Diane-35: Geen Reden tot Paniek!

12 03 2013

Dear english-speaking readers of this blog.

This post is about the anti-acne drug Diane-35 that (with other 3rd and 4th generation combined oral contraceptives (COCs)) has been linked to the deaths of several women in Canada, France and the Netherlands. Since there is a lot of media attention (and panic) in the Netherlands, the remainder of this post is in Dutch. Please write in the comments (or tweet) if you would like me to summarize the health concerns of these COCs in a separate English post.

————————-

Mediaophef 

Er is de laatste tijd nogal veel media-aandacht voor Diane-35. Het begon allemaal in Frankrijk, waar de Franse toezichthouder op geneesmiddelen ANSM Diane-35* eind januari van de markt wilde halen omdat in de afgelopen 25 jaar 4 vrouwen na het gebruik ervan waren overleden. In beroep werd dit afgewezen, waarna de ANSM de EMA (European Medicines Agency) verzocht om de veiligheid van Diane en 3e/4e generatie orale combinatie anticonceptiepillen (OAC) nog eens te onderzoeken.

In januari overleed ook een 21-jarige gebruikster van Diane-35 in Nederland. Met terugwerkende kracht ontving het Nederlandse Bijwerkingscentrum Lareb 97 meldingen van bijwerkingen van Diane-35. Hieronder waren 9 sterfgevallen uit 2011 en eerder.** Overigens werden ook sterfgevallen gemeld van vrouwen die vergelijkbare (3e en 4e generatie) orale anticonceptiemiddelen hadden gebruikt, zoals Yasmin.

Alle vrouwen zijn overleden aan bloedproppen in bloedvaten (trombose dan wel longembolie).  Totaal waren er  89 meldingen van bloedstolsels, wat altijd (ook zonder dodelijke afloop) een ernstige bijwerking is.

Aanleiding voor Canada en België, om ook in de statistieken te duiken. In Canada bleken sinds 2000 11 vrouwen die Diane-35 slikten te zijn overleden en in België zijn er sinds 2008 29 meldingen van trombose door het gebruik van de 3e/4e generatiepil (5 door Diane-35, geen doden)

Dit nieuws sloeg in als een bom. Veel mensen raakten in paniek. Of zijn boos op Bayer*, het CBG (College ter Beoordeling van Geneesmiddelen) en/of minister Schippers die het naar hun idee laten afweten. Op Twitter zie ik aan de lopende band tweets voorbijkomen als:

Diane-35 pil: heet deze zo omdat je er niet altijd de 35 jaar mee haalt?“.

Tonnen #rundvlees halen we uit de handel om wat #paardenvlees. Maar doden door de Diane 35 #pil doet de regering niks mee.

Oud Nieuws

Dergelijke reacties zijn sterk overdreven. Er is absoluut geen reden tot paniek.

Echter waar rook is, is vuur. Ook al gaat het hier om een klein brandje.

Maar wat betreft Diane-35 is die rook er al jaren…. Waarom roept men nu ineens: “Brand!”?

De meeste sterfgevallen zijn van vòòr dit jaar. Dat de Fransen authoriteiten nu zoveel daadkracht tonen komt waarschijnlijk omdat hen laksheid verweten werd bij recente schandalen met PIP-borstimplantaten en Mediator, dat meer dan 500 sterfgevallen veroorzaakt heeft. [Reuterszie ook het blog van Henk Jan Out]

Verder was allang bekend dat Diane-35 de kans op bloedstolsels verhoogde.

Niet alleen Diane-35

Men kan de risico’s van Diane-35 niet los zien van de risico’s van orale anticonceptiemiddelen (OAC’s) in het algemeen.

Diane-35 lijkt qua samenstelling erg op de 3e generatie OAC.  Het is echter uniek omdat het in plaats van een 3e generatie progestogeen cyproteronacetaat bevat. ‘De pil’ bevat levonorgestrel, dit is een 2e generatie progestogeen. Al de OAC combinatiepillen bevatten daarnaast (tegenwoordig) een lage dosering ethinylestradiol.

Zoals gezegd, is al jaren bekend dat alle OAC’s, dus ook ‘de pil’, de kans op bloedstolsels in bloedvaten licht verhogen[1,2,3]. Op zijn hoogst verhogen 2e generatie OAC’s (met levonorgestrel) die kans met een factor 4. Derde generatie pillen lijken die kans verder te verhogen. Met hoeveel precies, daarover verschillen de meningen. Voor wat beteft Diane-35, ziet de een géén tot nauwelijks effect [4], de ander een 1,5 [5]  tot 2 x [3] sterker effect.  Het totaalplaatje ziet er ongeveer als volgt uit:

7-3-2013 15-48-49 risico's pil

Absolute en relatieve kans op VTE (Veneuze trombo-embolie).
Uit: http://www.anticonceptie-online.nl/pil.htm

Risico’s in Perspectief

Een 1,5-2 x groter risico vergeleken met de “gewone pil”, lijkt een enorm groot risico. Dit zou ook een groot effect zijn als trombo-embolie vaak voorkwam. Stel dat 1 op de 100 mensen trombose krijgt per jaar, dan zouden op jaarbasis 2-4 op de 100 mensen trombose krijgen na de ‘pil’ en 3-8 mensen na Diane-35 of een 3e of 4e generatiepil. Dit is een groot absoluut risico. Dat risico zou je normaal niet nemen.

Maar trombo-embolie is zeldzaam. Het komt voor bij 5-10 op de 100.000 vrouwen per jaar. En totaal zal 1 op miljoen vrouwen daaraan overlijden. Dat is een heel minieme kans.Vier tot zes keer een kans van iets meer dan 0 blijft een kans van bijna 0. Dus in absolute zin, brengen Diane-35 en OAC’s weinig risico met zich mee.

Daarbij komt dat trombose niet direct door de pil veroorzaakt hoeft te zijn. Roken, leeftijd, (over)gewicht, erfelijke aanleg voor stollingsproblemen kunnen ook een (grote) rol spelen. Verder kunnen deze factoren samenspelen. Om deze reden worden OAC’s (ook de pil) afgeraden aan risicogroepen (oudere vrouwen die veel roken, aanleg voor trombose e.d.)

Het aantal bijwerkingendat mogelijk samenhangt met het gebruik van Diane-35, geeft eigenlijk aan dat dit een relatief veilig middel is.

Aanvaardbaar risico?

Om het nog meer in perspectief te plaatsen: zwangerschap geeft een 2x hoger risico op trombose dan Diane-35, en in de 12 weken na de bevalling is de kans nog weer 4-8 keer hoger dan in de zwangerschap (FDA). Toch zullen vrouwen het daarvoor niet laten om zwanger te worden. Het krijgen van een kind weegt meestal (impliciet) op tegen(kleine) risico’s (waarvan trombose er één is).

Men kan de (kans op) risico’s dus niet los zien van de voordelen. Als het voordeel hoog is zal men zelfs een zeker risico op de koop toe willen nemen (afhankelijk van de ernst van de aandoening en eht nut). Aan de andere kant wil je zelfs een heel klein risico niet lopen, als je geen baat hebt bij een middel of als er even goede, maar veiliger middelen zijn.

Maar mag de patiënte die overweging niet zelf met haar arts maken?

Geen plaats voor Diane-35  als anticonceptiemiddel

Diane-35 heeft een anticonceptiewerking, maar het is hiervoor niet (langer) geregistreerd. De laatste anticonceptie-richtlijn van de nederlandse huisartsen (NHG) uit 2010 [6] zegt expliciet dat er geen plaats meer is voor de pil met cyproteronacetaat. Dit omdat de gewone ‘pil’ even goed zwangerschap voorkomt én (iets) minder kans geeft op trombose als bijwerking. Dus waarom zou je een een potentieel hoger risico lopen, als dat niet nodig is? Helaas is de NHG-standaard minder expliciet over 2e en 3e generatie OAC’s.

In andere landen denkt men vaak net zo (In de VS is Diane-35 echter niet geregistreerd).

Dit zegt bijv de RCOG (Royal College of Obstetricians and Gynaecologist, UK) in hun evidence-based richtlijn die specifiek gaat over OAC’s en kans op trombose [1]

10-3-2013 17-27-40 RCOG CPA

Diane-35 als middel tegen ernstige acne en overbeharing.

Omdat het cyproteron acetaat in Diane-35 een sterk anti-androgene werking heeft kan het worden ingezet  bij ernstige acné en overbeharing (dat laatste met name bij vrouwen met PCOS, een gynecologische aandoening). Desgewenst kan het dan tevens dienst doen als anticonceptiemiddel: 2 vliegen in één klap.

Clinical Evidence, dat een heel mooie evidence based bron is die voor-en nadelen van behandelingen tegen elkaar afzet, concludeert dat middelen met cyproteron acetaat ondanks hun prima werking, bij ernstige overbeharing (bij PCOS) niet de voorkeur verdienen boven middelen als metformine. Het risico op trombose is in deze overweging meegenomen.[7]

3-3-2013 14-54-14 CLINical Evidence PCOS cyproterone acetate

Volgens een Cochrane Systematisch Review hielpen alle OAC’s wel bij acné, maar OAC’s met cyproteron leken wat effectiever dan pillen met 2e of 3e generatie progestogeen. De resultaten waren echter tegenstrijdig en de studies niet zo erg sterk.[8]

Sommigen concluderen op basis van dit Cochrane Review dat alle OAC’s even goed helpen en dat de gewone pil dus voorkeur verdient (zie bijv. dit recente artikel van Helmerhorst in de BMJ [2], en de NHG standaard acne [9]

Maar in de meest recente Richtlijn Acneïforme Dermatosen [10] van de Nederlandse Vereniging voor Dermatologie en Venereologie (NVDV) wordt er op basis van dezelfde evidence iets anders geconcludeerd: 10-3-2013 22-43-02 ned vereniging voor dermatologie

De Nederlandse dermatologen komen dus met een positieve aanbeveling van Diane-35 ten opzichte van andere anticonceptiemiddelen bij vrouwen die ook anticonceptie wensen. Nergens in deze richtlijn wordt expliciet gerefereerd aan trombose als mogelijke bijwerking.

Het voorschrijfbeleid in de praktijk.

Als Diane-35 niet als anticonceptiemiddel voorgeschreven wordt, en het wordt slechts bij ernstige vormen van acne of overbeharing gebruikt, hoe kan dit middel met een zo’n laag risico dan zo’n omvangrijk probleem worden? De doelgroep èn de kans op bijwerkingen is immers heel klein. En hoe zit het met 3e en 4e generatie OAC’s die niet eens bij acné voorgeschreven zullen worden? Daar zou de doelgroep nog kleiner moeten zijn.

De realiteit is dat de omvang van het probleem niet zozeer door het on-label gebruik komt maar, zoals Janine Budding al aangaf op haar blog Medical Facts door off-label voorschrijfgedrag, dus voor een  andere indicatie dan waarvoor het geneesmiddel is geregistreerd. In Frankrijk gebruikt de helft van de vrouwen die OAC’s gebruiken, de 3e en 4e generatie OAC: dat is ronduit buitensporig, en niet volgens de richtlijnen.

In Nederland slikkten ruim 161.000 vrouwen Diane-35 of een generieke variant met exact dezelfde werking. Ook veel Nederlandse en Canadese gebruiken Diane-35 en andere 3e en 4e generatie OAC’s puur anticonceptiemiddel. Voor een deel, omdat sommige huisartsen het ‘in de pen’ hebben of denken dat een meisje dan gelijk van haar puistjes afgeholpen is. Voor een deel omdat, in  Nederland en  Frankrijk, Diane-35 vergoed wordt en de gewone pil niet. Er is, zeker in Frankrijk, een run op een ‘gratis’ pil.

Online bedrijven spelen mogelijk ook een rol. Deze lichten vaak niet goed voor. Eén zo’n bedrijf (met gebrekkige info over Diane op hun website) gaat zelfs zover het twitter account @diane35nieuws te creeeren als dekmantel voor online pillenverkoop.

Wat nu?

Hoewel de risico’s van Diane-35 allang bekend waren en gering lijken te zijn, en bovendien vergelijkbaar met die van de 3e en 4e generatie  OAC’s, is er een massaal verzet tegen Diane-35 op gang gekomen, die niet meer te stuiten lijkt. Niet de experts, maar de media en de politiek lijken de discussie te voeren.Erg verwarrend en soms misleidend voor de patiënt.

Mijn inziens is het besluit van de Nederlandse huisartsen, gynecologen en recent ook de dermatologen om Diane-35 voorlopig niet voor te schrijven aan nieuwe patiënten tot de autoriteiten een uitspraak hebben gedaan over de veiligheid***, gezien de huidige onrust, een verstandige.

Wat niet verstandig is om zomaar met de Diane-35 pil te stoppen. Overleg altijd eerst met uw arts wat voor u de beste optie is.

In 1995 heeft een vergelijkbare reactie op waarschuwingen over de tromboserisico’s van bepaalde OAC’s geleid tot een ware “pil scare”: vrouwen gingen massaal over op een andere pil of stopten er in het geheel mee. Gevolg: een piek aan ongewenste zwangerschappen (met overigens een veel hogere kans op trombose) en abortussen. Conclusie destijds [10]:

“The level of risk should, in future, be more carefully assessed and advice more carefully presented in the interests of public health.”

Kennelijk is deze les aan Nederland en Frankrijk voorbijgegaan.

Hoewel ik denk dat Diane-35 maar voor een beperkte groep echt zinvol is boven de bestaande middelen, is het te betreuren dat op basis van ongefundeerde reacties, patiënten straks mogelijk zelf geen keuzevrijheid meer hebben. Mogen zij zelf de balans tussen voor-en nadelen bepalen?

Het is begrijpelijk (maar misschien niet zo heel professioneel), dat dermatologen nogal gefrustreerd reageren, nu een bepaalde groep patienten tussen wal en schip raakt. Tevens moet men niet op basis van evidence en argumenten, maar onder druk van media en politiek, tot een ander beleid overgaan.

11-3-2013 23-27-37 reactie dermatologen 2

Want laten we wel wezen, sommige dermatologische en gynecologische patiënten hebben wel baat bij Diane-35.

en

En tot slot een prachtige reactie van een acne-patiënte op een  blog post van Ivan Wolfers. Zij vat de essentie in enkele zinnen samen. Net als bovenstaande dames, een patient die zeer weloverwogen met haar arts beslissingen neemt op basis van de bestaande info.

Zoals het zou moeten…

12-3-2013 0-57-27 reactie patient

Noten

* Diane-35 wordt geproduceerd door Bayer. Het staat ook bekend als Minerva, Elisa en in buitenland bijvoorbeels als Dianette. Er zijn verder ook veel merkloze preparaten met dezelfde samenstelling.

**Inmiddels heeft zijn er nog 4 dodelijke gevallen na gebruik van Diane-35 in Nederland bijgekomen (Artsennet, 2013-03-11)

***Hopelijk wordt de gewone pil dan ook in de vergelijking meegenomen. Dit is wel zo eerlijk: het gaat immers om een vergelijking.

Referenties 

  • Venous Thromboembolism and Hormone Replacement Therapy – Green-top Guide line 40 (2010) Royal College of Obstetricians and Gynaecologists, 

    2011

  • Helmerhorst F.M. & Rosendaal F.R. (2013). Is an EMA review on hormonal contraception and thrombosis needed?, BMJ (Clinical research ed.), PMID:
  • van Hylckama Vlieg A., Helmerhorst F.M., Vandenbroucke J.P., Doggen C.J.M. & Rosendaal F.R. (2009). The venous thrombotic risk of oral contraceptives, effects of oestrogen dose and progestogen type: results of the MEGA case-control study., BMJ (Clinical research ed.), PMID:
  • Spitzer W.O. (2003) Cyproterone acetate with ethinylestradiol as a risk factor for venous thromboembolism: an epidemiological evaluation., Journal of obstetrics and gynaecology Canada : JOGC = Journal d’obstétrique et gynécologie du Canada : JOGC, PMID:
  • Martínez F., Ramírez I., Pérez-Campos E., Latorre K. & Lete I. (2012) Venous and pulmonary thromboembolism and combined hormonal contraceptives. Systematic review and meta-analysis., The European journal of contraception & reproductive health care : the official journal of the European Society of Contraception, PMID:
  • NHG-Standaard Anticonceptie 2010 Anke Brand, Anita Bruinsma, Kitty van Groeningen, Sandra Kalmijn, Ineke Kardolus, Monique Peerden, Rob Smeenk, Suzy de Swart, Miranda Kurver, Lex Goudswaard.

  • Cahill D. (2009). PCOS., Clinical evidence, PMID:
  • Arowojolu A.O., Gallo M.F., Lopez L.M. & Grimes D.A. (2012). Combined oral contraceptive pills for treatment of acne., Cochrane database of systematic reviews (Online), PMID:
  • Kertzman M.G.M., Smeets J.G.E., Boukes F.S. & Goudswaard A.N. [Summary of the practice guideline ‘Acne’ (second revision) from the Dutch College of General Practitioners]., Nederlands tijdschrift voor geneeskunde, PMID:
  • Richtlijn Acneïforme Dermatosen, © 2010, Nederlandse Vereniging voor Dermatologie en Venereologie (NVDV)
  • Furedi A. The public health implications of the 1995 ‘pill scare’., Human reproduction update, PMID:




Sugary Drinks as the Culprit in Childhood Obesity? a RCT among Primary School Children

24 09 2012

ResearchBlogging.org Childhood obesity is a growing health problem. Since 1980, the proportion of overweighted children has almost tripled in the USA:  nowadays approximately 17% of children and adolescents are obese.  (Source: cdc.gov [6])

Common sense tells me that obesity is the result of too high calory intake without sufficient physical activity.” which is just what the CDC states. I’m not surprised that the CDC also mentions the greater availability of high-energy-dense foods and sugary drinks at home and at school as main reasons for the increased intake of calories among children.

In my teens I already realized that sugar in sodas were just “empty calories” and I replaced tonic and cola by low calory  Rivella (and omitted sugar from tea). When my children were young I urged the day care to restrain from routinely giving lemonade (often in vain).

I was therefore a bit surprised to notice all the fuss in the Dutch newspapers [NRC] [7] about a new Dutch study [1] showing that sugary drinks contributed to obesity. My first reaction was “Duhhh?!…. so what?”.

Also, it bothered me that the researchers had performed a RCT (randomized controlled trial) in kids giving one half of them sugar-sweetened drinks and the other half sugar-free drinks. “Is it ethical to perform such a scientific “experiment” in healthy kids?”, I wondered, “giving more than 300 kids 14 kilo sugar over 18 months, without them knowing it?”

But reading the newspaper and the actual paper[1], I found that the study was very well thought out. Also ethically.

It is true that the association between sodas and weight gain has been shown before. But these studies were either observational studies, where one cannot look at the effect of sodas in isolation (kids who drink a lot of sodas often eat more junk food and watch more television: so these other life style aspects may be the real culprit) or inconclusive RCT’s (i.e. because of low sample size). Weak studies and inconclusive evidence will not convince policy makers, organizations and beverage companies (nor schools) to take action.

As explained previously in The Best Study Design… For Dummies [8] the best way to test whether an intervention has a health effect is to do a  double blind RCT, where the intervention (in this case: sugary drinks) is compared to a control (drinks with artificial sweetener instead of sugar) and where the study participants, and direct researchers do not now who receives the  actual intervention and who the phony one.

The study of Katan and his group[1] was a large, double blinded RCT with a long follow-up (18 months). The researchers recruited 641 normal-weight schoolchildren from 8 primary schools.

Importantly, only children were included in the study that normally drank sugared drinks at school (see announcement in Dutch). Thus participation in the trial only meant that half of the children received less sugar during the study-period. The researchers would have preferred drinking water as a control, but to ensure that the sugar-free and sugar-containing drinks tasted and looked essentially the same they used an artificial sweetener as a control.

The children drank 8 ounces (250 ml) of a 104-calorie sugar-sweetened or no-calorie sugar-free fruit-flavoured drink every day during 18 months.  Compliance was good as children who drank the artificially sweetened beverages had the expected level of urinary sucralose (sweetener).

At the end of the study the kids in the sugar-free group gained a kilo less weight than their peers. They also had a significant lower BMI-increase and gained less body fat.

Thus, according to Katan in the Dutch newspaper NRC[7], “it is time to get rid of the beverage vending machines”. (see NRC [6]).

But does this research really support that conclusion and does it, as some headlines state [9]: “powerfully strengthen the case against soda and other sugary drinks as culprits in the obesity epidemic?”

Rereading the paper I wondered as to the reasons why this study was performed.

If the trial was meant to find out whether putting children on artificially sweetened beverages (instead of sugary drinks) would lead to less fat gain, then why didn’t the researchers do an  intention to treat (ITT) analysis? In an ITT analysis trial participants are compared–in terms of their final results–within the groups to which they were initially randomized. This permits the pragmatic evaluation of the benefit of a treatment policy.
Suppose there were more dropouts in the intervention group, that might indicate that people had a reason not to adhere to the treatment. Indeed there were many dropouts overall: 26% of the children had stopped consuming the drinks, 29% from the sugar-free group, and 22% from the sugar group.
Interestingly, the majority of the children who stopped drinking the cans because they no longer liked the drink (68/94 versus 45/70 dropouts in the sugar-free versus the sugar group).
Ànd children who correctly assumed that the sweetened drinks were “artificially sweetened” was 21% higher than expected by chance (correct identification was 3% lower in the sugar group).
Did some children stop using the non-sugary drinks because they found the taste less nice than usual or artificial? Perhaps.

This  might indicate that replacing sugar-drinks by artificially sweetened drinks might not be as effective in “practice”.

Indeed most of the effect on the main outcome, the differences in BMI-Z score (the number of standard deviations by which a child differs from the mean in the Netherland for his or her age or sex) was “strongest” after 6 months and faded after 12 months.

Mind you, the researchers did neatly correct for the missing data by multiple imputation. As long as the children participated in the study, their changes in body weight and fat paralleled those of children who finished the study. However, the positive effect of the earlier use of non-sugary drinks faded in children who went back to drinking sugary drinks. This is not unexpected, but it underlines the point I raised above: the effect may be less drastic in the “real world”.

Another (smaller) RCT, published in the same issue of the NEJM [2](editorial in[4]), aimed to test the effect of an intervention to cut the intake of sugary drinks in obese adolescents. The intervention (home deliveries of bottled water and diet drinks for one year) led to a significant reduction in mean BMI (body mass index), but not in percentage body fat, especially in Hispanic adolescents. However at one year follow up (thus one year after the intervention had stopped) the differences between the groups evaporated again.

But perhaps the trial was “just” meant as a biological-fysiological experiment, as Hans van Maanen suggested in his critical response in de Volkskrant[10].

Indeed, the data actually show that sugar in drinks can lead to a greater increase in obesity-related parameters (and vice versa). [avoiding the endless fructose-glucose debate [11].

In the media, Katan stresses the mechanistic aspects too. He claims that children who drank the sweetened drinks, didn’t compensate for the lower intake of sugars by eating more. In the NY-times he is cited as follows[12]: “When you change the intake of liquid calories, you don’t get the effect that you get when you skip breakfast and then compensate with a larger lunch…”

This seems a logic explanation, but I can’t find any substatation in the article.

Still “food intake of the children at lunch time, shortly after the morning break when the children have consumed the study drinks”, was a secondary outcome in the original protocol!! (see the nice comparison of the two most disparate descriptions of the trial design at clinicaltrials.gov [5], partly shown in the figure below).

“Energy intake during lunchtime” was later replaced by a “sensory evaluation” (with questions like: “How satiated do you feel?”). The results, however were not reported in their current paper. That is also true for a questionnaire about dental health.

Looking at the two protocol versions I saw other striking differences. At 2009_05_28, the primary outcomes of the study are the children’s body weight (BMI z-score),waist circumference (replaced by waist to height), skin folds and bioelectrical impedance.
The latter three become secondary outcomes in the final draft. Why?

Click to enlarge (source Clinicaltrials.gov [5])

It is funny that although the main outcome is the BMI z score, the authors mainly discuss the effects on body weight and body fat in the media (but perhaps this is better understood by the audience).

Furthermore, the effect on weight is less then expected: 1 kilo instead of 2,3 kilo. And only a part is accounted for by loss in body fat: -0,55 kilo fat as measured by electrical impedance and -0,35 kilo as measured by changes in skinfold thickness. The standard deviations are enormous.

Look for instance at the primary end point (BMI z score) at 0 and 18 months in both groups. The change in this period is what counts. The difference in change between both groups from baseline is -0,13, with a P value of 0.001.

(data are based on the full cohort, with imputed data, taken from Table 2)

Sugar-free group : 0.06±1.00  [0 Mo]  –> 0.08±0.99 [18 Mo] : change = 0.02±0.41  

Sugar-group: 0.01±1.04  [0 Mo]  –> 0.15±1.06 [18 Mo] : change = 0.15±0.42 

Difference in change from baseline: −0.13 (−0.21 to −0.05) P = 0.001

Looking at these data I’m impressed by the standard deviations (replaced by standard errors in the somewhat nicer looking fig 3). What does a value of 0.01 ±1.04 represent? There is a looooot of variation (even though BMI z is corrected for age and sex). Although no statistical differences were found for baseline values between the groups the “eyeball test” tells me the sugar- group has a slight “advantage”. They seem to start with slightly lower baseline values (overall, except for body weight).

Anyway, the changes are significant….. But significance isn’t identical to relevant.

At a second look the data look less impressive than the media reports.

Another important point, raised by van Maanen[10], is that the children’s weight increases more in this study than in the normal Dutch population. 6-7 kilo instead of 3 kilo.

In conclusion, the study by the group of Katan et al is a large, unique, randomized trial, that looked at the effects of replacement of sugar by artificial sweeteners in drinks consumed by healthy school children. An effect was noticed on several “obesity-related parameters”, but the effects were not large and possibly don’t last after discontinuation of the trial.

It is important that a single factor, the sugar component in beverages is tested in isolation. This shows that sugar itself “does matter”. However, the trial does not show that sugary drinks are the main obesity  factor in childhood (as suggested in some media reports).

It is clear that the investigators feel very engaged, they really want to tackle the childhood obesity problem. But they should separate the scientific findings from common sense.

The cans fabricated for this trial were registered under the trade name Blikkie (Dutch for “Little Can”). This was to make sure that the drinks would never be sold by smart business guys using the slogan: “cans which have scientifically been proven to help to keep your child lean and healthy”.[NRC]

Still soft drink stakeholders may well argue that low calory drinks are just fine and that curbing sodas is not the magic bullet.

But it is a good start, I think.

Photo credits Cola & Obesity:  Melliegrunt Flikr [CC]

  1. de Ruyter JC, Olthof MR, Seidell JC, & Katan MB (2012). A Trial of Sugar-free or Sugar-Sweetened Beverages and Body Weight in Children. The New England journal of medicine PMID: 22998340
  2. Ebbeling CB, Feldman HA, Chomitz VR, Antonelli TA, Gortmaker SL, Osganian SK, & Ludwig DS (2012). A Randomized Trial of Sugar-Sweetened Beverages and Adolescent Body Weight. The New England journal of medicine PMID: 22998339
  3. Qi Q, Chu AY, Kang JH, Jensen MK, Curhan GC, Pasquale LR, Ridker PM, Hunter DJ, Willett WC, Rimm EB, Chasman DI, Hu FB, & Qi L (2012). Sugar-Sweetened Beverages and Genetic Risk of Obesity. The New England journal of medicine PMID: 22998338
  4. Caprio S (2012). Calories from Soft Drinks – Do They Matter? The New England journal of medicine PMID: 22998341
  5. Changes to the protocol http://clinicaltrials.gov/archive/NCT00893529/2011_02_24/changes
  6. Overweight and Obesity: Childhood obesity facts  and A growing problem (www.cdc.gov)
  7. NRC Wim Köhler Eén kilo lichter.NRC | Zaterdag 22-09-2012 (http://archief.nrc.nl/)
  8.  The Best Study Design… For Dummies (https://laikaspoetnik.wordpress.com)
  9. Studies point to sugary drinks as culprits in childhood obesity – CTV News (ctvnews.ca)
  10. Hans van Maanen. Suiker uit fris, De Volkskrant, 29 september 2012 (freely accessible at http://www.vanmaanen.org/)
  11. Sugar-Sweetened Beverages, Diet Coke & Health. Part I. (https://laikaspoetnik.wordpress.com)
  12. Roni Caryn Rabina. Avoiding Sugared Drinks Limits Weight Gain in Two Studies. New York Times, September 21, 2012




The Scatter of Medical Research and What to do About it.

18 05 2012

ResearchBlogging.orgPaul Glasziou, GP and professor in Evidence Based Medicine, co-authored a new article in the BMJ [1]. Similar to another paper [2] I discussed before [3] this paper deals with the difficulty for clinicians of staying up-to-date with the literature. But where the previous paper [2,3] highlighted the mere increase in number of research articles over time, the current paper looks at the scatter of randomized clinical trials (RCTs) and systematic reviews (SR’s) accross different journals cited in one year (2009) in PubMed.

Hofmann et al analyzed 7 specialties and 9 sub-specialties, that are considered the leading contributions to the burden of disease in high income countries.

They followed a relative straightforward method for identifying the publications. Each search string consisted of a MeSH term (controlled  term) to identify the selected disease or disorders, a publication type [pt] to identify the type of study, and the year of publication. For example, the search strategy for randomized trials in cardiology was: “heart diseases”[MeSH] AND randomized controlled trial[pt] AND 2009[dp]. (when searching “heart diseases” as a MeSH, narrower terms are also searched.) Meta-analysis[pt] was used to identify systematic reviews.

Using this approach Hofmann et al found 14 343 RCTs and 3214 SR’s published in 2009 in the field of the selected (sub)specialties. There was a clear scatter across journals, but this scatter varied considerably among specialties:

“Otolaryngology had the least scatter (363 trials across 167 journals) and neurology the most (2770 trials across 896 journals). In only three subspecialties (lung cancer, chronic obstructive pulmonary disease, hearing loss) were 10 or fewer journals needed to locate 50% of trials. The scatter was less for systematic reviews: hearing loss had the least scatter (10 reviews across nine journals) and cancer the most (670 reviews across 279 journals). For some specialties and subspecialties the papers were concentrated in specialty journals; whereas for others, few of the top 10 journals were a specialty journal for that area.
Generally, little overlap occurred between the top 10 journals publishing trials and those publishing systematic reviews. The number of journals required to find all trials or reviews was highly correlated (r=0.97) with the number of papers for each specialty/ subspecialty.”

Previous work already suggested that this scatter of research has a long tail. Half of the publications is in a minority of papers, whereas the remaining articles are scattered among many journals (see Fig below).

Click to enlarge en see legends at BMJ 2012;344:e3223 [CC]

The good news is that SRs are less scattered and that general journals appear more often in the top 10 journals publishing SRs. Indeed for 6 of the 7 specialties and 4 of the 9 subspecialties, the Cochrane Database of Systematic Reviews had published the highest number of systematic reviews, publishing between 6% and 18% of all the systematic reviews published in each area in 2009. The bad news is that even keeping up to date with SRs seems a huge, if not impossible, challenge.

In other words, it is not sufficient for clinicians to rely on personal subscriptions to a few journals in their specialty (which is common practice). Hoffmann et al suggest several solutions to help clinicians cope with the increasing volume and scatter of research publications.

  • a central library of systematic reviews (but apparently the Cochrane Library fails to fulfill such a role according to the authors, because many reviews are out of date and are perceived as less clinically relevant)
  • registry of planned and completed systematic reviews, such as prospero. (this makes it easier to locate SRs and reduces bias)
  • Synthesis of Evidence and synopses, like the ACP-Jounal Club which summarizes the best evidence in internal medicine
  • Specialised databases that collate and critically appraise randomized trials and systematic reviews, like www.pedro.org.au for physical therapy. In my personal experience, however, this database is often out of date and not comprehensive
  • Journal scanning services like EvidenceUpdates from mcmaster.ca), which scans over 120 journals, filters articles on the basis of quality, has practising clinicians rate them for relevance and newsworthiness, and makes them available as email alerts and in a searchable database. I use this service too, but besides that not all specialties are covered, the rating of evidence may not always be objective (see previous post [4])
  • The use of social media tools to alert clinicians to important new research.

Most of these solutions are (long) existing solutions that do not or only partly help to solve the information overload.

I was surprised that the authors didn’t propose the use of personalized alerts. PubMed’s My NCBI feature allows to create automatic email alerts on a topic and to subscribe to electronic tables of contents (which could include ACP journal Club). Suppose that a physician browses 10 journals roughly covering 25% of the trials. He/she does not need to read all the other journals from cover to cover to avoid missing one potentially relevant trial. Instead it is far more efficient to perform a topic search to filter relevant studies from journals that seldom publish trials on the topic of interest. One could even use the search of Hoffmann et al to achieve this.* Although in reality, most clinical researchers will have narrower fields of interest than all studies about endocrinology and neurology.

At our library we are working at creating deduplicated, easy to read, alerts that collate table of contents of certain journals with topic (and author) searches in PubMed, EMBASE and other databases. There are existing tools that do the same.

Another way to reduce the individual work (reading) load is to organize journals clubs or even better organize regular CATs (critical appraised topics). In the Netherlands, CATS are a compulsory item for residents. A few doctors do the work for many. Usually they choose topics that are clinically relevant (or for which the evidence is unclear).

The authors shortly mention that their search strategy might have missed  missed some eligible papers and included some that are not truly RCTs or SRs, because they relied on PubMed’s publication type to retrieve RCTs and SRs. For systematic reviews this may be a greater problem than recognized, for the authors have used meta-analyses[pt] to identify systematic reviews. Unfortunately PubMed has no publication type for systematic reviews, but it may be clear that there are many more systematic reviews that meta-analyses. Possibly systematical reviews might even have a different scatter pattern than meta-analyses (i.e. the latter might be preferentially included in core journals).

Furthermore not all meta-analyses and systematic reviews are reviews of RCTs (thus it is not completely fair to compare MAs with RCTs only). On the other hand it is a (not discussed) omission of this study, that only interventions are considered. Nowadays physicians have many other questions than those related to therapy, like questions about prognosis, harm and diagnosis.

I did a little imperfect search just to see whether use of other search terms than meta-analyses[pt] would have any influence on the outcome. I search for (1) meta-analyses [pt] and (2) systematic review [tiab] (title and abstract) of papers about endocrine diseases. Then I subtracted 1 from 2 (to analyse the systematic reviews not indexed as meta-analysis[pt])

Thus:

(ENDOCRINE DISEASES[MESH] AND SYSTEMATIC REVIEW[TIAB] AND 2009[DP]) NOT META-ANALYSIS[PT]

I analyzed the top 10/11 journals publishing these study types.

This little experiment suggests that:

  1. the precise scatter might differ per search: apparently the systematic review[tiab] search yielded different top 10/11 journals (for this sample) than the meta-analysis[pt] search. (partially because Cochrane systematic reviews apparently don’t mention systematic reviews in title and abstract?).
  2. the authors underestimate the numbers of Systematic Reviews: simply searching for systematic review[tiab] already found appr. 50% additional systematic reviews compared to meta-analysis[pt] alone
  3. As expected (by me at last), many of the SR’s en MA’s were NOT dealing with interventions, i.e. see the first 5 hits (out of 108 and 236 respectively).
  4. Together these findings indicate that the true information overload is far greater than shown by Hoffmann et al (not all systematic reviews are found, of all available search designs only RCTs are searched).
  5. On the other hand this indirectly shows that SRs are a better way to keep up-to-date than suggested: SRs  also summarize non-interventional research (the ratio SRs of RCTs: individual RCTs is much lower than suggested)
  6. It also means that the role of the Cochrane Systematic reviews to aggregate RCTs is underestimated by the published graphs (the MA[pt] section is diluted with non-RCT- systematic reviews, thus the proportion of the Cochrane SRs in the interventional MAs becomes larger)

Well anyway, these imperfections do not contradict the main point of this paper: that trials are scattered across hundreds of general and specialty journals and that “systematic reviews” (or meta-analyses really) do reduce the extent of scatter, but are still widely scattered and mostly in different journals to those of randomized trials.

Indeed, personal subscriptions to journals seem insufficient for keeping up to date.
Besides supplementing subscription by  methods such as journal scanning services, I would recommend the use of personalized alerts from PubMed and several prefiltered sources including an EBM search machine like TRIP (www.tripdatabase.com/).

*but I would broaden it to find all aggregate evidence, including ACP, Clinical Evidence, syntheses and synopses, not only meta-analyses.

**I do appreciate that one of the co-authors is a medical librarian: Sarah Thorning.

References

  1. Hoffmann, Tammy, Erueti, Chrissy, Thorning, Sarah, & Glasziou, Paul (2012). The scatter of research: cross sectional comparison of randomised trials and systematic reviews across specialties BMJ, 344 : 10.1136/bmj.e3223
  2. Bastian, H., Glasziou, P., & Chalmers, I. (2010). Seventy-Five Trials and Eleven Systematic Reviews a Day: How Will We Ever Keep Up? PLoS Medicine, 7 (9) DOI: 10.1371/journal.pmed.1000326
  3. How will we ever keep up with 75 trials and 11 systematic reviews a day (laikaspoetnik.wordpress.com)
  4. Experience versus Evidence [1]. Opioid Therapy for Rheumatoid Arthritis Pain. (laikaspoetnik.wordpress.com)




What Did Deep DNA Sequencing of Traditional Chinese Medicines (TCMs) Really Reveal?

30 04 2012

ResearchBlogging.orgA recent study published in PLOS genetics[1] on a genetic audit of Traditional Chinese Medicines (TCMs) was widely covered in the news. The headlines are a bit confusing as they said different things. Some headlines say “Dangers of Chinese Medicine Brought to Light by DNA Studies“, others that Bear and Antelope DNA are Found in Traditional Chinese Medicine, and still others more neutrally: Breaking down traditional Chinese medicine.

What have Bunce and his group really done and what is the newsworthiness of this article?

doi:info:doi/10.1371/journal.pgen.1002657.g001

Photos from 4 TCM samples used in this study doi/10.1371/journal.pgen.1002657.g001

The researchers from the the Murdoch University, Australia,  have applied Second Generation, high-throughput sequencing to identify the plant and animal composition of 28 TCM samples (see Fig.). These TCM samples had been seized by Australian Customs and Border Protection Service at airports and seaports across Australia, because they contravened Australia’s international wildlife trade laws (Part 13A EPBC Act 1999).

Using primers specific for the plastid trnL gene (plants) and the mitochondrial 16S ribosomal RNA (animals), DNA of sufficient quality was obtained from 15 of the 28 (54%) TCM samples. The resultant 49,000 amplicons (amplified sequences) were analyzed by high-throughput sequencing and compared to reference databases.

Due to better GenBank coverage, the analysis of vertebrate DNA was simpler and less ambiguous than the analysis of the plant origins.

Four TCM samples – Saiga Antelope Horn powder, Bear Bile powder, powder in box with bear outline and Chu Pak Hou Tsao San powder were found to contain DNA from known CITES- (Convention on International Trade in Endangered Species) listed species. This is no real surprise, as the packages were labeled as such.

On the other hand some TCM samples, like the “100% pure” Saiga Antilope powder, were “diluted” with  DNA from bovids (i.e. goats and sheep), deer and/or toads. In 78% of the samples, animal DNA was identified that had not been clearly labeled as such on the packaging.

In total 68 different plant families were detected in the medicines. Some of the TCMs contained plants of potentially toxic genera like Ephedra and Asarum. Ephedra contains the sympathomimetic ephedrine, which has led to many, sometimes fatal, intoxications, also in Western countries. It should be noted however, that pharmacological activity cannot be demonstrated by DNA-analysis. Similarly, certain species of Asarum (wild ginger) contain the nephrotoxic and carcinogenic aristolochic acid, but it would require further testing to establish the presence of aristolochia acid in the samples positive for Asarum. Plant DNA assigned to other potentially toxic, allergic (nuts, soy) and/or subject to CITES regulation were also recovered. Again, other gene regions would need to be targeted, to reveal the exact species involved.

Most newspapers emphasized that the study has brought to light “the dangers of TCM”

For this reason The Telegraph interviewed an expert in the field, Edzard Ernst, Professor of Complementary Medicine at the University of Exeter. Ernst:

“The risks of Chinese herbal medicine are numerous: firstly, the herbs themselves can be toxic; secondly, they might interact with prescription drugs; thirdly, they are often contaminated with heavy metals; fourthly, they are frequently adulterated with prescription drugs; fifthly, the practitioners are often not well trained, make unsubstantiated claims and give irresponsible, dangerous advice to their patients.”

Ernst is right about the risks. However, these adverse effects of TCM have long been known. Fifteen years ago I happened to have written a bibliography about “adverse effects of herbal medicines*” (in Dutch, a good book on this topic is [2]). I did exclude interactions with prescription drugs, contamination with heavy metals and adulteration with prescription drugs, because the events (publications in PubMed and EMBASE) were to numerous(!). Toxic Chinese herbs mostly caused acute toxicity by aconitine, anticholinergic (datura, atropa) and podophyllotoxin intoxications. In Belgium 80 young women got nephropathy (kidney problems) after attending a “slimming” clinic because of mixup of Stephania (chinese: fangji) with Aristolochia fanghi (which contains the toxic aristolochic acid). Some of the women later developed urinary tract cancer.

In other words, toxic side effects of herbs including chinese herbs are long known. And the same is true for the presence of (traces of) endangered species in TCM.

In a media release the complementary health council (CHC) of Australia emphasized that the 15 TCM products featured in this study were rogue products seized by Customs as they were found to contain prohibited and undeclared ingredients. The CHC emphasizes the proficiency of rigorous regulatory regime around complementary medicines, i.e. all ingredients used in listed products must be on the permitted list of ingredients. However, Australian regulations do not apply to products purchased online from overseas.

Thus if the findings are not new and (perhaps) not applicable to most legal TCM, then what is the value of this paper?

The new aspect is the high throughput DNA sequencing approach, which allows determination of a larger number of animal and plant taxa than would have been possible through morphological and/or biochemical means. Various TCM-samples are suitable: powders, tablets, capsules, flakes and herbal teas.

There are also some limitations:

  1. DNA of sufficient quality could only be obtained from appr. half of the samples.
  2. Plants sequences could often not be resolved beyond the family level. Therefore it could often not be established whether an endangered of toxic species was really present (or an innocent family member).
  3. Only DNA sequences can be determined, not pharmacological activity.
  4. The method is at best semi-quantitative.
  5. Only plant and animal ingredients are determined, not contaminating heavy metals or prescription drugs.

In the future, species assignment (2) can be improved with the development of better reference databases involving multiple genes and (3) can be solved by combining genetic (sequencing) and metabolomic (for compound detection) approaches. According to the authors this may be a cost-effective way to audit TCM products.

Non-technical approaches may be equally important: like convincing consumers not to use medicines containing animal traces (not to speak of  endangered species), not to order  TCM online and to avoid the use of complex, uncontrolled TCM-mixes.

Furthermore, there should be more info on what works and what doesn’t.

*including but not limited to TCM

References

  1. Coghlan ML, Haile J, Houston J, Murray DC, White NE, Moolhuijzen P, Bellgard MI, & Bunce M (2012). Deep Sequencing of Plant and Animal DNA Contained within Traditional Chinese Medicines Reveals Legality Issues and Health Safety Concerns. PLoS genetics, 8 (4) PMID: 22511890 (Free Full Text)
  2. Adverse Effects of Herbal Drugs 2 P. A. G. M. De Smet K. Keller R. Hansel R. F. Chandler, Paperback. Springer 1993-01-15. ISBN 0387558004 / 0-387-55800-4 EAN 9780387558004
  3. DNA may weed out toxic Chinese medicine (abc.net.au)
  4. Bedreigde beren in potje Lucas Brouwers, NRC Wetenschap 14 april 2012, bl 3 [Dutch]
  5. Dangers in herbal medicine (continued) – DNA sequencing to hunt illegal ingredients (somethingaboutscience.wordpress.com)
  6. Breaking down traditional Chinese medicine. (green.blogs.nytimes.com)
  7. Dangers of Chinese Medicine Brought to Light by DNA Studies (news.sciencemag.org)
  8. Chinese herbal medicines contained toxic mix (cbc.ca)
  9. Screen uncovers hidden ingredients of Chinese medicine (Nature News)
  10. Media release: CHC emphasises proficiency of rigorous regulatory regime around complementary medicines (http://www.chc.org.au/)




Evidence Based Point of Care Summaries [2] More Uptodate with Dynamed.

18 10 2011

ResearchBlogging.orgThis post is part of a short series about Evidence Based Point of Care Summaries or POCs. In this series I will review 3 recent papers that objectively compare a selection of POCs.

In the previous post I reviewed a paper from Rita Banzi and colleagues from the Italian Cochrane Centre [1]. They analyzed 18 POCs with respect to their “volume”, content development and editorial policy. There were large differences among POCs, especially with regard to evidence-based methodology scores, but no product appeared the best according to the criteria used.

In this post I will review another paper by Banzi et al, published in the BMJ a few weeks ago [2].

This article examined the speed with which EBP-point of care summaries were updated using a prospective cohort design.

First the authors selected all the systematic reviews signaled by the American College of Physicians (ACP) Journal Club and Evidence-Based Medicine Primary Care and Internal Medicine from April to December 2009. In the same period the authors selected all the Cochrane systematic reviews labelled as “conclusion changed” in the Cochrane Library. In total 128 systematic reviews were retrieved, 68 from the literature surveillance journals (53%) and 60 (47%) from the Cochrane Library. Two months after the collection started (June 2009) the authors did a monthly screen for a year to look for potential citation of the identified 128 systematic reviews in the POCs.

Only those 5 POCs were studied that were ranked in the top quarter for at least 2 (out of 3) desirable dimensions, namely: Clinical Evidence, Dynamed, EBM Guidelines, UpToDate and eMedicine. Surprisingly eMedicine was among the selected POCs, having a rating of “1” on a scale of 1 to 15 for EBM methodology. One would think that Evidence-based-ness is a fundamental prerequisite  for EBM-POCs…..?!

Results were represented as a (rather odd, but clear) “survival analysis” ( “death” = a citation in a summary).

Fig.1 : Updating curves for relevant evidence by POCs (from [2])

I will be brief about the results.

Dynamed clearly beated all the other products  in its updating speed.

Expressed in figures, the updating speed of Dynamed was 78% and 97% greater than those of EBM Guidelines and Clinical Evidence, respectively. Dynamed had a median citation rate of around two months and EBM Guidelines around 10 months, quite close to the limit of the follow-up, but the citation rate of the other three point of care summaries (UpToDate, eMedicine, Clinical Evidence) were so slow that they exceeded the follow-up period and the authors could not compute the median.

Dynamed outperformed the other POC’s in updating of systematic reviews independent of the route. EBM Guidelines and UpToDate had similar overall updating rates, but Cochrane systematic reviews were more likely to be cited by EBM Guidelines than by UpToDate (odds ratio 0.02, P<0.001). Perhaps not surprising, as EBM Guidelines has a formal agreement with the Cochrane Collaboration to use Cochrane contents and label its summaries as “Cochrane inside.” On the other hand, UpToDate was faster than EBM Guidelines in updating systematic reviews signaled by literature surveillance journals.

Dynamed‘s higher updating ability was not due to a difference in identifying important new evidence, but to the speed with which this new information was incorporated in their summaries. Possibly the central updating of Dynamed by the editorial team might account for the more prompt inclusion of evidence.

As the authors rightly point out, slowness in updating could mean that new relevant information is ignored and could thus affect the validity of point of care information services”.

A slower updating rate may be considered more important for POCs that “promise” to “continuously update their evidence summaries” (EBM-Guidelines) or to “perform a continuous comprehensive review and to revise chapters whenever important new information is published, not according to any specific time schedule” (UpToDate). (see table with description of updating mechanisms )

In contrast, Emedicine doesn’t provide any detailed information on updating policy, another reason that it doesn’t belong to this list of best POCs.
Clinical Evidence, however, clearly states, We aim to update Clinical Evidence reviews annually. In addition to this cycle, details of clinically important studies are added to the relevant reviews throughout the year using the BMJ Updates service.” But BMJ Updates is not considered in the current analysis. Furthermore, patience is rewarded with excellent and complete summaries of evidence (in my opinion).

Indeed a major limitation of the current (and the previous) study by Banzi et al [1,2] is that they have looked at quantitative aspects and items that are relatively “easy to score”, like “volume” and “editorial quality”, not at the real quality of the evidence (previous post).

Although the findings were new to me, others have recently published similar results (studies were performed in the same time-span):

Shurtz and Foster [3] of the Texas A&M University Medical Sciences Library (MSL) also sought to establish a rubric for evaluating evidence-based medicine (EBM) point-of-care tools in a health sciences library.

They, too, looked at editorial quality and speed of updating plus reviewing content, search options, quality control, and grading.

Their main conclusion is that “differences between EBM tools’ options, content coverage, and usability were minimal, but that the products’ methods for locating and grading evidence varied widely in transparency and process”.

Thus this is in line with what Banzi et al reported in their first paper. They also share Banzi’s conclusion about differences in speed of updating

“DynaMed had the most up-to-date summaries (updated on average within 19 days), while First Consult had the least up to date (updated on average within 449 days). Six tools claimed to update summaries within 6 months or less. For the 10 topics searched, however, only DynaMed met this claim.”

Table 3 from Shurtz and Foster [3] 

Ketchum et al [4] also conclude that DynaMed the largest proportion of current (2007-2009) references (170/1131, 15%). In addition they found that Dynamed had the largest total number of references (1131/2330, 48.5%).

Yes, and you might have guessed it. The paper of Andrea Ketchum is the 3rd paper I’m going to review.

I also recommend to read the paper of the librarians Shurtz and Foster [3], which I found along the way. It has too much overlap with the Banzi papers to devote a separate post to it. Still it provides better background information then the Banzi papers, it focuses on POCs that claim to be EBM and doesn’t try to weigh one element over another. 

References

  1. Banzi, R., Liberati, A., Moschetti, I., Tagliabue, L., & Moja, L. (2010). A Review of Online Evidence-based Practice Point-of-Care Information Summary Providers Journal of Medical Internet Research, 12 (3) DOI: 10.2196/jmir.1288
  2. Banzi, R., Cinquini, M., Liberati, A., Moschetti, I., Pecoraro, V., Tagliabue, L., & Moja, L. (2011). Speed of updating online evidence based point of care summaries: prospective cohort analysis BMJ, 343 (sep22 2) DOI: 10.1136/bmj.d5856
  3. Shurtz, S., & Foster, M. (2011). Developing and using a rubric for evaluating evidence-based medicine point-of-care tools Journal of the Medical Library Association : JMLA, 99 (3), 247-254 DOI: 10.3163/1536-5050.99.3.012
  4. Ketchum, A., Saleh, A., & Jeong, K. (2011). Type of Evidence Behind Point-of-Care Clinical Information Products: A Bibliometric Analysis Journal of Medical Internet Research, 13 (1) DOI: 10.2196/jmir.1539
  5. Evidence Based Point of Care Summaries [1] No “Best” Among the Bests? (laikaspoetnik.wordpress.com)
  6. How will we ever keep up with 75 Trials and 11 Systematic Reviews a Day? (laikaspoetnik.wordpress.com
  7. UpToDate or Dynamed? (Shamsha Damani at laikaspoetnik.wordpress.com)
  8. How Evidence Based is UpToDate really? (laikaspoetnik.wordpress.com)

Related articles (automatically generated)





Evidence Based Point of Care Summaries [1] No “Best” Among the Bests?

13 10 2011

ResearchBlogging.orgFor many of today’s busy practicing clinicians, keeping up with the enormous and ever growing amount of medical information, poses substantial challenges [6]. Its impractical to do a PubMed search to answer each clinical question and then synthesize and appraise the evidence. Simply, because busy health care providers have limited time and many questions per day.

As repeatedly mentioned on this blog ([67]), it is far more efficient to try to find aggregate (or pre-filtered or pre-appraised) evidence first.

Haynes ‘‘5S’’ levels of evidence (adapted by [1])

There are several forms of aggregate evidence, often represented as the higher layers of an evidence pyramid (because they aggregate individual studies, represented by the lowest layer). There are confusingly many pyramids, however [8] with different kinds of hierarchies and based on different principles.

According to the “5S” paradigm[9] (now evolved to 6S -[10]) the peak of the pyramid are the ideal but not yet realized computer decision support systems, that link the individual patient characteristics to the current best evidence. According to the 5S model the next best source are Evidence Based Textbooks.
(Note: EBM and textbooks almost seem a contradiction in terms to me, personally I would not put many of the POCs somewhere at the top. Also see my post: How Evidence Based is UpToDate really?)

Whatever their exact place in the EBM-pyramid, these POCs are helpful to many clinicians. There are many different POCs (see The HLWIKI Canada for a comprehensive overview [11]) with a wide range of costs, varying from free with ads (e-Medicine) to very expensive site licenses (UpToDate). Because of the costs, hospital libraries have to choose among them.

Choices are often based on user preferences and satisfaction and balanced against costs, scope of coverage etc. Choices are often subjective and people tend to stick to the databases they know.

Initial literature about POCs concentrated on user preferences and satisfaction. A New Zealand study [3] among 84 GPs showed no significant difference in preference for, or usage levels of DynaMed, MD Consult (including FirstConsult) and UpToDate. The proportion of questions adequately answered by POCs differed per study (see introduction of [4] for an overview) varying from 20% to 70%.
McKibbon and Fridsma ([5] cited in [4]) found that the information resources chosen by primary care physicians were seldom helpful in providing the correct answers, leading them to conclude that:

“…the evidence base of the resources must be strong and current…We need to evaluate them well to determine how best to harness the resources to support good clinical decision making.”

Recent studies have tried to objectively compare online point-of-care summaries with respect to their breadth, content development, editorial policy, the speed of updating and the type of evidence cited. I will discuss 3 of these recent papers, but will review each paper separately. (My posts tend to be pretty long and in-depth. So in an effort to keep them readable I try to cut down where possible.)

Two of the three papers are published by Rita Banzi and colleagues from the Italian Cochrane Centre.

In the first paper, reviewed here, Banzi et al [1] first identified English Web-based POCs using Medline, Google, librarian association websites, and information conference proceedings from January to December 2008. In order to be eligible, a product had to be an online-delivered summary that is regularly updated, claims to provide evidence-based information and is to be used at the bedside.

They found 30 eligible POCs, of which the following 18 databases met the criteria: 5-Minute Clinical Consult, ACP-Pier, BestBETs, CKS (NHS), Clinical Evidence, DynaMed, eMedicine,  eTG complete, EBM Guidelines, First Consult, GP Notebook, Harrison’s Practice, Health Gate, Map Of Medicine, Micromedex, Pepid, UpToDate, ZynxEvidence.

They assessed and ranked these 18 point-of-care products according to: (1) coverage (volume) of medical conditions, (2) editorial quality, and (3) evidence-based methodology. (For operational definitions see appendix 1)

From a quantitive perspective DynaMed, eMedicine, and First Consult were the most comprehensive (88%) and eTG complete the least (45%).

The best editorial quality of EBP was delivered by Clinical Evidence (15), UpToDate (15), eMedicine (13), Dynamed (11) and eTG complete (10). (Scores are shown in brackets)

Finally, BestBETs, Clinical Evidence, EBM Guidelines and UpToDate obtained the maximal score (15 points each) for best evidence-based methodology, followed by DynaMed and Map Of Medicine (12 points each).
As expected eMedicine, eTG complete, First Consult, GP Notebook and Harrison’s Practice had a very low EBM score (1 point each). Personally I would not have even considered these online sources as “evidence based”.

The calculations seem very “exact”, but assumptions upon which these figures are based are open to question in my view. Furthermore all items have the same weight. Isn’t the evidence-based methodology far more important than “comprehensiveness” and editorial quality?

Certainly because “volume” is “just” estimated by analyzing to which extent 4 random chapters of the ICD-10 classification are covered by the POCs. Some sources, like Clinical Evidence and BestBets (scoring low for this item) don’t aim to be comprehensive but only “answer” a limited number of questions: they are not textbooks.

Editorial quality is determined by scoring of the specific indicators of transparency: authorship, peer reviewing procedure, updating, disclosure of authors’ conflicts of interest, and commercial support of content development.

For the EB methodology, Banzi et al scored the following indicators:

  1. Is a systematic literature search or surveillance the basis of content development?
  2. Is the critical appraisal method fully described?
  3. Are systematic reviews preferred over other types of publication?
  4. Is there a system for grading the quality of evidence?
  5. When expert opinion is included is it easily recognizable over studies’ data and results ?

The  score for each of these indicators is 3 for “yes”, 1 for “unclear”, and 0 for “no” ( if judged “not adequate” or “not reported.”)

This leaves little room for qualitative differences and mainly relies upon adequate reporting. As discussed earlier in a post where I questioned the evidence-based-ness of UpToDate, there is a difference between tailored searches and checking a limited list of sources (indicator 1.). It also matters whether the search is mentioned or not (transparency), whether it is qualitatively ok and whether it is extensive or not. For lists, it matters how many sources are “surveyed”. It also matters whether one or both methods are used… These important differences are not reflected by the scores.

Furthermore some points may be more important than others. Personally I find step 1 the most important. For what good is appraising and grading if it isn’t applied to the most relevant evidence? It is “easy” to do a grading or to copy it from other sources (yes, I wouldn’t be surprised if some POCs are doing this).

On the other hand, a zero for one indicator can have too much weight on the score.

Dynamed got 12 instead of the maximum 15 points, because their editorial policy page didn’t explicitly describe their absolute prioritization of systematic reviews although they really adhere to that in practice (see comment by editor-in-chief  Brian Alper [2]). Had Dynamed received the deserved 15 points for this indicator, they would have had the highest score overall.

The authors further conclude that none of the dimensions turned out to be significantly associated with the other dimensions. For example, BestBETs scored among the worst on volume (comprehensiveness), with an intermediate score for editorial quality, and the highest score for evidence-based methodology.  Overall, DynaMed, EBM Guidelines, and UpToDate scored in the top quartile for 2 out of 3 variables and in the 2nd quartile for the 3rd of these variables. (but as explained above Dynamed really scored in the top quartile for all 3 variables)

On basis of their findings Banzi et al conclude that only a few POCs satisfied the criteria, with none excelling in all.

The finding that Pepid, eMedicine, eTG complete, First Consult, GP Notebook, Harrison’s Practice and 5-Minute Clinical Consult only obtained 1 or 2 of the maximum 15 points for EBM methodology confirms my “intuitive grasp” that these sources really don’t deserve the label “evidence based”. Perhaps we should make a more strict distinction between “point of care” databases as a point where patients and practitioners interact, particularly referring to the context of the provider-patient dyad (definition by Banzi et al) and truly evidence based summaries. Only few of the tested databases would fit the latter definition. 

In summary, Banzi et al reviewed 18 Online Evidence-based Practice Point-of-Care Information Summary Providers. They comprehensively evaluated and summarized these resources with respect to coverage (volume) of medical conditions, editorial quality, and (3) evidence-based methodology. 

Limitations of the study, also according to the authors, were the lack of a clear definition of these products, arbitrariness of the scoring system and emphasis on the quality of reporting. Furthermore the study didn’t really assess the products qualitatively (i.e. with respect to performance). Nor did it take into account that products might have a different aim. Clinical Evidence only summarizes evidence on the effectiveness of treatments of a limited number of diseases, for instance. Therefore it scores bad on volume while excelling on the other items. 

Nevertheless it is helpful that POCs are objectively compared and it may help as starting point for decisions about acquisition.

References (not in chronological order)

  1. Banzi, R., Liberati, A., Moschetti, I., Tagliabue, L., & Moja, L. (2010). A Review of Online Evidence-based Practice Point-of-Care Information Summary Providers Journal of Medical Internet Research, 12 (3) DOI: 10.2196/jmir.1288
  2. Alper, B. (2010). Review of Online Evidence-based Practice Point-of-Care Information Summary Providers: Response by the Publisher of DynaMed Journal of Medical Internet Research, 12 (3) DOI: 10.2196/jmir.1622
  3. Goodyear-Smith F, Kerse N, Warren J, & Arroll B (2008). Evaluation of e-textbooks. DynaMed, MD Consult and UpToDate. Australian family physician, 37 (10), 878-82 PMID: 19002313
  4. Ketchum, A., Saleh, A., & Jeong, K. (2011). Type of Evidence Behind Point-of-Care Clinical Information Products: A Bibliometric Analysis Journal of Medical Internet Research, 13 (1) DOI: 10.2196/jmir.1539
  5. McKibbon, K., & Fridsma, D. (2006). Effectiveness of Clinician-selected Electronic Information Resources for Answering Primary Care Physicians’ Information Needs Journal of the American Medical Informatics Association, 13 (6), 653-659 DOI: 10.1197/jamia.M2087
  6. How will we ever keep up with 75 Trials and 11 Systematic Reviews a Day? (laikaspoetnik.wordpress.com)
  7. 10 + 1 PubMed Tips for Residents (and their Instructors) (laikaspoetnik.wordpress.com)
  8. Time to weed the (EBM-)pyramids?! (laikaspoetnik.wordpress.com)
  9. Haynes RB. Of studies, syntheses, synopses, summaries, and systems: the “5S” evolution of information services for evidence-based healthcare decisions. Evid Based Med 2006 Dec;11(6):162-164. [PubMed]
  10. DiCenso A, Bayley L, Haynes RB. ACP Journal Club. Editorial: Accessing preappraised evidence: fine-tuning the 5S model into a 6S model. Ann Intern Med. 2009 Sep 15;151(6):JC3-2, JC3-3. PubMed PMID: 19755349 [free full text].
  11. How Evidence Based is UpToDate really? (laikaspoetnik.wordpress.com)
  12. Point_of_care_decision-making_tools_-_Overview (hlwiki.slais.ubc.ca)
  13. UpToDate or Dynamed? (Shamsha Damani at laikaspoetnik.wordpress.com)

Related articles (automatically generated)





Medical Black Humor, that is Neither Funny nor Appropriate.

19 09 2011

Last week, I happened to see this Facebook post of the The Medical Registrar where she offends a GP, Anne Marie Cunningham*, who wrote a critical post about black medical humor at her blog “Wishful Thinking in Medical Education”. I couldn’t resist placing a likewise “funny” comment in this hostile environment where everyone seemed to agree (till then) and try to beat each other in levels of wittiness (“most naive child like GP ever” – “literally the most boring blog I have ever read”,  “someone hasn’t met many midwives in that ivory tower there.”, ~ insulting for a trout etc.):

“Makes no comment, other than anyone who uses terms like “humourless old trout” for a GP who raises a relevant point at her blog is an arrogant jerk and an unempathetic bastard, until proven otherwise…  No, seriously, from a patient’s viewpoint terms like “labia ward” are indeed derogatory and should be avoided on open social media platforms.”

I was angered, because it is so easy to attack someone personally instead of discussing the issues raised.

Perhaps you first want to read the post of Anne Marie yourself (and please pay attention to the comments too).

Social media, black humour and professionals…

Anne Marie mainly discusses her feelings after she came across a discussion between several male doctors on Twitter using slang like ‘labia ward’ and ‘birthing sheds’ for birth wards, “cabbage patch” to refer to the intensive care and madwives for midwives (midwitches is another one). She discussed it with the doctors in question, but only one of them admitted he had perhaps misjudged sending the tweet. After consulting other professionals privately, she writes a post on her blog without revealing the identity of the doctors involved. She also puts it in a wider context by referring to  the medical literature on professionalism and black humour quoting Berk (and others):

“Simply put, derogatory and cynical humour as displayed by medical personnel are forms of verbal abuse, disrespect and the dehumanisation of their patients and themselves. Those individuals who are the most vulnerable and powerless in the clinical environment – students, patients and patients’ families – have become the targets of the abuse. Such humour is indefensible, whether the target is within hearing range or not; it cannot be justified as a socially acceptable release valve or as a coping mechanism for stress and exhaustion.”

The doctors involved do not make any effort to explain what motivated them. But two female anesthetic registrars frankly comment to the post of Anne Marie (one of them having created the term “labia ward”, thereby disproving that this term is misogynic per se). Both explain that using such slang terms isn’t about insulting anyone and that they are still professionals caring for patients:

 It is about coping, and still caring, without either going insane or crying at work (try to avoid that – wait until I’m at home). Because we can’t fall apart. We have to be able to come out of resus, where we’ve just been unable to save a baby from cotdeath, and cope with being shouted and sworn at be someone cross at being kept waiting to be seen about a cut finger. To our patients we must be cool, calm professionals. But to our friends, and colleagues, we will joke about things that others would recoil from in horror. Because it beats rocking backwards and forwards in the country.

[Just a detail, but “Labia ward” is a simple play on words to portray that not all women in the “Labor Ward” are involved in labor. However, this too is misnomer.  Labia have little to do with severe pre-eclampsia, intra-uterine death or a late termination of pregnancy]

To a certain extent medical slang is understandable, but it should stay behind the doors of the ward or at least not be said in a context that could offend colleagues and patients or their carers. And that is the entire issue. The discussion here was on Twitter, which is an open platform. Tweets are not private and can be read by other doctors, midwives, the NHS and patients. Or as e-Patient Dave expresses so eloquently:

I say, one is responsible for one’s public statements. Cussing to one’s buddies on a tram is not the same as cussing in a corner booth at the pub. If you want to use venting vocabulary in a circle, use email with CC’s, or a Google+ Circle.
One may claim – ONCE – ignorance, as in, “Oh, others could see that??” It must, I say, then be accompanied by an earnest “Oh crap!!” Beyond that, it’s as rude as cussing in a streetcorner crowd.

Furthermore, it seemed the tweet served no other goal as to be satirical, sardonic, sarcastic and subversive (words in the bio of the anesthetist concerned). And sarcasm isn’t limited to this one or two tweets. Just the other day he was insulting to a medical student saying among other things:“I haven’t got anything against you. I don’t even know you. I can’t decide whether it’s paranoia, or narcissism, you have”. 

We are not talking about restriction of “free speech” here. Doctors just have to think twice before they say something, anything on Twitter and Facebook, especially when they are presenting themselves as MD.  Not only because it can be offensive to colleagues and patients, but also because they have a role model function for younger doctors and medical students.

Isolated tweets of one or two doctors using slang is not the biggest problem, in my opinion. What I found far more worrying, was the arrogant and insulting comment at Facebook and the massive support it got from other doctors and medical students. Apparently there are many “I-like-to-exhibit-my-dark-humor-skills-and-don’t-give-a-shit-what-you think-doctors” at Facebook (and Twitter) and they have a large like-minded medical audience: the “medical registrar page alone has 19,000 (!) “fans”.

Sadly there is a total lack of reflection and reason in many of the comments. What to think of:

“wow, really. The quasi-academic language and touchy-feely social social science bullshit aside, this woman makes very few points, valid or otherwise. Much like these pages, if you’re offended, fuck off and don’t follow them on Twitter, and cabbage patch to refer to ITU is probably one of the kinder phrases I’ve heard…”

and

“Oh my god. Didnt realise there were so many easily offended, left winging, fun sponging, life sucking, anti- fun, humourless people out there. Get a grip people. Are you telling me you never laughed at the revue’s at your medical schools?”

and

“It may be my view and my view alone but the people who complain about such exchanges, on the whole, tend to be the most insincere, narcissistic and odious little fuckers around with almost NO genuine empathy for the patient and the sole desire to make themselves look like the good guy rather than to serve anyone else.”

It seems these doctors and their fans don’t seem to possess the communicative and emphatic skills one would hope them to have.

One might object that it is *just* Facebook or that “#twitter is supposed to be fun, people!” (dr Fiona) 

I wouldn’t agree for 3 reasons:

  • Doctors are not teenagers anymore and need to act as grown-ups (or better: as professionals)
  • There is no reason to believe that people who make it their habit to offend others online behave very differently IRL
  • Seeing Twitter as “just for fun” is an underestimation of the real power of Twitter

Note: *It is purely coincidental that the previous post also involved Anne Marie.





RIP Statistician Paul Meier. Proponent not Father of the RCT.

14 08 2011

This headline in Boing Boing caught my eye today:  RIP Paul Meier, father of the randomized trial

Not surprisingly, I knew that Paul Meier (with Kaplan) introduced the Kaplan-Meier estimator (1958), a very important tool for measuring how many patients survive a medical treatment. But I didn’t know he was “father of the randomized trial”….

But is he really?:Father of the randomized trial and “probably best known for the introduction of randomized trials into the evaluation of medical treatments”, as Boing Boing states?

Boing Boing’s very short article is based on the New York Times article: Paul Meier, Statistician Who Revolutionized Medical Trials, Dies at 87. According to the NY Times “Dr. Meier was one of the first and most vocal proponents of what is called “randomization.” 

Randomization, the NY-Times explains, is:

Under the protocol, researchers randomly assign one group of patients to receive an experimental treatment and another to receive the standard treatment. In that way, the researchers try to avoid unintentionally skewing the results by choosing, for example, the healthier or younger patients to receive the new treatment.

(for a more detailed explanation see my previous posts The best study designs…. for dummies and #NotSoFunny #16 – Ridiculing RCTs & EBM)

Meier was a very successful proponent, that is for sure. According to Sir Richard Peto, (Dr. Meier) “perhaps more than any other U.S. statistician, was the one who influenced U.S. drug regulatory agencies, and hence clinical researchers throughout the U.S. and other countries, to insist on the central importance of randomized evidence.”

But an advocate need not be a father, for advocates are seldom the inventors/creators. A proponent is more of a nurse, a mentor or a … foster-parent.

Is Meier the true father/inventor of the RCT? And if not, who is?

Googling “Father of the randomized trial” won’t help, because all 1.610  hits point to Dr. Meier…. thanks to Boing Boing careless copying.

What I read so far doesn’t point at one single creator. And the RCT wasn’t just suddenly there. It started with comparison of treatments under controlled conditions. Back in 1753, the British naval surgeon James Lind published his famous account of 12 scurvy patients, “their cases as similar as I could get them” noting that “the most sudden and visible good effects were perceived from the uses of the oranges and lemons and that citrus fruit cured scurvy [3]. The French physician Pierre Louis and Harvard anatomist Oliver Wendell Holmes (19th century) were also fierce proponents of supporting conclusions about the effectiveness of treatments with statistics, not subjective impressions.[4]

But what was the first real RCT?

Perhaps the first real RCT was The Nuremberg salt test (1835) [6]. This was possibly not only the first RCT, but also the first scientific demonstration of the lack of effect of a homeopathic dilution. More than 50 visitors of a local tavern participated in the experiment. Half of them received a vial  filled with distilled snow water, the other half a vial with ordinary salt in a homeopathic C30-dilution of distilled snow water. None of the participants knew whether he got the “actual medicine or not” (blinding). The numbered vials were coded and the code was broken after the experiment (allocation concealment).

The first publications of RCT’s were in the field of psychology and agriculture. As a matter of fact one other famous statistician, Ronald A. Fisher  (of the Fisher’s exact test) seems to play a more important role in the genesis and popularization of RCT’s than Meier, albeit in agricultural research [5,7]. The book “The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century” describes how Fisher devised a randomized trial at the spot to test the contention of a lady that she could taste the difference between tea into which milk had been poured and tea that had been poured into milk (almost according to homeopathic principles) [7]

According to Wikipedia [5] the published (medical) RCT appeared in the 1948 paper entitled “Streptomycin treatment of pulmonary tuberculosis”. One of the authors, Austin Bradford Hill, is (also) credited as having conceived the modern RCT.

Thus the road to the modern RCT is long, starting with the notions that experiments should be done under controlled conditions and that it doesn’t make sense to base treatment on intuition. Later, experiments were designed in which treatments were compared to placebo (or other treatments) in a randomized and blinded fashion, with concealment of allocation.

Paul Meier was not the inventor of the RCT, but a successful vocal proponent of the RCT. That in itself is commendable enough.

And although the Boing Boing article was incorrect, and many people googling for “father of the RCT” will find the wrong answer from now on, it did raise my interest in the history of the RCT and the role of statisticians in the development of science and clinical trials.
I plan to read a few of the articles and books mentioned below. Like the relatively lighthearted “The Lady Tasting Tea” [7]. You can envision a book review once I have finished reading it.

Note added 15-05 13.45 pm:

Today a more accurate article appeared in the Boston Globe (“Paul Meier; revolutionized medical studies using math”), which does justice to the important role of Dr Meier in the espousal of randomization as an essential element in clinical trials. For that is what he did.

Quote:

Dr. Meier published a scathing paper in the journal Science, “Safety Testing of Poliomyelitis Vaccine,’’ in which he described deficiencies in the production of vaccines by several companies. His paper was seen as a forthright indictment of federal authorities, pharmaceutical manufacturers, and the National Foundation for Infantile Paralysis, which funded the research for a polio vaccine.

  1. RIP Paul Meier, father of the randomized trial (boingboing.net)
  2. Paul Meier, Statistician Who Revolutionized Medical Trials, Dies at 87 (nytimes.com)
  3. M L Meldrum A brief history of the randomized controlled trial. From oranges and lemons to the gold standard. Hematology/ Oncology Clinics of North America (2000) Volume: 14, Issue: 4, Pages: 745-760, vii PubMed: 10949771  or see http://www.mendeley.com
  4. Fye WB. The power of clinical trials and guidelines,and the challenge of conflicts of interest. J Am Coll Cardiol. 2003 Apr 16;41(8):1237-42. PubMed PMID: 12706915. Full text
  5. http://en.wikipedia.org/wiki/Randomized_controlled_trial
  6. Stolberg M (2006). Inventing the randomized double-blind trial: The Nuremberg salt test of 1835. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org).
  7. The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century Peter Cummings, MD, MPH, Jama 2001;286(10):1238-1239. doi:10.1001/jama.286.10.1238  Book Review.
    Book by David Salsburg, 340 pp, with illus, $23.95, ISBN 0-7167-41006-7, New York, NY, WH Freeman, 2001.
  8. Kaptchuk TJ. Intentional ignorance: a history of blind assessment and placebo controls in medicine. Bull Hist Med. 1998 Fall;72(3):389-433. PubMed PMID: 9780448. abstract
  9. The best study design for dummies/ (https://laikaspoetnik.wordpress.com: 2008/08/25/)
  10. #Notsofunny: Ridiculing RCT’s and EBM (https://laikaspoetnik.wordpress.com: 2010/02/01/)
  11. RIP Paul Meier : Research Randomization Advocate (mystrongmedicine.com)
  12. If randomized clinical trials don’t show that your woo works, try anthropology! (scienceblogs.com)
  13. The revenge of “microfascism”: PoMo strikes medicine again (scienceblogs.com)




HOT TOPIC: Does Soy Relieve Hot Flashes?

20 06 2011

ResearchBlogging.orgThe theme of the Upcoming Grand Rounds held at June 21th (1st day of the Summer) at Shrink Rap is “hot”. A bit far-fetched, but aah you know….shrinks“. Of course they hope  assume  that we will express Weiner-like exhibitionism at our blogs. Or go into spicy details of hot sexpectations or other Penis Friday NCBI-ROFL posts. But no, not me, scientist and librarian to my bone marrow. I will stick to boring, solid science and will do a thorough search to find the evidence. Here I will discuss whether soy really helps to relieve hot flashes (also called hot flushes).

…..As illustrated by this HOT picture, I should post as well…..

(CC from Katy Tresedder, Flickr):

Yes, many menopausal women plagued by hot flashes take their relief  in soy or other phytoestrogens (estrogen-like chemicals derived from plants). I know, because I happen to have many menopausal women in my circle of friends who prefer taking soy over estrogen. They rather not take normal hormone replacement therapy, because this can have adverse effects if taken for a longer time. Soy on the other hand is considered a “natural remedy”, and harmless. Probably physiological doses of soy (food) are harmless and therefore a better choice than the similarly “natural” black cohosh, which is suspected to give liver injury and other adverse effects.

But is soy effective?

I did a quick search in PubMed and found a Cochrane Systematic Review from 2007 that was recently edited with no change to the conclusions.

This review looked at several phytoestrogens that were offered in several ways, as: dietary soy (9x) (powder, cereals, drinks, muffins), soy extracts (9x), red clover extracts (7x, including Promensil (5x)), Genistein extract , Flaxseed, hop-extract  and a Chinese medicinal herb.

Thirty randomized controlled trials with a total of 2730 participants met the inclusion criteria: the participants were women in or just before their menopause complaining of vasomotor symptoms (thus having hot flashes) for at least 12 weeks. The intervention was a food or supplement with high levels of phytoestrogens (not any other herbal treatment) and this was compared with placebo, no treatment or hormone replacement therapy.

Only 5 trials using the red clover extract Promensil were homogenous enough to combine in a meta-analysis. The effect on one outcome (incidence of hot flashes) is shown below. As can be seen at a glance, Promensil had no significant effect, whether given in a low (40 mg/day) or a higher (80 mg/day) dose. This was also true for the other outcomes.

The other phytoestrogen interventions were very heterogeneous with respect to dose, composition and type. This was especially true for the dietary soy treatment. Although some of the trials showed a positive effect of phytoestrogens on hot flashes and night sweats, overall, phytoestrogens were no better than the comparisons.

Most trials were small,  of short duration and/or of poor quality. Fewer than half of the studies (n=12) indicated that allocation had been concealed from the trial investigators.

One striking finding was that there was a strong placebo effect in most trials with a reduction in frequency of hot flashes ranging from 1% to 59% .

I also found another systematic review in PubMed by Bolaños R et al , that limited itself only to soy. Other differences with the Cochrane Systematic Review (besides the much simpler search 😉 ) were: inclusion of more recently published clinical trials, no inclusion of unpublished studies and less strict exclusion on basis of low methodological quality. Furthermore, genestein was (rightly) considered as a soy product.

The group of studies that used soy dietary supplement showed the highest heterogeneity. Overall, the results “showed a significant tendency(?)  in favor of soy. Nevertheless the authors conclude (similar to the Cochrane authors), that  it is still difficult to establish conclusive results given the high heterogeneity found in the studies. (but apparently the data could still be pooled?)

References

  • Lethaby A, Marjoribanks J, Kronenberg F, Roberts H, Eden J, & Brown J. (2007). Phytoestrogens for vasomotor menopausal symptoms Cochrane Database of Systematic Reviews (4) : 10.1002/14651858.CD001395.pub3.
  • Bolaños R, Del Castillo A, & Francia J (2010). Soy isoflavones versus placebo in the treatment of climacteric vasomotor symptoms: systematic review and meta-analysis. Menopause (New York, N.Y.), 17 (3), 660-6 PMID: 20464785




The Second #TwitJC Twitter Journal Club

14 06 2011

In the previous post I wrote about  a new initiative on Twitter, the Twitter Journal Club (hashtag #TwitJC). Here, I shared some constructive criticism. The Twitter Journal Club is clearly an original and admirable initiative, that gained a lot of interest. But there is some room for improvement.

I raised two issues: 1. discussions with 100 people are not easy to follow on Twitter, and 2. walking through a checklist for critical appraisals is not the most interesting to do (particularly because it had already been done).

But as one of the organizers explained, the first session was just meant for promoting #twitjc. Instead of the expected 6 people, 100 tweople showed up.

In the second session, last Sunday evening, the organizers followed a different structure.

Thus, I thought it would only be fair, to share my experiences with the second session as well. This time I managed to follow it from start to finish.

Don’t worry. Discussing the journal club won’t be a regular item. I will leave the organization up to the organizers. The sessions might inspire me, though, to write a blog post on the topic now and then. But that may only work synergistic. (at least for me, because it forces me to rethink it all)

This time the discussion was about Rose’s Prevention Paradox (PDF), a 30 year old paper that is still relevant. The paper is more of an opinion piece, therefore the discussion focused on the implications of the Prevention Paradox theory. It was really helpful that Fi wrote an introduction to the paper, and a Points of Discussion beforehand. There were 5 questions (and many sub-questions).

I still found it very hard to follow it all at Twitter, as illustrated by the following tweet:

  • laikas I think I lost track. Which question are we? #twitjc Sun Jun 12 20:07:03
  • laikas @MsPhelps ik werd wel helemaal duizelig van al die tweets. Er zijn toch wel veel mensen die steeds een andere vraag stellen voor de 1e is beantwoord -9:47 PM Jun 12th, 2011 (about instant nausea when seeing tweets rolling by and people already posing a new question before the first one is answered)

I followed the tweets at http://tweetchat.com/room/twitjc. Imagine tweets rolling by and you try to pick up those tweets you want to respond to (either bc they are very relevant, or because you disagree). By the time you have finished your tweet, already 20 -possibly very interesting tweets- passed by, including the next question by the organizers (unfortunately they didn’t use the official @twitjournalclub account for this).

Well, I suppose I am not very good at this. Partly because I’m Dutch (thus it takes longer to compose my tweets), partly because I’m not a fast thinker. I’m better at thorough analyses, at my blog for instance.

But this is Twitter.  To speak with Johan Cruyff, a legendary soccer-player from Holland, “Every disadvantage has its advantage”.

Twitter may not favor organized discussions, but on the other hand it is very engaging, thought-provoking and easy accessible. Where else do you meet 100 experts/doctors willing  to exchange thoughts about an interesting medical topic?

The tweets below are in line with/reflect my opinion on this second Twitter Journal Club (RT means retweeting/repeating the tweet):

  • laikas RT @themattmak@fidouglas @silv24 Congratulations again on a great #twitjc. Definitely more controversial and debate generating than last week’s! -9:18 PM Jun 12th, 2011
  • laikas @silv24 well i think it went well (it is probably me, I’m 2 slow). This paper is broad, evokes much discussion & many examples can B given -9:45 PM Jun 12th, 2011
  • DrDLittle Less structure to #twitJC last night but much wider debate 7:41 AM Jun 13th, 2011
  • amitns @DrDLittle It’s obviously a very complex topic, more structure would have stifled the debate. A lot of food for thought.#twitJC -7:45 AM Jun 13th, 2011

Again, the Twitter Journal Club gained a lot of interest. Scientist and teachers consider to borrow the concept. Astronomers are already preparing their first meeting on Thursday… And Nature seems to be on top of it as well, as it will interview the organizers of the medical and the astronomy journal club for an interview.

Emergency Physician Tom Young with experience in critically appraisal just summarized it nicely: (still hot from the press):

The two meetings of the journal club so far have not focussed in on this particular system; the first used a standard appraisal tool for randomised controlled trials, the second was more laissez-faire in its approach. This particular journal club is finding its feet in a new setting (that of Twitter) and will find its strongest format through trial and error. indeed, to try to manage such a phenomenon might be likened to ‘herding cats’ that often used description of trying to manage doctors, and I think, we would all agree would be highly inadvisable. Indeed, one of its strengths is that participants, or followers, will take from it what they wish, and this will be something, rather than nothing, whatever paper is discussed, even if it is only contact with another Tweeter, with similar or divergent views. 

Indeed, what I gained from these two meetings is that I met various nice and interesting people (including the organizers, @fidouglas and @silv24). Furthermore, I enjoyed the discussions, and picked up some ideas and examples that I would otherwise wouldn’t know about. The last online meeting sparked my interest in the prevention paradox. Before the meeting, I only read the paper at a glance. After the session I decided to read it again, and in more detail. As a matter of fact I feel inspired to write a blog post about this theory. Originally I planned to write a summary here, but probably the post is getting too long. Thus I will await the summary by the organizers and see if I have time to discuss it as well.

Related articles





The #TwitJC Twitter Journal Club, a New Initiative on Twitter. Some Initial Thoughts.

10 06 2011

There is a new initiative on Twitter: The Twitter Journal Club. It is initiated by Fi Douglas (@fidouglas) a medical student at Cambridge,  and Natalie Silvey (@silv24)  a junior doctor in the West Midlands.

Fi and Natalie have set up a blog for this event: http://twitjc.wordpress.com/

A Twitter Journal Club operates in the same way as any other journal club, except that the forum is Twitter.

The organizers choose a paper, which they announce at their website (you can make suggestions here or via a tweet). Ideally, people should read the entire paper before the Twitter session. A short summary with key points (i.e. see here) is posted on the website.

The first topic was:  Early Goal-Directed Therapy in the Treatment of Severe Sepsis and Septic Shock [PDF]

It started last Sunday 8 pm (Dutch time) and took almost 2 hours to complete.

@twitjournalclub (the twitter account of the organizers) started with a short introduction. People introduced themselves as they entered the discussion. Each tweet in the discussion was tagged with #TwitJC (a so called hashtag), otherwise it won’t get picked up by people following the hashtag. (Tweetchat automatically provides the hashtag you type in).

Although it was the first session, many people (perhaps almost 100?!) joined the Journal Club, both actively and more passively. That is a terrific achievement. Afterwards it got a very positive Twitter “press”. If you know to engage people like @nothern_doctor, @doctorblogs, @amcunningham and @drgrumble and people like @bengoldacre, @cebmblog and @david_colquhoun find it a terrific concept, then you know that it is a great idea that meets a need. As such, enough reason to continue.

There were also not purely positive voices. @DrVes sees it as a great effort, but added that “we need to go beyond this 1950s model rather than adapt it to social media.” Apparently this tweet was not well received, but I think he made a very sensible statement.

We can (and should) asks ourselves if Twitter is the right medium for such an event.

@DrVes has experience with Twitter Journal Clubs. He participated in the first medical journal club on Twitter at the Allergy and Immunology program of Creighton University back in 2008 and presented a poster at an allergy meeting in 2009.

BUT, as far as I can tell, that Twitter Journal Club was both much more small-scale (7 fellows?) and different in design. It seems that Tweets summarized what was being said at a real journal club teaching session. Ves Dimov:

“The updates were followed in real time by the Allergy and Immunology fellows at the Louisiana State University (Shreveport) and some interested residents at Cleveland Clinic, along with the 309 subscribers of my Twitter account named AllergyNotes“.

So that is the same as tweeting during a conference or a lecture to inform others about the most interesting facts/statements. It is one-way-tweeting (overall there were just 24 updates with links).

I think the present  Twitter Journal Club was more like a medical Twitter chat (also the words of Ves).

Is chatting on Twitter effective?

Well that depends on what one wants to achieve.

Apparently for all people participating, it was fun to do and educative.

I joined too late to tell, thus I awaited the transcript. But boy, who wants to read 31 pages of “chaotic tweets”? Because that is what a Twitter chat is if many people join.  All tweets are ordered chronologically. Good for the archive, but if the intention is to make the transcribed chat available to people who couldn’t attend, it needs deleting, cutting, pasting and sorting. But that is a lot of work if done manually.

I tried it for part of the transcript. Compare the original transcript here with this Google Doc.

The “remix of tweets” also illustrates that people have their own “mini-chats”, and “off-topic” (but often very relevant) questions.

In addition, the audience is very mixed. Some people seem to have little experience with critical appraisal or concepts like “intention to treat” (ITT) and would perhaps benefit from supplementary information beforehand (i.e. documents at the TwitJC website). Others are experienced doctors with a lot of clinical expertise, who always put theoretical things in perspective. Very valuable, but often they are far ahead in the discussion.

The name of the event is Twitter  Journal Club. Journal Club is a somewhat ambiguous term. According to Wikipedia “A journal club is a group of individuals who meet regularly to critically evaluate recent articles in scientific literature”. It can deal with any piece which looks interesting to share, including hypotheses and preclinical papers about mechanisms of actions.

Thus, to me Journal club is not per definition EBM (Evidence Based Medicine).

Other initiatives are a critical appraisal of a study and a CAT,  a critical appraisal of a topic (sometimes wrongly called PICO, PICO is only part of it).

The structure of the present journal club was more that of a critical appraisal. It followed the normal checklist for an RCT: What is being studied? Is the paper valid (appropriately allocated, blinded etc ), what are the results (NNT etc) and are the results valid outside of the context of the paper?

Imo, official critical appraisal of the paper costs a lot of time and is not the most interesting. Looking at my edited transcript you see that half of the people are answering the question and they all say the same: “Clearly focused question” is answer to first question (but even in the edited transcript this takes 3 pages), “clear interventions (helpful flowcharts) is the answer to the second question.

Half of the people have their own questions. Very legitimate and good questions, but not in line with the questions of @twitjournalclub. Talking about the NNT and about whether the results are really revolutionary, is VERY relevant, but should be left till the end.

A twitter chat with appr. 100 people needs a tight structure.

However, I wonder whether this  approach of critical appraisal is the most interesting. Even more so, because this part didn’t evoke much discussion.

Plus it has already been done!!

I searched the TRIP database and with the title of the paper, to find critical appraisals or synopses of the paper. I found 3 synopses, 2 of which follow more or less the structure of this journal club here, here (and this older one). They answer all the questions about validity.

Wouldn’t it have better with this older key paper (2001) to just use the existing critical appraisals as background information and discuss the implications? Or discuss new supporting or contradictory findings?

The very limited search in TRIP (title of paper only) showed some new interesting papers on the topic (external validation, cost effectiveness, implementation, antibiotics) and I am sure there are many more.

A CAT may also be more interesting than a synopsis, because “other pieces of evidence” are also taken into consideration and one discusses a topic not one single paper. But perhaps this is too difficult to do, because one has to do a thorough search as well and has too much to discuss. Alternatively one could choose a recent systematic review, which summarizes the existing RCT’s.

Anyway, I think the journal club could improve by not following the entire checklist (boring! done!), but use this as a background. Furthermore I think there should be 3-5 questions that are very relevant to discuss. Like in the #HSCMEU discussions, people could pose those questions beforehand. In this way it is easier to adhere to the structure.

As to the medium Twitter for this journal club. I am not fond of  long Twitter chats, because it tends to be chaotic, there is a lot of reiteration, people tend to tweet not to “listen” and there is a constriction of 140 characters. Personally I would prefer a webinar, where people discuss the topic and you can pose questions via Twitter or otherwise.
Other alternatives wouldn’t work for me either. A Facebook journal club (described by of Neil Mehta) looks more static (commenting to a short summary of a paper), and Skyping is difficult with more than 10 people and not easy to transcribe.

But as said there is a lot of enthusiasm for this Twitter Journal Club. Even outside the medical world. This “convincing effort” inspired others to start a Astronomy Twitter Journal Club.

Perhaps a little modification of goals and structure could make it even more interesting. I will try to attend the next event, which is about Geoffrey Rose’s ‘Prevention Paradox’ paper, officially titled ”Strategy of prevention: lessons from cardiovascular disease”, available here.

Notes added:

[1] A summary of the first Twitter journal club is just posted. This is really valuable and takes away the disadvantages of reading an entire transcript (but one misses a lot of interesting aspects too)!

[2] This is the immediate response of one of the organizers at Twitter. I’m very pleased to notice that they will put more emphasis on implications of the Journal. That would take away much of my critic.

(Read tweets from bottom to top).

References

  1. Welcome (twitjc.wordpress.com)
  2. An important topic for the first Twitter Journal Club (twitjc.wordpress.com)
  3. Rivers E, Nguyen B, Havstad S, Ressler J, Muzzin A, Knoblich B, Peterson E, Tomlanovich M; Early Goal-Directed Therapy Collaborative Group. Early goal-directed therapy in the treatment of severe sepsis and septic shock. N Engl J Med. 2001 Nov 8;345(19):1368-77. PubMed PMID: 11794169. (PDF).
  4. The First Journal Club on Twitter – Then and Now (casesblog.blogspot.com)
  5. Allergy and Immunologyclub on Twitter (allergynotes.blogspot.com)
  6. The Utility of a Real-time Microblogging Service for Journal Club in Allergy and Immunology. Dimov, V.; Randhawa, S.; Auron, M.; Casale, T. American College of Allergy, Asthma & Immunology (ACAAI) 2009 Annual Meeting. Ann Allergy Asthma Immunol., Vol 103:5, Suppl. 3, A126, Nov 2009.
  7. https://docs.google.com/document/pub?id=1qzk1WzjNO5fbWd0PAax6cIDdUGGg1sDn86FPT1li-sQ (short remix of the transcript)
  8. Model for a Journal Club using Google Reader and Facebook OR if the prophet does not go to the Mountain…. bring the journal club to FB! (blogedutech.blogspot.com)
  9. Astronomy Twitter Journal Club/ (sarahaskew.net)
  10. A summary of week one: Rivers et al (twitjc.wordpress.com)




To Retract or Not to Retract… That’s the Question

7 06 2011

In the previous post I discussed [1] that editors of Science asked for the retraction of a paper linking XMRV retrovirus to ME/CFS.

The decision of the editors was based on the failure of at least 10 other studies to confirm these findings and on growing support that the results were caused by contamination. When the authors refused to retract their paper, Science issued an Expression of Concern [2].

In my opinion retraction is premature. Science should at least await the results of two multi-center studies, that were designed to confirm or disprove the results. These studies will continue anyway… The budget is already allocated.

Furthermore, I can’t suppress the idea that Science asked for a retraction to exonerate themselves for the bad peer review (the paper had serious flaws) and their eagerness to swiftly publish the possibly groundbreaking study.

And what about the other studies linking the XMRV to ME/CFS or other diseases: will these also be retracted?
And what happens in the improbable case that the multi-center studies confirm the 2009 paper? Would Science republish the retracted paper?

Thus in my opinion, it is up to other scientists to confirm or disprove findings published. Remember that falsifiability was Karl Popper’s basic scientific principle. My conclusion was that “fraud is a reason to retract a paper and doubt is not”. 

This is my opinion, but is this opinion shared by others?

When should editors retract a paper? Is fraud the only reason? When should editors issue a letter of concern? Are there guidelines?

Let first say that even editors don’t agree. Schekman, the editor-in chief of PNAS, has no direct plans to retract another paper reporting XMRV-like viruses in CFS [3].

Schekman considers it “an unusual situation to retract a paper even if the original findings in a paper don’t hold up: it’s part of the scientific process for different groups to publish findings, for other groups to try to replicate them, and for researchers to debate conflicting results.”

Back at the Virology Blog [4] there was also a vivid discussion about the matter. Prof. Vincent Ranciello gave the following answer in response to a question of a reader:

I don’t have any hard numbers on how often journals ask scientists to retract a paper, only my sense that it is very rare. Author retractions are more frequent, but I’m only aware of a handful of those in a year. I can recall a few other cases in which the authors were asked to retract a paper, but in those cases scientific fraud was involved. That’s not the case here. I don’t believe there is a standard policy that enumerates how such decisions are made; if they exist they are not public.

However, there is a Guideline for editors, the Guidance from the Committee on Publication Ethics (COPE) (PDF) [5]

Ivanoranski, of the great blog Retraction Watch, linked to it when we discussed reasons for retraction.

With regard to retraction the COPE-guidelines state that journal editors should consider retracting a publication if:

  1. they have clear evidence that the findings are unreliable, either as a result of misconduct (e.g. data fabrication) or honest error (e.g. miscalculation or experimental error)
  2. the findings have previously been published elsewhere without proper crossreferencing, permission or justification (i.e. cases of redundant publication)
  3. it constitutes plagiarism
  4. it reports unethical research

According to the same guidelines journal editors should consider issuing an expression of concern if:

  1. they receive inconclusive evidence of research or publication misconduct by the authors 
  2. there is evidence that the findings are unreliable but the authors’ institution will not investigate the case 
  3. they believe that an investigation into alleged misconduct related to the publication either has not been, or would not be, fair and impartial or conclusive 
  4. an investigation is underway but a judgement will not be available for a considerable time

Thus in the case of the Science XMRV/CSF paper an expression of concern certainly applies (all 4 points) and one might even consider a retraction, because the results seem unreliable (point 1). But it is not 100%  established that the findings are false. There is only serious doubt……

The guidelines seem to leave room for separate decisions. To retract a paper in case of plain fraud is not under discussion. But when is an error sufficiently established ànd important to warrant retraction?

Apparently retractions are on the rise. Although still rare (0.02% of all publications by the late 2000s) there has been a tenfold increase in retractions compared to the early 1980s (see review at Scholarly Kitchen [6] about two papers: [7] and [8]). However it is unclear whether increasing rates of retraction reflect more fraudulent or erroneous papers or a better diligence. The  first paper [7] also highlights that, out of fear of litigation, editors are generally hesitant to retract an article without the author’s permission.

At the blog Nerd Alert they give a nice overview [9] (based on Retraction Watch, but then summarized in one post 😉 ) . They clarify that papers are retracted for “less dastardly reasons then those cases that hit the national headlines and involve purposeful falsification of data”, such as the fraudulent papers of Andrew Wakefield (autism caused by vaccination). Besides the mistaken publication of the same paper twice, data over-interpretation, plagiarism and the like, the reason can also be more trivial: ordering the wrong mice or using an incorrectly labeled bottle.

Still, scientist don’t unanimously agree that such errors should lead to retraction.

Drug Monkey blogs about his discussion [10] with @ivanoransky over a recent post at Retraction Watch, which asks whether a failure to replicate a result justifies a retraction [11]”. Ivanoransky presents a case, where a researcher (B) couldn’t reproduce the findings of another lab (A) and demonstrated mutations in the published protein sequence that excluded the mechanism proposed in A’s paper. This wasn’t retracted, possibly because B didn’t follow the published experimental protocols of A in all details. (reminds me of the XMRV controversy). 

Drugmonkey says (quote):  (cross-posted at Scientopia here — hmmpf isn’t that an example of redundant publication?)

“I don’t give a fig what any journals might wish to enact as a policy to overcompensate for their failures of the past.
In my view, a correction suffices” (provided that search engines like Google and PubMed make clear that the paper was in fact corrected).

Drug Monkey has a point there. A clear watermark should suffice.

However, we should note that most papers are retracted by authors, not the editors/journals, and that the majority of “retracted papers” remain available. Just 13.2% are deleted from the journal’s website. And 31% are not clearly labelled as such.

Summary of how the naïve reader is alerted to paper retraction (from Table 2 in [7], see: Scholarly Kitchen [6])

  • Watermark on PDF (41.1%)
  • Journal website (33.4%)
  • Not noted anywhere (31.8%)
  • Note appended to PDF (17.3%)
  • PDF deleted from website (13.2%)

My conclusion?

Of course fraudulent papers should be retracted. Also papers with obvious errors that invalidate the conclusions.

However, we should be extremely hesitant to retract papers that can’t be reproduced, if there is no undisputed evidence of error.

Otherwise we should retract almost all published papers at one point or another. Because if Professor Ioannides is right (and he probably is) “Much of what medical researchers conclude in their studies is misleading, exaggerated, or flat-out wrong”. ( see previous post [12],  “Lies, Damned Lies, and Medical Science” [13])  and Ioannides’ crushing article “Why most published research findings are false [14]”)

All retracted papers (and papers with major deficiencies and shortcomings) should be clearly labeled as such (as Drugmonkey proposed, not only at the PDF and at the Journal website, but also by search engines and biomedical databases).

Or lets hope, with Biochembelle [15], that the future of scientific publishing will make retractions for technical issues obsolete (whether in the form of nano-publications [16] or otherwise):

One day the scientific community will trade the static print-type approach of publishing for a dynamic, adaptive model of communication. Imagine a manuscript as a living document, one perhaps where all raw data would be available, others could post their attempts to reproduce data, authors could integrate corrections or addenda….

NOTE: Retraction Watch (@ivanoransky) and @laikas have voted in @drugmonkeyblog‘s poll about what a retracted paper means [here]. Have you?

References

  1. Science Asks to Retract the XMRV-CFS Paper, it Should Never Have Accepted in the First Place. (laikaspoetnik.wordpress.com 2011-06-02)
  2. Alberts B. Editorial Expression of Concern. Science. 2011-05-31.
  3. Given Doubt Cast on CFS-XMRV Link, What About Related Research? (blogs.wsj.com)
  4. XMRV is a recombinant virus from mice  (Virology Blog : 2011/05/31)
  5. Retractions: Guidance from the Committee on Publication Ethics (COPE) Elizabeth Wager, Virginia Barbour, Steven Yentis, Sabine Kleinert on behalf of COPE Council:
    http://www.publicationethics.org/files/u661/Retractions_COPE_gline_final_3_Sept_09__2_.pdf
  6. Retract This Paper! Trends in Retractions Don’t Reveal Clear Causes for Retractions (scholarlykitchen.sspnet.org)
  7. Wager E, Williams P. Why and how do journals retract articles? An analysis of Medline retractions 1988-2008. J Med Ethics. 2011 Apr 12. [Epub ahead of print] 
  8. Steen RG. Retractions in the scientific literature: is the incidence of research fraud increasing? J Med Ethics. 2011 Apr;37(4):249-53. Epub 2010 Dec 24.
  9. Don’t touch that blot. (nerd-alert.net/blog/weeklies/ : 2011/02/25)
  10. What_does_a_retracted_paper_mean? (scienceblogs.com/drugmonkey: 2011/06/03)
  11. So when is a retraction warranted? The long and winding road to publishing a failure to replicate (retractionwatch.wordpress.com : 2011/06/03/)
  12. Much Ado About ADHD-Research: Is there a Misrepresentation of ADHD in Scientific Journals? (laikaspoetnik.wordpress.com 2011-06-02)
  13. “Lies, Damned Lies, and Medical Science” (theatlantic.com :2010/11/)
  14. Ioannidis, J. (2005). Why Most Published Research Findings Are False. PLoS Medicine, 2 (8) DOI: 10.1371/journal.pmed.0020124
  15. Retractions: What are they good for? (biochembelle.wordpress.com : 2011/06/04/)
  16. Will Nano-Publications & Triplets Replace The Classic Journal Articles? (laikaspoetnik.wordpress.com 2011-06-02)

NEW* (Added 2011-06-08):

 





Science Asks to Retract the XMRV-CFS Paper, it Should Never Have Accepted in the First Place.

2 06 2011

Wow! Breaking!

As reported in WSJ earlier this week [1], editors of the journal Science asked Mikovits and her co-authors to voluntary retract their 2009 Science paper [2].

In this paper Mikovits and colleagues of the Whittemore Peterson Institute (WPI) and the Cleveland Clinic, reported the presence of xenotropic murine leukemia virus–related virus (XMRV) in peripheral blood mononuclear cells (PBMC) of patients with chronic fatigue syndrome (CFS). They used the very contamination-prone nested PCR to detect XMRV. This 2 round PCR enables detection of a rare target sequence by producing an unimaginable huge number of copies of that sequence.
XMRV was first demonstrated in cell lines and tissue samples of prostate cancer patients.

All the original authors, except for one [3], refused to retract the paper [4]. This prompted Science editor-in-chief Bruce Alberts to  issue an Expression of Concern [5], which was published two days earlier than planned because of the early release of the news in WSJ, mentioned above [1]. (see Retraction Watch [6]).

The expression of concern also follows the publication of two papers in the same journal.

In the first Science paper [7] Knox et al. found no Murine-Like Gammaretroviruses in any of the 61 CFS Patients previously identified as XMRV-positive, using the same PCR and culturing techniques as used by Lombardi et al. This paper made ERV (who consistently critiqued the Lombardi paper from the startlaugh-out-loud [8], because Knox also showed that human sera neutralize the virus in the blood,indicating it can hardly infect human cells in vivo. Knox also showed the WPIs sequences to be similar to the XMRV plasmid VP62, known to often contaminate laboratory agents.*

Contamination as the most likely reason for the positive WPI-results is also the message of the second Science paper. Here, Paprotka et al. [9]  show that XMRV was not present in the original prostate tumor that gave rise to the XMRV-positive 22Rv1 cell line, but originated -as a laboratory artifact- by recombination of two viruses during passaging the cell line in nude mice. For a further explanation see the Virology Blog [10].

Now Science editors have expressed their concern, the tweets, blogposts and health news articles are preponderantly negative about the XMRV findings in CFS/ME, where they earlier were positive or neutral. Tweets like “Mouse virus #XMRV doesn’t cause chronic fatigue #CFS http://t.co/Bekz9RG (Reuters) or “Origins of XMRV deciphered, undermining claims for a role in human disease: Delineation of the origin of… http://bit.ly/klDFuu #cancer” (National Cancer Institute) are unprecedented.

Thus the appeal by Science to retract the paper is justified?

Well yes and no.

The timing is rather odd:

  • Why does Science only express concern after publication of these two latest Science papers? There are almost a dozen other studies that failed to reproduce the WPI-findings. Moreover, 4 earlier papers in Retrovirology already indicated that disease-associated XMRV sequences are consistent with laboratory contamination. (see an overview of all published articles at A Photon in the Darkness [11])
  • There are still (neutral) scientist who believe that genuine human infections with XMRV still exist at a relatively low prevalence. (van der Kijl et al: xmrv is not a mousy virus [12])
  • And why doesn’t Science await the results from the official confirmation studies meant to finally settle whether XMRV exist in our blood supply and/or CFS (by the Blood Working Group and the NIH sponsored study by Lipkin et al.)
  • Why (and this is the most important question) did Science ever decide to publish the piece in the first place, as the study had several flaws.
I do believe that new research that turns existing paradigms upside down deserves a chance. Also a chance to get disproved. Yes such papers might be published in prominent scientific journals like Science, provided they are technically and methodologically sound at the very least. The Lombardi paper wasn’t.

Here I repeat my concerns expressed in earlier posts [13 and 14]. (please read these posts first, if you are unfamiliar with PCR).

Shortcomings in PCR-technique and study design**:

  • No positive control and no demonstration of the sensitivity of the PCR-assay. Usually a known concentration or a serial dilution of a (weakly) positive sample is taken as control. This allows to determine sensitivity of the assay.
  • Aspecific bands in negative samples (indicating suboptimal conditions)
  • Just one vial without added DNA per experiment as a negative control. (Negative controls are needed to exclude contamination).
  • CFS-Positive and negative samples are on separate gels (this increases bias, because conditions and chance of contamination are not the same for all samples, it also raises the question whether the samples were processed differently)
  • Furthermore only results obtained at the Cleveland Clinic are shown. (were similar results not obtained at the WPI? see below)
Contamination not excluded as a possible explanation
  • No variation in the XMRV-sequences detected (expected if the findings are real)
  • Although the PCR is near the detection limit, only single round products are shown. These are much stronger then expected even after two rounds. This is very confusing, because WPI later exclaimed that preculturing PBMC plus nested PCR (2 rounds) were absolutely required to get a positive result. But the Legend of Fig. 1 in the original Science paper clearly says PCR after one round. Strong (homogenous) bands after one round of PCR are highly suggestive of contamination.
  • No effort to exclude contamination of samples with mouse DNA (see below)
  • No determination of the viral DNA integration sites.

Mikovits also stressed that she never used the XMRV-positive cell lines in 2009. But what about the Cleveland Clinic, nota bene the institute that co-discovered XMRV and that had produced the strongly positive PCR-products (…after a single PCR-round…)?

On the other hand, the authors had other proof of the presence of retrovirus: detection of (low levels of) antibodies to XMRV in patient sera, and transmissibility of XMRV. On request they later applied the mouse mitochondrial assay to successfully exclude the presence of mouse DNA in their samples. (but this doesn’t exclude all forms of contamination, and certainly not at Cleveland Clinic)

These shortcomings alone should have been sufficient for the reviewers, had they seen it and /or deemed it of sufficient importance, to halt publication and to ask for additional studies**.

I was once in a similar situation. I found a rare cancer-specific chromosomal translocation in normal cells, but I couldn’t exclude PCR- contamination. The reviewers asked me to exclude contamination by sequencing the breakpoints, which only succeeded after two years of extra work. In retrospect I’m thankful to the reviewers for preventing me from publishing a possible faulty paper which could have ruined my career (yeah, because contamination is a real problem in PCR). And my paper improved tremendously by the additional experiments.

Yes it is peer review that failed here, Science. You should have asked for extra confirmatory tests and a better design in the first place. That would have spared a lot of anguish, and if the findings had been reproducible, more convincing and better data.

There were a couple of incidents after the study was published, that made me further doubt the robustness of WPI’s scientific data and even (after a while) I began to doubt whether WPI, and Judy Mikovits in particular, is adhering to good scientific (and ethical) practices.

  • WPI suddenly disclosed (Feb 18 2010) that culturing PBMC’s is necessary to obtain a positive PCR signal.  As a matter of fact they maintain this in their recent protest letter to Science. They refer to the original Science paper, but this paper doesn’t mention the need for culturing at all!! 
  • WPI suggests their researchers had detected XMRV in patient samples from both Dr. Kerr’s and Dr. van Kuppeveld’s ‘XMRV-negative’ CFS-cohorts. Thus in patient samples obtained without a culture-enrichment step…..  There can only be one truth:  main criticism on negative studies was that improper CFS-criteria were used. Thus either this CFS-population is wrongly defined and DOESN’t contain XMRV (with any method), OR it fulfills the criteria of CFS and the XMRV can be detected applying the proper technique. It is so confusing!..
  • Although Mikovits first reported that they found no to little virus variation, they later exclaimed to find a lot of variation.
  • WPI employees behave unprofessional towards colleague-scientists who failed to reproduce their findings.
Other questionable practices 
  • Mikovits also claims that people with autism harbor XMRV. One wonders which disease ISN’t associated with XMRV….
  • Despite the uncertainties about XMRV in CFS-patients, let alone the total LACK of demonstration of a CAUSAL RELATIONSHIP, Mikovits advocates the use of *not harmless* anti-retrovirals by CFS-patients.
  • At this stage of controversy, the WPI-XMRV test is sold as “a reliable diagnostic tool“ by a firm (VIP Dx) with strong ties to WPI. Mikovits even tells patients in a mail: “First of all the current diagnostic testing will define with essentially 100% accuracy! XMRV infected patients”. WTF!? 
  • This test is not endorsed in Belgium, and even Medicare only reimbursed 15% of the PCR-test.
  • The ties of WPI to RedLabs & VIP Dx are not clearly disclosed in the Science Paper. There is only a small Note (added in proof!)  that Lombardi is operations manager of VIP Dx, “in negotiations with the WPI to offer a diagnostic test for XMRV”.
Please see this earlier post [13] for broader coverage. Or read the post [16] of Keith Grimaldi, scientific director of Eurogene, and expert in personal genomics, who I asked to comment on the “diagnostic” tests. In his post he very clearly describes “what is exactly wrong about selling an unregulated clinical test  to a very vulnerable and exploitable group based on 1 paper on a small isolated sample”.

It is really surprising this wasn’t picked up by the media, by the government or by the scientific community. Will the new findings have any consequences for the XMRV-diagnostic tests? I fear WPI will get away with it for the time being. I agree with Lipkin, who coordinates the NIH-sponsored multi-center CFS-XMRV study that calls to retract the paper are premature at this point . Furthermore, –as addressed by WSJ [17]– if the Science paper is retracted, because XMRV findings are called into question, what about the papers also reporting a  link of XMRV-(like) viruses and CFS or prostate cancer?

WSJ reports, that Schekman, the editor-in chief of PNAS, has no direct plans to retract the paper of Alter et al reporting XMRV-like viruses in CFS [discussed in 18]. Schekman considers it “an unusual situation to retract a paper even if the original findings in a paper don’t hold up: it’s part of the scientific process for different groups to publish findings, for other groups to try to replicate them, and for researchers to debate conflicting results.”

I agree, this is a normal procedure, once the paper is accepted and published. Fraud is a reason to retract a paper, doubt is not.

Notes

*samples, NOT patients, as I saw a patient erroneous interpretation: “if it is contamination in te lab how can I have it as a patient?” (tweet is now deleted). No, according to the contamination -theory” XMRV-contamination is not IN you, but in the processed samples or in the reaction mixtures used.

** The reviewers did ask additional evidence, but not with respect to the PCR-experiments, which are most prone to contamination and false results.

  1. Chronic-Fatigue Paper Is Questioned (online.wsj.com)
  2. Lombardi VC, Ruscetti FW, Das Gupta J, Pfost MA, Hagen KS, Peterson DL, Ruscetti SK, Bagni RK, Petrow-Sadowski C, Gold B, Dean M, Silverman RH, & Mikovits JA (2009). Detection of an infectious retrovirus, XMRV, in blood cells of patients with chronic fatigue syndrome. Science (New York, N.Y.), 326 (5952), 585-9 PMID: 19815723
  3. WPI Says No to Retraction / Levy Study Dashes Hopes /NCI Shuts the Door on XMR (phoenixrising.me)
  4. http://wpinstitute.org/news/docs/FinalreplytoScienceWPI.pdf
  5. Alberts B. Editorial Expression of Concern. Science. 2011 May 31.
  6. Science asks authors to retract XMRV-chronic fatigue paper; when they refuse, issue Expression of Concern. 2011/05/31/ (retractionwatch.wordpress.com)
  7. K. Knox, Carrigan D, Simmons G, Teque F, Zhou Y, Hackett Jr J, Qiu X, Luk K, Schochetman G, Knox A, Kogelnik AM & Levy JA. No Evidence of Murine-Like Gammaretroviruses in CFS Patients Previously Identified as XMRV-Infected. Science. 2011 May 31. (10.1126/science.1204963).
  8. XMRV and chronic fatigue syndrome: So long, and thanks for all the lulz, Part I [erv] (scienceblogs.com)
  9. Paprotka T, Delviks-Frankenberry KA, Cingoz O, Martinez A, Kung H-J, Tepper CG, Hu W-S , Fivash MJ, Coffin JM, & Pathak VK. Recombinant origin of the retrovirus XMRV. Science. 2011 May 31. (10.1126/science.1205292).
  10. XMRV is a recombinant virus from mice  (Virology Blog : 2011/05/31)
  11. Science asks XMRV authors to retract paper (photoninthedarkness.com : 2011/05/31)
  12. van der Kuyl AC, Berkhout B. XMRV: Not a Mousy Virus. J Formos Med Assoc. 2011 May;110(5):273-4. PDF
  13. Finally a Viral Cause of Chronic Fatigue Syndrome? Or Not? – How Results Can Vary and Depend on Multiple Factor (laikaspoetnik.wordpress.com: 2010/02/15/)
  14. Three Studies Now Refute the Presence of XMRV in Chronic Fatigue Syndrome (CFS) (laikaspoetnik.wordpress.com 2010/04/27)
  15. WPI Announces New, Refined XMRV Culture Test – Available Now Through VIP Dx in Reno (prohealth.com 2010/01/15)
  16. The murky side of physician prescribed LDTs (eurogene.blogspot.com : 2010/09/06)
  17. Given Doubt Cast on CFS-XMRV Link, What About Related Research? (blogs.wsj.com)
  18. Does the NHI/FDA Paper Confirm XMRV in CFS? Well, Ditch the MR and Scratch the X… and… you’ve got MLV. (laikaspoetnik.wordpress.com : 2010/08/30/)

Related articles





Health Experts & Patient Advocates Beware: 10 Reasons Why you Shouldn’t be a Curator at Organized Wisdom!! #OrganizedWisdom

11 05 2011

Last year I aired my concern about Organized Wisdom in a post called Expert Curators, WisdomCards & The True Wisdom of @organizedwisdom.

Organized Wisdom shares health links of health experts or advocates, who (according to OW’s FAQ), either requested a profile or were recommended by OW’s Medical Review Board. I was one of those so called Expert Curators. However, I had never requested a profile and I seriously doubt whether someone from the a medical board had actually read any of my tweets or my blog posts.

This was one of the many issues with Organized Wisdom. But the main issue was its lack of credibility and transparency. I vented my complaints, I removed my profile from OW, stopped following updates at Twitter and informed some fellow curators.

I almost forgot about it, till Simon Sikorski, MD, commented at my blog, informing me that my complaints hadn’t been fully addressed and convincing me things were even worse than I thought.

He has started a campaign to do something about this Unethical Health Information Content Farming by Organized Wisdom (OW).

While discussing this affair with a few health experts and patient advocates I was disappointed by the reluctant reactions of a few people: “Well, our profiles are everywhere”, “Thanks I will keep an eye open”, “cannot say much yet”. How much evidence does one need?

Of course there were also people – well known MD’s and researchers – who immediately removed their profile and compared OW’s approach with that of Wellsphere, that scammed the Health Blogosphere. Yes, OW also scrapes and steals your intellectual property (blog and/or tweet content), but the difference is: OW doesn’t ask you to join, it just puts up your profile and shares it with the world.

As a medical librarian and e-patient I find the quality, reliability and objectivity of health information of utmost importance. I believe in the emancipation of patients (“Patient is not a third person word”, e-patient Dave), but it can only work if patients are truly well informed. This is difficult enough, because of the information overload and the conflicting data. We don’t need any further misinformation and non-transparency.

I belief that Organized Wisdom puts the reputation of  its “curators” at stake and that it is not a trustworthy nor useful resource for health information. For the following reasons (x see also Simon’s blog post and slides, his emphasis is more on content theft)

1. Profiles of Expert Curators are set up without their knowledge and consent
Most curators I asked didn’t know they were expert curators. Simon has spoken with 151 of the 5700 expert curators and not one of those persons knew he/she was listed on OW. (x)

2. The name Expert Curator suggests that you (can) curate information, but you cannot.
The information is automatically produced and is shown unfiltered (and often shown in duplicate, because many different people can link to the same source). It is not possible to edit the cards.
Ideally, curating should even be more than filtering (see this nice post about 
Social Media Content Curators, where curation is defined as the act of synthesizing and interpreting in order to present a complete record of a concept.)

3. OW calls your profile address: “A vanity URL¹”.

Is that how they see you? Well it must be said they try to win you by pure flattery. And they often succeed….

¹Quote OW: “We credit, honor, and promote our Health Experts, including offering: A vanity URL to promote so visitors can easily share your Health Profile with others, e.g. my.organizedwisdom.com/ePatientDave.
Note: this too is quite similar to the Wellsphere’s approach (read more at E-patients-net)

4. Bots tap into your tweets and/or scrape the content off their website
(x: see healthcare content farms monetizing scheme)

5. Scraping your content can affect your search rankings (x)
This probably affects starting/small blogs the most. I checked two posts of well known blogs and their websites still came up first.

6.  The site is funded/sponsored by pharmaceutical companies.
 “Tailored” ads show up next to the so called Wisdom Cards dealing with the same topic. If no pharmaceutical business has responded Google ads show up instead.
See the form where they actually invite pharma companies to select a target condition for advertizing. Note that the target conditions fit the OW topics.

7. The Wisdom Cards are no more than links to your tweets or posts. They have no added value. 

8. Worse, tweets and links are shown out of context.
I provided various examples in my previous post (mainly in the comment section)

A Cancer and Homeopathy WisdomCard™ shows Expert Curator Liz Ditz who is sharing a link about Cancer and Homeopathy. The link she shares is a dangerous article by a Dr. who is working in an Homeopathic General Hospital, in India “reporting” several cases of miraculous cures by Conium 1M, Thuja 50M and other watery-dilutions. I’m sure that Liz Ditz, didn’t say anything positive about the “article”. Still it seems she “backs it up”. Perhaps she tweeted: “Look what a dangerous crap.”
When I informed her, Liz said:“AIEEEE…. didn’t sign up with Organized Wisdom that I know of”. She felt she was used for credulous support for homeopathy & naturopathy.

Note: Liz card has disappeared (because she opted out), but I was was surprised to find that the link (http://organizedwisdom.com/Cancer-and-Homeopathy/wt/medstill works and links to other “evidence” on the same topic.


9. There is no quality control. Not of the wisdom cards and not of the expert curators.
Many curators are not what I would call true experts and I’m not alone: @holly comments at a Techcrunch postI am glad you brought up the “written by people who do not have a clue, let alone ANY medical training [of any kind] at all.” I have no experience with any kind of medical education, knowledge or even the slightest clue of a tenth of the topics covered on OW, yet for some reason they tried to recruit me to review cards there!?! )

The emphasis is also on alternative treatments: prevention of cancer, asthma, ADHD by herbs etc. In addition to “Health Centers”, there also Wellness Centers (AgingDietFitness etc) and Living Centers (BeautyCookingEnvironment). A single card can share information of 2 or 3 centers (diabetes and multivitamins for example).

And as said, all links of expert curators are placed unfiltered, even when you make a joke or mention you’re on vacation. Whether you’re a  Top health expert or advocate (there is a regular shout-out) just depends on the number of links you share, thus NOT on quality. For this reason the real experts are often at lower positions.

Some cards are just link baits.

 

10.  Organized Wisdom is heavily promoting its site.
Last year it launched activitydigest, automatic digests meant to stimulate “engagement” of expert curators. It tries to connect with top health experts, pharma -people and patient advocates. Hoping they will support OW. This leads to uncritical interviews such as at Pixels and Pills, at Health Interview (
Reader’s Digest + Organized Wisdom = Wiser Patients), Xconomy.com organizedwisdom recruits experts to filter health information on the web.

What can you do?

  • Check whether you have a profile at Organized Wisdom here.
  • Take a good look at Organized Wisdom and what it offers. It isn’t difficult and it doesn’t take much time to see through the facade.
  • If you don’t agree with what it represents, please consider to opt out.
  • You can email info@organizedwisdom.com to let your profile as expert curator removed.
  • If you agree that what OW does is no good practice, you could do the following (most are suggestions of Simon):
  • spread the word and inform others
  • join the conversation on Twitter #EndToFarms
  • join the tweetup on what you can do about this scandal and how to protect yourself from being liable. (more details will be offered by Simon at his regularly updated blogpost)
  • If you don’t agree this Content Farm deserves HONcode certification, notify HON at  https://www.healthonnet.org/HONcode/Conduct.html?HONConduct444558
Please don’t sit back and think that being a wisdom curator does not matter. Don’t show off  with an Organized Wisdom badget, widget or link at your blog or website.  Resist the flattery of being called an expert curator, because it doesn’t mean anything in this context. And by being part of Organized Wisdom, you indirectly support their practice. This may seriously affect your own reputation and indirectly you may contribute to misinformation.

Or as Heidi’s commented to my previous post:

I am flabbergasted that people’s reputation are being used to endorse content without their say so.
Even more so that they cannot delete their profile and withdraw their support.*

For me those two things on their own signal big red flags:

The damage to a health professional’s reputation as a result could be great.
Misleading the general public with poor (yes dangerous) information another

Altogether unethical.

*This was difficult at that time.

Update May 10, 2011: News from Simon: 165 individuals & 5 hospitals have now spoken up about unfolding scandal and are doing something about it (Tuesday )

Update May 12, 2011: If I failed to convince you, please read the post of Ramona Bates MD (@rlbates at Twitter, plastic surgeon, blogger at Suture for a Living), called “More Organized Wisdom Un-Fair Play. Ramona asked her profile to be removed from OW half a year ago).  Recommended pages at her blog seem to be written by other people.
She concludes:

“Once again, I encourage my fellow healthcare bloggers (doctors, nurses, patient advocates, etc) to remove yourself from any association with Organized Wisdom and other sites like them”

Related articles