Why Publishing in the NEJM is not the Best Guarantee that Something is True: a Response to Katan

27 10 2012

ResearchBlogging.orgIn a previous post [1] I reviewed a recent  Dutch study published in the New England Journal of Medicine (NEJM [2] about the effects of sugary drinks on the body mass index of school children.

The study got widely covered by the media. The NRC, for which the main author Martijn Katan works as a science columnist,  columnist, spent  two full (!) pages on the topic -with no single critical comment-[3].
As if this wasn’t enough, the latest column of Katan again dealt with his article (text freely available at mkatan.nl)[4].

I found Katan’s column “Col hors Catégorie” [4] quite arrogant, especially because he tried to belittle a (as he called it) “know-it-all” journalist who criticized his work  in a rivaling newspaper. This wasn’t fair, because the journalist had raised important points [5, 1] about the work.

The piece focussed on the long road of getting papers published in a top journal like the NEJM.
Katan considers the NEJM as the “Tour de France” among  medical journals: it is a top achievement to publish in this paper.

Katan also states that “publishing in the NEJM is the best guarantee something is true”.

I think the latter statement is wrong for a number of reasons.*

  1. First, most published findings are false [6]. Thus journals can never “guarantee”  that published research is true.
    Factors that  make it less likely that research findings are true include a small effect size,  a greater number and lesser preselection of tested relationships, selective outcome reporting, the “hotness” of the field (all applying more or less to Katan’s study, he also changed the primary outcomes during the trial[7]), a small study, a great financial interest and a low pre-study probability (not applicable) .
  2. It is true that NEJM has a very high impact factor. This is  a measure for how often a paper in that journal is cited by others. Of course researchers want to get their paper published in a high impact journal. But journals with high impact factors often go for trendy topics and positive results. In other words it is far more difficult to publish a good quality study with negative results, and certainly in an English high impact journal. This is called publication bias (and language bias) [8]. Positive studies will also be more frequently cited (citation bias) and will more likely be published more than once (multiple publication bias) (indeed, Katan et al already published about the trial [9], and have not presented all their data yet [1,7]). All forms of bias are a distortion of the “truth”.
    (This is the reason why the search for a (Cochrane) systematic review must be very sensitive [8] and not restricted to core clinical journals, but even include non-published studies: for these studies might be “true”, but have failed to get published).
  3. Indeed, the group of Ioannidis  just published a large-scale statistical analysis[10] showing that medical studies revealing “very large effects” seldom stand up when other researchers try to replicate them. Often studies with large effects measure laboratory and/or surrogate markers (like BMI) instead of really clinically relevant outcomes (diabetes, cardiovascular complications, death)
  4. More specifically, the NEJM does regularly publish studies about pseudoscience or bogus treatments. See for instance this blog post [11] of ScienceBased Medicine on Acupuncture Pseudoscience in the New England Journal of Medicine (which by the way is just a review). A publication in the NEJM doesn’t guarantee it isn’t rubbish.
  5. Importantly, the NEJM has the highest proportion of trials (RCTs) with sole industry support (35% compared to 7% in the BMJ) [12] . On several occasions I have discussed these conflicts of interests and their impact on the outcome of studies ([13, 14; see also [15,16] In their study, Gøtzsche and his colleagues from the Nordic Cochrane Centre [12] also showed that industry-supported trials were more frequently cited than trials with other types of support, and that omitting them from the impact factor calculation decreased journal impact factors. The impact factor decrease was even 15% for NEJM (versus 1% for BMJ in 2007)! For the journals who provided data, income from the sales of reprints contributed to 3% and 41% of the total income for BMJ and The Lancet.
    A recent study, co-authored by Ben Goldacre (MD & science writer) [17] confirms that  funding by the pharmaceutical industry is associated with high numbers of reprint ordersAgain only the BMJ and the Lancet provided all necessary data.
  6. Finally and most relevant to the topic is a study [18], also discussed at Retractionwatch[19], showing that  articles in journals with higher impact factors are more likely to be retracted and surprise surprise, the NEJM clearly stands on top. Although other reasons like higher readership and scrutiny may also play a role [20], it conflicts with Katan’s idea that  “publishing in the NEJM is the best guarantee something is true”.

I wasn’t aware of the latter study and would like to thank drVes and Ivan Oranski for responding to my crowdsourcing at Twitter.


  1. Sugary Drinks as the Culprit in Childhood Obesity? a RCT among Primary School Children (laikaspoetnik.wordpress.com)
  2. de Ruyter JC, Olthof MR, Seidell JC, & Katan MB (2012). A trial of sugar-free or sugar-sweetened beverages and body weight in children. The New England journal of medicine, 367 (15), 1397-406 PMID: 22998340
  3. NRC Wim Köhler Eén kilo lichter.NRC | Zaterdag 22-09-2012 (http://archief.nrc.nl/)
  4. Martijn Katan. Col hors Catégorie [Dutch], (published in de NRC,  (20 oktober)(www.mkatan.nl)
  5. Hans van Maanen. Suiker uit fris, De Volkskrant, 29 september 2012 (freely accessible at http://www.vanmaanen.org/)
  6. Ioannidis, J. (2005). Why Most Published Research Findings Are False PLoS Medicine, 2 (8) DOI: 10.1371/journal.pmed.0020124
  7. Changes to the protocol http://clinicaltrials.gov/archive/NCT00893529/2011_02_24/changes
  8. Publication Bias. The Cochrane Collaboration open learning material (www.cochrane-net.org)
  9. de Ruyter JC, Olthof MR, Kuijper LD, & Katan MB (2012). Effect of sugar-sweetened beverages on body weight in children: design and baseline characteristics of the Double-blind, Randomized INtervention study in Kids. Contemporary clinical trials, 33 (1), 247-57 PMID: 22056980
  10. Pereira, T., Horwitz, R.I., & Ioannidis, J.P.A. (2012). Empirical Evaluation of Very Large Treatment Effects of Medical InterventionsEvaluation of Very Large Treatment Effects JAMA: The Journal of the American Medical Association, 308 (16) DOI: 10.1001/jama.2012.13444
  11. Acupuncture Pseudoscience in the New England Journal of Medicine (sciencebasedmedicine.org)
  12. Lundh, A., Barbateskovic, M., Hróbjartsson, A., & Gøtzsche, P. (2010). Conflicts of Interest at Medical Journals: The Influence of Industry-Supported Randomised Trials on Journal Impact Factors and Revenue – Cohort Study PLoS Medicine, 7 (10) DOI: 10.1371/journal.pmed.1000354
  13. One Third of the Clinical Cancer Studies Report Conflict of Interest (laikaspoetnik.wordpress.com)
  14. Merck’s Ghostwriters, Haunted Papers and Fake Elsevier Journals (laikaspoetnik.wordpress.com)
  15. Lexchin, J. (2003). Pharmaceutical industry sponsorship and research outcome and quality: systematic review BMJ, 326 (7400), 1167-1170 DOI: 10.1136/bmj.326.7400.1167
  16. Smith R (2005). Medical journals are an extension of the marketing arm of pharmaceutical companies. PLoS medicine, 2 (5) PMID: 15916457 (free full text at PLOS)
  17. Handel, A., Patel, S., Pakpoor, J., Ebers, G., Goldacre, B., & Ramagopalan, S. (2012). High reprint orders in medical journals and pharmaceutical industry funding: case-control study BMJ, 344 (jun28 1) DOI: 10.1136/bmj.e4212
  18. Fang, F., & Casadevall, A. (2011). Retracted Science and the Retraction Index Infection and Immunity, 79 (10), 3855-3859 DOI: 10.1128/IAI.05661-11
  19. Is it time for a Retraction Index? (retractionwatch.wordpress.com)
  20. Agrawal A, & Sharma A (2012). Likelihood of false-positive results in high-impact journals publishing groundbreaking research. Infection and immunity, 80 (3) PMID: 22338040


* Addendum: my (unpublished) letter to the NRC

Tour de France.
Nadat het NRC eerder 2 pagina’ s de loftrompet over Katan’s nieuwe studie had afgestoken, vond Katan het nodig om dit in zijn eigen column dunnetjes over te doen. Verwijzen naar je eigen werk mag, ook in een column, maar dan moeten wij daar als lezer wel wijzer van worden. Wat is nu de boodschap van dit stuk “Col hors Catégorie“? Het beschrijft vooral de lange weg om een wetenschappelijke studie gepubliceerd te krijgen in een toptijdschrift, in dit geval de New England Journal of Medicine (NEJM), “de Tour de France onder de medische tijdschriften”. Het stuk eindigt met een tackle naar een journalist “die dacht dat hij het beter wist”. Maar ach, wat geeft dat als de hele wereld staat te jubelen? Erg onsportief, omdat die journalist (van Maanen, Volkskrant) wel degelijk op een aantal punten scoorde. Ook op Katan’s kernpunt dat een NEJM-publicatie “de beste garantie is dat iets waar is” valt veel af te dingen. De NEJM heeft inderdaad een hoge impactfactor, een maat voor hoe vaak artikelen geciteerd worden. De NEJM heeft echter ook de hoogste ‘artikelterugtrekkings’ index. Tevens heeft de NEJM het hoogste percentage door de industrie gesponsorde klinische trials, die de totale impactfactor opkrikken. Daarnaast gaan toptijdschriften vooral voor “positieve resultaten” en “trendy onderwerpen”, wat publicatiebias in de hand werkt. Als we de vergelijking met de Tour de France doortrekken: het volbrengen van deze prestigieuze wedstrijd garandeert nog niet dat deelnemers geen verboden middelen gebruikt hebben. Ondanks de strenge dopingcontroles.

To Retract or Not to Retract… That’s the Question

7 06 2011

In the previous post I discussed [1] that editors of Science asked for the retraction of a paper linking XMRV retrovirus to ME/CFS.

The decision of the editors was based on the failure of at least 10 other studies to confirm these findings and on growing support that the results were caused by contamination. When the authors refused to retract their paper, Science issued an Expression of Concern [2].

In my opinion retraction is premature. Science should at least await the results of two multi-center studies, that were designed to confirm or disprove the results. These studies will continue anyway… The budget is already allocated.

Furthermore, I can’t suppress the idea that Science asked for a retraction to exonerate themselves for the bad peer review (the paper had serious flaws) and their eagerness to swiftly publish the possibly groundbreaking study.

And what about the other studies linking the XMRV to ME/CFS or other diseases: will these also be retracted?
And what happens in the improbable case that the multi-center studies confirm the 2009 paper? Would Science republish the retracted paper?

Thus in my opinion, it is up to other scientists to confirm or disprove findings published. Remember that falsifiability was Karl Popper’s basic scientific principle. My conclusion was that “fraud is a reason to retract a paper and doubt is not”. 

This is my opinion, but is this opinion shared by others?

When should editors retract a paper? Is fraud the only reason? When should editors issue a letter of concern? Are there guidelines?

Let first say that even editors don’t agree. Schekman, the editor-in chief of PNAS, has no direct plans to retract another paper reporting XMRV-like viruses in CFS [3].

Schekman considers it “an unusual situation to retract a paper even if the original findings in a paper don’t hold up: it’s part of the scientific process for different groups to publish findings, for other groups to try to replicate them, and for researchers to debate conflicting results.”

Back at the Virology Blog [4] there was also a vivid discussion about the matter. Prof. Vincent Ranciello gave the following answer in response to a question of a reader:

I don’t have any hard numbers on how often journals ask scientists to retract a paper, only my sense that it is very rare. Author retractions are more frequent, but I’m only aware of a handful of those in a year. I can recall a few other cases in which the authors were asked to retract a paper, but in those cases scientific fraud was involved. That’s not the case here. I don’t believe there is a standard policy that enumerates how such decisions are made; if they exist they are not public.

However, there is a Guideline for editors, the Guidance from the Committee on Publication Ethics (COPE) (PDF) [5]

Ivanoranski, of the great blog Retraction Watch, linked to it when we discussed reasons for retraction.

With regard to retraction the COPE-guidelines state that journal editors should consider retracting a publication if:

  1. they have clear evidence that the findings are unreliable, either as a result of misconduct (e.g. data fabrication) or honest error (e.g. miscalculation or experimental error)
  2. the findings have previously been published elsewhere without proper crossreferencing, permission or justification (i.e. cases of redundant publication)
  3. it constitutes plagiarism
  4. it reports unethical research

According to the same guidelines journal editors should consider issuing an expression of concern if:

  1. they receive inconclusive evidence of research or publication misconduct by the authors 
  2. there is evidence that the findings are unreliable but the authors’ institution will not investigate the case 
  3. they believe that an investigation into alleged misconduct related to the publication either has not been, or would not be, fair and impartial or conclusive 
  4. an investigation is underway but a judgement will not be available for a considerable time

Thus in the case of the Science XMRV/CSF paper an expression of concern certainly applies (all 4 points) and one might even consider a retraction, because the results seem unreliable (point 1). But it is not 100%  established that the findings are false. There is only serious doubt……

The guidelines seem to leave room for separate decisions. To retract a paper in case of plain fraud is not under discussion. But when is an error sufficiently established ànd important to warrant retraction?

Apparently retractions are on the rise. Although still rare (0.02% of all publications by the late 2000s) there has been a tenfold increase in retractions compared to the early 1980s (see review at Scholarly Kitchen [6] about two papers: [7] and [8]). However it is unclear whether increasing rates of retraction reflect more fraudulent or erroneous papers or a better diligence. The  first paper [7] also highlights that, out of fear of litigation, editors are generally hesitant to retract an article without the author’s permission.

At the blog Nerd Alert they give a nice overview [9] (based on Retraction Watch, but then summarized in one post ;) ) . They clarify that papers are retracted for “less dastardly reasons then those cases that hit the national headlines and involve purposeful falsification of data”, such as the fraudulent papers of Andrew Wakefield (autism caused by vaccination). Besides the mistaken publication of the same paper twice, data over-interpretation, plagiarism and the like, the reason can also be more trivial: ordering the wrong mice or using an incorrectly labeled bottle.

Still, scientist don’t unanimously agree that such errors should lead to retraction.

Drug Monkey blogs about his discussion [10] with @ivanoransky over a recent post at Retraction Watch, which asks whether a failure to replicate a result justifies a retraction [11]“. Ivanoransky presents a case, where a researcher (B) couldn’t reproduce the findings of another lab (A) and demonstrated mutations in the published protein sequence that excluded the mechanism proposed in A’s paper. This wasn’t retracted, possibly because B didn’t follow the published experimental protocols of A in all details. (reminds me of the XMRV controversy). 

Drugmonkey says (quote):  (cross-posted at Scientopia here – hmmpf isn’t that an example of redundant publication?)

“I don’t give a fig what any journals might wish to enact as a policy to overcompensate for their failures of the past.
In my view, a correction suffices” (provided that search engines like Google and PubMed make clear that the paper was in fact corrected).

Drug Monkey has a point there. A clear watermark should suffice.

However, we should note that most papers are retracted by authors, not the editors/journals, and that the majority of “retracted papers” remain available. Just 13.2% are deleted from the journal’s website. And 31% are not clearly labelled as such.

Summary of how the naïve reader is alerted to paper retraction (from Table 2 in [7], see: Scholarly Kitchen [6])

  • Watermark on PDF (41.1%)
  • Journal website (33.4%)
  • Not noted anywhere (31.8%)
  • Note appended to PDF (17.3%)
  • PDF deleted from website (13.2%)

My conclusion?

Of course fraudulent papers should be retracted. Also papers with obvious errors that invalidate the conclusions.

However, we should be extremely hesitant to retract papers that can’t be reproduced, if there is no undisputed evidence of error.

Otherwise we should retract almost all published papers at one point or another. Because if Professor Ioannides is right (and he probably is) “Much of what medical researchers conclude in their studies is misleading, exaggerated, or flat-out wrong”. ( see previous post [12],  “Lies, Damned Lies, and Medical Science” [13])  and Ioannides’ crushing article “Why most published research findings are false [14]”)

All retracted papers (and papers with major deficiencies and shortcomings) should be clearly labeled as such (as Drugmonkey proposed, not only at the PDF and at the Journal website, but also by search engines and biomedical databases).

Or lets hope, with Biochembelle [15], that the future of scientific publishing will make retractions for technical issues obsolete (whether in the form of nano-publications [16] or otherwise):

One day the scientific community will trade the static print-type approach of publishing for a dynamic, adaptive model of communication. Imagine a manuscript as a living document, one perhaps where all raw data would be available, others could post their attempts to reproduce data, authors could integrate corrections or addenda….

NOTE: Retraction Watch (@ivanoransky) and @laikas have voted in @drugmonkeyblog‘s poll about what a retracted paper means [here]. Have you?


  1. Science Asks to Retract the XMRV-CFS Paper, it Should Never Have Accepted in the First Place. (laikaspoetnik.wordpress.com 2011-06-02)
  2. Alberts B. Editorial Expression of Concern. Science. 2011-05-31.
  3. Given Doubt Cast on CFS-XMRV Link, What About Related Research? (blogs.wsj.com)
  4. XMRV is a recombinant virus from mice  (Virology Blog : 2011/05/31)
  5. Retractions: Guidance from the Committee on Publication Ethics (COPE) Elizabeth Wager, Virginia Barbour, Steven Yentis, Sabine Kleinert on behalf of COPE Council:
  6. Retract This Paper! Trends in Retractions Don’t Reveal Clear Causes for Retractions (scholarlykitchen.sspnet.org)
  7. Wager E, Williams P. Why and how do journals retract articles? An analysis of Medline retractions 1988-2008. J Med Ethics. 2011 Apr 12. [Epub ahead of print] 
  8. Steen RG. Retractions in the scientific literature: is the incidence of research fraud increasing? J Med Ethics. 2011 Apr;37(4):249-53. Epub 2010 Dec 24.
  9. Don’t touch that blot. (nerd-alert.net/blog/weeklies/ : 2011/02/25)
  10. What_does_a_retracted_paper_mean? (scienceblogs.com/drugmonkey: 2011/06/03)
  11. So when is a retraction warranted? The long and winding road to publishing a failure to replicate (retractionwatch.wordpress.com : 2011/06/03/)
  12. Much Ado About ADHD-Research: Is there a Misrepresentation of ADHD in Scientific Journals? (laikaspoetnik.wordpress.com 2011-06-02)
  13. “Lies, Damned Lies, and Medical Science” (theatlantic.com :2010/11/)
  14. Ioannidis, J. (2005). Why Most Published Research Findings Are False. PLoS Medicine, 2 (8) DOI: 10.1371/journal.pmed.0020124
  15. Retractions: What are they good for? (biochembelle.wordpress.com : 2011/06/04/)
  16. Will Nano-Publications & Triplets Replace The Classic Journal Articles? (laikaspoetnik.wordpress.com 2011-06-02)

NEW* (Added 2011-06-08):


Much Ado About ADHD-Research: Is there a Misrepresentation of ADHD in Scientific Journals?

9 02 2011

The reliability of science is increasingly under fire. We all know that media often gives a distorted picture of scientific findings (i.e. Hot news: Curry, Curcumin, Cancer & cure). But there is also an ever growing number of scientific misreports or even fraud (see bmj editorial announcing retraction of the Wakefield paper about causal relation beteen MMR vaccination and autism). Apart from real scientific misconduct there are Ghost Marketing and “Publication Bias”, that makes (large) positive studies easier to find than those with negative or non-significant result.
Then there are also the ever growing contradictions, that makes the public sigh: what IS true in science?

Indeed according to doctor John Ioannidis “Much of what medical researchers conclude in their studies is misleading, exaggerated, or flat-out wrong. (see “Lies, Damned Lies, and Medical Science” in the Atlantic (2010). In 2005 he wrote the famous PLOS-article “Why most published research findings are false” [2] .

With Iaonnides as an editor, a new PLOS-one paper has recently been published on the topic [1]. The authors Gonon, Bezard and Boraud state that there is often a huge gap between neurobiological facts and firm conclusions stated by the media. They suggest that the misrepresentation often starts in the scientific papers, and is echoed by the media.

Although this article has already been reviewed by another researchblogger (Hadas Shema), I would like to give my own views on this paper

Gonon et al found 3 types of misrepresentation.*

1. Internal inconsistencies (between results and claimed conclusions).

In a (non-systematic) review of 360 ADHD articles  Gonon et al. [1] found  two studies with “obvious” discrepancies between results and claimed conclusions.  One paper claimed that dopamine is depressed in the brain of ADHD patient. Mitigations were only mentioned in the results section and of course only the positive message was resonated by the media without further questioning any alternative explanation (in this case a high baseline dopamine tone). The other paper [3] claimed that treatment with stimulant medications was associated with more favorite long-term school outcomes. However the average reading score and the school drop-outs did not differ significantly between treatment and control group. The newspapers also trumpeted that  “ADHD drugs help boost children’s grades” .

2. Fact Omission

To quantify fact omission in the scientific literature, Gonon et al systematically searched for ADHD articles mentioning the the D4 dopamine receptor (DRD4) gene. Among the 117 primary human studies with actual data (like odds ratios), 74 articles state in their summary that alleles of the DRD4 genes are significantly associated with ADHD but only 19 summaries mentioned that the risk was small. Fact omission was even more preponderant in articles, that only cite studies about DRD4.  Not surprisingly, 82% of the media articles didn’t report that the DRD4 only confers a small risk either.
In accordance with Ioannidis findings [2] Gonon et al found that the most robust effects were reported in initial studies: odds-ratios decreased from 2.4 in the oldest study in 1996 to 1.27 in the most recent meta-analysis.

3. Extrapolating basic and pre-clinical findings to new therapeutic prospects

Animal ADHD models have their limitations because investigations based on mouse behavior cannot capture the ADHD complexity. Analysis of all ADHD-related studies in mice showed that 23% of the conclusions were overstated. The frequency of this overstatement was positively related with the impact factor of the journal.

Again, the positive message was copied by the press. (see Figure below)


Figure from PLOS ONE [CC



The article by Gonon et al is another example that “published research findings are false” [ 2], or at least not completely true. The authors show that the press isn’t culprit number one, but that it “just” copies the overstatements in the scientific abstracts.

The merit of Gonon et al is that they have extensively looked at a great number of articles and at press articles citing those articles.

The first type of misrepresentation wasn’t systematically studied, but types 2 and 3 misrepresentations were studied by analyzing papers on a specific ADHD topic obtained by a systematic search.

One of the solutions the authors propose is that “journal editors collectively reject sensationalism and clearly condemn data misrepresentation”. I agree and would like to add that the reviewers should check that the summary actual reflects the data. Some journals already have strict criteria in this respect. It striked me that the few summaries I checked were very unstructured and short, unlike most summaries I see. Possibly, unstructured abstracts are more typically for journals about neuroscience and animal research.

The choice of the ADHD-topics investigated doesn’t seem random. A previous review[4], written by Francois Gonon deals entirely with “the need to reexamine the dopaminergic hypothesis of ADHD” . The type 1 misrepresentation data stem from this opinion piece.

The putative ADHD-DRD4 gene association and the animal studies, taken as examples for type 2 and type 3 misrepresentations respectively, can also be seen as topics of the “ADHD is a genetic disease” -kind.

Gonon et al clearly favor the hypothesis that ADHD is primarily caused by environmental factors . In his opinion piece he starts with saying:

This dopamine-deficit theory of ADHD is often based upon an overly simplistic dopaminergic theory of reward. Here, I question the relevance of this theory regarding ADHD. I underline the weaknesses of the neurochemical, genetic, neuropharmacological and imaging data put forward to support the dopamine-deficit hypothesis of ADHD. Therefore, this hypothesis should not be put forward to bias ADHD management towards psychostimulants.

I wonder whether it is  fair of the authors to limit the study to ADHD topics they oppose to in order to (indirectly) confirm their “ADHD has a social origin” hypothesis. Indeed in the paragraph “social and public health consequences” Gonon et al state:

Unfortunately, data misrepresentation biases the scientific evidence in favor of the first position stating that ADHD is primarily caused by biological factors.

I do not think that this conclusion is justified by their findings, since similar data misrepresentation might also occur in papers investigating social causes or treatments, but this was not investigated. (mmm, a misrepresentation of the third kind??)

I also wonder why impact factor data were only given for the animal studies.

Gonon et al interpret a lot, also in their results section. For instance, they mention that 2 out of 60 articles show obvious discrepancies between results and claimed conclusions. This is not much. Then they reason:

Our observation that only two articles among 360 show obvious internal inconsistencies must be considered with caution however. First, our review of the ADHD literature was not a systematic one and was not aimed at pointing out internal inconsistencies. Second, generalization to other fields of the neuroscience literature would be unjustified

But this is what they do. See title:

” Misrepresentation of Neuroscience Data Might Give Rise to Misleading Conclusions in the Media.”

Furthermore they selectively report themselves. The Barbaresi paper [3], a large retrospective cohort,  did not find an effect on average reading score and school drop-outs, but it did find a significantly lowered grade retention, which is -after all- an important long-term school outcome.

Misrepresentation type 2 (“omission”)  I would say.*


  1. Gonon, F., Bezard, E., & Boraud, T. (2011). Misrepresentation of Neuroscience Data Might Give Rise to Misleading Conclusions in the Media: The Case of Attention Deficit Hyperactivity Disorder PLoS ONE, 6 (1) DOI: 10.1371/journal.pone.0014618
  2. Ioannidis, J. (2005). Why Most Published Research Findings Are False PLoS Medicine, 2 (8) DOI: 10.1371/journal.pmed.0020124
  3. Barbaresi, W., Katusic, S., Colligan, R., Weaver, A., & Jacobsen, S. (2007). Modifiers of Long-Term School Outcomes for Children with Attention-Deficit/Hyperactivity Disorder: Does Treatment with Stimulant Medication Make a Difference? Results from a Population-Based Study Journal of Developmental & Behavioral Pediatrics, 28 (4), 274-287 DOI: 10.1097/DBP.0b013e3180cabc28
  4. GONON, F. (2009). The dopaminergic hypothesis of attention-deficit/hyperactivity disorder needs re-examining Trends in Neurosciences, 32 (1), 2-8 DOI: 10.1016/j.tins.2008.09.010

Related Articles

[*A short comment in the NRC Handelsblad (Febr 5th) comes to a similar conclusion]

FDA to Regulate Genetic Testing by DTC-Companies Like 23andMe

14 06 2010

Direct-to-consumer (DTC) genetic testing refers to genetic tests that are marketed directly to consumers via television, print advertisements, or the Internet. This form of testing, which is also known as at-home genetic testing, provides access to a person’s genetic information without necessarily involving a doctor or insurance company in the process. [definition from NLM's Genetic Home Reference Handbook]

Almost two years ago I wrote about 23andMe (23andMe: 23notMe, not yet),  a well known DTC company, that offers a genetics scan (SNP-genotyping) to the public ‘for research’, ‘for education’ and ‘for fun’:

“Formally 23andMe denies there is a diagnostic purpose (in part, surely, because the company doesn’t want to antagonize the FDA, which strictly regulates diagnostic testing for disease). However, 23andme does give information on your risk profile for certain diseases, including Parkinson”

In another post Personalized Genetics: Too Soon, Too Little? I summarized an editorial by Ioannides on the topic. His (and my) conclusion was that “the promise of personalized genetic prediction may be exaggerated and premature”. The most important issue is that predictive power to individualize risks is relatively weak. Ioannidis emphasized that despite the poor evidence, direct to consumer genetic testing has already begun and is here to stay. He proposed several safeguards, including transparent and thorough reporting, unbiased continuous synthesis and grading of the evidence and alerting the public that most genetic tests have not yet been shown to be clinically useful.

And now these “precautionary measures” actually seem to happen.
Last week the FDA sent 5 DTC-companies, including 23andMe a letter saying “their tests are medical devices that must receive regulatory approval before they can be marketed.” (ie. see NY-times article).

Alberto Gutierrez, who leads diagnostic test regulation at the FDA, wrote in the letters:

“Premarket review allows for an independent and unbiased assessment of a diagnostic test’s ability to generate test results that can reliably be used to support good health care decisions,”

These letters are part of an initiative to better explain the FDA’s actions by providing information that supports clinical medicine, biomedical innovation, and public health,” (May 19 New England Journal of Medicine commentary, source: see AMED-news)

Although it doesn’t look like the tests will be taken from the market, 23andMe does take a quite a rebellious attitude: one of its directors called the FDA “appallingly paternalistic.”

Many support this view: “people have the right to know their own genetic make-up”, so to say. Furthermore as discussed above, 23andMe denies that their genetic scans are meant for diagnosis.

In my view the latter is largely untrue. At least 23andMe suggests that knowing a scan does tell you something about your risks for certain diseases.
However, the risks are often not that straightforward. You just can’t “measure” the risk of a multifactorial disease like diabetes by “scanning” a few weakly predisposing  genes. Often the results are given in relative risk, which is highly confusing. In her TED-talk the 23andMe director Anne Wojcicki said her husband Sergey Brin (Google), had a 50% chance of getting Parkinson, but his relative risk (RR, based on the LRRK2-mutation, which isn’t the most crucial gene for getting Parkinson) varies from 20% to 80% , which means that this mutation increases his absolute risk of getting Parkinson from 2-5% (normal chance) to 4-10% at the most. (see this post).

Furthermore, as reported by Venture in Nature (October 8, 2009): For seven diseases, 50% or less of the predictions of two companies agreed across five individuals (i.e. for one disease: 23andMe : RR 4.02, and Navigenics RR: 1.25). On the other hand *fun* diagnoses could lead to serious concern in, or wrong/unnecessary decisions (removal of ovaries, changing drug doses) by patients.

There are also concerns with regard to their good-practice standards, as 23andMe just flipped a 96-wells plate of costumer DNA (see Genetic Future for a balanced post), which upset a mother noticing that her son didn’t have compatible genes. But lets assume that proper precautions will prevent this to happen again.

There are also positive aspects: results of a preliminary study showed that people who find out they have high genetic risk for cardiovascular disease are more likely to change their diet and exercise patterns than are those who learn they have a high risk from family history. (Technology ReviewGenetic Testing Can Change Behavior).

Furthermore, people buy those tests themselves and, indeed, there genes are their own.

However, I agree with Dr. Gutierrez of the FDA saying: “We really don’t have any issues with denying people information. We just want to make sure the information they are given is correct. (NY-Times). The FDA is putting the consumers first.

However, it will be very difficult to be consistent. What about total body scans in normal healthy people, detecting innocent incidentilomas? Or what about the controversial XMRV-tests offered by the Whittemore Peterson Institute (WPI) directly to CFS- patients? (see these posts) And one step further (although not in the diagnostic field): the ineffective CAM/homeopathic products sold over the counter?

I wouldn’t mind if these tests/products would be held up to the light. Consumers should not be misled by the results of unproven or invalid tests, and where needed should be offered the guidance of a healthcare provider.

But if tests are valid and risk predictions correct, it is up to the “consumer” if he/she wants to purchase such a test.


What Five FDA Letters Mean for the Future of DTC Genetic Testingat Genomics law Report is highly recommendable, but couldn’t be accessed while writing the post.

[Added: 2010-06-14 13.10]

  • Problem assessing Genomics Law Report is resolved.
  • Also recommendable: the post “FDA to regulate genetic tests as “devices”” at PHG Foundation. This post highlights that simply trying to classify the complete genomic testing service as “a device” is inadequate and will not address the difficult issues at hand. One of the biggest issues is that, while classifying DTC genetics tests as devices is certainly appropriate for assessing their analytical validity and direct safety, it does not and cannot provide an assessment of the service, thus of the predictions and interpretations resulting from the genome scans.  Although standard medical testing has traditionally been overseen by professional medical bodies, the current genomic risk profiling tests are simply not good enough to be used by health care services. (see post)
Related articles by Zemanta

Personalized Genetics: Too Soon, Too Little?

9 02 2009

ResearchBlogging.orgPersonalized Medicine is the concept that managing patient’s health should be based on the individual patient’s specific characteristics instead of on the standards of care. Often the term ‘personalized medicine’ is restricted to the use of information about a patient’s genotype or gene expression profile to further tailor medical care to an individual’s needs (see [1])

This so called Personalized Genetics is a beautiful concept. Suppose you could predict people’s risk for a certain disease and be able to prevent it by encouraging positive lifestyle changes and/or start a tailor made therapy, suppose you could predict which patients would respond to an intervention and which people should avoid certain medications. Wouldn’t that be wonderful and much better than treating everybody the same way only to benefit a few?

Research like the human genome project and recent advances in genomics research have boosted progress in the discovery of susceptibility genes and fueled expectations about opportunities of genetic profiling for personalizing.

But are the high expectations justified?

For personalized genetics to be (clinically) effective it must fulfill the following requirements (based on [2]):

  1. Clear and strong association of the gene (expression) variant with the susceptibility to a disease or the outcome of a treatment
  2. Improved prediction compared to other risk factors, including traditional risk factors and clinical judgment…
  3. ..as determined in good quality studies with a sufficient number of events (if the events are rare you cannot accurately predict the outcome)
    (1-3 make up the predictive performance)
  4. The availability of effective interventions or effective alternatives
  5. Cost-effectiveness


According to an editorial in the January issue of the Annals of Internal Medicine “the promise of personalized genetic prediction may be exaggerated and premature” [2]. This is especially true for many complex diseases, where 1 variant alone is unlikely to make the difference.

The editorial is written by John Ioannidis, who is a professor at the University of Ioannina School of Medicine in Greece and has an adjunct appointment at Tufts University School of Medicine in Boston. His research focuses on meta-analysis and evidence-based medicine with special emphasis on research methodology. Ioannidis is a brilliant researcher, epidemiologist and inspiring lecturer (I have attended a lecture of him once at a Cochrane Colloquium). Therefore I would urge everyone interested in personalize genetics to read his editorial.

Here I will give a summary of the editorial entitled “Personalized genetic prediction: too limited, too expensive, or too soon” [2].The editorial summarizes two publications in the same issue of the journal [3,4] and gives an overview of the literature.

Ioannidis stresses that recent studies into the predictive performance of common genetic traits have several shortcomings, including an often weak design with few events (*3)*, incomplete comparison with traditional risk factors (*2) and exaggerated prediction of effects because of the models used (*1).

To date, the genotypic information does not substantially improve the prediction of future cardiovascular disease (CVD), prostate cancer and type 2 diabetes beyond traditional risk factors. In the case of age-related macular degeneration, genetic information does increase the ability to predict progression to the disease. However the predictive power to individualize risks remains relatively weak.

Indeed, a recent paper published in PLOS [5] reinforces that a strong association between single nucleotide polymorphisms (SNPs) and a multifactorial disease like age-related macular degeneration, diabetes type 2, CVD and Crohn disease may be very valuable for establishing etiological hypotheses, but do not guarantee effective discrimination between cases and controls and are therefore of little clinical value yet. For further details with regard to the methods used to determine clinical validity of genetic testing you are encouraged to read the entire (free) paper [5].

Likewise, the study of Paynter et al [3] reviewed by Ioannidis, shows that genetic variation in chromosome 9p21.3 (rs10757274) was strongly and consistently associated with incident CVD in a cohort of white women, but did not improve on the discrimination or classification of predicted risk achieved with traditional risk factors, high-sensitivity C-reactive protein, and family history of premature myocardial infarction. Thus “knowing a patient’s rs10757274 genotype would not help a clinician make better preventive or therapeutic decisions to reduce future risk for heart disease”.
This holds also true for many other potentially causal single SNPs: they have a relatively small effect on their own. Complex diseases are probably the result of numerous gene-gene and gene-environment interactions, which may differ from one population to the other and only explain a small proportion of the trait variance.

Even improved prediction (*1-3) does not necessarily make a predictive test useful. The prevalence of the disease is also an important determinant, i.e. people with high risk gene variants for a rare disease may have a significant higher-than average risk, but still a negligible probability of developing the disease.

Clinical utility of the genetic prediction also depends on the availability of effective interventions (*4) and the cost effectiveness (*5). Another paper in the same Ann. Intern. Med. issue [4] shows that although CYP2C9 and VKORC1 strongly predict the chance of bleeding as a side effect of warfarin treatment, genotype-guided dosing appeared not to be cost-effective for patients requiring initiation of warfarin therapy. Piquant detail: The FDA has approved this kind of genetic testing, although there is no good evidence that such genotyping does in fact reduce the risk of hemorrhage in everyday clinical practice. Such knowledge would require large well designed RCT’s.

Ioannidis emphasizes that despite the poor evidence, genetic testing and commercial use (direct to consumer genetic testing) have already begun and are here to stay. He proposes several safeguards, including transparent and thorough reporting, unbiased continuous synthesis and grading of the evidence and alerting the public that most genetic tests have not yet been shown to be clinically useful. He concludes the editorial as follows:

Helping patients and physicians to decide when to do genetic tests will be a tough task because neither knows much about the rapidly emerging field of genomics. We need to learn more about what our genome can tell us and, more important, what it cannot tell us.

* refers to list, points 1-5


1. http://en.wikipedia.org/wiki/Personalized_medicine
2: Ioannidis JP. (2009). Personalized genetic prediction: too limited, too expensive, or too soon? Ann Intern Med, 150 (2), 139-141 DOI: 19153414 {=wrong DOI researchblogs click here to be linked to PubMed)
3: Paynter NP, Chasman DI, Buring JE, Shiffman D, Cook NR, Ridker PM. Cardiovascular disease risk prediction with and without knowledge of genetic variation at chromosome 9p21.3. Ann Intern Med. 2009 Jan 20;150(2):65-72.
4: Eckman MH, Rosand J, Greenberg SM, Gage BF. Cost-effectiveness of using pharmacogenetic information in warfarin dosing for patients with nonvalvular atrial fibrillation. Ann Intern Med. 2009 Jan 20;150(2):73-83.
5: Jakobsdottir J, Gorin MB, Conley YP, Ferrell RE, Weeks DE. Interpretation of genetic association studies: markers with replicated highly
significant odds ratios may be poor classifiers. PLoS Genet. 2009 Feb;5(2):e1000337. Epub 2009 Feb 6 (free full text).

6: Janssens AC, van Duijn CM. Genome-based prediction of common diseases: advances and prospects. Hum Mol Genet. 2008 Oct 15;17(R2):R166-73. Review.

You might also want to read:

23andme 23notme not yet (post 2008/09/29/)


Get every new post delivered to your Inbox.

Join 607 other followers