Why Publishing in the NEJM is not the Best Guarantee that Something is True: a Response to Katan

27 10 2012

ResearchBlogging.orgIn a previous post [1] I reviewed a recent  Dutch study published in the New England Journal of Medicine (NEJM [2] about the effects of sugary drinks on the body mass index of school children.

The study got widely covered by the media. The NRC, for which the main author Martijn Katan works as a science columnist,  columnist, spent  two full (!) pages on the topic -with no single critical comment-[3].
As if this wasn’t enough, the latest column of Katan again dealt with his article (text freely available at mkatan.nl)[4].

I found Katan’s column “Col hors Catégorie” [4] quite arrogant, especially because he tried to belittle a (as he called it) “know-it-all” journalist who criticized his work  in a rivaling newspaper. This wasn’t fair, because the journalist had raised important points [5, 1] about the work.

The piece focussed on the long road of getting papers published in a top journal like the NEJM.
Katan considers the NEJM as the “Tour de France” among  medical journals: it is a top achievement to publish in this paper.

Katan also states that “publishing in the NEJM is the best guarantee something is true”.

I think the latter statement is wrong for a number of reasons.*

  1. First, most published findings are false [6]. Thus journals can never “guarantee”  that published research is true.
    Factors that  make it less likely that research findings are true include a small effect size,  a greater number and lesser preselection of tested relationships, selective outcome reporting, the “hotness” of the field (all applying more or less to Katan’s study, he also changed the primary outcomes during the trial[7]), a small study, a great financial interest and a low pre-study probability (not applicable) .
  2. It is true that NEJM has a very high impact factor. This is  a measure for how often a paper in that journal is cited by others. Of course researchers want to get their paper published in a high impact journal. But journals with high impact factors often go for trendy topics and positive results. In other words it is far more difficult to publish a good quality study with negative results, and certainly in an English high impact journal. This is called publication bias (and language bias) [8]. Positive studies will also be more frequently cited (citation bias) and will more likely be published more than once (multiple publication bias) (indeed, Katan et al already published about the trial [9], and have not presented all their data yet [1,7]). All forms of bias are a distortion of the “truth”.
    (This is the reason why the search for a (Cochrane) systematic review must be very sensitive [8] and not restricted to core clinical journals, but even include non-published studies: for these studies might be “true”, but have failed to get published).
  3. Indeed, the group of Ioannidis  just published a large-scale statistical analysis[10] showing that medical studies revealing “very large effects” seldom stand up when other researchers try to replicate them. Often studies with large effects measure laboratory and/or surrogate markers (like BMI) instead of really clinically relevant outcomes (diabetes, cardiovascular complications, death)
  4. More specifically, the NEJM does regularly publish studies about pseudoscience or bogus treatments. See for instance this blog post [11] of ScienceBased Medicine on Acupuncture Pseudoscience in the New England Journal of Medicine (which by the way is just a review). A publication in the NEJM doesn’t guarantee it isn’t rubbish.
  5. Importantly, the NEJM has the highest proportion of trials (RCTs) with sole industry support (35% compared to 7% in the BMJ) [12] . On several occasions I have discussed these conflicts of interests and their impact on the outcome of studies ([13, 14; see also [15,16] In their study, Gøtzsche and his colleagues from the Nordic Cochrane Centre [12] also showed that industry-supported trials were more frequently cited than trials with other types of support, and that omitting them from the impact factor calculation decreased journal impact factors. The impact factor decrease was even 15% for NEJM (versus 1% for BMJ in 2007)! For the journals who provided data, income from the sales of reprints contributed to 3% and 41% of the total income for BMJ and The Lancet.
    A recent study, co-authored by Ben Goldacre (MD & science writer) [17] confirms that  funding by the pharmaceutical industry is associated with high numbers of reprint ordersAgain only the BMJ and the Lancet provided all necessary data.
  6. Finally and most relevant to the topic is a study [18], also discussed at Retractionwatch[19], showing that  articles in journals with higher impact factors are more likely to be retracted and surprise surprise, the NEJM clearly stands on top. Although other reasons like higher readership and scrutiny may also play a role [20], it conflicts with Katan’s idea that  “publishing in the NEJM is the best guarantee something is true”.

I wasn’t aware of the latter study and would like to thank drVes and Ivan Oranski for responding to my crowdsourcing at Twitter.


  1. Sugary Drinks as the Culprit in Childhood Obesity? a RCT among Primary School Children (laikaspoetnik.wordpress.com)
  2. de Ruyter JC, Olthof MR, Seidell JC, & Katan MB (2012). A trial of sugar-free or sugar-sweetened beverages and body weight in children. The New England journal of medicine, 367 (15), 1397-406 PMID: 22998340
  3. NRC Wim Köhler Eén kilo lichter.NRC | Zaterdag 22-09-2012 (http://archief.nrc.nl/)
  4. Martijn Katan. Col hors Catégorie [Dutch], (published in de NRC,  (20 oktober)(www.mkatan.nl)
  5. Hans van Maanen. Suiker uit fris, De Volkskrant, 29 september 2012 (freely accessible at http://www.vanmaanen.org/)
  6. Ioannidis, J. (2005). Why Most Published Research Findings Are False PLoS Medicine, 2 (8) DOI: 10.1371/journal.pmed.0020124
  7. Changes to the protocol http://clinicaltrials.gov/archive/NCT00893529/2011_02_24/changes
  8. Publication Bias. The Cochrane Collaboration open learning material (www.cochrane-net.org)
  9. de Ruyter JC, Olthof MR, Kuijper LD, & Katan MB (2012). Effect of sugar-sweetened beverages on body weight in children: design and baseline characteristics of the Double-blind, Randomized INtervention study in Kids. Contemporary clinical trials, 33 (1), 247-57 PMID: 22056980
  10. Pereira, T., Horwitz, R.I., & Ioannidis, J.P.A. (2012). Empirical Evaluation of Very Large Treatment Effects of Medical InterventionsEvaluation of Very Large Treatment Effects JAMA: The Journal of the American Medical Association, 308 (16) DOI: 10.1001/jama.2012.13444
  11. Acupuncture Pseudoscience in the New England Journal of Medicine (sciencebasedmedicine.org)
  12. Lundh, A., Barbateskovic, M., Hróbjartsson, A., & Gøtzsche, P. (2010). Conflicts of Interest at Medical Journals: The Influence of Industry-Supported Randomised Trials on Journal Impact Factors and Revenue – Cohort Study PLoS Medicine, 7 (10) DOI: 10.1371/journal.pmed.1000354
  13. One Third of the Clinical Cancer Studies Report Conflict of Interest (laikaspoetnik.wordpress.com)
  14. Merck’s Ghostwriters, Haunted Papers and Fake Elsevier Journals (laikaspoetnik.wordpress.com)
  15. Lexchin, J. (2003). Pharmaceutical industry sponsorship and research outcome and quality: systematic review BMJ, 326 (7400), 1167-1170 DOI: 10.1136/bmj.326.7400.1167
  16. Smith R (2005). Medical journals are an extension of the marketing arm of pharmaceutical companies. PLoS medicine, 2 (5) PMID: 15916457 (free full text at PLOS)
  17. Handel, A., Patel, S., Pakpoor, J., Ebers, G., Goldacre, B., & Ramagopalan, S. (2012). High reprint orders in medical journals and pharmaceutical industry funding: case-control study BMJ, 344 (jun28 1) DOI: 10.1136/bmj.e4212
  18. Fang, F., & Casadevall, A. (2011). Retracted Science and the Retraction Index Infection and Immunity, 79 (10), 3855-3859 DOI: 10.1128/IAI.05661-11
  19. Is it time for a Retraction Index? (retractionwatch.wordpress.com)
  20. Agrawal A, & Sharma A (2012). Likelihood of false-positive results in high-impact journals publishing groundbreaking research. Infection and immunity, 80 (3) PMID: 22338040


* Addendum: my (unpublished) letter to the NRC

Tour de France.
Nadat het NRC eerder 2 pagina’ s de loftrompet over Katan’s nieuwe studie had afgestoken, vond Katan het nodig om dit in zijn eigen column dunnetjes over te doen. Verwijzen naar je eigen werk mag, ook in een column, maar dan moeten wij daar als lezer wel wijzer van worden. Wat is nu de boodschap van dit stuk “Col hors Catégorie“? Het beschrijft vooral de lange weg om een wetenschappelijke studie gepubliceerd te krijgen in een toptijdschrift, in dit geval de New England Journal of Medicine (NEJM), “de Tour de France onder de medische tijdschriften”. Het stuk eindigt met een tackle naar een journalist “die dacht dat hij het beter wist”. Maar ach, wat geeft dat als de hele wereld staat te jubelen? Erg onsportief, omdat die journalist (van Maanen, Volkskrant) wel degelijk op een aantal punten scoorde. Ook op Katan’s kernpunt dat een NEJM-publicatie “de beste garantie is dat iets waar is” valt veel af te dingen. De NEJM heeft inderdaad een hoge impactfactor, een maat voor hoe vaak artikelen geciteerd worden. De NEJM heeft echter ook de hoogste ‘artikelterugtrekkings’ index. Tevens heeft de NEJM het hoogste percentage door de industrie gesponsorde klinische trials, die de totale impactfactor opkrikken. Daarnaast gaan toptijdschriften vooral voor “positieve resultaten” en “trendy onderwerpen”, wat publicatiebias in de hand werkt. Als we de vergelijking met de Tour de France doortrekken: het volbrengen van deze prestigieuze wedstrijd garandeert nog niet dat deelnemers geen verboden middelen gebruikt hebben. Ondanks de strenge dopingcontroles.

FUTON Bias. Or Why Limiting to Free Full Text Might not Always be a Good Idea.

8 09 2011

ResearchBlogging.orgA few weeks ago I was discussing possible relevant papers for the Twitter Journal Club  (Hashtag #TwitJC), a succesful initiative on Twitter, that I have discussed previously here and here [7,8].

I proposed an article, that appeared behind a paywall. Annemarie Cunningham (@amcunningham) immediately ran the idea down, stressing that open-access (OA) is a pre-requisite for the TwitJC journal club.

One of the TwitJC organizers, Fi Douglas (@fidouglas on Twitter), argued that using paid-for journals would defeat the objective that  #TwitJC is open to everyone. I can imagine that fee-based articles could set a too high threshold for many doctors. In addition, I sympathize with promoting OA.

However, I disagree with Annemarie that an OA (or rather free) paper is a prerequisite if you really want to talk about what might impact on practice. On the contrary, limiting to free full text (FFT) papers in PubMed might lead to bias: picking “low hanging fruit of convenience” might mean that the paper isn’t representative and/or doesn’t reflect the current best evidence.

But is there evidence for my theory that selecting FFT papers might lead to bias?

Lets first look at the extent of the problem. Which percentage of papers do we miss by limiting for free-access papers?

survey in PLOS by Björk et al [1] found that one in five peer reviewed research papers published in 2008 were freely available on the internet. Overall 8,5% of the articles published in 2008 (and 13,9 % in Medicine) were freely available at the publishers’ sites (gold OA).  For an additional 11,9% free manuscript versions could be found via the green route:  i.e. copies in repositories and web sites (7,8% in Medicine).
As a commenter rightly stated, the lag time is also important, as we would like to have immediate access to recently published research, yet some publishers (37%) impose an access-embargo of 6-12 months or more. (these papers were largely missed as the 2008 OA status was assessed late 2009).

PLOS 2009

The strength of the paper is that it measures  OA prevalence on an article basis, not on calculating the share of journals which are OA: an OA journal generally contains a lower number of articles.
The authors randomly sampled from 1.2 million articles using the advanced search facility of Scopus. They measured what share of OA copies the average researcher would find using Google.

Another paper published in  J Med Libr Assoc (2009) [2], using similar methods as the PLOS survey examined the state of open access (OA) specifically in the biomedical field. Because of its broad coverage and popularity in the biomedical field, PubMed was chosen to collect their target sample of 4,667 articles. Matsubayashi et al used four different databases and search engines to identify full text copies. The authors reported an OA percentage of 26,3 for peer reviewed articles (70% of all articles), which is comparable to the results of Björk et al. More than 70% of the OA articles were provided through journal websites. The percentages of green OA articles from the websites of authors or in institutional repositories was quite low (5.9% and 4.8%, respectively).

In their discussion of the findings of Matsubayashi et al, Björk et al. [1] quickly assessed the OA status in PubMed by using the new “link to Free Full Text” search facility. First they searched for all “journal articles” published in 2005 and then repeated this with the further restrictions of “link to FFT”. The PubMed OA percentages obtained this way were 23,1 for 2005 and 23,3 for 2008.

This proportion of biomedical OA papers is gradually increasing. A chart in Nature’s News Blog [9] shows that the proportion of papers indexed on the PubMed repository each year has increased from 23% in 2005 to above 28% in 2009.
(Methods are not shown, though. The 2008 data are higher than those of Björk et al, who noticed little difference with 2005. The Data for this chart, however, are from David Lipman, NCBI director and driving force behind the digital OA archive PubMed Central).
Again, because of the embargo periods, not all literature is immediately available at the time that it is published.

In summary, we would miss about 70% of biomedical papers by limiting for FFT papers. However, we would miss an even larger proportion of papers if we limit ourselves to recently published ones.

Of course, the key question is whether ignoring relevant studies not available in full text really matters.

Reinhard Wentz of the Imperial College Library and Information Service already argued in a visionary 2002 Lancet letter[3] that the availability of full-text articles on the internet might have created a new form of bias: FUTON bias (Full Text On the Net bias).

Wentz reasoned that FUTON bias will not affect researchers who are used to comprehensive searches of published medical studies, but that it will affect staff and students with limited experience in doing searches and that it might have the same effect in daily clinical practice as publication bias or language bias when doing systematic reviews of published studies.

Wentz also hypothesized that FUTON bias (together with no abstract available (NAA) bias) will affect the visibility and the impact factor of OA journals. He makes a reasonable cause that the NAA-bias will affect publications on new, peripheral, and under-discussion subjects more than established topics covered in substantive reports.

The study of Murali et al [4] published in Mayo Proceedings 2004 confirms that the availability of journals on MEDLINE as FUTON or NAA affects their impact factor.

Of the 324 journals screened by Murali et al. 38.3% were FUTON, 19.1%  NAA and 42.6% had abstracts only. The mean impact factor was 3.24 (±0.32), 1.64 (±0.30), and 0.14 (±0.45), respectively! The authors confirmed this finding by showing a difference in impact factors for journals available in both the pre and the post-Internet era (n=159).

Murali et al informally questioned many physicians and residents at multiple national and international meetings in 2003. These doctors uniformly admitted relying on FUTON articles on the Web to answer a sizable proportion of their questions. A study by Carney et al (2004) [5] showed  that 98% of the US primary care physicians used the Internet as a resource for clinical information at least once a week and mostly used FUTON articles to aid decisions about patient care or patient education and medical student or resident instruction.

Murali et al therefore conclude that failure to consider FUTON bias may not only affect a journal’s impact factor, but could also limit consideration of medical literature by ignoring relevant for-fee articles and thereby influence medical education akin to publication or language bias.

This proposed effect of the FFT limit on citation retrieval for clinical questions, was examined in a  more recent study (2008), published in J Med Libr Assoc [6].

Across all 4 questions based on a research agenda for physical therapy, the FFT limit reduced the number of citations to 11.1% of the total number of citations retrieved without the FFT limit in PubMed.

Even more important, high-quality evidence such as systematic reviews and randomized controlled trials were missed when the FFT limit was used.

For example, when searching without the FFT limit, 10 systematic reviews of RCTs were retrieved against one when the FFT limit was used. Likewise when searching without the FFT limit, 28 RCTs were retrieved and only one was retrieved when the FFT limit was used.

The proportion of missed studies (appr. 90%) is higher than in the studies mentioned above. Possibly this is because real searches have been tested and that only relevant clinical studies  have been considered.

The authors rightly conclude that consistently missing high-quality evidence when searching clinical questions is problematic because it undermines the process of Evicence Based Practice. Krieger et al finally conclude:

“Librarians can educate health care consumers, scientists, and clinicians about the effects that the FFT limit may have on their information retrieval and the ways it ultimately may affect their health care and clinical decision making.”

It is the hope of this librarian that she did a little education in this respect and clarified the point that limiting to free full text might not always be a good idea. Especially if the aim is to critically appraise a topic, to educate or to discuss current best medical practice.


  1. Björk, B., Welling, P., Laakso, M., Majlender, P., Hedlund, T., & Guðnason, G. (2010). Open Access to the Scientific Journal Literature: Situation 2009 PLoS ONE, 5 (6) DOI: 10.1371/journal.pone.0011273
  2. Matsubayashi, M., Kurata, K., Sakai, Y., Morioka, T., Kato, S., Mine, S., & Ueda, S. (2009). Status of open access in the biomedical field in 2005 Journal of the Medical Library Association : JMLA, 97 (1), 4-11 DOI: 10.3163/1536-5050.97.1.002
  3. WENTZ, R. (2002). Visibility of research: FUTON bias The Lancet, 360 (9341), 1256-1256 DOI: 10.1016/S0140-6736(02)11264-5
  4. Murali NS, Murali HR, Auethavekiat P, Erwin PJ, Mandrekar JN, Manek NJ, & Ghosh AK (2004). Impact of FUTON and NAA bias on visibility of research. Mayo Clinic proceedings. Mayo Clinic, 79 (8), 1001-6 PMID: 15301326
  5. Carney PA, Poor DA, Schifferdecker KE, Gephart DS, Brooks WB, & Nierenberg DW (2004). Computer use among community-based primary care physician preceptors. Academic medicine : journal of the Association of American Medical Colleges, 79 (6), 580-90 PMID: 15165980
  6. Krieger, M., Richter, R., & Austin, T. (2008). An exploratory analysis of PubMed’s free full-text limit on citation retrieval for clinical questions Journal of the Medical Library Association : JMLA, 96 (4), 351-355 DOI: 10.3163/1536-5050.96.4.010
  7. The #TwitJC Twitter Journal Club, a new Initiative on Twitter. Some Initial Thoughts. (laikaspoetnik.wordpress.com)
  8. The Second #TwitJC Twitter Journal Club (laikaspoetnik.wordpress.com)
  9. How many research papers are freely available? (blogs.nature.com)


Get every new post delivered to your Inbox.

Join 607 other followers