BAD Science or BAD Science Journalism? – A Response to Daniel Lakens

10 02 2013

ResearchBlogging.orgTwo weeks ago  there was a hot debate among Dutch Tweeps on “bad science, bad science journalism and bad science communication“. This debate was started and fueled by different Dutch blog posts on this topic.[1,4-6]

A controversial post, with both fierce proponents and fierce opposition was the post by Daniel Lakens [1], an assistant professor in Applied Cognitive Psychology.

I was among the opponents. Not because I don’t like a new fresh point of view, but because of a wrong reasoning and because Daniel continuously compares apples and oranges.

Since Twitter debates can’t go in-depth and lack structure and since I cannot comment to his Google sites blog, I pursue my discussion here.

The title of Daniels post is (freely translated, like the rest of his post):

Is this what one calls good science?” 

In his post he criticizes a Dutch science journalist, Hans van Maanen, and specifically his recent column [2], where Hans discusses a paper published in Pediatrics [3].

This longitudinal study tested the Music Marker theory among 309 Dutch kids. The researchers gathered information about the kids’ favorite types of music and tracked incidents of “minor delinquency”, such as shoplifting or vandalism, from the time they were 12 until they reached age 16 [4]. The researchers conclude that liking music that goes against the mainstream (rock, heavy metal, gothic, punk, African American music, and electronic dance music) at age 12 is a strong predictor of future minor delinquency at 16, in contrast to chart pop, classic music, jazz.

The University press office send out a press release [5 ], which was picked up by news media [4,6] and one of the Dutch authors of this study,  Loes Keijsers,  tweeted enthusiastically: “Want to know whether a 16 year old adult will suffer from delinquency, than look at his music taste at age 12!”

According to Hans, Loes could have easily broadcasted (more) balanced tweets, likeMusic preference doesn’t predict shoplifting” or “12 year olds who like Bach keep quiet about shoplifting when 16.” But even then, Hans argues, the tweets wouldn’t have been scientifically underpinned either.

In column style Hans explains why he thinks that the study isn’t methodologically strong: no absolute numbers are given; 7 out of 11 (!) music styles are positively associated with delinquency, but these correlations are not impressive: the strongest predictor (Gothic music preference) can explain no more than 9%  of the variance in delinquent behaviour, which can include anything from shoplifting, vandalism, fighting, graffiti spraying, switching price tags.  Furthermore the risks of later “delinquent” behavior are small:  on a scale 1 (never) to 4 (4 times or more) the mean risk was 1,12. Hans also wonders whether it is a good idea to monitor kids with a certain music taste.

Thus Hans concludesthis study isn’t good science”. Daniel, however, concludes that Hans’ writing is not good science journalism.

First Daniel recalls he and other PhD’s took a course on how to peer review scientific papers. On basis of their peer review of a (published) article 90% of the students decided to reject it. The two main lessons learned by Daniel were:

  • It is easy to critize a scientific paper and grind it down. No single contribution to science (no single article) is perfect.
  • New scientific insights, although imperfect, are worth sharing, because they help to evolve science. *¹

According to Daniel science jounalists often make the same mistakes as the peer reviewing PhD-students: critisizing the individuel studies without a “meta-view” on science.

Peer review and journalism however are different things (apples and oranges if you like).

Peer review (with all its imperfections) serves to filter, check and to improve the quality of individual scientific papers (usually) before they are published  [10]. My papers that passed peer review, were generally accepted. Of course there were the negative reviewers, often  the ignorant ones, and the naggers, but many reviewers had critique that helped to improve my paper, sometimes substantially. As a peer reviewer myself I only try to separate the wheat from the chaff and to enhance the quality of the papers that pass.

Science journalism also has a filter function: it filters already peer reviewed scientific papers* for its readership, “the public” by selecting novel relevant science and translating the scientific, jargon-laded language, into language readers can understand and appreciate. Of course science journalists should put the publication into perspective (call it “meta”).

Surely the PhD-students finger exercise resembles the normal peer review process as much as peer review resembles science journalism.

I understand that pure nitpicking seldom serves a goal, but this rarely occurs in science journalism. The opposite, however, is commonplace.

Daniel disapproves Hans van Maanen’s criticism, because Hans isn’t “meta” enough. Daniel: “Arguing whether an effect size is small or mediocre is nonsense, because no individual study gives a good estimate of the effect size. You need to do more research and combine the results in a meta-analysis”.

Apples and oranges again.

Being “meta” has little to do with meta-analysis. Being meta is … uh … pretty meta. You could think of it as seeing beyond (meta) the findings of one single study*.

A meta-analysis, however, is a statistical technique for combining the findings from independent, but comparable (homogeneous) studies in order to more powerfully estimate the true effect size (pretty exact). This is an important, but difficult methodological task for a scientist, not a journalist. If a meta-analysis on the topic exist, journalists should take this into account, of course (and so should the researchers). If not, they should put the single study in broader perspective (what does the study add to existing knowledge?) and show why this single study is or is not well done?

Daniel takes this further by stating that “One study is no study” and that journalists who simply echo the press releases of a study ànd journalists who just amply criticizes only single publication (like Hans) are clueless about science.

Apples and oranges! How can one lump science communicators (“media releases”), echoing journalists (“the media”) and critical journalists together?

I see more value in a critical analysis than a blind rejoicing of hot air. As long as the criticism guides the reader to appreciate the study.

And if there is just one single novel study, that seems important enough to get media attention, shouldn’t we judge the research on its own merits?

Then Daniel asks himself: “If I do criticize those journalists, shouldn’t I criticize those scientists who published just a single study and wrote a press release about it? “

His conclusion? “No”.

Daniel explains: science never provides absolute certainty, at the most the evidence is strong enough to state what is likely true. This can only be achieved by a lot of research by different investigators. 

Therefore you should believe in your ideas and encourage other scientists to pursue your findings. It doesn’t help when you say that music preference doesn’t predict shoplifting. It does help when you use the media to draw attention to your research. Many researchers are now aware of the “Music Marker Theory”. Thus the press release had its desired effect. By expressing a firm belief in their conclusions, they encourage other scientists to spend their sparse time on this topic. These scientists will try to repeat and falsify the study, an essential step in Cumulative Science. At a time when science is under pressure, scientists shouldn’t stop writing enthusiastic press releases or tweets. 

The latter paragraph is sheer nonsense!

Critical analysis of one study by a journalist isn’t what undermines the  public confidence in science. Rather it’s the media circus, that blows the implications of scientific findings out of proportion.

As exemplified by the hilarious PhD Comic below research results are propagated by PR (science communication), picked up by media, broadcasted, spread via the internet. At the end of the cycle conclusions are reached, that are not backed up by (sufficient) evidence.

PhD Comics – The news Cycle

Daniel is right about some things. First one study is indeed no study, in the sense that concepts are continuously tested and corrected: falsification is a central property of science (Popper). He is also right that science doesn’t offer absolute certainty (an aspect that is often not understood by the public). And yes, researchers should believe in their findings and encourage other scientists to check and repeat their experiments.

Though not primarily via the media. But via the normal scientific route. Good scientists will keep track of new findings in their field anyway. Suppose that only findings that are trumpeted in the media would be pursued by other scientists?

7-2-2013 23-26-31 media & science

And authors shouldn’t make overstatements. They shouldn’t raise expectations to a level which cannot be met. The Dutch study only shows weak associations. It simply isn’t true that the Dutch study allows us to “predict” at an individual level if a 12 year old will “act out” at 16.

This doesn’t help lay-people to understand the findings and to appreciate science.

The idea that media should just serve to spotlight a paper, seems objectionable to me.

Going back to the meta-level: what about the role of science communicators, media, science journalists and researchers?

According to Maarten Keulemans, journalist, we should just get rid of all science communicators as a layer between scientists and journalists [7]. But Michel van Baal [9] and Roy Meijer[8] have a point when they say that  journalists do a lot PR-ing too and they should do better than to rehash news releases.*²

Now what about Daniel criticism of van Maanen? In my opinion, van Maanen is one of those rare critical journalists who serve as an antidote against uncritical media diarrhea (see Fig above). Comparable to another lone voice in the media: Ben Goldacre. It didn’t surprise me that Daniel didn’t approve of him (and his book Bad Science) either [11]. 

Does this mean that I find Hans van Maanen a terrific science journalist? No, not really. I often agree with him (i.e. see this post [12]). He is one of those rare journalists who has real expertise in research methodology . However, his columns don’t seem to be written for a large audience: they seem too complex for most lay people. One thing I learned during a scientific journalism course, is that one should explain all jargon to one’s audience.

Personally I find this critical Dutch blog post[13] about the Music Marker Theory far more balanced. After a clear description of the study, Linda Duits concludes that the results of the study are pretty obvious, but that the the mini-hype surrounding this research is caused by the positive tone of the press release. She stresses that prediction is not predetermination and that the musical genres are not important: hiphop doesn’t lead to criminal activity and metal not to vandalism.

And this critical piece in Jezebel [14],  reaches far more people by talking in plain, colourful language, hilarious at times.

It also a swell title: “Delinquents Have the Best Taste in Music”. Now that is an apt conclusion!

———————-

*¹ Since Daniel doesn’t refer to  open (trial) data access nor the fact that peer review may , I ignore these aspects for the sake of the discussion.

*² Coincidence? Keulemans has covered  the music marker study quite uncritically (positive).

Photo Credits

http://www.phdcomics.com/comics/archive.php?comicid=1174

References

  1. Daniel Lakens: Is dit nou goede Wetenschap? - Jan 24, 2013 (sites.google.com/site/lakens2/blog)
  2. Hans van Maanen: De smaak van boefjes in de dop,De Volkskrant, Jan 12, 2013 (vanmaanen.org/hans/columns/)
  3. ter Bogt, T., Keijsers, L., & Meeus, W. (2013). Early Adolescent Music Preferences and Minor Delinquency PEDIATRICS DOI: 10.1542/peds.2012-0708
  4. Lindsay Abrams: Kids Who Like ‘Unconventional Music’ More Likely to Become Delinquent, the Atlantic, Jan 18, 2013
  5. Muziekvoorkeur belangrijke voorspeller voor kleine criminaliteit. Jan 8, 2013 (pers.uu.nl)
  6. Maarten Keulemans: Muziek is goede graadmeter voor puberaal wangedrag - De Volkskrant, 12 januari 2013  (volkskrant.nl)
  7. Maarten Keulemans: Als we nou eens alle wetenschapscommunicatie afschaffen? – Jan 23, 2013 (denieuwereporter.nl)
  8. Roy Meijer: Wetenschapscommunicatie afschaffen, en dan? – Jan 24, 2013 (denieuwereporter.nl)
  9. Michel van Baal. Wetenschapsjournalisten doen ook aan PR – Jan 25, 2013 ((denieuwereporter.nl)
  10. What peer review means for science (guardian.co.uk)
  11. Daniel Lakens. Waarom raadde Maarten Keulemans me Bad Science van Goldacre aan? Oct 25, 2012
  12. Why Publishing in the NEJM is not the Best Guarantee that Something is True: a Response to Katan - Sept 27, 2012 (laikaspoetnik.wordpress.com)
  13. Linda Duits: Debunk: worden pubers crimineel van muziek? (dieponderzoek.nl)
  14. Lindy west: Science: “Delinquents Have the Best Taste in Music” (jezebel.com)




FUTON Bias. Or Why Limiting to Free Full Text Might not Always be a Good Idea.

8 09 2011

ResearchBlogging.orgA few weeks ago I was discussing possible relevant papers for the Twitter Journal Club  (Hashtag #TwitJC), a succesful initiative on Twitter, that I have discussed previously here and here [7,8].

I proposed an article, that appeared behind a paywall. Annemarie Cunningham (@amcunningham) immediately ran the idea down, stressing that open-access (OA) is a pre-requisite for the TwitJC journal club.

One of the TwitJC organizers, Fi Douglas (@fidouglas on Twitter), argued that using paid-for journals would defeat the objective that  #TwitJC is open to everyone. I can imagine that fee-based articles could set a too high threshold for many doctors. In addition, I sympathize with promoting OA.

However, I disagree with Annemarie that an OA (or rather free) paper is a prerequisite if you really want to talk about what might impact on practice. On the contrary, limiting to free full text (FFT) papers in PubMed might lead to bias: picking “low hanging fruit of convenience” might mean that the paper isn’t representative and/or doesn’t reflect the current best evidence.

But is there evidence for my theory that selecting FFT papers might lead to bias?

Lets first look at the extent of the problem. Which percentage of papers do we miss by limiting for free-access papers?

survey in PLOS by Björk et al [1] found that one in five peer reviewed research papers published in 2008 were freely available on the internet. Overall 8,5% of the articles published in 2008 (and 13,9 % in Medicine) were freely available at the publishers’ sites (gold OA).  For an additional 11,9% free manuscript versions could be found via the green route:  i.e. copies in repositories and web sites (7,8% in Medicine).
As a commenter rightly stated, the lag time is also important, as we would like to have immediate access to recently published research, yet some publishers (37%) impose an access-embargo of 6-12 months or more. (these papers were largely missed as the 2008 OA status was assessed late 2009).

PLOS 2009

The strength of the paper is that it measures  OA prevalence on an article basis, not on calculating the share of journals which are OA: an OA journal generally contains a lower number of articles.
The authors randomly sampled from 1.2 million articles using the advanced search facility of Scopus. They measured what share of OA copies the average researcher would find using Google.

Another paper published in  J Med Libr Assoc (2009) [2], using similar methods as the PLOS survey examined the state of open access (OA) specifically in the biomedical field. Because of its broad coverage and popularity in the biomedical field, PubMed was chosen to collect their target sample of 4,667 articles. Matsubayashi et al used four different databases and search engines to identify full text copies. The authors reported an OA percentage of 26,3 for peer reviewed articles (70% of all articles), which is comparable to the results of Björk et al. More than 70% of the OA articles were provided through journal websites. The percentages of green OA articles from the websites of authors or in institutional repositories was quite low (5.9% and 4.8%, respectively).

In their discussion of the findings of Matsubayashi et al, Björk et al. [1] quickly assessed the OA status in PubMed by using the new “link to Free Full Text” search facility. First they searched for all “journal articles” published in 2005 and then repeated this with the further restrictions of “link to FFT”. The PubMed OA percentages obtained this way were 23,1 for 2005 and 23,3 for 2008.

This proportion of biomedical OA papers is gradually increasing. A chart in Nature’s News Blog [9] shows that the proportion of papers indexed on the PubMed repository each year has increased from 23% in 2005 to above 28% in 2009.
(Methods are not shown, though. The 2008 data are higher than those of Björk et al, who noticed little difference with 2005. The Data for this chart, however, are from David Lipman, NCBI director and driving force behind the digital OA archive PubMed Central).
Again, because of the embargo periods, not all literature is immediately available at the time that it is published.

In summary, we would miss about 70% of biomedical papers by limiting for FFT papers. However, we would miss an even larger proportion of papers if we limit ourselves to recently published ones.

Of course, the key question is whether ignoring relevant studies not available in full text really matters.

Reinhard Wentz of the Imperial College Library and Information Service already argued in a visionary 2002 Lancet letter[3] that the availability of full-text articles on the internet might have created a new form of bias: FUTON bias (Full Text On the Net bias).

Wentz reasoned that FUTON bias will not affect researchers who are used to comprehensive searches of published medical studies, but that it will affect staff and students with limited experience in doing searches and that it might have the same effect in daily clinical practice as publication bias or language bias when doing systematic reviews of published studies.

Wentz also hypothesized that FUTON bias (together with no abstract available (NAA) bias) will affect the visibility and the impact factor of OA journals. He makes a reasonable cause that the NAA-bias will affect publications on new, peripheral, and under-discussion subjects more than established topics covered in substantive reports.

The study of Murali et al [4] published in Mayo Proceedings 2004 confirms that the availability of journals on MEDLINE as FUTON or NAA affects their impact factor.

Of the 324 journals screened by Murali et al. 38.3% were FUTON, 19.1%  NAA and 42.6% had abstracts only. The mean impact factor was 3.24 (±0.32), 1.64 (±0.30), and 0.14 (±0.45), respectively! The authors confirmed this finding by showing a difference in impact factors for journals available in both the pre and the post-Internet era (n=159).

Murali et al informally questioned many physicians and residents at multiple national and international meetings in 2003. These doctors uniformly admitted relying on FUTON articles on the Web to answer a sizable proportion of their questions. A study by Carney et al (2004) [5] showed  that 98% of the US primary care physicians used the Internet as a resource for clinical information at least once a week and mostly used FUTON articles to aid decisions about patient care or patient education and medical student or resident instruction.

Murali et al therefore conclude that failure to consider FUTON bias may not only affect a journal’s impact factor, but could also limit consideration of medical literature by ignoring relevant for-fee articles and thereby influence medical education akin to publication or language bias.

This proposed effect of the FFT limit on citation retrieval for clinical questions, was examined in a  more recent study (2008), published in J Med Libr Assoc [6].

Across all 4 questions based on a research agenda for physical therapy, the FFT limit reduced the number of citations to 11.1% of the total number of citations retrieved without the FFT limit in PubMed.

Even more important, high-quality evidence such as systematic reviews and randomized controlled trials were missed when the FFT limit was used.

For example, when searching without the FFT limit, 10 systematic reviews of RCTs were retrieved against one when the FFT limit was used. Likewise when searching without the FFT limit, 28 RCTs were retrieved and only one was retrieved when the FFT limit was used.

The proportion of missed studies (appr. 90%) is higher than in the studies mentioned above. Possibly this is because real searches have been tested and that only relevant clinical studies  have been considered.

The authors rightly conclude that consistently missing high-quality evidence when searching clinical questions is problematic because it undermines the process of Evicence Based Practice. Krieger et al finally conclude:

“Librarians can educate health care consumers, scientists, and clinicians about the effects that the FFT limit may have on their information retrieval and the ways it ultimately may affect their health care and clinical decision making.”

It is the hope of this librarian that she did a little education in this respect and clarified the point that limiting to free full text might not always be a good idea. Especially if the aim is to critically appraise a topic, to educate or to discuss current best medical practice.

References

  1. Björk, B., Welling, P., Laakso, M., Majlender, P., Hedlund, T., & Guðnason, G. (2010). Open Access to the Scientific Journal Literature: Situation 2009 PLoS ONE, 5 (6) DOI: 10.1371/journal.pone.0011273
  2. Matsubayashi, M., Kurata, K., Sakai, Y., Morioka, T., Kato, S., Mine, S., & Ueda, S. (2009). Status of open access in the biomedical field in 2005 Journal of the Medical Library Association : JMLA, 97 (1), 4-11 DOI: 10.3163/1536-5050.97.1.002
  3. WENTZ, R. (2002). Visibility of research: FUTON bias The Lancet, 360 (9341), 1256-1256 DOI: 10.1016/S0140-6736(02)11264-5
  4. Murali NS, Murali HR, Auethavekiat P, Erwin PJ, Mandrekar JN, Manek NJ, & Ghosh AK (2004). Impact of FUTON and NAA bias on visibility of research. Mayo Clinic proceedings. Mayo Clinic, 79 (8), 1001-6 PMID: 15301326
  5. Carney PA, Poor DA, Schifferdecker KE, Gephart DS, Brooks WB, & Nierenberg DW (2004). Computer use among community-based primary care physician preceptors. Academic medicine : journal of the Association of American Medical Colleges, 79 (6), 580-90 PMID: 15165980
  6. Krieger, M., Richter, R., & Austin, T. (2008). An exploratory analysis of PubMed’s free full-text limit on citation retrieval for clinical questions Journal of the Medical Library Association : JMLA, 96 (4), 351-355 DOI: 10.3163/1536-5050.96.4.010
  7. The #TwitJC Twitter Journal Club, a new Initiative on Twitter. Some Initial Thoughts. (laikaspoetnik.wordpress.com)
  8. The Second #TwitJC Twitter Journal Club (laikaspoetnik.wordpress.com)
  9. How many research papers are freely available? (blogs.nature.com)







Follow

Get every new post delivered to your Inbox.

Join 610 other followers