No, Google Scholar Shouldn’t be Used Alone for Systematic Review Searching

9 07 2013

Several papers have addressed the usefulness of Google Scholar as a source for systematic review searching. Unfortunately the quality of those papers is often well below the mark.

In 2010 I already [1]  (in the words of Isla Kuhn [2]) “robustly rebutted” the Anders’ paper PubMed versus Google Scholar for Retrieving Evidence” [3] at this blog.

But earlier this year another controversial paper was published [4]:

“Is the coverage of google scholar enough to be used alone for systematic reviews?

It is one of the highly accessed papers of BMC Medical Informatics and Decision Making and has been welcomed in (for instance) the Twittosphere.

Researchers seem  to blindly accept the conclusions of the paper:

But don’t rush  and assume you can now forget about PubMed, MEDLINE, Cochrane and EMBASE for your systematic review search and just do a simple Google Scholar (GS) search instead.

You might  throw the baby out with the bath water….

… As has been immediately recognized by many librarians, either at their blogs (see blogs of Dean Giustini [5], Patricia Anderson [6] and Isla Kuhn [1]) or as direct comments to the paper (by Tuulevi OvaskaMichelle Fiander and Alison Weightman [7].

In their paper, Jean-François Gehanno et al examined whether GS was able to retrieve all the 738 original studies included in 29 Cochrane and JAMA systematic reviews.

And YES! GS had a coverage of 100%!


All those fools at the Cochrane who do exhaustive searches in multiple databases using controlled vocabulary and a lot of synonyms when a simple search in GS could have sufficed…

But it is a logical fallacy to conclude from their findings that GS alone will suffice for SR-searching.

Firstly, as Tuulevi [7] rightly points out :

“Of course GS will find what you already know exists”

Or in the words of one of the official reviewers [8]:

What the authors show is only that if one knows what studies should be identified, then one can go to GS, search for them one by one, and find out that they are indexed. But, if a researcher already knows the studies that should be included in a systematic review, why bother to also check whether those studies are indexed in GS?


Secondly, it is also the precision that counts.

As Dean explains at his blog a 100% recall with a precision of 0,1% (and it can be worse!) means that in order to find 36 relevant papers you have to go through  ~36,700 items.


Are the authors suggesting that researchers consider a precision level of 0.1% acceptable for the SR? Who has time to sift through that amount of information?

It is like searching for needles in a haystack.  Correction: It is like searching for particular hay stalks in a hay stack. It is very difficult to find them if they are hidden among other hay stalks. Suppose the hay stalks were all labeled (title), and I would have a powerful haystalk magnet (“title search”)  it would be a piece of cake to retrieve them. This is what we call “known item search”. But would you even consider going through the haystack and check the stalks one by one? Because that is what we have to do if we use Google Scholar as a one stop search tool for systematic reviews.

Another main point of criticism is that the authors have a grave and worrisome lack of understanding of the systematic review methodology [6] and don’t grasp the importance of the search interface and knowledge of indexing which are both integral to searching for systematic reviews.[7]

One wonders why the paper even passed the peer review, as one of the two reviewers (Miguel Garcia-Perez [8]) already smashed the paper to pieces.

The authors’ method is inadequate and their conclusion is not logically connected to their results. No revision (major, minor, or discretionary) will save this work. (…)

Miguel’s well funded criticism was not well addressed by the authors [9]. Apparently the editors didn’t see through and relied on the second peer reviewer [10], who merely said it was a “great job” etcetera, but that recall should not be written with a capital R.
(and that was about the only revision the authors made)

Perhaps it needs another paper to convince Gehanno et al and the uncritical readers of their manuscript.

Such a paper might just have been published [11]. It is written by Dean Giustini and Maged Kamel Boulos and is entitled:

Google Scholar is not enough to be used alone for systematic reviews

It is a simple and straightforward paper, but it makes its points clearly.

Giustini and Kamel Boulos looked for a recent SR in their own area of expertise (Chou et al [12]), that included a comparable number of references as that of Gehanno et al. Next they test GS’ ability to locate these references.

Although most papers cited by Chou et al. (n=476/506;  ~95%) were ultimately found in GS, numerous iterative searches were required to find the references and each citation had to be managed once at a time. Thus GS was not able to locate all references found by Chou et al. and the whole exercise was rather cumbersome.

As expected, trying to find the papers by a “real-life” GS search was almost impossible. Because due to its rudimentary structure, GS did not understand the expert search strings and was unable to translate them. Thus using Chou et al.’s original search strategy and keywords yielded unmanageable results of approximately >750,000 items.

Giustini and Kamel Boulos note that GS’ ability to search into the full-text of papers combined with its PageRank’s algorithm can be useful.

On the other hand GS’ changing content, unknown updating practices and poor reliability make it an inappropriate sole choice for systematic reviewers:

As searchers, we were often uncertain that results found one day in GS had not changed a day later and trying to replicate searches with date delimiters in GS did not help. Papers found today in GS did not mean they would be there tomorrow.

But most importantly, not all known items could be found and the search process and selection are too cumbersome.

Thus shall we now for once and for all conclude that GS is NOT sufficient to be used alone for SR searching?

We don’t need another bad paper addressing this.

But I would really welcome a well performed paper looking at the additional value of a GS in SR-searching. For I am sure that GS may be valuable for some questions and some topics in some respects. We have to find out which.


  1. PubMed versus Google Scholar for Retrieving Evidence 2010/06 (
  2. Google scholar for systematic reviews…. hmmmm  2013/01 (
  3. Anders M.E. & Evans D.P. (2010) Comparison of PubMed and Google Scholar literature searches, Respiratory care, May;55(5):578-83  PMID:
  4. Gehanno J.F., Rollin L. & Darmoni S. (2013). Is the coverage of Google Scholar enough to be used alone for systematic reviews., BMC medical informatics and decision making, 13:7  PMID:  (open access)
  5. Is Google scholar enough for SR searching? No. 2013/01 (
  6. What’s Wrong With Google Scholar for “Systematic” Review 2013/01 (
  7. Comments at Gehanno’s paper (
  8. Official Reviewer’s report of Gehanno’s paper [1]: Miguel Garcia-Perez, 2012/09
  9. Authors response to comments  (
  10. Official Reviewer’s report of Gehanno’s paper [2]: Henrik von Wehrden, 2012/10
  11. Giustini D. & Kamel Boulos M.N. (2013). Google Scholar is not enough to be used alone for systematic reviews, Online Journal of Public Health Informatics, 5 (2) DOI:
  12. Chou W.Y.S., Prestin A., Lyons C. & Wen K.Y. (2013). Web 2.0 for Health Promotion: Reviewing the Current Evidence, American Journal of Public Health, 103 (1) e9-e18. DOI:

The Scatter of Medical Research and What to do About it.

18 05 2012

ResearchBlogging.orgPaul Glasziou, GP and professor in Evidence Based Medicine, co-authored a new article in the BMJ [1]. Similar to another paper [2] I discussed before [3] this paper deals with the difficulty for clinicians of staying up-to-date with the literature. But where the previous paper [2,3] highlighted the mere increase in number of research articles over time, the current paper looks at the scatter of randomized clinical trials (RCTs) and systematic reviews (SR’s) accross different journals cited in one year (2009) in PubMed.

Hofmann et al analyzed 7 specialties and 9 sub-specialties, that are considered the leading contributions to the burden of disease in high income countries.

They followed a relative straightforward method for identifying the publications. Each search string consisted of a MeSH term (controlled  term) to identify the selected disease or disorders, a publication type [pt] to identify the type of study, and the year of publication. For example, the search strategy for randomized trials in cardiology was: “heart diseases”[MeSH] AND randomized controlled trial[pt] AND 2009[dp]. (when searching “heart diseases” as a MeSH, narrower terms are also searched.) Meta-analysis[pt] was used to identify systematic reviews.

Using this approach Hofmann et al found 14 343 RCTs and 3214 SR’s published in 2009 in the field of the selected (sub)specialties. There was a clear scatter across journals, but this scatter varied considerably among specialties:

“Otolaryngology had the least scatter (363 trials across 167 journals) and neurology the most (2770 trials across 896 journals). In only three subspecialties (lung cancer, chronic obstructive pulmonary disease, hearing loss) were 10 or fewer journals needed to locate 50% of trials. The scatter was less for systematic reviews: hearing loss had the least scatter (10 reviews across nine journals) and cancer the most (670 reviews across 279 journals). For some specialties and subspecialties the papers were concentrated in specialty journals; whereas for others, few of the top 10 journals were a specialty journal for that area.
Generally, little overlap occurred between the top 10 journals publishing trials and those publishing systematic reviews. The number of journals required to find all trials or reviews was highly correlated (r=0.97) with the number of papers for each specialty/ subspecialty.”

Previous work already suggested that this scatter of research has a long tail. Half of the publications is in a minority of papers, whereas the remaining articles are scattered among many journals (see Fig below).

Click to enlarge en see legends at BMJ 2012;344:e3223 [CC]

The good news is that SRs are less scattered and that general journals appear more often in the top 10 journals publishing SRs. Indeed for 6 of the 7 specialties and 4 of the 9 subspecialties, the Cochrane Database of Systematic Reviews had published the highest number of systematic reviews, publishing between 6% and 18% of all the systematic reviews published in each area in 2009. The bad news is that even keeping up to date with SRs seems a huge, if not impossible, challenge.

In other words, it is not sufficient for clinicians to rely on personal subscriptions to a few journals in their specialty (which is common practice). Hoffmann et al suggest several solutions to help clinicians cope with the increasing volume and scatter of research publications.

  • a central library of systematic reviews (but apparently the Cochrane Library fails to fulfill such a role according to the authors, because many reviews are out of date and are perceived as less clinically relevant)
  • registry of planned and completed systematic reviews, such as prospero. (this makes it easier to locate SRs and reduces bias)
  • Synthesis of Evidence and synopses, like the ACP-Jounal Club which summarizes the best evidence in internal medicine
  • Specialised databases that collate and critically appraise randomized trials and systematic reviews, like for physical therapy. In my personal experience, however, this database is often out of date and not comprehensive
  • Journal scanning services like EvidenceUpdates from, which scans over 120 journals, filters articles on the basis of quality, has practising clinicians rate them for relevance and newsworthiness, and makes them available as email alerts and in a searchable database. I use this service too, but besides that not all specialties are covered, the rating of evidence may not always be objective (see previous post [4])
  • The use of social media tools to alert clinicians to important new research.

Most of these solutions are (long) existing solutions that do not or only partly help to solve the information overload.

I was surprised that the authors didn’t propose the use of personalized alerts. PubMed’s My NCBI feature allows to create automatic email alerts on a topic and to subscribe to electronic tables of contents (which could include ACP journal Club). Suppose that a physician browses 10 journals roughly covering 25% of the trials. He/she does not need to read all the other journals from cover to cover to avoid missing one potentially relevant trial. Instead it is far more efficient to perform a topic search to filter relevant studies from journals that seldom publish trials on the topic of interest. One could even use the search of Hoffmann et al to achieve this.* Although in reality, most clinical researchers will have narrower fields of interest than all studies about endocrinology and neurology.

At our library we are working at creating deduplicated, easy to read, alerts that collate table of contents of certain journals with topic (and author) searches in PubMed, EMBASE and other databases. There are existing tools that do the same.

Another way to reduce the individual work (reading) load is to organize journals clubs or even better organize regular CATs (critical appraised topics). In the Netherlands, CATS are a compulsory item for residents. A few doctors do the work for many. Usually they choose topics that are clinically relevant (or for which the evidence is unclear).

The authors shortly mention that their search strategy might have missed  missed some eligible papers and included some that are not truly RCTs or SRs, because they relied on PubMed’s publication type to retrieve RCTs and SRs. For systematic reviews this may be a greater problem than recognized, for the authors have used meta-analyses[pt] to identify systematic reviews. Unfortunately PubMed has no publication type for systematic reviews, but it may be clear that there are many more systematic reviews that meta-analyses. Possibly systematical reviews might even have a different scatter pattern than meta-analyses (i.e. the latter might be preferentially included in core journals).

Furthermore not all meta-analyses and systematic reviews are reviews of RCTs (thus it is not completely fair to compare MAs with RCTs only). On the other hand it is a (not discussed) omission of this study, that only interventions are considered. Nowadays physicians have many other questions than those related to therapy, like questions about prognosis, harm and diagnosis.

I did a little imperfect search just to see whether use of other search terms than meta-analyses[pt] would have any influence on the outcome. I search for (1) meta-analyses [pt] and (2) systematic review [tiab] (title and abstract) of papers about endocrine diseases. Then I subtracted 1 from 2 (to analyse the systematic reviews not indexed as meta-analysis[pt])



I analyzed the top 10/11 journals publishing these study types.

This little experiment suggests that:

  1. the precise scatter might differ per search: apparently the systematic review[tiab] search yielded different top 10/11 journals (for this sample) than the meta-analysis[pt] search. (partially because Cochrane systematic reviews apparently don’t mention systematic reviews in title and abstract?).
  2. the authors underestimate the numbers of Systematic Reviews: simply searching for systematic review[tiab] already found appr. 50% additional systematic reviews compared to meta-analysis[pt] alone
  3. As expected (by me at last), many of the SR’s en MA’s were NOT dealing with interventions, i.e. see the first 5 hits (out of 108 and 236 respectively).
  4. Together these findings indicate that the true information overload is far greater than shown by Hoffmann et al (not all systematic reviews are found, of all available search designs only RCTs are searched).
  5. On the other hand this indirectly shows that SRs are a better way to keep up-to-date than suggested: SRs  also summarize non-interventional research (the ratio SRs of RCTs: individual RCTs is much lower than suggested)
  6. It also means that the role of the Cochrane Systematic reviews to aggregate RCTs is underestimated by the published graphs (the MA[pt] section is diluted with non-RCT- systematic reviews, thus the proportion of the Cochrane SRs in the interventional MAs becomes larger)

Well anyway, these imperfections do not contradict the main point of this paper: that trials are scattered across hundreds of general and specialty journals and that “systematic reviews” (or meta-analyses really) do reduce the extent of scatter, but are still widely scattered and mostly in different journals to those of randomized trials.

Indeed, personal subscriptions to journals seem insufficient for keeping up to date.
Besides supplementing subscription by  methods such as journal scanning services, I would recommend the use of personalized alerts from PubMed and several prefiltered sources including an EBM search machine like TRIP (

*but I would broaden it to find all aggregate evidence, including ACP, Clinical Evidence, syntheses and synopses, not only meta-analyses.

**I do appreciate that one of the co-authors is a medical librarian: Sarah Thorning.


  1. Hoffmann, Tammy, Erueti, Chrissy, Thorning, Sarah, & Glasziou, Paul (2012). The scatter of research: cross sectional comparison of randomised trials and systematic reviews across specialties BMJ, 344 : 10.1136/bmj.e3223
  2. Bastian, H., Glasziou, P., & Chalmers, I. (2010). Seventy-Five Trials and Eleven Systematic Reviews a Day: How Will We Ever Keep Up? PLoS Medicine, 7 (9) DOI: 10.1371/journal.pmed.1000326
  3. How will we ever keep up with 75 trials and 11 systematic reviews a day (
  4. Experience versus Evidence [1]. Opioid Therapy for Rheumatoid Arthritis Pain. (

Experience versus Evidence [1]. Opioid Therapy for Rheumatoid Arthritis Pain.

5 12 2011

ResearchBlogging.orgRheumatoid arthritis (RA) is a chronic auto-immune disease, which causes inflammation of the joints that eventually leads to progressive joint destruction and deformity. Patients have swollen, stiff and painful joints.  The main aim of treatment is to reduce swelling  and inflammation, to alleviate pain and stiffness and to maintain normal joint function. While there is no cure, it is important to properly manage pain.

The mainstays of therapy in RA are disease-modifying anti-rheumatic drugs (DMARDs) and non-steroidal anti-inflammatory drugs (NSAIDs). These drugs primarily target inflammation. However, since inflammation is not the only factor that causes pain in RA, patients may not be (fully) responsive to treatment with these medications.
Opioids are another class of pain-relieving substance (analgesics). They are frequently used in RA, but their role in chronic cancer pain, including RA, is not firmly established.

A recent Cochrane Systematic Review [1] assessed the beneficial and harmful effects of opioids in RA.

Eleven studies (672 participants) were included in the review.

Four studies only assessed the efficacy of  single doses of different analgesics, often given on consecutive days. In each study opioids reduced pain (a bit) more than placebo. There were no differences in effectiveness between the opioids.

Seven studies between 1-6 weeks in duration assessed 6 different oral opioids either alone or combined with non-opioid analgesics.
The only strong opioid investigated was controlled-release morphine sulphate, in a single study with 20 participants.
Six studies compared an opioid (often combined with an non-opioid analgesic) to placebo. Opioids were slightly better than placebo in improving patient reported global impression of clinical change (PGIC)  (3 studies, 324 participants: relative risk (RR) 1.44, 95% CI 1.03 to 2.03), but did not lower the  number of withdrawals due to inadequate analgesia in 4 studies.
Notably none of the 11 studies reported the primary and probably more clinical relevant outcome “proportion of participants reporting ≥ 30% pain relief”.

On the other hand adverse events (most commonly nausea, vomiting, dizziness and constipation) were more frequent in patients receiving opioids compared to placebo (4 studies, 371 participants: odds ratio 3.90, 95% CI 2.31 to 6.56). Withdrawal due to adverse events was  non-significantly higher in the opioid-treated group.

Comparing opioids to other analgesics instead of placebos seems more relevant. Among the 11 studies, only 1 study compared an opioid (codeine with paracetamol) to an NSAID (diclofenac). This study found no difference in efficacy or safety between the two treatments.

The 11 included studies were very heterogeneous (i.e. different opioid studied, with or without concurrent use of non-opioid analgesics, different outcomes measured) and the risk of bias was generally high. Furthermore, most studies were published before 2000 (less optimal treatment of RA).

The authors therefore conclude:

In light of this, the quantitative findings of this review must be interpreted with great caution. At best, there is weak evidence in favour of the efficacy of opioids for the treatment of pain in patients with RA but, as no study was longer than six weeks in duration, no reliable conclusions can be drawn regarding the efficacy or safety of opioids in the longer term.

This was the evidence, now the opinion.

I found this Cochrane Review via an EvidenceUpdates email alert from the BMJ Group and McMaster PLUS.

EvidenceUpdate alerts are meant to “provide you with access to current best evidence from research, tailored to your own health care interests, to support evidence-based clinical decisions. (…) All citations are pre-rated for quality by research staff, then rated for clinical relevance and interest by at least 3 members of a worldwide panel of practicing physicians”

I usually don’t care about the rating, because it is mostly 5-6 on a scale of 7. This was also true for the current SR.

There is a more detailed rating available (when clicking the link, free registration required). Usually, the newsworthiness of SR’s scores relatively low. (because it summarizes ‘old’ studies?). Personally I would think that the relevance and newsworthiness would be higher for the special interest group, pain.

But the comment of the first of the 3 clinical raters was most revealing:

He/she comments:

As a Palliative care physician and general internist, I have had excellent results using low potency opiates for RA and OA pain. The palliative care literature is significantly more supportive of this approach vs. the Cochrane review.

Thus personal experience wins from evidence?* How did this palliative care physician assess effectiveness? Just give a single dose of an opiate? How did he rate the effectiveness of the opioids? Did he/she compare it to placebo or NSAID (did he compare it at all?), did he/she measure adverse effects?

And what is “The palliative care literature”  the commenter is referring to? Apparently not this Cochrane Review. Apparently not the 11 controlled trials included in the Cochrane review. Apparently not the several other Cochrane reviews on use of opioids for non-chronic cancer pain, and not the guidelines, syntheses and synopsis I found via the TRIP-database. All conclude that using opioids to treat non-cancer chronic pain is supported by very limited evidence, that adverse effects are common and that long-term use may lead to opioid addiction.

I’m sorry to note that although the alerting service is great as an alert, such personal ratings are not very helpful for interpreting and *true* rating of the evidence.

I would rather prefer a truly objective, structured critical appraisal like this one on a similar topic by DARE (“Opioids for chronic noncancer pain: a meta-analysis of effectiveness and side effects”)  and/or an objective piece that puts the new data into clinical perspective.

*Just to be clear, the own expertise and opinions of experts are also important in decision making. Rightly, Sackett [2] emphasized that good doctors use both individual clinical expertise and the best available external evidence. However, that doesn’t mean that one personal opinion and/or preference replaces all the existing evidence.


  1. Whittle SL, Richards BL, Husni E, & Buchbinder R (2011). Opioid therapy for treating rheumatoid arthritis pain. Cochrane database of systematic reviews (Online), 11 PMID: 22071805
  2. Sackett DL, Rosenberg WM, Gray JA, Haynes RB, & Richardson WS (1996). Evidence based medicine: what it is and what it isn’t. BMJ (Clinical research ed.), 312 (7023), 71-2 PMID: 8555924
Enhanced by Zemanta

PubMed’s Higher Sensitivity than OVID MEDLINE… & other Published Clichés.

21 08 2011

ResearchBlogging.orgIs it just me, or are biomedical papers about searching for a systematic review often of low quality or just too damn obvious? I’m seldom excited about papers dealing with optimal search strategies or peculiarities of PubMed, even though it is my specialty.
It is my impression, that many of the lower quality and/or less relevant papers are written by clinicians/researchers instead of information specialists (or at least no medical librarian as the first author).

I can’t help thinking that many of those authors just happen to see an odd feature in PubMed or encounter an unexpected  phenomenon in the process of searching for a systematic review.
They think: “Hey, that’s interesting” or “that’s odd. Lets write a paper about it.” An easy way to boost our scientific output!
What they don’t realize is that the published findings are often common knowledge to the experienced MEDLINE searchers.

Lets give two recent examples of what I think are redundant papers.

The first example is a letter under the heading “Clinical Observation” in Annals of Internal Medicine, entitled:

“Limitations of the MEDLINE Database in Constructing Meta-analyses”.[1]

As the authors rightly state “a thorough literature search is of utmost importance in constructing a meta-analysis. Since the PubMed interface from the National Library of Medicine is a cornerstone of many meta-analysis,  the authors (two MD’s) focused on the freely available PubMed” (with MEDLINE as its largest part).

The objective was:

“To assess the accuracy of MEDLINE’s “human” and “clinical trial” search limits, which are used by authors to focus literature searches on relevant articles.” (emphasis mine)

O.k…. Stop! I know enough. This paper should have be titled: “Limitation of Limits in MEDLINE”.

Limits are NOT DONE, when searching for a systematic review. For the simple reason that most limits (except language and dates) are MESH-terms.
It takes a while before the indexers have assigned a MESH to the papers and not all papers are correctly (or consistently) indexed. Thus, by using limits you will automatically miss recent, not yet, or not correctly indexed papers. Whereas it is your goal (or it should be) to find as many relevant papers as possible for your systematic review. And wouldn’t it be sad if you missed that one important RCT that was published just the other day?

On the other hand, one doesn’t want to drown in irrelevant papers. How can one reduce “noise” while minimizing the risk of loosing relevant papers?

  1. Use both MESH and textwords to “limit” you search, i.e. also search “trial” as textword, i.e. in title and abstract: trial[tiab]
  2. Use more synonyms and truncation (random*[tiab] OR  placebo[tiab])
  3. Don’t actively limit but use double negation. Thus to get rid of animal studies, don’t limit to humans (this is the same as combining with MeSH [mh]) but safely exclude animals as follows: NOT animals[mh] NOT humans[mh] (= exclude papers indexed with “animals” except when these papers are also indexed with “humans”).
  4. Use existing Methodological Filters (ready-made search strategies) designed to help focusing on study types. These filters are based on one or more of the above-mentioned principles (see earlier posts here and here).
    Simple Methodological Filters can be found at the PubMed Clinical Queries. For instance the narrow filter for Therapy not only searches for the Publication Type “Randomized controlled trial” (a limit), but also for randomized, controlled ànd  trial  as textwords.
    Usually broader (more sensitive) filters are used for systematic reviews. The Cochrane handbook proposes to use the following filter maximizing precision and sensitivity to identify randomized trials in PubMed (see
    (randomized controlled trial [pt] OR controlled clinical trial [pt] OR randomized [tiab] OR placebo [tiab] OR clinical trials as topic [mesh: noexp] OR randomly [tiab] OR trial [ti]) NOT (animals [mh] NOT humans [mh]).
    When few hits are obtained, one can either use a broader filter or no filter at all.

In other words, it is a beginner’s mistake to use limits when searching for a systematic review.
Besides that the authors publish what should be common knowledge (even our medical students learn it) they make many other (little) mistakes, their precise search is difficult to reproduce and far from complete. This is already addressed by Dutch colleagues in a comment [2].

The second paper is:

PubMed had a higher sensitivity than Ovid-MEDLINE in the search for systematic reviews [3], by Katchamart et al.

Again this paper focuses on the usefulness of PubMed to identify RCT’s for a systematic review, but it concentrates on the differences between PubMed and OVID in this respect. The paper starts with  explaining that PubMed:

provides access to bibliographic information in addition to MEDLINE, such as in-process citations (..), some OLDMEDLINE citations (….) citations that precede the date that a journal was selected for MEDLINE indexing, and some additional life science journals that submit full texts to PubMed Central and receive a qualitative review by NLM.

Given these “facts”, am I exaggerating when I am saying that the authors are pushing at an open door when their main conclusion is that PubMed retrieved more citations overall than Ovid-MEDLINE? The one (!) relevant article missed in OVID was a 2005 study published in a Japanese journal that MEDLINE started indexing in 2007. It was therefore in PubMed, but not in OVID MEDLINE.

An important aspect to keep in mind when searching OVID/MEDLINE ( I have earlier discussed here and here). But worth a paper?

Recently, after finishing an exhaustive search in OVID/MEDLINE, we noticed that we missed a RCT in PubMed, that was not yet available in OVID/MEDLINE.  I just added one sentence to the search methods:

Additionally, PubMed was searched for randomized controlled trials ahead of print, not yet included in OVID MEDLINE. 

Of course, I could have devoted a separate article to this finding. But it is so self-evident, that I don’t think it would be worth it.

The authors have expressed their findings in sensitivity (85% for Ovid-MEDLINE vs. 90% for PubMed, 5% is that ONE paper missing), precision and  number to read (comparable for OVID-MEDLINE and PubMed).

If I might venture another opinion: it looks like editors of medical and epidemiology journals quickly fall for “diagnostic parameters” on a topic that they don’t understand very well: library science.

The sensitivity/precision data found have little general value, because:

  • it concerns a single search on a single topic
  • there are few relevant papers (17- 18)
  • useful features of OVID MEDLINE that are not available in PubMed are not used. I.e. Adjacency searching could enhance the retrieval of relevant papers in OVID MEDLINE (adjacency=words searched within a specified maximal distance of each other)
  • the searches are not comparable, nor are the search field commands.

The latter is very important, if one doesn’t wish to compare apples and oranges.

Lets take a look at the first part of the search (which is in itself well structured and covers many synonyms).
First part of the search - Click to enlarge
This part of the search deals with the P: patients with rheumatoid arthritis (RA). The authors first search for relevant MeSH (set 1-5) and then for a few textwords. The MeSH are fine. The authors have chosen to use Arthritis, rheumatoid and a few narrower terms (MeSH-tree shown at the right). The authors have taken care to use the MeSH:noexp command in PubMed to prevent the automatic explosion of narrower terms in PubMed (although this is superfluous for MesH terms having no narrow terms, like Caplan syndrome etc.).

But the fields chosen for the free text search (sets 6-9) are not comparable at all.

In OVID the mp. field is used, whereas all fields or even no fields are used in PubMed.

I am not even fond of the uncontrolled use of .mp (I rather search in title and abstract, remember we already have the proper MESH-terms), but all fields is even broader than .mp.

In general a .mp. search looks in the Title, Original Title, Abstract, Subject Heading, Name of Substance, and Registry Word fields. All fields would be .af in OVID not .mp.

Searching for rheumatism in OVID using the .mp field yields 7879 hits against 31390 hits when one searches in the .af field.

Thus 4 times as much. Extra fields searched are for instance the journal and the address field. One finds all articles in the journal Arthritis & Rheumatism for instance [line 6], or papers co-authored by someone of the dept. of rheumatoid surgery [line 9]

Worse, in PubMed the “all fields” command doesn’t prevent the automatic mapping.

In PubMed, Rheumatism[All Fields] is translated as follows:

“rheumatic diseases”[MeSH Terms] OR (“rheumatic”[All Fields] AND “diseases”[All Fields]) OR “rheumatic diseases”[All Fields] OR “rheumatism”[All Fields]

Oops, Rheumatism[All Fields] is searched as the (exploded!) MeSH rheumatic diseases. Thus rheumatic diseases (not included in the MeSH-search) plus all its narrower terms! This makes the entire first part of the PubMed search obsolete (where the authors searched for non-exploded specific terms). It explains the large difference in hits with rheumatism between PubMed and OVID/MEDLINE: 11910 vs 6945.

Not only do the authors use this .mp and [all fields] command instead of the preferred [tiab] field, they also apply this broader field to the existing (optimized) Cochrane filter, that uses [tiab]. Finally they use limits!

Well anyway, I hope that I made my point that useful comparison between strategies can only be made if optimal strategies and comparable  strategies are used. Sensitivity doesn’t mean anything here.

Coming back to my original point. I do think that some conclusions of these papers are “good to know”. As a matter of fact it should be basic knowledge for those planning an exhaustive search for a systematic review. We do not need bad studies to show this.

Perhaps an expert paper (or a series) on this topic, understandable for clinicians, would be of more value.

Or the recognition that such search papers should be designed and written by librarians with ample experience in searching for systematic reviews.

* = truncation=search for different word endings; [tiab] = title and abstract; [ti]=title; mh=mesh; pt=publication type

Photo credit

The image is taken from the Dragonfly-blog; here the Flickr-image Brain Vocab Sketch by labguest was adapted by adding the Pubmed logo.


  1. Winchester DE, & Bavry AA (2010). Limitations of the MEDLINE database in constructing meta-analyses. Annals of internal medicine, 153 (5), 347-8 PMID: 20820050
  2. Leclercq E, Kramer B, & Schats W (2011). Limitations of the MEDLINE database in constructing meta-analyses. Annals of internal medicine, 154 (5) PMID: 21357916
  3. Katchamart W, Faulkner A, Feldman B, Tomlinson G, & Bombardier C (2011). PubMed had a higher sensitivity than Ovid-MEDLINE in the search for systematic reviews. Journal of clinical epidemiology, 64 (7), 805-7 PMID: 20926257
  4. Search OVID EMBASE and Get MEDLINE for Free…. without knowing it ( 2010/10/19/)
  5. 10 + 1 PubMed Tips for Residents (and their Instructors) ( 2009/06/30)
  6. Adding Methodological filters to myncbi ( 2009/11/26/)
  7. Search filters 1. An Introduction ( 2009/01/22/)

How will we ever keep up with 75 Trials and 11 Systematic Reviews a Day?

6 10 2010

ResearchBlogging.orgAn interesting paper was published in PLOS Medicine [1]. As an information specialist and working part time for the Cochrane Collaboration* (see below), this topic is close to my heart.

The paper, published in PLOS Medicine is written by Hilda Bastian and two of my favorite EBM devotees ànd critics, Paul Glasziou and Iain Chalmers.

Their article gives an good overview of the rise in number of trials, systematic reviews (SR’s) of interventions and of medical papers in general. The paper (under the head: Policy Forum) raises some important issues, but the message is not as sharp and clear as usual.

Take the title for instance.

Seventy-Five Trials and Eleven Systematic Reviews a Day:
How Will We Ever Keep Up?

What do you consider its most important message?

  1. That doctors suffer from an information overload that is only going to get worse, as I did and probably also in part @kevinclauson who tweeted about it to medical librarians
  2. that the solution to this information overload consists of Cochrane systematic reviews (because they aggregate the evidence from individual trials) as @doctorblogs twittered
  3. that it is just about “too many systematic reviews (SR’s) ?”, the title of the PLOS-press release (so the other way around),
  4. That it is about too much of everything and the not always good quality SR’s: @kevinclauson and @pfanderson discussed that they both use the same ” #Cochrane Disaster” (see Kevin’s Blog) in their  teaching.
  5. that Archie Cochrane’s* dream is unachievable and ought perhaps be replaced by something less Utopian (comment by Richard Smith, former editor of the BMJ: 1, 3, 4, 5 together plus a new aspect: SR’s should not only  include randomized controlled trials (RCT’s)

The paper reads easily, but matters of importance are often only touched upon.  Even after reading it twice, I wondered: a lot is being said, but what is really their main point and what are their answers/suggestions?

But lets look at their arguments and pieces of evidence. (Black is from their paper, blue my remarks)

The landscape

I often start my presentations “searching for evidence” by showing the Figure to the right, which is from an older PLOS-article. It illustrates the information overload. Sometimes I also show another slide, with (5-10 year older data), saying that there are 55 trials a day, 1400 new records added per day to MEDLINE and 5000 biomedical articles a day. I also add that specialists have to read 17-22 articles a day to keep up to date with the literature. GP’s even have to read more, because they are generalists. So those 75 trials and the subsequent information overload is not really a shock to me.

Indeed the authors start with saying that “Keeping up with information in health care has never been easy.” The authors give an interesting overview of the driving forces for the increase in trials and the initiation of SR’s and critical appraisals to synthesize the evidence from all individual trials to overcome the information overload (SR’s and other forms of aggregate evidence decrease the number needed to read).

In box 1 they give an overview of the earliest systematic reviews. These SR’s often had a great impact on medical practice (see for instance an earlier discussion on the role of the Crash trial and of the first Cochrane review).
They also touch upon the institution of the Cochrane Collaboration.  The Cochrane collaboration is named after Archie Cochrane who “reproached the medical profession for not having managed to organise a “critical summary, by speciality or subspecialty, adapted periodically, of all relevant randomised controlled trials” He inspired the establishment of the international Oxford Database of Perinatal Trials and he encouraged the use of systematic reviews of randomized controlled trials (RCT’s).

A timeline with some of the key events are shown in Figure 1.

Where are we now?

The second paragraph shows many, interesting, graphs (figs 2-4).

Annoyingly, PLOS only allows one sentence-legends. The details are in the (WORD) supplement without proper referral to the actual figure numbers. Grrrr..!  This is completely unnecessary in reviews/editorials/policy forums. And -as said- annoying, because you have to read a Word file to understand where the data actually come from.

Bastian et al. have used MEDLINE’s publication types (i.e. case reports [pt], reviews[pt], Controlled Clinical Trial[pt] ) and search filters (the Montori SR filter and the Haynes narrow therapy filter, which is built-in in PubMed’s Clinical Queries) to estimate the yearly rise in number of study types. The total number of Clinical trials in CENTRAL (the largest database of controlled clinical trials, abbreviated as CCTRS in the article) and the Cochrane Database of Systematic Reviews (CDSR) are easy to retrieve, because the numbers are published quaterly (now monthly) by the Cochrane Library. Per definition, CDSR only contains SR’s and CENTRAL (as I prefer to call it) contains almost invariably controlled clinical trials.

In short, these are the conclusions from their three figures:

  • Fig 2: The number of published trials has raised sharply from 1950 till 2010
  • Fig 3: The number of systematic reviews and meta-analysis has raised tremendously as well
  • Fig 4: But systematic reviews and clinical trials are still far outnumbered by narrative reviews and case reports.

O.k. that’s clear & they raise a good point : an “astonishing growth has occurred in the number of reports of clinical trials since the middle of the 20th century, and in reports of systematic reviews since the 1980s—and a plateau in growth has not yet been reached.
Plus indirectly: the increase in systematic reviews  didn’t lead to a lower the number of trials and narrative reviews. Thus the information overload is still increasing.
But instead of discussing these findings they go into an endless discussion on the actual data and the fact that we “still do not know exactly how many trials have been done”, to end the discussion by saying that “Even though these figures must be seen as more illustrative than precise…” And than you think. So what? Furthermore, I don’t really get their point of this part of their article.


Fig. 2: The number of published trials, 1950 to 2007.



With regard to Figure 2 they say for instance:

The differences between the numbers of trial records in MEDLINE and CCTR (CENTRAL) (see Figure 2) have multiple causes. Both CCTR and MEDLINE often contain more than one record from a single study, and there are lags in adding new records to both databases. The NLM filters are probably not as efficient at excluding non-trials as are the methods used to compile CCTR. Furthermore, MEDLINE has more language restrictions than CCTR. In brief, there is still no single repository reliably showing the true number of randomised trials. Similar difficulties apply to trying to estimate the number of systematic reviews and health technology assessments (HTAs).

Sorry, although some of these points may be true, Bastian et al. don’t go into the main reason for the difference between both graphs, that is the higher number of trial records in CCTR (CENTRAL) than in MEDLINE: the difference can be simply explained by the fact that CENTRAL contains records from MEDLINE as well as from many other electronic databases and from hand-searched materials (see this post).
With respect to other details:. I don’t know which NLM filter they refer to, but if they mean the narrow therapy filter: this filter is specifically meant to find randomized controlled trials, and is far more specific and less sensitive than the Cochrane methodological filters for retrieving controlled clinical trials. In addition, MEDLINE does not have more language restrictions per se: it just contains a (extensive) selection of  journals. (Plus people more easily use language limits in MEDLINE, but that is besides the point).

Elsewhere the authors say:

In Figures 2 and 3 we use a variety of data sources to estimate the numbers of trials and systematic reviews published from 1950 to the end of 2007 (see Text S1). The number of trials continues to rise: although the data from CCTR suggest some fluctuation in trial numbers in recent years, this may be misleading because the Cochrane Collaboration virtually halted additions to CCTR as it undertook a review and internal restructuring that lasted a couple of years.

As I recall it , the situation is like this: till 2005 the Cochrane Collaboration did the so called “retag project” , in which they searched for controlled clinical trials in MEDLINE and EMBASE (with a very broad methodological filter). All controlled trials articles were loaded in CENTRAL, and the NLM retagged the controlled clinical trials that weren’t tagged with the appropriate publication type in MEDLINE. The Cochrane stopped the laborious retag project in 2005, but still continues the (now) monthly electronic search updates performed by the various Cochrane groups (for their topics only). They still continue handsearching. So they didn’t (virtually?!) halted additions to CENTRAL, although it seems likely that stopping the retagging project caused the plateau. Again the author’s main points are dwarfed by not very accurate details.

Some interesting points in this paragraph:

  • We still do not know exactly how many trials have been done.
  • For a variety of reasons, a large proportion of trials have remained unpublished (negative publication bias!) (note: Cochrane Reviews try to lower this kind of bias by applying no language limits and including unpublished data, i.e. conference proceedings, too)
  • Many trials have been published in journals without being electronically indexed as trials, which makes them difficult to find. (note: this has been tremendously improved since the Consort-statement, which is an evidence-based, minimum set of recommendations for reporting RCTs, and by the Cochrane retag-project, discussed above)
  • Astonishing growth has occurred in the number of reports of clinical trials since the middle of the 20th century, and in reports of systematic reviews since the 1980s—and a plateau in growth has not yet been reached.
  • Trials are now registered in prospective trial registers at inception, theoretically enabling an overview of all published and unpublished trials (note: this will also facilitate to find out reasons for not publishing data, or alteration of primary outcomes)
  • Once the International Committee of Medical Journal Editors announced that their journals would no longer publish trials that had not been prospectively registered, far more ongoing trials were being registered per week (200 instead of 30). In 2007, the US Congress made detailed prospective trial registration legally mandatory.

The authors do not discuss that better reporting of trials and the retag project might have facilitated the indexing and retrieval of trials.

How Close Are We to Archie Cochrane’s Goal?

According to the authors there are various reasons why Archie Cochrane’s goal will not be achieved without some serious changes in course:

  • The increase in systematic reviews didn’t displace other less reliable forms of information (Figs 3 and 4)
  • Only a minority of trials have been assessed in systematic review
  • The workload involved in producing reviews is increasing
  • The bulk of systematic reviews are now many years out of date.

Where to Now?

In this paragraph the authors discuss what should be changed:

  • Prioritize trials
  • Wider adoption of the concept that trials will not be supported unless a SR has shown the trial to be necessary.
  • Prioritizing SR’s: reviews should address questions that are relevant to patients, clinicians and policymakers.
  • Chose between elaborate reviews that answer a part of the relevant questions or “leaner” reviews of most of what we want to know. Apparently the authors have already chosen for the latter: they prefer:
    • shorter and less elaborate reviews
    • faster production ànd update of SR’s
    • no unnecessary inclusion of other study types other than randomized trials. (unless it is about less common adverse effects)
  • More international collaboration and thereby a better use  of resources for SR’s and HTAs. As an example of a good initiative they mention “KEEP Up,” which will aim to harmonise updating standards and aggregate updating results, initiated and coordinated by the German Institute for Quality and Efficiency in Health Care (IQWiG) and involving key systematic reviewing and guidelines organisations such as the Cochrane Collaboration, Duodecim, the Scottish Intercollegiate Guidelines Network (SIGN), and the National Institute for Health and Clinical Excellence (NICE).

Summary and comments

The main aim of this paper is to discuss  to which extent the medical profession has managed to make “critical summaries, by speciality or subspeciality, adapted periodically, of all relevant randomized controlled trials”, as proposed 30 years ago by Archie Cochrane.

Emphasis of the paper is mostly on the number of trials and systematic reviews, not on qualitative aspects. Furthermore there is too much emphasis on the methods determining the number of trials and reviews.

The main conclusion of the authors is that an astonishing growth has occurred in the number of reports of clinical trials as well as in the number of SR’s, but that these systematic pieces of evidence shrink into insignificance compared to the a-systematic narrative reviews or case reports published. That is an important, but not an unexpected conclusion.

Bastian et al don’t address whether systematic reviews have made the growing number of trials easier to access or digest. Neither do they go into developments that have facilitated the retrieval of clinical trials and aggregate evidence from databases like PubMed: the Cochrane retag-project, the Consort-statement, the existence of publication types and search filters (they use themselves to filter out trials and systematic reviews). They also skip other sources than systematic reviews, that make it easier to find the evidence: Databases with Evidence Based Guidelines, the TRIP database, Clinical Evidence.
As Clay Shirky said: “It’s Not Information Overload. It’s Filter Failure.”

It is also good to note that case reports and narrative reviews serve other aims. For medical practitioners rare case reports can be very useful for their clinical practice and good narrative reviews can be valuable for getting an overview in the field or for keeping up-to-date. You just have to know when to look for what.

Bastian et al have several suggestions for improvement, but these suggestions are not always underpinned. For instance, they propose access to all systematic reviews and trials. Perfect. But how can this be attained? We could stimulate authors to publish their trials in open access papers. For Cochrane reviews this would be desirable but difficult, as we cannot demand from authors who work for months for free to write a SR to pay the publications themselves. The Cochrane Collab is an international organization that does not receive subsidies for this. So how could this be achieved?

In my opinion, we can expect the most important benefits from prioritizing of trials ànd SR’s, faster production ànd update of SR’s, more international collaboration and less duplication. It is a pity the authors do not mention other projects than “Keep up”.  As discussed in previous posts, the Cochrane Collaboration also recognizes the many issues raised in this paper, and aims to speed up the updates and to produce evidence on priority topics (see here and here). Evidence aid is an example of a successful effort.  But this is only the Cochrane Collaboration. There are many more non-Cochrane systematic reviews produced.

And then we arrive at the next issue: Not all systematic reviews are created equal. There are a lot of so called “systematic reviews”, that aren’t the conscientious, explicit and judicious created synthesis of evidence as they ought to be.

Therefore, I do not think that the proposal that each single trial should be preceded by a systematic review, is a very good idea.
In the Netherlands writing a SR is already required for NWO grants. In practice, people just approach me, as a searcher, the days before Christmas, with the idea to submit the grant proposal (including the SR) early in January. This evidently is a fast procedure, but doesn’t result in a high standard SR, upon which others can rely.

Another point is that this simple and fast production of SR’s will only lead to a larger increase in number of SR’s, an effect that the authors wanted to prevent.

Of course it is necessary to get a (reliable) picture of what has already be done and to prevent unnecessary duplication of trials and systematic reviews. It would the best solution if we would have a triplet (nano-publications)-like repository of trials and systematic reviews done.

Ideally, researchers and doctors should first check such a database for existing systematic reviews. Only if no recent SR is present they could continue writing a SR themselves. Perhaps it sometimes suffices to search for trials and write a short synthesis.

There is another point I do not agree with. I do not think that SR’s of interventions should only include RCT’s . We should include those study types that are relevant. If RCT’s furnish a clear proof, than RCT’s are all we need. But sometimes – or in some topics/specialties- RCT’s are not available. Inclusion of other study designs and rating them with GRADE (proposed by Guyatt) gives a better overall picture. (also see the post: #notsofunny: ridiculing RCT’s and EBM.

The authors strive for simplicity. However, the real world isn’t that simple. In this paper they have limited themselves to evidence of the effects of health care interventions. Finding and assessing prognostic, etiological and diagnostic studies is methodologically even more difficult. Still many clinicians have these kinds of questions. Therefore systematic reviews of other study designs (diagnostic accuracy or observational studies) are also of great importance.

In conclusion, whereas I do not agree with all points raised, this paper touches upon a lot of important issues and achieves what can be expected from a discussion paper:  a thorough shake-up and a lot of discussion.


  1. Bastian, H., Glasziou, P., & Chalmers, I. (2010). Seventy-Five Trials and Eleven Systematic Reviews a Day: How Will We Ever Keep Up? PLoS Medicine, 7 (9) DOI: 10.1371/journal.pmed.1000326

Related Articles

#Cochrane Colloquium 2009: Better Working Relationship between Cochrane and Guideline Developers

19 10 2009

singapore CCLast week I attended the annual Cochrane Colloquium in Singapore. I will summarize some of the meetings.

Here is a summary of an interesting (parallel) special session: Creating a closer working relationship between Cochrane and Guideline Developers. This session was brought together as a partnership between the Guidelines International Network (G-I-N) and The Cochrane Collaboration to look at the current experience of guideline developers and their use of Cochrane reviews (see abstract).

Emma Tavender of the EPOC Australian Satellite, Australia reported on the survey carried out by the UK Cochrane Centre to identify the use of Cochrane reviews in guidelines produced in the UK ) (not attended this presentation) .

Pwee Keng Ho, Ministry of Health, Singapore, is leading the Health Technology Assessment (HTA) and guideline development program of the Singapore Ministry of Health. He spoke about the issues faced as a guideline developer using Cochrane reviews or -in his own words- his task was: “to summarize whether guideline developers like Cochrane Systematic reviews or not” .

Keng Ho presented the results of 3 surveys of different guideline developers. Most surveys had very few respondents: 12-29 if I remember it well.

Each survey had approximately the same questions, but in a different order. On the face of it, the 3 surveys gave the same picture.

Main points:

  • some guideline developers are not familiar with Cochrane Systematic Reviews
  • others have no access to it.
  • of those who are familiar with the Cochrane Reviews and do have access to it, most found the Cochrane reviews useful and reliable. (in one survey half of the respondents were neutral)
  • most importantly they actually did use the Cochrane reviews for most of their guidelines.
  • these guideline developers also used the Cochrane methodology to make their guidelines (whereas most physicians are not inclined to use the exhaustive search strategies and systematic approach of the Cochrane Collaboration)
  • An often heard critique of Guideline developers concerned the non-comprehensive coverage of topics by Cochrane Reviews. However, unlike in Western countries, the Singapore minister of Health mentioned acupuncture and herbs as missing topics (for certain diseases).

This incomplete coverage caused by a not-demand driven choice of subjects was a recurrent topic at this meeting and a main issue recognized by the entire Cochrane Community. Therefore priority setting of Cochrane Systematic reviews is one of the main topics addressed at this Colloquium and in the Cochrane Strategic review.

Kay Dickersin of the US Cochrane Center, USA, reported on the issues raised at the stakeholders meeting held in June 2009 in the US (see here for agenda) on whether systematic reviews can effectively inform guideline development, with a particular focus on areas of controversy and debate.

The Stakeholder summit concentrated on using quality SR’s for guidelines. This is different from effectiveness research, for which the Institute of Medicine (IOM) sets the standards: local and specialist guidelines require a different expertise and approach.

All kinds of people are involved in the development of guidelines, i.e. nurses, consumers, physicians.
Important issues to address, point by point:

  • Some may not understand the need to be systematic
  • How to get physicians on board: they are not very comfortable with extensive searching and systematic work
  • Ongoing education, like how-to workshops, is essential
  • What to do if there is no evidence?
  • More transparency; handling conflicts of interest
  • Guidelines differ, including the rating of the evidence. Almost everyone in the Stakeholders meeting used GRADE to grade the evidence, but not as it was originally described. There were numerous variations on the same theme. One question is whether there should be one system or not.
  • Another -recurrent- issue was that Guidelines should be made actionable.

Here are podcasts covering the meeting

Gordon Guyatt, McMaster University, Canada, gave  an outline of the GRADE approach and the purpose of ‘Summary of Findings’ tables, and how both are perceived by Cochrane review authors and guideline developers.

Gordon Guyatt, whose magnificent book ” Users’ Guide to the Medical Literature”  (JAMA-Evidence) lies at my desk, was clearly in favor of adherence to the original Grade-guidelines. Forty organizations have adopted these Grade Guidelines.

Grade stands for “Grading of Recommendations Assessment, Development and Evaluation”  system. It is used for grading evidence when submitting a clinical guidelines article. Six articles in the BMJ are specifically devoted to GRADE (see here for one (full text); and 2 (PubMed)). GRADE not only takes the rigor of the methods  into account, but also the balance between the benefits and the risks, burdens, and costs.

Suppose  a guideline would recommend  to use thrombolysis to treat disease X, because a good quality small RCTs show thrombolysis to be slightly but significantly more effective than heparin in this disease. However by relying on only direct evidence from the RCT’s it isn’t taken into account that observational studies have long shown that thrombolysis enhances the risk of massive bleeding in diseases Y and Z. Clearly the risk of harm is the same in disease X: both benefits and harms should be weighted.
Guyatt gave several other examples illustrating the importance of grading the evidence and the understandable overview presented in the Summary of Findings Table.

Another issue is that guideline makers are distressingly ready to embrace surrogate endpoints instead of outcomes that are more relevant to the patient. For instance it is not very meaningful if angiographic outcomes are improved, but mortality or the recurrence of cardiovascular disease are not.
GRADE takes into account if indirect evidence is used: It downgrades the evidence rating.  Downgrading also occurs in case of low quality RCT’s or the non-trade off of benefits versus harms.

Guyatt pleaded for uniform use of GRADE, and advised everybody to get comfortable with it.

Although I must say that it can feel somewhat uncomfortable to give absolute rates to non-absolute differences. These are really man-made formulas, people agreed upon. On the other hand it is a good thing that it is not only the outcome of the RCT’s with respect to benefits (of sometimes surrogate markers) that count.

A final remark of Guyatt: ” Everybody makes the claim they are following evidence based approach, but you have to learn them what that really means.”
Indeed, many people talk about their findings and/or recommendations being evidence based, because “EBM sells well”, but upon closer examination many reports are hardly worth the name.

Reblog this post [with Zemanta]

Grey Literature: Time to make it systematic

6 09 2009

Guest author: Shamsha Damani (@shamsha)

Grey literature is a term I first encountered in library school; I remember dubbing it “the-wild-goose-chase search” because it is time consuming, totally un-systematic, and a huge pain altogether. Things haven’t changed much in the grey literature arena, as I found out last week, when my boss asked me to help with the grey literature part of a systematic review.

Let me back up a bit and offer the official definition for grey literature by the experts of the Grey Literature International Steering Committee: “Information produced on all levels of government, academics, business and industry in electronic and print formats not controlled by commercial publishing i.e. where publishing is not the primary activity of the producing body.” Grey literature can include things such as policy documents, government reports, academic papers, theses, dissertations, bibliographies, conference abstracts/proceedings/papers, newsletters, PowerPoint presentations, standards/best practice documents, technical specifications, working papers and more! (Benzies et al 2006). So what is so time consuming about all this? There is no one magic database that will search all these at once. Translation: you have to search a gazillion places separately, which means you have to learn how to search each of these gazillion websites/databases separately. Now if doing searches for systematic reviews is your bread-and-butter, then you are probably scoffing already. But for a newbie like me, I was drowning big time.

After spending what seemed like an eternity to finish my search, I went back to the literature to see why inclusion of grey literature was so important. I know that grey literature adds to the evidence base and results in a comprehensive search, but it is often not peer-reviewed, and the quality of some of the documents is often questionable. So what I dug up was a bit surprising. The first was a Cochrane Review from 2007 titled “Grey literature in meta-analyses of randomized trials of health care interventions (review).” The authors concluded that not including grey literature in meta-analyses produced inflated results when looking at treatment effects. So the reason for inclusion of grey literature made sense: to reduce publication bias. Another paper published in the Bulletin of the World Health Organization concluded that grey literature tends to be more current, provides global coverage, and may have an impact on cost-effectiveness of various treatment strategies. This definitely got my attention because of the new buzzword in Washington: Comparative Effectiveness Research (CER). A lot of the grey literature is comprised of policy documents so it definitely has a big role to play in systematic reviews as well. However, the authors also pointed out that there is no systematic way to search the grey literature and undertaking such a search can be very expensive and time consuming. This validated my frustrations, but gave no solutions.

When I was struggling to get through my search, I was delighted to find a wonderful resource from the Canadian Agency for Drugs and Technologies in Health. They have created a document called “Grey Matters: A Practical Search Tool for Evidence-Based Medicine”, which is a 34-page checklist of many of the popular websites for searching grey literature, including a built-in documentation system. It was still tedious work because I had to search a ton of places, many resulting in no hits. But at least I had a start and a transparent way of documenting my work.

However, I’m still at a loss for why there are no official guidelines for librarians to search for grey literature. There are clear guidelines for authors of grey literature. Benzies and colleagues give compelling reasons for inclusion of grey literature in a systematic review, complete with a checklist for authors! Why not have guidelines for searching too? I know that every search would require different tools; but I think that a master list can be created, sort of like a must-search-these-first type of a list. It surely would help a newbie like me. I know that many libraries have such lists but they tend to be 10 pages long, with bibliographies for bibliographies! Based on my experience, I would start with the following resources the next time I encounter a grey literature search:

  1. National Guideline Clearinghouse
  2. Centre for Reviews and Dissemination
  3. Agency for Healthcare Research and Quality (AHRQ)
  4. Health Technology Assessment International (HTAI)
  5. Turning Research Into Practice (TRIP)

Some databases like Mednar, Deep Dyve, RePORTer, OAIster, and Google Scholar also deserve a mention but I have not had much luck with them. This is obviously not meant to be an exhaustive list. For that, I present my delicious page:, which is also ever-growing.

Finally, a request for the experts out there: if you have any tips on how to make this process less painful, please share it here. The newbies of the world will appreciate it.

Shamsha Damani

Clinical Librarian

Reblog this post [with Zemanta]

#CECEM Bridging the Gap between Evidence Based Practice and Practice Based Evidence

15 06 2009

cochrane-symbol A very interesting presentation at the CECEM was given by the organizer of this continental Cochrane meeting, Rob de Bie. De Bie is Professor of Physiotherapy Research and director of Education of the Faculty of Health within the dept. of Epidemiology of the Maastricht University. He is both a certified physiotherapist and an epidemiologist. Luckily he kept the epidemiologic theory to a minimum. In fact he is a very engaging speaker who keeps your attention to the end.


While guidelines were already present in the Middle Ages in the form of formalized treatment of daily practice, more recently clinical guidelines have emerged. These are systematically developed statements which assists clinicians and patients in making decisions about appropriate treatement for specific conditions.

Currently, there are 3 kinds of guidelines, each with its own shortcomings.

  • Consensus based. Consensus may be largely influenced by group dynamics
    Consensus = non-sensus and Consensus guidelines are guidelies.
  • Expert based. Might be even worse than consensus. It can have all kind of biases, like expert and opinion bias or external financing.
  • Evidence based. Guideline recommendations are based on best available evidence, deals with specific interventions for specific populations and are based on a systematic approach.

The quality of Evidence Based Guidelines depends on whether the evidence is good enough, transparent, credible, available, applied and not ‘muddled’ by health care insurers.
It is good to realize that some trials are never done, for instance because of ethical considerations. It is also true that only part of what you read (in the conclusions) has actually be done and some trials are republished several times, each time with a better outcome…

Systematic reviews and qualitatively good trials that don’t give answers.

Next Rob showed us the results of a study ( Jadad and McQuay in J. Clin. Epidemiol. ,1996) with efficacy as stated in the review plotted on the X-axis and the Quality score on the Y-axis. Surprisingly meta-analysis of high quality were less likely to produce positive results. Similar results were also obtained by Suttorp et al in 2006. (see Figure below)

12066264  rob de bie CECEM

Photo made by Chris Mavergames

There may be several reasons why good trials not always give good answers. Well known reasons are  the lack of randomization or blinding. However Rob focused on a less obvious reason. Despite its high level of evidence, a Randomized Controlled Trial (RCT) may not always be suitable to provide good answers applicable to all patients, because RCT’s often fail to reflect the true clinical practice. Often, the inclusion of patients in RCT’s is selective: middle-aged men with exclusion of co-morbidity. Whereas co-morbidity occurs in > 20% of the people of 60 years and older and in >40% of the people of 80 years and older (André Knottnerus in his speech).

Usefulness of a Nested Trial Cohort Study coupled to an EHR to study interventions.

Next, Rob showed that a nested Trial cohort study can be useful to study the effectiveness of  interventions. He used this in conjunction with an EHR (electronic health record), which could be accessed by practitioner and patient.

One of the diseases studied in this way, was Intermittent Claudication. Most commonly Intermittent Claudication is a manifestation of  peripheral arterial disease in the legs, causing pain and cramps in the legs while walking (hence the name). The mortality is high: the 5 year mortality rates are in between those of colorectal cancer and Non-Hodgkin Lymphoma. This is related to the underlying atherosclerosis.

There are several risk factors, some of which cannot be modified, like hereditary factors, age and gender. Other factors, like smoking, diet, physical inactivity and obesity can be tackled. These factors are interrelated.

Rob showed that, whereas there may be an overall null effect of exercise in the whole population, the effect may differ per subgroup.

15-6-2009 3-06-19 CI 1

  • Patients with mild disease and no co-morbidity may directly benefit from exercise-therapy (blue area).
  • Exercise has no effect on smokers, probably because smoking is the main causative factor.
  • People with unstable diabetes first show an improvement, which stabilized after a few weeks due to hypo- or hyperglycaemia induced by the exercise,
  • A similar effect is seen in COPD patients, the exercise becoming less effective because the patients become short of breath.

It is important to first regulate diabetes or COPD before continuing the exercise therapy. By individually optimizing the intervention(s) a far greater overall effect is achieved: 191% improval in the maximal (pain-free) walking distance compared to for instance <35% according to a Cochrane Systematic Review (2007).

Another striking effect: exercise therapy affects some of the prognostic factors: whereas there is no effect on BMI (this stays an important risk factor), age and diabetes become less important risk factors.

15-6-2009 3-35-10 shift in prognostic factors

Because guidelines are quickly outdated, the findings are directly implemented in the existing guidelines.

Another astonishing fact: the physiotherapists pay for the system, not the patient nor the government.

More information can be found on Although the presentation is not (yet?) available on the net, I found a comparable presentation here.

** (2009-06-15) Good news: the program and all presentations can now be viewed at:

#CECEM David Tovey -the Cochrane Library’s First Editor in Chief

13 06 2009

cochrane-symbolThis week I was attending another congress, the Continental European Cochrane Entities Meeting (CECEM).

This annual meeting is meant for staff from Cochrane Entities, thus Centre Staff, RGC’s (Review Group Coordinators), TSC’s (Trial Search Coordinators) and other staff members of the Cochrane Collaboration based in Continental Europe.

CECEM 2009 was held in Maastricht, the beautiful old Roman city in the South of the Netherlands. The city where my father was born and where I spend many holidays.

One interesting presentation was by Cochranes’ 1st Editor in chief, David Tovey, previously GP in an urban practice in London for 14 years and  Editorial Director of the BMJ Group’s ‘Knowledge’ division (responsible for BMJ Clinical Evidence and its sister product Best Treatments, see announcement in Medical News Today)

David began with saying that the end user is really the key person and that the impact of the Cochrane Reviews is most important.

“How is it that a Senior health manager in the UK may shrug his shoulders when you ask him if he has ever heard of Cochrane?”

“How do we make sure that our work had impact? Should we make use of user generated content?”

Quality is central, but quality depends on four pillars. Cochrane reviews should be reliable, timely, relevant and accessible.

Cochrane Tovey wit

How quality is perceived is dependent on the end users. There are several kinds of end users, each with his own priorities.

  1. doctor: wants comprehensive and up-to-date info, wants to understand and get answers quickly.
  2. patient: trustworthiness, up-to-date, wants to be able to make sense of it.
  3. scientist: wants to see how the conclusions are derived.
  4. policy and guideline-makers.

Reliable: Several articles have shown Cochrane Systematic Reviews to be more reliable then other systematic reviews  (Moher, PLOS BMJ)*

Timely: First it takes time to submit a title of a Cochrane Review and then it takes at least 2 years before a protocol becomes a review. Some reviews take even longer than 2 years. So there is room for improvement.

Patients are also very important as end user. Strikingly, the systematic review about the use of cranberry to prevent recurrent urinary tract infection is the most frequently viewed article,- and this is not because the doctors are most interested in this particular treatment….

Doctors: Doctors often rely on their colleagues for a quick and trustworthy answer. Challenge: “can we make consulting the Cochrane Library as easy as asking a colleague: thus timely and easy?”


  • making plain language summaries more understandable
  • Summary of Findings
  • podcasts of systematic reviews (very successful till now), .e. see an earlier post.
  • Web 2.0 innovations

Key challenges:

  • ensure and develop consistent quality
  • (timely) updating
  • putting the customer first: applicability & prioritization
  • web delivery
  • resources (not every group has the same resources)
  • make clear what an update means and how important this update is: are there new studies found? are these likely to change conclusions or not? When was the last amendment to the search?

I found the presentation very interesting. What I also liked is that David stayed with us for two days -also during the social program- and was easy approachable. I support the idea of a user-centric approach very much. However, I had expected the emphasis to be less on the timeliness (of updates for instance), but more on how users (patients, doctors) can get more involved and how we review the subjects that are most urgently needed. Indeed, when I twittered that Tovey suggested that we “make consulting the Cochrane Library as easy as asking a colleague”, Jon Brassey of TRIP answered that a lot has to be done to fulfill this, as the Cochrane only answers 2 out of 350+ questions asked by GPs in the UK, a statement that appeared to be based on his own experience (Jon is founder of the TRIP-database).

But in principle I think that Jon is correct. Right now too few questions (in the field of interventions) are directly answered by Cochrane Systematic Reviews and too little is done to reach and involve the Cochrane Library users.

13-6-2009 15-43-17 twitter CECEM discussion

click to enlarge

During the CECEM other speakers addressed some of these issues in more detail. André Knottnerus, Chair of the Dutch Health Council, discussed “the impact of Cochrane Reviews”, and Rob the Bie of the Rehabilitation & Related Therapies field discussed “Bridging the  gap between evidenced based practice and practice based evidence”, while Dave Brooker launched ideas about how to implement Web 2.0 tools. I hope to summarize these (and other) presentations in a blogpost later on.

*have to look this up

NOTE (2009-11-10).

I had forgotten about this blank “citation” till this post was cited quite in another context (see comment: and someone commented that the asterisk to the “the amazing statement” had still to be looked up,  indirectly arguing that this statement thus was not reliable- and continuing by giving an example of a typically flawed Cochrane Review that hit the headlines 4 years ago, a typical exception to the rule that “Cochrane systematic reviews are more reliable than other systematic reviews”. Of course when it is said that A is more trustworthy than B it is meant on average. I’m a searcher, and on average the Cochrane searchers are excellent, but when I do my best I surely can find some that are not good at all. Without doubt that also pertains to other parts of Cochrane Systematic Reviews.
In addition -and that was the topic of the presentation- there is room for improvement.

Now about the asterisk, which according to Susannah should have been (YIKES!) 100 times bigger. This was a post based on a live presentation and I couldn’t pick up all the references on the slides while making notes. I had hoped that David Tovey would have made his ppt public, so I could have checked the references he gave. But he didn’t and so I forgot about it. Now I’ve looked some references up, and, although they might not be identical to the references that David mentioned, they are in line with what he said:

  1. Moher D, Tetzlaff J, Tricco AC, Sampson M, Altman DG, 2007. Epidemiology and Reporting Characteristics of Systematic Reviews. PLoS Med 4(3): e78. doi:10.1371/journal.pmed.0040078 (free full text)
  2. The PLoS Medicine Editors 2007 Many Reviews Are Systematic but Some Are More Transparent and Completely Reported than Others. PLoS Med 4(3): e147. doi:10.1371/journal.pmed.0040147 (free full text; editorial coment on [1]
  3. Tricco AC, Tetzlaff J, Pham B, Brehaut J, Moher D, 2009. Non-Cochrane vs. Cochrane reviews were twice as likely to have positive conclusion statements: cross-sectional study. J Clin Epidemiol. Apr;62(4):380-386.e1. Epub 2009 Jan 6. [PubMed -citation]
  4. Anders W Jørgensen, Jørgen Hilden, Peter C Gøtzsche, 2006. Cochrane reviews compared with industry supported meta-analyses and other meta-analyses of the same drugs: systematic review BMJ  2006;333:782, doi: 10.1136/bmj.38973.444699.0B (free full text)
  5. Alejandro R Jadad, Michael Moher, George P Browman, Lynda Booker, Christopher Sigouin, Mario Fuentes, Robert Stevens (2000) Systematic reviews and meta-analyses on treatment of asthma: critical evaluation BMJ 2000;320:537-540, doi: 10.1136/bmj.320.7234.537 (free full text)

In previous posts I regularly discussed that (Merck’s Ghostwriters, Haunted Papers and Fake Elsevier Journals and One Third of the Clinical Cancer Studies Report Conflict of Interest) that pharma-sponsored trials rarely produce results that are unfavorable to the companies’ products [e.g. see here for an overview, and many papers of Lisa Bero].

Also pertinent to the abovementioned discussion at E-patient-Net is my earlier post: The Trouble with Wikipedia as a Source for Medical Information. (references still not in the correct order. Yikes!)

Reblog this post [with Zemanta]

New Cochrane Handbook: altered search policies

14 11 2008

cochrane-symbolThe Cochrane Handbook for Systematic Reviews of Interventions is the official document that describes in detail the process of preparing and maintaining Cochrane systematic reviews on the effects of healthcare interventions.

The current version of the Handbook is 5.0.1 (updated September 2008) is available either for purchase from John Wiley & Sons, Ltd or for download only to members of The Cochrane Collaboration (via the Collaboration’s information management system, Archie).
Version 5.0.0, updated February 2008, is freely available in browseable format, here. It should be noted however, that this version is not as up to date as version 5.0.1. The methodological search filters, for instance, are not1989 visual 6 completely identical.

As an information specialist I will concentrate on Chapter 6: Searching for studies.

This chapter consist of the following paragraphs:

  • 6.1 Introduction
  • 6.2 Sources to search
  • 6.3 Planning the search process
  • 6.4 Designing search strategies
  • 6.5 Managing references
  • 6.6 Documenting and reporting the search process
  • 6.7 Chapter information
  • 6.8 References

As the previous versions the essence of the Cochrane searches is to perform a comprehensive (sensitive) search for relevant studies (RCTs) to minimize bias. The most prominent changes are:

1. More emphasis on the central role of the Trial Search Coordinator (TSC) in the search process.
Practically each paragraph summary begins with an advice to consult the TSC, i.e. in 6.1: Cochrane review authors should seek advice from the Trials Search Co-ordinator of their Cochrane Review Group (CRG) before starting a search.

One of the main roles of TSC’s is the assisting of authors with searching, although the range of assistance may vary from advise on to how run searches to designing, running and sending the searches to authors.

I know from experience that most authors have not enough search literacy to be able to satisfactory complete the entire search on their own. Not even all librarians may be equipped to perform such exhaustive searches. That is why the handbook says: “If a CRG is currently without a Trials Search Co-ordinator authors should seek the guidance of a local healthcare librarian or information specialist, where possible one with experience of conducting searches for systematic reviews.”

Another essential core function of the TSC is the development and maintenance of the Specialized Register, containing all relevant studies in their area of interest, and submit this to CENTRAL (The Cochrane Central Register of Controlled Trials) on a quarterly basis”. CENTRAL is the most comprehensive source of reports of controlled trials (~500,000 records), available in “The Cochrane Library” (there it is called CLINICAL TRIALS). CENTRAL is available to all Cochrane Library subscribers, whereas the Specialized Register is only available via the TSC.


Redrawn from the Handbook Fig. 6.3.a: The contents of CENTRAL

2. Therefore Trials registers are an increasingly important source of information. CENTRAL is considered to be the best single source of reports of trials that might be eligible for inclusion in Cochrane reviews. However, other than would be expected (at least by many authors) a search of MEDLINE (PubMed) alone is not considered adequate.

The approach now is: Specialized Registers/CENTRAL and MEDLINE should be searched as a minimum, together with EMBASE if it is available (apart from topic specific databases, snowballing). MEDLINE should be searched from 2005 onwards, since CENTRAL contains all records from MEDLINE indexed with the Publication Type term ‘Randomized Controlled Trial’ or ‘Controlled Clinical Trial’ (a substantial proportion of theses MEDLINE records have been retagged as a result of the work of The Cochrane Collaboration (Dickersin 2002)).

Personally, for non-Cochrane searches, I would rather search the other way around, MEDLINE (OVID) first, than EMBASE (OVID) and finally CENTRAL, and deduplicate the searches afterwards (in Reference Manager for instance). The (Wiley) Cochrane Library is not easy to search (for non-experienced users, i.e. you have to know the MESH beforehand, there is (yet) no mapping). If you start your search in MEDLINE (OVID) you can easily transform it in EMBASE and subsequently CENTRAL (using both MESH and EMBASE keywords as well as textwords)

3. The full search strategies for each database searched need to be included in an Appendix with the total number of hits retrieved by the electronic searches included in the Results section. Indeed the reporting has been very variable, some authors only referring to the general search strategy of the group. This made the searching part less transparent.

4. Two new Cochrane Highly Sensitive Search Strategies for identifying randomized trials in MEDLINE strategies have been developed: a sensitivity-maximizing version and a sensitivity- and precision-maximizing version. These filters (that are to be combined with the subject search) were designed for MEDLINE-indexed records. Therefore, a separate search is needed to find non-indexed records as well. An EMBASE RCT filter is still under development.

These methodological filters will be exhaustively discussed in another post.

The (un)usefulness of regular breast exam

7 09 2008

Regular breast exam, either by women theirselves (BSE, breast self exam) or a doctor or nurse, has been promoted for many years, because this would help to detect breast cancer earlier, and “when breast cancer is found earlier, it’s easier to treat and cure” . At least that is what most people believe and what has been advocated by organizations and Internet companies (i.e. selling special gloves) (see figure).

The idea that regular breast exam is truly beneficial, however, has recently been challenged by a Cochrane Systematic Review, conducted by Kösters and Gøtzsche.[1] This review has stirred up quite a debate among doctors, guideline-makers, patients and women. Many major organizations and advocacy groups have stopped recommending routine BSE. Reactions of patients vary from ‘reluctant’ to ‘confused that it is no longer needed’ or even a bit angry (‘it is my body and I decide whether I check it or not’). See for instance these reactions: 1, 2, 3. Coverage in the media is sometimes misleading, but reactions of (some) doctors or “experts in the field” also do not always help to convey a clear message to the public either. Some seize the opportunity to rant against EBM (Evidence Based Medicine) in general, which makes things even less transparent, see for instance this post by Dr Rich (although he has some good points as well), this story in the Herald and this one in Medcape.

In a question-answer like way I try to cover the story.

1. What is the conclusion from the study?
The authors conclude that regular breast examination (BE) does more harm than good and is therefore not recommended.

2. Which harm, which good?
Breast examination didn’t lower mortality (not beneficial), whereas it led to more unnecessary biopsies (harm).

3. Why did they look at mortality only?
They didn’t, they also scored the number and stage of cancers identified. However mortality (or really survival) is an outcome that matters most for patients. Suppose the screening finds more breast cancers, but early intervention does not lead to any cure, than the early recognition of the cancer is of no real value to the patient.

4. Why are more unnecessary biopsies considered as harm?
Biopsies are an invasive procedure and lead to unnecessary anxiety, that can have a long-lasting effect on psychological well-being. Extra tests to rule out that it is not cancer also cost a lot of money. Whether it is ‘worth it’ depends on whether -and to which extent- people’s lives are saved (or quality of life improved).

5. What kind of study is it?
It is a systematic review (of controlled clinical trials) made by the Cochrane Collaboration (see glossary). Generally these systematic reviews are of high methodological quality, because of the systematic and explicit methods used to identify, select and critically appraise relevant research. After extensively searching for all trials, only controlled clinical trials (studies of the highest evidence) with predefined characteristics are included. Thus authors are really looking for all the high level evidence there is, instead of grabbing some papers from the drawer or looking at the core English language journals only.

6. Is this new information?
No, not really. In fact this systematic review is an update of a previous version, published in 2003. The studies included and the conclusions remain the same. As shown from the scheme below (taken from a figure in a very interesting opinion paper entitled “Challenges to cancer control by screening” (see abstract here), the attitude towards breast self examination already changed soon after the original trials were published.

Nature Reviews Cancer 3, 297 (2003)

M.N. Pollak and W.D. Foulkes: Nature Reviews Cancer 3, 297 (2003)

7. Omg? ….
All Cochrane Systematic have to be regularly updated to see if there isn’t any new evidence that could alter the conclusions. In this case, after updating the search, no new studies of good quality were found. However, there are still some trials ongoing.

8. Can we rely on these conclusions? Is the Cochrane Review of good enough quality?
The Cochrane Review itself is of high quality, but the two randomized studies included, one from Russia (1999: ~122,500 participants) and one from Shanghai (2002: ~266,000 participants) have some serious flaws. For instance, both studies did not have an adequate allocation concealment (keeping clinicians and participants unaware of the assignments). An inadequate concealment undermines the validity of a trial (see for instance this 2002 Lancet paper). Also, description of statistical methods was lacking. Furthermore, data from the Moscow-branch of the Russian study were incomplete (these are excluded), mammography might have been used additionally and in the Shanghai trial there was a large difference in all-cause mortality in favor of the control group, suggesting that the two groups were imbalanced from the start.

9. Can the results of these rather old trials from countries as China and Russia be directly translated to the situation in Western Countries with a high standard of care?
Intuitively I would say ‘probably not’. However, we still don’t know whether the current western quality of care would actually lead to a better outcome after early detection, because it has never be tested in a well performed controlled trial.

10. Is this outcome applicable to anyone?
No, the studies are applicable to healthy, middle-aged woman without any particular risk. Screening methods might be more useful or even required for woman at high risk (i.e. familiar predisposition, previous ovarian or breast cancer).

11. Still, in recent interviews experts in the field say they do know that BSE is beneficial. One doctor for instance referred (in this Medscape paper) to a recent trial, that concluded that breast self-examination should be promoted for early detection of breast cancer (see here).
Either these doctors/experts give their personal opinion, refer to unpublished data or to studies with a lower evidence level. For instance the study referred to by Dr. Goldstein above was a retrospective study looking at how accurately woman could detect a breast tumor. Retrospective studies are more biased (see previous post on levels of evidence for dummies). Furthermore this study didn’t evaluate a hard outcome (survival, better prognosis) and there are just as many retrospective studies that claim the opposite, i.e. this article of Newcomb et al in J Natl Cancer Inst. 1991(abstract).

12. Should woman refrain from breast self examination then?
I found a short article (half A4) in the Dutch woman’s magazine (!) Viva very clear and concise.
Four woman gave their opinion.

A patient who had had a previous breast tumor kept on checking it (high risk group).

The director of a patient association said: “there is no evidence that BSE is beneficial: don’t feel quilty if you don’t check your breasts. But it might have a reassuring effect if you do”.
The spokeswoman of the Dutch association “struggle against cancer” (KWF) said that they didn’t promote structural breast exam any longer, but they advised to “know your body” and know the alarm signals (retracting nipple etc), much the same way as you check for alterations in nevi. Most woman find small alterations anyway, said another, for instance when taking a shower.
Indeed, exemplified by my own experience: 18 years ago my mother detected breast cancer when feeling a lump in her breast under the shower (malignant, but curable).

The Cochrane authors are also very clear in their review about the necessity of women noticing changes to their breast.

“Some women will continue with breast self-examination or will wish to be taught the technique. We suggest that the lack of supporting evidence from the two major studies should be discussed with these women to enable them to make an informed decision.
It would be wrong, however, to conclude that women need not be aware of any breast changes. It is possible that increased breast awareness may have contributed to the decrease in mortality from breast cancer that has been noted in some countries. Women should, therefore, be encouraged to seek medical advice if they detect any change in their breasts that may be breast cancer.”

Listen to this Podcast featuring the Cochrane authors to learn more about their findings



Periodieke borstcontrole, uitgevoerd door vrouwen zelf of door artsen/verplegers, is jarenlang gepromoot, omdat je hierdoor eerder borstkanker zou ontdekken, waardoor het beter te genezen is. Deze gedachte wordt actief uitgedragen door verschillende organisaties en Internetbedrijven (die bijvoorbeeld speciale handschoenen verkopen, zie figuur).

Dat regelmatige borstcontrole zinvol zou zijn, wordt echter tegengesproken door een recent Cochrane Systematisch Review, uitgevoerd door Kösters and Gøtzsche.[1] Dit review heeft heel wat losgemaakt bij dokters, makers van richtlijnen, patienten en vrouwen in het algemeen. Veel belangrijke organisaties bevelen niet langer het maandelijks controleren van de borsten aan. De reacties van (engelstalige) patienten varieert van ‘opgelucht’ tot ‘in verwarring gebracht’ of lichtelijk boos (‘ik maak verdorie zelf nog wel even uit wat ik doe’). Zie bijv. enige reacties hier: 1, 2, 3. De berichtgeving door sommige media is soms misleidend. Dat is vaker zo, maar vervelender is het dat reacties van sommige artsen of ‘experts’ heel gekleurd zijn, waardoor de boodschap niet goed overkomt bij het publiek. Sommigen grijpen de gelegenheid aan om even goed op EBM (Evidence Based Medicine) af te geven, zie bijv. deze post van Dr Rich (die overigens ook zinnige dingen opmerkt), dit bericht in de Herald and dit in Medcape (zie onder).

Ik zal proberen om dit onderwerp in een vraag-en antwoord-vorm te bespreken.

1. Wat zijn de conclusies uit de studie?
Dat structureel borstonderzoek door vrouwen zelf of door artsen/verplegers meer kwaad dan goed doet, en dus niet langer aanbevolen kan worden.

2. Welk kwaad, welk goed?
Maandelijkse controle van de borsten leidt niet tot minder sterfte (niet ‘beter’), maar wel tot 2x zoveel biopsies (kwaad, ‘harm’).

3. Waarom kijken ze alleen naar sterfte?
Ze keken ook naar het aantal ontdekte kankers en hun stadia, maar sterfte (of eigenlijk overleving) is veel belangrijker voor de patient. Stel dàt je eerder borstkanker vindt door screening, maar dit leidt niet tot genezing en/of een betere kwaliteit van leven, dan schiet de patient daar niets mee op, integendeel (zij weet het langer).

4. Waarom worden biopsies als ‘schadelijk’ gezien?
Een biopsie is een medische ingreep, die -zeker in het geval van vermoede kanker-, een langdurig negatief kan effect hebben op iemand’s psychische gesteldheid. Biopsies en andere testen, die nodig
zijn om kanker uit te sluiten kosten veel geld. Of dit het ‘waard’ is hangt af van hoe nuttig die testen werkelijk zijn, dus of ze de kans op overleving of een betere kwaliteit van leven verhogen.

5. Wat voor een studie is het?
Het is een systematisch review (van “gecontrolleerde” klinische studies), gemaakt door auteurs van de Cochrane Collaboration (zie Glossary). Over het algemeen zijn deze reviews van uitstekende methodologische kwaliteit, omdat studies volgens een vast stramien gezocht, geselecteerd, beoordeeld en samengevat worden. Van te voren worden alle criteria vastgelegd. Dus de auteurs proberen echt alle evidence (positief of negatief, zonder taalbeperking) boven water te krijgen in plaats van wat artikelen uit de kast te trekken of alleen maar de top-tijdschriften te selecteren.

6. Is deze informatie nieuw?
Nee, niet echt. Dit systematische review is in feite een update van een vorige versie uit 2003. De geincludeerde studies en de conclusies zijn hetzelfde. Zoals te zien in het schema hieronder (Nature Reviews Cancer 2003, samenvatting hier), is de houding ten opzichte van borstzelfcontrole al sinds de publicaties van de oorspronkelijke studies (die in het Cochrane Review opgenomen zijn) veranderd. De Amerikaanse Cancer Society beveelt bijvoorbeeld al sindsdien maandelijks zelfonderzoek niet meer aan.

Nature Reviews Cancer 3, 297 (2003)

M.N. Pollak et al: Nature Reviews Cancer 3, 297 (2003)

7. Huh? ….
Alle Cochrane Systematische Reviews behoren regelmatig ge-update te worden om te kijken of er geen nieuwe evidence is die tot een andere conclusie leidt. In dit geval werden er geen nieuwe studies van goede kwaliteit gevonden. Wel lopen er nog enkele studies.

8. Kunnen we van deze conclusies op aan? Zijn Cochrane Reviews van een voldoende kwaliteit?
Het Cochrane Review zelf is van een goede kwaliteit, maar op de 2 studies die opgenomen zijn in het review (een uit Rusland uit 1999 met ca. 122.500 deelnemers en een uit Shanghai uit 2002 met ca. 266.000 deelnemers) is wel het een en het ander aan te merken. In beide studies was de blindering van de patienten en de behandelaars voor de toewijzing van de behandeling (concealment of allocation) onvoldoende. Daarmee wordt zo’n studie minder valide (zie bijv. dit artikel uit de Lancet van 2002). Verder was de beschrijving van de statistische methoden onvolledig, waren gegevens van de Moskouse tak van de studie Russische studie niet compleet (zijn wel uitgesloten) , en was er in de Shanghai studie een groot verschil in algehele sterfte (dus niet alleen borstkanker), wat een duidelijke aanwijzing is dat de 2 groepen al vanaf het begin niet gelijkwaardig waren.

9. Zijn de resultaten uit deze oudere studies uit landen als China en Rusland zondermeer op Westerse landen van toepassing?
Intuitief zou ik zeggen van niet. De zorg in Westerse landen en de hedendaagse behandelingen zijn mogelijk beter. Alleen weten we niet of screening door zelfonderzoek hier wel tot een betere uitkomst zou leiden, omdat dat nooit in goede gecontroleerde studies is bestudeerd.

10. Gelden de conclusies voor alle vrouwen?
Nee, de studies zijn allen gedaan -en daarom alleen van toepassing op gezonde vrouwen van zo’n 35 tot 65 jaar. Screeningsmethoden, waaronder borstzelfonderzoek, zijn wel aan te bevelen voor vrouwen, die tot de risicogroep behoren (vrouwen die erfelijk belast zijn of die eerder al borst- of eierstokkanker hebben gehad).

11. Toch stellen bepaalde deskundigen dat zelfonderzoek wel gunstig is. Een dokter (Dr. Goldstein) verwees daarbij in een interview in Medscape (zie hier) naar een heel recente studie (zie hier).
Deze artsen/deskundigen geven hun persoonlijke mening, verwijzen naar niet-gepubliceerde studies of naar studies met een lagere bewijskracht. De studie waar Dr. Goldstein naar verwijst is bijvoorbeeld een retrospectieve studie, die alleen onderzoekt hoe goed vrouwen borstkanker kunnen vaststellen. Retrospectieve studies hebben altijd meer vertekening (zie een vorig bericht over het beste studietype… voor dummies). Verder keek deze studie niet naar harde uitkomsten (overleving, betere prognose). Daarnaast zijn er evengoed retrospectieve studies die het tegenovergestelde beweren, zie bijvoorbeeld dit artikel van Newcomb PA et al in J Natl Cancer Inst. 1991(abstract).

12. Moeten vrouwen dan helemaal geen borstcontrole meer doen?
Ik kwam toevallig ergens op een terrasje een klein stukje in de Viva (1-7 aug 2008) tegen dat ik heel duidelijk vond.
4 Vrouwen gaven hun mening.

Een vrouw die eerder borstkanker had gehad bleef maandelijks controleren (risicogroep).
De directeur van de Borstkankervereniging zei: “Wetenschappelijk is aangetoond dat borstcontrole niet zorgt voor minder sterfte door borstkanker. Voel je niet schuldig als je het niet doet. Doe je het wel om zo je borsten goed te leren kennen, dan heeft dat vooral een psychologisch effect”. De woordvoerdster van de KWF Kankerbestrijding zei dat ze periodieke zelfcontrole niet langer promoten, maar dat ze ook niet zeggen dat het zinloos is. Het is net als bij moedervlekken, die controleer je ook niet gestructureerd, maar als je een verandering ziet ga je wel naar de huisarts. Daarmee in overeenstemming zei de directeur van Pink Ribbon dat 90% van de vrouwen borstkanker zelf opmerkt: als er iets zit merk je het toch wel, bijvoorbeeld tijdens het douchen. Inderdaad kan ik dat uit eigen ervaring bevestigen. Mijn moeder voelde jaren geleden een knobbeltje terwijl ze zich aan het douchen was (kwaardaardig, maar genezen).

De Cochrane auteurs zeggen in hun review ook heel expliciet dat het absoluut noodzakelijk is om naar de dokter te gaan als vrouwen veranderingen aan hun borst opmerken.

“Some women will continue with breast self-examination or will wish to be taught the technique. We suggest that the lack of supporting evidence from the two major studies should be discussed with these women to enable them to make an informed decision.
It would be wrong, however, to conclude that women need not be aware of any breast changes. It is possible that increased breast awareness may have contributed to the decrease in mortality from breast cancer that has been noted in some countries. Women should, therefore, be encouraged to seek medical advice if they detect any change in their breasts that may be breast cancer.”

Hier is de Podcast waarin de Cochrane auteurs over hun studie vertellen.


zie Engels gedeelte hierboven

Thesis Mariska Leeflang: Systematic Reviews of Diagnostic Test Accuracy.

22 08 2008

While I was on vacation Mariska Leeflang got her PhD. The ceremony was July 1st 2008.

Her thesis is entitled: Systematic Reviews of Diagnostic Test Accuracy.

Mariska is a colleague working (part time) at the Dutch Cochrane Centre (DCC). She studied veterinarian science in Utrecht, but gradually noticed that she was more interested in research than in veterinary practice. Four years ago she applied for a job at the dept. of Clinical Epidemiology, Biostatistics and Bioinformatics (KEBB) at the Amsterdam Academic Medical Centre (AMC). Having a cv with all kinds of odd subjects like livestock and courses delivering anesthetic drugs from a distance, she thought she would never make it, but she did.

Those 4 years have been very fruitful. She did research on diagnostic accuracy, is member of the Cochrane Diagnostic Test Accuracy Working Group and first author of one of the Cochrane pilot reviews of diagnostic test accuracy (chapter 7 of thesis). [Note: Cochrane Diagnostic Test Accuracy Reviews are a new initiative; till recently all Cochrane Systematic reviews were about health care interventions].
Mariska also supports authors of Cochrane systematic reviews, gave many presentations and led many workshops. In fact, she also gave in-service training to our group of Clinical Librarians in diagnostic studies and together we have given several courses on Evidence Based Medicine and Systematic Reviews. In leisure time she is Chair of “Stichting DIO” (Vet Science & Development Cooperation)

She will continue to work for the Cochrane Collaboration, including the DCC, but has also accepted a job at the Royal Tropical Institute (

Because of her backgound Mariska often gives her work a light “vet” touch.

“The cover of her thesis for instance is inspired by Celtic artwork and reflects the process of a systematic review: parts become a whole. The anthropomorphic (human-like) and zoomorphic (animal-like) creatures represent the background of the author. The stethoscopes and the corners refer specifically to diagnostic test accuracy reviews.The snakes eating their own tail stand in Celtic mythology for longevity and the ever-lasting life cycle.”

Also, she often closes her presentations with a slide showing swimming pigs, the pig being symbolic for “luck”.

So I would like to close this post in turn by wishing Mariska: “Good Luck”

Thesis: ISBN: 978-90-9023139-6
Digital Version at :
Index: (I’ll come back to chapter 1 and 2 another time)
Chapter 1: Systematic Reviews of Diagnostic Test Accuracy – New Developments within The Cochrane Collaboration – Submitted
Chapter 2: The use of methodological search filters to identify diagnostic accuracy studies can lead to the omission of relevant studies – J Clin Epidemiol. 2006;59(3):234-40
Chapter 3: Impact of adjustment for quality on results of meta-analyses of diagnostic accuracy – Clin Chem. 2007;53(2):164-72
Chapter 4: Bias in sensitivity and specificity caused by data driven selection of optimal cut-off values: mechanisms, magnitude and solutions – Clin Chem. 2008; 54(4):729-37
Chapter 5: Diagnostic accuracy may vary with prevalence: Implications for evidence-based diagnosis – Accepted by J Clin Epidemiol
Chapter 6: Accuracy of fibronectin tests for the prediction of pre-eclampsia: a systematic review – Eur J Obstet Gynecol Reprod Biol. 2007;133(1):12-9
Chapter 7: Galactomannan detection for the diagnosis of invasive aspergillosis in immunocompromized patients. A Cochrane Review of Diagnostic Test Accuracy – Conducted as a pilot Cochrane Diagnostic Test Accuracy review


Mariska Leeflang is op 1 juli 2008 aan de Universiteit van Amsterdam gepromoveerd op het onderwerp:“Systematische Reviews van de Diagnostische Accurratesse”.

Mariska is eigenlijk een collega van mij. We werken samen part time op het Dutch Cochrane Centre (DCC). Zij heeft diergeneeskunde gestudeerd in Utrecht, maar kwam er gaandeweg toch achter dat ze liever onderzoeker dan practiserend dierenarts wilde zijn. Toen ze vier jaar geleden ging solliciteren bij de afdeling Klinische Epidemiologie, Biostatistiek en Bioinfomatica (KEBB) van het AMC gaf ze zichzelf weinig kans met vakken als graslandbeheer en een cursus ‘verdoven op afstand’ op haar cv. Maar ze werd wel aangenomen. En terecht!

Die 4 jaar zijn zeer vruchtbaar geweest. Ze deed diagnostisch onderzoek, is lid van de Cochrane Diagnostic Test Accuracy Working Group en eerste auteur van een pilot diagnostisch accuratesse review (H 7 van proefschrift). Cochrane Systematische Reviews van Diagnostische Accuratessestudies zijn een nieuw type Systematisch Review, naast de bestaande Cochrane Reviews van interventies.
Mariska heeft veel presentaties en workshops gegeven, ook in Cochrane verband. Ze heeft zelfs ons clinical librarians bijgeschoold op het gebied van diagnostische stusies. Samen geef ik met haar EBM-cursussen en de cursus “Systematische Reviews” voor Cochrane auteurs. In haar vrije tijd is ze voorzitter van de Stichting DIO ( Diergeneeskunde in Ontwikkelingssamenwerking).

Ze zal voor de Cochrane Collaboration blijven werken, maar werkt sinds kort ook 2 dagen per week op het Koninklijk Tropeninstituut (KIT).

Vaak zie je dat Mariska vanwege haar achtergrond als diergeneeskundige vaak een link maakt naar dieren.

Op de omslag van haar boekje dat gebaseerd is op Keltisch kunstwerk wordt het proces van een systematisch review als volgt weergegeven: Alle delen worden samen een geheel. The mensachtige en dierlijke wezens vormen Mariska’s achtergrond. De stethoscoop en de hoeken staan voor de diagnostische accuratessereviews. De slangen, die hun eigen staart opeten staan in de Keltische mythologie voor een lang leven en de eeuwigdurende levenscyclus.

Ook sluit ze haar presentatie vaak af met een plaatje met zwemmende biggetjes, die voor “geluk” staan.

Dat lijkt me ook hier een passend slot: Veel geluk Mariska!!

Two new Cochrane Groups

8 05 2008

Two groups have officially joined the Cochrane Collaboration: the Cochrane Public Health Review Group and the Cochrane Prognosis Methods Group.

The Cochrane Public Health Review group belongs to the Cochrane Review Groups, i.e. groups that produce Cochrane Reviews in specific medical topic areas.

The Cochrane Prognosis Methods group will be the 13th Cochrane Method Group. This group will have two primary roles: 1. Work with existing Cochrane entities, including Methods Groups to ensure the best use of prognostic evidence in Cochrane reviews 2. Conduct research to advance the methods of prognosis reviews and other types of reviews, where similar methods apply.

By calling into existence Method Groups like the Cochrane Prognosis, Cochrane Adverse Effects, Cochrane Screening and Diagnostic Tests and the Cochrane Qualitative Research Methods Group, the Cochrane Collaboration will no longer fully concentrate on Systematic Reviews of Randomized Controlled Trials / interventions. That has been a major criticism of the Cochrane Systematic Reviews.

For people not familiar with the structure of the Cochrane Collaboration, see the schematic picture below or follow this link


De Cochrane Collaboration heeft er 2 nieuwe groepen bij, de Cochrane Public Health Review Group en de Cochrane Prognosis Methods Group.

Zoals de naam al zegt is de Cochrane Public Health Review Group een Cochrane Review Group, d.w.z. een groep die Cochrane Reviews schrijft over een bepaald medisch onderwerp, in dit geval dus volksgezondheid.

De Cochrane Prognosis Methods Group is de 13e Cochrane Method Group. Deze groep ondersteunt andere Cochrane groepen zodat ze evidence op het gebied van prognose goed implementeren en voert onderzoek uit om de de methodologie van prognostische reviews te verbeteren.

Het is een goede zaak dat de Cochrane Collaboration de Prognosis Method Group alsmede enkele andere groepen als de Cochrane Adverse Effects, Cochrane Screening and Diagnostic Tests en de Cochrane Qualitative Research Methods Group in het leven heeft geroepen. Hiermee komt zij tegemoet aan de vaak geuite kritiek dat Cochrane Systematic Reviews zich teveel op het nut van interventies richten en zich ‘alleen’ baseren op (randomized) controlled trials.

Voor wie niet bekend is met de structuur van de Cochrane Collaboration, zie bovenstaand plaatje en deze link

BMI bijeenkomst april 2008

21 04 2008

Afgelopen vrijdag 18 April was de Landelijke Dag BMI, CCZ, PBZ en WEB&Z. De BMI is afdeling Biomedische Informatie van de Nederlandse Vereniging voor Beroepsbeoefenaren (NVB). De andere afkortingen staan voor werkgroepen/commissies binnen de NVB: CCZ = Centrale Catalogus Ziekenhuisbibliotheken, BPZ = Bibliothecarissen van Psychiatrische Zorginstellingenen en WEB&Z = voorheen Biomedische werkgroep VOGIN.

Het programma bestond uit 3 ALV’s, van de CCZ, de BPZ en de BMI, afgewisseld met 3 lezingen. Een beetje lastig 3 ALV’s en 1 zaal. Dat betekende in mijn geval dat ik wel de BMI-ALV heb bijgewoond, maar tijdens de andere ALV’s (langdurig) in de koffieruimte annex gang moest wachten. Weliswaar heb ik die nuttig en plezierig doorgebracht, maar het zou wat gestroomlijnder kunnen. Ook vond ik het bijzonder jammer dat er nauwelijks een plenaire discussie was na de lezingen en dat men geacht werd de discussie letterlijk in de wandelgang voort te zetten. En stof tot discussie was er…..

Met name de eerste lezing deed de nodige stof opwaaien. Helaas heb ik deze voor de helft gemist, omdat ik in het station Hilversum dat van Amersfoort meende te herkennen ;) . Gelukkig heeft Ronald van Dieën op zijn blog ook de BMI-dag opgetekend, zodat ik de eerste punten van hem kan overnemen.

De eerste spreker was Geert van der Heijden, Universitair hoofddocent Klinische Epidemiologie bij het Julius Centrum voor Gezondheidswetenschappen van het UMC Utrecht. Geert is coördinator van het START-blok voor zesdejaars (Supervised Training in professional Attitude, Research and Teaching) en van de Academische Vaardigheden voor het GNK Masteronderwijs. Ik kende Geert oppervlakkig, omdat wij (afzonderlijk) geinterviewd waren voor het co-assistenten blad “Arts in Spe” over de integratie van het EBM-zoekonderwijs in het curriculum. Nu ik hem hier in levende lijve heb gehoord, lees ik zijn interview met heel andere ogen. Ik zag toen meer de overeenkomsten, nu de verschillen.

Zijn presentatie had als titel: “hoe zoekt de clinicus?”. Wie verwachtte dat Geert zou vertellen hoe de gemiddelde clinicus daadwerkelijk zoekt komt komt bedrogen uit. Geert vertelde vooral de methode van zoeken die hij artsen aanleert/voorhoudt. Deze methode is bepaald niet ingeburgerd en lijkt diametraal te staan tegenover de werkwijze van medisch informatiespecialisten, per slot zijn gehoor van dat moment. Alleen al het feit dat hij beweert dat je VOORAL GEEN MeSH moet gebruiken druist in tegen wat wij medisch informatiespecialisten leren en uitdragen. Het is de vraag of de zaal zo stil was, omdat zij overvallen werd door al het schokkends wat er gezegd werd of omdat men niet wist waar te beginnen met een weerwoord. Ik zag letterlijk een aantal monden openhangen van verbazing.

Zoals Ronald al stelde was dit een forse knuppel in het hoenderhok van de ‘medisch informatiespecialisten’. Ik deel echter niet zijn mening dat Geert het prima kon onderbouwen met argumenten. Hij is weliswaar een begenadigd spreker en bracht het allemaal met verve, maar ik had toch sterk de indruk dat zijn aanpak vooral practice- of eminence- en niet evidence-based was.

Hieronder enkele van zijn stellingen, 1ste 5 overgenomen van Ronald:

  1. “Een onderzoeker probeert publicatie air miles te verdienen met impact factors”
  2. “in Utrecht krijgen de studenten zo’n 500 uur Clinical Epidemiology en Evidence Based Practice, daar waar ze in Oxford (roots van EBM) slechts 10 uur krijgen”
  3. “contemporary EBM tactics (Sicily statement). (zie bijvoorbeeld hier:….)
  4. “fill knowledge gaps met problem solving skills”
  5. EBM = eminence biased medicine. Er zit veel goeds tussen, maar pas op….
  6. Belangrijkste doelstelling van literatuuronderzoek: reduceer Numbers Needed to Read.
  7. Vertrouw nooit 2e hands informatie (dit noemen wij voorgefilterde of geaggregeerde evidence) zoals TRIP, UpToDate, Cochrane Systematic Reviews, BMJ Clinical Evidence. Men zegt dat de Cochrane Systematic Reviews zo goed zijn, maar éen verschuiving van een komma heeft duizenden levens gekost. Lees en beoordeel dus de primaire bronnen!
  8. De Cochrane Collaboration houdt zich alleen maar bezig met systematische reviews van interventies, het doet niets aan de veel belangrijker domeinen “diagnose” en “prognose”.
  9. PICO (patient, intervention, comparison, outcome) werkt alleen voor therapie, niet voor andere vraagstukken.
  10. In plaats daarvan de vraag in 3 componenten splitsen: het domein (de categorie patiënten), de determinant (de diagnostische test, prognostische variabele of behandeling) en de uitkomst (ziekte, mortaliteit en …..)
  11. Zoeken doe je als volgt: bedenk voor elk van de 3 componenten zoveel mogelijk synoniemen op papier, verbind deze met “OR”, verbind de componenten met “AND”.
  12. De synoniemen alleen in titel en abstract zoeken (code [tiab]) EN NOOIT met MeSH (MEDLINE Subject Headings). MeSH zijn NOOIT bruikbaar volgens Geert. Ze zijn vaak te breed, ze zijn soms verouderd en je vindt er geen recente artikelen mee, omdat de indexering soms 3-12 maanden zou kosten.
  13. NOOIT Clinical Queries gebruiken. De methodologische filters die in PubMed zijn opgenomen, de zogenaamde Clinical Queries zijn enkel gebaseerd op MeSH en daarom niet bruikbaar. Verder zijn ze ontwikkeld voor heel specifieke onderwerpsgebieden, zoals cardiologie, en daarom niet algemeen toepasbaar.
  14. Volgens de Cochrane zou je als je een studie ‘mist’ de auteurs moeten aanschrijven. Dat lukt van geen kant. Beter is het te sneeuwballen via Web of Science en related articles en op basis daarvan JE ZOEKACTIE AAN TE PASSEN.

Wanneer men volgens de methode van der Heijden werkt zou men in een half uur klaar zijn met zoeken en in 2 uur de artikelen geselecteerd en beoordeeld hebben. Nou dat doe ik hem niet na.

De hierboven in rood weergegeven uitspraken zijn niet (geheel) juist. 8. Therapie is naar mijn bescheiden mening nog steeds een belangrijk domein; daarnaast is gaat de Cochrane Collaboration ook SR’s over diagnostische accuratesse studies schrijven. 13. in clinical queries worden (juist) niet alleen MeSH gebruikt.

In de groen weergegeven uitspraken kan ik me wel (ten dele) vinden, maar ze zijn niet essentieel verschillend van wat ik (men?) zelf nastreef(t)/doe(t), en dat wordt wel impliciet gesuggereerd.
Vele informatiespecialisten zullen ook:

  • 6 nastreven (door 7 te doen weliswaar),
  • 9 benadrukken (de PICO is inderdaad voor interventies ontwikkeld en minder geschikt voor andere domeinen)
  • en deze analoog aan 10 opschrijven (zij het dat we de componenten anders betitelen).
  • Het aanschrijven van auteurs (14) gebeurt als uiterste mogelijkheid. Eerst doen we de opties die door Geert als alternatief aangedragen worden: het sneeuwballen met als doel de zoekstrategie aan te passen. (dit weet ik omdat ik zelf de cursus “zoeken voor Cochrane Systematic Reviews” geef).

Als grote verschillen blijven dan over: (7) ons motto: geaggregeerde evidence eerst en (12) zoeken met MeSH versus zoeken in titel en abstract en het feit dat alle componenten met AND verbonden worden, wat ik maar mondjesmaat doe. Want: hoe meer termen/componenten je met “AND” combineert hoe groter de kans dat je iets mist. Soms moet het, maar je gaat niet a priori zo te werk.

Ik vond het een beetje flauw dat Geert aanhaalde dat er door één Cochrane reviewer een fout is gemaakt, waardoor er duizenden doden zouden zijn gevallen. Laat hij dan ook zeggen dat door het initiatief van de Cochrane er levens van honderd duizenden zijn gered, omdat eindelijk goed in kaart is gebracht welke therapieën nu wel en welke nu niet effectief zijn. Bij alle studies geldt dat je afhankelijk bent van hoe goed te studie is gedaan, van een juiste statistiek etcetera. Voordeel van geaggregeerde evidence is nu net dat een arts niet alle oorspronkelijke studies hoeft door te lezen om erachter te komen wat werkt (NNR!!!). Stel dat elke arts voor elke vraag ALLE individuele studies moet zoeken, beoordelen en moet samenvatten….. Dat zou, zoals de Cochrane het vaak noemt ‘duplication of effort’ zijn. Maar wil je precies weten hoe het zit, of wil je heel volledig zijn dan zul je inderdaad zelf de oorspronkelijke studies moeten zoeken en beoordelen.
Wel grappig trouwens dat 22 van de 70 artikelen waarvan Geert medeauteur is tot de geaggregeerde evidence (inclusief Cochrane Reviews) gerekend kunnen worden….. Zou hij de lezers ook afraden deze artikelen te selecteren? ;)

Voor wat betreft het zoeken via de MeSH. Ik denk dat weinig ‘zoekers’ louter en alleen op MeSH zoeken. Wij gebruiken ook tekstwoorden. In hoeverre er gebruik van gemaakt wordt hangt erg van het doel en de tijd af. Je moet steeds afwegen wat de voor- en de nadelen zijn. Door geen MeSH te gebruiken, maak je ook geen gebruik van de synoniemen functie en de mogelijkheid tot exploderen (nauwere termen meenemen). Probeer maar eens in een zoekactie alle synoniemen voor kanker te vinden: cancer, cancers , tumor, tumour(s), neoplasm(s), malignancy (-ies), maar daarnaast ook alle verschilende kankers: adenocarcinoma, lymphoma, Hodgkin’s disease, etc. Met de MeSH “Neoplasms” vind je in een keer alle spellingswijzen, synoniemen en alle soorten kanker te vinden.

Maar in ieder geval heeft Geert ons geconfronteerd met een heel andere zienswijze en ons een spiegel voorgehouden. Het is soms goed om even wakkergeschud te worden en na te denken over je eigen (soms te ?) routinematige aanpak. Geert ging niet de uitdaging uit de weg om de 2 zoekmethodes met elkaar te willen vergelijken. Dus wie weet wat hier nog uit voortvloeit. Zouden we tot een consensus kunnen komen?

De volgende praatjes waren weliswaar minder provocerend, maar toch zeker de moeite waard.

De web 2.0-goeroe Wouter Gerritsma (WoWter) praatte ons bij over web 2.0, zorg 2.0 en (medische) bibliotheek 2.0. Zeer toepasselijk met zeer moderne middelen: een powerpointpresentatie via slideshare te bewonderen en met een WIKI, van waaruit hij steeds enkele links aanklikte. Helaas was de internetverbinding af en toe niet zo 2.0, zodat bijvoorbeeld deze beeldende YOU TUBE-uitleg Web 2.0 … The machine is us/ing us niet afgespeeld kon worden. Maar handig van zo’n wiki is natuurlijk dat je het alsnog kunt opzoeken en afspelen. In de presentatie kwamen wat practische voorbeelden aan de orde (bibliotheek, zorg, artsen) en werd ingegaan op de verschillende tools van web 2.0: RSS, blogs, gepersonaliseerde pagina’s, tagging en wiki’s. Ik was wel even apetrots dat mijn blog alsmede dat van de bibliotheker even als voorbeeld getoond werden van beginnende (medische bieb) SPOETNIKbloggers. De spoetnikcursus en 23 dingen werden sowieso gepromoot om te volgen als beginner. Voor wie meer wil weten, kijk nog eens naar de wiki: het biedt een mooi overzicht.

Als laatsten hielden Tanja van Bon en Sjors Clemens een duo-presentatie over e-learning. Als originele start begonnen ze met vragen te stellen in plaats van ermee te eindigen. Daarna gaven ze een leuke introductie over e-learning en lieten ze zien hoe ze dit in hun ziekenhuis implementeerden.

Tussen en na de lezingen was er ruim tijd om met elkaar van gedachten te wisselen, aan het slot zelfs onder genot van een borrel voor wie niet de BOB was. Zeker een heel geslaagde dag. Hier ga ik vaker naar toe!


met de W: ik zie dat de bibliotheker inmiddels ook een stukje heeft geschreven over de lezing van Geert van der Heiden. Misschien ook leuk om dit te lezen.



Get every new post delivered to your Inbox.

Join 607 other followers