Friday Foolery #51 Statistically Funny

1 06 2012

Epidemiologists, people working in the EBM field and, above all, statisticians are said to have no sense of humor.*

Hilda Bastian is a clear exception to this rule.

I met Hilda a few years ago at a Cochrane colloquium. At that time she was working as a consumer advocate in Australia. Nowadays she is editor and curator of PubMed Health. According to her Twitter Bio (she tweets as @hildabast) she is (still) “Interested in effective communication as well as effective health care”. She also writes important articles, like “Seventy-Five Trials and Eleven Systematic Reviews a Day: How Will We Ever Keep Up? (PLOS 2010), reviewed at this blog.

Today I learned she also has a great creative talent in cartoon drawing, in the field of …  yeah… EBM, epidemiology & statistics.

Below is one of her cartoons, which fits in well with a recent post in the BMJ by Ray Moynihan, retweeted by Hilda: Preventing overdiagnosis: how to stop harming the healthy. In her post she refers to another article: Overdiagnosis in cancer (JNCI 2010), saying:

“Finding and aggressively treating non-symptomatic disease that would never have made people sick, inventing new conditions and re-defining the thresholds for old ones: will there be anyone healthy left at all?”

I invite you to go and visit Hilda’s blog Statistically funny (Commenting on the science of unbiased health research with cartoons) and to enjoy her cartoons, that are often inspired by recent publications in the field.

* My post #NotSoFunny #16: ridiculing RCTs and EBM even led David Rind to sigh that “EBM folks are not necessarily known for their great senses of humor”. (so I’m no exception to the rule ;)

The Scatter of Medical Research and What to do About it.

18 05 2012

ResearchBlogging.orgPaul Glasziou, GP and professor in Evidence Based Medicine, co-authored a new article in the BMJ [1]. Similar to another paper [2] I discussed before [3] this paper deals with the difficulty for clinicians of staying up-to-date with the literature. But where the previous paper [2,3] highlighted the mere increase in number of research articles over time, the current paper looks at the scatter of randomized clinical trials (RCTs) and systematic reviews (SR’s) accross different journals cited in one year (2009) in PubMed.

Hofmann et al analyzed 7 specialties and 9 sub-specialties, that are considered the leading contributions to the burden of disease in high income countries.

They followed a relative straightforward method for identifying the publications. Each search string consisted of a MeSH term (controlled  term) to identify the selected disease or disorders, a publication type [pt] to identify the type of study, and the year of publication. For example, the search strategy for randomized trials in cardiology was: “heart diseases”[MeSH] AND randomized controlled trial[pt] AND 2009[dp]. (when searching “heart diseases” as a MeSH, narrower terms are also searched.) Meta-analysis[pt] was used to identify systematic reviews.

Using this approach Hofmann et al found 14 343 RCTs and 3214 SR’s published in 2009 in the field of the selected (sub)specialties. There was a clear scatter across journals, but this scatter varied considerably among specialties:

“Otolaryngology had the least scatter (363 trials across 167 journals) and neurology the most (2770 trials across 896 journals). In only three subspecialties (lung cancer, chronic obstructive pulmonary disease, hearing loss) were 10 or fewer journals needed to locate 50% of trials. The scatter was less for systematic reviews: hearing loss had the least scatter (10 reviews across nine journals) and cancer the most (670 reviews across 279 journals). For some specialties and subspecialties the papers were concentrated in specialty journals; whereas for others, few of the top 10 journals were a specialty journal for that area.
Generally, little overlap occurred between the top 10 journals publishing trials and those publishing systematic reviews. The number of journals required to find all trials or reviews was highly correlated (r=0.97) with the number of papers for each specialty/ subspecialty.”

Previous work already suggested that this scatter of research has a long tail. Half of the publications is in a minority of papers, whereas the remaining articles are scattered among many journals (see Fig below).

Click to enlarge en see legends at BMJ 2012;344:e3223 [CC]

The good news is that SRs are less scattered and that general journals appear more often in the top 10 journals publishing SRs. Indeed for 6 of the 7 specialties and 4 of the 9 subspecialties, the Cochrane Database of Systematic Reviews had published the highest number of systematic reviews, publishing between 6% and 18% of all the systematic reviews published in each area in 2009. The bad news is that even keeping up to date with SRs seems a huge, if not impossible, challenge.

In other words, it is not sufficient for clinicians to rely on personal subscriptions to a few journals in their specialty (which is common practice). Hoffmann et al suggest several solutions to help clinicians cope with the increasing volume and scatter of research publications.

  • a central library of systematic reviews (but apparently the Cochrane Library fails to fulfill such a role according to the authors, because many reviews are out of date and are perceived as less clinically relevant)
  • registry of planned and completed systematic reviews, such as prospero. (this makes it easier to locate SRs and reduces bias)
  • Synthesis of Evidence and synopses, like the ACP-Jounal Club which summarizes the best evidence in internal medicine
  • Specialised databases that collate and critically appraise randomized trials and systematic reviews, like for physical therapy. In my personal experience, however, this database is often out of date and not comprehensive
  • Journal scanning services like EvidenceUpdates from, which scans over 120 journals, filters articles on the basis of quality, has practising clinicians rate them for relevance and newsworthiness, and makes them available as email alerts and in a searchable database. I use this service too, but besides that not all specialties are covered, the rating of evidence may not always be objective (see previous post [4])
  • The use of social media tools to alert clinicians to important new research.

Most of these solutions are (long) existing solutions that do not or only partly help to solve the information overload.

I was surprised that the authors didn’t propose the use of personalized alerts. PubMed’s My NCBI feature allows to create automatic email alerts on a topic and to subscribe to electronic tables of contents (which could include ACP journal Club). Suppose that a physician browses 10 journals roughly covering 25% of the trials. He/she does not need to read all the other journals from cover to cover to avoid missing one potentially relevant trial. Instead it is far more efficient to perform a topic search to filter relevant studies from journals that seldom publish trials on the topic of interest. One could even use the search of Hoffmann et al to achieve this.* Although in reality, most clinical researchers will have narrower fields of interest than all studies about endocrinology and neurology.

At our library we are working at creating deduplicated, easy to read, alerts that collate table of contents of certain journals with topic (and author) searches in PubMed, EMBASE and other databases. There are existing tools that do the same.

Another way to reduce the individual work (reading) load is to organize journals clubs or even better organize regular CATs (critical appraised topics). In the Netherlands, CATS are a compulsory item for residents. A few doctors do the work for many. Usually they choose topics that are clinically relevant (or for which the evidence is unclear).

The authors shortly mention that their search strategy might have missed  missed some eligible papers and included some that are not truly RCTs or SRs, because they relied on PubMed’s publication type to retrieve RCTs and SRs. For systematic reviews this may be a greater problem than recognized, for the authors have used meta-analyses[pt] to identify systematic reviews. Unfortunately PubMed has no publication type for systematic reviews, but it may be clear that there are many more systematic reviews that meta-analyses. Possibly systematical reviews might even have a different scatter pattern than meta-analyses (i.e. the latter might be preferentially included in core journals).

Furthermore not all meta-analyses and systematic reviews are reviews of RCTs (thus it is not completely fair to compare MAs with RCTs only). On the other hand it is a (not discussed) omission of this study, that only interventions are considered. Nowadays physicians have many other questions than those related to therapy, like questions about prognosis, harm and diagnosis.

I did a little imperfect search just to see whether use of other search terms than meta-analyses[pt] would have any influence on the outcome. I search for (1) meta-analyses [pt] and (2) systematic review [tiab] (title and abstract) of papers about endocrine diseases. Then I subtracted 1 from 2 (to analyse the systematic reviews not indexed as meta-analysis[pt])



I analyzed the top 10/11 journals publishing these study types.

This little experiment suggests that:

  1. the precise scatter might differ per search: apparently the systematic review[tiab] search yielded different top 10/11 journals (for this sample) than the meta-analysis[pt] search. (partially because Cochrane systematic reviews apparently don’t mention systematic reviews in title and abstract?).
  2. the authors underestimate the numbers of Systematic Reviews: simply searching for systematic review[tiab] already found appr. 50% additional systematic reviews compared to meta-analysis[pt] alone
  3. As expected (by me at last), many of the SR’s en MA’s were NOT dealing with interventions, i.e. see the first 5 hits (out of 108 and 236 respectively).
  4. Together these findings indicate that the true information overload is far greater than shown by Hoffmann et al (not all systematic reviews are found, of all available search designs only RCTs are searched).
  5. On the other hand this indirectly shows that SRs are a better way to keep up-to-date than suggested: SRs  also summarize non-interventional research (the ratio SRs of RCTs: individual RCTs is much lower than suggested)
  6. It also means that the role of the Cochrane Systematic reviews to aggregate RCTs is underestimated by the published graphs (the MA[pt] section is diluted with non-RCT- systematic reviews, thus the proportion of the Cochrane SRs in the interventional MAs becomes larger)

Well anyway, these imperfections do not contradict the main point of this paper: that trials are scattered across hundreds of general and specialty journals and that “systematic reviews” (or meta-analyses really) do reduce the extent of scatter, but are still widely scattered and mostly in different journals to those of randomized trials.

Indeed, personal subscriptions to journals seem insufficient for keeping up to date.
Besides supplementing subscription by  methods such as journal scanning services, I would recommend the use of personalized alerts from PubMed and several prefiltered sources including an EBM search machine like TRIP (

*but I would broaden it to find all aggregate evidence, including ACP, Clinical Evidence, syntheses and synopses, not only meta-analyses.

**I do appreciate that one of the co-authors is a medical librarian: Sarah Thorning.


  1. Hoffmann, Tammy, Erueti, Chrissy, Thorning, Sarah, & Glasziou, Paul (2012). The scatter of research: cross sectional comparison of randomised trials and systematic reviews across specialties BMJ, 344 : 10.1136/bmj.e3223
  2. Bastian, H., Glasziou, P., & Chalmers, I. (2010). Seventy-Five Trials and Eleven Systematic Reviews a Day: How Will We Ever Keep Up? PLoS Medicine, 7 (9) DOI: 10.1371/journal.pmed.1000326
  3. How will we ever keep up with 75 trials and 11 systematic reviews a day (
  4. Experience versus Evidence [1]. Opioid Therapy for Rheumatoid Arthritis Pain. (

Can Guidelines Harm Patients?

2 05 2012

ResearchBlogging.orgRecently I saw an intriguing “personal view” in the BMJ written by Grant Hutchison entitled: “Can Guidelines Harm Patients Too?” Hutchison is a consultant anesthetist with -as he calls it- chronic guideline fatigue syndrome. Hutchison underwent an acute exacerbation of his “condition” with the arrival of another set of guidelines in his email inbox. Hutchison:

On reviewing the level of evidence provided for the various recommendations being offered, I was struck by the fact that no relevant clinical trials had been carried out in the population of interest. Eleven out of 25 of the recommendations made were supported only by the lowest levels of published evidence (case reports and case series, or inference from studies not directly applicable to the relevant population). A further seven out of 25 were derived only from the expert opinion of members of the guidelines committee, in the absence of any guidance to be gleaned from the published literature.

Hutchison’s personal experience is supported by evidence from two articles [2,3].

One paper published in the JAMA 2009 [2] concludes that ACC/AHA (American College of Cardiology and the American Heart Association) clinical practice guidelines are largely developed from lower levels of evidence or expert opinion and that the proportion of recommendations for which there is no conclusive evidence is growing. Only 314 recommendations of 2711 (median, 11%) are classified as level of evidence A , thus recommendation based on evidence from multiple randomized trials or meta-analyses.  The majority of recommendations (1246/2711; median, 48%) are level of evidence C, thus based  on expert opinion, case studies, or standards of care. Strikingly only 245 of 1305 class I recommendations are based on the highest level A evidence (median, 19%).

Another paper, published in Ann Intern Med 2011 [3], reaches similar conclusions analyzing the Infectious Diseases Society of America (IDSA) Practice Guidelines. Of the 4218 individual recommendations found, only 14% were supported by the strongest (level I) quality of evidence; more than half were based on level III evidence only. Like the ACC/AHH guidelines only a small part (23%) of the strongest IDSA recommendations, were based on level I evidence (in this case ≥1 randomized controlled trial, see below). And, here too, the new recommendations were mostly based on level II and III evidence.

Although there is little to argue about Hutchison’s observations, I do not agree with his conclusions.

In his view guidelines are equivalent to a bullet pointed list or flow diagram, allowing busy practitioners to move on from practice based on mere anecdote and opinion. It therefore seems contradictory that half of the EBM-guidelines are based on little more than anecdote (case series, extrapolation from other populations) and opinion. He then argues that guidelines, like other therapeutic interventions, should be considered in terms of balance between benefit and risk and that the risk  associated with the dissemination of poorly founded guidelines must also be considered. One of those risks is that doctors will just tend to adhere to the guidelines, and may even change their own (adequate) practice  in the absence of any scientific evidence against it. If a patient is harmed despite punctilious adherence to the guideline-rules,  “it is easy to be seduced into assuming that the bad outcome was therefore unavoidable”. But perhaps harm was done by following the guideline….

First of all, overall evidence shows that adherence to guidelines can improve patient outcome and provide more cost effective care (Naveed Mustfa in a comment refers to [4]).

Hutchinson’s piece is opinion-based and rather driven by (understandable) gut feelings and implicit assumptions, that also surround EBM in general.

  1. First there is the assumption that guidelines are a fixed set of rules, like a protocol, and that there is no room for preferences (both of the doctor and the patient), interpretations and experience. In the same way as EBM is often degraded to “cookbook medicine”, EBM guidelines are turned into mere bullet pointed lists made by a bunch of experts that just want to impose their opinions as truth.
  2. The second assumption (shared by many) is that evidence based medicine is synonymous with “randomized controlled trials”. In analogy, only those EBM guideline recommendations “count” that are based on RCT’s or meta-analyses.

Before I continue, I would strongly advice all readers (and certainly all EBM and guideline-skeptics) to read this excellent and clearly written BJM-editorial by David Sackett et al. that deals with misconceptions, myths and prejudices surrounding EBM : Evidence based medicine: what it is and what it isn’t [5].

Sackett et al define EBM as “the conscientious, explicit and judicious use of current best evidence in making decisions about the care of individual patients” [5]. Sackett emphasizes that “Good doctors use both individual clinical expertise and the best available external evidence, and neither alone is enough. Without clinical expertise, practice risks becoming tyrannised by evidence, for even excellent external evidence may be inapplicable to or inappropriate for an individual patient. Without current best evidence, practice risks becoming rapidly out of date, to the detriment of patients.”

Guidelines are meant to give recommendations based on the best available evidence. Guidelines should not be a set of rules, set in stone. Ideally, guidelines have gathered evidence in a transparent way and make it easier for the clinicians to grasp the evidence for a certain procedure in a certain situation … and to see the gaps.

Contrary to what many people think, EBM is not restricted to randomized trials and meta-analyses. It involves tracking down the best external evidence there is. As I explained in #NotSoFunny #16 – Ridiculing RCTs & EBM, evidence is not an all-or-nothing thing: RCT’s (if well performed) are the most robust, but if not available we have to rely on “lower” evidence (from cohort to case-control to case series or expert opinion even).
On the other hand RCT’s are often not even suitable to answer questions in other domains than therapy (etiology/harm, prognosis, diagnosis): per definition the level of evidence for these kind of questions inevitably will be low*. Also, for some interventions RCT’s are not appropriate, feasible or too costly to perform (cesarean vs vaginal birth; experimental therapies, rare diseases, see also [3]).

It is also good to realize that guidance, based on numerous randomized controlled trials is probably not or limited applicable to groups of patients who are seldom included in a RCT: the cognitively impaired, the patient with multiple comorbidities [6], the old patient [6], children and (often) women.

Finally not all RCTs are created equal (various forms of bias; surrogate outcomes; small sample sizes, short follow-up), and thus should not all represent the same high level of evidence.*

Thus in my opinion, low levels of evidence are not per definition problematic. Even if they are the basis for strong recommendations. As long as it is clear how the recommendations were reached and as long as these are well underpinned (by whatever evidence or motivation). One could see the exposed gaps in evidence as a positive thing as it may highlight the need for clinical research in certain fields.

There is one BIG BUT: my assumption is that guidelines are “just” recommendations based on exhaustive and objective reviews of existing evidence. No more, no less. This means that the clinician must have the freedom to deviate from the recommendations, based on his own expertise and/or the situation and/or the patient’s preferences. The more, when the evidence on which these strong recommendations are based is ‘scant’. Sackett already warned for the possible hijacking of EBM by purchasers and managers (and may I add health insurances and governmental agencies) to cut the costs of health care and to impose “rules”.

I therefore think it is odd that the ACC/AHA guidelines prescribe that Class I recommendations SHOULD be performed/administered even if they are based on level C recommendations (see Figure).

I also find it odd that different guidelines have a different nomenclature. The ACC/AHA have Class I, IIa, IIb and III recommendations and level A, B, C evidence where level A evidence represents sufficient evidence from multiple randomized trials and meta-analyses, whereas the strength of recommendations in the IDSA guidelines includes levels A through C (OR D/E recommendations against use) and quality of evidence ranges from level I through III , where I indicates evidence from (just) 1 properly randomized controlled trial. As explained in [3] this system was introduced to evaluate the effectiveness of preventive health care interventions in Canada (for which RCTs are apt).

Finally, guidelines and guideline makers should probably be more open for input/feedback from people who apply these guidelines.


*the new GRADE (Grading of Recommendations Assessment, Development, and Evaluation) scoring system taking into account good quality observational studies as well may offer a potential solution.

Another possibly relevant post at this blog: The Best Study Design for … Dummies

Taken from a summary of an ACC/AHA guideline at
Click to enlarge.


  1. Hutchison, G. (2012). Guidelines can harm patients too BMJ, 344 (apr18 1) DOI: 10.1136/bmj.e2685
  2. Tricoci P, Allen JM, Kramer JM, Califf RM, & Smith SC Jr (2009). Scientific evidence underlying the ACC/AHA clinical practice guidelines. JAMA : the journal of the American Medical Association, 301 (8), 831-41 PMID: 19244190
  3. Lee, D., & Vielemeyer, O. (2011). Analysis of Overall Level of Evidence Behind Infectious Diseases Society of America Practice Guidelines Archives of Internal Medicine, 171 (1), 18-22 DOI: 10.1001/archinternmed.2010.482
  4. Menéndez R, Reyes S, Martínez R, de la Cuadra P, Manuel Vallés J, & Vallterra J (2007). Economic evaluation of adherence to treatment guidelines in nonintensive care pneumonia. The European respiratory journal : official journal of the European Society for Clinical Respiratory Physiology, 29 (4), 751-6 PMID: 17005580
  5. Sackett, D., Rosenberg, W., Gray, J., Haynes, R., & Richardson, W. (1996). Evidence based medicine: what it is and what it isn’t BMJ, 312 (7023), 71-72 DOI: 10.1136/bmj.312.7023.71
  6. Aylett, V. (2010). Do geriatricians need guidelines? BMJ, 341 (sep29 3) DOI: 10.1136/bmj.c5340

Experience versus Evidence [1]. Opioid Therapy for Rheumatoid Arthritis Pain.

5 12 2011

ResearchBlogging.orgRheumatoid arthritis (RA) is a chronic auto-immune disease, which causes inflammation of the joints that eventually leads to progressive joint destruction and deformity. Patients have swollen, stiff and painful joints.  The main aim of treatment is to reduce swelling  and inflammation, to alleviate pain and stiffness and to maintain normal joint function. While there is no cure, it is important to properly manage pain.

The mainstays of therapy in RA are disease-modifying anti-rheumatic drugs (DMARDs) and non-steroidal anti-inflammatory drugs (NSAIDs). These drugs primarily target inflammation. However, since inflammation is not the only factor that causes pain in RA, patients may not be (fully) responsive to treatment with these medications.
Opioids are another class of pain-relieving substance (analgesics). They are frequently used in RA, but their role in chronic cancer pain, including RA, is not firmly established.

A recent Cochrane Systematic Review [1] assessed the beneficial and harmful effects of opioids in RA.

Eleven studies (672 participants) were included in the review.

Four studies only assessed the efficacy of  single doses of different analgesics, often given on consecutive days. In each study opioids reduced pain (a bit) more than placebo. There were no differences in effectiveness between the opioids.

Seven studies between 1-6 weeks in duration assessed 6 different oral opioids either alone or combined with non-opioid analgesics.
The only strong opioid investigated was controlled-release morphine sulphate, in a single study with 20 participants.
Six studies compared an opioid (often combined with an non-opioid analgesic) to placebo. Opioids were slightly better than placebo in improving patient reported global impression of clinical change (PGIC)  (3 studies, 324 participants: relative risk (RR) 1.44, 95% CI 1.03 to 2.03), but did not lower the  number of withdrawals due to inadequate analgesia in 4 studies.
Notably none of the 11 studies reported the primary and probably more clinical relevant outcome “proportion of participants reporting ≥ 30% pain relief”.

On the other hand adverse events (most commonly nausea, vomiting, dizziness and constipation) were more frequent in patients receiving opioids compared to placebo (4 studies, 371 participants: odds ratio 3.90, 95% CI 2.31 to 6.56). Withdrawal due to adverse events was  non-significantly higher in the opioid-treated group.

Comparing opioids to other analgesics instead of placebos seems more relevant. Among the 11 studies, only 1 study compared an opioid (codeine with paracetamol) to an NSAID (diclofenac). This study found no difference in efficacy or safety between the two treatments.

The 11 included studies were very heterogeneous (i.e. different opioid studied, with or without concurrent use of non-opioid analgesics, different outcomes measured) and the risk of bias was generally high. Furthermore, most studies were published before 2000 (less optimal treatment of RA).

The authors therefore conclude:

In light of this, the quantitative findings of this review must be interpreted with great caution. At best, there is weak evidence in favour of the efficacy of opioids for the treatment of pain in patients with RA but, as no study was longer than six weeks in duration, no reliable conclusions can be drawn regarding the efficacy or safety of opioids in the longer term.

This was the evidence, now the opinion.

I found this Cochrane Review via an EvidenceUpdates email alert from the BMJ Group and McMaster PLUS.

EvidenceUpdate alerts are meant to “provide you with access to current best evidence from research, tailored to your own health care interests, to support evidence-based clinical decisions. (…) All citations are pre-rated for quality by research staff, then rated for clinical relevance and interest by at least 3 members of a worldwide panel of practicing physicians”

I usually don’t care about the rating, because it is mostly 5-6 on a scale of 7. This was also true for the current SR.

There is a more detailed rating available (when clicking the link, free registration required). Usually, the newsworthiness of SR’s scores relatively low. (because it summarizes ‘old’ studies?). Personally I would think that the relevance and newsworthiness would be higher for the special interest group, pain.

But the comment of the first of the 3 clinical raters was most revealing:

He/she comments:

As a Palliative care physician and general internist, I have had excellent results using low potency opiates for RA and OA pain. The palliative care literature is significantly more supportive of this approach vs. the Cochrane review.

Thus personal experience wins from evidence?* How did this palliative care physician assess effectiveness? Just give a single dose of an opiate? How did he rate the effectiveness of the opioids? Did he/she compare it to placebo or NSAID (did he compare it at all?), did he/she measure adverse effects?

And what is “The palliative care literature”  the commenter is referring to? Apparently not this Cochrane Review. Apparently not the 11 controlled trials included in the Cochrane review. Apparently not the several other Cochrane reviews on use of opioids for non-chronic cancer pain, and not the guidelines, syntheses and synopsis I found via the TRIP-database. All conclude that using opioids to treat non-cancer chronic pain is supported by very limited evidence, that adverse effects are common and that long-term use may lead to opioid addiction.

I’m sorry to note that although the alerting service is great as an alert, such personal ratings are not very helpful for interpreting and *true* rating of the evidence.

I would rather prefer a truly objective, structured critical appraisal like this one on a similar topic by DARE (“Opioids for chronic noncancer pain: a meta-analysis of effectiveness and side effects”)  and/or an objective piece that puts the new data into clinical perspective.

*Just to be clear, the own expertise and opinions of experts are also important in decision making. Rightly, Sackett [2] emphasized that good doctors use both individual clinical expertise and the best available external evidence. However, that doesn’t mean that one personal opinion and/or preference replaces all the existing evidence.


  1. Whittle SL, Richards BL, Husni E, & Buchbinder R (2011). Opioid therapy for treating rheumatoid arthritis pain. Cochrane database of systematic reviews (Online), 11 PMID: 22071805
  2. Sackett DL, Rosenberg WM, Gray JA, Haynes RB, & Richardson WS (1996). Evidence based medicine: what it is and what it isn’t. BMJ (Clinical research ed.), 312 (7023), 71-2 PMID: 8555924
Enhanced by Zemanta

Evidence Based Point of Care Summaries [2] More Uptodate with Dynamed.

18 10 2011

ResearchBlogging.orgThis post is part of a short series about Evidence Based Point of Care Summaries or POCs. In this series I will review 3 recent papers that objectively compare a selection of POCs.

In the previous post I reviewed a paper from Rita Banzi and colleagues from the Italian Cochrane Centre [1]. They analyzed 18 POCs with respect to their “volume”, content development and editorial policy. There were large differences among POCs, especially with regard to evidence-based methodology scores, but no product appeared the best according to the criteria used.

In this post I will review another paper by Banzi et al, published in the BMJ a few weeks ago [2].

This article examined the speed with which EBP-point of care summaries were updated using a prospective cohort design.

First the authors selected all the systematic reviews signaled by the American College of Physicians (ACP) Journal Club and Evidence-Based Medicine Primary Care and Internal Medicine from April to December 2009. In the same period the authors selected all the Cochrane systematic reviews labelled as “conclusion changed” in the Cochrane Library. In total 128 systematic reviews were retrieved, 68 from the literature surveillance journals (53%) and 60 (47%) from the Cochrane Library. Two months after the collection started (June 2009) the authors did a monthly screen for a year to look for potential citation of the identified 128 systematic reviews in the POCs.

Only those 5 POCs were studied that were ranked in the top quarter for at least 2 (out of 3) desirable dimensions, namely: Clinical Evidence, Dynamed, EBM Guidelines, UpToDate and eMedicine. Surprisingly eMedicine was among the selected POCs, having a rating of “1” on a scale of 1 to 15 for EBM methodology. One would think that Evidence-based-ness is a fundamental prerequisite  for EBM-POCs…..?!

Results were represented as a (rather odd, but clear) “survival analysis” ( “death” = a citation in a summary).

Fig.1 : Updating curves for relevant evidence by POCs (from [2])

I will be brief about the results.

Dynamed clearly beated all the other products  in its updating speed.

Expressed in figures, the updating speed of Dynamed was 78% and 97% greater than those of EBM Guidelines and Clinical Evidence, respectively. Dynamed had a median citation rate of around two months and EBM Guidelines around 10 months, quite close to the limit of the follow-up, but the citation rate of the other three point of care summaries (UpToDate, eMedicine, Clinical Evidence) were so slow that they exceeded the follow-up period and the authors could not compute the median.

Dynamed outperformed the other POC’s in updating of systematic reviews independent of the route. EBM Guidelines and UpToDate had similar overall updating rates, but Cochrane systematic reviews were more likely to be cited by EBM Guidelines than by UpToDate (odds ratio 0.02, P<0.001). Perhaps not surprising, as EBM Guidelines has a formal agreement with the Cochrane Collaboration to use Cochrane contents and label its summaries as “Cochrane inside.” On the other hand, UpToDate was faster than EBM Guidelines in updating systematic reviews signaled by literature surveillance journals.

Dynamed‘s higher updating ability was not due to a difference in identifying important new evidence, but to the speed with which this new information was incorporated in their summaries. Possibly the central updating of Dynamed by the editorial team might account for the more prompt inclusion of evidence.

As the authors rightly point out, slowness in updating could mean that new relevant information is ignored and could thus affect the validity of point of care information services”.

A slower updating rate may be considered more important for POCs that “promise” to “continuously update their evidence summaries” (EBM-Guidelines) or to “perform a continuous comprehensive review and to revise chapters whenever important new information is published, not according to any specific time schedule” (UpToDate). (see table with description of updating mechanisms )

In contrast, Emedicine doesn’t provide any detailed information on updating policy, another reason that it doesn’t belong to this list of best POCs.
Clinical Evidence, however, clearly states, We aim to update Clinical Evidence reviews annually. In addition to this cycle, details of clinically important studies are added to the relevant reviews throughout the year using the BMJ Updates service.” But BMJ Updates is not considered in the current analysis. Furthermore, patience is rewarded with excellent and complete summaries of evidence (in my opinion).

Indeed a major limitation of the current (and the previous) study by Banzi et al [1,2] is that they have looked at quantitative aspects and items that are relatively “easy to score”, like “volume” and “editorial quality”, not at the real quality of the evidence (previous post).

Although the findings were new to me, others have recently published similar results (studies were performed in the same time-span):

Shurtz and Foster [3] of the Texas A&M University Medical Sciences Library (MSL) also sought to establish a rubric for evaluating evidence-based medicine (EBM) point-of-care tools in a health sciences library.

They, too, looked at editorial quality and speed of updating plus reviewing content, search options, quality control, and grading.

Their main conclusion is that “differences between EBM tools’ options, content coverage, and usability were minimal, but that the products’ methods for locating and grading evidence varied widely in transparency and process”.

Thus this is in line with what Banzi et al reported in their first paper. They also share Banzi’s conclusion about differences in speed of updating

“DynaMed had the most up-to-date summaries (updated on average within 19 days), while First Consult had the least up to date (updated on average within 449 days). Six tools claimed to update summaries within 6 months or less. For the 10 topics searched, however, only DynaMed met this claim.”

Table 3 from Shurtz and Foster [3] 

Ketchum et al [4] also conclude that DynaMed the largest proportion of current (2007-2009) references (170/1131, 15%). In addition they found that Dynamed had the largest total number of references (1131/2330, 48.5%).

Yes, and you might have guessed it. The paper of Andrea Ketchum is the 3rd paper I’m going to review.

I also recommend to read the paper of the librarians Shurtz and Foster [3], which I found along the way. It has too much overlap with the Banzi papers to devote a separate post to it. Still it provides better background information then the Banzi papers, it focuses on POCs that claim to be EBM and doesn’t try to weigh one element over another. 


  1. Banzi, R., Liberati, A., Moschetti, I., Tagliabue, L., & Moja, L. (2010). A Review of Online Evidence-based Practice Point-of-Care Information Summary Providers Journal of Medical Internet Research, 12 (3) DOI: 10.2196/jmir.1288
  2. Banzi, R., Cinquini, M., Liberati, A., Moschetti, I., Pecoraro, V., Tagliabue, L., & Moja, L. (2011). Speed of updating online evidence based point of care summaries: prospective cohort analysis BMJ, 343 (sep22 2) DOI: 10.1136/bmj.d5856
  3. Shurtz, S., & Foster, M. (2011). Developing and using a rubric for evaluating evidence-based medicine point-of-care tools Journal of the Medical Library Association : JMLA, 99 (3), 247-254 DOI: 10.3163/1536-5050.99.3.012
  4. Ketchum, A., Saleh, A., & Jeong, K. (2011). Type of Evidence Behind Point-of-Care Clinical Information Products: A Bibliometric Analysis Journal of Medical Internet Research, 13 (1) DOI: 10.2196/jmir.1539
  5. Evidence Based Point of Care Summaries [1] No “Best” Among the Bests? (
  6. How will we ever keep up with 75 Trials and 11 Systematic Reviews a Day? (
  7. UpToDate or Dynamed? (Shamsha Damani at
  8. How Evidence Based is UpToDate really? (

Related articles (automatically generated)

Evidence Based Point of Care Summaries [1] No “Best” Among the Bests?

13 10 2011

ResearchBlogging.orgFor many of today’s busy practicing clinicians, keeping up with the enormous and ever growing amount of medical information, poses substantial challenges [6]. Its impractical to do a PubMed search to answer each clinical question and then synthesize and appraise the evidence. Simply, because busy health care providers have limited time and many questions per day.

As repeatedly mentioned on this blog ([6-7]), it is far more efficient to try to find aggregate (or pre-filtered or pre-appraised) evidence first.

Haynes ‘‘5S’’ levels of evidence (adapted by [1])

There are several forms of aggregate evidence, often represented as the higher layers of an evidence pyramid (because they aggregate individual studies, represented by the lowest layer). There are confusingly many pyramids, however [8] with different kinds of hierarchies and based on different principles.

According to the “5S” paradigm[9] (now evolved to 6S -[10]) the peak of the pyramid are the ideal but not yet realized computer decision support systems, that link the individual patient characteristics to the current best evidence. According to the 5S model the next best source are Evidence Based Textbooks.
(Note: EBM and textbooks almost seem a contradiction in terms to me, personally I would not put many of the POCs somewhere at the top. Also see my post: How Evidence Based is UpToDate really?)

Whatever their exact place in the EBM-pyramid, these POCs are helpful to many clinicians. There are many different POCs (see The HLWIKI Canada for a comprehensive overview [11]) with a wide range of costs, varying from free with ads (e-Medicine) to very expensive site licenses (UpToDate). Because of the costs, hospital libraries have to choose among them.

Choices are often based on user preferences and satisfaction and balanced against costs, scope of coverage etc. Choices are often subjective and people tend to stick to the databases they know.

Initial literature about POCs concentrated on user preferences and satisfaction. A New Zealand study [3] among 84 GPs showed no significant difference in preference for, or usage levels of DynaMed, MD Consult (including FirstConsult) and UpToDate. The proportion of questions adequately answered by POCs differed per study (see introduction of [4] for an overview) varying from 20% to 70%.
McKibbon and Fridsma ([5] cited in [4]) found that the information resources chosen by primary care physicians were seldom helpful in providing the correct answers, leading them to conclude that:

“…the evidence base of the resources must be strong and current…We need to evaluate them well to determine how best to harness the resources to support good clinical decision making.”

Recent studies have tried to objectively compare online point-of-care summaries with respect to their breadth, content development, editorial policy, the speed of updating and the type of evidence cited. I will discuss 3 of these recent papers, but will review each paper separately. (My posts tend to be pretty long and in-depth. So in an effort to keep them readable I try to cut down where possible.)

Two of the three papers are published by Rita Banzi and colleagues from the Italian Cochrane Centre.

In the first paper, reviewed here, Banzi et al [1] first identified English Web-based POCs using Medline, Google, librarian association websites, and information conference proceedings from January to December 2008. In order to be eligible, a product had to be an online-delivered summary that is regularly updated, claims to provide evidence-based information and is to be used at the bedside.

They found 30 eligible POCs, of which the following 18 databases met the criteria: 5-Minute Clinical Consult, ACP-Pier, BestBETs, CKS (NHS), Clinical Evidence, DynaMed, eMedicine,  eTG complete, EBM Guidelines, First Consult, GP Notebook, Harrison’s Practice, Health Gate, Map Of Medicine, Micromedex, Pepid, UpToDate, ZynxEvidence.

They assessed and ranked these 18 point-of-care products according to: (1) coverage (volume) of medical conditions, (2) editorial quality, and (3) evidence-based methodology. (For operational definitions see appendix 1)

From a quantitive perspective DynaMed, eMedicine, and First Consult were the most comprehensive (88%) and eTG complete the least (45%).

The best editorial quality of EBP was delivered by Clinical Evidence (15), UpToDate (15), eMedicine (13), Dynamed (11) and eTG complete (10). (Scores are shown in brackets)

Finally, BestBETs, Clinical Evidence, EBM Guidelines and UpToDate obtained the maximal score (15 points each) for best evidence-based methodology, followed by DynaMed and Map Of Medicine (12 points each).
As expected eMedicine, eTG complete, First Consult, GP Notebook and Harrison’s Practice had a very low EBM score (1 point each). Personally I would not have even considered these online sources as “evidence based”.

The calculations seem very “exact”, but assumptions upon which these figures are based are open to question in my view. Furthermore all items have the same weight. Isn’t the evidence-based methodology far more important than “comprehensiveness” and editorial quality?

Certainly because “volume” is “just” estimated by analyzing to which extent 4 random chapters of the ICD-10 classification are covered by the POCs. Some sources, like Clinical Evidence and BestBets (scoring low for this item) don’t aim to be comprehensive but only “answer” a limited number of questions: they are not textbooks.

Editorial quality is determined by scoring of the specific indicators of transparency: authorship, peer reviewing procedure, updating, disclosure of authors’ conflicts of interest, and commercial support of content development.

For the EB methodology, Banzi et al scored the following indicators:

  1. Is a systematic literature search or surveillance the basis of content development?
  2. Is the critical appraisal method fully described?
  3. Are systematic reviews preferred over other types of publication?
  4. Is there a system for grading the quality of evidence?
  5. When expert opinion is included is it easily recognizable over studies’ data and results ?

The  score for each of these indicators is 3 for “yes”, 1 for “unclear”, and 0 for “no” ( if judged “not adequate” or “not reported.”)

This leaves little room for qualitative differences and mainly relies upon adequate reporting. As discussed earlier in a post where I questioned the evidence-based-ness of UpToDate, there is a difference between tailored searches and checking a limited list of sources (indicator 1.). It also matters whether the search is mentioned or not (transparency), whether it is qualitatively ok and whether it is extensive or not. For lists, it matters how many sources are “surveyed”. It also matters whether one or both methods are used… These important differences are not reflected by the scores.

Furthermore some points may be more important than others. Personally I find step 1 the most important. For what good is appraising and grading if it isn’t applied to the most relevant evidence? It is “easy” to do a grading or to copy it from other sources (yes, I wouldn’t be surprised if some POCs are doing this).

On the other hand, a zero for one indicator can have too much weight on the score.

Dynamed got 12 instead of the maximum 15 points, because their editorial policy page didn’t explicitly describe their absolute prioritization of systematic reviews although they really adhere to that in practice (see comment by editor-in-chief  Brian Alper [2]). Had Dynamed received the deserved 15 points for this indicator, they would have had the highest score overall.

The authors further conclude that none of the dimensions turned out to be significantly associated with the other dimensions. For example, BestBETs scored among the worst on volume (comprehensiveness), with an intermediate score for editorial quality, and the highest score for evidence-based methodology.  Overall, DynaMed, EBM Guidelines, and UpToDate scored in the top quartile for 2 out of 3 variables and in the 2nd quartile for the 3rd of these variables. (but as explained above Dynamed really scored in the top quartile for all 3 variables)

On basis of their findings Banzi et al conclude that only a few POCs satisfied the criteria, with none excelling in all.

The finding that Pepid, eMedicine, eTG complete, First Consult, GP Notebook, Harrison’s Practice and 5-Minute Clinical Consult only obtained 1 or 2 of the maximum 15 points for EBM methodology confirms my “intuitive grasp” that these sources really don’t deserve the label “evidence based”. Perhaps we should make a more strict distinction between “point of care” databases as a point where patients and practitioners interact, particularly referring to the context of the provider-patient dyad (definition by Banzi et al) and truly evidence based summaries. Only few of the tested databases would fit the latter definition. 

In summary, Banzi et al reviewed 18 Online Evidence-based Practice Point-of-Care Information Summary Providers. They comprehensively evaluated and summarized these resources with respect to coverage (volume) of medical conditions, editorial quality, and (3) evidence-based methodology. 

Limitations of the study, also according to the authors, were the lack of a clear definition of these products, arbitrariness of the scoring system and emphasis on the quality of reporting. Furthermore the study didn’t really assess the products qualitatively (i.e. with respect to performance). Nor did it take into account that products might have a different aim. Clinical Evidence only summarizes evidence on the effectiveness of treatments of a limited number of diseases, for instance. Therefore it scores bad on volume while excelling on the other items. 

Nevertheless it is helpful that POCs are objectively compared and it may help as starting point for decisions about acquisition.

References (not in chronological order)

  1. Banzi, R., Liberati, A., Moschetti, I., Tagliabue, L., & Moja, L. (2010). A Review of Online Evidence-based Practice Point-of-Care Information Summary Providers Journal of Medical Internet Research, 12 (3) DOI: 10.2196/jmir.1288
  2. Alper, B. (2010). Review of Online Evidence-based Practice Point-of-Care Information Summary Providers: Response by the Publisher of DynaMed Journal of Medical Internet Research, 12 (3) DOI: 10.2196/jmir.1622
  3. Goodyear-Smith F, Kerse N, Warren J, & Arroll B (2008). Evaluation of e-textbooks. DynaMed, MD Consult and UpToDate. Australian family physician, 37 (10), 878-82 PMID: 19002313
  4. Ketchum, A., Saleh, A., & Jeong, K. (2011). Type of Evidence Behind Point-of-Care Clinical Information Products: A Bibliometric Analysis Journal of Medical Internet Research, 13 (1) DOI: 10.2196/jmir.1539
  5. McKibbon, K., & Fridsma, D. (2006). Effectiveness of Clinician-selected Electronic Information Resources for Answering Primary Care Physicians’ Information Needs Journal of the American Medical Informatics Association, 13 (6), 653-659 DOI: 10.1197/jamia.M2087
  6. How will we ever keep up with 75 Trials and 11 Systematic Reviews a Day? (
  7. 10 + 1 PubMed Tips for Residents (and their Instructors) (
  8. Time to weed the (EBM-)pyramids?! (
  9. Haynes RB. Of studies, syntheses, synopses, summaries, and systems: the “5S” evolution of information services for evidence-based healthcare decisions. Evid Based Med 2006 Dec;11(6):162-164. [PubMed]
  10. DiCenso A, Bayley L, Haynes RB. ACP Journal Club. Editorial: Accessing preappraised evidence: fine-tuning the 5S model into a 6S model. Ann Intern Med. 2009 Sep 15;151(6):JC3-2, JC3-3. PubMed PMID: 19755349 [free full text].
  11. How Evidence Based is UpToDate really? (
  12. Point_of_care_decision-making_tools_-_Overview (
  13. UpToDate or Dynamed? (Shamsha Damani at

Related articles (automatically generated)

#FollowFriday #FF @DrJenGunter: EBM Sex Health Expert Wielding the Lasso of Truth

19 08 2011

If you’re on Twitter you probably seen the #FF or #FollowFriday phenomenon. FollowFriday is a way to recommend people on Twitter to others. For at least 2 reasons: to acknowledge your favorite tweople and to make it easier for your followers to find new interesting people.

However, some #FollowFriday tweet-series are more like a weekly spam. Almost 2 years ago I blogged about the misuse of FF-recommendations and I gave some suggestions to do #FollowFriday the right way: not by sheer mentioning many people in numerous  tweets, but by recommending one or a few people a time, and explaining why this person is so awesome to follow.

Twitter Lists are also useful tools for recommending people (see post). You could construct lists of your favorite Twitter people for others to follow. I have created a general FollowFridays list, where I list all the people I have recommended in a #FF-tweet and/or post.

In this post I would like to take up the tradition of highlighting the #FF favs at my blog. .

This FollowFriday I recommend:  

Jennifer Gunter

Jennifer Gunter (@DrJenGunter at Twitter), is a beautiful lady, but she shouldn’t be tackled without gloves, for she is a true defender of evidence-based medicine and wields the lasso of truth.

Her specialty is OB/GYN. She is a sex health expert. No surprise, many tweets are related to this topic, some very serious, some with a humorous undertone. And there can be just fun (re)tweets, like:

LOL -> “@BackpackingDad: New Word: Fungry. Full-hungry. “I just ate a ton of nachos, but hot damn am I fungry for those Buffalo wings!””

Dr Jen Gunter has a blog Dr. Jen Gunther (wielding the lasso of truth). 

Again we find the same spectrum of posts, mostly in the field of ob/gyn. You need not be an ob/gyn nor an EBM expert to enjoy them. Jen’s posts are written in plain language, suitable for anyone to understand (including patients).

Some titles:

In addition, There are also hilarious posts like “Cosmo’s sex position of the day proves they know nothing about good sex or women“,where she criticizes Cosmo for tweeting impossible sex positions (“If you’re over 40, I dare you to even GET into that position! “), which she thinks were created by one of the following:

A) a computer who has never had sex and is not programmed to understand how the female body bends.
B) a computer programmer who has never has sex and has no understanding of how the female body bends.
C) a Yogi master/Olympic athlete.

Sometimes the topic is blogging. Jen is a fierce proponent of medical blogging. She sees it as a way to “promote” yourself as a doctor, to learn from your readers and to “contribute credible content drowns out garbage medical information” (true) and as an ideal platform to deliver content to your patients and like-minded medical professionals. (great idea)

Read more at:

You can follow Jen at her Twitter-account (!/DrJenGunter) and/or you can follow my lists. She is on:  ebm-cochrane-sceptics and the followfridays list.

Of course you can also take a subscription to her blog

Related articles


Get every new post delivered to your Inbox.

Join 610 other followers