Can Guidelines Harm Patients?

2 05 2012

ResearchBlogging.orgRecently I saw an intriguing “personal view” in the BMJ written by Grant Hutchison entitled: “Can Guidelines Harm Patients Too?” Hutchison is a consultant anesthetist with -as he calls it- chronic guideline fatigue syndrome. Hutchison underwent an acute exacerbation of his “condition” with the arrival of another set of guidelines in his email inbox. Hutchison:

On reviewing the level of evidence provided for the various recommendations being offered, I was struck by the fact that no relevant clinical trials had been carried out in the population of interest. Eleven out of 25 of the recommendations made were supported only by the lowest levels of published evidence (case reports and case series, or inference from studies not directly applicable to the relevant population). A further seven out of 25 were derived only from the expert opinion of members of the guidelines committee, in the absence of any guidance to be gleaned from the published literature.

Hutchison’s personal experience is supported by evidence from two articles [2,3].

One paper published in the JAMA 2009 [2] concludes that ACC/AHA (American College of Cardiology and the American Heart Association) clinical practice guidelines are largely developed from lower levels of evidence or expert opinion and that the proportion of recommendations for which there is no conclusive evidence is growing. Only 314 recommendations of 2711 (median, 11%) are classified as level of evidence A , thus recommendation based on evidence from multiple randomized trials or meta-analyses.  The majority of recommendations (1246/2711; median, 48%) are level of evidence C, thus based  on expert opinion, case studies, or standards of care. Strikingly only 245 of 1305 class I recommendations are based on the highest level A evidence (median, 19%).

Another paper, published in Ann Intern Med 2011 [3], reaches similar conclusions analyzing the Infectious Diseases Society of America (IDSA) Practice Guidelines. Of the 4218 individual recommendations found, only 14% were supported by the strongest (level I) quality of evidence; more than half were based on level III evidence only. Like the ACC/AHH guidelines only a small part (23%) of the strongest IDSA recommendations, were based on level I evidence (in this case ≥1 randomized controlled trial, see below). And, here too, the new recommendations were mostly based on level II and III evidence.

Although there is little to argue about Hutchison’s observations, I do not agree with his conclusions.

In his view guidelines are equivalent to a bullet pointed list or flow diagram, allowing busy practitioners to move on from practice based on mere anecdote and opinion. It therefore seems contradictory that half of the EBM-guidelines are based on little more than anecdote (case series, extrapolation from other populations) and opinion. He then argues that guidelines, like other therapeutic interventions, should be considered in terms of balance between benefit and risk and that the risk  associated with the dissemination of poorly founded guidelines must also be considered. One of those risks is that doctors will just tend to adhere to the guidelines, and may even change their own (adequate) practice  in the absence of any scientific evidence against it. If a patient is harmed despite punctilious adherence to the guideline-rules,  “it is easy to be seduced into assuming that the bad outcome was therefore unavoidable”. But perhaps harm was done by following the guideline….

First of all, overall evidence shows that adherence to guidelines can improve patient outcome and provide more cost effective care (Naveed Mustfa in a comment refers to [4]).

Hutchinson’s piece is opinion-based and rather driven by (understandable) gut feelings and implicit assumptions, that also surround EBM in general.

  1. First there is the assumption that guidelines are a fixed set of rules, like a protocol, and that there is no room for preferences (both of the doctor and the patient), interpretations and experience. In the same way as EBM is often degraded to “cookbook medicine”, EBM guidelines are turned into mere bullet pointed lists made by a bunch of experts that just want to impose their opinions as truth.
  2. The second assumption (shared by many) is that evidence based medicine is synonymous with “randomized controlled trials”. In analogy, only those EBM guideline recommendations “count” that are based on RCT’s or meta-analyses.

Before I continue, I would strongly advice all readers (and certainly all EBM and guideline-skeptics) to read this excellent and clearly written BJM-editorial by David Sackett et al. that deals with misconceptions, myths and prejudices surrounding EBM : Evidence based medicine: what it is and what it isn’t [5].

Sackett et al define EBM as “the conscientious, explicit and judicious use of current best evidence in making decisions about the care of individual patients” [5]. Sackett emphasizes that “Good doctors use both individual clinical expertise and the best available external evidence, and neither alone is enough. Without clinical expertise, practice risks becoming tyrannised by evidence, for even excellent external evidence may be inapplicable to or inappropriate for an individual patient. Without current best evidence, practice risks becoming rapidly out of date, to the detriment of patients.”

Guidelines are meant to give recommendations based on the best available evidence. Guidelines should not be a set of rules, set in stone. Ideally, guidelines have gathered evidence in a transparent way and make it easier for the clinicians to grasp the evidence for a certain procedure in a certain situation … and to see the gaps.

Contrary to what many people think, EBM is not restricted to randomized trials and meta-analyses. It involves tracking down the best external evidence there is. As I explained in #NotSoFunny #16 – Ridiculing RCTs & EBM, evidence is not an all-or-nothing thing: RCT’s (if well performed) are the most robust, but if not available we have to rely on “lower” evidence (from cohort to case-control to case series or expert opinion even).
On the other hand RCT’s are often not even suitable to answer questions in other domains than therapy (etiology/harm, prognosis, diagnosis): per definition the level of evidence for these kind of questions inevitably will be low*. Also, for some interventions RCT’s are not appropriate, feasible or too costly to perform (cesarean vs vaginal birth; experimental therapies, rare diseases, see also [3]).

It is also good to realize that guidance, based on numerous randomized controlled trials is probably not or limited applicable to groups of patients who are seldom included in a RCT: the cognitively impaired, the patient with multiple comorbidities [6], the old patient [6], children and (often) women.

Finally not all RCTs are created equal (various forms of bias; surrogate outcomes; small sample sizes, short follow-up), and thus should not all represent the same high level of evidence.*

Thus in my opinion, low levels of evidence are not per definition problematic. Even if they are the basis for strong recommendations. As long as it is clear how the recommendations were reached and as long as these are well underpinned (by whatever evidence or motivation). One could see the exposed gaps in evidence as a positive thing as it may highlight the need for clinical research in certain fields.

There is one BIG BUT: my assumption is that guidelines are “just” recommendations based on exhaustive and objective reviews of existing evidence. No more, no less. This means that the clinician must have the freedom to deviate from the recommendations, based on his own expertise and/or the situation and/or the patient’s preferences. The more, when the evidence on which these strong recommendations are based is ‘scant’. Sackett already warned for the possible hijacking of EBM by purchasers and managers (and may I add health insurances and governmental agencies) to cut the costs of health care and to impose “rules”.

I therefore think it is odd that the ACC/AHA guidelines prescribe that Class I recommendations SHOULD be performed/administered even if they are based on level C recommendations (see Figure).

I also find it odd that different guidelines have a different nomenclature. The ACC/AHA have Class I, IIa, IIb and III recommendations and level A, B, C evidence where level A evidence represents sufficient evidence from multiple randomized trials and meta-analyses, whereas the strength of recommendations in the IDSA guidelines includes levels A through C (OR D/E recommendations against use) and quality of evidence ranges from level I through III , where I indicates evidence from (just) 1 properly randomized controlled trial. As explained in [3] this system was introduced to evaluate the effectiveness of preventive health care interventions in Canada (for which RCTs are apt).

Finally, guidelines and guideline makers should probably be more open for input/feedback from people who apply these guidelines.


*the new GRADE (Grading of Recommendations Assessment, Development, and Evaluation) scoring system taking into account good quality observational studies as well may offer a potential solution.

Another possibly relevant post at this blog: The Best Study Design for … Dummies

Taken from a summary of an ACC/AHA guideline at
Click to enlarge.


  1. Hutchison, G. (2012). Guidelines can harm patients too BMJ, 344 (apr18 1) DOI: 10.1136/bmj.e2685
  2. Tricoci P, Allen JM, Kramer JM, Califf RM, & Smith SC Jr (2009). Scientific evidence underlying the ACC/AHA clinical practice guidelines. JAMA : the journal of the American Medical Association, 301 (8), 831-41 PMID: 19244190
  3. Lee, D., & Vielemeyer, O. (2011). Analysis of Overall Level of Evidence Behind Infectious Diseases Society of America Practice Guidelines Archives of Internal Medicine, 171 (1), 18-22 DOI: 10.1001/archinternmed.2010.482
  4. Menéndez R, Reyes S, Martínez R, de la Cuadra P, Manuel Vallés J, & Vallterra J (2007). Economic evaluation of adherence to treatment guidelines in nonintensive care pneumonia. The European respiratory journal : official journal of the European Society for Clinical Respiratory Physiology, 29 (4), 751-6 PMID: 17005580
  5. Sackett, D., Rosenberg, W., Gray, J., Haynes, R., & Richardson, W. (1996). Evidence based medicine: what it is and what it isn’t BMJ, 312 (7023), 71-72 DOI: 10.1136/bmj.312.7023.71
  6. Aylett, V. (2010). Do geriatricians need guidelines? BMJ, 341 (sep29 3) DOI: 10.1136/bmj.c5340

An Evidence Pyramid that Facilitates the Finding of Evidence

20 03 2010

Earlier I described that there are so many search- and EBM-pyramids that it is confusing. I described  3 categories of pyramids:

  1. Search Pyramids
  2. Pyramids of EBM-sources
  3. Pyramids of EBM-levels (levels of evidence)

In my courses where I train doctors and medical students how to find evidence quickly, I use a pyramid that is a mixture of 1. and 2. This is a slide from a 2007 course.

This pyramid consists of 4 layers (from top down):

  1. EBM-(evidence based) guidelines.
  2. Synopses & Syntheses*: a synopsis is a summary and critical appraisal of one article, whereas synthesis is a summary and critical appraisal of a topic (which may answer several questions and may cover many articles).
  3. Systematic Reviews (a systematic summary and critical appraisal of original studies) which may or may not include a meta-analysis.
  4. Original Studies.

The upper 3 layers represent “Aggregate Evidence”. This is evidence from secondary sources, that search, summarize and critically appraise original studies (lowest layer of the pyramid).

The layers do not necessarily represent the levels of evidence and should not be confused with Pyramids of EBM-levels (type 3). An Evidence Based guideline can have a lower level of evidence than a good systematic review, for instance.
The present pyramid is only meant to lead the way in the labyrinth of sources. Thus, to speed up to process of searching. The relevance and the quality of evidence should always be checked.

The idea is:

  • The higher the level in the pyramid the less publications it contains (the narrower it becomes)
  • Each level summarizes and critically appraises the underlying levels.

I advice people to try to find aggregate evidence first, thus to drill down (hence the drill in the Figure).

The advantage: faster results, lower number to read (NNR).

During the first courses I gave, I just made a pyramid in Word with the links to the main sources.

Our library ICT department converted it into a HTML document with clickable links.

However, although the pyramid looked quite complex, not all main evidence sources were included. Plus some sources belong to different layers. The Trip Database for instance searches sources from all layers.

Our ICT-department came up with a much better looking and better functioning 3-D pyramid, with databases like TRIP in the sidebar.

Moving the  mouse over a pyramid layer invokes a pop-up with links to the databases belonging to that layer.

Furthermore the sources included in the pyramid differ per specialty. So for the department Gynecology we include POPLINE and MIDIRS in the lowest layer, and the RCOG and NVOG (Dutch) guidelines in the EBM-guidelines layer.

Together my colleagues and I decide whether a source is evidence based (we don’t include UpToDate for instance) and where it  belongs. Each clinical librarian (we all serve different departments) then decides which databases to include. Clients can give suggestions.

Below is a short You Tube video showing how this pyramid can be used. Because of the rather poor quality, the video is best to be viewed in full screen mode.
I have no audio (yet), so in short this is what you see:

Made with Screenr:

The pyramid is highly appreciated by our clients and students.

But it is just a start. My dream is to visualize the entire pathway from question to PICO, checklists, FAQs and database of results per type of question/reason for searching (fast question, background question, CAT etc.).

I’m just waiting for someone to fulfill the technical part of this dream.


*Note that there may be different definitions as well. The top layers in the 5S pyramid of Bryan Hayes are defined as follows: syntheses & synopses (succinct descriptions of selected individual studies or systematic reviews, such as those found in the evidence-based journals), summaries, which integrate best available evidence from the lower layers to develop practice guidelines based on a full range of evidence (e.g. Clinical Evidence, National Guidelines Clearinghouse), and at the peak of the model, systems, in which the individual patient’s characteristics are automatically linked to the current best evidence that matches the patient’s specific circumstances and the clinician is provided with key aspects of management (e.g., computerised decision support systems).

Begin with the richest source of aggregate (pre-filtered) evidence and decline in order to to decrease the number needed to read: there are less EBM guidelines than there are Systematic Reviews and (certainly) individual papers.

CC (2) Duodecim: Connecting patients (and doctors) to the best-evidence

5 10 2008

This is the second post in the series Cochrane Colloquium (CC) 2008.

In the previous post, I mentioned a very interesting opening session.

Here I will summarize one of the presentations in that opening session, i.e. the presentation by Pekka Mustonen, called:

Connecting patients to the best-evidence through technology: An effective solution or “the great seduction”?

Pekka essentially showed us what the Finnish have achieved with their Duodecim database.

Duodecim was started as a health portal for professionals only. It is a database (a decision support system) made by doctors for doctors. It contains Evidence Base (EBM) Guidelines with:

  • regularly updated recommendations
  • links to evidence, including guidelines and Cochrane Systematic Reviews
  • commentaries

Busy Clinicians don’t have the time to perform an extensive search to find the best available evidence each time they have a clinical question. Ideally, they only would have to carry out one search, taking not more than one minute to find the right information.

This demand seems to be reasonably met by Duodecim.

Notably, Duodecim is not only very popular as a source for clinicians ànd nurses, the guidelines are also read and followed by them. Those familiar with healthcare know that this is the main obstacle: getting doctors and nurses to actually use the guidelines.

According to Pekka, patients are even more important than doctors to implement guidelines: Half of the patients don’t seem to follow their doctor’s advice. If the advice is to keep on inhaled steroids for long-term management for asthma, many patients won’t follow that advice, for instance. “When you reach patients, small changes can have large benefits”, he said.

However, although many patients rely on internet to find health information, formal health information sites face fierce competition on Internet. It is difficult for consumers to separate chaff from wheat:

Still, Duodecim has managed to make a website for the general public that is now as popular as the original physicians database is for doctors, the only difference being that doctors use the database continuously, whereas the general public just consults the database when they are confronted with a health problem.
The database contains 1000 EBM key articles, where the content is integrated with personal health records. The site looks rather straightforward, not glitzy nor flashy. Intentionally, in order to look like a serious and trustworthy professional health care site.

A survey revealed that Duodecim performed a lot better than Google in answering health care questions, and does lead to more people either deciding NOT to consult a physician (because they are reassured), or deciding to consult one (because the symptoms might be more serious than thought). Thus it can make a difference!

The results are communicated differently to patients compared to doctors. For instance, whether it is useful to wear stockings during long-haul flights to prevent deep venous thrombosis in patients that have either a low or a high risk for thrombosis is explained to the physician in terms of RR, ARR, RRR and NNT.
Patients see a table with red (high risk patients) and green columns (low risk patients). Conclusions will be translated as follows:

If 1000 patients with a low risk for DVT wear stockings on long-haul flights

  • 9 will avoid it
  • 1 will get it
  • 1 out of 1000 (will get it)
  • 990 use stocking in vain

If 1000 patients at high risk for DVT wear stockings on long-haul flights:

  • 27 will avoid it
  • 3 will get it
  • 1 out of 333 (will get it)
  • 970 use stocking in vain

This database will be integrated with permanent health records and virtual health checks. It is also linked to a tv program with the aim of changing the way of living. Online you can do a life expectancy test to see what age you would reach if you continue your life style as you do (compare “je echte leeftijd”, “your real age”[dutch]).

“What young people don’t realize”, Pekka said, is that most older people find that the best of life starts at the age of 60(?!) Thus, it doesn’t end at 30, as most youngsters think. But young people will only notice, when they reach old age in good health. To do this, they must change their habits already when young.

The Finnish database is for free for Finnish people.

Quite coincidentally (asking for a free usb-stick at the Wiley stand 😉 ) I found out that Wiley’s database EBM Guidelines links to the Duodecim platform (see below). Quite interesting to take a trial, I think.

(Although this presumably is only the professional part of Duodecim, thus not the patient oriented database.)