No, Google Scholar Shouldn’t be Used Alone for Systematic Review Searching

9 07 2013

Several papers have addressed the usefulness of Google Scholar as a source for systematic review searching. Unfortunately the quality of those papers is often well below the mark.

In 2010 I already [1]  (in the words of Isla Kuhn [2]) “robustly rebutted” the Anders’ paper PubMed versus Google Scholar for Retrieving Evidence” [3] at this blog.

But earlier this year another controversial paper was published [4]:

“Is the coverage of google scholar enough to be used alone for systematic reviews?

It is one of the highly accessed papers of BMC Medical Informatics and Decision Making and has been welcomed in (for instance) the Twittosphere.

Researchers seem  to blindly accept the conclusions of the paper:

But don’t rush  and assume you can now forget about PubMed, MEDLINE, Cochrane and EMBASE for your systematic review search and just do a simple Google Scholar (GS) search instead.

You might  throw the baby out with the bath water….

… As has been immediately recognized by many librarians, either at their blogs (see blogs of Dean Giustini [5], Patricia Anderson [6] and Isla Kuhn [1]) or as direct comments to the paper (by Tuulevi OvaskaMichelle Fiander and Alison Weightman [7].

In their paper, Jean-François Gehanno et al examined whether GS was able to retrieve all the 738 original studies included in 29 Cochrane and JAMA systematic reviews.

And YES! GS had a coverage of 100%!

WOW!

All those fools at the Cochrane who do exhaustive searches in multiple databases using controlled vocabulary and a lot of synonyms when a simple search in GS could have sufficed…

But it is a logical fallacy to conclude from their findings that GS alone will suffice for SR-searching.

Firstly, as Tuulevi [7] rightly points out :

“Of course GS will find what you already know exists”

Or in the words of one of the official reviewers [8]:

What the authors show is only that if one knows what studies should be identified, then one can go to GS, search for them one by one, and find out that they are indexed. But, if a researcher already knows the studies that should be included in a systematic review, why bother to also check whether those studies are indexed in GS?

Right!

Secondly, it is also the precision that counts.

As Dean explains at his blog a 100% recall with a precision of 0,1% (and it can be worse!) means that in order to find 36 relevant papers you have to go through  ~36,700 items.

Dean:

Are the authors suggesting that researchers consider a precision level of 0.1% acceptable for the SR? Who has time to sift through that amount of information?

It is like searching for needles in a haystack.  Correction: It is like searching for particular hay stalks in a hay stack. It is very difficult to find them if they are hidden among other hay stalks. Suppose the hay stalks were all labeled (title), and I would have a powerful haystalk magnet (“title search”)  it would be a piece of cake to retrieve them. This is what we call “known item search”. But would you even consider going through the haystack and check the stalks one by one? Because that is what we have to do if we use Google Scholar as a one stop search tool for systematic reviews.

Another main point of criticism is that the authors have a grave and worrisome lack of understanding of the systematic review methodology [6] and don’t grasp the importance of the search interface and knowledge of indexing which are both integral to searching for systematic reviews.[7]

One wonders why the paper even passed the peer review, as one of the two reviewers (Miguel Garcia-Perez [8]) already smashed the paper to pieces.

The authors’ method is inadequate and their conclusion is not logically connected to their results. No revision (major, minor, or discretionary) will save this work. (…)

Miguel’s well funded criticism was not well addressed by the authors [9]. Apparently the editors didn’t see through and relied on the second peer reviewer [10], who merely said it was a “great job” etcetera, but that recall should not be written with a capital R.
(and that was about the only revision the authors made)

Perhaps it needs another paper to convince Gehanno et al and the uncritical readers of their manuscript.

Such a paper might just have been published [11]. It is written by Dean Giustini and Maged Kamel Boulos and is entitled:

Google Scholar is not enough to be used alone for systematic reviews

It is a simple and straightforward paper, but it makes its points clearly.

Giustini and Kamel Boulos looked for a recent SR in their own area of expertise (Chou et al [12]), that included a comparable number of references as that of Gehanno et al. Next they test GS’ ability to locate these references.

Although most papers cited by Chou et al. (n=476/506;  ~95%) were ultimately found in GS, numerous iterative searches were required to find the references and each citation had to be managed once at a time. Thus GS was not able to locate all references found by Chou et al. and the whole exercise was rather cumbersome.

As expected, trying to find the papers by a “real-life” GS search was almost impossible. Because due to its rudimentary structure, GS did not understand the expert search strings and was unable to translate them. Thus using Chou et al.’s original search strategy and keywords yielded unmanageable results of approximately >750,000 items.

Giustini and Kamel Boulos note that GS’ ability to search into the full-text of papers combined with its PageRank’s algorithm can be useful.

On the other hand GS’ changing content, unknown updating practices and poor reliability make it an inappropriate sole choice for systematic reviewers:

As searchers, we were often uncertain that results found one day in GS had not changed a day later and trying to replicate searches with date delimiters in GS did not help. Papers found today in GS did not mean they would be there tomorrow.

But most importantly, not all known items could be found and the search process and selection are too cumbersome.

Thus shall we now for once and for all conclude that GS is NOT sufficient to be used alone for SR searching?

We don’t need another bad paper addressing this.

But I would really welcome a well performed paper looking at the additional value of a GS in SR-searching. For I am sure that GS may be valuable for some questions and some topics in some respects. We have to find out which.

References

  1. PubMed versus Google Scholar for Retrieving Evidence 2010/06 (laikaspoetnik.wordpress.com)
  2. Google scholar for systematic reviews…. hmmmm  2013/01 (ilk21.wordpress.com)
  3. Anders M.E. & Evans D.P. (2010) Comparison of PubMed and Google Scholar literature searches, Respiratory care, May;55(5):578-83  PMID:
  4. Gehanno J.F., Rollin L. & Darmoni S. (2013). Is the coverage of Google Scholar enough to be used alone for systematic reviews., BMC medical informatics and decision making, 13:7  PMID:  (open access)
  5. Is Google scholar enough for SR searching? No. 2013/01 (blogs.ubc.ca/dean)
  6. What’s Wrong With Google Scholar for “Systematic” Review 2013/01 (etechlib.wordpress.com)
  7. Comments at Gehanno’s paper (www.biomedcentral.com)
  8. Official Reviewer’s report of Gehanno’s paper [1]: Miguel Garcia-Perez, 2012/09
  9. Authors response to comments  (www.biomedcentral.com)
  10. Official Reviewer’s report of Gehanno’s paper [2]: Henrik von Wehrden, 2012/10
  11. Giustini D. & Kamel Boulos M.N. (2013). Google Scholar is not enough to be used alone for systematic reviews, Online Journal of Public Health Informatics, 5 (2) DOI:
  12. Chou W.Y.S., Prestin A., Lyons C. & Wen K.Y. (2013). Web 2.0 for Health Promotion: Reviewing the Current Evidence, American Journal of Public Health, 103 (1) e9-e18. DOI:




The Scatter of Medical Research and What to do About it.

18 05 2012

ResearchBlogging.orgPaul Glasziou, GP and professor in Evidence Based Medicine, co-authored a new article in the BMJ [1]. Similar to another paper [2] I discussed before [3] this paper deals with the difficulty for clinicians of staying up-to-date with the literature. But where the previous paper [2,3] highlighted the mere increase in number of research articles over time, the current paper looks at the scatter of randomized clinical trials (RCTs) and systematic reviews (SR’s) accross different journals cited in one year (2009) in PubMed.

Hofmann et al analyzed 7 specialties and 9 sub-specialties, that are considered the leading contributions to the burden of disease in high income countries.

They followed a relative straightforward method for identifying the publications. Each search string consisted of a MeSH term (controlled  term) to identify the selected disease or disorders, a publication type [pt] to identify the type of study, and the year of publication. For example, the search strategy for randomized trials in cardiology was: “heart diseases”[MeSH] AND randomized controlled trial[pt] AND 2009[dp]. (when searching “heart diseases” as a MeSH, narrower terms are also searched.) Meta-analysis[pt] was used to identify systematic reviews.

Using this approach Hofmann et al found 14 343 RCTs and 3214 SR’s published in 2009 in the field of the selected (sub)specialties. There was a clear scatter across journals, but this scatter varied considerably among specialties:

“Otolaryngology had the least scatter (363 trials across 167 journals) and neurology the most (2770 trials across 896 journals). In only three subspecialties (lung cancer, chronic obstructive pulmonary disease, hearing loss) were 10 or fewer journals needed to locate 50% of trials. The scatter was less for systematic reviews: hearing loss had the least scatter (10 reviews across nine journals) and cancer the most (670 reviews across 279 journals). For some specialties and subspecialties the papers were concentrated in specialty journals; whereas for others, few of the top 10 journals were a specialty journal for that area.
Generally, little overlap occurred between the top 10 journals publishing trials and those publishing systematic reviews. The number of journals required to find all trials or reviews was highly correlated (r=0.97) with the number of papers for each specialty/ subspecialty.”

Previous work already suggested that this scatter of research has a long tail. Half of the publications is in a minority of papers, whereas the remaining articles are scattered among many journals (see Fig below).

Click to enlarge en see legends at BMJ 2012;344:e3223 [CC]

The good news is that SRs are less scattered and that general journals appear more often in the top 10 journals publishing SRs. Indeed for 6 of the 7 specialties and 4 of the 9 subspecialties, the Cochrane Database of Systematic Reviews had published the highest number of systematic reviews, publishing between 6% and 18% of all the systematic reviews published in each area in 2009. The bad news is that even keeping up to date with SRs seems a huge, if not impossible, challenge.

In other words, it is not sufficient for clinicians to rely on personal subscriptions to a few journals in their specialty (which is common practice). Hoffmann et al suggest several solutions to help clinicians cope with the increasing volume and scatter of research publications.

  • a central library of systematic reviews (but apparently the Cochrane Library fails to fulfill such a role according to the authors, because many reviews are out of date and are perceived as less clinically relevant)
  • registry of planned and completed systematic reviews, such as prospero. (this makes it easier to locate SRs and reduces bias)
  • Synthesis of Evidence and synopses, like the ACP-Jounal Club which summarizes the best evidence in internal medicine
  • Specialised databases that collate and critically appraise randomized trials and systematic reviews, like www.pedro.org.au for physical therapy. In my personal experience, however, this database is often out of date and not comprehensive
  • Journal scanning services like EvidenceUpdates from mcmaster.ca), which scans over 120 journals, filters articles on the basis of quality, has practising clinicians rate them for relevance and newsworthiness, and makes them available as email alerts and in a searchable database. I use this service too, but besides that not all specialties are covered, the rating of evidence may not always be objective (see previous post [4])
  • The use of social media tools to alert clinicians to important new research.

Most of these solutions are (long) existing solutions that do not or only partly help to solve the information overload.

I was surprised that the authors didn’t propose the use of personalized alerts. PubMed’s My NCBI feature allows to create automatic email alerts on a topic and to subscribe to electronic tables of contents (which could include ACP journal Club). Suppose that a physician browses 10 journals roughly covering 25% of the trials. He/she does not need to read all the other journals from cover to cover to avoid missing one potentially relevant trial. Instead it is far more efficient to perform a topic search to filter relevant studies from journals that seldom publish trials on the topic of interest. One could even use the search of Hoffmann et al to achieve this.* Although in reality, most clinical researchers will have narrower fields of interest than all studies about endocrinology and neurology.

At our library we are working at creating deduplicated, easy to read, alerts that collate table of contents of certain journals with topic (and author) searches in PubMed, EMBASE and other databases. There are existing tools that do the same.

Another way to reduce the individual work (reading) load is to organize journals clubs or even better organize regular CATs (critical appraised topics). In the Netherlands, CATS are a compulsory item for residents. A few doctors do the work for many. Usually they choose topics that are clinically relevant (or for which the evidence is unclear).

The authors shortly mention that their search strategy might have missed  missed some eligible papers and included some that are not truly RCTs or SRs, because they relied on PubMed’s publication type to retrieve RCTs and SRs. For systematic reviews this may be a greater problem than recognized, for the authors have used meta-analyses[pt] to identify systematic reviews. Unfortunately PubMed has no publication type for systematic reviews, but it may be clear that there are many more systematic reviews that meta-analyses. Possibly systematical reviews might even have a different scatter pattern than meta-analyses (i.e. the latter might be preferentially included in core journals).

Furthermore not all meta-analyses and systematic reviews are reviews of RCTs (thus it is not completely fair to compare MAs with RCTs only). On the other hand it is a (not discussed) omission of this study, that only interventions are considered. Nowadays physicians have many other questions than those related to therapy, like questions about prognosis, harm and diagnosis.

I did a little imperfect search just to see whether use of other search terms than meta-analyses[pt] would have any influence on the outcome. I search for (1) meta-analyses [pt] and (2) systematic review [tiab] (title and abstract) of papers about endocrine diseases. Then I subtracted 1 from 2 (to analyse the systematic reviews not indexed as meta-analysis[pt])

Thus:

(ENDOCRINE DISEASES[MESH] AND SYSTEMATIC REVIEW[TIAB] AND 2009[DP]) NOT META-ANALYSIS[PT]

I analyzed the top 10/11 journals publishing these study types.

This little experiment suggests that:

  1. the precise scatter might differ per search: apparently the systematic review[tiab] search yielded different top 10/11 journals (for this sample) than the meta-analysis[pt] search. (partially because Cochrane systematic reviews apparently don’t mention systematic reviews in title and abstract?).
  2. the authors underestimate the numbers of Systematic Reviews: simply searching for systematic review[tiab] already found appr. 50% additional systematic reviews compared to meta-analysis[pt] alone
  3. As expected (by me at last), many of the SR’s en MA’s were NOT dealing with interventions, i.e. see the first 5 hits (out of 108 and 236 respectively).
  4. Together these findings indicate that the true information overload is far greater than shown by Hoffmann et al (not all systematic reviews are found, of all available search designs only RCTs are searched).
  5. On the other hand this indirectly shows that SRs are a better way to keep up-to-date than suggested: SRs  also summarize non-interventional research (the ratio SRs of RCTs: individual RCTs is much lower than suggested)
  6. It also means that the role of the Cochrane Systematic reviews to aggregate RCTs is underestimated by the published graphs (the MA[pt] section is diluted with non-RCT- systematic reviews, thus the proportion of the Cochrane SRs in the interventional MAs becomes larger)

Well anyway, these imperfections do not contradict the main point of this paper: that trials are scattered across hundreds of general and specialty journals and that “systematic reviews” (or meta-analyses really) do reduce the extent of scatter, but are still widely scattered and mostly in different journals to those of randomized trials.

Indeed, personal subscriptions to journals seem insufficient for keeping up to date.
Besides supplementing subscription by  methods such as journal scanning services, I would recommend the use of personalized alerts from PubMed and several prefiltered sources including an EBM search machine like TRIP (www.tripdatabase.com/).

*but I would broaden it to find all aggregate evidence, including ACP, Clinical Evidence, syntheses and synopses, not only meta-analyses.

**I do appreciate that one of the co-authors is a medical librarian: Sarah Thorning.

References

  1. Hoffmann, Tammy, Erueti, Chrissy, Thorning, Sarah, & Glasziou, Paul (2012). The scatter of research: cross sectional comparison of randomised trials and systematic reviews across specialties BMJ, 344 : 10.1136/bmj.e3223
  2. Bastian, H., Glasziou, P., & Chalmers, I. (2010). Seventy-Five Trials and Eleven Systematic Reviews a Day: How Will We Ever Keep Up? PLoS Medicine, 7 (9) DOI: 10.1371/journal.pmed.1000326
  3. How will we ever keep up with 75 trials and 11 systematic reviews a day (laikaspoetnik.wordpress.com)
  4. Experience versus Evidence [1]. Opioid Therapy for Rheumatoid Arthritis Pain. (laikaspoetnik.wordpress.com)




PubMed’s Higher Sensitivity than OVID MEDLINE… & other Published Clichés.

21 08 2011

ResearchBlogging.orgIs it just me, or are biomedical papers about searching for a systematic review often of low quality or just too damn obvious? I’m seldom excited about papers dealing with optimal search strategies or peculiarities of PubMed, even though it is my specialty.
It is my impression, that many of the lower quality and/or less relevant papers are written by clinicians/researchers instead of information specialists (or at least no medical librarian as the first author).

I can’t help thinking that many of those authors just happen to see an odd feature in PubMed or encounter an unexpected  phenomenon in the process of searching for a systematic review.
They think: “Hey, that’s interesting” or “that’s odd. Lets write a paper about it.” An easy way to boost our scientific output!
What they don’t realize is that the published findings are often common knowledge to the experienced MEDLINE searchers.

Lets give two recent examples of what I think are redundant papers.

The first example is a letter under the heading “Clinical Observation” in Annals of Internal Medicine, entitled:

“Limitations of the MEDLINE Database in Constructing Meta-analyses”.[1]

As the authors rightly state “a thorough literature search is of utmost importance in constructing a meta-analysis. Since the PubMed interface from the National Library of Medicine is a cornerstone of many meta-analysis,  the authors (two MD’s) focused on the freely available PubMed” (with MEDLINE as its largest part).

The objective was:

“To assess the accuracy of MEDLINE’s “human” and “clinical trial” search limits, which are used by authors to focus literature searches on relevant articles.” (emphasis mine)

O.k…. Stop! I know enough. This paper should have be titled: “Limitation of Limits in MEDLINE”.

Limits are NOT DONE, when searching for a systematic review. For the simple reason that most limits (except language and dates) are MESH-terms.
It takes a while before the indexers have assigned a MESH to the papers and not all papers are correctly (or consistently) indexed. Thus, by using limits you will automatically miss recent, not yet, or not correctly indexed papers. Whereas it is your goal (or it should be) to find as many relevant papers as possible for your systematic review. And wouldn’t it be sad if you missed that one important RCT that was published just the other day?

On the other hand, one doesn’t want to drown in irrelevant papers. How can one reduce “noise” while minimizing the risk of loosing relevant papers?

  1. Use both MESH and textwords to “limit” you search, i.e. also search “trial” as textword, i.e. in title and abstract: trial[tiab]
  2. Use more synonyms and truncation (random*[tiab] OR  placebo[tiab])
  3. Don’t actively limit but use double negation. Thus to get rid of animal studies, don’t limit to humans (this is the same as combining with MeSH [mh]) but safely exclude animals as follows: NOT animals[mh] NOT humans[mh] (= exclude papers indexed with “animals” except when these papers are also indexed with “humans”).
  4. Use existing Methodological Filters (ready-made search strategies) designed to help focusing on study types. These filters are based on one or more of the above-mentioned principles (see earlier posts here and here).
    Simple Methodological Filters can be found at the PubMed Clinical Queries. For instance the narrow filter for Therapy not only searches for the Publication Type “Randomized controlled trial” (a limit), but also for randomized, controlled ànd  trial  as textwords.
    Usually broader (more sensitive) filters are used for systematic reviews. The Cochrane handbook proposes to use the following filter maximizing precision and sensitivity to identify randomized trials in PubMed (see http://www.cochrane-handbook.org/):
    (randomized controlled trial [pt] OR controlled clinical trial [pt] OR randomized [tiab] OR placebo [tiab] OR clinical trials as topic [mesh: noexp] OR randomly [tiab] OR trial [ti]) NOT (animals [mh] NOT humans [mh]).
    When few hits are obtained, one can either use a broader filter or no filter at all.

In other words, it is a beginner’s mistake to use limits when searching for a systematic review.
Besides that the authors publish what should be common knowledge (even our medical students learn it) they make many other (little) mistakes, their precise search is difficult to reproduce and far from complete. This is already addressed by Dutch colleagues in a comment [2].

The second paper is:

PubMed had a higher sensitivity than Ovid-MEDLINE in the search for systematic reviews [3], by Katchamart et al.

Again this paper focuses on the usefulness of PubMed to identify RCT’s for a systematic review, but it concentrates on the differences between PubMed and OVID in this respect. The paper starts with  explaining that PubMed:

provides access to bibliographic information in addition to MEDLINE, such as in-process citations (..), some OLDMEDLINE citations (….) citations that precede the date that a journal was selected for MEDLINE indexing, and some additional life science journals that submit full texts to PubMed Central and receive a qualitative review by NLM.

Given these “facts”, am I exaggerating when I am saying that the authors are pushing at an open door when their main conclusion is that PubMed retrieved more citations overall than Ovid-MEDLINE? The one (!) relevant article missed in OVID was a 2005 study published in a Japanese journal that MEDLINE started indexing in 2007. It was therefore in PubMed, but not in OVID MEDLINE.

An important aspect to keep in mind when searching OVID/MEDLINE ( I have earlier discussed here and here). But worth a paper?

Recently, after finishing an exhaustive search in OVID/MEDLINE, we noticed that we missed a RCT in PubMed, that was not yet available in OVID/MEDLINE.  I just added one sentence to the search methods:

Additionally, PubMed was searched for randomized controlled trials ahead of print, not yet included in OVID MEDLINE. 

Of course, I could have devoted a separate article to this finding. But it is so self-evident, that I don’t think it would be worth it.

The authors have expressed their findings in sensitivity (85% for Ovid-MEDLINE vs. 90% for PubMed, 5% is that ONE paper missing), precision and  number to read (comparable for OVID-MEDLINE and PubMed).

If I might venture another opinion: it looks like editors of medical and epidemiology journals quickly fall for “diagnostic parameters” on a topic that they don’t understand very well: library science.

The sensitivity/precision data found have little general value, because:

  • it concerns a single search on a single topic
  • there are few relevant papers (17- 18)
  • useful features of OVID MEDLINE that are not available in PubMed are not used. I.e. Adjacency searching could enhance the retrieval of relevant papers in OVID MEDLINE (adjacency=words searched within a specified maximal distance of each other)
  • the searches are not comparable, nor are the search field commands.

The latter is very important, if one doesn’t wish to compare apples and oranges.

Lets take a look at the first part of the search (which is in itself well structured and covers many synonyms).
First part of the search - Click to enlarge
This part of the search deals with the P: patients with rheumatoid arthritis (RA). The authors first search for relevant MeSH (set 1-5) and then for a few textwords. The MeSH are fine. The authors have chosen to use Arthritis, rheumatoid and a few narrower terms (MeSH-tree shown at the right). The authors have taken care to use the MeSH:noexp command in PubMed to prevent the automatic explosion of narrower terms in PubMed (although this is superfluous for MesH terms having no narrow terms, like Caplan syndrome etc.).

But the fields chosen for the free text search (sets 6-9) are not comparable at all.

In OVID the mp. field is used, whereas all fields or even no fields are used in PubMed.

I am not even fond of the uncontrolled use of .mp (I rather search in title and abstract, remember we already have the proper MESH-terms), but all fields is even broader than .mp.

In general a .mp. search looks in the Title, Original Title, Abstract, Subject Heading, Name of Substance, and Registry Word fields. All fields would be .af in OVID not .mp.

Searching for rheumatism in OVID using the .mp field yields 7879 hits against 31390 hits when one searches in the .af field.

Thus 4 times as much. Extra fields searched are for instance the journal and the address field. One finds all articles in the journal Arthritis & Rheumatism for instance [line 6], or papers co-authored by someone of the dept. of rheumatoid surgery [line 9]

Worse, in PubMed the “all fields” command doesn’t prevent the automatic mapping.

In PubMed, Rheumatism[All Fields] is translated as follows:

“rheumatic diseases”[MeSH Terms] OR (“rheumatic”[All Fields] AND “diseases”[All Fields]) OR “rheumatic diseases”[All Fields] OR “rheumatism”[All Fields]

Oops, Rheumatism[All Fields] is searched as the (exploded!) MeSH rheumatic diseases. Thus rheumatic diseases (not included in the MeSH-search) plus all its narrower terms! This makes the entire first part of the PubMed search obsolete (where the authors searched for non-exploded specific terms). It explains the large difference in hits with rheumatism between PubMed and OVID/MEDLINE: 11910 vs 6945.

Not only do the authors use this .mp and [all fields] command instead of the preferred [tiab] field, they also apply this broader field to the existing (optimized) Cochrane filter, that uses [tiab]. Finally they use limits!

Well anyway, I hope that I made my point that useful comparison between strategies can only be made if optimal strategies and comparable  strategies are used. Sensitivity doesn’t mean anything here.

Coming back to my original point. I do think that some conclusions of these papers are “good to know”. As a matter of fact it should be basic knowledge for those planning an exhaustive search for a systematic review. We do not need bad studies to show this.

Perhaps an expert paper (or a series) on this topic, understandable for clinicians, would be of more value.

Or the recognition that such search papers should be designed and written by librarians with ample experience in searching for systematic reviews.

NOTE:
* = truncation=search for different word endings; [tiab] = title and abstract; [ti]=title; mh=mesh; pt=publication type

Photo credit

The image is taken from the Dragonfly-blog; here the Flickr-image Brain Vocab Sketch by labguest was adapted by adding the Pubmed logo.

References

  1. Winchester DE, & Bavry AA (2010). Limitations of the MEDLINE database in constructing meta-analyses. Annals of internal medicine, 153 (5), 347-8 PMID: 20820050
  2. Leclercq E, Kramer B, & Schats W (2011). Limitations of the MEDLINE database in constructing meta-analyses. Annals of internal medicine, 154 (5) PMID: 21357916
  3. Katchamart W, Faulkner A, Feldman B, Tomlinson G, & Bombardier C (2011). PubMed had a higher sensitivity than Ovid-MEDLINE in the search for systematic reviews. Journal of clinical epidemiology, 64 (7), 805-7 PMID: 20926257
  4. Search OVID EMBASE and Get MEDLINE for Free…. without knowing it (laikaspoetnik.wordpress.com 2010/10/19/)
  5. 10 + 1 PubMed Tips for Residents (and their Instructors) (laikaspoetnik.wordpress.com 2009/06/30)
  6. Adding Methodological filters to myncbi (laikaspoetnik.wordpress.com 2009/11/26/)
  7. Search filters 1. An Introduction (laikaspoetnik.wordpress.com 2009/01/22/)




Collaborating and Delivering Literature Search Results to Clinical Teams Using Web 2.0 Tools

8 08 2010

ResearchBlogging.orgThere seem to be two camps in the library, the medical and many other worlds: those who embrace Web 2.0, because they consider it useful for their practice and those who are unaware of Web 2.0 or think it is just a fad. There are only a few ways the Web 2.0-critical people can be convinced: by arguments (hardly), by studies that show evidence of its usefulness and by examples of what works and what doesn’t work.

The paper of Shamsha Damani and Stephanie Fulton published in the latest Medical Reference Services Quarterly [1] falls in the latter category. Perhaps the name Shamsha Damania rings a bell: she is a prominent twitterer and has written quest posts at this blog on several occasions (here, herehere and here)

As clinical librarians at The University of Texas MD Anderson Cancer Center, Shamsha and Stephanie are immersed in clinical teams and provide evidence-based literature for various institutional clinical algorithms designed for patient care.

These were some of the problems the clinical librarians encountered when sharing the results of their searches with the teams by classic methods (email):

First, team members were from different departments and were dispersed across the sprawling hospital campus. Since the teams did not meet in person very often, it was difficult for the librarians to receive timely feedback on the results of each literature search. Second, results sent from multiple database vendors were either not received or were overlooked by team members. Third, even if users received the bibliography, they still had to manually search for and locate the full text of articles. The librarians also experimented with e-mailing EndNote libraries; however, many users were not familiar with EndNote and did not have the time to learn how to use it. E-mails in general tended to get lost in the shuffle, and librarians often found themselves re-sending e-mails with attachments. Lastly, it was difficult to update the results of a literature search in a consistent manner and obtain meaningful feedback from the entire team.

Therefore, they tried several Web 2.0 tools for sharing search results with their clinical teams.
In their article, the librarians share their experience with the various applications they explored that allowed centralization of the search results, provided easy online access, and enabled collaboration within the group.

Online Reference Management Tools were the librarians’ first choice, since these are specifically designed to help users gather and store references from multiple databases and allow sharing of results. Of the available tools, Refworks was eventually not tested, because it required two sets of usernames and passwords. In contrast, EndNote Web can be accessed from any computer with a username and password. Endnoteweb is suitable for downloading and managing references from multiple databases and for retrieving full text papers as well as  for online collaboration. In theory, that is. In practice, the team members experienced several difficulties: trouble to remember the usernames and passwords, difficulties using the link resolver and navigating to the full text of each article and back to the Endnote homepage. Furthermore, accessing the full text of each article was considered a too laborious process.

Next, free Social bookmarking sites were tested allowing users to bookmark Web sites and articles, to share the bookmarks and to access them from any computer. However, most team members didn’t create an account and could therefore not make use of the collaborative features. The bookmarking sites were deemed ‘‘user-unfriendly’’, because  (1) the overall layout and the presentation of results -with the many links- were experienced as confusing,  (2) sorting possibilities were not suitable for this purpose and (3) it was impossible to search within the abstracts, which were not part of the bookmarked records. This was true both for Delicious and Connotea, even though the latter is more apt for science and medicine, includes bibliographic information and allows import and export of references from other systems. An other drawback was that the librarians needed to bookmark and comment each individual article.

Wikis (PBWorks and SharePoint) appeared most user-friendly, because they were intuitive and easy to use: the librarians had created a shared username and password for the entire team, the wiki was behind the hospital’s firewall (preferred by the team) and the users could access the articles with one click. For the librarians it was labor-consuming as they annotated the bibliographies, published it on the wiki and added persistent links to each article. It is not clear from the article how final reference lists were created by the team afterwards. Probably by cut & paste, because Wikis don’t seem suitable as a Word processor nor  are they suitable for  import and export of references.

Some Remarks

It is informative to read the pros and cons of the various Web 2.0 tools for collaborating and delivering search results. For me, it was even more valuable to read how the research was done. As the authors note (quote):

There is no ‘‘one-size-fits-all’’ approach. Each platform must be tested and evaluated to see how and where it fits within the user’s workflow. When evaluating various Web 2.0 technologies, librarians should try to keep users at the forefront and seek feedback frequently in order to provide better service. Only after months of exploration did the librarians at MD Anderson Cancer Center learn that their users preferred wikis and 1-click access to full-text articles. Librarians were surprised to learn that users did not like the library’s link resolvers and wanted a more direct way to access information.

Indeed, there is no ‘‘one-size-fits-all’’ approach. For that reason too, the results obtained may only apply in certain settings.

I was impressed by the level of involvement of the clinical librarians and the time they put not only in searching, but also in presenting the data, in ranking the references according to study design, publication type, and date and in annotating the references. I hope they prune the results as well, because applying this procedure to 1000 or more references is no kidding. And, although it may be ideal for the library users, not all librarians work like this. I know of no Dutch librarian who does. Because of the workload such a ready made wiki may not be feasible for many librarians .

The librarians starting point was to find an easy and intuitive Web based tool that allowed collaborating and sharing of references.
The emphasis seems more on the sharing, since end-users did not seem to collaborate via the wikis themselves. I also wonder if the simpler and free Google Docs wouldn’t fulfill most of the needs. In addition, some of the tools might have been perceived more useful if users had received some training beforehand.
The training we offer in Reference Manager, is usually sufficient to learn to work efficiently with this quite complex reference manager tool. Of course, desktop software is not suitable for collaboration online (although it could always be easily exported to an easier system), but a short training may take away most of the barriers people feel when using a new tool (and with the advantage that they can use this tool for other purposes).

In short,

Of the Web 2.0 tools tested, wikis were the most intuitive and easy to use tools for collaborating with clinical teams and for delivering the literature search results. Although it is easy to use by end-users, it seems very time-consuming for librarians, who make ready-to-use lists with annotations.

Clinical teams of MD Anderson must be very lucky with their clinical librarians.

Reference
Damani S, & Fulton S (2010). Collaborating and delivering literature search results to clinical teams using web 2.0 tools. Medical reference services quarterly, 29 (3), 207-17 PMID: 20677061

Are you a Twitter user? Tweet this!

———————————

Added: August 9th 2010, 21:30 pm

On basis of the comments below (Annemarie Cunningham) and on Twitter (@Dymphie – here and here (Dutch)) I think it is a good idea to include a figure of one of the published wiki-lists.

It looks beautiful, but -as said- where is the collaborative aspect? Like Dymphie I have the impression that these lists are no different from the “normal” reference lists. Or am I missing something? I also agree with Dymphie that instructing people in Reference Manager may be much more efficient for this purpose.

It is interesting to read Christina Pikas view about this paper. At her blog Christina’s Lis Rant (just moved to the new Scientopia platform) Christina first describes how she delivers her search results to her customers and which platforms she uses for this. Then she shares some thoughts about the paper, like:

  • they (the authors) ruled out RefWorks because it required two sets of logins/passwords – hmm, why not RefWorks with RefShare? Why two sets of passwords?
  • SharePoint wikis suck. I would probably use some other type of web part – even a discussion board entry for each article.
  • they really didn’t use the 2.0 aspects of the 2.0 tools – particularly in the case of the wiki. The most valued aspects were access without a lot of logins and then access to the full text without a lot of clicks.

Like Christina,  I would be interested in hearing other approaches – particularly using newer tools.






Will Nano-Publications & Triplets Replace The Classic Journal Articles?

23 06 2010

ResearchBlogging.org“Libraries and journals articles as we know them will cease to exists” said Barend Mons at the symposium in honor of our Library 25th Anniversary (June 3rd). “Possibly we will have another kind of party in another 25 years”…. he continued, grinning.

What he had to say the next half hour intrigued me. And although I had no pen with me (it was our party, remember), I thought it was interesting enough to devote a post to it.

I’m basing this post not only on my memory (we had a lot of Italian wine at the buffet), but on an article Mons referred to [1], a Dutch newspaper article [2]), other articles [3-6] and Powerpoints [7-9] on the topic.

This is a field I know little about, so I will try to keep it simple (also for my sake).

Mons started by touching on a problem that is very familiar to doctors, scientists and librarians: information overload by a growing web of linked data.  He showed a picture that looked like the one at the right (though I’m sure those are Twitter Networks).

As he said elsewhere [3]:

(..) the feeling that we are drowning in information is widespread (..) we often feel that we have no satisfactory mechanisms in place to make sense of the data generated at such a daunting speed. Some pharmaceutical companies are apparently seriously considering refraining from performing any further genome-wide association studies (… whole genome association –…) as the world is likely to produce many more data than these companies will ever be able to analyze with currently available methods .

With the current search engines we have to do a lot of digging to get the answers [8]. Computers are central to this digging, because there is no way people can stay updated, even in their own field.

However,  computers can’t deal with the current web and the scientific  information as produced in the classic articles (even the electronic versions), because of the following reasons:

  1. Homonyms. Words that sound or are the same but have a different meaning. Acronyms are notorious in this respect. Barend gave PSA as an example, but, without realizing it, he used a better example: PPI. This means Protein Pump Inhibitor to me, but apparently Protein Protein Interactions to him.
  2. Redundancy. To keep journal articles readable we often use different words to denote the same. These do not add to the real new findings in a paper. In fact the majority of digital information is duplicated repeatedly. For example “Mosquitoes transfer malaria”, is a factual statement repeated in many consecutive papers on the subject.
  3. The connection between words is not immediately clear (for a computer). For instance, anti-TNF inhibitors can be used to treat skin disorders, but the same drugs can also cause it.
  4. Data are not structured beforehand.
  5. Weight: some “facts” are “harder” than others.
  6. Not all data are available or accessible. Many data are either not published (e.g. negative studies), not freely available or not easy to find.  Some portals (GoPubmed, NCBI) provide structural information (fields, including keywords), but do not enable searching full text.
  7. Data are spread. Data are kept in “data silos” not meant for sharing [8](ppt2). One would like to simultaneously query 1000 databases, but this would require semantic web standards for publishing, sharing and querying knowledge from diverse sources…..

In a nutshell, the problem is as Barend put it: “Why bury data first and then mine it again?” [9]

Homonyms, redundancy and connection can be tackled, at least in the field Barend is working in (bioinformatics).

Different terms denoting the same concept (i.e. synonyms) can be mapped to a single concept identifier (i.e. a list of synonyms), whereas identical terms used to indicate different concepts (i.e. homonyms) can be resolved by a disambiguation algorithm.

The shortest meaningful sentence is a triplet: a combination of subject, predicate and object. A triplet indicates the connection and direction.  “Mosquitoes cause/transfer malaria”  is such a triplet, where mosquitoes and malaria are concepts. In the field of proteins: “UNIPROT 05067 is a protein” is a triplet (where UNIPROT 05067 and protein are concepts), as are: “UNIprotein 05067 is located in the membrane” and “UNIprotein 0506 interacts with UNIprotein 0506″[8].  Since these triplets  (statements)  derive from different databases, consistent naming and availability of  information is crucial to find them. Barend and colleagues are the people behind Wikiproteins, an open, collaborative wiki  focusing on proteins and their role in biology and medicine [4-6].

Concepts and triplets are widely accepted in the world of bio-informatics. To have an idea what this means for searching, see the search engine Quertle, which allows semantic search of PubMed & full-text biomedical literature, automatic extraction of key concepts; Searching for ESR1 $BiologicalProcess will search abstracts mentioning all kind of processes where ESR1 (aka ERα, ERalpha, EStrogen Receptor 1) are involved. The search can be refined by choosing ‘narrower terms’ like “proliferation” or “transcription”.

The new aspects is that Mons wants to turn those triplets into (what he calls) nano-publications. Because not every statement is as ‘hard’, nano-publications are weighted by assigning numbers from 0 (uncertain) to 1 (very certain). The nano-publication “mosquitoes transfer malaria” will get a number approaching 1.

Such nano-publications offer little shading and possibility for interpretation and discussion. Mons does not propose to entirely replace traditional articles by nano-publications. Quote [3]:

While arguing that research results should be available in the form of nano-publications, are emphatically not saying that traditional, classical papers should not be published any longer. But their role is now chiefly for the official record, the “minutes of science” , and not so much as the principle medium for the exchange of scientific results. That exchange, which increasingly needs the assistance of computers to be done properly and comprehensively, is best done with machine-readable, semantically consistent nano-publications.

According to Mons, authors and their funders should start requesting and expecting the papers that they have written and funded to be semantically coded when published, preferably by the publisher and otherwise by libraries: the technology exists to provide Web browsers with the functionality for users to identify nano-publications, and annotate them.

Like the wikiprotein-wiki, nano-publications will be entirely open access. It will suffice to properly cite the original finding/publication.

In addition there is a new kind of “peer review”. An expert network is set up to immediately assess a twittered nano-publication when it comes out, so that  the publication is assessed by perhaps 1000 experts instead of 2 or 3 reviewers.

On a small-scale, this is already happening. Nano-publications are send as tweets to people like Gert Jan van Ommen (past president of HUGO and co-author of 5 of my publications (or v.v.)) who then gives a red (don’t believe) or a green light (believe) via one click on his blackberry.

As  Mons put it, it looks like a subjective event, quite similar to “dislike” and “like” in social media platforms like Facebook.

Barend often referred to a PLOS ONE paper by van Haagen et al [1], showing the superiority of the concept-profile based approach not only in detecting explicitly described PPI’s, but also in inferring new PPI’s.

[You can skip the part below if you’re not interested in details of this paper]

Van Haagen et al first established a set of a set of 61,807 known human PPIs and of many more probable Non-Interacting Protein Pairs (NIPPs) from online human-curated databases (and NIPPs also from the IntAct database).

For the concept-based approach they used the concept-recognition software Peregrine, which includes synonyms and spelling variations  of concepts and uses simple heuristics to resolve homonyms.

This concept-profile based approach was compared with several other approaches, all depending on co-occurrence (of words or concepts):

  • Word-based direct relation. This approach uses direct PubMed queries (words) to detect if proteins co-occur in the same abstract (thus the names of two proteins are combined with the boolean ‘AND’). This is the simplest approach and represents how biologists might use PubMed to search for information.
  • Concept-based direct relation (CDR). This approach uses concept-recognition software to find PPIs, taking synonyms into account, and resolving homonyms. Here two concepts (h.l. two proteins) are detected if they co-occur in the same abstract.
  • STRING. The STRING database contains a text mining score which is based on direct co-occurrences in literature.

The results show that, using concept profiles, 43% of the known PPIs were detected, with a specificity of 99%, and 66% of all known PPIs with a specificity of 95%. In contrast, the direct relations methods and STRING show much lower scores:

Word-based CDR Concept profiles STRING
Sensitivity at spec = 99% 28% 37% 43% 39%
Sensitivity at spec = 95% 33% 41% 66% 41%
Area under Curve 0.62 0.69 0.90 0.69

These findings suggested that not all proteins with high similarity scores are known to interact but may be related in another way, e.g.they could be involved in the same pathway or be part of the same protein complex, but do not physically interact. Indeed concept-based profiling was superior in predicting relationships between proteins potentially present in the same complex or pathway (thus A-C inferred from concurrence protein pairs A-B and B-C).

Since there is often a substantial time lag between the first publication of a finding, and the time the PPI is entered in a database, a retrospective study was performed to examine how many of the PPIs that would have been predicted by the different methods in 2005 were confirmed in 2007. Indeed, using concept profiles, PPIs could be efficiently predicted before they enter PPI databases and before their interaction was explicitly described in the literature.

The practical value of the method for discovery of novel PPIs is illustrated by the experimental confirmation of the inferred physical interaction between CAPN3 and PARVB, which was based on frequent co-occurrence of both proteins with concepts like Z-disc, dysferlin, and alpha-actinin. The relationships between proteins predicted are broader than PPIs, and include proteins in the same complex or pathway. Dependent on the type of relationships deemed useful, the precision of the method can be as high as 90%.

In line with their open access policy, they have made the full set of predicted interactions available in a downloadable matrix and through the webtool Nermal, which lists the most likely interaction partners for a given protein.

According to Mons, this framework will be a very rich source for new discoveries, as it will enable scientists to prioritize potential interaction partners for further testing.

Barend Mons started with the statement that nano-publications will replace the classic articles (and the need for libraries). However, things are never as black as they seem.
Mons showed that a nano-publication is basically a “peer-reviewed, openly available” triplet. Triplets can be effectively retrieved ànd inferred from available databases/papers using a
concept-based approach.
Nevertheless, effectivity needs to be enhanced by semantically coding triplets when published.

What will this mean for clinical medicine? Bioinformatics is quite another discipline, with better structured and more straightforward data (interaction, identity, place). Interestingly, Mons and van Haage plan to do further studies, in which they will evaluate whether the use of concept profiles can also be applied in the prediction of other types of relations, for instance between drugs or genes and diseases. The future will tell whether the above-mentioned approach is also useful in clinical medicine.

Implementation of the following (implicit) recommendations would be advisable, independent of the possible success of nano-publications:

  • Less emphasis on “publish or perish” (thus more on the data themselves, whether positive, negative, trendy or not)
  • Better structured data, partly by structuring articles. This has already improved over the years by introducing structured abstracts, availability of extra material (appendices, data) online and by guidelines, such as STARD (The Standards for Reporting of Diagnostic Accuracy)
  • Open Access
  • Availability of full text
  • Availability of raw data

One might argue that disclosing data is unlikely when pharma is involved. It is very hopeful therefore, that a group of major pharmaceutical companies have announced that they will share pooled data from failed clinical trials in an attempt to figure out what is going wrong in the studies and what can be done to improve drug development (10).

Unfortunately I don’t dispose of Mons presentation. Therefore two other presentations about triplets, concepts and the semantic web.

&

References

  1. van Haagen HH, ‘t Hoen PA, Botelho Bovo A, de Morrée A, van Mulligen EM, Chichester C, Kors JA, den Dunnen JT, van Ommen GJ, van der Maarel SM, Kern VM, Mons B, & Schuemie MJ (2009). Novel protein-protein interactions inferred from literature context. PloS one, 4 (11) PMID: 19924298
  2. Twitteren voor de wetenschap, Maartje Bakker, Volskrant (2010-06-05) (Twittering for Science)
  3. Barend Mons and Jan Velterop (?) Nano-Publication in the e-science era (Concept Web Alliance, Netherlands BioInformatics Centre, Leiden University Medical Center.) http://www.nbic.nl/uploads/media/Nano-Publication_BarendMons-JanVelterop.pdf, assessed June 20th, 2010.
  4. Mons, B., Ashburner, M., Chichester, C., van Mulligen, E., Weeber, M., den Dunnen, J., van Ommen, G., Musen, M., Cockerill, M., Hermjakob, H., Mons, A., Packer, A., Pacheco, R., Lewis, S., Berkeley, A., Melton, W., Barris, N., Wales, J., Meijssen, G., Moeller, E., Roes, P., Borner, K., & Bairoch, A. (2008). Calling on a million minds for community annotation in WikiProteins Genome Biology, 9 (5) DOI: 10.1186/gb-2008-9-5-r89
  5. Science Daily (2008/05/08) Large-Scale Community Protein Annotation — WikiProteins
  6. Boing Boing: (2008/05/28) WikiProteins: a collaborative space for biologists to annotate proteins
  7. (ppt1) SWAT4LS 2009Semantic Web Applications and Tools for Life Sciences http://www.swat4ls.org/
    Amsterdam, Science Park, Friday, 20th of November 2009
  8. (ppt2) Michel Dumontier: triples for the people scientists liberating biological knowledge with the semantic web
  9. (ppt3, only slide shown): Bibliography 2.0: A citeulike case study from the Wellcome Trust Genome Campus – by Duncan Hill (EMBL-EBI)
  10. WSJ (2010/06/11) Drug Makers Will Share Data From Failed Alzheimer’s Trials




PubMed versus Google Scholar for Retrieving Evidence

8 06 2010

ResearchBlogging.orgA while ago a resident in dermatology told me she got many hits out of PubMed, but zero results out of TRIP. It appeared she had used the same search for both databases: alopecea areata and diphenciprone (a drug with a lot of synonyms). Searching TRIP for alopecea (in the title) only, we found a Cochrane Review and a relevant NICE guideline.

Usually, each search engine has is its own search and index features. When comparing databases one should compare “optimal” searches and keep in mind for what purpose the search engines are designed. TRIP is most suited to search aggregate evidence, whereas PubMed is most suited to search individual biomedical articles.

Michael Anders and Dennis Evans ignore this “rule of the thumb” in their recent paper “Comparison of PubMed and Google Scholar Literature Searches”. And this is not the only shortcoming of the paper.

The authors performed searches on 3 different topics to compare PubMed and Google Scholar search results. Their main aim was to see which database was the most useful to find clinical evidence in respiratory care.

Well quick guess: PubMed wins…

The 3 respiratory care topics were selected from a list of systematic reviews on the Website of the Cochrane Collaboration and represented in-patient care, out-patient care, and pediatrics.

The references in the three chosen Cochrane Systematic Reviews served as a “reference” (or “golden”) standard. However, abstracts, conference proceedings, and responses to letters were excluded.

So far so good. But note that the outcome of the study only allows us to draw conclusions about interventional questions, that seek to find controlled clinical trials. Other principles may apply to other domains (diagnosis, etiology/harm, prognosis ) or to other types of studies. And it certainly doesn’t apply to non-EBM-topics.

The authors designed ONE search for each topic, by taking 2 common clinical terms from the title of each Cochrane review connected by the Boolean operator “AND” (see Table, ” ” are not used). No synonyms were used and the translation of searches in PubMed wasn’t checked (luckily the mapping was rather good).

“Mmmmm…”

Topic

Search Terms

Noninvasive positive-pressure ventilation for cardiogenic pulmonary edema “noninvasive positive-pressure ventilation” AND “pulmonary edema”
Self-management education and regular practitioner review for adults with asthma “asthma” AND “education”
Ribavirin for respiratory syncytial virus “ribavirin” AND “respiratory syncytial virus”

In PubMed they applied the narrow methodological filter, or Clinical Query, for the domain therapy.
This prefab search strategy (randomized controlled trial[Publication Type] OR (randomized[Title/Abstract] AND controlled[Title/Abstract] AND trial[Title/Abstract]), developed by Haynes, is suitable to quickly detect the available evidence (provided one is looking for RCT’s and doesn’t do an exhaustive search). (see previous posts 2, 3, 4)

Google Scholar, as we all probably know, does not have such methodological filters, but the authors “limited” their search by using the Advanced option and enter the 2 search terms in the “Find articles….with all of the words” space (so this is a boolean “AND“) and they limited it the search to the subject area “Medicine, Pharmacology, and Veterinary Science”.

They did a separate search for publications that were available at their library, which has limited value for others, subscriptions being different for each library.

Next they determined the sensitivity (the number of relevant records retrieved as a proportion of the total number of records in the gold standard) and the precision or positive predictive value, the  fraction of returned positives that are true positives (explained in 3).

Let me guess: sensitivity might be equal or somewhat higher, and precision is undoubtedly much lower in Google Scholar. This is because (in) Google Scholar:

  • you can often search full text instead of just in the abstract, title and (added) keywords/MeSH
  • the results are inflated by finding one and the same references cited in many different papers (that might not directly deal with the subject).
  • you can’t  limit on methodology, study type or “evidence”
  • there is no automatic mapping and explosion (which may provide a way to find more synonyms and thus more relevant studies)
  • has a broader coverage (grey literature, books, more topics)
  • lags behind PubMed in receiving updates from MEDLINE

Results: PubMed and Google Scholar had pretty much the same recall, but for ribavirin and RSV the recall was higher in PubMed, PubMed finding 100%  (12/12) of the included trials, and Google Scholar 58% (7/12)

No discussion as to the why. Since Google Scholar should find the words in titles and abstracts of PubMed I repeated the search in PubMed but only in the title, abstract field, so I searched ribavirin[tiab] AND respiratory syncytial virus[tiab]* and limited it with the narrow therapy filter: I found 26 papers instead of 32. These titles were missing when I only searched title and abstract (between brackets: [relevant MeSH (reason why paper was found), absence of abstract (thus only title and MeSH) and letter], bold: why terms in title abstract are not found)

  1. Evaluation by survival analysis on effect of traditional Chinese medicine in treating children with respiratory syncytial viral pneumonia of phlegm-heat blocking Fei syndrome.
    [MesH:
    Respiratory Syncytial Virus Infections/]
  2. Ribavarin in ventilated respiratory syncytial virus bronchiolitis: a randomized, placebo-controlled trial.
    [MeSH:
    Respiratory Syncytial Virus Infections/[NO ABSTRACT, LETTER]
  3. Study of interobserver reliability in clinical assessment of RSV lower respiratory illness.
    [MeSH:Respiratory Syncytial Virus Infections*]
  4. Ribavirin for severe RSV infection. N Engl J Med.
    [MeSH: Respiratory Syncytial Viruses
    [NO ABSTRACT, LETTER]
  5. Stutman HR, Rub B, Janaim HK. New data on clinical efficacy of ribavirin.
    MeSH: Respiratory Syncytial Viruses
    [NO ABSTRACT]
  6. Clinical studies with ribavirin.
    MeSH: Respiratory Syncytial Viruses
    [NO ABSTRACT]

Three of the papers had the additional MeSH respiratory syncytial virus and the three others respiratory syncytial virus infections. Although not all papers (2 comments/letters) may be relevant, it illustrates why PubMed may yield results, that are not retrieved by Google Scholar (if one doesn’t use synonyms)

In Contrast to Google Scholar, PubMed translates the search ribavirin AND respiratory syncytial virus so that the MeSH-terms “ribavirin”, “respiratory syncytial viruses”[MeSH Terms] and (indirectly) respiratory syncytial virus infection”[MeSH] are also found.

Thus in Google Scholar articles with terms like RSV and respiratory syncytial viral pneumonia (or lack of specifications, like clinical efficacy) could have been missed with the above-mentioned search.

The other result of the study (the result section comprises 3 sentences) is that “For each individual search, PubMed had better precision”.

The Precision was 59/467 (13%) in PubMed and 57/80,730 (0.07%)  in Google Scholar (p<0.001)!!
(note: they had to add author names in the Google Scholar search to find the papers in the haystack 😉

Héhéhé, how surprising. Well why would it be that no clinician or librarian would ever think of using Google Scholar as the primary, let alone the only, source to search for medical evidence?
It should also ring a bell, that [QUOTE**]:
In the Cochrane reviews the researchers retrieved information from multiple databases, including MEDLINE, the Cochrane Airways Group trial register (derived from MEDLINE)***, CENTRAL, EMBASE, CINAHL, DARE, NHSEED, the Acute Respiratory Infections Group’s specialized register, and LILACS… ”
Note
Google Scholar isn’t mentioned as a source! Google Scholar is only recommendable to search for work citing (already found) relevant articles (this is called forward searching), if one hasn’t access to Web of Science or SCOPUS. Thus only to catch the last fish.

Perhaps the paper could have been more interesting if the authors had looked at any ADDED VALUE of Google Scholar, when exhaustively searching for evidence. Then it would have been crucial to look for grey literature too, (instead of excluding it), because this could be a possible strong point for Google Scholar. Furthermore one could have researched if forward searching yielded extra papers.

The specificity of PubMed is attributed to the used therapy-narrow filter, but the vastly lower specificity of Google Scholar is also due to the searching in the full text, including the reference lists.

For instance, searching for ribavirin AND respiratory syncytial virus in PubMed yields 523 hits. This can be reduced to 32 hits when applying the narrow therapy filter. This means a reduction by a factor of 16.
Yet a similar search in Google Scholar yield
4,080 hits. Thus without the filter there is still an almost 8 times higher yield from Google Scholar than from PubMed.

That evokes another  research idea: what would have happened if randomized (OR randomised) would have been added to the Google Scholar search? Would this have increased the specificity? In case of the above search it lowers the yield with a factor 2, and the first hits look very relevant.

It is really funny but the authors bring down their own conclusion that “These results are important because efficient retrieval of the best available scientific evidence can inform respiratory care protocols, recommendations for clinical decisions in individual patients, and education, while minimizing information overload.” by saying elsewhere that “It is unlikely that users consider more than the first few hundred search results, so RTs who conduct literature searches with Google Scholar on these topics will be much less likely to find references cited in Cochrane reviews.”

Indeed no one would take it into ones head to try to find the relevant papers out of those 4,080 hits retrieved. So what is this study worth from a practical point of view?

Well anyway, as you can ask for the sake of asking you can research for the sake of researching. Despite being an EBM-addict I prefer a good subjective overview on this topic over a weak scientific, quasi-evidence based, research paper.

Does this mean Google Scholar is useless? Does it mean that all those PhD’s hooked on Google Scholar are wrong?

No, Google Scholar serves certain purposes.

Just like the example of PubMed and TRIP, you need to know what is in it for you and how to use it.

I used Google Scholar when I was a researcher:

  • to quickly find a known reference
  • to find citing papers
  • to get an idea of how much articles have been cited/ find the most relevant papers in a quick and dirty way (i.e. by browsing)
  • for quick and dirty searches by putting words string between brackets.
  • to search full text. I used quite extensive searches to find out what methods were used (for instance methods AND (synonym1 or syn2 or syn3)). An interesting possibility is to do a second search for only the last few words (in a string). This will often reveal the next words in the sentence. Often you can repeat this trick, reading a piece of the paper without need for access.

If you want to know more about the pros and cons of Google Scholar I recommend the recent overview by the expert librarian Dean Giustini: “Sure Google Scholar is ideal for some things” [7]”. He also compiled a “Google scholar bibliography” with ~115 articles as of May 2010.

Speaking of librarians, why was the study performed by PhD RRT (RN)’s and wasn’t the university librarian involved?****

* this is a search string and more strict than respiratory AND syncytial AND virus
**
abbreviations used instead of full (database) names
*** this is wrong, a register contains references to controlled clinical trials from EMBASE, CINAHL and all kind of  databases in addition to MEDLINE.
****other then to read the manuscript afterwards.

References

  1. Anders ME, & Evans DP (2010). Comparison of PubMed and Google Scholar Literature Searches. Respiratory care, 55 (5), 578-83 PMID: 20420728
  2. This Blog: https://laikaspoetnik.wordpress.com/2009/11/26/adding-methodological-filters-to-myncbi/
  3. This Blog: https://laikaspoetnik.wordpress.com/2009/01/22/search-filters-1-an-introduction/
  4. This Blog: https://laikaspoetnik.wordpress.com/2009/06/30/10-1-pubmed-tips-for-residents-and-their-instructors/
  5. NeuroDojo (2010/05) Pubmed vs Google Scholar? [also gives a nice overview of pros and cons]
  6. GenomeWeb (2010/05/10) Content versus interface at the heart of Pubmed versus Scholar?/ [response to 5]
  7. The Search principle Blog (2010/05) Sure Google Scholar is ideal for some things.




An Evidence Pyramid that Facilitates the Finding of Evidence

20 03 2010

Earlier I described that there are so many search- and EBM-pyramids that it is confusing. I described  3 categories of pyramids:

  1. Search Pyramids
  2. Pyramids of EBM-sources
  3. Pyramids of EBM-levels (levels of evidence)

In my courses where I train doctors and medical students how to find evidence quickly, I use a pyramid that is a mixture of 1. and 2. This is a slide from a 2007 course.

This pyramid consists of 4 layers (from top down):

  1. EBM-(evidence based) guidelines.
  2. Synopses & Syntheses*: a synopsis is a summary and critical appraisal of one article, whereas synthesis is a summary and critical appraisal of a topic (which may answer several questions and may cover many articles).
  3. Systematic Reviews (a systematic summary and critical appraisal of original studies) which may or may not include a meta-analysis.
  4. Original Studies.

The upper 3 layers represent “Aggregate Evidence”. This is evidence from secondary sources, that search, summarize and critically appraise original studies (lowest layer of the pyramid).

The layers do not necessarily represent the levels of evidence and should not be confused with Pyramids of EBM-levels (type 3). An Evidence Based guideline can have a lower level of evidence than a good systematic review, for instance.
The present pyramid is only meant to lead the way in the labyrinth of sources. Thus, to speed up to process of searching. The relevance and the quality of evidence should always be checked.

The idea is:

  • The higher the level in the pyramid the less publications it contains (the narrower it becomes)
  • Each level summarizes and critically appraises the underlying levels.

I advice people to try to find aggregate evidence first, thus to drill down (hence the drill in the Figure).

The advantage: faster results, lower number to read (NNR).

During the first courses I gave, I just made a pyramid in Word with the links to the main sources.

Our library ICT department converted it into a HTML document with clickable links.

However, although the pyramid looked quite complex, not all main evidence sources were included. Plus some sources belong to different layers. The Trip Database for instance searches sources from all layers.

Our ICT-department came up with a much better looking and better functioning 3-D pyramid, with databases like TRIP in the sidebar.

Moving the  mouse over a pyramid layer invokes a pop-up with links to the databases belonging to that layer.

Furthermore the sources included in the pyramid differ per specialty. So for the department Gynecology we include POPLINE and MIDIRS in the lowest layer, and the RCOG and NVOG (Dutch) guidelines in the EBM-guidelines layer.

Together my colleagues and I decide whether a source is evidence based (we don’t include UpToDate for instance) and where it  belongs. Each clinical librarian (we all serve different departments) then decides which databases to include. Clients can give suggestions.

Below is a short You Tube video showing how this pyramid can be used. Because of the rather poor quality, the video is best to be viewed in full screen mode.
I have no audio (yet), so in short this is what you see:

Made with Screenr:  http://screenr.com/8kg

The pyramid is highly appreciated by our clients and students.

But it is just a start. My dream is to visualize the entire pathway from question to PICO, checklists, FAQs and database of results per type of question/reason for searching (fast question, background question, CAT etc.).

I’m just waiting for someone to fulfill the technical part of this dream.

————–

*Note that there may be different definitions as well. The top layers in the 5S pyramid of Bryan Hayes are defined as follows: syntheses & synopses (succinct descriptions of selected individual studies or systematic reviews, such as those found in the evidence-based journals), summaries, which integrate best available evidence from the lower layers to develop practice guidelines based on a full range of evidence (e.g. Clinical Evidence, National Guidelines Clearinghouse), and at the peak of the model, systems, in which the individual patient’s characteristics are automatically linked to the current best evidence that matches the patient’s specific circumstances and the clinician is provided with key aspects of management (e.g., computerised decision support systems).

Begin with the richest source of aggregate (pre-filtered) evidence and decline in order to to decrease the number needed to read: there are less EBM guidelines than there are Systematic Reviews and (certainly) individual papers.




Searching Skills Toolkit. Finding the Evidence [Book Review]

4 03 2010

Most books on Evidence Based Medicine give little attention to the first two steps of EBM: asking focused answerable questions and searching the evidence. Being able to appraise an article, but not being able to find the best evidence may be challenging and frustrating to the busy clinicians.

Searching Skills Toolkit: Finding The Evidence” is a pocket-sized book that aims to instruct the clinician how to search for evidence. It is the third toolkit book in the series edited by Heneghan et al. (author of the CEBM-blog Trust the Evidence). The authors Caroline de Brún and Nicola Pearce Smith are experts in searching (librarian and information scientist respectively).

According to the description at Wiley’s, the distinguishing feature of this searching skills book,  is its user-friendliness. “The guiding principle is that readers do not want to become librarians, but they are faced with practical difficulties when searching for evidence, such as lack of skills, lack of time and information overload. They need to learn simple search skills, and be directed towards the right resources to find the best evidence to support their decision-making.”

Does this book give guidance that makes searching for evidence easy? Is this book the ‘perfect companion’ to doctors, nurses, allied health professionals, managers, researchers and students, as it promises?

I find it difficult to answer, partly because I’m not a clinician and partly because, being a medical information specialist myself, I would frequently tackle a search otherwise.

The booklet is in pocket-size, easy to take along. The lay-out is clear and pleasant. The approach is original and practical. Despite its small size, the booklet contains a wealth of information. Table one, for instance, gives an overview of truncation symbols, wildcards and Boolean operators for Cochrane, Dialog, EBSCO, OVID, PubMed and Webspirs (see photo). And although this is mouth watering for many medical librarians one wonders whether this detailed information is really useful for the clinician.

Furthermore 34 pages of the 102 (1/3) are devoted on searching these specific health care databases. IMHO of these databases only PubMed and the Cochrane Library are useful to the average clinician. In addition most of the screenshots of the individual databases are too small to read. And due to the PubMed Redesign the PubMed description is no longer up-to-date.

The readers are guided to the chapters on searching by asking themselves beforehand:

  1. The time available to search: 5 minutes, an hour or time to do a comprehensive search. This is an important first step, which is often not considered by other books and short guides.
    Primary sources, secondary sources and ‘other’ sources are given per time available. This is all presented in a table with reference to key chapters and related chapters. These particular chapters enable the reader to perform these short, intermediate or long searches.
  2. What type of publication he is looking for: a guideline, a systematic review, patient information or an RCT (with tips where to find them).
  3. Whether the query is about a specific topic, i.e. drug or safety information or health statistics.

All useful information, but I would have discussed topic 3 before covering EBM, because this doesn’t fit into the ‘normal’ EBM search.  So for drug information you could directly go to the FDA, WHO or EMEA website. Similarly, if my question was only to find a guideline I would simply search one or more guideline databases.
Furthermore it would be more easy to pile the small, intermediate and long searches upon each other instead of next to each other. The basic principle would be (in my opinion at least) to start with a PICO and to (almost) always search for secondary searches first (fast), search for primary publications (original research) in PubMed if necessary and broaden the search in other databases (broad search) in case of exhaustive searches. This is easy to remember, even without the schemes in the book.

Some minor points. There is an overemphasis on UK-sources. So the first source to find guidelines is the (UK) National Library of Guidelines, where I would put the National Guideline Clearinghouse (or the TRIP-database) first. And why is MedlinePlus not included as a source for patients, whereas NHS-choices is?

There is also an overemphasis on interventions. How PICO’s are constructed for other domains (diagnosis, etiology/harm and prognosis) is barely touched upon. It is much more difficult to make PICOs and search in these domains. More practical examples would also have been helpful.

Overall, I find this book very useful. The authors are clearly experts in searching and they fill a gap in the market: there is no comparable book on “the searching of the evidence”. Therefore, despite some critique and preferences for another approach, I do recommend this book to doctors who want to learn basic searching skills. As a medical information specialist I keep it in my pocket too: just in case…

Overview

What I liked about the book:

  • Pocket size, easy to take a long.
  • Well written
  • Clear diagrams
  • Broad coverage
  • Good description of (many) databases
  • Step for step approach

What I liked less about it:

  • Screen dumps are often too small to read and thereby not useful
  • Emphasis on UK-sources
  • Other domains than “therapy” (etiology/harm, prognosis, diagnosis) are almost not touched upon
  • Too few clinical examples
  • A too strict division in short, intermediate and long searches: these are not intrinsically different

The Chapters

  1. Introduction.
  2. Where to start? Summary tables and charts.
  3. Sources of clinical information: an overview.
  4. Using search engines on the World Wide Web.
  5. Formulating clinical questions.
  6. Building a search strategy.
  7. Free text versus thesaurus.
  8. Refining search results.
  9. Searching specific healthcare databases.
  10. Citation pearl searching.
  11. Saving/recording citations for future use.
  12. Critical appraisal.
  13. Further reading by topic or PubMed ID.
  14. Glossary of terms.
  15. Appendix 1: Ten tips for effective searching.
  16. Appendix 2: Teaching tips

References

  1. Searching Skills Toolkit – Finding The Evidence (Paperback – 2009/02/17) by Caroline De Brún and Nicola Pearce-smith; Carl Heneghan et al (Editors). Wiley-Blackell BMJ\ Books
  2. Kamal R Mahtani Evid Based Med 2009;14:189 doi:10.1136/ebm.14.6.189 (book review by a clinician)

Reblog this post [with Zemanta]




When more is less: Truncation, Stemming and Pluralization in the Cochrane Library

5 01 2010

I’m on two mail lists of the Cochrane Collaboration, one is the TSC -list (TSC=Trials Search Coordinator) and the other the IRMG-list. IMRG stands for Information Retrieval Methods Group (of the Cochrane). Sometimes, difficult search problems are posted on the list. It is challenging to try to find the solutions. I can’t remember that a solution was not found.

A while ago a member of the list was puzzled why he got the following retrieval result from the Cochrane Library:

ID Search Hits
#1 (breast near tumour* ) ….. 254
#2 (breast near tumour) …… 640
#3 (breast near tumor*) ….. 428
#4 (breast near tumor) …… 640

where near = adjacent (thus breast should be just before tumour) and the asterisk * is the truncation symbol.  At the end of the word an asterisk is used for all terms that begin with that basic word root. Thus tumour* should find: tumours and tumour and thus broaden the search.

The results are odd, because #2 (without truncation) gives more hits than #1 (with truncation), and the same is true for #4 versus #3. One would expect truncation to give more results. What could be the reason behind it?

I suspected the problem had to do with the truncation. I searched for breast and tumour with or without truncation (#1 to #4) and only tumour* gave odd results: tumour* gave much less results than tumour. (to exclude that it had to do with the fields being searched I only searched the fields ti (title), ab (abstract) and kw (keywords))

Records found with tumour, not with tumour*, contained the word tumor (not shown). Thus tumour automatically searches for tumor (and vice versa). This process is called stemming.

According to the Help-function of the Cochrane Library:

Stemming: The stemming feature within the search allows words with small spelling variants to be matched. The term tumor will also match tumour.

In addition, as I realized later, the Cochrane has pluralization and singularization features.

Pluralization and singularization matches Pluralized forms of words also match singular versions, and vice versa. The term drugs will find both drug and drugs. To match either just the singular or plural form of a terms, use an exact match search and include the word in quotation marks.

Indeed (tumor* OR tumour*) (or shortly tumo*r*) retrieves a little more than tumor OR tumour: words like tumoral, tumorous, tumorectomy. Not particularly useful, although it might not be disadvantagous when used adjacent to breast, as this will filter most noise.

tumor spelling variants searched in the title (ti) only: it doesn't matter how you spell tumor (#8, #9, #10,#11), as long as you don't truncate (while using a single variant)

Thus stemming, pluralization and singularization only work without truncation. In case of truncation you should add the spelling variants yourselves if case stemming/pluralization takes place. This is useful if you’re interested in other word variants that are not automatically accounted for.

Put it another way: knowing that stemming and pluralization takes place you can simply search for the single or plural form, American or English spelling. So breast near tumor (or simply breast tumor) would have been o.k. This is the reason why these features were introduced in the first way. 😉

By the way, truncation and stemming (but not pluralization) are also features in PubMed. And this can give similar and other problems. But this will be dealt with in another blogpost.

Reblog this post [with Zemanta]




Medlibs Round 1.8 at Highlight Health

14 11 2009

For those that haven’t yet seen it:

The MedLib’s Round, the monthly blog carnival that highlights some of the best writing on medical librarianship, encompassing all stages in the publication and dissemination of medical information: writing, publishing, searching, citing, managing and social networking is up at Highlight Health (link).

The theme of this incredible 8th edition is: Finding Credible Health Information Online.

Walter Jessen introduces the round as follows:

There’s a revolution occurring on the Web: those “authoritative” articles written on traditional, static websites are being replaced with blogs, wikis and online social networks. In the sphere of health, medicine and information technology, this “real-time Web” consists of many who are experts in the field; these are their posts listed below.
In the digital age, these are the characteristics of new media: recent, relevant, reachable and reliable.

Subjects: “Searching the Web for health information”, “Biomedical research”, “Web 2.0 tools”, “PubMed Redesigned” and “Social media and participatory medicine” with contributions of Women’s Health News, Our Bodies Our Blog [@rachel_w]* Emerging Technologies Librarian [@pfanderson] Musings of a Distractible Mind [@doc_rob] Laika’s MedLibLog [@ericrumsey, Janet Wale, @Laikas], Significant Science [@hleman], Websearch Guide Internet News [Gwen Harris], Alisha764’s Blog [@alisha764] Next Generation Science [@NextGenScience], Dr Shock MD Ph [@DrShock], Life in the Fast Lane [@sandnsurf], Knowledge beyond words [@novoseek on Twitter], Eagle Dawg Blog [@eagledawg], The Search Principle blog [@giustini], Krafty Librarian [@Krafty], Dose of Digital [@jonmrich], e-Patients.net [@SusannahFox] and Highlight HEALTH [@HighlightHEALTH].

Walter Jessen [wjjessen] concluded the blog carnival with a great presentation of Kevin Clauson [@kevinclauson] on the role of Facebook and Twitter in pharmacy and the development of participatory medicine. Since I intended to show this presentation anyway, I might as well place it here 😉 :

Please enjoy reading the blog carnival at Highlight Health.

The host of the next edition of MedLib’s Round will be Knowledge Beyond Words (http://blog.novoseek.com). Valentin of Novoseek invites you to start submitting through this form http://blogcarnival.com/bc/submit_6092.html

Past and future hosts can be found on the Medlib’s Archive.

*links refer to the Twitter addresses.

Reblog this post [with Zemanta]




Finding Skin Disease Pictures on the Web

10 11 2009

eric_118_gray_biggerGuest author: Eric Rumsey (@ericrumsey on Twitter)
Librarian and Web Developer at University of Iowa
Creater and Keeper of Hardin MD

——————————————————————————————-

When looking for skin disease pictures on the Web, the first step is to search for the specific disease terms of interest in Google Image Search. You will likely find something, but don’t assume that it comes close to being everything — Very likely it doesn’t! In my experience, it will have somewhere in the range of 10-30% of everything on the Web. In particular, it will not have images from what I consider to be the single most comprehensive, reliable site for skin disease pictures — DermNet.com, by  Alan N. Binnick & Thomas P. Habif, Dartmouth Medical School.

Though Dermnet.com is a large site, with high-quality pictures, it does not appear in Google Image Search, apparently because the tagging/metadata is so sparse. Indeed, the pictures on the site are virtually without any accompanying text. They are classed by disease, but not by any other characteristics, e.g. age, gender, or anatomical region.

A relatively small subset of the images in Dermnet.com are included in Hardin MD, where the tagging/metadata is more complete, making them easier to search. These images are included by special arrangement with people at Dermnet, who have given us permission to include them in Hardin MD.

Reblog this post [with Zemanta]




Grey Literature: Time to make it systematic

6 09 2009

Guest author: Shamsha Damani (@shamsha)

Grey literature is a term I first encountered in library school; I remember dubbing it “the-wild-goose-chase search” because it is time consuming, totally un-systematic, and a huge pain altogether. Things haven’t changed much in the grey literature arena, as I found out last week, when my boss asked me to help with the grey literature part of a systematic review.

Let me back up a bit and offer the official definition for grey literature by the experts of the Grey Literature International Steering Committee: “Information produced on all levels of government, academics, business and industry in electronic and print formats not controlled by commercial publishing i.e. where publishing is not the primary activity of the producing body.” Grey literature can include things such as policy documents, government reports, academic papers, theses, dissertations, bibliographies, conference abstracts/proceedings/papers, newsletters, PowerPoint presentations, standards/best practice documents, technical specifications, working papers and more! (Benzies et al 2006). So what is so time consuming about all this? There is no one magic database that will search all these at once. Translation: you have to search a gazillion places separately, which means you have to learn how to search each of these gazillion websites/databases separately. Now if doing searches for systematic reviews is your bread-and-butter, then you are probably scoffing already. But for a newbie like me, I was drowning big time.

After spending what seemed like an eternity to finish my search, I went back to the literature to see why inclusion of grey literature was so important. I know that grey literature adds to the evidence base and results in a comprehensive search, but it is often not peer-reviewed, and the quality of some of the documents is often questionable. So what I dug up was a bit surprising. The first was a Cochrane Review from 2007 titled “Grey literature in meta-analyses of randomized trials of health care interventions (review).” The authors concluded that not including grey literature in meta-analyses produced inflated results when looking at treatment effects. So the reason for inclusion of grey literature made sense: to reduce publication bias. Another paper published in the Bulletin of the World Health Organization concluded that grey literature tends to be more current, provides global coverage, and may have an impact on cost-effectiveness of various treatment strategies. This definitely got my attention because of the new buzzword in Washington: Comparative Effectiveness Research (CER). A lot of the grey literature is comprised of policy documents so it definitely has a big role to play in systematic reviews as well. However, the authors also pointed out that there is no systematic way to search the grey literature and undertaking such a search can be very expensive and time consuming. This validated my frustrations, but gave no solutions.

When I was struggling to get through my search, I was delighted to find a wonderful resource from the Canadian Agency for Drugs and Technologies in Health. They have created a document called “Grey Matters: A Practical Search Tool for Evidence-Based Medicine”, which is a 34-page checklist of many of the popular websites for searching grey literature, including a built-in documentation system. It was still tedious work because I had to search a ton of places, many resulting in no hits. But at least I had a start and a transparent way of documenting my work.

However, I’m still at a loss for why there are no official guidelines for librarians to search for grey literature. There are clear guidelines for authors of grey literature. Benzies and colleagues give compelling reasons for inclusion of grey literature in a systematic review, complete with a checklist for authors! Why not have guidelines for searching too? I know that every search would require different tools; but I think that a master list can be created, sort of like a must-search-these-first type of a list. It surely would help a newbie like me. I know that many libraries have such lists but they tend to be 10 pages long, with bibliographies for bibliographies! Based on my experience, I would start with the following resources the next time I encounter a grey literature search:

  1. National Guideline Clearinghouse
  2. Centre for Reviews and Dissemination
  3. Agency for Healthcare Research and Quality (AHRQ)
  4. Health Technology Assessment International (HTAI)
  5. Turning Research Into Practice (TRIP)

Some databases like Mednar, Deep Dyve, RePORTer, OAIster, and Google Scholar also deserve a mention but I have not had much luck with them. This is obviously not meant to be an exhaustive list. For that, I present my delicious page: http://delicious.com/shamsha/greylit, which is also ever-growing.

Finally, a request for the experts out there: if you have any tips on how to make this process less painful, please share it here. The newbies of the world will appreciate it.

Shamsha Damani

Clinical Librarian

Reblog this post [with Zemanta]




MedLib’s Round 1.3

8 04 2009

The 3rd Medlib’s Round, a blog carnival of medical-library related blogposts, is up at First Person Narrative. Anne Welsh did a great job pulling together an interesting collection of posts.

From Anne’s introduction

This month’s theme was “evidence” – not just in the terms of “Evidence Based Medicine” but in the widest possible sense. Evidence is a hot topic in the UK at the moment – indeed, the National Library for Health (NLH) is to be relaunched at the end of this month as NHS Evidence, “a web-based service that will help people find, access and use high-quality clinical and non-clinical evidence and best practice.”

Please have a look at the First Person Narrative and enjoy reading.

Want to stay informed? You can take a RSS subscription to the Medlib’s Round. An aggregated feed of credible, rotating health and medicine blog carnivals is also available (thanks Walter Jessen)

**************************************************************

The Next MedLib’s Round will be hosted by Nicole S. Dettmar at Eagle Dawg Blog. Nikki is a medical librarian at the National Network of Libraries of Medicine (NN/LM). The main theme will be PubMed or 3rd party PubMed tools. Post addressing this subject will get extra emphasis.

You can submit the permalink (url) of the post (you have already written on your blog) at the Blog Carnival submission form (you have to login, scroll down (!), submit links to selected posts and give an optional description). Don’t forget to submit before Saturday May 2, 2009 round midnight (EST)

Perhaps you would like to host a future edition as well. If so, please inform me which edition (June, July or August) you would like to host.

Further Reading:





Advanced Neuritis in PubMed

8 03 2009

pubmed-logoAlmost a year ago (June 2008) I discussed PubMed’s Advanced Search Beta in a series entitled PubMed: Past, Present and Future. At that time I was not particularly impressed by disliked Advanced Search Beta and I still do.

November last year some of its features have improved: like the addition of a Clear Button, Focused Queries, providing links to the Clinical Queries and Special Queries pages, and the author/journal search has been extended with optional fields so that it looks more like the valuable Single Citation Mapper in the blue side bar of the Basic PubMed page. And there is a link to the MeSH-database (see NLM Technical Bulletin November 2008).
Although these are real improvements, the links to the Queries and to the MeSH database are inconspicuous, at the end of the page below all kind of limits. My major objections to the Advanced Search is that people are more inclined to narrow their search by using as many limits as possible (because these are so prominently present) and that the MeSH cannot be easily looked up and/or are wrongly translated. Previously I gave some examples, where lung cancer[mesh] was searched, whereas the MeSH is lung neoplasms, or where recurrent pregnancy loss[MeSH] returns no result, because the term is habitual abortion (see previous post).

I avoid Advanced Search as long as I can, but the problem is, the library-users don’t. They like to experiment, especially when they consider themselves as advanced searchers.

Last month a Neurologist asked me if I could check his search for a Diagnostic Systematic Review. A search for a Systematic Review should be comprehensive and thus contain both MeSH-terms (Controlled terms of MEDLINE) and free text words (tw).

He was a resident in Neurology for 5 years and knew how to search PubMed.

Below is the first part of his search.

((((((((motor neuropathy[MeSH Terms] OR motor neuron[tw] OR motor neuropathy[tw]) OR multifocal motor neuropathy[tw]) OR demyelinating neuropathy[tw]) OR multifocal demyelinating motor neuropathy[tw]) OR neuropathy[tw]) OR neuropathies[tw]) AND (((((((((((((((((((((((((…..

Grosso modo it looked all right and well structured. The awful number of brackets is often seen when people combine directly in PubMed (although I was already glad there were no brackets around every single word and he didn’t copy the entire translation from the Details-Tab). And some terms were superfluous: you don’t have to search for multiword terms with neuropathy (i.e. motor neuropathy) because these are already found by searching neuropathy.

So we made the search simpler, like this:

(motor neuropathy[MeSH Terms] OR motor neuron[tw] OR neuropathy[tw]) OR neuropathies[tw]) AND (………

Just to be sure I asked him: “Do you mind if we check the MeSH? Motor Neuropathy looks just fine, but you never know.”

To my surprise, typing motor neuropathy in the MeSH search bar yielded 4 suggestions, none of which was motor neuropathy.

pubmed-motor-neuropathy-mesh-1

The most suitable term appeared Neuritis. When bringing this MeSH-term to PubMed we got exactly the same number of hits as with Motor Neuropathy. Mere coincidence? No, the hits weren’t any different (#1 NOT #4 giving zero results).

pubmed-motor-neuropathy-search-1

Looking Up the Query Translation under the Details Tab confirmed my suspicion: motor neuropathy[mesh] was translated as “neuritis”[MeSH]. This is disturbing. Not only doesn’t there exist any MeSH specific for motor neuropathy, people are put on the wrong track since it looks like motor  neuropathy[mesh] is recognized as such.

pubmed-motor-neuropathy-search-1b-details

Then it came to my mind that I had seen a similar odd “translation” when using PubMed Advanced Search (see above). And I asked him: “Did you by any chance use the Advanced Search”, which he did.

To check this I searched in Advanced Search for the MeSH: motor neuropathy. And, yes indeed, the motor neuropathy[MeSH] was searched so it seemed. (in reality we now know: Neuritis was searched). The difference with searching the MeSH database is that here I know that I search for neuritis (I choose to), whereas the Advanced search misleads me by suggesting I’m searching for motor neuropathy.

pubmed-motor-neuropathy-2

pubmed-motor-neuropathy-2a

Why do I bother? Why don’t I just use motor neuropathy[mesh]. First because I don’t get what I want: I get neuritis[mesh] not neuropathy! Second, and most important, because it is not the most appropriate MeSH-term.

To find more appropriate MeSH I use a trick. I look for MeSH-terms assigned to articles, having motor neuropathy in their title, assuming that motor neuropathy is an important aspect of those papers.

Although you can look up MeSH assigned to each individual citation in PubMed in the citation display format, it takes a lot of time to go through the papers one at a time. Therefore I rather use GoPubMed or even better PubReminer for this purpose, because these give you a frequency list of the MeSH assigned.

Of the 379 hits found in GoPubMed, 219 were categorized as Motor Neuron Disease, 153 as Demyelinating Diseases and 145 as Polyneuropathies. These categories are MeSH term you can use for your search.

gopubmed-neuropathy

Similarly of the 380 references found in PubReminer, many papers were indexed with Motor Neuron Disease, Demyelinating Diseases, Polyneuropathies, peripheral nervous system diseases and motor neuron.

(Below are the number of papers, indexed with the indicated MESH in PubReminer; PubReminer shows the subheading coupled to the MeSH)

  • 65 Motor Neuron Disease/diagnosis
  • 32 Motor Neurons/physiology
  • 26 Demyelinating Diseases/diagnosis
  • 16 Peripheral nervous system diseases/diagnosis
  • 8 Polyneuropathies

Using this approach we were able to set up a more complete search in PubMed. Remember it was the neurologist’s purpose to to an exhaustive search, for a less exhaustive search we would have only used motor neuropath* and perhaps motor neuron disease[mesh].

How different is it when you use the OVID interface for searching MEDLINE.

When you type Motor Neuropathy, several MeSH are suggested, many of which are useful:

ovid-motor-neuropathy-1

When you click on Motor Neuron Disease, you see the hierarchal context and can choose which terms you would like to add. We choose not to explode Motor Neuron Disease, but only include one narrow term in our search: amyotrophic lateral sclerosis.

ovid-motor-neuropathy-2

Finally the first part of the search in MEDLINE (OVID) looked like this. It is rather broad but the second part of the search (not shown) puts it into context.

1. motor neuron disease/ or amyotrophic lateral sclerosis/
2. exp Motor Neurons/
3. Demyelinating Diseases/
4. neuromuscular diseases/ or peripheral nervous system diseases/ or neuritis/ or polyneuropathies/
5. (neuropathy or neuropathies).tw.
6. motor neuron*.tw.
7. or/1-6

OVID MEDLINE was easier to use, you get what you see (and want) and the search is easier to save and edit. Furthermore the entire MEDLINE search can be easily transformed to a search in EMBASE: just replace MESH by EMBASE keywords.

I’m not happy with the Advanced Search for reasons explained above. I don’t find the altered mapping and citation sensor a success either. I don’t like that they removed the blue side bar in some display formats. And I’m really getting depressed by NLM’s announcement (November 2008):

PubMed Advanced Search will soon no longer be a beta site. It is now the place to go to use features such as field searching and limits. In the near future the tabs for Limits, Preview/Index, History, Clipboard, and Details will be removed from the basic PubMed pages. History, Limits, Index of Fields, and a link to Details are available from the Advanced Search screen. A link for the Clipboard appears to the right of the search box on the PubMed screen when the Clipboard has content.

If I understand it correctly this means that Pubmed Advanced Search is taking over the basic search.

It looks that my original idea was right: PubMed is going for the mass, it is going for the Google-like quick searches by people that don’t know much about MEDLINE and don’t want to learn it. But you have to know some basic principles to get the most out of subject searching. It is such a pity, that PubMed tries to copy its clones, whereas it holds all the trumps. No other 3rd party tools offer the same possibilities that PubMed offers, although they are more suitable for certain purposes (see examples of GoPubMed and PubReMiner above).

At least make two interfaces, one for the beginner (the present Advanced Search) and one for librarians and other people doing subject searches.

But I don’t have the illusion that the people of PubMed/NLM will listen to me and I’m not going to contact them for a 3rd time. PubMed’s route is determined, I guess.