Silly Sunday #29 World Cup 2010-Twitter Reports of England’s Loss.

27 06 2010

The World Cup Soccer 2010 started 2 weeks ago. For now I only follow the Dutch team live. But indirectly I follow many other matches via Twitter. It is very entertaining, especially if things go awry, like the way the English were crushed by the Germans today (1:4). This was partly due to the referee who ruled out a legitimate goal by Frank Lampard when it was still 1 : 1.

Below are some of the tweets in my timeline. I especially like @precordialthump’s comparison of the English knock-out with apoptosis.

@Precordialthump opens with the best Faulty Tower fragment: “Don’t mention the War”. I can’t resist to show the fragment here.

And don’t miss the pic: “It wasn’t a goal” (via nutrigenomics)

  1. Maria Wolters
    mariawolters PHEW! #ger AND #gha are through! Go Ghana, go Africa! Now on to #ger / #eng. Mwahahaha …. #fb
  2. precordialthump
  3. precordialthump
    precordialthump Oh my god!!!! Come on England – 1966 in reverse!!!
  4. Sally Church
    MaverickNY @SallyWalker exactly kind of gobsmacked. If they ditch all the bad refs there won’t be any left for the final tho
  5. Maria Wolters
    mariawolters at least #eng will be spared the excruciating penalty shootout this time #brightside #schlaaaaand #fb
  6. Richard Herring
    Herring1967 I blame our 12th invisible player. Everyone keeps passing to him and then he fucks it up.
  7. Theodor Adorno
    TW_Adorno Your team qualified with ease under a Labour Govt and have struggled in every game under the Conservatives. How could this be?
  8. Stephanie Merritt
    thestephmerritt Is this happening because they’ve cut the defence budget? #ididafootballjoke
  9. precordialthump
    precordialthump The England team’s performance turned out to be the World Cup football equivalent of apoptosis… well done, Germany.
  10. Sally Church
    MaverickNY @whydotpharma not sure which was worse: refereeing, #eng or american tv commentary. Probably the last one was most clueless.
  11. jdc 325
    jdc325 Watched the England game with my Dad. My summary: what a shit waste of time. I could have gone for a walk or read a book.
  12. Nutrigenomics
    nutrigenomics Ha RT @biomatushiq: [pretty fast] ROFL RT @sotak: It wasn’t a goal! [pic] http://bit.ly/aHon2g #worldcup #eng #ger
  13. Daft-bint
    TheMarydoll Just been announced that the england team are flying back to glasgow airport so they can get a hero’s welcome.
  14. Laika (Jacqueline)
    laikas RT @BrettAwesome: Breaking News: England have a new coach. It takes them to the airport in 15 minutes.
  15. Maria Wolters

this quote was brought to you by quoteurl





Will Nano-Publications & Triplets Replace The Classic Journal Articles?

23 06 2010

ResearchBlogging.org“Libraries and journals articles as we know them will cease to exists” said Barend Mons at the symposium in honor of our Library 25th Anniversary (June 3rd). “Possibly we will have another kind of party in another 25 years”…. he continued, grinning.

What he had to say the next half hour intrigued me. And although I had no pen with me (it was our party, remember), I thought it was interesting enough to devote a post to it.

I’m basing this post not only on my memory (we had a lot of Italian wine at the buffet), but on an article Mons referred to [1], a Dutch newspaper article [2]), other articles [3-6] and Powerpoints [7-9] on the topic.

This is a field I know little about, so I will try to keep it simple (also for my sake).

Mons started by touching on a problem that is very familiar to doctors, scientists and librarians: information overload by a growing web of linked data.  He showed a picture that looked like the one at the right (though I’m sure those are Twitter Networks).

As he said elsewhere [3]:

(..) the feeling that we are drowning in information is widespread (..) we often feel that we have no satisfactory mechanisms in place to make sense of the data generated at such a daunting speed. Some pharmaceutical companies are apparently seriously considering refraining from performing any further genome-wide association studies (… whole genome association –…) as the world is likely to produce many more data than these companies will ever be able to analyze with currently available methods .

With the current search engines we have to do a lot of digging to get the answers [8]. Computers are central to this digging, because there is no way people can stay updated, even in their own field.

However,  computers can’t deal with the current web and the scientific  information as produced in the classic articles (even the electronic versions), because of the following reasons:

  1. Homonyms. Words that sound or are the same but have a different meaning. Acronyms are notorious in this respect. Barend gave PSA as an example, but, without realizing it, he used a better example: PPI. This means Protein Pump Inhibitor to me, but apparently Protein Protein Interactions to him.
  2. Redundancy. To keep journal articles readable we often use different words to denote the same. These do not add to the real new findings in a paper. In fact the majority of digital information is duplicated repeatedly. For example “Mosquitoes transfer malaria”, is a factual statement repeated in many consecutive papers on the subject.
  3. The connection between words is not immediately clear (for a computer). For instance, anti-TNF inhibitors can be used to treat skin disorders, but the same drugs can also cause it.
  4. Data are not structured beforehand.
  5. Weight: some “facts” are “harder” than others.
  6. Not all data are available or accessible. Many data are either not published (e.g. negative studies), not freely available or not easy to find.  Some portals (GoPubmed, NCBI) provide structural information (fields, including keywords), but do not enable searching full text.
  7. Data are spread. Data are kept in “data silos” not meant for sharing [8](ppt2). One would like to simultaneously query 1000 databases, but this would require semantic web standards for publishing, sharing and querying knowledge from diverse sources…..

In a nutshell, the problem is as Barend put it: “Why bury data first and then mine it again?” [9]

Homonyms, redundancy and connection can be tackled, at least in the field Barend is working in (bioinformatics).

Different terms denoting the same concept (i.e. synonyms) can be mapped to a single concept identifier (i.e. a list of synonyms), whereas identical terms used to indicate different concepts (i.e. homonyms) can be resolved by a disambiguation algorithm.

The shortest meaningful sentence is a triplet: a combination of subject, predicate and object. A triplet indicates the connection and direction.  ”Mosquitoes cause/transfer malaria”  is such a triplet, where mosquitoes and malaria are concepts. In the field of proteins: “UNIPROT 05067 is a protein” is a triplet (where UNIPROT 05067 and protein are concepts), as are: “UNIprotein 05067 is located in the membrane” and “UNIprotein 0506 interacts with UNIprotein 0506″[8].  Since these triplets  (statements)  derive from different databases, consistent naming and availability of  information is crucial to find them. Barend and colleagues are the people behind Wikiproteins, an open, collaborative wiki  focusing on proteins and their role in biology and medicine [4-6].

Concepts and triplets are widely accepted in the world of bio-informatics. To have an idea what this means for searching, see the search engine Quertle, which allows semantic search of PubMed & full-text biomedical literature, automatic extraction of key concepts; Searching for ESR1 $BiologicalProcess will search abstracts mentioning all kind of processes where ESR1 (aka ERα, ERalpha, EStrogen Receptor 1) are involved. The search can be refined by choosing ‘narrower terms’ like “proliferation” or “transcription”.

The new aspects is that Mons wants to turn those triplets into (what he calls) nano-publications. Because not every statement is as ‘hard’, nano-publications are weighted by assigning numbers from 0 (uncertain) to 1 (very certain). The nano-publication “mosquitoes transfer malaria” will get a number approaching 1.

Such nano-publications offer little shading and possibility for interpretation and discussion. Mons does not propose to entirely replace traditional articles by nano-publications. Quote [3]:

While arguing that research results should be available in the form of nano-publications, are emphatically not saying that traditional, classical papers should not be published any longer. But their role is now chiefly for the official record, the “minutes of science” , and not so much as the principle medium for the exchange of scientific results. That exchange, which increasingly needs the assistance of computers to be done properly and comprehensively, is best done with machine-readable, semantically consistent nano-publications.

According to Mons, authors and their funders should start requesting and expecting the papers that they have written and funded to be semantically coded when published, preferably by the publisher and otherwise by libraries: the technology exists to provide Web browsers with the functionality for users to identify nano-publications, and annotate them.

Like the wikiprotein-wiki, nano-publications will be entirely open access. It will suffice to properly cite the original finding/publication.

In addition there is a new kind of “peer review”. An expert network is set up to immediately assess a twittered nano-publication when it comes out, so that  the publication is assessed by perhaps 1000 experts instead of 2 or 3 reviewers.

On a small-scale, this is already happening. Nano-publications are send as tweets to people like Gert Jan van Ommen (past president of HUGO and co-author of 5 of my publications (or v.v.)) who then gives a red (don’t believe) or a green light (believe) via one click on his blackberry.

As  Mons put it, it looks like a subjective event, quite similar to “dislike” and “like” in social media platforms like Facebook.

Barend often referred to a PLOS ONE paper by van Haagen et al [1], showing the superiority of the concept-profile based approach not only in detecting explicitly described PPI’s, but also in inferring new PPI’s.

[You can skip the part below if you're not interested in details of this paper]

Van Haagen et al first established a set of a set of 61,807 known human PPIs and of many more probable Non-Interacting Protein Pairs (NIPPs) from online human-curated databases (and NIPPs also from the IntAct database).

For the concept-based approach they used the concept-recognition software Peregrine, which includes synonyms and spelling variations  of concepts and uses simple heuristics to resolve homonyms.

This concept-profile based approach was compared with several other approaches, all depending on co-occurrence (of words or concepts):

  • Word-based direct relation. This approach uses direct PubMed queries (words) to detect if proteins co-occur in the same abstract (thus the names of two proteins are combined with the boolean ‘AND’). This is the simplest approach and represents how biologists might use PubMed to search for information.
  • Concept-based direct relation (CDR). This approach uses concept-recognition software to find PPIs, taking synonyms into account, and resolving homonyms. Here two concepts (h.l. two proteins) are detected if they co-occur in the same abstract.
  • STRING. The STRING database contains a text mining score which is based on direct co-occurrences in literature.

The results show that, using concept profiles, 43% of the known PPIs were detected, with a specificity of 99%, and 66% of all known PPIs with a specificity of 95%. In contrast, the direct relations methods and STRING show much lower scores:

Word-based CDR Concept profiles STRING
Sensitivity at spec = 99% 28% 37% 43% 39%
Sensitivity at spec = 95% 33% 41% 66% 41%
Area under Curve 0.62 0.69 0.90 0.69

These findings suggested that not all proteins with high similarity scores are known to interact but may be related in another way, e.g.they could be involved in the same pathway or be part of the same protein complex, but do not physically interact. Indeed concept-based profiling was superior in predicting relationships between proteins potentially present in the same complex or pathway (thus A-C inferred from concurrence protein pairs A-B and B-C).

Since there is often a substantial time lag between the first publication of a finding, and the time the PPI is entered in a database, a retrospective study was performed to examine how many of the PPIs that would have been predicted by the different methods in 2005 were confirmed in 2007. Indeed, using concept profiles, PPIs could be efficiently predicted before they enter PPI databases and before their interaction was explicitly described in the literature.

The practical value of the method for discovery of novel PPIs is illustrated by the experimental confirmation of the inferred physical interaction between CAPN3 and PARVB, which was based on frequent co-occurrence of both proteins with concepts like Z-disc, dysferlin, and alpha-actinin. The relationships between proteins predicted are broader than PPIs, and include proteins in the same complex or pathway. Dependent on the type of relationships deemed useful, the precision of the method can be as high as 90%.

In line with their open access policy, they have made the full set of predicted interactions available in a downloadable matrix and through the webtool Nermal, which lists the most likely interaction partners for a given protein.

According to Mons, this framework will be a very rich source for new discoveries, as it will enable scientists to prioritize potential interaction partners for further testing.

Barend Mons started with the statement that nano-publications will replace the classic articles (and the need for libraries). However, things are never as black as they seem.
Mons showed that a nano-publication is basically a “peer-reviewed, openly available” triplet. Triplets can be effectively retrieved ànd inferred from available databases/papers using a
concept-based approach.
Nevertheless, effectivity needs to be enhanced by semantically coding triplets when published.

What will this mean for clinical medicine? Bioinformatics is quite another discipline, with better structured and more straightforward data (interaction, identity, place). Interestingly, Mons and van Haage plan to do further studies, in which they will evaluate whether the use of concept profiles can also be applied in the prediction of other types of relations, for instance between drugs or genes and diseases. The future will tell whether the above-mentioned approach is also useful in clinical medicine.

Implementation of the following (implicit) recommendations would be advisable, independent of the possible success of nano-publications:

  • Less emphasis on “publish or perish” (thus more on the data themselves, whether positive, negative, trendy or not)
  • Better structured data, partly by structuring articles. This has already improved over the years by introducing structured abstracts, availability of extra material (appendices, data) online and by guidelines, such as STARD (The Standards for Reporting of Diagnostic Accuracy)
  • Open Access
  • Availability of full text
  • Availability of raw data

One might argue that disclosing data is unlikely when pharma is involved. It is very hopeful therefore, that a group of major pharmaceutical companies have announced that they will share pooled data from failed clinical trials in an attempt to figure out what is going wrong in the studies and what can be done to improve drug development (10).

Unfortunately I don’t dispose of Mons presentation. Therefore two other presentations about triplets, concepts and the semantic web.

&

References

  1. van Haagen HH, ‘t Hoen PA, Botelho Bovo A, de Morrée A, van Mulligen EM, Chichester C, Kors JA, den Dunnen JT, van Ommen GJ, van der Maarel SM, Kern VM, Mons B, & Schuemie MJ (2009). Novel protein-protein interactions inferred from literature context. PloS one, 4 (11) PMID: 19924298
  2. Twitteren voor de wetenschap, Maartje Bakker, Volskrant (2010-06-05) (Twittering for Science)
  3. Barend Mons and Jan Velterop (?) Nano-Publication in the e-science era (Concept Web Alliance, Netherlands BioInformatics Centre, Leiden University Medical Center.) http://www.nbic.nl/uploads/media/Nano-Publication_BarendMons-JanVelterop.pdf, assessed June 20th, 2010.
  4. Mons, B., Ashburner, M., Chichester, C., van Mulligen, E., Weeber, M., den Dunnen, J., van Ommen, G., Musen, M., Cockerill, M., Hermjakob, H., Mons, A., Packer, A., Pacheco, R., Lewis, S., Berkeley, A., Melton, W., Barris, N., Wales, J., Meijssen, G., Moeller, E., Roes, P., Borner, K., & Bairoch, A. (2008). Calling on a million minds for community annotation in WikiProteins Genome Biology, 9 (5) DOI: 10.1186/gb-2008-9-5-r89
  5. Science Daily (2008/05/08) Large-Scale Community Protein Annotation — WikiProteins
  6. Boing Boing: (2008/05/28) WikiProteins: a collaborative space for biologists to annotate proteins
  7. (ppt1) SWAT4LS 2009Semantic Web Applications and Tools for Life Sciences http://www.swat4ls.org/
    Amsterdam, Science Park, Friday, 20th of November 2009
  8. (ppt2) Michel Dumontier: triples for the people scientists liberating biological knowledge with the semantic web
  9. (ppt3, only slide shown): Bibliography 2.0: A citeulike case study from the Wellcome Trust Genome Campus – by Duncan Hill (EMBL-EBI)
  10. WSJ (2010/06/11) Drug Makers Will Share Data From Failed Alzheimer’s Trials




Friday Foolery #28 Radiant Pin-Up Calendar

18 06 2010

It is HOT & Radiating.

Eizo, a medial diagnostic supply company, issued a very special pin-up calendar.
No body part was concealed from the girls, exposed to the camera…. It is really very original…

But..…Why does no one pose the question whether this illuminate work [full body irradiation x12 (if the same girl poses on the calendar), x attempts ......]  is a responsible thing to do? It is no CT-scan, but still…

The calendar was made by the agency Butter; First seen on: Daily Art Press

Perhaps you also like: Friday Foolery #10. 6 x X-Rays





The June MedLib’s Round is up & Call for Submissions.

14 06 2010

Yes, the latest edition of the Medlibs Round is up at EBM and Clinical Support Librarians@UCHC! Kathleen aka Creaky did a wonderful job compiling this round, with the main theme “service”.

Posts vary from a summary of the MLA-congress, to the first systematic review search of a librarian,  a friend’s request for help in finding information on breast cancer is, disclosure of conflicts of interest, collaborative librarianship through social media and (much) more. Read it all at Creaky’s blog!

Now it is only 2 weeks for the next deadline (Saturday, July the 3rd is the official deadline). So you better start writing and/or submit recent posts!

There is no theme, I will accept all relevant and good quality posts pertaining to medical librarianship, in the braodest sense of the word.

I would like to see posts about (for instance):

  • Social media (and medical information)
  • Searches, Search-engines & Databases (i.e. PubMed)
  • Reference Manager Systems
  • Library congresses (EAHIL?)
  • Web 2.0 tools, I-Phones, I-pads
  • Open Access, Publishing
  • Reliability of Information, Patient Information
  • Evidence Based Medicine

So NOT ONLY librarians, but also doctors and other healthcare workers, patients, pharma-people and scientists are invited to submit.

Submitting is easy, just submit the permalink (URL) of your post (at your blog) at the blog carnival here. For examples and Faqs see the MedLibs Round-ARCHIVE.

If you have no blog but would like to submit you are welcome to write a guest post at this blog.

************

I would also like to take the opportunity to ask if there are any Med- or Medlib-bloggers out there who would like to host the MEDLIBS round August, September, October or later this year!
A host for August is rather urgent as I will be on vacation the second half of July.

And if you didn’t fill in the poll below you still have the opportunity to do so. When we have a new name, the next step is to ask Robin of Survive the Journey to make a logo for us.





FDA to Regulate Genetic Testing by DTC-Companies Like 23andMe

14 06 2010

Direct-to-consumer (DTC) genetic testing refers to genetic tests that are marketed directly to consumers via television, print advertisements, or the Internet. This form of testing, which is also known as at-home genetic testing, provides access to a person’s genetic information without necessarily involving a doctor or insurance company in the process. [definition from NLM's Genetic Home Reference Handbook]

Almost two years ago I wrote about 23andMe (23andMe: 23notMe, not yet),  a well known DTC company, that offers a genetics scan (SNP-genotyping) to the public ‘for research’, ‘for education’ and ‘for fun’:

“Formally 23andMe denies there is a diagnostic purpose (in part, surely, because the company doesn’t want to antagonize the FDA, which strictly regulates diagnostic testing for disease). However, 23andme does give information on your risk profile for certain diseases, including Parkinson”

In another post Personalized Genetics: Too Soon, Too Little? I summarized an editorial by Ioannides on the topic. His (and my) conclusion was that “the promise of personalized genetic prediction may be exaggerated and premature”. The most important issue is that predictive power to individualize risks is relatively weak. Ioannidis emphasized that despite the poor evidence, direct to consumer genetic testing has already begun and is here to stay. He proposed several safeguards, including transparent and thorough reporting, unbiased continuous synthesis and grading of the evidence and alerting the public that most genetic tests have not yet been shown to be clinically useful.

And now these “precautionary measures” actually seem to happen.
Last week the FDA sent 5 DTC-companies, including 23andMe a letter saying “their tests are medical devices that must receive regulatory approval before they can be marketed.” (ie. see NY-times article).

Alberto Gutierrez, who leads diagnostic test regulation at the FDA, wrote in the letters:

“Premarket review allows for an independent and unbiased assessment of a diagnostic test’s ability to generate test results that can reliably be used to support good health care decisions,”

These letters are part of an initiative to better explain the FDA’s actions by providing information that supports clinical medicine, biomedical innovation, and public health,” (May 19 New England Journal of Medicine commentary, source: see AMED-news)

Although it doesn’t look like the tests will be taken from the market, 23andMe does take a quite a rebellious attitude: one of its directors called the FDA “appallingly paternalistic.”

Many support this view: “people have the right to know their own genetic make-up”, so to say. Furthermore as discussed above, 23andMe denies that their genetic scans are meant for diagnosis.

In my view the latter is largely untrue. At least 23andMe suggests that knowing a scan does tell you something about your risks for certain diseases.
However, the risks are often not that straightforward. You just can’t “measure” the risk of a multifactorial disease like diabetes by “scanning” a few weakly predisposing  genes. Often the results are given in relative risk, which is highly confusing. In her TED-talk the 23andMe director Anne Wojcicki said her husband Sergey Brin (Google), had a 50% chance of getting Parkinson, but his relative risk (RR, based on the LRRK2-mutation, which isn’t the most crucial gene for getting Parkinson) varies from 20% to 80% , which means that this mutation increases his absolute risk of getting Parkinson from 2-5% (normal chance) to 4-10% at the most. (see this post).

Furthermore, as reported by Venture in Nature (October 8, 2009): For seven diseases, 50% or less of the predictions of two companies agreed across five individuals (i.e. for one disease: 23andMe : RR 4.02, and Navigenics RR: 1.25). On the other hand *fun* diagnoses could lead to serious concern in, or wrong/unnecessary decisions (removal of ovaries, changing drug doses) by patients.

There are also concerns with regard to their good-practice standards, as 23andMe just flipped a 96-wells plate of costumer DNA (see Genetic Future for a balanced post), which upset a mother noticing that her son didn’t have compatible genes. But lets assume that proper precautions will prevent this to happen again.

There are also positive aspects: results of a preliminary study showed that people who find out they have high genetic risk for cardiovascular disease are more likely to change their diet and exercise patterns than are those who learn they have a high risk from family history. (Technology ReviewGenetic Testing Can Change Behavior).

Furthermore, people buy those tests themselves and, indeed, there genes are their own.

However, I agree with Dr. Gutierrez of the FDA saying: “We really don’t have any issues with denying people information. We just want to make sure the information they are given is correct. (NY-Times). The FDA is putting the consumers first.

However, it will be very difficult to be consistent. What about total body scans in normal healthy people, detecting innocent incidentilomas? Or what about the controversial XMRV-tests offered by the Whittemore Peterson Institute (WPI) directly to CFS- patients? (see these posts) And one step further (although not in the diagnostic field): the ineffective CAM/homeopathic products sold over the counter?

I wouldn’t mind if these tests/products would be held up to the light. Consumers should not be misled by the results of unproven or invalid tests, and where needed should be offered the guidance of a healthcare provider.

But if tests are valid and risk predictions correct, it is up to the “consumer” if he/she wants to purchase such a test.

—————–

What Five FDA Letters Mean for the Future of DTC Genetic Testingat Genomics law Report is highly recommendable, but couldn’t be accessed while writing the post.

[Added: 2010-06-14 13.10]

  • Problem assessing Genomics Law Report is resolved.
  • Also recommendable: the post “FDA to regulate genetic tests as “devices”" at PHG Foundation. This post highlights that simply trying to classify the complete genomic testing service as “a device” is inadequate and will not address the difficult issues at hand. One of the biggest issues is that, while classifying DTC genetics tests as devices is certainly appropriate for assessing their analytical validity and direct safety, it does not and cannot provide an assessment of the service, thus of the predictions and interpretations resulting from the genome scans.  Although standard medical testing has traditionally been overseen by professional medical bodies, the current genomic risk profiling tests are simply not good enough to be used by health care services. (see post)
Related articles by Zemanta




Friday Foolery [27] Twitter Parade

11 06 2010

There are many Twitter-apps that show “how you are doing” on Twitter.  They show your ranking, your interaction with others, the words you tweet the most, etc. (see this post)

Twitter Parade (http://isparade.jp/#) is more of a fun tool. If you fill in a name, i.e. laikas you see all your followers literally following you in a parade. Some are tweeting.
The first time I saw it I really had to laugh. It is fun to see a very serious doctor applauding or to see Dr. Shock barking as a dog.

You also realize that >1500 followers is really a lot. There is no end to the row.

Instead of  a name you can also fill in a keyword.

Since WordPress.com doesn’t allow scripts I embedded a YouTube video instead.

But it is more fun to try the Twitter parade yourself here (the page takes some time to load).
Perhaps you can find me walking, jumping, clapping there… for you.

Related articles by Zemanta




PubMed versus Google Scholar for Retrieving Evidence

8 06 2010

ResearchBlogging.orgA while ago a resident in dermatology told me she got many hits out of PubMed, but zero results out of TRIP. It appeared she had used the same search for both databases: alopecea areata and diphenciprone (a drug with a lot of synonyms). Searching TRIP for alopecea (in the title) only, we found a Cochrane Review and a relevant NICE guideline.

Usually, each search engine has is its own search and index features. When comparing databases one should compare “optimal” searches and keep in mind for what purpose the search engines are designed. TRIP is most suited to search aggregate evidence, whereas PubMed is most suited to search individual biomedical articles.

Michael Anders and Dennis Evans ignore this “rule of the thumb” in their recent paper “Comparison of PubMed and Google Scholar Literature Searches”. And this is not the only shortcoming of the paper.

The authors performed searches on 3 different topics to compare PubMed and Google Scholar search results. Their main aim was to see which database was the most useful to find clinical evidence in respiratory care.

Well quick guess: PubMed wins…

The 3 respiratory care topics were selected from a list of systematic reviews on the Website of the Cochrane Collaboration and represented in-patient care, out-patient care, and pediatrics.

The references in the three chosen Cochrane Systematic Reviews served as a “reference” (or “golden”) standard. However, abstracts, conference proceedings, and responses to letters were excluded.

So far so good. But note that the outcome of the study only allows us to draw conclusions about interventional questions, that seek to find controlled clinical trials. Other principles may apply to other domains (diagnosis, etiology/harm, prognosis ) or to other types of studies. And it certainly doesn’t apply to non-EBM-topics.

The authors designed ONE search for each topic, by taking 2 common clinical terms from the title of each Cochrane review connected by the Boolean operator “AND” (see Table, ” ” are not used). No synonyms were used and the translation of searches in PubMed wasn’t checked (luckily the mapping was rather good).

“Mmmmm…”

Topic

Search Terms

Noninvasive positive-pressure ventilation for cardiogenic pulmonary edema “noninvasive positive-pressure ventilation” AND “pulmonary edema”
Self-management education and regular practitioner review for adults with asthma “asthma” AND “education”
Ribavirin for respiratory syncytial virus “ribavirin” AND “respiratory syncytial virus”

In PubMed they applied the narrow methodological filter, or Clinical Query, for the domain therapy.
This prefab search strategy (randomized controlled trial[Publication Type] OR (randomized[Title/Abstract] AND controlled[Title/Abstract] AND trial[Title/Abstract]), developed by Haynes, is suitable to quickly detect the available evidence (provided one is looking for RCT’s and doesn’t do an exhaustive search). (see previous posts 2, 3, 4)

Google Scholar, as we all probably know, does not have such methodological filters, but the authors “limited” their search by using the Advanced option and enter the 2 search terms in the “Find articles….with all of the words” space (so this is a boolean “AND“) and they limited it the search to the subject area “Medicine, Pharmacology, and Veterinary Science”.

They did a separate search for publications that were available at their library, which has limited value for others, subscriptions being different for each library.

Next they determined the sensitivity (the number of relevant records retrieved as a proportion of the total number of records in the gold standard) and the precision or positive predictive value, the  fraction of returned positives that are true positives (explained in 3).

Let me guess: sensitivity might be equal or somewhat higher, and precision is undoubtedly much lower in Google Scholar. This is because (in) Google Scholar:

  • you can often search full text instead of just in the abstract, title and (added) keywords/MeSH
  • the results are inflated by finding one and the same references cited in many different papers (that might not directly deal with the subject).
  • you can’t  limit on methodology, study type or “evidence”
  • there is no automatic mapping and explosion (which may provide a way to find more synonyms and thus more relevant studies)
  • has a broader coverage (grey literature, books, more topics)
  • lags behind PubMed in receiving updates from MEDLINE

Results: PubMed and Google Scholar had pretty much the same recall, but for ribavirin and RSV the recall was higher in PubMed, PubMed finding 100%  (12/12) of the included trials, and Google Scholar 58% (7/12)

No discussion as to the why. Since Google Scholar should find the words in titles and abstracts of PubMed I repeated the search in PubMed but only in the title, abstract field, so I searched ribavirin[tiab] AND respiratory syncytial virus[tiab]* and limited it with the narrow therapy filter: I found 26 papers instead of 32. These titles were missing when I only searched title and abstract (between brackets: [relevant MeSH (reason why paper was found), absence of abstract (thus only title and MeSH) and letter], bold: why terms in title abstract are not found)

  1. Evaluation by survival analysis on effect of traditional Chinese medicine in treating children with respiratory syncytial viral pneumonia of phlegm-heat blocking Fei syndrome.
    [MesH:
    Respiratory Syncytial Virus Infections/]
  2. Ribavarin in ventilated respiratory syncytial virus bronchiolitis: a randomized, placebo-controlled trial.
    [MeSH:
    Respiratory Syncytial Virus Infections/ - [NO ABSTRACT, LETTER]
  3. Study of interobserver reliability in clinical assessment of RSV lower respiratory illness.
    [MeSH:Respiratory Syncytial Virus Infections*]
  4. Ribavirin for severe RSV infection. N Engl J Med.
    [MeSH: Respiratory Syncytial Viruses
    [NO ABSTRACT, LETTER]
  5. Stutman HR, Rub B, Janaim HK. New data on clinical efficacy of ribavirin.
    MeSH: Respiratory Syncytial Viruses
    [NO ABSTRACT]
  6. Clinical studies with ribavirin.
    MeSH: Respiratory Syncytial Viruses
    [NO ABSTRACT]

Three of the papers had the additional MeSH respiratory syncytial virus and the three others respiratory syncytial virus infections. Although not all papers (2 comments/letters) may be relevant, it illustrates why PubMed may yield results, that are not retrieved by Google Scholar (if one doesn’t use synonyms)

In Contrast to Google Scholar, PubMed translates the search ribavirin AND respiratory syncytial virus so that the MeSH-terms “ribavirin”, “respiratory syncytial viruses”[MeSH Terms] and (indirectly) respiratory syncytial virus infection”[MeSH] are also found.

Thus in Google Scholar articles with terms like RSV and respiratory syncytial viral pneumonia (or lack of specifications, like clinical efficacy) could have been missed with the above-mentioned search.

The other result of the study (the result section comprises 3 sentences) is that “For each individual search, PubMed had better precision”.

The Precision was 59/467 (13%) in PubMed and 57/80,730 (0.07%)  in Google Scholar (p<0.001)!!
(note: they had to add author names in the Google Scholar search to find the papers in the haystack ;)

Héhéhé, how surprising. Well why would it be that no clinician or librarian would ever think of using Google Scholar as the primary, let alone the only, source to search for medical evidence?
It should also ring a bell, that [QUOTE**]:
In the Cochrane reviews the researchers retrieved information from multiple databases, including MEDLINE, the Cochrane Airways Group trial register (derived from MEDLINE)***, CENTRAL, EMBASE, CINAHL, DARE, NHSEED, the Acute Respiratory Infections Group’s specialized register, and LILACS… “
Note
Google Scholar isn’t mentioned as a source! Google Scholar is only recommendable to search for work citing (already found) relevant articles (this is called forward searching), if one hasn’t access to Web of Science or SCOPUS. Thus only to catch the last fish.

Perhaps the paper could have been more interesting if the authors had looked at any ADDED VALUE of Google Scholar, when exhaustively searching for evidence. Then it would have been crucial to look for grey literature too, (instead of excluding it), because this could be a possible strong point for Google Scholar. Furthermore one could have researched if forward searching yielded extra papers.

The specificity of PubMed is attributed to the used therapy-narrow filter, but the vastly lower specificity of Google Scholar is also due to the searching in the full text, including the reference lists.

For instance, searching for ribavirin AND respiratory syncytial virus in PubMed yields 523 hits. This can be reduced to 32 hits when applying the narrow therapy filter. This means a reduction by a factor of 16.
Yet a similar search in Google Scholar yield
4,080 hits. Thus without the filter there is still an almost 8 times higher yield from Google Scholar than from PubMed.

That evokes another  research idea: what would have happened if randomized (OR randomised) would have been added to the Google Scholar search? Would this have increased the specificity? In case of the above search it lowers the yield with a factor 2, and the first hits look very relevant.

It is really funny but the authors bring down their own conclusion that “These results are important because efficient retrieval of the best available scientific evidence can inform respiratory care protocols, recommendations for clinical decisions in individual patients, and education, while minimizing information overload.” by saying elsewhere that “It is unlikely that users consider more than the first few hundred search results, so RTs who conduct literature searches with Google Scholar on these topics will be much less likely to find references cited in Cochrane reviews.”

Indeed no one would take it into ones head to try to find the relevant papers out of those 4,080 hits retrieved. So what is this study worth from a practical point of view?

Well anyway, as you can ask for the sake of asking you can research for the sake of researching. Despite being an EBM-addict I prefer a good subjective overview on this topic over a weak scientific, quasi-evidence based, research paper.

Does this mean Google Scholar is useless? Does it mean that all those PhD’s hooked on Google Scholar are wrong?

No, Google Scholar serves certain purposes.

Just like the example of PubMed and TRIP, you need to know what is in it for you and how to use it.

I used Google Scholar when I was a researcher:

  • to quickly find a known reference
  • to find citing papers
  • to get an idea of how much articles have been cited/ find the most relevant papers in a quick and dirty way (i.e. by browsing)
  • for quick and dirty searches by putting words string between brackets.
  • to search full text. I used quite extensive searches to find out what methods were used (for instance methods AND (synonym1 or syn2 or syn3)). An interesting possibility is to do a second search for only the last few words (in a string). This will often reveal the next words in the sentence. Often you can repeat this trick, reading a piece of the paper without need for access.

If you want to know more about the pros and cons of Google Scholar I recommend the recent overview by the expert librarian Dean Giustini: “Sure Google Scholar is ideal for some things” [7]“. He also compiled a “Google scholar bibliography” with ~115 articles as of May 2010.

Speaking of librarians, why was the study performed by PhD RRT (RN)’s and wasn’t the university librarian involved?****

* this is a search string and more strict than respiratory AND syncytial AND virus
**
abbreviations used instead of full (database) names
*** this is wrong, a register contains references to controlled clinical trials from EMBASE, CINAHL and all kind of  databases in addition to MEDLINE.
****other then to read the manuscript afterwards.

References

  1. Anders ME, & Evans DP (2010). Comparison of PubMed and Google Scholar Literature Searches. Respiratory care, 55 (5), 578-83 PMID: 20420728
  2. This Blog: http://laikaspoetnik.wordpress.com/2009/11/26/adding-methodological-filters-to-myncbi/
  3. This Blog: http://laikaspoetnik.wordpress.com/2009/01/22/search-filters-1-an-introduction/
  4. This Blog: http://laikaspoetnik.wordpress.com/2009/06/30/10-1-pubmed-tips-for-residents-and-their-instructors/
  5. NeuroDojo (2010/05) Pubmed vs Google Scholar? [also gives a nice overview of pros and cons]
  6. GenomeWeb (2010/05/10) Content versus interface at the heart of Pubmed versus Scholar?/ [response to 5]
  7. The Search principle Blog (2010/05) Sure Google Scholar is ideal for some things.




Friday Foolery [26] Nightmare turns into DreamNight (at the Zoo)

5 06 2010

Today I took the plunge, changed clothes at work, “jumped” into my old running shoes and went off for a 8.5 km run homeward. Just outside the building I heard a couple whisper “accident” and I saw several  ambulances driving towards the highway. Half a kilometer down the road ambulances were still leaving the hospital. There was a continuous wailing sound. Everywhere were ambulances, police-cars and fire-engines. Something big must have happened. A disaster on the highway perhaps?

It looked like this:

Crossing the bridge over the highway, I didn’t see anything, not even the usual Friday evening rush hour. …

I stopped to twitter and searched for “accident”. There seemed to be a serious accident on the A2-highway, but this was further South.

Finally at home (it took me longer than I had hoped) I checked Twitter again. It seemed that there had not been an accident or disaster, it was no excercise, it was the once yearly Dreamnight at the zoo. This is:

“an annual and entrance-free eveningopening of a zoo exclusively for chronically ill and disabled children, their parents and brothers and sisters”

The ambulances and other vehicles are just their (loud) escort to the zoo.

This year it is exactly 15 years ago that the dreamnight-project was born. The first edition was held in the Sophia’s Children Hospital in Rotterdam – The Netherlands: 175 very ill children came with their parents and siblings…. all together some 750 special guests were entertained.

When European zoos joined, the name “Dreamnight at the Zoo” was introduced. Later dreamnight got other partners, like museums and attraction parks.

The night is meant to give VIC’s (very important children) and their parents an unforgettable evening. Police, fireman and paramedics also help to make it a memorable day. Today was a bright and sunny day. I’m sure the children and their families had a great evening.

It really is a project that is well worth the effort. It is the  dream of the organizers that all Zoos in the world once will call the first Friday of June (or December in Australia) the “dreamnight at the zoo”….

For more information, see the website http://www.dreamnightatthezoo.nl/[5 languages] or contact info@dreamnightatthezoo.nl

There is also a special site for Artis dreamnight: http://www.dreamnightatartis.nl/(Dutch)







MedLibs Round: Update & Call for Submissions June 2010

4 06 2010

In the past months we had some excellent hosts of the round, really “la crème de la crème” of the medical information/libarary blogosphere:

2010 was heralded by Dr Shock MD PhD, followed by Emerging Technologies Librarian (@pfanderson) The Krafty Librarian (@krafty) and @Eagledawg (Nikki Dettmar).

Nikki  hosted the round for a second time, but now on her new blog: Eagledawg.net. The title: E(Patients)-I(Pad)-O(pportunities):Medlibs Round

Last Month the round was hosted by Danni (Danni4info) at The Health Informaticist, my favorite English EBM-library blog. It is a great round again, about “dealing with PubMed trending analysis, liability in information provision, the ‘splinternet’, a search engine optimisation (SEO) teaser from CILIP’s fresh off the presses Update magazine, and more. Missed it? You can read it here.

And now we have a few days left to submit our posts for the Next MedLibs Round, hosted by yet another excellent EBM/librarian blogger: @creaky at EBM and Clinical Support Librarians@UCHC.

She would like posts about “Reference Questions (or People) I Won’t Forget” (thus “memorable” encounters that took place in a public service/reference desk setting, over your career) or “how the library/librarian” has helped you.
But as always other relevant and good quality posts related to medical information and medical librarianship will also be considered.

For more details see the (2nd!) Call for submissions post at EBM and Clinical Support Librarians@UCHC

I am sure you all have a story to tell. So please share it with @creaky and us!

As always, you can submit the permalink (URL) (of your post(s) on your blog) here.

************

I would also like to take the opportunity to ask if there are any med- or medlib-bloggers out there who would like to host the MEDLIBS round August, September, October.

The MEDLIBs Round is still called the MedLibs round because I got too little response (6 votes including mine) to the poll with other name suggestions. Neither did I get any suggestions regarding the design of the MEDLIBS-logo, Robin of Survive the Journey has offered to make [for details see request here]. I hope you will take the time to fill in the poll below, and to think about any suggestions for a logo. Thanks!

@ links to the twitteraccounts








Follow

Get every new post delivered to your Inbox.

Join 83 other followers