The Scatter of Medical Research and What to do About it.

18 05 2012

ResearchBlogging.orgPaul Glasziou, GP and professor in Evidence Based Medicine, co-authored a new article in the BMJ [1]. Similar to another paper [2] I discussed before [3] this paper deals with the difficulty for clinicians of staying up-to-date with the literature. But where the previous paper [2,3] highlighted the mere increase in number of research articles over time, the current paper looks at the scatter of randomized clinical trials (RCTs) and systematic reviews (SR’s) accross different journals cited in one year (2009) in PubMed.

Hofmann et al analyzed 7 specialties and 9 sub-specialties, that are considered the leading contributions to the burden of disease in high income countries.

They followed a relative straightforward method for identifying the publications. Each search string consisted of a MeSH term (controlled  term) to identify the selected disease or disorders, a publication type [pt] to identify the type of study, and the year of publication. For example, the search strategy for randomized trials in cardiology was: “heart diseases”[MeSH] AND randomized controlled trial[pt] AND 2009[dp]. (when searching “heart diseases” as a MeSH, narrower terms are also searched.) Meta-analysis[pt] was used to identify systematic reviews.

Using this approach Hofmann et al found 14 343 RCTs and 3214 SR’s published in 2009 in the field of the selected (sub)specialties. There was a clear scatter across journals, but this scatter varied considerably among specialties:

“Otolaryngology had the least scatter (363 trials across 167 journals) and neurology the most (2770 trials across 896 journals). In only three subspecialties (lung cancer, chronic obstructive pulmonary disease, hearing loss) were 10 or fewer journals needed to locate 50% of trials. The scatter was less for systematic reviews: hearing loss had the least scatter (10 reviews across nine journals) and cancer the most (670 reviews across 279 journals). For some specialties and subspecialties the papers were concentrated in specialty journals; whereas for others, few of the top 10 journals were a specialty journal for that area.
Generally, little overlap occurred between the top 10 journals publishing trials and those publishing systematic reviews. The number of journals required to find all trials or reviews was highly correlated (r=0.97) with the number of papers for each specialty/ subspecialty.”

Previous work already suggested that this scatter of research has a long tail. Half of the publications is in a minority of papers, whereas the remaining articles are scattered among many journals (see Fig below).

Click to enlarge en see legends at BMJ 2012;344:e3223 [CC]

The good news is that SRs are less scattered and that general journals appear more often in the top 10 journals publishing SRs. Indeed for 6 of the 7 specialties and 4 of the 9 subspecialties, the Cochrane Database of Systematic Reviews had published the highest number of systematic reviews, publishing between 6% and 18% of all the systematic reviews published in each area in 2009. The bad news is that even keeping up to date with SRs seems a huge, if not impossible, challenge.

In other words, it is not sufficient for clinicians to rely on personal subscriptions to a few journals in their specialty (which is common practice). Hoffmann et al suggest several solutions to help clinicians cope with the increasing volume and scatter of research publications.

  • a central library of systematic reviews (but apparently the Cochrane Library fails to fulfill such a role according to the authors, because many reviews are out of date and are perceived as less clinically relevant)
  • registry of planned and completed systematic reviews, such as prospero. (this makes it easier to locate SRs and reduces bias)
  • Synthesis of Evidence and synopses, like the ACP-Jounal Club which summarizes the best evidence in internal medicine
  • Specialised databases that collate and critically appraise randomized trials and systematic reviews, like www.pedro.org.au for physical therapy. In my personal experience, however, this database is often out of date and not comprehensive
  • Journal scanning services like EvidenceUpdates from mcmaster.ca), which scans over 120 journals, filters articles on the basis of quality, has practising clinicians rate them for relevance and newsworthiness, and makes them available as email alerts and in a searchable database. I use this service too, but besides that not all specialties are covered, the rating of evidence may not always be objective (see previous post [4])
  • The use of social media tools to alert clinicians to important new research.

Most of these solutions are (long) existing solutions that do not or only partly help to solve the information overload.

I was surprised that the authors didn’t propose the use of personalized alerts. PubMed’s My NCBI feature allows to create automatic email alerts on a topic and to subscribe to electronic tables of contents (which could include ACP journal Club). Suppose that a physician browses 10 journals roughly covering 25% of the trials. He/she does not need to read all the other journals from cover to cover to avoid missing one potentially relevant trial. Instead it is far more efficient to perform a topic search to filter relevant studies from journals that seldom publish trials on the topic of interest. One could even use the search of Hoffmann et al to achieve this.* Although in reality, most clinical researchers will have narrower fields of interest than all studies about endocrinology and neurology.

At our library we are working at creating deduplicated, easy to read, alerts that collate table of contents of certain journals with topic (and author) searches in PubMed, EMBASE and other databases. There are existing tools that do the same.

Another way to reduce the individual work (reading) load is to organize journals clubs or even better organize regular CATs (critical appraised topics). In the Netherlands, CATS are a compulsory item for residents. A few doctors do the work for many. Usually they choose topics that are clinically relevant (or for which the evidence is unclear).

The authors shortly mention that their search strategy might have missed  missed some eligible papers and included some that are not truly RCTs or SRs, because they relied on PubMed’s publication type to retrieve RCTs and SRs. For systematic reviews this may be a greater problem than recognized, for the authors have used meta-analyses[pt] to identify systematic reviews. Unfortunately PubMed has no publication type for systematic reviews, but it may be clear that there are many more systematic reviews that meta-analyses. Possibly systematical reviews might even have a different scatter pattern than meta-analyses (i.e. the latter might be preferentially included in core journals).

Furthermore not all meta-analyses and systematic reviews are reviews of RCTs (thus it is not completely fair to compare MAs with RCTs only). On the other hand it is a (not discussed) omission of this study, that only interventions are considered. Nowadays physicians have many other questions than those related to therapy, like questions about prognosis, harm and diagnosis.

I did a little imperfect search just to see whether use of other search terms than meta-analyses[pt] would have any influence on the outcome. I search for (1) meta-analyses [pt] and (2) systematic review [tiab] (title and abstract) of papers about endocrine diseases. Then I subtracted 1 from 2 (to analyse the systematic reviews not indexed as meta-analysis[pt])

Thus:

(ENDOCRINE DISEASES[MESH] AND SYSTEMATIC REVIEW[TIAB] AND 2009[DP]) NOT META-ANALYSIS[PT]

I analyzed the top 10/11 journals publishing these study types.

This little experiment suggests that:

  1. the precise scatter might differ per search: apparently the systematic review[tiab] search yielded different top 10/11 journals (for this sample) than the meta-analysis[pt] search. (partially because Cochrane systematic reviews apparently don’t mention systematic reviews in title and abstract?).
  2. the authors underestimate the numbers of Systematic Reviews: simply searching for systematic review[tiab] already found appr. 50% additional systematic reviews compared to meta-analysis[pt] alone
  3. As expected (by me at last), many of the SR’s en MA’s were NOT dealing with interventions, i.e. see the first 5 hits (out of 108 and 236 respectively).
  4. Together these findings indicate that the true information overload is far greater than shown by Hoffmann et al (not all systematic reviews are found, of all available search designs only RCTs are searched).
  5. On the other hand this indirectly shows that SRs are a better way to keep up-to-date than suggested: SRs  also summarize non-interventional research (the ratio SRs of RCTs: individual RCTs is much lower than suggested)
  6. It also means that the role of the Cochrane Systematic reviews to aggregate RCTs is underestimated by the published graphs (the MA[pt] section is diluted with non-RCT- systematic reviews, thus the proportion of the Cochrane SRs in the interventional MAs becomes larger)

Well anyway, these imperfections do not contradict the main point of this paper: that trials are scattered across hundreds of general and specialty journals and that “systematic reviews” (or meta-analyses really) do reduce the extent of scatter, but are still widely scattered and mostly in different journals to those of randomized trials.

Indeed, personal subscriptions to journals seem insufficient for keeping up to date.
Besides supplementing subscription by  methods such as journal scanning services, I would recommend the use of personalized alerts from PubMed and several prefiltered sources including an EBM search machine like TRIP (www.tripdatabase.com/).

*but I would broaden it to find all aggregate evidence, including ACP, Clinical Evidence, syntheses and synopses, not only meta-analyses.

**I do appreciate that one of the co-authors is a medical librarian: Sarah Thorning.

References

  1. Hoffmann, Tammy, Erueti, Chrissy, Thorning, Sarah, & Glasziou, Paul (2012). The scatter of research: cross sectional comparison of randomised trials and systematic reviews across specialties BMJ, 344 : 10.1136/bmj.e3223
  2. Bastian, H., Glasziou, P., & Chalmers, I. (2010). Seventy-Five Trials and Eleven Systematic Reviews a Day: How Will We Ever Keep Up? PLoS Medicine, 7 (9) DOI: 10.1371/journal.pmed.1000326
  3. How will we ever keep up with 75 trials and 11 systematic reviews a day (laikaspoetnik.wordpress.com)
  4. Experience versus Evidence [1]. Opioid Therapy for Rheumatoid Arthritis Pain. (laikaspoetnik.wordpress.com)




Another bug in My NCBI?

15 10 2008

This bug is now fixed (15-11-2008) !!!

——————————————————

It is confusing, but each week I have another post on the appearance, disappearance or reappearance of a bug in PubMed’s My NCBI:

For me this is an essential feature of My Collections.Often, when I develop a sensitive search, I collect all relevant studies, especially the ones that were not in my search (i.e. found by checking references or ‘related articles’). Then I optimize the search and hope all the relevant records will be found. This can be checked by combining (a) search(es) with the collection(s). If the search is good all relevant records will be found.

Of course this will only work when you CAN combine the collection from My NCBI with one or more searches in the History.

A cumbersome solution, that only works for one collection at the time, is that you send the collections (executed in PubMed) to the Clipboard and combine this set (#0) with the searches, but I prefer a simpler solution. In fact it has always been possible in the past….

Well we will write again to the help desk.
Hopefully I will report the bug repair next week and there will be no follow up.

—————————-

Voor de tweede keer een bug in My NCBI. Dit keer gaat het om “My Collections”. Als je een “collection” activeert, worden de desbetreffende records (in het voorbeeld 39 items) wel uitgevoerd in PubMed, maar komen ze niet in de History terecht.

Dat vind ik erg vervelend, omdat ik My Collections vooral gebruik om uitgebreide zoekacties op te zetten.

Ik sla alle relevante artikelen op in My Collections en voer ze op een later tijdstip uit. Dan combineer ik ze met een of meer searches. Ik kan zo checken of ik met zo’n search alle relevante artikelen (bijv. gekregen van klant of via related articles) vind. Is dat niet het geval, dan is het een manier om ontbrekende termen te vinden.

Deze procedure werkt nu dus niet meer, omdat een set uit My Collections niet in de History terechtkomt.

Ik heb wel een voorlopige kunstgreep bedacht, t.w. deze items in Pubmed naar het Clipboard sturen, zodat ze alsnog als set #0 in de History komen te staan. Dat werkt natuurlijk maar met 1 set tegelijk en is tamelijk omslachtig.

Voorheen werkte dit trouwens wel altijd, dus het zal wel weer liggen aan de overhaaste ‘reparaties’ en aanpassingen.

Nou, dat wordt weer een mailtje richting helpdesk.

Hopelijk wordt het snel verholpen en hoort u even niet meer van mij..





Bug My NCBI repaired

8 10 2008

I’m pleased to announce that the bug in PubMed’s My NCBI, that I pointed out a week ago, has been repaired.

For two weeks, since the update of My NCBI, searches comprised of setnumbers were incorrectly saved in My NCBI, thus literally as #1 AND #2, or in the example I gave as: #3 + RCT filter instead of: hirsutism and spironolactone (+ RCT-filter), which was the actual search behind it. (see Figures below)

This was the response I just received from someone of the U.S.National Library of Medicine:

“You can now save searches with search statement (aka History) numbers. Unfortunately, any that you recently created that didn’t work are not going to work, so please delete those.

As part of the fix, we made some changes to how links for saved search names work in My NCBI. On the screen where you used to see “View Results,” use the search name to link to run the search in PubMed. The “Edit” link now takes you to where you can change the specs of the search. These changes are not yet finished. When we have things running normally we will provide more detailed information in our newsletter.

Thank you for your patience.”

I’ve checked it and it really works. Thank god it does. It is really an essential feature, especially for the unexperienced searcher: the (correct number of) brackets are automatically in the right positions.

I’m also pleased with the way the saved searched are presented. It is far more logic that the search is executed when clicking at the underlined name (which looks like a link) and that you can edit where it says “edit”.

I’m looking forward to the other enhancements.

search was erroneously saved as (#3) AND ....

Search is now saved as: (((hirsutism) AND (spironolactone) AND ....

The old (wrong) and updated search in My NCBI (in the new layout)





About “1 AND 2 = 3” in My NCBI

1 10 2008

The PubMed My NCBI feature has been updated. The navigation is entirely different and -in my view- less intuitive and more complex. The increased complexity may relate to the new features, some seeming rather unnecessary (filters), others looking promising: my bibliography, persistent cookies, no limit to the number of saved searches or collections per account (hurray!).

You can find details about the My NCBI changes in the NLM-bulletin and in MyNCBI-help.

For now, I just want to address one point, that hopefully is a “temporary error”.

I noticed it last Friday, thought that it was just a technical error of the kind that frequently occurs these days in PubMed, but will be restored without any notice.

But the mistake (?) is still there. It is about HOW PubMed searches are saved

Before, if you combined two sets, say: “#1 AND #2”, set #3 would be created: #1 AND #2.
If you would save #3 in My NCBI, you would save the entire search behind #1 AND #2, but now only the string “#1 AND #2” is saved. You can easily imagine that set numbers #1 AND #2 are only meaningful if #1 AND #2 are still present and the same as in the original search.
A Dutch colleague just shouted out he got an error message when trying to execute a saved search. Set X was not recognized….

Example.

Suppose you want to find an answer to the following question: Is spironolactone useful (compared to cyproterone acetate for instance) to reduce hirsutism in women with PCOS?

You search for:

  • hirsutism (#1) and spironolactone (#2) (checking that these are mapped to the appropriate MeSH using Details)
  • combine the two sets with AND.
  • Subsequently combine #3 with a narrow filter for the Therapy Domain (filter for RCT’s) in the Clinical Queries.
  • Set #4 (=#3 AND filter) gives 23 results.
  • You save set #4 in My NCBI.
  • But what happens:
    It is saved as #3 AND filter, not as: hirsutism AND spironolactone AND filter.
    Reexecuting the search if the original History is gone yields 0 results (or an erroneous result).

Personally I can circumvent most problems, because I optimize my searches in Word (also nice as safeguard when the PubMed servers are overheated), but for most users this is an unnecessary extra step.

I hope this bug (?, I hope it is a bug) is quickly restored by NLM.

Please inform them by writing to the PubMed helpdesk (at the bottom of the PubMed front page). I will do the same.





Blog Spam and Spam Blogs (2)

14 09 2008

In a previous post I gave two examples of Health Blogs that are really pills-selling-sites. In this post I will show two examples of real Spam Blogs.

Spam blogs or splogs are usely fake weblogs where content is often either inauthentic text or merely stolen (scraped) from other websites. All spam artificially increases the site’s search engine ranking, increasing the number of potential visitors.

Database-management blog: no longer exists

Original post at this blog above and comment below.

One Spam blog that I wanted to show you, is no longer available. It is called Database Management.

Technorati-profile (authority=51)

This blog had no own content, but scraped it from blogposts having the (WordPress?) tag “database”. Although the post does link to the original site, it doesn’t refer to the author’s proper name, but some automatically generated fake name. For instance Shamisos instead of Laikaspoetnik (see Fig).

When I tried to place a comment on their site I had to login into the WordPress-account (although I was already logged in into mine). That’s when I began to really distrust it.

It’s technorati profile still exists (see Fig.). It is clear that the blog has rapidly increased it’s “authority” in the few months it existed. From zero to 51.
Many blogs linking to this blog are also gone or peculiar. Other blogs might have just linked to the spam blog because they assumed that this was the original post, not the copy. Presumably by having so much content on ‘database management’ the splog gets more traffic (of the preferred kind). This might be an example of a splog that backlinks to a portfolio of affiliate websites, to artificially inflate paid ad impressions from visitors, and/or as a link outlet to get new sites indexed (Wikipedia).

The second example of a spamblog is a very interesting site for Medical Librarians: Generic Pub, with the webadress: http://genericpubmed.com/pub/ with posts about PubMed. Really high quality information. Why? Because the posts derive from elsewhere. All of my posts about PubMed are in there, as are those of my colleagues, and perhaps your posts as well. There is no clue as to where the post really came from. You don’t get any pingbacks, unless the (original) post linked to you. That’s how I found out. As with the other spamblogs you cannot comment. Comments are always closed.

one of my posts on Generic Pub

The blogroll of Generic Pub

Blogroll of Generic Pub

Generic PubMed homepage

Generic PubMed homepage

The site does not hide its real intentions. To the left is a huge pill “cialis” and the blogroll consists of only pills, as well as PubMed tag feeds of Technorati and WordPress.

If you strip of the web adress to: http://genericpubmed.com you arive at the homepage, which is unmistakingly a pharmaceutical e-commerce website. Why is this done? Perhaps the sites looks more reliable whith all those PubMed posts or perhaps the site might be easier to find.

One way or another, these two sites steal posts from other sites. Tags used by Technorati or by WordPress, that can be easily transformed into a feed make it very easy for these spambloggers to automatically import blogposts with a certain tag.
By the way, did you find your post in there?

Previous post, see here.

————————————————————————–

Database-management blog: no longer exists

In een eerder post heb ik 2 voorbeelden gegeven van blogs die eigenlijk tot doel hebben pillen te verkopen.

Nu 2 voorbeelden van echte Spam Blogs.

Volgens Wikipedia: Spam blogs of splogs zijn doorgaans nep-weblogs, waarvan de inhoud vaak min of meer gestolen wordt (“scraped”) van andere websites. Dit verhoogt de ranking door zoekmachines en zorgt ervoor dat het aantal bezoekers toeneemt.

Een Spam blog dat ik jullie wilde laten zien, is niet langer beschikbaar, tw. Database Management.

Dit blog had alle inhoud gepikt van posts met de (WordPress?) tag “database”. Er wordt wel gelinkt naar de originele site, maar de naam van de auteur wordt vervangen door een of andere automatisch gegenereerde naam, bijv. Shamisos in plaats van Laikaspoetnik (see Fig in engelstalig gedeelte).

Toen ik een commentaar wilde plaatsen op deze site, werd ik gedwongen in te loggen in WordPress, terwijl ik nota bene al ingelogd was. Vanaf dat moment vertrouwde ik het echt niet meer.

Het technorati profiel van deze site bestaat nog steeds (zie fig in engelstalig gedeelte). Het blog is in enkele maanden tijd van 0,0 tot 51 gestegen in “authoriteit”.
Veel blogs die naar dit blog linken zijn ook opgeheven of zijn verdacht. Andere blogs hebben misschien slechts per ongeluk naar deze splog gelinked, omdat men dacht met de originele post van doen te hebben, niet de kopie. Waarschijnlijk krijgt de splog zo meer verkeer van mensen die juist in database management geinteresseerd zijn. Mogelijk is dit een splog die teruglinkt naar een aantal klonen en vice versa. (Wikipedia).

Het 2e voorbeeld van een splog is een erg interessante site voor medisch informatiespecialisten, nl Generic Pub met het webadres: genericpubmed.com/pub. Allemaal kwalitatief zeer goede posts over PubMed. Maar ze zijn wel gejat. Al mijn berichten met de tag PubMed zijn er te vinden, evenals die van mijn collega’s en misschien uw berichten ook wel.
Nergens is de ware herkomst van de berichten te herleiden. De echte auteurs krijgen normaal geen pingback, alleen als de oorspronkelijke post een link naar hen bevat. Zo kwam ik er eigenlijk achter. Evenals de andere splogs, kun je geen commentaar plaatsen.

De website verhult zijn werkelijke bedoelingen niet. Links staat een reuzachtige pil “cialis” en de blogroll bevat alleen namen van pillen alsmede de feeds van de PubMed tags van Technorati en WordPress.
Als je het webadres stript tot: genericpubmed.com kom je op de homepage, onmiskenbaar een e-commerce site. Waarom verschuilt men zich achter zo’n blog? Lijkt de site er betrouwbaarder door of vinden potentiele klanten de site makkelijker?

Hoe dan ook deze 2 sites stelen van andere websites. Een feed nemen op Technorati- of WordPress-tags is een eitje, en dit maakt het deze spambloggers erg makkelijk om automatisch blogposts met een bepaalde tag te importeren.
Tussen 2 haakjes, heeft u uw post al getraceerd?

Vorig bericht in deze serie, zie hier.





PubMed Search Clinic on ATM, Citation Sensor, Advanced Search: Video available.

21 07 2008

The video from the online Search clinic on recent PubMed changes, announced in a previous post is now available at: nlm.nih.gov (pmupdate08): click here.

Direct link to the video only: https://webmeeting.nih.gov/p91519064/

A good coverage is given by Michelle Kraft (Krafty Librarian) at her site (click here).

The clinic, presented by Katherine Majewski, updated recent changes to PubMed, earlier described at the NLM information bulletins on the new ATM and the Beta Advanced Search page.
Recent changes have also been amply described (and discussed) at several of my previous posts, most notably this one.

Here is an overview, with emphasis on new aspects (at least to me).

Citation Sensor:

In the clinic the citation sensor was defined as: “a new feature designed for users seeking specific citations”. However it is not a separate search box. The citation sensor works automatically when you type words into the general search bar. If combination of words are recognized as representing citations (e.g. volume numbers, author names, journal titles) the matches are displayed in a yellow box above the retrieval.

In my previous post I already discussed that the sensor doesn’t always work perfectly and like Krafty, I think that the Single Citation Matcher (in the blue side bar) performs better. It suggests author and journal titles as you write them. Furthermore, you can just fill in the specific information you know in specific fields, i.e. if the author name is misspelled/wrong, it often suffices to fill in year, page number and title word(s), to name just one possible combination. In response to a question, Majewski said the sensor is not an advantage per se as opposed to the Single Citation Matcher. Probably it is just handy for people used to a Google-like way of searching.

One thing new to me was that there are two “Details” when performing a search.

When you type: choi blood 2008, the citation sensor finds 6 hits, 3 of them shown in the yellow box.
The Details button shows: choi[All Fields] AND (“blood”[Subheading] OR “blood”[All Fields] OR “blood”[MeSH Terms]) AND 2008[All Fields].

However when you click 6 articles to see them all, the Details button shows how the citation sensor has translated the search in: choi[Author] AND (blood[Author] OR “Blood”[Journal]) AND 2008[Publication Date]

Thus in fact the search is translated twice (although the citation sensor-results are always a subset of the full results). If you click on 6 articles, the 2nd translation appears as a 2nd search in the Search History.

ATM – Automatic Term Mapping.

ATM has been changed in conjunction with the citation sensor in order to identify queries that contain citation-type information. The old ATM mapped search terms to subject, journal, and author tables in that order. If a MeSH-match was found, PubMed would search for that MeSH as well the user-input as a textword (title, abstract). Automatic term mapping would then stop because it found a match with MeSH. Thus terms that are not only in the MeSH but also in the author or journal table would have been missed, such as in Burns Laryngoscope 2005. The old ATM would map Burns and Laryngoscope as MeSH (subject-search), but the new ATM also searches these terms in ‘all fields’, thus enabling the retrieval of the paper of Burns in Laryngoscope.
In the Q & A part of the session Majewski advised to use qualifiers as MeSH when Burns is searched just as a topic. I only wonder if/how most of the untrained people would find this out.

Another consequence, not really addressed here, is that multi-term words are split and searched individually. With the new ATM, gene therapy is not only searched as the phrase gene therapy (as MeSH-term and textword) but also as ”gene”[All Fields] AND “therapy”[All Fields], which leads to a far greater retrieval (almost 250%). Few of these extra hits are relevant. (see previous post)

Statistics, however, show that the thousands (‘real’) queries performed returned only 10% extra hits on average (see ATM-FAQ for more information). According to NLM, the enhanced ATM and citation sensor have considerably improved searching PubMed. Probably because most people just come to PubMed to search a specific paper or subject (running one or two search commands). The new features enhance citation searches, while subject searches do not suffer too much as long as multiple terms (concepts) are used, as this will filter much of the noise seen with one term (because the term is searched within the context of the other word).

My remark that most of my patrons do do subject searches was interpreted as “do do broad searches“. Which in effect they do (i.e. searches for systematic reviews), but I do not think the suggested NCBI books might be very helpful to them, although it might indeed serve those people (patients?) that want information about broad subjects like “burns”. Perhaps PubMed/NCBI can offer subject searchers other tools as well.

Notably, based on user input there are now (as of July 2nd) some exceptions to the new ATM-rule:
Substance names (such as ferrous glucanate) and
MeSH with stand alone letters or numbers (like complement factor B) will not be broken apart, but searched as a phrase.

Advanced Search (Beta-version)
Advanced Search is amply discussed in a previous post. However, I didn’t mention that the page consists of 4 collapsible boxes beneath the Search Bar (I missed this: you have to click a small minus sign at the upper left of each box in order to collapse.) In essence you can search by many fields, the default fields displayed being Author, Journal, and Publication Date (box2) and all fields (box 4). There is an index for each selected field available (little buttons right of the search boxes). I see no other difference between box 2 and 4 than the defaulted field and the fact that you can only make multiple choices from the index in box 4. Answering a question in the audience Majewski said they might consider allowing multiple choices in box 2 as well.
Box 3 shows limit-options, much the same as the Limit-tab in the usual frontpage, except that you can unlock your limits to future searches using the lock icon (by defaulted limits are carried to future searches).

Thus again this new ‘enhancement’ mainly facilitates citation searches, not subject searches. Clinical Queries are absent and it is for instance not possible to look up any MeSH other than by index, and even this often goes wrong with multi-word terms. The question why MeSH-trees were unavailable in the beta-version remained unanswered at the clinic.
It was a relief though to hear that there were no intentions to replace the normal PubMed frontpage by this advanced search page in due course.

Katherine Majewski ended the clinic by saying that answers to the questions posed during the clinic would be shown at this NLM-page later. She also encouraged to give positive and negative feecback by writing to the NLM customer service and to be as specific as possible if your search was negatively affected by the recent PubMed changes.

——————————-

NL flag NL vlag

De video van de PubMed Search Clinic, die ik in een eerder bericht aankondigde is nu te zien op: http://www.nlm.nih.gov/bsd/disted/clinics/pmupdate08.html.

Directe link naar de video: klik hier

Michelle Kraft (Krafty Librarian) heeft de clinic al goed op haar blog samengevat.

De webpresentatie, gegeven door Katherine Majewski, behandelde de recente PubMed-veranderingen, zoals aangekondigd in de NLM informatiebulletins (gewijzigde ATM-mapping resp. Beta Advanced Search)
Eerder heb ik deze veranderingen ook al uitgebreid beschreven en becommentarieerd. (zie bijv. hier).

Hier een samenvatting, met nadruk op nieuwe aspecten

Citation Sensor:

In de webpresentatie werd de “citation sensor” omschreven als: “a new feature designed for users seeking specific citations”. Het is echter geen aparte zoekoptie. De citation sensor doet zijn werk automatisch als je woorden in de algemene zoekbalk typt. De als citaties herkende hits worden apart op een gele achtergrond getoond.

Eerder heb ik al opgemerkt dat de sensor niet altijd goed werkt en evenals Krafty denk ik dat de Single Citation Matcher (in the blauwe balk) veel beter werkt. Deze geeft nl. woordsuggesties terwijl je typt en je kunt elke mogelijke informatie specifiek invullen. Weet je een auteur niet dan kun je vaak volstaan met jaar, paginanummer en titelwoorden, om maar één combinatie te noemen. Volgens Majewski is de sensor ook niet perse beter. Waarschijnlijk is het vooral handig voor mensen die gewend zijn aan een Google-zoekwijze en die verder weinig weten van PubMed. Zelf zou ik toch wel graag willen dat je de citation sensor naar believen aan of uit kon zetten.

Ik zag nu pas voor het eerst dat je 2 “Details” hebt, als de citatie-sensor iets mapt.

Typ je: choi blood 2008, dan vindt de sensor 6 hits en toont er 3.
Onder Details is te zien dat Pubmed de search vertaald als: choi[All Fields] AND (“blood”[Subheading] OR “blood”[All Fields] OR “blood”[MeSH Terms]) AND 2008[All Fields].

Als je op 6 articles klikt om ze allemaal te zien, staat onder Details hoe de citatie-sensor de search vertaald heeft: choi[Author] AND (blood[Author] OR “Blood”[Journal]) AND 2008[Publication Date]

Dus, er zijn eigenlijk 2 ‘vertaalslagen’ Als je op 6 articles klikt dan verschijnt de 2e mapping als een zoekset in the zoekgeschiedenis.


ATM – Automatic Term Mapping.

ATM is evenals de citatie-sensor ontwikkeld aangepast om zoekacties gericht op het vinden van artikelen te vergemakkelijken. De oude ATM stopte met het zoeken van termen in de MeSH-, auteurs- en tijdschriftenlijst als een passende MeSH was gevonden. Tevens werd het ingetypte woord als tekstwoord gezocht. Met als gevolg dat termen die zowel in de MeSH- als in de auteurs- of tijdschriftenlijst voorkwamen nooit anders dan als MeSH (en tekstwoord) werden gezocht. Met Burns Laryngoscope 2005 zou dus nooit het artikel van Burns in Laryngoscope zijn gevonden. Met de nieuwe ATM lukt dat wel.
Majewski adviseerde om veldenaanduidingen (qualifiers). zoals MeSH te gebruikenals je op een onder onderwerp zoals ‘Burns’ wilt zoeken. Dan vraag je je wel af in hoeverre de gemiddelde Pubmed -gebruiker dit weet.

Tijdens de sessie werd niet echt aangekaart dat termen die uit meerdere woorden bestaan worden opgesplitst en in alle velden worden gezocht. Eerder heb ik al laten zien dat bij de nieuwe ATM 2,5 x meer hits oplevert met een term als gen therapie en dat de meeste van deze hits weinig relevant zijn.

Volgens de NLM statistieken leiden echte zoekacties gemiddels slechts to 10% extra hits (zie ATM-FAQ voor meer info) en zijn zoekacties door de vernieuwingen aanzienlijk verbeterd . Waarschijnlijk omdat de meeste mensen alleen maar snel even iets opzoeken (1-2 zoekopdrachten) en vooral geinteresseerd zijn in specifieke artikelen. Wat dat levert het intypen van wat termen in de zoekbalk nu eerder wat op, en zolang je veel termen met elkaar combineert heb ik ook niet veel last van veel ruis bij het zoeken op onderwerp. Maar ik ben zeker niet overtuigd dat dit het zoeken op onderwerp verbetert.

Mijn opmerking dat mijn klanten vooral op onderwerp zoeken werd opgevat als dat ze vooral breed zoeken. Nu is dat wel zo, maar ik denk niet dat zij veel aan suggesties hebben als NCBI-books. Dit lijkt me wel geschikt voor mensen die zich globaal willen inlezen in een onderwerp als brandwonden (burns), patienten bijvoorbeeld. Misschien heeft PubMed/NCBI wel nog andere tools voor uitputtende searches in het verschiet….

Op basis van gebruikersfeedback zijn er vanaf 2 Juli wel enkele uitzonderingen op de nieuwe ATM-regel, t.w.:
Substance names (zoals ferrous glucanate) en
MeSH with losstaande letters en cijfers worden niet langer opgesplitst, maar als phrase gezocht.

Advanced Search (Beta-versie)
Advanced Search heb ik ook eerder uitgebreid besproken (zie hier). Wat ik nu pas bemerk, is dat de velden onder de zoekregel in-en uitklapbaar zijn. Er is een miniscuul min tekentje helemaal linksboven elk veld, waar je op moet klikken om het veld te verkleinen.

De essentie van advanced search is dat je veel verschillende velden kunt doorzoeken, maar dat de standaard velden weer citatie-gericht zijn, dus: Author, Journal, and Publication Date (veld 2) en All Fields (veld 4). Je kunt termen voor elk gekozen veld opzoeken in een index (klein knopje rechts). Ik zie eigenlijk geen verschil tussen veld 2 en 4, behalve dan het standaard veld en het feit dat je in het 4e veld verschillende termen tegelijk kunt aanklikken. Mogelijk komt deze optie ook voor veld 2.
In veld 3 kun je limieten aanklikken, eigenlijk erg vergelijkbaar met de Limit-Tab op de PubMed openingspagina. Wel prettig dat je een limiet desgewenst alleen gedurende één zoekactie kunt toepassen (default: blijft alle zoekacties aanstaan).

Dus ook advanced search beta is vooral ten dienste van degene die bepaalde artikelen zoekt. Je kunt bijvoorbeeld alleen maar de MeSH in de index opzoeken en er zijn geen Clinical Queries. De vraag waarom De MeSH-hierarchie niet geraagdpleegd kon worden vanuit bleef onbeantwoord.
Het was wel een pak van mijn hart, dat het volgens Majewski niet de bedoeling was dat de Advanced Search de normale openingspagina op termijn zou vervangen.

Katherine Majewski beeindigde de sessie met de mededeling dat antwoorden op gestelde vragen later op deze pagina zou verschijnen.

Ze verzocht iedereen ook hun eventuele problemen met de veranderingen zo specifiek mogelijk aan de help desk door te geven.





PubMed Online Search Clinic on ATM!

17 07 2008

Just a short note at the last moment.

Back from vacation I picked up some twitter and blog messages announcing a PubMed search clinic offered at July 17 (today!) at 2pm Eastern time (8pm Amsterdam/Paris time, see timetable throughout the world).

A 30 minute online search clinic will be presented by the NLM® and the National Training Center and Clearinghouse (NTCC) via Adobe® ConnectTM on Thursday, July 17th (2pm ET). The presentation will cover changes to PubMed including changes to how PubMed handles your search (the new automatic term mapping process), the citation sensor, and the beta Advanced Search page.

There is a maximum capacity of 300 participants, on a first come first served base. However, the clinic will be recorded and will be available for viewing later.

To follow the clinic log in at: https://webmeeting.nih.gov/pmupdate08/

or: http://www.nlm.nih.gov/bsd/disted/clinics/pmupdate08.html.
Here you find more info about the clinic, as well as tips for successful participation in the clinic. Be sure to test it beforehand.

Sources:

The Krafty Librarian: @Krafty (twitter) and several posts on her blog.

Nikki (Eagledawgs) guest post on David Rothman’s blog

Background info on what others have blogged about recent Pubmed can be found on another Krafty Librarian’s post and several of my previous post, including PubMed: Past, Present And Future, PART II

************************

Even op de valreep.

Net terug van vakantie zag ik enkele twitters en blogberichten die een “PubMed search clinic” aankondigden.

Deze begint om 8 hr p.m. (welke tijd waar?).

Het duurt 30 minuten en gaat over de recente veranderingen in Pubmed, de nieuwe ATM (automatic term mapping), de citation sensor en Advanced Search Beta.

Er kunnen 300 mensen deelnemen, volgens het “wie het eerst komt, het eerst maalt” principe. De clinic wordt wel opgenomen, zodat je hem later nog eens kunt bekijken.

Inloggen voor 19.00: https://webmeeting.nih.gov/pmupdate08/

Meer info op: http://www.nlm.nih.gov/bsd/disted/clinics/pmupdate08.html.
Inclusief tips om de clinic goed te kunnen volgen.

Bronnen:

The Krafty Librarian: @Krafty (twitter) en verschillende blogberichten.

Nikki (Eagledawgs) te gast op het blog van David Rothman.

Achtergrondinfo over wat anderen van de veranderingen vinden zijn ook te vinden de site van Krafty Librarian (zie hier). Enkele van mijn eerdere berichten zoals PubMed: Past, Present And Future, PART II zijn er ook aan gewijd.





PubMed: Past, Present and Future PART III

27 06 2008

The Future: ????

This is a continuation of Part I and II

Part I (click here) describes that PubMed contains many different tools, some of which are quite difficult in use and/or hidden, i.e. Single Citation Matcher, MeSH database and Clinical Queries.

This counterintuitive character of the PubMed interface leads to Google-like searches, that are often ineffective, driving some people crazy. Anna Kushnir for instance started a little riot on her blog by shouting out loud that she hates PubMed. Her ranting elicited a response of Dr. Lipman of the NCBI who reassured her “that a number of changes are underway that will make PubMed work better for her and many other users”.

Part II (click here) describes the new PubMed features recently introduced to meet the wishes of an apparent majority of people that come to PubMed in search for specific information ‘with one finger snap’:

  • ATM has been modified to enable retrieval of citations: multi-word terms are split and sought individually in all fields, including author, address, journal-title field
  • Introduction of a Citation Sensor, that matches searches with citations
  • Advanced Search Beta that allows to specify fields for searching
  • Disappearance of the blue side bar to play along with new features.

These modifications facilitate retrieval of citations to some extent, though not as effectively as the ‘good old’ Single Citation Matcher and at the cost of effective subject searching. In particular, the renewed ATM leads to an unacceptable low precision (ample examples given).

Part III is about the future, but what the future has in store I do not know. I have some ideas though as to what I would like the (near) future to bring (or NLM/NCBI to change) :

[General]
People come to PubMed with different kinds of backgrounds, information skills, questions and aims. Rather than creating one tool that serves them all, but imperfectly, why not create different tools that serves each group well? Why replace Mercedes-cars by Flintstone-mobiles, because 90% of the people rather use their feet? Make 10% Mercedes or learn the Flintstones how to enjoy driving a real motor-driven car!
Thus make it easy on newbies and people just passing by, give them an idea what they might have missed and why, but still enable exhaustive subject searching for those that wish/need to do so. PubMed should not be just Google-like because people are used to that. Number one priority should be that people find what they are looking for! If this means that they have to do a little training: o.k., what’s wrong with that? I agree fully with David Rothman’s view on the anna-kushnir-when-the-user-actually-is-broken story as expressed in his excellent post. I particularly like the parable he gave:

“I remember asking my father to teach me to program in BASIC. He cheerfully agreed and handed me the big brown manual”

PubMed should not imitate its look-alikes, which do an awful lot better with regard to user-friendliness (see for instance here), but generally are NOT very suitable for (more exhaustive) medical subject searching.

[Specific]
At least disconnect reference and subject searching (please??…)

  • The Single Citation Matcher is a perfect tool that, when found, is easy to use for everyone. Why not give it a more self-evident name, like “reference-seeker” OR “find (the) citation” and put it in a more prominent place?
    If NLM/NCBI decides that the general search bar (or PubMed entrance) should be apt for citation searching, why not create a second one for subject searching? Perhaps give people (optional) tips how to continue:

Want to look up a reference: Go to the Single Citation Matcher.
Have a medical question? Go to the subject search bar (or -page).
Do you want to find the best evidence? Go to clinical queries.

Institutional or personal customization of the interface would be a pro. The OVID-SP interface has many of the above characteristics.

  • For subject searching, the old features suffice. Indeed:
    • Give back the blue side bar to reduce the number of clicks needed to (re)enter the MeSH-database, Clinical Queries, Single Citation Matcher etc.
    • Undo the new ATM feature! The only thing that is ‘enhanced’ is the number needed to read. It’s awful!
    • No Advanced Search Beta in the present form, only some of its features, like locking/unlocking some of the limits and multiple field-selection in the index.
      The idea of boxes is nice though.
      MeSH-fields should be default in any new (advanced) interface, as are Clinical Queries and the MeSH-database.
  • [Dreams]
    • Two different interfaces: for simple and for advanced searches. The first may look like advanced search beta (but with optional boxes), the second with an interface that facilitates comprehensive searching, i.e. staying within the History, powerful tools always one click away, easy navigating and sending terms from MeSH-database to PubMed (no ‘Send to Search Box’).
    • Possibility to save and edit the History and not just one search (like in OVID) and perhaps, perhaps, the adjacency function?
    • All important tools like MyNCBI, RSS, MeSH, more ‘visible and intuitive’ for all.
    • Modernization of MeSH (especially in the non-clinical field) and one MeSH for one concept, i.e. not: (Protein Kinase Inhibitors AND Receptor, Epidermal Growth Factor/antagonists & inhibitors) for EGFR tyrosine kinase inhibitors.

How can (some of) these changes be achieved? “Should I shout out loud: I hate PubMed” in order to be heard? No way. I like PubMed. In essence it is a powerful tool freely available to everyone.

But I hope that PubMed (NCBI/NLM) will not merely watch statistics and listen to the voice of the clamorous crowd, but will also listen to the few expert librarians, who represent a large community. They often know the information needs of our clients and the barrieres in the information-seeking process very well, since they help and train them every day…
—————————-

NL flag NL vlagDe toekomst: ????

Dit is een vervolg van deel I en II

Deel I (klik hier) bespreekt dat Pubmed veel zoekfuncties heeft die vaak nogal complex zijn en moeilijk te vinden, zoals de zoekbalk, de Single Citation Matcher, de MeSH-database en Clinical Queries.

Omdat PubMed zo ingewikkeld overkomt zoeken mensen veelal zoals in Google via de zoekbalk, met als gevolg dat het resultaat te wensen overlaat. Uit onwetendheid schuift men de schuld af op PubMed. Zo ging Anna Kushnir op haar weblog luid te keer dat ze PubMed haatte. Hierop reageerde Dr. Lipman (NCBI) met de mededeling: “a number of changes are underway that will make PubMed work better for her and many other users”.

Deel II ((klik hier) Beschrijft de nieuwe zoekfuncties, die recent zijn geintroduceerd om aan de wensen van die mensen tegemoet komen die kennelijk de meerderheid vormen: diegenen die snel even in PubMed zoeken om iets te vinden:

  • ATM is gewijzigd dat PubMed ook citaties kan vinden: Termen worden nu in alle velden gezocht en opgesplitst, indien ze uit meerdere woorden bestaan.
  • Citation Sensor, die citaties ‘herkent’.
  • Advanced Search Beta, waarin je op specifieke velden kunt zoeken.
  • Verdwijnen van de blauwe balk rechts.

Door deze veranderingen kunnen specifieke referenties soms iets beter gevonden worden, maar lang niet zo effectief als met de ‘oude vertrouwde’ Single Citation Matcher en ten koste van een onacceptabele hoeveelheid ruis, vooral door de nieuwe ATM.

Part III gaat over de toekomst. Wat die brengt weet ik natuurlijk niet. Wel weet ik wat ik graag zou willen dat er verandert.

[Algemeen]
PubMedgebruikers verschillen qua achtergrond, zoekvaardigheid, vragen en doelstelling. Waaorm zou je al deze mensen op dezelfde manier laten zoeken, waarom niet verschillende mogelijkheden voor verschillende gebruikers? Waarom zou je mercedes-auto’s willen vervangen door flinstone-auto’s, omdat 90% van de mensen liever zijn voeten gebruikt? Maak voor die 10% Mercedes-auto’s (of Opel Astra’s) of leer de Flintstones hoe ze moeten rijden in een auto die op benzine rijdt.
Maak het makkelijk voor beginners of mensen die eventjes iets zoeken, attendeer ze op wat ze misschien missen en waarom, maar laat het ook mogelijk blijven om op een makkelijke manier uitgebreide zoekacties te doen. PubMed moet toch niet op Google lijken, omdat mensen aan Google gewend zijn? Het allerbelangrijkste is dat mensen vinden waar ze naar zoeken. Als dat betekent dat ze zich er een beetje in moeten verdiepen, dan is dat toch o.k.?! Ik ben het helemaal met David Rothman’s visie op het Anna Kushnir gebeuren eens. Deze vergelijking is ook wel treffend:

“I remember asking my father to teach me to program in BASIC. He cheerfully agreed and handed me the big brown manual”

PubMed moet ook niet proberen zijn kopieen na te bootsen. Die zijn veel gebruikersvriendelijker (zie bijv hier), maar meestal niet bijzonder geschikt voor het uitgebreid zoeken op onderwerp. En dat is nu juist de meerwaarde van PubMed.

[Specifiek]
Koppel het zoeken van citaties los van het zoeken op onderwerp.

  • De Single Citation Matcher is uitermate geschikt voor het vinden van citaties en makkelijk in het gebruik. Het zou wat makkelijker te vinden moeten zijn en een vanzelfsprekender naam moeten hebben, zoals “reference-seeker” of “find (the) citation”.
    Als NLM/NCBI besluit dat de algemene zoekbalk vooral citaties moet kunnen vinden, waarom zou je dan geen 2e balk/pagina kunnen hebben om wel op onderwerp te zoeken? Misschien met wat (optionele) tips:

Want to look up a reference: Go to the Single Citation Matcher.
Have a medical question? Go to the subject search bar (or -page).
Do you want to find the best evidence? Go to clinical queries.

Het zou fijn zijn als je, net als bij OVID-SP, de default instellingen zou kunnen wijzigen.

  • V.w.b. het zoeken op onderwerp voldoet het oude PubMed eigenlijk grotendeels, dus:
    • Geef ons de blauwe menubalk terug! zodat we niet eindeloos moeten klikken om (weer) in de MeSH-database, Clinical Queries en Single Citation te komen.
    • Geef ons de oude ATM terug! Het enige wat ‘vooruit is gegaan’ is de “number needed to read”. Zoveel meer artikelen en zoveel meer ruis.
    • Niet een Advanced Search Beta, hooguit in een aangepaste vorm. Het vastzetten van bepaalde limieten, het kunnen selecteren van verschillende velden zijn goede aanpassingen.
      Ook de velden (voor zoeken, zoekgeschiedenis, limiteren e.d.) zijn geen slecht idee.
      MeSH-index-velden zouden standaard aanwezig moeten zijn, evenals Clinical Queries en MeSH-database.
  • [Dromen]
    • Twee verschillende interfaces voor beginners en gevorderden. De 1e zou op Advanced Search Beta mogen lijken (maar met MeSH-velden), de 2e zou uitgebreid zoeken mogelijk moeten maken. Je zou graag in de Zoekgeschiedenis willen blijven, alle belangrijke hulpmiddelen binnen bereik willen hebben en makkelijk willen navigeren vanuit PubMed naar de MeSH database en v.v. (niet via de ‘Send to Search box’ bijvoorbeeld).
    • De mogelijkheid om de hele zoekgeschiedenis te bewaren en te bewerken en misschien, misschien …. nabijheidszoeken (net als in OVID)?
    • Alle belangrijke zoekopties zoals MyNCBI, RSS, MeSH duidelijker zichtbaar en makkelijk in het gebruik.
    • Aanpassen van MeSH aan de moderne tijd/prekinische vakken en één MeSH voor één concept, bijv niet: (Protein Kinase Inhibitors AND Receptor, Epidermal Growth Factor/antagonists & inhibitors) om ‘EGFR tyrosine kinase inhibitors’ te vinden.

Hoe (enkele van) deze veranderingen te bereiken? Moet ik ook uitschreeuwen dat ik PubMed haat voor ik gehoord word?
Tuurlijk niet, ik haat PubMed niet. Het is prachtig dat zoiets als PubMed bestaat. In principe is het een geweldig goede database met heel veel mogelijkheden. En wat ook ook heel belangrijk is: het is gratis beschikbaar voor iedereen.

Maar ik hoop alleen dat de mensen achter PubMed (NCBI/NLM) niet alleen maar naar de statistieken kijken en naar de stem van de massa luisteren, maar ook de mening van informatiespecialisten ter harte nemen. Want zij vertegenwoordigen eigenlijk een heel grote groep gebruikers en weten uit ervaring waar hun klanten naar op zoek zijn en tegen welke problemen ze oplopen.





PubMed: Past, Present And Future, PART II

15 06 2008

The Present: PubMed is going for the mass.

This is a continuation of Part I (click here to read)

… Well, it seems that some of these enhancements are in the process of being implemented, considering recent major changes to PubMed’s interface:

1. Automatic Term Mapping (ATM).

ATM is the most recent, most radical and yet most poorly announced change.

Suddenly, when preparing a Master Class, searching via the search bar gave different, sometimes odd results. PubMed looked the same, but the DETAILS-tab showed the automatic search mapping (ATM) to be different. PubMed’s “New and Noteworthy” confirmed that ATM had been drastically modified. See here for the announcement’.

Consider this (given) example. Searching gene therapy would give:

with the Old ATM:
“gene therapy”[MeSH Terms] OR gene therapy[Text Word]

and the NEW ATM:
“gene therapy”[MeSH Terms] OR (“gene”[All Fields] AND “therapy”[All Fields]) OR “gene therapy”[All Fields].

Thus the new ATM expands the search:

1. by searching in All Fields instead of the tw-field (Title, Abstract, MeSH)
2. by splitting multi-word terms. Gene therapy is no longer sought as “gene therapy”, but as “gene” and “therapy”.

According to the NLM this facilitates finding synonyms like “gene silencing therapy…” and finding X in the author field. They should add: whether you WANT TO FIND IT OR NOT. Thus from now on you will search all fields automatically, including author, journal and address field.

Should I be glad to find more? NO, I use the Single Citation Mapper if I want to find a citation by author X, and I rather expand the search by adding terms that matter.
Suppose I would like to search´gene silencing therapy´ as well, then I would add gene silencing therap*[tiab], since searching for these words in a string will broaden the search without increasing noise.

However gene silencing (preventing a gene to work, i.e. by antisense oligo’s OR siRNA) is not really a gene therapy (insertion of a gene). So for most searches on ´gene therapy´ ´gene silencing´ is no valuable addition. And if it would be, MeSH like “Gene Silencing” and its narrow term RNA Interference should be included as well.

With gene therapy ATM will now (June 5th) retrieve 90942 hits instead of 36557, thus a surplus of 54385 hits, that is 2 ½ times as much!!! The expansion does add very little meaningful terms. It mainly retrieves citations with therapy in ANY field and:

  • gene as an author [au] : 53 extra hits
  • gene in the addressfield [ad], like hkj@gene.com or Department of Gene Regulation : 1327 extra hits
  • gene in the journal name, including “Gene” : 1487 extra hits
  • gene and therapy in the abstract/MeSH without direct connection to each other: papers about the impact of gene expression profiling on breast cancer outcomes (following chemotherapy NOT gene therapy), of experimental studies on change in gene-regulation following therapy etcetera: the majority of the extra hits. Estimation > 90%?: does anyone realize how often ‘gene’ and ‘therapy’ (in text, MeSH, subheadings and all other fields?) are used outside the context of gene therapy?

I guess I’m not the only one that is not pleased with this “enhancement”. Most users I know use Pubmed for subject searching and they unanimously experience the high number needed to read (high recall, low precision) as the major obstacle. ATM will only make this worse.

And what about:

  • people unaware of any changes and just relying on the search bar for subject searching, supposing it works the same as before?
  • the effect on alerts (RSS or MyNCBI)?
  • important updates of prior searches, i.e. for systematic reviews. With ATM you may retrieve MUCH more irrelevant papers. How to explain different results over time?
  • Although of minor importance, our courses, tutorials, exercises, the PubMed book my colleagues just wrote, all have to be adapted.

Thus I stop advising students/meds to simply use the search bar and just check the details, because this will surely frustate them. Rather I will advise them to add tags themselves: Look for the appropriate MeSH for Y in the MeSH-database and add Y*[tiab] as well. Even for simple subject searches!

Who wants the search d-dimer diagnosis lung embolism to be translated as:

(“fibrin fragment D”[Substance Name] OR (“fibrin”[All Fields] AND “fragment”[All Fields] AND “D”[All Fields]) OR “fibrin fragment D”[All Fields] OR (“d”[All Fields] AND “dimer”[All Fields]) OR “d dimer”[All Fields]) AND (“diagnosis”[Subheading] OR “diagnosis”[All Fields] OR “diagnosis”[MeSH Terms]) AND (“lung”[MeSH Terms] OR “lung”[All Fields]) AND (“embolism”[MeSH Terms] OR “embolism”[All Fields])

Very impressive, isn’t it, but the correct MeSH for lung embolism, pulmonary embolism is not mapped!!!!

Is it good then for preclinical guys, i.e.molecular biologist? Suppose you’re looking for signal transducer and activator of transcription 3 (that’s one protein), most lab people will use either the whole word or stat 3, stat(3), stat-3 or stat3

1. stat 3 maps to: (“Stat”[Journal] OR “stat”[All Fields]) AND 3[All Fields] = 4031 hits

2. Stat-3 maps to: stat-3[All Fields] = 591 hits

3. stat3 maps to: “stat3 transcription factor”[MeSH Terms] OR (“stat3″[All Fields] AND “transcription”[All Fields] AND “factor”[All Fields]) OR “stat3 transcription factor”[All Fields] OR “stat3″[All Fields] = 4639 hits
(Note that grey terms are superfluous: by searching stat3 you already find stat3 transcription factors)

Not very consistent and only the 3rd variation will be mapped to the proper MeSH, BUT (like 1.) will also give things like:

  • DeltaB=(1.18+/-0.09_{stat}+/-0.07_{syst}+/-0.01_{theor})x10;{-3}
  • EPI STAT, version 3.2.2.
  • Via Santa Marta n. 3 (address) and pH-stat
  • D Stat (author) and vol nr 3.

Thus it would be better to search for either merely

“stat3 transcription factor”[MeSH Terms]

or add synonyms (with OR) like stat-3[tiab], stat3[tiab], “stat 3″[tiab], “signal transducer and activator of transcription 3″[tiab].

This will increase precision and even recall.
However, one has to know how to find the correct terms and tags.

2. Citation Sensor

The renewed ATM was introduced together with the Citation Sensor that recognizes combinations of search terms characteristic of citation searching, e.g. volume nrs, author names, journal titles and publication dates, which it then matches to citations. These are shown separately in a yellow area above the retrieval.

Searching for limpens oncogene indeed suggests one paper of Limpens in Oncogene. This option can be very handy when one wants to retrieve a citation.

However typing: gene therapy 2007 405 gives 59 hits, but the citation sensor does not sense the specific paper in “Gene” year 2007, vol 405 (although retreived).

The Single Citation Mapper would have done better…. giving a single (correct) hit on both occasions.

Donna Berryman came to a very similar conclusion when writing to the MedLibList. She shows some other nice examples (i.e. that the citation sensor shows 4 citations from the journal Cancer by author Lung when searching lung cancer!!).

Donna explains that at the NLM booth at MLA, she was told that Pubmed changes were made to meet the wishes of a “significant” number of people that were going to PubMed, entering an author name and a journal title (with no field qualifiers) and expecting to retrieve a particular citation.

I’ve seen the nih.gov webmeeting presentation Donna referred to (click here)] as well as another (click here) (tips of the MedLib twitters @pfanderson and @eagledawg. Eagledawg (Nikki) also wrote 2 blogposts about this subject, see here (May) and here (June) )

It was quite revealing to see that empasis was given to numbers: number of visitors, numbers of queries versus number of documents and speed:

“if the query takes 2-3 minutes we loose users!”.

Well I can understand that NLM doesn’t want to discourage potential users, but I don’t understand why all functionalities have to be mixed in a way that it only serves the quick and dirty searches and even not very effectively. As Donna puts it: the new ATM is moving PubMed away from being a subject-based search. Again, most of my customers do subject searching.

3. Advanced search beta

Advanced search is a beta (version) and thus may be adapted based on findings and feedback (see here for announcement)

I don’t really know what to think of it. Firstly I wonder whether the Advanced Search is an extra option or meant to replace the present front page in due course. Secondly the Advanced Search looks quite complex, but not particularly advanced. The regular front page has more options (although hidden). This is certainly not an advanced tool for librarians, but is it an adequate tool for other users, clinicians or researchers?

Advanced Search beta consist of 5 separate “boxes”.

  1. The search-bar with a preview or a search option. Surprisingly the search option brings you back to the old front page. When you opt for “preview” you stay in the ‘advanced’ search.
  2. Search History showing the last 5 searches. If you exceed 5 searches a “More History” button appears. When clicked it brings up the full display.
  3. Seach by selected Fields. There are 3 default lines set up for Author, Journal and Publication date searching. Thus again, emphasis is given to reference instead of subject searching. Similar to the Single Citation Mapper, there is an auto-complete feature for authors and journals. On the right of each line is an index-feature.If you want to do a subject search (which in fact most advanced searchers do), you have to open the list of fields using the pull-down menu. However, for MeSH terms this is not ideal. Suppose you want to look up the MeSH for recurrent pregnancy loss (the term mostly used by clinicians). The MeSH is Abortion, Habitual. You won’t find the MeSH by looking at recur….. In effect, you won’t find it by looking up habit…. either. You have to start typing abortion…!?
    When you find an appropriate MeSH, you can choose to search for the MeSH coupled to a particular subheading (i.e. haemonchiasis/blood). You can see immediately how many hits will be retrieved (63).

    Suppose a clinician wants to know whether PGS is indicated in RPL. He pulls open the MeSH-field, types recurrent pregnancy loss, adds another MeSH-field and fills in preimplantation genetic screening, because he thinks PubMed will match it for him.


    He
    clicks a few limits because he thinks that might help to narrow his search, clicks the search button, waits and … ends up the regular front page showing zero results. So all steps he took didn’t lead him anywhere, because the appropriate MeSH (Abortion, Habitual and Preimplantation Diagnosis) weren’t found and he still has no clue as to what terms he should use.

    Even if the correct MeSH is found, the notation may be quite misleading. For example, after typing lung cancer into the box next to ‘Search MeSH terms’ , the History in PubMed will show lung cancer[MeSH Terms], whereas “lung cancer” is NOT the MeSH term. Thus people are going to think that lung cancer is the MeSH, because it looks like this. If they look in the Details box, however, they’ll see the real “lung neoplasms”[MeSH terms]. How are people going to know what’s what? (Thanks to Donna for providing this example).

    At least, in case of lung cancer, the correct MeSH-term is being searched. In contrast, a term like Lung embolism is not searched as Pulmonary embolism[mesh], and gives zero hits. Funny, because searching via the normal search bar would at least translate lung embolism in embolism[mesh] and lung[mesh]. (and there are several tricks whereby you can subsequently find the proper MeSH)


    Thus, in Advanced Search Beta, searching MeSH via ‘search MeSH-terms’ will only work when you know the (exact) MeSH-term in advance.

  4. The 4th box is really the limits-tab from the usual front page, but shown in full. A nice option is that you can lock certain limits while unlocking others (that is you can apply one limit to the next search and other limits to this and subsequent searches).
  5. The 5th box is (again) an Index of Fields. However it allows you to enter multiple terms.

In short, I’m not particularly impressed by this advanced search beta. It is too complex for a quick and dirty search as well as for a reference search. However, it is also not very well suited for an (advanced) subject search. It is not possible to look up any MeSH other than by index, and even this often goes wrong.
Some important functionalities are not included, like the clinical queries. Furthermore by displaying limits so prominently, many people will automatically use them. Personally I’m very reticent in using limits, because you miss non-indexed (i.e. recent) papers.

So I agree with tunaiskewl

“I stumbled across a beta Advanced Search in PubMed today. Has anyone else played with this? It appears that it merges the Preview/Index, History, Limits, and field searching screens all together in one place. Perhaps this will make some of PubMed’s features more obvious to searchers, but I’m not seeing too much benefit to it otherwise…”

4. Other minor recent changes include:

  • Create Collection in MyNCBI by one step via the send to option (this is wonderful!)
  • PubMedID (ID for Pubmed Central, at the bottom right)
  • Collaborators -display (separate from autors)
  • In Abstract Plus – (very popular with users, dynamic display format)
  • Blue side bar gone in certain display formats. Again this is done to make room for new functionalities (bad!, takes me 2 steps to go back to MeSH, Clinical Queries or whatsoever)

—————–

NL flag NL vlag

The Present: PubMed is going for the mass.

Dit is een vervolg op deel 1(zie hier)

Het lijkt erop dat enkele aanpassingen inmiddels doorgevoerd zijn, t.w.

1. Automatic Term Mapping (ATM).

Hoewel ATM een zeer ingrijpende verandering is, is de gebruiker hier nauwelijks van op de hoogte gesteld.

Ik kwam er bij toeval achter toen ik met een collega een keuzevak voor 2e jaars voorbereidde. Zoeken via de zoekbalk gaf heel andere resultaten, terwijl er uiterlijk aan PubMed niets te zien viel. De Details tab toonde een geheel afwijkende automatic term mapping, ook wel ATM of mapping genoemd. In PubMed’s “New and Noteworthy” werd dit wel aangekondigd, maar hoe velen lezen dit?

Men geeft hier het volgende voorbeeld:

Gene therapy wordt als volgt gemapt:

met de oude ATM: “gene therapy”[MeSH Terms] OR gene therapy[Text Word]

met de nieuwe ATM: “gene therapy”[MeSH Terms] OR (“gene”[All Fields] AND “therapy”[All Fields]) OR “gene therapy”[All Fields].

Dus de nieuwe ATM breidt de search uit:

1. door op All Fields te zoeken ipv. het tw-field (Titel, Abstract, MeSH)

2. door termen bestaande uit meerdere woorden op te hakken in de individuele woorden. Gene therapy wordt niet langer gezocht als “gene therapy”, maar als “gene” en “therapy”.

Volgens de NLM zoek je daarmee ook op synoniemen als “gene silencing therapy…” en vind je ook X in het auteursveld als je op X zoekt. Eigenlijk hadden ze moeten zeggen; ongeacht of je het wilt vinden. Dus van nu af aan zoek je automatisch in alle velden als je zelf geen tags toevoegt.

Of ik blij ben dat ik nu meer vind? Nou nee, ik gebruik de Single Citation Mapper wel als ik een citatie Y door auteur X wil vinden en ik breid searches liever uit door er relevante termen aan toe te voegen.
Dus hooguit zou ik gene silencing therap*[tiab] aan de search toevoegen, als ik heel breed wil zoeken. Dit breidt mijn search uit zonder onnodige ruis. Echter, goed beschouwd, is “gene therapy” (gentherapie, invoegen van een gen) toch wezenlijk anders dan gene-silencing (voorkomen dat een gen werkt door antisense oligo’s of siRNA). Daarom lijkt het me dit begrip voor de meeste searches over gentherapie niet echt bruikbaar. (Tussen 2 haakjes: er is een goede MeSH voor “Gene Silencing”, de nauwere term is RNA Interference)

Met gene therapy vindt ATM nu (5 juni) 90942 hits i.p.v. 36557, dus 54385 extra hits, dit is 2 ½ keer zoveel!!! De meeste van deze extra hits zijn niet relevant. Je vind nl ook citaties met therapy in ELK veld en:

  • gene als auteur : 53 extra hits
  • gene in het adresveld: hkj@gene.com of Department of Gene Regulation : 1327 extra hits
  • gene in de tijdschrifttitel, zoals “Gene” : 1487 extra hits
  • gene en therapy in het abstract/de MeSH zonder enig betekenisvolle relatie: artikelen over het effect van gene expression profiling op de prognose van borstkanker (na chemo, niet na gentherapie), studies over veranderingen in genregulatie na therapie X. De meerderheid van de extra hits zal onder deze noemer vallen.

Waarschijnlijk vinden meer mensen ‘deze enhancement’ niet prettig. De meeste gebruikers die ik ken zoeken op onderwerp en het grootste probleem dat ze hierbij ondervinden is dat ze teveel vinden wat niet relevant is (hoog number needed to read). ATM verergert dit alleen maar.

En wat te zeggen van:

  • mensen die zich van niets bewust zijn en de zoekbalk net zo gebruiken als vanouds
  • effect op bestaande alerts (RSS of MyNCBI)?
  • updates van eerdere searches, bijvoorbeeld voor een systematisch review. Ten gevolge van ATM vind je dan opeens na een bepaald tijdstip meer hits met dezelfde search (indien geen tags gebruikt)
  • het aanpassen van cursussen, tutorials, opdrachten, het PubMed boek dat mijn collega’s net hebben gemaakt? En wie zegt dat dit het einde is?

Van nu af aan zal ik (bijna) iedereen adviseren om niet langer maar via de zoekbalk te zoeken en slechts de Details te controleren, maar om de meest geschikte Mesh-term(en) te gebruiken en evt. op een of meer synoniemen in titel en abstract te zoeken.

D-dimer diagnosis lung embolism wordt volgens de huidige ATM vertaald als:

(“fibrin fragment D”[Substance Name] OR (“fibrin”[All Fields] AND “fragment”[All Fields] AND “D”[All Fields]) OR “fibrin fragment D”[All Fields] OR (“d”[All Fields] AND “dimer”[All Fields]) OR “d dimer”[All Fields]) AND (“diagnosis”[Subheading] OR “diagnosis”[All Fields] OR “diagnosis”[MeSH Terms]) AND (“lung”[MeSH Terms] OR “lung”[All Fields]) AND (“embolism”[MeSH Terms] OR “embolism”[All Fields])

Indrukwekkend niet, maar de meest geeigende MeSH, pulmonary embolism wordt niet gevonden!!!!

Is het dan goed voor de moleculair biologen e.a. preclinici? Stel dat je bijv. op zoek bent naar het eiwit signal transducer and activator of transcription 3. De meesten zoeken dan op het hele woord of stat 3, stat(3), stat-3 or stat3

1. stat 3 geeft: (“Stat”[Journal] OR “stat”[All Fields]) AND 3[All Fields] = 4031 hits

2. Stat-3 geeft: stat-3[All Fields] = 591 hits

3. stat3 geeft: “stat3 transcription factor”[MeSH Terms] OR (“stat3″[All Fields] AND “transcription”[All Fields] AND “factor”[All Fields]) OR “stat3 transcription factor”[All Fields] OR “stat3″[All Fields] = 4639 hits
(De grijze termen zijn dus eigenlijk overbodig want door op stat3 te zoeken vind je die al.

Niet erg consistent vertaald; alleen variatie 3 wordt gemapt met een MeSH, MAAR vindt evenals 1 geheel irrelevante hits als:

  • DeltaB=(1.18+/-0.09_{stat}+/-0.07_{syst}+/-0.01_{theor})x10;{-3}
  • EPI STAT, version 3.2.2.
  • Via Santa Marta n. 3 (address) and pH-stat
  • D Stat (author) and vol nr 3.

Daarom is het beter om of alleen op de MeSH te zoeken

“stat3 transcription factor”[MeSH Terms]

of om daar synoniemen aan toe te voegen als stat-3[tiab], stat3[tiab], “stat 3″[tiab], “signal transducer and activator of transcription 3″[tiab].

Hierdoor neemt de precisie en zelfs de recall toe. Maar je moet wel weten hoe de termen en tags te vinden.

2. Citation Sensor

Tegelijk met de nieuwe ATM werd ook de Citation Sensor ingevoerd. Deze herkent termen die karakteristiek zijn voor citaties. Als Citaties gevonden worden, worden ze apart in een geel vlak boven de zoekresultaten getoond.

Wanneer je op limpens oncogene zoekt zijn wordt het artikel van Limpens in Oncogene getoond. Deze optie kan handig zijn als je een citatie wil vinden.

Zoek je echter: gene therapy 2007 405 dan pikt de citation sensor niet het artikel in “Gene” 2007, vol 405 op temidden van de 57 hits.

De Single Citation Mapper zou dit beter gedaan hebben: 1 enkele goede hit in beide voorbeelden.

Donna Berryman kwam tot dezelfde conclusie in haar MedLibList-Mail. Ze geeft nog een paar andere leuke voorbeelden, zoals dat de citation sensor 4 citaties vindt van auteur Lung in het tijdschrift Cancer als je op lung cancer zoekt!!).

Donna vertelt dat ze op een NLM stand op de MLA hoorde dat er PubMed veranderingen doorgevoerd werden ten behoeve van een niet te verwaarlozen groot aantal mensen die alleen naar PubMed kwamen om een auteur of tijdschrifttitel in te voeren, omdat ze zo dachten een bepaald artikel te kunnen vinden

Hier is (waarschijnlijk) de webmeeting waar Donna aan refereert en hier een andere (tip van de MedLib twitters @pfanderson en @eagledawg. Eagledawg (Nikki) schreef, zo las ik later, ook 2 blogberichten over dit onderwerp, zie hier (Mei) en hier (Juni) )

Ik vond het nogal onthutsend dat getallen zo zwaar telden.

“if the query takes 2-3 minutes we loose users!”.

Ik begrijp natuurlijk wel dat de NLM ook degenen wil tegemoetkomen die alleen maar een artikeltje zoeken, maar moet dat ten koste gaan van andere functionaliteiten? Zelfs zoeken op een specifiek artikel verloopt niet altijd vlekkeloos. Het lijkt erop dat, zoals Donna het zegt, met de nieuwe ATM het zoeken op onderwerp minder belangrijk wordt. Nogmaals, de meeste mensen die ik ken zoeken op onderwerp.

3. Advanced search beta

Advanced search is een beta (versie), dus nog in de probeerfase. (zie hier).

Ik weet nog niet helemaal wat ik ervan moet denken. Komt het naast of in plaats van de oude entree? Ik het er nogal erg complex uitzien en toch niet erg geavanceerd. Niet alle opties van de normale openingspagina zijn aanwezig.

Er zijn 5 verschillende vakjes.

  1. De zoekbalk met een preview en een zoekoptie. Gek genoeg kom je als je op search klikt weer op de oude vertrouwde Pubmed pagina terecht. Als je daarentegen voor “preview” kiest blijf je wel in de ‘advanced’ search.
  2. Search History. Bij meer dan 5 searches moet je op “More History” klikken om de volledige zoekgeschiedenis te kunnen zien.
  3. Seach by selected Fields. Standaard kun je op Author, Journal and Publication date zoeken. Dus wederom erg gericht op het vinden van referenties. Handig is de auto-complete-functie voor auteurs en tijdschriften (net als in de Single Citation Mapper). Rechts is een aanklikbare index.Je kunt in andere velden zoeken door op het pull-down menu te klikken. Het is echter niet erg handig om zo op MeSH te zoeken. Stel dat je op recurrent pregnancy loss wil zoeken. De MeSH is Abortion, Habitual. Dat vindt je dus niet door op recur….. te zoeken in de index, en ook niet door op habit…. te zoeken.(in een update van de engelse versie heb ik een aantal voorbeelden toegevoegd die laten zien dat het zoeken van MeSH-termen via Advanced Serach beta niet goed verloopt, t.z.t zal ik die hier vertalen)

    Je kunt als je een MeSH vindt, deze alleen zoeken of met een subheading eraan gekopeld (bijv. haemonchiasis/blood). Het aantal hits (63) is direct te zien.

  4. Het 4e vak is eigenlijk de limit-tab, maar dan volledig getoond. Nieuw is dat je bepaalde limieten aan kan laten staan (locked), terwijl je andere alleen voor de volgende search gebruikt.
  5. Het 5e vak is weer een index van alle velden. je kunt hier wel verschillende termen tegelijk invoeren.

Samenvattend, ik ben niet bijzonder onder de indruk van deze ‘geavanceerde’ seach optie. het is te ingewikkeld en te weinig intuitief voor een snelle search of het opzoekwerk, maar het is ook niet erg geschikt voor een geavanceerde search. Met name omdat je de MeSH alleen via indexen kunt opzoeken. Ook zijn er minder opties. De Clinical Queries ontbreken bijvoorbeeld. Aan de andere kant zijn de Limits zo prominent aanwezig dat gebruikers misschien sneller dan normaal geneigd zijn ze toe te passen. Persoonlijk gebruik ik ze zeer beperkt!

4. Kleinere veranderingen

  • Je kunt een Collection in MyNCBI nu simpel aanmaken via de send to option (perfect!)
  • PubMedID (ID voor Pubmed Central, rechtsonderaan)
  • Collaborators -display (gescheiden van auteurs)
  • In Abstract Plus
  • De linker blauwe balk (met geavanceerde opties) wordt in bepaalde display formats niet meer getoond. Hierdoor zou er meer ruimte komen voor nieuwe functionaliteiten (als de related reviews), maar ik vind het heel vervelend omdat ik meer stappen nodig heb om na elke individuele zoekactie weer naar de MeSH of Clinical Queries terug te gaan.







PubMed: Past, Present And Future, PART I

11 06 2008

I.The Past:
Extremely simple, yet incredibly difficult

For Part II (discussion ATM, Advanced Search beta: see here).

Searching PubMed has never been easy, not for the advanced searcher nor the beginner.

As an advanced searcher you have (had?) to find your way through the Search Bar, MeSH-database, look for broader, narrower or related terms, know when to explode MeSH, add major topics or subheadings or not, know when to use ‘Links’ or the ‘Search to Sendbox’ to send Searches to PubMed. Know when and how to use Clinical Queries, Limits, Field Codes (nowadays called tags 🙂 ), History, MyNCBI saved searches and collections, linkouts, AND, OR, NOT and so on….

It takes some investment of time to become an effective & advanced PubMed searcher.

For the less experienced and/or more rushed people (busy clinicians, young investigators, lab-people) trying to find an answer to medical questions, the search bar often sufficed. Here you just typed in some words that were not only searched in title and abstract, but also translated into the corresponding MeSH (if recognized as synonyms), a process called automatic mapping. People just haved to check “Details” to verify proper mapping. It often went well, but sometimes the mapping was completely wrong (i.e. typ: walking aids and you will search for the MeSH term for AIDS, although HIV has nothing to do with it).

The overwhelming number of hits could be effectively reduced with some risk of loosing relevant hits by using the Limits option or using Methodological Filters in the Clinical Queries (EBM). Because of the disease-oriented MeSH, PubMed is not very well suited for preclinical or basic scientific searches. This often leads to frustrations (see below).

Some people just want to look up citations and there is a perfect tool for it: the Single Citation Mapper. It is so wonderful, just type in an author, the journal, first page or whatever. It has an autofill function, so I even prefer this tool to find a journal or an author (instead of the indexes, which is yet another option).

Now let alone the summing up of these possibilities makes me see stars. After a course PubMed a student who knew a lot about programming, sighed:

Wow, there is a lot of stuff in there, but it is all so concealed and difficult to find….”

That’s true, and this together with the superficial resemblance of the search bar with the Google search bar makes inexperienced users use PubMed in a Googlish way: just typ in some words ….. and you probably… don’t find what you want. This was especially true for people looking up a particular paper and not familiar with the Single Citation Mapper, hidden in the blue side bar. (The picture left is from “Arts in Spe”, with the Title: Searching like in Google”)

Or as the Harvard PhD student Anna Kushnir expressed her frustations when ranting against PubMed :

“I hate PubMed. I hate it with a burning passion. For a site that is as vital to scientific progress as PubMed is, their search engine is shamefully bad. It’s embarrassingly, frustratingly, painfully bad. (…)

… I can hold a paper in my hands, search for two authors’ last names and have PubMed come up with nothing. (….)

Why is PubMed so behind the times? Why? How does it even work? Does it search only the abstract? Does it also search the body of the papers that are available online? Why does it get so massively confused by an author’s initials and last name together, in one search. […]

I don’t think I should have to be, or enlist the services of, a medical librarian in order to do a simple search on a literature search engine. PubMed should be an intuitive search engine such as Google, or others. […] PubMed should be tuned to my needs and my skill set. I should not have to tune to it. […]”

There was an overwhelming response to her post and Anna’s story was covered in many blogs. I don’t want to revive that discussion, just want to mention Graham Steele’s comment.

“@ Anna,

You might just be in luck thanks to voicing your frustrations online !!

I brought this post to the attention of Dr Lipman who I’ve just heard back from.

He’s authorized me to post here on his behalf. (Thanks Dr Lipman)

Although the current engine works well for some users and some queries, I understand Anna’s frustration and we are in the midst of a number of changes that will make PubMed work better for her and many other users.

We will be adding a number of other “sensors” which will run in parallel with the default search. From monitoring results of enhancements we’ve added to some of our other Entrez databases.

A number of these complaints are fair and we’ll be doing our best to address them. With the large number of users we have, it will be clear what areas we’ll improving and what areas will need more work.”

I’m now beginning to understand what Graham Steel meant in his reply to Anna.

Coincidantly or not, PubMed has introduced a couple of changes that seem to concede Anna’s demands. This will be the subject of the second part of this Trilogy, see HERE

—————-

NL flag NL vlag

I.The Past:Extremely simple, yet incredibly difficult

Zoeken in PubMed is nooit makkelijk geweest, voor wie dan ook, beginner of gevorderde.

Als je echt volop gebruik wil maken van PubMed dan moet je niet alleen overweg kunnen met de zoekbalk, maar ook met de MeSH database. Je moet weten wat bredere en nauwere termen zijn, weten wanneer automatische explosie gewenst is of niet, wanneer je major topics gebruikt, subheading toevoegt en of je MeSH-termen via ‘Links’ of via de ‘Search to Sendbox’ naar PubMed ‘brengt’. Je moet weten of en hoe je Clinical Queries, Limits, Field Codes (tags), de History, MyNCBI saved searches and collections, linkouts, AND, OR, NOT enzovoorts, enzoverder gebruikt….

Het duurt dus even voor je alle ins en outs kent en op een effectieve manier van de geavanceerde mogelijkheden van PubMed gebruik maakt.

Voor de minder gevorderde gebruikers of de mensen die snel een antwoord willen op een (bio)medische vraag zoals artsen in de drukke klinische praktijk, fundamentele wetenschappers, preklinici voldeed vaak de zoekbalk. Je kon hier gewoon wat woorden intypen en ongezien zoekt (zocht) PubMed niet alleen in titel en abstract, maar vertaalde ze woorden ook in MeSH (als ze als synoniem herkend worden). Dit heet (automatic term) mapping of ATM. Makkelijk, maar het is wel aan te bevelen de “Details”-tab te bekijken om te zien of de search goed vertaald wordt. Soms gaat het namelijk helemaal fout. Bijv. als je walking aids typt, wordt o.a. op de MeSH voor de ziekte Aids gezocht, terwijl dat er natuurlijk niets mee van doen heeft.

Om de enorme hoeveelheid hits te reduceren kun je Limits of Methodologische Filters in de Clinical Queries (EBM-vragen) toepassen. Omdat de MeSH nogal georienteerd zijn op ziekten, is PubMed niet bij uitstek geschikt voor niet-medische vragen. Dit kan nog wel eens tot frustaties leiden. (zie onder)

Wanneer je alleen maar bepaalde artikelen wil opzoeken, kun je dat heel handig doen via de Single Citation Mapper. Typ gewoon de naam van een auteur, het tijdschrift, het pagina- of volumenummer, en/of een titelwoord in. En het artikel is zo gevonden.

Bij het opsommen van al deze mogelijkheden gaat het me al duizelen. Hoe moet het dan op beginners overkomen?

Na een PubMedcursus verzuchtte een student met veel ervaring in programmeren tegen mij.

“Wat een mogelijkheden, maar het is wel heel erg verborgen allemaal en erg moeilijk te vinden. Niet erg gebruikersvriendelijk.

Dat is zondermeer waar en omdat de PubMed zoekbalk oppervlakkig gezien wel op Google lijkt gaan onervaren zoekers (en met name de Google-generatie) erin zoeken als in Google. Ze typen de hele zoekstrategie gewoon in en verwachten dan snel wat te vinden. Helaas is dat niet zo. Zeker specifieke artikelen kon men zo vaak juist niet vinden, omdat wel automatisch met MeSH gemapt werd, maar meestal (juist niet) in het tijdschrift- of auteursveld gezocht werd. Daar was nu juist die handige Single Citation Mapper voor. Veel mensen kennen die echter niet, want de naam is nietszeggend en de optie zit in de blauwe zijbalk verscholen.

Ook promovenda Anna Kushnir liep hier tegenop en blies daarover stoom af op haar blog:

“I hate PubMed. I hate it with a burning passion. For a site that is as vital to scientific progress as PubMed is, their search engine is shamefully bad. It’s embarrassingly, frustratingly, painfully bad. (…)

… I can hold a paper in my hands, search for two authors’ last names and have PubMed come up with nothing. (….)

Why is PubMed so behind the times? Why? How does it even work? Does it search only the abstract? Does it also search the body of the papers that are available online? Why does it get so massively confused by an author’s initials and last name together, in one search. […]

I don’t think I should have to be, or enlist the services of, a medical librarian in order to do a simple search on a literature search engine. PubMed should be an intuitive search engine such as Google, or others. […] PubMed should be tuned to my needs and my skill set. I should not have to tune to it. […]”

Dit blog heeft heel wat losgemaakt, zowel onder voor- en tegenstanders. Ik zal nu niet het stof weer doen opwaaien, maar ik wil alleen nog even Graham Steele’s commentaar vermelden.

@ Anna,

You might just be in luck thanks to voicing your frustrations online !!

I brought this post to the attention of Dr Lipman who I’ve just heard back from.

He’s authorized me to post here on his behalf. (Thanks Dr Lipman)

Although the current engine works well for some users and some queries, I understand Anna’s frustration and we are in the midst of a number of changes that will make PubMed work better for her and many other users.

We will be adding a number of other “sensors” which will run in parallel with the default search. From monitoring results of enhancements we’ve added to some of our other Entrez databases.

A number of these complaints are fair and we’ll be doing our best to address them. With the large number of users we have, it will be clear what areas we’ll improving and what areas will need more work.

Ik begin nu een beetje te begrijpen wat Graham hiermee bedoelde.

Want toevalligerwijs of niet, zijn er enkele zaken ingrijpend veranderd in PubMed, veranderingen die Anna’s eisen lijken in te willigen.

Welke veranderingen dat zijn en wat voor een effect ze sorteren wordt in deel 2 van deze trilogie besproken, zie HIER





Related Articles = Lateral Navigation

18 05 2008

What I did pick up form the WordPress announcement by Matt on possibly related posts (see previous post) is the term “lateral navigation” for navigating from one post to another. Why is this such a nice term?

Well, in my classes on systematic searching I teach people to perform (1) backward searching (checking citations in the reference list of selected papers), (2) forward searching (looking for papers that cite relevant papers) and (3) to browse Related Articles in PubMed or use “Find Similar” in OVID (MEDLINE, EMBASE). This approach is called snowballing or pearl method. It serves to find papers that you might have missed, but even more so to find new terms to add to your search, so you catch these ‘missing studies’ with your final search strategy.

The term lateral searching is so perfect because you can easily vissualize what this word stands for and it fits in with backward and forward searching.

So lateral searching will now be added to my slides! (see figure)

lateral searching

———————————————-

NL flag NL vlag

De post van Matt (WordPress) over “possibly related posts” (zie mijn vorige post) bracht me op een voor mij nieuw begrip. “lateral navigation”, of in mijn geval nog beter “lateral searching” (lateraal of zijwaarts zoeken). Dit vind ik nl. een heel toepasselijke term voor het zoeken naar verwante artikelen.

In mijn cursussen “systematisch zoeken” raad ik mensen aan om aan de hand van geincludeerde (geselecteerde) artikelen systematisch ontbrekende studies te vinden door (1) “backward searching” (referentielijst checken), (2) “forward searching” (citerende artikelen zoeken) en (3) Related Articles in PubMed or “Find Similar” in OVID (MEDLINE, EMBASE) door te nemen. Deze zoekmethode wordt ook wel de sneeuwbal- of parelmethode genoemd. Het dient niet alleen om de ontbrekende artikelen te vinden maar vooral om nieuwe termen te vinden waarmee je je zoekactie kunt vervolmaken, zodat je deze èn andere artikelen met je uiteindelijke zoekactie vangt.

The term “lateral searching” past zo mooi bij de termen backward and forward searching, omdat ze alle 3 een beweging uitdrukken, waarbij de zijwaartse beweging nog het minst doelgericht lijkt en dat is het ook. Als je niet uitkijkt zwalk je zo van de ene naar de andere studie, en daarmee verlies je de systematiek. Leuk als je op nieuwe ideeen wilt komen, niet goed als je systematisch wilt zoeken.

Dus vanaf nu komt “lateral searching” op mijn powerpoint-presentatie te staan! (zie figuur))

——————————

Previous posts on related articles/posts at this blog:
https://laikaspoetnik.wordpress.com/2008/05/16/possibly-an-announcement-about-possibly-related-posts/






New: Related Reviews in PubMed

14 05 2008

The latest feature to be added to the PubMed AbstractPlus display format is ‘Related Reviews’, links from the set of “Related Articles’ to review articles (articles indexed with the Publication Type ‘Review’).

Reading this in the NLM Technical Bulletin ( May 06 2008 ) I rushed to have a look in PubMed, but all I could find was the familiar set of 5 related articles (sometimes companied by Patient Drug Information) to the paper I selected.

related reviews PubMed

What went wrong?

Looking at “the small print” in the bulletin I read:

“Related Reviews will initially appear during randomly selected PubMed sessions; therefore, not all users will see them. We expect that this feature will be expanded to all users if it proves to be popular.”

O.k. this it is in fact a random trial, and I just received the placebo.

How would NLM decide whether this featured feature is popular? The number of clicks?

*****************************

NL flagVers van de pers uit het NLM Technical Bulletin (May 06 2008 ) : Aan het PubMed AbstractPlus display format worden ‘Related Reviews’ toegevoegd, dat zijn links van “related articles” naar reviews (artikelen met het Publicatietype ‘Review’).

Heb het meteen geprobeerd in PubMed, maar ik zag alleen maar de oude, vertrouwde 5 related articles (en soms “Patient Drug Information”) naast een geselecteerd artikel (zie Figuur hierboven?)

Wat was er mis?

In de ‘kleine lettertjes’ onderaan las ik het volgende:

“Related Reviews will initially appear during randomly selected PubMed sessions; therefore, not all users will see them. We expect that this feature will be expanded to all users if it proves to be popular.”

O.k. het is dus een random trial, en ik kreeg net de placebo.

Hoe zou de NLM nu op grond van deze gegevens tot een juist besluit kunnen komen: het aantal kliks op de related reviews?………)





Presentatie Geert van der Heijden op Slideshare

23 04 2008

De presentatie van Geert van der Heijden van vrijdag 18 april j.l. (BMI-ALV) is nu op Slideshare geplaatst (Lieuwe Kool slaat enkele web 2.0 stappen over ;).
Ik had er al in een eerder post melding van gemaakt, maar daar stond het zo temidden van de grote hoeveelheid tekst, dat het waarschijnlijk wat ondergesneeuwd raakte.
Verder had ik nog geen ervaring met embedden van een slide-serie, dus dat leek me meteen een aardig experimentje. Jammer dat Geert’s presentatie niet gepodcast is, want dan had ik meteen de SPOETNIK opdracht van deze week voltooid. Ik loop wat dat betreft nogal achter.

Fijn dat Geert deze presentatie ter beschikking heeft gesteld. Zo kan men even rustig Geerts’ verhaal in zijn eigen woorden (na) lezen in plaats van de afgeleiden hiervan op deze of gene blog.
Zo kan ik ook nog even lezen wat ik het eerste kwartier heb gemist.

Presentatie Geert staat online op slideshare:





The best moment teaching EBM-searching skills?

6 04 2008

When you are a (future) doctor you will obviously need to look for publications at one stage or another. PubMed is the place to look for relevant medical papers. Usually medical students begin to feel the urge to learn the ins and outs of PubMed (and searching in general) once they do their scientific training (4th year) or their internship, especially when they have to perform a CAT, critically appraised topic. Then it turns out their superficial knowledge of PubMed is one of the main hurdles. They find too many hits or too few and/or miss the relevant ones.

To help them I started a monthly class of 2 hours in which I learn interns (at the dept. Gynaecology) the basics of EBM, at least the first two steps: constructing a well answerable question using the PICO method (including defining the domain/levels of evidence) and finding the evidence in PubMed as well as in aggregate resources. (these two steps are called EBS or evidence based searching). Interns are asked to prepare 4 questions, all based on previous CATs. The first question is answered during an interactive power point presentation (first hour), the other 3 are practised ‘hands on’. If needed I give them personal aftercare.

It is a highly appreciated course, and it helped to improve the quality of the CATs. So that’s very encouraging.

I often get the same feedback from the user surveys:

  • well structured and informative
  • why didn’t we get this earlier?
  • too much information at once (especially at the end of the day)

To meet their wish my colleagues started a short introduction in PubMed prior to this ‘advanced’ class. As a result, the students are better acquainted with PubMed and we can delve more in depth into the subject. Last session they even prepared all questions. I wasn’t aware and asked one of the students (quite disappointed) why he still put the words in one string in the search bar instead of looking up each word separately and checking whether the words mapped correctly to the appropriate MeSH. He replied: “But I already did this at home. I checked out all the words.” showing his notes. And I must admit his search was quite good. So I was very satisfied with this group of students.

But the feedback remains the same. well structured and informative – why didn’t we get this earlier? – too much information at once. (especially at the end of the day)

Thus one would be inclined to think there is a need to teach students earlier on.

Now coincidently, a new curriculum has started in our academic hospital, in which EBM is incorporated into the clinical modules. The 1st year students learn about information resources and study designs. In the 2nd year they learn the basics of PubMed, EBM, PICO’s, Evidence Based Searching and Systematic Reviews.

Our library is involved in the educational process with respect to information resources, PICO’s and searching. Most of the teaching is in the form of e-learning (Dutch: COO, computer ondersteunend onderwijs) using the QMP (question mark perception) system, which is basically designed to test knowledge.

We have made a tutorial for PubMed (a-basic-learn-the-buttons-and-MeSH-course) and I prepared an e-learning module on PICO’s, study-designs and aggregate evidence, for the Cardiology block. This took me 6 weeks! It was reasonably well received by the students… That is, who bothered to give feedback.

During the course “Pulmonology” (february/march) we gave 30 “Finding the Evidence Search Workshops” to 6-12 students.
I had quite high expectations, since in theory these students should have a good theoretical basis (considering the earlier e-learning tutorials).

However their knowledge was quite disappointing, and even more so were their motivation and attitude. They were just a bunch of kids, most of them not very interested in PubMed, searching, EBM or whatsoever. They were often giggling and chatting, which I find rather distracting, or were passive, silent and gazing, which is even more distracting. And when I took a glimpse at their screens I often saw g-mail and unfamiliar colourful sites instead of PubMed.

I wondered at what point these students would pupate and transform into the butterflies called interns? And at this stage I couldn’t imagine them sitting on my bedside as a doctor I would trust unconditionally.

Was it really this bad? No, I’m a bit exaggerating. When I sound them out it appeared that they find the scientific methodology courses to fragmented, too basic and not the core of their study: firstly they want to pass their exams and secondly they want to become a doctor(!), not a scientist nor a librarian. I suppose E-learning and tutorials are not the ideal tools, not even for the computer generation. E-learning has to be dosed and is not as inspiring as a good tutor (at least that is what I think).

Anyway after one hour yawning, sighing and bewildered looks and after a much needed coffee break with cookies (a brilliant move of two of my collegues) I got the impression the penny finally dropped. Some students mumbled: “Mmm, I think I come to understand it” others smiled and uttered “Yes!” and the remaining questions were answered rather swiftly by most students. It even turned out that some of the glossy sites I had seen were on-line medical dictionaries, they used to look up the correct terms. Yes, this young generation is capable of multitasking.

If these courses were evaluated the same way as the above mentioned CAT-course, I guess the outcome would be as follows:

  • not particularly interesting
  • why do we have to learn this now? can’t it wait?
  • too much information….

We still need to find the ideal timing for these courses and also a better dosing. The best timing is when they need it the most, I suppose. The students who absorbed the information best were those who needed the information right now or found out that needed it before (i.e. they now realized that their previous searches were far from ideal). The form is also something to workat. Especially the e-learning modules should be better integrated into the clinical blocks. It is not sufficient to tune in with the subject. For students to appreciate and retain information, searching skills need to be taught in tandem with assignments. Students need to see the relevance of what they learning.