Things to Keep in Mind when Searching OVID MEDLINE instead of PubMed

25 11 2011

When I search extensively for systematic reviews I prefer OVID MEDLINE to PubMed for several reasons. Among them, it is easier to build a systematic search in OVID, the search history has a more structured format that is easy to edit, the search features are more advanced giving you more control over the search and translation of the a search to OVID EMBASE, PSYCHINFO and the Cochrane Library is “peanuts”, relatively speaking.

However, there are at least two things to keep in mind when searching OVID MEDLINE instead of PubMed.

1. You may miss publications, most notably recent papers.

PubMed doesn’t only provide access to MEDLINE, but also contains some other citations, including in-process citations which provide a record for an article before it is indexed with MeSH and added to MEDLINE.

As previously mentioned, I once missed a crucial RCT that was available in PubMed, but not yet available in OVID/MEDLINE.

A few weeks ago one of my clients said that she found 3 important papers with a simple PubMed search that were not retrieved by my exhaustive OVID MEDLINE (Doh!).
All articles were recent ones [Epub ahead of print, PubMed – as supplied by publisher]. I checked that these articles were indeed not yet included in OVID MEDLINE, and they weren’t.

As said, PubMed doesn’t have all search features of OVID MEDLINE and I felt a certain reluctance to make a completely new exhaustive search in PubMed. I would probably retrieve many irrelevant papers which I had tried to avoid by searching OVID*. I therefore decided to roughly translate the OVID search using textwords only (the missed articles had no MESH attached). It was a matter of copy-pasting the single textwords from the OVID MEDLINE search (and omitting adjacency operators) and adding the command [tiab], which means that terms are searched as textwords (in title and abstract) in PubMed (#2, only part of the long search string is shown).

To see whether all articles missed in OVID were in the non-MEDLINE set, I added the command: NOT MEDLINE[sb] (#3). Of the 332 records (#2), 28 belonged to the non-MEDLINE subset. All 3 relevant articles, not found in OVID MEDLINE, were in this set.

In total, there were 15 unique records not present in the OVID MEDLINE and EMBASE search. This additional search in PubMed was certainly worth the effort as it yielded more than 3 new relevant papers. (Apparently there was a boom in relevant papers on the topic, recently)

In conclusion, when doing an exhaustive search in OVID MEDLINE it is worth doing an additional search in PubMed to find the non-MEDLINE papers. Regularly these are very relevant papers that you wouldn’t like to have missed. Dependent on your aim you can suffice with a simpler, broader search for only textwords and limit by using NOT MEDLINE[sb].**

From now on, I will always include this PubMed step in my exhaustive searches. 

2. OVID MEDLINE contains duplicate records

I use Reference Manager to deduplicate the records retrieved from all databases  and I share the final database with my client. I keep track of the number of hits in each database and of the number of duplicates to facilitate the reporting of the search procedure later on (using the PRISM flowchart, see above). During this procedure, I noticed that I always got LESS records in Reference Manager when I imported records from OVID MEDLINE, but not when I imported records from the other databases. Thus it appears that OVID MEDLINE contains duplicate records.

For me it was just a fact that there were duplicate records in OVID MEDLINE. But others were surprised to hear this.

Where everyone just wrote down the number of total number hits in OVID MEDLINE, I always used the number of hits after deduplication in Reference Manager. But this is a quite a detour and not easy to explain in the PRISM-flowchart.

I wondered whether this deduplication could be done in OVID MEDLINE directly. I knew you cold deduplicate a multifile search, but would it also be possible to deduplicate a set from one database only? According to OVID help there should be a button somewhere, but I couldn’t find it (curious if you can).

Googling I found another OVID manual saying :

..dedup n = Removes duplicate records from multifile search results. For example, ..dedup 5 removes duplicate records from the multifile results set numbered 5.

Although the manual only talked about “multifile searches”, I tried the comment (..dedup 34) on the final search set (34) in OVID MEDLINE, and voilà, 21 duplicates were found (exactly the same number as removed by Reference manager)

The duplicates had the same PubMed ID (PMID, the .an. command in OVID), and were identical or almost identical.

Differences that I noticed were minimal changes in the MeSH (i.e. one or more MeSH  and/or subheadings changed) and changes in journal format (abbreviation used instead of full title).

Why are these duplicates present in OVID MEDLINE and not in PubMed?

These are the details of the PMID 20846254 in OVID (2 records) and in PubMed (1 record)

The Electronic Date of Publication (PHST)  was September 16th 2010. 2 days later the record was included in PubMed , but MeSH were added 3 months later ((MHDA: 2011/02/12). Around this date records are also entered in OVID MEDLINE. The only difference between the 2 records in OVID MEDLINE is that one record appears to be revised at 2011-10-13, whereas the other is not.

The duplicate records of 18231698 have again the same creation date (20080527) and entry date (20081203), but one is revised 2110-20-09 and updated 2010-12-14, while the other is revised 2011-08-18 and updated 2011-08-19 (thus almost one year later).

Possibly PubMed changes some records, instantaneously replacing the old ones, but OVID only includes the new PubMed records during MEDLINE-updates and doesn’t delete the old version.

Anyway, wouldn’t it be a good thing if OVID deduplicated its MEDLINE records on a daily basis or would replace the old ones when loading  new records from MEDLINE?

In the meantime, I would recommend to apply the deduplicate command yourself to get the exact number of unique records retrieved by your search in OVID MEDLINE.

*mostly because PubMed doesn’t have an adjacency-operator.
** Of course, only if you have already an extensive OVID MEDLINE search.




16 responses

25 11 2011
Elaine Garrett

This is a really useful discussion, thank you.

I almost always seem to have duplicate entries in my searches, and had never thought to try the dedup command.
Are you using the “without revisions” segments of Medline? I just compared a simple search in the two, and duplicates have been removed from the “without revisions” database segment.

MEZZ: Ovid MEDLINE(R) without Revisions 1948 to Present
# Searches Results
1 osteoporosis/ 30561
2 limit 1 to yr=”2010″ 1512
3 remove duplicates from 2 1512
4 the role of platelets in bone remodeling.m_titl. 1

MESZ: Ovid MEDLINE(R) 1948 to November Week 3 2011
# Searches Results
1 osteoporosis/ 32076
2 limit 1 to yr=”2010″ 1717
3 remove duplicates from 2 1513
4 the role of platelets in bone remodeling.m_titl. 2

25 11 2011
Mairna Englesakis

Wonderful item – thank you!

I also search the “OvidSP Medline In-Process & Other Non-Indexed Citations” when conducting searches in support of systematic reviews. I often find it quite useful, especially for new stuff, and non-indexed journal articles. It does require text word searching, though…

26 11 2011

Dear Jacqueline,
I have 3 remarks about deduplication of the Medline/Ovid records:
Deduplication is not possible for sets larger than 6000 records.
I have seen “duplicate pairs” that were not real duplicates, e.g. AN 22064423 and AN 22064419.
You have to scrutinize them but they are NOT equal!
I do not know the algorithm that Ovid uses.
Deduplication in RefMan is more transparent.
I have contacted Ovid several times to ask them to remove the duplicates themselves.
But they answered me always that they only could do so after a reload or something like that

28 11 2011
Leone Snowden

Hi Laika,
I would usually search PreMedline for records which aren’t fully indexed in Medline – and I like to search it separately since it will be a textword only search. I know PubMed covers both Medline and PreMedline, and has content which is non-Medline. Is there any evidence that PubMed gets the unindexed content before PreMedline does?
Leone Snowden
NSW Medicines Information Centre
Sydney, Australia

30 11 2011
Leslie Radentz, MD

Very generous of you to share this valuable information.
Thank You,
Leslie Radentz, MD

1 12 2011
David Kaunelis

Dear Jacqueline,

This is a very interesting post. I agree that it is very important to search PubMed in addition to Medline. It’s something we do routinely at the Canadian Agency for Drugs and Technologies in Health (CADTH). However, I think there is a somewhat more precise approach for retrieving items in PubMed but not in Medline than you provided. Your PubMed search suggestion:

[Search string] NOT Medline

does retrieve items that are not in Medline, but it still retrieves some items that are in Medline (for example, oddly enough, those with a status of “PubMed not Medline” – these items are in Medline but not indexed). Also, this search will include items with a status of “In-Process”. These will also be in Medline as long as you make sure to search a complete version (such as Ovid MEDLINE(R) In-Process & Other Non-Indexed Citations, Ovid MEDLINE(R) Daily and Ovid MEDLINE(R) 1948 to Present).

I have found that this search works more precisely:

[Search string] AND publisher [sb]

This search limits only to those articles that still have a status of “Publisher”, including most Epub ahead of print articles. It seems to do a very good job of retrieving all the citations that are only in PubMed. CADTH has created a poster that provides more detail. It’s available at:

Dave Kaunelis
Ottawa, Canada

2 12 2011
Jacqueline (aka Laika)

Dear commenters,
Thanks for commenting and sharing your valuable ideas and experience. I will update my post with some of the new insights I gained.

@Leslie Thnx for connecting on LinkedIn and directly responding to my blog post. Glad you find this rather technical post meaningful.

@elaine : I use the following section: Ovid MEDLINE(R) In-Process & Other Non-Indexed Citations and Ovid MEDLINE(R) 1948 to Present. At our library we have only accession to this segment, Ovid MEDLINE(R) 1948 to November Week 3 2011, Ovid MEDLINE(R) Daily Update November 16, 2011. It is an interesting suggestion though and I will ask whether we can get access to MEZZ as well. A question that remains is, which version do you prefer? The non-revised one or the revised one? I rather would have the latter

@dieuwke thanks for your tips and comments. You are right. First I thought, but these citations have different PMID’s, thus aren’t deduplicated. But they are! (by using the command: ..dedup.) This example is exceptional however, because it is a news item on the same page in Nature, written by the same author and with approximately the same title. Possibly (dependent on how you deduplicate) Reference Manager (RM) would consider them duplicates too.
I don’t find deduplication in RM more transparant. Of course you can set your own rules, but the reader of the systematic review doesn’t know which one you use and what has been deduplicated. I used to send duplicates to a separate RM-file, but the new RM version crashes so often that I stopped doing this. Thus I have no way to easily check whether the removed citations are true duplicates.
I find it more transparent to show what you have done in OVID MEDLINE, because everyone can repeat this exactly the way you did. (I always include at least one complete search in the appendix of the SR). But from now on I will check whether duplicates are true duplicates in OVID. Thnx for the note of caution!


@mairna. I haven’t been clear enough, I suppose. I forgot to mention that I do use OvidSP Medline In-Process & Other Non-Indexed Citations”. I find it so self-evident to use textwords in addition to MESH, that I didn’t mentioned this either. The conclusion of this piece was to show that you miss articles in OVID MEDLINE (incl non-indexed citations) even though you take care to include text words.

@Leone How do you search PreMedline? It is not any longer a separate database, thus do you use another comment than NOT medline [sb]? Or do you mean the PreMedline search is automatically retrieved in a PubMed search?
I’m not sure I understand your question ” Is there any evidence that PubMed gets the unindexed content before PreMedline does?” I would say, per definition it wouldn’t. But perhaps there are others knowing more abt PubMed who can answer your question.

Thanks for sharing your findings and your excellent poster on this topic. It is good to know that CADTH routinely searches PubMed in addition to Medline.

You suggest to use publisher [sb] instead of the NOT medline [sb] comment that I have used.
I have checked it using the example I gave.
With AND publisher [sb] I find 9 references.
With NOT [medline [sb] I get 30 refrences

The entire publisher[sb] set is within the medline[sb] set. Most importantly, this publisher[sb] set contains the 3 relevant papers that I missed in OVID.

But I missed 2 other citations with publisher[sb] that were not in OVID MEDLINE.

1. One citation I found in PubMed and not in OVID MEDLINE, was present in OVID MEDLINE, but I rerieved it in PubMed because my PubMed search is much broader. I search for T4[tiab] OR T3[tiab], whereas my search in OVID using adjacency operators confines the meaning of T4 and T3 to hormones and not T cells, for instance.
My command for these hormones in OVID MEDLINE:

exp Thyroid Hormones/ or exp Thyrotropin/ or Thyrotropin-Releasing Hormone/
(Triiodothyronine or FT3 or t?yroxin* or FT4).tw,ot,kw.
((total or hormone* or free or level* or test*) adj2 (T3 or T4 or T 3 or T 4)).tw,ot,kw.

Thus this reference is not relevant because it is about T4 cells not the T4 hormone. Yet 1 irrelevant citation is no problem for a systematic review. It might become a problem though if you have a high retrieval and a very inaccurate PubMed search.

2. The other citation (18210580) was relevant to the topic, but apparently not included in OVID (1949, OLDMEDLINE).

My conclusion: limiting to publisher [sb]  is a cleaner way to limit to a recent subset of publications, not yet included in OVID MEDLINE.

Limiting to the non-Medline set by applying the command: NOT medline[sb] retrieves not only the publisher [sb] subset, but also other publications that are not in PubMed but not in MEDLINE. Some of these citations are relevant. There is a great overlap with OVID MEDLINE however (pubmednotmedline [sb] and  inprocess [sb]), but this is no problem if the citations are deduplicated using a database like RM. (see PRISM flow chart).

Thus it is up to the searcher & client to decide whether they only aim to find the “as supplied by publisher” subset, or that they want to find other non-MEDLINE articles as well. This must be balanced against the possibly higher noise level, inherent to PubMed searches.

2 12 2011
David Kaunelis

Thanks for your update. Based on your comments, I think the below search may be most optimal if your goal is to retrieve everything that is in PubMed, but not in Medline:

[Search string] NOT (Medline [sb] OR inprocess [sb] OR pubmednotmedline [sb])

This will remove the overlap you mentioned and avoid having to remove duplicates in your reference manager software. As you mentioned, if your goal is to find only articles that have not yet been added to Medline, then using

[Search string ] AND publisher [sb]

is preferable, especially for larger searches.

2 12 2011

That is a very apt analysis, David.

The difference between NOT (Medline [sb] OR inprocess [sb] OR pubmednotmedline [sb]) and AND publisher[sb] is marginal though. In my case just 1 paper (10 versus 9). But it seems better than NOT (MEDLINE[sb] alone [30 hits, 19 duplicates].

I think I will try the 3 commands the nest few times. Just to check that I don’t miss anything.

2 12 2011
David Kaunelis

Hi Jacqueline,

Thanks for your further comments. After some thought, I would like to expand a little on what I said in my above post. First I would like to note why the two articles that were not retrieved using the publisher [sb] filter in the search example used above:

Article 1 was retrieved in PubMed due to a broadening of the search strategy. Since the article is in Medline, it would not be retrieved using the publisher [sb] filter. It would have been retrieved in the original Medline search if the same strategy had been used,

Article 2 was not retrieved because it has a status of OldMedline. It’s not in the Medline database, but is in the OldMedline database which is available as an Ovid database. If researchers are interested in older articles, OldMedline and Medline can be searched together in Ovid.

In all, there are five citation status subsets in PubMed: Medline, OldMedline, In Process, PubMedNotMedline and Publisher. Three of these (Medline, In-Process and PubMedNotMedline) are in found in Medline (or are added within a day or two).

The only two subsets not found in Medline are Publisher and OldMedline. So the alternative strategy I mentioned in my previous post:

[Search string] NOT (Medline [sb] OR inprocess [sb] OR pubmednotmedline [sb])

retrieves the same results as this one:

[Search string] AND (publisher [sb] OR oldmedline [sb])

And if you aren’t interested in older articles, then just using the publisher [sb] filter will find all citations in PubMed but not in Medline.


2 12 2011

A response faster than light. 😉

You are right (again). The only thing to “worry” about is the “a day or two difference” before the citations appear in MEDLINE. But now I’m splitting hairs, I suppose.

2 12 2011
David Kaunelis

Yes, the time lag can be a problem if you won’t be maintaining alerts on your search. But if you are concerned, you can always run a second search without the publisher [sb] filter and limit to articles published within the past day or two. I think this hair is now fully split!

2 12 2011

I agree! Have a nice weekend, David.

11 01 2012
“Pharmacological Action” in PubMed has no True Equivalent in OVID MEDLINE « Laika's MedLibLog

[…] Things to Keep in Mind when Searching OVID MEDLINE instead of PubMed ( Share this:TwitterFacebookLinkedInStumbleUponLike this:LikeBe the first to like this post. […]

10 06 2013

Thanks for this discussion, Laika, David, and others. I also use the “in-data-review” [sb] as an additional citation status subset when I am isolating the newest records. Is anyone else doing this? They do seem to be a unique subset as when combining i.e., ANDing them with any of the other subsets (publisher, in process, pubmednotmedline) produces 0 records.

10 06 2013

Apologies – the actual syntax is indatareview [sb]. I retested and these all appear to have the “in process” tag, too, so it is redundant.

Search indatareview [sb] AND in process [sb] 123266 records

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: