Why scholars cite the things they cite – the real reasons

In my previous post, I took a look at some of the scholarship about why certain articles are cited more than others.

I feel bad, because by focusing on all of the little things that correlate with citation rate, I didn’t talk about the substantive aspects of how a citation is used.

Sometimes a citation is used to simply say "I agree." Other times it may be used to say "You're wrong."

Cue the next article I found by R.B. Williams (2011) about the history and classification of citation systems in the biosciences.

This was an exciting article to read for two reasons.  First, I had been looking for some information about the history of various citation styles for a while.  (It isn’t easy.  Try Google-ing “history of citation styles”).

Second, the article made me aware of the scholarship about how citations are actually used within scientific documents.  I am particularly drawn to the questions posed by Moravcsik and Murugesan back in 1975.

  1. Is the reference conceptual or operational? In other words, is the reference made in connection with a concept or theory that is used in the referring paper, or is it made in connection with a tool or physical technique used in the referring paper? The distinction is not meant to be a value judgment, and is not to be taken as synonymous with judging the importance of the paper referred to.
  2. Is the reference organic or perfunctory? In other words, is the reference truly needed for the understanding of the referring paper (or to the working out of the content of that paper), or is it mainly an acknowledgment that some other work in the same general area has been performed?
  3. Is the reference evolutionary or juxtapositional? In other words, is the referring paper built on the foundations provided by the reference, or is it an alternative to it?
  4. Is the reference confirmative or negational? In other words, is it claimed by the referring paper that the reference is correct, or is its correctness disputed? Incorrectness need not be claimed through an actual demonstration of an error in the paper referred to, but could also be established, for example, through inferior agreement with experimental data.

First, these questions have a real importance when we start thinking about the ways in which citation metrics don’t necessarily get at the importance of scientific work.
And second, I think there is some potential in these ideas to help students when they write term papers and cite their sources.

Traditionally, I teach students that they need to cite their sources in order to acknowledge the scholarly work of others.  I talk about the implications of not citing something (it was your own idea, its common knowledge, its plagiary), but I don’t really go into more detail about why you might cite something.

By breaking down the purpose of a citation explicitly, as these questions do, perhaps we can better prepare students to effectively use the research articles they find in their term papers and projects.

Now, I’m no expert on teaching writing.  But the best term papers do an effective job of integrating the various sources they find into a cohesive narrative.  Perhaps we could be more explicit about how this is done, and perhaps these ideas can help the students envision what their citations and their term paper might look like.  Perhaps.


Moravcsik, M. J., & Murugesan, P. (1975). Some Results on the Function and Quality of Citations. Social studies of science, 5(1), 86-92. Sage Publications. Retrieved from http://sss.sagepub.com/content/5/1/86.full.pdf

Williams, R. B. (2011). Citation systems in the biosciences: A history, classification and descriptive terminology. Journal of Documentation, 67(6), 995-1014. doi:10.1108/00220411111183564

Will this post be cited more often? Non-content factors that influence citation rates.

For many researchers, the citation is a make-or-break concept.  Most ranking algorithms use citations to determine a journal’s influence or impact.  Publication in “high impact” journals is often the key to tenure and promotion, and the number of times an article has been cited is often widely touted in tenure and promotion packets.

Image courtesy of Flickr user futureatlas.com

With careers, funding and much else riding on citation, it would be useful for scholars and librarians to know why a particular item gets cited.  We’d all like to think that the only reason an article is cited is because it’s content is relevant (and more relevant than other items) to the study at hand.

Unfortunately, there is some evidence to suggest that other, non-content, factors influence the likelihood of an item being cited.

Big caveat: The quality of these studies is highly variable and their results are sometimes contradictory.  Correlation does not equal causation.

Nevertheless, most of the non-content factors influencing citation rate relate to article discoverability.  You can’t be cited if you can’t be read, and you can’t be read if you can’t be found.  How likely is an article to be found in a database?  Was the article discussed in a newspaper or other popular science forum?  Does the title clearly explain what the article is about (and make you want to read more)?  Is the article already connected to a wide circle of readers via multiple authors or large universities?  While there are classic examples of important scientific publications published in obscure journals, those are the exception and not the norm.

So, in no particular order, here are a few things that folks suggest might influence how often your article is cited:

A lot of research has looked into various aspects of article titles on subsequent citations.

  • Type of title – In an interesting study looking at article titles from PLoS journals, Jamali and Nikzad (2011) wondered if the type of article title affected the citation rate of an article.  In general, they found that article titles that asked a question were downloaded more but cited less than descriptive or declarative titles.  Interestingly, Ball (2009) found that the number of such interrogative titles have increased 50% – 200% in the last 40 years.
  • Length of title – Jamali and Nikzad suggest that articles with longer titles are downloaded and cited less, and Moore (2010) in a quick study found no correlation. However, Habibzadeh and Yadollahie (2010) suggested that longer titles are cited more (especially in high impact factor journals) and a positive correlation between article title length and citation rate was found by Jacques and Sabire (2009).
  • Specific terms – Disciplinary abbreviations (very specific keywords) may lead to more citations (Moore, 2010), where as articles with specific country names in the title might be cited less (Jacques and Sabire, 2009).
  • Humorous titles – To my disappointment, a study of articles with amusing titles in prestigious psychology journals by Sagi and Yechiam (2008) found that these articles were less likely to be cited than other articles with unfunny articles.  Since funny titles are often less descriptive of the actual research, these articles could be more difficult to find in databases.
Articles with funny titles, like this one featured on Discover's Discoblog, may not be cited as much as others.

Positive Results – There is strong evidence to suggest that positive results are much more likely to be submitted and published than negative results.  It seems as though positive results are also more likely to be cited.  Banobi et al. (2011) found that rebuttal articles (either technical reports or full length articles) were less likely to be cited than the original articles, i.e. the articles with positive results were more likely to be cited.  This correlates well with the results of Leimy and Koricheva (2005) who found that articles that successfully proved their original hypothesis were more likely to be cited than articles that disproved the original hypothesis

Number of authors – Leimu and Koricheva (2005) found a positive correlation between the number of authors and the number of citations in the ecological literature, while Kulkarni et al. (2007) found that group authorship in medical journals increased citation counts by 11.1.  However, a blog post by Moore (2010) suggested that isn’t wasn’t the number of authors that were important, but their reputation.  A recent study of the chemical literature that was able to account for article quality (as measured by reviewers rating) found a correlation with author reputation but no correlation to the number of authors (Bornmann et al. 2012).

Industry relationship – Studying medial journals, Kulkarni et al. (2007) found that industry funded research that reported results beneficial to the industry (i.e. a medical device that worked or a drug that didn’t show harmful side effects) was more likely to be cited than non-instustry funded, negative research.

Data sharing – Piwowar et al. (2007) found that within a specific scholarly community (cancer microarray clinical trial publications) free availability of research data let to a higher citation rate, independent of journal impact factor.

Open Access – Lots of studies have been done with mixed results.  A slightly higher number of studies seem to suggest that open access leads to higher citations (See the excellent review article by Wagner (2010)).

Popular press coverage – It makes intuitive sense that journal articles spotlighted by the popular press might be cited more, but this is difficult to prove.  Perhaps the press is merely good at identifying those articles that would be highly cited anyway.  Phillips et. al (1991) were able to take advantage of an interesting situation when the New York Times went on strike in 1978 but continued to produce a “paper of record” that was never published.  Phillips et. al. (1991) found that items written about in the “paper of record” but not published were no more likely to be cited than other articles.

Length of your bibliography – A 2009 study by Webster et al. (2009) suggests a correlation between the length of an articles bibliography and the number of times it is later cited.  They suggest a “I’ll cite you since you cited me” mentality, but online commentators suggest that this is merely a specious relationship (See Corbyn, 2010, and comments therein).


So, if you want to publish a paper that gets the highest number of citations, what should you do?  Do your study with a large number of prestigious co-authors.  Submit your long article containing positive results and a big bibliography to a open access journal.  Say something nice about a pharmaceutical company.  Share your data and get the New York Times to write about it.

Oh, and it might be useful to have some interesting and solid science in there somewhere.


Really Long Bibliography:

Ball, R. (2009). Scholarly communication in transition: The use of question marks in the titles of scientific articles in medicine, life sciences and physics 1966–2005. Scientometrics, 79(3), 667–679.  Retrieved from: http://www.akademiai.com/index/UH466Q5P3722N37L.pdf

Banobi, J. A., Branch, T. A., & Hilborn, R. (2011). Do rebuttals affect future science? Ecosphere, 2(3), art37. doi:10.1890/ES10-00142.1

Bornmann, L., Schier, H., Marx, W., & Daniel, H. D. (2012). What factors determine citation counts of publications in chemistry besides their quality? Journal of Informetrics, 6(1), 11-18. Elsevier Ltd. doi:10.1016/j.joi.2011.08.004

Corbyn, Z. (2010). An easy way to boost a paper’s citations. Nature. Nature Publishing Group. doi:10.1038/news.2010.406

Habibzadeh, F., & Yadollahie, M. (2010). Are Shorter Article Titles More Attractive for Citations? Cross-sectional Study of 22 Scientific Journals. Croatian Medical Journal, 51(2), 165-170. doi:10.3325/cmj.2010.51.165

Jacques, T. S., & Sebire, N. J. (2010). The impact of article titles on citation hits: an analysis of general and specialist medical journals. JRSM short reports, 1(1), 2. doi:10.1258/shorts.2009.100020

Jamali, H. R., & Nikzad, M. (2011). Article title type and its relation with the number of downloads and citations. Scientometrics, (49), 653-661. doi:10.1007/s11192-011-0412-z

Kulkarni, A. V., Busse, J. W., & Shams, I. (2007). Characteristics associated with citation rate of the medical literature. PloS one, 2(5), e403. doi:10.1371/journal.pone.0000403

Leimu, R., & Koricheva, J. (2005). What determines the citation frequency of ecological papers? Trends in ecology & evolution, 20(1), 28-32. doi:10.1016/j.tree.2004.10.010

Moore, A. (2010). Do Article Title Attributes Influence Citations? Wiley-Blackwell Publishing News. Retrieved December 16, 2011, from http://blogs.wiley.com/publishingnews/2010/09/02/do-article-title-attributes-influence-citations/

Phillips, D. P., Kanter, E. J., Bednarczyk, B., & Tastad, P. L. (1991). Importance of the Lay Press in the Transmission of Medical Knowledge to the Scientific Community. The New England Journal of Medicine, 325(16), 1180-1183.  Available via: http://www.ncbi.nlm.nih.gov/pubmed/1891034

Piwowar, H. A., Day, R. S., & Fridsma, D. B. (2007). Sharing detailed research data is associated with increased citation rate. PloS ONE, 2(3), e308. doi:10.1371/journal.pone.0000308

Sagi, I., & Yechiam, E. (2008). Amusing titles in scientific journals and article citation. Journal of Information Science, 34(5), 680-687. doi:10.1177/0165551507086261

Wagner, A. B. (2010). Open access citation advantage: an annotated bibliography. Issues in Science and Technology Librarianship, (60). Retrieved from http://www.istl.org/10-winter/article2.html

Webster, G. D., Jonason, P. K., & Schember, T. O. (2009). Hot Topics and Popular Papers in Evolutionary Psychology : Analyses of Title Words and Citation Counts in Evolution and Human Behavior , 1979 – 2008. Evolutionary Psychology, 7(3), 348-362. Retrieved from http://www.epjournal.net/filestore/ep07348362.pdf

Trust but verify the results of automatic citation creation tools

Almost all academic databases these days will allow you to export a properly formatted citation (APA, MLA, etc.) for a book or journal article within that database. This is a wonderful feature for undergraduates that saves a lot of really annoying formatting. It is especially helpful for eliminating the annoyance of re-arranging author first names and last names and putting in appropriate punctuation.

Unfortunately, it doesn’t always come out perfectly.

For example, the citation database Scopus regularly produces a citation indicating that an article is “Available from http://www.scopus.com” which is completely incorrect.  Just the citation is available from Scopus, the full text of the item is found elsewhere.

So in my library instruction sessions I regularly encourage students to double check the results of these citation generators (in databases, in web services like EasyBib and in programs like Mendeley and EndNote).

Because this is what happens when you don’t look things over:

Bad citation from the tumblr blog "Shit my Students Write"

So take a minute or two to look over your bibliography – you don’t want to look silly.

Digital Object Identifiers for 17th century publications

The Royal Society of London recently announced that the historic archive of the Philosophical Transactions of the Royal Society will now be openly available (Happy Open Access Week!) 

Philosphical Transactions is typically regarded as the first scientific journal and has been in continuous publication since it started in 1665. (A french journal, the Journal des sçavans started publication three months prior to the Philosophical Transactions, but since it appealed to a wider audience and included a larger percentage of book reviews, many do not consider it the first real scientific journal).

We’ve had access to this archive for a while now via JSTOR, and I love having the ability to see the very beginnings of the scientific journal article.

What intrigued me when I started digging into their now-open archive was the delightful juxtaposition of the 1665 publication date and the modern DOI.

Since these historical documents are available online, they are digital objects, and assigning DOIs makes a lot of sense.  It also makes each individual article much easier to find.

So, hurrah for Open Access, hurrah for easy access (not necessarily the same thing), and hurrah for An Account of a Very Odd Monstrous Calf:

Boyle, Robert. 1665.  An Account of a Very Odd Monstrous Calf.  Philosophical Transactions. 1: 10. doi: 10.1098/rstl.1665.0007
Boyle, Robert. 1665. An Account of a Very Odd Monstrous Calf. Philosophical Transactions. 1: 10. doi: 10.1098/rstl.1665.0007

Tracking down a citation shouldn’t be this hard

Tracking down a citation you already have should be a relatively simple task.

A colleague of mine asked for help the other day tracking down a citation.  A variety of circumstances made it anything but straightforward and served to remind me about some of the confusing parts of the scholarly communication system (and that I really love my job).

A student had approached the reference desk looking for a citation to this article:

Tan, D. X.; Chen, L. D.; Poeggeler, B.; Manchester, L. C.; Reiter, R. J. (1993). “Melatonin: a potent, endogenous hydroxyl radical scavenger”. Endocrine J 1: 57–60.

The student had found the citation via the Wikipedia entry for Melatonin.  My colleague started out with the usual process – look up the journal, find the right volume and go from there.  Except when you look up Endocrine Journal, you find that the volume number doesn’t match the year, nor are there any articles with a similar title in the publication.  Author searches in the same journal also yield nothing.

Since the citation came from Wikipedia, it’s seemed probable that there was an error.  So she did a search on Google and Google Scholar to try to find a correct citation.  Neither search turns up the article, but Google Scholar indicates that the article has been cited over 1000 times!  The student found another article by some of the same authors on the topic and was content, but my colleague still wanted the answer.  With other students waiting for reference help, she sent the question along to me.

I was checking my email after my kids went to bed and thought I’d poke around a little to see what I can find.  I re-did the searches my colleague did so that I understand the problem. Theoretically, the article has to exist, since it has been cited so many times.  So why couldn’t we find it?  I tried Google Scholar, PubMed, Scopus and found nothing (we don’t have Web of Science here).  I searched for additional publications by the same authors but I still didn’t find anything close to this one.

So I started looking for similarly name publications.  The journal Endocrine Journal is published by the Japan Endocrine Society and the years don’t match up, so perhaps the abbreviation refers to something different?  I located a journal called simply Endocrine (try finding that one in a Google search!) published by Springer.  This started to look promising because the first volume of of Endocrine was published in 1993, just want we want.  But this volume isn’t available on the publisher’s website, so I couldn’t confirm my suspicions.

If Endocrine is the journal we want, why can’t I find it indexed in a database?  I checked indexing information.  PubMed only started indexing it in 1997.  Scopus started indexing it in 1993, but only with the fifth issue, and we need issue 1.  And Google Scholar won’t have it (other than the citation) because it isn’t on the Springer website or in PubMed.

I start to think that the citation really refers to an article in Endocrine, not Endocrine Journal.  But Scopus has over 1000 folks citing Endocrine Journal.  It seems unlikely that so many people would make the same error.

I stayed up past my bedtime having fun tracking this down.  I emailed my thoughts to my colleague and I wondered if perhaps Web of Science indexed this item from issue 1.

The next day, we asked a colleague at another institution to do a quick search for us in Web of Science.  No hits on the article title.  Perhaps Web of Science didn’t index it from issue 1 either, or perhaps I’m just wrong (it’s been known to happen).

Then I checked Ulrich’s guide to periodicals.  We have it in print here, and the brief entry illustrates the missing piece of our puzzle.

The entry in Ulrich
The entry in Ulrich's clearly indicated this journals former title, a fact that is missing from the publisher's website.

From 1993 to 1994 there were two Endocrine Journals!

For a brief period of time (<2 years), Endocrine called itself Endocrine Journal.  Perhaps they discovered the Japan Endocrine Society’s Endocrine Journal as the internet was making international collaboration easier.

Since I found the original ISSN (0969-711X), I submitted an ILL request to confirm my thoughts.  Sure enough, here’s the article masthead, but with Macmillan Press as the publisher, not Springer.  The early issues available on the Springer website have Stockton Press as the publisher in 1995.  It seems to have changed publisher several times.

Article screenshot
Screen shot of the PDF file I received via ILL. Note the publisher at the top.

What’s the moral of this story?

  1. Journals really need to select unique names.  (Do new journals think about Google-ability of their names?)
  2. I picked the right profession because I had fun chasing this down.
  3. Given my difficulty tracking this down, I have to ask: How many of the 1000 folks that cited this article actually tracked it down?  I bet there are some who never laid eyes on it.

More importantly, it can be very easy for valuable information to disappear entirely.  We live in an era of information overload.  Yes, people have been saying the same thing since the invention of the printing press, but these days it isn’t a matter of finding any information, it is a matter of sorting to find the right information.  And even today, an item published just before the explosion of online scholarly information could almost disappear.  Although it may seem like it, not everything is available in Google.

Guest post on ACRLog

Readers of this blog may be interested in a guest post I wrote for the Association of College and Research Libraries blog, ACRLog.

Last week I taught an information literacy class to a group of senior Chemistry students. We didn’t talk about databases or indexes, we talked about numbers. We talked about impact factors and h-indexes and alternative metrics, and the students loved it. Librarians have used these metrics for years in collection development, and have looked them up to help faculty with tenure and promotion packets. But many librarians don’t know where the numbers come from, or what some of the criticisms are.

Read the rest of the post here.

Do researchers find all the relevant literature? Not so much.

In a typical term paper assignment, faculty ask students to review the literature, synthesize their findings and write a cohesive narrative about a particular topic.  They expect students to find the most important research on the subject and determine what the general scientific consensus is, taking into account any disagreements.   By the time most students get to their senior year in college, most appear to do an okay job of this.

But do the faculty follow their own guidelines when writing up their own research?  A recent study in the journal Ecosphere suggests that researchers aren’t always finding, reading or critically analyzing the original and rebuttal papers.

Banobi, Branch and Hilborn (2011) selected 7 high profile papers originally published in Science or Nature, all of which had at least one rebuttal published.  The authors identified papers that cited the original article or the rebuttal and then analyzed:

  • Number of citations to the original paper vs. citations to the rebuttal,
  • How well the citing paper agreed with the original paper or the rebuttal (and whether this changed after the publication of the rebuttal)
  • Whether citations to the original paper decreased over time

After correcting for the effects of self-citation, their results are remarkable:

  • Original papers were cited 17 times more than the rebuttals.
  • They found a lot of papers that cited only the original paper, and 95% of these accepted the original at face value
  • Only about 5% of the citations to the original papers were critical (at all) of the original article.
  • Some papers cited the original and the rebuttals as though they both supported the same position!

Why is this happening?

Benobi, et al. suggest that:

This confirms our intuitive sense that most authors, except the relative few that are writing and citing rebuttals, tend to accept a paper’s conclusions uncritically.

Additionally, we can wonder if the authors have really read all of the papers they cite (something suggested by Simkin and Roychowdhury 2003) or found all of the relevant research (suggested by Robinson and Goodman (2010), my discussion here)

The authors suggest that original articles and rebuttals need to be better linked in our information retrieval systems, something that I’ve touched on earlier.  But a lack of such system tools does not absolve the authors of their responsibility to find relevant earlier work.  Good keyword searches will often easily turn up the rebuttal papers, and citation searching (available for free on Google Scholar if you don’t have Web of Science or Scopus) should be required!

We may also need to examine the possibility that some researchers are just as guilty as their students of not finding and reading the relevant literature.


Banobi, J., Branch, T., & Hilborn, R. (2011). Do rebuttals affect future science? Ecosphere, 2 (3) DOI: 10.1890/ES10-00142.1

Robinson, K. A., & Goodman, S. N. (2011). A systematic examination of the citation of prior research in reports of randomized, controlled trials. Annals of internal medicine, 154(1), 50-5. DOI: 10.1059/0003-4819-154-1-201101040-00007.

Simkin, M. V., & Roychowdhury, V. P. (2002). Read before you cite! Complex Systems, 14, 269-272. Retrieved July 15, 2011, from http://arxiv.org/abs/cond-mat/0212043.

Note: Hat tip to Richard P. Grant who posted a link to the Banobi et al. article on Google+.