In my previous post, I took a look at some of the scholarship about why certain articles are cited more than others.
I feel bad, because by focusing on all of the little things that correlate with citation rate, I didn’t talk about the substantive aspects of how a citation is used.
Cue the next article I found by R.B. Williams (2011) about the history and classification of citation systems in the biosciences.
This was an exciting article to read for two reasons. First, I had been looking for some information about the history of various citation styles for a while. (It isn’t easy. Try Google-ing “history of citation styles”).
Second, the article made me aware of the scholarship about how citations are actually used within scientific documents. I am particularly drawn to the questions posed by Moravcsik and Murugesan back in 1975.
Is the reference conceptual or operational? In other words, is the reference made in connection with a concept or theory that is used in the referring paper, or is it made in connection with a tool or physical technique used in the referring paper? The distinction is not meant to be a value judgment, and is not to be taken as synonymous with judging the importance of the paper referred to.
Is the reference organic or perfunctory? In other words, is the reference truly needed for the understanding of the referring paper (or to the working out of the content of that paper), or is it mainly an acknowledgment that some other work in the same general area has been performed?
Is the reference evolutionary or juxtapositional? In other words, is the referring paper built on the foundations provided by the reference, or is it an alternative to it?
Is the reference confirmative or negational? In other words, is it claimed by the referring paper that the reference is correct, or is its correctness disputed? Incorrectness need not be claimed through an actual demonstration of an error in the paper referred to, but could also be established, for example, through inferior agreement with experimental data.
First, these questions have a real importance when we start thinking about the ways in which citation metrics don’t necessarily get at the importance of scientific work.
And second, I think there is some potential in these ideas to help students when they write term papers and cite their sources.
Traditionally, I teach students that they need to cite their sources in order to acknowledge the scholarly work of others. I talk about the implications of not citing something (it was your own idea, its common knowledge, its plagiary), but I don’t really go into more detail about why you might cite something.
By breaking down the purpose of a citation explicitly, as these questions do, perhaps we can better prepare students to effectively use the research articles they find in their term papers and projects.
Now, I’m no expert on teaching writing. But the best term papers do an effective job of integrating the various sources they find into a cohesive narrative. Perhaps we could be more explicit about how this is done, and perhaps these ideas can help the students envision what their citations and their term paper might look like. Perhaps.
For many researchers, the citation is a make-or-break concept. Most ranking algorithms use citations to determine a journal’s influence or impact. Publication in “high impact” journals is often the key to tenure and promotion, and the number of times an article has been cited is often widely touted in tenure and promotion packets.
With careers, funding and much else riding on citation, it would be useful for scholars and librarians to know why a particular item gets cited. We’d all like to think that the only reason an article is cited is because it’s content is relevant (and more relevant than other items) to the study at hand.
Unfortunately, there is some evidence to suggest that other, non-content, factors influence the likelihood of an item being cited.
Big caveat: The quality of these studies is highly variable and their results are sometimes contradictory. Correlation does not equal causation.
Nevertheless, most of the non-content factors influencing citation rate relate to article discoverability. You can’t be cited if you can’t be read, and you can’t be read if you can’t be found. How likely is an article to be found in a database? Was the article discussed in a newspaper or other popular science forum? Does the title clearly explain what the article is about (and make you want to read more)? Is the article already connected to a wide circle of readers via multiple authors or large universities? While there are classic examples of important scientific publications published in obscure journals, those are the exception and not the norm.
So, in no particular order, here are a few things that folks suggest might influence how often your article is cited:
A lot of research has looked into various aspects of article titles on subsequent citations.
Type of title – In an interesting study looking at article titles from PLoS journals, Jamali and Nikzad (2011) wondered if the type of article title affected the citation rate of an article. In general, they found that article titles that asked a question were downloaded more but cited less than descriptive or declarative titles. Interestingly, Ball (2009) found that the number of such interrogative titles have increased 50% – 200% in the last 40 years.
Length of title – Jamali and Nikzad suggest that articles with longer titles are downloaded and cited less, and Moore (2010) in a quick study found no correlation. However, Habibzadeh and Yadollahie (2010) suggested that longer titles are cited more (especially in high impact factor journals) and a positive correlation between article title length and citation rate was found by Jacques and Sabire (2009).
Specific terms – Disciplinary abbreviations (very specific keywords) may lead to more citations (Moore, 2010), where as articles with specific country names in the title might be cited less (Jacques and Sabire, 2009).
Humorous titles – To my disappointment, a study of articles with amusing titles in prestigious psychology journals by Sagi and Yechiam (2008) found that these articles were less likely to be cited than other articles with unfunny articles. Since funny titles are often less descriptive of the actual research, these articles could be more difficult to find in databases.
Positive Results – There is strong evidence to suggest that positive results are much more likely to be submitted and published than negative results. It seems as though positive results are also more likely to be cited. Banobi et al. (2011) found that rebuttal articles (either technical reports or full length articles) were less likely to be cited than the original articles, i.e. the articles with positive results were more likely to be cited. This correlates well with the results of Leimy and Koricheva (2005) who found that articles that successfully proved their original hypothesis were more likely to be cited than articles that disproved the original hypothesis
Number of authors – Leimu and Koricheva (2005) found a positive correlation between the number of authors and the number of citations in the ecological literature, while Kulkarni et al. (2007) found that group authorship in medical journals increased citation counts by 11.1. However, a blog post by Moore (2010) suggested that isn’t wasn’t the number of authors that were important, but their reputation. A recent study of the chemical literature that was able to account for article quality (as measured by reviewers rating) found a correlation with author reputation but no correlation to the number of authors (Bornmann et al. 2012).
Industry relationship – Studying medial journals, Kulkarni et al. (2007) found that industry funded research that reported results beneficial to the industry (i.e. a medical device that worked or a drug that didn’t show harmful side effects) was more likely to be cited than non-instustry funded, negative research.
Data sharing – Piwowar et al. (2007) found that within a specific scholarly community (cancer microarray clinical trial publications) free availability of research data let to a higher citation rate, independent of journal impact factor.
Open Access – Lots of studies have been done with mixed results. A slightly higher number of studies seem to suggest that open access leads to higher citations (See the excellent review article by Wagner (2010)).
Popular press coverage – It makes intuitive sense that journal articles spotlighted by the popular press might be cited more, but this is difficult to prove. Perhaps the press is merely good at identifying those articles that would be highly cited anyway. Phillips et. al (1991) were able to take advantage of an interesting situation when the New York Times went on strike in 1978 but continued to produce a “paper of record” that was never published. Phillips et. al. (1991) found that items written about in the “paper of record” but not published were no more likely to be cited than other articles.
Length of your bibliography – A 2009 study by Webster et al. (2009) suggests a correlation between the length of an articles bibliography and the number of times it is later cited. They suggest a “I’ll cite you since you cited me” mentality, but online commentators suggest that this is merely a specious relationship (See Corbyn, 2010, and comments therein).
So, if you want to publish a paper that gets the highest number of citations, what should you do? Do your study with a large number of prestigious co-authors. Submit your long article containing positive results and a big bibliography to a open access journal. Say something nice about a pharmaceutical company. Share your data and get the New York Times to write about it.
Oh, and it might be useful to have some interesting and solid science in there somewhere.
Really Long Bibliography:
Ball, R. (2009). Scholarly communication in transition: The use of question marks in the titles of scientific articles in medicine, life sciences and physics 1966–2005. Scientometrics, 79(3), 667–679. Retrieved from: http://www.akademiai.com/index/UH466Q5P3722N37L.pdf
Banobi, J. A., Branch, T. A., & Hilborn, R. (2011). Do rebuttals affect future science? Ecosphere, 2(3), art37. doi:10.1890/ES10-00142.1
Bornmann, L., Schier, H., Marx, W., & Daniel, H. D. (2012). What factors determine citation counts of publications in chemistry besides their quality? Journal of Informetrics, 6(1), 11-18. Elsevier Ltd. doi:10.1016/j.joi.2011.08.004
Habibzadeh, F., & Yadollahie, M. (2010). Are Shorter Article Titles More Attractive for Citations? Cross-sectional Study of 22 Scientific Journals. Croatian Medical Journal, 51(2), 165-170. doi:10.3325/cmj.2010.51.165
Jacques, T. S., & Sebire, N. J. (2010). The impact of article titles on citation hits: an analysis of general and specialist medical journals. JRSM short reports, 1(1), 2. doi:10.1258/shorts.2009.100020
Jamali, H. R., & Nikzad, M. (2011). Article title type and its relation with the number of downloads and citations. Scientometrics, (49), 653-661. doi:10.1007/s11192-011-0412-z
Kulkarni, A. V., Busse, J. W., & Shams, I. (2007). Characteristics associated with citation rate of the medical literature. PloS one, 2(5), e403. doi:10.1371/journal.pone.0000403
Leimu, R., & Koricheva, J. (2005). What determines the citation frequency of ecological papers? Trends in ecology & evolution, 20(1), 28-32. doi:10.1016/j.tree.2004.10.010
Phillips, D. P., Kanter, E. J., Bednarczyk, B., & Tastad, P. L. (1991). Importance of the Lay Press in the Transmission of Medical Knowledge to the Scientific Community. The New England Journal of Medicine, 325(16), 1180-1183. Available via: http://www.ncbi.nlm.nih.gov/pubmed/1891034
Piwowar, H. A., Day, R. S., & Fridsma, D. B. (2007). Sharing detailed research data is associated with increased citation rate. PloS ONE, 2(3), e308. doi:10.1371/journal.pone.0000308
Sagi, I., & Yechiam, E. (2008). Amusing titles in scientific journals and article citation. Journal of Information Science, 34(5), 680-687. doi:10.1177/0165551507086261
Webster, G. D., Jonason, P. K., & Schember, T. O. (2009). Hot Topics and Popular Papers in Evolutionary Psychology : Analyses of Title Words and Citation Counts in Evolution and Human Behavior , 1979 – 2008. Evolutionary Psychology, 7(3), 348-362. Retrieved from http://www.epjournal.net/filestore/ep07348362.pdf
Almost all academic databases these days will allow you to export a properly formatted citation (APA, MLA, etc.) for a book or journal article within that database. This is a wonderful feature for undergraduates that saves a lot of really annoying formatting. It is especially helpful for eliminating the annoyance of re-arranging author first names and last names and putting in appropriate punctuation.
Unfortunately, it doesn’t always come out perfectly.
For example, the citation database Scopus regularly produces a citation indicating that an article is “Available from http://www.scopus.com” which is completely incorrect. Just the citation is available from Scopus, the full text of the item is found elsewhere.
So in my library instruction sessions I regularly encourage students to double check the results of these citation generators (in databases, in web services like EasyBib and in programs like Mendeley and EndNote).
Because this is what happens when you don’t look things over:
So take a minute or two to look over your bibliography – you don’t want to look silly.
Philosphical Transactions is typically regarded as the first scientific journal and has been in continuous publication since it started in 1665. (A french journal, the Journal des sçavans started publication three months prior to the Philosophical Transactions, but since it appealed to a wider audience and included a larger percentage of book reviews, many do not consider it the first real scientific journal).
We’ve had access to this archive for a while now via JSTOR, and I love having the ability to see the very beginnings of the scientific journal article.
What intrigued me when I started digging into their now-open archive was the delightful juxtaposition of the 1665 publication date and the modern DOI.
Since these historical documents are available online, they are digital objects, and assigning DOIs makes a lot of sense. It also makes each individual article much easier to find.
Tracking down a citation you already have should be a relatively simple task.
A colleague of mine asked for help the other day tracking down a citation. A variety of circumstances made it anything but straightforward and served to remind me about some of the confusing parts of the scholarly communication system (and that I really love my job).
A student had approached the reference desk looking for a citation to this article:
Tan, D. X.; Chen, L. D.; Poeggeler, B.; Manchester, L. C.; Reiter, R. J. (1993). “Melatonin: a potent, endogenous hydroxyl radical scavenger”. Endocrine J 1: 57–60.
The student had found the citation via the Wikipedia entry for Melatonin. My colleague started out with the usual process – look up the journal, find the right volume and go from there. Except when you look up Endocrine Journal, you find that the volume number doesn’t match the year, nor are there any articles with a similar title in the publication. Author searches in the same journal also yield nothing.
Since the citation came from Wikipedia, it’s seemed probable that there was an error. So she did a search on Google and Google Scholar to try to find a correct citation. Neither search turns up the article, but Google Scholar indicates that the article has been cited over 1000 times! The student found another article by some of the same authors on the topic and was content, but my colleague still wanted the answer. With other students waiting for reference help, she sent the question along to me.
I was checking my email after my kids went to bed and thought I’d poke around a little to see what I can find. I re-did the searches my colleague did so that I understand the problem. Theoretically, the article has to exist, since it has been cited so many times. So why couldn’t we find it? I tried Google Scholar, PubMed, Scopus and found nothing (we don’t have Web of Science here). I searched for additional publications by the same authors but I still didn’t find anything close to this one.
So I started looking for similarly name publications. The journal Endocrine Journal is published by the Japan Endocrine Society and the years don’t match up, so perhaps the abbreviation refers to something different? I located a journal called simply Endocrine (try finding that one in a Google search!) published by Springer. This started to look promising because the first volume of of Endocrine was published in 1993, just want we want. But this volume isn’t available on the publisher’s website, so I couldn’t confirm my suspicions.
If Endocrine is the journal we want, why can’t I find it indexed in a database? I checked indexing information. PubMed only started indexing it in 1997. Scopus started indexing it in 1993, but only with the fifth issue, and we need issue 1. And Google Scholar won’t have it (other than the citation) because it isn’t on the Springer website or in PubMed.
I start to think that the citation really refers to an article in Endocrine, not Endocrine Journal. But Scopus has over 1000 folks citing Endocrine Journal. It seems unlikely that so many people would make the same error.
I stayed up past my bedtime having fun tracking this down. I emailed my thoughts to my colleague and I wondered if perhaps Web of Science indexed this item from issue 1.
The next day, we asked a colleague at another institution to do a quick search for us in Web of Science. No hits on the article title. Perhaps Web of Science didn’t index it from issue 1 either, or perhaps I’m just wrong (it’s been known to happen).
From 1993 to 1994 there were two Endocrine Journals!
For a brief period of time (<2 years), Endocrine called itself Endocrine Journal. Perhaps they discovered the Japan Endocrine Society’s Endocrine Journal as the internet was making international collaboration easier.
Since I found the original ISSN (0969-711X), I submitted an ILL request to confirm my thoughts. Sure enough, here’s the article masthead, but with Macmillan Press as the publisher, not Springer. The early issues available on the Springer website have Stockton Press as the publisher in 1995. It seems to have changed publisher several times.
What’s the moral of this story?
Journals really need to select unique names. (Do new journals think about Google-ability of their names?)
I picked the right profession because I had fun chasing this down.
Given my difficulty tracking this down, I have to ask: How many of the 1000 folks that cited this article actually tracked it down? I bet there are some who never laid eyes on it.
More importantly, it can be very easy for valuable information to disappear entirely. We live in an era of information overload. Yes, people have been saying the same thing since the invention of the printing press, but these days it isn’t a matter of finding any information, it is a matter of sorting to find the right information. And even today, an item published just before the explosion of online scholarly information could almost disappear. Although it may seem like it, not everything is available in Google.
Readers of this blog may be interested in a guest post I wrote for the Association of College and Research Libraries blog, ACRLog.
Last week I taught an information literacy class to a group of senior Chemistry students. We didn’t talk about databases or indexes, we talked about numbers. We talked about impact factors and h-indexes and alternative metrics, and the students loved it. Librarians have used these metrics for years in collection development, and have looked them up to help faculty with tenure and promotion packets. But many librarians don’t know where the numbers come from, or what some of the criticisms are.
In a typical term paper assignment, faculty ask students to review the literature, synthesize their findings and write a cohesive narrative about a particular topic. They expect students to find the most important research on the subject and determine what the general scientific consensus is, taking into account any disagreements. By the time most students get to their senior year in college, most appear to do an okay job of this.
But do the faculty follow their own guidelines when writing up their own research? A recent study in the journal Ecosphere suggests that researchers aren’t always finding, reading or critically analyzing the original and rebuttal papers.
Banobi, Branch and Hilborn (2011) selected 7 high profile papers originally published in Science or Nature, all of which had at least one rebuttal published. The authors identified papers that cited the original article or the rebuttal and then analyzed:
Number of citations to the original paper vs. citations to the rebuttal,
How well the citing paper agreed with the original paper or the rebuttal (and whether this changed after the publication of the rebuttal)
Whether citations to the original paper decreased over time
After correcting for the effects of self-citation, their results are remarkable:
Original papers were cited 17 times more than the rebuttals.
They found a lot of papers that cited only the original paper, and 95% of these accepted the original at face value
Only about 5% of the citations to the original papers were critical (at all) of the original article.
Some papers cited the original and the rebuttals as though they both supported the same position!
Why is this happening?
Benobi, et al. suggest that:
This confirms our intuitive sense that most authors, except the relative few that are writing and citing rebuttals, tend to accept a paper’s conclusions uncritically.
Additionally, we can wonder if the authors have really read all of the papers they cite (something suggested by Simkin and Roychowdhury 2003) or found all of the relevant research (suggested by Robinson and Goodman (2010), my discussion here)
The authors suggest that original articles and rebuttals need to be better linked in our information retrieval systems, something that I’ve touched on earlier. But a lack of such system tools does not absolve the authors of their responsibility to find relevant earlier work. Good keyword searches will often easily turn up the rebuttal papers, and citation searching (available for free on Google Scholar if you don’t have Web of Science or Scopus) should be required!
We may also need to examine the possibility that some researchers are just as guilty as their students of not finding and reading the relevant literature.
Banobi, J., Branch, T., & Hilborn, R. (2011). Do rebuttals affect future science? Ecosphere, 2 (3) DOI: 10.1890/ES10-00142.1
Robinson, K. A., & Goodman, S. N. (2011). A systematic examination of the citation of prior research in reports of randomized, controlled trials. Annals of internal medicine, 154(1), 50-5. DOI: 10.1059/0003-4819-154-1-201101040-00007.
There are many, many parts of my job that I love. Teaching students the mechanics of a citation style is not one of them. I don’t mind teaching about many aspects of citations, including effective use of in-text citation, or even technology sessions on using tools like Mendeley. But teaching the basic, “this is what an NLM article citation style looks like” is one of my least favorite parts of my job.
This is partly because I can completely sympathize with students when they complain about the preponderance of citation styles – it doesn’t make much practical sense.
It also probably because my style of teaching about citation styles isn’t very exciting.
My basic plan starts with a PowerPoint presentation in which I discuss the following:
Why we use specific styles – it isn’t just to annoy undergraduates, it is to facilitate clear communication among scholars.
Specific rules for articles, books or websites in the selected citation style – especially the bits that tend to mess with students
Resources to make all of this easier – “bibliography” output buttons in databases, references managers like Mendeley and Zotero
This is normally followed by an in-class practice session where they are given a sample article, website, book etc. and asked to create a citation.
I follow up with a homework assignment via the LMS in which I ask them to create a properly formatted citation for a resource they will use in an upcoming assignment. I provide feedback and they should have at least one good citation for their project.
I believe that this information is useful to students, and the faculty who ask for such a session believe it is worth giving up class time, but it isn’t the most interesting.
So I put the question to the universe – what are some teaching strategies that can make the boring fundamentals of citation styles more engaging (to both me and my students)?
One of the things that faculty often complain about is that students don’t adequately track down and cite enough relevant material for their term papers and projects. This problem isn’t confined to undergraduates. A study in the January 4, 2011 issue of the Annals of Internal Medicine by Karen Robinson and Steven Goodman finds that medical researchers aren’t doing a very good job of citing previous research either.
Specifically, Robinson and Goodman looked at reports of randomized, controlled trials to determine if the authors cited previous, related trials. Citing previous trials is an important part of putting the results of the current trial in context, and in the case of medicine, may help save lives.
In order to do this study, the authors used meta-analysis to locate groups of related papers. They reasoned that if the studies were similar enough to group mathematically, they were similar enough to cite each other. The allowed for a 1-year gap between an original publication and a citation.
Overall, they found that only 25% of relevant papers were actually cited.
Why might a citation not be included? I can think of a few reasons.
The authors couldn’t find the previous study
The authors found the previous study but didn’t think it was relevant enough to cite
The authors found the study and purposefully excluded it for some nefarious purpose
Robinson and Goodman seem to favor the first explanation most of all:
The obvious remedy – requiring a systematic review of relevant literature [before an RCT is funded] – is hampered by a lack of necessary skills and resources.
This obviously speaks to the importance of information literacy skills in both undergraduates and medical school students. One of the most troubling things about the article results was Robinson and Goodman’s determination that a very simple PubMed search could locate most of the articles on one of the topics assessed.
An interesting recommendation that Robinson and Goodman repeat throughout the article is to suggest that a description of the search strategy for prior results be included in the final published article (and they follow their own advice in an appendix to the article).
Of course, it is hard to believe that this problem is limited to just the authors of randomized control trials in biomedicine. It wouldn’t take much to convince me that this problem exists throughout scholarly work, restricting the speed at which new discoveries are made. I would bet that the problem can get particularly difficult in interdisciplinary areas.
We need to start with our undergraduates and convince them that it isn’t enough just to find the minimum number of required sources, but to really get at the heart of previous work on a topic. This leads naturally into the topic of getting students to pick manageable project topics. Of course, undergraduates like clear guidelines (and for the most part this is good teaching strategy), but upper level undergraduates should be able to handle the requirement that they find most of the relevant literature on a topic.
Robinson KA, & Goodman SN (2011). A systematic examination of the citation of prior research in reports of randomized, controlled trials. Annals of internal medicine, 154 (1), 50-5 PMID: 21200038
Despite the vast array of challenges and problems with creating and tracking citations to journal articles, the scholarly publishing realm has developed (over the past 350 years) standards to deal with these things. New concepts such as DOIs, an increase in the number of providers who track citations (Web of Knowledge, Scopus, Google Scholar), and tools to easily format citations have made all of this a bit easier.
Scholars are now facing new challenges in creating and tracking citations. The types of material being cited are probably more varied than ever. Scholars are citing archived data sets, websites that may not exist in few months (or years), multimedia, and perhaps even blog posts and tweets in addition to the traditional journal articles, books and technical reports.
At the Science Online 2011 conference, several speakers lead discussions that focused on the challenges and possible solutions to some of these new issues.
Jason Hoyt, Chief Scientist at Mendeley, discussed some of their new initiatives to track citations based on user libraries. Since I don’t want to spread misinformation about the nature of these initiatives and I’m not entirely clear about them, you’ll just have to stay tuned for more information.
Martin Fenner discussed his work with project ORCID, which will be a publisher-independant tool to help with author disambiguation.
Overall, there was an interesting discussion about the nature of citation itself. The way the metrics count it, a citation is a citation. You get ‘credit’ for a citation even if the folks who cite you say that you are completely wrong. Is there a way to use the semantic web to indicate how a citation is being used? For example, Scopus indicates that Andrew Wakefield’s retracted paper about autism and vaccines has been cited 714 times since its publication, including almost 65 citations since the paper was retracted at the beginning of 2010. Could there be a way to easily say how many of these citations say that Wakefield was wrong?
With all of these interesting advances, there are a lot of challenges. Can the same set of metadata used to describe genetic data be used to describe high energy physics data? Are we moving toward a future where scholarly metadata is exponentially more fuzzy than it is now? Will standard procedures develop – is there an incentive for standard procedures develop? Who will develop them?
I don’t know enough to even hazard a guess at the answer to these questions. For a least a little while, before scientists, publishers and librarians work out the details, undergraduate students are going to be even more frustrated at citing material for their projects, especially due to varying faculty expectations. The “How do you cite this?” questions at the reference desk will get much more complicated before they get any easier.