For the third year in a row, I was lucky enough to attend the ScienceOnline “unconference” in North Carolina. This gathering of scientists, professors, journalists, editors, librarians and other interested folks is one of the most dense intellectual three days of the year. With a wide variety of sessions and an extraordinary group of conference goers, it is easily my favorite conference. Details of each of the sessions can be found on the conference wiki.
As I got back to work over the past week and a half, the sessions and hallway conversations from the conference kept rolling around in my head. The consistent thing among everything at the conference seems to be the importance of context. A few examples from the conference and beyond:
Data – Without context, data is useless. From something as simple as a unit or a label, to information about the procedure used to get the data. One reason that scientists cite for their reluctance to share data is the concern that the data will be misused. Without providing proper context, it is more likely to be misused, and scientists don’t want to spend their time adding metadata to their datasets, they want to spend their time doing science. In addition, although I was excited about some of the new data sharing services that are springing up, some of them require very little metadata. This is great if you don’t have a lot of time to upload the data, but slightly pointless if the data can’t be used because of the lack of context.
Popular science – I read a lot of science blogs and science news. I read (ok, skim) a lot of scientific journal articles. In all these cases, the information needs a bit of context. A classic example from after the conference is a press release about a new article discussing the origins of the Little Ice Age. Most of the initial news reports failed to provide a good background to the story, largely because they were based on the press release. Later posts and more in depth stories (from those who had read the journal article) were able to help us better understand where this information comes from.
Undergraduates and blogging – Along the same lines, I sat down with an undergraduate today to help her understand how to read a scientific article. We talked a bit about what each section of the paper was likely to contain, and one of the most important things was context in the Introduction and Discussion. Likewise, students need to understand the context of a blog post. What is likely to be discussed? Where can you find more information? What are the social norms of the blogging world?
The semantic web – This is all about context. Nothing but context. Just make it machine readable. Ontologies can help make connections between data points and establish relationships between concepts. Without context, it’s just a bunch of unrelated zeros and ones.
Altmetrics – Just like you need context with research data, measuring research output requires context. Right now, many researchers have a vague idea of what a good impact factor in their field is. Through hard work and effort, they have a sense of context. For the new metrics, most researchers don’t yet have enough experience to be able to put them in context. What does it mean that 4 folks mentioned your article on twitter or that 18 folks bookmarked your article on Mendeley. Is that good? Bad? Mediocre?
Managing Digital Information* – One big challenge here is to put this all in context. That’s why we love email programs that put email messages into threads – we like seeing what came before and after. How can we see what needs to be seen, ignore the things that can be ignored, and know how it all relates to one another?
I have heard my boss argue that one of the things that libraries are really good at is providing context, although we rarely put it in those terms. Looking over the list I created above, I think it’s impossible to argue otherwise.
Folks in libraries and information centers create metadata, provide books and reference materials to help folks understand the context of the world around them, teach folks about online environments and explore new technologies. We make lists and compare things, and provide tools to help manage digital information.
Yes, we are in the information business, but perhaps it is a bit more descriptive to say that we are in the business of providing context.
*I still can’t get over the irony of this session. In front of a standing room only crowd, the moderators tried to engage us in a discussion of how to manage and deal with the information deluge, while at the same time we had laptop computers and smart phones open to catch every tweet and blog post about what was being said.
I must admit that the shift to digital textbooks concerns me slightly, but not because I object to the format: I read a lot of ebooks* on my iPhone, via the Kindle and Nook apps.
What concerns me is the decreasing number of options for students when eTextbooks come into the picture.
In a print world, students have lots of options: they can buy books new or used, share books with friends, borrow them from a library (for a few hours or for the semester if they are lucky) and sell them back at the end of the semester for at least a bit of what they paid for them.
But digital textbooks generally have a single price point. And it isn’t always cheaper than the print version. And you can’t share the books with a friend. And you can’t sell them back. Sometimes you don’t even get to keep them.
Certainly, some pricing makes these digital textbooks cheaper, but not always. Faculty have not always paid attention to textbook cost when selecting a book for their classes, and I worry that the more complicated factors involved in eTextbook access will also get overlooked.
So what can we do? Well, we have a textbooks on reserve program at my library where we try to get faculty to donate copies of textbooks to the library. Part of this program is simply talking to faculty about how difficult it can be for some students to purchase all of there textbooks. Perhaps the extension of this program is to help faculty understand all of the options available.
Stereotypically, librarians have a lot of time to read at work. In reality, any reading we do (even for professional purposes) is typically done in our “free” time. With the birth of my second daughter this year, that free time was almost non existent, and as a result, I read considerably fewer books this year. You can view the complete list at Worldcat.org to find each book in a library near you.
Some great non fiction:
Fey, T. (2011). Bossypants. New York: Little, Brown and Co.
In my previous post, I took a look at some of the scholarship about why certain articles are cited more than others.
I feel bad, because by focusing on all of the little things that correlate with citation rate, I didn’t talk about the substantive aspects of how a citation is used.
Cue the next article I found by R.B. Williams (2011) about the history and classification of citation systems in the biosciences.
This was an exciting article to read for two reasons. First, I had been looking for some information about the history of various citation styles for a while. (It isn’t easy. Try Google-ing “history of citation styles”).
Second, the article made me aware of the scholarship about how citations are actually used within scientific documents. I am particularly drawn to the questions posed by Moravcsik and Murugesan back in 1975.
Is the reference conceptual or operational? In other words, is the reference made in connection with a concept or theory that is used in the referring paper, or is it made in connection with a tool or physical technique used in the referring paper? The distinction is not meant to be a value judgment, and is not to be taken as synonymous with judging the importance of the paper referred to.
Is the reference organic or perfunctory? In other words, is the reference truly needed for the understanding of the referring paper (or to the working out of the content of that paper), or is it mainly an acknowledgment that some other work in the same general area has been performed?
Is the reference evolutionary or juxtapositional? In other words, is the referring paper built on the foundations provided by the reference, or is it an alternative to it?
Is the reference confirmative or negational? In other words, is it claimed by the referring paper that the reference is correct, or is its correctness disputed? Incorrectness need not be claimed through an actual demonstration of an error in the paper referred to, but could also be established, for example, through inferior agreement with experimental data.
First, these questions have a real importance when we start thinking about the ways in which citation metrics don’t necessarily get at the importance of scientific work.
And second, I think there is some potential in these ideas to help students when they write term papers and cite their sources.
Traditionally, I teach students that they need to cite their sources in order to acknowledge the scholarly work of others. I talk about the implications of not citing something (it was your own idea, its common knowledge, its plagiary), but I don’t really go into more detail about why you might cite something.
By breaking down the purpose of a citation explicitly, as these questions do, perhaps we can better prepare students to effectively use the research articles they find in their term papers and projects.
Now, I’m no expert on teaching writing. But the best term papers do an effective job of integrating the various sources they find into a cohesive narrative. Perhaps we could be more explicit about how this is done, and perhaps these ideas can help the students envision what their citations and their term paper might look like. Perhaps.
For many researchers, the citation is a make-or-break concept. Most ranking algorithms use citations to determine a journal’s influence or impact. Publication in “high impact” journals is often the key to tenure and promotion, and the number of times an article has been cited is often widely touted in tenure and promotion packets.
With careers, funding and much else riding on citation, it would be useful for scholars and librarians to know why a particular item gets cited. We’d all like to think that the only reason an article is cited is because it’s content is relevant (and more relevant than other items) to the study at hand.
Unfortunately, there is some evidence to suggest that other, non-content, factors influence the likelihood of an item being cited.
Big caveat: The quality of these studies is highly variable and their results are sometimes contradictory. Correlation does not equal causation.
Nevertheless, most of the non-content factors influencing citation rate relate to article discoverability. You can’t be cited if you can’t be read, and you can’t be read if you can’t be found. How likely is an article to be found in a database? Was the article discussed in a newspaper or other popular science forum? Does the title clearly explain what the article is about (and make you want to read more)? Is the article already connected to a wide circle of readers via multiple authors or large universities? While there are classic examples of important scientific publications published in obscure journals, those are the exception and not the norm.
So, in no particular order, here are a few things that folks suggest might influence how often your article is cited:
A lot of research has looked into various aspects of article titles on subsequent citations.
Type of title – In an interesting study looking at article titles from PLoS journals, Jamali and Nikzad (2011) wondered if the type of article title affected the citation rate of an article. In general, they found that article titles that asked a question were downloaded more but cited less than descriptive or declarative titles. Interestingly, Ball (2009) found that the number of such interrogative titles have increased 50% – 200% in the last 40 years.
Length of title – Jamali and Nikzad suggest that articles with longer titles are downloaded and cited less, and Moore (2010) in a quick study found no correlation. However, Habibzadeh and Yadollahie (2010) suggested that longer titles are cited more (especially in high impact factor journals) and a positive correlation between article title length and citation rate was found by Jacques and Sabire (2009).
Specific terms – Disciplinary abbreviations (very specific keywords) may lead to more citations (Moore, 2010), where as articles with specific country names in the title might be cited less (Jacques and Sabire, 2009).
Humorous titles – To my disappointment, a study of articles with amusing titles in prestigious psychology journals by Sagi and Yechiam (2008) found that these articles were less likely to be cited than other articles with unfunny articles. Since funny titles are often less descriptive of the actual research, these articles could be more difficult to find in databases.
Positive Results – There is strong evidence to suggest that positive results are much more likely to be submitted and published than negative results. It seems as though positive results are also more likely to be cited. Banobi et al. (2011) found that rebuttal articles (either technical reports or full length articles) were less likely to be cited than the original articles, i.e. the articles with positive results were more likely to be cited. This correlates well with the results of Leimy and Koricheva (2005) who found that articles that successfully proved their original hypothesis were more likely to be cited than articles that disproved the original hypothesis
Number of authors – Leimu and Koricheva (2005) found a positive correlation between the number of authors and the number of citations in the ecological literature, while Kulkarni et al. (2007) found that group authorship in medical journals increased citation counts by 11.1. However, a blog post by Moore (2010) suggested that isn’t wasn’t the number of authors that were important, but their reputation. A recent study of the chemical literature that was able to account for article quality (as measured by reviewers rating) found a correlation with author reputation but no correlation to the number of authors (Bornmann et al. 2012).
Industry relationship – Studying medial journals, Kulkarni et al. (2007) found that industry funded research that reported results beneficial to the industry (i.e. a medical device that worked or a drug that didn’t show harmful side effects) was more likely to be cited than non-instustry funded, negative research.
Data sharing – Piwowar et al. (2007) found that within a specific scholarly community (cancer microarray clinical trial publications) free availability of research data let to a higher citation rate, independent of journal impact factor.
Open Access – Lots of studies have been done with mixed results. A slightly higher number of studies seem to suggest that open access leads to higher citations (See the excellent review article by Wagner (2010)).
Popular press coverage – It makes intuitive sense that journal articles spotlighted by the popular press might be cited more, but this is difficult to prove. Perhaps the press is merely good at identifying those articles that would be highly cited anyway. Phillips et. al (1991) were able to take advantage of an interesting situation when the New York Times went on strike in 1978 but continued to produce a “paper of record” that was never published. Phillips et. al. (1991) found that items written about in the “paper of record” but not published were no more likely to be cited than other articles.
Length of your bibliography – A 2009 study by Webster et al. (2009) suggests a correlation between the length of an articles bibliography and the number of times it is later cited. They suggest a “I’ll cite you since you cited me” mentality, but online commentators suggest that this is merely a specious relationship (See Corbyn, 2010, and comments therein).
So, if you want to publish a paper that gets the highest number of citations, what should you do? Do your study with a large number of prestigious co-authors. Submit your long article containing positive results and a big bibliography to a open access journal. Say something nice about a pharmaceutical company. Share your data and get the New York Times to write about it.
Oh, and it might be useful to have some interesting and solid science in there somewhere.
Really Long Bibliography:
Ball, R. (2009). Scholarly communication in transition: The use of question marks in the titles of scientific articles in medicine, life sciences and physics 1966–2005. Scientometrics, 79(3), 667–679. Retrieved from: http://www.akademiai.com/index/UH466Q5P3722N37L.pdf
Banobi, J. A., Branch, T. A., & Hilborn, R. (2011). Do rebuttals affect future science? Ecosphere, 2(3), art37. doi:10.1890/ES10-00142.1
Bornmann, L., Schier, H., Marx, W., & Daniel, H. D. (2012). What factors determine citation counts of publications in chemistry besides their quality? Journal of Informetrics, 6(1), 11-18. Elsevier Ltd. doi:10.1016/j.joi.2011.08.004
Habibzadeh, F., & Yadollahie, M. (2010). Are Shorter Article Titles More Attractive for Citations? Cross-sectional Study of 22 Scientific Journals. Croatian Medical Journal, 51(2), 165-170. doi:10.3325/cmj.2010.51.165
Jacques, T. S., & Sebire, N. J. (2010). The impact of article titles on citation hits: an analysis of general and specialist medical journals. JRSM short reports, 1(1), 2. doi:10.1258/shorts.2009.100020
Jamali, H. R., & Nikzad, M. (2011). Article title type and its relation with the number of downloads and citations. Scientometrics, (49), 653-661. doi:10.1007/s11192-011-0412-z
Kulkarni, A. V., Busse, J. W., & Shams, I. (2007). Characteristics associated with citation rate of the medical literature. PloS one, 2(5), e403. doi:10.1371/journal.pone.0000403
Leimu, R., & Koricheva, J. (2005). What determines the citation frequency of ecological papers? Trends in ecology & evolution, 20(1), 28-32. doi:10.1016/j.tree.2004.10.010
Phillips, D. P., Kanter, E. J., Bednarczyk, B., & Tastad, P. L. (1991). Importance of the Lay Press in the Transmission of Medical Knowledge to the Scientific Community. The New England Journal of Medicine, 325(16), 1180-1183. Available via: http://www.ncbi.nlm.nih.gov/pubmed/1891034
Piwowar, H. A., Day, R. S., & Fridsma, D. B. (2007). Sharing detailed research data is associated with increased citation rate. PloS ONE, 2(3), e308. doi:10.1371/journal.pone.0000308
Sagi, I., & Yechiam, E. (2008). Amusing titles in scientific journals and article citation. Journal of Information Science, 34(5), 680-687. doi:10.1177/0165551507086261
Webster, G. D., Jonason, P. K., & Schember, T. O. (2009). Hot Topics and Popular Papers in Evolutionary Psychology : Analyses of Title Words and Citation Counts in Evolution and Human Behavior , 1979 – 2008. Evolutionary Psychology, 7(3), 348-362. Retrieved from http://www.epjournal.net/filestore/ep07348362.pdf
And can librarians, scholars and publishers agree about it?
By value, I don’t just mean the impact factor and other metrics, or even the general prestige of a journal as measured by gut feeling. I mean the value of a publication (a single article, an entire journal, every journal a publisher publishes) in fiscal terms. Dollars and cents. Moolah. Benjamins.
Perhaps we can say that a more highly ranked journal (impact factor, eigenfactor, etc.) might be worth more in dollars. Some publishers would certainly like this to be true. Higher quality equals higher cost. After all, you want the $100 bottle of wine to taste better than the $5 bottle of wine. But this doesn’t seem to be how it works in reality, at least in some disciplines. Bergstrom and Bergstrom (2006) did a study of ecology journals, and examined cost versus impact factor. In general, they found that there was no correlation between the two. [Yes, yes, I know. Impact factor is just one poor way of measuring value.]
Perhaps value has something to do with reliability? One way to look at reliability is to examine retractions. Fang and Casadevall (2011) have shown that typically, high impact journals have more per-article retractions because they are at the cutting edge of research (also see this Retraction Watch post). Because they want to get things out fast, mistakes are sometimes made. Nature, Cell and Science all have a relatively high Retraction Index (see Fang and Casadevall, 2011). But does this make these journals any less valuable overall?
With the number of open access options increases, publishers (especially for profit publishers and academic societies that act like for profit publishers) make the argument that their editing, copy editing and page preparation services add significant value to their publications. How much should should these services add to the total value of the publication? [Hint: not as much as the publisher would like.]
Then again, perhaps the value of a publication has less to do with the content and more to do with the audience. For example, the New England Journal of Medicine is probably more useful and valuable to a medical school than it is to my small liberal arts college, and more valuable to me than to a similarly sized school without a biology program.
When publishers assign a value to their publications, they typically take into account the size of the school or the types of degrees they award. Larger schools often pay more money for access to the same resources. Unfortunately, this doesn’t always take into account the details of who those folks are. Two schools with 5000 students will pay the same amount for a resource in chemistry, say, even though one school has 100 chemistry majors each year and the other has 10.
As journal costs keep rising, institutions must continuously evaluate value – does this journal provide enough value to my institution to justify the costs?
No matter how good the $100 bottle of wine is, I’ll need to keep drinking the $5 stuff. Or maybe the $10 stuff for Christmas.
Almost all academic databases these days will allow you to export a properly formatted citation (APA, MLA, etc.) for a book or journal article within that database. This is a wonderful feature for undergraduates that saves a lot of really annoying formatting. It is especially helpful for eliminating the annoyance of re-arranging author first names and last names and putting in appropriate punctuation.
Unfortunately, it doesn’t always come out perfectly.
For example, the citation database Scopus regularly produces a citation indicating that an article is “Available from http://www.scopus.com” which is completely incorrect. Just the citation is available from Scopus, the full text of the item is found elsewhere.
So in my library instruction sessions I regularly encourage students to double check the results of these citation generators (in databases, in web services like EasyBib and in programs like Mendeley and EndNote).
Because this is what happens when you don’t look things over:
So take a minute or two to look over your bibliography – you don’t want to look silly.
Philosphical Transactions is typically regarded as the first scientific journal and has been in continuous publication since it started in 1665. (A french journal, the Journal des sçavans started publication three months prior to the Philosophical Transactions, but since it appealed to a wider audience and included a larger percentage of book reviews, many do not consider it the first real scientific journal).
We’ve had access to this archive for a while now via JSTOR, and I love having the ability to see the very beginnings of the scientific journal article.
What intrigued me when I started digging into their now-open archive was the delightful juxtaposition of the 1665 publication date and the modern DOI.
Since these historical documents are available online, they are digital objects, and assigning DOIs makes a lot of sense. It also makes each individual article much easier to find.
Tracking down a citation you already have should be a relatively simple task.
A colleague of mine asked for help the other day tracking down a citation. A variety of circumstances made it anything but straightforward and served to remind me about some of the confusing parts of the scholarly communication system (and that I really love my job).
A student had approached the reference desk looking for a citation to this article:
Tan, D. X.; Chen, L. D.; Poeggeler, B.; Manchester, L. C.; Reiter, R. J. (1993). “Melatonin: a potent, endogenous hydroxyl radical scavenger”. Endocrine J 1: 57–60.
The student had found the citation via the Wikipedia entry for Melatonin. My colleague started out with the usual process – look up the journal, find the right volume and go from there. Except when you look up Endocrine Journal, you find that the volume number doesn’t match the year, nor are there any articles with a similar title in the publication. Author searches in the same journal also yield nothing.
Since the citation came from Wikipedia, it’s seemed probable that there was an error. So she did a search on Google and Google Scholar to try to find a correct citation. Neither search turns up the article, but Google Scholar indicates that the article has been cited over 1000 times! The student found another article by some of the same authors on the topic and was content, but my colleague still wanted the answer. With other students waiting for reference help, she sent the question along to me.
I was checking my email after my kids went to bed and thought I’d poke around a little to see what I can find. I re-did the searches my colleague did so that I understand the problem. Theoretically, the article has to exist, since it has been cited so many times. So why couldn’t we find it? I tried Google Scholar, PubMed, Scopus and found nothing (we don’t have Web of Science here). I searched for additional publications by the same authors but I still didn’t find anything close to this one.
So I started looking for similarly name publications. The journal Endocrine Journal is published by the Japan Endocrine Society and the years don’t match up, so perhaps the abbreviation refers to something different? I located a journal called simply Endocrine (try finding that one in a Google search!) published by Springer. This started to look promising because the first volume of of Endocrine was published in 1993, just want we want. But this volume isn’t available on the publisher’s website, so I couldn’t confirm my suspicions.
If Endocrine is the journal we want, why can’t I find it indexed in a database? I checked indexing information. PubMed only started indexing it in 1997. Scopus started indexing it in 1993, but only with the fifth issue, and we need issue 1. And Google Scholar won’t have it (other than the citation) because it isn’t on the Springer website or in PubMed.
I start to think that the citation really refers to an article in Endocrine, not Endocrine Journal. But Scopus has over 1000 folks citing Endocrine Journal. It seems unlikely that so many people would make the same error.
I stayed up past my bedtime having fun tracking this down. I emailed my thoughts to my colleague and I wondered if perhaps Web of Science indexed this item from issue 1.
The next day, we asked a colleague at another institution to do a quick search for us in Web of Science. No hits on the article title. Perhaps Web of Science didn’t index it from issue 1 either, or perhaps I’m just wrong (it’s been known to happen).
From 1993 to 1994 there were two Endocrine Journals!
For a brief period of time (<2 years), Endocrine called itself Endocrine Journal. Perhaps they discovered the Japan Endocrine Society’s Endocrine Journal as the internet was making international collaboration easier.
Since I found the original ISSN (0969-711X), I submitted an ILL request to confirm my thoughts. Sure enough, here’s the article masthead, but with Macmillan Press as the publisher, not Springer. The early issues available on the Springer website have Stockton Press as the publisher in 1995. It seems to have changed publisher several times.
What’s the moral of this story?
Journals really need to select unique names. (Do new journals think about Google-ability of their names?)
I picked the right profession because I had fun chasing this down.
Given my difficulty tracking this down, I have to ask: How many of the 1000 folks that cited this article actually tracked it down? I bet there are some who never laid eyes on it.
More importantly, it can be very easy for valuable information to disappear entirely. We live in an era of information overload. Yes, people have been saying the same thing since the invention of the printing press, but these days it isn’t a matter of finding any information, it is a matter of sorting to find the right information. And even today, an item published just before the explosion of online scholarly information could almost disappear. Although it may seem like it, not everything is available in Google.