Alternative metrics at ScienceOnline2011 and beyond

After a very interesting session on alternative metrics at ScienceOnline2011, I have been trying to figure out if I should write anything about it here.  My thoughts are rather scattered, incomplete and probably lacking in originality, but I decided to put something up anyway to help me sort through things.

100 Feet of tape measure
CC image courtesy of Flickr user karindalziel

The first thing I’m thinking about related to alternative metrics is “What problems are we trying to solve?”  I am very familiar with the criticisms of the impact factor, but I’m interested in returning to the basic questions.

What are researchers doing that some kind of metrics could help them with?

  • Find a job
  • Get tenure
  • Get promoted
  • Have their research read by a lot of folks
  • Publish in places that make them look smart
  • Quickly evaluate how smart other people are
  • Convince people to give them money for their research
  • Do more research

At the moment, the impact factor can affect all of these things, although it wasn’t developed to do that (see this article for a history of its development and some defense of the number).

What kind of quantitative information do we need to help researchers accomplish the things they want to do?

Various metrics have been proposed to help researchers:

  • rate journals (average quality)
  • rate individual articles
  • rate authors

I think the fundamental question comes back to how do these metrics help the researcher make decisions about their research and publication.

The second big question I have is whether or not the problem is all about the limitations of the impact factor or if the problem is with the academic culture that misuses existing metrics?  New metrics being proposed are not perfect, and I would argue that any quantitative measure of scholarship will have flaws (though perhaps not as many as the singular reliance on IF).

As more scholars do work outside of traditional peer review journals, I think tenure and promotion committees will feel more pressure to expand beyond simply looking at the impact factor.  We have some interesting examples of scholarship beyond the peer reviewed literature, and several new journals making the case for assessing individual articles rather than the journal as a whole.  These developments will also put

But those of us who are very connected to the online scholarly community of blogs, twitter feeds and social networks need to remember that most faculty are still not aware of or interested in what this community has to offer.  As a result, we can develop all of the alternative metrics we like, but until there is greater acceptance of these “alternative” forms of scholarship, I doubt that any alternative metric will be able to gain a prominent foothold.

At the ScienceOnline2011 conference, Jason Hoyt, Chief Scientists at Mendeley made the argument that we should put our energies into whatever alternative metric will bring down the impact factor.  I wonder if that is putting the cart before the horse.  Perhaps we need to get faculty (even those who have no interest in the blogosphere) to acknowledge the limitations of the impact factor first.

Of course, it is much easier to develop a quantitative measure of scientific impact than to change the culture of scholarly disciplines.

Discovering the scientific conversation

I often like to think of science as a conversation.  It is a conversation that other folks need to be able to hear, so it needs to be discoverable.

We’ve come a long way since da Vinci wrote his notes in code.  Research results are regularly published as journal articles, and references and citations attempt to credit previous work.  The conversation of science could (at one point) be seen as the steady progression of peer-reviewed journal articles and technical comments, with some conference proceedings thrown in for good measure.

Conveniently, this was fairly easy (if expensive and time consuming) to access and preserve.  Publishers originally worked with print index makers and eventually digital database folks.  Conference abstracts were often preserved, even if the actual presentation wasn’t.  And each discipline typically had one primary source to find this information: GeoRef for the geologists or Chemical Abstracts for the chemists.

Things are changing.  And the ScienceOnline2011 conference provided a lot of examples of this new conversation in action.

The peer-reviewed journal article is no longer the only place where this conversation is taking place.  Scientists are commenting on and rating papers on publisher websites.  Scholars are making comments via twitter and friendfeed.  Bloggers are providing detailed (and informed) commentary on published papers, making suggestions for further research and trying to re-create published experiments.  Scientists are citing and archiving data that is stored all over the place.

So, how can researchers and student follow this conversation?

Just a few of problems:

  • Comments, ratings and supplemental material are usually not indexed in the traditional research databases we point students to.
  • Google is great at uncovering conference presentations posted on SlideShare or Google Docs, but not so great at making the connection between the presentation and the conference abstract.
  • If researchers access a journal article via an aggregator (not through the publishers website) they probably won’t have access to the supplemental material
  • Will the non-article material be preserved?
  • Will a published journal article link back to the Open Notebook that was used during the course of the experiment?  Will that notebook be preserved?
  • Most research databases and publisher websites don’t provide links to blog posts commenting about the article.

Is this a problem for researchers, or just for librarians and science historians?

I spend a lot of time in classrooms teaching students how to track citations forward and backward in time using tools such as Scopus and Google Scholar.  But if Scopus is stripping out citations to archived data, and if there is no connection to the blog post that sparked a whole new research direction, they aren’t seeing the whole story.

Is there a need for a more complicated discovery system that searches everything and makes the appropriate connections?  Is the semantic web a solution to these problems?

While I don’t know the answer, I will continue to look for ways to expose undergraduates to this exciting conversation of science.

Citations, Data and ScienceOnline2011

Despite the vast array of challenges and problems with creating and tracking citations to journal articles, the scholarly publishing realm has developed (over the past 350 years) standards to deal with these things.  New concepts such as DOIs, an increase in the number of providers who track citations (Web of Knowledge, Scopus, Google Scholar), and tools to easily format citations have made all of this a bit easier.

Scholars are now facing new challenges in creating and tracking citations.  The types of material being cited are probably more varied than ever.  Scholars are citing archived data sets, websites that may not exist in few months (or years), multimedia, and perhaps even blog posts and tweets in addition to the traditional journal articles, books and technical reports.

At the Science Online 2011 conference, several speakers lead discussions that focused on the challenges and possible solutions to some of these new issues.

Jason Hoyt, Chief Scientist at Mendeley, discussed some of their new initiatives to track citations based on user libraries.  Since I don’t want to spread misinformation about the nature of these initiatives and I’m not entirely clear about them, you’ll just have to stay tuned for more information.

Martin Fenner discussed his work with project ORCID, which will be a publisher-independant tool to help with author disambiguation.

Overall, there was an interesting discussion about the nature of citation itself.  The way the metrics count it, a citation is a citation.  You get ‘credit’ for a citation even if the folks who cite you say that you are completely wrong.  Is there a way to use the semantic web to indicate how a citation is being used?  For example, Scopus indicates that Andrew Wakefield’s retracted paper about autism and vaccines has been cited 714 times since its publication, including almost 65 citations since the paper was retracted at the beginning of 2010.  Could there be a way to easily say how many of these citations say that Wakefield was wrong?

With all of these interesting advances, there are a lot of challenges.  Can the same set of metadata used to describe genetic data be used to describe high energy physics data?  Are we moving toward a future where scholarly metadata is exponentially more fuzzy than it is now?  Will standard procedures develop – is there an incentive for standard procedures develop?  Who will develop them?

I don’t know enough to even hazard a guess at the answer to these questions.  For a least a little while, before scientists, publishers and librarians work out the details, undergraduate students are going to be even more frustrated at citing material for their projects, especially due to varying faculty expectations.  The “How do you cite this?” questions at the reference desk will get much more complicated before they get any easier.