The really hard part about research isn’t the databases

This isn’t news to librarians, but I find that students (and occasionally faculty) get caught up in equating research with the research databases.

Take two recent examples.

This week I met with some math students who were looking for scholarly and non-scholarly articles and information to help them solve a particular mathematics problem.  They had been using MathSciNet and Google to help them find appropriate information, with mixed results.  I was able to provide them with a few technical tips on improving their searches (+ and – operators in Google, subject classifications in MathSciNet, etc.), but that wasn’t what they really needed.  What was most useful to them was an opportunity to think through all the different aspects of their problem.  In this case, I merely acted as a facilitator – my three semesters of calculus did not give me the required knowledge to help them come up with synonyms or alternative search terms.  But I could ask questions: What are the various aspects of your problem?  What are some alternative terms that would define these aspects?  What are the various approaches you’ve tried to solve the problem?

When their faculty adviser asked me to meet with them, he thought our discussion might focus more on the technical aspects of the databases.  He sat in on the session, and was vital in helping the students brainstorm their additional search terms.  I think he learned a bit about guiding students through the research process (plus a couple of tips about searching Google), and our collaboration really benefited the students.

In another case, I had a student who needed to write a paper describing some aspect of the effect of the hypothalamus on the pituitary gland.  For her, the massive quantity of information on this topic was overwhelming, and I was able to provide her with some guidance on how to focus her topic.  We looked at a Wikipedia article listed some specific hormones and their specific effects.  We looked at some search results lists and picked out a few topic ideas.  We talked about how she can use the background information she is collecting from textbooks to select a focus.  And only then did we talk about taking that focus back into the databases to find relevant information for her project.

Overall, she came away with a much better strategy for completing her project.  It wasn’t about “click here, then click there”.

At the same time, most of our faculty initiated requests for library instruction sessions start out with a database name.

I’m not saying that this isn’t important – it is – and there are some tricky technical issues involved in navigating our OpenURL system too.  I would make the argument that for many students, it is easier for them to learn a search interface on their own than it is to develop an overall strategy for completing the information gathering portion of their projects.

Teaching the Mechanics of Citation Styles

There are many, many parts of my job that I love.  Teaching students the mechanics of a citation style is not one of them.  I don’t mind teaching about many aspects of citations, including effective use of in-text citation, or even technology sessions on using tools like Mendeley.  But teaching the basic, “this is what an NLM article citation style looks like” is one of my least favorite parts of my job.

This is partly because I can completely sympathize with students when they complain about the preponderance of citation styles – it doesn’t make much practical sense.

It also probably because my style of teaching about citation styles isn’t very exciting.

My basic plan starts with a PowerPoint presentation in which I discuss the following:

  • Why we use specific styles – it isn’t just to annoy undergraduates, it is to facilitate clear communication among scholars.
  • Specific rules for articles, books or websites in the selected citation style – especially the bits that tend to mess with students
  • Resources to make all of this easier – “bibliography” output buttons in databases, references managers like Mendeley and Zotero

This is normally followed by an in-class practice session where they are given a sample article, website, book etc. and asked to create a citation.

I follow up with a homework assignment via the LMS in which I ask them to create a properly formatted citation for a resource they will use in an upcoming assignment.  I provide feedback and they should have at least one good citation for their project.

I believe that this information is useful to students, and the faculty who ask for such a session believe it is worth giving up class time, but it isn’t the most interesting.

So I put the question to the universe – what are some teaching strategies that can make the boring fundamentals of citation styles more engaging (to both me and my students)?

How will undergraduates navigate a post peer-review scholarly landscape?

With all of it’s flaws (and there are many), faculty at the moment can tell students to find articles that pass one crucial credibility test: peer review.

This is pretty easy for students to do, given a bit of instruction.  Many databases will indicate if something is peer reviewed (although they don’t always get it right), and most primary research articles are peer reviewed – you just need to be able to recognize one.

But peer review is changing.  It isn’t going away anytime soon, but through a variety of trials and experiments and advocacy, it is changing.  Cameron Neylon has argued in favor of doing away with the current peer review system altogether.

This may require a more informed readership, readers who understand what various metrics mean, and a greater reliance on understanding the general reputation of a journal (however this is measured).

All of this creates problems for your typical undergraduate.

When they are just starting out, students don’t have the required scientific knowledge of concepts and methods to adequately evaluate the quality of a journal article on their own – that’s what they are in college to learn.

So when their professors ask them to write a paper or complete a project using high quality primary research articles, how will students filter the signal from the noise if the simple “peer-reviewed” credibility test no longer works?

I can think of a few things that may help them out, although it won’t be quite as simple as it’s made to seem now. This may also require a bit more instruction to bring students up to speed on these concepts.

  • Use the databases as a filtering tool.  Databases like Scopus and Web of Science and SciFinder select which journals to include.  Theoretically, they wouldn’t include content from the really poor quality journals.  Of course, this doesn’t stop bad papers from appearing in good journals.  Faculty could limit students to articles found in a particular database.
  • Increased prevalence of article level metrics on publisher websites.  Some journals already make this information prominent (like PLoS ONE) and more are doing so.  This would require more education (for both faculty and students) about what these metrics mean (and don’t mean).  Faculty could ask students to only use articles that meet some minimum threshold.
  • An expansion of rating networks like Faculty of 1000.  We don’t have access to this resource at my institution, but we may see undergraduates relying more on this (and similar networks) to help them get a sense of whether an article is “worthy” or not.  Students could be limited to using articles that had a minimum rating.

All of this is limiting.  Hopefully, by the time students reach their senior year, faculty could stop making arbitrary requirements and simply ask for high quality material, right?

What are some other techniques for evaluating scholarship that undergraduates may have to master as peer review changes?

Miscellaneous things I’ve learned lately

  • Genetics researchers at undergraduate institutions can have a hard time finding projects that are interesting enough to get funding, but not so interesting that a larger lab swoops in to do the work.
  • Our students got more books through ILL last year than they checked out from our own collection. (Special thanks to the IDS project for getting these books to students in 5 days on average)
  • Two of the undergraduates I’ve taught in information literacy sessions are interested in science librarianship.
  • This quote: “Facts are not science – as the dictionary is not literature.” -Martin Fisher, Fischerisms
  • My two year old will eat cucumbers, just not the skin
  • My organization is switching from Oracle calendar to Google calendar (Yay!) but we will have to run two systems simultaneously for a little while.

Talking with faculty

Over at Confessions of a Science Librarian, John Dupuis has set out a delightful “Stealth Librarianship Manifesto” that echoes many of the comments I have made about how librarians need to get out of the library (physically and virtually) and interact with our users in their spaces, including conferences and publications.

At my library, we are currently working through a big project to help us do that.  We have a relatively new “scholarly communications” team and our goal over the next 6 months or so is to talk to faculty members across campus to learn about what they are doing.  I’ve mentioned this project before, and noted that there are some resources available to help folks understand various disciplines.  It is vitally important for us to understand what is going on on our campus.  Our faculty are amazing, but they have different pressures than the folks at research universities.

So every week I meet with two or three faculty from the disciplines I serve and chat with them about the research and publication efforts:

  • What are they working on right now?
  • Are they incorporating undergraduates into their research?  Have they co-authored publications with these students? (Quite often)
  • How do they select which journal to publish in?  Do they pay attention to impact factors or not? (Although my faculty pay attention to general reputation, they rarely mention the metrics)
  • Have they posted a copy of one of the publications online?  Do they know if they kept the right to do so? (They have no idea what rights they have to their papers)
  • What kinds of data are they producing?  What do they do with it? (I’ve already learned a lot about the distinctions between the theorists and the applied folks in math and computer science)

The conversations I have had so far have been incredibly interesting and educational.  I service 6 departments (Biology, Chemistry, Computer Science, Geological Sciences, Mathematics, and Physics & Astronomy).  My educational background is in Geology, so I don’t have a native understanding of what the mathematicians or physicists are doing, for example.  These conversations have given me remarkable glimpses into our faculty’s values, assumptions and goals.

One of the important distinctions I’ve noticed is the disconnect between the highly active science online community (bloggers and tweeters, etc.) and your average run of the mill faculty.  Scholarly communication may be changing, but many of the faculty I’ve talked with (including those who are still publishing actively) are barely aware of some of the fascinating changes and experiments taking place.

So far, I’ve only had a chance to talk with 13% of the faculty I work with, and an upcoming maternity leave will delay my conversations with some, but it has been an incredible experience so far, and I look forward to the rest.

It isn’t just students: Medical researchers aren’t citing previous work either

One of the things that faculty often complain about is that students don’t adequately track down and cite enough relevant material for their term papers and projects.  This problem isn’t confined to undergraduates.  A study in the January 4, 2011 issue of the Annals of Internal Medicine by Karen Robinson and Steven Goodman finds that medical researchers aren’t doing a very good job of citing previous research either.

ResearchBlogging.orgSpecifically, Robinson and Goodman looked at reports of randomized, controlled trials to determine if the authors cited previous, related trials.  Citing previous trials is an important part of putting the results of the current trial in context, and in the case of medicine, may help save lives.

In order to do this study, the authors used meta-analysis to locate groups of related papers.  They reasoned that if the studies were similar enough to group mathematically, they were similar enough to cite each other.  The allowed for a 1-year gap between an original publication and a citation.

Overall, they found that only 25% of relevant papers were actually cited.

Why might a citation not be included?  I can think of a few reasons.

  • The authors couldn’t find the previous study
  • The authors found the previous study but didn’t think it was relevant enough to cite
  • The authors found the study and purposefully excluded it for some nefarious purpose

Robinson and Goodman seem to favor the first explanation most of all:

The obvious remedy – requiring a systematic review of relevant literature [before an RCT is funded] – is hampered by a lack of necessary skills and resources.

This obviously speaks to the importance of information literacy skills in both undergraduates and medical school students.  One of the most troubling things about the article results was Robinson and Goodman’s determination that a very simple PubMed search could locate most of the articles on one of the topics assessed.

An interesting recommendation that Robinson and Goodman repeat throughout the article is to suggest that a description of the search strategy for prior results be included in the final published article (and they follow their own advice in an appendix to the article).

Robinson and Goodman's search strategy to find the meta-analyses used to locate the randomized control trials

Of course, it is hard to believe that this problem is limited to just the authors of randomized control trials in biomedicine.  It wouldn’t take much to convince me that this problem exists throughout scholarly work, restricting the speed at which new discoveries are made.  I would bet that the problem can get particularly difficult in interdisciplinary areas.

We need to start with our undergraduates and convince them that it isn’t enough just to find the minimum number of required sources, but to really get at the heart of previous work on a topic.   This leads naturally into the topic of getting students to pick manageable project topics.  Of course, undergraduates like clear guidelines (and for the most part this is good teaching strategy), but upper level undergraduates should be able to handle the requirement that they find most of the relevant literature on a topic.

Robinson KA, & Goodman SN (2011). A systematic examination of the citation of prior research in reports of randomized, controlled trials. Annals of internal medicine, 154 (1), 50-5 PMID: 21200038

See also:

Alternative metrics at ScienceOnline2011 and beyond

After a very interesting session on alternative metrics at ScienceOnline2011, I have been trying to figure out if I should write anything about it here.  My thoughts are rather scattered, incomplete and probably lacking in originality, but I decided to put something up anyway to help me sort through things.

100 Feet of tape measure
CC image courtesy of Flickr user karindalziel

The first thing I’m thinking about related to alternative metrics is “What problems are we trying to solve?”  I am very familiar with the criticisms of the impact factor, but I’m interested in returning to the basic questions.

What are researchers doing that some kind of metrics could help them with?

  • Find a job
  • Get tenure
  • Get promoted
  • Have their research read by a lot of folks
  • Publish in places that make them look smart
  • Quickly evaluate how smart other people are
  • Convince people to give them money for their research
  • Do more research

At the moment, the impact factor can affect all of these things, although it wasn’t developed to do that (see this article for a history of its development and some defense of the number).

What kind of quantitative information do we need to help researchers accomplish the things they want to do?

Various metrics have been proposed to help researchers:

  • rate journals (average quality)
  • rate individual articles
  • rate authors

I think the fundamental question comes back to how do these metrics help the researcher make decisions about their research and publication.

The second big question I have is whether or not the problem is all about the limitations of the impact factor or if the problem is with the academic culture that misuses existing metrics?  New metrics being proposed are not perfect, and I would argue that any quantitative measure of scholarship will have flaws (though perhaps not as many as the singular reliance on IF).

As more scholars do work outside of traditional peer review journals, I think tenure and promotion committees will feel more pressure to expand beyond simply looking at the impact factor.  We have some interesting examples of scholarship beyond the peer reviewed literature, and several new journals making the case for assessing individual articles rather than the journal as a whole.  These developments will also put

But those of us who are very connected to the online scholarly community of blogs, twitter feeds and social networks need to remember that most faculty are still not aware of or interested in what this community has to offer.  As a result, we can develop all of the alternative metrics we like, but until there is greater acceptance of these “alternative” forms of scholarship, I doubt that any alternative metric will be able to gain a prominent foothold.

At the ScienceOnline2011 conference, Jason Hoyt, Chief Scientists at Mendeley made the argument that we should put our energies into whatever alternative metric will bring down the impact factor.  I wonder if that is putting the cart before the horse.  Perhaps we need to get faculty (even those who have no interest in the blogosphere) to acknowledge the limitations of the impact factor first.

Of course, it is much easier to develop a quantitative measure of scientific impact than to change the culture of scholarly disciplines.

Discovering the scientific conversation

I often like to think of science as a conversation.  It is a conversation that other folks need to be able to hear, so it needs to be discoverable.

We’ve come a long way since da Vinci wrote his notes in code.  Research results are regularly published as journal articles, and references and citations attempt to credit previous work.  The conversation of science could (at one point) be seen as the steady progression of peer-reviewed journal articles and technical comments, with some conference proceedings thrown in for good measure.

Conveniently, this was fairly easy (if expensive and time consuming) to access and preserve.  Publishers originally worked with print index makers and eventually digital database folks.  Conference abstracts were often preserved, even if the actual presentation wasn’t.  And each discipline typically had one primary source to find this information: GeoRef for the geologists or Chemical Abstracts for the chemists.

Things are changing.  And the ScienceOnline2011 conference provided a lot of examples of this new conversation in action.

The peer-reviewed journal article is no longer the only place where this conversation is taking place.  Scientists are commenting on and rating papers on publisher websites.  Scholars are making comments via twitter and friendfeed.  Bloggers are providing detailed (and informed) commentary on published papers, making suggestions for further research and trying to re-create published experiments.  Scientists are citing and archiving data that is stored all over the place.

So, how can researchers and student follow this conversation?

Just a few of problems:

  • Comments, ratings and supplemental material are usually not indexed in the traditional research databases we point students to.
  • Google is great at uncovering conference presentations posted on SlideShare or Google Docs, but not so great at making the connection between the presentation and the conference abstract.
  • If researchers access a journal article via an aggregator (not through the publishers website) they probably won’t have access to the supplemental material
  • Will the non-article material be preserved?
  • Will a published journal article link back to the Open Notebook that was used during the course of the experiment?  Will that notebook be preserved?
  • Most research databases and publisher websites don’t provide links to blog posts commenting about the article.

Is this a problem for researchers, or just for librarians and science historians?

I spend a lot of time in classrooms teaching students how to track citations forward and backward in time using tools such as Scopus and Google Scholar.  But if Scopus is stripping out citations to archived data, and if there is no connection to the blog post that sparked a whole new research direction, they aren’t seeing the whole story.

Is there a need for a more complicated discovery system that searches everything and makes the appropriate connections?  Is the semantic web a solution to these problems?

While I don’t know the answer, I will continue to look for ways to expose undergraduates to this exciting conversation of science.

Citations, Data and ScienceOnline2011

Despite the vast array of challenges and problems with creating and tracking citations to journal articles, the scholarly publishing realm has developed (over the past 350 years) standards to deal with these things.  New concepts such as DOIs, an increase in the number of providers who track citations (Web of Knowledge, Scopus, Google Scholar), and tools to easily format citations have made all of this a bit easier.

Scholars are now facing new challenges in creating and tracking citations.  The types of material being cited are probably more varied than ever.  Scholars are citing archived data sets, websites that may not exist in few months (or years), multimedia, and perhaps even blog posts and tweets in addition to the traditional journal articles, books and technical reports.

At the Science Online 2011 conference, several speakers lead discussions that focused on the challenges and possible solutions to some of these new issues.

Jason Hoyt, Chief Scientist at Mendeley, discussed some of their new initiatives to track citations based on user libraries.  Since I don’t want to spread misinformation about the nature of these initiatives and I’m not entirely clear about them, you’ll just have to stay tuned for more information.

Martin Fenner discussed his work with project ORCID, which will be a publisher-independant tool to help with author disambiguation.

Overall, there was an interesting discussion about the nature of citation itself.  The way the metrics count it, a citation is a citation.  You get ‘credit’ for a citation even if the folks who cite you say that you are completely wrong.  Is there a way to use the semantic web to indicate how a citation is being used?  For example, Scopus indicates that Andrew Wakefield’s retracted paper about autism and vaccines has been cited 714 times since its publication, including almost 65 citations since the paper was retracted at the beginning of 2010.  Could there be a way to easily say how many of these citations say that Wakefield was wrong?

With all of these interesting advances, there are a lot of challenges.  Can the same set of metadata used to describe genetic data be used to describe high energy physics data?  Are we moving toward a future where scholarly metadata is exponentially more fuzzy than it is now?  Will standard procedures develop – is there an incentive for standard procedures develop?  Who will develop them?

I don’t know enough to even hazard a guess at the answer to these questions.  For a least a little while, before scientists, publishers and librarians work out the details, undergraduate students are going to be even more frustrated at citing material for their projects, especially due to varying faculty expectations.  The “How do you cite this?” questions at the reference desk will get much more complicated before they get any easier.

Managing your scholarly identity

When someone Googles your name, do you know what they will find?  When a colleague, student or potential employer go searching for your scholarly record, will they find accurate information?  When you are looking for a collaborator, a reviewer or a potential hire what sources do you trust for reliable and up-to-date information about that scholar?

Have you Googled yourself lately?

Unfortunately, faculty websites and college faculty profiles can often be absent, out-of-date, or impossible to find.

Enter the database of scholars.  There are several types out there – those that require registration and constant maintenance by individual scholars, those that automatically pull data from other sources, and those that do a bit of both.

My college has recently acquired access to one of the latter, Scholar Universe.  SUNY has negotiated with Scholar Universe (normally a subscription database) to provide open searching of SUNY scholar profiles.  Check out my SUNY colleagues and especially my SUNY Geneseo colleagues.

Faculty at my institution are now confronted with their public profiles, and a renewed interest in making sure that the information available about them is accurate and complete.  Yesterday, in collaboration with the Office of Sponsored Research, we held a workshop for faculty on editing their Scholar Universe profiles and otherwise managing their scholarly identity.

So, what can an individual researcher do to take control of their scholarly identity?  Here are some of my thoughts:

First, know how others see you.  Google yourself.  Do vanity searches in the databases used in your discipline.  Are you happy with the results?  While a database might not list all of your publications (because of which journals they choose to include), is a list of your publications available online?

Second, if you see wrong information – correct it.  Is your webpage 8 years old?  Make a few updates.  Remove time sensitive stuff like office hours and course schedules so that it doesn’t get so easily out of date.  Add stuff that won’t get out of date like publications, current and prior affiliations, and expertise.  If you see wrong information in a database or on another website, try to correct it by contacting the editor of the site (of course, sometimes this just isn’t possible.)

Third, add to the body of scholarly information available about you.  Create profiles on Nature Network or Mendeley and include your list of publications.  Post a copy of your CV (if you don’t know how to post a document online, try using Google Docs to upload a copy to the web). Assuming you have permission to do so, upload a pre-print of your publications to your website, an institutional repository (ask your librarian) or a disciplinary repository.

Fourth, do what you can do help scholars find all of your publications in one place, especially if you have a common name.  Register with ResearcherID.com to collect all of your publications in one place, and make sure that you only have one identity on Scopus.

What else can a researcher do?  How do you manage your scholarly identity?