Post-Henderson Post

So You Think You Can Digitize had a bit of an unplanned hiatus; turns out that maintaining a blog while its authors take a something like 15+ trips, attend to work/school responsibilities, and write gobs of papers is a bit trickier than anticipated.  One of those papers was born directly out of this blog: a paper submitted to Zookeys about our work on the Henderson Field Notes Project.  Ironically, we spent so much time on the Henderson manuscript that we were forced to spend less time uploading Henderson’s notebooks for additional annotation.  That should be remedied soon!   With help from our some still anonymous friends, we’ve made a lot of progress, including the annotation and extraction of over a thousand species observations made by Henderson between 1905-1909, which we then packaged as a Darwin Core archive for publication along with the paper — but there are still 9 more notebooks to go.

We’ve written a lot about field notebooks, especially older notebooks, penned by pioneers, describing the Old West, but we do worry that this reverence for the past has the unintended consequence of diminishing the importance of field and lab data recorded in the exact same way in the present day.  Digitizing old things, showing off the labels written in India ink and cursive: the romance of this might reinforce unfortunate stereotypes about the “dustiness” and antiquity of museum collections.  In the here and now, field notes are still handwritten, and specimens are still collected and catalogued, often at higher rates than ever before.  What do we do with the analog present?

This very issue came up in a post on Andy Farke’s blog, The Open Source Paleontologist.  Dr. Farke is a paleontologist who recently published a paper describing a new species not found in the field, but rather, in Yale’s paleontology collection.  Kudos to him, because he decided that publishing his findings in PLoS ONE was not enough; he also published the lab notes on which his paper was based.  In his words,

There isn’t really anything earthshaking in there… but in any case now other folks can use them. The sketches of real bone vs. reconstruction should be particularly useful.”

Earthshaking or no, we agree that there’s a huge need to openly archive this sort of documentation, both to support the reproducibility and replicability of scientific results, and to better describe the soon-to-be-historical use of the specimens in question.

Because repositories like Dryad aren’t intended to accept field or lab notes,  Dr. Farke turned to Figshare as a place to deposit them.  While Figshare is great for its ease of use and flexibility in licensing options, it’s nevertheless not ideal for text; pdfs are not easily navigable, nor is there any support for transcription and annotation of notes.  This made us wonder: could WikiSource be a place to deposit these notes?

After Gaurav conferred with the Wiki-community, we found that Wikisource could indeed be used to provision more recent field or lab notes in addition to historical documents, provided they meet certain criteria (and the Wikipedians don’t eventually find this to be a violation of existing policies).  Dr. Farke’s notes on the Torosaurus are now available here, and in need of transcription too.

Why put lab notes on Wikisource?  Why not just leave them on Figshare?  Well:

  • Just as we were able to annotate the Henderson field notes for taxa, it’s easy to imagine notes like Dr. Farke’s being annotated with specimen catalog numbers, and even linked to other records describing the specimen in question.
  • Lots of Copies Keeps Stuff Safe — If either of these sites goes belly up, then a copy of the notes would still be available, with the same CC-BY license that Dr. Farke requires.
  • Publishing notebooks to a platform like Wikisource bridges gaps between formal and informal publication, not to mention institutional archiving and self-archiving (which more often than not is simply left-bottom-desk-drawer-archiving).  Yes, though anyone can edit or post anything to the various Wikimedia sites, there are nevertheless quality and notability requirements that must be met for an article to be considered Wiki-worthy; e.g. Dr Farke’s notes qualify because he (or his papers) meets that notability requirement, but generally speaking, Wikisource is not a place to “just put notes.”
  • Something worth noting and maybe exploring further: Deposition of notebooks post-publication is not quite the same thing as maintaining an Open Notebook, although clearly related.  Wikisource/Wikimedia aren’t intended to be means of making science transparent, and may balk at that level of repurposing of the platform — only time (and continued experimentation…) will tell.

We are curious to see what others think of the idea of using Wikisource as a repository not just for historical notes, but more recent notes as well (and we also wonder if there are any eager paleo people out there looking to help transcribe Dr. Farke’s notes).

Post-Henderson, and post-Wikisource: we do want to turn our attention back to digitization, natural history collections, and what is going on right now.  A lot has changed since we started this blog back in March 2011; a number of projects that were merely in the planning stages have not only been funded, but have actually started digitizing collections.  Some of that work putting project plans into practice has been happening right on Rob’s doorstep.  A few weeks ago, the Herbarium at the University of Colorado had  a visit from New York Botanical Gardens traveling digitization set-up gurus Melissa Tulig and Kim Watson, who were here to set up an imaging station for use in the Tri-Trophic Thematic Collections Network — we hope to talk about that next post.

And hey — we’ll also be at SPNHC again this year, presenting on the Henderson project in the Archives and Special Collections session, so if you’re in New Haven this June, and wanna talk digitization, Wikisource, or anything else, do say hello!

About these ads

About Andrea

Andrea is a Ph.D. student in Library and Information Science at the University of Illinois at Urbana-Champaign, and is supported by the Center for Informatics Research in Science and Scholarship. Her research interests include text mining; scholarly communication; data curation; biodiversity, phylogenetic and natural history museum informatics; and mining and making available undiscovered public knowledge. She is particularly interested in information extraction from natural history field notes and texts, and improving methods of digitizing and publishing data about the world's 3–4 billion museum specimen records so they can be used to better model evolutionary and ecological processes.
This entry was posted in field notes, Henderson Project, SPNHC. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s