<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>So You Think You Can Digitize</title>
	<atom:link href="http://soyouthinkyoucandigitize.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://soyouthinkyoucandigitize.wordpress.com</link>
	<description>a blog about natural history museum digitization projects, data sharing, and informatics</description>
	<lastBuildDate>Tue, 07 May 2013 01:20:51 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='soyouthinkyoucandigitize.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>So You Think You Can Digitize</title>
		<link>http://soyouthinkyoucandigitize.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://soyouthinkyoucandigitize.wordpress.com/osd.xml" title="So You Think You Can Digitize" />
	<atom:link rel='hub' href='http://soyouthinkyoucandigitize.wordpress.com/?pushpress=hub'/>
		<item>
		<title>This week in digitization: The good, the buggy, and the curious</title>
		<link>http://soyouthinkyoucandigitize.wordpress.com/2013/04/30/this-week-in-digitization-the-good-the-buggy-and-the-curious/</link>
		<comments>http://soyouthinkyoucandigitize.wordpress.com/2013/04/30/this-week-in-digitization-the-good-the-buggy-and-the-curious/#comments</comments>
		<pubDate>Tue, 30 Apr 2013 15:56:38 +0000</pubDate>
		<dc:creator>Andrea</dc:creator>
				<category><![CDATA[citizen science]]></category>
		<category><![CDATA[crowdsourcing]]></category>
		<category><![CDATA[biodiversity informatics]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[natural history]]></category>

		<guid isPermaLink="false">http://soyouthinkyoucandigitize.wordpress.com/?p=365</guid>
		<description><![CDATA[This will be old news to many, but regardless: two big projects related to specimen digitization and biodiversity informatics launched in the past couple weeks.   Quick impressions on both below, focusing on the good, the buggy and a few items &#8230; <a href="http://soyouthinkyoucandigitize.wordpress.com/2013/04/30/this-week-in-digitization-the-good-the-buggy-and-the-curious/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=soyouthinkyoucandigitize.wordpress.com&#038;blog=22038354&#038;post=365&#038;subd=soyouthinkyoucandigitize&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p dir="ltr">This will be old news to many, but regardless: two big projects related to specimen digitization and biodiversity informatics launched in the past couple weeks.   Quick impressions on both below, focusing on the good, the buggy and a few items of curiosity.  Both projects are great, but &#8212; how will they fit into the broader landscape of existing resources, and into what niches?</p>
<p dir="ltr">1) <a title="Notes From Nature" href="http://notesfromnature.org" target="_blank">Notes From Nature</a> &#8212; a new Zooniverse project for the transcription of natural history collection ledgers.  This has been a long time in the making (more details <a href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3406478/">here</a>) and as of this writing, the two available collections (Herbarium specimens from SERNEC and insects from CALBUG) are already 26% and 21% transcribed, respectively.</p>
<p dir="ltr" style="padding-left:30px;"><strong>The Good</strong>: As always, clean and intuitive interfaces from the Zooniverse team make transcription fast and easy.  Data entry screens are customized to each type of collection (e.g. plant labels often contain more detailed locality descriptions than insects, whereas insect labels often contain data about what host-organism they were found on).  Awesomely, all the code is available on Github (<a href="https://github.com/zooniverse/notesFromNature">https://github.com/zooniverse/notesFromNature</a>) in case other Museums want to set up their own transcription engines locally.  There is also an intriguing teaser buried at the very bottom of the Notes from Nature “<a href="http://www.notesfromnature.org/#/about">About</a>” page: “Interested in publishing your collection? Contact us.”</p>
<p dir="ltr" style="padding-left:30px;"><strong>The Buggy</strong>: Maaaan, I’ve transcribed around 40 labels and my total isn’t showing up under my user profile. This bothers me more than I care to admit, though it’s primarily out of worry that my transcriptions aren’t being saved.</p>
<p dir="ltr" style="padding-left:30px;"><strong>The Curious</strong>:  It would be great to learn more about how these data get back to the collections databases, and how exactly that handoff happens.  What do the transcribed files look like?  How is accuracy checked?  Do the museums have plans to make these records publicly available, or harvestable by aggregators like GBIF?</p>
<p>2) The patriotically-named <a href="http://bison.usgs.ornl.gov/" target="_blank">Biodiversity Information Serving Our Nation (AKA BISON) </a>biodiversity data portal out of USGS &#8211;  I know less about this project, other than what I’ve learned at various conference talks &#8212; however I’ve heard it referred to as the “federal version of iDigBio.”</p>
<p dir="ltr" style="padding-left:30px;"><strong>The Good</strong>: On first look, really nice integration of specimen occurrence data with USGS map layers, and as <a href="https://plus.google.com/117016856028818567812/posts/XBHDfEAkKkA">Hilmar Lapp pointed out</a>, there’s an API, which is great.</p>
<p dir="ltr" style="padding-left:30px;"><strong>The Buggy</strong>:  There are no identifiers on these specimens &#8212; not even their local catalog numbers.  Per Stinger Guala in the G+ thread linked above, the data is there &#8212; it’s just not yet visible (though will be soon).  Perhaps there are reasons (a need for better formatting? a need for cleaner data?  a need for more server space?) that they’re not yet making this data visible yet &#8212; but it struck me as a pretty glaring omission.  While I realize that many researchers don’t spend a lot of time looking at  catalog numbers, I imagine that they’d be absolutely critical if one was integrating BISON data with that from other sources (say, something from another portal like GBIF). Also, how could any of these records ever be linked back to the source data or any other data out there?  Provenance = important, no?</p>
<p dir="ltr" style="padding-left:30px;"><strong>The Curious</strong>: BISON is apparently the US node of GBIF &#8212; which I had assumed meant they would be providing GBIF with US data  &#8211; however, the data in BISON appears to invert that model and is a US-centric mirror of GBIF.  I hope that BISON becomes a platform through which US, <em>federally owned and managed</em> biocollections can be made publically discoverable, and would be interested to hear from BISON reps if there are any plans in place to do this.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/soyouthinkyoucandigitize.wordpress.com/365/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/soyouthinkyoucandigitize.wordpress.com/365/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=soyouthinkyoucandigitize.wordpress.com&#038;blog=22038354&#038;post=365&#038;subd=soyouthinkyoucandigitize&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://soyouthinkyoucandigitize.wordpress.com/2013/04/30/this-week-in-digitization-the-good-the-buggy-and-the-curious/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c8080798e336baca30da6f14204f848f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">akthom</media:title>
		</media:content>
	</item>
		<item>
		<title>What gets linked to global unique identifiers (GUIDs) in natural history collections digitization?</title>
		<link>http://soyouthinkyoucandigitize.wordpress.com/2013/01/28/what-gets-linked-to-global-unique-identifiers-guids-in-natural-history-collection-digitization/</link>
		<comments>http://soyouthinkyoucandigitize.wordpress.com/2013/01/28/what-gets-linked-to-global-unique-identifiers-guids-in-natural-history-collection-digitization/#comments</comments>
		<pubDate>Mon, 28 Jan 2013 21:58:27 +0000</pubDate>
		<dc:creator>Rob</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://soyouthinkyoucandigitize.wordpress.com/?p=315</guid>
		<description><![CDATA[Co-written with David Bloom. For as long as explorers have been collecting specimens and bringing them back to museums, collection managers and museum staff have been assigning unique (well, more or less unique, with a margin for human error) numbers &#8230; <a href="http://soyouthinkyoucandigitize.wordpress.com/2013/01/28/what-gets-linked-to-global-unique-identifiers-guids-in-natural-history-collection-digitization/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=soyouthinkyoucandigitize.wordpress.com&#038;blog=22038354&#038;post=315&#038;subd=soyouthinkyoucandigitize&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><em>Co-written with David Bloom.</em></p>
<p>For as long as explorers have been collecting specimens and bringing them back to museums, collection managers and museum staff have been assigning unique (well, more or less unique, with a margin for human error) numbers to them. Collections management isn’t just about conservation of physical objects – it’s also about care and support of an information retrieval system.  And while new technologies come and go, the maintenance of this system of locally unique identifiers remains at the core of collections management.</p>
<p>Recently, that aforementioned information retrieval system has been growing increasingly dispersed, and the ways in which information about specimens is disseminated, accessed, and manipulated have been changing rapidly.  The ability to place digital representations of specimens and their data <em>en masse</em> onto the World Wide Web is fundamentally changing collections-based scholarship.  Consequently, locating the right giant clam specimen in the University of Colorado Museum of Natural History (CU Museum) invertebrate collection is a very different task than locating a digital image of the same specimen on the Internet.  Tracking and connecting all of the digital representations (e.g., images, metadata records digitized from labels, tissue samples derived from the specimen, sound and video files) derived from that single specimen is even more difficult.  While local identifiers suffice to connect data and specimens within the CU Museum, global identifiers are needed to maintain these connections when content is released into the wilds of the Internet.  The challenge before us is how to best set up a system of globally unique identifiers (GUIDs) that work at Internet-scale.</p>
<p><a href="https://www.idigbio.org/">iDigBio</a>, an NSF-funded project tasked with coordinating the collections community’s digitization efforts, just <a href="https://www.idigbio.org/sites/default/files/iDigBioGuidGuideForProviders_v1.pdf">released</a> a GUID guide for data providers.  The document clarified the importance of GUIDs and recommended that iDigBio data providers adopt universally unique identifiers (UUIDs) as GUIDs.  It went further, however.  In the document, a call-out box (on page 3) states that “It has been agreed by the iDigBio community that the identifier represents the <em><strong>digital record</strong></em> (database record) of the specimen not the specimen itself. Unlike the barcode that would be on the physical specimen, for instance, the GUID uniquely represents the digital record only.”</p>
<p>In response, Rod Page wrote an (as always, entertaining and illuminating) <a href="http://iphylo.blogspot.com/2013/01/idigbio-you-are-putting-identifiers-on.html">iPhylo blog post</a> “iDigBio: You are putting identifiers on the wrong thing ” in which he makes a strong case that a GUID must refer to the <em><strong>physical object</strong></em>.  “Surely, “ writes Rod, “the key to integrating specimens with other biodiversity data is to have globally unique identifiers for the specimens.”</p>
<p>The disagreement above underscores our community’s need to be very clear about what GUIDs reference and how they resolve<sup><a href="/Users/robgur/Desktop/SYTYCDIdentifiersV4.html#ftnt1" name="ftnt_ref1">[1]</a></sup>.  This seems simple, but it has been one of the most contentious issues <em>within our community</em>.  So who is right – should GUIDs point to digital records, or physical objects?</p>
<p>iDigBio has a clear mission to support the <em><strong>digitization</strong></em> of natural history specimens, and thus, deals exclusively with digital objects, which, as any database manager knows, need to have identifiers.  So, it does make sense that they would be concerned with identifiers for digital objects.  Those identifiers, however, absolutely must be as closely associated with the physical specimens as possible.  In particular, they need to be assigned and linked to the local identifiers stored in local databases managed by on-site collections staff.  If iDigBio is saying that only digitized objects that get passed to iDigBio from their data providers need GUIDs, not the original digital catalogs, we can’t agree with that.</p>
<p>On the other side, we’re not sure we agree with Rod either.  If Rod is suggesting that GUIDs replace, or serve as additions to the catalog numbers literally, physically attached to specimens or jars, we think that is simply impractical.  What is the incentive for putting a GUID on every single specimen in a collection, especially wet collections, from the point of view of a Collections Manager?  Does it help with loans?  Who is going to go into a collection and assign yet another number to all the objects in that collection (and how many institutions have the resources to make that happen)?</p>
<p>What we think is feasible and useful (and likely what Rod meant) is <em>to assign GUIDs to digital specimen records stored in local museum databases and linked to the local identifiers</em>.  When these data get published online, the associated GUID can be pushed downstream as well. Assigning GUIDs <em><strong>to the local, authoritative, electronic specimen records as they are digitized</strong></em> should be a mandatory step in the digitization process &#8212; a process that iDigBio is uniquely poised to support. This is the only way that GUIDs will be consistently propagated downstream to other data aggregators like VertNet, GBIF, and whatever else comes along fifty years from now (and fifty years after funding runs out on some existing projects).  Again, we want to point out: it’s important to remember that natural history collections management has always entailed the management of identifiers; the adoption of global identifiers will only increase the need for local identifier management</p>
<p>Now, we can imagine one case in which GUIDs could conceivably serve as the originating catalog number: during field collection.  As biologists generate more and more digital content in the field (such as images and DNA collected in the field), minting GUIDs at the moment of collection (or during the review of daily collection events) and assigning them to samples and specimens directly could be quite useful.  As these physical objects make their way into collections, we anticipate that collections folks will still assign local identifiers.  Both have their uses and are made stronger and more useful when linked.</p>
<p>In summary: we are less worried about what exactly a GUID will point to (digital record vs physical object) as long as the content referenced is valuable to the biodiversity collections and science community.  However, we are more worried that we’re not explicitly identifying what we’re assigning identifiers to, and not discussing who and how these identifiers will be managed and integrated.  Our focus should be on developing trusted and well-understood GUID services that provide content resolution for the long (50-100 years) term.</p>
<hr />
<div>
<p><a href="/Users/robgur/Desktop/SYTYCDIdentifiersV4.html#ftnt_ref1" name="ftnt1">[1]</a> To resolve a GUID, you dump it into a resolution service maintained by a naming authority that originally created that GUID, such as <a href="http://www.crossref.org/">CrossRef</a> or <a href="http://datacite.org/">DataCite</a> . That service then returns to you links and other information that point you to other content attached to the same identifier.  A great example of a resolvable GUID is a Digital Object Identifier (DOI).  In the case of journal articles, resolution of a DOI will usually direct you to the paper itself via hyperlink, but it could also be a web page with information about the resource, a sound file, or any other representation of the object associated with the GUID.</p>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/soyouthinkyoucandigitize.wordpress.com/315/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/soyouthinkyoucandigitize.wordpress.com/315/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=soyouthinkyoucandigitize.wordpress.com&#038;blog=22038354&#038;post=315&#038;subd=soyouthinkyoucandigitize&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://soyouthinkyoucandigitize.wordpress.com/2013/01/28/what-gets-linked-to-global-unique-identifiers-guids-in-natural-history-collection-digitization/feed/</wfw:commentRss>
		<slash:comments>33</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/b67f826745311eced80f5f0da70b89b6?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">robgur</media:title>
		</media:content>
	</item>
		<item>
		<title>Post-Henderson Post</title>
		<link>http://soyouthinkyoucandigitize.wordpress.com/2012/06/04/post-henderson-post/</link>
		<comments>http://soyouthinkyoucandigitize.wordpress.com/2012/06/04/post-henderson-post/#comments</comments>
		<pubDate>Mon, 04 Jun 2012 23:50:53 +0000</pubDate>
		<dc:creator>Andrea</dc:creator>
				<category><![CDATA[field notes]]></category>
		<category><![CDATA[Henderson Project]]></category>
		<category><![CDATA[SPNHC]]></category>

		<guid isPermaLink="false">http://soyouthinkyoucandigitize.wordpress.com/?p=300</guid>
		<description><![CDATA[So You Think You Can Digitize had a bit of an unplanned hiatus; turns out that maintaining a blog while its authors take a something like 15+ trips, attend to work/school responsibilities, and write gobs of papers is a bit &#8230; <a href="http://soyouthinkyoucandigitize.wordpress.com/2012/06/04/post-henderson-post/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=soyouthinkyoucandigitize.wordpress.com&#038;blog=22038354&#038;post=300&#038;subd=soyouthinkyoucandigitize&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>So You Think You Can Digitize had a bit of an unplanned hiatus; turns out that maintaining a blog while its authors take a something like 15+ trips, attend to work/school responsibilities, and write gobs of papers is a bit trickier than anticipated.  One of those papers was born directly out of this blog: a paper submitted to Zookeys about our work on the <a href="http://soyouthinkyoucandigitize.wordpress.com/category/henderson-project/">Henderson Field Notes Project</a>.  Ironically, we spent so much time on the Henderson manuscript that we were forced to spend less time uploading Henderson’s notebooks for additional annotation.  That should be remedied soon!   With help from our some still anonymous friends, we’ve made a lot of progress, including the annotation and extraction of over a thousand species observations made by Henderson between 1905-1909, which we then packaged as a Darwin Core archive for publication along with the paper &#8212; but there are still 9 more notebooks to go.</p>
<p>We&#8217;ve written a lot about field notebooks, especially older notebooks, penned by pioneers, describing the Old West, but we do worry that this reverence for the past has the unintended consequence of diminishing the importance of field and lab data recorded in the exact same way in the present day.  Digitizing old things, showing off the labels written in India ink and cursive: the romance of this might reinforce unfortunate stereotypes about the “<a href="http://www.conservationmagazine.org/2011/05/the-macaque-shuffle/comment-page-1/#comment-9109">dustiness</a>” and antiquity of museum collections.  In the here and now, field notes are still handwritten, and specimens are still collected and catalogued, often at higher rates than ever before.  What do we do with the analog present?</p>
<p>This very issue came up in a post on Andy Farke’s blog, <a href="http://openpaleo.blogspot.com/2012/03/open-museum-notebook-torosaurus-style.html">The Open Source Paleontologist</a>.  Dr. Farke is a paleontologist who recently published a paper describing a new species not found in the field, but rather, in Yale’s paleontology collection.  Kudos to him, because he decided that <a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0016196">publishing his findings in PLoS ONE</a> was not enough; he also published the lab notes on which his paper was based.  In his words,</p>
<p style="padding-left:30px;">“<em>There isn&#8217;t really anything earthshaking in there&#8230; but in any case now other folks can use them. The sketches of real bone vs. reconstruction should be particularly useful</em>.”</p>
<p>Earthshaking or no, we agree that there’s a huge need to openly archive this sort of documentation, both to support the reproducibility and <a href="http://ivory.idyll.org/blog/apr-12/replication-i">replicability of scientific results</a>, and to better describe the soon-to-be-historical use of the specimens in question.</p>
<p>Because repositories like Dryad aren’t intended to accept field or lab notes,  Dr. Farke turned to Figshare as a place to deposit them.  While Figshare is great for its ease of use and flexibility in licensing options, it’s nevertheless not ideal for text; pdfs are not easily navigable, nor is there any support for transcription and annotation of notes.  This made us wonder: could WikiSource be a place to deposit these notes?</p>
<p>After Gaurav <a href="http://en.wikisource.org/wiki/Wikisource:Scriptorium/Archives/2012-04#Scientists.2C_personal_notes_and_Wikisource">conferred with the Wiki-community</a>, we found that Wikisource could indeed be used to provision more recent field or lab notes in addition to historical documents, provided they meet certain criteria (and the Wikipedians don’t eventually find this to be a violation of existing policies).  Dr. Farke’s notes on the Torosaurus are <a href="http://en.wikisource.org/wiki/Index:Notes_and_Observations_on_Specimens_of_Torosaurus_at_the_Yale_Peabody_Museum_of_Natural_History.pdf">now available here</a>, and in need of transcription too.</p>
<p>Why put lab notes on Wikisource?  Why not just leave them on Figshare?  Well:</p>
<ul>
<li>Just as we were able to annotate the Henderson field notes for taxa, it’s easy to imagine notes like Dr. Farke’s being annotated with specimen catalog numbers, and even linked to other records describing the specimen in question.</li>
<li><a href="http://en.wikipedia.org/wiki/LOCKSS">Lots of Copies Keeps Stuff Safe</a> &#8212; If either of these sites goes belly up, then a copy of the notes would still be available, with the same CC-BY license that Dr. Farke requires.</li>
<li>Publishing notebooks to a platform like Wikisource bridges gaps between formal and informal publication, not to mention institutional archiving and self-archiving (which more often than not is simply left-bottom-desk-drawer-archiving).  Yes, though anyone can edit or post anything to the various Wikimedia sites, there are nevertheless quality and <a href="http://en.wikipedia.org/wiki/Wikipedia:Notability">notability requirements</a> that must be met for an article to be considered Wiki-worthy; e.g. Dr Farke’s notes qualify because he (or his papers) meets that notability requirement, but generally speaking, Wikisource is not a place to “just put notes.”</li>
<li>Something worth noting and maybe exploring further: Deposition of notebooks post-publication is not quite the same thing as maintaining an Open Notebook, although clearly related.  Wikisource/Wikimedia aren’t intended to be means of making science transparent, and may balk at that level of repurposing of the platform &#8212; only time (and continued experimentation&#8230;) will tell.</li>
</ul>
<p>We are curious to see what others think of the idea of using Wikisource as a repository not just for historical notes, but more recent notes as well (and we also wonder if there are any eager paleo people out there looking to help transcribe Dr. Farke’s notes).</p>
<p>Post-Henderson, and post-Wikisource: we do want to turn our attention back to digitization, natural history collections, and what is going on right now.  A lot has changed since we started this blog back in March 2011; a <a href="https://www.idigbio.org/">number</a> <a href="http://2012.botanyconference.org/engine/search/index.php?func=detail&amp;aid=250">of</a> <a href="http://invertnet.org/">projects</a> that were merely in the planning stages have not only been funded, but have actually started digitizing collections.  Some of that work putting project plans into practice has been happening right on Rob’s doorstep.  A few weeks ago, the Herbarium at the University of Colorado had  a visit from New York Botanical Gardens traveling digitization set-up gurus Melissa Tulig and Kim Watson, who were here to set up an imaging station for use in the <a href="http://tcn.amnh.org/">Tri-Trophic Thematic Collections Network</a> &#8212; we hope to talk about that next post.</p>
<p>And hey &#8212; we’ll also be at SPNHC again this year, presenting on the Henderson project in the Archives and Special Collections session, so if you’re in New Haven this June, and wanna talk digitization, Wikisource, or anything else, do say hello!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/soyouthinkyoucandigitize.wordpress.com/300/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/soyouthinkyoucandigitize.wordpress.com/300/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=soyouthinkyoucandigitize.wordpress.com&#038;blog=22038354&#038;post=300&#038;subd=soyouthinkyoucandigitize&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://soyouthinkyoucandigitize.wordpress.com/2012/06/04/post-henderson-post/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c8080798e336baca30da6f14204f848f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">akthom</media:title>
		</media:content>
	</item>
		<item>
		<title>JHFNP, Post 4.5</title>
		<link>http://soyouthinkyoucandigitize.wordpress.com/2012/01/23/jhfnp-post-4-5/</link>
		<comments>http://soyouthinkyoucandigitize.wordpress.com/2012/01/23/jhfnp-post-4-5/#comments</comments>
		<pubDate>Mon, 23 Jan 2012 22:31:26 +0000</pubDate>
		<dc:creator>Andrea</dc:creator>
				<category><![CDATA[Henderson Project]]></category>

		<guid isPermaLink="false">http://soyouthinkyoucandigitize.wordpress.com/?p=287</guid>
		<description><![CDATA[A quick mini-post here, to tell of some interesting things: 1) Notebook 1 is DONE.  Fully annotated, and all within 3 days of our last post.  This represents many hours of work and the creation of hundreds of annotations: {{place&#124;&#8230;}} &#8230; <a href="http://soyouthinkyoucandigitize.wordpress.com/2012/01/23/jhfnp-post-4-5/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=soyouthinkyoucandigitize.wordpress.com&#038;blog=22038354&#038;post=287&#038;subd=soyouthinkyoucandigitize&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>A quick mini-post here, to tell of some interesting things:</p>
<p>1) Notebook 1 is DONE.  Fully annotated, and all within 3 days of our last post.  This represents many hours of work and the creation of hundreds of annotations:</p>
<p style="padding-left:30px;">{{place|&#8230;}} annotations: 218<br />
{{dated|&#8230;}} annotations: 64<br />
{{taxon|&#8230;}} annotations: 347</p>
<p>So total annotations = 629!</p>
<p>2) Furthermore, Notebook 1 was annotated almost entirely by 1 person&#8230; who we can’t thank by name because they didn’t make a WikiSource account.  So, thank you, IP address 173.69.207.29!  And also to: 207.145.38.73.  If you would like a “thank you” coffee mug please let us know in the comments or via email.</p>
<p>3)  We have set up a <a href="http://en.wikisource.org/wiki/Wikisource:WikiProject_Field_Notes">Wikisource Project page</a>.  We still need to write detailed, step by step descriptions for some pages (e.g. transcription and annotation), but we do have an initial<a href="http://en.wikisource.org/wiki/Wikisource:WikiProject_Field_Notes/For_uploaders"> help page on upload</a> available, and more will be posted there soon.</p>
<p>4)  We’re putting new notebooks and transcriptions onto Wikisource as fast as we can get transcriptions and scans associated with one another; transcriptions of <a href="http://en.wikisource.org/wiki/Field_Notes_of_Junius_Henderson/Notebook_2">Notebook 2</a> are already up.  This notebook features more paleontology and geology than the last, and describes several trips to Northern Colorado: <em>“Instead of attending commencement and taking my B.A. degree I started north on foot, up the Lykins lateral valley NW of Colorado Sanitarium.”  </em></p>
<p>We are excited about, and appreciative of, the annotation help, and we really do want to acknowledge as many folks as possible in the upcoming Zookeys paper.  So once again, if you want to be credited by name, make sure to create (and login to) a WikiSource account before annotating (though if you prefer public anonymity, just email us directly to say you are helping); we probably can’t acknowledge IP addresses, and we’re not sure if we can acknowledge aliases in a published paper (though how great would it be to say, “special thanks to Paul Flemons, Laura Russell and SquirrelFan23 for her or his help”).</p>
<p>Next, more substantial update: annotations and occurrences, as promised last time.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/soyouthinkyoucandigitize.wordpress.com/287/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/soyouthinkyoucandigitize.wordpress.com/287/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=soyouthinkyoucandigitize.wordpress.com&#038;blog=22038354&#038;post=287&#038;subd=soyouthinkyoucandigitize&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://soyouthinkyoucandigitize.wordpress.com/2012/01/23/jhfnp-post-4-5/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c8080798e336baca30da6f14204f848f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">akthom</media:title>
		</media:content>
	</item>
		<item>
		<title>Field Notes Challenge Part 4: Help, ‘Cause We Need Somebod(y/ies)</title>
		<link>http://soyouthinkyoucandigitize.wordpress.com/2012/01/12/field-notes-challenge-part-4-help-cause-we-need-somebodies/</link>
		<comments>http://soyouthinkyoucandigitize.wordpress.com/2012/01/12/field-notes-challenge-part-4-help-cause-we-need-somebodies/#comments</comments>
		<pubDate>Thu, 12 Jan 2012 23:04:25 +0000</pubDate>
		<dc:creator>Andrea</dc:creator>
				<category><![CDATA[crowdsourcing]]></category>
		<category><![CDATA[field notes]]></category>
		<category><![CDATA[Henderson Project]]></category>

		<guid isPermaLink="false">http://soyouthinkyoucandigitize.wordpress.com/?p=277</guid>
		<description><![CDATA[Co-written once again with Gaurav Vaidya. Over the last week, Gaurav has continued to pull templates out of his hat (leaving rabbit pulling to Rob and his bunnies) and we now have templates for locations and dates.  The syntax for &#8230; <a href="http://soyouthinkyoucandigitize.wordpress.com/2012/01/12/field-notes-challenge-part-4-help-cause-we-need-somebodies/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=soyouthinkyoucandigitize.wordpress.com&#038;blog=22038354&#038;post=277&#038;subd=soyouthinkyoucandigitize&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><em>Co-written once again with Gaurav Vaidya.</em></p>
<p>Over the last week, Gaurav has continued to pull templates out of his hat (leaving rabbit pulling to Rob and his bunnies) and we now have templates for locations and dates.  The syntax for these templates are very similar to the “taxon” template we discussed <a href="http://soyouthinkyoucandigitize.wordpress.com/2012/01/06/field-notes-challenge-part-3-new-years-digital-resolutions/">in the last post</a> (and touch on below).</p>
<p>Let’s start with date.  The general syntax is:</p>
<p style="text-align:center;" dir="ltr">{{dated|&lt;date in YYYY-MM-DD Format&gt;|&lt;date as transcribed&gt;}}</p>
<p>So this example from <a href="http://en.wikisource.org/w/index.php?title=Page:Field_Notes_of_Junius_Henderson,_Notebook_1.djvu/14">Page 12 of Notebook 1</a> becomes, “{{dated|1905-08-04|Aug. 4}}.”</p>
<p>The Location template looks similar &#8211;</p>
<p style="text-align:center;" dir="ltr">{{place|Location Name|Location name as transcribed}}</p>
<p style="text-align:left;">&#8211; but does something particularly nifty: it creates a linkout to <a href="http://www.openstreetmap.org/">OpenStreetMaps</a> which immediately resolves the place name on a map along with links to Wikipedia and Wikimedia Commons.  So this example on <a href="http://en.wikisource.org/wiki/Page%3AField_Notes_of_Junius_Henderson%2C_Notebook_1.djvu/8">Page 6 of Notebook 1</a> is annotated like this:<br />
“{{place|Florissant Lake|Florissant Lake basin}},” which creates a box like this: <a href="http://nominatim.openstreetmap.org/search?q=Florissant"><img class="aligncenter" src="https://lh3.googleusercontent.com/5I-vxSU-05RYtyTnhxKEdYhTTtWStWfWEVj1nToBWfyu-TqxwQk_PDADRTN4u-_JCqMuB6xRgSAI2XJYd8Ku0ZGbesjP3_HOPxcJkW2Gtm9r-75M2sQ" alt="" width="182px;" height="43px;" /></a><span style="text-align:center;">The little circled map in the image above is the link out.  YAY MAPS.</span></p>
<p>As always, the million dollar question is: what next?!?  The next step is to continue using these templates to fully annotate Henderson’s first notebook &#8212; <strong>which is where you guys come in</strong>.  All these experiments are well and good, but until the rubber hits the road (or the fingers hit the keyboard), its more theory than practice. <strong>We need your help</strong>.  If you have time and inclination, please jump in and annotate.  We want to make it clear that you can’t hinder our progress, only help, and that this is<em> really easy</em>.  It’s a wiki, so hit “edit” on one of the pages (for example <a href="http://en.wikisource.org/wiki/Page%3AField_Notes_of_Junius_Henderson%2C_Notebook_1.djvu/33">this one</a>, which mentions <a href="http://cuteoverload.com/2009/08/23/this-just-in-american-pika/">pikas</a> (!)), and just try out a taxon or location annotation.  For example, on <a href="http://en.wikisource.org/wiki/Page%3AField_Notes_of_Junius_Henderson%2C_Notebook_1.djvu/33">Page 31</a>, you could hit edit and replace the word “chickadee” with this:  {{taxon|Paridae|chickadee}}, or even {{taxon|chickadee}} and VOILA &#8212; annotation!  Go back to the main index page (the up arrow next to the ‘Page | Discussion | Image” links) and you can see the changes have also shown up in the main contents page.</p>
<p>We have a particular need to get this done quickly, because we have been asked to assemble our experiments into a peer-reviewed and hopefully published paper in a special issue of <a href="http://www.pensoft.net/journals/zookeys">Zookeys</a>.  A manuscript is due in mid-March and, yeah, that isn’t very far away.  So we could really use your help with annotating this text. The rewards &#8212; apart from a general sense of well-being and the satisfaction of contributing to the furthering of knowledge about our planet &#8212; will be a direct mention of your help in the acknowledgements section of our paper.  If you are interested in making a more substantive contribution in terms of work and writing, we’d be pleased to chat more and possibly include you as a co-author.</p>
<p>We have been on the fence previously about the utility of prizes, and whether these are effective incentives, or a titch gimmicky.  In the past we’ve given (very small) Amazon gift cards as prizes, as a way to say thank you to those that took the time to comment on post, but this time around we’re thinking of something different: Rob is happy to make a small donation from (very limited) personal funds to make a <a href="http://www.cafepress.com/cp/customize/product.aspx?number=611355935">Junius Henderson coffee mug</a> and then give those away as prizes to people.  But we wanna know &#8212; do prizes help motivate you to get involved in annotating?  Or are they eye roll inducing?</p>
<p>COME ON who doesn’t want a coffee mug?</p>
<p>Next post: text mining, annotations and occurrences!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/soyouthinkyoucandigitize.wordpress.com/277/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/soyouthinkyoucandigitize.wordpress.com/277/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=soyouthinkyoucandigitize.wordpress.com&#038;blog=22038354&#038;post=277&#038;subd=soyouthinkyoucandigitize&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://soyouthinkyoucandigitize.wordpress.com/2012/01/12/field-notes-challenge-part-4-help-cause-we-need-somebodies/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c8080798e336baca30da6f14204f848f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">akthom</media:title>
		</media:content>

		<media:content url="https://lh3.googleusercontent.com/5I-vxSU-05RYtyTnhxKEdYhTTtWStWfWEVj1nToBWfyu-TqxwQk_PDADRTN4u-_JCqMuB6xRgSAI2XJYd8Ku0ZGbesjP3_HOPxcJkW2Gtm9r-75M2sQ" medium="image" />
	</item>
		<item>
		<title>Field Notes Challenge Part 3:   New Year&#8217;s Digital Resolutions</title>
		<link>http://soyouthinkyoucandigitize.wordpress.com/2012/01/06/field-notes-challenge-part-3-new-years-digital-resolutions/</link>
		<comments>http://soyouthinkyoucandigitize.wordpress.com/2012/01/06/field-notes-challenge-part-3-new-years-digital-resolutions/#comments</comments>
		<pubDate>Fri, 06 Jan 2012 15:22:08 +0000</pubDate>
		<dc:creator>Rob</dc:creator>
				<category><![CDATA[Henderson Project]]></category>

		<guid isPermaLink="false">http://soyouthinkyoucandigitize.wordpress.com/?p=241</guid>
		<description><![CDATA[“Resolution” is a word with many meanings. It can refer to the granularity of a digital image, or the solution to a problem, or a firm decision to do or not do something.  The Carefree Cogitation Coalition here at So &#8230; <a href="http://soyouthinkyoucandigitize.wordpress.com/2012/01/06/field-notes-challenge-part-3-new-years-digital-resolutions/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=soyouthinkyoucandigitize.wordpress.com&#038;blog=22038354&#038;post=241&#038;subd=soyouthinkyoucandigitize&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>“Resolution” is a word with many meanings. It can refer to the granularity of a digital image, or the solution to a problem, or a firm decision to do or not do something.  The Carefree Cogitation Coalition here at So You Think You Can Digitize has been thinking all about resolutions as we enter the new year. We left 2011 with some exciting developments and new challenges related to making digitized and transcribed field notes openly available.  We have resolved some of the issues mentioned in our last post, and are now resolved to tackle perhaps the biggest challenge yet: to find a flexible way to annotate these notes.</p>
<p>First, to recap BRIEFLY:<br />
1) We <a href="http://soyouthinkyoucandigitize.wordpress.com/2011/11/28/an-ode-to-founders-and-a-field-notes-challenge-part-1/">decided to use Wikisource</a> as a platform for providing scans of CU Museum founder Junius Henderson&#8217;s field notes along with transcriptions.<br />
2) The first notebook scans and transcriptions are now available.  Wikisource’s navigation can be less than intuitive, so here’s the <a href="http://en.wikisource.org/wiki/Index:Field_Notes_of_Junius_Henderson,_Notebook_1.djvu">index page</a>, which lists all the pages of the Field Notebook along with metadata.  Next, click the “Notebook 1” link on the upper right hand side of the screen to get to the <a href="http://en.wikisource.org/wiki/Field_Notes_of_Junius_Henderson/Notebook_1">contents</a> page for that notebook.  You’ll see a Table of Contents here and, as you scroll down, the full transcription, with page numbers listed along the left border of the page.   Click on any of these page numbers (for example, Page 5) and you will go <a href="http://en.wikisource.org/wiki/Page%3AField_Notes_of_Junius_Henderson%2C_Notebook_1.djvu/7">to a page displaying the scanned image of the Henderson notebook and transcription</a>.  This page is editable according to Wiki rules.<br />
3) We’d originally resolved to only spend 5 hours a piece TOTAL on this project.  Yeah, consider that resolution broken.  Maybe 5 hours a piece&#8230; per week?</p>
<p><a href="http://soyouthinkyoucandigitize.files.wordpress.com/2012/01/wikiview.jpg"><img class="size-full wp-image-242 aligncenter" title="Wikiview" src="http://soyouthinkyoucandigitize.files.wordpress.com/2012/01/wikiview.jpg?w=640&#038;h=425" alt="" width="640" height="425" /></a></p>
<address><strong>Figure 1. </strong> Page 5 of Henderson’s field notebook as shown on Wikisource</address>
<p>So far, most of our work has been focused on figuring out how to get Wikisource to represent these notes in a way that’s consistent with existing Wikisource standards and policy, while also serving our needs as field-note data-miners.  We think we’ve done this pretty well; Gaurav Vaidya has put a ton of work into developing templates for taxon annotations.  You see those little items in boxes up there in Figure 1? Gaurav’s template (which has it’s own <a href="http://en.wikisource.org/wiki/Template:Taxon">Wikisource page</a>) automagically creates both a direct hyperlink to the species page as well as the floating boxes that link out to Wikispecies and the Wikimedia Commons.  The mark up itself looks like this:</p>
<p style="text-align:center;">{{taxon|&lt;taxon-name&gt;|&lt;text-to-appear on transcription&gt;}}.</p>
<p>So “Lark buntings” would become “{{taxon|Calamospiza melanocorys|Lark buntings}}.”</p>
<p>Pretty simple!  Our next steps are similarly simple.  We (read: Gaurav) will create annotation templates for “Dates” and “Locations,” and then start marking them up in the text.  Andrea has been linking together resources (<a href="http://www.ubio.org/tools/recognize.php">uBio’s FindIt</a>, <a href="http://europeana-geo.isti.cnr.it/geoparser/geoparsing">Europeana’s Geoparser</a>,  and her own rudimentary code) to automate this markup, so that the future notebooks we upload will be pre-loaded with links (we’ll talk about this more in our next post).  Locations, at least, will also be inter-wiki linked so interested readers can learn about the places Henderson visited during his journeys.  As soon as we have the templates for location and date done, we’ll post the syntax here and you can just jump in and try.  We’d love the help!</p>
<p>So now we have annotated field notes online, readily and freely available to everybody!  Exciting!  But here is the really exciting part: we think we can push these annotations out of the World of Wikipedia and into the larger semantic web.  Our plan is to unlock those annotations from Wikisource and try to represent them as separate Darwin Core observation records; more on what those records look like <a href="http://code.google.com/p/darwincore/wiki/Example_RDFa#Example_3_Observation_with_Name_and_Location">here</a>.  We&#8230; aren’t entirely sure how we’re going to do this yet, but we’ll keep you posted.  We also have some interesting ideas about what to do with these:  <a href="http://commons.wikimedia.org/wiki/Category:Junius_Henderson">http://commons.wikimedia.org/wiki/Category:Junius_Henderson</a>.  Maybe you do too.  If you do, please comment.  We live for comments.  Be resolved to let us know what you think, and Happy New Year.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/soyouthinkyoucandigitize.wordpress.com/241/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/soyouthinkyoucandigitize.wordpress.com/241/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=soyouthinkyoucandigitize.wordpress.com&#038;blog=22038354&#038;post=241&#038;subd=soyouthinkyoucandigitize&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://soyouthinkyoucandigitize.wordpress.com/2012/01/06/field-notes-challenge-part-3-new-years-digital-resolutions/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/b67f826745311eced80f5f0da70b89b6?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">robgur</media:title>
		</media:content>

		<media:content url="http://soyouthinkyoucandigitize.files.wordpress.com/2012/01/wikiview.jpg" medium="image">
			<media:title type="html">Wikiview</media:title>
		</media:content>
	</item>
		<item>
		<title>Field Note Challenge Part 2: Veni, Vidi, Wiki</title>
		<link>http://soyouthinkyoucandigitize.wordpress.com/2011/12/05/field-note-challenge-part-2-veni-vidi-wiki/</link>
		<comments>http://soyouthinkyoucandigitize.wordpress.com/2011/12/05/field-note-challenge-part-2-veni-vidi-wiki/#comments</comments>
		<pubDate>Tue, 06 Dec 2011 03:50:48 +0000</pubDate>
		<dc:creator>Andrea</dc:creator>
				<category><![CDATA[crowdsourcing]]></category>
		<category><![CDATA[field notes]]></category>
		<category><![CDATA[Henderson Project]]></category>
		<category><![CDATA[projects]]></category>

		<guid isPermaLink="false">http://soyouthinkyoucandigitize.wordpress.com/?p=232</guid>
		<description><![CDATA[SYTYCD would like to welcome guest blog co-author Gaurav Vaidya. A week ago, we told you about our cunning plan to play around with annotating and publishing one  transcribed notebook of Junius Henderson’s field notes. We’ve had two big successes in the last &#8230; <a href="http://soyouthinkyoucandigitize.wordpress.com/2011/12/05/field-note-challenge-part-2-veni-vidi-wiki/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=soyouthinkyoucandigitize.wordpress.com&#038;blog=22038354&#038;post=232&#038;subd=soyouthinkyoucandigitize&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><em>SYTYCD would like to welcome guest blog co-author <a href="http://www.ggvaidya.com/">Gaurav Vaidya</a>.</em></p>
<p>A week ago, we told you about <a href="http://soyouthinkyoucandigitize.wordpress.com/2011/11/28/an-ode-to-founders-and-a-field-notes-challenge-part-1/">our cunning plan</a> to play around with annotating and publishing one  transcribed notebook of <a href="http://en.wikipedia.org/wiki/Junius_Henderson">Junius Henderson</a>’s field notes. We’ve had two big successes in the last seven days, which is not bad for <del>soul-crushing finals and project deadline week</del> the holiday season.</p>
<p>Success #1:  YOU GUYS, the internet is amazing.  Within half an hour of posting our last post, we were contacted by <a href="http://spot.colorado.edu/~dena/DMSmithHomepage.htm">Dena Smith</a> and <a href="http://paleobiology.si.edu/staff/individuals/hollis.html">Kathy Hollis</a>, who alerted us to the existence of scans of Henderson’s notebooks &#8212; remember, we only had transcribed text files when we started!  This started a chain of events that put us in touch with two folks from the <a href="http://nsidc.org/">National Snow and Ice Data Center</a> (NSIDC): <a href="http://nsidc.org/noaa/people.html">Allaina Wallace</a>, librarian and analog data archivist, and <a href="http://nsidc.org/about/bios/duerr.html">Ruth Duerr</a>, Manager of Data Stewardship.  Less than 24 hours later, Rob and Gaurav had a productive meeting at NSIDC offices in east Boulder, and a DVD containing all the scans.  This DVD included three notebooks we hadn’t known about, two of which cover Henderson’s travels between 1927 and 1936 &#8212; adding another decade to his life on the road &#8212; AND were accompanied by more of Peter Robinson’s transcriptions.</p>
<div class="wp-caption aligncenter" style="width: 509px"><a href="http://en.wikisource.org/wiki/Index:Field_Notes_of_Junius_Henderson,_Notebook_1.djvu"><img title="From the field notes of Junius Henderson, Notebook 1" src="http://upload.wikimedia.org/wikipedia/commons/thumb/7/7e/Field_Notes_of_Junius_Henderson%2C_Notebook_1.djvu/page110-500px-Field_Notes_of_Junius_Henderson%2C_Notebook_1.djvu.jpg" alt="From the field notes of Junius Henderson, Notebook 1" width="499" height="766" /></a><p class="wp-caption-text">From the field notes of Junius Henderson, Notebook 1</p></div>
<p>Success #2:  Having the scans made a huge impact on what we were able to do with the text.  In particular, Gaurav has made headway in using WikiSource as a platform for maximal use and re-use of Henderson notes. <a href="http://en.wikipedia.org/wiki/WikiSource">WikiSource</a> is “an online digital library of free content textual sources on a wiki, operated by the Wikimedia Foundation” (i.e. Wikipedia). Uploading the scan of Henderson’s first notebook to the Wikimedia Commons was easy: these are now available as <a href="http://commons.wikimedia.org/wiki/File:Field_Notes_of_Junius_Henderson,_Notebook_1.pdf">PDF</a> or <a href="http://commons.wikimedia.org/wiki/File:Field_Notes_of_Junius_Henderson,_Notebook_1.djvu">DjVu</a> files. Once the scans were in the Commons, Gaurav <a href="http://en.wikisource.org/wiki/Index:Field_Notes_of_Junius_Henderson,_Notebook_1.djvu">created an Index page</a> (following instructions on <a href="http://en.wikisource.org/wiki/Help:Beginner's_guide_to_Index:_files">the Beginner&#8217;s Guide to Index: files</a> and <a href="http://en.wikisource.org/wiki/Help:Proofread">the Introduction to Proofreading</a> on WikiSource). The Index maps pages from the scanned DjVu file to pages on WikiSource. Click on a yellow-coloured page number to proofread or edit an existing page (for example, <a href="http://en.wikisource.org/wiki/Page:Field_Notes_of_Junius_Henderson,_Notebook_1.djvu/5">page 3</a>), or on a red-coloured page to transcribe it.   Transcription itself is dead-easy: the page image is displayed on the right, and a textbox (which accepts all MediaWiki syntax) is displayed on the left.  In our case, since we have the transcriptions “done”, it was mostly cutting and pasting sections of Peter’s transcribed text so that it aligned with Henderson’s scrawl on the scanned pages.</p>
<p>So yay, successes!!  The fruits of a week’s worth of work are available on <a href="http://en.wikisource.org/wiki/Field_Notes_of_Junius_Henderson/Notebook_1">the &#8220;Notebook 1&#8243; page</a> on WikiSource, where &#8212; using <a href="http://en.wikisource.org/wiki/Help:Proofread#Transclusion">WikiSource&#8217;s &lt;pages /&gt; command</a> &#8212; Gaurav mapped pages from the scanned DjVu file to pages on WikiSource.  Numbers along the left margin of the main page link back to the corresponding page from the Index, making it easy to verify or fix transcription errors.  Also, Gaurav compiled pages from the Index into sections representing field trips (just as Henderson did in his notes), and listed them in a “Contents” box at the top of the page.</p>
<p>Henderson’s field notes continue to be, first and foremost, a good read. “Notebook 1” features details from Henderson’s week-long trips to <a href="http://en.wikisource.org/wiki/Field_Notes_of_Junius_Henderson/Notebook_1#Florissant_trip">Florissant, Colorado</a> (August 1905) and <a href="http://en.wikisource.org/wiki/Field_Notes_of_Junius_Henderson/Notebook_1#Silver_Lake_Arapahoe_trip">Silver Lake Arapahoe</a> (September 1905). He keeps record of everything from the stamina of his comrades:</p>
<p style="padding-left:30px;"><em>“The party showed fatigue in the following order: Sievert least, I next, then Watts, Then Markman, then Frank.”</em> (<a href="http://en.wikisource.org/wiki/Page%3AField_Notes_of_Junius_Henderson%2C_Notebook_1.djvu/30">August 30</a>, 1905)</p>
<p>to train delays and opportunities for rumination:</p>
<p style="padding-left:30px;"><em>“Train again so late as to afford ample opportunity for philosophic meditation upon the motives which inspire railroad people to advertise time which they do not expect to make except under rare circumstances.”</em> (<a href="http://en.wikisource.org/wiki/Page:Field_Notes_of_Junius_Henderson,_Notebook_1.djvu/39">September 3</a>, 1905)</p>
<p>What next?  Our sense of what we want to do and what is possible is rapidly evolving.  Simply having the scanned field notebook pages completely changed our game plan.  Before Wednesday of this week, we just had transcriptions.  Now we have the <a href="http://holdmyticket.com/flyers2/d3c9f0a62c022743dc34c0523fda88d6.jpg">whole enchilada</a>.  What we currently want is a no-cost, minimal effort system that will make scans AND transcriptions AND annotations available, and that can facilitate text mining of the transcriptions.  Do we have that in WikiSource?  We will see.  More on annotations to follow in our next post but some <a href="http://en.wikipedia.org/wiki/Father_to_a_Sister_of_Thought">father to a sister of some thoughts</a> are already percolating and <a href="http://en.wikisource.org/wiki/Page%3AField_Notes_of_Junius_Henderson%2C_Notebook_1.djvu/7">we have even implemented some rudimentary examples</a>.</p>
<p>We’d like to encourage you to try your hand at <a href="http://en.wikisource.org/wiki/Index:Field_Notes_of_Junius_Henderson,_Notebook_1.djvu">transcribing</a> or <a href="http://en.wikisource.org/wiki/Field_Notes_of_Junius_Henderson/Notebook_1">annotating</a> this notebook along with us, and to let us know what you think about the process (reminder: Henderson’s first field notebook is still available <a href="http://goo.gl/5UxYt">as plain text</a> or <a href="https://docs.google.com/open?id=0B_srI9Pi83gYZDJhZWI3MWEtMzdkMy00ZjMyLWExMjAtNDY4Y2RkOGEwYWFh">as a Word document</a>).  As on Wikipedia, all edits are saved, so you can’t really mess up &#8211; <a href="http://en.wikipedia.org/wiki/WP:BOLD">be bold</a>, jump in (!) and tell us what you think.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/soyouthinkyoucandigitize.wordpress.com/232/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/soyouthinkyoucandigitize.wordpress.com/232/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=soyouthinkyoucandigitize.wordpress.com&#038;blog=22038354&#038;post=232&#038;subd=soyouthinkyoucandigitize&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://soyouthinkyoucandigitize.wordpress.com/2011/12/05/field-note-challenge-part-2-veni-vidi-wiki/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c8080798e336baca30da6f14204f848f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">akthom</media:title>
		</media:content>

		<media:content url="http://upload.wikimedia.org/wikipedia/commons/thumb/7/7e/Field_Notes_of_Junius_Henderson%2C_Notebook_1.djvu/page110-500px-Field_Notes_of_Junius_Henderson%2C_Notebook_1.djvu.jpg" medium="image">
			<media:title type="html">From the field notes of Junius Henderson, Notebook 1</media:title>
		</media:content>
	</item>
		<item>
		<title>An Ode to Founders and a Field Notes Challenge: Part 1</title>
		<link>http://soyouthinkyoucandigitize.wordpress.com/2011/11/28/an-ode-to-founders-and-a-field-notes-challenge-part-1/</link>
		<comments>http://soyouthinkyoucandigitize.wordpress.com/2011/11/28/an-ode-to-founders-and-a-field-notes-challenge-part-1/#comments</comments>
		<pubDate>Mon, 28 Nov 2011 22:39:54 +0000</pubDate>
		<dc:creator>Rob</dc:creator>
				<category><![CDATA[Henderson Project]]></category>

		<guid isPermaLink="false">http://soyouthinkyoucandigitize.wordpress.com/?p=200</guid>
		<description><![CDATA[Junius Henderson was the founder and first curator of the University of Colorado (CU) Museum of Natural History where Rob works.  Because Rob is the Invertebrate Curator of Zoology, and his training is in malacology (not “bad ecology” or “evil” &#8230; <a href="http://soyouthinkyoucandigitize.wordpress.com/2011/11/28/an-ode-to-founders-and-a-field-notes-challenge-part-1/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=soyouthinkyoucandigitize.wordpress.com&#038;blog=22038354&#038;post=200&#038;subd=soyouthinkyoucandigitize&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><a href="http://www.archive.org/stream/nautilus51amer#page/96/mode/2up">Junius Henderson</a> was the<a href="http://cumuseum.colorado.edu/About/history.html"> founder and first curator</a> of the<a href="http://cumuseum.colorado.edu/"> University of Colorado (CU) Museum of Natural History</a> where Rob works.  Because Rob is the Invertebrate Curator of Zoology, and his training is in malacology (not “bad ecology” or “evil” but the study of molluscs such as squids, clams, snails), he has always been pleased that he can trace a direct taxonomic line back to Henderson, who was first and foremost one of the great descriptive malacologists working in western North America.  One hundred years later, brick and mortar testaments to CU’s Founders remain throughout campus: the <a href="http://cumuseum.colorado.edu/Visit/directions.html">Henderson</a> building, where the CU Museum of Natural History exhibits are housed, and the<a href="http://www.colorado.edu/campusmap/map.html?bldg=RAMY"> Ramaley</a> building (named after compatriot <a href="http://www.sciencemag.org/content/96/2483/102.extract.jpg">Francis Ramaley</a>), home to the majority of the ecology and evolutionary biology department.</p>
<p>Junius Henderson kept copious field notes describing his many collecting trips; these were compiled into eleven volumes, and are archived in the museum.  The notes start in 1905 with this entry:</p>
<p style="padding-left:30px;" dir="ltr"><em>“Boulder, Colorado. July 28, 1905. Saw Say Phoebe and Siskins, Robin, Flicker.“  </em></p>
<p>Another very early entry reads,</p>
<p style="padding-left:30px;" dir="ltr"><em>“Expenses Florissant trip, 2 tickets to Denver Dr. Ramaley and I &#8212;-$2.00. Saw a Kingbird and Robin on way to depot&#8230;  Went to City Park and heard band and <strong>saw moving pictures including ‘Stage Robbery’ which, to say the least, was not an elevating spectacle, nor helpful to venturesome boys, apt to be carried away with the wildness of such a life.”</strong></em> [emphasis added for the benefit of any venturesome readers]</p>
<p>Twenty-two years and ten notebooks later, here is one of the last entries:</p>
<p style="padding-left:30px;" dir="ltr"><em>“Virginia Dale, Colo., Wednesday, June 15, 1927. Cloudy, foggy, rainy, cold morning, with a strong northwest wind. Started at 8 a.m. At edge of Laramie basin, speedometer 9728; Laramie 9747; Rock River 9787, at noon for lunch: Medicine Bow 9804; Ft. Steele 9848, about (speedometer slipped off just before reaching there); Rawlins 9864. Roads mostly gravelled and good; but in some places clay, and soft and slippery. Cleared about middle of afternoon and warmer this evening in camp at Rawlins.”</em></p>
<p>Fast forward another century (give or take): shortly after Rob’s arrival to CU in 2000, the now retired Curator of Paleontology, <a href="http://cumuseum.colorado.edu/About/newsdetail.php?newsID=3">Peter Robinson</a> mentioned he had personally transcribed ALL ELEVEN VOLUMES and saved each notebook as a separate Word document.  This is a best-case scenario for transcription in many ways; Peter is an expert with deep experience in natural history and paleontology, so his transcriptions of esoteric species names and locations are likely as accurate as they could possibly be.  While there are no scans of Henderson’s notes (yet), Peter did add some annotations (always using double parentheses) such as, <em>“((at some later date Henderson wrote an emphatic ‘NO.’ at this place in the notebook))”</em> to let readers know where they should refer back to the original notes.  So one disappointment is that Peter often added this annotation <em>“((Drawing in field book))”</em> to the notebook, which one cannot (yet) view.</p>
<p>Rob has made use of these notes in his research at CU; in 2003, he headed out on a summer-long collecting expedition as part of a State of Colorado survey of molluscs and crayfish in Western Colorado. Henderson’s field notes provided invaluable context and information about past collecting trips.  Henderson&#8217;s notes aren’t just part of the scientific record, however; they’re also a vivid image of the American West in a moment of swift change, as his modes of transportation transition from stagecoach to trains to automobiles, and his travels take him along new routes and through new towns and cities.  In our<a href="http://soyouthinkyoucandigitize.wordpress.com/2011/11/16/where-do-the-digital-humanities-and-escience-intersect/"> last post</a> we talked about how we can best do work at the intersection of the sciences and the humanities; rich corpora of field notes like Henderson’s are exactly the media that tie these seemingly disparate disciplines together.</p>
<p>So why are we telling you all this?  Because we think that:</p>
<p>a) Henderson’s meticulousness and Peter Robinson’s hard work provide a remarkable resource that should be publicly available, and;<br />
b) we’ve talked a lot about how to digitize, what to digitize, why to digitize,<em> but we haven’t done quite as much work discussing what to do once you’ve digitized</em>.  In other words, say you’ve transcribed 1000 pages of field notes.  Now what?</p>
<p>So over the last week we’ve been working on just this question of &#8220;Now What&#8221; using  Henderson&#8217;s field notes, with the following goals and caveats for this project:<br />
1) We want to make the notes publicly accessible, easily discoverable, and preferably bundled with appropriate descriptive, structural and preservation metadata;<br />
2) We want do so using the least restrictive licensing available (and we appreciate the support and encouragement of CU Museum Director <a href="http://cumuseum.colorado.edu/About/Bios/kociolek.html">Patrick Kociolek</a> and Peter Robinson to do so);<br />
3) We want to make use of some of the automated data extraction tools we’ve stumbled across over the last couple of months to do things like link names of taxa, places, people and dates to other sources of biodiversity knowledge;<br />
4) We want to produce at least one Nifty Thing as a result of this project &#8212; like a map on Google Earth showing Henderson’s travels;<br />
5) <strong>We don’t want to spend more than five hours each on this.</strong>  This is because we’re both super busy, and we also like the idea of figuring out what substantial products can be produced on a budget of no money and close-to-no time.</p>
<p>Rob’s student <a href="http://www.ggvaidya.com/">Gaurav Vaidya</a> has also been working on this project with us, focusing on possible wikipedia-oriented solutions, and we’re all nearing/exceeding the end of our respective 5 hour allotments (even when excluding time spent looking up movies from the 1900&#8242;s and pictures of <a href="http://www.bird-friends.com/pics/SaysPhoebe/SaysPhoebe4LR.jpg">Say&#8217;s Phoebe</a>).   In the interim, here (in <a href="http://goo.gl/5UxYt">text</a> and <a href="https://docs.google.com/open?id=0B_srI9Pi83gYZDJhZWI3MWEtMzdkMy00ZjMyLWExMjAtNDY4Y2RkOGEwYWFh">Word</a> formats) is the first notebook of Henderson’s for your perusal and to get you thinking and doing. In posts that follow, we will report some of our next steps with the full corpus along with releasing the other notebooks.  More soon!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/soyouthinkyoucandigitize.wordpress.com/200/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/soyouthinkyoucandigitize.wordpress.com/200/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=soyouthinkyoucandigitize.wordpress.com&#038;blog=22038354&#038;post=200&#038;subd=soyouthinkyoucandigitize&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://soyouthinkyoucandigitize.wordpress.com/2011/11/28/an-ode-to-founders-and-a-field-notes-challenge-part-1/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/b67f826745311eced80f5f0da70b89b6?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">robgur</media:title>
		</media:content>
	</item>
		<item>
		<title>Where do the digital humanities and eScience intersect? &#8212; Crosspost with VertNet</title>
		<link>http://soyouthinkyoucandigitize.wordpress.com/2011/11/16/where-do-the-digital-humanities-and-escience-intersect/</link>
		<comments>http://soyouthinkyoucandigitize.wordpress.com/2011/11/16/where-do-the-digital-humanities-and-escience-intersect/#comments</comments>
		<pubDate>Wed, 16 Nov 2011 14:30:13 +0000</pubDate>
		<dc:creator>Andrea</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://soyouthinkyoucandigitize.wordpress.com/?p=182</guid>
		<description><![CDATA[This special post was co-written with David Bloom, VertNet Coordinator and crossposted (with some minor mods) at the Vertnet Blog.   First and foremost, digitization of natural history collections and tools to make these digitized records available, such as VertNet, support &#8230; <a href="http://soyouthinkyoucandigitize.wordpress.com/2011/11/16/where-do-the-digital-humanities-and-escience-intersect/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=soyouthinkyoucandigitize.wordpress.com&#038;blog=22038354&#038;post=182&#038;subd=soyouthinkyoucandigitize&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><em>This special post was co-written with <strong>David Bloom</strong>, <a href="http://vertnet.org/index.php">VertNet </a>Coordinator and crossposted (with some minor mods) at the <a href="http://blog.vertnet.org/">Vertnet Blog</a>.  </em></p>
<div class="wp-caption aligncenter" style="width: 607px"><a href="http://www.loc.gov/exhibits/treasures/"><img title="How is a range map like a writing desk?" src="http://www.loc.gov/exhibits/treasures/images/1831s.jpg" alt="" width="597" height="450" /></a><p class="wp-caption-text"><a href="http://www.loc.gov/exhibits/treasures/images/1831s.jpg" rel="nofollow">http://www.loc.gov/exhibits/treasures/images/1831s.jpg</a></p></div>
<p>First and foremost, digitization of natural history collections and tools to make these digitized records available, such as <a href="http://vertnet.org/index.php">VertNet</a>, support global biodiversity research.  We suspect that the majority of use of digitized records will be to generate products such as species distribution models and change assessments, and to answer questions about what is in any given museum collection.  However, in the broader context of academic endeavor, these data could also serve as a unique link between the digital sciences and the<a href="http://en.wikipedia.org/wiki/Digital_humanities"> digital humanities</a>.  Work in the digital humanities includes everything from<a href="http://www.ucl.ac.uk/transcribe-bentham/"> crowdsourcing manuscript transcription</a> to<a href="http://williamjturkel.net/fabrication/"> humanistic fabrication</a> to<a href="http://portal.tapor.ca/portal/portal"> data mining</a> &#8212; work that is not so dissimilar in method, description, or data type from that in the digital sciences.</p>
<p>Biological collections aren’t the only organizations engaged in massive digitization efforts; libraries and archives have been digitizing and making their materials discoverable and interoperable for decades as well.  As a result of these efforts, an unprecedented number of research materials from a wide range of domains are now available for free on the Web.  Just as VertNet does for biodiversity data, the<a href="http://imlsdcc.grainger.uiuc.edu/"> University of Illinois’ Digital Collections and Content</a> project does for cultural heritage records, the<a href="http://trove.nla.gov.au/"> Australia National Library’s Trove</a> for newspapers, articles, and music.  The <a href="http://www.hathitrust.org/">Hathi Trust</a> makes more than 9 million books available &#8212; and the list goes on.  Digitization allows these materials to be recombined and analyzed quickly and (relatively) easily in new ways.</p>
<p>Our question is a simple one:  Where do the digital humanities and e-science overlap and interconnect?  One method of digital investigation that caught our attention is the<a href="http://etcpanel.princeton.edu/blog/blog/2011/04/08/mapping-novels-with-google-earth/"> mapping of novels</a> and<a href="http://mattwilkens.com/2011/03/28/maps-of-american-fiction/"> other historic texts</a>; researchers take prose text and mine it for mappable units.  Erin Sells and her students, for instance, have used this method to create dynamic maps of Virginia Woolf’s<a href="http://chronicle.com/blogs/profhacker/mapping-novels/32528"> Mrs. Dalloway</a>, which incorporate “pictures, sounds, videos, and the text itself into the map.”  Similarly, in the<a href="http://googleancientplaces.wordpress.com/"> Google Ancient Places</a> project, researchers mine archaeological and historical texts to create databases of georeferenced ancient locales which can then be mapped.  Though these researchers are working with novels, they’re producing data in formats similar to those used for species occurrence records in databases such as VertNet.</p>
<p>This made us think: what sorts of questions could we ask of a data set composed of all kinds of georeferences &#8212; not just species occurrence records, but locations from history or works of fiction as well?  If students of the humanities can create maps with such texture using similarly organized data sets, could they build on this richness by including analysis of the natural world as it existed at the time described in the novel?  Perhaps searching on the VertNet portal (or<a href="http://www.gbif.org/"> GBIF</a> or<a href="http://www.ala.org.au/"> ALA</a>) could provide a detailed list of vertebrate species and, with a little more work, the associated ranges of these species.  Suddenly, the map of Mrs. Dalloway’s world, and the atmosphere of Clarissa’s party, can be enriched not only with human influence and creation, but by the natural environment, too.  Conversely, data from diaries or other digitized sources could be mined for data about distributions of now-extinct species.  Could these data be used as observations and published as records along with those from natural history collections?</p>
<div class="wp-caption alignleft" style="width: 297px"><img title="Merriweather Lewish Can't Draw Birds" src="http://www.smithsonianeducation.org/images/educators/lesson_plan/lewis_and_clark/si_ci_bird_lg.jpg" alt="" width="287" height="440" /><p class="wp-caption-text">From Lewis and Clark&#039;s journals - <a href="http://www.smithsonianeducation.org/images/educators/lesson_plan/lewis_and_clark/si_ci_bird_lg.jpg" rel="nofollow">http://www.smithsonianeducation.org/images/educators/lesson_plan/lewis_and_clark/si_ci_bird_lg.jpg</a></p></div>
<p>We hope that VertNet will support interdisciplinary research in the science and the humanities by providing new avenues for deeper readings, and new ways to reconstruct real and imagined worlds.  Where are the specimens that Lewis and Clark found on their expeditions and how do those link up with their journals (<a href="http://lewisandclarkjournals.unl.edu/">online already!!</a>)?  What about whale species described by Melville?   How accurate are James Fenimore Cooper’s depictions of the animals Hawkeye and Cora encountered as they traveled through the Great Lakes?  What does this accuracy or inaccuracy tell you about Cooper as an author?  What about Thoreau’s notebooks of life at Walden Pond, and how have this iconic landscape and its animals and plants <a href="http://www.npr.org/templates/story/story.php?storyId=96206248%29">changed since his stay</a>?</p>
<p>We also hope that other folks have more ideas about what new combinations of data and domains of inquiry are possible now that so many different sources of knowledge have been digitized.  How can eScience support and enrich the digital humanities and vice-versa? What happens when <a href="http://gigapan.org/gigapans/71531/">images of specimens</a>* mix with <a href="http://www.flickr.com/photos/biodivlibrary/collections/">drawings from the literature</a>? Point-radius georeferences, for example, are easy enough to pull together from different sources &#8212; what further visualizations could be created with the combination of journals, books, and catalog ledgers?  What further ways can we use data and smarts to bridge gaps between the sciences and the humanities?</p>
<p>SYTYCD is offering the inaugural Thinky People’s Digitizaton Challenge (THIPDIC).   This first THIPDIC will go to the person or people who provide our favorite comment showing how digital science and the digital humanities intersect.  Any cool examples?  Any deeper thoughts about how this happens?  <a href="http://worldwidewhiskers.files.wordpress.com/2009/05/cat-reading-harry-potter.jpg">Any cute pictures of animals reading book</a>?  Winners will be celebrated the world over and will be eligible for a (modest) prize, offered by Rob (don&#8217;t worry, it&#8217;ll be something interesting and of actual value).  You may now talk amongst yourselves.</p>
<p>* gigapan snakes in jar!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/soyouthinkyoucandigitize.wordpress.com/182/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/soyouthinkyoucandigitize.wordpress.com/182/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=soyouthinkyoucandigitize.wordpress.com&#038;blog=22038354&#038;post=182&#038;subd=soyouthinkyoucandigitize&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://soyouthinkyoucandigitize.wordpress.com/2011/11/16/where-do-the-digital-humanities-and-escience-intersect/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c8080798e336baca30da6f14204f848f?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">akthom</media:title>
		</media:content>

		<media:content url="http://www.loc.gov/exhibits/treasures/images/1831s.jpg" medium="image">
			<media:title type="html">How is a range map like a writing desk?</media:title>
		</media:content>

		<media:content url="http://www.smithsonianeducation.org/images/educators/lesson_plan/lewis_and_clark/si_ci_bird_lg.jpg" medium="image">
			<media:title type="html">Merriweather Lewish Can&#039;t Draw Birds</media:title>
		</media:content>
	</item>
		<item>
		<title>Zombies versus Unicorns at TDWG (or, a recap of citizen science talks)</title>
		<link>http://soyouthinkyoucandigitize.wordpress.com/2011/10/31/zombies-versus-unicorns-at-tdwg-or-a-recap-of-citizen-science-talks/</link>
		<comments>http://soyouthinkyoucandigitize.wordpress.com/2011/10/31/zombies-versus-unicorns-at-tdwg-or-a-recap-of-citizen-science-talks/#comments</comments>
		<pubDate>Mon, 31 Oct 2011 17:08:31 +0000</pubDate>
		<dc:creator>Rob</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://soyouthinkyoucandigitize.wordpress.com/?p=156</guid>
		<description><![CDATA[So You Think You Can Digitize was in the Big Easy for TDWG 2011 last week. Summarizing the whole meeting is best left for friends Nico and Gaurav, who have longer attention spans than us. Nor should you miss the &#8230; <a href="http://soyouthinkyoucandigitize.wordpress.com/2011/10/31/zombies-versus-unicorns-at-tdwg-or-a-recap-of-citizen-science-talks/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=soyouthinkyoucandigitize.wordpress.com&#038;blog=22038354&#038;post=156&#038;subd=soyouthinkyoucandigitize&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><span class="Apple-style-span" style="font-size:16px;line-height:24px;"><span class="Apple-style-span" style="font-size:16px;line-height:24px;">So You Think You Can Digitize was in the Big Easy for <a href="http://www.tdwg.org/">TDWG</a> 2011 last week. Summarizing the whole meeting is best left for friends <a href="http://cellinese.blogspot.com/2011/10/tdwg-2011-my-post-partum-rant.html">Nico</a> and <a href="http://www.ggvaidya.com/">Gaurav</a>, who have longer attention spans than us. Nor should you miss the <a href="http://vimeo.com/30910279">Unicorn Magic</a> from friends at <a href="http://blog.vertnet.org/">VertNet</a>. Instead, we&#8217;ll focus our efforts on a set of talks in the citizen science session.</span></span></p>
<p><strong>Batting first: Enlisting the Use of Educated Volunteers at a Distance: Or, Why Crowdsourcing and Citizen Science Will NOT Create Nightmare Zombies That Will Destroy Us All.</strong></p>
<p>Presented by us! Slides are <a href="http://www.tdwg.org/fileadmin/2011conference/slides/Thomer-Guralnick-Zombies-Crowdsourcing.ppt">here</a>. This talk developed organically out of the last few SYTYCD posts, but also gave us an opportunity to push a bit further on some trickier concepts we’ve been cogitating on for the last few months.  Particularly:1)  We presented some neat (and we think relevant) education literature that shows that knowledge may be constructed more quickly through peer discussion in the classroom. We argued that volunteers communicating and using existing resources to vet records is analogous to students talking to their neighbors in the classroom. What do you think? Discuss!</p>
<p>2)  We also argued that the creation of these large crowdsourcing interfaces and applications (e.g. Old Weather, Atlas of Living Australia,) necessarily forces “<em>articulation work</em>” &#8212; that is, the work explaining what one group of people wants done by another group of people (e.g. curators by web developers, collections managers by volunteers).  A fundamental concern of citizen science is about how to best connect the people collecting or annotating data back to the scientists who use them.  Using web applications to facilitate this connection forces both the citizen scientists and the experts to understand the data and encode that understanding into those apps.  For a standards group like TDWG, this act of encoding is particularly iimportant to consider and understand; we need to remember that standards aren’t just ways of passively creating databases with consistent field names, but are means of facilitating communication and shared sense of mission between people as well.</p>
<p>Notes: We might still have some work articulating articulation work.  Also, our best intentions to collect data on how easily people can use existing web resources to more effectively digitize foundered on the rocks of too little time to get through some, uh, minor logistics issues (in particular, IRB Human Subject approvals &#8211; facepalm).  However, we still hope to do this in the future.</p>
<p><strong>Batting in the 2-Spot:  Crowd sourcing record transcription to unlock historical species data from natural history collections.</strong></p>
<p><a href="http://vizzuality.com/team/andrewhill">Andrew Hill</a>, <a href="http://vizzuality.com/">Vizzuality</a> wunderkind and semi-erstwhile PhD student at CU Boulder with Rob, discussed Vizzuality’s rapid development of citizen science projects like “Old Weather” and a new one for NASA called <a href="https://neemo.zooniverse.org/">NEEMO</a>. Andrew showed that citizen scientists work together in the spirit of both cooperation and competition by relating how he and company owner <a href="http://vizzuality.com/team/jatorre">Javi De La Torre</a> kept vying for the top scoring spot in NEEMO &#8212; only to be blown away by a NASA employee who was also working/playing. It is an interesting line, at least from our perspective, where elements of competition and collaboration can both be optimized in developing citizen science applications. We here at SYTYCD have tended to focus on cooperation and narrative &#8212; not on game-ification and competition &#8212; but maybe there is a middle ground that yields the best of both worlds, and maybe the broadest appeal. Perhaps competition works better for some demographics and cooperation for others. Andrew also announced that Vizzuality is likely going to be involved, in some capacity, in developing a citizen science project for natural history transcription. We love this plan and can’t wait to hear more.</p>
<p><strong>Batting Third:  Crowd-sourcing: perpetual valuable resource or a passing shower of dubious worth?</strong></p>
<p><a href="http://australianmuseum.net.au/staff/paul-flemons">Paul Flemons</a>, who Rob thinks looks just a teensy bit like Samuel Vimes (<a href="http://en.wikipedia.org/wiki/Sam_Vimes">famous fictional cop</a>), presented his work with the ALA’s “Australian Museum Cicada Expedition” while deftly weaving in musings about the long-term value of crowdsourcing as a digitization tool.  One thing we particularly liked seeing was a frequency plot showing the  <a href="http://en.wikipedia.org/wiki/Long_Tail">“long tail” </a> of transcription efforts.  That is, most volunteers who drop by the site will only transcribe one or two records; however, there are a few extraordinarily dedicated folks who will transcribe much larger numbers &#8212; hundreds or thousands of records.  Why?  Well this gets to incentives &#8212; really, all the talks in the session ultimately touched on this essential topic.  Is it possible to build a citizen science tool that shifts that long tail to be shorter and stouter so that more people are willing to transcribe more records?   Paul ended his talk saying he wasn’t entirely sure about the future of crowdsourced transcription for natural history collections &#8212; he is still not sure that we have the critical mass of volunteers needed to transcribe EVERYTHING, or that the links between the volunteer work and science are always full exposed.</p>
<p>After seeing all the talks and the excellent demonstrations by Beth Mantle, Katja Schulz, and Tony Kirchgessner, we are more optimistic than Paul. One reason for optimism: we overheard comments like “Wow!  this session was amazingly well attended” and, to paraphrase, “this might actually work.” So, yeah, TDWG was indeed great, even if one of us who isn’t Rob did get suckered into Co-Chairing the Citizen Science Interest Group.  And yes, we do indeed still think we can digitize.</p>
<p>Speaking of digitization, we have been following the crowd-sourcing thread for a long time now, and next posts may swing back around to other topics of interest in the broader realm of natural history digitization.  With the ramping up of <a href="http://idigbio.org/content/thematic-collections-networks">Thematic Collections Networks</a> and the <a href="http://idigbio.org/">iDigBio HUB</a>, the hard work of digitizing and the even harder work of innovating is just getting started&#8230;.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/soyouthinkyoucandigitize.wordpress.com/156/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/soyouthinkyoucandigitize.wordpress.com/156/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=soyouthinkyoucandigitize.wordpress.com&#038;blog=22038354&#038;post=156&#038;subd=soyouthinkyoucandigitize&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://soyouthinkyoucandigitize.wordpress.com/2011/10/31/zombies-versus-unicorns-at-tdwg-or-a-recap-of-citizen-science-talks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/b67f826745311eced80f5f0da70b89b6?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">robgur</media:title>
		</media:content>
	</item>
	</channel>
</rss>
