One of the reasons we think we can digitize is that so many people are already digitizing in such a successful manner! Here’s a (hopefully oft-updated) list of some of the projects we’ve found, separated into a couple categories:
Natural History Projects
Digital Humanties Projects
Field Note Digitization Projects
Digitization Tools
If you know of a project that’s not listed here, PLEASE let us know, either by email or in comments!
Natural History Projects:
Atlas of Living Australia – Australian Museum Cicada Expedition
Project Summary: “Join the Cicada Transcription Expedition Team and help us capture information from cicada specimens up to 100 years old.” ALA uses a multi-step transcription process for Cicada specimen labels; volunteers first type labels as they appear, verbatim, then atomize data points into separate fields, and then georeference any available locality information if possible. Georeferencing is facilitated by a Google Maps plugin.
Our thoughts: ALA has created a sleek and simple interface, and seems to rely primarily on volunteers’ interest in helping with transcriptions rather than on excessive gamification of the task at hand. One concern: there are many fields to complete plus verbatim text, which means high quality data but some amount of effort to claim transcription “victory.”
Botanical Society of the British Isles - Herbaria@home
Summary: “The UK has the world’s largest and oldest collections of herbarium specimens held in trust by museums and universities. As a record of plant biodiversity this resource is unparalleled and could be vital for future studies of taxonomy, ecology, conservation and genetic biodiversity.” Simple registration. Herbaria sheets are allocated to users, and volunteers are asked to look for specific, atomized fields of data. Separate tables/look up lists are maintained of common names, locations, and species to prevent unnecessary duplication of effort.
Our thoughts: Although there were some problems with image viewing in Chrome, it worked in other browsers. Particularly great are the pull down menu choices that narrow the choices for you based on input. The pull downs helped to quickly make progress and to even validate entries (e.g. OH! Croydon does make sense!)
Global Plants Initiative – Resources for the Digitisation of Herbarium Specimens
Summary: “An international collaboration aiming to digitise and make available plant type specimens, together with other botanical resources, for scholarly purposes. The GPI network of content providers currently includes more than 166 partner herbaria representing over 57 countries.” Interestingly, it’s not herbaria sheets they’re crowdsourcing – it’s signatures and initials of botanists.
Our thoughts: Really neat application, but the explanation of what the site is all about could be clearer.
USGS Patuxent – North American Bird Phenology Program
Summary: Per commenter Sally Shelton, the “USGS Patuxent crew is running what appears to be a very successful crowdsourcing program for digitizing years’ worth of bird observations mailed in from all over the country on postcards for the first half of the 20th century.” Unfortunately we haven’t been able to find any further information on this – additional links would be welcome! Thanks to Jim for the link! We also found this which makes us smile.
Digital Humanities projects
University of Iowa Library – Civil War Diaries Transcription Project
Summary: Simple, straightforward call to aid in transcription of civil war diaries for inclusion in the university’s Digital Collection. Volunteers do not need to register, and progress bars are shown for individual diaries.
Our thoughts: We’d like to know more on rates of use for this project. Are they successfully recruiting volunteers with a relatively low level of user personalization? Zooming to help see individual words would be a nice additional function.
National Library of Australia – TROVE
Summary: “Digitized newspapers and more.” Crowdsourced correction of OCR’d newspaper articles. Users log in, search for articles, and are given the opportunity to correct mistakes in the OCR.
Our thoughts:This site has a great interface that helps users track and manage a lot of text both on scanned images and from initial OCR. The site is also a great marriage of users’ need for materials and for those materials to be fit for use. Takes advantage of users’ natural workflow: they search for documents and then are compelled to improve their own dataset for others’ future use. Clever! How could this be applied to natural history digitization?
New York Public Library – What’s on the Menu?
Summary: Help “…improve a unique collection! We’re transcribing our historical restaurant menus, dish by dish, so that they can be searched by what people were eating back in the day.”
Our Thoughts: Neat project! It is, though, a bit of a challenge to reference between transcribed parts of the menu and the scanned image, and the green “checkmarks” can obscure parts of the image. However, the first menu examined had “omelettes with ham, parsley, jelly or rum”. Rum? Really?
Field Note Digitization projects:
This is of particular interest to both of us. Although none of the projects below are crowdsourced, we are still interested in field notes in general as perhaps being perfect for future crowdsourcing projects.
Cal Academy in conjunction with several other institutions - Connecting Content (see also this blog post)
Summary:“The project involves the digitization of field notebooks and natural history collections and the generation of metadata for these items. Six of the seven institutions will conduct pilot projects of varying scope and size. We will then develop the means to map and link these collections to one another and to published material in the Biodiversity Heritage Library.”
Smithsonian – The Field Book Project
Summary:Per PI Rusty Russell’s SPNHC 2011 talk, this is “A joint initiative between the National Museum of Natural SummaryHistory and the Smithsonian Institution Archives” with the “overall mission is to create one online location for field book content,” and “this process will begin as a Smithsonian-wide initiative and lay the foundation for an online Field Book Registry comprised of content contributed by the entire community.”
Missouri Botanical Garden - Digitizing Engelmann’s Legacy (see also this blog post)
Summary:This project aims to digitize the George Engelmann collection. These 8,000+ specimens were “gathered during pioneering expeditions into the American West following those of Lewis and Clark are the first scientific record of the plants growing in the vast wilderness west of the Mississippi River.” Once digitized, these records will be accessible via Tropicos, Botanicus, and customized web interfacs.
Tools
In our earlier posts, we cogitated (in a carefree manor) about what steps in digitization workflows might apply to the widest number of natural history collections. Tools that help connect producers of images with citizen scientists, and that can be re-used across projects, strikes me (us?) as a great example of a step that could “scale”. Here are some tools that show promise.
Arctos
Summary: “Arctos is an ongoing effort to integrate access to specimen data, collection-management tools, and external resources on the internet. Nearly all that is known about a specimen can be included in Arctos, and, except for some data encumbered for proprietary reasons, data are open to the public.”
Oh No Robot
Summary: A brilliant little widget intended for webcomics that allows users to transcribe comics as they are published.
Our Thoughts:Great example of tapping into a dedicated “user” base to help with digitization!
FromthePage
Summary: “FromThePage is software that allows volunteers to transcribe handwritten documents online. Currently it hosts the Julia Brumfield Diaries, an incomplete collection of diaries written between 1915 and 1938 chronicling life on a tobacco farm in Pittsylvania County, Virginia. The FromThePage software is still under development, but we’d like to invite people to look around and send suggestions and bug reports to benwbrum@gmail.com. If anything looks broken, hard to understand, or just odd, please let us know! For a behind-the-scenes look at the development effort, check out the product development blog.”
Scripto
Summary: “a lightweight, open source tool that allows users to contribute transcriptions to online documentary projects… You provide the CMS and GUI; Scripto provides the engine for crowdsourcing the transcription of your content.” (Please see their blog post “Why Crowdsourcing? Why Scripto?” for a more thorough description- http://scripto.org/?p=77)
accessTEI
Summary: a vendor that will encode docs into TEI (Text Encoding Initiative; http://www.tei-c.org/index.xml) Our thoughts: not so useful for we-who-do-not-use-TEI, but a good idea to keep in mind.
Thanks to all who contributed links and project leads!

Pingback: Crowdsourcing, Deep Reading, and Narrative: Part 2 | So You Think You Can Digitize
The USGS Patuxent crowdsourcing project is the North American Bird Phenology Program
http://www.pwrc.usgs.gov/bpp/
Props go to stellw@Maine, who has transcribed over 90,000 records!
JIM! You’re awesome. Thanks.
The San Diego Museum of Natural History is using FromThePage to digitize the field notes of herpetologist Laurence Klauber at http://fromthepage.bpoc.org
While they are not the same sort of projects as those listed above, I would like to add the other Zooniverse projects – Planet Hunters, Ancient Lives (transcribing Greek Papyri), Solar Stormwatch, etc. Finding a new planet would be one of the coolest things, I think, one could do. I, though, love Old Weather. The connection between transcribers and transcribers and the ships is still both amazing and unexpected to me.
From the Claremont Colleges Digital Library: The Larry Oglesby Collection consists of (5,355 and counting) 35 mm color slides of flora and fauna taken by Professor Larry Oglesby, Professor Emeritus of Biology at Pomona College. Taken in the field, the slides depict plants such as Sky Lupine or Field Mustard and animals such as the Pacific Pond Turtle and Killdeer. The photographs were taken primarily in California and Oregon but photographs of the Shenandoah National Park in Virginia and other locations may also be found. Each item within the collection displays the color slide and Oglesby’s notations on the slide mount.