At the S2I2 workshop, we saw some great presentations on some awesome digitization projects and products that we think everyone in the world show know about (yes, EVERYONE in the WHOLE WORLD). An abridged list with a short descriptions follows.
FilteredPush – http://etaxonomy.org/mw/FilteredPush
Current data sharing and publishing systems tend to be unidirectional in the collections community. Data goes out, but nothing comes back in — including corrections. Very inefficient! 😦 FilteredPush is a project aiming to solve that impediment. 🙂 See also: BiSciCol (http://biscicol.blogspot.com/), VertNet (http://vertnet.org).
Apiary – http://www.apiaryproject.org/
Apiary is a project to speed up digitization of herbarium specimens (but could likely be useful to other types of collections!). The simple and elegant idea behind Apiary is that an increase in speed can happen if digitizers first photograph the herbarium sheet, save the image, and then annotate and extract sections of the sheet containing label data in a later stage. This staging solution should lead to increases in digitization rates, which have been too slow. See also: HERBIS (http://www.herbis.org/), Paris Herbarium (below).
GeoLocate – http://www.museum.tulane.edu/geolocate/
GeoLocate is a software suite for georeferencing legacy records. It takes as input a text string describing a location on the planet (e.g. 6 miles southwest of Urbana-Champaign) and converts that description into a geospatial coordinate (e.g.
40.063905,-88.294979) (e.g. purveyors of delicious apple donuts). GeoLocate is developing the means to make the whole georeferencing process more collaborative (CoGE –http://www.museum.tulane.edu/coge/). See also: Biogeomancer and Best Practices in Georeferencing (http://www2.gbif.org/BioGeomancerGuide.pdf).
Islandora – http://islandora.ca/
Islandora was something of a revelation to me, says Rob. Rob had heard about most of the other projects, but not this one. Islandora is a Drupal and Fedora-based system for storing and sharing digital archival data. Rob wonders if Islandora is more a solution for layering “on top of” collections databases that might help manage multiple related content streams (publications, field notes, specimen databases). Similar projects: Specify (http://specifysoftware.org/), Arctos (http://arctos.database.museum/home.cfm) (but these are very much collections database software, while Islandora is more for general digtal assets).
Scatter, Gather, Reconcile.
No web presence we could find; perhaps there is still some gathering and reconciling happening before we see content online? The general idea, though, is to discover duplicates or near-duplicates across the network of data publishers so that digitizers don’t replicate data that is already out there. Similar projects: Filtered Push, above.
Paris Herbarium Assembly Line.
We’d love to post some photos or video of the awesome assembly line that the Paris Herbarium has been developing (so lets us know, Marc Pignal! Pretty please?). Very informative PowerPoint made short: the Paris Natural History Museum had to move their herbarium collection from one building to another, and decided to take this as an opportunity to digitize the entire collection! Herbarium sheets are placed on conveyor belts and images snapped by overhead cameras as the sheets go past. The rate of digitization is really impressive – several thousand sheets get digitized at a cost below €1 per sheet. Similar projects: We don’t know anything quite like it. Maybe NYBG is getting close?
We should note that the above is a hodgepodge of sorts. Some of the presentations focused directly on digitization, some on how to store and mobilize digitized products, some on data quality enhancements once there is a datastore. We might have one more S2I2 report to file that is a bit more focused and directly addresses some of the discussions about tools, technologies, and methodologies that could deployed to increase digitization rates.
In the meantime, we should point out that SPNHC 2011 (research.calacademy.org/spnhc) is having a Demo Camp (note: mistyped “Demon Camp” at first, but that is an entirely different thing), run by Amanda Neil. Many of the projects listed here have shown up at Demo Camp in past years, and we’ll let you know what might be happening at this year’s camp. Might even Live Blog! Do you have an innovative project that speeds up digitizing collections records? If so, we’re your blog! Leave a comment! Let us know!