First and foremost, digitization of natural history collections and tools to make these digitized records available, such as VertNet, support global biodiversity research. We suspect that the majority of use of digitized records will be to generate products such as species distribution models and change assessments, and to answer questions about what is in any given museum collection. However, in the broader context of academic endeavor, these data could also serve as a unique link between the digital sciences and the digital humanities. Work in the digital humanities includes everything from crowdsourcing manuscript transcription to humanistic fabrication to data mining — work that is not so dissimilar in method, description, or data type from that in the digital sciences.
Biological collections aren’t the only organizations engaged in massive digitization efforts; libraries and archives have been digitizing and making their materials discoverable and interoperable for decades as well. As a result of these efforts, an unprecedented number of research materials from a wide range of domains are now available for free on the Web. Just as VertNet does for biodiversity data, the University of Illinois’ Digital Collections and Content project does for cultural heritage records, the Australia National Library’s Trove for newspapers, articles, and music. The Hathi Trust makes more than 9 million books available — and the list goes on. Digitization allows these materials to be recombined and analyzed quickly and (relatively) easily in new ways.
Our question is a simple one: Where do the digital humanities and e-science overlap and interconnect? One method of digital investigation that caught our attention is the mapping of novels and other historic texts; researchers take prose text and mine it for mappable units. Erin Sells and her students, for instance, have used this method to create dynamic maps of Virginia Woolf’s Mrs. Dalloway, which incorporate “pictures, sounds, videos, and the text itself into the map.” Similarly, in the Google Ancient Places project, researchers mine archaeological and historical texts to create databases of georeferenced ancient locales which can then be mapped. Though these researchers are working with novels, they’re producing data in formats similar to those used for species occurrence records in databases such as VertNet.
This made us think: what sorts of questions could we ask of a data set composed of all kinds of georeferences — not just species occurrence records, but locations from history or works of fiction as well? If students of the humanities can create maps with such texture using similarly organized data sets, could they build on this richness by including analysis of the natural world as it existed at the time described in the novel? Perhaps searching on the VertNet portal (or GBIF or ALA) could provide a detailed list of vertebrate species and, with a little more work, the associated ranges of these species. Suddenly, the map of Mrs. Dalloway’s world, and the atmosphere of Clarissa’s party, can be enriched not only with human influence and creation, but by the natural environment, too. Conversely, data from diaries or other digitized sources could be mined for data about distributions of now-extinct species. Could these data be used as observations and published as records along with those from natural history collections?
We hope that VertNet will support interdisciplinary research in the science and the humanities by providing new avenues for deeper readings, and new ways to reconstruct real and imagined worlds. Where are the specimens that Lewis and Clark found on their expeditions and how do those link up with their journals (online already!!)? What about whale species described by Melville? How accurate are James Fenimore Cooper’s depictions of the animals Hawkeye and Cora encountered as they traveled through the Great Lakes? What does this accuracy or inaccuracy tell you about Cooper as an author? What about Thoreau’s notebooks of life at Walden Pond, and how have this iconic landscape and its animals and plants changed since his stay?
We also hope that other folks have more ideas about what new combinations of data and domains of inquiry are possible now that so many different sources of knowledge have been digitized. How can eScience support and enrich the digital humanities and vice-versa? What happens when images of specimens* mix with drawings from the literature? Point-radius georeferences, for example, are easy enough to pull together from different sources — what further visualizations could be created with the combination of journals, books, and catalog ledgers? What further ways can we use data and smarts to bridge gaps between the sciences and the humanities?
SYTYCD is offering the inaugural Thinky People’s Digitizaton Challenge (THIPDIC). This first THIPDIC will go to the person or people who provide our favorite comment showing how digital science and the digital humanities intersect. Any cool examples? Any deeper thoughts about how this happens? Any cute pictures of animals reading book? Winners will be celebrated the world over and will be eligible for a (modest) prize, offered by Rob (don’t worry, it’ll be something interesting and of actual value). You may now talk amongst yourselves.
* gigapan snakes in jar!