So You Think You Can Digitize was in the Big Easy for TDWG 2011 last week. Summarizing the whole meeting is best left for friends Nico and Gaurav, who have longer attention spans than us. Nor should you miss the Unicorn Magic from friends at VertNet. Instead, we’ll focus our efforts on a set of talks in the citizen science session.
Batting first: Enlisting the Use of Educated Volunteers at a Distance: Or, Why Crowdsourcing and Citizen Science Will NOT Create Nightmare Zombies That Will Destroy Us All.
Presented by us! Slides are here. This talk developed organically out of the last few SYTYCD posts, but also gave us an opportunity to push a bit further on some trickier concepts we’ve been cogitating on for the last few months. Particularly:1) We presented some neat (and we think relevant) education literature that shows that knowledge may be constructed more quickly through peer discussion in the classroom. We argued that volunteers communicating and using existing resources to vet records is analogous to students talking to their neighbors in the classroom. What do you think? Discuss!
2) We also argued that the creation of these large crowdsourcing interfaces and applications (e.g. Old Weather, Atlas of Living Australia,) necessarily forces “articulation work” — that is, the work explaining what one group of people wants done by another group of people (e.g. curators by web developers, collections managers by volunteers). A fundamental concern of citizen science is about how to best connect the people collecting or annotating data back to the scientists who use them. Using web applications to facilitate this connection forces both the citizen scientists and the experts to understand the data and encode that understanding into those apps. For a standards group like TDWG, this act of encoding is particularly iimportant to consider and understand; we need to remember that standards aren’t just ways of passively creating databases with consistent field names, but are means of facilitating communication and shared sense of mission between people as well.
Notes: We might still have some work articulating articulation work. Also, our best intentions to collect data on how easily people can use existing web resources to more effectively digitize foundered on the rocks of too little time to get through some, uh, minor logistics issues (in particular, IRB Human Subject approvals – facepalm). However, we still hope to do this in the future.
Batting in the 2-Spot: Crowd sourcing record transcription to unlock historical species data from natural history collections.
Andrew Hill, Vizzuality wunderkind and semi-erstwhile PhD student at CU Boulder with Rob, discussed Vizzuality’s rapid development of citizen science projects like “Old Weather” and a new one for NASA called NEEMO. Andrew showed that citizen scientists work together in the spirit of both cooperation and competition by relating how he and company owner Javi De La Torre kept vying for the top scoring spot in NEEMO — only to be blown away by a NASA employee who was also working/playing. It is an interesting line, at least from our perspective, where elements of competition and collaboration can both be optimized in developing citizen science applications. We here at SYTYCD have tended to focus on cooperation and narrative — not on game-ification and competition — but maybe there is a middle ground that yields the best of both worlds, and maybe the broadest appeal. Perhaps competition works better for some demographics and cooperation for others. Andrew also announced that Vizzuality is likely going to be involved, in some capacity, in developing a citizen science project for natural history transcription. We love this plan and can’t wait to hear more.
Batting Third: Crowd-sourcing: perpetual valuable resource or a passing shower of dubious worth?
Paul Flemons, who Rob thinks looks just a teensy bit like Samuel Vimes (famous fictional cop), presented his work with the ALA’s “Australian Museum Cicada Expedition” while deftly weaving in musings about the long-term value of crowdsourcing as a digitization tool. One thing we particularly liked seeing was a frequency plot showing the “long tail” of transcription efforts. That is, most volunteers who drop by the site will only transcribe one or two records; however, there are a few extraordinarily dedicated folks who will transcribe much larger numbers — hundreds or thousands of records. Why? Well this gets to incentives — really, all the talks in the session ultimately touched on this essential topic. Is it possible to build a citizen science tool that shifts that long tail to be shorter and stouter so that more people are willing to transcribe more records? Paul ended his talk saying he wasn’t entirely sure about the future of crowdsourced transcription for natural history collections — he is still not sure that we have the critical mass of volunteers needed to transcribe EVERYTHING, or that the links between the volunteer work and science are always full exposed.
After seeing all the talks and the excellent demonstrations by Beth Mantle, Katja Schulz, and Tony Kirchgessner, we are more optimistic than Paul. One reason for optimism: we overheard comments like “Wow! this session was amazingly well attended” and, to paraphrase, “this might actually work.” So, yeah, TDWG was indeed great, even if one of us who isn’t Rob did get suckered into Co-Chairing the Citizen Science Interest Group. And yes, we do indeed still think we can digitize.
Speaking of digitization, we have been following the crowd-sourcing thread for a long time now, and next posts may swing back around to other topics of interest in the broader realm of natural history digitization. With the ramping up of Thematic Collections Networks and the iDigBio HUB, the hard work of digitizing and the even harder work of innovating is just getting started….