JHFNP, Post 4.5

A quick mini-post here, to tell of some interesting things:

1) Notebook 1 is DONE.  Fully annotated, and all within 3 days of our last post.  This represents many hours of work and the creation of hundreds of annotations:

{{place|…}} annotations: 218
{{dated|…}} annotations: 64
{{taxon|…}} annotations: 347

So total annotations = 629!

2) Furthermore, Notebook 1 was annotated almost entirely by 1 person… who we can’t thank by name because they didn’t make a WikiSource account.  So, thank you, IP address 173.69.207.29!  And also to: 207.145.38.73.  If you would like a “thank you” coffee mug please let us know in the comments or via email.

3)  We have set up a Wikisource Project page.  We still need to write detailed, step by step descriptions for some pages (e.g. transcription and annotation), but we do have an initial help page on upload available, and more will be posted there soon.

4)  We’re putting new notebooks and transcriptions onto Wikisource as fast as we can get transcriptions and scans associated with one another; transcriptions of Notebook 2 are already up.  This notebook features more paleontology and geology than the last, and describes several trips to Northern Colorado: “Instead of attending commencement and taking my B.A. degree I started north on foot, up the Lykins lateral valley NW of Colorado Sanitarium.”  

We are excited about, and appreciative of, the annotation help, and we really do want to acknowledge as many folks as possible in the upcoming Zookeys paper.  So once again, if you want to be credited by name, make sure to create (and login to) a WikiSource account before annotating (though if you prefer public anonymity, just email us directly to say you are helping); we probably can’t acknowledge IP addresses, and we’re not sure if we can acknowledge aliases in a published paper (though how great would it be to say, “special thanks to Paul Flemons, Laura Russell and SquirrelFan23 for her or his help”).

Next, more substantial update: annotations and occurrences, as promised last time.

About these ads

About Andrea

Andrea is a Ph.D. student in Library and Information Science at the University of Illinois at Urbana-Champaign, and is supported by the Center for Informatics Research in Science and Scholarship. Her research interests include text mining; scholarly communication; data curation; biodiversity, phylogenetic and natural history museum informatics; and mining and making available undiscovered public knowledge. She is particularly interested in information extraction from natural history field notes and texts, and improving methods of digitizing and publishing data about the world's 3–4 billion museum specimen records so they can be used to better model evolutionary and ecological processes.
This entry was posted in Henderson Project. Bookmark the permalink.

8 Responses to JHFNP, Post 4.5

  1. R.U. Testingme says:

    I don’t have time to really start on NB4 tonight. But I did clean up some pretty bad markup …. unless you are testing me. R U Testingme? This WILL get easier. I promise.

    • Rob says:

      Yeah, R.U. Not testing at all. I have been pretty much in charge of the upload of transcriptions, but we have a new helper who is learning the ropes, and although I am glad to have assistance, there is a definite learning curve. A bit psyched/thrilled to be maybe talking to one of our mystery annotator super-stars but respect the mystery too much to say more here.
      - I. M. A. Fan

      • R.U. Testingme says:

        This crowdsourcing transcription process can be improved by allowing transcribers to build field book specific lookup tables of common terms (taxa, places, etc) as they go, create macros for the most common and, when saved, run a link test. Just wondering if Guarav is going to create templates for people, formations, or weather. Also, what if a link gave you a choice of destinations, e.g. taxon link allows you to choose from Wikipedia, Namebank, TNRS, etc.? I’ll send your newbie a spreadsheet of Henderson’s bird/mammal/invert list.

  2. R.U. Testingme says:

    Check out page 15. I found a ghost town (Bath), a place that isn’t a place. How do we deal with that?

    • Rob says:

      R.U. — I can see a couple ways to deal with this, one lazy and immediate and one that I think we’d like to consider downstream. 1) We are not specifying that places have to still be extant, so just as an extinct taxon is still a taxon, an “extinct town” is still correctly called a “place” and annotated as such. 2) Down the road, it would be good to have a more detailed vocabulary for types of man-made places (towns, cities, ghost towns, ski resorts, haunted houses).

      I also want to comment on your post re: new templates and better link-outs (WordPress makes doing so in the right spot difficult), and just want to say that I love these ideas. I think a person-template is probably my highest personal priority. There has been some concern about homebrewing too many templates on Wikisource without understanding best practices, and so starting small and focusing on those mark-ups most related to documenting an occurrence seemed warranted. All good things to be considering!

  3. R.U. Testingme says:

    I.M.A.,
    Are there other rules? For example, when a place name or taxon is repeated multiple times on same page, is it necessary to tag all of the occurrences? Is there a preference for using scientific or common names if both link to Wikipedia? How will maps, drawings, marginal notes, etc. be tagged so that are both searchable and descriptive? Why can’t I use a coordinate in lieu of a place name (see ghost town comment)? More to come.
    R.U.

    • Rob says:

      U.R. — I do think we need to tag all occurrences and then it becomes a matter of interpretation whether an author was referring to the same occurrence twice or two different occurrences of the same species. Re: preference, it turns out we do have a strong preference but we didn’t know it until we tried to resolve taxon names later on and get a final “valid taxon”. Scientific name is much better for this than vernacular name. We will eventually specify more about preferred controlled vocabularies for annotations, and glad you asked about that. Regarding maps, drawings, marginal notes — GREAT questions and no simple answers there yet but I would like to come back to that too. I would love to see some mechanism by which tags could be proposed, much as terms in standards can, with some quick vetting and approval, to develop new tags but not let that proliferate out of control. Along with that could be best practices for “interpretations” that allow link-outs. Since georeferencing is an “act”, in some senses we want to capture the full aspect of that process so it is one reason I am glad we didn’t just allow coordinates to be used.

  4. R.U. Testingme says:

    Wikipedia has misspelled Acmaea palacea as Acmaea paleacea. Who do we lynch?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s