Article read:
- Nunberg, Geoff. "Google Books: A Metadata Train Wreck" Language Log blog, August 29, 2009. http://languagelog.ldc.upenn.edu/nll/?p=1701. Be sure to read through some of the comments, specifically the second comment left by Jon Orwant (Google engineer on the GBS team) on September 1, 2009 @ 1:51 am.
- Nunberg, Geoff. "Google's Book Search: A Disaster for Scholars," The Chronicle of Higher Education, August 31, 2009. http://chronicle.com/article/Googles-Book-Search-A/48245/
This month's Metadata Discussion Group began with a discussion of the tone of the blog post and article, and the tone of the rhetoric in the community at large around the Google Book Search project. Participants expressed support for the idea that discussion needs to be reasoned and civil - neither Google nor libraries are all wrong or all right. It is more important to fix identified problems than to point fingers. One participant noted that the difference in tone between the blog post and the slightly later Chronicle article was telling. Numberg’s interest is clearly for the scholars, but this is more obvious in the Chronicle article than in the blog post. The Chronicle article immediately sets up a "this service is bad" tone by listing Elsevier as the first possible future owner! The Chronicle version doesn’t even give Google a chance for keeping this service as its own.
Thinking about how to solve these problems led to a theme common in the Metadata Discussion Group sessions - what if we were to open up metadata editing to users? Wikipedia isn’t consistent, surely - would that approach here. A participant noted that OCLC itself is a cooperative venture and there are many inconsistencies there. Institutions futz with records locally and don’t send them back to OCLC. CONSER had a history of record edit wars and catalogers decided they just have to grit their teeth and deal with it.
Participants then noted that scholars aren't the only or perhaps even the primary audience for GBS. But should they be? A great deal (though not all - content comes from publisher too) comes from academic libraries who have built their collections primarily in support of scholarly activities. Shouldn't library partnerships come with some sort of responsibility on Google's part to pay attention to scholarly needs? For IU and the CIC and other academic libraries, HathiTrust is attempting to fulfill this role, but is that enough?
The next question the group considered was
Discussion then turned to some of the statistics presented by the Google engineer in a comment on the blog post, including the claim of “millions” of problems and BISAC accuracy rate of 90%. Participants guessed we have less than 10% howlers for subject headings in our catalogs, but there certainly are lots of them in there. Lots of redundancy in the MARC record gives more text that could be used to avoid this kind of obvious error. We wondered if Google is using any of this redundant information effectively.
The topic then turned back to whether Google Book Search should spend more effort meeting scholarly needs. What should they do differently to support this kind of user better? First, probably not use just a single classification scheme. Don't necessarily stop using BISAC, but they could also use alternatives - that's what Google is about, more information! They're definitely getting LCSH from MARC records, despite LCSH's limitations. The LCC class number could be used to devise a "primary" subject, and potentially words that aren't elsewhere in the record. Participants noted that as the GBS database grows each individual subject heading will start getting larger and larger result sets.
The session closed with some musings on how the Google and library communities might better learn from one another. The notion of constructive conversation rather than disdain was raised again. Then participants noted that the GBS engineer commenting on the blog post invited comment. Individuals can take advantage of this invitation, and IU as a GBS partner can provide information and start conversations at yet another level.