Sunday, September 12, 2010

The Trouble with Google Books


Laura Miller, for Salon, writes about how rampant errors threaten the scholarly mission of the vast digital library.

From the piece...

Nunberg, a linguist interested in how word usage changes over time, noticed "endemic" errors in Google Books, especially when it comes to publication dates. A search for books published before 1950 and containing the word "Internet" turned up the unlikely bounty of 527 results. Woody Allen is mentioned in 325 books ostensibly published before he was born.

Other errors include misattributed authors -- Sigmund Freud is listed as a co-author of a book on the Mosaic Web browser and Henry James is credited with writing "Madame Bovary." Even more puzzling are the many subject misclassifications: an edition of "Moby Dick" categorized under "Computers," and "Jane Eyre" as "Antiques and Collectibles" ("Madame Bovary" got that label, too).

Although Google representatives did respond to Nunberg's article, blaming the bulk of the errors on outside contractors, much of the incorrect information remains in place. Looking at listings for "The Golden Bough" by James Frazer, a seminal work on comparative religion with a complex and fascinating publication history, I found one edition characterized as "Life Sciences." The 12 volumes of what is arguably the most authoritative edition of the book (published between 1910 and 1915) aren't grouped together or searchable as a whole, and the foremost search result is a dubious reprint of the bowdlerized 1922 edition with an introduction lifted from Wikipedia and a publication date of 1947, although the text itself claims a publication date of 2008.

No comments: