Google Books

I was pleased to discover today that Google Books can be searched in Cyrillic.   In other words, the service enables researchers to do keyword searches of their massive and expanding database, including tens of thousands published in the Soviet Union collected over the decades by North American research libraries and recently scanned.  Although the complete text is usually not available for viewing because of copyright restrictions,  each reference provides some useful information, including the number of times these terms appear in the book (up to thirty, it seems) and up to three page numbers where the term is mentioned.

My initial attempts to use this function to perform a keyword search of a Russian-language text were a little frustrating.  I decided to test the keyword search using the multi-volume collection Tragediia Sovetskoi Derevni. An advanced search for a keyword along with a word from the title: трагедия, and author:  Данилов came up with a couple hits in volume II of the series.  But at first I was unable to find my term on the pages where Goggle said it should be.  To double check, I searched the online text for a word that I found on p. 339 in the hard copy.  Google Books did find that word, but only one time, and only on p. 905.   In the end–after some persistence–I determined that Google’s entry for volume II is incorrect: the online text that was actually searching was volume III.  Once I had correctly matched the mislabeled Google text with the correct volume, I was able to make successful keyword searches.

Moreover, by searching the various volumes directly rather than returning to the general advanced search I was able to get hits for my keyword in the other volumes.  Once I matched the mislabeled online texts with its respective hard copy, I was then able to carry out pretty effective searches.  It’s important to note, however, that the keyword searches are limited in significant ways.  First, the search did not catch my keyword in all instances.  This is not surprising, I guess, given the problems inherent in converting digital images into text (OCR).  Also, what is most frustrating is that the keyword search only provides the page number for the first three hits.  It is thus useful for terms that show up very infrequently, but frustrating to use effectively when the keyword shows up more often.

There is a helpful discussion of some of the quality control issues in 2007 article by Robert Townsend, “Google Books: What’s Not to Like” on the AHA Today blog.  Overall, despite it’s problems, Google Books is still a powerful tool given its scope.

Ben Zajicek Says:
April 27, 2009 at 12:15 pm edit

I’ve found some fabulous stuff on google print. Anything written before 1923 is available in Google books as a full-text downloadable pdf file. Suggestion: perhaps we could create a site with links to the permanent URLs for public domain books in Russian/related to Russia, find a way to post them as we discover them.

On the downside, I’ve tried to use this in my dissertation research and run into a lot of disappointment. The entire run of the Soviet psychiatry journal “nevropatologiia i psikhiatriia” is available on google print. The problem is that they are all post 1923, and thus deemed to be in copyright. I could still search them for key words and then go look them up in the print version, but the identifying information (year, volume, number) is almost all wrong… Sigh.

A useful resource that I’ve found for public domain sources to use in Western Civ courses:

James Harvey Robinson, Readings in European History: A Collection of Extracts from the Sources Chosen with the Purpose of Illustrating the Progress of Culture in Western Europe Since the German Invasions, v. 1 (1904) : books.google.com/books?id=4Z1FAAAAIAAJ

and v. 2 (1906): http://books.google.com/books?id=EDoNAAAAYAAJ>

Advertisements

April 12, 2009. Full-text, Research.

3 Comments

  1. Ben Zajicek replied:

    I’ve found some fabulous stuff on google print. Anything written before 1923 is available in Google books as a full-text downloadable pdf file. Suggestion: perhaps we could create a site with links to the permanent URLs for public domain books in Russian/related to Russia, find a way to post them as we discover them.

    On the downside, I’ve tried to use this in my dissertation research and run into a lot of disappointment. The entire run of the Soviet psychiatry journal “nevropatologiia i psikhiatriia” is available on google print. The problem is that they are all post 1923, and thus deemed to be in copyright. I could still search them for key words and then go look them up in the print version, but the identifying information (year, volume, number) is almost all wrong… Sigh.

    A useful resource that I’ve found for public domain sources to use in Western Civ courses:

    James Harvey Robinson, Readings in European History: A Collection of Extracts from the Sources Chosen with the Purpose of Illustrating the Progress of Culture in Western Europe Since the German Invasions, v. 1 (1904) : books.google.com/books?id=4Z1FAAAAIAAJ

    and v. 2 (1906): http://books.google.com/books?id=EDoNAAAAYAAJ>

    • auriberg replied:

      Thank you Ben, much belatedly. I’ve been collecting some other titles that are available. So far I haven’t come across any list of permanent URLs.

  2. auriberg replied:

    Dan Cohen presented a defense of google books for history during a 2010 panel of the AHA, which he posted on his blog: http://www.dancohen.org/2010/01/07/is-google-good-for-history/

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Trackback URI