Cites & Insights: Crawford at Large
ISSN 1534-0937
Libraries · Policy · Technology · Media


Selection from Cites & Insights 5, Number 7: May 2005


Session Report: ACRL 2005

What’s Next? Academic Libraries in a Google Environment

Joy Weese Moll

This presentation on Google was a late addition to the ACRL Conference, announced on an addendum sheet rather than in the bound program. Adam Smith of Google described the current status of both the Google Scholar and the Google Print initiatives. John Price Wilkins of the University of Michigan discussed Google Print from the perspective of one of the participating libraries. Both presenters took questions at the end for about twenty minutes. People were still standing in line at the microphones to ask questions when the session ended.

Adam Smith briefly described Google, the leading search engine company with 52% market share. The core business is advertising. Google subscribes to a strict “separation of church and state” putting the ads in the right column and not allowing search results to be polluted. The Google philosophy is to develop products quickly and to push them out, marked as Beta test. The products will be improved based on observations of users and on user feedback. Google wants input from the library community regarding both Google Print (print-support@google.com) and Google Scholar (scholar-support@google.com).

Adam Smith on Google Print

The Google Print project (print.google.com) has two components. Publisher partnerships form one component—Google indexes material that it receives directly from the publisher. The results of searches of these materials, which are covered by copyright, are displayed as three short snippets from the published material in a KeyWord in Context format.

The second component of the project consists of partnerships with libraries. Library books that remain in copyright are displayed in the same way as the publisher-supplied material. Library books that are in the public domain are displayed full text with no browsing restrictions. “No library books were harmed during the making of this project.” The libraries, not Google, choose what is digitized and in what order.

Smith acknowledged that determining whether a publication is in public domain requires a lot of work. To start with, Google is using very blunt rules. In the U.S., if it was published prior to 1923, it is public domain. Otherwise, Google treats it as a copyrighted work. Other blunt rules are used to apply to material that is governed by international copyright laws.

When books appear in Google search results, there is also a link to the record in Open WorldCat (www.oclc.org/worldcat/open/default.htm) to assist users in finding the book in local libraries. These results are most easily brought up by putting “book” as the first word in the search terms.

Smith made it very clear that this project is in its infancy. Google considers itself to be an international company and intends to participate in digitization projects in other countries and other languages. Smith acknowledged that Google cannot digitize everything. Rather, Google wants to be a catalyst for digitization efforts, not the only game in town. Google’s digitization project will help them build tools that will improve the searching of digital libraries created by universities, governments, and other organizations.

Google’s motivation for Google Print is to enhance the quality of the search. Smith believes that Google Print does not signal the beginning of the end for libraries, that the roles of Google and libraries are complementary, and that Google Print will help the user discover and use library resources.

Adam Smith on Google Scholar

The goal of Google Scholar (scholar.google.com) is to “create the best scholarly search experience” by providing an easy to use search in a single place. The relevancy rankings are different from those in Google, placing a heavy emphasis on citations. The results display links to multiple versions of articles (including pre-prints and repository copies) grouped together, giving precedence to the publisher version. Google Scholar displays results that represent off-line content, including books through Open WorldCat.

Current coverage of Google Scholar includes full text indexing of most scholarly publishers and societies. Google is still working to develop agreements with Elsevier and the American Chemical Society and would appreciate librarian encouragement to those two organizations. Google Scholar also indexes PubMed, institutional repositories, and more.

Google recognizes that Google Scholar will be most useful in academic environments if it works through OpenURL link resolvers so that users can easily access library-licensed resources available at their home campuses. That functionality has been implemented in a pilot program, done in cooperation with all the major link resolver vendors. Watch for it to roll over to Beta testing soon. Google is also interested in working with libraries that developed their own link resolvers (scholar-support@google.com).

Google Scholar, like Google Print, is a new project with many challenges ahead. Currently, the database is not updated frequently enough. Since rankings depend on citations, important recent articles do not appear at the top of the listing, even when they probably should. Methods for disambiguating authors’ names have yet to be developed.

John Price Wilkin on the library perspective

John Price Wilkin, from the University of Michigan, began his presentation by stating, “Google has been a fantastic partner.” He confirmed that Google is using nondestructive methods of digitizing and that UM retains both the physical book and a digital copy, a preservation surrogate, of each book. Wilkin believes the primary responsibility of the library is to be the long term curator of the physical and digital material.

The University gets a digital copy, identified by barcode. The scan is 600 dpi for print and 300 dpi for color/grayscale. The library specified the naming convention. The files have Optical Character Recognition. The quality is at least as good as what the University of Michigan had been doing for years on their digitization projects.

The library can do whatever it likes with its digital copies. UM will put their copies on-line when they develop specialized tools that better serve the UM audience than the tools Google provides for a general audience.

The plan is to digitize all 7 million books in the University Library (at UM, this does not include the law and business libraries but those libraries may enter into their own agreements with Google). Google indemnified the University of Michigan against any legal issues that arise from copyrights.

The digitization project at the University of Michigan has begun with material stored in remote shelving. These books are organized by size which helps with the workflow. The preservation librarians are providing lots of guidance.

Wilkin asked that we begin to consider the transformative implications of the Google Print project. He wondered about broad social issues like the effect of wide, efficient, democratizing access to information. He says that the project has already proven to be a factor in driving clarification of intellectual property rights, including the orphan copyright issue.

Wilkin also wondered about the transformative implications of Google projects on libraries. What are the possibilities for a cooperative, universal library? What are the implications for library-as-place given the paradox of rising gate counts as more information goes on-line? If libraries cede the generalist role to Google, how can they facilitate specialization in service? How can Google Print and Google Scholar free up resources for related issues like institutional repositories and scholarly communication?

Questions and Answers

Is Google aware that U.S. government publications in federal depository libraries are copyright free and will they be digitizing them?

Yes and yes.

Is the quality of OCR good enough for voice output?

Google has an accessibility team and is aware of the difficulties involved. Solving it for Google Print is considered a long-term project.

What progress is being made on addressing the difficulties of determining if a work is protected by copyright?

All the participating libraries and Google sent letters to the U.S. legislature requesting resolution of the orphan copyright issues. Google is encouraging a database of copyrights so that libraries and others can determine easily if a work is protected by copyright.

How does one search material that is not in English, particularly if it uses Cyrillic or Asian characters?

Adam Smith: “We haven’t solved all of the world’s problems. We are very aware of the difficulties there.” He said they are working on it but the project is in the very early stages.

Google Print and Scholar may reinforce the notion that everything is on the web. What is Google doing to mitigate that impact?

Adam Smith: “Information literacy is critical, a critical skill going forward.”

John Price Wilkin: “What better way to deal with that perception than to make sure as much as possible is on the web?”

What impact will Google Print and Google Scholar have on Open Access?

Wilkin felt that what Google does is work with publishes and libraries in their current respective roles, so the Open Access issue is orthogonal to these projects. “It will be back in our court.”

Are the bindings of the books digitized in the Google Print project?

Yes.

Comments

The prevailing attitude of Adam Smith and John Price Wilkin was enthusiasm for the possibilities of two projects that have just started. They are aware of many challenges. Google moves quickly to solve the easy ones (like using link resolvers) and devotes some resources to work on more difficult ones (like searching with Chinese characters). While the audience displayed less enthusiasm, the questioners seemed guarded and curious rather than hostile. As a student librarian concerned about the future of my newly chosen profession, I was pleased to see the high level of cooperation and understanding between Google and the partner libraries. Although it is difficult to predict the impact of these projects, the potential for mutual benefit is greatest in an environment of mutual appreciation.

Joy Weese Moll, joy@mollprojects.com, will graduate in December 2005 from the School of Information Science and Learning Technologies at the University of Missouri. See her blog, Wanderings of a Student Librarian, at joy.mollprojects. com/blog.

Cites & Insights: Crawford at Large, Volume 5, Number 7, Whole Issue 63, ISSN 1534-0937, a journal of libraries, policy, technology and media, is written and produced by Walt Crawford, a senior analyst at RLG.

Cites & Insights is sponsored by YBP Library Services, http://www.ybp.com.

Hosting provided by Boise State University Libraries.

Opinions herein may not represent those of RLG, YBP Library Services, or Boise State University Libraries.

Comments should be sent to wcc@notes.rlg.org. Cites & Insights: Crawford at Large is copyright © 2005 by Walt Crawford: Some rights reserved.

All original material in this work is licensed under the Creative Commons Attribution-NonCommercial License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/1.0 or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.

URL: citesandinsights.info/civ5i7.pdf