Selection from Cites & Insights 5, Number 12: November 2005

The Library Stuff

Arnold, Stephen E., “Relevance and the end of objective hits,” Online 29:5 (September/October 2005): 16-21.

Information professionals expect search results to reflect their search query. This is what happens with traditional online search services.

That’s the blurb for this fascinating article. Or, as the first sentence says: “Ask LexisNexis, Factiva, Dialog, EBSCOhost, or ProQuest to return information on, say, Macedonian weapons, and that’s what you get.” An exact match—without broadening, autotruncation (usually), the service trying to “outguess the searcher,” or sponsored links.

That’s not the case with Web search engines—and the order in which results are displayed is unpredictable: “Relevance ranking replaces objectivity.” This encourages the “cottage industry of search optimizers.” This article discusses the extent to which search engine optimization works and how it may affect the validity and relevance of search results.

Arnold is a careful writer: He notes that “search” is a single syllable that “embraces a mind-boggling range of meaning,” and that “relevance” (in the Web search sense) is “another slippery fish.” Traditional information retrieval experts think of relevance in terms of precision and recall—how effectively the search and engine reject stuff the searcher doesn’t want (precision) and include everything the searcher does want (recall). That’s not Web search “relevance.”

I now understand the difference between search optimization “cheats” (hidden text, link farms, blog seeding/comment spam, metatag spamming, etc.) and what Arnold calls “organic optimization,” which includes “surprisingly common-sense actions.”

For example, dynamic URLs may interfere with page ranking; so may frames. Site maps may improve site indexing. Sites with current content tend to do better, as do sites with “thematically related content.” Links from reputable sites help; links from questionable sites may hurt. Good metadata can help.

There’s lots more here. Highly recommended.

Bell, Lori, and Tom Peters, “Digital library services for all,” American Libraries 36:8 (September 2005): 46-9.

“Brick-and-mortar libraries can be intimidating places for print-impaired people, including those who are blind or visually impaired, or who have reading disabilities.” That’s the lead sentence for this article, which considers a number of recent technological and programmatic innovations to improve access for “print-impaired” patrons.

The first discussion doesn’t seem to fit the overall topic. OPAL, Online Programming for All Libraries, is an interesting initiative to expand library programming through online collaboration and online programs. It’s not clear from the description that OPAL’s uses are limited to print-impaired patrons.

The others seem more specifically accessibility-oriented. MI-DTP, the Mid-Illinois Digital Talking Book Project, is a “year-long bakeoff” to test various downloadable digital audiobook systems and players. The Unabridged digital audiobook delivery service uses OverDrive’s downloadable digital audiobooks as the basis for a delivery system. InfoEyes uses QuestionPoint as the basis for “a virtual reference and information service for the visually impaired.”

A good article, including some useful concerns.

Cohen, Scott, ed., “Interviews: On the future of libraries,” Tennessee libraries 55:2 (www.lib.utk. edu/~tla/TL/v55n2/interview552.htm)

This feature offers responses from 19 librarians to two questions: “What do you think libraries can do to remain viable in the age of the Internet?” and “What can libraries do to stay important to their patrons?”

Most respondents are from academic libraries; only three are from public libraries. That is, in some ways, a shame—particularly when you get comments such as this, from Rick Anderson’s response: “Besides, the [research] skills we teach [college students] aren’t going to have much applicability in the real world, where they won’t have access to the library’s resources.” Anderson is at the University of Nevada-Reno, and he seems to be dismissing the possibility that public libraries have licensed databases and interlibrary loan facilities, or that graduates would have access to publicly-funded academic institutions that provide in-house access to their resources.

Responses vary considerably in length and tone. It’s an interesting collection. Recommended.

Gall, James E., “Dispelling five myths about e-books,” Information Technology and Libraries 24:1 (March 2005): 25-31.

What’s the difference between an opinion piece on ebooks and a scholarly article in a refereed journal? In this case, 45 footnotes and the writer’s Assistant Professor status—and the fact that it was submitted to a scholarly journal. It’s worth reading, but I find it puzzling in some areas, specifically “myths” that I don’t think are widely believed.

“Myth 1—E-books represent a new idea that has failed.” Ebooks certainly aren’t new and “failed” oversimplifies the complex marketplace. I’ll argue that dedicated ebook appliances have demonstrated astonishing degrees of marketplace failure (although they’re not all that new), but that’s quite a different story.

“Myth 2—E-books are easily defined.” That’s why I published a nine-part breakdown in American Libraries (not cited) and others (such as Donald Hawkins) have published similarly complex views of the ebook market. Who says ebooks are easy to define?

“Myth 3—E-books and printed books are competing media.” That’s only a myth because the asserted competition failed so badly. The first paragraph under that head asserts a “protagonist/antagonist” stance for most articles that I haven’t found in most informed discussions of ebooks over the past few years.

“Myth 4—E-books are expensive.” Again, who’s promulgating this so-called myth? Gall goes on to cite the huge costs of handling printed forms as a cost of “the printed page,” which may be true but has nothing to do with either print books or ebooks.

“Myth 5—E-books are a passing fad.” An odd fad, since they’re only successful in a range of niche markets. That discussion ends with an astonishing statement: “In a few years, we may find that nontechnology-related endeavors are no longer represented in our information landscape.” Say what?

As one who’s been accused of raising straw men, I’m reluctant to make that accusation—but it’s hard to avoid in this case. I found portions of the article interesting, but had to struggle against the urge to write detailed rejoinders. For example, he warns libraries that “committing to a technology that concurrently requires consumer success can be problematic”—but in most cases, committing to technologies that lack consumer success is either fatal or irrelevant. All in all, an odd, interesting, and frustrating discussion.

Mann, Thomas, “Google Print vs. onsite collections,” American Libraries 36:7 (August 2005): 45-6.

There are, I believe, two separate (if intertwined) themes in this article—and one of them has received much more critical commentary than the other, perhaps unfortunately. The first theme is stated in the subhead: “Don’t send your paper copies off to remote storage just yet.” Mann points up this theme by recounting a comment during a meeting about Google Print: “[One librarian’s] supervisor…looked forward to having 15 million electronic books so he could send to remote storage every paper copy with an online equivalent. That struck me as unwise…” In this theme, in what I regard as a valuable message, Mann points out the value of a physical collection shelved in subject-classified order.

I think Mann is right on the money here, quite apart from the ludicrousness of planning for big moves to remote storage based on the eventual possibility that some day you’ll be able to get 15 million books on Google. You won’t: Whatever the outcome of Google’s legal conflicts with authors and publishers, most of those 15 million books will show up only as tiny extracts, since the majority of books in the “G5” (the five university libraries involved in Google Print) are still under copyright. As his detailed discussion makes clear, a good researcher can find things through browsing a classified collection that would be far more difficult, or even impossible, to find through full-text keyword searching of the same materials.

The other theme has to do with general problems in full-text searching of book collections, and particularly what Mann states as limitations in Google Print. A number of critics have assailed Mann for his assertions about what Google can and cannot do, noting that some of the things Google can’t do now might be feasible in a future version. Here’s one case:

Google’s software can only manipulate results within each keyword-defined set; it cannot build bridges among multiple sets using different words for the same idea, or covering different aspects of the same subject.

I agree that Mann may overstate the case against digital retrieval. Keyword searches aren’t necessarily the only thing Google will be able to do in the future (or even the only thing it can do now); for another, a variety of techniques can enhance keyword searching to provide some of the power Mann asks for.

But that theme is basically two paragraphs out of a two-page article. Trashing the entire article because of those two paragraphs is unfortunate. I believe the more important theme is sound and deserves more attention. I’m afraid there’s more than a little truth in Mann’s possibly overstated final sentence, not only for book collections but also for retrieval in general:

Our profession is in the grip of an uncritical infatuation with keyword searching as the sole avenue of access to book collections; if this is not corrected and counterbalanced, scholars throughout the nation may soon find that we librarians have traded our birthright for a mess of pottage.

Report of the NISO “blue ribbon” strategic planning panel, May 3, 2005, 25pp. ( members/secure/BRPrpt05.pdf)

The NISO Board formed a blue ribbon panel to consider the future of NISO. This report is the outcome. If you care about technical standards in libraries and publishing, it’s worth reading—and hard to summarize, as it’s a thoughtfully written 25 pages. From the summary:

We believe that NISO must take three sets of actions, in this order:

1. Define the NISO constituency for the future and articulate the way that NISO will relate to that constituency.

2. Develop a well-synthesized framework that looks at the needs and priorities of that constituency, the technical standards landscape relevant to that constituency, and the ecology of other standards-related organizations relevant to that constituency. From this will follow a roadmap and priorities for standards development and for partnerships, collaborations, and other relationships with other players.

3. Deal with resource and funding constraints and needs.

NISO is unusual as ANSI-accredited standards developers go. Many (most) of its members are not major corporations. NISO standards and drafts are open standards: Not only are they developed in an appropriately open environment (while meeting ANSI requirements), the standards themselves are freely available as PDF downloads. That’s highly unusual for accredited standards agencies.

The report makes some tough calls and recommendations. Highly recommended if you care about the topic—and maybe you should.

