Cites & Insights: Crawford at Large
ISSN 1534-0937
Libraries · Policy · Technology · Media

Selection from Cites & Insights 6, Number 4: March 2006

The Library Stuff

The Straight Dope ( has done at least two remarkably good essays related to libraries: One on the history of public libraries in America and one (January 31, 2006) on the Dewey Decimal System. This is seriously good stuff at a popular but not dumbed-down level: the DDC essay prints as nine single-spaced pages.

Every article cited here comes with at least some degree of personal recommendation, even if it lacks a boldface note to that effect.

Bergstrom, Theodore and R. Preston McAfee, “End free ride for costly journals,” Library Journal (December 15, 2005).

Here’s a radical proposal: Recognize that for-profit journal publishers have disrupted the “symbiotic relationship” between scholars and scholarly publishers—and act accordingly. “Large for-profit publishers are gouging the academic community for as much as the market will bear.”

How to react? Universities should charge—for journal editing as a first step, although one might also suggest charging for refereeing (the article doesn’t go that far). The authors suggest that universities assess overhead charges for support services of editors for journals with library subscription prices higher than a certain threshold of price-per-article or price-per-citation. They’ve created, a web site providing such information for 5,000 journals and offering summary tables.

Block, Marylaine, “Information literacy: Food for thought,” Ex Libris 271 (January 13, 2006).

Short and to the point, this is a “few leading questions to ask at the start of information literacy sessions that might force students to examine their assumptions.” For example, why is stuff on the web free? Given a set of items, what would you expect to find for free on the web—and what would you not expect to find. The last three questions push students toward the library’s licensed and offline resources. It’s a fine list, well worth reading and using.

Brown, Myra Michele, “Video libraries: More than a lure,” American Libraries 36:11 (December 2005): 41.

It’s just a one-page “On my mind”—an op-ed of sorts. Brown discusses her experience as video librarian at Texas Tech University, starting with “comments I heard questioning the academic authenticity of the video library.” She discusses the power of video and specifically TTU’s global collection as made available in film series on campus.

What doesn’t surprise me: That film and video serve academic purposes. What does surprise me: That this op-ed can be anything more than example of good library programming in 2006. But it clearly is. I don’t doubt Brown’s finding that many academic (and public) librarians regard film and video as inferior to books. Heck, I regard video as inferior to books for book-length stories expressed as text, but far superior for other purposes, including different kinds of stories.

The pull quote: “It is counterproductive to stigmatize one format while deifying print.” I’m a “print person” by many measures, and I agree.

Bucknall, Tim, “Getting more from your electronic collections through studies of user behavior,” Against the Grain 17:5 (November 2005): 1, 18, 20.

“Libraries spend a lot of money on electronic resources and understandably want to get the best possible return—in other words, the most usage—on that investment.” This article discusses attempts at the University of North Carolina at Greensboro to measure the effects of expanded title and article level access (via OpenURL) on database usage. It’s an interesting article with results that might be predictable (or might not) but haven’t been measured and reported that often. For ten of 12 full-text databases studied, there were more uses by way of a journal or article link than through the library’s database list. Deep linking has increased use and OpenURL-compliant databases show greater usage increases than other databases.

There’s more to the article, including work on matching system behavior to patron behavior by studying failed searches. Well worth reading.

Carver, Blake, “10 ways to make the internet a better place,” LISNews, September 19, 2005.

“99% of the stuff on the Internet is trash. Don’t help make it 100%. Here’s some good ways to make the web a better place for all of us.” Carver offers a sprightly list of 10 topics with a brief expansion on each one: Contribute, maintain, promote web standards, mentor, support, promote civility, be a good neighbor, write right, “add to this list,” and unplug.

This is four pages with loads of white space; you can read it in two or three minutes, on paper or at LISNews. While the advice that “the Internet is permanent” is true enough as a warning, it’s unfortunately not true as a general precept: sites move, links break. The best advice regarding internet permanency: the internet is permanent when you screw up—but may be evanescent when you exhibit touches of genius.

That aside, this is good stuff, the last item being one I’ve been pushing for years: “Unplug” from time to time. Turn it all off. No cell phone. No pager. No PDA. Certainly don’t fill your ears with MP3. Go get a touch of nature, even if that means the sounds of the city. (Any other old KSFO fans out there hearing that magnificent custom-written theme song at this point?)

Eversberg, B., “On the theory of library catalogs and search engines.” tlcse.htm. (Version downloaded revised July 6, 2005.)

This paper bills itself as a supplement to a talk on “Principles and goals of cataloging” at the 2002 German Librarians’ Annual Conference. There’s also a German version—and if this is a translation, it’s an excellent one. Even though it’s four years old, the general comments and specific comparisons between catalogs and web search engines still largely ring true.

The starting sentence, labeled “a banal sentence”: “Nothing is more practical than a good theory.” I thought that was a good start—and the discussion that follows lives up to the start. But then, I would: One key sentence in “contents of libraries and Internet” is: “No one single method can serve all purposes and all searchers all the time—everybody will know this who has tried to find anything on more than one occasion.” Too many of today’s librarians seem ready to buy into the notion that the web search approach is the only kind of searching anyone needs, ever; that’s simply not true.

Eversberg posits a sensible distinction: Catalogs stink at factual searching, but they’re superior to web search engines for at least two of three broad categories of document searching: known item and collocation searching. (“Collocation searching”—whether for all items by an author or for other groups that collocate using cataloging tools—is overlooked by some, who seem to think that catalogs are only good for known-item searching.) Subject searching may be the toughest aspect of catalog searching: “‘What is this book about’” is a question that very often cannot be answered with a brief list of terms.” On the other hand, as Thomas Mann points out, catalogs with browsable subject headings provide powerful tools for finding related items once one subject is known.

The three-page tabular comparison of catalogs and search engines is first rate; commentary would be almost as long as the table itself. Take a look.

Mann, Thomas, “Research at risk,” Library Journal (July 15, 2005).

Mann sees library managers arguing that the profession should “capitulate” to searchers’ tendency to use keywords and nothing but. “In their view, we should abandon [LCSH] in our OPACs and scan in the table of contents of each book—or wait for Google Print to digitize ‘everything.’” Mann argues that no addition of keywords will be as effective for some purposes as efficient research using, among other tools, browsable subject headings.

Mann teaches research orientation classes and finds that students “are hungry to know how to do research more efficiently.” He shows one example of what happens without LCSH (an example that also shows the difficulty of LCSH): a researcher looking for linguistic studies of Cockney, who typed in “Cockney” as a keyword. That yields some juvenile fiction and other stuff, but not most of the linguistic studies, which can be rounded up under “English Language—Dialects—England—London.”

Just for interest, I tried this on RedLightGreen and on the RLG Union Catalog using Eureka. With RedLightGreen, “Cockney” yields 230 hits—and in the “refine by” sidebar, the first subject listing is the proper subject heading, which yields 19 hits. Via Eureka, “Cockney” yields 313 hits [note that RedLightGreen “frbrizes” material but is also less current than the RLG Union Catalog]; the second one shows the proper subject heading, which itself yields 104 titles from the kind of browse display Mann calls for, including lots of multiple editions.

Mann notes that scholars want and need comprehensive searches, not a strong suit of web search engines—and that web searches “fail miserably at keeping relevant uses of a term separate from irrelevant ones.”

His second example begins with Google and moves to a library catalog, assuming that a student wants to research Millard Fillmore’s foreign policy. “President Fillmore Foreign Policy” brings back a paragraph from MSN Encarta, a link to a term paper mill, a fifth-grader’s paper, and a brief speech from the Britannica, among others. By contrast, if you knew enough to only enter Fillmore’s name as a subject in a catalog with subject browsing (and, by the way, knew to invert the name), you might realize that choosing the “bibliography” heading would lead you to a book with loads of sources on Fillmore’s foreign policy.

Examples age. Right now (or, rather, as I wrote this on February 8, 2006), Encarta still comes up first, now followed by an odd “sciforum” posting. Third is a PDF copy of a speech by…well, you guessed it: Thomas Mann on “The future of cataloging” (the article I’m annotating now turns up sixth, with a posted “Japanese reply to Pres. Fillmore’s letter” and a link to a scholarly article behind a fee wall in between). While the term paper mill and grade-school article are further down, it’s still true that you wouldn’t find much about Fillmore’s foreign policy in the top results on Google. It’s just not the right tool for the job.

Examples also get more difficult. How would Google Book Search do with “Fillmore foreign policy”? Pretty well—but it still wouldn’t yield the results Mann describes. We need more than one tool.

“Rethinking how we provide bibliographic services for the University of California.” December 2005.

I haven’t read the full report from UC Libraries’ Bibliographic Services Task Force, but the four-page executive summary is dynamite. I won’t attempt to summarize what’s already a summary, and I might question a few of the details in this set of recommendations (some of which are for further consideration rather than flat recommendations), but it’s a remarkable starting point. I’m pleased that “abandoning the use of controlled vocabularies for topical subjects in bibliographic records” is a “consider” rather than a recommendation; for reasons such as those given by Thomas Mann, I think it’s a bad idea, even as badly as many subject searches work. On the whole, all I can say is go read this. It shouldn’t be hard to find.

Tennant, Roy, “Is metasearching dead?” Library Journal (July 15, 2005).

Tennant loves controversy, but this time he poses a question rather than overstating a case. He wonders whether Google Scholar could replace the need for library-based metasearch services, as some of his colleagues believe. He doesn’t, “no matter how good Scholar gets (and it will get better).” Why? Partly because “what you don’t search can be as important as what you do”—very broad databases tend to flood searches with inherently-irrelevant results, while good metasearch interfaces can be designed for specific audiences or purposes.

Ten Years of D-Lib Magazine

I’ve been a bit skeptical at times of “digital libraries” in general and the Digital Library Initiative(s) in particular. Early digital library work seemed to place all the emphasis on digital, with “library” being an afterthought—and maybe that’s not surprising for work originally funded by DARPA and NSF. I still get the sense that it’s mostly about being digital, with issues of librarianship a distant second, even when I attended one Digital Library Forum.

On the other hand, D-Lib Magazine has been a first-rate publication at least as long as I’ve been aware of it. The magazine began in 1995 and published a special ten year anniversary issue in July/August 2005 ( What follows are brief annotations on some (not all) of the items in that special issue. My congratulations to the editors and authors for a decade of work that’s not only important but also interesting and readable. I don’t provide specific URLs; you can get to these and other pieces from the contents URL above. I’ve kept these remarks brief; all the articles are recommended.

Kahn, Robert E., “Ten years of D-Lib Magazine and counting.”

Kahn, president & CEO of the Corporation for National Research Initiatives (CNRI, home to D-Lib), kicks things off with this brief editorial. “The magazine has proven to be an important source of timely and relevant information about digital libraries in particular and, more generally, of information production, consumption and management.”

While D-Lib has proven its worth as a magazine, that doesn’t pay the bills; apparently that’s a problem:

Producing a high quality magazine on the net each month turned out to be somewhat less difficult than I would have expected, due almost entirely to the quality of the editorial staff and the willingness of the readership to contribute interesting articles. Funding the continued production of the magazine has been, perhaps, its biggest challenge…

Staffing is the equivalent of “a little over one full-time person.” Funding possibilities have been constrained by the basic (and, I believe, correct) decision to make D-Lib free and available without registration, and not to charge authors for publication. Advertising is one possibility, but the magazine is so widely mirrored that demonstrating readership may be difficult. Kahn estimates that ads might cover one-third to half the current costs. There’s always foundation funding—but they haven’t found that yet. (If you have great ideas, send them to

Thus, despite the strong past of D-Lib, its future isn’t quite assured:

I cannot say with any degree of certainty how this will all work out over the coming months and years. However, the need for the kind of information dissemination mechanism that D-Lib Magazine has shown for quality information covering electronic aspects of libraries, publishing, and information creation, dissemination and management will only increase as the technology for access and dissemination of digital information continues to evolve.

Wilson, Bonita, and Allison L. Powell, “A tenth anniversary for D-Lib Magazine.”

This article discusses the makeup of the special issue and some aspects of D-Lib’s first decade. It’s planned for 11 issues a year, around the 15th of the month (except August); 548 full-length features appeared in the first 111 issues, with 538 brief items in “In Brief” (beginning September 1999)—the only place I’ve appeared in D-Lib. Seventy-two “exemplary digital collections” have been featured since 1999.

There’s a good explanation of why D-Lib is not a refereed journal. The founders opted for “quick turnaround from submission to publication over peer review…” Despite its less formal status, D-Lib articles have been cited frequently, an average of nearly 118 citations per year.

I wonder about this statement: “Another indicator that authors appreciate the model we use is that 66 percent of the central authors in the field of Digital Libraries have either published in or cited an article from D-Lib Magazine.” That statement leads to an article note leading to a paper that can be read as identifying “central and/or frequently published authors” from digital library conferences—but these are entirely computer/engineering digital library conferences.

One slightly odd aspect of D-Lib is reader feedback—or, rather, the lack of reader feedback. For the first four years, the magazine used HyperNews to facilitate reader responses—but readers didn’t respond. Since then, they accept letters to the editor, but “letters received have been few and far between.” This seems unfortunate. Perhaps, even though it’s a magazine, D-Lib has enough of a journal’s formality to discourage most reader feedback.

A table of journals or conferences that cite D-Lib articles does tend to support my sense that “digital libraries” are still more digital than library: the names are uniformly within technology or, at best, information processing/information science.

Friedlander, Amy, “Really 10 years old?”

Friedlander was the first editor of D-Lib, working with Bill Arms. She recounts production of the first issue and where things went from there. She explicitly thought and thinks of D-Lib as a magazine, not a journal. “[W]e were freed from the canons of peer review to engage in speculation that might eventually feed into the formal process of juried results.”

Friedlander “didn’t know squat about editing a magazine when we started D-Lib but learned fast—by looking around at “publications I admired” and reading a couple of books about editing. She clearly saw articles as stories, an excellent starting point. She aimed for a combination of substantial research reporting and good writing—and good writing has been a hallmark of most D-Lib articles ever since.

Larsen, Ronald L., “Whence leadership?”

This brief article addresses the nature of leadership within DLI and issues a call of sorts for readers to serve as leaders, including temporary leadership posts for DLI and whatever replaces it. Larsen includes an interesting comment about the magazine as compared to the Digital Library Forum: “The community didn’t need the Forum as much as it needed the magazine.”

Lynch, Clifford, “Where do we go from here? The next decade for digital libraries.”

The field of digital libraries has always been poorly-defined, a “discipline” of amorphous borders and crossroads, but also of atavistic resonance and unreasonable inspiration. “Digital libraries”: this oxymoronic phrase has attracted dreamers and engineers, visionaries and entrepreneurs, a diversity of social scientists, lawyers, scientists and technicians. And even, ironically, librarians – though some would argue that digital libraries have very little to do with libraries as institutions or the practice of librarianship. Others would argue that the issue of the future of libraries as social, cultural and community institutions, along with related questions about the character and treatment of what we have come to call “intellectual property” in our society, form perhaps the most central of the core questions within the discipline of digital libraries – and that these questions are too important to be left to librarians, who should be seen as nothing more than one group among a broad array of stakeholders.

That last sentence is challenging (perhaps less so if you believe that physical libraries have a bright future as social, cultural and community institutions)—and, after all, you should expect to be challenged by a Lynch article.

Summarizing seven pages of Lynch’s idea-rich prose is beyond my talents. He believes that we won’t see much more governmental funding of digital libraries research and that digital libraries offer a “relatively mature set of tools, engineering approaches, and technologies.” He knows “digital preservation is going to be an enormous issue” for many parties and notes a broader set of stewardship issues. Lynch notes four areas for future research he finds “particularly compelling”: personal information management, long term relationships between humans and information collections and systems, the role of digital libraries (and related services) in supporting teaching, learning, and human development, and active environments for computer supported collaborative work.

Paepcke, Andreas, Hector Garcia-Molina and Rebecca Wesley, “Dewey meets Turing: Librarians, computer scientists, and the Digital Libraries Initiative.”

This article is unusual: It expressly argues that DLI did unite librarians and computer scientists. (It also asserts that Google emerged from DLI-funded work.) I admit that I haven’t followed DLI and DLI-2 carefully since the beginning, but this casual history doesn’t ring true with my own memory. The authors seem to say that the web disrupted the “reasonably comfortable nest for the emerging union between the two disciplines.” They go on to say that librarians perceive that “computer scientists have hijacked the [DLI] money and created an environment whose connection to librarianship is unclear,” while computer scientists don’t understand why librarians “couldn’t be, well, normal computer scientists.”

It’s an odd article. What are we to make of the statement that “the notion of collections is spontaneously re-emerging” when, in the real world of librarianship, the notion never departed? I suggest reading it (and, of course, all the other articles) yourself; those of you with more background in DLI may see a truth that I’m missing.

Weibel, Stuart L., “Border crossing: Reflections on a decade of metadata consensus building.”

Weibel recently left the Dublin Core Metadata Initiative management team—and DC is also celebrating its tenth anniversary. He reflects on “some of the achievements and lessons of that decade.” It’s a fascinating story (and I don’t doubt Weibel’s recollections for a moment). He raises a number of useful issues and points out what many of us know and believe: Most authors will not spend the time to create their own metadata (other than an article title). “Creating good quality metadata is challenging, and users are unlikely to have the knowledge or patience to do it very well, let alone fit it into an appropriate context with related resources.” Then Weibel agrees with Erik Duval’s statement, “Librarians don’t scale.” That may be true—but librarians (and their indexer colleagues) have, to date, created a whole bunch more and better metadata than anyone else, on the order of tens or hundreds of millions of records.

He notes the naïveté of the assumption that “metadata would be the primary key to discovery on the Web,” then goes on to discuss the question this leaves: “What is metadata for?” It’s a good discussion. Weibel seems to think we’ll get the “answer” to the question of whether full-text indexing of books and the like is “better” for retrieval than high-quality metadata (cataloging). I’m not sure that’s true, unless the answer is “It depends.”

There’s a lot here that I haven’t touched on, and it’s well worth reading.

Ten Years of Ariadne

A little more recent (the tenth anniversary issue is either January or February 2006, depending where you look) and a lot less frequent (quarterly; the special issue is #46), but then the UK is also a little smaller than the U.S. Ariadne isn’t a precise British equivalent to D-Lib Magazine but it’s close: The primary focus is digital libraries within the UK.

Lorcan Dempsey was co-director of the initial publication and notes in his section of the editorial introducing the decennial issue:

Ariadne first appeared in Web and print formats. A high-quality magazine-style publication appeared on people’s desks, and an extended version appeared on the Web. It met two needs: it provided both a general update on the progress of the eLib programme and related national information services, and a forum for reflection and discussion about changing times. It was an important community-building tool.

After a few years, funding problems eliminated the print publication. The web version remains. It’s strongly British and at times seems to assume an existing knowledge of the many programs under the UKOLN and JISC umbrellas. As with D-Lib, it’s generally worth reading. The founders lack false humility: Here’s how the other original co-director, John MacColl, puts it in his portion of the shared editorial: “Ariadne is ten years old, and she is still the best guide I know to what is going on in the digital library world.” Not, apparently, just in the UK digital library world, but in the entire digital library world.

I have just a few comments on four of the seven main articles in the special issue—which is not at all to say the others aren’t worth reading. You’ll find the issue at

Dempsey, Lorcan, “The (digital) library environment: Ten years after.”

Dempsey may be at OCLC now but was a figure in the UK digital library field in earlier years. As with Clifford Lynch, Lorcan Dempsey is a “thick” writer—he packs a lot of thought and ideas into every page and needs to be read with care to get the most benefit from his thinking. I respect that (and envy the thinking that enables such writing), but it makes it hard to comment on his articles—particularly long ones. This one’s 19 pages of small type (including two pages of endnotes), nearly 13,000 words in all.

What happened clearly in the mid-nineties was the convergence of the Web with more pervasive network connectivity, and this made our sense of the network as a shared space for research and learning, work and play, a more real and apparently achievable goal. What also emerged—at least in the library and research domains—was a sense that it was also a propitious time for digital libraries to move from niche to central role as part of the information infrastructure of this new shared space.

However, the story did not quite develop this way. We have built digital libraries and distributed information systems, but they are not necessarily central.

He considers the environment leading up to that convergence but spends most of the article “thinking about where we are today, and saying something about libraries, digital libraries and related issues in the context of current changes.” Dempsey sees another convergence today: “the convergence of communicating applications and more pervasive, broadband connectivity.” Note that Dempsey doesn’t say “ubiquitous computing”—he sticks with the more certain “more pervasive, broadband connectivity.”

He notes five general conclusions from his participation in and observation of the foundational UK digital library programs (excerpted):

1. They were major learning experiences for participants, many of whom have gone on to do important work in the library community.

2. They showed that the development of new services depends on organisational and business changes that it is difficult for project-based programmes to bring about. This is a continuing issue.

3. Many of the assumptions about technical architecture, service models and user behaviours in emerging digital library developments were formed pre-Web….

4. Their impact has been diffuse and indirect, and is difficult to assess. Compared to the overall number of deliverables, there is a small number of ongoing products or services which are direct project outcomes…

5. How does one transfer innovation into routine service? Certainly, alongside the project work there was service innovation in the JISC environment, but it flowed from central planning.

As promised, Dempsey spends much more time on the new environment and the challenge for libraries of “working in a flat world.” Despite my red underlines on the article, I find it impossible to provide useful summaries of that rich, complex discussion. The red marks weren’t points of disagreement; they were areas I wanted to point out as deserving special attention—and there are too many to jot down here.

It’s quite an article. I’d almost consider it required reading if you’re interested in the past and future of digital libraries.

Lynch, Clifford, “Research libraries engage the digital world: A US-UK comparative examination of recent history and future prospects.”

Speaking of thick writers… The title’s longer but the article is much shorter, a little over five pages (plus endnotes). Lynch writes in terms of transformations—the transformation of scholarship and teaching and the transformation of the library and the invention of digital libraries. He notes that UK work on digital libraries has a “much greater emphasis on the deployment of a national system of library services” than in the U.S.—and that libraries were “typically only peripherally involved” in the NSF-funded U.S. digital libraries program. Indeed, the short-term experimental nature of the projects did not fit well with the working models of U.S. research libraries.

But digital libraries were fashionable, they were well-funded, they generated great interest during the great ‘dot-com’ bubble, and they were frankly sometimes threatening (and sometimes deliberately used as a way of threatening) research libraries in the US—if these libraries were not on the road to becoming digital libraries, they were backwaters, obsolete, ‘book museums’; they were in danger of being supplanted or overtaken by commercial competitors. Much of this was, to be blunt, complete rubbish, at least in the near term, but the development of these information management and retrieval systems that were called ‘digital libraries’ and the confusion between these and what actual libraries as organisations do, and the systems that they might use to accomplish those missions, gave rise to a major problem in public perception.

Lynch notes the “great obsession” toward the end of the century within library and higher education communities to define digital libraries—and that, as funding for the prototype projects dried up, the discussion has become more constructive: “How research libraries could more effectively support teaching, learning and scholarship in a changing environment.”

That’s just a bit of another rich, dense article. He anticipates major changes in the way scholars use libraries and their resources. He notes that it was a mistake to think first about how libraries should change—rather than seeing how the library’s users and their needs were changing, and how the library could meet those needs.

“So what has happened to the digital library? At least as I define digital libraries, what happened was that we realised that they are just tools, a bundle of technologies and engineering techniques—that find applications in a surprisingly wide range of settings beyond higher education and research.”

Lynch includes speculations about public libraries and their future, and there I might take issue with him—but that paragraph is an admitted digression.

We are in the middle of a very large-scale shift. The nature of that shift is that we are at last building a real linkage between research libraries and the new processes of scholarly communication and scholarly practice, as opposed to just repackaging existing products and services of the traditional scholarly publishing system and the historic research library. In this shift we have left the debate about digital libraries behind, recognising this now as simply shorthand for just one set of technologies and systems among many that are likely to be important.

Well worth reading, a comment that usually applies to Clifford Lynch’s writing (as to Lorcan Dempsey’s).

MacColl, John, “Google challenges for academic libraries.”

“How should we understand Google? Libraries still feel like the batsman at whom something has been bowled which looks familiar, but then turns out to be a nasty threat.” MacColl offers a sprightly, relatively brief (4.5 pages plus notes) consideration of how libraries do and perhaps should feel about Google’s various initiatives. MacColl gets at least one minor thing wrong (he says the Google Library Program “has stalled while the [AAUP] law suit is pending” with the libraries not giving Google any in-copyright books, and that’s simply not true in 2006), and he says flatly that full-text book indexing offers power “much greater than that of indexes we are used to,” where I’d suggest full-text retrieval is different (better in some ways, worse in others).

All in all, I strongly recommend the article. He notes that librarians are bothered by the opacity of Google Scholar. He notes that “new technologies do not change principles.” He concludes:

As librarians, running pleasant study environments, containing expert staff, providing havens on our campus which are well respected, and building and running high-quality Web-based services, we will decide which of Google’s offerings we wish to promote, and which we are prepared to pay for. And we will stand up—no matter how wealthy we assume our students and academic users to be—for the principle of free and equal access to content, and for the principle of high-quality index provision, whether free or at a cost, because without those principles we are no longer running libraries.

Rusbridge, Chris, “Excuse me…Some digital preservation fallacies?”

This one’s just plain fascinating. Rusbridge was director of the eLib program when Ariadne began and now works for the Digital Curation Centre at the University of Edinburgh. He offers this set of six common assertions or assumptions about digital preservation:

1. Digital preservation is very expensive [because]

2. File formats become obsolete very rapidly [which means that]

3. Interventions must occur frequently, ensuring that continuing costs remain high.

4. Digital preservation repositories should have very long timescale aspirations,

5. ‘Internet-age’ expectations are such that the preserved object must be easily and instantly accessible in the format de jour, and

6. the preserved object must be faithful to the original in all respects.

He then proceeds to argue the case for each of the six being a fallacy—noting that in some cases he’s carrying on “an argument with myself!” The discussions are engaging and clear. #2 is particularly interesting: Rusbridge is almost certainly correct in stating that commercial file formats (those used in consumer-oriented software) become inaccessible far more slowly than we might have expected. The copy of Microsoft Word I’m writing this on will write to many different formats including Word all the way back to 2.0 (how long ago was that?), and translation software will handle all but the most obscure commercial formats. His argument is much more subtle than this brief description might indicate; he’s saying it takes a long time to totally lose information content (creating using consumer software), although partial loss may be more rapid.

This paper is about as far from a doctrinaire list as you can get and is, to be sure, well worth reading. Rusbridge really is trying different notions on for size. I find myself agreeing more often than disagreeing. Here’s how Rusbridge restates the six “possible fallacies,” toward the conclusion that “lack of money is perhaps the biggest obstacle to effective digital preservation. Assumptions that make digital preservation more expensive reduce the likelihood of it happening at all”:

1. Digital preservation is comparatively inexpensive, compared to preservation in the print world,

2. File formats become obsolete rather more slowly than we thought

3. Interventions can occur rather infrequently, ensuring that continuing costs remain containable.

4. Digital preservation repositories should have timescale aspirations adjusted to their funding and business case, but should be prepared for their succession,

5. “Internet-age” expectations cannot be met by most digital repositories; and,

6. Only desiccated versions of the preserved object need be easily and instantly accessible in the format de jour, although the original bit-stream and good preservation metadata or documentation should be available for those who wish to invest in extracting extra information or capability.

Cites & Insights: Crawford at Large, Volume 6, Number 4, Whole Issue 74, ISSN 1534-0937, a journal of libraries, policy, technology and media, is written and produced by Walt Crawford, a senior analyst at RLG.

Cites & Insights is sponsored by YBP Library Services,

Hosting provided by Boise State University Libraries.

Opinions herein may not represent those of RLG, YBP Library Services, or Boise State University Libraries.

Comments should be sent to Comments specifically intended for publication should go to Cites & Insights: Crawford at Large is copyright © 2006 by Walt Crawford: Some rights reserved.

All original material in this work is licensed under the Creative Commons Attribution-NonCommercial License. To view a copy of this license, visit or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.