Cites & Insights: Crawford at Large
ISSN 1534-0937
Libraries · Policy · Technology · Media

Selection from Cites & Insights 5, Number 6: April 2005

disContent Perspective


Here’s a quote from an April 2, 2002 AP story as carried on Yahoo! News: “Ryan has not bee with wrongdoing.” Here’s another, later in the same story: “Falwell and others were behind the creation of false documents in the secretary of state’s offic justify pay raises…”

I’m not picking on the Associated Press. Read on screen, those two sentences make perfectly good sense—but in Yahoo!’s “printer-friendly” format, something’s missing. More than you see here, actually: on the printed page, half of the second “e” in “bee” and half of the “c” in “office” are missing.

By the time this column appears, Yahoo! may have discovered that its “printer-friendly” format is oxymoronic—but that will leave too many other Web sites that get in the way of the most natural thing a reader does with good Web content: print it.

Some disContent columns cover oddities, exaggerate to make a point, or deal with issues I find amusing. This is a case where I find the behavior of content providers self-defeating in incomprehensible ways. I’m bemused by the ways professionally-designed Web sites get in the way of printing content so that it can be read offline. Bemused isn’t exactly the right word. As a user, frustrated, annoyed, even “mad as heck” all come to mind.

Why Printing Matters

I can think of three reasons why anyone would want to print Web content:

Ø    They want to read the content and it’s more than a few paragraphs long. The most optimistic claims I’ve seen are that people won’t read anything longer than 500 words online. As one content-oriented designer (at NUblog) puts it, “the only people who don’t print Web sites are those without printers.”

Ø    What you say is worth repeating. People want to save it to cite elsewhere.

Ø    What you say is valuable—interesting or lasting enough that people want to save it for future reference or rereading.

If your content is short and worthless, you can skip the rest of this column: you don’t need to worry about printing.

Worst-Case Scenarios

Yahoo! represents the worst case: offering a printer-friendly option that makes the text unreadable and ultimately ruins the content. I’ve run into a surprising number of other content-oriented sites that are nearly as bad, obstructing effective printing in one or more of these ways:

Ø    Running off the edge and having no printer-friendly option. That happens at Holt Uncensored, a text-oriented site about independent bookstores, publishing, and related topics. As with most such sites, it’s also hard to read onscreen unless you have a high-resolution display and turn off left-hand control panels. Just today, working on a copyright cluster for Cites & Insights, I wound up at digital media association,—and there goes the text, right off the edge of the page (far enough that I can’t make sense of the printed results).

Ø    Dark backgrounds on printed versions, which not only waste toner or ink but also make reading difficult. More than one book-related site falls into this trap, even though the proprietors, of all people, should know better. The sites are hard to readon screen as well, so maybe these writers just don’t really want to be read.

Ø    Light text and forced-small text that prints out that way. I know designers don’t like normal-sized text to mess up the site’s look, but does that justify expecting readers to read text that prints at less than nine points?

When I encounter these problems, I can be charitable and assume that nobody at the site has ever tried printing any of it out. Or I can be less charitable (and more like the average browser) and assume that the site proprietors just don’t care about readers.

Major Annoyances and Minor Peculiarities

A professional society puts the entire text of a book on the Web, one chapter per file, ready for printing—but there’s a wide black bar down the left side of every printed page and the pages are in sans, even though serif text is known to be much more readable in print form.

“Printer-friendly” versions show up with huge color ads inserted in the midst of the text, just in case we didn’t see the ad on the screen. There goes half a buck worth of ink if we’re using inkjet printers, which will certainly inspire me to purchase said product.

Some multipage articles offer printer formats, but only if you request print for each online page, one at a time—even though the articles are ten pages or longer and require full reading to make sense.

Hello, Jakob Nielsen, usability guru: In your infinite wisdom, you must think we’re all children. Your Alertbox forces us to read oversize sans type, and we get the same ugly, paper-wasting type on the printed page. (Just one more reason not to be too concerned about your “authoritative announcements.”)

A beautifully-designed combination Webzine and blog prints out feature articles in clean, standard-size, justified serif type—but three-color (or gray and black on a laser printer) stripes across the top of each page obscure portions of the text.

Your site offers printer-friendly versions, and they work—but the resulting pages don’t show where the copy came from. The URL is obscure (because it’s a printable version) and you’ve left off clear source identification. That was a wonderful article I just reviewed a week later; too bad I’m not quite sure where it came from or when it originally appeared.

I could go on, and likely so could many readers. We all make little mistakes, but some of these also strike me as cases where the “content people” have never actually used the site or at least never actually printed from it.

Getting It Better; Getting It Right

At one point, the Journal of Electronic Publishing yielded printouts of its lengthy articles with light, hard-to-read text and some of the other problems noted above. There’s still a narrow gray bar down the left margin, but these days the printed articles are otherwise clean, clear, and carefully identified. A fair number of other sites have also cleaned up their acts. Of the content-heavy sites I visit frequently, more do it well than do it badly—which makes the mistakes stand out all the more.

The NUblog article mentioned earlier offers a few hints for good printable pages; it’s easy enough to find good advice elsewhere. I wouldn’t trust advice from sites that don’t yield good printable pages, but the fundamentals seem clear enough. Make the text (or printer-friendly version) monochrome (unless the color serves a specific purpose). Leave out the ads: we’ve seen them. Label pages clearly, with who you are, when the content appeared, and the original URL. Let the browser do text flow: turn off the special features that force long text lines.

For goodness sake, let body text be “normal” or “medium” size. And why not let the user’s preferred typeface prevail for printed versions? If the user hasn’t made a choice, the default’s probably Times New Roman, which works very well on the printed page. And if the user has made a choice of a font he or she finds highly readable (and that is the point, isn’t it?), he or she will appreciate having that choice honored.

Most of this boils down to “strip out the funny stuff.” There are few things simpler in HTML than creating a printable page. Why make using the content everyone is so anxious to get up on to the Web and viewed by as many eyeballs as possible so difficult?

This disContent column appeared in EContent 25:7 (July 2002), pp. 40-41—exactly as it appears here, including significant editorial improvements by Michelle Manafy.

Blog Printability: Bringing the Story Forward

Readers with long memories may note that the most recent disContent reprint-with-postscript (in Cites & Insights 4:14) was the February 2002 column—and that I’ve normally included these columns in chronological order, skipping those that don’t work well within Cites & Insights. I jumped to July 2002 because the printability problem hasn’t gone away—and it may be getting worse in the blog world. Why does that matter? To quote and expand:

I can think of [several] reasons why anyone would want to print [Weblog] content:

Ø    They want to read the content and it’s more than a few paragraphs long [possibly including comments on your entry].

Ø    What you say is worth repeating. People want to save it to cite elsewhere.

Ø    What you say is valuable—interesting or lasting enough that people want to save it for future reference or rereading.

Ø    They’ve been away from the blog for a while and would just as soon catch up in print form, reading a paper copy of recent entries.

Several of the weblogs I monitor do include essays more than a screen long. Some weblogs draw enough (and interesting enough) comments to make the entry plus comments worth printing, possibly running to several pages. And for secondary sources like me (as opposed to other blogs), printing and saving is the only way blog material will be mentioned.

The Problems and the Triumphs

Problems with printing at text-oriented websites (that is, articles, papers, arguments, etc.) include the ones mentioned in the column. I haven’t seen those problems that often in weblogs when prepared for printing or in text sites that use weblog-like tools for content management. I have seen a group of other problems, specifically the following:

Ø    Blogger’s moving strip: This phenomenon prints the body of a weblog in a narrow strip down the center of the page (sometimes with blog overhead such as archives and blogrolls on either side). Blogger’s special trick: the strip starts moving to the right on each successive page, until part or all of the copy simply disappears off the right-hand edge of the paper. I’ve seen that happen as early as the third page, as late as the seventh page. It doesn’t always happen at all.

Ø    One-page wonders: Weblogs that stop after one page of the blog text. Period. Marking the text to “print selection” won’t help. The only way I’ve found to print longer entries from these weblogs is to email the entries or to copy all the text into some program that knows how to print, such as Word. Most one-page wonders use Movable Type, but I think I’ve seen one Blogger one-page wonder as well, and one or two where I wasn’t sure of the software.

Ø    The banner stands alone: Weblogs that print the blog’s banner or heading on a page that’s otherwise blank—and then print one page of the blog’s body and stop. In some cases, I’ve even seen a blank first page followed by a one-page wonder. This phenomenon seems to be a TypePad specialty, but I’ve seen examples from Movable Type and unknown software. Other weblogs using Movable Type and TypePad print the banner on an otherwise-blank page—but at least they let you print more than one page of postings.

At the other extreme, quite a few weblogs produce cleaned print versions: Printouts that omit weblog overhead and are clearly designed specifically for printing, using a separate stylesheet. These use paper efficiently and are easy to read. Cleaned-for-printing weblogs are a WordPress specialty, although I’ve seen a few using other blogging software.

The Numbers

I didn’t check eight million weblogs—I don’t even claim that this is a representative sample. I checked all of the weblogs in my Bloglines list and most (but not all) of the other weblogs at I also included seven “text-oriented” sites that I check separately, including two journalism magazines and a librarianship ejournal. In all, I looked at 177 websites.

Most websites clearly identify software, with four programs predominating:

Ø    Blogger: 55 sites, 31% of the test.

Ø    Movable Type: 39 sites, 23% of the test.

Ø    WordPress: 24 sites, 14% of the test.

Ø    TypePad: 12 sites, 7% of the test.

The other 47 sites (27%) either didn’t identify the software or used programs such as slashcode, zope, LiveJournal, scoop, or IBlog.

Here’s how I would judge the sites as letter grades—noting that this is only for printability, not for quality of content:

Ø    A (ideal printability, typically “cleaned” for printing): 30 sites or 17%.

Ø    B (good but not ideal—typically a strip wasting lots of paper): 87 sites or 49%.

Ø    C or D (significantly flawed): 9 sites or 5%.

Ø    F (impossible to print out the content in its entirety with readable results): 51 or 29%.

Nearly three out of ten sites tested were printer-hostile: That’s an awful track record, particularly given that most of the sites here are related to librarianship or copyright. Don’t people care about whether their words are read and retained?

Let’s break that down by software:

Ø    Blogger sites: one (2%) rated A, 23 (42%) rated F.

Ø    Movable Type: two A (5%) and 11 F (28%).

Ø    WordPress: 23 sites (96%) rated A; the other one was a solid B.

Ø    Typepad: no A—and nine F (75%).

Ø    Others: four A (9%) and eight F (17%).

Winners and Losers

I’m going to name the “A” and “F” sites—noting again that the grade only applies to printability! There are “F” sites that I like quite a bit; there may be “A” sites that I wouldn’t read on a bet. There is an order to each list, but it has nothing to do with excellence or awfulness. I’m not going to attempt to replicate the orthography and wordspacing of blog names.

Winners: All of these sites offered first-rate printability: Blog without a Library, Caveat Lector, Creative Librarian, Etc., Infomusings, Information Wants to be Free, Librarians Happen, Library Clips, Library Voice, Library Web Chic, Quædam Cuiusdam, Right-Wing Librarian, SciTech Library Question, Tangognat, Ten Thousand Year Blog, Furdlog, Blog of a Bookslut, Columbia Journalism Review, Web Pages that Suck, Bizgirl, Digitization Blog, Distance Education Library Services, Eng Lib, Hidden Peanuts, Kevin’s Worklog, Library Lovers’ LiveJournal, Library Monk, Library Grrrrrl, Linux Librarian, Netbib.

Losers: This list includes some of my favorite weblogs (and a lot of others)—but they sure do resist printing: A Wandering Eyre, Blog Driver’s Waltz, C&I Updates, Canuck Librarian, Dave’s Blog, Exploded Library, Icarus, Info Ediface, Infozo, ISBlogN, It’s All Good, Librarian Avengers, Librarian in Black, Library Technology in Texas, LibraryLaw Blog, Library Techtonics, Rabid Librarian, Rick Librarian, Tame the Web, Tinfoil + Racoon, Twisted Librarian, Via Proni, What’s New at OhioLink, Improbable Research, InfoThought, Kept-Up Academic Librarian, Mamamusings, Many-to-Many, Holt Uncensored, LIBRES articles, Online Journalism Review (blog and articles), Cog Sci Librarian, Collecting My Thoughts, Connie Crosby, Conversational Reading, Convivial Librarian, Distant Librarian, Drizzle, Feel-good Librarian, Lethal Librarian, Library Boy, Library Dust, Rambling Librarian, Schwagbag, Shush, Stephan Gallant Review, Teacher Librarian, Technogeekery for Librarians, Texadata, Unclassifiable Librarian.

I’d love to revisit the sites just mentioned in a few months and find the printability problems cleared up. It’s clear that all blogging software can yield full printability; it just seems to be easier (or more likely the default behavior) with WordPress.

Cites & Insights: Crawford at Large, Volume 5, Number 6, Whole Issue 62, ISSN 1534-0937, a journal of libraries, policy, technology and media, is written and produced by Walt Crawford, a senior analyst at RLG.

Cites & Insights is sponsored by YBP Library Services,

Hosting provided by Boise State University Libraries.

Opinions herein may not represent those of RLG, YBP Library Services, or Boise State University Libraries.

Comments should be sent to Cites & Insights: Crawford at Large is copyright © 2005 by Walt Crawford: Some rights reserved.

All original material in this work is licensed under the Creative Commons Attribution-NonCommercial License. To view a copy of this license, visit or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.