Bibs & Blather
The Liblog Landscape 2007-2008
The Liblog Landscape 2007-2008: A Lateral Look is now available from Lulu and CreateSpace/Amazon. The 285-page 6x9 trade paperback costs $35.00. If you’re reading this before January 15, 2009, you can take advantage of the early bird special: Order directly from lulu.com (www.lulu.com/content/4898086) for $22.50 plus shipping. The ISBN13 for the CreateSpace/Amazon version is 978-1440473845.
The Liblog Landscape 2007-2008 looks at 607 liblogs (most English-language) and, for most of them, how they’ve changed from 2007 to 2008. Eleven chapters consider the universe of liblogs (blogs by “library people” as opposed to blogs from libraries):
▪ Age, authorship, country of origin
▪ Number of posts during a three-month period and change from 2007 to 2008
▪ Word count and average post length; change
▪ Comments and comments per post; change
▪ Figures and figures per post; change
▪ Patterns of change from 2007 to 2008
▪ Correlations between pairs of metrics
▪ A look at 143 blogs from 2006 through 2008
▪ Interesting subgroups
▪ The visibility issue
▪ Liblogs and the larger blogosphere
The final chapter, just over half the book, provides a brief objective description and metrics for each blog. The book includes many tables and a fair number of graphs. There is an index of blogs and authors.
It’s the most comprehensive look at liblogs ever done—and the only one I know of that shows how they’re changing from year to year.
If you’ve been reading the series of posts on Walt at Random, you can skip this part—it’s the same text, but without the puzzles and segments of the list of liblogs.
The first chapter introduces naïve hypotheses on liblogs and how they’re changing (I was right and wrong), “typical” liblogs (there’s no such thing), metrics and quintiles used in the book, how I assembled the universe of liblogs—and some descriptive elements for the 607 blogs.
Descriptive elements? Things that aren’t part of the regular metrics but may be worth noting. What blog programs do bloggers use? (The top two are closer together than I would have thought.) How many bloggers provide full names–and how many group blogs are there? What about typography? How are liblog distributed by affiliation? By country? By age?
One graphical note: Two figures show precisely the same data–the age of blogs within the study–but one is difficult to interpret while the other is crystal-clear. The difference? One graphs age by month, the other by year. (The peak year for new liblogs was 2005–not 2006, which is what I expected to find.)
Chapter 2 considers frequency—the number of posts in a blog and how that frequency changed from 2007 to 2008. As with most other metrics in this book, the analysis and comments are based on March, April and May 2007 and 2008.
The most prolific blog had 200 fewer posts in 2008 than the most prolific blog did in 2007, and there were fewer posts for 533 countable blogs in 2008 than for 523 countable blogs in 2007, even though more blogs were involved.
Indeed, of 523 blogs with countable posts for 2007, slightly more than 60% had at least 20% fewer posts in 2008–but slightly more than 20% had at least 20% more posts in 2008.
Chapter 3 deals with word count—for blogs over a three-month period and, more interesting, as average word counts per post within a blog. With overall lengths ranging from 26 words to 186,467 words in 2007—and from 39 to 204,517 in 2008!—there’s quite a range.
There’s no “right length” for a blog post. Some excellent blogs have very short posts; others consist entirely of long essays. This is one metric where both the longest and shortest posts stand out as unusual in positive, interesting ways. The chapter includes tables, charts comparing one year to another and considerable discussion.
Some of you can probably already guess the blog with the shortest average words per post; it’s also one of relatively few blogs with exactly the same number of posts in March-May 2007 and March-May 2008: 92, to be exact. See page 35.
Is a blog without comments really a blog? Of course it is–but comments are important to many, maybe most liblogs. This chapter looks at total comments per blog and the more interesting figure, conversational intensity: Average number of comments per post. We also look at how things change from 2007 to 2008.
One blog in 2007 had more than 1,000 (and more than 1,500) comments over three months. Two entirely different blogs had more than 1,000 (but less than 1,300) comments in the 2008 study period. And roughly two out of every five blogs had significantly higher conversational intensity in 2008 than in 2007.
There’s lots more about comments and conversational intensity in the book.
Chapter 5 is about visuals in liblogs—videos, drawings, charts, etc.. Many blogs don’t use them at all; many use very few. This is one metric that won’t be tracked in possible future updates, but I think you may find the brief chapter interesting.
Speaking of visuals, you should know that the wraparound cover photo was taken (by my wife, the talented one in the family) somewhere outside Christchurch, New Zealand.
By my lights, this is one of the most interesting chapters, one that combines facets of blogs to look at patterns. I look at change in number of posts, change in average post length and change in comments per post.
The chapter uses two models to describe change: A simple “up or down” model and one splitting metrics into three parts: Significant increase (20% or more), significant decrease (-20% or more) and “about the same” (+19% to -19%).
I think you’ll find this an interesting and possibly revealing chapter. It’s also the chapter that convinces me that my naïve hypotheses are right in some ways, wrong in others…which can be said of almost any hypothesis regarding the overall liblog landscape!
When I was working on this study, colleagues offered suggestions on possible correlations–e.g., older liblogs might show larger decreases in posts than newer ones.
This chapter looks at a few dozen possible correlations between pairs of metrics, normalizing metrics and using Excel’s CORREL function (which appears to be identical to the PEARSON function, calculating Pearson’s product-moment coefficient, the only readily available measure of correlation between two sets of numbers that I could find).
For those cases where the correlation is medium (between 0.3 and 0.5 or -0.3 and -0.5) or strong (greater than 0.5 or less than -0.5), I note the correlation and include a scatterplot for the two values.
Statistical extremists sometimes discuss weak correlations—those below 0.3. Fact is, almost any two sets of numbers will show some correlation (that is, will have a Pearson’s product-moment coefficient greater or less than 0.000)—but I see no reason to believe that weak correlations mean anything at all, other than that you’re comparing two sets of numbers. I do note some weak correlations, mostly to say there’s no significant correlation between the two metrics.
As to the age suggestion? I found no useful correlation between age of blogs and any other metric.
The Liblog Landscape 2007-2008 includes quite a few line graphs and a few scatterplots. I used Excel2007’s graphing functions and tuned the results for legibility. Most graphs and plots represent more than 400 data points. The only graphs and plots that use non-zero baselines are those dealing with change percentages, where the baseline is properly -100%.
Purists may object that the graphs and plots are chartjunk for either of two reasons:
▪ In most cases, the axes–while showing numbers–aren’t labeled (there are no words below or to the side of the axes).
▪ In some cases, one or both axes are logarithmic rather than linear.
I believe logarithmic axes are chartjunk only if there are no numbers on the axis. When you see evenly-spaced marks numbered “1 10 100 1,000″ you’re dealing with a logarithmic axis–and I don’t believe that’s deceptive. Some sets of data simply require logarithmic charting to display meaningfully, and some data is logarithmic in character. For example, nearly all audio performance graphs are logarithmic in most scales–frequency, distortion percentage, power–simply because sound has logarithmic characteristics.
The first one’s simple enough. In most cases, it didn’t make sense to label the horizontal axis but not the vertical axis, and there’s a clear issue with labeling the vertical axis. That issue could be stated as “26 picas” or “4 1/3 inches.” Either way, it’s the width available between the margins of a typical 6×9″ book: The width of the text block. Make that block wider, and you either have problems with the binding margin or have too-narrow outer margins.
26 picas is a nearly ideal width for 11point or 12point text, within the 55- to 65-character range usually regarded as optimal for reading. But it’s a little narrow for a graph with a lot of information…particularly after you add numeric labels for the vertical axis and a little white space between the graph and border. That narrows the graph area to at most four inches and more typically around 3.5 inches.
What happens when you add a vertical axis label? You lose another half-inch or more.
I found that graphs were squeezed too tight as a result–they became even harder to interpret.
In the end, I eliminated most axis labels, stating them in the text that precedes or follows each graph instead. It was a tradeoff of proper graph presentation standards versus graph readability. (The other alternative–8.5×11" for the book, with a 6″ text block–is great for graphs but problematic for everything else.)
The last time I looked at a large number of liblogs was in the summer of 2006, considering 213 liblogs that seemed to be in “the great middle”–neither the most visible nor the least visible in the field.
This chapter looks at 143 of those blogs: Ones with at least two posts in each of the three March-May study periods. It’s a longer lateral study of a much smaller landscape–and a landscape that I don’t regard as necessarily typical of liblogs as a whole.
I believe there’s a significant conclusion from the subgroup, and that conclusion appears in the chapter–but it’s less firm than I’d like to be, because the group may not be representative.
Do pseudonymous/anonymous blogs differ significantly from the liblog landscape as a whole? What about Canadian blogs—or blawgs?
Chapter 9 takes a dozen groups of blogs, including most groups with at least 15 blogs, and offers brief notes on how they differ (quantitatively) from the entire study. It's a short chapter (including a dozen figures and notes on each figure) and an interesting one.
Many blogging gurus (mostly outside the library arena) would say visibility is the most important thing for a blog—how many readers, how many ad impressions, how many links? In previous studies, I've looked at it as an interesting factor—but also one that's hard to judge externally.
This chapter discusses how I've looked at visibility in the past, what I did this time (and why it was only used as a lower limit for inclusion, not as an actual metric), why it's getting even more difficult—and what I'll do in future studies (if any).
This chapter looks at the 2008 Technorati State of the Blogosphere report and draws some comparisons between the liblog landscape and the larger blogosphere. Portions have appeared elsewhere.
This chapter offers a brief objective view of each liblog: Name (using the orthography of the blog itself), motto or subtitle if any, author, affiliation, country, start date, up to three of the most popular categories or tags (if obvious), and a set of metrics.
At least a couple of you have said you were looking forward to my comments about blogs—and I'm afraid you'll be disappointed.
The book includes lots of comments about how liblogs work in the aggregate and how they're changing. The first 11 chapters are very much in my voice and include my opinions.
But I don't attempt to discuss what bloggers are posting about—that's too complicated and too transitory. To be honest, with some of the more prolific blogs, I was just marking-and-counting: Adding up the number of posts, comments and figures, and measuring total word count, but not reading each post. (Hey, the blogs included more than 22,000 posts during March-May 2007 and more than 19,000 during March-May 2008. I'm a fast reader, but that's a lot of reading—more than 9.5 million words, or the equivalent of at least 95 good-size books.)
As I was building the preliminary version of Chapter 12, I was adding a brief evaluative comment for each blog in some cases: One or two sentences describing the blog's nature as I saw it during the 2008 period. I wound up stripping out all of those comments for four reasons:
1. In a few cases (maybe half a dozen?), I didn't feel I could include a comment because I really didn't like the blog (or some aspects of the blog, or the blogger)—and I'd already decided to follow the “grandmother rule” (If I couldn't say anything nice, I wouldn't say anything at all.)
2. In a lot of cases (scores of them), I didn't have anything useful to say, either because the blog was in an area I don't understand very well or for other reasons.
3. As I worked my way through, I found my comments becoming less and less useful.
4. The killer: Those comments would take up at least 100 pages of the book, probably more like 150 pages. I was hoping to keep the book under 300 pages (and succeeded, partly by using slightly smaller type) and certainly wanted to keep it under 400 pages.
Part of me wants to do the evaluative part—but I think it would be a separate book. Is that book worth doing? Am I the right one to do it? (Would I be able to keep on as even a part-time participant in the library field after doing it?)
Damned if I know. For now, I'm not sure how I'd go about it. The task of categorizing and judging 19,000 posts is far beyond me, I think. The task of providing useful evaluative comments on 500 or more blogs—possible, but I'm not sure how. We shall see.
Cites & Insights 8 (2008) is also available as a trade paperback, this one 8.5x11" and 346 pages long. All twelve issues of Cites & Insights 8 appear, plus the volume title sheet and indexes.
I’m assuming that the only likely customers for the bound volumes of C&I are people who want to show support for my ongoing work. I produced the volume primarily as a good way to have my own bound copy. Given that assumption, I’m pricing Volume 8 (and repricing Volumes 6 and 7) at $50. If you want to show support but have no interest in a big thick book with a really nice cover, taken in Scotland, I’m making the PDF download available for the same price.
The books will be available until June 2009 or two months past the last order received for any of them, whichever comes later.
C&I in book form is only available through Lulu. Volume 8 is at http://www.lulu.com/content/5014958; change “5014958” to “1526643” for Volume 7 (2007) and to “1738303” for Volume 6 (2006).
In the November 2008 Cites & Insights I said that Public Library Blogs: 252 Examples and Academic Library Blogs: 231 Examples would be going out of print around the beginning of 2009, given no sales of the first book since June 2008 and only two sales of the second book since June 2008.
I won’t say there’s been much change since then, but the picture has muddied. Here’s my current plan for changes, as soon on or after January 1, 2009 as I get around to them:
▪ The print versions at Lulu.com will be disabled—but the downloadable versions will still be available for $20.00. I’ll keep those available until at least two months go by with no sales at all.
▪ Print versions at CreateSpace (www.create space.com/3330831 for Public Library Blogs and www.createspace.com/3333993 for Academic Library Blogs) will be available at least for a little while. You can get a 20% discount by entering the discount code KMM7J427 for the first and BABJDZAD for the second when you checkout. I believe the Amazon conduit for the CreateSpace versions will be disabled, but I might be wrong. I’ll also keep those available until two months go by with no sales at all.
▪ Balanced Libraries: Thoughts on Continuity and Change continues to be available (and to sell, albeit very slowly). I’ll probably keep it in print until I decide whether to do a second edition.
One element of the Word2007 template for Cites & Insights changes with this issue. I believe most of you will find that it makes portions a little easier to read. It may also make some issues a little longer. The first person to send me email or otherwise note what the change is will earn my hearty congratulations.
So far, no major changes are planned for this volume. (“Planned”: what an interesting word.)
Yes, this issue is peculiar—maybe more peculiar than usual. That has to do with scheduling—wanting to get done with the Retrospective series during the calendar year (albeit not by formal publication date) and paying attention to the desires of some readers. Yes, I could deep-six the Offtopic Perspective—but there’s no My Back Pages and I need to have some fun.
Cites & Insights is sponsored by YBP Library Services, http://www.ybp.com.
Opinions herein may not represent those of PALINET or YBP Library Services.
Comments should be sent to email@example.com. Cites & Insights: Crawford at Large is copyright © 2009 by Walt Crawford: Some rights reserved.
All original material in this work is licensed under the Creative Commons Attribution-NonCommercial License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/1.0 or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.