Cites & Insights: Crawford at Large
ISSN 1534-0937
Libraries · Policy · Technology · Media

Selection from Cites & Insights 6, Number 4: March 2006

Perspective

Folksonomy and Dichotomy

You’ve probably heard of folksonomy, either under that neologism or as tagging. Some of you doubtless help create folksonomies as users of flickr or del.icio.us or Yahoo! My Web 2.0 or Technorati or…the list seems endless. Tag: you’re it—and you’re building folksonomies.

You may have heard that folksonomy will replace all traditional classification and taxonomy systems because it’s so much cheaper and so much more…well, fun…than cataloging and classification. At least that’s what some people seem to be saying. “Some people” may or may not be Clay Shirky. He’s probably the most prominent name in the “Folksonomy über alles” camp,” but his statements on the subject vary a lot in the extent to which he sees folksonomy as a universal solution and wholesale replacement for traditional classification schemes.

An admission: I don’t tag (by that name), at least not yet. I haven’t used del.icio.us or Yahoo! My Web 2.0 or flickr (except to view linked photos) or Technorati (except for certain canned searches). But I’m not a cataloger either. I believe I understand the principles underlying Dewey Decimal, the LC call number system, and LCSH, but that’s as far as it goes.

I started collecting the occasional article and blog essay on folksonomy about a year ago, when I started hearing how revolutionary it was and how it was going to sweep away formal classification systems. That collection was never comprehensive and no essay ever got written. I think I now understand just enough to offer an opinion—and you can pick up part of it in the title of this Perspective. I considered “Ontology” (one of Shirky’s contrast words) or “Taxonomy” (a more plausible contrast to Folksonomy) or even “Classification” or “Cataloging.” The middle word could be “versus.”

The more I look at the situation, the more I see folksonomy and dichotomy—that is, false dichotomy thanks to unwarranted universalization. It’s yet another “and not or” situation: inclusionary thinking vs. exclusionary claims.

A second admission: The first admission above may be false, depending on your definitions of tagging and folksonomy. You could argue that WordPress “categories” are tags by another name, and I do provide categories for almost every Walt at Random post.

Note “categories,” plural. As in other tagging systems, I can and do use more than one for each post. Some commentators claim multiple topics as one distinction between folksonomy and traditional classification or taxonomic systems: Traditional systems require one and only one “name” for any item, where you can provide as many tags as an item seems to call for. That’s another false dichotomy. There is nothing in classification and taxonomy systems that inherently requires that they be “folders,” that items have one and only one name. Ever seen a cataloging record with more than one subject heading? Ever seen one with six? Sure you have.

Dichotomy is Overrated

I couldn’t resist that, and yes, it’s a play on one of Shirky’s most quoted articles, “Ontology is overrated: Categories, links, and tags” (www.shirky.com/writings/ ontology_overrated.html). The article may be worth reading if you haven’t encountered much background on claims of folksonomy supporters. I’d also suggest Emanuele Quintarelli’s “Folksonomies: power to the people” (www.iskoi.org/doc/folksonomies.htm) and “Social bookmarking tools (I): A general review” by Tony Hammond, Timo Hannay, Ben Lund, and Joanna Scott (D-Lib Magazine 11:4, April 2005, www.dlib.org/ dlib/april05/hammond/04hammond.html). Between those three papers and links within them, you should get a good overview of formal “pro-folksonomy” perspectives, noting that the authors don’t form a monolithic set of views. Hammond, Hannay, Lund and Scott see folksonomy as additive and complementary; Quintarelli seeks a “merged” middle ground. Shirky—well, I’m not quite sure what he really thinks about the ongoing role of traditional systems.

That’s the pro-folksonomy side. Naturally, some commentators view tagging and folksonomy with less enthusiasm, including Michael Wexler of The Net Takeaway (www.nettakeaway.com/tp/), Peter Merholz of Peterme.com (www.peterme.com) (see particularly “Clay Shirky’s viewpoints are overrated” in the August 2005 archives), and others.

Some (including many librarian bloggers and some of Clay Shirky’s fellow posters at Many 2 many, www.corante.com/many/) are in the middle—most commonly because they see both the virtues and defects of tagging and because they see there’s no need to assume that one system or the other will or should become universal.

I’m with that group.

Some “pro-folksonomy” articles make erroneous assumptions about formal systems, perhaps in order to demonstrate the superiority of tagging. Yes, call number systems require that a book be assigned one and only one call number (it has to go on the shelf somewhere)—but subject cataloging never assumes that an item can have one and only one subject. There are faceted classification systems that inherently assign multiple facets or subject categories for an item.

Others assume that formal systems don’t scale—that the scope of the web is so much larger than anything in past history that tagging is the only solution. While it may be true that formal cataloging doesn’t make economic sense for every web page (although there are many possible levels of “formal cataloging,” some of which needn’t be all that expensive), it’s easy to underestimate the number of items that have received formal cataloging and classification, or at least have been assigned subjects using a thesaurus. Given the size of the RLG Union Catalog and WorldCat, plus the size of A&I databases that include subject headings, I’ll suggest that at least a quarter billion items have been formally classified (including duplicates, to be sure)—and once you include the taxonomies in use for species, and all the other formal taxonomies in use, I wouldn’t be surprised if the number was at least half a billion. Have half a billion websites been tagged? Possibly, but I’m a little doubtful.

There should be no dichotomy. “Popular tagging” has been part of the process of organizing and identifying items throughout history. The web makes it easier and some tagging applications make it fun. I wonder whether most web users are really interested in doing lots of tagging, but that issue will be settled over a few years.

Once you eliminate the dichotomy—once you think “and, not or”—I lose interest in trying to put down folksonomy or determine whether it really is a superior tool for all applications. More interesting questions are how tagging can be used effectively, and how tagging and formal systems can best complement one another. I’d like to think that people smarter than I am are working on those issues. I’m certain that people are working on those issues who are better informed on the topics involved and far more likely to produce good results.

Miscellaneous Grumbles

A few things about some tagging systems do bother me. Consider this list: blog, web, tools, blogs, search, fun, development, tech, tips, toread. How many of those would you use as a way to locate something on the web? Those are ten of the 50 most popular tags on del.icio.us on February 2, 2006. On their own, they’re largely useless—but in combination with other words, they might (or might not) be significant.

That list shows one potential problem with some “folksonomy” tools: Limiting tags to single words. Many concepts just don’t work as single words. “Toread” is, of course, “to read” without the space—just as “webdesign” and “web2.0” are phrases entered as words among the top 50. Why should such subterfuge be necessary? When did English become a language in which there are no nominative phrases?

Shirky, for one, makes a point of Google’s success relative to Yahoo!—specifically, that Yahoo! failed because it attempted to classify sites using a taxonomy, while Google succeeded because it ignored formal structures. That analysis overlooks some messy truths, such as the reality that Yahoo! didn’t fail: It gets a lot more unique visitors each month than Google does, partly thanks to its combination of directory (formal taxonomy) and search (text-based retrieval). Some times, a directory is precisely what you need.

Those who believe folksonomy is the only future seem to believe we’re all hot to tag, or at least most of us are. That has yet to be demonstrated. I wouldn’t be surprised if it proves not to be the case. To some extent, it’s true that folksonomy doesn’t reduce the cost of identifying items so much as it shifts the cost—lowering the cost for those who might wish to identify, but increasing the cost (in time) for those searching. If, in the end, the population willing to keep tagging sites is only ten or twenty times the population of catalogers and indexers, the overall retrieval cost of an all-folksonomy universe might be considerably higher than the overall cost of an all-cataloging/classification/taxonomy universe. But that’s a silly dichotomy: An all-traditional means universe is out of the question—and I believe an all-folksonomy universe is equally absurd.

Recent Recommended Reading

Marieke Guy and Emma Tonkin, both of UKOLN, wrote “Folksonomies: Tidying up tags?” in the January 2006 D-Lib Magazine (www.dlib.org/dlib/january06/guy/ 01guy.html). Guy and Tonkin are “and” thinkers—they regard tags as supplements to formal classification systems, not wholesale replacements. The article examines “sloppy” tags and questions the usefulness of attempting to do too much “tidying up.” It’s definitely worth reading.

A less formal “article”—technically, it’s a blog post, but it prints out as a 9-page single-spaced paper complete with 38 footnotes—is also well worth reading, even though I don’t care for the title: “The hive mind: Folksonomies and user-based tagging” by Ellyssa Kroski at Infotangle (infotangle.blogsome.com), posted December 7, 2005. (I find “hive mind” a dispiriting term, but that’s me.) Kroski’s overview of (some) tag-based applications and some of folksonomy’s strengths and weaknesses is good enough that it convinced me not to attempt such an overview: Why bother, when she’s done it so well? As a tease for the article, here are the boldface-italic introductory sentences for the “strengths” and “weaknesses” sections that take up 4.5 pages of the seven text pages of the article (the last two pages are endnotes):

Strengths: Folksonomies are inclusive. Folksonomies are current. Folksonomies offer discovery. Folksonomies are non-binary. Folksonomies are democratic and self-governing. Folksonomies follow “desire lines.” Folksonomies offer insight into user behavior. Folksonomies engender community. Folksonomies offer a low cost alternative. Folksonomies offer usability. Resistance is futile.

Weaknesses: Folksonomies have no synonym control. Folksonomies have a lack of precision. Folksonomies lack hierarchy. Folksonomies have a “basic level” problem. Folksonomies have a lack of recall. Folksonomies are susceptible to “gaming.”

Kroski loves folksonomies, but she does a good job of citing critics—and, of course, Shirky’s facile responses. You need to read the paragraphs that follow those sentences.

Tagging isn’t going away, nor should it. Neither are formal taxonomies and classification/cataloging systems going away. There’s room for both, and there should be ways to use each to enrich the other.

Cites & Insights: Crawford at Large, Volume 6, Number 4, Whole Issue 74, ISSN 1534-0937, a journal of libraries, policy, technology and media, is written and produced by Walt Crawford, a senior analyst at RLG.

Cites & Insights is sponsored by YBP Library Services, http://www.ybp.com.

Hosting provided by Boise State University Libraries.

Opinions herein may not represent those of RLG, YBP Library Services, or Boise State University Libraries.

Comments should be sent to waltcrawford@gmail.com. Comments specifically intended for publication should go to citesandinsights@gmail.com. Cites & Insights: Crawford at Large is copyright © 2006 by Walt Crawford: Some rights reserved.

All original material in this work is licensed under the Creative Commons Attribution-NonCommercial License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/1.0 or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.

URL: citesandinsights.info/civ6i4.pdf

Cites & Insights: Crawford at Large ISSN 1534-0937 Libraries · Policy · Technology · Media

Selection from Cites & Insights 6, Number 4: March 2006

Dichotomy is Overrated

Miscellaneous Grumbles

Recent Recommended Reading

Cites & Insights: Crawford at Large
ISSN 1534-0937
Libraries · Policy · Technology · Media