Cites & Insights: Crawford at Large
ISSN 1534-0937
Libraries · Policy · Technology · Media


Selection from Cites & Insights 7, Number 3: March 2007


Net Media Perspective

Wikipedia Revisited

In previous episodes…

OK, that opening phrase appeared out of nowhere, but this may be a good time to revisit past items regarding the project.

2002-2004

The first mention came in May 2002 (C&I 2:7, p. 18), in a rare self-referential The Good Stuff item citing a “PC Monitor” column in Online based on a survey done via C&I. (Circular enough for you?) After noting the article, I said:

If you read the print magazine (highly recommended) instead of an online version, look back two pages; Péter Jacsó’s “pan” of the month is one of those grotesque “let’s all make an encyclopedia” efforts (Wikipedia) that help some of us appreciate professional efforts. He is astonished that MIT’s Technology Review ran a serious interview with the CEO and that Peter Suber wrote a friendly notice. I am incapable of being surprised by Technology Review behaving like Wired, but the Suber notice does surprise me.

Jacsó was indeed unhappy with Wikipedia—which was probably less than a year old when he wrote the column and had 16,000 articles at the time. Jacsó had “been panning in this column and elsewhere projects that I usually refer to as another encyclopedia on your lunch break,” largely because the fancy titles “make even educated librarians and other decent people provide a link to these pathetic sites… Others will keep copying their links and thus adding clout to such sites ad nauseam—and that is not all right.” Regarding Wikipedia itself, Jacsó calls it “a joke at best” and notes, “It looks like a prank.” Of the 16,000 articles, he notes that many of them were a single sentence and that more than 10% referred to September 11, 2001, “playing the emotional card, which is totally out of place in a general encyclopedia, and include information about lists of victims and vigil sites, personal reports—and scams.”

Jacsó also mentions 25 articles about “the tiny island of Niue” and that most of them, like many other country entries, were lifted verbatim from a year-old edition of the CIA World Factbook (splitting each chapter of a profile into a separate article)—without crediting the source. He again called it “a prank” and included illustrations of portions of its “tips on contributing” and FAQ, showing “a lack of scholarly discipline” and a [casual] attitude toward quality information. He cites Peter Suber’s surprisingly favorable comment (Suber called Wikipedia “the ultimate development in dynamic, interactive, collaborative scholarship”) and concludes:

My, oh my, is this scholarship and ultimate? What would the naïve users say and think, who will soon become contributors after reading the tips for contributors? Jimbo expects advertisers by mid-2002, and then you know who is going to be laughing all the way to the bank.

I cite this not to embarrass Peter Suber or Péter Jacsó. Peter Suber’s comment included this key qualifier after “collaborative scholarship”: “if you can call anything scholarship that dispenses with editorial filters in the name of user freedom.”

Times change. Wikipedia has changed, leaving that 100,000-article goal in the dust. Some articles are impressively long. Some appear well based in cited sources. For many contemporary topics, Wikipedia is a wonderful starting point. The sin even back then in quoting big chunks of the CIA World Factbook was not copyright infringement (it’s government-prepared, public domain in the U.S.) but plagiarism: failure to cite the source. Such plagiarism continues, although many articles are heavy with footnotes. “Jimbo” decided not to run advertising on Wikipedia itself. The site no longer looks like a prank.

I’m nervous about calling Wikipedia “collaborative scholarship” or scholarship of any sort—particularly given Wikipedia’s rules, which forbid inclusion of original scholarship. It certainly has clout; Wikipedia articles appear at or near the top in many search engine results. (Second for “Niue” at Google, first numbered result on Yahoo! and Live—and, remarkably, tenth at Ask.) In 2007 it’s impossible to dismiss Wikipedia as a prank or joke—but it’s still controversial. Even Wikipedia’s entry on “Wikipedia” notes controversy, including this paragraph “above the fold” (before the outline of a fairly long article):

There has been controversy over Wikipedia's reliability and accuracy, with the site receiving criticism for its susceptibility to vandalism, uneven quality and consistency, systemic bias, and preference for consensus or popularity over credentials. Information is sometimes unconfirmed and questionable, lacking the reliable sources that, in the eyes of most regular contributors, are necessary for an article to be considered of high quality.

I didn’t mention Wikipedia at all during 2003. Then came October 2004 (C&I 4:12, pp. 2-4) and the relatively brief Perspective: Wikipedia and Worth.

Late summer saw a whole bunch of foofaraw about wikis and specifically Wikipedia. After one columnist suggested Wikipedia as a resource for computer history, other writers assaulted Wikipedia as worthless trash; at least one librarian made noises about the difference between online junk and authoritative sources; some wiki advocates pontificated about the awesome error correcting capabilities of community-based collaborative media. Alex Halavais of the School of Informatics at Buffalo University made 13 changes in the English language Wikipedia, “anticipating that most would remain intact and he’d have to remove them in two weeks.” Presumably, if that had happened, there would have been evidence that the ease of modifying Wikipedia makes it suspect as a resource.

I discussed the results of the Halavais test (which impressed Halavais himself) and two commentaries (one of which concerned wikis and scholarship but not Wikipedia itself). One commentary was by Ed Felten, who reviewed five entries on “things I know very well” and found four of them good—but the fifth “riddled with errors” of a sort that would “lead high-school report writers astray.” Felten mentioned high school; as with most other college faculty, he would (I believe) assume college students would never use Wikipedia or any other encyclopedia as a cited source. Given one of the issues raised later in this Perspective, it’s worth noting that the error-filled article related to Microsoft. I offered my own middle-of-the-road perspective, doubting Wikipedia would “eclipse” traditional encyclopedias and assuming it was neither worthless nor better than a traditional encyclopedias—and that entries should be used on a “trust but verify” basis. In the next issue (November 2004, C&I 4:13, pp. 14-15) I cited some direct feedback and related list discussion, including Michael Lorenzen’s cogent comments about Wikipedia’s extensive use of government-generated public domain information creating a pro-American bias. My take then was that triumphalism was a problem with Wikipedia and its advocates—the felt need to “sweep away” or be “better than” traditional encyclopedias. That’s still true.

2005-2006

I mentioned Wikipedia in five 2005 issues of Cites & Insights, but some of the mentions were trivial. February 2005 (C&I 5:3, pp. 11-19) featured a new and considerably longer Perspective: Wikipedia and Worth [Revisited]. In case you’ve forgotten the situation in late 2004 and early 2005, three substantial critical essays appeared:

Ø    On November 15, 2004, Robert McHenry (former editor in chief of the Encyclopædia Britannica) posted “The faith-based encyclopedia” at Tech central station (www2.tech­central­station.com). He disagreed with the claims for collaborative editing and the methodology. He used one article as a case study, asserting numerous typographic, styling, grammatical and diction errors and calling it a “C [high school] paper at best”—despite more than 150 edits. He concluded: “The user who visits Wikipedia to learn about some subject, to confirm some matter of fact, is rather in the position of a visitor to a public restroom. It may be obviously dirty, so that he knows to exercise great care, or it may seem fairly clean, so that he may be lulled into a false sense of security. What he certainly does not know is who has used the facilities before him.”

Ø    A few days later, Jason Scott posted “The great failure of Wikipedia” to one of his weblogs, ASCII by Jason Scott (ascii.textfiles.com). This computer history researcher “tried extended interaction with Wikipedia” and “consider[s] it a failure.” He says why in considerable detail. He argues with the project’s low barriers to entry, says it has “a small set of content generators, a massive amount of wonks and twiddlers, and then a heaping amount of procedural whackjobs” and comes to this judgment: “I’m sorry, but content creators are relatively rare in this world. Content commentators less so. Content critics are a dime a hundred, and content vandals lurk in every doorway. Wikipedia lets the vandals run loose on the creators, while the commentators fill the void with chatter. It is a failure.”

Ø    On New Year’s Eve 2004, Larry Sanger published “Why Wikipedia must jettison its anti-elitism” on Kuro5hin (www.kuro5hin.org)—and all hell broke loose, or at least the kind of hell that happens on Kuro5hin and /. The five-page article generated hundreds of comments very rapidly, the total back-and-forth reaching book length within the first month. Sanger was a cofounder of Wikipedia and continues to admire the project, which put him in an unusual position to point out what, in Sanger’s mind, has gone wrong.

I’m not going to recount the responses and list comments; the Perspective is available at citesandinsights.info/v5i3d.htm. I was surprised by Clay Shirky’s attack on librarians, teachers and academics (I said he “lost it”)—but since then, as Shirky has dismissed taxonomy as worthless and said web video will kill HDTV, I’ve become less surprised: Shirky is a man of strong, even strident opinions. Web4lib posts included extreme positions as well as nuanced discussion, as did Publib posts. My conclusion at that point:

Maybe it’s human nature (for some humans, not all) to advocate your own preferred solution by putting down alternatives rather than by showing the virtues of your choice. That’s sad if true. Wikipedia can do just fine. So can Encarta. So can Britannica, back in print and still in digital form. And so, to be sure, can all of those books, journal articles, “vetted” websites and primary sources that encyclopedias of any nature should lead us to.

I discussed other commentaries in June 2005 (C&I 5:8, pp. 8-9), including a Wired essay with exactly the bias you’d expect, a disagreement between Many2many contributors and the continuing apparent need by some to create a zero-sum game, where Wikipedia can only “win” at the expense of traditional encyclopedias. A multitopic Net Media roundup in October 2005 (C&I 5:11, pp. 6-7 for Wikipedia stuff) included notes on a long Larry Sanger memoir about the early history of Wikipedia and Nupedia.

While I mentioned Wikipedia four times in 2006, there was only one substantial discussion—Net Media Perspective: What About Wikipedia? (C&I 6:13, November 2006, pp. 2-11). By then, I was using Wikipedia as a starting point in many cases (it’s one of several choices in my Firefox search-box menu, along with Worldcat.org, IMDB and the four major search engines)—and I noted a case where a Wikipedia article directed me to a verifiable answer for a Unix problem (where a book had provided bad information). “I don’t trust Wikipedia’s ‘neutral’ point of view and find many of the essays poorly written—but it’s great for what it is.”

The first major part of that discussion was Nature’s article comparing the accuracy of Wikipedia and Britannica. (I now see I managed to substitute Science for Nature once in that discussion, but I certainly don’t claim to be an authoritative source.) The study was, at best, too narrow to be conclusive—and at worst flawed both in its analysis and in the way it was reported. Perhaps the best commentary among those I discussed was Paula Berinstein’s “Wikipedia and Britannica: The kid’s all right (and so’s the old man)” in the March 2006 Searcher.

Then there’s notability and Wikipedia’s “inability to handle domain experts,” as danah boyd puts it. That discussion made one of Wikipedia’s true peculiarities clear (a peculiarity that’s mentioned in contemporary discussion, below): Living persons are not expected to edit their own entries except perhaps to correct factual errors. It’s “culturally inappropriate”—and although founder Jimmy Wales did edit his own article, he apologized for doing so. Seth Finkelstein discussed the impossibility of opting out: The astonishing case that, unlike (for example) Who’s Who in America, he can’t say “I don’t want to be in Wikipedia” even though he’s not a politician or otherwise so notable that he has no grounds for such a request. Others have noted the same problem.

I also discussed a lengthy New Yorker article on Wikipedia, a thoughtful piece that pointed out some of its strengths and weaknesses and included the pointed comment that “Wikipedia’s bureaucracy doesn’t necessarily favor truth.” A far less balanced piece appeared in the September 2006 Atlantic Monthly; it was a gushing, unbalanced tribute that approvingly says truth is whatever the community says it is. “Yes, that means that if the community changes its mind and decides that two plus two equals five, then two plus two does equal five.” As I noted, this model of “truth” would mean evolution is a myth, at least in America, since the majority of Americans (who respond to polls) apparently don’t believe in it. “And there’s no global warming and we’ll never run out of oil.” As long as we clap our hands long enough and loud enough, Tinker Bell can fly and our SUVs can keep getting bigger forever. A third mainstream press piece in the Wall Street Journal was an email debate between Jimmy Wales and Dale Hoiberg, current editor-in-chief of the Britannica. It’s an odd piece, considerably flawed by Wales’ absolutism and dismissal of expertise.

Then there’s Citizendium—Larry Sanger’s fledgling project to produce a more authoritative Wikipedia. I discussed the early plans in that November 2006 essay, along with attacks from Clay Shirky and commentary from Nicholas Carr (who wrote off Citizendium before it got started). That project has changed course. More on that later.

We Interrupt this Program

Before discussing current controversies and developments in Wikipedia and Citizendium, it’s worth noting again that very few wikis aim to be encyclopedic. A wiki is a tool—another kind of “free as speech” content management system, a bit less lightweight than a blog but better for different purposes. Sure, some people get overenthusiastic about wikis, creating them when simpler tools might work better. Certainly, there’s a problem within librarianship in that people seem more inclined to create new wikis than to contribute to existing ones.

But they do work, sometimes exceptionally well. This Perspective isn’t about wikis in general, but I was taken with “wikilove,” a commentary posted January 24, 2007 by Jenica at a blog that’s either called Thinking out loud or Mermaid (jenica26.squarespace.com/mermaid/). Jenica “love[s] our (password protected and unshareable) library wiki.” Excerpts:

One of my continuing frustrations is how to keep organized. As an area coordinator and a team leader, I have to organize information for my own use, I have to organize information so that our director can access it at the moment of need, I have to organize information for communal access by a committee, I have to organize information for communal access by a leadership team, I have to organize information so that the whole staff can find it, I have to organize information for the use of two working groups I chair. Each one of those information needs is different, and each one produces different kinds of information to organize. In some cases, the clear and appropriate solution is to use our shared storage drive. In some cases, we need to use a paper distribution system. In some cases, emailing documents is best. But I really hate those solutions for reasons of findability, retrievability, and convenience.

Which is why I love the wiki. Click on edit; edit; click on save; done. Click click click, retrieved. Or, search, click, retrieved. It’s fantastic.

Sadly, it doesn’t work for everything or for everyone, but as a finding tool it’s fantastic…[Examples for some of the cases above.] And since we use MediaWiki, there’s a “watch this page” feature that sends me an email when my colleagues edit the pages I’m monitoring, which means that tracking the progress of a joint project just got effortless—the software does the checking for me…

It’s a tool. Sometimes it’s a remarkably effective tool. Jenica’s instance isn’t what some would call social software because it’s protected—it’s a shared resource for the staff.

Some wikis fail after initial excitement—just like most blogs. Some wikis generate new wikis more than they succeed on their own terms—just like some blogs. And some wikis become international phenomena and sources of ongoing controversy. But those are fringe cases that have little to do with the underlying uses and benefits of wikis.

In other words, criticisms levied at Wikipedia and Citizendium in the remainder of this essay are criticisms of those particular wikis, not of wikis as tools.

Recent Wikipedia Controversies

This one isn’t recent except to me. “Digital Maoism: The hazards of the new online collectivism” by Jaron Lanier appeared at Edge (www.edge.org) on May 30, 2006. The blurb above the essay sets the tone:

The hive mind is for the most part stupid and boring. Why pay attention to it?

The problem is in the way the Wikipedia has come to be regarded and used; how it's been elevated to such importance so quickly. And that is part of the larger pattern of the appeal of a new online collectivism that is nothing less than a resurgence of the idea that the collective is all-wise, that it is desirable to have influence concentrated in a bottleneck that can channel the collective with the most verity and force. This is different from representative democracy, or meritocracy. This idea has had dreadful consequences when thrust upon us from the extreme Right or the extreme Left in various historical periods. The fact that it's now being re-introduced today by prominent technologists and futurists, people who in many cases I know and like, doesn't make it any less dangerous.

I’m not fond of the “hive mind” concept. The “wisdom of the crowd” strikes me as wildly overrated given, oh, elections, many online discussions, the general level of IMDB user reviews and other counter-examples. So what does “digital visionary” Jaron Lanier say about all this?

He starts out by complaining that Wikipedia identifies him as a film director. He’s tried to fix that, but it doesn’t stick. Now reporters are asking him about his filmmaking career (which consisted of one experimental short film years ago). I won’t summarize the entire essay. I will note a few items. Lanier says, “Accuracy in a text is not enough. A desirable text is more than a collection of accurate references. It is also an expression of personality.” I couldn’t agree more. To me, this is the difference between a bunch of facts strung together and a story. You could argue that collective editing works against story telling because it diminishes individual voices.

[M]ost of the technical or scientific information that is in the Wikipedia was already on the Web before the Wikipedia was started… In some cases I have noticed specific texts get cloned from original sites at universities or labs onto wiki pages. And when that happens, each text loses part of its value. Since search engines are now more likely to point you to the wikified versions, the Web has lost some of its flavor in casual use.

When you see the context in which something was written and you know who the author was beyond just a name, you learn so much more than when you find the same text placed in the anonymous, faux-authoritative, anti-contextual brew of the Wikipedia. The question isn't just one of authentication and accountability, though those are important, but something more subtle. A voice should be sensed as a whole… Even Britannica has an editorial voice, which some people have criticized as being vaguely too "Dead White Men."

If an ironic Web site devoted to destroying cinema claimed that I was a filmmaker, it would suddenly make sense. That would be an authentic piece of text. But placed out of context in the Wikipedia, it becomes drivel.

Lanier discusses the problems with “Meta” sites, sites that base their content on collective algorithms, in essence trying to reflect the “hive mind.” Digg and Reddit are such sites, as is deli.cio.us to some extent, with popurls.com aggregating other metasites. He notes that popurls tends to ignore major but serious news (earthquakes, new approaches to diabetes management) in favor of pop culture and other trivia.

There are notions here I’d agree with, but there are problems as well. Consider this extract:

[I]t must at least be pointed out that writing professionally and well takes time and that most authors need to be paid to take that time. In this regard, blogging is not writing. For example, it's easy to be loved as a blogger. All you have to do is play to the crowd. Or you can flame the crowd to get attention. Nothing is wrong with either of those activities. What I think of as real writing, however, writing meant to last, is something else. It involves articulating a perspective that is not just reactive to yesterday's moves in a conversation.

That’s not quite Gormanesque, but still devalues many first-rate writers who blog (and who prepare their posts seriously) and many skilled writers who do some or all of their writing without direct pay. “Blogging is not writing” is not thinking; it’s unworthy of a professional writer or thinker.

Are there authentic examples of collective intelligence? Of course, and Lanier discusses them. He also asserts, “Every authentic example of collective intelligence that I am aware of also shows how that collective was guided or inspired by well-meaning individuals.” In the end, Lanier’s not entirely opposed to collective intelligence:

The hive mind should be thought of as a tool. Empowering the collective does not empower individuals—just the reverse is true. There can be useful feedback loops set up between individuals and the hive mind, but the hive mind is too chaotic to be fed back into itself.

Edge engages in “the reality club,” where a bunch of Important Minds comment on something—in this case, Lanier’s essay. The essay runs 12 pages; the responses take up 26 pages. I should note something about Lanier’s essay that I probably wouldn’t notice if I was a regular Edge reader (I tried it and decided against it), or if I was part of the In Crowd involved with Edge. To wit, Jaron Lanier makes sure that we know he’s important. He’s a big shot. He matters. Reporters talk to him frequently. The “prominent technologists and futurists” he disagrees with are “people who in many cases I know and like”—Lanier moves in important circles. Kevin Kelly “is a friend.” Consider this sentence (after we learn that Lanier’s a well-paid consultant who finds he’s being paid just as much to do less these days): “I've participated in a number of elite, well-paid wikis and Meta-surveys lately and have had a chance to observe the results.” I find this distracting and offputting. Clearly Lanier’s prominent enough to be in Wikipedia. Does he really need to remind us so frequently that he’s a bigshot and friends with People Who Count? If the essay is really about Jaron Lanier, maybe so. If it’s about “digital Maoism,” probably not.

I’ll admit to ignorance here. Lanier is described as a “computer scientist, composer, visual artist, and author” (that description could be taken directly from his website). Worldcat.org shows zero books by Lanier, but that’s OK: Lots of authors never get around to booklength projects. He claims to have coined the term “virtual reality”—and Wikipedia currently calls him a virtual reality developer. I don’t doubt that Lanier is an important public intellectual. I do doubt the wisdom of his constant stressing of his importance and connectedness. Incidentally, if you read the Wikipedia article on Jaron Lanier, make sure to read the Discussion page: To my mind, it says a little too much about what’s wrong with Wikipedia and the way Wikipedians deal with true topic experts, including the legitimate expertise that living people have about their own lives.

I’m going to cop out. If only for reasons of space, I’m not going to attempt coherent comments on the responses from such luminaries as John Brockman, Clay Shirky, Douglas Rushkoff, Cory Doctorow, Kevin Kelly, Esther Dyson, Larry Sanger, Jimmy Wales, Dan Gillmor and Howard Rheingold (among others). Brockman informs us that we’re migrating “from individual mind to collective intelligence” and “witnessing the emergence of a new kind of person”—and tells us how important all the other commenters are, including Clay Shirky, than whom “no one is deeper, more thoughtful, on the social and economic effects of Internet technologies.” The commenters are “a ‘who’s who’ of the movers, shakers, and pundits of this new universe of collective intelligence.” Wow. Some of those movers and shakers write well; some, surprisingly badly. Some think well; some do not, at least in these commentaries. Some agree with Lanier (at least partially); some mock him and celebrate the “hive mind.” After reading through all of the commentaries, I wrote a rude comment on the last page; I won’t repeat it in this family-friendly publication. You may find the commentaries enlightening; I did not.

Plagiarism?

A November 6, 2006 Associated Press item by Anick Jesdanun notes a project by Daniel Brandt to check portions of 12,000 biographical articles in Wikipedia against other sources, looking for plagiarism—that is, uncredited copying. He brought 142 articles—just under 1.2%, for what that’s worth—to Wikipedia’s attention. Jimmy Wales called the findings “exaggerated” while admitting that plagiarism does happen. Frankly, if just over 1% of the articles in Wikipedia contain uncredited copying from other sources, I’d say it’s doing pretty well.

Brandt is another person unhappy with his biography in Wikipedia. He runs Wikipedia Watch, Google Watch and Yahoo! Watch. I suggest looking at any or all of these sites; that certainly helped me form an opinion as to how seriously I should take Brandt’s criticisms. Ad hominem may be a logical fallacy but it’s useful in real life.

Can Wikipedia ever make the grade?

That’s the title of an October 27, 2006 Chronicle of higher education article by Brock Read. Read covers some of the same ground I’ve noted and does so engagingly and skillfully, adding more sources and seemingly favoring neither Wikipedia nor its critics. I appreciate the comment of Roy Rosenzweig, a history professor at George Mason University: “Are Wikipedians good historians? As in the old tale of the blind men and the elephant, your assessment of Wikipedia as history depends a great deal on what part you touch.” He finds “thorough, fairly well-written essays” on topics such as Red Faber and “Postage stamps and postal history of the United States,” but “incomplete, almost capricious coverage” on possibly more important topics such as American history from 1918 to 1945 (which at the time omitted items such as female suffrage, the Ku Klux Klan and the rise of radio).

Read identifies at least three reasons scholars may be reluctant to add articles in areas where Wikipedia is currently weak: The difficulty of keeping scholarly work there when avid editors are quick to delete or dumb it down; the fact that anonymous articles in Wikipedia won’t do a thing for scholarly respect or career advancement (why write for Wikipedia when you can be working on refereed articles?); and Wikipedia’s preference for concise articles (which seems less in evidence for pop culture). He has three scholars grade individual Wikipedia entries, with results ranging from A (for “Flow cytometry”) through C (for “African-American civil rights movement”).

The article’s accompanied by a sidebar and, on CHE’s website, a “live discussion” involving Alex Halavais and a number of questioners. Halavais is good: He believes college students shouldn’t be allowed by their professors to cite Wikipedia in research papers—“but only because it is an encyclopedia” (a stance with which founder Jimmy Wales agrees). Halavais now seems to be a Wikipedia supporter, based on his responses in general. Marc Meola commented on the discussion in an October 27, 2006 post at ACRLog. He didn’t change his basic view, “which is that the errors are too random and the editing too chaotic.”

Wikipedia and the trust factor

Paul Vallely wrote this piece in The Independent (London) on October 22, 2006; I picked it up from TechNewsWorld. Vallely asserts bias in a number of Wikipedia entries because the work of “dedicated contributors with idiosyncratic beliefs” sticks around—“because no one has the time and energy to counteract them.” In some cases, “pages seem to have been taken over by fanatics and special interest groups.” Vallely also notes “disproportionate emphasis,” one of the more cogent criticisms. Vallely isn’t anti-Wikipedia but he is cautious. I like this interchange:

How unreliable is it?

How long is a string of clichés? That’s how a lot of Wikipedia entries read. But then others read as if they were written by people who know what they’re talking about. The problem is all the stuff in between, which looks reliable, but you never know. Using Wikipedia is like asking questions of a bloke you met in the pub. He might be a nuclear physicist. Or he might be a fruitcake.

Vallely’s conclusion: “Wikipedia’s premise—that continuous improvement will lead to perfection—is completely unproven.”

Knowledge and unknowledge

Nicholas Carr posted this on December 3, 2006 at Rough type. Carr blogs about Wikipedia a lot. This time he notes that “elite members” at Wikipedia increasingly spend their time removing stuff, including real content. An article on a Canadian band was removed because the powers that be decided the band was “non-notable,” drawing a razzberry from the bandleader. According to Carr (citing a Washington Post article), about a hundred entries a day bite the dust—and Carr wonders why.

Now, philosophically, I have no problem with this newfound desire to separate the wheat from the chaff. Encyclopedias have always had to decide what's worthy of being included and what isn't. Wikipedia is just following the fine old tradition of selectivity. But what puzzles me is this: I thought Wikipedia was about not following tradition. I thought it was about being freed from the old physical world's scarcity-imposing constraints, the constraints that forced us for millennia to live without easy access to “the sum of all human knowledge.” I thought the fact that Wikipedia didn't have to worry about ink and paper and printing meant that it could be radically inclusive—that it could put everything in and let readers decide what was worthy of their time and what wasn't. I thought Wikipedia was about the long tail of knowledge. I thought it was about abundance.

He sees a discrepancy between ideal and practice and notes a response by Mitch Kapor to a question about gatekeeping: “Who said that quality emerges out of gatekeeping?” Carr’s comment: “Who said that quality emerges out of gatekeeping? That’s precisely what Wikipedia is saying, about a hundred times a day.”

In comments, Seth Finkelstein thinks Carr has put Wikipedia in a no-win situation and notes there have always been some standards for inclusion, trying to avoid “things that nobody will care about.” Michael Moneur believes Wikipedia has “been publishing volumes upon volumes of things nobody cares about for years,” such as pages about characters in obscure videogames and fans of characters in obscure videogames. Anthony Cowley agrees Wikipedia is violating its own slogan, “the sum of all human knowledge.” It’s an interesting and inconclusive discussion—but, much as I admire Seth Finkelstein’s work, I come down on the other side here. Wikipedia is absurdly inclusive on things geeks care about—pages on individual Mutant Ninja Turtles—and far more selective on “minor” topics that its core contributors and activists don’t care about.

Wikipedia will fail in four years

That’s the somewhat startling title of a December 5, 2006 post by Eric Goldman in his Technology & marketing law blog—and I should note that Goldman does have a (stub) entry in Wikipedia, with its own typically-interesting discussion page regarding his notability. This essay is actually a year-later update of a prediction that it would fail in five years. His key points:

Ø    Growing traffic makes Wikipedia a target for marketers. Wikipedians are the only things keeping marketers from screwing up pages.

Ø    Marketers will use automated tools to attack pages, forcing Wikipedians to spend more time and energy combating them, leading to burnout and causing them to leave the project, leaving fewer people to pick up the load.

Ø    This will lead to increasingly junky pages—which may lead to more Wikipedians bailing out because it’s no longer worthwhile.

Ø    “Thus, Wikipedia will enter a death spiral where the rate of junkiness will increase rapidly until the site becomes a wasteland. Alternatively, to prevent this death spiral, Wikipedia will change its core open-access architecture, increasing the database’s vitality by changing its mission somewhat.”

Goldman uses the Open Directory Project as an analogy, noting that ODP in its heyday did an “amazing job of aggregating free labor to produce a valuable database”—but ODP is “now effectively worthless.”

How’s it going a year later? There are still relatively few active editors. Jimmy Wales says the vast majority of work is done by about 1,000 Wikipedians. Goldman suggests you could argue Wikipedia has already diverged from an open-access paradigm, becoming more insular and self-focused. There aren’t a lot of rewards for being an active Wikipedian: it’s free and largely anonymous labor. Goldman believes marketers are increasing their pressure on Wikipedia. He stands by his prediction.

The Microsoft controversy and Wikipedia’s expertise problem

Microsoft was unhappy about Wikipedia’s entry on the Open Office XML document format. Someone at Microsoft offered to pay an independent expert to edit the entry. The offer was not hidden. The expert was free to write anything he wanted. Here’s how it was described in AP coverage on January 23, 2007, quoting Microsoft spokesperson Catherine Brooker:

Brooker said Microsoft had gotten nowhere in trying to flag the purported mistakes to Wikipedia's volunteer editors, so it sought an independent expert who could determine whether changes were necessary and enter them on Wikipedia. Brooker said Microsoft believed that having an independent source would be key in getting the changes to stick — that is, to not have them just overruled by other Wikipedia writers.

Brooker said Microsoft and the writer, Rick Jelliffe, had not determined a price and no money had changed hands—but they had agreed that the company would not be allowed to review his writing before submission.

Jimmy Wales’ response? “Wales said the proper course would have been for Microsoft to write or commission a ‘white paper’ on the subject with its interpretation of the facts, post it to an outside Web site and then link to it in the Wikipedia articles' discussion forums.”

To which I can only say, “Huh?” It’s unacceptable for Microsoft to submit material on an area it should be expert on and it’s unacceptable to hire an expert to do so—but it would be acceptable to post the material somewhere else in article form, then link to it. After all, Wikipedia material can’t be “original”—but cited sources can be almost any web site, including blog posts and /. essays. Even then, the expert isn’t allowed to make corrections, only to call for discussion.

I think Nicholas Carr’s take on Wales’ comment is worth quoting:

That's kind of an odd suggestion from “the encyclopedia that anyone can edit.” It seems like we're getting to the point where anyone who has gained deep enough knowledge of a subject to have developed a point of view on it will be unwelcome to edit Wikipedia. Experts, automatically considered suspect, will be forced to go through some parody of a traditional editorial process.

Seth Finkelstein posted “Wikipedia articles can be a disaster waiting to happen” on January 24, 2007 at Infothought, noting the Microsoft controversy and the extent to which Wikipedia “can be a minefield of conflicting rules, administered by petty bureaucrats, with a collection of obscure policies that spawn the term ‘wikilawyer.” As Finkelstein notes, this isn’t always about ego. In the Microsoft case, where data formats are being described, “there’s big bucks at stake.” I love the closing paragraph:

Maybe this specific argument just comes with the territory of money and power. But still, it’s quite a feat to make me feel sympathy for Microsoft.

Later that month (January 29, 2007), Finkelstein noted posts he calls “Wikipedia punditry,” most having to do with Wikipedia’s mysterious internal bureaucracies, problems with expertise and internal power-tripping. Chris Edwards notes that the Byzantine procedure for suggesting a change when you’re an expert is “surprisingly close to that used by traditional publishers… That is, it would be if the publisher had a bureaucratic system based on China’s [local bureaucracies]…each one does it differently, and attitudes can change dramatically in the space of days, although they will refer to the same rule book and come back with some obscure answer…”

I was particularly interested in Kathryn Cramer’s detailed proposal (January 25, 2007, www.kathryn­cramer.com/kathryn_cramer/): “SF author bios should be moved from Wikipedia to the ISFDB wiki.” Currently, the Internet Science Fiction Data Base (an essential resource for SF anthologists like Cramer) does not include author bios; it relies on Wikipedia for such bios. Cramer thinks that’s the wrong way around:

After a brief experience with Wikipedia, its editors strike me as a pack of officious trolls whose main concern is to make sure that you don't actually know the people you are writing about. The science fiction field doesn't work that way. I know hundreds (maybe over a thousand) science fiction writers, editors, and fans. Many, many of them could be described as my “associates.” Am I connected to most members of the professional science fiction community in some way? You bet.

I've helped run a Hugo-nominated SF semiprozine for a couple of decades, I edit two year's best volumes, and am married to one of the most eminent editors in the field. But this connectedness holds true of really a lot of the people doing the actual biographies: Perhaps their connections are not so visible or so obvious, but the SF field is like one big extended family. We've all slept on each other's couches. We've bought each other drinks. We marry each other's daughters… It's Clan Fandom.

And of those creating biographies that don't know their subjects, what they are mostly doing is lifting the ISFDB bibliographies wholesale and transplanting the content over to Wikipedia.

Cramer’s asserting, I believe correctly, that Wikipedia’s distaste for being too close to your subject just doesn’t work in science fiction and fantasy: The writers and fans have always intermingled, and as “semiprozines” show, the lines are fuzzy.

It’s a long post and makes an excellent case. The best critics in science fiction tend to know the authors. That doesn’t cause them to pull punches—but it does, apparently, make them ineligible to write Wikipedia bios. She cites examples of well-written bios deleted or rewritten to eliminate material based on personal knowledge—substituting the bland stuff any Googler can find.

The proposal’s legal, as long as the ISFBD wiki has an appropriate license, but it’s clearly just a starting point: ISFBD should have real bios written by knowledgeable people. In updates, Cramer notes the Microsoft controversy and Jimmy Wales’ astonishing letter saying that editing Wikipedia for pay is unethical and “a grave violation of community trust.” Cramer says the controversy “reflects more on problems with Wikipedia than with Microsoft; Wales’ own attitudes promote the kind of bureaucratic paranoia and suspicion of expertise I experienced… Truth is not the point. The point is control.” And this:

To me the biggest irony of the Microsoft controversy is that the material that the Wikipedians I talked to insisted was the only kind of material appropriate for sourcing was pretty much all written by people who were paid to write it. And edited by people who were paid to edit it.

Comments on the proposal begin with a charming irony: If ISFDB did create its own bios, “Wikipedia could then happily lift all the information from the ISFDB bios, and they would be properly ‘sourced.’”

Citizendium

Citizendium hasn’t disappeared; in fact, it’s now opening registration. But there’s been a big change in the project. Based on comments from early contributors, Larry Sanger concluded that forking Wikipedia in its entirety resulted in too much mediocre material, which contributors found tiresome to edit.

So the fork is being “unforked.” All Wikipedia articles in Citizendium that haven’t yet been edited by Citizendium contributors will be (or have been) deleted. That makes the site much smaller but should substantially improve the average quality of entries—if Sanger’s basic theses are correct. It’s far too early to tell whether that’s true.

Comments and Conclusions

When Jimmy Wales says college students shouldn’t cite Wikipedia in research papers because they shouldn’t cite any encyclopedia, I agree. When Jimmy Wales says, “One aspect of Jaron Lanier’s criticism had to do with the passionate, unique, individual voice he prefers, rather than this sort of bland, royal-we voice of Wikipedia. To that I’d say ‘yes, we plead guilty quite happily.’ We’re an encyclopedia,” I disagree. Lanier struck me as calling for voice—not necessarily “passionate” but coherent, turning sets of facts into stories. There is nothing about an encyclopedia that precludes coherent, well-written entries representing single voices with personality; groupthink and bland speech are not prerequisites for encyclopedia entries. (Remember the “scholars’ edition” of the Britannica?)

I looked up an article in a traditional encyclopedia (albeit one in DVD form): Encarta 2007. The article, “Pre-Columbian art and architecture,” is long, segmented, and interesting; it’s written in a clear voice that tells a story. It’s also signed, in this case by Robert J. Loscher of the Art Institute of Chicago, an expert in the field. So are many articles in many encyclopedias. Wales’ defense is simply nonsense.

One frequently cited issue, the uncertainty as to whether stuff in Wikipedia has any basis in fact, is to some extent being dealt with as articles show ever more footnotes. Unfortunately, that process seems to have two negative side effects: It makes the articles harder to read (when there are superscript numbers every sentence or two), and it may be making articles even less coherent and “voiced.”

There’s nothing earthshaking here, just an update on continuing issues in the wacky world of Wikipedia. I believe Wikipedia’s “all human knowledge” claim has been sufficiently revealed as hypocritical to be discarded. One editor’s “non-notable” is another person’s substantial. Will Wikipedia die (of bloat, marketing or other problems) by 2010? How should I know?

I’m happy I don’t have an article in the English Wikipedia (there is, as of this writing, a stub in the German version). I’ll do more notable people in the library field the favor of not creating articles on them either. Increasingly, the “solutions” to Wikipedia problems appear to be problematic—and for living subjects, inclusion in the project can be the kind of honor one might wish to duck. But while you can opt out of Who’s Who in America or Who’s Who in the World, you can’t opt out of Wikipedia—unless, of course, you’re part of the mostly-pseudonymous inner circle. And that’s just wrong, unless you’re a politician, Nobel laureate, before-the-title actor or similarly public person.

Cites & Insights: Crawford at Large, Volume 7, Number 3, Whole Issue 87, ISSN 1534-0937, a journal of libraries, policy, technology and media, is written and produced by Walt Crawford, a senior analyst at OCLC.

Cites & Insights is sponsored by YBP Library Services, http://www.ybp.com.

Opinions herein may not represent those of OCLC or YBP Library Services.

Comments should be sent to waltcrawford@gmail.com. Comments specifically intended for publication should go to citesandinsights@gmail.com. Cites & Insights: Crawford at Large is copyright © 2007 by Walt Crawford: Some rights reserved.

All original material in this work is licensed under the Creative Commons Attribution-NonCommercial License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/1.0 or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.

URL: citesandinsights.info/civ7i3.pdf