Coase's Penguin, or, Linux and the Nature of the Firm Yochai Benkler

NASA Clickworkers is “an experiment to see if public volunteers, each working for a few minutes here and there can do some routine science analysis that would normally be done by a scientist or graduate student working for months on end.”14 Currently a user can mark craters on maps of Mars, classify craters that have already been marked or search the Mars landscape for “honeycomb” terrain. The project is “a pilot study with limited funding, run part-time by one software engineer, with occasional input from two scientists.”15 In its first six months of operation over 85,000 users visited the site, with many contributing to the effort, making over 1.9 million entries (including redundant entries of the same craters, used to average out errors.) An analysis of the quality of markings showed “that the automaticallycomputed consensus of a large number of clickworkers is virtually indistinguishable from the inputs of a geologist with years of experience in identifying Mars craters.”16 The tasks performed by clickworkers (like marking craters) are discrete, and each iteration is easily performed in a matter of minutes. As a result users can choose to work for a minute doing one iteration or for hours by doing many, with an early study of the project suggesting that some clickworkers indeed work on the project for weeks, but that 37% of the work was done by one-time contributors.17 The clickworkers project is a perfect example of how complex and previously highly professional (though perhaps tedious) tasks, that required budgeting the full time salaries of a number of highly trained individuals, can be reorganized so that they can be performed by tens of thousands of volunteers in increments so minute that the tasks can now be performed on a much lower budget. This low budget is devoted to coordinating the volunteer effort, but the raw human capital needed is contributed for the fun of it. The professionalism of the original scientists is replaced by a combination of very high modularization of the task, coupled with redundancy and automated averaging out of both errors and purposeful defections (e.g., purposefully erroneous markings). 18 What the NASA scientists running this experiment had tapped in to was a vast pool of five-minute increments of human intelligence applied with motivation to a task that is unrelated to keeping together the bodies and souls of the agents.

http://clickworkers.arc.nasa.gov/top http://clickworkers.arc.nasa.gov/contact Clickworkers Results: Crater Marking Activity, July 3, 2001, http://clickworkers.arc.nasa.gov/documents/crater-marking.pdf B. Kanefsky, N.G. Barlow, and V.C. Gulick, Can Distributed Volunteers Accomplaish Massive Data Analysis Tasks? http://clickworkers.arc.nasa.gov/documents/abstract.pdf Clickworkers results, supra, para. 2.2.

COASE’S PENGUIN Oct. 2001 While clickworkers is a distinct, self-conscious experiment, it suggests characteristics of distributed production that are, in fact, quite widely observable.

Consider at the most general level the similarity between this project and the way the World Wide Web can be used to answer a distinct question that anyone can have at a given moment. Individuals put up web sites with all manner of information, in all kinds of quality and focus, for reasons that have nothing to do with external, welldefined economic motives—just like the individuals who identify craters on Mars. A user interested in information need only plug a search request into a search engine like Google, and dozens, or hundreds of websites will appear. Now, there is a question of how to select among them—the question of relevance and accreditation. But that is for the next subpart. For now what is important to recognize is that the web is a global library produced by millions of people. Whenever I sit down to search for information, there is a very high likelihood that someone, somewhere, has produced a usable answer, for whatever reason—pleasure, self-advertising, or fulfilling some other public or private goal as a non-profit or for profit that sustains itself by means other than selling the information it posted. The power of the web comes not from the fact that one particular site has all the great answers. It is not an Encyclopedia Britannica. The power comes from the fact that it allows a user looking for specific information at a given time to collect answers from a sufficiently large number of contributions. The task of sifting and accrediting falls to the user, motivated by the need to find an answer to the question posed. As long as there are tools to lower the cost of that task to a level acceptable to the user, the Web shall have “produced” the information content the user was looking for. These are not trivial considerations (though they are much more trivial today with high speed connections and substantially better search engines than those available a mere two or three years ago).

But they are also not intractable. And, as we shall see, some of the solutions can themselves be peer produced.

Another important trend to look at is the emergence and rise of computer games, and in particular multi-player and online games. These fall in the same cultural “time slot” as television shows and movies of the 20th century. The interesting thing about them is that they are structurally different. In a game like Ultima Online, the role of the commercial provider is not to tell a finished, highly polished story to be consumed start to finish by passive consumers. Rather, the role of the game provider is to build tools with which users collaborate to tell a story. There have been observations about this approach for years, regarding MUDs and MOOs.

The point here is that there is a discrete element of the “content” that can be produced in a centralized professional manner—the screenwriter here replaces the scientist in the NASA clickworkers example—that can also be organized using the appropriate software platform to allow the story to be written by the many users as they experience it. The individual contributions of the users/co-authors of the storyline are literally done for fun—they are playing a game—but they are spending a real COASE’S PENGUIN Oct. 2001 economic good—their attention, on a form of entertainment that displaces what used to be passive reception of a finished, commercially and professionally manufactured good with a platform for active co-production of a storyline. The individual contributions are much more substantial than the time needed to mark craters, but then the contributors are having a whole lot more fun manipulating the intrigues of their imaginary Guild than poring over digitized images of faint craters on Mars.

2. Relevance/accreditation

Perhaps, you might say, many distributed individuals can produce content, but it’s gobbledygook. Who in their right mind wants to get answers to legal questions from a fifteen-year-old child who learned the answers from watching Court TV?19 The question, then becomes whether relevance and accreditation of initial utterances of information can itself be produced on a peer production model. At an initial intuitive level the answer is provided by commercial businesses that are breaking off precisely the “accreditation and relevance” piece of their product for peer production.

Amazon is a perfect example.

Amazon uses a mix of mechanisms to get in front of their buyers books and other products that the users are likely to buy. It uses a variety of mechanisms to produce relevance and accreditation by harnessing the users themselves. At the simplest level, the recommendation “customers who bought items you recently viewed also bought these items,” is a mechanical means of extracting judgments of relevance and accreditation from the collective actions of many individuals who produce the datum of relevance as a by-product of making their own purchasing decisions. At a more self-conscious level (self-conscious, that is, on the part of the user), Amazon allows users to create topical lists, and to track other users as their “friends and favorites,” whose decisions they have learned to trust. Amazon also provides users with the ability to rate books they buy, generating a peer-produced rating by averaging the ratings. The point to take home from Amazon is that a corporation that has done immensely well at acquiring and retaining customers harnesses peer production to provide one of its salient values—its ability to allow users to find things they want quickly and efficiently.

Similarly, Google, probably the most efficient general search engine currently in operation, introduced a crucial innovation into ranking results that made it substantially better than any of its competitors. While Google uses a text-based algorithm to retrieve a given universe of web pages initially, its PageRank software employs peer production of ranking in the following way.20 The engine treats links NY Times Magazine, Sunday, July 15, 2001 cover story.

See description http://www.google.com/technology/index.html.

COASE’S PENGUIN Oct. 2001 from other web site pointing to a given website as votes of confidence. Whenever one person who has a page links to another page, that person has stated quite explicitly that the linked page is worth a visit. Google’s search engine counts these links as distributed votes of confidence in the quality of that page as among pages that fit the basic search algorithm. Pages that themselves are heavily linked-to count as more important votes of confidence, so if a highly linked-to site links to a given page, that vote counts for more than if an obscure site that no one else thinks is worth visiting links to it. The point here is that what Google did was precisely to harness the distributed judgments of many users, each made and expressed as a byproduct of making one’s own site useful, to produce a highly accurate relevance and accreditation algorithm.

While Google is an automated mechanism of collecting human judgment as a by product of some other activity (publishing a web page) there are also important examples of distributed projects self-consciously devoted to peer production of relevance. Most prominent among these is the Open Directory Project. The site relies on tens of thousands of volunteer editors to determine which links should be included in the directory. Acceptance as a volunteer requires application, and not all are accepted, and quality relies on a peer review process based substantially on seniority as a volunteer and engagement. The site is hosted and administered by Netscape, which pays for server space and a small number of employees to administer the site and set up the initial guidelines, but licensing is free, and presumably adds value partly to AOL’s and Netscape’s commercial search engine/portal, and partly through goodwill. The volunteers are not affiliated with Netscape, receive no compensation, and manage the directory out of the joy of doing so, or for other internal or external motivations. Out of these motivations the volunteers spend time on selecting sites for inclusion in the directory (in small increments of perhaps 15 minutes per site reviewed), producing the most comprehensive, highest quality human-edited directory of the Web—competing with, and perhaps overtaking, Yahoo in this category. The result is a project that forms the basis of the directories for many of the leading commercial sites, as well as offering a free and advertising free directory for all to use.

Perhaps the most elaborate mechanism for peer production of relevance and accreditation, at multiple layers, is Slashdot. 21 Billed as “News for Nerds”, Slashdot primarily consists of users commenting on initial submissions that cover a variety of technology-related topics. The submissions are typically a link to a proprietary story coupled with some initial commentary from the person who submits the piece. Users follow up the initial submission with comments that often number in the hundreds.

The initial submissions themselves, and more importantly the approach to sifting

–  –  –

through the comments of users for relevance and accreditation, provide a rich example of how this function can be performed on a distributed, peer production model.

First, it is important to understand that the function of posting a story from another site onto Slashdot, the first “utterance” in a chain of comments on Slashdot, is itself an act of relevance production. The person submitting the story is telling the community of Slashdot users “here is a story that people interested in “News for Nerds” should be interested in.” This initial submission of a link is itself filtered by “authors” (really editors) who are largely paid employees of Open Source Development Network (OSDN), a corporation that sells advertising on Slashdot.

Stories are filtered out if they have technical formatting problems or, in principle, if they are poorly written or outdated. This segment of the service, then, seems mostly traditional—paid employees of the “publisher” decide what stories are, and what are not, interesting and of sufficient quality. The only “peer production” element here is the fact that the initial trolling of the web for interesting stories is itself performed in a distributed fashion. This characterization nonetheless must be tempered, because the filter is relatively coarse, as exemplified by the FAQ response to the question, “how do you verify the accuracy of Slashdot stories?” The answer is, “We don’t. You do.

If something seems outrageous, we might look for some corroboration, but as a rule, we regard this as the responsibility of the submitter and the audience. This is why it's important to read comments. You might find something that refutes, or supports, the story in the main.”22 In other words, Slashdot very self-consciously is organized as a means of facilitating peer production of accreditation—it is at the comments stage that the story undergoes its most important form of accreditation—peer review ex post.

And things do get a lot more interesting as one looks at the comments. Here, what Slashdot allows is the production of commentary on a peer-based model. Users submit comments that are displayed together with the initial submission of a story.

