«Coase’s Penguin, or, Linux and the Nature of the Firm Yochai Benkler∗ Abstract The emergence of GNU/Linux as a viable alternative to the Windows ...»
Think of the “content” produced in these comments as a cross between academic peer review of journal submissions and a peer-produced substitute for television’s “talking heads.” It is in the means of accrediting and evaluating these comments that Slashdot’s system provides a comprehensive example of peer production of relevance and accreditation.
Slashdot implements an automated system to select moderators from the pool of the users. Moderators are selected according to several criteria; the must be logged in (not anonymous), they must be regular users (selects users who use the site averagely, not one time page loaders or compulsive users), they must have been using the site for a while (defeats people who try to sign up just to moderate), they must be willing, and they must have positive “karma”. Karma is a number assigned to a user http://slashdot.org/faq/editorial.shtml#ed230.
COASE’S PENGUIN Oct. 2001 that primarily reflects whether the user has posted good or bad comments (according to ratings from other moderators). If a user meets these criteria, the program assigns the user moderator status and the user gets five “influence points” to review comments. The moderator rates a comment of his choice using a drop down list with words such as “flamebait” and “informative”, etc. A positive word will increase the rating of the comment one point, and a negative word will decrease the rating a point.
Each time a moderator rates a comment it costs the moderator one influence point, so the moderator can only rate five comments for each moderating period. The period lasts for three days and if the user does not use the influence points, they expire. The moderation setup is intentionally designed to give many users a small amount of power – thus decreasing the effect of rogue users with an axe to grind or with poor judgment.
The site also implements some automated “troll filters” which prevent users from sabotaging the system. The troll filters prevent users from posting more than once every 60 seconds, prevent identical posts, and will ban a user for 24 hours if the user has been moderated down several times within a short time frame.
Slashdot provides the users with a “threshold” filter that allows each user to block lower quality comments. The scheme uses the numerical rating of the comment (ranging from –1 to 5). Comments start out at 0 for anonymous posters, 1 for registered users and 2 for registered users with good “karma”. As a result, if a user sets their filter at 1, the user will not see any comments from anonymous posters unless the comments’ ratings were increased by a moderator. A user can set their filter anywhere from –1 (viewing all of the comments), to 5 (where only the posts that have been upgraded by several moderators will show up).
Relevance is also tied into the Slashdot scheme because off topic posts should receive an “off topic” rating by the moderators and sink below the threshold level (assuming the user has the threshold set above the minimum). However, the moderation system is limited to choices that sometimes are not mutually exclusive.
For instance, a moderator may have to choose between “funny” (+1) and “off topic” (when a post is both funny and off topic. As a result, an irrelevant post can increase in ranking and rise above the threshold level because it is funny or informative. It is unclear, however, whether this is a limitation on relevance, or indeed mimics our own normal behavior, say in reading a newspaper or browsing a library (where we might let our eyes linger longer on a funny or informative tidbit, even after we’ve ascertained that it is not exactly relevant to what we were looking for).
posts consistently receive high ratings, their karma will increase. At a certain karma level, their comments will start off with a rating of 2 thereby giving them a louder voice in the sense that users with a threshold of 2 will now see their posts immediately. Likewise a user with bad karma from consistently poorly rated comments can lose accreditation by having their posts initially start off at 0 or –1. At the –1 level, the posts will probably not get moderated, effectively removing the opportunity for the “bad” poster to regain any karma.
Together, these mechanisms allow for the distributed production of both relevance and accreditation. Because there are many moderators who can moderate any given comment, and thanks to the mechanisms that explicitly limit the power of any one moderator to over-influence the aggregate judgment, the system evens out differences in evaluation by aggregating judgments. The system then allows users to determine what the level of accreditation pronounced by this aggregate system fits their particular time and needs, by setting their filter to be more or less inclusive, therefore relying to a greater or lesser extent on the judgment of the moderators. By introducing “karma,” the system also allows users to build reputation over time, and to gain greater control over the accreditation of their own work relative to the power of the critics. Just, one might say, as very famous authors or playwrights might have over unforgiving critics. The mechanization of means of preventing defection or gaming of the system—applied to both users and moderators—also serve to mediate some of the collective action problems one might expect in a joint effort involving many people.
In addition to the mechanized means of selecting moderators and minimizing their power to skew the aggregate judgment of the accreditation system, Slashdot implements a system of peer-review accreditation for the moderators themselves.
Slashdot implements meta-moderation by making any user that has an account from the first 90% of accounts created on the system eligible to moderate the moderations.
Each eligible user who opts to perform meta-moderation review is provided with 10 random moderator ratings of comments. The user/meta-moderator then rates the moderator’s rating as either unfair, fair, or neither. The meta-moderation process then affects the karma of the original moderator, which, when lowered sufficiently by cumulative judgments of unfair ratings, will remove the moderator from the moderation system.
Users, moderators, and meta-moderators are all volunteers. Using sophisticated software to mediate the multiple judgments of average users regarding the postings of others, the system generates aggregate judgments as to the relevance and quality of user comments. The primary point to take from the Slashdot example is that the same dynamic that we saw used for peer production of content can be implemented to produce relevance and accreditation. Rather than using the full time COASE’S PENGUIN Oct. 2001 effort of professional accreditation experts—call them editors or experts—the system is designed to permit the aggregation of many small judgments, each of which entails a trivial effort for the contributor, regarding both relevance and accreditation of the materials sought to be accredited. Another important point is that the software that mediates the communication among the collaborating peers embeds both means to facilitate the participation and to defend the common effort from defection.
3. Value-added Distribution
Finally, when we speak of information or cultural goods that exist (content has been produced) and are made usable through some relevance and accreditation mechanisms, there remains the question of “distribution.” To some extent this is a non-issue on the Net. Distribution is cheap, all one needs is a server and large pipes connecting one’s server to the world, and anyone, anywhere can get the information. I mention it here for two reasons. One, there are a variety of value-adding activities at the distribution stage—like proofreading in print publication—that need to be done at the distribution stage. Again, as long as we are talking about individual web sites, the author who placed the content on the Web will likely, for the same motivations that caused him or her to put the materials together in the first place, seek to ensure these distribution values. Still, we have very good examples of precisely these types of value being produced on a peer production model. Furthermore, as the Net is developing, the largest ISPs are trying to differentiate their services by providing certain distribution–related values. The most obvious examples are caching and mirroring—implementations by the ISP (caching) or a third party like Akamai (mirroring) that insert themselves into the distribution chain in order to make some material more easily accessible than other material. The question is the extent to which peer distribution can provide similar or substitute values.
The most notorious example is Napster. The point here was that the collective availability of tens of millions of hard drives of individual users provided a substantially more efficient distribution system for a much wider variety of songs than the centralized (and hence easier to control) distribution systems preferred by the record industry. The point here is not to sing the praises of the dearly departed (as of this writing) Napster. The point is that, setting aside the issues of content ownership, efficient distribution could be offered by individuals for individuals. Instead of any one corporation putting funds into building a large server and maintaining it, endusers opened part of their hard drives to make content available to others. And while Napster required a central addressing system to connect these hard drives, Gnutella does not. This is not the place to go into the debate over whether Gnutella has its own limitations, be they scalability or free riding. The point is that there are both volunteers and commercial software companies involved in developing software intended to allow users to set up a peer-based distribution system that will be COASE’S PENGUIN Oct. 2001 independent of the more commercially-controlled distribution systems, and will be from the edges of the network to its edges, rather than through a controlled middle.
Perhaps the most interesting, discrete and puzzling (for anyone who dislikes proofreading) instantiation of peer-based distribution function is Project Gutenberg and the site set up to support it, Distributed Proofreading. Project Gutenburg23 entails hundreds of volunteers who scan in and correct books so that they are freely available in digital form. Currently, Project Gutenberg has amassed around 3,500 public domain “etexts” through the efforts of volunteers and makes the collection available to everyone for free. The vast majority of the etexts are offered as public domain materials. The etexts are offered in ASCII format, which is the lowest common denominator and makes it possible to reach the widest audience. The site itself presents the etexts in ASCII format but does not discourage volunteers from offering the etexts in markup languages. It contains a search engine that allows a reader to search for typical fields such as subject, author and title. Distributed Proofreading is a site that supports Project Gutenberg by allowing volunteers to proofread an etext by comparing it to scanned images of the original book. The site is maintained and administered by one person.
Project Gutenberg volunteers can select any book that is in the public domain to transform into an etext. The volunteer submits a copy of the title page of the book to Michael Hart—who founded the project—for copyright research. The volunteer is notified to proceed if the book passes the copyright clearance. The decision on which book to convert to etext is left up to the volunteer, subject to copyright limitations.
Typically a volunteer converts a book to ASCII format using OCR (optical character recognition) and proofreads it one time in order to screen it for major errors. The volunteer then passes the ASCII file to a volunteer proofreader. This exchange is orchestrated with very little supervision. The volunteers use a listserv mailing list and a bulletin board to initiate and supervise the exchange. In addition, books are labeled with a version number indicating how many times they have been proofed. The site encourages volunteers to select a book that has a low number and proof it. The Project Gutenberg proofing process is simple and involves looking at the text itself and examining it for errors. The proofreaders (aside from the first pass) are not expected to have access to the book or scanned images, but merely review the etext for selfevident errors.
Distributed Proofreading,24 a site unaffiliated with the Project Gutenberg, is devoted to proofing Gutenberg Project etexts more efficiently, by distributing the volunteer proofreading function in smaller and more information rich modules. In the http://promo.net/pg/ http://charlz.dynip.com/gutenberg/ COASE’S PENGUIN Oct. 2001 Distributed Proofreading process, scanned pages are stored on the site and volunteers are shown a scanned page and a page of the etext simultaneously so that the volunteer can compare the etext to the original page. Because of the modularity, proofreaders can come on the site and proof one or a few pages and submit them. By contrast, on the Project Gutenberg site the entire book is typically exchanged, or at minimum a chapter. In this fashion, Distributed Proofreading clears the proofing of thousands of pages every month.
What is particularly interesting in these sites is that they show that even the most painstaking, and some might say mundane, jobs can be produced on a distributed model. Here the motivation problem may be particularly salient, but it appears that a combination of bibliophilia and community ties suffices (both sites are much smaller and more tightly knit than the Linux development community or the tens of thousands of NASA clickworkers). The point is that individuals can self-identify as having a passion for a particular book, or as having the time and inclination to proofread as part of a broader project they perceive to be in the public good. By connecting a very large number of people to these potential opportunities to produce, Project Gutenberg, just like clickworkers, or Slashdot, or Amazon, can capitalize on an enormous pool of underutilized intelligent human creativity and willingness to engage in intellectual effort.
What I hope all these examples provide is a common set of mental pictures of what peer production looks like. In the remainder of the article I will