«Information, Privacy, and the Internet: An Economic Perspective Susan Athey Stanford Graduate School of Business1 Contents Introduction 5 1. ...»
Second, harming the efficiency of online advertising is typically regressive. Advertising supports free products. Low income people appreciate ad-supported free products more than wealthy people. For example, free productivity software such as Google Docs or Office Web Apps are especially beneficial to students, new or small businesses, and low-income people. (Note that these services do not currently show ads directly, but they tend to increase user utilization of related services offered by the same firm that are ad-funded.) In a more striking example, two studies by Miller and Tucker (2009, 2011) together imply that U.S. states adopting especially stringent privacy laws decreased the adoption of electronic medical records systems, which in turn increased infant mortality (which is very high in the U.S. among poor women, where not all women have access to prenatal care). The evidence suggests that disadvantaged women were harmed by the lack of availability of medical information when they came to the hospital. Certainly, those who designed the privacy policies did not account for the fact that these policies would end up leading to the death of economically disadvantaged babies; yet this example, while extreme, is not at all isolated. Data enables services that help the most vulnerable.
Third, past attempts at privacy regulation have resulted in behavior where privacy policies are typically too difficult to read. There is little evidence that the way “notice and consent” has been implemented across a wide range of firms has had much impact on consumer behavior; indeed, only a tiny fraction of users read such notices, and an even smaller fraction understand them.
Fourth, as discussed above, it is difficult to measure welfare benefits provided to consumers, and so it is hard to propose efficient strategies. In addition, views are changing, so it is hard to predict long term benefits. Although expert policy-makers may do better at assessing the factual information about privacy policies and at understanding various risks, they may not have much advantage at putting a dollar value on the benefits consumers get from feeling like their data is used fairly or that their data is private.
What kinds of policies, then, have some hope of balancing the costs and benefits appropriately? Burt (2013) reported on one kind of
proposal, from Craig Mundie:
“Metadata” could incorporate user preferences, while users of the data would have freedom to develop new technology so long as they respected user preferences.
This proposal can be thought of in a broader context where regulation helps establish property rights. Property rights are a broad concept that can be applied even in a world of fast-changing technology and across many contexts. There is some hope that consumers can learn to understand what it means to own their data and allocate property rights.
This kind of proposal can be contrasted with an approach of trying to ban particular technologies. Policy aimed primarily at, e.g., cookies can be undermined through the use of other technology that accomplishes a similar goal, and cookies may not even be relevant in new form factors or settings (like the “internet of things,” the “smart home,” “wearables,” or mobile.
Another type of policy is one that attempts to provide broader protection through limits on data retention. Chiow and Tucker (2014) argue that data retention limits may pass the cost-benefit test, providing evidence from recent data that changes in retention policy did not change the quality of search engine results. Their finding is consistent with general industry understanding that recent data is much more important for predicting what consumers want. A potential policy would limit the retention of data, and require it to be anonymized and/or aggregated after a certain time period.
Although there is always some value to having older data, particularly for research and development and for analyzing trends over time, there are also large potential costs to keeping that data.
To see why, let us take the perspective that an individual values privacy because of the risk of economic harm or reputational risk due to discovering information about the individual. (Of course, there are many other perspectives on privacy, as outlined above.) Note that there may be many sources of information about an individual’s current behavior. One could observe their shopping physically, for example. On the other hand, over time, it is more likely that a user might have changed their preferences and behavior, and thus face some costs if their previous behavior was revealed. At the same time, as time passes, there are fewer and fewer ways for an outsider to find detailed data about a user’s past behavior, other than the digital data retained by online firms. Thus, eliminating the digital data has a material impact on the risk that the information is revealed.
A natural alternative is to regulate the use of old data rather than its retention. However, it is very difficult to anticipate or even understand how and why historical data is harmful, and thus difficult to regulate all the different uses that could be harmful.
Furthermore, a security breach might occur. If the data does not exist (or exists only in anonymized form), then a single security breach is less likely to expose harmful information.
Limits on retention are also easy for consumers to understand (though it may be more subtle to understand residual risks of the retention of “anonymized” or aggregated data). A consumer can have confidence that something that happened two years ago is more or less “gone” unless they have specifically opted in to retention (e.g. retaining old credit card or bank statements, or historical orders on an e-commerce site, which are easier to remember than website viewing or shopping). These give users a feeling of control, and may create more utility for consumers if part of the value to consumers is not having to worry about the unknown or about technologies they don’t understand.
Limits on retention may seem like a blunt instrument, but such limits also provide blunt protection against a wide range of issues, including security breaches as well as unwanted use of data or government surveillance. Although historical data does have real value, and in some contexts (such as studying health conditions that develop over many years) it may be indispensable, in many online contexts, the benefit of long retention of non-anonymized historical data may not outweigh the privacy costs and risks. If limits on retention help consumers become more comfortable with richer uses of current data, and thus policy permits the use of current data to create more value and efficiency (for example in online advertising for small websites and apps), such a policy may have substantial welfare benefits.
Case Study on Retention and User Awareness It may also be important to have independent “auditors” help interpret the actual practices of internet firms. For example, Google “history” is turned on for many or most Gmail users.
Depending on the date on which a Gmail account was created, users were automatically opted into this service, and as of June 2014, the Google “About Google Web History” website states “When you create a Google Account, Google Web History is automatically turned on.” Even though Google publicly states that it only retains search logs for a limited period (e.g. 18 months), Gmail users may discover that their entire search log history as well as their entire internet browsing history from the creation of their account is stored by Google (a user’s history can be found by logging into Gmail and then typing http://www.google.com/history/). This author’s history, for example, can show the daily search activity in September 2006, or the set of political news articles read on a particular day several years ago in a different tab of a browser where I was reading Gmail.
Google’s “About Google Web History” website also gives the
Google Web History saves information about your activity on the web, as well as details about your browser,
Presumably government surveillance of data from such a long time frame could also be problematic. There is no public information about what fraction of Google users has disabled Web History, nor the average number of years of data per user.
Conclusions This paper has attempted to survey a large volume of literature to understand the role of data, gatekeepers, information, internet search, and advertising. This paper argues that bringing an economic framework to bear is essential for achieving beneficial policy outcomes.
More robust policies may include the establishment of property rights for data, which at least have the potential to allow the efficiency benefits of using data for personalization to be realized, as well as broad measures such as limits on retention that are easy for consumers to understand and also solve a wide range of potential privacy and security concerns simultaneously, without limiting technology. Even retention policies must be carefully considered in each domain, however, because in some domains (such as health), longer retention of data may be justified.
References Abraham, I., S. Athey, M. Babaioff, and M. Grubb, “Peaches, Lemons, and Cookies: Designing Auction Markets with Dispersed Information,” Working Paper, Stanford Graduate School of Business, 2014.
Acquisiti, A., 2010, “The Economics of Personal Data and the Economics of Privacy,” Working Paper, CMU.
Aggarwal, G., Ashish G., and R. Motwani. “Truthful auctions for pricing search keywords,” ACM Conference on Electronic Commerce, 2006.
Armstrong, M., J. Vickers, and J. Zhou. “Prominence and
consumer search.” The RAND Journal of Economics 40, no. 2 (2009):