Privacy in an era of 'big data'
Earlier this fall, I came across an essay, “Why privacy matters even if you have nothing to hide”. Taken from a book written by Daniel J. Solove, a law professor at George Washington University, the piece provides a somewhat wordy argument for a more encompassing view of how privacy might be compromised in an era of big data.
Before I read Solove's essay, I’d been thinking a bit about Alistair Croll’s post, “Big data is our generation’s civil rights issue, and we don’t know it”. Croll makes a solid case for linking "what the data is" to "how it can be used".
For his part, Solove feels the ‘nothing to hide’ argument blocks our ability to consider and evaluate four potentially problematic data-collection activities:
- Aggregation, the “fusion of small bits of seemingly innocuous data”
- Exclusion, blocking people from understanding data use or correcting errors in collected data
- Secondary use, information collected for one purpose and then used in other ways without consent
- The potential for distortion, as any information collected is only a subset of the full person
In his essay, Solove asserts:
“The difficulty is that commentators are trying to conceive of the problems caused by databases in terms of surveillance when, in fact, those problems are different… [T]he problem with the nothing-to-hide argument is the underlying assumption that privacy is about hiding bad things.”
Solove’s arguments focus on the potential misuse of data by governments, and that’s certainly an important and ongoing concern. His perspective came to mind as I read the coverage of the Journal News and its decision to map gun permit data in the wake of the school shootings in Newtown, Connecticut.
By law, the permit data is public; anyone can obtain it. Locating addresses using Google Maps is not particularly hard. But the act of publishing this information – in effect, the “fusion of small bits” – has provoked a firestorm of debate and criticism.
This controversy outlines the concerns that Solove described two years ago. The map represents a combination of aggregation, secondary use and (at least in the eyes of permit holders) distortion. The end result: a lot of people feel their privacy was violated, even though they have nothing to hide.
The newspaper probably should have done (much) more than collect and map the data. Data is part of reporting, but it is not a substitute for reporting. As an example, Journal News staff could have talked with permit holders and presented the points of view they have on gun ownership.
But the data in question is available for anyone to collect, analyze and use as they see fit. This includes people who may want to know if their neighbors are licensed to own handguns as well as people with less benign interests.
Solove contends that privacy “is often threatened not by a single egregious act but by the slow accretion of a series of relatively minor acts.” Or, as Mathew Ingram wrote last week: “Online privacy is complicated”.
The Journal News didn’t create the privacy problem. I’d argue that it exposed it.
A personal addendum: I write this post with mixed emotions. I think that gun ownership in the United States is a significant problem. The laws in place are inadequate and at times bizarre. We provide permits in perpetuity, require licenses to buy or own handguns but not assault weapons, and allow individuals to stockpile weapons with few restrictions.
But, as I wrote in an August post, we’re now collecting data before we decide how it may be used. Although I would like to change the laws that govern gun ownership in the United States, in this setting it’s also fair to talk about the rules we want in place to govern secondary uses of the information our government is gathering.
Edited December 31 to add a link to a Los Angeles Times editorial about the Journal News and the privacy implications of their mapping effort. It makes points similar to those raised here.