The Antidote for “Anecdata”: A Little Science Can Separate Data Privacy Facts from Folklore

Guest post by Daniel Barth-Jones

For anyone who follows the increasingly critical topic of data privacy closely, it would have been impossible to miss the remarkable chain reaction that followed the New York TLC’s (Taxi and Limousine Commission) recent release of data on more than 173 million taxi rides in response to a FOIL (Freedom of Information Law) request by Urbanist and self-described “Data Junkie” Chris Whong.  It wasn’t long at all after the data went public that the sharp eyes and keen wit of software engineer Vijay Pandurangan detected that taxi drivers’ license numbers and taxi plate (or medallion) numbers hadn’t been anonymized properly and could be decoded due to the failed encryption process.

Soon after Pandurangan’s revelation of the botched unsalted MD5 cryptographic hash in the TLC data, Anthony Tockar, working on a summer Data Science internship with Neustar,  posted his blog “Riding with the Stars: Passenger Privacy in the NYC Taxicab Dataset” with the aim of introducing the concept of “differential privacy” and announcing Neustar’s expertise in this area. (It’s well worth checking out both Tockar’s short, but informative, tutorial on differential privacy and his application of the method to the maps of the TLC taxi data as his smartly designed graphics allow you interactively adjust differential privacy’s “epsilon” parameter and see its impact on the results.)

To illustrate possible rider privacy risks for the TLC taxi-data, Tockar, armed with some celebrity paparazzi photos and some clever insights as to when, where and how to find potential vulnerabilities produced a blog post replete with attention grabbing tales of miserly celebrities who stiffed drivers on their tips and cyber-stalking strip club patrons, which quickly went viral. And so as to up the fear, uncertainty, and dread (FUD) factors surrounding his attacks, Tockar further gravely warned us all in his post that:

Equipped with this [TLC Taxi] dataset, and just a little auxiliary information about you, it would be quite trivial for someone to follow your movements, collecting data on your whereabouts and habits, while you remain blissfully unaware. A stalker could find out where you live and work. Your partner may spy on you. A thief could work out when you’re away from home, based on your habits.

However, as I’ll explain in more detail, sorting out these quite concerning claims in a rational fashion which will enable us to consider complex decisions about the possible trade-offs between Freedom of Information and open government principles and data privacy concerns requires that we move beyond mere citation of anecdotes (or worse, collections of anecdotes in which carefully targeted and especially vulnerable, non-representative cases have been repackaged as “anecdata”). Instead, we must base our risk assessment in a systematic investigation appropriately founded in the principles of scientific study design and statistically representative samples. Regrettably though, this wasn’t the case here and has quite often not been the case for many headline snatching re-identification attacks that have repeatedly made the news in recent years.

Read more…

Big Pharma: the New Hustler

That’s the provocative thesis of Jane’s post over at Balkinization for the conference Public Health in the Shadow of the First Amendment. Worth a read! And here’s her second post.

The Cambridge University Press decision and Educational Fair Use

The Eleventh Circuit released its 129-page opinion in Cambridge University Press v. Patton (which most of us probably still think of as the Becker case) last Friday. Although the appeals court reversed what I thought was a pretty solid opinion of the district court upholding Georgia State University’s practice of distributing digital “course packs” of reading materials to its students, it is very far from a big win for the publishers who challenged the practice. There is a lot to like in the opinion for advocates of educational fair use, and it is difficult to imagine that the district court on remand will rule in favor of the publisher plaintiffs with respect to very many of the works at issue even though the appeals court directed changes in some aspects of its fair use analysis. Although it found some errors in the district court’s treatment of the second and third fair use factors, the appeals court sensibly and correctly rejected several arguments that would have materially constricted the scope of educational fair use in the digital arena. (Full disclosure: I joined Jason Schultz’s excellent amicus brief on behalf of Georgia State.)

Although the Court of Appeals’ opinion deserves a close look, I’ll confine myself here just to noting a few highlights. Read more…

Celebrities, Copyright, and Cybersecurity

The fall began with a wave of hacked nude celebrity photos (as Tim notes in his great post). The release generated attention to the larger problem of revenge porn – or, more broadly, the non-consensual sharing of intimate media. Legislators and scholars have moved to tackle the problem. Danielle Citron proposes a model statute for criminalizing revenge porn in a Slate article (excerpted from her new book), and California finally got around to dealing with the random selfie-only coverage of its law.

I’ve written an article that proposes using copyright law to address non-consensual sharing (but see Rebecca Tushnet’s critique). It’s worth noting that Reddit took down the illicit celebrity images after receiving a copyright claim – which sites have to respect, on pain of getting sued into oblivion (since Section 230‘s immunity doesn’t apply to IP claims). Apparently others have the same idea – one attorney is threatening Google with a $100M lawsuit for failure, in his view, to comply with the DMCA’s takedown requirements. (The letter bloviates and any suit has as much chance of winning as this plaintiff did.) The revenge porn contretemps raises at least four issues:

1. Everyone does it - The sharing of intimate media (videos and images of people nude or engaged in sexual activity) is ubiquitous. Jennifer Lawrence, Kate Upton, Kirsten Dunst – somehow, it took leaks of celebrity intimate media to drive home this point. This has two helpful consequences (one hopes). First, “just say no” should go the way of Nancy Reagan’s campaign, since it had about the same efficacy. Partners sharing intimate media is the new normal, and it’s foolish to pretend otherwise. Second, the moral critique attached to the practice should fade. One common response to revenge porn is “He/she took the risk, so too bad.” That approach focuses culpability on the victim, not the offender. The risk is not in using intimate media – it’s in trusting the wrong person. Most of us have done that at some point.

2. Stupid is as stupid does - Regulating revenge porn properly matters. Here in Arizona, it’s only a matter of time before the state’s terribly-drafted revenge porn bill is enjoined by a federal judge. (The ACLU is suing to block the bill, along with a coalition of bookstores, journalists, and others.) I, along with many others, pointed out that the bill was fatally flawed the moment it passed. This means that victims in Arizona are going to be without protection because their legislators failed them – and that all of us Arizonans are going to fund the state’s defense of a statute that is without hope. The Arizona legislature could have gotten it right – my understanding is that they consulted law professor and revenge porn expert Mary Anne Franks during the drafting – but they whiffed: the drafters apparently ignored Prof. Franks’s good advice. (It’s not as though she drafted a model statute they could have used.) So too with Texas, where the legislature messed up a statute that is arguably underinclusive. So too California, although the Golden State just fixed its law. The lesson is simple: legislators should take their time, get diverse input, and ask the experts.

3. Changing norms - One interesting and hopeful development with the celebrity revenge porn hack is a new wave of calls for people not to look at the pictures. Those calls aren’t likely to be highly effective; there are plenty of people all too eager to see Jennifer Lawrence nude. But this could herald a shift towards disapprobation not only to leaking intimate media, but to viewing it if it’s shared without consent. Norms are powerful regulators, and this change would mark a useful riposte to the gleeful distribution of revenge porn.

4. Everyone needs cybersecurity - It appears that the celebrity photos were obtained through a combination of guessing security questions on Apple’s iCloud service and, perhaps, social engineering. Early reports also suggested that attackers may have simply used dictionary attacks to guess passwords on iCloud. The truth is probably a mix. But it means we all must start to care about cybersecurity. We all have something to hide: credit card numbers, trade secrets, job applications, nude selfies – the list goes on. We carry that information on an ever-increasing array of devices and Internet services. That means we have to invest some time and effort to do things like check privacy and security policies, figure out whether your smartphone encrypts its data, and use good passwords for things you care about. Cybersecurity: it’s not just for geeks anymore.

This isn’t the last we’ll hear of this topic, unfortunately. But perhaps the discourse is shifting in a useful direction…


On Accuracy in Cybersecurity

I have a new article on how to address questions of accuracy in cybersecurity up on SSRN. It’s titled Schrödinger’s Cybersecurity; here’s the abstract:

Both law and cybersecurity prize accuracy. Cyberattacks, such as Stuxnet, demonstrate the risks of inaccurate data. An attack can trick computer programs into making changes to information that are technically authorized but incorrect. While computer science treats accuracy as an inherent quality of data, law recognizes that accuracy is fundamentally a socially constructed attribute. This Article argues that law has much to teach cybersecurity about accuracy. In particular, law’s procedural mechanisms and contextual analysis can define concepts such as authorization and correctness that are exogenous to code. The Article assesses why accuracy matters, and explores methods law and cybersecurity deploy to attain it. It argues both law and cybersecurity have but two paths to determining accuracy: hierarchy, and consensus. Then, it defends the controversial proposition that accuracy is constructed through social processes, rather than emerging from information itself. Finally, it offers a proposal styled on the common law to evaluate when accuracy matters, and suggests that regulation should bolster technological mechanisms through a combination of mandates and funding. Like the cat of Schrödinger’s famous thought experiment, information is neither accurate nor inaccurate until observed in social context.

Cite: Derek E. Bambauer, Schrödinger’s Cybersecurity, 48 UC Davis Law Review (forthcoming 2014).


ACLU Challenges Arizona Revenge Porn Law

The ACLU, ably assisted by Dentons US LLP, has filed a challenge to Arizona’s revenge porn law in federal district court (complaint, ACLU blog, WIRED story). This is great news for Arizonans: the bill was terribly drafted and unconstitutional from the moment it was signed into law. Fighting revenge porn is important, but as Arizona is about to learn, you don’t get to trample the Constitution even in the service of a good cause. (Here’s my earlier post on the law.)


Alan Trammell and I have a new article coming out on the problems of personal jurisdiction analysis when it involves Internet contacts. (The title is Personal Jurisdiction and “teh Interwebs”; I tried very hard to convince Alan to go with the title of this post, to no avail.) Abstract is below; we’d love your comments and thoughts.

For nearly twenty years, lower courts and scholars have struggled to figure out how personal jurisdiction doctrine should apply in the Internet age. When does virtual conduct make someone amenable to jurisdiction in any particular forum? The classic but largely discredited response by courts has been to give primary consideration to a commercial Web site’s interactivity. That approach distorts the current doctrine and is divorced from coherent jurisdictional principles. Moreover, scholars have not yielded satisfying answers. They typically have argued either that the Internet is thoroughly exceptional and requires its own rules, or that it is largely unexceptional and can be subject to current doctrinal tests. 

The difficult relationship between the Internet and modern personal jurisdiction doctrine is a symptom of a much larger problem. We argue that the Supreme Court’s current approach has bifurcated physical and intangible harm. Viewed through that lens, the overarching problem comes into focus because rules that sensibly govern the physical world apply awkwardly — sometimes incoherently — to intangible harm. Accordingly, we propose a return to personal jurisdiction’s first principles, particularly a concern for fairness and predictability. We argue that courts should dispense with the fiction that purely virtual conduct creates any meaningful contact with a particular forum. The narrow approach that we advocate likely will restrict the number of places where a plaintiff can sue for intangible harm, but through three test cases we demonstrate why such a rule will enhance fairness and predictability while also ensuring sufficient access to justice.

Cite: Alan M. Trammell & Derek E. Bambauer, Personal Jurisdiction and “teh Interwebs,” 100 Cornell Law Review (forthcoming 2015).

Why Aren’t “Hacked” Celebrities Filing Takedown Notices?

Writing today in Slate, Emily Bazelon complains that the law does not do enough to protect the privacy rights of celebrities whose accounts were illicitly “hacked” last weekend, resulting in the release of unauthorized nude photos the celebrities apparently took of themselves. Bazelon contrasts what she characterizes as the celebrities’ inability to remove their objectionable content from third-party Web sites with the much easier time that big movie studios have getting their works removed from YouTube. She writes:

Every day, movie and TV producers succeed in getting videos that have been posted without their consent taken down from major websites. Sure, you can still find pirated stuff if you look hard enough. But the big sites take down content once they know it’s been posted in violation of copyright. Because if they don’t, they’ll be sued—and no one will care if they defend the publication of stolen materials, in the name of free speech or otherwise.

Yet in the days since Jennifer Lawrence and other celebrities discovered that their nude images were stolen, and then posted without their consent on sites like Reddit and 4Chan, the stars can’t get the images taken down.

But that’s just not so. The law already provides precisely the same safeguards for the celebrities that it does for the movie and TV producers: as the creators (and copyright holders) of works posted online without their permission, they are statutorily entitled under 17 U.S.C. § 512(c) to insist that the hosting web sites remove, or disable access to, that content. Further legislation is unnecessary; all that is necessary for the injured parties to disable access to all the “hacked” photos is to follow the notice-and-takedown procedure specified in Section 512.

The problem is not, as Bazelon argues, that Section 230 of the Communications Decency Act (“CDA”) immunizes the web sites’ unauthorized display of the “hacked” photos. (To the contrary, those sites have apparently already removed some of the leaked content whose distribution violates federal law.) By its express terms [in paragraph (e)(2)], Section 230 provides absolutely no immunity to service providers accused of violating copyright law. Thus, the CDA interposes no bar to the use of Section 512’s notice-and-takedown regime under the present circumstances.

There are real issues raised by the “hacking” scandal, but the big ones are social/cultural, not legal. Posting content created by other people is already punishable both civilly and criminally, and the means to disable online access to such content are already in place. Whether it is fair to require individuals whose privacy has been invaded to avail themselves of their Section 512 rights in order to prevent further invasions is a separate question, but the problem is not, as Bazelon portrays it, the lack of appropriate legislation.

(Of course, Derek has already treated this issue, and responds presciently to the Section 230 objection, elsewhere. Happily for his analysis, the thornier IP problems involved in the repugnant “revenge porn” scenario, where the injured party and the copyright holder may not be the same person, are not present in the context of the hacked celebrity “selfies.”)

Mod a Game Console, Go to Jail

I’ve been puzzling over the 6th Circuit’s new opinion in United States v. Reichert (No. 13-3479, Mar. 28, 2014), in which a divided panel affirmed a defendant’s criminal conviction for violating the Digital Millennium Copyright Act’s anti-trafficking rule (17 U.S.C. § 1201(a)(2)) based on the defendant’s sale of a “modded” video game console to an undercover federal agent.

It’s a confusing opinion. Part of the confusion may have to do with my relative unfamiliarity with criminal law; most of the majority’s opinion is devoted to explaining why the defendant’s violation met the “willfulness” mens rea standard of 17 U.S.C. § 1204(a) despite the defendant’s apparent ignorance of the DMCA. I personally find troubling the court’s characterization of “willfulness” as permitting conviction based on proof that the defendant “deliberately ignored a high probability that he was trafficking in technology” that the DMCA in fact forbids (irrespective of whether the defendant knew that the DMCA forbid such trafficking), which seems to me inconsistent with United States v. Liu, 731 F.3d 982 (9th Cir. 2013), a case that engaged in a much more careful analysis of copyright’s criminal mens rea requirements than does the Reichert panel. But perhaps I’m misreading the court’s opinion on this point; criminal law isn’t my specialty.

My more basic problem is that, quite apart from the mens rea issue, I am having a hard time understanding the underlying DMCA claim. Read more…

Cybercrime’s International Challenges

Jane and I are in Cluj-Napoca, Romania, for a conference titled “Crimes, Criminals, and the New Criminal Codes: Assessing the Effectiveness of the Legal Response” at Babes-Bolyai University. Jane is speaking on “Surveillance in a Technological Age: The Case of the NSA,” and I’m giving a talk based on my forthcoming article Ghost in the Network. I’ll post updates from time to time – one thing I’ve just learned is that Ramniciu Valcea, Romania, is known as “Hackerville” because it has become what Francesca Bosco of UNICRI calls the “Silicon Valley of organized crime.”


  • Straight out of “The Wire“: organized crime hacked systems at the Port of Antwerp to smuggle drugs into the country in shipping containers without detection.
  • Reminder from Jane: FISA doesn’t protect non-U.S. persons who are abroad, and never has.
  • Jane: there is exactly one criminal case where a defendant has received notice that information developed under a FISA order is being used against him/her. (Of course, that’s because the government routinely lies about this.)
  • Giovanni Ziccardi on hate speech: pure cybercrimes are unusual – the real action is in the linkage between “old” crimes and use of computers. In 20 years (1993 – 2013), Italy had 72 unauthorized access cases, 7 virus cases, and 17 about attacks on critical systems (at the country’s 1st and 2nd level courts).
  • Ziccardi: Italy’s draft law 2195 bans Internet anonymity. Appears to have been drafted by squirrels and starlets [my summary].
  • John Vervaele: The EU has only one mechanism for forcing member states to comply with basic standards for human rights – suspending that state’s membership rights. It’s too strong a remedy to be effective in ensuring the rule of law, since the EU is reluctant to deploy it.