- Reconciling lifestreaming and privacy: tech-facilitated negotiations
I’ve long thought that, as tough as privacy against government intrusion and corporate surveillance are, the most novel and complex privacy challenges will be peer-to-peer. With gov’t and corporate privacy issues, the players to be affected are more known and manageable, and impinging on their freedom to collect on us — or report what they find — feel like “regular” regulation.
But what happens when the information being gathered about us is thanks to someone wearing a headset and simply streaming anything interesting that he or she sees, helpfully auto-tagged with our identities? Some bars and restaurants may try to ban Google Glass on the way in, but lessons from anything ranging from mobile phones to hats tell us who’s going to win that war in the longer term. Especially once the distribution of streaming devices has evened out, so it’s not just the occasional freak behaving anti-socially, but all of us doing so, we’ll need to look for other solutions if we don’t want to be stuck simply having to reconcile ourselves to no private moments in public.
One place to mine is the realm of digital rights management. DRM has not worked out so well for copyrighted material in the public mainstream, like movies and music. But what if the kind of tagging by which stuff can ask — if not require — “don’t copy me” could be deployed for privacy purposes, more in the spirit of Creative Commons than the ill-fated Macrovision VHS copy protection scheme.
How to do this? A start would be to allow people to set their expectations for a given environment, and to be able to broadcast them (without having to share their names, of course). If enough people in, say, a classroom, agree that the meeting is off the record, then recording devices will be alerted accordingly. They’ll still function, but they’ll show a message that the environment is expected to be off the record — and perhaps they’ll have a glowing LED or some other gentle indicator to tell others in the room that someone has chosen to record despite the norm. Perhaps, too, those recorded will be able to have some form of pseudonymous contact information embedded in the recording — so that if it should become public, they can choose to show that they were indeed the ones recorded (again without necessarily having to reveal identity) and then ask — not demand — some privilege in contextualizing or commenting upon the recording.
Many of us might appreciate an opportunity to know about others’ preferences and expectations in a quiet, low-impact way, and then to respect them — or if not, to realize that that choice entails overriding the preferences of others. The function of the technology is not to impede certain uses by fiat — the way the old DRM did — but rather to allow people to see that other people are implicated by what they do, permitting the moral dimension of our enthusiastic use of technology to become more apparent.
Update: PlaceAvoider appears to seek to implement some of this functionality.
- “The Big Brother Problem” WEF panel
“The Big Brother Problem” is a timely, difficult, and sweeping topic, at WEF ’14, covering digital surveillance by both public and private actors and its implications for human rights. I’ll be moderating the session for it this week, and I thought I’d share my thoughts on both process and substance as I prepare for it.
We have one hour on a very broad topic, including audience and online questions, with six participants. How to use that hour to make progress? Let’s start with the lineup. Since there won’t be prepared remarks, I’ll be looking for lead-off questions that allow each participant to highlight what he or she finds most important, while also answering something that might be a bit off the beaten path.
There’s Salil Shetty, Secretary-General of the Nobel Peace Prize-winning Amnesty International, which has been a beacon for human rights since its inception in 1961. For someone concerned about human rights, it may not make much sense to try to rank abuses: too often the observation that rights are abused “worse” elsewhere is an excuse for a particular government not to clean up its own practices. But it may be fair to ask generally how informational privacy ranks next to possibly more fundamental concerns about physical integrity – like freedom from arbitrary detention or torture.
Indeed, surveillance might be best understood metaphorically as a precursor chemical to an undesirable concoction: spying can facilitate human rights abuses. But it’s dual- or multiple-use. Good digital surveillance, particularly of places where there aren’t easily independent boots on the ground to understand what’s going on, could be a powerful tool for good. Intelligence agencies around the world may be able to understand what’s going on in, say, Syria, thanks to the tools and practices that have been splashed across global headlines since last June.
If surveillance is to be cut back, what’s the right blend of restrictions on the collection end – trying to raise the cost of knowing something secret across the board – versus on use. What bulwarks against abuse, as compared to limits on collection, would be sufficient to the cause of human rights and dignity? (For example, someone against nuclear weapons proliferation might focus on a regime limiting the use and transfer of nuclear arms; their elimination entirely; and at the end of the spectrum a restriction on peaceful nuclear power for fear that enrichment technologies are just too easily dual-purpose.)
Here, too, it may be helpful to explore what one might want in theory, and what is judged politically attainable. Is there room for compromise and negotiation here, or is a role of an organization like Amnesty to anchor and stick to what it perceives as the purest of truths?
Next up is U.S. Senator Patrick Leahy. Leahy is no stranger to an issue that will no doubt be prominent in the discussion: the activities of the U.S. National Security Agency that have been the subject of ongoing leaks originating most recently with Edward Snowden, and the topic of a U.S. Presidential address covering reform last week. Senator Leahy chairs the Senate Judiciary Committee, which has been a focus for debate over U.S. legislation like the PATRIOT Act, and has often weighed in as one of the chamber’s civil libertarians. He’s also a former prosecutor. It will be interesting to see how he views the situation: will it lean towards “I told you so” on certain activities having gone too far – perhaps precisely because authorized by U.S. law from which he dissented rather than as freelancing by a rogue agency – or will it emphasize a realpolitik defense of countries gathering whatever information they can to protect their security, something every country does? Many U.S. officials have expressed bemusement that the NSA has been in the spotlight so singularly, when counterpart agencies around the world are thought to conduct parallel activities, and few if any states can offer a clear map for satisfying (much less transparent) intelligence oversight.
It might be good to ask Senator Leahy whether he thinks there’s been an impact not only on the global reputation of the U.S. government, but of U.S. business. Should the fact that legal process can yield so much intelligence from “local” U.S. companies with significant global customer bases be a reason for forbearance on collection from them, lest that customer base go elsewhere, even understanding that elsewhere is still somewhere: if not in the U.S., there will be some other government empowered by the new presence of bits and activities in its jurisdiction.
Raising the business question might be a good time to turn to Augie Fabela and Brad Smith. Mr. Fabela is the former chairman of VimpelCom, a worldwide telecommunications provider that originated in Russia in the early 90’s and is now based in the Netherlands, with over 200 million mobile customers worldwide. Telecoms providers are natural places to seek to spy, and indeed some governments – including in Europe – have imposed data retention requirements upon them, so as to facilitate law enforcement or national security-based investigations on subjects that have not yet been conceived. How should we think about those requirements? Mr. Fabela may have a particular interesting view, as since leaving VimpelCom he’s honed an interest in public safety and law enforcement: in 2012 he was appointed to the rank of sheriff commander in Cook County, Illinois, USA, assisting with the restructuring of its intelligence center.
Brad Smith is the general counsel of Microsoft, and has been with the company for over twenty years. He’s contended with requests from law enforcement and intelligence agencies around the world, and will no doubt have strong and long-thought-through views on where to go from here. You can see some of his thinking in an interview he gave to Corporate Counsel magazine. There he makes the point that it’s difficult to ask companies to step in when governments fall short; governments are, he says, the “ultimate decisionmakers.” When we think of checks and balances in the gathering and use of data, what role, if any, can companies play versus public authorities exercising their power?
Orit Gadiesh is the chairman of global management consulting firm Bain & Company. She has some background in intelligence policy from a time before the Internet – and corresponding digital surveillance – had gone mainstream, but she may remind us that the Big Brother Problem, contra Orwell, is not limited to the State. Companies gather and maintain an enormous amount of data about us, and moreover have incentive – particularly if advertiser-supported – to use that data to influence our choices, even if those choices are as prosaic as what airline to fly or pet food to buy. As the cost of data gathering and processing goes down, what formerly was the province of massively-funded state intelligence operations can become the territory of “mere” companies – and ultimately, perhaps, individuals. Ms. Gadiesh’s participation on the panel may help round out the panel’s exploration to include private as well as public intelligence policy. Here, too, can we count on use restrictions as a hedge against abuse, or must safety lie in more limited collection?
Finally, what will happen as the power of surveillance devolves from governments and companies to individuals themselves? This very session, in an acknowledgement of the realities of the early 21st century, will be Webcast live. Others labeled off-the-record may seem increasingly quaint as anyone in the room is in a position to record – or stream live – what’s taking place. How can we contend with this when ubiquitous data gathering can happen not just at conferences, but truly anywhere? Shyam Sankar may have some thoughts on that. He’s a director of Palantir Technologies, which, in its own words, offers “a suite of software applications for integrating, visualizing and analyzing the world’s information.” Customers include the U.S. Government – which also invested in Palantir early on – and Palantir has made a point of emphasizing its claim that its data analytics are geared recognize and implement privacy protections. Mr. Sankar will no doubt have much to say about the relationship between the private and public sectors in data gathering and processing, and lately he’s been speaking a lot – including at TED – about “human-computer cooperation”: how algorithms alone aren’t the key to big data – presumably whether used for good or ill – but rather a “symbiosis” between person and machine.
There might be some solace in the centrality of people, rather than algorithms, to the infrastructure of surveillance: Big Brother ultimately will comprise people; fallible, yes, but constitutions and the rule of law were framed with those fallible people in mind. That may give promise that the challenges today posed by the ocean of data about us and our activities are difficult and new, but not entirely alien to what has come before, and for which there are models of successful vindication of our rights and dignity.
I hope you’ll join this session, whether in person or online. It’ll be streamed live at 10:30 am CET — that’s GMT+1 — on Wednesday, January 22nd, 2014. …JZ
Update: The video of the panel can be found here. And a related Berkman Center panel here.
- Humanizing the Web
I wrote this in April of 2008 for The Times, and don’t think I ever posted it here –
Humanizing the Web
The Web’s design reflects the open ethos of its early users: it has no central managers, no main menu, and no investment in content – indeed, no business plan whatsoever. Instead, its framers assumed that people and institutions could put their own material online, and the Web would grow to whatever size it might. Users could surf from one site to the next, following links that Web site authors saw fit to place on their pages.
Then the first search engines sprang up. They sent digital robots crawling from one link to the next, copying everything they found, hoping to index the entire Web in one place by obsessively following every path from several starting points. The engines worked: soon one could search the Web not only by following links, but also by entering a search term and finding all the pages containing that term.
This short cut offered by search engines rankled some webmasters. They wanted to be found only the old fashioned way. Even though they’d chosen to put their data on the Web for all to see, they felt far more exposed once any words within – including proper names – could turn up their pages as search results.
But these webmasters didn’t turn to lawyers. They didn’t assert some right to be excluded from the robots’ indexing, even though they might well have had one. Instead, an informal discussion in 1994 produced an obscure standard called “robots.txt.” With it, a webmaster could place very basic requests in a corner of his or her Web site to tell visiting robots which pages to ignore. The standard worked: every major search engine, no matter how data hungry, will pass over pages marked by webmasters as not to be searched. (Try visiting your favorite web site while adding /robots.txt to its name, such as http://www.harvard.edu/robots.txt, and you’ll see what I mean.) The standard solved most of a problem before judges or legislators had to be called in, by deploying one of the fundamental units of a civilised society: an unelaborated request to be let alone. Robots.txt has become a crucial part of the scaffolding that keeps the Web functioning socially. It creates semi-private spaces where before the choice was between entirely private or entirely public. With the norm established, it might even have attained the force of law as the Web has expanded and not every search robot is feeling that civilised.
The nerdy concordat of robots.txt offers an important lesson at a time when we ourselves are feeling exposed and vulnerable on the Web. Everyone has heard horror stories of invasions of privacy and cyber-bullying: photos taken out of context, embarrassing videos posted online, and a mob mentality where commenters can be vicious about what they see. In 2002, a Canadian boy filmed himself swinging a golf ball retriever as if he were a Jedi knight. He did it for his own amusement, and for awhile the tape lay forgotten. Some of his friends saw it, and without his permission, though perhaps without malice, placed it online. Within two weeks the video had been reposted in many places and viewed millions of times. Spin-off videos were produced, adding soundtracks and extra graphics. Much of this creative activity was no doubt in fun – but the boy was mortified. He walked around his school to chants of “Star Wars Kid!” He was diagnosed with depression. To this day he deeply regrets the airing of the film.
The first instinct in cases like these is to look to the law. Indeed, the boy sued, and a settlement was reached with the families of the boys who initially put the video online. More generally, stories like these cause many to rue the Web and the collateral damage its openness can cause, and perhaps to retreat as much as possible – placing as little about themselves online as they can, hoping to remain a grain of sand on the digital beach, in comfortable anonymity.
The law may have a role here, but its cumbersome machinery should be a last resort. Instead, we should recognize that the Web’s technology for moving around bits of data has far outpaced its ability to move social information. A person of good will encountering a video of Star Wars kid might happily forward a link to it on to friends – not realizing that the subject of the video is wounded by it, and thus not having an opportunity to consider the ethical implications of the act.
As tons of personal data floods the Web, we can design ways to imbue it with social cues. What is today a disembodied photo on a Google image search could be tagged by its photographer and subjects to indicate just how far they’d like it to spread – and to request that major uses of it be cleared first, providing a way to reach them for consultation without their having to divulge their actual identities. That photo then becomes anchored to the people who made it and are in it, and those encountering it online can see it as the social artefact that it is – rather than just a funny image. I believe that if the Star Wars kid could have tagged his video as private at the time he made it – just in case it should escape its videocassette home – his friends may not have posted it. Even if they did post it, others might not have rushed to help it go viral.
Take the Star Wars Kid Test yourself: Imagine someone forwarded you a link to the video and you thought it was funny. You’re about to share it with friends. But then you see that it’s been marked by the Kid himself as private, and a desperate plea is attached to please not fan the flames, along with an explanation of what happened to get it online in the first place. Would you forward the link?
I’m hoping not. And if enough people – not everyone, of course, just enough – decided to respect the Kid’s wishes, the video may never have reached the critical mass that took it viral. We know there are bad apples online; there are people who won’t respect reasonable requests made nicely. But it’s the vast middle, the rest of us, who transform run-of-the-mill privacy violations online into the truly awful phenomena that they can become. I don’t blame us – yet – because there’s no way to easily convey those requests and cues that make civilisation breathe. If we build those technologies and embed them in the Web, we’ll then be able to understand ourselves as facing true moral choices when we remix a video or redistribute information that might be all in fun – or might be personally devastating.
Hints of the possibility of a humanized Web abound. There is an extensive entry about Star Wars Kid within Wikipedia, the online encyclopedia that anyone can edit. The Kid’s name is not found in the article. On his article’s corresponding discussion page, a debate has raged about whether or not he should be identified. “He’s already been very prominently featured in a USA Today story, as well as being mentioned in many, many places on the Web,” writes one Wikipedian. Another says: “I read that his parents requested his last name to be kept confidential in future reproductions. Therefore, I think it would be wise if we deleted the … name, not in the fear that Wiki’ll be sued otherwise, just out of courtesy to the poor kid.”
Courtesy prevailed. Online masses 1, USA Today 0.
As I entered this in, in November of ’13, I checked the Wiki page again: and Star Wars Kid is now named in the first sentence of the article. As best I can tell from the article’s talk and history pages it was added in June of ’13, with the justification that “… as per talk page discussion, he has explicitly associated himself with the viral video through various high profile media outlets.” I can’t tell if that puts the score back to zero for Wikipedia, or if genuinely changing circumstances simply point to the encyclopedia’s flexibility and timeliness. I think it’s more the latter; see if you think the reasoning is persuasive — or at least attentive to the earlier opposite consensus.
And an interesting related post, from someone who had her own picture go unpleasantly viral describes what she did next.
- Joining Team Archive: Perma and the Ongoing Effort to Preserve the Web
The accessibility and flexibility of the Internet is a double-edged sword. A distributed web makes it easy to publish content and link to it, but it also means that this content is by no means permanent: any given server or page can disappear or change at any time. (For example, the U.S. federal government was partially shut down in late 2013, with thousands of formerly stable Web pages at .gov destinations no longer available.) When this happens, links that previously led users to those resources take them instead to error messages, unrelated content, or snarky commentaries on the transience of online content: this phenomenon is called linkrot.
As noted previously, the Harvard Law School Library in conjunction with over 30 partnering libraries and non-profits, has developed Perma.cc to mitigate the impact of linkrot on scholarly citations. Perma is an archive with a relatively narrow scope. Rather than undertake the daunting task of archiving the entire Web – something that the Internet Archive, a Perma partner, has been doing amazingly well – Perma is designed to take particular instances of particular web pages at the specific request of an author and place them in the hands of a community of libraries for safe-keeping.
The battle against linkrot is not one that Perma fights alone. There are other organizations and initiatives attacking this problem from angles that differ both in scope and technical implementation.
The Internet Archive has developed an unparalleled tool for preserving online content called the Wayback Machine. The crawler-based archive attempts to document every page on the Web by routinely saving the content hosted at all the URLs its crawler can find. If a Web surfer wants to see what the Google homepage looked like in 2002, she can go to the Wayback Machine, and it will show her all cached versions of the Google homepage saved in 2002 — or any other year — by the Internet Archive crawler.
A slightly narrower approach to large-scale archiving has been implemented by SiteStory. Instead of crawling the web, or caching instances at the request of a user, SiteStory monitors the servers hosting particular content, and makes a record of that content every time information is requested from the server. While this method won’t cache websites that are not being visited, it can create detailed archives of frequently visited sites that capture almost every change made as the sites evolve.
Atop these archives operate a number of protocols and interfaces that assist users in accessing archived content in a targeted fashion. While the functionality of displaying what a particular author wanted a reader to see is not present, as it is in Perma, these services enable users to pinpoint the content for which they are looking across archives. An example of such a tool is Memento. Memento is a framework with a Chrome plug-in that adds a temporal dimension to the HTTP protocol. Designed to make old versions of web pages more accessible, Memento enables a user’s browser to time travel through many different archives, including the Internet Archive’s Wayback Machine, Sitestory, and Archive.is. Because of the user-friendly power of the Memento tool, the Perma team is seeking to make Perma Memento-compatible in a future release.
Narrowing the scope of an archival mission even further, services such as WAIL, Arichive.is, and WebCite are designed to allow users to store a cache of a specific instance of a website — either on their own computers or with the service itself. WAIL serves as an interface for archiving tools, such as Heritrix and Apache Tomcat, giving users an easy way to use these technologies to create their own local repositories. WebCite and Archive.is also preserve user-directed copies of online content, but they host it themselves as part of centralized archives.
Perma enriches this community of archivists in several ways. One unprecedented feature is Perma’s institutional nature. Libraries are in the forever business. When a library promises to save something, it means it, and some of the libraries behind Perma go back hundreds of years. Libraries have ventured into digital archiving with tools such as Digital Object Identifiers (DOIs), but these are predominantly for uniquely identifying institutionally published materials, rather than archiving them in a particular place or manner. This blog post, for example, won’t readily have a DOI, but one could make a Perma link to it!
By bringing the institutional power and promise of libraries to bear on the transient content of the Internet, Perma aspires to create a hybrid of these two essential entities—and in doing so, to capture the best of both worlds. Perma is as accessible as the Internet, and meant to be as permanent as the libraries that stand behind it.
To fully realize the accessibility and durability to which Perma aspires, the Perma team is constructing an archive that spans a network of mirrored servers distributed throughout the consortium of partnering libraries. We aim to construct a network through which an independent copy of the cached content behind every Perma link will be stored on each of these servers. The federated nature of the archive will ensure that if the servers at any one institution go down, the content will still be accessible from the other mirroring partners. With each additional mirroring library, the probability that the Perma archive will remain accessible increases.
The projects mentioned above constitute some of the individuals and organizations that have stepped up to help preserve online content. Given the herculean task that is archiving the Internet, no one method or institution could singlehandedly serve as a silver bullet against online transience. However, the past two decades have seen the burgeoning of this multi-dimensional, collective effort to archive the Internet. With some collaboration, and interoperability, these services can help systematically preserve the ephemeral, yet increasingly important, space the Internet has become.
—by Shailin Thomas, Jonathan Zittrain, and Ben Sobel
- Perma: Scoping and addressing the problem of “link rot”
Kendra Albert, Larry Lessig and I are finishing up a study of link rot, available at http://papers.ssrn.com/abstract=2329161. Link rot is the phenomenon by which material we link to on the distributed Web vanishes or changes beyond recognition over time. (Wiki discusses link rot here.) This is a particular problem for academic scholarship, which is increasingly linking out to the Web rather than more formal, library-curated sources. That kind of linking makes clear sense, but having materials easily accessible right until they vanish means that academic work (government documents, such as judicial opinions) can end up with sources that can’t be checked or followed up upon by readers.
We found that half of the links in all Supreme Court opinions no longer work. And more than 70% of the links in such journals as the Harvard Law Review (in that case measured from 1999 to 2012), currently don’t work. As time passes, the number of non-working links increases.
Our work builds on other great link rot studies such as that by Raizel Liebler and June Liebert in the Yale Journal of Law and Technology, available here (PDF).
In response, the Harvard Library Innovation Lab has pioneered a project to unite libraries so that link rot can be mitigated. We are joined by about thirty law libraries around the world to start Perma.cc, which will allow those libraries on direction of authors and journal editors to store permanent caches of otherwise ephemeral links. Libraries are the ideal partners for this task: they think on a long timescale; they take user trust and service seriously; and they are non-commercial. You can see more about the system at perma.cc. The amazing Internet Archive has lent its archiving engine to the effort, and Instapaper has generously provided an alternative path to parse Web pages to be saved. CloudFlare has kindly ensured that the the system at Perma.cc can scale with use.
We’re grateful to these many institutions and people who have come together to help make the Web work for the ages — the only way this can work is as a peer effort.
Perma’s founding partners are:
- Pence Law Library, Washington College of Law, American University
- Law Library at Boston College, Boston College of Law
- Pappas Law Library, Boston University School of Law
- Biddle Law Library, University of Pennsylvania Law School
- Charleston School of Law Library
- Arthur W. Diamond Law Library, Columbia Law School
- Digital Public Library of America
- J. Michael Goodson Law Library, Duke University School of Law
- Florida State Law Research Center, Florida State University College of Law
- The Leo T. Kissam Memorial Library, Forham University School of Law
- Georgetown Law Library, Georgetown Law
- Internet Archive
- Harvard Law School Library
- Ruth Lilly Law Library, Robert H. McKinney School of Law, Indiana University
- Louis L. Biro Law Library, The John Marshall Law School
- Louisiana Statue University Law Library, LSU Law Center
- Thurgood Marshall Law Library, Francis King Carey School of Law, University of Maryland
- Melbourne Law School Law Library
- Bodleian Law Library, Bodleian Libraries, University of Oxford
- Harnish Law Library, Pepperdine University School of Law
- The Fred Parks Law Library, South Texas College of Law
- Robert Crown Law Library, Stanford Law School
- Hugh & Hazel Darling Law Library, UCLA School of Law
- Grisham Law Library, University of Mississippi School of Law
- Wiener-Rogers Law Library, UNLV William S. Boyd School of Law
- Tarleton Law Library, Jamail Center of Legal Research, The University of Texas School of Law
- Lillian Goldman Law Library, Yale Law School
NYT story here. And the Perma link to this very page can be found at http://perma.cc/0WNvsHVwhT5. (How’s that for recursive?)