Book Review published on SSRN

Three weeks ago I blogged about my recent review of  ”Pharmaceutical Innovation, Competition and Patent Law – a Trilateral Perspective” (Edward Elgar 2013). The full review, which is forthcoming in a spring issue of European Competition Law Review (Sweet Maxwell), is now available at SSRN:

New paper on “Standardization, IPRs and Open Innovation in Synthetic Biology”

I am pleased to announce that we have today published the following paper:

Minssen, Timo and Wested, Jakob Blak, Standardization, IPRs and Open Innovation in Synthetic Biology (February 14, 2014). Available at SSRN.

This brief book contribution stems from a presentation given at the 2013 conference “Innovation, Competition, Collaboration” at Bucerius Law School, Hamburg, Germany. It is currently under review by Edward Elgar.  A longer journal-version will follow.


An effective and just sharing of resources for innovation needs a supportive infrastructure. One such infrastructure of both historic and contemporary significance is the development of standards. Considering recent developments within the software and ICT industries, it seems fair to assume that the process of standardization may also have significant impact on the development and adoption of Synthetic Biology (SB). Within SB different standardization efforts have been made, but few have assumed dominance or authority. Standardization efforts within SB may differ within various technical areas, and also the basic processes of standard creation can be divided into various categories. The different technical areas and processes for standardization differ in their speed, handling of interests and ability to dodge possible IPR concerns.

Out of this notion arises i.a. the following questions: How comparable is engineering in SB to more traditional fields of engineering?; What type of standards have emerged and what bearing have IPRs on these?; and, How applicable are the approaches adopted by the standards-setting organizations in the information and communication technology (ICT) to biological standards? These and further legal issues related to IP, regulation, standardization, competition law & open innovation require a careful consideration of new user-generated models and solutions.

Before this background, our paper seeks to describe IP and standardization aspects of SB in order to discuss them in the context of the “open innovation” discourse. We concentrate on describing the technology and identifying areas of particular relevance. Ultimately we also sketch out open questions and potential solutions requiring further research. However, due to the limitation of this paper we do not aim to create elaborated theories or to propose solutions in more detail. Rather this paper, which will be complemented by more extensive follow-up studies, provides a first overview on the complex questions that we are currently dealing with.

To achieve this modest goal, section 1 commences with a brief introduction to the fascinating science of SB and a description of recent technological advances and applications. This will lead us to section 2, in which we will address standard setting efforts in SB, as well as the relevance and governance of various IPRs for specific SB standards. This provides the basis for section 3, in which we debate problematic issues and summarize our conclusions.

Pit Crews with Computers: Can Health Information Technology Fix Fragmented Care?

I recently posted this draft on SSRN. Feedback much appreciated. Here is the abstract:

Fragmentation and lack of coordination remain as some of the most intractable problems facing health care. Attention has often alighted on the promise of Health care Information Technology not least because IT has had such positive impact on many other personal, professional and industrial domains. For at least two decades the HIT-panacea narrative has been persistent even though the context has shifted. At various times we have been promised that patient safety technologies would solve our medical error problems, electronic transactions would simplify healthcare administration and insurance and clinical data would become interoperable courtesy of electronic medical records. Today the IoM is positioning HIT at the center of its new “continuously learning” health care model that is in large part aimed at solving our fragmentation and lack of coordination problems. While the consensus judgment that HIT can reduce fragmentation and increase coordination has intuitive force the specifics are more complicated. First, the relationship between health care and IT has been both culturally and financially complex. Second, HIT has been overhyped as a solution for all of health care’s woes; it has its own problems. Third, the HIT-fragmentation solution presents a chicken-and-egg problem — can HIT solve health care fragmentation and lack of coordination problems or must health care problems such as episodic care be solved prior to successful deployment of HIT? The article takes a critical look at both health care and HIT with those questions in mind before concluding with some admittedly difficult recommendations designed to break the chicken-and-egg deadlock.


Capturing Value in Advanced Medical Imaging

On December 12, a bipartisan bill entitled the Excellence in Diagnostic Imaging Utilization Act of 2013 (HR 3705) was introduced in the House of Representatives which would require clinicians to use electronic clinical decision support tools (CDS) before ordering advanced diagnostic imaging tests for Medicare patients.  Structured around appropriate use criteria  developed by professional medical societies, the tools would aim to increase the value of advanced imaging studies by informing and guiding practitioners’ decisions across a variety of clinical settings.

Such tools would provide active feedback on the appropriateness and evidence base of various imaging modalities, and would require physicians to furnish rationales for ordering tests that are inconsistent with appropriate use criteria.  The bill also envisions the creation of registries that document how diagnostic tests are used in order to facilitate research and to enable feedback to clinicians on metrics related to appropriate use criteria.  In a press release, the American College of Radiology lauded the proposed legislation, stating that it would “revolutionize the specialty of radiology.”

Mandating the use of electronic clinical decision support tools portends at least three key improvements in clinical workflows and healthcare quality more broadly.

Continue reading

Broader Lessons from the Insurance Exchange Fiasco

By Nicolas Terry

The political ripples from the poorly managed exchange roll-out likely will endure through at least one election cycle. Maybe, late night comedians will run out of material sooner. While criticism and inquiry are appropriate given the foreseeable nature of the problem (some months ago at SEALS even I was moved to highlight the OIG’s predictions that there would be little time for testing the data hub) mostly we will witness technical flaws being fashioned into a cudgel with which to beat the Affordable Care Act and its champion-in-chief.

As Ezra Klein has noted, “the politics here will be driven by the reality. If the policy continues to fail, then there’s nothing the White House can do to keep from being dragged down. Conversely, if the Web site is fixed come mid-December, and the policy begins working pretty well, then there’s no amount of Republican messaging that can make it a failure.”

Sitting here in mid-to-late November, it may be appropriate (or at least refreshing) to seek out some broader lessons that we may take away from this mess. In an illuminating post at the Commonwealth Fund blog David Blumenthal contrasted his experiences inside and outside of government and concluded that the federal government needed to reform its IT procurement system. Extrapolating even further from the current disaster Clay Shirky uses to pose some fundamental questions about how managers communicate with technologists and how politicians approach Internet interaction with citizens. His “litmus test” for “whether out political class grasps the internet”? “Can anyone with authority over a new project articulate the tradeoff between features, quality, and time?” Those managing failed that test.

Medical Advice and the Limits of Therapeutic Influence

By Michael Young

It is estimated that 500,000 patients are discharged from U.S. hospitals against the recommendations of medical staff each year.  This category of discharges, dubbed discharges against medical advice (DAMA), encompasses cases in which patients request to be discharged in spite of countervailing medical counsel to remain hospitalized.  Despite safeguards that exist to ensure that patients are adequately informed and competent to make such decisions, these cases can be ethically challenging for practitioners who may struggle to balance their commitments to patient-centered care with their impulse to accomplish what is in their view best for a patient’s health.

Writing in the most recent issue of JAMA, Alfandre et al. contend that “the term ['discharge against medical advice'] is an anachronism that has outlived its usefulness in an era of patient-centered care.”  They argue that the concept and category of DAMA “sends the undesirable message that physicians discount patients’ values in clinical decision making.  Accepting an informed patient’s values and preferences, even when they do not appear to coincide with commonly accepted notions of good decisions about health, is always part of patient-centered care.”  The driving assumption here seems to be that if physicians genuinely include patients’ interests and values in their assessments, then the possibility of “discharge against medical advice” is ruled out ab initio, since any medical advice issued would necessarily encapsulate and reflect patients’ preferences.  They therefore propose that ”[f]or a profession accountable to the public and committed to patient-centered care, continued use of the discharged against medical advice designation is clinically and ethically problematic.”

While abandoning DAMA procedures may well augment patients’ sense of acceptance among medical providers and reduce deleterious effects on therapeutic relationships that may stem from having to sign DAMA forms, it leaves relatively unaddressed the broader question of how to mitigate health risks patients may experience following medically premature or unplanned discharge.  Alfandre and Schumann’s robust interpretation of patient-centeredness also raises the question of how to handle situations in which patients refuse medically appropriate discharge.  On this interpretation, can the ideal of patient-centered care be squared with concerns for optimizing the equity and efficiency of resource allocations more broadly?

Continue reading

Gregg Fields on the Failure of Public-Private Partnerships in Obamacare

Over at our sister blog for the Edmond J. Safra Center for Ethics, Gregg Fields has an insightful discussion of the way Obamacare has relied on private sector contractors to get its enrollment website up and running.  Gregg quotes Safra-affiliate Bill English, who explains the allure of public-private partnerships:   they “enable the public sector to harness the expertise and efficiencies that the private sector can bring to the delivery of certain facilities and services traditionally procured and delivered by the public sector.” HHS Secretary Kathleen Sebelius gets the understatement of the year award: “Unfortunately, a subset of those contracts for have not met expectations.”

Disruptive Innovation and the Rise of the Retail Clinic

By Michael Young

The Association of American Medical Colleges (AAMC) projects that by 2025 the United States will face a shortage of 130,600 physicians, representing a near 18-fold increase from the deficit of 7,400 physicians in 2008.  The widening gap between physician supply and demand has grown out of a complex interplay of legal, political, and social factors, including a progressively aging population, Congressionally mandated caps on the number of Medicare-funded residency slots and funding for graduate medical education, and waning interest among medical school graduates in pursuing careers in primary care.

These issues generate unprecedented opportunities for healthcare innovators and entrepreneurs to design solutions that can effectively address widening disparities between healthcare supply and demand, particularly within vulnerable and underserved areas.

Continue reading

Ethical Concerns, Conduct and Public Policy for Re-Identification and De-identification Practice: Part 3 (Re-Identification Symposium)

This post is part of Bill of Health‘s symposium on the Law, Ethics, and Science of Re-Identification Demonstrations. Background on the symposium is here. You can call up all of the symposium contributions by clicking here. —MM

By Daniel C. Barth-Jones

In Part 1, and Part 2 of this symposium contribution I wrote about a number of re-identification demonstrations and their reporting, both by the popular press and in scientific communications. However, even beyond the ethical considerations that I’ve raised about the accuracy of some of these communications, there are additional ethical, “scientific ethos”, and pragmatic public policy considerations involved in the conduct of re-identification research and de-identification practice that warrant some more thorough discussion and debate.

First Do No Harm

Unless we believe that the ends always justify the means, even obtaining useful results for guiding public policy (as was the case with the PGP demonstration attack’s validation of “perfect population register” issues) doesn’t necessarily mean that the conduct of re-identification research is on solid ethical footing. Yaniv Erlich’s admonition in his “A Short Ethical Manifesto for the Privacy Researcher blog post contributed as part of this symposium provides this wise advice: “Do no harm to the individuals in your study. If you can prove your point by a simulation on artificial data – do it.” This is very sound ethical advice in my opinion. I would argue that the re-identification risks for those individuals in the PGP study who had supplied 5-digit Zip Code and full date of birth were already understood to be unacceptably high (if these persons were concerned about being identified) and that no additional research whatsoever was needed to demonstrate this point. However, if additional arguments needed to be made about the precise levels of the risks, this could have been adequately addressed through the use of probability models. I’d also argue that “data intrusion scenario” uncertainty analyses which I discussed in Part 1 of this symposium contribution already accurately predicted the very small re-identification risks found for the sort of journalist and “nosy neighbor” attacks directed at the Washington hospital data. When strong probabilistic arguments can be made regarding potential re-identification risks, there is little possible purpose for undertaking actual re-identifications that can impact specific persons.

Looking more broadly, it seems more reasonably debatable whether the earlier January re-identification attacks by the Erlich lab on the CEPH – Utah Residents with Northern and Western European Ancestry (CEU) participants could have been warranted by virtue of the attack having exposed a previously underappreciated risk. However, I think an argument could likely be made that, given the prior work by Gitschier which had already revealed the re-identification vulnerabilities of CEU participants, the CEU portion of the Science paper also might not have served any additional purpose in directly advancing the science needed for development of good public policy. Without the CEU re-identifications though, it is unclear whether the surname inference paper would have been published (at least by a prominent journal like Science) and it also seems quite unlikely that it would have sustained nearly the level of media attention.

Continue reading

Press and Reporting Considerations for Recent Re-Identification Demonstration Attacks: Part 2 (Re-Identification Symposium)

This post is part of Bill of Health‘s symposium on the Law, Ethics, and Science of Re-Identification Demonstrations. Background on the symposium is here. You can call up all of the symposium contributions by clicking here. —MM

Daniel C. Barth-Jones, M.P.H., Ph.D., is a HIV and infectious disease epidemiologist.  His work in the area of statistical disclosure control and implementation under the HIPAA Privacy Rule provisions for de-identification is focused on the importance of properly balancing competing goals of protecting patient privacy and preserving the accuracy of scientific research and statistical analyses conducted with de-identified data. You can follow him on Twitter at @dbarthjones.

Forecast for Re-identification: Media Storms Continue…

In Part 1 of this symposium contribution, I wrote about the re-identification “media storm” started in January by the Erlich lab’s “Y-STR” re-identifications which made use of the relationship between Short Tandem Repeats (STRs) on the Y chromosome and paternally inherited surnames. Within months of that attack, April and June brought additional re-identification media storms; this time surrounding re-identification of Personal Genome Project (PGP) participants and a separate attack matching 40 persons within the Washington State hospital discharge database to news reports. However, as I have written has sometimes been the case with past reporting on other re-identification risks, accurate and legitimate characterization of re-identification risks has, unfortunately, once again been over-shadowed by distortive and exaggerated reporting on some aspects of these re-identification attacks. Unfortunately, a careful review of both the popular press coverage and scientific communications for these recent re-identification demonstrations displays some highly misleading communications, the most egregious of which incorrectly informs more than 112 million persons (more than one third of the U.S. population) that they are at potential risk of re-identification when they would not actually be unique and, therefore, re-identifiable. While each separate reporting concern that I’ve addressed here is important in and of itself, the broader pattern that can be observed for these communications about re-identification demonstrations raises some serious concerns about the impact that such distortive reporting could have on the development of sound and prudent public policy for the use of de-identified data.

Reporting Fail (and after-Fails)

University of Arizona law professor Jane Yakowitz Bambauer was the first to call out the distortive “reporting fail” for the PGP “re-identifications” in her blog post on the Harvard Law School Info/Law website. Bambauer pointed out that a Forbes article (written by Adam Tanner, a fellow at Harvard University’s Department of Government, and colleague of the re-identification scientist) covering the PGP re-identification demonstration was misleading with regard to a number of aspects of the actual research report released by Harvard’s Data Privacy Lab. The PGP re-identification study attempted to re-identify 579 persons in the PGP study by linking their “quasi-identifiers” {5-digit Zip Code, date of birth and gender} to both voter registration lists and an online public records database. The Forbes article led with the statement that “more than 40% of a sample of anonymous participants” had been re-identified. (This dubious claim was also repeated in subsequent reporting by the same author in spite of Bambauer’s “call out” of the inaccuracy explained below.) However, the mischaracterization of this data as “anonymous” really should not have fooled anyone beyond the most casual readers. In fact, approximately 80 individuals among the 579 were “re-identified” only because they had their actual names included within file names of the publically available PGP data. Some two dozen additional persons had their names embedded within the PGP file names, but were also “re-identifiable” by matching to voter and online public records data. Bambauer points out that the inclusion of the named individuals was “not relevant to an assessment of re-identification risk because the participants were not de-identified,” and quite correctly adds that “Including these participants in the re-identification number inflates both the re-identification risk and the accuracy rate.

As one observer humorously tweeted after reading Bambauer’s blog piece,

It’s like claiming you “reidentified” people from their high school yearbook”.

Continue reading

HBR/NEJM online forum on health care innovation

For those of you who haven’t seen it yet, there’s a great ongoing online forum over at the joint Harvard Business Review and New England Journal of Medicine Insight Center on Leading Health Care Innovation.  It’s online at HBR here, and will feature an ongoing series of posts about innovation in high-value health care through November 15.  Short articles from scholars in various fields will focus on three main areas: Big Ideas (foundational principles of high-value health care); Managing Innovations (organization and delivery); and From the Front Lines (stories of specific case solutions from practitioners).

They’re looking to host a lively forum, so comments seem both quite welcome and unusually thoughtful so far.


Of Data Challenges

Cross-posted from the HealthLawProfs blog.

Challenges designed to spur innovative uses of data are springing up frequently. These are contests, sponsored by a mix of government agencies, industry, foundations, a variety of not-for-profit groups, or even individuals. They offer prize money or other incentives for people or teams to come up with solutions to a wide range of problems. In addition to grand prizes, they often offer many smaller prizes or networking opportunities. The latest such challenge to come to my attention was announced August 19 by the Knight Foundation: $2 million for answers to the question “how can we harnass data and information for the health of communities?” Companion prizes, of up to $200,000, are also being offered by the Robert Wood Johnson Foundation and the California Healthcare Foundation.

Such challenges are also a favorite of the Obama administration. From promoting Obamacare among younger Americans (over 100 prizes of up to $30,000)–now entered by Karl Rove’s Crossroads group–to arms control and identification of sewer overflows, the federal government has gone in for challenges big time. Check out to see the impressive list. Use of information and technological innovation feature prominently in the challenges, but there is also a challenge for “innovative communications strategies to target individuals who experience high levels of involuntary breaks (“churn”) in health insurance coverage” (from SAMHSA), a challenge to design posters to educate kids about concussions (from CDC), a challenge to develop a robot that can retrieve samples (from NASA), and a challenge to use technology for atrocity prevention (from USAID and Humanity United). All in all, some 285 challenges sponsored by the federal government are currently active, although for some the submission period has closed. Continue reading

Thoughts on Myriad

While awaiting the torrent of academic commentary on this case that is no doubt forthcoming, for now I thought I’d highlight a few interesting aspects of today’s unanimous Supreme Court decision in Association for Molecular Pathology v. Myriad Genetics, 569 U. S. ____ (2013).

Briefly, this case concerned whether genes can be patented. The company Myriad Genetics held several patents related to two genes: BRCA1 and BRCA2. When mutations of these genes are present, it may indicate that a woman is at a high risk for getting breast or ovarian cancer. Myriad also held patents on a proprietary test to evaluate for the presence of BRCA gene mutations that costs over $3,000. This screening has been in the news recently due to Angelina Jolie’s decision to undergo preventive double mastectomy after testing positive for BRCA mutations.

In today’s case, Myriad’s patents were being challenged because they limited competition from other companies and researchers that could have independently tested for the same gene mutations. The outcome of this case has been critically anticipated for years because of its impact on patient access to medicines and funding medical research and development. Thousands of human genes have been patented in the U.S. over the past 30 years.

Before reaching the Supreme Court, a U.S. District judge in New York invalidated Myriad’s patents in 2010, ruling that genes were ineligible for patent protection as “products of nature.” However, the Court of Appeals for the Federal Circuit disagreed, holding that genes were eligible for patent protection because DNA isolated from the body is “markedly different” in chemical structure than DNA as it exists inside the body. The Supreme Court remanded the decision back to the Federal Circuit in light of its decision in Prometheus, and the Federal Circuit affirmed its decision that DNA was patent eligible.

Continue reading

Public Policy Considerations for Recent Re-Identification Demonstration Attacks on Genomic Data Sets: Part 1 (Re-Identification Symposium)

This post is part of Bill of Health‘s symposium on the Law, Ethics, and Science and Re-Identification Demonstrations. We’ll have more contributions throughout the week. Background on the symposium is here. You can call up all of the symposium contributions by clicking here. —MM

Daniel C. Barth-Jones, M.P.H., Ph.D., is a HIV and Infectious Disease Epidemiologist.  His work in the area of statistical disclosure control and implementation under the HIPAA Privacy Rule provisions for de-identification is focused on the importance of properly balancing competing goals of protecting patient privacy and preserving the accuracy of scientific research and statistical analyses conducted with de-identified data. You can follow him on Twitter at @dbarthjones.

Re-identification Rain-makers

The media’s “re-identification rain-makers” have been hard at work in 2013 ceremoniously drumming up the latest anxiety-inducing media storms. In January, a new re-identification attack providing “surname inferences” from genomic data was unveiled and the popular press and bloggers thundered, rattled and raged with headlines ranging from the more staid and trusted voices of major newspapers (like the Wall Street Journal’s: “A Little Digging Unmasks DNA Donor Names. Experts Identify People by Matching Y-Chromosome Markers to Genealogy Sites, Obits; Researchers’ Privacy Promises ‘Empty’”) to near “the-sky-is-falling” hysteria in the blogosphere where headlines screamed: “Your Biggest Genetic Secrets Can Now Be Hacked, Stolen, and Used for Target Marketing” and “DNA hack could make medical privacy impossible”. (Now, we all know that editors will sometimes write sensational headlines in order to draw in readers, but I have to just say “Please, Editors… Take a deep breath and maybe a Xanax”.)

The more complicated reality is that, while this recent re-identification demonstration provided some important warning signals for future potential health privacy concerns, it was not likely to have been implemented by anyone other than an academic re-identification scientist; nor would it have been nearly so successful if it had not carefully selected targets who were particularly susceptible for re-identification.

As I’ve written elsewhere, from a public policy standpoint, it is essential that the re-identification scientists and the media accurately communicate re-identification risk research; because public opinion should, and does, play an important role in setting priorities for policy-makers. There is no “free lunch”. Considerable costs come with incorrectly evaluating the true risks of re-identification, because de-identification practice importantly impacts the scientific accuracy and quality of the healthcare decisions made based on research using de-identified data. Properly balancing disclosure risks and statistical accuracy is crucial because some popular de-identification methods can unnecessarily, and often undetectably, degrade the accuracy of de-identified data for multivariate statistical analyses. Poorly conducted de-identification may fail to protect privacy, and the overuse of de-identification methods in cases where they do not produce meaningful privacy protections can quickly lead to undetected and life threatening distortions in research and produce damaging health policy decisions.

So, what is the realistic magnitude of re-identification risk posed by the “Y-STR” surname inference re-identification attack methods developed by Yaniv Erlich’s lab? Should *everyone* really be fearful that this “DNA Hack” has now made their “medical privacy impossible”? Continue reading

An Open Letter From a Genomic Altruist to a Genomic Extrovert (Re-Identification Symposium)

This post is part of Bill of Health‘s symposium on the Law, Ethics, and Science of Re-Identification Demonstrations. You can call up all of the symposium contributions here. We’ll continue to post contributions throughout the week. —MM

Dear Misha:

In your open letter to me, you write:

No one is asking you to be silent, blasé or happy about being cloned (your clone, however, tells me she is “totally psyched”).

First things first: I have an ever-growing list of things I wish I had done differently in life, so let me know when my clone has learned how to read, and I’ll send it on over; perhaps her path in life will be sufficiently similar to mine that she’ll benefit from at least a few items on the list.

Moving on to substance, here’s the thing: some people did say that PGP participants have no right to complain about being re-identified (and, by logical extension, about any of the other risks we assumed, including the risk of being cloned). It was my intention, in that post, to articulate and respond to three arguments that I’ve encountered, each of which suggests that re-identification demonstrations raise few or no ethical issues, at least in certain cases. To review, those arguments are:

  1. Participants who are warned by data holders of the risk of re-identification thereby consent to be re-identified by third parties.
  2. Participants who agree to provide data in an open access format for anyone to do with it whatever they like thereby gave blanket consent that necessarily included consent to using their data (combined with other data) to re-identify them.
  3. Re-identification is benign in the hands of scholars, as opposed to commercial or criminal actors.

I feel confident in rejecting the first and third arguments. (As you’ll see from the comments I left on your post, however, I struggled, and continue to struggle, with how to respond to the second argument; Madeleine also has some great thoughts.) Note, however, two things. First, none of my responses to these arguments was meant to suggest that I or anyone else had been “sold a bill of goods” by the PGP. I’m sorry that I must have written my post in such a way that it leant itself to that interpretation. All I intended to say was that, in acknowledging the PGP’s warning that re-identification by third parties is possible, participants did not give third parties permission to re-identify them. I was addressing the relationship between re-identification researchers and data providers more than that between data providers and data holders.

Second, even as to re-identification researchers, it doesn’t follow from my rejection of these three arguments that re-identification demonstrations are necessarily unethical, even when conducted without participant consent. Exploring that question is the aim, in part, of my next post. What I tried to do in the first post was clear some brush and push back against the idea that under the PGP model — a model that I think we both would like to see expand — participants have given permission to be re-identified, “end of [ethical] story.” Continue reading

Reidentification as Basic Science (Re-Identification Symposium)

This post is part of Bill of Health‘s symposium on the Law, Ethics, and Science of Re-Identification Demonstrations. You can call up all of the symposium contributions here. We’ll continue to post contributions into next week. —MM

Arvind Narayanan (Ph.D. 2009) is an Assistant Professor of Computer Science at Princeton. He studies information privacy and security and has a side-interest in technology policy. His research has shown that data anonymization is broken in fundamental ways, for which he jointly received the 2008 Privacy Enhancing Technologies Award. Narayanan is one of the researchers behind the “Do Not Track” proposal. His most recent research direction is the use of Web measurement to uncover how companies are using our personal information.

Narayanan is an affiliated faculty member at the Center for Information Technology Policy at Princeton and an affiliate scholar at Stanford Law School’s Center for Internet and Society. You can follow him on Twitter at @random_walker.

By Arvind Narayanan

What really drives reidentification researchers? Do we publish these demonstrations to alert individuals to privacy risks? To shame companies? For personal glory? If our goal is to improve privacy, are we doing it in the best way possible?

In this post I’d like to discuss my own motivations as a reidentification researcher, without speaking for anyone else. Certainly I care about improving privacy outcomes, in the sense of making sure that companies, governments and others don’t get away with mathematically unsound promises about the privacy of consumers’ data. But there is a quite different goal I care about at least as much: reidentification algorithms. These algorithms are my primary object of study, and so I see reidentification research partly as basic science.

Continue reading

Reflections of a Re-Identification Target, Part I: Some Information Doesn’t Want To Be Free (Re-Identification Symposium)

This post is part of Bill of Health‘s symposium on the Law, Ethics, and Science of Re-Identification Demonstrations. You can call up all of the symposium contributions here. Please note that Bill of Health continues to have problems receiving some comments. If you post a comment to any symposium piece and do not see it within half an hour or so, please email your comment to me at mmeyer @ and I will post it. —MM

By Michelle N. Meyer

I wear several hats for purposes of this symposium, in addition to organizer. First, I’m trained as a lawyer and an ethicist, and one of my areas of scholarly focus is research regulation and ethics, so I see re-identification demonstrations through that lens. Second, as a member of the advisory board of the Social Science Genetics Association Consortium (SSGAC), I advise data holders about ethical and regulatory aspects of their research, including issues of re-identification. I may have occasion to reflect on this role later in the symposium. For now, however, I want to put on my third hat: that of data provider to (a.k.a. research participant in) the Personal Genome Project (PGP), the most recent target of a pair of re-identification “attacks,” as even re-identification researchers themselves seem to call them.

In this first post, I’ll briefly discuss my experience as a target of a re-identification attack. In my discussions elsewhere about the PGP demonstrations, some have suggested that re-identification requires little or no ethical justification where (1) participants have been warned about the risk of re-identification; (2) participants have given blanket consent to all research uses of the data they make publicly available; and/or (3) the re-identification researchers are scholars rather than commercial or criminal actors.

In explaining below why I think each of these arguments is mistaken, I focus on the PGP re-identification demonstrations. I choose the PGP demonstrations not to single them out, but rather for several other reasons. First, the PGP attacks are the case studies with which, for obvious reasons, I’m most familiar, and I’m fortunate to have convinced so many other stakeholders involved in those demonstrations to participate in the symposium and help me fill out the picture with their perspectives. I also focus on the PGP because some view it as an “easy” case for re-identification work, given the features I just described. Therefore, if nonconsensual re-identification attacks on PGP participants are ethically problematic, then much other nonconsensual re-identification work is likely to be as well. Finally, although today the PGP may be somewhat unusual in being so frank with participants about the risk of re-identification and in engaging in such open access data sharing, both of these features, and especially the first, shouldn’t be unusual in research. To the extent that we move towards greater frankness about re-identification risk and broader data sharing, trying to achieve clarity about what these features of a research project do — and do not — mean for the appropriateness of re-identification demonstrations will be important.

Having argued here about how not to think about the ethics of re-identification studies, in a later post, I plan to provide some affirmative thoughts about an ethical framework for how we should think about this work.

Continue reading

Data Sharing vs. Privacy: Cutting the Gordian Knot (Re-Identification Symposium)

PGP participants and staff at the 2013 GET Conference. Photo credit:, license CC-BY

This post is part of Bill of Health‘s symposium on the Law, Ethics, and Science of Re-Identification Demonstrations. You can call up all of the symposium contributions here. Please note that Bill of Health continues to have problems receiving some comments. If you post a comment to any symposium piece and do not see it within half an hour or so, please email your comment to me at mmeyer @ and I will post it. —MM

By Madeleine Ball

Scientists should share. Methods, samples, and data — sharing these is a foundational aspect of the scientific method. Sharing enables researchers to replicate, validate, and build upon the work of colleagues. As Isaac Newton famously wrote: “If I have seen further it is by standing on the shoulders of giants.”

When scientists study humans, however, this impulse to share runs into another motivating force — respect for individual privacy. Clinical research has traditionally been conducted using de-identified data, and participants have been assured privacy. As digital information and computational methods have increased the ability to re-identify participants, researchers have become correspondingly more restrictive with sharing. Solutions are proposed in an attempt to maximize research value while protecting privacy, but these can fail — and, as Gymrek et al. have recently confirmed, biological materials themselves contain highly identifying information through their genetic material alone.

When George Church proposed the Personal Genome Project in 2005, he recognized this inherent tension between privacy and data sharing. He proposed an extreme solution: cutting the Gordian knot by removing assurances of privacy:

If the study subjects are consented with the promise of permanent confidentiality of their records, then the exposure of their data could result in psychological trauma to the participants and loss of public trust in the project. On the other hand, if subjects are recruited and consented based on expectation of full public data release, then the above risks to the subjects and the project can be avoided.

Church, GM “The Personal Genome Project” Molecular Systems Biology (2005)

Thus, the first ten PGP participants — the PGP-10 — identified themselves publicly.

Continue reading

Re-Identification Is Not the Problem. The Delusion of De-Identification Is. (Re-Identification Symposium)

This is the second post in Bill of Health‘s symposium on the Law, Ethics, and Science of Re-Identification Demonstrations. We’ll have more contributions throughout the week, and extending at least into early next week. Background on the symposium is here. You can call up all of the symposium contributions by clicking here (or by clicking on the “Re-Identification Symposium” category link at the bottom of any symposium post).

Please note that Bill of Health continues to have problems receiving some comments. If you post a comment to any symposium piece and do not see it within half an hour or so, please email your comment to me at mmeyer @ and I will post it. —MM

By Jen Wagner, J.D., Ph.D.

Before I actually discuss my thoughts on the re-identification demonstrations, I think it would be useful to provide a brief background on my perspective.


My genome is an identifier. It can be used in lieu of my name, my visible appearance, or my fingerprints to describe me sufficiently for legal purposes (e.g. a “Jane Doe” search or arrest warrant specifying my genomic sequence). Nevertheless, my genome is not me. It is not the gist of who I am –past, present or future. In other words, I do not believe in genetic essentialism.

My genome is not my identity, though it contributes to my identity in varying ways (directly and indirectly; consciously and subconsciously; discretely and continuously). Not every individual defines his/her self the way I do. There are genomophobes who may shape their identity in the absence of their genomic information and even in denial of and/or contradiction to their genomic information. Likewise, there are genomophiles who may shape their identity with considerable emphasis on their genomic information, in the absence of non-genetic information and even in denial of and/or contradiction to their non-genetic information (such as genealogies and origin beliefs).

My genome can tell you probabilistic information about me, such as my superficial appearance, health conditions, and ancestry. But it won’t tell you how my phenotypes have developed over my lifetime or how they may have been altered (e.g. the health benefits I noticed when I became vegetarian, the scar I earned when I was a kid, or the dyes used to hide the grey hairs that seem proportional to time spent on the academic job market). I do not believe in genetic determinism. My genomic data is of little research value without me (i.e. a willing, able, and honest participant), my phenotypic information (e.g. anthropometric data and health status), and my environmental information (e.g. data about my residence, community, life exposures, etc). Quite simply, I make my genomic data valuable.

As a PGP participant, I did not detach my name from the genetic data I uploaded into my profile. In many ways, I feel that the value of my data is maximized and the integrity of my data is better ensured when my data is humanized.

Continue reading

Applying Information Privacy Norms to Re-Identification Demonstrations (Re-Identification Symposium)

This is the first post in Bill of Health‘s symposium on the Law, Ethics, and Science of Re-Identification Demonstrations. We’ll have more contributions throughout the week. Background on the symposium is here. You can call up all of the symposium contributions by clicking here (or by clicking on the “Re-Identification Symposium” category link at the bottom of any symposium post). —MM

By Stephen Wilson

I’m fascinated by the methodological intersections of technology and privacy – or rather the lack of intersection, for it appears that a great deal of technology development occurs in blissful ignorance of information privacy norms.  By “norms” in the main I mean the widely legislated OECD Data Protection  Principles (see Graham Greenleaf, Global data privacy laws: 89 countries, and accelerating, Privacy Laws & Business International Report, Issue 115, Special Supplement, February 2012).

Standard data protection and information privacy regulations world-wide are grounded by a reasonably common set of principles; these include, amongst other things, that personal information should not be collected if it is not needed for a core business function, and that personal information collected for one purpose should not be re-used for unrelated purposes without consent. These sorts of privacy formulations tend to be technology neutral; they don’t much care about the methods of collection but focus instead on the obligations of data custodians regardless of how personal information has come to be in their systems. That is, it does not matter if you collect personal information from the public domain, or from a third party, or if you synthesise it from other data sources, you are generally accountable under the Collection Limitation and Use Limitation principles in the same way as if you collect that personal information directly from the individuals concerned.

I am aware of two distinct re-identification demonstrations that have raised awareness of the issues recently.  In the first, Yaniv Erlich used what I understand are new statistical techniques to re-identify a number of subjects that had donated genetic material anonymously to the 1000 Genomes project. He did this by correlating genes in the published anonymous samples with genes in named samples available from genealogical databases. The 1000 Genomes consent form reassured participants that re-identification would be “very hard”. In the second notable demo, Latanya Sweeney re-identified volunteers in the Personal Genome Project using her previously published method of using a few demographic values (such as date or birth, sex and postal code) extracted from the otherwise anonymous records.

A great deal of the debate around these cases has focused on the consent forms and the research subjects’ expectations of anonymity. These are important matters for sure, yet for me the ethical issue in re-anonymisation demonstrations is more about the obligations of third parties doing the identification who had nothing to do with the original informed consent arrangements.  The act of recording a person’s name against erstwhile anonymous data represents a collection of personal information.  The implications for genomic data re-identification are clear.

Continue reading