Open Source and Cloud Computing

My friend and former Berkman co-worker Aaron Williamson, who is a lawyer at the Software Freedom Law Center, was kind enough to talk with my Internet Law class about how open source works in a cloud computing environment. Aaron was good enough to let me post my notes on his talk – with fervent apologies for any errors I make! Read on to learn about how open source can break down in the cloud, and how we might re-invent it…

  • The move to Web applications challenges the open source model – copyleft only works if you’re distributing software to your users. With network services, for example, the GPL becomes a permissive license – if you’re running a Web server that is under the GPL, you aren’t distributing the code, so there’s no obligation under the license to provide that code to your users. Web apps thus can undercut open source goals / obligations, and also have the effect of equalizing the various flavors of licenses (BSD, GPL, etc.).
  • The Affero GPL seeks to move network-based services closer to the PC-based model for open source licensing – the AGPL modifies the GPL’s bargain by linking the source provision requirement to the modification of the underlying code and its distribution user interaction over a network [Update 26 May: Aaron corrected me!]. Copyright remains the fulcrum: modification (creation of a derivative work) gives the license its grip, ensuring that users have access to source code. In addition, the AGPL reduces vendor lock-in: if the vendor goes out of business, or begins behaving badly, users have the code. However…
  • Data is the primary challenge to open source in cloud computing – access to source doesn’t help much if the data from a Web application remains inaccessible. Often, the only interface to a Web application’s data is via the site itself – if there’s no API, or a limited API, the transaction cost of shifting to a different vendor or application increases dramatically. This may be particularly acute for financial or business data.
  • The set of social relationships that is critical to the Web – think Facebook – isn’t yet addressed by the open source model. Being able to set up your own version of Facebook is effectively worthless if you can’t migrate the social connections that characterize social networking. It’s hard to replicate the value of a Web community by taking the underlying code and installing it on your computer or server. (How well would “Bambauer’s Book” fare if I decided I was sick of Facebook and wanted to start my own?) Network effects can thus create lock-in.
  • There are three key challenges in a world of cloud computing: data portability, privacy (typically governed by contract, but think also about Fourth Amendment issues), and compatibility (particularly protecting the integrity of social relationships during migration). Before networked apps, access to source took care of these concerns – you could examine both the data formats and how the data was processed by the code to address concerns. Terms of service – the parameters of the relationship between the user and the networked service – thus become critical in addressing these worries for cloud computing…
  • For the GPL model to migrate to networked services, copyright and licenses aren’t enough – we also need technological features that protect user freedoms. This becomes difficult to mandate, though, as the universal applicability of copyright no longer does this work for us. To enable user autonomy, for example, data has to be portable, which means that network services must provide APIs to communicate with other services. Take for example LiveJournal – the code is under the GPL, but there’s no way to get access to all of the relevant data that a user creates and depends upon. In terms of data security, banks have become a model for why protected is needed.
  • To replicate the GPL model for Web services, we need three things: 1) access to source, 2) carefully designed terms of service, and 3) technological features (such as data APIs).
  • Aaron identified Identi.ca, a micro-blogging service ( = like Twitter), as a key proof of concept for open source Web services. It’s licensed under the AGPL v3 (#1 above). The service’s ToS specify which data is private and which is not (#2 above) – private data isn’t shared, but is only used to provide services to users, and Identi.ca will only turn data over to the government under a court order. The service also describes exactly what it does with the public data it stores, constraining its freedom with regards to that information. Finally, Identi.ca has an API (a clone of the Twitter API, evidently) that lets users get their data out of the service (#3 above). Users can also export relationships in a standardized format (Friend of a Friend). Identi.ca addresses the vendor lock-in concern by implementing the Open Authorization protocol, which allows separate instances of the network software to communicate with each other. This enables interoperability without exposing private data. If you want, you can have your own Identi.ca version – and it can talk with other versions! For Twitter addicts, Identi.ca will talk to Twitter (well, at Twitter) if you have an account for Tweeting.

I thought Aaron’s talk was absolutely fascinating. The key worry that remained with me is that we’re really dependent on vendors to make the open source model work: if they don’t enable tech features, such as data APIs, or put together obnoxious terms of service, we won’t get the equivalent of the GPL’s freedoms in the networked services world. It’s not clear how to counteract this – Aaron is bullish about best practices and the example set by services such as Identi.ca – but at least, thanks to Aaron and SFLC, we have an accurate sense of the challenges.

2 Responses to “Open Source and Cloud Computing”

  1. Thanks for this, Derek. If you have time, you may want to tune into the Berkman luncheon today: Caught in the Cloud: Privacy, Encryption, and Government Back Doors in the Web 2.0 Era.

  2. Sounds like a fascinating talk, Derek — thanks a million for posting your summary! My recollection (which I am unable to back up with a citation, sorry; so take it for what it’s worth) is that Richard Stallman has argued against using cloud-based services for some of the very reasons you identify, namely that in doing so, you give up some of the control over your own information that is the central philosophical concern of the “free software” branch of the F/OSS movement. Myself, I’m so dependent on Gmail now that it would take a major catastrophe or security breach to make me revert to local e-mail storage, but it’s good to keep in mind the tradeoffs that my dependency entails.