Skip to content

Internet Censorship and Control

19-Jun-13

The Internet is and has always been a space where participants battle for control. The two core protocols that define the Internet – TCP and IP – are both designed to allow separate networks to connect to each other easily, so that networks that differ not only in hardware implementation (wired vs. satellite vs. radio networks) but also in their politics of control (consumer vs. research vs. military networks) can interoperate easily. It is a feature of the Internet, not a bug, that China – with its extensive, explicit censorship infrastructure – can interact with the rest of the Internet.

I’m proud to announce today the release of an open access collection of five peer reviewed papers on the topic of Internet Censorship and Control. These papers appear in the May issue of the IEEE Internet Computer magazine, but today we also make them available as an open access collection. The collection was edited by Steven Murdoch and me.

The topics of the papers include a broad look at information controls, censorship of microblogs in China, new modes of online censorship, the balance of power in Internet governance, and control in the certificate authority model. These papers make it clear that there is no global consensus on what mechanisms of control are best suited for managing conflicts on the Internet, just as there is none for other fields of human endeavour. That said, there is optimism that with vigilance and continuing efforts to maintain transparency the Internet can stay as a force for increasing freedom than a tool for more efficient repression.

Our Circumvention Research Does Not Support SOPA

22-Dec-11

Daniel Castro of The Information Technology & Innovation Fund recently published a paper supporting the Stop Online Privacy Act (SOPA) currently being debated in congress. In that report, he claims that research performed by us supports the domain name system (DNS) filtering mechanisms mandated by SOPA. This claim is a distortion of our work. We disagree with the use of our study to make the point that DNS-based Internet filtering works and that we should therefore use it as a means of stopping websites from distributing copyrighted content. The data we collected answer a completely different set of questions in a completely different context.

Among other provisions that seek to control the sharing of copyrighted material on the Internet, SOPA, if enacted, would call upon the U.S. government to require that Internet service providers remove from their DNS servers the names of any sites that either infringe copyright directly or merely “facilitate” copyright infringement. So, for example, the government could require that ISPs remove the name “twitter.com” from their DNS servers if twitter.com was not being sufficiently aggressive in preventing its users from tweeting information about places to download copyrighted materials. This practice is known as DNS filtering. DNS filtering is one of the most common modes of Internet-based censorship. As we and our collaborators in the OpenNet Initiative have shown over the past decade, practices of this sort are used extensively in autocratic countries, including China and Iran, to prevent access to a range of sites offensive to the governments of those countries.

Opponents of SOPA have argued that the DNS filtering, even though it will have a number of harmful effects on the technical and political structure of the Internet, will not be effective in preventing users from accessing the blocked sites. Mr. Castro cites our research as evidence that SOPA’s mandate to filter DNS will be effective. He quotes our finding that at most 3% of users in certain countries that substantially filter the Internet use circumvention tools and asserts that “presumably the desire for access to essential political, historical, and cultural information is at least equal to, if not significantly stronger than, the desire to watch a movie without paying for it. Yet only a small fraction of Internet users employ circumvention tools to access blocked information, in part because many users simply lack the skills or desire to find, learn and use these tools.”

In our report, we looked at three sets of censorship circumvention tools: complex, client-based tools like Tor; paid VPNs; and web proxies. We estimated usage of those three classes of tools. We used reports from the client tool developers, a survey to gather usage data from VPN operators and used data from Google Analytics to estimate usage of web proxy tools. Counting all three classes of tools, we estimated as many as 19 million users a month of circumvention tools. Given the large number of users in China, Iran, Saudi Arabia and other states where filtering is endemic, this represents a fairly small percentage of Internet users in those countries; 19 million people represents about 3% of the users in countries where Internet filtering is pervasive. We actually believe that 3% figure is high, as some of the tools we study are used by users in open societies to evade corporate or university firewalls, not just to evade government censorship.

We stand behind the findings in our study (with reservations that we detail in the paper), but we disagree with the way that Mr. Castro applies our findings to the SOPA debate. His presumption that people will work as hard or harder to access political content than they do to access entertainment content deeply misunderstands how and why most people use the Internet. Far more users in open societies use the Internet for entertainment than for political purposes; it is unreasonable to assume different behaviors in closed societies. Our research offers the depressing conclusion that comparatively few users are seeking blocked political information and suggests that the governments most successful in blocking political content ensure that entertainment and social media content is widely available online precisely because users get much more upset about blocking the ability watch movies than they do about blocking specific pieces of political content.

Rather than comparing usage of circumvention tools in closed societies to predict the activities of a given userbase, Mr. Castro would do better to consider the massive userbase of tools like bit torrent clients, which would make for a far cleaner analogy to the problem at hand. Likewise, the long line of very popular peer-to-peer sharing tools that have been incrementally designed to circumvent the technical and political measures used to prevent sharing copyrighted materials are a stronger analogy than our study of users in authoritarian regimes seeking to access political content.

Second, our research has consistently shown that those who really wish to evade Internet filters can do so with relatively little effort. The problem is that these activities can be very dangerous in certain regimes. Even though our research shows that relatively few people in autocratic countries use circumvention tools, this does not mean that circumvention tools are not crucial to the dissident communities in those countries. 19 million people is not large in relation to the population of the Internet, but it is still a lot of people absolutely who have freer access to the Internet through the tools. We personally know many people in autocratic countries for whom these tools provide a crucial (though not perfect) layer of security for their activist work. Those people would be at much greater risk than they already are without access to the tools, but in addition to mandating DNS filtering, SOPA would make many circumvention tools illegal. The single biggest funder of circumvention tools has been and remains the U.S. government, precisely because of the role the tools play in online activism. It would be highly counter-productive for the U.S. government to both fund and outlaw the same set of tools.

Finally, our decade-long study of Internet filtering and circumvention has documented the many problems associated with Internet filtering, not its overall effectiveness. DNS filtering is by necessity either overbroad or underbroad; it either blocks too much or too little. Content on the Internet changes its place and nature rapidly, and DNS filtering is ineffective when it comes to keeping up with it. Worse, especially from a First Amendment perspective, DNS filtering ends up blocking access to enormous amounts of perfectly lawful information. We strongly resist the claim that our research, and that of our collaborators, makes the case in favor of DNS-based Internet filtering.

Links:

Mr. Castro’s report may be found here:

http://www.itif.org/publications/pipasopa-responding-critics-and-finding-path-forward

with the reference to our work on p. 8.

The study that is being misused by Mr. Castro is here:

http://cyber.law.harvard.edu/publications/2010/Circumvention_Tool_Usage

The findings of our decade-long studies are documented in three books,
published MIT Press and available freely online in their entirety at:

http://access.opennet.net/

- John Palfrey, Jillian York, Rob Faris, Ethan Zuckerman, and Hal Roberts

Local Control: About 95% of Chinese Web Traffic is Local

15-Aug-11

While exploring the structure of national networks through our Mapping Local Internet Control project, we decided to combine our national network data with Google’s AdPlanner data to estimate the overall locality of web site traffic in individual countries. The most interesting result so far is that we estimate that 96% of all page views in China are to web sites hosted within China. This is a very interesting finding because of its implications for how to understand Internet control in China.

There are lots of ways to control the Internet, including blocking local users from viewing objectionable remote content, flooding or hacking objectionable sites, and monitoring the Internet usage of activists. But in many cases the most effective forms of Internet control are offline — threatening, fining, arresting, or killing activists because of their activity online. These forms of control are especially effective against content that is hosted within a country. There is no need to launch a DDoS attack against a dissident site that is hosted within the offended country when agents of the offended country can simply knock on the door of either the individual activist publishing the content or of the hosting provider that is hosting the objectionable content and use traditional methods of the state (fines, closing of businesses, jail) to control the content.

The extremely high proportion local web traffic in China may be the result of the success of the Chinese government in blocking the international sites, like Facebook, YouTube, and Blogger, that are generally the biggest destination in other countries. Or it might be because Chinese people like to read content written in Chinese by other Chinese about Chinese topics run by Chinese people. It is likely some combination of the two factors. But the end result is the same. The most direct battleground in the fight over control of the Internet in China is local — it’s happening on the local Chinese services that are the source of almost all Chinese web traffic but are required to censor content by the government.

Methods:

To generate this number, we took the existing database of countries to autonomous systems to IP address blocks from our Mapping Local Internet Control project (documented here) and combined them with Google AdPlanner’s list of the number of page views of the most popular 250 sites in China. We combined the datasets by looking up the IP address of each of the sites in the AdPlanner 250, looking up the autonomous system of each IP address in our database, and then looking up the country of registration for each of those autonomous systems. We then took the resulting list of the AdPlanner 250 sites and countries and computed the web locality number by dividing total the number of page views for sites hosted within the country by the total number of of page views for all of the AdPlanner 250 sites. This approach is just an estimate. Some IP addresses may physically route to another country even though they are registered with a local autonomous system. The AdPlanner 250 sites are not necessarily representative of all web traffic. The AdPlanner stats themselves are only estimates, and they do not include numbers for google.com itself.

Tor and Journalism Vulnerabilities

27-Apr-11

I was recently quoted in a story in the New Scientist about a new attack on Tor. The quote was a combination of somewhat sloppy wording on my part and a lack of context on the reporter’s part, so I’d like to provide context and more precise wording here. The quote is:

“There are lots of vulnerabilities in Tor, and Tor has always been open about the various vulnerabilities in its system,” says Hal Roberts at Harvard University, who studies censorship and privacy technologies. “Tor is far from perfect but better than anything else widely available.”

The basic idea of the attack described in the article is to use a rogue Tor exit node to insert an address owned by the attacker into a BitTorrent stream to fool the client into connecting to that address via UDP, which is not anonymized by Tor. So when the BitTorrent client connects to the UDP address, the attacker can discover the attacker’s real IP address. This sort of attack on Tor is well known — the paper’s authors call it a ‘bad apple’ attack. Tor’s core job is just to provide a secure TCP tunnel, but most real world applications do much more than just communicating via a single TCP connection. For example, in addition to HTTP requests for web pages, web browsers make DNS requests to lookup host names, so any end user packaging of Tor has to make sure that DNS lookups happen over the Tor tunnel (as does TorButton). Tor does not ultimately control the applications that use its tunnels but relies on those applications to use its TCP tunnel exclusively to maintain the privacy of the user.

Tor’s conundrum is that at the end of the day what end users need is anonymous communications through applications, not secure TCP tunnels. So even though Tor can’t be responsible for making every application in existence behave nicely with it, to be actually useful it has to take some responsibility for the most common end user applications. To this end, Tor works closely with the Firefox developers to make Firefox work as well as possible with Tor, and Tor and associated folks have invested lots of effort into tools that improve the interface between the browser, the user, and Tor. But there’s only so much that Tor can do here in the world of all applications.

These attacks might not be considered ‘vulnerabilities in Tor’, as I say above, so I should have been more careful with my language (though most folks who do these press interviews struggle with the danger of any given sentence out of an hour long conversation not having precise language that can stand out of context of the rest of the conversation). But the basic point remains — there are lots of ways to break through the privacy of Tor as it is used in the real world, and Tor has been completely open about those in an effort to educate its user base and provide ‘open research questions’ (Roger Dingledine’s favorite phrase!) for its developer community. Roger’s response to the specific BitTorrent problem is simply to tell Tor users not to use BitTorrent over Tor because there’s no way that Tor itself can fix all of the broken BitTorrent clients in the world, but one of the core findings of the above paper is that lots of people do use BitTorrent clients over Tor. So that’s a really hard problem.

The attack described in the paper has a second component that is more directly a vulnerability of Tor than a ‘bad apple’ application attack. The second component is that Tor does not create a new circuit of nodes for every connection, but instead re-uses the same circuit for several connections from the same client to improve performance. This behavior makes it possible to identify the origin IP address of not just the one ‘bad apple’ connection (the BitTorrent connection in the paper’s attack) but also the origin IP address of other current connections by the same user. So a user who is using BitTorrent and browsing the web at the same time exposes not just her BitTorrent activities but also her web browsing activities to the attacker (the paper’s authors say ‘one bad apple spoils the bunch’).

This attack can be more traditionally described as a ‘vulnerability in Tor.’ Claiming ‘lots’ of these is sloppy language, but there is certainly a whole class of timing / tagging attacks that allow an attacker who has control of an entry and an exit node to identify users (and I think the risk of these attacks is more than theoretical in a world in which one ISP in China controls about 63% of the country’s IP addresses).

So to return to the quote and story, I spoke to the author of the piece for about an hour, most of which I spent trying to convince him not to write a ‘TOR IS BROKEN!’ piece that hyped this attack as the one, new chink in Tor’s otherwise pristine armor. I walked through the above, trying to explain that Tor is intended to do a single specific thing (anonymize communication through a TCP tunnel) but that there are various attacks that exploit the layer between Tor and the applications that use it. And there are also attacks like the circuit association described above that are more properly vulnerabilities in Tor itself. But many examples of both of these sorts of attacks have been around for as long as Tor has been around, and Tor has been very vocal about them.

I was trying (unsuccessfully!) to steer the reporter toward explaining the vulnerability as an example of how it is important that users understand that even a project like Tor that is very strongly focused on anonymity over other properties can’t provide perfect privacy for its users, that there are some things it does well but not perfectly (setting up anonymous TCP tunnels) and other things it does not as well (automagically make any application using Tor anonymous). To borrow Roger’s favorite phrase, how to explain complex social / technical issues like this one to reporters is still an open research question for which I’m eager to hear solutions!

Update: The reporter who wrote the article reminded me nicely that the only contact he had with me for this article was a single email exchange, so evidently I made up the long conversation with the reporter in my mind. In my defense, I give a lot of interviews on circumvention related topics, and I can actually still (falsely!) remember standing in the my house having this call with the reporter.

Independent Media Sites in Belarus Reportedly Hijacked During Election

19-Dec-10

Belarus is holding an election today. This election is particularly important because Aleksandr G. Lukashenko, sometimes referred to as the ‘last dictator of Europe,’ has allowed a fair degree of freedom throughout the campaign, including giving free airtime on national TV to opposition candidates, during which they were allowed to criticize him without censorship.

However, it appears that Belarus is continuing in its mixed record of allowing free access to opposition Internet sites during elections. I am getting reports from a digital activist whom I trust of DDoS attacks against a number of sites, which is common during times of crisis in authoritarian countries. I can verify that the following sites have been inaccessible at times this morning: charter97.org, belaruspartisan.org, ucpb.org. He is also reporting that international connections to ports 443 and 465 are being blocked, which will prevent users from securely posting content to international sites like facebook and twitter and from sending mail through international carriers like gmail (the blocking is apparently for all international sites, though, not just ones that may be offensive to the government).

Most interestingly, he reports that BELPAK, the Belarussian national ISP, has been silently redirecting requests from independent media sites to copies of those sites presumably run by pro-government actors, if not the government itself. So when a user requests gazetaby.com, the ISP hijacks the request and instead of returning the requested page returns a redirect for gazetaby.in. The fake site is almost identical to the originally requested site, and as of this post each fake site appears to contain all of the same stories as the original site. Presumably as election day goes on, though, the government will use the fake site to prevent publication of stories that it does not like (by merely not mirroring them onto the fake site). My source observed this behavior repeatedly this morning, but it has since stopped, so requests from within Belarus are currently going to the original sites. This behavior was reported for the following sites, with the following faked mirrors (which can be accessed as confirmation):

original site fake site
gazetaby.com gazetaby.in
charter97.org charter97.in
nn.by nnby.in
belaruspartisan.org belaruspartisan.in
ucpb.org ucpb.in
euroradio.fm euroradio.in

Here’s a zip file of screenshots of each of the above sites, in case the fake sites are taken down.

I cannot verify that this activity was or is happening, but the mere presence of the mirrored sites under almost identical names is strong evidence of bad behavior by someone. My source is working directly with many of the sites listed above and so can verify that those mirrored sites are not being run by the site owners (running such mirrored sites under similar domain names is a very common form of DDoS resistance).

This practice of using a complex combination of different methods for controlling the Internet, particularly during times of crisis like an election or a protest, is very common (we will shortly release a report on DDoS attacks against independent media which includes the finding that independent media sites offer suffer from a range of different types of control rather than just filtering, just ddos, just hijacking, etc). Note above that several of the sites that have been subject to the hijacking described above have also been DDoS’d. It may or may not be the case that the actors DDoS’ing the sites are the same as the ones hijacking them (the hijacking is almost certainly the work of BELPAK, since they are the only ones with the ability to hijack requests as described above).

Update 2010-12-19:

All of the mirrors above are hosted on IP addresses owned by BELPAK:

gazetaby.in has address 194.158.211.74
nnby.in has address 194.158.214.60
charter97.in has address 194.158.214.58
bchdd.in has address 194.158.214.58
belaruspartisan.in has address 194.226.121.242
euroradio.in has address 194.158.211.74
ucpb.in has address 194.158.214.60
svaboda.in has address 194.158.194.2

This doesn’t necessarily mean that BELPAK itself is directly hosting the sites — it just means that BELPAK or one of its customers is hosting the mirrors sites within its network. Nonetheless, this is further evidence of bad behavior.

Update 2010-12-21:

Radio Free Europe / Radio Liberty is reporting that one of the site mirrors changed the location of a protest (presumably to misdirect protesters).

Amazon’s Wikileaks Takedown

03-Dec-10

For the past year, I’ve been working on a study on distributed denial of service (ddos) attacks against independent media and human rights sites with colleagues at the Berkman Center. The resulting report will be out shortly, but one of the main conclusions is that independent media sites are not capable of independently defending themselves of large, network based ddos attacks. There are many things an independent site can do to protect itself against smaller ddos attacks that target specific application vulnerabilities (including simply serving static content), but the problem with a large, network based attack is that it will flood the link between the targeted site and the rest of the Internet, usually causing the hosting ISP to take the targeted site down entirely to protect the rest of its network.

Defending against these large network attacks requires massive amounts of bandwidth, specific and deep technical experience, and often connections to the folks running the networks where the attacks are originating from. There are only a couple dozen organizations (ISPs, hypergiant websites, and content distribution networks) at the core of the Internet who have sufficient amounts of bandwidth, technical ability, and community connections to fight off the biggest of these attacks. Paying for services from those organizations is very expensive, though, starting at thousands of dollars per month without bandwidth costs and often going much, much higher. An alternative is to use one of a handful of hosting services like blogger that offers a high level of ddos protection at no financial cost. One of the recommendations we make in our report is for independent media sites that think they are likely to be attacked and want to be able to defend against themselves either find the resources to pay for a ddos protection service or accept the compromises of hosting on a service like blogger in return for the free ddos protection.

We make this recommendation with a great deal of caution, however, because moving independent media sites to these core network actors trades more freedom from ddos attacks for more control by one of these large companies. It’s great to be able to withstand a 10Gbps ddos attack on youtube, but it’s not so great for youtube to take down your video at its sole discretion for violation of its terms of service. In general, these core companies have struggled in this genuinely difficult role. How is youtube supposed to judge what to do when it receives complaints about a violent video in Arabic posted from Egypt? Do videos of police brutality qualify as the ‘graphic or gratuitous violence’ that youtube disallows in its terms of service?

So with this context, I’ve been watching the Wikileaks attack with great interest. It has been suffering a pretty big network attack (Wikileaks claims about 10Gbps, which is big enough to take down all but a couple dozen or less ISPs in the world; arbor claims about 2-4 Gbps, which is still big enough to cause the vast majority of ISPs in the world major disruption). The attack successfully took its site offline at its main hosting ISP. Wikileak’s textbook response was to move to Amazon’s web services, one of those core Internet services capable of defending against big network attacks.

The move seemed to work for a couple of days, but then Amazon exercised its control, shutting the site down. Joe Lieberman claimed responsibility for Amazon’s decision to take the site down. But Amazon responded with a message claiming that it made the decision to take the site down based purely on its own decision based on its terms of service. The core of their argument is that Wikileaks was hosting content that it did not own and that it was putting human rights workers at risk:

for example, our terms of service state that “you represent and warrant that you own or otherwise control all of the rights to the content… that use of the content you supply does not violate this policy and will not cause injury to any person or entity.” It’s clear that WikiLeaks doesn’t own or otherwise control all the rights to this classified content. Further, it is not credible that the extraordinary volume of 250,000 classified documents that WikiLeaks is publishing could have been carefully redacted in such a way as to ensure that they weren’t putting innocent people in jeopardy. Human rights organizations have in fact written to WikiLeaks asking them to exercise caution and not release the names or identities of human rights defenders who might be persecuted by their governments.

If this is really how they made their decision, this is a worse process than merely succumbing to the political pressure of the US government. At least Lieberman is an elected official and therefore to some degree beholden to his constituents. Amazon is instead arguing dismissively that it made the decision based on its own interpretation of its terms of service. Without getting into the merits of either side, the questions of whether Wikileaks has the rights to the content and especially of what level of risk of harm merits censorship are very, very difficult and should clearly be decided by some sort of deliberative jurisprudence rather than arbitrarily and dismissively decided by a private actor.

This need for careful, structured, and public deliberation on these questions is obviously balanced by Amazon’s right to decide what to do with its own property. But as a society, we have reached a place where the only way to protect some sorts of speech on the Internet is through one of only a couple dozen core Internet organizations. Totally ceding decisions about control of politically sensitive speech to that handful of actors, without any legal process or oversight, is a bad idea. The problem is that an even worse option is to cede these decisions about what content gets to stay up to the owners of the botnets capable of executing large ddos attacks.

Filtering and Circumvention in Iran

17-Jun-09

Here’s a guest post I wrote yesterday for the MIT Technology Review about filtering and circumvention during the protests in Iran.

China Bans the Letter ‘F’

12-Jun-09

China recently mandated that Green Dam, a client side application that filters pornography and political content, be installed on all computers manufactured in China starting July 1, 2009. One way the application blocks access to sites is to kill the browser window when it tries to visit an offending site. The above video demonstrates that the application is poorly designed such that it can end up killing the browser window every time the user types ‘F’ as the letter in the location window.

downloadable version of the video

What’s happening in the video is GD fails to block falundafa.org the first time it is loaded, so ‘falundafa.org’ gets into the history of the browser. Eventually GD recognizes the offensive content on the sites and kills the whole browser after briefly flashing a ‘you have been filtered’ image. Any time GD flags a site as politically offensive, that url gets entered into a list of urls to trigger a kill-block whenever it is entered into the location bar or window. But that auto-kill-block applies to text brought up in the auto-complete list as well as text in the entry box proper. Since falundafa.org is in both the auto-complete list and in the auto-kill-block list, every time the user brings up the location window and types in ‘f’ to start a url, the location window and current tab are instantly killed.

Presto, China has banned the letter ‘f’!

Full details in our just published ONI report on the tool.

Update: replaced youtube version with local version to support folks blocked from youtube.

Grey Surveillance Talk

20-Jan-09

Ethan Zuckerman wrote up a talk I recently gave on looking at Google’s AdWords as a network of grey surveillance.

Viral Conversations Updates Reviews Policy

20-Jan-09

Viral Conversations has updated its FAQ to no longer suggest that companies let reviewers keep products and to suggest that reviewers disclose within reviews if they are allowed to keep the products they are reviewing. In a previous post, I pointed out that the site’s policy of encouraging companies to gift reviewed products to reviewers and of not encouraging users to disclose those gifts was encouraging fraudulent reviews. The relevant sections of the FAQ now read:

Do I Have to Let the Bloggers Keep The Item?

No, you don’t have to let the bloggers keep the item, in the end it’s up to you. It’s going to depend on a number of factors such as cost and shipping difficulty. Letting the bloggers keep a $25 coffee maker is probably a no brainer, but you may feel a little differently about an $1500 espresso machine. Be as clear as possible in the beginning to avoid any confusion. Additionally if you are letting bloggers keep the item, expect that to be disclosed in the review.

Do I need a Disclaimer on My Post?

We really recommend you do it to be upfront and honest with your readers. It could be something as simple as “The John Smith Camera Company sent me their new ABC-123 DLSR camera to review”. If you do a lot of reviews on your website a more formal review policy should be something you should look into. If you are keeping the item, you should disclose that in your review.

Do I Get to Keep The Product I am Reviewing?

That’s going to vary from offer to offer. Sometimes you will sometimes you won’t. That information should be communicated to you before hand. If you do keep the item you are responsible for any tax liabilities that are incurred. If you do keep the item we recommend that you disclose that fact in your review.

These changes bring the site into line with the practices of mainstream media. There are still inherent biases in letting companies choose which bloggers will review their items and in not publishing negative reviews as strongly suggested by FAQ. But those practices closely parallel the practices of mainstream publications who vie for the advertising dollars of the same companies whose products they are reviewing and who avoid publishing negative reviews.