While exploring the structure of national networks through our Mapping Local Internet Control project, we decided to combine our national network data with Google’s AdPlanner data to estimate the overall locality of web site traffic in individual countries. The most interesting result so far is that we estimate that 96% of all page views in China are to web sites hosted within China. This is a very interesting finding because of its implications for how to understand Internet control in China.
There are lots of ways to control the Internet, including blocking local users from viewing objectionable remote content, flooding or hacking objectionable sites, and monitoring the Internet usage of activists. But in many cases the most effective forms of Internet control are offline — threatening, fining, arresting, or killing activists because of their activity online. These forms of control are especially effective against content that is hosted within a country. There is no need to launch a DDoS attack against a dissident site that is hosted within the offended country when agents of the offended country can simply knock on the door of either the individual activist publishing the content or of the hosting provider that is hosting the objectionable content and use traditional methods of the state (fines, closing of businesses, jail) to control the content.
The extremely high proportion local web traffic in China may be the result of the success of the Chinese government in blocking the international sites, like Facebook, YouTube, and Blogger, that are generally the biggest destination in other countries. Or it might be because Chinese people like to read content written in Chinese by other Chinese about Chinese topics run by Chinese people. It is likely some combination of the two factors. But the end result is the same. The most direct battleground in the fight over control of the Internet in China is local — it’s happening on the local Chinese services that are the source of almost all Chinese web traffic but are required to censor content by the government.
To generate this number, we took the existing database of countries to autonomous systems to IP address blocks from our Mapping Local Internet Control project (documented here) and combined them with Google AdPlanner’s list of the number of page views of the most popular 250 sites in China. We combined the datasets by looking up the IP address of each of the sites in the AdPlanner 250, looking up the autonomous system of each IP address in our database, and then looking up the country of registration for each of those autonomous systems. We then took the resulting list of the AdPlanner 250 sites and countries and computed the web locality number by dividing total the number of page views for sites hosted within the country by the total number of of page views for all of the AdPlanner 250 sites. This approach is just an estimate. Some IP addresses may physically route to another country even though they are registered with a local autonomous system. The AdPlanner 250 sites are not necessarily representative of all web traffic. The AdPlanner stats themselves are only estimates, and they do not include numbers for google.com itself.