For a long time, the only free source of data about site traffic online has been the Alexa Top Sites list, but the data for the Alexa list is based on the very skewed sample of folks who run the Alexa toolbar, and who the heck runs the Alexa toolbar these days? When I’ve needed data about the most popular sites in a country, I’ve had to use the Alexa data, but only holding my nose with knowledge that the data at best represents a wild guess. There have been better sources of data, but they were all closed, expensive, and generally collected in at least mildly sketchy ways.
Google’s ad planner tool moves dramatically toward filling this big hole in public knowledge about the web site traffic. To try it out, visit the above url and click on the ‘Begin Research’ button.
The ad planner tool is:
a free media planning tool that can help you identify websites your audience is likely to visit so you can make better-informed advertising decisions.
With Google Ad Planner, you can:
* Define audiences by demographics and interests.
* Search for websites relevant to your audience.
* Access aggregated statistics on the number of unique visitors, page views, and other data for millions of websites from over 40 countries.
* Create lists of websites where you’d like to advertise and store them in a media plan.
* Generate aggregated website statistics for your media plan.
What the tool actually does is provide a list of total traffic numbers the 250 most visited sites that meet a number of different demographic queries, including by country and by site type. This lets you, for instance, find out the 250 most visited sites in India, along with the total traffic and number of unique visitors for each site. Or the 250 sites most visited by women. Or by women between 25 and 34. Or by women between 25 and 34 who make more than $150,000 a year:
It’s hard to overstate the power of this tool and the orders of magnitude improvement it is over the Alexa data. You can filter the data by category (newspapers, liberal blogs, flower stores, etc, though the categories seem very poorly assigned). You can choose just sites that allow advertising or all sites (note that the tool shows just advertising sites by default). You can choose sites visited by users who have visited some other site. Or sites visited by users who have searched for some word.
Did you know that the New York Times has twice as many visitors (21 million) as the next closest newspaper, the Washington Post (11 million)? That the Washington Post has half again as much traffic as the next newspaper? That the Huffington Post has basically as much traffic (6.8 million) as every newspaper but New York Times and the Washington Post? That Daily Kos is less than a quarter of the size of the Huffington Post? That unlike in any of the other 20+ included countries, only 2 of the top 25 sites in China are U.S. hosted sites (yahoo at #8 and microsoft at #23)?
And the data used for the tool is the Good Stuff:
How is the data in Google Ad Planner generated?
Google Ad Planner combines information from a variety of sources, such as aggregated Google search data, opt-in anonymous Google Analytics data, opt-in external consumer panel data, and other third-party market research. The data is aggregated over millions of users and powered by computer algorithms; it doesn’t contain personally-identifiable information.
In other words, they use all of the very expensive, somewhat-to-very privacy questionable methods that we privacy interested folks worry about. They tap into their own extensive search logs, the even more extensive data from the adwords system, the extensive data from their analytics tool, and “market research” companies that install spyware that is difficult to distinguish from malware.
But hey, now at least we get the data.
What’s fascinating about this tool is that it’s a market research tool for folks who want to figure out what list of sites to advertise on. It’s little known because it has not been marketed like google trends as a general use tool, even though it is hugely useful as such. In fact, the terms of service only explicitly allows that: “You may use the Program to choose sites on which to target ads” (oddly, the terms also mandates that “The existence of this Program will be deemed Confidential Information” and must be protected with stringent security safeguards, notwithstanding the publication of the tool by google). The fantastic power of this tool for monitoring and understanding the Internet and the wide and deep and invasive methods used to collect the data for the tool point to the very strong connection between surveillance and advertising. The release of this tool and its data ouput as an ‘ad planner’ shows that in the world of adwords, doubleclick’s use of near universal third party cookies, and Phorm’s tapping of UK Internet connections, advertising has become very difficult to distinguish from surveillance.