Google: Tomorrow’s Silicon (not Crystal) BallJuly 15th, 2009 — Scott Hartley
The Silicon Valley has yet to create true forecasting technology, but certain online tools are providing voyeurs with the ability to interpret political events in terms of observable changes over time. In some cases, comparing relative change over time with against an expected baseline of activity can indicate predictive deviations. Explosive growth in use of the term “SBY” across Internet platforms corroborated what polling said off-line: Incumbent President Yudhoyono was re-election bound. And Google Trends data stood in contrast to polling expectations that Jusuf Kalla would lead Megawati in second place. Google, and not polling data, corroborated actual electoral ordering.
Last Wednesday, on July 8th, Indonesia swiftly completed the second direct democratic presidential election in the country’s history. According to a national polling group, Lembaga Survei Indonesia (LSI), incumbent Susilo Bambang Yudhoyono (popularly referred to as “SBY”) won 60.82 percent of the vote, with the opposing Democratic Party of Struggle (PDI-P) candidate Megawati winning 26.57 percent, and the Golkar party candidate Jusuf Kalla taking 12.61 percent. Despite tepid claims that 5.9 million fictitious names had been included among the eligible voters (made by Megawati’s billionaire financial contributor, Hashim Djojohadikusumo), the election took place without incident.
Observation of relative trends over time has been used in many contexts. For example, Raymond Fisman, co-author of Economic Gangsters, observed corruption by monitoring stock prices and news. Under the Indonesian Suharto regime, insider-information as to Suharto’s health moved the then-Jakarta Stock Exchange before news became public. In this case, insider information, driven by concealed concern over political change, facilitated opportunistic buying and selling of stock that, in moving the market price, helped reveal corrupt practices.
Today, Internet users reveal themselves publicly in a variety of ways that collaboratively paint a picture of preferences and concerns that, if not generally applicable to the populous, are immediately reflective of the online demographic in the region. In Internet ecosystems such as the Netherlands, where 90 percent of the population is online, or in the United States, where 72 percent of Americans have access, Internet trends can be more widely extrapolated to indicate the public ethos. In Indonesia, despite its low Internet penetration of roughly 5 percent, the Internet is still a useful tool to observe opinion on important issues as well as regional strongholds of support.
Over the course of 90 days leading up to the election, top Google queries across Indonesia almost exclusively included references to popular networking sites such as Facebook and Friendster. As such, among connected Indonesians, use of social networking platforms is important. “Facebook Lexicon,” a tool that allows one to observe trends of terms or topics used in “wall posts” between friends, becomes relevant in indicating shifting ethos. Within the Indonesian Facebook demographic — admittedly a small and likely young group– Facebook Lexicon reveals that over the last year there has been significant change in topics of on-site political discussion.
While Facebook is a networking and discussion platform and its use typically involves conversation, active use of Google search engine indicates explicit interest. And over the same period, data from Google Insights for Search queries confirms a swelling Internet interest in incumbent candidate SBY. Since January 1, 2009, Google queries across Indonesia on “SBY” grew by 625 percent, compared with 40 percent on “Mega.” While relative search on “Jusuf Kalla” increased by 1100 percent, his absolute search volume was roughly 90 percent lower than SBY.
Retrospective analysis is always problematic. One must be wary of ex-ante conditions observed ex-post, and the ease of false attribution. And as was seen in the telephone polls of American voters prior to the 1948 presidential election, the misinterpretation of niche trends for alterations in public opinion can yield headlines such as “DEWEY DEFEATS TRUMAN,” perhaps the greatest Chicago Daily Tribune gaffe. But as Google.org and the CDC have shown, aggregated search engine query data can, by observing online health-seeking activity, “accurately estimate the current level of weekly influenza activity in each region of the United States.” In politics, if the frequency of Internet user queries about candidates correlates with the percentage of votes cast, then perhaps this information could also be applied to pre-election statistics, which could augment offline polling data on connected demographics.
Pre-election polling by LSI and other Indonesian polling groups showed SBY leading followed by Jusuf Kalla. Putative belief was that Megawati was entirely out of the running. Pre-election Google search query data pointed to an online reality that Megawati was still garnering much attention. In fact, nationally, the results for actual votes cast more closely followed the numbers of Internet search query data –not domestic polling numbers. Perhaps such observation is mere coincidence. Perhaps Internet Search Analytics is an increasingly important data point to cohere with off-line demographic polling. Today the online information-seeking behavior of a geographically diverse sample of connected Indonesians is perhaps illustrative of broader pre-electoral interests. It’s not perfect, and its scope is limited to Internet penetration, but I’m betting that tomorrow’s crystal ball could be made of silicon.