Show and Tell: Algorithmic Culture

March 25th, 2014 by Christian

(or, What you need to know about “puppy dog hate”)

(or, “It’s not that I’m uninterested in hygiene…”)

   

Last week I tried to get a group of random sophomores to care about algorithmic culture. I argued that software algorithms are transforming communication and knowledge. The jury is still out on my success at that, but in this post I’ll continue the theme by reviewing the interactive examples I used to make my point. I’m sharing them because they are fun to try. I’m also hoping the excellent readers of this blog can think of a few more.

I’ll call my three examples “puppy dog hate,” “top stories fail,” and “your DoubleClick cookie filling.”  They should highlight the ways in which algorithms online are selecting content for your attention. And ideally they will be good fodder for discussion. Let’s begin:

Three Ways to Demonstrate Algorithmic Culture

(1.) puppy dog hate (Google Instant)

You’ll want to read the instructions fully before trying this. Go to http://www.google.com/ and type “puppy”, then [space], then “dog”, then [space], but don’t hit [Enter].  That means you should have typed “puppy dog ” (with a trailing space). Results should appear without the need to press [Enter]. I got this:

Now repeat the above instructions but instead of “puppy” use the word “bitch” (so: “bitch dog “).  Right now you’ll get nothing. I got nothing. (The blank area below is intentionally blank.) No matter how many words you type, if one of the words is “bitch” you’ll get no instant results.

What’s happening? Google Instant is the Google service that displays results while you are still typing your query. In the algorithm for Google Instant, it appears that your query is checked against a list of forbidden words. If the query contains one of the forbidden words (like “bitch”) no “instant” results will be shown, but you can still search Google the old-fashioned way by pressing [Enter].

This is an interesting example because it is incredibly mild censorship, and that is typical of algorithmic sorting on the Internet. Things aren’t made to be impossible, some things are just a little harder than others. We can discuss whether or not this actually matters to anyone. After all, you could still search for anything you wanted to, but some searches are made slightly more time-consuming because you will have to press [Enter] and you do not receive real-time feedback as you construct your search query.

It’s also a good example that makes clear how problematic algorithmic censorship can be. The hackers over at 2600 reverse engineered Google Instant’s blacklist (NSFW) and it makes absolutely no sense. The blocked words I tried (like “bitch”) produce perfectly inoffensive search results (sometimes because of other censorship algorithms, like Google SafeSearch). It is not clear to me why they should be blocked. For instance, anatomical terms for some parts of the female anatomy are blocked while other parts of the female anatomy are not blocked.

Some of the blocking is just silly. For instance, “hate” is blocked. This means you can make the Google Instant results disappear by adding “hate” to the end of an otherwise acceptable query. e.g., “puppy dog hate ” will make the search results I got earlier disappear as soon as I type the trailing space. (Remember not to press [Enter].)

This is such a simple implementation that it barely qualifies as an algorithm. It also differs from my other examples because it appears that an actual human compiled this list of blocked words. That might be useful to highlight because we typically think that companies like Google do everything with complicated math and not site-by-site or word-by-word rules–they have claimed as much, but this example shows that in fact this crude sort of blacklist censorship still goes on.

Google does censor actual search results (what you get after pressing [Enter]) in a variety of ways but that is a topic for another time. This exercise with Google Instant at least gets us started thinking about algorithms, whose interests they are serving, and whether or not they are doing their job well.

(2.) Top Stories Fail (Facebook)

In this example, you’ll need a Facebook account.  Go to http://www.facebook.com/ and look for the tiny little toggle that appears under the text “News Feed.” This allows you to switch between two different sorting algorithms: the Facebook proprietary EdgeRank algorithm (this is the default), and “most recent.” (On my interface this toggle is in the upper left, but Facebook has multiple user interfaces at any given time and for some people it appears in the center of the page at the top.)

Switch this toggle back and forth and look at how your feed changes.

What’s happening? Okay, we know that among 18-29 year-old Facebook users the median number of friends is now 300. Even given that most people are not over-sharers, with some simple arithmetic it is clear that some of the things posted to Facebook may never be seen by anyone. A status update is certainly unlikely to be seen by anywhere near your entire friend network. Facebook’s “Top Stories” (EdgeRank) algorithm is the solution to the oversupply of status updates and the undersupply of attention to them, it determines what appears on your news feed and how it is sorted.

We know that Facebook’s “Top Stories” sorting algorithm uses a heavy hand. It is quite likely that you have people in your friend network that post to Facebook A LOT but that Facebook has decided to filter out ALL of their posts. These might be called your “silenced Facebook friends.” Sometimes when people do this toggling-the-algorithm exercise they exclaim: “Oh, I forgot that so-and-so was even on Facebook.”

Since we don’t know the exact details of EdgeRank, it isn’t clear exactly how Facebook is deciding which of your friends you should hear from and which should be ignored. Even though the algorithm might be well-constructed, it’s interesting that when I’ve done this toggling exercise with a large group a significant number of people say that Facebook’s algorithm produces a much more interesting list of posts than “Most Recent,” while a significant number of people say the opposite — that Facebook’s algorithm makes their news feed worse. (Personally, I find “Most Recent” produces a far more interesting news feed than “Top Stories.”)

It is an interesting intellectual exercise to try and reverse-engineer Facebook’s EdgeRank on your own by doing this toggling. Why is so-and-so hidden from you? What is it they are doing that Facebook thinks you wouldn’t like? For example, I think that EdgeRank doesn’t work well for me because I select my friends carefully, then I don’t provide much feedback that counts toward EdgeRank after that. So my initial decision about who to friend works better as a sort without further filtering (“most recent”) than Facebook’s decision about what to hide. (In contrast, some people I spoke with will friend anyone, and they do a lot more “liking” than I do.)

What does it mean that your relationship to your friends is mediated by this secret algorithm? A minor note: If you switch to “most recent” some people have reported that after a while Facebook will switch you back to Facebook’s “Top Stories” algorithm without asking.

There are deeper things to say about Facebook, but this is enough to start with. Onward. 

(3.) Your DoubleClick Cookie Filling (DoubleClick)

This example will only work if you browse the Web regularly from the same Web browser on the same computer and you have cookies turned on. (That describes most people.) Go to the Google Ads settings page — the URL is a mess so here’s a shortcut: http://bit.ly/uc256google

Look at the right column, headed “Google Ads Across The Web,” then scroll down and look for the section marked “Interests.” The other parts may be interesting too, such as Google’s estimate of your Gender, Age, and the language you speak — all of which may or may not be correct.  Here’s a screen shot:

If you have “interests” listed, click on “Edit” to see a list of topics.

What’s Happening? Google is the largest advertising clearinghouse on the Web. (It bought DoubleClick in 2007 for over $3 billion.) When you visit a Web site that runs Google Ads — this is likely quite common — your visit is noted and a pattern of all of your Web site visits is then compiled and aggregated with other personal information that Google may know about you.

What a big departure from some old media! In comparison, in most states it is illegal to gather a list of books you’ve read at the library because this would reveal too much information about you. Yet for Web sites this data collection is the norm.

This settings page won’t reveal Google’s ad placement algorithm, but it shows you part of the result: a list of the categories that the algorithm is currently using to choose advertising content to display to you. Your attention will be sold to advertisers in these categories and you will see ads that match these categories.

This list is quite volatile and this is linked to the way Google hopes to connect advertisers with people who are interested in a particular topic RIGHT NOW. Unlike demographics that are presumed to change slowly (age) or not to change at all (gender), Google appears to base a lot of its algorithm on your recent browsing history. That means if you browse the Web differently you can change this list fairly quickly (in a matter of days, at least).

Many people find the list uncannily accurate, while some are surprised at how inaccurate it is. Usually it is a mixture. Note that some categories are very specific (“Currency Exchange”), while others are very broad (“Humor”).  Right now it thinks I am interested in 27 things, some of them are:

  • Standardized & Admissions Tests (Yes.)
  • Roleplaying Games (Yes.)
  • Dishwashers (No.)
  • Dresses (No.)

You can also type in your own interests to save Google the trouble of profiling you.

Again this is an interesting algorithm to speculate about. I’ve been checking this for a few years and I persistently get “Hygiene & Toiletries.” I am insulted by this. It’s not that I’m uninterested in hygiene but I think I am no more interested in hygiene than the average person. I don’t visit any Web sites about hygiene or toiletries. So I’d guess this means… what exactly? I must visit Web sites that are visited by other people who visit sites about hygiene and toiletries. Not a group I really want to be a part of, to be honest.

These were three examples of algorithm-ish activities that I’ve used. Any other ideas? I was thinking of trying something with an item-to-item recommender system but I could not come up with a great example. I tried anonymized vs. normal Web searching to highlight location-specific results but I could not think of a search term that did a great job showing a contrast.  I also tried personalized twitter trends vs. location-based twitter trends but the differences were quite subtle. Maybe you can do better.

In a future post I’ll write about how the students reacted to all this.

 

(This was also cross-posted to The Social Media Collective.)

 


Think About New Media Algorithmically

March 20th, 2014 by Christian

(or: How to Explain Yourself to a General Audience of Sophomores)

I recently gave a guest lecture to the University of Michigan sophomore special topics course “22 Ways to Think About New Media.”  This is a course intended for students who have not yet declared a major, where each week a faculty member from a different discipline describes a “way” that they think about “New Media.”  One goal of this is “a richer appreciation of the liberal arts and sciences,” and so I was asked to consider my remarks in the context of questions like: “What is the place of your work in society? What kinds of questions do you ask? How, in short, do you think?”

Wow, that’s a tall order. Explain and defend your field — communication and information studies — to people who have never encountered it before. Tell (for example) an undergraduate interested in chemistry why they should care about your work. And say something interesting about new media. Well, I’ll give it a shot. Here’s a summary of my attempt.

I decided that the way I want people to think about New Media is “algorithmically.” I meant that as a one-word shorthand for “I am interested in algorithms,” or “I think about new media algorithms and try to understand their implications,” and not “I am an algorithm.” (*)

A central question in the study of communication is this one: How do communication and information systems and institutions organize and shape what we know and think? That is, there is a great amount of material that could be watched, read, and heard but of course we each only have time to experience a small fraction of the whole. While we have some freedom to choose what we experience, there are also processes in media systems that shape what music, movies, news, and even conversations we pay attention to. This shaping ultimately helps to determine our shared culture, and new media are now transforming these processes — and therefore our shared culture.

(For instance, Twitter’s algorithms currently think I should pay attention to #NCAAMarchMadness2014 [which is trending]. They tell me this is a recommendation “just for me” [see below]. In fact I hate sports, so perhaps Twitter hates me.)

I used the example of trashy pop bands – a student suggested One Direction – to illustrate this. There may be a large number of musicians with enough skill to comprise a trashy pop band but only a few trashy pop bands are successful at any given time. Musical talent is far more widely distributed than attention to specific bands. Even a casual music listener will agree that talent does not necessarily determine popularity. So what does?

The same is true of more serious topics—consider news. There is enough serious news to fill many newspapers but somehow it comes to be that we hear about certain topics over and over again, while other topics are ignored. How is it that the same events might get more coverage at one moment but less at another moment? It does not seem to be about the “quality” of the news story or the importance of the events, taken in isolation. At this point I employed Ethan Zuckerman’s comparison of attention to Kim Kardashian vs. famine.

Google Trends: Interest in Kardashian vs. famine

(Click to enlarge)

Ultimately this shaping and organization of communication and information determines who we are as a collective, as a public, as a society. A central problem in the study of communication and information has been: how do communication and information systems and institutions shape our knowledge and attention?

This is a particularly interesting moment to consider this topic because, while this is a perennial research problem in the study of communication (cf. Gatekeeping Theory, Agenda-Setting Theory, Framing, Priming, Cultivation Theory, Theories of the Public Sphere, etc.), the new prevalence of attention sorting algorithms on the Internet is transforming the way that attention and knowledge are shaped. A useful phrase naming the overall phenomenon is “Algorithmic Culture,” coined by Alex Galloway.

Decades ago, decisions made by a few behind-the-scenes industry professionals like legendary music producer John Hammond would be instrumental in selecting and promoting specific media content (like the musical acts of Count Basie, Bob Dylan, and Aretha Franklin), and newspaper owners like Joseph Pulitzer decided what should be spread as news (such as color comic strips or crusading investigative reporting exposing government corruption).

They may or may not have done a good job, but it is interesting that today they do not wield power in the same way. Today on the Internet many decisions about media content and advertising are made by algorithms. An algorithm, or step-by-step procedure for accomplishing something, is typically a piece of computer software that uses some data about you to determine what you will watch, hear, or read. A simple algorithm might be “show the most recent thing any friend of mine has posted” — however most algorithms in use are much more complex.

Algorighms sort both content and advertising. Older media industries often promoted content quite broadly, but now the resulting decisions may be individualized to you, meaning that no two people might see the same Web page. Although algorithms are written by people, they often have effects that are hard for any single person to anticipate.

To introduce this topic, I suggested two online readings that are intended to be accessible to a general audience. They both consider how new media are now re-shaping the selection of content online by focusing on the idea of the algorithm. I decided to forward these two from The Atlantic:

(1.) “The Algorithm Economy: Inside the Formulas of Facebook and Amazon,” by Derek Thompson, 12 March 2014, The Atlantic

This very short blog post introduces the idea that algorithms (meaning, a repeatable step-by-step procedure for accomplishing something) now drive much of our experience with new media. It contrasts two major algorithms that most people are familiar with: (1) Amazon.com product recommendations (technically called item-to-item collaborative filtering) and (2) the Facebook news feed (called EdgeRank). A key point is that all algorithms are not equal — these two implementations of algorithmic sorting of content are quite different in their implications and effects.

(2.) “A Guide to the Digital Advertising Industry That’s Watching Your Every Click,” by Joe Turow, 7 Feb 2012, The Atlantic

Most content on the Internet is available for free and supported by online advertising. This longer article is a book excerpt from the introduction of Turow’s book The Daily You. It introduces the new ways that the online advertising industry operates and describes the way that firms match customer data to online content and advertising. This article focuses on the data about audiences that must be gathered and analyzed in order to provide personalized advertising. It then raises the question of whether or not people know about this large-scale data collection about them and considers how they feel about it.

Optional extra: For a more in-depth treatment of the topic, see Tarleton Gillespie’s “The Relevance of Algorithms,” recently released in Media Technologies.

Okay, I’ll stop here for now. But in my next post, I’ll consider how to demonstrate the effects of algorithmic sorting in a simple and easy-to-understand way. Then I’ll tell you how the students reacted to all this.

(*) – Although some days I do feel like an algorithm.


What Came Before Social Media?

February 7th, 2014 by Christian

(or, Social Media circa 1994)

(or, Happy 20th Birthday, My Home Page!)

Thanks to the rigorous use of backups, I’ve just noticed that it is the twentieth anniversary of my personal home page. In the spirit of commemoration, I’ve uploaded the original version (c. 1994). For reasons I don’t remember now, I named it “booger.html.” A screenshot:

booger.html screenshot

I stumbled upon this file while looking through my backups for something else. I also found all kinds of other interesting stuff. For example, I found my personal list of “hotlinks” (as we called them then).

It’s very hard to reconstruct what the Web was like then. The Internet Archive had not begun operation yet. All of my old links to things are now dead, but it’s still interesting to try to remember how we were social with computers. Yes, there were “social media.” I’ll explain:

  • Apparently I was in a Webring.
  • I found my PGP Public Key. (No idea where the private key is.) I made my PGP public key available so people could send me a PGP encrypted message at any time. However, in ten years no one ever sent me a PGP encrypted message. But I was ready. (Take that NSA.) As long as I could find my PGP private key and remember the password from ten years ago, that is.
  • My preferred search engine was Web Crawler.
  • Later in the year I was very excited about Hot Wired, the first commercial magazine on the Web (an online version of Wired Magazine). It had its own URL then, which still works: http://www.hotwired.com  Everything was prefaced with “hot” back then. That is a hotlink to HotWired.
  • I spent a lot of time doing ytalk with my friends. Screenshot (found on the Internet — not mine):

ytalk

  • I exhorted people to look me up on whois and to “finger me.” I regularly updated my .plan and .project files, which were status updates. Yes, Mark Zuckerberg basically ripped off the finger protocol from 1971, then added a facility to help Harvard men look at Harvard women (the “Facebook”) and “poke” them. Great job. Here’s an example finger query (not mine, found on the Web):

finger protocol

A lot of being on the Web in 1994 seems to be about just being on the Web at all. For instance:

  • I used the HotDog Web Editor for my HTML. Apparently because the logo was so cool. (I don’t think I used it for my first Web page – booger.html though because the HTML is terrible.)

hotdog3

  • I appear to have been on an obsessive search for new “icons.” I bookmarked a bunch of icon sharing sites, all now defunct.
  • I was very interested in how to interlace GIFs.
  • Does anyone else remember Carlos’s Forms Tutorial at NCSA? I spent a huge amount of time there and looking at the CGI documentation on a server named hoohoo (the link is a capture from 1996). I spent so much time on it that I memorized the URL, and we didn’t believe in short URLs then. UIUC loomed large in my imagination purely because of its Web stuff. Little did I know I would go on to work there and genuflect at the monument to the Web Browser every single day.

The ephemera above remind me that the Web was so exciting that a friend went to the DMV and got the California personalized license plate “IDOWWW“. I thought this might be the coolest thing anyone had ever done. In fact, I still think it is.

It’s hard to believe twenty years have passed since booger.html. I want to keep the nostalgia going. Does anyone else remember anything about social media in 1994?


Reddit, Mathematically the Anti-Facebook (and other thoughts on algorithmic culture)

January 29th, 2014 by Christian

(or, Are We Social Insects?)

I worried that my last blog post was too short and intellectually ineffectual. But given the positive feedback I’ve received, my true calling may be to write top ten lists of other people’s ideas, based on conferences I attend. So here is another list like that.

These are my notes from my attendance at “Algorithmic Culture,” an event in the University of Michigan’s Digital Currents program. It featured a lecture by the amazing Ted Striphas. These notes also reflect discussion after the talk that included Megan Sapnar Ankerson, Mark Ackerman, John Cheney-Lippold and other people I didn’t write down.

Ted has made his work on historicizing the emergence of an “algorithmic culture” (Alex Galloway‘s term) available widely already, so my role here is really just to point at it and say: “Look!” (Then applaud.)

If you’re not familiar with this general topic area (“algorithmic culture”) see Tarleton Gillespie’s recent introduction The Relevance of Algorithms and then maybe my own writing posse’s Re-Centering the Algorithm. OK here we go:

Eight Questions About Algorithms and Culture

  1. Are algorithms centralizing? Algorithms, born from ideas of decentralized control and cybernetics, were once seen as basically anti-hierarchical. Fifty years ago we searched for algorithms in nature and found them decentralized — today engineers write them and we find them centralizing.
  2. OR, are algorithms fundamentally democratic? Even if Google and Facebook have centralized the logic, they claim “democracy!” because we provide the data. YouTube has no need of kings. The LOLcats and fail videos are there by our collective will.
  3. Many of today’s ideas about algorithms and culture can be traced to earlier ideas about social insects. Entomology once noted that termites “failed to evolve” because their algorithms, based on biology, were too inflexible. How do our algorithms work? Too inflexible? (and does this mean we are social insects?)
  4. The specific word “algorithm” is a recent phenomenon, but the idea behind it is not new. (Consider: plan, recipe, procedure, script, program, function, …) But do we think about these ideas differently now? If so, maybe it is who looks at them and where they look. In early algorithmic thinking people were the logic and housed the procedure. Now computers house the procedure and people are the operands.
  5. Can “algorithmic culture” be countercultural? Fred Turner and John Markoff have traced the links between the counterculture and computing. Striphas argued that counterculture-like influences on what would become modern computing came much earlier than the 60s: consider the influence of WWII and The Holocaust. For example, Talcott Parsons saw culture through the lens of anti-authoritarianism. He also saw culture as the opposite of state power. Is culture fundamentally anti-state? This also leads me to ask: Is everything always actually about Hitler in the end?
  6. Today, the computer science definition of “algorithm” is similar to anthropologist Clifford Geertz’s definition of culture in 1970s — that is, a recipe, plan, etc. Why is this? Is this significant?
  7. Is Reddit the conceptual anti-Facebook? Reddit publicly discloses the algorithm that it uses to sort itself. There have been calls for Facebook algorithm transparency on normative grounds. What are the consequences of Reddit’s disclosure, if any? As Reddit’s algorithm is not driven by Facebook’s business model, does that mean these two social media platform sorting algorithms are mathematically (or more properly, procedurally) opposed?
  8. Are algorithms fundamentally about homeostasis? (That’s the idea, prevalent in cybernetics and 1950s social science, that the systems being described are stable.) In other words, when algorithms are used today is there an implicit drive toward stability, equilibrium, or some other similar implied goal or similar standard of beauty for a system?

Whew, I’m done. What a great event!

I’m skeptical about that last point (algorithms = homeostasis) but the question reminds me of “The Use and Abuse of Vegetational Concepts,” part 2 of the 2011 BBC documentary/insane-music-video by Adam Curtis titled All Watched Over by Machines of Loving Grace. It is a favorite of mine. Although I think many of the implied claims are not true, it’s worth watching for the soundtrack and jump cuts alone.

It’s all about cybernetics and homeostasis. I’ll conclude with it… “THIS IS A STORY ABOUT THE RISE OF THE MACHINES”:

All Watched Over By Machines of Loving Grace 2 from SACPOP on Vimeo.

P.S.

Some of us also had an interesting side conversation about what job would be the “least algorithmic.” Presumably something that was not repeatable — it differs each time it is performed. Some form of performance art? This conversation led us to think that everything is actually algorithmic.


Are there feminist data? (+ other questions)

January 24th, 2014 by Christian

Here’s a quick post containing eight ideas that made it into my notes from today’s “Feminism, Technology, and the BodyFemTechNet dialogue at the University of Michigan. It featured  Alondra Nelson, Jessie Daniels, Lisa Nakamura, Sidonie Smith, Carrie Rentschler, Sharon Irish, and a bunch of other people I didn’t write down. What a crew!

Eight Ideas About Feminism, Technology, and the Body:

1. Early ads for the Internet wouldn’t work today. We no longer aspire to leave our bodies behind. Or we can no longer imagine it.  Remember this ad?  (c. 1997)
 http://www.youtube.com/watch?v=ioVMoeCbr…

2. If we’ve theorized the Internet and the body, what about social media and the body?

3. Is  the selfie inherently anti-feminist?

4. Are there “feminist data?” What are they?

5. “Just add women and stir” won’t work — mixing women and tech together is not in itself progressive. (cf. bell hooks)

6. Whatever happened to the emancipatory cyborg? (Haraway) Is a woman’s body still a trap?

7. Don’t forget where all this comes from. Facebook was born in a sexist moment. It was meant to make Harvard women available to the male gaze.

8. Forget the MOOC, it’s time for the DOCC.(*)

(* – Distributed Online Collaborative Course)


Bad Behavior has blocked 221 access attempts in the last 7 days.