The Last Post

September 1st, 2015 by Christian

(or, I’m Moving My Blogging to Other Platforms.)

After a great run of six full years, I’ve decided to retire this blog. It worked well, but increasingly I find that most of the readership from my writing comes from my blogging at The Social Media Collective and occasionally at other venues like The Huffington Post and Wired.

Thanks so much for reading this. I’ll still be blogging and I hope that you’ll keep reading after I move things over there.

In the unlikely event that I launch any new standalone blogs I’ll be sure to alert you via my homepage.

Posted in Research, Teaching, Uncategorized | Comments Off on The Last Post

2015 Advice For Your 856-Year-Old Ph.D.

August 5th, 2015 by Christian

(or, What’s New About Getting an Old Degree?)

I’m delighted to be teaching an intro seminar for all the new Ph.D. students in my department’s graduate program. One of my goals is to give these students a place to talk about the environment of graduate school itself. How does getting a Ph.D. work? What do you need to know?

This task has made me reflective. At first I thought I should pass along readings that had been inspirational for me during grad school. That sure didn’t work. Here is the advice I apparently once loved:

Once you have identified some [thesis] topics you are interested in, you can research them rapidly by spending a few hours on the telephone calling up experts in the field and pumping them for information…although it may cost you a few dollars in long-distance bills… —Getting What You Came For: The Smart Student’s Guide to Earning a Master’s or Ph.D., p. 182

Or:

I wrote the paper with which this book begins on a microcomputer. Though this first experience with one frightened me a little at first, writing soon seemed so much less work that I wondered how I had managed before. —Writing for Social Scientists, p. 151

Or:

Having surveyed the basics…it’s time to consider the role that electronic communication can play. The most important thing is to employ electronic media consciously and deliberately as part of a larger strategy for your career. —Networking on the Network: A Guide to Professional Skills for PhD Students

Or:

Fortunately, these days every legitimate library has a copy machine, and each copy costs about a dime. —How to Write a Thesis, p. 86

The process of getting a Ph.D. is very old. Wikipedia claims the first Ph.D. was awarded in Paris in 1150. I thought Ph.D. advice would be more likely to stand the test of time.

These days you’ll find better dissertation advice on Tumblr. Or at least you’ll find some comic relief from Tumblrs like When in Academia…

(That’s some great tagging.)

The upshot is that it looks like a fair amount of the advice about how to get a Ph.D. has to do with the available communication technology of the time. Both the stuff that’s in everyday use, and also the scholarly communication infrastructure (which I’ve also blogged about recently).

Has anyone reading this ever attended a conference paper sale? (No, that’s not about buying pre-written term papers.) Or have you ever received an academic journal article “preprint request postcard?” Here’s an image of one:

Source: Google Scholar Blog.

So far I’ve come up with a list of things that seem to still be helpful. Caveats: I’m aiming to help the social science and humanities students interested in communication and information. Our first year students won’t be teaching yet, so I am not focusing on teaching with this list.

Hopefully there are some readers who will find this list useful too.

How to Get a Ph.D. — The Draft Reading List

Agre, P. (2002). Networking on the Network: A Guide to Professional Skills for PhD Students. http://vlsicad.ucsd.edu/Research/Advice/network.html I’ll excerpt the following sections:

Building a Professional Identity
- Socializing at Conferences
- Publication and Credit
- Recognizing Difference
Your Dissertation
Academic Language

anonymous. (ed.) (2015). “When in Academia.” http://wheninacademia.tumblr.com/

Becker, H. S. (2007). Writing for Social Scientists. Chicago: University of Chicago Press. — Don’t let the title of this book fool you, it is equally applicable to graduate students in the humanities and professional programs. I’m excerpting the following sections:

Freshman English for Graduate Students
Persona and Authority
Learning to Write as a Professional
Risk
Terrorized by the Literature

Cham, J. (2013, January 21). “Your Conference Presentation.” (image.) PhD Comics.

Edwards, P. N. (2014). “How to Give an Academic Talk.” http://pne.people.si.umich.edu/PDF/howtotalk.pdf (13 pp.)

Germano, W. (2013) From Dissertation to Book. (2nd ed.) Chicago: University of Chicago Press. — Note: “Passive Is Spoken Here” is a great section heading. I’ll excerpt the chapter:

Making Prose Speak

Sterne, J. (2014). How to Peer Review Something You Hate. ICA Newsletter. (2 pp.)

Shore, B. M. (2014). The Graduate Advisor Handbook. Chicago: University of Chicago Press. I’ll excerpt:

Mutual Expectations for Research Advising (pp. 143-146)

Strunk, W., Jr. & White, E. B. (2000). The Elements of Style. (4th ed.) New York: Longman. (Important: You must avoid any “Original Edition” or public domain reprint that does not include E. B. White as a co-author. The version without E. B. White is a different book.)

@yourpapersucks (ed.) (2015). “Shit My Reviewers Say.” http://shitmyreviewerssay.tumblr.com/

…

…however…

I see that it’s a list woefully lacking in anything like “social media savvy for Ph.D. students” or “How new forms of scholarly communication are changing the dissertation.” I’m sure there are other newish domains I’ve left out, too. What am I missing? Can anyone help me out? Please add a comment or e-mail me.

Yours in futurity.

(this blog post was cross-posted to The Social Media Collective.)

Posted in Living, Teaching | 1 Comment »

Accountable Algorithms: A Research Agenda

May 12th, 2015 by Christian

(or, Caillou Sucks)

What should people who are interested in accountability and algorithms be thinking about? Here is one answer: My eleven-minute remarks are now online from a recent event at NYU. I’ve edited them to intersperse my slides.

This talk was partly motivated by the ethics work being done in the machine learning community. That is very exciting and interesting work and I love, love, love it. My remarks are an attempt to think through the other things we might also need to do. Let me know how to replace the “??” in my slides with something more meaningful!

Preview: My remarks contain a minor attempt at a Michael Jackson joke.

Here is the video: https://www.youtube.com/embed/rJfDKx2fjdE

A number of fantastic Social Media Collective people were at this conference — you can hear Kate Crawford in the opening remarks. For more videos from the conference, see:

Algorithms and Accountabilityhttp://www.law.nyu.edu/centers/ili/algorithmsconference

Thanks to Joris van Hoboken, Helen Nissenbaum and Elana Zeide for organizing such a fab event.

If you bought this 11-minute presentation you might also buy: Auditing Algorithms, a forthcoming workshop at Oxford.

http://auditingalgorithms.wordpress.com

(This post was cross-posted to The Social Media Collective.)

Posted in Research | 3 Comments »

The Facebook “It’s Not Our Fault” Study

May 7th, 2015 by Christian

Today in Science, members of the Facebook data science team released a provocative study about adult Facebook users in the US “who volunteer their ideological affiliation in their profile.” The study “quantified the extent to which individuals encounter comparatively more or less diverse” hard news “while interacting via Facebook’s algorithmically ranked News Feed.”*

The research found that the user’s click rate on hard news is affected by the positioning of the content on the page by the filtering algorithm. The same link placed at the top of the feed is about 10-15% more likely to get a click than a link at position #40 (figure S5).
The Facebook news feed curation algorithm, “based on many factors,” removes hard news from diverse sources that you are less likely to agree with but it does not remove the hard news that you are likely to agree with (S7). They call news from a source you are less likely to agree with “cross-cutting.”*
The study then found that the algorithm filters out 1 in 20 cross-cutting hard news stories that a self-identified conservative sees (or 5%) and 1 in 13 cross-cutting hard news stories that a self-identified liberal sees (8%).
Finally, the research then showed that “individuals’ choices about what to consume” further limits their “exposure to cross-cutting content.” Conservatives will click on ~~only 17%~~ a little less than 30% of cross-cutting hard news, while liberals will click 7% a little more than 20% (figure 3).

My interpretation in three sentences:

We would expect that people who are given the choice of what news they want to read will select sources they tend to agree with–more choice leads to more selectivity and polarization in news sources.
Increasing political polarization is normatively a bad thing.
Selectivity and polarization are happening on Facebook, and the news feed curation algorithm acts to modestly accelerate selectivity and polarization.

I think this should not be hugely surprising. For example, what else would a good filter algorithm be doing other than filtering for what it thinks you will like?

But what’s really provocative about this research is the unusual framing. This may go down in history as the “it’s not our fault” study.

Facebook: It’s not our fault.

I carefully wrote the above based on my interpretation of the results. Now that I’ve got that off my chest, let me tell you about how the Facebook data science team interprets these results. To start, my assumption was that news polarization is bad. But the end of the Facebook study says:

“we do not pass judgment on the normative value of cross-cutting exposure”

This is strange, because there is a wide consensus that exposure to diverse news sources is foundational to democracy. Scholarly research about social media has–almost universally–expressed concern about the dangers of increasing selectivity and polarization. But it may be that you do not want to say that polarization is bad when you have just found that your own product increases it. (Modestly.)

And the sources cited just after this quote sure do say that exposure to diverse news sources is important. But the Facebook authors write:

“though normative scholars often argue that exposure to a diverse ‘marketplace of ideas’ is key to a healthy democracy (25), a number of studies find that exposure to cross-cutting viewpoints is associated with lower levels of political participation (22, 26, 27).”

So the authors present reduced exposure to diverse news as a “could be good, could be bad” but that’s just not fair. It’s just “bad.” There is no gang of political scientists arguing against exposure to diverse news sources.**

The Facebook study says it is important because:

“our work suggests that individuals are exposed to more cross-cutting discourse in social media they would be under the digital reality envisioned by some“

Why so defensive? If you look at what is cited here, this quote is saying that this study showed that Facebook is better than a speculative dystopian future.*** Yet the people referred to by this word “some” didn’t provide any sort of point estimates that were meant to allow specific comparisons. On the subject of comparisons, the study goes on to say that:

“we conclusively establish that…individual choices more than algorithms limit exposure to attitude-challenging content.”

“compared to algorithmic ranking, individuals’ choices about what to consume had a stronger effect”

Alarm bells are ringing for me. The tobacco industry might once have funded a study that says that smoking is less dangerous than coal mining, but here we have a study about coal miners smoking. Probably while they are in the coal mine. What I mean to say is that there is no scenario in which “user choices” vs. “the algorithm” can be traded off, because they happen together (Fig. 3 [top]). Users select from what the algorithm already filtered for them. It is a sequence.**** I think the proper statement about these two things is that they’re both bad — they both increase polarization and selectivity. As I said above, the algorithm appears to modestly increase the selectivity of users.

The only reason I can think of that the study is framed this way is as a kind of alibi. Facebook is saying: It’s not our fault! You do it too!

Are we the 4%?

In my summary at the top of this post, I wrote that the study was about people “who volunteer their ideological affiliation in their profile.” But the study also describes itself by saying:

“we utilize a large, comprehensive dataset from Facebook.”

“we examined how 10.1 million U.S. Facebook users interact”

These statements may be factually correct but I found them to be misleading. At first, I read this quickly and I took this to mean that out of the at least 200 million Americans who have used Facebook, the researchers selected a “large” sample that was representative of Facebook users, although this would not be representative of the US population. The “limitations” section discusses the demographics of “Facebook’s users,” as would be the normal thing to do if they were sampled. There is no information about the selection procedure in the article itself.

Instead, after reading down in the appendices, I realized that “comprehensive” refers to the survey research concept: “complete,” meaning that this was a non-probability, non-representative sample that included everyone on the Facebook platform. But out of hundreds of millions, we ended up with a study of 10.1m because users were excluded unless they met these four criteria:

“18 or older”
“log in at least 4/7 days per week”
“have interacted with at least one link shared on Facebook that we classified as hard news”
“self-report their ideological affiliation” in a way that was “interpretable”

That #4 is very significant. Who reports their ideological affiliation on their profile?

It turns out that only 9% of Facebook users do that. Of those that report an affiliation, only 46% reported an affiliation in a way that was “interpretable.” That means this is a study about the 4% of Facebook users unusual enough to want to tell people their political affiliation on the profile page. That is a rare behavior.

More important than the frequency, though, is the fact that this selection procedure confounds the findings. We would expect that a small minority who publicly identifies an interpretable political orientation to be very likely to behave quite differently than the average person with respect to consuming ideological political news. The research claims just don’t stand up against the selection procedure.

But the study is at pains to argue that (italics mine):

“we conclusively establish that on average in the context of Facebook, individual choices more than algorithms limit exposure to attitude-challenging content.”

The italicized portion is incorrect because the appendices explain that this is actually a study of a specific, unusual group of Facebook users. The study is designed in such a way that the selection for inclusion in the study is related to the results. (“Conclusively” therefore also feels out of place.)

Algorithmium: A Natural Element?

Last year there was a tremendous controversy about Facebook’s manipulation of the news feed for research. In the fracas it was revealed by one of the controversial study’s co-authors that based on the feedback received after the event, many people didn’t realize that the Facebook news feed was filtered at all. We also recently presented research with similar findings.

I mention this because when the study states it is about selection of content, who does the selection is important. There is no sense in this study that a user who chooses something is fundamentally different from the algorithm hiding something from them. While in fact the the filtering algorithm is driven by user choices (among other things), users don’t understand the relationship that their choices have to the outcome.

In other words, the article’s strange comparison between “individual’s choices” and “the algorithm,” should be read as “things I choose to do” vs. the effect of “a process Facebook has designed without my knowledge or understanding.” Again, they can’t be compared in the way the article proposes because they aren’t equivalent.

I struggled with the framing of the article because the research talks about “the algorithm” as though it were an element of nature, or a naturally occurring process like convection or mitosis. There is also no sense that it changes over time or that it could be changed intentionally to support a different scenario.*****

Facebook is a private corporation with a terrible public relations problem. It is periodically rated one of the least popular companies in existence. It is currently facing serious government investigations into illegal practices in many countries, some of which stem from the manipulation of its news feed algorithm. In this context, I have to say that it doesn’t seem wise for these Facebook researchers to have spun these data so hard in this direction, which I would summarize as: the algorithm is less selective and less polarizing. Particularly when the research finding in their own study is actually that the Facebook algorithm is modestly more selective and more polarizing than living your life without it.

Update: (6pm Eastern)

Wow, if you think I was critical have a look at these. It turns out I am the moderate one.

Eszter Hargittai from Northwestern posted on Crooked Timber that we should “stop being mesmerized by large numbers and go back to taking the fundamentals of social science seriously.” And (my favorite): “I thought Science was a serious peer-reviewed publication.”

Nathan Jurgenson from Maryland and Snapchat wrote on Cyborgology (“in a fury“) that Facebook is intentionally “evading” its own role in the production of the news feed. “Facebook cannot take its own role in news seriously.” He accuses the authors of using the “Big-N trick” to intentionally distract from methodological shortcomings. He tweeted that “we need to discuss how very poor corporate big data research gets fast tracked into being published.”

Zeynep Tufekci from UNC wrote on Medium that “I cannot remember a worse apples to oranges comparison” and that the key take-away from the study is actually the ordering effects of the algorithm (which I did not address in this post). “Newsfeed placement is a profoundly powerful gatekeeper for click-through rates.”

Update: (5/10)

A comment helpfully pointed out that I used the wrong percentages in my fourth point when summarizing the piece. Fixed it, with changes marked.

Update: (5/15)

It’s now one week since the Science study. This post has now been cited/linked in The New York Times, Fortune, Time, Wired, Ars Technica, Fast Company, Engaget, and maybe even a few more. I am still getting emails. The conversation has fixated on the <4% sample, often saying something like: “So, Facebook said this was a study about cars, but it was actually only about blue cars.” That’s fine, but the other point in my post is about what is being claimed at all, no matter the sample.

I thought my “coal mine” metaphor about the algorithm would work but it has not always worked. So I’ve clamped my Webcam to my desk lamp and recorded a four-minute video to explain it again, this time with a drawing.******

Here’s the video:
https://www.youtube.com/watch?v=eBPntMSDGSs

If the coal mine metaphor failed me, what would be a better metaphor? I’m not sure. Suggestions?

Notes:

* Diversity in hard news, in their study, would be a self-identified liberal who receives a story from FoxNews.com, or a self-identified conservative who receives one from the HuffingtonPost.com, where the stories are about “national news, politics, [or] world affairs.” In more precise terms, for each user “cross-cutting content” was defined as stories that are more likely to be shared by partisans who do not have the same self-identified ideological affiliation that you do.

** I don’t want to make this even more nitpicky, so I’ll put this in a footnote. The paper’s citations to Mutz and Huckfeldt et al. to mean that “exposure to cross-cutting viewpoints is associated with lower levels of political participation” is just bizarre. I hope it is a typo. These authors don’t advocate against exposure to cross-cutting viewpoints.

*** Perhaps this could be a new Facebook motto used in advertising: “Facebook: Better than one speculative dystopian future!”

**** In fact, algorithm and user form a coupled system of at least two feedback loops. But that’s not helpful to measure “amount” in the way the study wants to, so I’ll just tuck it away down here.

***** Facebook is behind the algorithm but they are trying to peer-review research about it without disclosing how it works — which is a key part of the study. There is also no way to reproduce the research (or do a second study on a primary phenomenon under study, the algorithm) without access to the Facebook platform.

****** In this video, I intentionally conflate (1) the number of posts filtered and (2) the magnitude of the bias of the filtering. I did so because the difficulty with the comparison works the same way for both, and I was trying to make the example simpler. Thanks to Cedric Langbort for pointing out that “baseline error” is the clearest way of explaining this.
—

(This was cross-posted to The Social Media Collective and Wired.)

Posted in Research | 24 Comments »

Should You Boycott Traditional Journals?

March 30th, 2015 by Christian

(Or, Should I Stay or Should I Go?)

Is it time to boycott “traditional” scholarly publishing? Perhaps you are an academic researcher, just like me. Perhaps, just like me, you think that there are a lot of exciting developments in scholarly publishing thanks to the Internet. And you want to support them. And you also want people to read your research. But you also still need to be sure that your publication venues are held in high regard.

Or maybe you just receive research funding that is subject to new open access requirements.

Academia is a funny place. We are supposedly self-governing. So if we don’t like how our scholarly communications are organized we should be able to fix this ourselves. If we are dissatisfied with the journal system, we’re going to have to do something about it. The question of whether or not it is now time to eschew closed access journals is something that comes up a fair amount among my peers.

It comes up often enough that a group of us at Michigan decided to write an article on the topic. Here’s the article. It just came out yesterday (open access, of course):

Carl Lagoze, Paul Edwards, Christian Sandvig, & Jean-Christophe Plantin. (2015). Should I stay or Should I Go? Alternative Infrastructures in Scholarly Publishing. International Journal of Communication 9: 1072-1081.

The article is intended for those who want some help figuring out the answer to the question the article title poses: Should I stay or should I go? It’s meant help you decipher the unstable landscape of scholarly publishing these days. (Note that we restrict our topic to journal publishing.)

Researching it was a lot of fun, and I learned quite a bit about how scholarly communication works.

It contains a mention of the first journal. Yes, the first one that we would recognize as a journal in today’s terms. It’s Philosophical Transactions published by the Royal Society of London. It’s on Volume 373.
It should teach you about some of the recent goings-on in this area. Do you know what a green repository is? What about an overlay journal? Or the “serials crisis“?
It addresses a question I’ve had for a while: What the heck are those arXiv people up to? If it’s so great, why hasn’t it spread to all disciplines?
There’s some fun discussion of influential experiments in scholarly publishing. Remember the daring foundation of the Electronic Journal of Communication? Vectors? Were you around way-back-in-the-day when the pioneering, Web-based JCMC looked like this hot mess below? Little did we know that we were actually looking at the future.(*)

(JCMC circa 1995)

(*): Unless we were looking at the Gopher version, then in that case we were not looking at the future.

Ultimately, we adapt a framework from Hirschman that we found to be an aid to our thinking about what is going on today in scholarly communication. Feel free to play this song on a loop as you read it.

(This post has been cross-posted on The Social Media Collective.)

Posted in Research | 1 Comment »

multicast

The Last Post

2015 Advice For Your 856-Year-Old Ph.D.

How to Get a Ph.D. — The Draft Reading List

…however…

Accountable Algorithms: A Research Agenda

The Facebook “It’s Not Our Fault” Study

Facebook: It’s not our fault.

Are we the 4%?

Algorithmium: A Natural Element?

Should You Boycott Traditional Journals?

About Me

TWITTER

Friendly

Links

Newish Comments

Categories

Feeds

Archives