Wall Street Journal

You are currently browsing articles tagged Wall Street Journal.

In The Data Bubble, I told readers to mark the day: 31 July 2010. That’s when The Wall Street Journal published The Web’s Gold Mine: Your Secrets, subtitled A Journal investigation finds that one of the fastest-growing businesses on the Internet is the business of spying on consumers. First in a series. That same series is now nine stories long, not counting the introduction and a long list of related pieces. Here’s the current list:

  1. The Web’s Gold Mine: What They Know About You
  2. Microsoft Quashed Bid to Boost Web Privacy
  3. On the Web’s Cutting Edge: Anonymity in Name Only
  4. Stalking by Cell Phone
  5. Google Agonizes Over Privacy
  6. Kids Face Intensive Tracking on Web
  7. ‘Scrapers’ Dig Deep for Data on the Web
  8. Facebook in Privacy Breach
  9. A Web Pioneer Profiles Users By Name

Related pieces—

Two things I especially like about all this. First, Julia Angwin and her team are doing a terrific job of old-fashioned investigative journalism here. Kudos for that. Second, the whole series stands on the side of readers. The second person voice (you, your) is directed to individual persons—the same persons who do not sit at the tables of decision-makers in this crazy new hyper-personalized advertising business.

To measure the delta of change in that business, start with John Battelle‘s Conversational Marketing series (post 1, post 2, post 3) from early 2007, and then his post Identity and the Independent Web, from last week. In the former he writes about how the need for companies to converse directly with customers and prospects is both inevitable and transformative. He even kindly links to The Cluetrain Manifesto (behind the phrase “brands are conversations”).

In his latest he observes some changes in the Web itself:

Here’s one major architectural pattern I’ve noticed: the emergence of two distinct territories across the web landscape. One I’ll call the “Dependent Web,” the other is its converse: The “Independent Web.”

The Dependent Web is dominated by companies that deliver services, content and advertising based on who that service believes you to be: What you see on these sites “depends” on their proprietary model of your identity, including what you’ve done in the past, what you’re doing right now, what “cohorts” you might fall into based on third- or first-party data and algorithms, and any number of other robust signals.

The Independent Web, for the most part, does not shift its content or services based on who you are. However, in the past few years, a large group of these sites have begun to use Dependent Web algorithms and services to deliver advertising based on who you are.

A Shift In How The Web Works?

And therein lies the itch I’m looking to scratch: With Facebook’s push to export its version of the social graph across the Independent Web; Google’s efforts to personalize display via AdSense and Doubleclick; AOL, Yahoo and Demand building search-driven content farms, and the rise of data-driven ad exchanges and “Demand Side Platforms” to manage revenue for it all, it’s clear that we’re in the early phases of a major shift in the texture and experience of the web.

He goes on to talk about how “these services match their model of your identity to an extraordinary machinery of marketing dollars“, and how

When we’re “on” Facebook, Google, or Twitter, we’re plugged into an infrastructure (in the case of the latter two, it may be a distributed infrastructure) that locks onto us, serving us content and commerce in an automated but increasingly sophisticated fashion. Sure, we navigate around, in control of our experience, but the fact is, the choices provided to us as we navigate are increasingly driven by algorithms modeled on the service’s understanding of our identity.

And here is where we get to the deepest, most critical problem: Their understanding of our identity is not the same as our understanding of our identity. What they have are a bunch of derived assumptions that may or may not be correct; and even if they are, they are not ours. This is a difference in kind, not degree. It doesn’t matter how personalized anybody makes advertising targeted at us. Who we are is something we possess and control—or would at least like to think we do—no matter how well some of us (such as advertisers) rationalize the “socially derived” natures of our identities in the world.

It is standard for people in the ad business to equate assent with approval, and John’s take on this is a good example of that. Sez he,

We know this, and we’re cool with the deal.

In fact we don’t know, we’re not cool with it, and it isn’t a deal.

If we knew, the Wall Street Journal wouldn’t have a reason to clue us in at such length.

We’re cool with it only to the degree that we are uncomplaining about it—so far.

And it isn’t a “deal” because nothing was ever negotiated.

On that last point, our “deals” with vendors on the Web are agreements in name only. Specifically, they are a breed of assent called contracts of adhesion. Also called standard form or boilerplate contracts, they are what you get when a dominant party sets all the terms, there is no room for negotiation, and the submissive party has a choice only to accept the terms or walk away. The term “adhesion” refers to the nailed-down nature of the submissive party’s position, while the dominant party is free to change the terms any time it wishes. Next time you “agree” to terms you haven’t read, go read them and see where it says the other party reserves the right to change the terms.

There is a good reason why we have had these kinds of agreements since the dawn of e-commerce. It’s because that’s the way the Web was built. Only one party—the one with the servers and the services—was in a position to say what was what. It’s still that way. The best slide I’ve seen in the last several years is one of Phil Windley‘s. It says,


1995: Invention of the Cookie.

The End.

About all we’ve done since 1995 on the sell side is improve the cookie-based system of “relating” to users. This is a one-way take-it-or-leave-it system that has become lame and pernicious in the extreme. We can and should do better than that.

Phil’s own company, Kynetx, has come up with a whole new schema. Besides clients and servers (which don’t go away), you’ve got end points, events, rules and rules engines to execute the rules. David Siegel’s excellent book, The Power of Pull, describes how the Semantic Web also offers a rich and far more flexible and useful alternative to the Web’s old skool model. His post yesterday is a perfect example of liberated thinking and planning that transcends the old cookie-limited world. The man is on fire. Dig his first paragraph:

Monday I talked about the social networking bubble. Marketers are getting sucked into the social-networking vortex and can’t find their way out. The problem is that most companies are trying small tactical improvements, hoping to improve sales a bit and trying tactical savings programs, hoping to improve margins a bit. Yet there’s a whole new curve of efficiency waiting in the world of pull. It’s time to start talking about savingtrillions, not millions. Companies should think in terms of big, strategic, double-digit improvements, new markets, and new ways to cooperate. Here is a road map

Read on. (I love that he calls social networking a “bubble”. I’m with that.)

This week at IIW in Mountain View, we’re going to be talking about, and working on, improving markets from the buyers’ side. (Through VRM and other means.) On the table will be whole new ways of relating, starting with systems by which users and customers can offer their own terms of engagement, their own policies, their own preferences (even their own prices and payment options)—and by which sellers and site operators can signal their openness to those terms (even if they’re not yet ready to accept them). The idea here is to get buyers out of their shells and sellers out of their silos, so they can meet and deal for real in a truly open marketplace. (This doesn’t have to be complicated. A lot of it can be automated. And, if we do it right, we can skip a lot of the pointless one-sided agreement-clicking friction we now take for granted.)

Right now it’s hard to argue against all the money being spent (and therefore made) in the personalized advertising business—just like it was hard to argue against the bubble in tech stock prices in 1999 and in home prices in 2004. But we need to come to our senses here, and develop new and better systems by which demand and supply can meet and deal with each other as equally powerful parties in the open marketplace. Some of the tech we need for that is coming into being right now. That’s what we should be following. Not just whether Google, Facebook or Twitter will do the best job of putting crosshairs on our backs.

John’s right that the split is between dependence and independence. But the split that matters most is between yesterday’s dependence and tomorrow’s independence—for ourselves. If we want a truly conversational economy, we’re going to need individuals who are independent and self-empowered. Once we have that, the level of economic activity that follows will be a lot higher, and a lot more productive, than we’re getting now just by improving the world’s biggest guesswork business.

Tags: , , , , , , ,

Back on July 31 I posted The Data Bubble in response to the first of The Wall Street Journal‘s landmark series of articles and Web postings on the topic of unwelcome (and, to their targets, mostly unknown) user tracking.

A couple days ago I began to get concerned about how much time had passed since the last posting, on August 12. So I tweeted, Hey @whattheyknow, is your Wall Street Journal series done? If not, when are we going to see more entries? Last I saw was >1 month ago.

Then yesterday @WhatTheyKnow tweeted back, @dsearls: Ask and ye shall receive: http://on.wsj.com/9DTpdP. Nice!

The piece is titled On the Web, Children Face Intensive Tracking, by Steve Stecklow, and it’s a good one indeed. To start,

The Journal examined 50 sites popular with U.S. teens and children to see what tracking tools they installed on a test computer. As a group, the sites placed 4,123 “cookies,” “beacons” and other pieces of tracking technology. That is 30% more than were found in an analysis of the 50 most popular U.S. sites overall, which are generally aimed at adults.

The most prolific site: Snazzyspace.com, which helps teens customize their social-networking pages, installed 248 tracking tools. Its operator described the site as a “hobby” and said the tracking tools come from advertisers.

Should we call cookies for kids “candy”? Hey, why not?

Once again we see the beginning of the end of fettered user tracking. Such as right here:

Many kids’ sites are heavily dependent on advertising, which likely explains the presence of so many tracking tools. Research has shown children influence hundreds of billions of dollars in annual family purchases.

Google Inc. placed the most tracking files overall on the 50 sites examined. A Google spokesman said “a small proportion” of the files may be used to determine computer users’ interests. He also said Google doesn’t include “topics solely of interest to children” in its profiles.

Still, Google’s Ads Preferences page displays what Google has determined about web users’ interests. There, Google accurately identified a dozen pastimes of 10-year-old Jenna Maas—including pets, photography, “virtual worlds” and “online goodies” such as little animated graphics to decorate a website.

“It is a real eye opener,” said Jenna’s mother, Kate Maas, a schoolteacher in Charleston, S.C., viewing that data.

Jenna, now in fifth grade, said: “I don’t like everyone knowing what I’m doing and stuff.”

A Google spokesman said its preference lists are “based on anonymous browser activity. We don’t know if it’s one user or four using a particular browser, or who those users are.” He said users can adjust the privacy settings on their browser or use the Ads Preferences page to limit data collection.

I went and checked my own Ads Preferences page (http://www.google.com/ads/preferences) and found that I had opted out of Google’s interest-based advertising sometime in the past. I barely remember doing that, but I’m not surprised I did. On the whole I think most people would opt to turn that kind of stuff off, just to get a small measure of shelter amidst the advertising blizzard that the commercial Web has become.

Finding Google’s opt-out control box without a flashlight, however, is a bit of a chore. Worse, Google is just one company. The average user has to deal with dozens or hundreds of other (forgive me) cookie monsters, each with its own opt-out/in control boxes (or lack of them). And I suspect that most of those others are far less disclosing about their practices (and respectful of users) than Google is.

(But I have no research to back that up—yet. If anybody does, please let me have it. There’s a whole chapter in a book I’m writing that’s all about this kind of stuff.)

Meanwhile, says the Journal,

Parents hoping to let their kids use the Internet, while protecting them from snooping, are in a bind. That’s because many sites put the onus on visitors to figure out how data companies use the information they collect.

Exactly. And what are we to do? Depend on the site owners and their partners? Not in the absence of help, that’s for sure. The Journal again:

Gaiaonline.com—where teens hang out together in a virtual world—says in its privacy policy that it “cannot control the activities” of other companies that install tracking files on its users’ computers. It suggests that users consult the privacy policies of 11 different companies.

In a statement, gaiaonline.com said, “It is standard industry practice that advertisers and ad networks are bound by their own privacy policy, which is why we recommend that our users review those.” The Journal’s examination found that gaiaonline.com installed 131 tracking files from third parties, such as ad networks.

An executive at a company that installed several of those 131 files, eXelate Media Ltd., said in an email that his firm wasn’t collecting or selling teen-related data. “We currently are not specifically capturing or promoting any ‘teen’ oriented segments for marketing purposes,” wrote Mark S. Zagorski, eXelate’s chief revenue officer.

But the Journal found that eXelate was offering data for sale on 5.9 million people it described as “Age: 13-17.” In a later interview, Mr. Zagorski confirmed eXelate was selling teen data. He said it was a small part of its business and didn’t include personal details such as names.

BlueKai Inc., which auctions data on Internet users, also said it wasn’t offering for sale data on minors. “We are not selling data on kids,” chief executive Omar Tawakol wrote in an email. “Let there be no doubt on what we do.”

However, another data-collecting company, Lotame Solutions Inc., told the Journal that it was selling what it labeled “teeny bopper” data on kids age 13 to 19 via BlueKai’s auctions. “If you log into BlueKai, you’ll see ‘teeny boppers’ available for sale,” said Eric L. Porres, Lotame’s chief marketing officer.

Mr. Tawakol of BlueKai later confirmed the “teeny bopper” data had been for sale on BlueKai’s exchange but no one had ever bought it. He said as a result of the Journal’s inquiries, BlueKai had removed it.

The FTC is reviewing the only federal law that limits data collection about kids, the Children’s Online Privacy Protection Act, or Coppa. That law requires sites aimed at children under 13 to obtain parental permission before collecting, using or disclosing a child’s “personal information” such as name, home or email address, and phone and Social Security number. The law also applies to general-audience sites that knowingly collect personal information from kids.

So we have pots and kettles calling each other black while copping out of responsibility in any case—and then, naturally, turning toward government for help.

My own advice: let’s not be so fast with that. Let’s continue to expose bad practices, but let’s also fix the problem on the users’ end. Because what we really need here are tools by which individuals (including parents) can issue their own global preferences, their own terms of engagement,  their own controls, and their own ends of relationships with companies that serve them.

These tools need to be be based on open standards, code and protocols, and independent of any seller. Where they require trusted intermediaries, those parties should be substitutable, so individuals are not locked in again.

And guess what? We’re working on those. Here’s what I wrote last month in Cooperation vs. Coercion:

What we need now is for vendors to discover that free customers are more valuable than captive ones. For that we need to equip customers with better ways to enjoy and express their freedom, including ways of engaging that work consistently for many vendors, rather than in as many different ways ways as there are vendors — which is the “system” (that isn’t) we have now.

There are lots of VRM development efforts working on both the customer and vendor sides of this challenge. In this post I want to draw attention to the symbols that represent those two sides, which we call r-buttons, two of which appear [in the example below]. Yours is the left one. The vendor’s is the right one. They face each other like magnets, and are open on the facing ends.

These are designed to support what Steve Gillmor calls gestures, which he started talking about back in 2005 or so. I paid some respect to gestures (though I didn’t yet understand what he meant) in The Intention Economy, a piece I wrote for Linux Journal in 2006. (That same title is also the one for book I’m writing for Harvard Business Press. The subtitle is What happens when customers get real power.) On the sell side, in a browser environment, the vendor puts some RDFa in its HTML that says “We welcome free customers.” That can mean many things, but the most important is this: Free customers bring their own means of engagement. It also means they bring their own terms of engagement.

Being open to free customers doesn’t mean that a vendor has to accept the customer’s terms. It does mean that the vendor doesn’t believe it has to provide all those terms itself, through the currently defaulted contracts of adhesion that most of us click “accept” for, almost daily. We have those because from the dawn of e-commerce sellers have assumed that they alone have full responsibility for relationships with customers. Maybe now that dawn has passed, we can get some daylight on other ways of getting along in a free and open marketplace.

The gesture shown here —

— is the vendor (in this case the public radio station KQED, which I’m just using as an example here) expressing openness to the user, through that RDFa code in its HTML. Without that code, the right-side r-button would be gray. The red color on the left side shows that the user has his or her own code for engagement, ready to go. (I unpack some of this stuff here.)

Putting in that RDFa would be trivial for a CRM system. Or even for a CMS (content management system). Next step: (I have Craig Burton leading me on this… he’s on the phone with me right now…) RESTful APIs for customer data. Check slide 69 here. Also slides 98 and 99. And 122, 124, 133 and 153.

If I’m not mistaken, a little bit of RDFa can populate a pop-down menu on the site’s side that might look like this:

All the lower stuff is typical “here are our social links” jive. The important new one is that item at the top. It’s the new place for “legal” (the symbol is one side of a “scale of justice”) but it doesn’t say “these are our non-negotiable terms of service (or privacy policies, or other contracts of adhesion). Just by appearing there it says “We’re open to what you bring to the table. Click here to see how.” This in turn opens the door to a whole new way for buyers and sellers to relate: one that doesn’t need to start with the buyer (or the user) just “accepting” terms he or she doesn’t bother to read because they give all advantages to the seller and are not negotiable. Instead it is an open door like one in a store. Much can be implicit, casual and free of obligation. No new law is required here. Just new practice. This worked for Creative Commons (which neither offered nor required new copyright law), and it can work for r-commerce (a term I just made up). As with Creative Commons, what happens behind that symbol can be machine, lawyer or human-readable. You don’t have to click on it. If your policy as a buyer is that you don’t want to to be tracked by advertisers, you can specify that, and the site can hear and respond to it. The system is, as Renee Lloyd puts it, the difference between a handcuff and a handshake.

Giving customers means for showing up in the marketplace with their own terms of engagement is a core job right now for VRM. Being ready to deal with customers who bring their own terms is equally important for CRM. What I wrote here goes into some of the progress being made for both. Much more is going on as well. (I’m writing about this stuff because these are the development projects I’m involved with personally. There are many others.)

You can check out some of those others here.

Bonus link: Tracking the Companies that Track You Online. That’s a Fresh Air interview by Dave Davies of Julia Angwin, senior technology editor of The Wall Street Journal and the lead reporter on the What They Know series.

Tags: , ,

“I make my living off the Evening News
Just give me something: something I can use
People love it when you lose
They love dirty laundry.

Don Henley, “Dirty Laundry”

Look up “Wikipedia loses” (with the quotes) and you get 20,800 results. Look up “Wikipedia has lost” and you get 56,900. (Or at least that’s what I got this morning.) Most of those results tell a story, which is what news reports do. “What’s the story?” may be the most common question asked of reporters by their managing editors. As humans, we are interested in stories — even if they’re contrived, which is what we have with all “reality” television shows.

Lately Wikipedia itself is the subject of a story about losing editors. The coverage snowball apparently started rolling with Volunteers Log Off as Wikipedia Ages, by Julia Angwin and Geoffrey A. Fowler in The Wall Street Journal. It begins,

Wikipedia.org is the fifth-most-popular Web site in the world, with roughly 325 million monthly visitors. But unprecedented numbers of the millions of online volunteers who write, edit and police it are quitting.

That could have significant implications for the brand of democratization that Wikipedia helped to unleash over the Internet — the empowerment of the amateur.

Volunteers have been departing the project that bills itself as “the free encyclopedia that anyone can edit” faster than new ones have been joining, and the net losses have accelerated over the past year. In the first three months of 2009, the English-language Wikipedia …

That’s all you get without paying. Still, it’s enough.

Three elements make stories interesting: 1) a protagonist we know, or is at least interesting; 2) a struggle of some kind; and 3) movement (or possible movement) toward a resolution. Struggle is at the heart of a story. There has to be a problem (what to do with Afghanistan), a conflict (a game between good teams, going to the final seconds), a mystery (wtf was Tiger Woods’ accident all about?), a wealth of complications (Brad and Angelina), a crazy success (the iPhone), failings of the mighty (Nixon and Watergate). The Journal‘s Wikipedia story is of the Mighty Falling variety.

The Journal’s source is Wikipedia: A Quantitative Analysis, a doctoral thesis by José Phillipe Ortega of Universidad Rey San Carlos in Madrid. (The graphic at the top of this post is one among many from the study.) In Wikipedia’s Volunteer Story, Erik Moeller and Erik Zachte of the Wikimedia Foundation write,

First, it’s important to note that Dr. Ortega’s study of editing patterns defines as an editor anyone who has made a single edit, however experimental. This results in a total count of three million editors across all languages.  In our own analytics, we choose to define editors as people who have made at least 5 edits. By our narrower definition, just under a million people can be counted as editors across all languages combined.  Both numbers include both active and inactive editors.  It’s not yet clear how the patterns observed in Dr. Ortega’s analysis could change if focused only on editors who have moved past initial experimentation.

Even more importantly, the findings reported by the Wall Street Journal are not a measure of the number of people participating in a given month. Rather, they come from the part of Dr. Ortega’s research that attempts to measure when individual Wikipedia volunteers start editing, and when they stop. Because it’s impossible to make a determination that a person has left and will never edit again, there are methodological challenges with determining the long term trend of joining and leaving: Dr. Ortega qualifies as the editor’s “log-off date” the last time they contributed. This is a snapshot in time and doesn’t predict whether the same person will make an edit in the future, nor does it reflect the actual number of active editors in that month.

Dr. Ortega supplements this research with data about the actual participation (number of changes, number of editors) in the different language editions of our projects. His findings regarding actual participation are generally consistent with our own, as well as those of other researchers such as Xerox PARC’s Augmented Social Cognition research group.

What do those numbers show?  Studying the number of actual participants in a given month shows that Wikipedia participation as a whole has declined slightly from its peak 2.5 years ago, and has remained stable since then. (See WikiStats data for all Wikipedia languages combined.) On the English Wikipedia, the peak number of active editors (5 edits per month) was 54,510 in March 2007. After a more significant decline by about 25%, it has been stable over the last year at a level of approximately 40,000. (See WikiStats data for the English Wikipedia.) Many other Wikipedia language editions saw a rise in the number of editors in the same time period. As a result the overall number of editors on all projects combined has been stable at a high level over recent years. We’re continuing to work with Dr. Ortega to specifically better understand the long-term trend in editor retention, and whether this trend may result in a decrease of the number of editors in the future.

They add details that amount to not much of a story, if you consider all the factors involved, including the maturity of Wikipedia itself.

As it happens I’m an editor of Wikipedia, at least by the organization’s own definitions. I’ve made fourteen contributions, starting with one in April 2006, and ending, for the moment, with one I made this morning. Most involve a subject I know something about: radio. In particular, radio stations, and rules around broadcast engineering. The one this morning involved edits to the WQXR-FM entry. The edits took a lot longer than I intended — about an hour, total — and were less extensive than I would have made, had I given the job more time and had I been more adept at editing references and citations. (It’s pretty freaking complicated.) The preview method of copy editing is also time consuming as well as endlessly iterative. It was sobering to see how many times I needed to go back and forth between edits and previews before I felt comfortable that I had contributed accurate and well-written copy.

In fact, as I look back over my fourteen editing efforts, I can see that most of them were to some degree experimental. I wanted to see if I had what it took to be a dedicated Wikipedia editor, because I regard that as a High Calling. The answer so far is a qualified no. I’ll continue to help where I can. But on the whole my time is better spent doing other things, some of which also have leverage with Wikipedia, but not of the sort that Dr. Ortega measured in his study.

For example, photography.

As of today you can find 113 photos on Wikimedia Commons that I shot. Most of these have also found use in Wikipedia. (Click “Check Usage” at the top of any shot to see how it’s been used, and where.) I didn’t put any of these shots in Wikimedia Commons, nor have I put any of them in Wikipedia. Other people did all of that. To the limited degree I can bother to tell, I don’t know anybody who has done any of that work. All I do is upload shots to my Flickr site, caption and tag them as completely as time allows, and let nature take its course. I have confidence that at least some of the shots I take will be useful. And the labor involved on my part is low.

I also spent about half an hour looking through Dr. Ortega’s study. My take-away is that Wikipedia has reached a kind of maturity, and that the fall-off in participation is no big deal. This is not to say that Wikipedia doesn’t have problems. It has plenty. But I see most of those as features rather than as bugs, even if they sometimes manifest, at least superficially, as the latter. That’s not much of a story, but it’s a hell of an accomplishment.

Tags: , , , , , , , , , , , ,

I dunno why the New York Times appeared on my doorstep this morning, along with our usual Boston Globe (Sox lost, plus other news) — while our Wall Street Journal did not. (Was it a promo? There was no response envelope or anything. And none of the neighbors gets a paper at all, so it wasn’t a stray, I’m pretty sure.) Anyway, while I was paging through the Times over breakfast, I was thinking, “It’s good, but I’m not missing much here–” when I hit Hot Story to Has-Been: Tracking News via Cyberspace, by Patricia Cohen, on the front page of the Arts section. It’s about MediaCloud, a Berkman Center project, and features quotage from Ethan Zuckerman and Yochai Benkler


(pictured above at last year’s Berkman@10).

The home page of MediaCloud explains,

The Internet is fundamentally altering the way that news is produced and distributed, but there are few comprehensive approaches to understanding the nature of these changes. Media Cloud automatically builds an archive of news stories and blog posts from the web, applies language processing, and gives you ways to analyze and visualize the data.

This is a cool thing. It also raises the same question that is asked far too often in other contexts: Why doesn’t Google do that? Here’s the short answer: Because the money’s not there. For Google, the money is in advertising.

Plain enough, but let’s go deeper.

It’s an interesting fact that Google’s index covers the present, but not the past. When somebody updates their home page, Google doesn’t remember the old one, except in cache, which gets wiped out after a period of time. It doesn’t remember the one before that, or the one before that. If it did it might look, at least conceptually, like Apple’s Time Machine:


If Google were a time machine, you could not only see what happened in the past, but do research against it. You could search for what’s changed. Not on Google’s terms, as you can, say, with Google Trends, but on your own, with an infinite variety of queries.

I don’t know if Google archives everything. I suspect not. I think they archive search and traffic histories (or they wouldn’t be able to do stuff like this), and other metadata. (Mabye a Googler can fill us in here.)

I do know that Technorati keeps (or used to keep) an archive of all blogs (or everything with an RSS feed). This was made possible by the nature of blogging, which is part of the Live Web. It comes time-stamped, and with the assumption that past posts will accumulate in a self-archiving way. Every blog has a virtual directory path that goes domainname/year/month/day/post. Stuff on the Static Web of sites (a real estate term) were self-replacing and didn’t keep archives on the Web. Not by design, anyway.

I used to be on the Technorati advisory board and talked with the company quite a bit about what to do with those archives. I thought there should be money to be found through making them searchable in some way, but I never got anywhere with that.

If there isn’t an advertising play, or a traffic-attraction play (same thing in most cases), what’s the point? So goes the common thinking about site monetization. And Google is in the middle of that.

So this got me to thinking about research vs. advertising.

If research wants to look back through time (and usually it does), it needs data from the past. That means the past has to be kept as a source. This is what MediaCloud does. For research on news topics, it does one of the may things I had hoped Technorati would do.

Advertising cares only about the future. It wants you to buy something, or to know about something so you can act on it at some future time.

So, while research’s time scope tends to start in present and look back, advertising’s time scope tends to start in the present and look forward.

To be fair, I commend Google for all the stuff it does that is not advertising-related or -supported, and it’s plenty. And I commend Technorati for keeping archives, just in case some business model does finally show up.

But in the meantime I’m also wondering if advertising doesn’t have some influence on our sense of how much the past matters. And my preliminary response is, Yes, it does. It’s an accessory to forgetfulness. (Except, of course, to the degree it drives us to remember — through “branding” and other techniques — the name of a company or product.)

Just something to think about. And maybe research as well. If you can find the data.

Tags: , , , , , , , , , , , , ,

I’ve been a Wall Street Journal subscriber since the 1970s. I still am. The paper shows up at my doorstep every day.

I’ve also been a subscriber to the Journal online. It costs extra. I’ve gladly paid it, even though I think the paper makes a mistake by locking its archives behind a paywall. (Sell the news, give away the olds, I say.)

I’d still be glad to pay it, if the Journal made it easy. But they don’t. No paper does, far as I know. In fact very few media make it easy at all to give them money for their online goods.

As it happens, my Journal online subscription just ran out. To fix matters, the paper’s site prompted me not to renew, but to update my credit card. So I went through the very complicated experience of updating that data, with the form losing most of the data each time I had to fill in a blank missed on the last try. (Why separate house number from street name?) In the midst it wouldn’t take my known password, and I had to have them do the email thing, through which I got to create a new password after clicking on a link in an email sent to me by the WSJ “system.” Even after doing that, and getting the new credit card info in there, and everything seemed to be fine (no more mistakes noticed on the form)… I can’t get in.

Did the payment go through? I have no idea. The credit card, from Chase, also has an impossible website. I don’t even want to go there.

In any case, I can no longer get in. At the top of the login page, it says “Welcome, Doc Searls.” Below that it tells me to log out if I am not myself. And below that it says

Your Current Subscription(s)

I can still access my Personal Information, which includes rude questions about my income, the number of people in my organization and how many stock transactions my household made in the past 12 months. Earth to Journal: Readers hate filling out shit like that. Why put readers over a grill like that? Does it really help sales? Please.

Okay, between the last paragraph and this one I somehow got far enough into the site to actually read some stuff. Specifically, this Peggy Noonan piece, and this PJ O’Rourke piece. In the midst of hunting those down, search results that failed said this:

No Information Available

Your subscription does not include access to this service.

If you have any questions please call Customer Service at 800-369-2834 (or 609-514-0870) or contact us by e-mail at  onlinejournal at wsj.com. Representatives are available Monday-Friday from 7 a.m. to 10 p.m. & Saturday from 8 a.m. to 3 p.m. (ET). Subscribers outside the United States, click here.

Good gawd.

Why put readers through #$%^& ordeals like these? Not to mention a website that’s already cluttered beyond endurance.

Because it’s always been done this way, they say. “Always” meaning “since 1995.”

Actually, it’s gotten worse in recent years, all the better to drag eyeballs across advertising, and to maximize the time readers spend on the site.

Hell, I’ve been on the WSJ site for the last hour, hating every second of it.

We can do better than this. I say we, because I have no faith at all that the Journal, or any of the papers, will ever fix problems that have been obvious for the duration. The readers are going to have to tell them what to do. And I mean all of them at once. We need one basic way to interact with media and their systems for accepting payments. Not as many different ways as there are media, all of them bad.

Tags: , , , ,