We are what we do.

We are more than that, of course, but it helps to have answers to the questions “What do you do?” and “What have you done?”

Among many other notable things l did was survive breast cancer. It was a subject that came up often during the year we shared as fellows at the Berkman Center. It may not have been a defining thing, but it helped build her already strong character. Persephone also said she knew that her personal war with the disease might not be over. The risks for survivors are always there.

So it was not just by awful chance that Persephone showed up at a Berkman event this Spring wearing a turban. She was on chemo, she said, but optimistic. Thin and frail, she was still pressing on with work, carrying the same good humor, toughness, intelligence and determination.

The next time I saw her, in early June, she looked worse. Then, on June 24, Ethan Zuckerman sent an email to Berkman friends, letting us know that Persephone’s health was diminishing quickly, and that she “probably will not live through July.” He also said that she had moved to a hospice, but was doing well enough to read email and accept a few visitors — and that he had hoped to visit her on July 6. Just five days later, Ethan wrote to say that Persephone had died the night before. I had been working in slow motion on an email to her — thinking, I guess, that Ethan’s July 6 date was an appointment she would keep. This post began as that email.

Persephone is gone, but her work isn’t, and that’s what I want to talk about. It’s a subject I wanted to bring up with her, and one I’m sure all her friends care about. We all should.

What I want to talk about is not “carrying on” the work of the deceased in the usual way that eulogizers do. What I’m talking about is keeping Persephone’s public archives in a published, accessible and easily found state. I fear that if we don’t make an effort to do that — for everybody — that we’ll lose them.

The Web went commercial in 1995, and has only become more so since. Today it is a boundless live public marketplace, searched mostly through one company’s engine, which continues to adapt accordingly. While Google’s original mission (“to organize the world’s information and make it universally accessible and useful”) persists, its commercial imperatives cannot help but subordinate its noncommercial ones.

In my own case I’m finding it harder and harder to use Google (or any search engine) to find my own archived work, even if there are links to it. The Live Web, which I first wrote about in 2005, has come to be known as the “real time” Web, which is associated with Twitter and Facebook as well as Google. What’s live, what’s real time, is now. Not then.

Today almost no time passes between the publishing of anything and its indexing by Google. This is good, but it is also aligned with commercial imperatives that emphasize the present and dismiss the past. No seller has an interest in publishing last week’s offerings, much less last year’s or last decade’s. What would be the point?

It would help if there were competition among search engines, or more specialized ones, but there’s not much hope for that. Bing’s business model is the same as Google’s. And the original Live Web search engines — Technorati, PubSub, Blogpulse, among others — are gone or moved on to other missions. Perhaps ironically, Technorati maintained an archive of all blogging for half a decade. But I’ve been told that’s gone. is still there, but re-cast as a news engine. Only persists as a straightforward Live Web engine, sustained, I suppose, by Mark Cuban‘s largesse. (For which I thank him. IceRocket is outstanding.)

For archives we have two things, it seems. One is search engines concerned mostly about the here and now, and the other is The latter does an amazing job, but finding stuff there is a chore if you don’t start with a domain name.

Meanwhile I have no idea how long tweets last, and no expectation that Twitter (or anybody other than a few individuals) will maintain them for the long term. Nor do I have a sense of how long anything will (or should) last inside Facebook, Linkedin or any other commercial walled garden.

To be fair, everything on the Web is rented, starting with domain names. I “own” , only for as long as I keep paying a domain registrar for the rights to use it. Will it stay around after I’m gone? For how long? All of us rent our servers, even if we own them, simply because they use electricity, take up space and need to be maintained. Who will do that after their paid-for purposes expire? Why? And again, for how long?

Persephone worked for years at I assume her work there will last as long as the organization does. Here’s the Google cache of her Key Staff bio. Her tweets as (her last was June 9th) will persist as long as Twitter doesn’t bother to get rid of them, I suppose. Here’s a Google search for her name. Here’s her Berkman alum page. Here’s her Linkedin. Here are her Delicious bookmarks. More to the point of this post, here’s her Media Re:public blog, with many links out to other sources, including her own. Here’s the Media Re:public report she led. And here’s an Internews search for Persephone, which has five pages of results.

All of this urges us toward a topic and cause that was close to Persephone’s mind and heart: journalism. If we’re serious about practicing journalism on the Web, we need to preserve it at least as well as we publish it.

The comment thread in my last post was lengthened by Seth Finkelstein‘s characterization of me as “basically a PR person”. I didn’t like that, and a helpful back-and-forth between the two of us (and others) followed. In the midst of the exchange I said I would unpack some of my points in a fresh post rather than branch off in the comment thread. So here we are.

We tend to be defined by what we do. Or, in some cases, what we’ve done. Many of our surnames describe the work of an ancestor. Carpenter. Baker. Weaver. Tanner. Of my own surname, it says here,

In his book, Surnames of the United Kingdom, Harrison writes that the surname Searle, Searls, Searles, Serle, Serles, Serrell, or Serrill is of Teutonic origin signifying “Armour or Arms”. It is derived from the Old Teutonic Serlo, Sarla, Sarl, Sarilo, Serilo. Serli ” and the Old English “Searo”, it is the equivalent of the Old High German “Saro” which is the same as the old Norse ” Sorus” meaning Armor arms, skill or device.”

A soldier, I guess. My father, Allen H. Searls, was a soldier, both before and during WWII (he re-enlisted at age 36). But basically he was a carpenter: a builder. So was his father, George William Searls. Also George’s father, Allen Searls. Also Allen’s father, Samuell Searls. I’m not, but my daugther Colette married Todd Carpenter. So my grandson is a Carpenter too.

By the time I knew him, my father was an insurance agent. But he saw himself more as an builder of useful stuff. Thus our basement was a workshop. Pop’s brother-in-law, Archie Apgar, was a banker by day and a builder the rest of the time. In the summer of 1949, the two of them together built our summer house in the woods of South Jersey. (In a paradise of pine, oak and blueberries, now home to a shopping center.) My father was also a longshoreman, a cable-rigger on the George Washington Bridge, and a builder of railroad trestles. He did that in Alaska, where he met my mother, a social worker who had grown up in North Dakota. They married after the War and moved to New Jersey, where Mom worked for many years as a teacher. Her maiden name was Oman, borrowed by a grandfather from a fellow Swede on the boat over from Malmö (or maybe it was Göteburg… someplace with an umlaut).

Mom was a good writer, and in that respect I took more after her than Pop. I started writing in high school, covered sports for my college paper, and the wrote for a variety of newspapers and  magazines across the many years since.

But I’ve done lots of other stuff too. I was a moving man. I drove an ice cream truck. I worked in frozen produce wholesaling (which consisted of moving skids of goods with forklifts and carrying clipboards in and out of freezing warehouses, railroad cars and tractor trailers). I worked in the fronts and the backs of restaurant kitchens, and waiting tables. I flipped burgers and worked counters in fast food joints. I worked in the kitchen at a hospital, and delivering food to patients. I worked in retail, both in sales and management. I worked as a community organizer in a social welfare project (a job that later gave me respect for Barack Obama’s work at the same job, especially since he was good at it and I was not). I worked in radio, doing everything from selling ads to spinning records to engineering, including maintaining transmitters and tower-climbing to change bulbs. I did site studies for FM stations, and made new facility applications to the FCC. I worked in academic parapsychology, helping with research and editing publications. I worked in a landlord’s sawmill when I couldn’t make the rent. And I worked in advertising and PR. Next to writing, that’s the job I held longest.

In 1978 I co-founded Hodskins Simone & Searls, an advertising agency in Durham, North Carolina. By 1980 we came to specialize in what as then called “high tech”. We did well and opened a second office in Palo Alto, moving there completely in 1986. A couple years later we created a division called The Searls Group, which specialized in PR, and eventually spun off on its own as a marketing consultancy. Our clients included Farallon, Symantec, The Burton Group, pieces of Apple and Motorola, Sun Microsystems, Hitachi Semiconductor, Zenith Data Systems and many more.

I had mixed feelings about doing PR, because I was still a journalist at heart, even though I was only freelancing at it during that time. And, while being a journalist made me a better flack, it didn’t make me less of one. I also found that PR folk had little leverage on corporate strategy. Their function was output, not input. So, after awhile, I moved The Searls Group’s work up the client stack, to the point where we did consulting at the CXO level, helping clients understand and engage their markets, rather than in helping them craft and send messages to those markets. You might say our job was delivering (often unwelcome) clues to the places where those clues were needed most. This shift started in the early ’90s and was done by the time Chirs Locke, Rick Levine, David Weinberger and I wrote The Cluetrain Manifesto, in 1999.

Not long after Cluetrain came out as a book in early 2000, Jakob Nielsen noted the use of the first person plural voice in the original Manifesto. When we talked about “we”, as with this here…


… we were not speaking as marketers. We were speaking as human beings, out in the marketplace. What happened, Jakob said, was that “You guys defected from marketing, and sided with markets against marketing.”

He was right.

The great irony that followed was that Cluetrain was generally classified as a marketing book, and its closest followers have been in marketing as well. Many marketers have been inspired by Cluetrain to improve marketing, including the practices of advertising and PR. Along those same lines, Cluetrain has also been credited with foreseeing the “social” movement in computing and communications, and with inspiring and guiding that movement as well. Look up Cluetrain+social on Google and see what comes up. (Here’s a Twitter search for the same.)

I’m not proud, or even happy, with either of those developments. Not long ago I even suggested that “social media” is a crock. My point was not to denigrate people doing good work in the social media space, but rather to point out that our collective vision of this space was wrongly limited to what could be done on Facebook, Twitter and other commercial “platforms”. Ignored was the freedom and independence granted by the Net’s own open and essentially ownerless platforms and protocols — and the need to equip individuals with their own instruments of independence and engagement: work that’s still mostly not done.

That’s why I welcomed the opportunity to add fresh chapters to Cluetrain for its 10th anniversary edition. For the last few years I’ve been working on Cluetrain’s unfinished (or unstarted) business, through ProjectVRM, at Harvard’s Berkman Center, and through its collection of allied efforts and volunteers, both around the Center and around the world. Thus my own chapter of the latest Cluetrain is titled Markets Are Relationships, and unpacks the ambitions behind VRM (which stands for Vendor Relationship Management):

  1. Provide tools for individuals to manage relationships with organizations. These tools are personal. That is, they belong to the individual, in the sense that they are under the individual’s control. They can also be social, in the sense that they can connect with others and support group formation and action. But they need to be personal first.
  2. Make individuals the collection centers for their own data, so that transaction histories, health records, membership details, service contracts, and other forms of personal data aren’t scattered throughout a forest of silos.
  3. Give individuals the ability to share data selectively, without disclosing more personal information than the individual allows.
  4. Give individuals the ability to control how their data is used by organizations, and for how long, including agreements requiring organizations to delete the individual’s data when the relationship ends.
  5. Give individuals the ability to assert their own “terms of service,” obviating the need for organization-written terms of service that nobody reads and everybody has to “accept” anyway.
  6. Give individuals means for expressing demand in the open market, outside any organizational silo, without disclosing any unnecessary personal information.
  7. Make individuals platforms for business, by opening the market to many kinds of third party services that serve buyers as well as sellers.
  8. Base relationship-managing tools on open standards, open APIs (application program interfaces) and open code. This will support a rising tide of activity that will lift an infinite variety of business boats, plus other social goods.

We don’t have those tools yet. When we do, they will change the way customers relate to companies, and therefore change the reverse as well. That will change the job of marketing, sales, and pretty much everything else a company does — so long as it responds to customers who are far better equipped to express demand, and otherwise relate, than they are today.

So, to sum up, there is a place where I stand in respect to all the above. That place is alongside customers, in the marketplace. Not alongside sellers, even when I’m consulting those sellers. My consulting hat is not a PR or a marketing one. It’s a customer hat. A user hat. (And, to the extent that I’m hired to help make sense of free and open source development, a geek hat.)

This is why I took offense to being labeled a “PR person.” I have no problem with good PR people. In fact I try to help them out, along with everybody else who’s interested in my input. But what makes me valuable, I believe, is where I stand in respect to customers. I’m on their side. I’m trying to help them out, and markets along with them. Maybe I’ll succeed, and maybe not. But I do believe that, in the long run, we will have VRM tools, and that these tools will make life better for everybody in the marketplace, including vendors.

Meanwhile, there is a temptation not only to confuse the past with the , but the present with the future. We tend to assume that, as John Updike once said (at a time when copiers, answer machines and faxes seemed miraculous), “we live in the age of full convenience”. We don’t. The present is just a draft for the future. Our conveniences are just prototypes.

I’m glad Seth and others (Dave Rogers, where are you?) are out there, calling bullshit on techno-utopians like me. A lot of what Seth and others on that thread had to say was sobering stuff. The flywheels of Old Skool industrial practices, and thinking, have not gone away. They even spin inside “good” companies like Google.

Markets are different now that the Net runs beneath them. There are fewer secrets, and both good ideas and bad can spread with alarming speed. Lately the split between the static and the live web (which most of us call “real-time” and some of us saw coming half a decade and more ago) has become dramatic and confusing. So has the split between fixed and mobile computing and communications. One can get lost through enthusiasm, despair, or both. Hey, the iPhone is a wonderful thing, but — what next? And why? And how?

Markets are no better than we make them. I’m not sure what one should call a person who works on tools to make markets better. But hey, that’s my job.

Guess I’m a builder after all.

@robpatrob (Robert Paterson) asks (responding to this tweet and this post) “Why would GBH line up against BUR? Why have a war between 2 Pub stations in same city?” (In this tweet and this one, Dan Kennedy asks pretty much the same thing.)

The short answer is, Because it wouldn’t be a war. Boston is the world’s largest college town. There are already a pile of home-grown radio-ready program-filling goods here, if one bothers to dig and develop. The standard NPR line-up could also use a challenge from other producers. WGBH is already doing that in the mornings by putting The Takeaway up against Morning Edition. That succeeds for me because now I have more choices. I can jump back and forth between those two (which I do, and Howard Stern as well).

The longer answer is that it gives GBH a start on the inevitable replacement of signal-based radio by multiple streams and podcast line-ups. WGBH has an exemplary record as a producer of televsion programming, but it’s not setting the pace in other media, including radio. The story is apparent in the first four paragraphs of its About page (which is sure to change):

WGBH is PBS’s single largest producer of content for television (prime-time and children’s programs) and the Web. Some of your favorite series and websites — Nova, Masterpiece, Frontline, Antiques Roadshow, Curious George, Arthur, and The Victory Garden, to name a few — are produced here in our Boston studios.

WGBH also is a major supplier of programs heard nationally on public radio, including The World. And we’re a pioneer in educational multimedia and in media access technologies for people with hearing or vision loss.

Our community ties run deep. We’re a local public broadcaster serving southern New England, with 11 public television services and three public radio services — and productions (from Greater Boston to Jazz with Eric in the Evening) that reflect the issues and cultural riches of our region. We’re a member station of PBS and an affiliate of both NPR and PRI.

In today’s fast-changing media landscape, we’re making sure you can find our content when and where you choose — on TV, radio, the Web, podcasts, vodcasts, streaming audio and video, iPhone applications, groundbreaking teaching tools, and more. Our reach and impact keep growing.

Note the order: TV first, radio second, the rest of it third. But where WGBH needs to lead in the future is with #3: that last paragraph. Look at WGBH’s annual report. It’s very TV-heavy. Compare its radio productions to those of Chicago Public Radio or WNYC. Very strong in classical music (now moving over to WCRB, at least on the air), and okay-but-not-great in other stuff.

Public TV has already become a ghetto of geezers and kids, while the audience between those extrmes is diffusing across cable TV and other media. An increasingly negligible sum of people watch over-the-air (OTA) TV. Here WGBH lost out too. It’s old signal on Channel 2 was huge, reaching more households than any other in New England. Now it’s just another UHF digital signal — like its own WGBX/44, with no special advantages. Public radio is in better shape, for now, because its band isn’t the ever-growing accordion file that cable TV has become; and because most of it still lives in a regulated protectorate at the bottom fifth of the FM band. It also helps public radio that the rest of both the FM and the AM bands suck so royally. (Only sports and political talk are holding their own. Music programming is losing to file sharing and iPods. All-news stations are yielding to iPhone programs that offer better news, weather and traffic reporting. In Boston WBZ is still a landmark news station, but it has to worry a bit with WGBH going in the same direction.)

So the timing is right. WGBH needs to start sinking new wells into the aquifer of smart, talented and original people and organizations here in the Boston area — and taking the lead in producing great new programming with what they find. I’ll put in another plug for Chris Lydon‘s Open Source, which is currently available only in podcast/Web form. And there is much more, including Cambridge-based PRX‘s enormous portfolio of goods.  (Disclosure: my work with the Berkman Center is partially funded through PRX — and those folks, like Chris, are good friends.)

In the long run what will matter are sources, listeners, and the finite amount of time the latter can devote to the former. Not old-fashioned signals.

P.S. to Dan Kennedy’s tweeted question, “Is there another city in the country where two big-time public radio stations go head-to-head on news? Can’t think of one.” Here are a few (though I’d broaden the answer beyond “news,” since WBUR isn’t just that):

All with qualifications, of course. In some cases you can add in Pacifica (which, even though my hero Larry Josephson once called it a “foghorn for political correctness,” qualifies as competition). Still, my point is that there is room for more than one mostly-talk (or news) public radio station in most well-populated regions. Even in Boston, where WBUR has been king of the hill for many years. Hey, other things being equal (and they never are), the biggest signal still tends to win. And in Boston, WGBH has a bigger signal than WBUR: almost 100,000 watts vs. 12,000 watts. WBUR radiates from a higher elevaiton, but its signal is directional. On AM that means it’s stronger than the listed power in some directions and weaker in others; but on FM it means no more than the listed power in some directions and weaker in others. See the FCC’s relative field polar plot to see how WBUR’s signal is dented in every direction other than a stretch from just west of North to Southeast. In other words, toward all but about a third of its coverage area. To sum up, WGBH has a much punchier signal. I’m sure the GBH people also have this in mind when they think about how they’ll compete with BUR.

JeffersonDependence begets subservience and venality, suffocates the germ of virtue, and prepares fit tools for the designs of ambition. — Thomas Jefferson


Near the start of his Institutional Corruption talk the other day, Larry Lessig sourced the quote above, from Thomas Jefferson. Larry was making a point: that the Framers were interested in personal independence, and not just that of a former colony. The Framers operated, however, in advance of the Industrial Revolution, which was won by Industry and lost by the rest of us — or at least by some of the roles we play in the marketplace.

Such as our roles as customers. While being customers gives us choices among products and services, many of the companies behind those products and services make us dependent on them, in ways we would not prefer if we had a choice. For a measure of how little choice we have, ask yourself how many times you’ve clicked “accept” to “Terms of Service” that typically give all advantages to the seller. Or look the number of cookies stored in your browser.

Well, the tide is turning. We’re finally starting to see a few tools that give users control over how data is collected and used. We’re working on some of those in the VRM community. And they’re a subject of discussion at


at 9:30am on Tuesday, at Harvard Law School, starting with the panel in the title graphic above. You can register here. Even if you show up only for the panel, it’ll help us know how many will be there.

There’s lots more about it at Civilizing the Personal Data Frontier, over at the ProjectVRM blog. Hope to see you there.

For the form of life we call business, we are at a boundary between eras. For biological forms of life, the most recent of these is the K-T boundary between the  and the Eras. The Mezozoic Era ended when Earth was struck by an object that left a crater 110 miles wide and a world-wide layer of iridium-rich crud. Below that layer lies the Age of Dinosaurs, completed. Above that layer accumulate the fossils of life forms that survived the change, and took advantage of it. Notable among these is a branch of theropod dinosaurs we call birds.

In business we have the I-I boundary: the one between the Industrial and Information ages (which Alvin Toffler first observed in The Third Wave, published in 1980).  Below that boundary we find a communications environment dominated by telecom and cablecom. Above it we find a radically different communications environment that still supports voice and video, but as just two among an endless variety of other applications. We call that environment the Internet.

At this moment in history most of us know the Internet as a tertiary service of telephone and cable companies, which still make most of their money selling telephone service and cable TV. Since those are highly regulated businesses, the Internet is subject to degrees of regulatory capture. Some of that capture is legal, but much of it is conceptual, for example when we see the Internet as a grace of telecom and cablecom — rather than as something that subsumes and obsoletes both of those Industrial Age frames.

Such is the risk with “broadband” — a term inherited by the Internet from both telecom and cablecom, and which is a subject of interest for both Congress and the FCC. In April of this year the FCC announced the development of a national broadband plan, subtitled “Seeks Public Input on Plan to Ensure Every American has Access to Broadband Capability”. In July the commission announced that Harvard’s Berkman Center would conduct “an independent review of broadband studies” to assist the FCC. Then yesterday the center put up a notice that it “is looking for a smart, effective fellow to join our broadband research team”. (This is more than close to home for me, since I am a fellow at Berkman. So I need to say that the broadband studies review is not my project — mine is this one — and that I am not speaking for the Berkman Center here, or even in my capacity as a fellow.)

The challenge here for everybody is to frame our understanding of the Net, and of research concerning the Net, in terms that are as native to the Net as possible, and not just those inherited from the Industrial Age businesses to which it presents both threats and promise — the former more obvioius than the latter. This will be very hard, because the Internet conversation is still mostly a telecom and cablecom conversation. (It’s also an entertainment industry conversation, to the degree that streaming and sharing of audio and video files are captive to regulations driven by the recording and movie industries.)

This is the case especially for legislators and regulators, too few of which are technologists. Some years ago Michael Powell, addressing folks pushing for network neutrality legislation, said that he had met with nearly every member of Congress during his tour of duty as FCC chairman, and that he could report that nearly all of them knew very little about two subjects. “One is technology, and the other is economics,” he said. “Now proceed.”

Here is what I am hoping for, as we proceed both within this study and beyond it to a greater understanding of the Internet and the new Age it brings on:

  • That “broadband” comes to mean the full scope of the Internet’s capabilities, and not just data speeds.
  • That we develop a native understanding of what the Internet really is, including the realization that what we know of it today is just an early iteration.
  • That telecom and cablecom companies not only see the writing on the wall for their old business models, but embrace other advantages of incumbency, including countless new uses and businesses that can flourish in an environment of wide-open and minimally encumbered connectivity — which they have a privileged ability to facilitate.
  • That the Net’s capacities are not only those provided from the inside out by “backbone” and other big “carriers”, but from the outside in by individuals, small and mid-size businesses (including other Internet service providers, such as WISPs) and municipalities.

That last item is important because carriers are the theropods of our time. To survive, and thrive, they need to adapt. The hardest challenge for them is to recognize that the money they leave on the shrinking Industrial Age table is peanuts next to the money that will appear on the Information Age table they are in a privileged position to help build.

I dunno why the New York Times appeared on my doorstep this morning, along with our usual Boston Globe (Sox lost, plus other news) — while our Wall Street Journal did not. (Was it a promo? There was no response envelope or anything. And none of the neighbors gets a paper at all, so it wasn’t a stray, I’m pretty sure.) Anyway, while I was paging through the Times over breakfast, I was thinking, “It’s good, but I’m not missing much here–” when I hit Hot Story to Has-Been: Tracking News via Cyberspace, by Patricia Cohen, on the front page of the Arts section. It’s about MediaCloud, a Berkman Center project, and features quotage from Ethan Zuckerman and Yochai Benkler


(pictured above at last year’s Berkman@10).

The home page of MediaCloud explains,

The Internet is fundamentally altering the way that news is produced and distributed, but there are few comprehensive approaches to understanding the nature of these changes. Media Cloud automatically builds an archive of news stories and blog posts from the web, applies language processing, and gives you ways to analyze and visualize the data.

This is a cool thing. It also raises the same question that is asked far too often in other contexts: Why doesn’t Google do that? Here’s the short answer: Because the money’s not there. For Google, the money is in advertising.

Plain enough, but let’s go deeper.

It’s an interesting fact that Google’s index covers the present, but not the past. When somebody updates their home page, Google doesn’t remember the old one, except in cache, which gets wiped out after a period of time. It doesn’t remember the one before that, or the one before that. If it did it might look, at least conceptually, like Apple’s Time Machine:


If Google were a time machine, you could not only see what happened in the past, but do research against it. You could search for what’s changed. Not on Google’s terms, as you can, say, with Google Trends, but on your own, with an infinite variety of queries.

I don’t know if Google archives everything. I suspect not. I think they archive search and traffic histories (or they wouldn’t be able to do stuff like this), and other metadata. (Mabye a Googler can fill us in here.)

I do know that Technorati keeps (or used to keep) an archive of all blogs (or everything with an RSS feed). This was made possible by the nature of blogging, which is part of the Live Web. It comes time-stamped, and with the assumption that past posts will accumulate in a self-archiving way. Every blog has a virtual directory path that goes domainname/year/month/day/post. Stuff on the Static Web of sites (a real estate term) were self-replacing and didn’t keep archives on the Web. Not by design, anyway.

I used to be on the Technorati advisory board and talked with the company quite a bit about what to do with those archives. I thought there should be money to be found through making them searchable in some way, but I never got anywhere with that.

If there isn’t an advertising play, or a traffic-attraction play (same thing in most cases), what’s the point? So goes the common thinking about site monetization. And Google is in the middle of that.

So this got me to thinking about research vs. advertising.

If research wants to look back through time (and usually it does), it needs data from the past. That means the past has to be kept as a source. This is what MediaCloud does. For research on news topics, it does one of the may things I had hoped Technorati would do.

Advertising cares only about the future. It wants you to buy something, or to know about something so you can act on it at some future time.

So, while research’s time scope tends to start in present and look back, advertising’s time scope tends to start in the present and look forward.

To be fair, I commend Google for all the stuff it does that is not advertising-related or -supported, and it’s plenty. And I commend Technorati for keeping archives, just in case some business model does finally show up.

But in the meantime I’m also wondering if advertising doesn’t have some influence on our sense of how much the past matters. And my preliminary response is, Yes, it does. It’s an accessory to forgetfulness. (Except, of course, to the degree it drives us to remember — through “branding” and other techniques — the name of a company or product.)

Just something to think about. And maybe research as well. If you can find the data.

In the month since it hit the streets (at least here in the U.S.), I’ve been surprised at how little those who like Cluetrain know about the new, 10th anniversary edition of the book. Many assume that it’s a fancy new edition of the same old thing. That’s true to the degree that it comes with a hard cover and a nice design. But there are also five new chapters by the four original authors, plus three additional chapters: one each by Dan Gillmor, Jake McKee and JP Rangaswami. In other words, it’s a lot thicker and more substantial than the original.

So yeah, I’m promoting it a bit. I’ve done approximately none of that, and it deserves any plug it gets. A lot of good work went into it.

The shot above is from a Berkman YouTube video of a Cluetrain discussion at Harvard Law School, led by Jonathan Zittrain, and featuring Dr. Weinberger and myself.

cluetraincoverTen years ago The Cluetrain Manifesto was a website that had been up for a couple of months — long enough to create a stir and get its four authors a book deal. By early June we had begun work on the book, which would wrap in August and come out in January. So at the moment we’re past the website’s anniversary and shy of the book’s.

cover187-cluetrain-10th-0465018653That’s close enough for 10th Anniversary Edition of The Cluetrain Manifesto, which will hit the streets this month. The new book, which arrived at my house yesterday, is the same as the original (we didn’t change a word). but with the addition of a new introduction by David Weinberger, four new chapters by each of the four authors (Chris Locke and Rick Levine, in addition to Dr. Weinberger and myself), and one each by Dan Gillmor, Jake McKee and JP Rangaswami.

A lot has happened in the last decade. A lot hasn’t happened too. To reflect on both, the Berkman Center will host a conversation called Cluetrain at 10: So How’s Utopia Working Out for Ya? at Harvard Law School.

David Weinberger and I will be joined by Jonathan Zittrain, a Harvard Law professor and author of The Future of the Internet — and How to Stop It. “JZ” was a student at HLS when he co-founded the Berkman Center eleven years ago. David and I are both fellows at the center as well. The three of us will talk for a bit and then the rest of it will be open to the floor, both in the room and out on the IRC (and other backchannels), since the conversation will be webcast as well. It starts at 6:00 pm East Coast time.

Meet/meat space is the Austin East Classroom of Austin Hall at Harvard Law School. It’s free and open to everybody. Since it’s a classroom and expected to fill up, an RSVP is requested. To do that, go here.

So I’m walking across the Harvard campus, going from one Berkman office to another, listening to KCLU from Santa Barbara on my iPhone. The guest on the show is Berkman’s own John Palfrey. I think, that’s coolwhat’s the show? The tuner doesn’t tell me, because (I assume) KCLU doesn’t provide that data along with the audio stream.

To find out, I just sat down on a bench, popped open the laptop and started looking around. KCLU’s site says what’s on now is OnPoint. That’s because the time on the scuedule block says 9:00am. It’s currently 10:45am, Pacific. The next show block on the schedule is Fresh Air at 11:00am. John isn’t listed as an OnPoint guest, so… what is the show he’s on?

I wait until the interview with John ends, and then I learn that the show is Here & Now, which KCLU says comes on at 2pm. Here & Now has the JP segment listed. Says this:

More Countries Use Internet Censorship
We’ve heard about countries like China, Iran and North Korea censoring websites. But our guest, John Palfrey of Harvard’s Berman Center for Internet and Society says the practice is becoming more widespread—more than three dozen countries do extensive censoring, even France, Australia and the U.S. engage in some type of censorship.

Now it’s 11:00am Pacific, and KCLU brings on Science Friday. Also at variance from the schedule.

I’m not sure how to fix the problem of not including show data in a stream (or, if included, getting it displayed on software tuners), though I am sure it’s fixable. More importantly, I am convinced of the  need of listeners to know what they’re hearing, to bookmark it, and to find out more about it later. At the very least they should be able to find the answer to the “What was that?” question — without spending fifteen minutes surfing around a browser on a laptop.

Being able to know what you’re hearing would also inform decisions about, say, how much money you’d like to throw at the station or a program, if you’d like to do that. That’s what EmanciPay (which I wrote about yesterday) would help do.

Anyway, that’s why we’re working on Listen Log, as a variety of Media Logging. Input welcome.

New Innerface

Sorry I’ve been quiet. Let’s see… I’ve only blogged on 12 days this month. A new low for me, I’m sure. There are several reasons, all good. The new one, though, is that I’m hunkering down on a book. For the first time, ever. Not easy for me. I’m a sprinter, not a marathon runner. I’m also more distractable than a kitten. That’s good for blogging and tweeting, bad for book-writing. (Where would either blogging or tweeting be without sublimated ADHD? Dropped in half? More?)

I’ve also been awol during an overhaul of Berkman blogs. (Not all those at the last link are hosted by Berkman, but I can’t find another link at the moment, and I need to get back to work.)

In any case, there’s a new WordPress dashboard here, which I’m using for the first time. This little authoring section is called “QuickPress”. I’m writing in HTML, because I assume there’s no other way. At least within this section. Which is cool. I like writing in HTML.

Haven’t found the wysiwyg authoring thing yet. More importantly, I need to get my OPML editor working with the blog. That’s my main means, and that connection seems to be broken. Might be at this end, because I’ve been switching laptops around too. Miss it. I’m an outline-y kinda guy.

Anyway, just letting ya’ll know that I’m here. Just busy.

In a meeting yesterday, somebody on the IRC shared links to “Re-identification of home addresses from spatial locations anonymized by Gaussian skew” and “Bregman divergences in the (m x k)-partitioning problem“, from Science Digest. Sez the abstract of the latter,

A method of fixed cardinality partition is examined. This methodology can be applied on many problems, such as the confidentiality protection, in which the protection of confidential information has to be ensured, while preserving the information content of the data. The basic feature of the technique is to aggregate the data into m groups of small fixed size k, by minimizing Bregman divergences. It is shown that, in the case of non-uniform probability measures the groups of the optimal solution are not necessarily separated by hyperplanes, while with uniform they are. After the creation of an initial partition on a real data-set, an algorithm, based on two different Bregman divergences, is proposed and applied. This methodology provides us with a very fast and efficient tool to construct a near-optimum partition for the (m×k)-partitioning problem.

Keywords: Confidentiality; Data masking; Fixed cardinality partitioning; Fixed size micro-aggregation; Bregman divergences; Pythagorean property; Convex partition

What’s extra wacky is that I actually spent time diving into this stuff, even though it’s about forty thousand leagues over my head. Still, it was fun trying to remember all that math I barely learned too long ago.

As I recall, the highest grade I ever got in high school math was a C. That was in Geometry. (Hey, I’m a visual guy.) The only math course I took in college was Statistics. The teacher and I couldn’t stand each other, and I dropped out, or thought I did. Turns out I was too late doing that and the guy gave me an F.

But I kept the book, which served me well years later when I was studying Arbitron’s ratings for radio stations. To my surprise, I actually liked the subject, and used what I learned from the book to develop algorithms for factoring out seasonal variations in station AQH (average quarter hour) shares, to aid in predicting which stations would do what in the next “book”. In addition to racking up billable hours for my company, and helping our client station sell advertising, I was able to win bets with friends in the radio business.

The biggest bet of all was that WFXC, the station with the weakest signal in the Raleigh-Durham metro, would kick ass in the first book after its programming went “urban” (that’s radio talk for “black”). The math was easy. The market was about 40% black, and no other FM stations addressed that population.

I won. Foxy was #1 in its first book. (And it’s still doing well, 2+ decades later.)

As it happens, WFXC “Foxy 107″ (a name I suggested to the owners before they picked the call letters, though I don’t know if I was the first to come up with that) was consulted at the time by Dean Landsman, whom I didn’t know at the time. We became good friends years later when we both haunted the late Compuserve’s late Broadcast Professionals Forum, which was run by Mary Lu Wehmeier, now a friend as well. She was the “Sysop” for that forum, where I occasionally came off the bench to help. Running the Sysop Forum was Jonathan Zittrain, who later helped found the Berkman Center, and now stars as a professor at Harvard Law School. Making things even more circular, Dean is now a valuable and diligent contributor to ProjectVRM. Dean, a closet math whiz, made a living for many years doing in-depth work around radio station ratings. I’ll be he knows, or could puzzle out, the quoted text at the top of this post.

By the way, my nickname is the fossil remnant of a radio persona called “Doctor Dave”, featured on WDBS, the prior incarnation of WFXC, which is still around (now with a somewhat better transmitter, and a second and much larger signal on another channel, covering the east side of the market). When I was there, in the mid-’70s, WDBS was owned by Duke University and had awful ratings to go with its awful signal. But it was a great little station. Still friends with folks from those days too.

Ah, I found the picture I was looking for, now at the top of this post. That was the WDBS staff in 1975, I’m guessing. I’m the guy with the wide tie and the narrow shoulders in the back row. There are many missing folks too. I’d love to follow this digressive path, but have too much work to do. At least I’ve left plenty of link and tag bait. :-)

