You are looking at posts in the category metrics.
Posted on December 16th, 2005 by longestnow.
Categories: metrics.
Joho wrote a while ago about distributed authority, providing trusted
views of Wikipedia content. An excerpt from my reply follows; more relevant now that Wikipedia 0.5 takes form.
Distributed authority — in the ’stamp and seal’
sense — is not my idea. And what I would like to see happen with research groups has
been suggested by others before me; there is simply growing interest in
it now. I want to make it easy for people who already work on and
review content in a field to do so in a way that directly improves
Wikipedia.
At the moment, individual authors ‘adopt’ certain articles and try
to keep them fresh and free of errors. And various organizations
maintain their own internal knowledge-bases with content that overlaps
a good deal with relevant Wikipedia articles
Rather than trying to hack an authority system into MediaWiki, you
can do something simpler to encourage both of the above : have groups
that maintain their own small clusters of articles — 10 or 20 or 100
– on a local wiki, with its own portal page. Give them an easy way to
offer their work for merging with WP, without requiring them to all
join the site. The edits they make are implicitly ‘approved’ by them.
This is not a good verification method within WP, however
software changes are required for that (and Seth’s suggestion is one
specific path one might take). At the moment, Nature can link to
revisions of 100 articles that they approve. But once you follow a link
through to a Nature-edited revision of [[DNA]], and follow a link to
another WP article, you’ve already returned to the realm of public
editing.
The motivation for this is a few professors and talented writers who
began editing on WP, but commented that editing Wikipedia directly can
be offensive and off-putting (they are readily offended by trolling,
and have no patience for even trivial wiki-lawyering).
We’re making progress towards Wikipedia 1.0, slowly but surely; I
think along the way we will improve both the default view of content
and the selection of optional views suggested above. Suggestions and improvements are welcome, as always.
Posted on December 7th, 2005 by longestnow.
Categories: metrics.
I have seen many estimates of the size
of Wikipedia’s community; all of them too low. And what surprises
me most of all is that noone cares much about the lack of real metrics
in their speech, their writing, their journalism, their research.
Okay, that last is going a bit far; many researchers are very careful
about defining their metrics and terms. But this is what makes
those which are not stick out so severely.
Here are some basic statistics, care of Erik Zachte’s scripts, the Wikimedia Foundation’s server farms, and over 100,000 active contributors over the past four years (user statistics often exclude the 15% of edits which come from editors without named accounts).
To the point of the user community:
Just to reiterate the casual power of thousands of zealous volunteers
with a variety of content-addictions, some of the scripted data above
has a hand-generated and hand-updated wiki cousin, with its own original additions.
As for where I personally draw the line at counting community size, I
would say the English Wikipedia has this year passed the
10,000-volunteer mark, and is currently around 20,000. We would
know better if we counted not only edits but page-views
per
user… there are those who edit infrequently but keep up with all
aspects of the community; and also many who edit occasionally but
haven’t taken
time to learn the community policies or norms; which one might discount.
I would estimate 60,000 in the ‘copyediting’ community (active
readers, familiar with the interface, acting as typo and vandalism
monitors; and anonymous contributors), and ten times again as many
regular readers – around 500,000.
For all languages combined : 40,000 volunteers, perhaps 120,000 in the
‘copyediting’ community (people in other langs are on average less
likely to understand that they can edit; which I would expect to grow more than linearly
with the size of the community and press coverage in that language),
and some 2M active readers.
Posted on December 7th, 2005 by longestnow.
Categories: metrics.
There’s been some hubbub lately about the usefulness of anonymous
contributions to the information commons. In particular, Monday
saw a somewhat ad-hoc test of the effect on forcing account-creation on
the quality of contributions to the* English Wikipedia.
I have some statistics of my own to add about that particular
experiment. However, for the moment I would simply like to point
to a lovely Wikipedia contribution analysis, “Explaining Quality in Internet Collective Goods: Zealots and Good Samaritans in the case of Wikipedia” (pdf) by researcher Denise Anthony, who presented it this past Monday at MIT. Her research suggested to her that “the highest quality contributions come from the vast numbers of anonymous ‘Good Samaritans’ who contribute infrequently.”
* Note : the direct article is appropriate here because of the
“English” adjective before Wikipedia. For more detail, see my old reply
to JDL at Joho’s house.
Posted on October 28th, 2005 by longestnow.
Categories: metrics.
The size of downloads has been increasing at a record clip.
Downloads have been growing in size since the inception of the
concept… today I direct your attention to Wikipedia and Wikimedia
downloads. Unavailable as torrents, but rather only via http, the
full downloads at 2+GB each are unwieldy even for people who run
downloads over their broadband connections at night while they
sleep. Is WP dump size growing faster than avergae pipe
throughput to homes and workplaces? (yes) What can be done
about this?
How about… shipping hard drives to people who want them? Guaranteed 5-day delivery; for a reasonable fee (perhaps $80 for a drive + shipping + overhead?)…
Posted on September 29th, 2005 by longestnow.
Categories: metrics.
Everyone seems to think that developing tools around people’s daily
lives, on cleverly-designed platforms, is the Answer to lots of things
- the next iPod/computer/phone, new PCs for people in China’s urban
households, etc.
It doesn’t sound terribly innovative to me; am I just a stick in the
mud? How can anyone get excited about a PC-like platform when
there’s some real innovation being done for $100 PCs that torally
rethinks many layers in the development and distribution of
computing? Not that I think the $100 PC is the be-all or end-all
of what target consumers really need… I’m foolish enough to
think that most things that end-users really need doesn’t get developed
at all. A completely silly suggestion, I know.
Posted on September 28th, 2005 by longestnow.
Categories: metrics.
I’m blogging from MIT’s Emerging Technology conference. Earlier today,
there were some great keynotes and a remarkable panel on innovation; a
full report on those to come. Up next: a panel on Nuclear-Power Comeback, featuring support from former opponent (and personal hero) Stewart Brand.
Posted on September 22nd, 2005 by longestnow.
Categories: metrics.
I am amazed by the number of people who think that a perfectly acceptable response to an emergency is disruptive, individual flight. I can think of a number of positive responses to emergencies, but this is an entirely negative one. Roads jammed with uncoordinated traffic
and hotels overwhelmed in the absence of coordination; people
struggling alone to cope with traumatic decisions — what a gray joke.
A few positive alternatives:
And this business of stores and people ‘running out’ of key supplies in
the run-up to every disaster gets old fast. In the first place,
each neighborhood should maintain a decent supply of these
staples. In the second, if Wal*Mart can figure out how to alert
their suppliers to up production every time there’s a sale, surely
cities can find a way to alert the usual suspects every time there’s an
impending disaster-alert.
Posted on September 15th, 2005 by longestnow.
Categories: metrics.
I suppose that should be refined to “as long as we can choose what routes our traffic
takes…” — that is, which peers, what types of lines and routers, perhaps even what
last-mile providers. It should be possible to say “if
there’s no way to send the following content along routes I trust,
don’t send it.“
You don’t have to be paranoid to want this. You might distrust a
route because you expect it to attempt to reconstruct, alter, and
resend content; because you suspect it of not accepting content from
certain areas or sites, because you worry that it keeps track of what
you send when, without your permission… You might not want to
send content through any router that doesn’t respect the “return
receipt” flag which sends back information on how your packets
travelled on their way to a destination. Or you might just not want to
support in any way certain traffic providers, explicitly asking to
patronize other providers whenever possible.
Implementing this would seem to take significantly more intelligent routers and middleware than currently exists.
For a great coverage of some of the topics brought up at the Web of Ideas discussion tonight, see Geoff Huston’s killer essay on the finance of networks, with its diversity of options laid out in gory detail.
Posted on September 7th, 2005 by longestnow.
Categories: metrics.
Standards are sexy. Reuniting families is sexier. PFIF is worth the time it takes to read it.
In use by the grassroots Katrina PeopleFinder project [Katrina help wiki | search for refugees here).
Posted on September 5th, 2005 by longestnow.
Categories: metrics.
Public lists of “ways to help” with Katrina relief : a
retrospective. Below are a collection of links from the
past weeks, and some public timelines. How to do better next
time? Is a “Disaster 2.0″ effort the answer?
Timelines: from TPM | from Wikipedia
Posted on September 5th, 2005 by longestnow.
Categories: metrics.
Ray in Austin
is my favorite blogger at the moment. He’s writing solely about NO; his anger is tangible and practical. He
provides a recap of Walter Maestri’s work in predicting hurricane
damage and evangelizing for preparedness, apparently in vain. An NPR story from last year describes how explicitly this very storm had been played out in the minds of people preparing for it.
Meanwhile, skilled volunteers are actively not being called in.
Chains of command are still worrying about looking good to others,
while the “related deaths” toll is steadily soaring. I’ve seen
this kind of careful negligence before, and cringe to observe it when
so much is at stake. 10,000 deaths doesn’t sound unlikely to me any more. According to
some sources (the NYT?), we’re up to 250k refugees in Texas, far more
than the 100k I predicted last week.
Meanwhile, the Army relief forces seem to have dived into NO from a standpoint of total war:
… next up : you didn’t know what when???
Posted on August 26th, 2005 by longestnow.
Categories: metrics.
There are an increasing number of articles and works published whichrefer to Wikipedia as an implicitly reliable source — often ininappropriate contexts. As its quality improves, Wikipedia
seemsto be shirking a certain quiet
duty
to be modest; something which wasnot a problem back when none would
have mistaken it for a meticulouslyedited compilation.
Example: Ann Simmons, writing in the
LA Times on a matter of British peerage earlier this summer, used the
clause “according to Burke’s
and Wikipedia,”a snippet which should immediately give one pause. For one
thing,the two references have nothing in common. It seems that aneditor tacked on the clause, “,
an online encyclopedia,” in a vain effort at
clarification. The full quote:
The 11th Earl is a bachelor and has no children.
With no otherapparent successor in sight, Capell is the new heir to the earldom.
Hisaristocratic genealogy is documented in the 106th edition of “Burke’sPeerage & Baronetage.”
Please understand me; I will be the first to tell you that you can
find
articles and collections
on Wikipedia – including many
on peerage and
royalty
- which are among the
best overviews in the Englishlanguage; if only you know where to look, and how to check the latest
revisions in each
article’s history.
But
the process for checking information added to Burke’s and that
foradding information to Wikipedia are vastly dissimilar.
TheWikipedia overview article on the Earl of
Essex,for
instance, continues to list no references, two months after theabove
(widely syndicated) article drew new attention to the wiki
articles on Frederick andRobert Capell.
It is
embarrassing to imagine some newscasster, writer, lawyer,politician,
student, professor, or publicistciting a random article from Wikipedia,
on peerage or anything else,without somehow verifying
thatthe article had been carefullyresearched. So what can be done? Short of the
full-fledgeddrive for moderated or static views of the project, that is.
What I would like to see is an internal quality review group that
issues regular recommendations
to the rest of the world. At first these
recommendations would look like a brief whitelist of the categories and
subsubfields thatare really
top-notch and being monitored by a healthy community ofrespected
users. As content improves, it would add various
hard metricsfor each of
various top-level categories — spot-check accuracy;vandalism
frequency/longevity; proportion/longevity of POV and otherdisputes;
rates of article creation, editing, and deletion; &c,
&c.
The recommendations could go out to educational, librarian, andresearch bodies –including
some of you reading this. Theywould be prominently linked
to the sitewidedisclaimer[s]. The metrics would be available to
anyone asfeedback, including those working on relevant WikiProjects.
What do
youthink? (… read the full
essay) A tip o’ the cursor to
lotsofissues
(Update: quintupling of this post reverted. Now how did that happen? Rogue content editor alert…)
Posted on August 19th, 2005 by longestnow.
Categories: metrics.
Shako Mukulu is one of those people I think of every month, though we have not spoken since my last visit to his hometown of Kibwezi
over 5 years ago. He taught me many things while I stayed with
him, off and on, for two months. Most importantly, after “never blow your nose on anything but tissue paper (or you’ll get sick),” was his maxim not to mistrust anyone without very good reason.
I once came home after a day of packed matatu rides
with a neat rip through my pants pocket and missing a 1000 KSh note
that had been there the day before – my luxurious budget for last week
in the country. I worried that it had been stolen;
Shako reproved me roundly, saying “don’t think such things if you don’t
know.” I checked through all of my belongings, and found I had
turned my pockets out into a small bag without remembering it.
Trust is a funny beast, but it is in many communities the right
default. This is a complex topic, worthy of a few chapters of a
book, but of particular relevance to travellers in strange lands with
professional pickpockets.
Differentiating between these pros and everyday people, and the milieus
each group prefers, is the difference between prudence and prejudice.
My mother left behind a makeup case – stolen? perhaps. I
left behind a stack of newspapers and a baseball cap (the Sox… you
had to ask?), and retrieved both of them once I found the right people
to ask.
Posted on August 1st, 2005 by longestnow.
Categories: metrics.
Day 1 of the Wikimania was a definite success. Everyone arrived
and found their way to the hostel (and, in Achal’s case, the hotel)
without incident; the hackers self-assembled to a midday start, and in
the process of discussing the first day’s topic, hacked out a first draft of a metadata solution.
After the day’s talks, and after what seemed like a fine dorm-style
meal, there were many good, quiet discussions and a viewing of Pi. Eugene and Sven and I talked about the active disinterest in HtmlArea
by Wiki programmers, including many of our friends. Without my
mentioning my interview with Ward Cunningham, Eugene commented that
Ward probably wouldn’t feel strongly about it.
When I pointed out that, in fact, Ward had twice listed “lack of WYSIWYG
editors” as the greatest remaining barrier to the general public
using wikis, Eugene was surprised, and commented that nobody had blogged
about it. Which was true! Mea maxima culpa.
So, I’m going to blog about it now; better late than never.
Posted on June 22nd, 2005 by longestnow.
Categories: metrics.
Jeff Young and Outgoing’s Thom Hickey are working on developing a
Metawiki to hold structured metadata along with each record.
Talis advisor Paul Miller (of Common Information Environment fame) comments:
It would be interesting – in the spirit of openness and cooperation – to understand any relationships between the [Silkworm] Directory and OCLC’s MetaWiki.
Contrast this with recent ideas about a WikiCite project for annotating all references that might be used in books or encyclopedia articles, and you can see a lovely tool just waiting to emerge.
The Wikimedians don’t care about the differences between the Silkworm Directory and the Metawiki and Wikidata; they just want to get down to creating annotations as soon as possible. People can argue over what format they should be in and how they should be propoagated later…
Posted on April 27th, 2005 by .
Categories: metrics.
With bird song accompaniment, Jimmy Wales focused on international, multi-lingual Wikipedia efforts. An IRC transcript is available.
Transcript of the Queen of Engl^B^B^B^B^B Jimmy Wales’ Harvard Law School Talk …
Posted on April 25th, 2005 by longestnow.
Categories: metrics.
Jimmy Wales gabve a presentation for the whole of Jonathan Zittrain’s penultimate class today. An IRC transcript is available. There was also an audiostream, which will probably be archived; links as they turn up.