You are viewing a read-only archive of the Blogs.Harvard network. Learn more.

The Longest Now


Blogs.harvard, wrapped: an ecosystem snapshot as the lights go out
Friday June 30th 2023, 1:37 pm
Filed under: citation needed,fly-by-wire,indescribable,meta,Not so popular,null

¡Blogs.harvard is closing its doors for good!

Today is nominally the last day it will be editable, though it will stay up for archiving and export for another month. The WordPress dashboard lately has hadan expandable bar in the corner titled ‘Recent Updates’, but I’d never expanded it to see that it was local news about the platform, so this came as a surprise.

 

Checklist:

1)  ping people who still need to migrate
2)  draft final blog post, honoring the network

In the early days of blogging, Dave Winer was an energetic advocate of the form, as something important for writing and communication and not just another modern pastime.  He set up the first version of Blogs@Harvard while he was a Berkman fellow (a Manila instance hosted by the Berkman Center, at blogs.law.harvard.edu), and started blogging there as well as at Scripting News. It moved to WordPress in 2007. The community revisited it in 2011 to reaffirm the value in keeping it online. (JP, as the head of the center, warmly summarized the project history to date at that point)

Over the next decade, new blogs were only created by Harvard affiliates. In 2014, technical maintenance of the blogs moved to the Harvard Library’s Office for Scholarly Communication, and the domain changed to blogs.harvard.edu.  In 2018 its maintenance shifted to Harvard University Information Technology, and any old blogs run by authors who were not affiliates were closed [and taken offline, if they had not set up an archive]. This also affected a number of past affiliates who no longer had university or alum email addresses, including the pathbreaking info/law and j’s scratchpad, blog of the founding organizer of the Blogging Group.

Now the rest are being shut down.  While bloggers still at Harvard can migrate to the existing sites.harvard.edu, with a bit of effort, they are not being migrated by default, and most have not migrated.  Those without new posts in the past year were not notified of the change.  This also affects people like Doc Searls, a long-time pillar of free software and the open web who we’ve been lucky to have in the local eddy, whose active projects live on nearby.

There are plans for a full archive to be preserved; let’s make it one befitting this decentralized community, which has hosted many students and practitioners of digital creation and archiving.  Going through the archiving process myself reminds me of the [extraordinary, wonderful]  service of the Wayback Machine, which may also let us restore former blogs currently hidden behind its veil.

 

Checklist:

3)  Salvage old drafts
4)  Make a proper export

It is a curious sensation to revisit my old tempo of posting by seeing the proportionate tempo of unpublished drafts; some quite good and close to completion, but written in a week or month when many other works were going out.  These days I would publish a good three-section post without hesitation.  Most drafts removed or published; new “unfinished draft” category added.

I am also reminded that fully half of the links from over 5 years ago are no longer online; other websites having a much shorter time-to-linkrot than this blog family.  Again, Wayback is not only a default salvation but one of the only options; if it disappeared, readers, researchers, and historians would be entirely out of luck (short of bring up one of the Wayback mirrors).  If you are in a position to host a full mirror (currently around 100PB), please get in touch with the archiveteam or the Internet Archive.

Exports should be easy, though mine is not small.  Preserving the directory structure on import requires a target style that uses the same schema for dated posts.  Alternately, I could scrape the entire site into a .wacz file and restore its public appearance exactly as it stands today, then move to a different format for a future blog.  I’d like something more collaborative by nature; easy to have a cohort working together.  I have hopes that Tana could be turned towards this end, as shared writing is naturally a more social activity than just linking to one another’s blogs (and even here some of the best outlier blogs here have been multi-author, during times when many were active together)

https://blogs.harvard.edu/project-info/

Comments Off on Blogs.harvard, wrapped: an ecosystem snapshot as the lights go out


Kostoff, reprised: peer review secured again, everything is fine.
Wednesday May 11th 2022, 10:18 am
Filed under: chain-gang,citation needed,metrics,poetic justice,unfinished draft

In the end, Elsevier retracted Kostoff’s anti-vax article, along with a pro-ivermectin study in the same issue that was similarly statistically-challenged.  (It was that ivermectin study that led me to discover the issue in the first place, via scite.ai)

But not before his article dominated media and social media references to the Journal for months; and the author parlayed his peer-reviewed work into a DailyClout essay that was even more extreme, and did a tour on the social media anti-vax circuit. Thousands of people spent time debunking this nonsense, including a dozen on PubPeer alone.  Millions of people saw references to it on social media.

The editor-in-chief who regularly published his own articles (or added himself as author to articles in his journal) stepped down as EIC, but continues to edit other toxicology journals and publish research at a healthy clip of three articles a month. Global understanding of COVID-19 is advancing steadily, with no further confusion or misdirection whatever. Everything is fine 🐶🔥

 

Comments Off on Kostoff, reprised: peer review secured again, everything is fine.


Forging Social Proof: the Networked Turing Test Rules the First AI War
Friday September 25th 2020, 2:51 pm
Filed under: citation needed,fly-by-wire,Uncategorized

A few years ago I wrote about how our civilization was forfeiting the zeroth AI war — allowing individual attention hacks, deployed at scale, to diminish and replace our natural innovation and productivity in every society.  We gained efficiency in every area of life, and then let our new wealth and spare time get absorbed by newly-efficient addictive spirals.

Exploit culture

This war for attention affects what sort of society we can hope to live in. Channeling so much wealth to attention-hackers, and the networks of crude AI tools and gambling analogs that support them, has strengthened an entire industry of exploiters, allowing a subculture of engineers and dealmakers to flourish.  That industry touches on fraud, propaganda, manipulation of elections and regulation, and more, all of which influence what social equilibria are stable.

The first real AI war

Now we are facing the first real artificial-intelligence war — dominated by entities that appear as avatars of independent, intelligent people, but are artificial, scripted, automated.  

What is new in this? Earlier low-tech versions of this required no machine learning or programming: they used the veil of pseudonymity to fake authorship, votes, and small-scale consensus.  In response, we developed layers of law and regulation around earlier attacks — fraud, impersonation, and scams are illegal.  AI can smoothly scale this to millions of comments on public bills, and to forging microtargeted social proof in millions of smaller group interactions online. And these scaled attacks are often still legal, or lightly penalized and enforced.
(more…)

Comments Off on Forging Social Proof: the Networked Turing Test Rules the First AI War


Trump’s tee-totalling: why are so many meetings held on the golf course?
Sunday December 01st 2019, 6:17 pm
Filed under: %a la mod,chain-gang,citation needed,fly-by-wire,international

It is time we stop talking about “golf time” as leisure time away from the presidency, and start treating it as a primary channel for meetings, negotiations, and decision-making. (See for instance the last line of this remarkable story.)

Trump’s presidential schedule is full of empty days and golf weekends – roughly two days a week have been spent on his own resorts, throughout his presidency. Combined with his historically light work schedule, averaging under two hours of meetings per day, the majority of small-group meetings may be taking place at his resorts.

He has also directed hundreds of government groups, and countless diplomatic partners and allies, to stay at his resorts and properties.

On his properties, his private staff control the access list, security videos and other records.  They are also able to provide privacy from both press and government representatives that no federal property could match.

How might we address the issues involved with more clarity?

Paying himself with government funds

To start with, this is self-dealing on an astronomical scale: the 300+ days spent at his golf clubs and other properties have cost the US government, by conservative estimate, $110 million. The cost of encouraging the entire government to stay at Trump properties is greater still, if harder to estimate. (more…)

Comments Off on Trump’s tee-totalling: why are so many meetings held on the golf course?


Anonymizing data on the users of Wikipedia
Wednesday July 25th 2018, 12:22 pm
Filed under: chain-gang,citation needed,Glory, glory, glory,wikipedia

Updated for the new year: with specific things we can all start doing 🙂

Wikipedia currently tracks and stores almost no data about its readers and editors.  This persistently foils researchers and analysts inside the WMF and its projects; and is largely unnecessary.

Not tracked last I checked: sessions, clicks, where on a page readers spend their time, time spent on page or site, returning users.  There is a small exception: data that can fingerprint a user’s use of the site is stored for a limited time, made visible only to developers and checkusers, in order to combat sockpuppets and spam.

This is all done in the spirit of preserving privacy: not gathering data that could be used by third parties to harm contributors or readers for reading or writing information that some nation or other powerful group might want to suppress.  That is an essential concern, and Wikimedia’s commitment to privacy and pseudonymity is wonderful and needed.

However, the data we need to improve the site and understand how it is used in aggregate doesn’t require storing personally identifiable data that can be meaningfully used to target editors in specific. Rather than throwing out data that we worry would expose users to risk, we should be fuzzing and hashing it to preserve the aggregates we care about.  Browser fingerprints, including the username or IP, can be hashed; timestamps and anything that could be interpreted as geolocation can have noise added to them.

We could then know things such as, for instance:

  • the number of distinct users in a month, by general region
  • how regularly each visitor comes to the projects; which projects + languages they visit [throwing away user and article-title data, but seeing this data across the total population of ~1B visitors]
  • particularly bounce rates and times: people finding the site, perhaps running one search, and leaving
  • the number of pages viewed in a session, its tempo, or the namespaces they are in [throwing away titles]
  • the reading + editing flows of visitors on any single page, aggregated by day or week
  • clickflows from the main page or from search results [this data is gathered to some degree; I don’t know how reusably]

These are just rough descriptions — great care must be taken to vet each aggregate for preserving privacy. but this is a known practice that we could do with expert attention..

What keeps us from doing this today?  Some aspects of this are surely discussed in places, but is hard to find.  Past discussions I recall were brought to an early end by [devs worrying about legal] or [legal worrying about what is technically possible].

Discussion of obstacles and negative-space is generally harder to find on wikis than discussion of works-in-progress and responses to them: a result of a noun-based document system that requires discussions to be attached to a clearly-named topic!

What we can do, both researchers and data fiduciaries:

  • As site-maintainers: Start gathering this data, and appoint a couple privacy-focused data analysts to propose how to share it.
    • Identify challenges, open problems, solved problems that need implementing.
  • Name the (positive, future-crafting, project-loving) initiative to do this at scale, and the reasons to do so.
    • By naming the positive aspect, distinguish this from a tentative caveat to a list of bad things to avoid, which leads to inaction.  (“never gather data!  unless you have extremely good reasons, someone else has done it before, it couldn’t possibly be dangerous, and noone could possibly complain.“)
  • As data analysts (internal and external): write about what better data enables.  Expand the list above, include real-world parallels.
    • How would this illuminate the experience of finding and sharing knowledge?
  • Invite other sociologists, historians of knowledge, and tool-makers to start working with stub APIs that at first may not return much data.

Without this we remain in the dark —- and, like libraries who have found patrons leaving their privacy-preserving (but less helpful) environs for data-hoarding (and very handy) book-explorers, we remain vulnerable to disuse.



Psych statistics wars: new methods are shattering old-guard assumptions
Thursday October 20th 2016, 12:51 pm
Filed under: %a la mod,chain-gang,citation needed,Glory, glory, glory,knowledge,meta,metrics

Recently, statistician Andrew Gelman has been brilliantly breaking down the transformation of psychology (and social psych in particular) through its adoption of and creative use of statistical methods, leading to an improved understanding of how statistics can be abused in any field, and of how empirical observations can be [unwittingly and unintentionally] flawed. This led to the concept of p-hacking and other methodological fallacies which can be observed in careless uses of statistics throughout scientific and public analyses. And, as these new tools were used to better understand psychology and improve its methods, existing paradigms and accepted truths have been rapidly changed over the past 5 years. This shocks and anguishes researchers who are true believers in”hypotheses vague enough to support any evidence thrown at them“, and have built careers around work supporting those hypotheses.

Here is Gelman’s timeline of transformations in psychology and in statistics, from Paul Meehl’s argument in the 1960s that results in experimental psych may have no predictive power, to PubPeer, Brian Nosek’s reprodicibility project, and the current sense that “the emperor has no clothes”.

Here is a beautiful discussion a week later, from Gelman, about how researchers respond to statistical errors or other disproofs of part of their work.  In particular, how co-authors handle such new discoveries, either together or separately.

At the end, one of its examples turns up a striking example of someone taking these sorts of discoveries and updates to their work seriously: Dana Carney‘s public CV includes inline notes next to each paper wherever significant methodological or statistical concerns were raised, or significant replications failed.

Carney makes an appearance in his examples because of her most controversially popular research, with Cuddy an Yap, on power posing.  A non-obvious result (that holding certain open physical poses leads to feeling and acting more powerfully) became extremely popular in the popular media, and has generated a small following of dozens of related extensions and replication studies — which starting in 2015 started to be done with large samples and at high power, at which point the effects disappeared.  Interest within social psychology in the phenomenon, as an outlier of “a popular but possibly imaginary effect”, is so great that the journal Comprehensive Results in Social Psychology has an entire issue devoted to power posing coming out this Fall.
Perhaps motivated by Gelman’s blog post, perhaps by knowledge of the results that will be coming out in this dedicated journal issue [which she suggests are negative], she put out a full two-page summary of her changing views on her own work over time, from conceiving of the experiment, to running it with the funds and time available, to now deciding there was no meaningful effect.  My hat is off to her.  We need this sort of relationship to data, analysis, and error to make sense of the world. But it is a pity that she had to publish such a letter alone, and that her co-authors didn’t feel they could sign onto it.

Update: Nosek also wrote a lovely paper in 2012 on Restructuring incentives to promote truth over publishability [with input from the estimable Victoria Stodden] that describes many points at which researchers have incentives to stop research and publish preliminary results as soon as they have something they could convince a journal to accept.

Comments Off on Psych statistics wars: new methods are shattering old-guard assumptions


Inversionistas inmobiliarimos en Chile de hoy
Sunday November 03rd 2013, 5:43 pm
Filed under: citation needed,Glory, glory, glory,ideonomy,international

En Puerto Varas, para ser precisos. Un articulo por Sebastian.   ᔥmadre.

Hay paisajes extraordinarios, pienso, y luego este. Esos campos y poblados guardan un centenario orgullo que emociona.

Comments Off on Inversionistas inmobiliarimos en Chile de hoy


To “snub” you must find someone who can be made to feel inferior
Saturday October 19th 2013, 4:53 pm
Filed under: citation needed,Glory, glory, glory,poetic justice

“A snub,” defined Lady Roosevelt, “is the effort of a person who feels superior to make someone else feel inferior. To do so, he has to find someone who can be made to feel inferior.”

ᔥ Quote Investigator,  ↬ Meredith Patterson



A New ‘Pedia: planning for the future of Wikipedia
Saturday August 10th 2013, 2:58 am
Filed under: citation needed,Glory, glory, glory,Uncategorized,wikipedia

Wikipedia has gotten more elaborate and complex to use. Adding a reference, marking something for review, uploading a file or creating a new article now take many steps — and failing to follow them can lead to starting all over. The curators of the core projects are concerned with uniformly high quality, and impatient with contributors who don’t have the expertise and wiki-experience to create something according to policy. Good stubs or photos are deleted for failing to comply with one of a dozen policies, or for inadequate cites or license templates; even when they are in fact derived from reliable sources and freely licensed.

The Article Creation Wizard has a five-step process for drafting an article, after which it is submitted for review by a team of experienced editors, and finally moved to the article namespace. 7 steps for approval is too much overhead for many.  And the current notability guidelines on big Wikipedias excludes most local and specialist knowledge.

We need a simpler scratch-space to develop new material:

  • A place not designed to be high quality, where everything can be in flux, possibly wrong, in need of clarification and polishing and correction.
  • A place that can be used to build draft articles, images, and other media before posting them to Wikipedia
  • A place where everyone is welcome to start a new topic, and share what they know: relying on verifiability over time (but not requiring it immediately), and without any further standard for notability
  • A place with no requirements to edit: possibly style guidelines to aspire to, but where newbies who don’t know how the tools or system works are welcomed and encouraged to contribute more, and not chastised for getting things wrong.

Since this will be a new sort of compendium or comprehensive cyclopedia, covering all topics, it should have a new name. Something simple, say Newpedia. Scripts can be written to help editors work through the most polished Newpedia items and push them to Wikipedia and Wikisource and Commons. We could invite editors to start doing their rough work on Newpedia, to avoid the conflict and fast reversion on the larger wiki references that make it hard to use for quick new work.

Update: Mako discussed Newpedia (or double-plus-newpedia) in his panel about “Wikipedia in 2022“, and Erik Moeller talked about how the current focus on notability is keeping all of our projects from growing, in his “Ghosts of Wikipedia Future“.  I look forward to the video and transcripts.

What do you think?  I started a mailing list for people who are interested in developing such a knowledge-project.  I look forward to your thoughts, both serious and otherwise 😉



Plumpy’Nut Patent – Has their “patentleft” option seen wide use so far?
Monday July 15th 2013, 10:31 am
Filed under: citation needed,ideonomy,knowledge,metrics

In 1996, two French food scientists, André Briend and Michel Lescanne, developed a nut-based food formulation to serve as an emergency food relief product in famine-stricken areas.  The goal was to have a high-density balanced food with a long and robust shelf life – one which, unlike the previous standard of milk-based therapeutic food, could be taken at home rather than in a hospital.

They soon formed the company Nutriset to further develop and commercialize the idea.  Their most popular product, Plumpy’Nut, has shipped millions of units and currently makes up roughly 90% of UNICEF’s stocks of ready-to-use therapeutic foods [RUTFs] for famine relief.

In forming their company, they captured their idea in the form of a patent (a standard way to declare ownership of and investment) and went on to build a production chain around it.  This included tweaked formulas and a family of products; production and packaging factories; and grant-writing and research to get certification + field-feedback + approval from various UN bodies.  This involved few years of up-front investment and reputation-building, and then ramping up mass production of millions of pounds of Plumpy’Nut and its derivatives. They later set up a novel “patentleft” process allowing companies in developing countries to use the patent commercially, and make derivatives from it, at no cost — after a brief online registration. This is something which has received surprisingly little attention since, considering how simple and elegant their solution. Read on for details! (more…)



Miscellaneous patterns : Link Dump Sunday from summer workshops
Sunday June 30th 2013, 6:45 pm
Filed under: Blogroll,citation needed,unfinished draft

tactics:

Principles for Open Contracting

Homepage

civics:

Swarmwise – The Tactical Manual To Changing The World. Chapter Six.


http://civic.mit.edu/conference2013/attendees
http://www.hurriyetdailynews.com/the-ert-saga-in-greece.aspx?pageID=449&nID=49316&NewsCatID=422
https://github.com/project-open-data/project-open-data.github.io/pulls?direction=desc&page=1&sort=created&state=closed

From my annotation talk:
http://prose.io/#project-open-data/project-open-data.github.io/edit/master/index.md

Nick Stenning’s Hilbert Problems of Annotation

My summary of Nick Stenning’s slides about open problems on the Web for annotation at iannotate.

  • bi-directional links: need to be able to discover when resources are annotated
  • annotation of documents, not formats: an annotation of a document in html should apply to the pdf, epub, etc
  • annotating dynamic content: content changes on the Web, and annotations need to be able to survive that (Memento, InternetArchive could help here)
  • persistent reference: for annotation Cool URIs is not enough. The Web isn’t cool enough.

http://www.perseus.tufts.edu/hopper/text?doc=Perseus%3Atext%3A1999.01.0125%3Abook%3D4%3Achapter%3D137
http://blogs.ch.cam.ac.uk/pmr/

From Michael’s talk:
http://atls.io/ideas/knowledge-markets/index.txt/editor

More design patterns:
http://hackerspaces.org/wiki/Design_Patterns
http://redditgifts.com/exchanges/
http://www.dna.caltech.edu/~pwkr/

Freedom of panorama:
blacked out images from regions without it  

Comments Off on Miscellaneous patterns : Link Dump Sunday from summer workshops


Annotation Notes from a recent discussion with this year’s Berkterns
Thursday June 13th 2013, 10:18 pm
Filed under: citation needed,knowledge,meta,popular demand,wikipedia

Anno-notes.  (thanks, piratepad)

Comments Off on Annotation Notes from a recent discussion with this year’s Berkterns


Web <30 – the Future of the Web is Intertextual
Thursday March 28th 2013, 9:28 am
Filed under: citation needed,Glory, glory, glory,indescribable

From a recent discussion about Web 3.0 and the far future, on the AIR-L list:

In fact, the Web is currently developing Web <30, to be rolled out
with Chrome 25, Firefox 20, Opera 15, and IE 10 later this winter.

If you are interested in cutting-edge research and convolving
observation with participation, you can take part in the design of Web
<30 yourself. It is being developed through a massively
multistakeholder open online crowd-refined platform generation
(MMOOCRPG) design
.
Building on the exponential success of past
efforts
, the development mailing list includes a periodic
distributed auto-immolating critique of its own work, where the future
web is continuously redefined as its own dual.

Comments Off on Web <30 – the Future of the Web is Intertextual


Exploring science in ten hundred words or less, and similar gems
Tuesday January 29th 2013, 6:27 pm
Filed under: chain-gang,citation needed,indescribable,knowledge,meta,poetic justice,Uncategorized

try and grok science
try and make a gun
try Sheldrake’s homing dove thought experiments

For dessert, some fraud:
listed, retracted, pharmed, 11-jigen (x6),
chilled(snapshot, comments).



A Christmas Gift from Cards Against Humanity To the Wikimedes
Tuesday December 18th 2012, 1:20 am
Filed under: chain-gang,citation needed,gustatory,poetic justice,Seraphic,wikipedia

♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡

Comments Off on A Christmas Gift from Cards Against Humanity To the Wikimedes


Public Domain Day 2013: A moment for celebration
Saturday December 15th 2012, 2:03 pm
Filed under: citation needed,unfinished draft,wikipedia

Summaries of what is entering the public domain: The Public Domain Review class of 2013
(Geography: confirm where these artists’ work will enter PD)

pd-authors-2013 spreadsheet (from a Freebase query)
See also: Category:1942 deaths

Public Domain call for Arts

(more…)

Comments Off on Public Domain Day 2013: A moment for celebration


That art makes me feel … uncomfortable.
Monday December 03rd 2012, 9:00 pm
Filed under: %a la mod,citation needed,Rogue content editor

Crash course in false equivalence.




Bad Behavior has blocked 195 access attempts in the last 7 days.