June 4th, 2011
|Dictionary and red pencil, photo by novii, on Flickr|
Sanford Thatcher has written a valuable, if anecdotal, analysis of some papers residing on Harvard’s DASH repository (Copyediting’s Role in an Open-Access World, Against the Grain, volume 23, number 2, April 2011, pages 30-34), in an effort to get at the differences between author manuscripts and the corresponding published versions that have benefited from copyediting.
“What may we conclude from this analysis?” he asks. “By and large, the copyediting did not result in any major improvements of the manuscripts as they appear at the DASH site.” He finds that “the vast majority of changes made were for the sake of enforcing a house formatting style and cleaning up a variety of inconsistencies and infelicities, none of which reached into the substance of the writing or affected the meaning other than by adding a bit more clarity here and there” and expects therefore that the DASH versions are “good enough” for many scholarly and educational uses.
Although more substantive errors did occur in the articles he examined, especially in the area of citation and quotation accuracy, they were typically carried over to the published versions as well. He notes that “These are just the kinds of errors that are seldom caught by copyeditors.”
One issue that goes unmentioned in the column is the occasional introduction of errors by the typesetting and copyediting process itself. This used to happen with great frequency in the bad old days when publishers rekeyed papers to typeset them. It was especially problematic in fields like my own, in which papers tend to have large amounts of mathematical notation, which the typesetting staff had little clue about the niceties of. These days more and more journals allow authors to submit LaTeX source for their articles, which the publisher merely applies the house style file to. This practice has been a tremendous boon to the accuracy and typesetting quality of mathematical articles. Still, copyediting can introduce substantive errors in the process. Here’s a nice example from a paper in the Communications of the ACM:
“Besides getting more data, faster, we also now use much more sophisticated learning algorithms. For instance, algorithms based on logistic regression and that support vector machines can reduce by half the amount of spam that evades filtering, compared to Naive Bayes.” (Joshua Goodman, Gordon V. Cormack, and David Heckerman, Spam and the ongoing battle for the inbox, Communications of the Association for Computing Machinery, volume 50, number 2, 2007, page 27. Emphasis added.)
Any computer scientist would immediately see that the sentence as published makes no sense. There is no such thing as a “vector machine” and in any case algorithms don’t support them. My guess is that the author manuscript had the sentence “For instance, algorithms based on logistic regression and support vector machines can reduce by half…” — without the word that. The copyeditor apparently didn’t realize that the noun phrase support vector machine is a term of art in the machine learning literature; the word support was not intended to be a verb here. (Do a Google search for vector machine. Every hit has the phrase in the context of the term support vector machine, at least for the pages I looked at before boredom set in.)
Presumably, the authors didn’t catch the error introduced by the copyeditor. The occurrence of errors of this sort is no argument against copyediting, but it does demonstrate that it should be viewed as a collaborative activity between copyeditors and authors, and better tools for collaboratively vetting changes would surely be helpful.
In any case, back to Dr. Thatcher’s DASH study. Ellen Duranceau at MIT Libraries News views the study as “support for the MIT faculty’s approach to sharing their articles through their Open Access Policy”, and the same could be said for Harvard as well. However, before we declare victory, it’s worth noting that Dr. Thatcher did find differences between the versions, and in general the edits were beneficial.
The title of Dr. Thatcher’s column gets at the subtext of his conclusions, that in an open-access world, we’d have to live with whatever errors copyediting would have caught, since we’d be reading uncopyedited manuscripts. But open-access journals can and do provide copyediting as one of their services, and to the extent that doing so improves the quality of the articles they publish and thus the imprimatur of the journal, it has a secondary benefit to the journal of improving its brand and its attractiveness to authors.
I admit that I’m a bit of a grammar nerd (with what I think is a nuanced view that manages to be linguistically descriptivist and editorially prescriptivist at the same time) and so I think that copyediting can have substantial value. (My own writing was probably most improved by Savel Kliachko, an outstanding editor at my first employer SRI International.) To my mind, the question is how to provide editing services in a rational way. Given that the costs of copyediting are independent of the number of accesses, and that the value accrues in large part to the author (by making him or her look like less of a halfwit for exhibiting “inconsistencies and infelicities” and occasionally more substantive errors), it seems reasonable that authors ought to pay publishers a fee for these services. And that is exactly what happens in open-access journals. Authors can decide if the bargain is a good one on the basis of the services that the publisher provides, including copyediting, relative to the fee the publisher charges. As a result, publishers are given incentive to provide the best services for the dollar. A good deal all around.
Most importantly, in a world of open-access journals the issue of divergence between author manuscripts and publisher versions disappears, since readers are no longer denied access to the definitive published version. Dr. Thatcher concludes that the benefits of copyediting were not as large as he would have thought. Nonetheless, however limited the benefits might be, properly viewed those benefits argue for open access.