One Manâs Quest to Rid Wikipedia of Exactly One Grammatical Mistake
No misuse of âcompriseâ will sneak past this WikiGnome.
--
On a Friday in July 2012, two employees of the Wikimedia Foundation gave a talk at Wikimania, their organizationâs annual conference. Maryana Pinchuk and Steven Walling addressed a packed room as they answered a question that has likely popped into the minds of even the most casual users of Wikipedia: who the hell edits the site, and why do they do it?
Pinchuk and Walling conducted hundreds of interviews to find out. They learned that many serious contributors have an independent streak and thrive off the opportunity to work on any topic they like. Other prolific editors highlight the encyclopediaâs huge global audience or say they derive satisfaction from feeling that their work is of use to someone, no matter how arcane their interests. Then Walling lands on a slide entitled, âperfectionism.â The bespectacled young man pauses, frowning.
âI feel sometimes that this motivation feels a little bit fuzzy, or a little bit negative in some ways⦠Like, one of my favorite Wikipedians of all time is this user called Giraffedata,â he says. âHe has, like, 15,000 edits, and heâs done almost nothing except fix the incorrect use of âcomprised ofâ in articles.â
A couple of audience members applaud loudly.
âBy hand, manually. No tools!â interjects Pinchuk, her green-painted fingernails fluttering as she gestures for emphasis.
âItâs not a bot!â adds Walling. âItâs totally contextual in every article. Heâs, like, my hero!â
âIf anybody knows him, get him to come to our office. Weâll give him a Barnstar in person,â says Pinchuk, referring to the coveted virtual medallion that Wikipedia editors award one another.
Walling continues: âI donât think he wakes up in the morning and says, âIâm gonna serve widows in Africa with the sum of all human knowledge.ââ He begins shaking his hands in mock frustration. âHe wakes up and says, âThose fuckers â they messed it up again!ââ
Giraffedata is something of a superstar among the tiny circle of people who closely monitor Wikipedia, one of the most popular websites in the English-speaking world. About 8 million English Wikipedia articles are visited every hour, yet only a tiny fraction of readers click the âeditâ button in the top right corner of every page. And only 30,000 or so people make at least five edits per month to the quickly growing site.
Giraffedataâa 51-year-old software engineer named Bryan Hendersonâis among the most prolific contributors, ranking in the top 1,000 most active editors. While some Wikipedia editors focus on adding content or vetting its accuracy, and others work to streamline the siteâs grammar and style, generally few, if any, adopt Giraffedataâs approach to editing: an unrelenting, multi-year project to fix exactly one grammatical error.
Henderson has now made over 47,000 edits to the site since 2007, virtually all of them addressing this one linguistic pet peeve. Article by article, week by week, Henderson redacts imperfect sentences, tightening them almost imperceptibly. âIâm proud of it,â says Henderson of the project. âItâs just fun for me. Iâm not doing it to have any impact on the world.â
Every Sunday night before going to bed, Henderson follows an editing routine that allows him to efficiently work on the approximately 70 to 80 new âcomprised ofâ errors that appear on the encyclopedia each week. The entire process takes an hour, at most.
He begins by running a software program that he wrote himself, which sends a request to Wikipediaâs server for articles containing the phrase âcomprised of.â His program parses the HTML code from the search results page to extract a list of dozens of article titles: âPlayStation 4,â for example, in addition to âHigh Court (Ireland),â and âBritish Armoured formations of World War II.â The program then compares these titles against an offline database of articles that Henderson has edited within the last six months. Any matches get removed from the list. (He does this to avoid hitting the same article too often and pissing off overprotective editors who claim âownershipâ of certain articles.)
Next, a simple Web page is generated on the giraffe-data.com Web server, which contains a list of links to the edit page for each remaining article. Henderson can now easily click on each entry and make the necessary changes. Finally, the program updates the database of recently edited pages. âAn edit typically takes about ten seconds, but thatâs because Iâve gotten really, really good at it,â he says. âIâm actually putting a lot of thought into those ten seconds. Some of them take a lot longer; some of them take minutes.â
In the interest of saving himself those precious minutes, Henderson is more than happy to explain the trouble with âcomprised of.â Take the following sentence, for example:
The Wikipedia editorial community is comprised of many interesting people.
The problem is rooted in confusion over the verbs âto compriseâ and âto compose.â Most style manuals advise against this usage. Better alternatives to the above example include the following:
The Wikipedia editorial community is composed of many interesting people.
Or:
The Wikipedia editorial community consists of many interesting people.
In a 6,000-word essay, Henderson lays out his case for why that phrase is ungrammatical. It is one of the top Google results for âcomprised of.â âThereâs nothing else that completely beats it to death like that article does,â Henderson says. Under the subheading âPointlessness of caring about it,â he writes that some editors seem to think heâs wasting his time. âI wonât offer a rebuttal of that,â he writes, âBecause an individual editorâs allocation of his time shouldnât be anyone elseâs concern.â
Not everyone has welcomed his mission. In the essay he mentions that he once âattracted a stalker, a single editor who reverted about 30 [âcomprised ofâ edits] in a row in the same order in which I made them.â On 15 June, 2009, an editor left a comment on the âTalkâ page of Jimmy Wales, a founder of the encyclopedia. Entitled âIntercession needed,â the writer began: âPlease refer to user Talk:Giraffedata. Even though numerous editors have objected to his obsessive removal of the gramatically [sic] acceptable term âcomprised ofâ from hundreds of articles, he defiantly continues to do so. Your assistance here is appreciated.â
Wales replied later that day: âI believe that Giraffedataâs arguments against our using it are persuasive,â though he abstained from passing further judgment.
On his own âTalkâ page, Henderson notes, âDozens of editors have let me know that they learned of the grammatical issue from my edit, had consequently decided to avoid âcomprised ofâ in their writing, and thanked me.â
I am one such editor. As a freelance journalist I had occasionally used the phrase âcomprised ofâ in my writing, most often when discussing musical acts. In a 2011 feature published in Rolling Stone Australia, for example, I wrote this sentence:
âA four-piece comprised of members from three Brisbane bands youâve never heard of, Millions realised during their initial rehearsals that their sound might appeal to the national broadcaster.â
I discovered Hendersonâs âcomprised ofâ essay last March, while working on edits for my first book, Talking Smack. In the first draft, I wrote the following:
âCompletely improvisational in nature, the band is comprised of a bassist, two rappers, three members poking at laptops, and, occasionally, a singer.â
My editor switched the verb to âcomposedâ but gave no explanation. I googled my original phrase and discovered, to my horror, the prevalence of the error. I read Giraffedataâs essay. Thoroughly impressed, I shared it on Facebook. âSpectacular. A true hero,â one fellow writer commented in response.
As a stickler for correct grammar, I am appalled at the thought of incorrect English in my published work. So in March 2014, I thanked Henderson for saving me from further embarrassment by awarding him an âOriginal Barnstar.â âYouâre a legend, Bryan,â I wrote on his âTalkâ page. âThanks for correcting my semi-regular use of âcomprised of.â Never again will I use it!â
Within an hour, Henderson had replied. âThank you,â he wrote. âI love it when people are able to change their grammar based on a logical argument. Iâm like that â in fact, I actually enjoy learning and adopting new grammarâbut I frequently run into people so emotionally attached to their grammar that they will defend what âsounds rightâ to the death.â
My curiosity piqued, I arranged a phone interview with Henderson the following week. I wondered how closely he fit the Wikipedia editor stereotype, which Steven Walling, in his Wikimania talk, had characterized as a loner living in his motherâs basement, with little more than an IV drip and a keyboard.
Henderson was born in Olympia, Washington, the middle child of a father who worked for the state government and a mother who taught math in middle school. He discovered an early affinity for computer science, and his first job out of college was working for IBM. He spent a decade working out of the companyâs San Jose office before he felt the itch to do something different. He left the company at the end of 1995.
Swept up by the optimism of the dot-com boom, he decided to start his own company. Henderson purchased a neighborhood video store and, inspired by Apple, named it Giraffe Data Systems. âThey picked a fruit; I picked an animal,â he says.
His idea was essentially what Netflix is now, except using the technology of the time, the VHS videotape. A customer would order a movie online, and perhaps a pizza, too, to be delivered to their house. âIt would have worked, except that neighborhood video stores were on the way out, as the industry was being consumed by Blockbuster,â he says. He dissolved Giraffe Data Systems in 1999 but kept the company name and web domain, and he moved back home, to Olympia.
About a year later his former bosses at IBM heard that Henderson was unemployed. They lured him back to San Jose, and he soon met his partner, Chun Xue, online. âI basically ordered him from a catalogue,â Henderson jokes. âIt was pretty much love at first sight.â The pair began cohabiting in 2001, and they now share a condo in San Jose.
Henderson first came across Wikipedia in 2004, when the site was three years old. By the time he made his first edit under the username Giraffedata in September 2004, the encyclopedia had amassed 323,000 articles hashed together by 10,885 contributors.
âI read everything on the Web and Iâd say, âJesus, this is written wrong,ââ he recalls. âSuddenly, I was looking at Wikipedia and I said, âYou know what? Rumor has it, I can fix this!ââ He gives a short laugh. âI pressed âeditâ and, sure enough, it let me submit it, and nobody came back and scolded me for it, or changed it. It was still there a week later.â
His first âcomprised ofâ edit took place on August 14, 2006, in the article âCentral processing unit.â The next one was on January 4, 2007, in âMichigan Research Community.â Soon the edits flowed thick and fast; by March, he was zapping the phrase from the site on a regular basis. By the end of that year, English Wikipedia became the largest encyclopedia ever assembled, surpassing two million articles. Henderson narrowed his focus to just âcomprised of.â The project had begun.
Before he developed his programmatic solution, he used Google to find the 15,000 or so instances of the phrase. âIn the beginning, I marked them all as âminor edits,ââ he says, âwhich is basically defined as, âNobody could possibly disagree with this.ââ
He was surprised, however, to find in the first three months that some people disagreed with his edit, sometimes vehemently. âWhen the first few people said, âWhy did you do this?â I said, âWell, itâs not grammatical. Itâs not English at all.â And then finally somebody came and said, âYou jerk, itâs a matter of opinion! Itâs completely valid, I looked it up in my dictionary! You have no right to mess with my article!ââ Henderson laughs. âThat came as quite a surprise.â He stopped checking the âminor editâ box, to acknowledge that some users might find the change controversial.
Eventually, Henderson discovered Wikipediaâs search function, and he wrote some code to compile a complete list of the unedited instances. Every Sunday night, he worked on his project. âBetween two and three years in, I actually reached the end,â he says. âI was amazed when I got to the end of this list.â He pauses. âAnd then I started over again, because more had been added at the start.â
To meet its goal of encapsulating the sum of human knowledge, Wikipedia draws on the talents of different kinds of editors. Henderson is an archetypical WikiGnome, a contributor who specializes in fixing typos, repairing broken links, adding categories and, yes, correcting grammar.
Yet he is unique in that few, if any, editors devote themselves to one grammatical cause. âIâm definitely not the only one who does grammar edits; there must be people who spend ten hours a week on them,â he says. âBut Iâm the only one who concentrates on one aspect.â
Many contributors, of course, focus on adding or refining material. For example, the English encyclopediaâs most prolific editor is 32-year-old Indianapolis resident Justin Knappâusername: koavfâwho has made 1.45 million edits so far. Knappâs edits are sometimes assisted by semi-automated software that, within a few hours on January 30, 2015, allowed him to make nearly a thousand category tweaks to Pakistan-related articles. On the same day, he also removed âunsourced and redundantâ information about songs on an upcoming Bob Dylan album and created a new section on the article for Glenda Ritz, the incumbent Superintendent of Public Instruction for Indiana.
The community has dubbed obsessively editing the encyclopedia Wikipediholism; an article for the term warns that, âlike any behavioral addiction, Wikipedia overuse may lead to job loss, divorce, bankruptcy, or worse. ⦠Remember, itâs your time and you are donating it to Wikipedia. It is healthy to donate what you can afford to donate, but no more.â
In May 2011, an editor wrote on Hendersonâs âTalkâ page: âHiâ¦..questionâ¦please donât take it soo harsh because I donât know youâ¦but honestly, do you have a life?â The following day, the grammarian replied, âMy life is rather full. I have a full time job and numerous hobbies in addition to copy editing Wikipedia. But not much of my non-job time is spent doing conventional pastimes (i.e. from the approved lifestyle list) such as attending baseball games, wine tasting, traveling, painting, and mountain biking.â
Henderson follows a strict schedule: cycle to work at 7:30 am, eat lunch in the company cafeteria, come home at 5:30 pm, eat dinner, indulge in some open source programming and television, and go to bed. âI really do like routine,â he says. He wears the same color and model of shirt each workdayâa red, short-sleeve polo with a pocket. For a time, he bought all of them from a company that makes uniforms; more recently, heâs âgoing wild and looseâ by buying several brands.
When Henderson one day revealed his ongoing editorial project to his older brother, Robin, his sibling soon joined the battle against imperfect English. Under the username Laodah, Robin, 52, edits instances of what he dubs the âlazy around,â wherein someone writes âbased aroundâ instead of âbased on.â His first such edit took place on January 30, 2012, and he has since made dozens more.
âI fix the âaroundsâ that I happen to come across in the course of my research,â says Robin. âBryan is more âsearch and destroyâ: he goes in there with his HAL 9000, combs Wikipedia for incidences, and torpedoes them. Heâs neurotic that way: when he gets something under his bonnet thatâs important to him, he has this laser concentration to it.â
When asked what motivates him, Henderson says he views his pursuit as similar to that of people who choose to spend their Saturdays picking up litter from the side of the road. âI really do think Iâm doing a public service, but at the same time, I get something out of it myself. Itâs hard to imagine doing it for the rest of my life,â he says with a laugh. âI donât have any plans to quit, but I guess eventually, Iâll have to find a way. Itâs hard to walk away, especially when Iâve actually accomplished something.â