Sunday, August 26, 2007

Wikipedia Unmasked

A new Web site reveals the sneak attacks and ego-fluffing of your friends and co-workers.

Wikipedia seems benign and geeky, so eager to share its awkwardly written knowledge. Let us tell you about Atari 2600, Impressionism, and orcs! The nerd orientation, however, distracts from what the site has become: an information battleground. Type "Exxon" into Google and the first two hits are official Exxon sites; the third is the company's Wikipedia entry. Wikipedia is the fourth hit for "Philip Morris," the third for "Starbucks," the fifth for "New York Times." The Wikipedia page has become a public face for corporations, and they have an incentive to polish and scrub their entries.

Enter Virgil Griffith, a self-described "disruptive technologist" and future CalTech graduate student. After reading about members of Congress who altered their Wikipedia entries, Griffith thought up a clever stink bomb. Wikipedia pages can be edited anonymously, but the anonymity is not total. When a computer connects to the Internet, it's assigned an IP address. (Click here to discover yours.) This address can change each time you connect, but organizations typically have a defined range of IP addresses. When an edit is made to a Wikipedia page, the IP address that made the change is recorded. What Griffith did was take the 34,417,493 anonymous edits added between February 2004 and August 2007 and correlate them with the IP addresses of hedge funds, law firms, media companies, the CIA, and the rest of us. He dubbed the result Wikiscanner, and launched it two weeks ago.

Virgil Griffith. Click image to expand.

Disruption occurred. The site lit up the Boing-Boing-osphere, a story about corporate edits made the cover of the New York Times, and—what fellow hackers seemed most impressed by—Stephen Colbert used Wikiscanner as the centerpiece of a Word of the Day monologue on The Colbert Report. The Threat Level blog at Wired took the lead in gathering the most egregious edits, a dragnet that rounded up the usual suspects. A Scientology IP added a link to the Kurt Cobain page that suggests the singer's childhood Ritalin prescription led him to suicide. An Exxon IP cleaned up the section on the effects of the Valdez oil spill, cheerfully noting "six of the largest salmon harvests in history were recorded in the decade immediately following the spill." A Philip Morris IP deleted this sentence from a history paragraph of the "Marlboro (cigarette)" page: "It emerged as the number one youth-initiation brand."

While these edits are embarrassing, they are not exactly smoking guns. Just because someone with a Scientology-associated IP edited the Kurt Cobain entry, it doesn't mean a Scientology employee made the change. Here is how Griffith explains it: "Technically, we don't know if it came from an agent of that company. However, we do know that edit came from someone with access to their network. If the edit occurred during working hours, then we can reasonably assume that the person is either an employee of that company or a guest that was allowed access to their network." Even so, it makes sense to tread lightly with accusations. For example, a lot of people create and update their own Wikipedia pages—for ego, for networking, for the hell of it. When I did a wikiscan of the Washington Post IP range, I turned up a change to my colleague Jack Shafer's page. The sentence:

He is perhaps best known for his obituary for Walter Annenberg, entitled "Citizen Annenberg - So long you rotten bastard", a fine example of early twenty-first century American poison pen obituary writing.

had been amended to:

He is perhaps best known for his critical obituary of Walter Annenberg, entitled "Citizen Annenberg - So long you rotten bastard".

It would be easy to assume that this is an example of Wikiscanner capturing Jack's modesty in action. But when I asked him if he made the change, he pleaded not guilty. He also pointed out that the page has his alma mater wrong.

It's not news that Wikipedia is occasionally incorrect. It's also not a surprise that tobacco companies, the Mormon Church, and Scientology are altering pages to promote their products and worldviews. More interesting are the small fry who were also caught in the Wikiscanner net. Someone with a New York Times IP made a crucial edit to the Condoleezza Rice page, altering "pianist" to "penis." A Greenpeace IP sniped at Ted Nugent (the Nuge is a prominent pro-hunting spokesman) by claiming that he once had a 9-year-old Hawaiian girlfriend. A Republican Party IP ruined the sixth Harry Potter by blanking the entire entry and adding a spoiler.

For the past week, I've been running various large law firms, banks, and consulting groups through Wikiscanner. While I haven't uncovered any great gotchas, I have noticed a few trends. First, corporate America shows an unparalleled creativity when it comes to describing erotic activity. (I will never think of milkshakes the same way again.) Second, way too many of you are adding yourself to Wikipedia as "Notable Alumni" of your school or as a "Person" from your hometown. Third, most every investment bank seems to have some British guy who's obsessed with football and vandalizes the Arsenal page. Fourth, people who went to Yale cannot resist editing their Secret Society page. Fifth, there's still a lot of love out there for Larry Bird. And, memo to everyone: Adding your buddy's name to the entry about "Oral Sex" is not original.

Wikipedia vandalism is as old as Wikipedia itself. The Wikipedians have a whole section devoted to the most inspired damage, called "Bad Jokes and Other Deleted Nonsense." I especially liked the archive of hoax pages, including the justly celebrated Upper Peninsula War, which details (complete with maps and historical photos) a skirmish over the Michigan's Upper Peninsula between Canadians and Americans during the Spring of 1843. But even a hoax as convincing as this one lasted only two weeks before being found out. Wikiscanner, despite its litany of mischief, points to the success of Wikipedia. The egotistical edits, slurs, and blatant puffery eventually get re-edited and fixed by the community.

As I scanned away, I found devoted Wikipedians who corrected grammar, argued finer points of historical incidents, and updated entries relating to their catholic interests: Jacques Lacan, steampunk, the Rabbit tetralogy, cannabis, the Empire State Building, weather balloons. So, that's the image I'm left with after two weeks of Wikiscanner: a thousand Cliff Clavins, anonymously sculpting the knowledge of Wikipedia during their working hours. No doubt there are subtle Wikipedia vandalism and public-relations black ops waiting to be discovered, but, for the moment, the open-source encyclopedia seems to be holding the fort against the forces of idiocy and spin.

Bonus celebrity edit: A Dreamworks IP gives Seth Rogen an adopted Laotian child named Pingpong Applesauce Rogen.