Should ChatGPT Be Used to Write Wikipedia Articles?

Jan 12, 202311:00 AM

Photo illustration by Slate. Photo by sompong_tom/iStock/Getty Images Plus.

Welcome to Source Notes, a Future Tense column about the internet’s information ecosystem.

Five years ago, I traveled to Stockholm to cover the annual convention for Wikipedia and related free knowledge projects. But it was not just wiki-interviews and chewy candy fish that occupied my time among the Swedes. During one fun evening, I came across a group playing a tabletop game envisioning what Wikipedia would be like in 2035. This futuristic Dungeons & Dragons-style role-playing game featured a cast of diverse characters like Yuki, an A.I. pop music composer and Wikipedia writer, and Levi, a passionate neo-Luddite who believed Wikipedia should be composed by humans only.

Back then, the game struck me as a creative but far-fetched thought experiment. But now Wikipedians are engaged in a heated debate about whether ChatGPT should be allowed for drafting articles. Ready or not, Wikipedians must answer the question of whether to allow generative artificial intelligence to cross the great encyclopedic threshold.

Back in November, OpenAI, the creator of ChatGPT, made a prototype of the chatbot available to the public. Since then, people have been using it for a wide variety of applications including humanlike conversation, language translation, and debugging lines of code. The underlying GPT-3.5 language model has been trained on a vast swath of the internet, including websites, blogs, and Wikipedia itself. Despite being so well-read, the chatbot is prone to making scary blunders—see, for example, the deceptions that Slate contributor Charles Seife found when he asked it to compose his own obituary. Meanwhile, the technology is spurring difficult conversations about the future of knowledge work and education. TikTok is awash with videos of Gen-Zers asking ChatGPT to do their homework.

Given the hype, it was only a matter of time before someone would ask ChatGPT to write a Wikipedia article about a topic. On Dec. 6, Richard Knipel, a longtime Wikipedian under the handle Pharos who is involved in the Wikimedia New York City local chapter, posted a new Wikipedia entry titled “Artwork title.” His edit summary explained “draft via ChatGPT, will extensively modify!” The A.I. version, which Knipel said he lightly edited before he copied over, is a generic but grammatically correct overview that defines what an artwork title is and refers to ancient and contemporary examples. Later, Knipel posted on the talk page that he believed this was the first time someone had started with ChatGPT for the initial version of an article and had been transparent about its use.

Wikipedians like Knipel imagine that ChatGPT could be used on Wikipedia as a tool without removing the role of humanity. For them, the initial text that’s generated from the chatbot is useful as a starting place or a skeletal outline. Then, the human verifies that this information is supported by reliable sources and fleshes it out with improvements. This way, Wikipedia itself does not become machine-written. Humans remain the project’s special sauce.

I asked Knipel why he thought “Artwork title”—as in, the Mona Lisa—was a good topic to experiment with the new technology. “It’s sort of a general subject. I was thinking about creating an article on it for a while, probably a couple of years,” Knipel said. “My experience suggests this is mostly useful as a tool for overcoming writer’s block.”

Andrew Lih, the Wikimedian-at-large at the Smithsonian Institution in Washington and a volunteer Wikipedia editor since 2003, agreed that much of the potential for ChatGPT lies in overcoming that initial inertia and finding the “activation energy” to write a new article for the encyclopedia. “Wikipedians are not lacking for motivation or passion, but just the time,” he said.

Lih views ChatGPT as presenting a new twist on Cunningham’s Law, an idea credited to the original wiki developer Ward Cunningham: The best way to get the right answer on the internet is not to ask a question; it’s to post the wrong answer. “In many ways that’s the analogue here,” Lih said. “The best way to get a [Wikipedia] article written is to post a really poor version of it generated by ChatGPT, but it’s not completely wrong. It’s enough to hang your hat on and say this is salvageable. This can be improved.”

To fix the rough draft, humans must be mindful of the problems inherent in the A.I-produced version. After inspecting what the chatbot had spit out, Knipel noticed that ChatGPT tended to overgeneralize, generating text like: “Whether descriptive or abstract, the title of a work of art is a crucial element of the artistic process.” Knipel also felt obligated to dial-down ChatGPT’s tone to align it with Wikipedia’s more neutral style. “It’s a very confident writer,” he said. “You need to, like, give some humility to it.”

There is also the problem of sourcing. Wikipedia’s verifiability policy means that readers should be able to check that the information comes from a reliable source, which is why articles are prone to have dozens of reference links at the bottom of the page. But ChatGPT typically does not provide references for its responses, and prompting the chatbot for references leads it to bizarre fabrications. For instance, Lih has been using ChatGPT to draft a potential new Wikipedia page on the concept of “Weaponized incompetence.” He said ChatGPT’s overview of the topic was decent, but the sources it provided from Forbes, the Guardian, and Psychology Today were completely bogus. These articles never existed! Even the URLs were auto-generated fakes, leading to “page not found” errors.

While ChatGPT’s ability to invent plausible-sounding fiction has been helping prolific Kindle novelists, inventing facts is of course not the goal in constructing a reliable encyclopedia. No wonder some Wikipedians are concerned. “The risks are that Wikipedia editors will find it more difficult to patrol content added to the site,” said Heather Ford, an associate professor of digital and social media at the University of Technology Sydney and the author of Writing the Revolution: Wikipedia and the Survival of Facts in the Digital Age. “It will be easier for bad actors to create fictional content that is cited to yet more fictional sources.”

Then again, Ford noted that Wikipedians have long agreed to permit bots and automated content in certain cases. As far back as 2002, a Wikipedia contributor known as Ram-Man used the automated tool rambot to auto-generate Wikipedia pages for American cities based on U.S. Census data, starting with the rather basic article for Autaugaville, Alabama. Rambot’s work was controversial because the articles followed the same format, taking raw numbers and placing them into robotic-sounding prose, mass-producing pages that users described as mere “stubs.” Even today, English-language Wikipedians tend to oppose bots that simply data-dump or machine-translate content without a human’s supervision. On the other hand, bots that automatically detect vandalism and screen common swear words are now widely accepted.

So far, Wikipedians have not devised formal rules that speak to ChatGPT specifically, though editors told me that existing policies banning plagiarism and requiring verification might prohibit blindly pasting the A.I.-generated text. Still, some have argued that ChatGPT should be entirely verboten. “In my view, we should strongly advise against the use of AI tools to create article drafts, even if the articles are then reviewed by humans,” argued one editor. “ChatGPT is too good at introducing plausible-sounding falsehoods.”

Others Wikipedia users characterized this proposal as a reactionary prohibition—and potentially a real shame. Arguably, Wikipedia editors have skills that make them better at using ChatGPT than most. “It’s a literary tool. That overlap of literacy and coding is one of the strengths of Wikipedians,” Knipel said. Some long-time contributors worry that Wikipedia is losing its original bold spirit and developing a knee-jerk resistance to change. “We’ve kind of closed ranks around Wikipedia being reliable, responsible, and verifiable—that’s great!—but I think that’s also carted us off from experimentation,” Lih said.

Rather than an outright ban, the better course may be to outline the best practices. When Lih has experimented with ChatGPT, he has placed the initial A.I. version into his sandbox page (a Wikipedia user’s drafting space), rather than publishing it to the live, public encyclopedia. Lih also tags these edits with the hashtag #ChatGPT to make it more easily searchable. He even makes a record of the specific command he gave the chatbox, something like, Write an article that adheres to Wikipedia’s policy of neutral point of view. The overarching principle is transparency.

Meanwhile, a controversy has flared up on Knipel’s original page. Although the article is now much longer than the first A.I. version, one Wikipedia editor has added an orange “badge of shame” to the top. The notice warns readers that this page might contain “original research” that’s not appropriate for publication on the encyclopedia. The argument is that Knipel made a fatal error by constructing his article in reverse: first by using unsourced text written by ChatGPT and then adding real, nonfiction sources to verify the claims.

Knipel counters that this is how Wikipedia has always worked in practice. “We always have imperfect information, and then we correct it,” he said. “If the issue is the original sin of using A.I., well, I don’t believe in original sin.” Perhaps that’s not a bad way to conceptualize this issue overall. Because generative A.I. is here to stay, it makes sense to adopt best practices and to stress the need for human supervision—not ban it from the outset as the fruit of the poisonous tree.

Future Tense is a partnership of Slate, New America, and Arizona State University that examines emerging technologies, public policy, and society.