Now Anyone Can Deploy Google’s Troll-Fighting AI

LAST SEPTEMBER, A Google offshoot called Jigsaw declared war on trolls, launching a project to defeat online harassment using machine learning. Now, the team is opening up that troll-fighting system to the world.

On Thursday, Jigsaw and its partners on Google’s Counter Abuse Technology Team released a new piece of code called Perspective, an API that gives any developer access to the anti-harassment tools that Jigsaw has worked on for over a year. Part of the team’s broader Conversation AI initiative, Perspective uses machine learning to automatically detect insults, harassment, and abusive speech online. Enter a sentence into its interface, and Jigsaw says its AI can immediately spit out an assessment of the phrase’s “toxicity” more accurately than any keyword blacklist, and faster than any human moderator.

The Perspective release brings Conversation AI a step closer to its goal of helping to foster troll-free discussion online, and filtering out the abusive comments that silence vulnerable voices—or, as the project’s critics have less generously put it, to sanitize public discussions based on algorithmic decisions.

An Internet Antitoxin

Conversation AI has always been an open source project. But by opening up that system further with an API, Jigsaw and Google can offer developers the ability to tap into that machine-learning-trained speech toxicity detector running on Google’s servers, whether for identifying harassment and abuse on social media or more efficiently filtering invective from the comments on a news website.

“We hope this is a moment where Conversation AI goes from being ‘this is interesting’ to a place where everyone can start engaging and leveraging these models to improve discussion,” says Conversation AI product manager CJ Adams. For anyone trying to rein in the comments on a news site or social media, Adams says, “the options have been upvotes, downvotes, turning off comments altogether or manually moderating. This gives them a new option: Take a bunch of collective intelligence—that will keep getting better over time—about what toxic comments people have said would make them leave, and use that information to help your community’s discussions.”

On a demonstration website launched today, Conversation AI will now let anyone type a phrase into Perspective’s interface to instantaneously see how it rates on the “toxicity” scale. Google and Jigsaw developed that measurement tool by taking millions of comments from Wikipedia editorial discussions, the New York Times and other unnamed partners—five times as much data, Jigsaw says, as when it debuted Conversation AI in September—and then showing every one of those comments to panels of ten people Jigsaw recruited online to state whether they found the comment “toxic.”

The resulting judgements gave Jigsaw and Google a massive set of training examples with which to teach their machine learning model, just as human children are largely taught by example what constitutes abusive language or harassment in the offline world. Type “you are not a nice person” into its text field, and Perspective will tell you it has an 8 percent similarity to phrases people consider “toxic.” Write “you are a nasty woman,” by contrast, and Perspective will rate it 92 percent toxic, and “you are a bad hombre” gets a 78 percent rating. If one of its ratings seems wrong, the interface offers an option to report a correction, too, which will eventually be used to retrain the machine learning model.

The Perspective API will allow developers to access that test with automated code, providing answers quickly enough that publishers can integrate it into their website to show toxicity ratings to commenters even as they’re typing. And Jigsaw has already partnered with online communities and publishers to implement that toxicity measurement system. Wikipedia used it to perform a study of its editorial discussion pages. The New York Times is planning to use it as a first pass of all its comments, automatically flagging abusive ones for its team of human moderators. And the Guardian and the Economist are now both experimenting with the system to see how they might use it to improve their comment sections, too. “Ultimately we want the AI to surface the toxic stuff to us faster,” says Denise Law, the Economist’s community editor. “If we can remove that, what we’d have left is all the really nice comments. We’d create a safe space where everyone can have intelligent debates.”


Leave a Reply