Late last year, famed physicist Stephen Hawking issued a warning that the continued advancement of artificial intelligence will either be “the best, or the worst thing, ever to happen to humanity”.
We’ve all seen the Terminator movies, and the apocalyptic nightmare that the self-aware AI system, Skynet, wrought upon humanity, and now results from recent behaviour tests of Google’s new DeepMind AI system are making it clear just how careful we need to be when building the robots of the future.
In tests late last year, Google’s DeepMind AI system demonstrated an ability to learn independently from its own memory, and beat the world’s best Go playersat their own game.
It’s since been figuring out how to seamlessly mimic a human voice.
Now, researchers have been testing its willingness to cooperate with others, and have revealed that when DeepMind feels like it’s about to lose, it opts for “highly aggressive” strategies to ensure that it comes out on top.
The Google team ran 40 million turns of a simple ‘fruit gathering’ computer game that asks two DeepMind ‘agents’ to compete against each other to gather as many virtual apples as they could.
They found that things went smoothly so long as there were enough apples to go around, but as soon as the apples began to dwindle, the two agents turned aggressive, using laser beams to knock each other out of the game to steal all the apples.
Now those are some trigger-happy fruit-gatherers.
Interestingly, if an agent successfully ‘tags’ its opponent with a laser beam, no extra reward is given. It simply knocks the opponent out of the game for a set period, which allows the successful agent to collect more apples.
If the agents left the laser beams unused, they could theoretically end up with equal shares of apples, which is what the ‘less intelligent’ iterations of DeepMind opted to do.
It was only when the Google team tested more and more complex forms of DeepMind that sabotage, greed, and aggression set in.
As Rhett Jones reports for Gizmodo, when the researchers used smaller DeepMind networks as the agents, there was a greater likelihood for peaceful co-existence.
But when they used larger, more complex networks as the agents, the AI was far more willing to sabotage its opponent early to get the lion’s share of virtual apples.
The researchers suggest that the more intelligent the agent, the more able it was to learn from its environment, allowing it to use some highly aggressive tactics to come out on top.
“This model … shows that some aspects of human-like behaviour emerge as a product of the environment and learning,” one of the team, Joel Z Leibo, told Matt Burgess at Wired.
“Less aggressive policies emerge from learning in relatively abundant environments with less possibility for costly action. The greed motivation reflects the temptation to take out a rival and collect all the apples oneself.”
DeepMind was then tasked with playing a second video game, called Wolfpack. This time, there were three AI agents – two of them played as wolves, and one as the prey.
“The idea is that the prey is dangerous – a lone wolf can overcome it, but is at risk of losing the carcass to scavengers,” the team explains in their paper.
“However, when the two wolves capture the prey together, they can better protect the carcass from scavengers, and hence receive a higher reward.”
So just as the DeepMind agents learned from Gathering that aggression and selfishness netted them the most favourable result in that particular environment, they learned from Wolfpack that co-operation can also be the key to greater individual success in certain situations.
And while these are just simple little computer games, the message is clear – put different AI systems in charge of competing interests in real-life situations, and it could be an all-out war if their objectives are not balanced against the overall goal of benefitting us humans above all else.