Elon Musk’s Artificial Intelligence has managed to show that it is the best playing Atari; and I can’t think of a better show of strength.
Elon Musk’s interest in AI is nothing new, although always in a cautious way; not so long ago he said AI was like summoning the demon.
The great fear of Musk and other experts is the damage that a misused AI could cause humanity; That’s why in late 2015 I started the OpenAI project, a non-profit organization that seeks to research and collaborate to create friendly AI.
Neuroevolution to create a friendly AI
To do this, OpenAI researchers have focused on a field that had some success in the 1980s; but that since then it has been replaced by other currents.
It’s called neuroevolution, and specifically they focused on evolutionary strategies; in other words that AI evolve with the experiences you have lived, drawing inspiration from biological evolution.
Specifically, the key to evolutionary strategy is that a system that has been successful pass its characteristics to its successors. Thus, from a selection of candidates the best performers are chosen; in this way they ensure that future versions will be up to the task.
As we have said, this concept already has decades on it; the achievement that researchers have achieved is apply it to neural networks and distributed systems. But how do they know that this type of AI has so much potential?
The researchers were clear on who had to beat: DeepMind, the Google company whose most recent achievement has been to create AlphaGo, the program that beat the best Go player in the world.
It is not just the Go. DeepMind has managed to face some of the most difficult Atari games, and has emerged victorious. Therefore, at OpenAI they decided that the combat would be decided with video games.
How Elon Musk’s Artificial Intelligence has learned to play Atari
Keep in mind that these AIs don’t even know how to play. Initially, the OpenAI system only received random rules on how to get a high score.
From that set of commands, several hundred copies were made with random variations, and tested in-game. Considering the results, the system re-mixed these standards; but giving more weight to those who had achieved the highest score.
This process was repeated constantly, until finally the system discovers a series of suitable rules to play.
According to OpenAI, in just one hour the system got so good at playing like DeepMind after a whole day of training.
The key to achieving these results is in multitasking; Because each set of standards is processed independently, they were able to occupy 1,440 processors with them at the same time.