On Alpha Go. Part II

NB! You can find the first part of the article by Thomas among the posts below.

The difference between computer ability in Chess and Go is due to the fact that Go is the more strategic game with many more possible moves while Chess has stronger elements of tactics that can be overlooked to instantly lose a game against the computer, which as explained is very strong tactically (due to the fewer possible moves, engines can also look deeper into Chess than in Go to find winning tactics). However, due to the unintelligent nature of these engines and their brute force approach, they lack the strategic foresight that human players have, and this is what has allowed humans to stay relevant despite our weaker calculating abilities. A great example of these elements can be garnered from the games in Deep Blue versus Garry Kasparov. Deep Blue was a brute force engine (it was actually a computer purpose-built for this) and challenged the world chess champion Garry Kasparov in 1996 and 1997, losing the first and winning the second 6 game matches. The major upset occurred in the first game, where Deep Blue as white defeated Kasparov in a tactical game and showed an important advantage of the machine, its lack of fear. Kasparov had launched a late-game counterattack which many human opponents would fear, thinking the World Champion had something up his sleeve. Deep Blue was under no false pretences however and simply calculated that there was no threat and continued playing until Kasparov resigned. After this, Kasparov realised he would have to use his unique human ability of intuitive to play a slow, positional game before executing any attack rather than try his might tactically. This allowed Kasparov to rebound and win the 1996 match 4-2 with 3 wins 2 draws and a loss. Despite losing the match, this was the first time a computer had even beaten a Chess World Champion in a classical game, and so was hailed as a victory of artificial intelligence, although really chess engines are little more than strong calculators, lacking intelligence other than following algorithms. The next year, however, an upgraded Deep Blue then beat Kasparov 3.5-2.5, with an estimated 200 million moves processed per second being decisive in this match, although many claim it was a psychological defeat that caused Kasparov to lose his confidence, with Kasparov even accusing Deep Blue of cheating after the second game. The calculating power of Deep Blue had reached the point where Kasparov could no longer consistently keep up, although he did win the first game. From then on engines have only become better, and even the current Chess World Champion Magnus Carlsen would almost certainly lose a classical match to modern chess programs. So that was the situation, with computers beating humans conclusively and progressing in their own abilities gradually. Then AlphaZero appeared, and despite not being given any information other than the rules, was set to work with over 5000 Google TPUs (like a CPU or GPU but for neural networks) and the rules of chess. After 9 hours of playing games against itself, the neural network had been trained up sufficiently that it beat the best engine of the time Stockfish 8 in a time-controlled match with 28 wins, 72 draws and 0 losses. The most impressive thing about this is that while Stockfish 8 could calculate 70 million moves per second, AlphaZero could only calculate 80,000 and yet decisively defeated Stockfish. This clearly shows the potential of neural networks, where the network of nodes and edges that had been trained up in those 9 hours were able to narrow down the selection of moves so greatly that the 80,000 moves per second were enough to defeat Stockfish’s 70 million. This aspect of the neural network is where the intelligence really shows, as, with no external input, AlphaZero had managed to teach itself the optimal moves to consider in each position, without the same level of calculation as normal engines. The play style of AlphaZero thus feels less cold to us than normal engines, and has elements of strategy and intuition that until now engines lacked. This can only be due to the fact that, like humans, AlphaZero had learned from its experience that certain positions and patterns would lead to success, and it has managed to use its experience to defeat Stockfish. There was a mixed reaction by chess players; Grandmaster Jon Ludwig Hammer believed AlphaZero played ‘insane attacking chess’, which is reminiscent of the chess of the 19th century, while even Kasparov saw it as a ‘remarkable achievement’. Others, however, believe Stockfish was at a disadvantage, Tord Romstad, a developer of Stockfish, argued that the software and time controls used during the match left Stockfish at a disadvantage due to the way it was programmed. Even he conceded however that he thought this would increase Stockfish’s drawing chance, which gives some credit to AlphaZero’s extraordinary play style which GM Nielsen and a Deepmind developer Demis Hassabis labelled ‘alien’. Either way, it was a success for Artificial Intelligence, which has demonstrated the ability to learn from experience and displayed characteristics which we would call intuition and strategic thought. It is particularly interesting that only when humans stopped intervening with opening books and tablebases and teaching the network from human games, that AlphaZero began to reach its full potential which in the end has, rather sadly, far eclipsed human ability.