To see the method in action, let's implement AlphaGo Zero for Connect4. The game is for two players with fields 6 × 7. Players have disks of two different colors, which they drop in turn to any of the seven columns. Disks fall to the bottom, stacking vertically. The game objective is to be the first to form a horizontal, vertical or diagonal group of four disks of the same color. Two game situations are shown in the diagram. On the first, the red player has just won, while on the second, the blue player is going to form a group.
Despite the simplicity, this game has 4.5*1012 different game states, which is challenging for computers to solve with brute force. This example consists of several tools and library modules:
Chapter18/lib/game.py: Low-level game representation, which contains functions to make moves, encode and decode the game state, and other game-related utilities.
Chapter18/lib/mcts.py: MCTS implementation that allows GPU-expansion...