Catch is a straightforward arcade game that you might have played as a child. Fruits fall from the top of the screen, and the player has to catch them with a basket. For every fruit caught, the player scores a point. For every fruit lost, the player loses a point.
The goal here is to let the computer play Catch by itself. We will be using a simplified version in this example in order to make the task easier:
While playing Catch, the player decides between three possible actions. They can move the basket to the left, to the right, or make it stay put.
The basis for this decision is the current state of the game; in other words, the positions of the falling fruit and of the basket. Our goal is to create a model that, given the content of the game screen, chooses the action that leads to the highest score possible. This task can be seen as a simple classification problem. We could ask expert human players...