Book Image

Reinforcement Learning with TensorFlow

By : Sayon Dutta
Book Image

Reinforcement Learning with TensorFlow

By: Sayon Dutta

Overview of this book

Reinforcement learning (RL) allows you to develop smart, quick and self-learning systems in your business surroundings. It's an effective method for training learning agents and solving a variety of problems in Artificial Intelligence - from games, self-driving cars and robots, to enterprise applications such as data center energy saving (cooling data centers) and smart warehousing solutions. The book covers major advancements and successes achieved in deep reinforcement learning by synergizing deep neural network architectures with reinforcement learning. You'll also be introduced to the concept of reinforcement learning, its advantages and the reasons why it's gaining so much popularity. You'll explore MDPs, Monte Carlo tree searches, dynamic programming such as policy and value iteration, and temporal difference learning such as Q-learning and SARSA. You will use TensorFlow and OpenAI Gym to build simple neural network models that learn from their own actions. You will also see how reinforcement learning algorithms play a role in games, image processing and NLP. By the end of this book, you will have gained a firm understanding of what reinforcement learning is and understand how to put your knowledge to practical use by leveraging the power of TensorFlow and OpenAI Gym.
Table of Contents (21 chapters)
Title Page
Packt Upsell

The Monte Carlo tree search algorithm

The Monte Carlo Tree Search (MCTS) is a planning algorithm and a way of making optimal decisions in case of artificial narrow intelligence problems. MCTS works on a planning ahead kind of approach to solve the problem.

The MCTS algorithm gained importance after earlier algorithms such as minimax and game trees failed to show results with complex problems. So what makes the MCTS different and better than past decision making algorithms such as minimax?

Let's first discuss what minimax is.

Minimax and game trees

Minimax was the algorithm used by IBM Deep Blue to beat the world champion Gary Kasparov on February 10, 1996 in a chess game. This win was a very big milestone back then. Both minimax and game trees are directed graphs, where each node represents the game states, that is, position in the game as shown in the following diagram of a game of tic-tac-toe:

Game tree for tic-tac-toe. The top node represents the start position of the game. Following down...