Book Image

Deep Reinforcement Learning Hands-On

By : Maxim Lapan
Book Image

Deep Reinforcement Learning Hands-On

By: Maxim Lapan

Overview of this book

Deep Reinforcement Learning Hands-On is a comprehensive guide to the very latest DL tools and their limitations. You will evaluate methods including Cross-entropy and policy gradients, before applying them to real-world environments. Take on both the Atari set of virtual games and family favorites such as Connect4. The book provides an introduction to the basics of RL, giving you the know-how to code intelligent learning agents to take on a formidable array of practical tasks. Discover how to implement Q-learning on 'grid world' environments, teach your agent to buy and trade stocks, and find out how natural language models are driving the boom in chatbots.
Table of Contents (23 chapters)
Deep Reinforcement Learning Hands-On
Contributors
Preface
Other Books You May Enjoy
Index

Taxonomy of RL methods


The cross-entropy method falls into the model-free and policy-based category of methods. These notions are new, so let's spend some time exploring them. All methods in RL can be classified into various aspects:

  • Model-free or model-based

  • Value-based or policy-based

  • On-policy or off-policy

There are other ways that you can taxonomize RL methods, but for now we're interested in the preceding three. Let's define them, as your problem specifics can influence your decision on a particular method.

The term model-free means that the method doesn't build a model of the environment or reward; it just directly connects observations to actions (or values that are related to actions). In other words, the agent takes current observations and does some computations on them, and the result is the action that it should take. In contrast, model-based methods try to predict what the next observation and/or reward will be. Based on this prediction, the agent is trying to choose the best possible...