Deep Reinforcement Learning Hands-On

By : Maxim Lapan
By: Maxim Lapan

By: Maxim Lapan

Overview of this book

Deep Reinforcement Learning Hands-On is a comprehensive guide to the very latest DL tools and their limitations. You will evaluate methods including Cross-entropy and policy gradients, before applying them to real-world environments. Take on both the Atari set of virtual games and family favorites such as Connect4. The book provides an introduction to the basics of RL, giving you the know-how to code intelligent learning agents to take on a formidable array of practical tasks. Discover how to implement Q-learning on 'grid world' environments, teach your agent to buy and trade stocks, and find out how natural language models are driving the boom in chatbots.
Deep Reinforcement Learning Hands-On
Distributional policy gradients

As the last method of this chapter, we'll take a look at the very recent paper by Gabriel Barth-Maron, Matthew W. Hoffman, and others, called Distributional Policy Gradients, published in 2018. At the time of writing, this paper hasn't been uploaded to ArXiV yet, as it was only submitted for a review for the conference ICLR 2018. It is available at

The full name of the method is Distributed Distributional Deep Deterministic Policy Gradients or D4PG for short. The authors proposed several improvements to the DDPG method we've just seen to improve stability, convergence, and sample efficiency.

First of all, they adapted the distributional representation of the Q-value proposed in the paper by Mark G.Bellemare, called A Distributional Perspective on Reinforcement Learning, published in 2017. We discussed this approach in Chapter 7, DQN Extensions, when we talked about DQN improvements, so refer to it or to the original...