11. Policy-Based Methods for Reinforcement Learning
Overview
In this chapter, we will implement different policy-based methods of Reinforcement Learning (RL), such as policy gradients, Deep Deterministic Policy Gradients (DDPGs), Trust Region Policy Optimization (TRPO), and Proximal Policy Optimization (PPO). You will be introduced to the math behind some of the algorithms and you'll also learn how to code policies for RL agents within the OpenAI Gym environment. By the end of this chapter, you will not only have a base-level understanding of policy-based RL methods but you'll also be able to create complete working prototypes using the previously mentioned policy-based RL methods.