Book Image

Deep Reinforcement Learning Hands-On

By : Maxim Lapan
Book Image

Deep Reinforcement Learning Hands-On

By: Maxim Lapan

Overview of this book

Deep Reinforcement Learning Hands-On is a comprehensive guide to the very latest DL tools and their limitations. You will evaluate methods including Cross-entropy and policy gradients, before applying them to real-world environments. Take on both the Atari set of virtual games and family favorites such as Connect4. The book provides an introduction to the basics of RL, giving you the know-how to code intelligent learning agents to take on a formidable array of practical tasks. Discover how to implement Q-learning on 'grid world' environments, teach your agent to buy and trade stocks, and find out how natural language models are driving the boom in chatbots.
Table of Contents (23 chapters)
Deep Reinforcement Learning Hands-On
Contributors
Preface
Other Books You May Enjoy
Index

ES on CartPole


The complete example is in Chapter16/01_cartpole_es.py. In this example, we use the single environment to check the fitness of the perturbed network weights. Our fitness function will be the undiscounted total reward for the episode:

#!/usr/bin/env python3
import gym
import time
import numpy as np

import torch
import torch.nn as nn

from tensorboardX import SummaryWriter

From the import statements, you can notice how self-contained our example is. We're not using PyTorch optimizers, as we do not perform backpropagation at all. In fact, we could avoid using PyTorch completely and work only with NumPy, as the only thing we use PyTorch for is to perform a forward pass and calculate the network's output.

MAX_BATCH_EPISODES = 100
MAX_BATCH_STEPS = 10000
NOISE_STD = 0.01
LEARNING_RATE = 0.001

The amount of hyperparameters is also small and includes the following values:

  • MAX_BATCH_EPISODES and MAX_BATCH_STEPS: The limit of episodes and steps we use for training

  • NOISE_STD: The standard...