Book Image

Mastering Reinforcement Learning with Python

By : Enes Bilgin
Book Image

Mastering Reinforcement Learning with Python

By: Enes Bilgin

Overview of this book

Reinforcement learning (RL) is a field of artificial intelligence (AI) used for creating self-learning autonomous agents. Building on a strong theoretical foundation, this book takes a practical approach and uses examples inspired by real-world industry problems to teach you about state-of-the-art RL. Starting with bandit problems, Markov decision processes, and dynamic programming, the book provides an in-depth review of the classical RL techniques, such as Monte Carlo methods and temporal-difference learning. After that, you will learn about deep Q-learning, policy gradient algorithms, actor-critic methods, model-based methods, and multi-agent reinforcement learning. Then, you'll be introduced to some of the key approaches behind the most successful RL implementations, such as domain randomization and curiosity-driven learning. As you advance, you’ll explore many novel algorithms with advanced implementations using modern Python libraries such as TensorFlow and Ray’s RLlib package. You’ll also find out how to implement RL in areas such as robotics, supply chain management, marketing, finance, smart cities, and cybersecurity while assessing the trade-offs between different approaches and avoiding common pitfalls. By the end of this book, you’ll have mastered how to train and deploy your own RL agents for solving RL problems.
Table of Contents (24 chapters)
Section 1: Reinforcement Learning Foundations
Section 2: Deep Reinforcement Learning
Section 3: Advanced Topics in RL
Section 4: Applications of RL

Setting up your RL environment

RL algorithms utilize state-of-the-art ML libraries that require some sophisticated hardware. To follow along the examples we will solve throughout the book, you will need to set up your computer environment. Let's go over the hardware and software you will need in your setup.

Hardware requirements

As mentioned previously, state-of-the-art RL models are usually trained on hundreds of GPUs and thousands of CPUs. We certainly don't expect you to have access to those resources. However, having multiple CPU cores will help you simultaneously simulate many agents and environments to collect data more quickly. Having a GPU will speed up training deep neural networks that are used in modern RL algorithms. In addition, to be able to efficiently process all that data, having enough RAM resources is important. But don't worry, work with what you have, and you will still get a lot out of this book. For your reference, here are some specifications of the desktop we used to run the experiments: 

  • AMD Ryzen Threadripper 2990WX CPU with 32 cores
  • NVIDIA GeForce RTX 2080 Ti GPU
  • 128 GB RAM

As an alternative to building a desktop with expensive hardware, you can use Virtual Machines (VM) with similar capabilities provided by various companies. The most famous ones are:

  • Amazon AWS
  • Microsoft Azure
  • Google Cloud

These cloud providers also provide data science images for your virtual machines during the setup and it saves the user from installing the necessary software for deep learning (CUDA, TensorFlow ). They also provide detailed guidelines on how to setup your VMs, to which we defer the details of the setup.

A final option that would allow small-scale deep learning experiments with TensorFlow is Google's Colab, which provides VM instances readily accessible from your browser with the necessary software installed. You can start experimenting on a Jupyter Notebook-like environment right away, which is a very convenient option for quick experimentation.

Operating system

When you develop data science models for educational purposes, there is often not a lot of difference between Windows, Linux or MacOS. However, we plan to do a bit more than that in this book with advanced RL libraries running on a GPU. This setting is best supported on Linux, of which we use Ubuntu 18.04.3 LTS distribution. Another option is macOS, but that often does not come with a GPU on the machine. Finally, although the setup could be a bit convoluted, Windows Subsystem for Linux (WSL) 2 is an option you could explore.

Software toolbox

One of the first things people do while setting up the software environment for data science projects is to install Anaconda, which gives you a Python platform along with many useful libraries.


The CLI tool called virtualenv is a lighter weight tool compared to Anaconda to create virtual environments for Python, and preferable in most production environments. We, too, will use it in certain chapters. You can find the installation instructions for virtualenv at

We will particularly need the following packages:

  • Python 3.7: Python is the lingua franca of data science today. We will use version 3.7.
  • NumPy: It is one of the most fundamental libraries used in scientific computing in Python.
  • Pandas: Pandas is a widely used library that provides powerful data structures and analysis tools.
  • Jupyter Notebook: This is a very convenient tool to run Python code especially for small-scale tasks. It usually comes with your Anaconda installation by default.
  • TensorFlow 2.x: This will be our choice as the deep learning framework. We use version 2.3.0 in the book. Occasionally, we will refer to repos that use TF 1.x as well.
  • Ray & RLlib: Ray is a framework for building and running distributed applications and it is getting increasingly popular. RLlib is a library running on Ray that includes many popular RL algorithms. At the time of writing this book, Ray supports only Linux and macOS for production, and Windows support is in alpha phase. We will use version 1.0.1, unless noted otherwise.
  • Gym: This is an RL framework created by OpenAI that you have probably interacted with before if you ever touched RL. It allows us to define RL environments in a standard way and let them communicate with algorithms in packages like RLlib.
  • OpenCV Python Bindings: We need this for some image processing tasks.
  • Plotly: This is a very convenient library for data visualization. We will use Plotly together with the Cufflinks package to bind it to pandas.

You can use one of the following commands on your terminal to install a specific package. With Anaconda:

conda install pandas==0.20.3

With virtualenv (Also works with Anaconda in most cases)

pip install pandas==0.20.3

Sometimes, you are flexible with the version of the package, in which case you can omit the equal sign and what comes after.


It is always a good idea to create a virtual environment specific to your experiments for this book and install all these packages in that environment. This way, you will not break dependencies for your other Python projects. There is a comprehensive online documentation on how to manage your environments provided by Anaconda available at

That's it! With that, you are ready to start coding RL!