Book Image

Hands-On Reinforcement Learning for Games

By : Micheal Lanham
Book Image

Hands-On Reinforcement Learning for Games

By: Micheal Lanham

Overview of this book

With the increased presence of AI in the gaming industry, developers are challenged to create highly responsive and adaptive games by integrating artificial intelligence into their projects. This book is your guide to learning how various reinforcement learning techniques and algorithms play an important role in game development with Python. Starting with the basics, this book will help you build a strong foundation in reinforcement learning for game development. Each chapter will assist you in implementing different reinforcement learning techniques, such as Markov decision processes (MDPs), Q-learning, actor-critic methods, SARSA, and deterministic policy gradient algorithms, to build logical self-learning agents. Learning these techniques will enhance your game development skills and add a variety of features to improve your game agent’s productivity. As you advance, you’ll understand how deep reinforcement learning (DRL) techniques can be used to devise strategies to help agents learn from their actions and build engaging games. By the end of this book, you’ll be ready to apply reinforcement learning techniques to build a variety of projects and contribute to open source applications.
Table of Contents (19 chapters)
1
Section 1: Exploring the Environment
7
Section 2: Exploiting the Knowledge
15
Section 3: Reward Yourself

Extending replay with prioritized experience replay

So far, we've seen how using a replay buffer or experience replay mechanism allows us to pull values back in batches at a later time in order to train the network graph. These batches of data were composed of random samples, which works well, but of course, we can do better. Therefore, instead of storing just everything, we can make two decisions: what data to store and what data is a priority to use. In order to simplify things, we will just look at prioritizing what data we extract from the experience replay. By prioritizing the data we extract, we can hope this will dramatically improve the information we do feed to the network for learning and thus the whole performance of the agent.

Unfortunately, the idea behind prioritizing the replay buffer is quite simple to grasp but far more difficult in practice to derive and...