Book Image

Hands-On Reinforcement Learning for Games

By : Micheal Lanham
Book Image

Hands-On Reinforcement Learning for Games

By: Micheal Lanham

Overview of this book

With the increased presence of AI in the gaming industry, developers are challenged to create highly responsive and adaptive games by integrating artificial intelligence into their projects. This book is your guide to learning how various reinforcement learning techniques and algorithms play an important role in game development with Python. Starting with the basics, this book will help you build a strong foundation in reinforcement learning for game development. Each chapter will assist you in implementing different reinforcement learning techniques, such as Markov decision processes (MDPs), Q-learning, actor-critic methods, SARSA, and deterministic policy gradient algorithms, to build logical self-learning agents. Learning these techniques will enhance your game development skills and add a variety of features to improve your game agent’s productivity. As you advance, you’ll understand how deep reinforcement learning (DRL) techniques can be used to devise strategies to help agents learn from their actions and build engaging games. By the end of this book, you’ll be ready to apply reinforcement learning techniques to build a variety of projects and contribute to open source applications.
Table of Contents (19 chapters)
1
Section 1: Exploring the Environment
7
Section 2: Exploiting the Knowledge
15
Section 3: Reward Yourself

Questions

Use these questions and exercises to reinforce the material you just learned. The exercises may be fun to attempt, so be sure to try atleast two to four questions/exercises:

Questions:

  1. What are the names of the main components of an RL system? Hint, the first one is Environment.
  2. Name the four elements of an RL system. Remember that one element is optional.
  3. Name the three main threads that compose modern RL.
  4. What makes a Markov state a Markov property?
  5. What is a policy?

Exercises:

  1. Using Chapter_1_2.py, alter the code so the agent pulls from a bandit with 1,000 arms. What code changes do you need to make?
  2. Using Chapter_1_3.py, alter the code so that the agent pulls from the average value, not greedy/max. How did this affect the agent's exploration?
  3. Using Chapter_1_3.py, alter the learning_rate variable to determine how fast or slow you can make the agent learn. How few episodes are you required to run for the agent to solve the problem?
  4. Using Chapter_1_5.py, alter the code so that the agent uses a different policy (either the greedy policy or something else). Take points off yourself if you look ahead in this book or online for solutions.
  5. Using Chapter_1_4.py, alter the code so that the bandits are connected. Hence, when an agent pulls an arm, they receive a reward and are transported to another specific bandit, no longer at random. Hint: This likely will require a new destination table to be built and you will now need to include the discounted reward term we removed.

Even completing a few of these questions and/or exercises will make a huge difference to your learning this material. This is a hands-on book after all.