Chapter 7: Policy-Based Methods | Mastering Reinforcement Learning with Python

Book Overview & Buying
Table Of Contents

Mastering Reinforcement Learning with Python

By : Enes Bilgin

4.4 (12)

Buy this Book

Mastering Reinforcement Learning with Python

4.4 (12)

By: Enes Bilgin

Buy this Book

Overview of this book

Reinforcement learning (RL) is a field of artificial intelligence (AI) used for creating self-learning autonomous agents. Building on a strong theoretical foundation, this book takes a practical approach and uses examples inspired by real-world industry problems to teach you about state-of-the-art RL. Starting with bandit problems, Markov decision processes, and dynamic programming, the book provides an in-depth review of the classical RL techniques, such as Monte Carlo methods and temporal-difference learning. After that, you will learn about deep Q-learning, policy gradient algorithms, actor-critic methods, model-based methods, and multi-agent reinforcement learning. Then, you'll be introduced to some of the key approaches behind the most successful RL implementations, such as domain randomization and curiosity-driven learning. As you advance, you’ll explore many novel algorithms with advanced implementations using modern Python libraries such as TensorFlow and Ray’s RLlib package. You’ll also find out how to implement RL in areas such as robotics, supply chain management, marketing, finance, smart cities, and cybersecurity while assessing the trade-offs between different approaches and avoiding common pitfalls. By the end of this book, you’ll have mastered how to train and deploy your own RL agents for solving RL problems.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Section 1: Reinforcement Learning Foundations

Free Chapter

Chapter 1: Introduction to Reinforcement Learning

Why reinforcement learning?

The three paradigms of ML

RL application areas and success stories

Elements of a RL problem

Setting up your RL environment

Summary

References

Chapter 2: Multi-Armed Bandits

Exploration-Exploitation Trade-Off

What is a MAB?

Case study: Online advertising

A/B/n testing

ε-greedy actions

Action selection using upper confidence bounds

Thompson (Posterior) sampling

Summary

References

Chapter 3: Contextual Bandits

Why we need function approximations

Using function approximation for context

Using function approximation for action

Other applications of multi-armed and contextual bandits

Summary

References

Chapter 4: Makings of a Markov Decision Process

Starting with Markov chains

Introducing the reward: Markov reward process

Bringing the action in: Markov decision process

Partially observable Markov decision process

Summary

Exercises

References

Chapter 5: Solving the Reinforcement Learning Problem

Exploring dynamic programming

Training your agent with Monte Carlo methods

Temporal-difference learning

Understanding the importance of the simulation in reinforcement learning

Summary

References

Section 2: Deep Reinforcement Learning

Chapter 6: Deep Q-Learning at Scale

From tabular Q-learning to deep Q-learning

Deep Q-networks

Extensions to DQN: Rainbow

Distributed deep Q-learning

Implementing scalable deep Q-learning algorithms using Ray

RLlib: Production-grade deep reinforcement learning

Summary

References

Chapter 7: Policy-Based Methods

Need for policy-based methods

Vanilla policy gradient

Actor-critic methods

Trust-region methods

Revisiting off-policy Methods

Comparison of the policy-based methods in Lunar Lander

How to pick the right algorithm?

Open source implementations of policy-gradient methods

Summary

References

Chapter 8: Model-Based Methods

Introducing model-based methods

Planning through a model

Learning a world model

Unifying model-based and model-free approaches

Summary

References

Chapter 9: Multi-Agent Reinforcement Learning

Introducing multi-agent reinforcement learning

Exploring the challenges in multi-agent reinforcement learning

Training policies in multi-agent settings

Training tic-tac-toe agents through self-play

Summary

References

Section 3: Advanced Topics in RL

Chapter 10: Introducing Machine Teaching

Introduction to machine teaching

Engineering the reward function

Curriculum learning

Warm starts with demonstrations

Action masking

Summary

References

Chapter 11: Achieving Generalization and Overcoming Partial Observability

Focusing on generalization in reinforcement learning

Enriching agent experience via domain randomization

Using memory to overcome partial observability

Quantifying generalization via CoinRun

Summary

References

Chapter 12: Meta-Reinforcement Learning

Introducing meta-reinforcement learning

Meta-reinforcement learning with recurrent policies

Gradient-based meta-reinforcement learning

Meta-reinforcement learning as partially observed reinforcement learning

Challenges in meta-reinforcement learning

Conclusion

References

Chapter 13: Exploring Advanced Topics

Diving deeper into distributed reinforcement learning

Exploring curiosity-driven reinforcement learning

Offline reinforcement learning

Summary

References

Section 4: Applications of RL

Chapter 14: Solving Robot Learning

Introducing PyBullet

Getting familiar with the Kuka environment

Developing strategies to solve the Kuka environment

Using curriculum learning to train the Kuka robot

Going beyond PyBullet into autonomous driving

Summary

References

Chapter 15: Supply Chain Management

Optimizing inventory procurement decisions

Modeling routing problems

Summary

References

Chapter 16: Personalization, Marketing, and Finance

Going beyond bandits for personalization

Developing effective marketing strategies using reinforcement learning

Applying reinforcement learning in finance

Summary

References

Chapter 17: Smart City and Cybersecurity

Controlling traffic lights to optimize vehicle flow

Providing ancillary service to power grid

Detecting cyberattacks in a smart grid

Summary

References

Chapter 18: Challenges and Future Directions in Reinforcement Learning

What you have achieved with this book

Challenges and future directions

Suggestions for aspiring reinforcement learning experts

Final words

References

Other Books You May Enjoy

Leave a review - let other readers know what you think

Mastering Reinforcement Learning with Python

By : Enes Bilgin

Mastering Reinforcement Learning with Python

By: Enes Bilgin

Overview of this book

Actor-critic methods

Further reducing the variance in policy-based methods

Estimating the reward-to-go

Mastering Reinforcement Learning with Python

By : Enes Bilgin

Mastering Reinforcement Learning with Python

By: Enes Bilgin

Overview of this book

Actor-critic methods

Further reducing the variance in policy-based methods

Estimating the reward-to-go

Confirmation

Buy this book with your credits?

Submit Your Feedback

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access