Book Image

Learn Unity ML-Agents ??? Fundamentals of Unity Machine Learning

Book Image

Learn Unity ML-Agents ??? Fundamentals of Unity Machine Learning

Overview of this book

Unity Machine Learning agents allow researchers and developers to create games and simulations using the Unity Editor, which serves as an environment where intelligent agents can be trained with machine learning methods through a simple-to-use Python API. This book takes you from the basics of Reinforcement and Q Learning to building Deep Recurrent Q-Network agents that cooperate or compete in a multi-agent ecosystem. You will start with the basics of Reinforcement Learning and how to apply it to problems. Then you will learn how to build self-learning advanced neural networks with Python and Keras/TensorFlow. From there you move o n to more advanced training scenarios where you will learn further innovative ways to train your network with A3C, imitation, and curriculum learning models. By the end of the book, you will have learned how to build more complex environments by building a cooperative and competitive multi-agent ecosystem.
Table of Contents (8 chapters)

Asynchronous actor – critic training

Thus far, we have assumed that the internal training structure of PPO mirrors what we learned when we first looked at neural networks and DQN. However, this isn't actually the case. Instead of using a single network to derive Q values or some form of policy, the PPO algorithm uses a technique called actor–critic. This method is essentially a combination of calculating values and policy. In actor–critic, or A3C, we train two networks. One network acts as a Q-value estimate or critic, and the other determines the policy or actions of the actor or agent.

We compare these values in the following equation to determine the advantage:

However, the network is no longer calculating Q-values, so we substitute that for an estimation of rewards:

Now our environment looks like the following screenshot:



Diagram of actor–...