Book Image

Learn Unity ML-Agents ??? Fundamentals of Unity Machine Learning

Book Image

Learn Unity ML-Agents ??? Fundamentals of Unity Machine Learning

Overview of this book

Unity Machine Learning agents allow researchers and developers to create games and simulations using the Unity Editor, which serves as an environment where intelligent agents can be trained with machine learning methods through a simple-to-use Python API. This book takes you from the basics of Reinforcement and Q Learning to building Deep Recurrent Q-Network agents that cooperate or compete in a multi-agent ecosystem. You will start with the basics of Reinforcement Learning and how to apply it to problems. Then you will learn how to build self-learning advanced neural networks with Python and Keras/TensorFlow. From there you move o n to more advanced training scenarios where you will learn further innovative ways to train your network with A3C, imitation, and curriculum learning models. By the end of the book, you will have learned how to build more complex environments by building a cooperative and competitive multi-agent ecosystem.
Table of Contents (8 chapters)

Running a sample

Unity ships the ML-Agents package with a number of prepared samples that demonstrate various aspects of learning and training scenarios. Let's open up Unity and load up a sample project and get a feel for how the ML-Agents run by following this exercise:

  1. Open the Unity editor and go to the starting Project dialog.
  1. Click the Open button at the top of the dialog and navigate to and select the ML-Agents/ml-agents/unity-environment folder, as shown in the following screenshot:
Loading the unity-environment project into the editor
  1. This will load the unity-environment project into the Unity editor. Depending on the Unity version you are using, you may get a warning that the version needs to be upgraded. As long as you are using a recent version of Unity, you can just click Continue. If you do experience problems, try upgrading or downgrading your version of Unity.
  2. Locate the Scene file in the Assets/ML-Agents/Examples/3DBall folder of the Project window, as shown in the following screenshot:
Locating the example scene file in the 3DBall folder
  1. Double-click the 3DBall scene file to open the scene in the editor.
  2. Press the Play button at the top center of the editor to run the scene. You will see that the scene starts running and that balls are being dropped, but the balls just fall off the platforms. This is because the scene starts up in Player mode, which means you can control the platforms with keyboard input. Try to balance the balls on the platform using the arrow keys on the keyboard.
  3. When you are done running the scene, click the Play button again to stop the scene.

Setting the agent Brain

As you witnessed, the scene is currently set for Player control, but obviously we want to see how some of this ML-Agents stuff works. In order to do that, we need to change the Brain type the agent is using. Follow along to switch the Brain type in the 3D Ball agent:

  1. Locate the Ball3DAcademy object in the Hierarchy window and expand it to reveal the Ball3DBrain object.
  2. Select the Ball3DBrain object and then look to the Inspector window, as shown in the following screenshot:
Switching the Brain on the Ball3DBrain object
  1. Switch the Brain component, as shown in the preceding excerpt, to the Heuristic setting. The Heuristic brain setting is for ML-Agents that are internally coded within Unity scripts in a heuristic manner. Heuristic programming is nothing more than selecting a simpler quicker solution when a classic, in our case, ML algorithms, may take longer. Writing a Heuristic brain can often help you better define a problem and it is a technique we will use later in this chapter. The majority of current game AIs fall within the category of using Heuristic algorithms.
  1. Press Play to run the scene. Now, you will see the platforms balancing each of the balls – very impressive for a heuristic algorithm. Next, we want to open the script with the heuristic brain and take a look at some of the code.
You may need to adjust the Rotation Speed property, up or down, on the Ball 3D Decision (Script). Try a value of .5 for a rotation speed if the Heuristics brain seems unable to effectively balance the balls. The Rotation Speed is hidden in the preceding screen excerpt.
  1. Click the Gear icon beside the Ball 3D Decision (Script), and from the context menu, select Edit Script, as shown in the following screenshot:
Editing the Ball 3D Decision script
  1. Take a look at the Decide method in the script as follows:
      public float[] Decide(
List<float> vectorObs,
List<Texture2D> visualObs,
float reward,
bool done,
List<float> memory)
== SpaceType.continuous)
List<float> act = new List<float>();

// state[5] is the velocity of the ball in the x orientation.
// We use this number to control the Platform's z axis rotation
// so that the Platform is tilted in the x orientation
act.Add(vectorObs[5] * rotationSpeed);

// state[7] is the velocity of the ball in the z orientation.
// We use this number to control the Platform's x axis rotation
// so that the Platform is tilted in the z orientation
act.Add(-vectorObs[7] * rotationSpeed);

return act.ToArray();

// If the vector action space type is discrete, then we don't do
return new float[1] { 1f };
  1. We will cover more details about what the inputs and outputs of this method mean later. For now though, look at how simple the code is. This is the heuristic brain that is balancing the balls on the platform, which is fairly impressive when you see the code. The question that may just hit you is: why are we bothering with ML programming, then? The simple answer is that the 3D ball problem is deceptively simple and can be easily modeled with eight states. Take a look at the code again and you can see that only eight states are used (0 to 7), with each state representing the direction the ball is moving in. As you can see, this works well for this problem but when we get to more complex examples, we may have millions upon billions of states – hardly anything we could easily solve using heuristic methods.

Heuristic brains should not be confused with Internal brains, which we will get to in Chapter 6, Terrarium Revisited – Building a Multi-Agent Ecosystem. While you could replace the heuristic code in the 3D ball example with an ML algorithm, that is not the best practice for running an advanced ML such as Deep Learning algorithms, which we will discover in Chapter 3, Deep Reinforcement Learning with Python.

In the next section, we are going to modify the Basic example in order to get a better feel for how ML-Agents components work together.