Book Image

Python Deep Learning Cookbook

By : Indra den Bakker
Book Image

Python Deep Learning Cookbook

By: Indra den Bakker

Overview of this book

Deep Learning is revolutionizing a wide range of industries. For many applications, deep learning has proven to outperform humans by making faster and more accurate predictions. This book provides a top-down and bottom-up approach to demonstrate deep learning solutions to real-world problems in different areas. These applications include Computer Vision, Natural Language Processing, Time Series, and Robotics. The Python Deep Learning Cookbook presents technical solutions to the issues presented, along with a detailed explanation of the solutions. Furthermore, a discussion on corresponding pros and cons of implementing the proposed solution using one of the popular frameworks like TensorFlow, PyTorch, Keras and CNTK is provided. The book includes recipes that are related to the basic concepts of neural networks. All techniques s, as well as classical networks topologies. The main purpose of this book is to provide Python programmers a detailed list of recipes to apply deep learning to common and not-so-common scenarios.
Table of Contents (21 chapters)
Title Page
Credits
About the Author
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface

Implementing policy gradients


In reinforcement learning, we cannot the error in our network directly, because we don't have a truth set for each step. We only receive feedback now and then. This is why we need the policy gradient to propagate the rewards back to the network. The rules to determine the best are called policies. The network for learning these policies is called policy network. This can be type of network, for example, a simple, two-layer FNN or a CNN. The more complex the environment, the more you will benefit from a complex network. When using a policy gradient, we draw an action of the output distribution of our policy network. Because the reward is not always directly available, we treat the action as correct. Later we use the discounted reward as a scalar and backpropagate this to the network weights.

In the following recipe, we will teach an to play Pong from OpenAI by implementing a policy gradient in TensorFlow. Pong is a great game to start with, because it is simple...