Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying TensorFlow Reinforcement Learning Quick Start Guide
  • Table Of Contents Toc
TensorFlow Reinforcement Learning Quick Start Guide

TensorFlow Reinforcement Learning Quick Start Guide

By : Balakrishnan
5 (2)
close
close
TensorFlow Reinforcement Learning Quick Start Guide

TensorFlow Reinforcement Learning Quick Start Guide

5 (2)
By: Balakrishnan

Overview of this book

Advances in reinforcement learning algorithms have made it possible to use them for optimal control in several different industrial applications. With this book, you will apply Reinforcement Learning to a range of problems, from computer games to autonomous driving. The book starts by introducing you to essential Reinforcement Learning concepts such as agents, environments, rewards, and advantage functions. You will also master the distinctions between on-policy and off-policy algorithms, as well as model-free and model-based algorithms. You will also learn about several Reinforcement Learning algorithms, such as SARSA, Deep Q-Networks (DQN), Deep Deterministic Policy Gradients (DDPG), Asynchronous Advantage Actor-Critic (A3C), Trust Region Policy Optimization (TRPO), and Proximal Policy Optimization (PPO). The book will also show you how to code these algorithms in TensorFlow and Python and apply them to solve computer games from OpenAI Gym. Finally, you will also learn how to train a car to drive autonomously in the Torcs racing car simulator. By the end of the book, you will be able to design, build, train, and evaluate feed-forward neural networks and convolutional neural networks. You will also have mastered coding state-of-the-art algorithms and also training agents for various control problems.
Table of Contents (11 chapters)
close
close

The A2C algorithm

The difference between A2C and A3C is that A2C performs synchronous updates. Here, all the workers will wait until they have completed the collection of experiences and computed the gradients. Only after this are the global (or master) network's parameters updated. This is different from A3C, where the update is performed asynchronously, that is, where the worker threads do not wait for the others to finish. A2C is easier to code than A3C, but that is not undertaken here. If you are interested in this, you are encouraged to take the preceding A3C code and convert it to A2C, after which the performance of both algorithms can be compared.

CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
TensorFlow Reinforcement Learning Quick Start Guide
notes
bookmark Notes and Bookmarks search Search in title playlist Add to playlist download Download options font-size Font size

Change the font size

margin-width Margin width

Change margin width

day-mode Day/Sepia/Night Modes

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY

Submit Your Feedback

Modal Close icon
Modal Close icon
Modal Close icon