Book Image

Reinforcement Learning with TensorFlow

By : Sayon Dutta
Book Image

Reinforcement Learning with TensorFlow

By: Sayon Dutta

Overview of this book

Reinforcement learning (RL) allows you to develop smart, quick and self-learning systems in your business surroundings. It's an effective method for training learning agents and solving a variety of problems in Artificial Intelligence - from games, self-driving cars and robots, to enterprise applications such as data center energy saving (cooling data centers) and smart warehousing solutions. The book covers major advancements and successes achieved in deep reinforcement learning by synergizing deep neural network architectures with reinforcement learning. You'll also be introduced to the concept of reinforcement learning, its advantages and the reasons why it's gaining so much popularity. You'll explore MDPs, Monte Carlo tree searches, dynamic programming such as policy and value iteration, and temporal difference learning such as Q-learning and SARSA. You will use TensorFlow and OpenAI Gym to build simple neural network models that learn from their own actions. You will also see how reinforcement learning algorithms play a role in games, image processing and NLP. By the end of this book, you will have gained a firm understanding of what reinforcement learning is and understand how to put your knowledge to practical use by leveraging the power of TensorFlow and OpenAI Gym.
Table of Contents (21 chapters)
Title Page
Packt Upsell

Problem definition

As we already know, portfolio management is the continuous reallocation of funds across different multiple financial products (assets). In this work, the time is divided into equal length periods, where each period T = 30 minutes. At the beginning of each period, the trading agent reallocates the fund across different assets. The price of an asset fluctuates within a period, but four important price metrics are taken into consideration, which are good enough to characterize the price movement of an asset in the period. These price metrics are as follows:

  • Opening price
  • Highest price
  • Lowest price
  • Closing price

For a continuous market (such as our test case), the opening price of an asset in a period t is its closing price in the previous period t-1. The portfolio consists of m assets. For a time period t, the closing prices of all the m assets create the price vector 

. Thus, 

 element of 

 that is 

 is the closing price of the

 asset in that 

 time period.

Similarly, we have vector...