Now that we have seen a few examples, let's dig into the building blocks of a reinforcement learning system. Apart from the interaction between the agent and the environment, there are other factors at play here:
A typical reinforcement learning agent goes through the following steps:
There is a set of states related to the agent and the environment. At a given point of time, the agent observes an input state to sense the environment.
There are policies that govern what action needs to be taken. These policies act as decision making functions. The action is determined based on the input state using these policies.
The agent takes the action based on the previous step.
The environment reacts in a particular way in response to that action. The agent receives reinforcement, also known as reward, from the environment.
The agent records the information about this reward. It's important to note that this reward is for this particular pair of state and action...