Introduction
The focus of this chapter is policy-based methods for RL. However, before diving into a formal introduction to policy-based methods for RL, let's spend some time understanding the motivation behind them. Let's go back a few hundred years when the globe was still mostly undiscovered and maps were incomplete. Brave sailors at that time sailed the great oceans with only indomitable courage and unyielding curiosity on their side. But they weren't completely blind in the vastness of the oceans. They looked up to the night sky for direction. The stars and planets in the night sky guided them to their destination. The night sky is viewed differently at different times of the year from different parts of the globe. This information, along with highly accurate maps of the night sky, guided these brave explorers to their destinations and sometimes to unknown, uncharted lands.
Now, you might question what this story has to do with RL at all. A map of the night sky...