In this chapter, we'll take a look at another practical application of Deep Reinforcement Learning (Deep RL), which has become popular over the Past two years: the training of natural language models with RL methods. It started with a paper called Recurrent Models of Visual Attention, published in 2014, and has been successfully applied to a wide variety of problems from the Natural Language Processing (NLP) domain.
To understand the method, we will begin with a brief introduction to the NLP basics, including Recurrent Neural Networks (RNNs), word embeddings, and the seq2seq model. Then we'll discuss similarities between the NLP and RL problems and take a look at original ideas on how to improve NLP seq2seq training using RL methods. The core of the chapter is a dialogue system trained on the movie dialogues dataset.