In this chapter, we looked at the basic architecture of recurrent neural networks and how they work better than traditional neural networks over sequence data. We saw how RNNs can be used to learn an author's writing style and generate text using the learned model. We also saw how this example can be extended to predicting stock prices or other time series, speech from noisy audio, and so on, as well as generate music that was composed by a learned model.
We looked at different ways to compose our RNN units and these topologies can be used to model and solve specific problems such as sentiment analysis, machine translation, image captioning, and classification, and so on.
We then looked at one of the biggest drawbacks of the SimpleRNN architecture, that of vanishing and exploding gradients. We saw how the vanishing gradient problem is handled using the LSTM (and GRU) architectures. We also looked at the LSTM and GRU architectures in some detail. We also saw two examples of predicting...