RNNs are special compared to other traditional deep neural networks because of their capability to work over long sequences of vectors, and to output different sequences of vectors. RNNs are unfolded over time to work like a feed-forward neural network. The training of RNNs is performed with backpropagation of time, which is an extension of the traditional backpropagation algorithm. A special unit of RNNs, called Long short-term memory, helps to overcome the limitations of the backpropagation of time algorithm.
We also talked about the bidirectional RNN, which is an updated version of the unidirectional RNN. Unidirectional RNNs sometimes fail to predict correctly because of lack of future input information. Later, we discussed distribution of deep RNNs and their implementation with Deeplearning4j. Asynchronous stochastic gradient descent can be used for the training of the distributed RNN. In the next chapter, we will discuss another model of deep neural network, called the Restricted...