All the deep learning architectures that we have dealt with so far have no mechanism to memorize the input that they have received previously. For instance, if you feed a feed-forward neural network (FNN) with a sequence of characters such as HELLO, when the network gets to E, you will find that it didn't preserve any information/forgotten that it just read H. This is a serious problem for sequence-based learning. And since it has no memory of any previous characters it read, this kind of network will be very difficult to train to predict the next character. This doesn't make sense for lots of applications such as language modeling, machine translation, speech recognition, and so on.
For this specific reason, we are going to introduce RNNs, a set of deep learning architectures that do preserve information and memorize what they have just encountered...