Earlier, we discussed two important challenges in training a simple RNN: the exploding gradient and the vanishing gradient. We also know that we can prevent gradient explosion with a simple trick such as gradient clipping, leading to more stable training. However, solving the vanishing gradient takes much more effort, because there is no simple scaling/clipping mechanism to solve the gradient vanishing, as we did for gradient explosion. Therefore, we need to modify the structure of the RNN itself, giving explicitly the ability for it to remember longer patterns in sequences of data .The RNN-CF proposed in the paper, Learning Longer Memory in Recurrent Neural Networks, Tomas Mikolov and others, International Conference on Learning Representations (2015), is one such modification to the standard RNN, helping RNNs to memorize patterns in sequences of data for longer.
An RNN-CF provides an improvement to reduce the vanishing...