In this recipe, we look at the code sample on how to optimize with RMSProp.
RMSprop is an (unpublished) adaptive learning rate method proposed by Geoff Hinton. RMSprop and AdaDelta were both developed independently around the same time, stemming from the need to resolve AdaGrad's radically diminishing learning rates. RMSprop is identical to the first update vector of AdaDelta that we derived earlier:
RMSprop divides the learning rate by an exponentially decaying average of squared gradients. It is suggested that γ to be set to 0.9, while a good default value for the learning rate is η is 0.001.
Import the relevant classes, methods, and so on, as specified in the preceding common code section.