One of the important things that we do when the deep learning model fails to learn is to add more layers to the model. As you add layers the model accuracy improves and then starts saturating. It starts degrading as you keep on adding more layers. Adding more layers beyond a certain number will add certain challenges, such as vanishing or exploding gradients, which is partially solved by carefully initializing weights and introducing intermediate normalizing layers. Modern architectures, such as residual network (ResNet) and Inception, try to solve this problem by introducing different techniques, such as residual connections.
Modern network architectures
ResNet
ResNet solves these problems by explicitly letting the layers...