You can refer to the following documents for more insights into the topics covered in this chapter:
Deeplearning.net Theano tutorials: Single layer (http://deeplearning.net/tutorial/logreg.html), MLP (http://deeplearning.net/tutorial/mlp.html), Convolutions (http://deeplearning.net/tutorial/lenet.html)
All loss functions: for classification, regression, and joint embedding (http://christopher5106.github.io/deep/learning/2016/09/16/about-loss-functions-multinomial-logistic-logarithm-cross-entropy-square-errors-euclidian-absolute-frobenius-hinge.html)
The last example corresponds to Yann Lecun's five-5 layer network as in Gradient based learning applied to document recognition (http://yann.lecun.com/exdb/publis/pdf/lecun-98.pdf)
Understanding the difficulty of training deep feedforward neural networks, Xavier Glorot, Yoshua Bengio, 2010
Maxout Networks: Ian J. Goodfellow, David Warde-Farley, Mehdi Mirza, Aaron Courville, Yoshua Bengio 2013
An overview of gradient descent algorithms...