## Two-layer neural networks

Let us look at the formal definition of a two-layer neural network. We follow the notations and description used by David MacKay (reference 1, 2, and 3 in the *References* section of this chapter). The input to the NN is given by . The input values are first multiplied by a set of weights to produce a weighted linear combination and then transformed using a nonlinear function to produce values of the state of neurons in the hidden layer:

A similar operation is done at the second layer to produce final output values :

The function is usually taken as either a
**sigmoid** function or . Another common function used for multiclass classification is **softmax** defined as follows:

This is a normalized exponential function.

All these are highly nonlinear functions exhibiting the property that the output value has a sharp increase as a function of the input. This nonlinear property gives neural networks more computational flexibility than standard linear or generalized linear models...