## Multi-layer feed-forward neural network

Historically, artificial neural networks have been largely identified by multi-layer feed-forward perceptrons, and so we will begin with a discussion of the primitive elements of the structure of such networks, how to train them, the problem of overfitting, and techniques to address it.

### Inputs, neurons, activation function, and mathematical notation

A single neuron or perceptron is the same as the unit described in the Linear Regression topic in Chapter 2, *Practical Approach to Real-World Supervised Learning*. In this chapter, the data instance vector will be represented by *x* and has *d* dimensions, and each dimension can be represented as . The weights associated with each dimension are represented as a weight vector *w* that has *d* dimensions, and each dimension can be represented as . Each neuron has an extra input *b*, known as the bias, associated with it.

Neuron pre-activation performs the linear transformation of inputs given by:

The activation function...