The final glue – loss functions and optimizers
The network that transforms input data into output is not the only thing we need for training. We need to define our learning objective, which is to have a function that accepts two arguments—the network's output and the desired output. Its responsibility is to return to us a single number—how close the network's prediction is from the desired result. This function is called the loss function, and its output is the loss value. Using the loss value, we calculate gradients of network parameters and adjust them to decrease this loss value, which pushes our model to better results in the future. Both the loss function and the method of tweaking a network's parameters by gradient are so common and exist in so many forms that both of them form a significant part of the PyTorch library. Let's start with loss functions.