In the previous chapter, we introduced the gradient descent technique to speed up processing. As we've seen with Linear Regression, the fitting of the model can be made in two ways: closed form or iterative form. Closed form gives the best possible solution in one step (but it's a very complex and time-demanding step); iterative algorithms, instead, reach the minima step by step with few calculations for each update and can be stopped at any time.
Gradient descent is a very popular choice for fitting the Logistic Regression model; however, it shares its popularity with Newton's methods. Since Logistic Regression is the base of the iterative optimization, and we've already introduced it, we will focus on it in this section. Don't worry, there is no winner or any best algorithm: all of them can reach the very same model eventually, following different paths in the coefficients' space.
First, we should compute the derivate of the loss function. Let's make it a bit...