In this section, we will formally define what machine learning is and, specifically, what supervised machine learning is.

In the early days of AI, everything was a rules engine. The programmer wrote the function and the rules, and the computer simply followed them. Modern-day AI is more in line with machine learning, which teaches a computer to write its own functions. Some may contest that oversimplification of the concept, but, at its core, this is largely what machine learning is all about.

We're going to look at a quick example of what machine learning is and what it is not. Here, we're using scikit-learn's datasets, submodule to create two objects and variables, also known as covariance or features, which are along the column axis. `y`

is a vector with the same number of values as there are rows in `X`

. In this case, `y`

is a class label. For the sake of an example, `y`

here could be a binary label corresponding to a real-world occurrence, such as the malignancy of a tumor.`X`

is then a matrix of attributes that describe `y`

. One feature could be the diameter of the tumor, and another could indicate its density. The preceding explanation can be seen in the following code:

import numpy as np from sklearn.datasets import make_classification rs = np.random.RandomState(42) X,y = make_classification(n_samples=10, random_state=rs)

A rules engine, by our definition, is simply business logic. It can be as simple or as complex as you need it to be, but the programmer makes the rules. In this function, we're going to evaluate our `X`

matrix by returning `1`

, or `true`

, where the sums over the rows are greater than `0`

. Even though there's some math involved here, there is still a rules engine, because we, the programmers, defined a rule. So, we could theoretically get into a gray area, where the rule itself was discovered via machine learning. But, for the sake of argument, let's take an example that the head surgeon arbitrarily picks `0`

as our threshold, and anything above that is deemed as cancerous:

def make_life_alterning_decision(X): """Determine whether something big happens""" row_sums = X.sum(axis=1) return (row_sums > 0).astype(int) make_life_alterning_decision(X)

The output of the preceding code snippet is as follows:

array([0, 1, 0, 0, 1, 1, 1, 0, 1, 0])

Now, as mentioned before, our rules engine can be as simple or as complex as we want it to be. Here, we're not only interested in `row_sums`

, but we have several criteria to meet in order to deem something cancerous. The minimum value in the row must be less than `-1.5`

, in addition to one or more of the following three criteria:

- The row sum exceeds
`0`

- The sum of the rows is evenly divisible by
`0.5`

- The maximum value of the row is greater than
`1.5`

So, even though our math is a little more complex here, we're still just building a rules engine:

def make_more_complex_life_alterning_decision(X): """Make a more complicated decision about something big""" row_sums = X.sum(axis=1) return ((X.min(axis=1) < -1.5) & ((row_sums >= 0.) | (row_sums % 0.5 == 0) | (X.max(axis=1) > 1.5))).astype(int) make_more_complex_life_alterning_decision(X)

The output of the preceding code is as follows:

array([0, 1, 1, 1, 1, 1, 0, 1, 1, 0])

Now, let's say that our surgeon understands and realizes they're not the math or programming whiz that they thought they were. So, they hire programmers to build them a machine learning model. The model itself is a function that discovers parameters that complement a decision function, which is essentially the function the machine itself learned. So, parameters are things we'll discuss in our next Chapter 2, *Implementing Parametric Models,* which are parametric models. So, what's happening behind the scenes when we invoke the `fit`

method is that the model learns the characteristics and patterns of the data, and how the `X`

matrix describes the `y`

vector. Then, when we call the `predict`

function, it applies its learned decision function to the input data to make an educated guess:

from sklearn.linear_model import LogisticRegression def learn_life_lession(X, y): """Learn a lesson abd apply it in a future situation""" model = LogisticRegression().fit(X, y) return (lambda X: model.predict(X)) educated_decision = learn_life_lession(X, y)(X) educated_decision

The output of the preceding code is as follows:

array([1, 1, 0, 0, 0, 1, 1, 0, 1, 0])

So, now we're at a point where we need to define specifically what supervised learning is. Supervised learning is precisely the example we just described previously. Given our matrix of examples, *X*, in a vector of corresponding labels, *y*, that learns a function which approximates the value of *y* or

:

There are other forms of machine learning that are not supervised, known as**unsupervised machine learning**. These do not have labels and are more geared toward pattern recognition tasks. So, what makes something supervised is the presence of labeled data.

Going back to our previous example, when we invoke the `fit`

method, we learn our new decision function and then, when we call `predict`

, we're approximating the new `y`

values. So, the output is this

we just looked at:

Supervised learning learns a function from labelled samples that approximates future `y`

values. At this point, you should feel comfortable explaining the abstract concept—just the high-level idea of what supervised machine learning is.