Let's start with the simple and beautiful nearest neighbor method from the previous chapter. Although it is not as advanced as other methods, it is very powerful: as it is not model-based, it can learn nearly any data. But this beauty comes with a clear disadvantage, which we will find out very soon.
This time, we won't implement it ourselves, but rather take it from the sklearn
toolkit. There, the classifier resides in sklearn.neighbors
. Let's start with a simple 2-Nearest Neighbor classifier:
>>> from sklearn import neighbors >>> knn = neighbors.KNeighborsClassifier(n_neighbors=2) >>> print(knn) KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski', n_neighbors=2, p=2, weights='uniform')
It provides the same interface as all other estimators in sklearn
: we train it using fit()
, after which we can predict the class of new data instances using predict()
:
>>> knn.fit([[1],[2],[3],[4],[5],[6...