In machine learning classification is the problem of identifying class/type of a given input quantity. Formally the problem can be stated like, we have a set of classes/types represented by:
C={t(1), t(2),…, t(m)}.
We have a set P of objects, each of which is described by a vector. All the objects of P have a unique class from C. From P we are given n objects (that is their representative vectors) p(1), p(2), …, p(n) (each p(i) is d-dimensional vector) and for each one of them p(i) the class is also given c(i). These n vectors with their classes ( p(i) , c(i)) are called as the training data. We are given a distance measure d( p1 , p2) that gives the relevant distance between two vectors of P. Now, we are presented an arbitrary point from P say p whose class is not known. The problem is to find the class of p (using each given data and distance).
To find the class of x we use the following algorithm called k-nearest neighborhood algorithm.
Fix a positive...