Data clustering
Clusters are data groups of elements that are very close or similar. For example, a group of people can be divided into clusters according to age, height, sex, social status, and so on. Clustering helps to better understand input information because if we know the properties of one element of the cluster, it is likely that the other elements may also have these properties. The process of finding a cluster can go on without a teacher (unsupervised learning technique) and can be based on two functions: the distance function that indicates the distance between the elements of a cluster—the closer the elements are to each other, the greater is the probability that they are in the same cluster, and the dissimilarity function, the result of which is the degree of dissimilarity between the elements.
To cluster data, we'll use the FindClusters
function. First, let's consider its application in simple examples:
By default, the FindClusters
function finds clusters on the basis of the...