-
Book Overview & Buying
-
Table Of Contents
Machine Learning Techniques for Text
By :
The basic idea behind the density-based spatial clustering of applications with noise (DBSCAN) algorithm is that clusters are regions of high point density, separated from other clusters by low point density regions. The algorithm takes each point in the dataset to identify the high-density regions and checks whether its neighborhood contains a minimum number of points. Unlike K-means, DBSCAN does not require manually specifying the number of clusters; it is more immune to outliers and more appropriate when the clusters have complex shapes.
To employ the algorithm, we need to set two hyperparameters:
All the data points with less than minPts but more than one point in their neighborhood are called border points. Finally, data points without...