Unsupervised learning using outlier detection
The subject of finding outliers or anomalies in the data streams is one of the emerging fields in machine learning. This area has not been explored by researchers as much as classification and clustering-based problems have. However, there have been some very interesting ideas extending the concepts of clustering to find outliers from data streams. We will provide some of the research that has been proved to be very effective in stream outlier detection.
Partition-based clustering for outlier detection
The central idea here is to use an online partition-based clustering algorithm and based on either cluster size ranking or inter-cluster distance ranking, label the clusters as outliers.
Here we present one such algorithm proposed by Koupaie et al., using incremental k-Means.
Inputs and outputs
Only numeric features are used, as in most k-Means algorithms. The number of clusters k and the number of windows of outliers n, on which offline clustering...