Unsupervised learning
AWS provides several unsupervised learning algorithms for the following tasks:
- Clustering:
- K-means algorithm
- Dimension reduction:
- Principal Component Analysis (PCA)
- Pattern recognition:
- IP Insights
- Anomaly detection:
- Random Cut Forest Algorithm (RCF)
Let's start by talking about clustering and how the most popular clustering algorithm works: K-means.
Clustering
Clustering algorithms are very popular in data science. Basically, they aim to identify groups in a given dataset. Technically, we call these findings or groups clusters. Clustering algorithms belong to the field of non-supervised learning, which means that they don't need a label or response variable to be trained.
This is just fantastic because labeled data used to be scarce. However, it comes with some limitations. The main one is that clustering algorithms provide clusters for you, but not the meaning of each cluster. Thus, someone, as a...