K-means and Mean Shift clustering algorithms put observations into distinct clusters: an observation can belong to one and only one cluster of similar samples. While this might be right for discretely separable datasets, if some of the data overlaps, it may be too hard to place them into only one bucket. After all, our world is not just black or white but our eyes can register millions of colors.
The c-means clustering model allows each and every observation to be a member of more than one cluster and this membership is weighted: the sum of all the weights across all the clusters for each observation must equal 1.
To execute this recipe, you will need pandas
and the Scikit-Fuzzy
module. The Scikit-Fuzzy
module normally does not come preinstalled with Anaconda so you will need to install it yourself.
In order to do so, clone the Scikit-Fuzzy
repository to a local folder:
git clone https://github.com/scikit-fuzzy/scikit-fuzzy.git
On finishing...