Using KMeans for Image Clustering
KMeans can be useful for other tasks related to finding clusters.
Introduction
Clustering is an unsupervised machine learning technique. That means that your dataset does not carry a label, a target variable to be associated with the patterns found by the explanatory variables.
Unsupervised learning is all about finding patterns that look alike and putting them in the same bucket.
One of the most used unsupervised learning algorithms is KMeans, used for clustering. Using it, one can cluster regular data, but can also perform other tasks such as clustering colors in images and perform dimensionality reduction for further classification.
Let’s see more about that.
KMeans
KMeans is an algorithm of clustering based on centroids. In other words, when it is performing the job, it will base all the decisions on a central point for each cluster and the distances between each data point and the center of the cluster. The smaller this distance between the observation and the cluster center, the higher is the probability that it part of that cluster.
The algorithm calculates Euclidean distances between points and the clusters will have…