Using KMeans for Image Clustering

Gustavo R Santos
4 min readOct 27, 2022

KMeans can be useful for other tasks related to finding clusters.

Photo by Pawel Czerwinski on Unsplash

Introduction

Clustering is an unsupervised machine learning technique. That means that your dataset does not carry a label, a target variable to be associated with the patterns found by the explanatory variables.

Unsupervised learning is all about finding patterns that look alike and putting them in the same bucket.

One of the most used unsupervised learning algorithms is KMeans, used for clustering. Using it, one can cluster regular data, but can also perform other tasks such as clustering colors in images and perform dimensionality reduction for further classification.

Let’s see more about that.

KMeans

KMeans is an algorithm of clustering based on centroids. In other words, when it is performing the job, it will base all the decisions on a central point for each cluster and the distances between each data point and the center of the cluster. The smaller this distance between the observation and the cluster center, the higher is the probability that it part of that cluster.

The algorithm calculates Euclidean distances between points and the clusters will have…

--

--

Gustavo R Santos
Gustavo R Santos

Written by Gustavo R Santos

Data Scientist | I solve business challenges through the power of data. | Visit my site: https://gustavorsantos.me

No responses yet