CV
CVPR

Clustering Plotted Data by Image Segmentation

April 26, 2023
Clustering is a popular approach to detecting patterns in unlabeled data. Existing clustering methods typically treat samples in a dataset as points in a metric space and compute distances to group together similar points. In this paper, we present a different way of clustering points in 2-dimensional space, inspired by how humans cluster data: by training neural networks to perform instance segmentation on plotted data. Our approach, Visual Clustering, has several advantages over traditional clustering algorithms: it is much faster than most existing clustering algorithms (making it suitable for very large datasets), it agrees strongly with human intuition for clusters, and it is by default hyperparameter free (although additional steps with hyperparameters can be introduced for more control of the algorithm). We describe the method and compare it to ten other clustering methods on synthetic data to illustrate its advantages and disadvantages. We then demonstrate how our approach can be extended to higher-dimensional data and illustrate its performance on real-world data. Our implementation of Visual Clustering is publicly available as a python package that can be installed and used on any dataset in a few lines of code https://github.com/tareknaous/visual-clustering. A demo on synthetic datasets is provided https://huggingface.co/spaces/CVPR/visual-clustering.

Overall

< 1 minute

Tarek Naous, Srinjay Sarkar, Abubakar Abid, James Zou

CVPR 2022

Share Article

Related publications

GenAI
CV
NeurIPS
November 28, 2024

Hao Phung*, Quan Dao*, Trung Dao, Viet Hoang Phan, Dimitris N. Metaxas, Anh Tran

GenAI
CV
ECCV
November 28, 2024

Uy Dieu Tran, Minh Luu, Phong Ha Nguyen, Khoi Nguyen, Binh-Son Hua

GenAI
CV
ECCV
November 28, 2024

Phuong Dam, Jihoon Jeong, Anh Tran, Daeyoung Kim

CV
ECCV
November 28, 2024

Hoang Pham, The-Anh Ta, Anh Tran, Khoa Doan