Clustering the unlabeled data using a modified cat swarm optimization
This paper presents a modified version of the Cat Swarm Optimization (CSO) algorithm aimed at addressing the limitations of traditional clustering methods in handling complex, high-dimensional datasets. The primary objective of this research is to improve clustering accuracy and stability by elimina...
Published in: | Journal of Applied Data Sciences |
---|---|
Main Author: | |
Format: | Article |
Language: | English |
Published: |
Bright Publisher
2024
|
Online Access: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85210851113&doi=10.47738%2fjads.v5i3.349&partnerID=40&md5=b9bbaca40cab8a163ad06306ff30229f |
id |
2-s2.0-85210851113 |
---|---|
spelling |
2-s2.0-85210851113 Dewi D.A.; Kurniawan T.B.; Mohd Zakizakaria; Armoogum S. Clustering the unlabeled data using a modified cat swarm optimization 2024 Journal of Applied Data Sciences 5 3 10.47738/jads.v5i3.349 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85210851113&doi=10.47738%2fjads.v5i3.349&partnerID=40&md5=b9bbaca40cab8a163ad06306ff30229f This paper presents a modified version of the Cat Swarm Optimization (CSO) algorithm aimed at addressing the limitations of traditional clustering methods in handling complex, high-dimensional datasets. The primary objective of this research is to improve clustering accuracy and stability by eliminating the mixture ratio (MR), setting the counts of dimensions to change (CDC) to 100%, and incorporating a new search equation in the tracing mode of the CSO algorithm. To evaluate the performance of the modified algorithm, five classic datasets from the UCI Machine Learning Repository—namely Iris, Cancer, Glass, Wine, and Contraceptive Method Choice (CMC)—were used. The proposed algorithm was compared against K-Means and the original CSO. Performance metrics such as intra-cluster distance, standard deviation, and F- measure were used to assess the quality of clustering. The results demonstrated that the modified CSO consistently outperformed the competing algorithms. For example, on the Iris dataset, the modified CSO achieved a best intra-cluster distance of 96.78 and an F-measure of 0.786, compared to 97.12 and 0.781 for K-Means. Similarly, for the Wine dataset, the modified CSO reached a best intra-cluster distance of 16399, surpassing K-Means which recorded 16768. In conclusion, the modifications introduced to the CSO algorithm significantly enhance its clustering performance across diverse datasets, producing tighter and more accurate clusters with improved stability. These findings suggest that the modified CSO is a robust and effective tool for data clustering tasks, particularly in high-dimensional spaces. Future work will focus on dynamic parameter tuning and testing the scalability of the algorithm on larger and more complex datasets. © Authors retain all copyrights. Bright Publisher 27236471 English Article |
author |
Dewi D.A.; Kurniawan T.B.; Mohd Zakizakaria; Armoogum S. |
spellingShingle |
Dewi D.A.; Kurniawan T.B.; Mohd Zakizakaria; Armoogum S. Clustering the unlabeled data using a modified cat swarm optimization |
author_facet |
Dewi D.A.; Kurniawan T.B.; Mohd Zakizakaria; Armoogum S. |
author_sort |
Dewi D.A.; Kurniawan T.B.; Mohd Zakizakaria; Armoogum S. |
title |
Clustering the unlabeled data using a modified cat swarm optimization |
title_short |
Clustering the unlabeled data using a modified cat swarm optimization |
title_full |
Clustering the unlabeled data using a modified cat swarm optimization |
title_fullStr |
Clustering the unlabeled data using a modified cat swarm optimization |
title_full_unstemmed |
Clustering the unlabeled data using a modified cat swarm optimization |
title_sort |
Clustering the unlabeled data using a modified cat swarm optimization |
publishDate |
2024 |
container_title |
Journal of Applied Data Sciences |
container_volume |
5 |
container_issue |
3 |
doi_str_mv |
10.47738/jads.v5i3.349 |
url |
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85210851113&doi=10.47738%2fjads.v5i3.349&partnerID=40&md5=b9bbaca40cab8a163ad06306ff30229f |
description |
This paper presents a modified version of the Cat Swarm Optimization (CSO) algorithm aimed at addressing the limitations of traditional clustering methods in handling complex, high-dimensional datasets. The primary objective of this research is to improve clustering accuracy and stability by eliminating the mixture ratio (MR), setting the counts of dimensions to change (CDC) to 100%, and incorporating a new search equation in the tracing mode of the CSO algorithm. To evaluate the performance of the modified algorithm, five classic datasets from the UCI Machine Learning Repository—namely Iris, Cancer, Glass, Wine, and Contraceptive Method Choice (CMC)—were used. The proposed algorithm was compared against K-Means and the original CSO. Performance metrics such as intra-cluster distance, standard deviation, and F- measure were used to assess the quality of clustering. The results demonstrated that the modified CSO consistently outperformed the competing algorithms. For example, on the Iris dataset, the modified CSO achieved a best intra-cluster distance of 96.78 and an F-measure of 0.786, compared to 97.12 and 0.781 for K-Means. Similarly, for the Wine dataset, the modified CSO reached a best intra-cluster distance of 16399, surpassing K-Means which recorded 16768. In conclusion, the modifications introduced to the CSO algorithm significantly enhance its clustering performance across diverse datasets, producing tighter and more accurate clusters with improved stability. These findings suggest that the modified CSO is a robust and effective tool for data clustering tasks, particularly in high-dimensional spaces. Future work will focus on dynamic parameter tuning and testing the scalability of the algorithm on larger and more complex datasets. © Authors retain all copyrights. |
publisher |
Bright Publisher |
issn |
27236471 |
language |
English |
format |
Article |
accesstype |
|
record_format |
scopus |
collection |
Scopus |
_version_ |
1820775433680977920 |