An optimal and stable algorithm for clustering numerical data

In the conventional k-means framework, seeding is the first step toward optimization before the objects are clustered. In random seeding, two main issues arise: the clustering results may be less than optimal and different clustering results may be obtained for every run. In real-world applications,...

Full description

Bibliographic Details
Published in:	Algorithms
Main Author:	Seman A.; Sapawi A.M.
Format:	Article
Language:	English
Published:	MDPI AG 2021
Online Access:	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85109399419&doi=10.3390%2fa14070197&partnerID=40&md5=28e9e298aa7e11e0021cdaf07826ebad

id	2-s2.0-85109399419
spelling	2-s2.0-85109399419 Seman A.; Sapawi A.M. An optimal and stable algorithm for clustering numerical data 2021 Algorithms 14 7 10.3390/a14070197 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85109399419&doi=10.3390%2fa14070197&partnerID=40&md5=28e9e298aa7e11e0021cdaf07826ebad In the conventional k-means framework, seeding is the first step toward optimization before the objects are clustered. In random seeding, two main issues arise: the clustering results may be less than optimal and different clustering results may be obtained for every run. In real-world applications, optimal and stable clustering is highly desirable. This report introduces a new clustering algorithm called the zero k-approximate modal haplotype (Zk-AMH) algorithm that uses a simple and novel seeding mechanism known as zero-point multidimensional spaces. The Zk-AMH provides cluster optimality and stability, therefore resolving the aforementioned issues. Notably, the Zk-AMH algorithm yielded identical mean scores to maximum, and minimum scores in 100 runs, producing zero standard deviation to show its stability. Additionally, when the Zk-AMH algorithm was applied to eight datasets, it achieved the highest mean scores for four datasets, produced an approximately equal score for one dataset, and yielded marginally lower scores for the other three datasets. With its optimality and stability, the Zk-AMH algorithm could be a suitable alternative for developing future clustering tools. © 2021 by the authors. Licensee MDPI, Basel, Switzerland. MDPI AG 19994893 English Article All Open Access; Gold Open Access
author	Seman A.; Sapawi A.M.
spellingShingle	Seman A.; Sapawi A.M. An optimal and stable algorithm for clustering numerical data
author_facet	Seman A.; Sapawi A.M.
author_sort	Seman A.; Sapawi A.M.
title	An optimal and stable algorithm for clustering numerical data
title_short	An optimal and stable algorithm for clustering numerical data
title_full	An optimal and stable algorithm for clustering numerical data
title_fullStr	An optimal and stable algorithm for clustering numerical data
title_full_unstemmed	An optimal and stable algorithm for clustering numerical data
title_sort	An optimal and stable algorithm for clustering numerical data
publishDate	2021
container_title	Algorithms
container_volume	14
container_issue	7
doi_str_mv	10.3390/a14070197
url	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85109399419&doi=10.3390%2fa14070197&partnerID=40&md5=28e9e298aa7e11e0021cdaf07826ebad
description	In the conventional k-means framework, seeding is the first step toward optimization before the objects are clustered. In random seeding, two main issues arise: the clustering results may be less than optimal and different clustering results may be obtained for every run. In real-world applications, optimal and stable clustering is highly desirable. This report introduces a new clustering algorithm called the zero k-approximate modal haplotype (Zk-AMH) algorithm that uses a simple and novel seeding mechanism known as zero-point multidimensional spaces. The Zk-AMH provides cluster optimality and stability, therefore resolving the aforementioned issues. Notably, the Zk-AMH algorithm yielded identical mean scores to maximum, and minimum scores in 100 runs, producing zero standard deviation to show its stability. Additionally, when the Zk-AMH algorithm was applied to eight datasets, it achieved the highest mean scores for four datasets, produced an approximately equal score for one dataset, and yielded marginally lower scores for the other three datasets. With its optimality and stability, the Zk-AMH algorithm could be a suitable alternative for developing future clustering tools. © 2021 by the authors. Licensee MDPI, Basel, Switzerland.
publisher	MDPI AG
issn	19994893
language	English
format	Article
accesstype	All Open Access; Gold Open Access
record_format	scopus
collection	Scopus
_version_	1809677597535109120

An optimal and stable algorithm for clustering numerical data

Similar Items