A comparison study of clustering algorithms in grouping microarray data

Clustering is the process of assembling related data items into a group that stands out from the other things. Finding groupings of genes that share co-expression in various settings through the analysis of microarray data is possible through the use of clustering. This study's primary goal is...

Full description

Bibliographic Details
Published in:AIP Conference Proceedings
Main Author: Zin S.H.H.M.; Moktar B.
Format: Conference paper
Language:English
Published: American Institute of Physics 2024
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85194167481&doi=10.1063%2f5.0208145&partnerID=40&md5=183e5ec3dbc96f1fb1585a5c363b49d4
Description
Summary:Clustering is the process of assembling related data items into a group that stands out from the other things. Finding groupings of genes that share co-expression in various settings through the analysis of microarray data is possible through the use of clustering. This study's primary goal is to compare the effectiveness of various clustering methods while grouping microarray data both with and without outliers. Six microarray datasets were analysed using the same way. The clustering algorithms are widely used by researchers which include the agglomerative hierarchical clustering, K-means algorithm, Self-Organizing Maps (SOM) and Partitioning Around Medoids (PAM). The measures of performance were internal validation measures (connectivity, Dunn index and silhouette width) and stability validation measures (Average of Proportion of Non-overlap and Average Distance of Means). The findings demonstrate that various clustering techniques generate various numbers of clusters. According to internal and stability performance criteria, the agglomerative hierarchical clustering method with two clusters produces the best clustering results for data sets with and without an outlier issue. © 2024 Author(s).
ISSN:0094243X
DOI:10.1063/5.0208145