A comparison study of clustering algorithms in grouping microarray data

Clustering is the process of assembling related data items into a group that stands out from the other things. Finding groupings of genes that share co-expression in various settings through the analysis of microarray data is possible through the use of clustering. This study's primary goal is...

Full description

Bibliographic Details
Published in:AIP Conference Proceedings
Main Author: Zin S.H.H.M.; Moktar B.
Format: Conference paper
Language:English
Published: American Institute of Physics 2024
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85194167481&doi=10.1063%2f5.0208145&partnerID=40&md5=183e5ec3dbc96f1fb1585a5c363b49d4
id 2-s2.0-85194167481
spelling 2-s2.0-85194167481
Zin S.H.H.M.; Moktar B.
A comparison study of clustering algorithms in grouping microarray data
2024
AIP Conference Proceedings
2850
1
10.1063/5.0208145
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85194167481&doi=10.1063%2f5.0208145&partnerID=40&md5=183e5ec3dbc96f1fb1585a5c363b49d4
Clustering is the process of assembling related data items into a group that stands out from the other things. Finding groupings of genes that share co-expression in various settings through the analysis of microarray data is possible through the use of clustering. This study's primary goal is to compare the effectiveness of various clustering methods while grouping microarray data both with and without outliers. Six microarray datasets were analysed using the same way. The clustering algorithms are widely used by researchers which include the agglomerative hierarchical clustering, K-means algorithm, Self-Organizing Maps (SOM) and Partitioning Around Medoids (PAM). The measures of performance were internal validation measures (connectivity, Dunn index and silhouette width) and stability validation measures (Average of Proportion of Non-overlap and Average Distance of Means). The findings demonstrate that various clustering techniques generate various numbers of clusters. According to internal and stability performance criteria, the agglomerative hierarchical clustering method with two clusters produces the best clustering results for data sets with and without an outlier issue. © 2024 Author(s).
American Institute of Physics
0094243X
English
Conference paper
All Open Access; Bronze Open Access
author Zin S.H.H.M.; Moktar B.
spellingShingle Zin S.H.H.M.; Moktar B.
A comparison study of clustering algorithms in grouping microarray data
author_facet Zin S.H.H.M.; Moktar B.
author_sort Zin S.H.H.M.; Moktar B.
title A comparison study of clustering algorithms in grouping microarray data
title_short A comparison study of clustering algorithms in grouping microarray data
title_full A comparison study of clustering algorithms in grouping microarray data
title_fullStr A comparison study of clustering algorithms in grouping microarray data
title_full_unstemmed A comparison study of clustering algorithms in grouping microarray data
title_sort A comparison study of clustering algorithms in grouping microarray data
publishDate 2024
container_title AIP Conference Proceedings
container_volume 2850
container_issue 1
doi_str_mv 10.1063/5.0208145
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-85194167481&doi=10.1063%2f5.0208145&partnerID=40&md5=183e5ec3dbc96f1fb1585a5c363b49d4
description Clustering is the process of assembling related data items into a group that stands out from the other things. Finding groupings of genes that share co-expression in various settings through the analysis of microarray data is possible through the use of clustering. This study's primary goal is to compare the effectiveness of various clustering methods while grouping microarray data both with and without outliers. Six microarray datasets were analysed using the same way. The clustering algorithms are widely used by researchers which include the agglomerative hierarchical clustering, K-means algorithm, Self-Organizing Maps (SOM) and Partitioning Around Medoids (PAM). The measures of performance were internal validation measures (connectivity, Dunn index and silhouette width) and stability validation measures (Average of Proportion of Non-overlap and Average Distance of Means). The findings demonstrate that various clustering techniques generate various numbers of clusters. According to internal and stability performance criteria, the agglomerative hierarchical clustering method with two clusters produces the best clustering results for data sets with and without an outlier issue. © 2024 Author(s).
publisher American Institute of Physics
issn 0094243X
language English
format Conference paper
accesstype All Open Access; Bronze Open Access
record_format scopus
collection Scopus
_version_ 1818940552167555072