A comparison study of clustering algorithms in grouping microarray data
Clustering is the process of assembling related data items into a group that stands out from the other things. Finding groupings of genes that share co-expression in various settings through the analysis of microarray data is possible through the use of clustering. This study's primary goal is...
Published in: | AIP Conference Proceedings |
---|---|
Main Author: | |
Format: | Conference paper |
Language: | English |
Published: |
American Institute of Physics
2024
|
Online Access: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85194167481&doi=10.1063%2f5.0208145&partnerID=40&md5=183e5ec3dbc96f1fb1585a5c363b49d4 |
id |
2-s2.0-85194167481 |
---|---|
spelling |
2-s2.0-85194167481 Zin S.H.H.M.; Moktar B. A comparison study of clustering algorithms in grouping microarray data 2024 AIP Conference Proceedings 2850 1 10.1063/5.0208145 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85194167481&doi=10.1063%2f5.0208145&partnerID=40&md5=183e5ec3dbc96f1fb1585a5c363b49d4 Clustering is the process of assembling related data items into a group that stands out from the other things. Finding groupings of genes that share co-expression in various settings through the analysis of microarray data is possible through the use of clustering. This study's primary goal is to compare the effectiveness of various clustering methods while grouping microarray data both with and without outliers. Six microarray datasets were analysed using the same way. The clustering algorithms are widely used by researchers which include the agglomerative hierarchical clustering, K-means algorithm, Self-Organizing Maps (SOM) and Partitioning Around Medoids (PAM). The measures of performance were internal validation measures (connectivity, Dunn index and silhouette width) and stability validation measures (Average of Proportion of Non-overlap and Average Distance of Means). The findings demonstrate that various clustering techniques generate various numbers of clusters. According to internal and stability performance criteria, the agglomerative hierarchical clustering method with two clusters produces the best clustering results for data sets with and without an outlier issue. © 2024 Author(s). American Institute of Physics 0094243X English Conference paper All Open Access; Bronze Open Access |
author |
Zin S.H.H.M.; Moktar B. |
spellingShingle |
Zin S.H.H.M.; Moktar B. A comparison study of clustering algorithms in grouping microarray data |
author_facet |
Zin S.H.H.M.; Moktar B. |
author_sort |
Zin S.H.H.M.; Moktar B. |
title |
A comparison study of clustering algorithms in grouping microarray data |
title_short |
A comparison study of clustering algorithms in grouping microarray data |
title_full |
A comparison study of clustering algorithms in grouping microarray data |
title_fullStr |
A comparison study of clustering algorithms in grouping microarray data |
title_full_unstemmed |
A comparison study of clustering algorithms in grouping microarray data |
title_sort |
A comparison study of clustering algorithms in grouping microarray data |
publishDate |
2024 |
container_title |
AIP Conference Proceedings |
container_volume |
2850 |
container_issue |
1 |
doi_str_mv |
10.1063/5.0208145 |
url |
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85194167481&doi=10.1063%2f5.0208145&partnerID=40&md5=183e5ec3dbc96f1fb1585a5c363b49d4 |
description |
Clustering is the process of assembling related data items into a group that stands out from the other things. Finding groupings of genes that share co-expression in various settings through the analysis of microarray data is possible through the use of clustering. This study's primary goal is to compare the effectiveness of various clustering methods while grouping microarray data both with and without outliers. Six microarray datasets were analysed using the same way. The clustering algorithms are widely used by researchers which include the agglomerative hierarchical clustering, K-means algorithm, Self-Organizing Maps (SOM) and Partitioning Around Medoids (PAM). The measures of performance were internal validation measures (connectivity, Dunn index and silhouette width) and stability validation measures (Average of Proportion of Non-overlap and Average Distance of Means). The findings demonstrate that various clustering techniques generate various numbers of clusters. According to internal and stability performance criteria, the agglomerative hierarchical clustering method with two clusters produces the best clustering results for data sets with and without an outlier issue. © 2024 Author(s). |
publisher |
American Institute of Physics |
issn |
0094243X |
language |
English |
format |
Conference paper |
accesstype |
All Open Access; Bronze Open Access |
record_format |
scopus |
collection |
Scopus |
_version_ |
1809678005745745920 |