A New Internal Validity Index for Fuzzy c-Means Algorithm
Fuzzy c-Means (FCM) is a popular clustering algorithm that can partition a set of objects into groups such that objects within a group are similar to each other and dissimilar to those in other groups. A validity index, either external or internal, is required to validate a cluster quality formed by...
Published in: | IEEE Access |
---|---|
Main Author: | |
Format: | Article |
Language: | English |
Published: |
Institute of Electrical and Electronics Engineers Inc.
2024
|
Online Access: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85196079504&doi=10.1109%2fACCESS.2024.3414415&partnerID=40&md5=fbeabc0a53c4946804ed068fd989e5cb |
id |
2-s2.0-85196079504 |
---|---|
spelling |
2-s2.0-85196079504 Nurmazianna Ismail K.; Seman A.; Airin Fariza Abu Samah K. A New Internal Validity Index for Fuzzy c-Means Algorithm 2024 IEEE Access 12 10.1109/ACCESS.2024.3414415 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85196079504&doi=10.1109%2fACCESS.2024.3414415&partnerID=40&md5=fbeabc0a53c4946804ed068fd989e5cb Fuzzy c-Means (FCM) is a popular clustering algorithm that can partition a set of objects into groups such that objects within a group are similar to each other and dissimilar to those in other groups. A validity index, either external or internal, is required to validate a cluster quality formed by the FCM algorithm. External validations require known class labels for measuring a cluster quality and serve as the clustering ground truth. In real-world data with unknown class labels, a cluster quality can be validated only via internal validations. A variety of internal validation measures with different scoring models have been developed, including minimum model, maximum model, and range model with minimum to maximum scores. No internal validation measure proposed thus far is associated with a model ranging from 0 to 1, like the clustering ground truth (external validation). Therefore, a new internal validation, namely, the fuzzy validity index (FVI), is proposed. Experimental results based on several cluster properties demonstrated that the FVI is highly promising. Overall, the scores of the FVI were comparable to the scores obtained by the external validity index, i.e., F-measure. Statistically, the correlation coefficient between the FVI and F-measure was high (around 0.8 and above), indicating their similarity. Therefore, the FVI could potentially serve as the ground truth for measuring the cluster quality of FCM. © 2013 IEEE. Institute of Electrical and Electronics Engineers Inc. 21693536 English Article All Open Access; Gold Open Access |
author |
Nurmazianna Ismail K.; Seman A.; Airin Fariza Abu Samah K. |
spellingShingle |
Nurmazianna Ismail K.; Seman A.; Airin Fariza Abu Samah K. A New Internal Validity Index for Fuzzy c-Means Algorithm |
author_facet |
Nurmazianna Ismail K.; Seman A.; Airin Fariza Abu Samah K. |
author_sort |
Nurmazianna Ismail K.; Seman A.; Airin Fariza Abu Samah K. |
title |
A New Internal Validity Index for Fuzzy c-Means Algorithm |
title_short |
A New Internal Validity Index for Fuzzy c-Means Algorithm |
title_full |
A New Internal Validity Index for Fuzzy c-Means Algorithm |
title_fullStr |
A New Internal Validity Index for Fuzzy c-Means Algorithm |
title_full_unstemmed |
A New Internal Validity Index for Fuzzy c-Means Algorithm |
title_sort |
A New Internal Validity Index for Fuzzy c-Means Algorithm |
publishDate |
2024 |
container_title |
IEEE Access |
container_volume |
12 |
container_issue |
|
doi_str_mv |
10.1109/ACCESS.2024.3414415 |
url |
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85196079504&doi=10.1109%2fACCESS.2024.3414415&partnerID=40&md5=fbeabc0a53c4946804ed068fd989e5cb |
description |
Fuzzy c-Means (FCM) is a popular clustering algorithm that can partition a set of objects into groups such that objects within a group are similar to each other and dissimilar to those in other groups. A validity index, either external or internal, is required to validate a cluster quality formed by the FCM algorithm. External validations require known class labels for measuring a cluster quality and serve as the clustering ground truth. In real-world data with unknown class labels, a cluster quality can be validated only via internal validations. A variety of internal validation measures with different scoring models have been developed, including minimum model, maximum model, and range model with minimum to maximum scores. No internal validation measure proposed thus far is associated with a model ranging from 0 to 1, like the clustering ground truth (external validation). Therefore, a new internal validation, namely, the fuzzy validity index (FVI), is proposed. Experimental results based on several cluster properties demonstrated that the FVI is highly promising. Overall, the scores of the FVI were comparable to the scores obtained by the external validity index, i.e., F-measure. Statistically, the correlation coefficient between the FVI and F-measure was high (around 0.8 and above), indicating their similarity. Therefore, the FVI could potentially serve as the ground truth for measuring the cluster quality of FCM. © 2013 IEEE. |
publisher |
Institute of Electrical and Electronics Engineers Inc. |
issn |
21693536 |
language |
English |
format |
Article |
accesstype |
All Open Access; Gold Open Access |
record_format |
scopus |
collection |
Scopus |
_version_ |
1812871796519600128 |