A New Internal Validity Index for Fuzzy c-Means Algorithm

Fuzzy c-Means (FCM) is a popular clustering algorithm that can partition a set of objects into groups such that objects within a group are similar to each other and dissimilar to those in other groups. A validity index, either external or internal, is required to validate a cluster quality formed by...

Full description

Bibliographic Details
Published in:IEEE Access
Main Author: Nurmazianna Ismail K.; Seman A.; Airin Fariza Abu Samah K.
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers Inc. 2024
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85196079504&doi=10.1109%2fACCESS.2024.3414415&partnerID=40&md5=fbeabc0a53c4946804ed068fd989e5cb
Description
Summary:Fuzzy c-Means (FCM) is a popular clustering algorithm that can partition a set of objects into groups such that objects within a group are similar to each other and dissimilar to those in other groups. A validity index, either external or internal, is required to validate a cluster quality formed by the FCM algorithm. External validations require known class labels for measuring a cluster quality and serve as the clustering ground truth. In real-world data with unknown class labels, a cluster quality can be validated only via internal validations. A variety of internal validation measures with different scoring models have been developed, including minimum model, maximum model, and range model with minimum to maximum scores. No internal validation measure proposed thus far is associated with a model ranging from 0 to 1, like the clustering ground truth (external validation). Therefore, a new internal validation, namely, the fuzzy validity index (FVI), is proposed. Experimental results based on several cluster properties demonstrated that the FVI is highly promising. Overall, the scores of the FVI were comparable to the scores obtained by the external validity index, i.e., F-measure. Statistically, the correlation coefficient between the FVI and F-measure was high (around 0.8 and above), indicating their similarity. Therefore, the FVI could potentially serve as the ground truth for measuring the cluster quality of FCM. © 2013 IEEE.
ISSN:21693536
DOI:10.1109/ACCESS.2024.3414415