Summary: | Fuzzy c-Means (FCM) is a popular clustering algorithm that can partition a set of objects into groups such that objects within a group are similar to each other and dissimilar to those in other groups. A validity index, either external or internal, is required to validate a cluster quality formed by the FCM algorithm. External validations require known class labels for measuring a cluster quality and serve as the clustering ground truth. In real-world data with unknown class labels, a cluster quality can be validated only via internal validations. A variety of internal validation measures with different scoring models have been developed, including minimum model, maximum model, and range model with minimum to maximum scores. No internal validation measure proposed thus far is associated with a model ranging from 0 to 1, like the clustering ground truth (external validation). Therefore, a new internal validation, namely, the fuzzy validity index (FVI), is proposed. Experimental results based on several cluster properties demonstrated that the FVI is highly promising. Overall, the scores of the FVI were comparable to the scores obtained by the external validity index, i.e., F-measure. Statistically, the correlation coefficient between the FVI and F-measure was high (around 0.8 and above), indicating their similarity. Therefore, the FVI could potentially serve as the ground truth for measuring the cluster quality of FCM. © 2013 IEEE.
|