Summary: | This research delves into the dynamic relationship between dataset size, hyperparameters, and clustering algorithms' performance. The study encompasses a diverse set of experiments, utilizing a dataset representing Imstagrid, DBSCAN, and HDBSCAN. Notably, this study observed a significant impact when reducing the dataset size to 1500 rows, with distinct sets of hyperparameters leading to varied algorithmic results. The findings highlight the intricate balance between data density and clustering granularity. Our comparative analysis, presented in Table 3, showcases the top five exemplary results across these experiments, emphasizing the importance of parameter selection. The study underscores the superiority of Agglomerative Clustering with an optimized spatial parameter (L = 5.862), achieving a remarkable silhouette score of 92.95% and 48 clusters for Imstagrid. In contrast, DBSCAN and HDBSCAN exhibited lower performance with silhouette scores ranging from 39% to 50%. These insights provide valuable guidance for selecting appropriate clustering algorithms and parameters in different scenarios. © 2023 IEEE.
|