Summary: | This study addresses the challenge of imbalanced dissolved gas analysis (DGA) data in transformer failure classification by assessing the impact of data-level balancing techniques on machine learning performance. Five data-level strategies – Random Under-Sampling (RUS), Edited Nearest Neighbors (ENN), NearMiss (NM), Random Over-Sampling (ROS), and ADASYN – were applied to balance the dataset and improve classification outcomes. The dataset includes key gas concentrations (H2, CH4, C2H6, C2H4, and C2H2) and a target defect variable (act). Three machine learning algorithms – Support Vector Machine, Decision Tree, and Random Forest – were tested, with results showing that ENN combined with SVM achieved the highest classification performance: 88% accuracy, 89.89% precision, 88.00% recall, 86.64% F1-score, and a runtime of 0.21 s. This approach demonstrates the effectiveness of data-level techniques in improving transformer fault diagnosis, offering a robust path forward for enhancing electrical power system reliability. Future research should refine these techniques and explore their integration with optimized models to enhance the accuracy of the proposed technique. © 2024 The Author(s)
|