DBRF: Random Forest Optimization Algorithm Based on DBSCAN

correlation and redundancy of features will directly affect the quality of randomly selected features, weakening the convergence of random forests (RF) and reducing the performance of random forest models. This paper introduces an improved random forest algorithm-A Random Forest Algorithm Based on D...

Full description

Bibliographic Details
Published in:INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS
Main Authors: Zhuo, Wang; Ahmad, Azlin
Format: Article
Language:English
Published: SCIENCE & INFORMATION SAI ORGANIZATION LTD 2024
Subjects:
Online Access:https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-record/WOS:001344145700001
author Zhuo
Wang; Ahmad
Azlin
spellingShingle Zhuo
Wang; Ahmad
Azlin
DBRF: Random Forest Optimization Algorithm Based on DBSCAN
Computer Science
author_facet Zhuo
Wang; Ahmad
Azlin
author_sort Zhuo
spelling Zhuo, Wang; Ahmad, Azlin
DBRF: Random Forest Optimization Algorithm Based on DBSCAN
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS
English
Article
correlation and redundancy of features will directly affect the quality of randomly selected features, weakening the convergence of random forests (RF) and reducing the performance of random forest models. This paper introduces an improved random forest algorithm-A Random Forest Algorithm Based on DBSCAN (DBRF). The algorithm utilizes the DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm to improve the feature extraction process, to extract a more efficient feature set. The algorithm first uses DBSCAN to group all features based on their relevance and then selects features from each group in proportion to construct a feature subset for each decision tree, repeating this process until the random forest is built. The algorithm ensures the diversity of features in the random forest while eliminating the correlation and redundancy among features to some extent, thereby improving the quality of random feature selection. In the experimental verification, the classification prediction results of CART, RF, and DBRF, three different classifiers, were compared through ten-fold cross-validation on six different-sized datasets using accuracy, precision, recall, F1, and running time as validation indicators. Through experimental verification, it was found that DBRF algorithm outperformed RF, and the prediction performance was improved, especially in terms of time complexity. This algorithm is suitable for various fields and can effectively improve the classification prediction performance at a lower complexity level.
SCIENCE & INFORMATION SAI ORGANIZATION LTD
2158-107X
2156-5570
2024
15
9

Computer Science

WOS:001344145700001
https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-record/WOS:001344145700001
title DBRF: Random Forest Optimization Algorithm Based on DBSCAN
title_short DBRF: Random Forest Optimization Algorithm Based on DBSCAN
title_full DBRF: Random Forest Optimization Algorithm Based on DBSCAN
title_fullStr DBRF: Random Forest Optimization Algorithm Based on DBSCAN
title_full_unstemmed DBRF: Random Forest Optimization Algorithm Based on DBSCAN
title_sort DBRF: Random Forest Optimization Algorithm Based on DBSCAN
container_title INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS
language English
format Article
description correlation and redundancy of features will directly affect the quality of randomly selected features, weakening the convergence of random forests (RF) and reducing the performance of random forest models. This paper introduces an improved random forest algorithm-A Random Forest Algorithm Based on DBSCAN (DBRF). The algorithm utilizes the DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm to improve the feature extraction process, to extract a more efficient feature set. The algorithm first uses DBSCAN to group all features based on their relevance and then selects features from each group in proportion to construct a feature subset for each decision tree, repeating this process until the random forest is built. The algorithm ensures the diversity of features in the random forest while eliminating the correlation and redundancy among features to some extent, thereby improving the quality of random feature selection. In the experimental verification, the classification prediction results of CART, RF, and DBRF, three different classifiers, were compared through ten-fold cross-validation on six different-sized datasets using accuracy, precision, recall, F1, and running time as validation indicators. Through experimental verification, it was found that DBRF algorithm outperformed RF, and the prediction performance was improved, especially in terms of time complexity. This algorithm is suitable for various fields and can effectively improve the classification prediction performance at a lower complexity level.
publisher SCIENCE & INFORMATION SAI ORGANIZATION LTD
issn 2158-107X
2156-5570
publishDate 2024
container_volume 15
container_issue 9
doi_str_mv
topic Computer Science
topic_facet Computer Science
accesstype
id WOS:001344145700001
url https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-record/WOS:001344145700001
record_format wos
collection Web of Science (WoS)
_version_ 1818940497836638208