Performance Measurement: Machine Learning as a Complement to DEA for Continuous Efficiency Estimation

Data Envelopment Analysis (DEA) is a well -established non -parametric technique for performance measurement to assess the efficiency of Decision -Making Units (DMUs). However, its inability to predict the efficiency values of new DMUs without re -conducting the analysis on the entire dataset has le...

Full description

Bibliographic Details
Published in:MALAYSIAN JOURNAL OF FUNDAMENTAL AND APPLIED SCIENCES
Main Authors: Khoubrane, Yousef; Ramli, Noor Asiah; Khairi, Siti Shaliza Mohd
Format: Article
Language:English
Published: PENERBIT UTM PRESS 2024
Subjects:
Online Access:https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-record/WOS:001221789500014
Description
Summary:Data Envelopment Analysis (DEA) is a well -established non -parametric technique for performance measurement to assess the efficiency of Decision -Making Units (DMUs). However, its inability to predict the efficiency values of new DMUs without re -conducting the analysis on the entire dataset has led to the integration of Machine Learning (ML) in previous studies to address this limitation. Yet, such integration often lacks a thorough evaluation of ML's adaptability in replacing the current DEA process. This paper presents the results of an empirical study that employed eight ML models, two DEA variants, and a dataset of S&P500 companies. The findings demonstrated ML's remarkable precision in predicting efficiency values derived from a single DEA run and comparable performance in predicting the efficiency of new DMUs, thus eliminating the need for repeated DEA. This discovery highlights ML's robustness as a complementary tool for DEA in continuous efficiency estimation, rendering the practice of re -conducting DEA unnecessary. Notably, boosting models within the Ensemble Learning category consistently outperformed other models, highlighting their effectiveness in the context of DEA efficiency prediction. Particularly, CatBoost demonstrated its superiority as the top -performing model, followed by LightGBM in the second position in most cases. When extended to five enlarged datasets, it shows that the model exhibits superior R2 values in the CRS scenario.
ISSN:2289-5981
2289-599X
DOI:10.11113/mjfas.v20n2.3310