Summary: | Airline customer satisfaction is an important indicator to measure whether passengers are satisfied with services provided such as inflight services and airport services. Wide research has been done in the previous literature, however, there are inconsistent findings regarding the crucial services that need attention due to the different data used and methods that were applied. A further problem is that there are only a few studies that applied data mining applications in classifying airline customer satisfaction. Therefore, this study aimed to explore more service factors that need to be paid attention to by airline service providers as well as to highlight the contribution of applying the machine learning model classifiers, Logistic Regression, Naïve Bayes, and Support Vector Machine to classify the airline customer satisfaction. Logistic Regression is the common classifier used for model classification, Naïve Bayes always outperforms based on the previous literature and Supports Vector Machines (SVM) as this classifier will optimize the hyperplane in classifying the dataset into classes. A total of 129881 and 23 attributes as a dataset in this study. Descriptive statistics purposely describe the mean values for all continuous attributes, in which the target label, namely customer satisfaction shows balanced data. As the intention of this study is to identify the influential attributes based on the top 5 crucial services, features weight comparing between three different Filter methods; Weight by Information Gain, Weight by Information Gain Ratio, and Weight by Chi-Squared Statistics were employed. The next methodological process is performing predictive analysis using Logistic Regression, Naïve Bayes, and SVM. The performance of each classifier obtained with Logistic Regression has a better performance for overall accuracy, precision, and recall values (sensitivity, and specificity). The findings regarding the top 5 crucial services were identified: Ease of online booking, Online support, On-board service, Online boarding, and Legroom service. The percentages of the performance metrics for Logistic Regression are 81.58%, 83.46%, 82.72%, and 80.21% respectively. The AUC (Area under the ROC Curve) shows Logistic Regression is the highest (AUC=0.894). As such top 5 service factors identified can further be used as a key step for the reformation of airline services or to increase customer retention and loyalty. Moreover, the contribution of this study in determining the best model can be used by decision-makers to define their future strategies to run the airline business efficiently and progressively. © 2024 Author(s).
|