Analysis of airline customer satisfaction using data mining techniques

Airline customer satisfaction is an important indicator to measure whether passengers are satisfied with services provided such as inflight services and airport services. Wide research has been done in the previous literature, however, there are inconsistent findings regarding the crucial services t...

Full description

Bibliographic Details
Published in:AIP Conference Proceedings
Main Author: Mohd Rozaini N.A.A.; Abd Talib N.F.A.; Ul-Saufie A.Z.; Ibrahim N.; Jamil S.A.M.; Ul-Saufie A.Z.
Format: Conference paper
Language:English
Published: American Institute of Physics 2024
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85203168951&doi=10.1063%2f5.0223954&partnerID=40&md5=114dd38b7a270328fba3d4bc4a7df5e3
id 2-s2.0-85203168951
spelling 2-s2.0-85203168951
Mohd Rozaini N.A.A.; Abd Talib N.F.A.; Ul-Saufie A.Z.; Ibrahim N.; Jamil S.A.M.; Ul-Saufie A.Z.
Analysis of airline customer satisfaction using data mining techniques
2024
AIP Conference Proceedings
3123
1
10.1063/5.0223954
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85203168951&doi=10.1063%2f5.0223954&partnerID=40&md5=114dd38b7a270328fba3d4bc4a7df5e3
Airline customer satisfaction is an important indicator to measure whether passengers are satisfied with services provided such as inflight services and airport services. Wide research has been done in the previous literature, however, there are inconsistent findings regarding the crucial services that need attention due to the different data used and methods that were applied. A further problem is that there are only a few studies that applied data mining applications in classifying airline customer satisfaction. Therefore, this study aimed to explore more service factors that need to be paid attention to by airline service providers as well as to highlight the contribution of applying the machine learning model classifiers, Logistic Regression, Naïve Bayes, and Support Vector Machine to classify the airline customer satisfaction. Logistic Regression is the common classifier used for model classification, Naïve Bayes always outperforms based on the previous literature and Supports Vector Machines (SVM) as this classifier will optimize the hyperplane in classifying the dataset into classes. A total of 129881 and 23 attributes as a dataset in this study. Descriptive statistics purposely describe the mean values for all continuous attributes, in which the target label, namely customer satisfaction shows balanced data. As the intention of this study is to identify the influential attributes based on the top 5 crucial services, features weight comparing between three different Filter methods; Weight by Information Gain, Weight by Information Gain Ratio, and Weight by Chi-Squared Statistics were employed. The next methodological process is performing predictive analysis using Logistic Regression, Naïve Bayes, and SVM. The performance of each classifier obtained with Logistic Regression has a better performance for overall accuracy, precision, and recall values (sensitivity, and specificity). The findings regarding the top 5 crucial services were identified: Ease of online booking, Online support, On-board service, Online boarding, and Legroom service. The percentages of the performance metrics for Logistic Regression are 81.58%, 83.46%, 82.72%, and 80.21% respectively. The AUC (Area under the ROC Curve) shows Logistic Regression is the highest (AUC=0.894). As such top 5 service factors identified can further be used as a key step for the reformation of airline services or to increase customer retention and loyalty. Moreover, the contribution of this study in determining the best model can be used by decision-makers to define their future strategies to run the airline business efficiently and progressively. © 2024 Author(s).
American Institute of Physics
0094243X
English
Conference paper

author Mohd Rozaini N.A.A.; Abd Talib N.F.A.; Ul-Saufie A.Z.; Ibrahim N.; Jamil S.A.M.; Ul-Saufie A.Z.
spellingShingle Mohd Rozaini N.A.A.; Abd Talib N.F.A.; Ul-Saufie A.Z.; Ibrahim N.; Jamil S.A.M.; Ul-Saufie A.Z.
Analysis of airline customer satisfaction using data mining techniques
author_facet Mohd Rozaini N.A.A.; Abd Talib N.F.A.; Ul-Saufie A.Z.; Ibrahim N.; Jamil S.A.M.; Ul-Saufie A.Z.
author_sort Mohd Rozaini N.A.A.; Abd Talib N.F.A.; Ul-Saufie A.Z.; Ibrahim N.; Jamil S.A.M.; Ul-Saufie A.Z.
title Analysis of airline customer satisfaction using data mining techniques
title_short Analysis of airline customer satisfaction using data mining techniques
title_full Analysis of airline customer satisfaction using data mining techniques
title_fullStr Analysis of airline customer satisfaction using data mining techniques
title_full_unstemmed Analysis of airline customer satisfaction using data mining techniques
title_sort Analysis of airline customer satisfaction using data mining techniques
publishDate 2024
container_title AIP Conference Proceedings
container_volume 3123
container_issue 1
doi_str_mv 10.1063/5.0223954
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-85203168951&doi=10.1063%2f5.0223954&partnerID=40&md5=114dd38b7a270328fba3d4bc4a7df5e3
description Airline customer satisfaction is an important indicator to measure whether passengers are satisfied with services provided such as inflight services and airport services. Wide research has been done in the previous literature, however, there are inconsistent findings regarding the crucial services that need attention due to the different data used and methods that were applied. A further problem is that there are only a few studies that applied data mining applications in classifying airline customer satisfaction. Therefore, this study aimed to explore more service factors that need to be paid attention to by airline service providers as well as to highlight the contribution of applying the machine learning model classifiers, Logistic Regression, Naïve Bayes, and Support Vector Machine to classify the airline customer satisfaction. Logistic Regression is the common classifier used for model classification, Naïve Bayes always outperforms based on the previous literature and Supports Vector Machines (SVM) as this classifier will optimize the hyperplane in classifying the dataset into classes. A total of 129881 and 23 attributes as a dataset in this study. Descriptive statistics purposely describe the mean values for all continuous attributes, in which the target label, namely customer satisfaction shows balanced data. As the intention of this study is to identify the influential attributes based on the top 5 crucial services, features weight comparing between three different Filter methods; Weight by Information Gain, Weight by Information Gain Ratio, and Weight by Chi-Squared Statistics were employed. The next methodological process is performing predictive analysis using Logistic Regression, Naïve Bayes, and SVM. The performance of each classifier obtained with Logistic Regression has a better performance for overall accuracy, precision, and recall values (sensitivity, and specificity). The findings regarding the top 5 crucial services were identified: Ease of online booking, Online support, On-board service, Online boarding, and Legroom service. The percentages of the performance metrics for Logistic Regression are 81.58%, 83.46%, 82.72%, and 80.21% respectively. The AUC (Area under the ROC Curve) shows Logistic Regression is the highest (AUC=0.894). As such top 5 service factors identified can further be used as a key step for the reformation of airline services or to increase customer retention and loyalty. Moreover, the contribution of this study in determining the best model can be used by decision-makers to define their future strategies to run the airline business efficiently and progressively. © 2024 Author(s).
publisher American Institute of Physics
issn 0094243X
language English
format Conference paper
accesstype
record_format scopus
collection Scopus
_version_ 1812871794070126592