An analysis on performance of different type classifiers in handling big data sets

Data analysis is one of the most important tasks in the decision making process. It helps decision maker to solve many problems such as classification and regression. However, wrong choice of method will produce inefficiency solution especially when dealing with big data sets. Besides, lack of infor...

Full description

Bibliographic Details
Published in:Frontiers in Artificial Intelligence and Applications
Main Author: Mohamad M.; Selamat A.
Format: Conference paper
Language:English
Published: IOS Press BV 2019
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85082075517&doi=10.3233%2fFAIA190057&partnerID=40&md5=467a692fad8c4642ac4725b87309e519
Description
Summary:Data analysis is one of the most important tasks in the decision making process. It helps decision maker to solve many problems such as classification and regression. However, wrong choice of method will produce inefficiency solution especially when dealing with big data sets. Besides, lack of information on data set characteristics also could make the analysis process more complicated and returned low analysis performance. Therefore, this study has conducted a few experimental works that evaluate six common algorithms in handling big data sets. A standard data analysis framework which consists of data initial process, data analysis and performance evaluation had been implemented. Results has shown that each algorithm has its own capability in handling different type of multi-variate big data sets. Naive Bayes is one of the algorithm that has successfully classified all selected data sets. Poker and Madelon required large space of memory during the analysis process. It can be concluded that, an information on data set characteristics and the capability of assigned data analysis method are important to be specified before any decision can be made. © 2019 The authors and IOS Press. All rights reserved.
ISSN:9226389
DOI:10.3233/FAIA190057