A Sentiment Analysis Framework on COVID-19 in Major Cities of Malaysia based on Tweets using Machine Learning Classification Model

Twitter is one of the famous social media platforms for people to share their stories and opinions on any situations, such as the COVID-19 pandemic. With the indirect influence of tweets on users and the rise in cases of COVID-19 in Malaysia, it is important to monitor information related to the pan...

Full description

Bibliographic Details
Published in:2021 IEEE 11th International Conference on System Engineering and Technology, ICSET 2021 - Proceedings
Main Author: Aminuddin R.; Bistamam M.A.; Ibrahim S.; Abu Mangshor N.N.; Fesol S.F.A.; Wahab N.I.F.A.
Format: Conference paper
Language:English
Published: Institute of Electrical and Electronics Engineers Inc. 2021
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85123368779&doi=10.1109%2fICSET53708.2021.9612527&partnerID=40&md5=2499ca2d1ebad930b80d633b9bfaaf33
Description
Summary:Twitter is one of the famous social media platforms for people to share their stories and opinions on any situations, such as the COVID-19 pandemic. With the indirect influence of tweets on users and the rise in cases of COVID-19 in Malaysia, it is important to monitor information related to the pandemic in order to avoid misinformation, panic, or confusion among public. As the data from tweets are also one of the useful raw data sources that can be used for data visualization, this project aims to design and develop a web-based system for visualizing the status of pandemic in Malaysia based on the data collected from Twitter. There are four phases in the methodology of this project: (i) Planning, (ii) Analysis, (iii) Design and Development, and (iv) Testing and Documentation. In the planning and analysis phases, the data will be collected from March 2020 to March 2021 and will be filtered by using keywords and hashtags, such as #COVID19 and #Coronavirus, as well as the location tagged on the tweets. The collected data then will be pre-processed to remove any unwanted texts. The classification of the data is based on sentiment analysis using one of machine learning models that is Support Vector Machine (SVM). The performance of the classification model will be evaluated using the evaluation model: (i) accuracy, (ii) recall, (iii) precision, and (iv) F1-measure. The final output of this project is the data visualization of the sentiment analysis on COVID-19 in Malaysia based on two of its major cities: Kuala Lumpur and Klang. © 2021 IEEE.
ISSN:
DOI:10.1109/ICSET53708.2021.9612527