Fake News Detection Regarding COVID-19 Tweets Using Machine Learning Approaches

The pervasiveness of misinformation surrounding the COVID-19 pandemic has garnered heightened attention due to its implications, as a noteworthy proportion of the populace is being exposed to spurious and unsubstantiated narratives concerning the crisis. This research utilizes a dataset sourced from...

Full description

Bibliographic Details
Published in:8th International Conference on Recent Advances and Innovations in Engineering: Empowering Computing, Analytics, and Engineering Through Digital Innovation, ICRAIE 2023
Main Author: Hussin M.H.; Mahmud Y.; Mohd Hanafiah N.I.; Azma Nasruddin Z.; Mohd Ariffin N.H.; Ince M.; Senel F.A.
Format: Conference paper
Language:English
Published: Institute of Electrical and Electronics Engineers Inc. 2023
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85189939276&doi=10.1109%2fICRAIE59459.2023.10468126&partnerID=40&md5=e7fcbe2274590bd27d72d1312dd144d7
Description
Summary:The pervasiveness of misinformation surrounding the COVID-19 pandemic has garnered heightened attention due to its implications, as a noteworthy proportion of the populace is being exposed to spurious and unsubstantiated narratives concerning the crisis. This research utilizes a dataset sourced from Codalab, comprising 8,560 tweets, with 4,480 labelled as real and 4,080 as fake. The research explores the effectiveness of different machine learning models, including logistic regression (LR), random forest (RF), and deep learning models such as Convolutional Neural Networks (CNN) and Bidirectional Long Short-Term Memory (Bi-LSTM). In addition to model comparison, experiments were conducted to analyze the impact of different data splits (70:30, 80:20, and 90:10), batch sizes (16, 32, and 64), and the number of epochs (5, 10, and 15) on model performance. The experiments provided insights into the optimal configurations for the models. The results showcase the model's capabilities, with high accuracy achieved across the different models. Specifically, logistic regression achieved an accuracy of 92%, random forest 91%, Bi-LSTM 93%, and CNN 95%. These findings highlight the potential of deep learning models, particularly CNN, in accurately detecting fake news from COVID-19-related tweets. © 2023 IEEE.
ISSN:
DOI:10.1109/ICRAIE59459.2023.10468126