Climate Change Sentiment Analysis Using Lexicon, Machine Learning and Hybrid Approaches

The emissions of greenhouse gases, such as carbon dioxide, into the biosphere have the consequence of warming up the planet, hence the existence of climate change. Sentiment analysis has been a popular subject and there has been a plethora of research conducted in this area in recent decades, typica...

Full description

Bibliographic Details
Published in:Sustainability (Switzerland)
Main Author: Sham N.M.; Mohamed A.
Format: Article
Language:English
Published: MDPI 2022
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85129172729&doi=10.3390%2fsu14084723&partnerID=40&md5=16cbf8bde5706041860eb5459124e2d9
id 2-s2.0-85129172729
spelling 2-s2.0-85129172729
Sham N.M.; Mohamed A.
Climate Change Sentiment Analysis Using Lexicon, Machine Learning and Hybrid Approaches
2022
Sustainability (Switzerland)
14
8
10.3390/su14084723
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85129172729&doi=10.3390%2fsu14084723&partnerID=40&md5=16cbf8bde5706041860eb5459124e2d9
The emissions of greenhouse gases, such as carbon dioxide, into the biosphere have the consequence of warming up the planet, hence the existence of climate change. Sentiment analysis has been a popular subject and there has been a plethora of research conducted in this area in recent decades, typically on social media platforms such as Twitter, due to the proliferation of data generated today during discussions on climate change. However, there is not much research on the performances of different sentiment analysis approaches using lexicon, machine learning and hybrid methods, particularly within this domain-specific sentiment. This study aims to find the most effective sentiment analysis approach for climate change tweets and related domains by performing a comparative evaluation of various sentiment analysis approaches. In this context, seven lexicon-based approaches were used, namely SentiWordNet, TextBlob, VADER, SentiStrength, Hu and Liu, MPQA, and WKWSCI. Meanwhile, three machine learning classifiers were used, namely Support Vector Machine, Naïve Bayes, and Logistic Regression, by using two feature extraction techniques, which were Bag-of-Words and TF–IDF. Next, the hybridization between lexicon-based and machine learningbased approaches was performed. The results indicate that the hybrid method outperformed the other two approaches, with hybrid TextBlob and Logistic Regression achieving an F1-score of 75.3%; thus, this has been chosen as the most effective approach. This study also found that lemmatization improved the accuracy of machine learning and hybrid approaches by 1.6%. Meanwhile, the TF–IDF feature extraction technique was slightly better than BoW by increasing the accuracy of the Logistic Regression classifier by 0.6%. However, TF–IDF and BoW had an identical effect on SVM and NB. Future works will include investigating the suitability of deep learning approaches toward this domain-specific sentiment on social media platforms. © 2022 by the authors. Licensee MDPI, Basel, Switzerland.
MDPI
20711050
English
Article
All Open Access; Gold Open Access
author Sham N.M.; Mohamed A.
spellingShingle Sham N.M.; Mohamed A.
Climate Change Sentiment Analysis Using Lexicon, Machine Learning and Hybrid Approaches
author_facet Sham N.M.; Mohamed A.
author_sort Sham N.M.; Mohamed A.
title Climate Change Sentiment Analysis Using Lexicon, Machine Learning and Hybrid Approaches
title_short Climate Change Sentiment Analysis Using Lexicon, Machine Learning and Hybrid Approaches
title_full Climate Change Sentiment Analysis Using Lexicon, Machine Learning and Hybrid Approaches
title_fullStr Climate Change Sentiment Analysis Using Lexicon, Machine Learning and Hybrid Approaches
title_full_unstemmed Climate Change Sentiment Analysis Using Lexicon, Machine Learning and Hybrid Approaches
title_sort Climate Change Sentiment Analysis Using Lexicon, Machine Learning and Hybrid Approaches
publishDate 2022
container_title Sustainability (Switzerland)
container_volume 14
container_issue 8
doi_str_mv 10.3390/su14084723
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-85129172729&doi=10.3390%2fsu14084723&partnerID=40&md5=16cbf8bde5706041860eb5459124e2d9
description The emissions of greenhouse gases, such as carbon dioxide, into the biosphere have the consequence of warming up the planet, hence the existence of climate change. Sentiment analysis has been a popular subject and there has been a plethora of research conducted in this area in recent decades, typically on social media platforms such as Twitter, due to the proliferation of data generated today during discussions on climate change. However, there is not much research on the performances of different sentiment analysis approaches using lexicon, machine learning and hybrid methods, particularly within this domain-specific sentiment. This study aims to find the most effective sentiment analysis approach for climate change tweets and related domains by performing a comparative evaluation of various sentiment analysis approaches. In this context, seven lexicon-based approaches were used, namely SentiWordNet, TextBlob, VADER, SentiStrength, Hu and Liu, MPQA, and WKWSCI. Meanwhile, three machine learning classifiers were used, namely Support Vector Machine, Naïve Bayes, and Logistic Regression, by using two feature extraction techniques, which were Bag-of-Words and TF–IDF. Next, the hybridization between lexicon-based and machine learningbased approaches was performed. The results indicate that the hybrid method outperformed the other two approaches, with hybrid TextBlob and Logistic Regression achieving an F1-score of 75.3%; thus, this has been chosen as the most effective approach. This study also found that lemmatization improved the accuracy of machine learning and hybrid approaches by 1.6%. Meanwhile, the TF–IDF feature extraction technique was slightly better than BoW by increasing the accuracy of the Logistic Regression classifier by 0.6%. However, TF–IDF and BoW had an identical effect on SVM and NB. Future works will include investigating the suitability of deep learning approaches toward this domain-specific sentiment on social media platforms. © 2022 by the authors. Licensee MDPI, Basel, Switzerland.
publisher MDPI
issn 20711050
language English
format Article
accesstype All Open Access; Gold Open Access
record_format scopus
collection Scopus
_version_ 1792585522950963200