Chinese Monolingual Sentiment Classifier for Social Media Data Using Corpus-Based Approach

Sentiment analysis, sometimes known as opinion mining, is a technique or process for finding and obtaining information on a certain topic, such as opinions and attitudes, from both written and spoken language. Most of the netizens in Malaysia have unique dialects or slang and commonly use abbreviati...

Full description

Bibliographic Details
Published in:2024 5th International Conference on Artificial Intelligence and Data Sciences, AiDAS 2024 - Proceedings
Main Author: Sia Abdullah N.A.; Low Cheng Cheng S.C.; Rosli M.M.
Format: Conference paper
Language:English
Published: Institute of Electrical and Electronics Engineers Inc. 2024
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85209641239&doi=10.1109%2fAiDAS63860.2024.10730147&partnerID=40&md5=c9635986bd3ad579211ae17ce2106921
id 2-s2.0-85209641239
spelling 2-s2.0-85209641239
Sia Abdullah N.A.; Low Cheng Cheng S.C.; Rosli M.M.
Chinese Monolingual Sentiment Classifier for Social Media Data Using Corpus-Based Approach
2024
2024 5th International Conference on Artificial Intelligence and Data Sciences, AiDAS 2024 - Proceedings


10.1109/AiDAS63860.2024.10730147
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85209641239&doi=10.1109%2fAiDAS63860.2024.10730147&partnerID=40&md5=c9635986bd3ad579211ae17ce2106921
Sentiment analysis, sometimes known as opinion mining, is a technique or process for finding and obtaining information on a certain topic, such as opinions and attitudes, from both written and spoken language. Most of the netizens in Malaysia have unique dialects or slang and commonly use abbreviations while communicating, resulting in a more efficient exchange of information, which makes it impossible to capture the text messages and, as a result, makes it more difficult to categorise the polarity of the text. This paper develops a Chinese monolingual sentiment classifier for social media data using a corpus-based approach. It is implemented as a web-based system utilising the Flask Framework, which uses front-end and back-end programming languages including HTML, CSS, and Python. A vast corpus of Chinese texts from social media is collected. Then an open-source dataset with sentiment labels is utilised. The anticipated outcome of improving sentiment analysis is effectively detecting and categorising sentiment expressions in Chinese text. A 90% accuracy rate was achieved with a Chinese sentiment classifier for social media texts. The Support Vector Machine model is implemented within a Flask-based web application. Although the classifier shows proficiency in determining positive and negative sentiments, it still needs additional refinement to interpret neutral tones and complex expressions. © 2024 IEEE.
Institute of Electrical and Electronics Engineers Inc.

English
Conference paper

author Sia Abdullah N.A.; Low Cheng Cheng S.C.; Rosli M.M.
spellingShingle Sia Abdullah N.A.; Low Cheng Cheng S.C.; Rosli M.M.
Chinese Monolingual Sentiment Classifier for Social Media Data Using Corpus-Based Approach
author_facet Sia Abdullah N.A.; Low Cheng Cheng S.C.; Rosli M.M.
author_sort Sia Abdullah N.A.; Low Cheng Cheng S.C.; Rosli M.M.
title Chinese Monolingual Sentiment Classifier for Social Media Data Using Corpus-Based Approach
title_short Chinese Monolingual Sentiment Classifier for Social Media Data Using Corpus-Based Approach
title_full Chinese Monolingual Sentiment Classifier for Social Media Data Using Corpus-Based Approach
title_fullStr Chinese Monolingual Sentiment Classifier for Social Media Data Using Corpus-Based Approach
title_full_unstemmed Chinese Monolingual Sentiment Classifier for Social Media Data Using Corpus-Based Approach
title_sort Chinese Monolingual Sentiment Classifier for Social Media Data Using Corpus-Based Approach
publishDate 2024
container_title 2024 5th International Conference on Artificial Intelligence and Data Sciences, AiDAS 2024 - Proceedings
container_volume
container_issue
doi_str_mv 10.1109/AiDAS63860.2024.10730147
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-85209641239&doi=10.1109%2fAiDAS63860.2024.10730147&partnerID=40&md5=c9635986bd3ad579211ae17ce2106921
description Sentiment analysis, sometimes known as opinion mining, is a technique or process for finding and obtaining information on a certain topic, such as opinions and attitudes, from both written and spoken language. Most of the netizens in Malaysia have unique dialects or slang and commonly use abbreviations while communicating, resulting in a more efficient exchange of information, which makes it impossible to capture the text messages and, as a result, makes it more difficult to categorise the polarity of the text. This paper develops a Chinese monolingual sentiment classifier for social media data using a corpus-based approach. It is implemented as a web-based system utilising the Flask Framework, which uses front-end and back-end programming languages including HTML, CSS, and Python. A vast corpus of Chinese texts from social media is collected. Then an open-source dataset with sentiment labels is utilised. The anticipated outcome of improving sentiment analysis is effectively detecting and categorising sentiment expressions in Chinese text. A 90% accuracy rate was achieved with a Chinese sentiment classifier for social media texts. The Support Vector Machine model is implemented within a Flask-based web application. Although the classifier shows proficiency in determining positive and negative sentiments, it still needs additional refinement to interpret neutral tones and complex expressions. © 2024 IEEE.
publisher Institute of Electrical and Electronics Engineers Inc.
issn
language English
format Conference paper
accesstype
record_format scopus
collection Scopus
_version_ 1820775439325462528