Chinese Monolingual Sentiment Classifier for Social Media Data Using Corpus-Based Approach

Sentiment analysis, sometimes known as opinion mining, is a technique or process for finding and obtaining information on a certain topic, such as opinions and attitudes, from both written and spoken language. Most of the netizens in Malaysia have unique dialects or slang and commonly use abbreviati...

Full description

Bibliographic Details
Published in:2024 5th International Conference on Artificial Intelligence and Data Sciences, AiDAS 2024 - Proceedings
Main Author: Sia Abdullah N.A.; Low Cheng Cheng S.C.; Rosli M.M.
Format: Conference paper
Language:English
Published: Institute of Electrical and Electronics Engineers Inc. 2024
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85209641239&doi=10.1109%2fAiDAS63860.2024.10730147&partnerID=40&md5=c9635986bd3ad579211ae17ce2106921
Description
Summary:Sentiment analysis, sometimes known as opinion mining, is a technique or process for finding and obtaining information on a certain topic, such as opinions and attitudes, from both written and spoken language. Most of the netizens in Malaysia have unique dialects or slang and commonly use abbreviations while communicating, resulting in a more efficient exchange of information, which makes it impossible to capture the text messages and, as a result, makes it more difficult to categorise the polarity of the text. This paper develops a Chinese monolingual sentiment classifier for social media data using a corpus-based approach. It is implemented as a web-based system utilising the Flask Framework, which uses front-end and back-end programming languages including HTML, CSS, and Python. A vast corpus of Chinese texts from social media is collected. Then an open-source dataset with sentiment labels is utilised. The anticipated outcome of improving sentiment analysis is effectively detecting and categorising sentiment expressions in Chinese text. A 90% accuracy rate was achieved with a Chinese sentiment classifier for social media texts. The Support Vector Machine model is implemented within a Flask-based web application. Although the classifier shows proficiency in determining positive and negative sentiments, it still needs additional refinement to interpret neutral tones and complex expressions. © 2024 IEEE.
ISSN:
DOI:10.1109/AiDAS63860.2024.10730147