Chinese Monolingual Sentiment Classifier for Social Media Data Using Corpus-Based Approach
Sentiment analysis, sometimes known as opinion mining, is a technique or process for finding and obtaining information on a certain topic, such as opinions and attitudes, from both written and spoken language. Most of the netizens in Malaysia have unique dialects or slang and commonly use abbreviati...
Published in: | 2024 5th International Conference on Artificial Intelligence and Data Sciences, AiDAS 2024 - Proceedings |
---|---|
Main Author: | |
Format: | Conference paper |
Language: | English |
Published: |
Institute of Electrical and Electronics Engineers Inc.
2024
|
Online Access: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85209641239&doi=10.1109%2fAiDAS63860.2024.10730147&partnerID=40&md5=c9635986bd3ad579211ae17ce2106921 |
id |
2-s2.0-85209641239 |
---|---|
spelling |
2-s2.0-85209641239 Sia Abdullah N.A.; Low Cheng Cheng S.C.; Rosli M.M. Chinese Monolingual Sentiment Classifier for Social Media Data Using Corpus-Based Approach 2024 2024 5th International Conference on Artificial Intelligence and Data Sciences, AiDAS 2024 - Proceedings 10.1109/AiDAS63860.2024.10730147 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85209641239&doi=10.1109%2fAiDAS63860.2024.10730147&partnerID=40&md5=c9635986bd3ad579211ae17ce2106921 Sentiment analysis, sometimes known as opinion mining, is a technique or process for finding and obtaining information on a certain topic, such as opinions and attitudes, from both written and spoken language. Most of the netizens in Malaysia have unique dialects or slang and commonly use abbreviations while communicating, resulting in a more efficient exchange of information, which makes it impossible to capture the text messages and, as a result, makes it more difficult to categorise the polarity of the text. This paper develops a Chinese monolingual sentiment classifier for social media data using a corpus-based approach. It is implemented as a web-based system utilising the Flask Framework, which uses front-end and back-end programming languages including HTML, CSS, and Python. A vast corpus of Chinese texts from social media is collected. Then an open-source dataset with sentiment labels is utilised. The anticipated outcome of improving sentiment analysis is effectively detecting and categorising sentiment expressions in Chinese text. A 90% accuracy rate was achieved with a Chinese sentiment classifier for social media texts. The Support Vector Machine model is implemented within a Flask-based web application. Although the classifier shows proficiency in determining positive and negative sentiments, it still needs additional refinement to interpret neutral tones and complex expressions. © 2024 IEEE. Institute of Electrical and Electronics Engineers Inc. English Conference paper |
author |
Sia Abdullah N.A.; Low Cheng Cheng S.C.; Rosli M.M. |
spellingShingle |
Sia Abdullah N.A.; Low Cheng Cheng S.C.; Rosli M.M. Chinese Monolingual Sentiment Classifier for Social Media Data Using Corpus-Based Approach |
author_facet |
Sia Abdullah N.A.; Low Cheng Cheng S.C.; Rosli M.M. |
author_sort |
Sia Abdullah N.A.; Low Cheng Cheng S.C.; Rosli M.M. |
title |
Chinese Monolingual Sentiment Classifier for Social Media Data Using Corpus-Based Approach |
title_short |
Chinese Monolingual Sentiment Classifier for Social Media Data Using Corpus-Based Approach |
title_full |
Chinese Monolingual Sentiment Classifier for Social Media Data Using Corpus-Based Approach |
title_fullStr |
Chinese Monolingual Sentiment Classifier for Social Media Data Using Corpus-Based Approach |
title_full_unstemmed |
Chinese Monolingual Sentiment Classifier for Social Media Data Using Corpus-Based Approach |
title_sort |
Chinese Monolingual Sentiment Classifier for Social Media Data Using Corpus-Based Approach |
publishDate |
2024 |
container_title |
2024 5th International Conference on Artificial Intelligence and Data Sciences, AiDAS 2024 - Proceedings |
container_volume |
|
container_issue |
|
doi_str_mv |
10.1109/AiDAS63860.2024.10730147 |
url |
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85209641239&doi=10.1109%2fAiDAS63860.2024.10730147&partnerID=40&md5=c9635986bd3ad579211ae17ce2106921 |
description |
Sentiment analysis, sometimes known as opinion mining, is a technique or process for finding and obtaining information on a certain topic, such as opinions and attitudes, from both written and spoken language. Most of the netizens in Malaysia have unique dialects or slang and commonly use abbreviations while communicating, resulting in a more efficient exchange of information, which makes it impossible to capture the text messages and, as a result, makes it more difficult to categorise the polarity of the text. This paper develops a Chinese monolingual sentiment classifier for social media data using a corpus-based approach. It is implemented as a web-based system utilising the Flask Framework, which uses front-end and back-end programming languages including HTML, CSS, and Python. A vast corpus of Chinese texts from social media is collected. Then an open-source dataset with sentiment labels is utilised. The anticipated outcome of improving sentiment analysis is effectively detecting and categorising sentiment expressions in Chinese text. A 90% accuracy rate was achieved with a Chinese sentiment classifier for social media texts. The Support Vector Machine model is implemented within a Flask-based web application. Although the classifier shows proficiency in determining positive and negative sentiments, it still needs additional refinement to interpret neutral tones and complex expressions. © 2024 IEEE. |
publisher |
Institute of Electrical and Electronics Engineers Inc. |
issn |
|
language |
English |
format |
Conference paper |
accesstype |
|
record_format |
scopus |
collection |
Scopus |
_version_ |
1820775439325462528 |