MyDAS Corpus: Malay Social Media Texts for Detecting Depression, Anxiety, and Stress on Facebook

The application of Natural Language Processing (NLP) in mental health monitoring has significantly expanded; however, the specific challenges of interpreting Depression, Anxiety, and Stress (DAS) in Malay language social media texts have not been adequately addressed. This gap underscores the need f...

Full description

Bibliographic Details
Published in:2024 5th International Conference on Artificial Intelligence and Data Sciences, AiDAS 2024 - Proceedings
Main Author: Ahmad Z.; Mohamed A.; Conway M.; Zakaria R.; Teo N.H.I.; Maskat R.
Format: Conference paper
Language:English
Published: Institute of Electrical and Electronics Engineers Inc. 2024
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85209658607&doi=10.1109%2fAiDAS63860.2024.10730385&partnerID=40&md5=abcccc169bf9db99d4e2264e40ef4e41
id 2-s2.0-85209658607
spelling 2-s2.0-85209658607
Ahmad Z.; Mohamed A.; Conway M.; Zakaria R.; Teo N.H.I.; Maskat R.
MyDAS Corpus: Malay Social Media Texts for Detecting Depression, Anxiety, and Stress on Facebook
2024
2024 5th International Conference on Artificial Intelligence and Data Sciences, AiDAS 2024 - Proceedings


10.1109/AiDAS63860.2024.10730385
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85209658607&doi=10.1109%2fAiDAS63860.2024.10730385&partnerID=40&md5=abcccc169bf9db99d4e2264e40ef4e41
The application of Natural Language Processing (NLP) in mental health monitoring has significantly expanded; however, the specific challenges of interpreting Depression, Anxiety, and Stress (DAS) in Malay language social media texts have not been adequately addressed. This gap underscores the need for NLP solutions that are sensitised to the linguistic and cultural specificities of Malay-speaking populations. This study develops and validates a specialised Malay language corpus from social media content, targeting DAS. Utilising a hybrid ground truth strategy that integrates self-reports with expert assessments, the research offers methodological refinements in the analysis of Malay linguistic patterns and the deployment of machine learning classifiers to efficiently identify mental health indicators. The paper reviews existing methodologies, outlines a novel corpus development strategy, and discusses classifier performance. The Decision Tree classifier achieved the highest F1 score of 0.75, followed by the Support Vector Machine (SVM) with an F1 score of 0.73, and Random Forest with 0.70. Multinomial Naive Bayes (MNB) and K-Nearest Neighbors (KNN) demonstrated lower performances with F1 scores of 0.55 and 0.52 respectively. Comprehensive analyses using bi-gram networks and t-SNE visualisations explore the nuanced linguistic indicators of mental health states, culminating in a discussion of the implications for future NLP applications in mental health monitoring. © 2024 IEEE.
Institute of Electrical and Electronics Engineers Inc.

English
Conference paper

author Ahmad Z.; Mohamed A.; Conway M.; Zakaria R.; Teo N.H.I.; Maskat R.
spellingShingle Ahmad Z.; Mohamed A.; Conway M.; Zakaria R.; Teo N.H.I.; Maskat R.
MyDAS Corpus: Malay Social Media Texts for Detecting Depression, Anxiety, and Stress on Facebook
author_facet Ahmad Z.; Mohamed A.; Conway M.; Zakaria R.; Teo N.H.I.; Maskat R.
author_sort Ahmad Z.; Mohamed A.; Conway M.; Zakaria R.; Teo N.H.I.; Maskat R.
title MyDAS Corpus: Malay Social Media Texts for Detecting Depression, Anxiety, and Stress on Facebook
title_short MyDAS Corpus: Malay Social Media Texts for Detecting Depression, Anxiety, and Stress on Facebook
title_full MyDAS Corpus: Malay Social Media Texts for Detecting Depression, Anxiety, and Stress on Facebook
title_fullStr MyDAS Corpus: Malay Social Media Texts for Detecting Depression, Anxiety, and Stress on Facebook
title_full_unstemmed MyDAS Corpus: Malay Social Media Texts for Detecting Depression, Anxiety, and Stress on Facebook
title_sort MyDAS Corpus: Malay Social Media Texts for Detecting Depression, Anxiety, and Stress on Facebook
publishDate 2024
container_title 2024 5th International Conference on Artificial Intelligence and Data Sciences, AiDAS 2024 - Proceedings
container_volume
container_issue
doi_str_mv 10.1109/AiDAS63860.2024.10730385
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-85209658607&doi=10.1109%2fAiDAS63860.2024.10730385&partnerID=40&md5=abcccc169bf9db99d4e2264e40ef4e41
description The application of Natural Language Processing (NLP) in mental health monitoring has significantly expanded; however, the specific challenges of interpreting Depression, Anxiety, and Stress (DAS) in Malay language social media texts have not been adequately addressed. This gap underscores the need for NLP solutions that are sensitised to the linguistic and cultural specificities of Malay-speaking populations. This study develops and validates a specialised Malay language corpus from social media content, targeting DAS. Utilising a hybrid ground truth strategy that integrates self-reports with expert assessments, the research offers methodological refinements in the analysis of Malay linguistic patterns and the deployment of machine learning classifiers to efficiently identify mental health indicators. The paper reviews existing methodologies, outlines a novel corpus development strategy, and discusses classifier performance. The Decision Tree classifier achieved the highest F1 score of 0.75, followed by the Support Vector Machine (SVM) with an F1 score of 0.73, and Random Forest with 0.70. Multinomial Naive Bayes (MNB) and K-Nearest Neighbors (KNN) demonstrated lower performances with F1 scores of 0.55 and 0.52 respectively. Comprehensive analyses using bi-gram networks and t-SNE visualisations explore the nuanced linguistic indicators of mental health states, culminating in a discussion of the implications for future NLP applications in mental health monitoring. © 2024 IEEE.
publisher Institute of Electrical and Electronics Engineers Inc.
issn
language English
format Conference paper
accesstype
record_format scopus
collection Scopus
_version_ 1818940554290921472