A bi-annotated Malay-English code-switching (Manglish) dataset of X posts for biological gender identification and authorship attribution
Low-resource languages, like Malay, face the threat of extinction when linguistic resources become scarce. This paper addresses the scarcity issue by contributing to the inventory of low-resource languages, specifically focusing on Malay-English, known as Manglish. Manglish speakers are primarily lo...
Published in: | Data in Brief |
---|---|
Main Author: | Maskat R.; Azman N.A.; Nulizairos N.S.S.; Zahidin N.A.; Mahadi A.H.; Norshamsul S.R.; Sharif M.M.M.; Mahdin H. |
Format: | Data paper |
Language: | English |
Published: |
Elsevier Inc.
2024
|
Online Access: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85182445049&doi=10.1016%2fj.dib.2024.110034&partnerID=40&md5=9f0750fb9318c39376abd8451a361b9e |
Similar Items
-
A bi-annotated Malay-English code-switching (Manglish) dataset of X posts for biological gender identification and authorship attribution
by: Maskat, et al.
Published: (2024) -
Explaining the Diversity in Malay-English Code-Switching Patterns: The Contribution of Typological Similarity and Bilingual Optimization Strategies
by: Treffers-Daller J.; Majid S.; Thai Y.N.; Flynn N.
Published: (2022) -
Code-Switching and Code-Mixing in the Practice of Judgement Writing in Malaysia
by: Md Zolkapli R.B.; Mohamad H.A.; Mohaini M.L.; Wahab N.H.A.; Nath P.R.
Published: (2022) -
The research's knowledge transfer through co-authorship collaboration
by: Rahman S.A.; Noordin S.A.; Rahmad F.; Mohamed A.N.; Abdullah H.; Salleh A.A.
Published: (2017) -
SiulMalaya: an annotated bird audio dataset of Malaysia lowland forest birds for passive acoustic monitoring
by: Jamil N.; Norali A.N.; Ramli M.I.; Shah A.K.M.K.; Mamat I.
Published: (2023)