A bi-annotated Malay-English code-switching (Manglish) dataset of X posts for biological gender identification and authorship attribution
Low-resource languages, like Malay, face the threat of extinction when linguistic resources become scarce. This paper addresses the scarcity issue by contributing to the inventory of low-resource languages, specifically focusing on Malay-English, known as Manglish. Manglish speakers are primarily lo...
Published in: | Data in Brief |
---|---|
Main Author: | |
Format: | Data paper |
Language: | English |
Published: |
Elsevier Inc.
2024
|
Online Access: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85182445049&doi=10.1016%2fj.dib.2024.110034&partnerID=40&md5=9f0750fb9318c39376abd8451a361b9e |