A bi-annotated Malay-English code-switching (Manglish) dataset of X posts for biological gender identification and authorship attribution

Low-resource languages, like Malay, face the threat of extinction when linguistic resources become scarce. This paper addresses the scarcity issue by contributing to the inventory of low-resource languages, specifically focusing on Malay-English, known as Manglish. Manglish speakers are primarily lo...

Full description

Bibliographic Details
Published in:Data in Brief
Main Author: Maskat R.; Azman N.A.; Nulizairos N.S.S.; Zahidin N.A.; Mahadi A.H.; Norshamsul S.R.; Sharif M.M.M.; Mahdin H.
Format: Data paper
Language:English
Published: Elsevier Inc. 2024
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85182445049&doi=10.1016%2fj.dib.2024.110034&partnerID=40&md5=9f0750fb9318c39376abd8451a361b9e