A bi-annotated Malay-English code-switching (Manglish) dataset of X posts for biological gender identification and authorship attribution

Low -resource languages, like Malay, face the threat of extinction when linguistic resources become scarce. This paper addresses the scarcity issue by contributing to the inventory of low -resource languages, specifically focusing on Malay -English, known as Manglish. Manglish speakers are primarily...

Full description

Bibliographic Details
Published in:DATA IN BRIEF
Main Authors: Maskat, Ruhaila; Azman, Norazmiera Ayunie; Nulizairos, Nur Shaheera Shastera; Zahidin, Nurul Athirah; Mahadi, Adibah Humairah; Norshamsul, Siti Rubaya; Sharif, Mohd Mukhlis Mohd; Mahdin, Hairulnizam
Format: Article; Data Paper; Early Access
Language:English
Published: ELSEVIER 2024
Subjects:
Online Access:https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-record/WOS:001157110000001