Summary: | Recognizing emotions using natural or spontaneous speech are extremely difficult compared to doing the same for acted or elicited speeches. Speech emotion recognition for real conversation such as spontaneous speech requires linguistic information of the speech to be included in the speech emotion recognition component to achieve a high recognition rate. However, with the lack of digital speech resources of an under-resourced language, this requirement poses a problem. In this paper, speech emotion recognition of spontaneous speech in Malay language using prosodic features and Random Forest classifier is presented. We also investigate the influence of age categorized as children, young adults and middle-aged on emotion recognition. Ninety spontaneous speech sentences from 30 native speakers of Malay language are collected and classified into three emotions, which are happy, angry and sad. Results show that the spontaneous speech of middle-aged group achieved the highest accuracy rate followed by children age group and finally the young adults. While sad emotions are recognized satisfactorily across all age groups, confusions exist between happy and angry emotions. © 2017 IEEE.
|