Summary: | The flexibility to listen to the contents is critical for an improved understanding of web information. The Speech Synthesizer Markup Language (SSML) provides a link between the speech engine and HTML-formatted online content; nevertheless, it introduces the risks of inconsistent content between SSML and HTML5 content and incorrect file linking between them. Because HTML5 is the container, annotating the SSML elements into the HTML5 is the logical solution to these challenges. Thus, in this paper, we propose a methodology called Position-based Markup Language Annotation Process (PMLAP) that aims to (1) streamline the annotation process of applying SSML elements at a specific position in the HTML5 elements through the custom data attribute called the data-* attribute and (2) offers a flat learning curve for the web developers to grasp the annotation process. The methodology consists of four distinct steps which produce an annotated HTML5. This output can then be used as the input of PMLAP transcoder to extract the relevant information and generate the respective SSML document automatically. Thus, we also present the design of the transcoder which has been implemented using Javascript. We then illustrate the applicability of PMLAP using a running example followed by validating the generated SSML produced by the transcoder using the available tools, namely, ExtendClass Text Compare, a text comparison tool to check the string well-formedness and AWS Polly TTS, a text-to-speech web service for checking the correctness of the generated speech. The validation results indicate the viability of the transcoder to achieve the research goal. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023.
|