Knowledge-Grounded Attention-Based Neural Machine Translation Model

Neural machine translation (NMT) model processes sentences in isolation and ignores additional contextual or side information beyond sentences. The input text alone often provides limited knowledge to generate contextually correct and meaningful translation. Relying solely on the input text could yi...

Full description

Bibliographic Details
Published in:APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING
Main Authors: Israr, Huma; Khan, Safdar Abbas; Tahir, Muhammad Ali; Shahzad, Muhammad Khuram; Ahmad, Muneer; Zain, Jasni Mohamad
Format: Article
Language:English
Published: WILEY 2025
Subjects:
Online Access:https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-record/WOS:001397785100001
Description
Summary:Neural machine translation (NMT) model processes sentences in isolation and ignores additional contextual or side information beyond sentences. The input text alone often provides limited knowledge to generate contextually correct and meaningful translation. Relying solely on the input text could yield translations that lack accuracy. Side information related to either source or target side is helpful in the context of NMT. In this study, we empirically show that training an NMT model with target-side additional information used as knowledge can significantly improve the translation quality. The acquired knowledge is leveraged in the encoder-/decoder-based model utilizing multiencoder framework. The additional encoder converts knowledge into dense semantic representation called attention. These attentions from the input sentence and additional knowledge are then combined into a unified attention. The decoder generates the translation by conditioning on both the input text and acquired knowledge. Evaluation of translation from Urdu to English with a low-resource setting yields promising results in terms of both perplexity reduction and improved BLEU scores. The proposed models in the respective group outperform in LSTM and GRU with attention mechanism by +3.1 and +2.9 BLEU score, respectively. Extensive analysis confirms our claim that the translations influenced by additional information may occasionally contain rare low-frequency words and faithful translation. Experimental results on a different language pair DE-EN demonstrate that our suggested method is more efficient and general.
ISSN:1687-9724
1687-9732
DOI:10.1155/acis/6234949