Knowledge-Grounded Attention-Based Neural Machine Translation Model

Neural machine translation (NMT) model processes sentences in isolation and ignores additional contextual or side information beyond sentences. The input text alone often provides limited knowledge to generate contextually correct and meaningful translation. Relying solely on the input text could yi...

Full description

Bibliographic Details
Published in:APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING
Main Authors: Israr, Huma; Khan, Safdar Abbas; Tahir, Muhammad Ali; Shahzad, Muhammad Khuram; Ahmad, Muneer; Zain, Jasni Mohamad
Format: Article
Language:English
Published: WILEY 2025
Subjects:
Online Access:https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-record/WOS:001397785100001
author Israr
Huma; Khan
Safdar Abbas; Tahir
Muhammad Ali; Shahzad
Muhammad Khuram; Ahmad
Muneer; Zain
Jasni Mohamad
spellingShingle Israr
Huma; Khan
Safdar Abbas; Tahir
Muhammad Ali; Shahzad
Muhammad Khuram; Ahmad
Muneer; Zain
Jasni Mohamad
Knowledge-Grounded Attention-Based Neural Machine Translation Model
Computer Science
author_facet Israr
Huma; Khan
Safdar Abbas; Tahir
Muhammad Ali; Shahzad
Muhammad Khuram; Ahmad
Muneer; Zain
Jasni Mohamad
author_sort Israr
spelling Israr, Huma; Khan, Safdar Abbas; Tahir, Muhammad Ali; Shahzad, Muhammad Khuram; Ahmad, Muneer; Zain, Jasni Mohamad
Knowledge-Grounded Attention-Based Neural Machine Translation Model
APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING
English
Article
Neural machine translation (NMT) model processes sentences in isolation and ignores additional contextual or side information beyond sentences. The input text alone often provides limited knowledge to generate contextually correct and meaningful translation. Relying solely on the input text could yield translations that lack accuracy. Side information related to either source or target side is helpful in the context of NMT. In this study, we empirically show that training an NMT model with target-side additional information used as knowledge can significantly improve the translation quality. The acquired knowledge is leveraged in the encoder-/decoder-based model utilizing multiencoder framework. The additional encoder converts knowledge into dense semantic representation called attention. These attentions from the input sentence and additional knowledge are then combined into a unified attention. The decoder generates the translation by conditioning on both the input text and acquired knowledge. Evaluation of translation from Urdu to English with a low-resource setting yields promising results in terms of both perplexity reduction and improved BLEU scores. The proposed models in the respective group outperform in LSTM and GRU with attention mechanism by +3.1 and +2.9 BLEU score, respectively. Extensive analysis confirms our claim that the translations influenced by additional information may occasionally contain rare low-frequency words and faithful translation. Experimental results on a different language pair DE-EN demonstrate that our suggested method is more efficient and general.
WILEY
1687-9724
1687-9732
2025
2025
1
10.1155/acis/6234949
Computer Science
gold
WOS:001397785100001
https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-record/WOS:001397785100001
title Knowledge-Grounded Attention-Based Neural Machine Translation Model
title_short Knowledge-Grounded Attention-Based Neural Machine Translation Model
title_full Knowledge-Grounded Attention-Based Neural Machine Translation Model
title_fullStr Knowledge-Grounded Attention-Based Neural Machine Translation Model
title_full_unstemmed Knowledge-Grounded Attention-Based Neural Machine Translation Model
title_sort Knowledge-Grounded Attention-Based Neural Machine Translation Model
container_title APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING
language English
format Article
description Neural machine translation (NMT) model processes sentences in isolation and ignores additional contextual or side information beyond sentences. The input text alone often provides limited knowledge to generate contextually correct and meaningful translation. Relying solely on the input text could yield translations that lack accuracy. Side information related to either source or target side is helpful in the context of NMT. In this study, we empirically show that training an NMT model with target-side additional information used as knowledge can significantly improve the translation quality. The acquired knowledge is leveraged in the encoder-/decoder-based model utilizing multiencoder framework. The additional encoder converts knowledge into dense semantic representation called attention. These attentions from the input sentence and additional knowledge are then combined into a unified attention. The decoder generates the translation by conditioning on both the input text and acquired knowledge. Evaluation of translation from Urdu to English with a low-resource setting yields promising results in terms of both perplexity reduction and improved BLEU scores. The proposed models in the respective group outperform in LSTM and GRU with attention mechanism by +3.1 and +2.9 BLEU score, respectively. Extensive analysis confirms our claim that the translations influenced by additional information may occasionally contain rare low-frequency words and faithful translation. Experimental results on a different language pair DE-EN demonstrate that our suggested method is more efficient and general.
publisher WILEY
issn 1687-9724
1687-9732
publishDate 2025
container_volume 2025
container_issue 1
doi_str_mv 10.1155/acis/6234949
topic Computer Science
topic_facet Computer Science
accesstype gold
id WOS:001397785100001
url https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-record/WOS:001397785100001
record_format wos
collection Web of Science (WoS)
_version_ 1823296087716265984