Automated Generation of Chinese Text-Image Summaries Using Deep Learning Techniques

In the era of the internet, an abundance of Chinese text-image content is continuously produced, necessitating effective automated technologies for processing and summarizing these materials. Automated generation of Chinese text-image summaries facilitates rapid comprehension of key information, the...

Full description

Bibliographic Details
Published in:TRAITEMENT DU SIGNAL
Main Authors: Xu, Meiling; Abd Rahman, Hayati; Li, Feng
Format: Article
Language:English
Published: INT INFORMATION & ENGINEERING TECHNOLOGY ASSOC 2023
Subjects:
Online Access:https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-record/WOS:001137494800030
author Xu
Meiling; Abd Rahman
Hayati; Li
Feng
spellingShingle Xu
Meiling; Abd Rahman
Hayati; Li
Feng
Automated Generation of Chinese Text-Image Summaries Using Deep Learning Techniques
Computer Science; Engineering
author_facet Xu
Meiling; Abd Rahman
Hayati; Li
Feng
author_sort Xu
spelling Xu, Meiling; Abd Rahman, Hayati; Li, Feng
Automated Generation of Chinese Text-Image Summaries Using Deep Learning Techniques
TRAITEMENT DU SIGNAL
English
Article
In the era of the internet, an abundance of Chinese text-image content is continuously produced, necessitating effective automated technologies for processing and summarizing these materials. Automated generation of Chinese text-image summaries facilitates rapid comprehension of key information, thereby enhancing the efficiency of information consumption. Due to the unique characteristics of the Chinese language, traditional automatic summarization techniques are inadequately transferable, prompting the development of text-image summary generation technologies tailored to Chinese features. Research indicates that while existing natural language processing and deep learning techniques have made strides in text summarization, deficiencies remain in the deep semantic mining and integration of text-image content. This study primarily focuses on two aspects: Firstly, a generative approach based on an enhanced MaliGAN model, employing deep learning models to improve text generation quality. Secondly, a retrieval-based approach, utilizing cross-modal similarity retrieval to extract text information most relevant to the image content, guiding summary generation. Additionally, this study innovatively proposes a model architecture comprising segmentation, cross-modal retrieval, and adaptive fusion strategy modules, significantly augmenting the accuracy and reliability of text-image summary generation.
INT INFORMATION & ENGINEERING TECHNOLOGY ASSOC
0765-0019
1958-5608
2023
40
6
10.18280/ts.400644
Computer Science; Engineering

WOS:001137494800030
https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-record/WOS:001137494800030
title Automated Generation of Chinese Text-Image Summaries Using Deep Learning Techniques
title_short Automated Generation of Chinese Text-Image Summaries Using Deep Learning Techniques
title_full Automated Generation of Chinese Text-Image Summaries Using Deep Learning Techniques
title_fullStr Automated Generation of Chinese Text-Image Summaries Using Deep Learning Techniques
title_full_unstemmed Automated Generation of Chinese Text-Image Summaries Using Deep Learning Techniques
title_sort Automated Generation of Chinese Text-Image Summaries Using Deep Learning Techniques
container_title TRAITEMENT DU SIGNAL
language English
format Article
description In the era of the internet, an abundance of Chinese text-image content is continuously produced, necessitating effective automated technologies for processing and summarizing these materials. Automated generation of Chinese text-image summaries facilitates rapid comprehension of key information, thereby enhancing the efficiency of information consumption. Due to the unique characteristics of the Chinese language, traditional automatic summarization techniques are inadequately transferable, prompting the development of text-image summary generation technologies tailored to Chinese features. Research indicates that while existing natural language processing and deep learning techniques have made strides in text summarization, deficiencies remain in the deep semantic mining and integration of text-image content. This study primarily focuses on two aspects: Firstly, a generative approach based on an enhanced MaliGAN model, employing deep learning models to improve text generation quality. Secondly, a retrieval-based approach, utilizing cross-modal similarity retrieval to extract text information most relevant to the image content, guiding summary generation. Additionally, this study innovatively proposes a model architecture comprising segmentation, cross-modal retrieval, and adaptive fusion strategy modules, significantly augmenting the accuracy and reliability of text-image summary generation.
publisher INT INFORMATION & ENGINEERING TECHNOLOGY ASSOC
issn 0765-0019
1958-5608
publishDate 2023
container_volume 40
container_issue 6
doi_str_mv 10.18280/ts.400644
topic Computer Science; Engineering
topic_facet Computer Science; Engineering
accesstype
id WOS:001137494800030
url https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-record/WOS:001137494800030
record_format wos
collection Web of Science (WoS)
_version_ 1792031127280549888