Self-supervised opinion summarization with multi-modal knowledge graph

L **, J Chen - Journal of Intelligent Information Systems, 2024 - Springer
Multi-modal opinion summarization aims at automatically generating summaries of products
or businesses from multi-modal reviews containing text, image and table to present clear …

SMSMO: Learning to generate multimodal summary for scientific papers

X Zhong, Z Tan, S Gao, J Li, J Shen, J Ji, J Tang… - Knowledge-Based …, 2025 - Elsevier
Nowadays, publishers like Elsevier increasingly use graphical abstracts (ie, a pictorial paper
summary) along with textual abstracts to facilitate scientific paper readings. In such a case …

DIUSum: Dynamic Image Utilization for Multimodal Summarization

M **ao, J Zhu, F Zhai, Y Zhou, C Zong - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Existing multimodal summarization approaches focus on fusing image features in the
encoding process, ignoring the individualized needs for images when generating different …

Towards Informative Open-ended Text Generation with Dynamic Knowledge Triples

Z Ren, Y Zhao, C Zong - Findings of the Association for …, 2023 - aclanthology.org
Pretrained language models (PLMs), especially large language models (LLMs) demonstrate
impressive capabilities in open-ended text generation. While our statistical results show that …

Exploring the Trade-Off within Visual Information for MultiModal Sentence Summarization

M Yuan, S Cui, X Zhang, S Wang, H Xu… - Proceedings of the 47th …, 2024 - dl.acm.org
MultiModal Sentence Summarization (MMSS) aims to generate a brief summary based on
the given source sentence and its associated image. Previous studies on MMSS have …

Multimodal summarization with modality features alignment and features filtering

B Tang, B Lin, Z Chang, S Li - Neurocomputing, 2024 - Elsevier
Abstract Previous studies about MultiModal Summarization (MMS) mainly focus on effective
selection and filtering of visual features to assist in cross-modal fusion and text-based …

Enhancing Large Language Models for Scientific Multimodal Summarization with Multimodal Output

Z Tan, X Zhong, JY Ji, W Jiang… - Proceedings of the 31st …, 2025 - aclanthology.org
The increasing integration of multimedia such as videos and graphical abstracts in scientific
publications necessitates advanced summarization techniques. This paper introduces Uni …

Multization: Multi-Modal Summarization Enhanced by Multi-Contextually Relevant and Irrelevant Attention Alignment

H Rong, Z Chen, Z Lu, F Xu, VS Sheng - ACM Transactions on Asian and …, 2024 - dl.acm.org
This article focuses on the task of Multi-Modal Summarization with Multi-Modal Output for
China JD. COM e-commerce product description containing both source text and source …

Visual Enhanced Entity-Level Interaction Network for Multimodal Summarization

H Yan, B Tang, B Lin, G Zhao, S Li - Findings of the Association for …, 2024 - aclanthology.org
MultiModal Summarization (MMS) aims to generate a concise summary based on
multimodal data like texts and images and has wide application in multimodal fields …

CGSMP: Controllable Generative Summarization via Multimodal Prompt

Q Yong, J Wei, YR Zhang, XL Zhang, C Wei… - Proceedings of the 1st …, 2023 - dl.acm.org
Natural Language Generation (NLG) has improved exponentially in recent years thanks to
the development of a large language model (LLM), this advancement has resulted in more …