Abstractive text summarization: State of the art, challenges, and improvements

H Shakil, A Farooq, J Kalita - Neurocomputing, 2024 - Elsevier
Specifically focusing on the landscape of abstractive text summarization, as opposed to
extractive techniques, this survey presents a comprehensive overview, delving into state-of …

Summary-oriented vision modeling for multimodal abstractive summarization

Y Liang, F Meng, J Xu, J Wang, Y Chen… - arxiv preprint arxiv …, 2022 - arxiv.org
Multimodal abstractive summarization (MAS) aims to produce a concise summary given the
multimodal data (text and vision). Existing studies mainly focus on how to effectively use the …

Recent trends of multimodal affective computing: A survey from NLP perspective

G Hu, Y **n, W Lyu, H Huang, C Sun, Z Zhu… - arxiv preprint arxiv …, 2024 - arxiv.org
Multimodal affective computing (MAC) has garnered increasing attention due to its broad
applications in analyzing human behaviors and intentions, especially in text-dominated …

Unisa: Unified generative framework for sentiment analysis

Z Li, TE Lin, Y Wu, M Liu, F Tang, M Zhao… - Proceedings of the 31st …, 2023 - dl.acm.org
Sentiment analysis is a crucial task that aims to understand people's emotional states and
predict emotional categories based on multimodal information. It consists of several …

Multi-task hierarchical heterogeneous fusion framework for multimodal summarization

L Zhang, X Zhang, L Han, Z Yu, Y Liu, Z Li - Information Processing & …, 2024 - Elsevier
With the rise of multimedia content on the internet, Multimodal Summarization has become a
challenging task to help individuals grasp vital information fast. However, previous methods …

Exploiting pseudo image captions for multimodal summarization

C Jiang, R **e, W Ye, J Sun, S Zhang - arxiv preprint arxiv:2305.05496, 2023 - arxiv.org
Cross-modal contrastive learning in vision language pretraining (VLP) faces the challenge
of (partial) false negatives. In this paper, we study this problem from the perspective of …

DTV: Dual Knowledge Distillation and Target-oriented Vision Modeling for Many-to-Many Multimodal Summarization

Y Liang, F Meng, J Wang, J Xu, Y Chen… - arxiv preprint arxiv …, 2023 - arxiv.org
Many-to-many multimodal summarization (M $^ 3$ S) task aims to generate summaries in
any language with document inputs in any language and the corresponding image …

Hyperpelt: Unified parameter-efficient language model tuning for both language and vision-and-language tasks

Z Zhang, W Guo, X Meng, Y Wang, Y Wang… - arxiv preprint arxiv …, 2022 - arxiv.org
The workflow of pretraining and fine-tuning has emerged as a popular paradigm for solving
various NLP and V&L (Vision-and-Language) downstream tasks. With the capacity of …

Evaluating and improving factuality in multimodal abstractive summarization

D Wan, M Bansal - arxiv preprint arxiv:2211.02580, 2022 - arxiv.org
Current metrics for evaluating factuality for abstractive document summarization have
achieved high correlations with human judgment, but they do not account for the vision …

Unimeec: Towards unified multimodal emotion recognition and emotion cause

G Hu, Z Zhu, D Hershcovich, L Hu, H Seifi… - arxiv preprint arxiv …, 2024 - arxiv.org
Multimodal emotion recognition in conversation (MERC) and multimodal emotion-cause pair
extraction (MECPE) have recently garnered significant attention. Emotions are the …