A comprehensive survey on applications of transformers for deep learning tasks

S Islam, H Elmekki, A Elsebai, J Bentahar… - Expert Systems with …, 2024 - Elsevier
Abstract Transformers are Deep Neural Networks (DNN) that utilize a self-attention
mechanism to capture contextual relationships within sequential data. Unlike traditional …

Knowledge graphs meet multi-modal learning: A comprehensive survey

Z Chen, Y Zhang, Y Fang, Y Geng, L Guo… - arxiv preprint arxiv …, 2024 - arxiv.org
Knowledge Graphs (KGs) play a pivotal role in advancing various AI applications, with the
semantic web community's exploration into multi-modal dimensions unlocking new avenues …

A Fine‐Tuned BERT‐Based Transfer Learning Approach for Text Classification

R Qasim, WH Bangyal, MA Alqarni… - Journal of healthcare …, 2022 - Wiley Online Library
Text Classification problem has been thoroughly studied in information retrieval problems
and data mining tasks. It is beneficial in multiple tasks including medical diagnose health …

Knowledge graph augmented network towards multiview representation learning for aspect-based sentiment analysis

Q Zhong, L Ding, J Liu, B Du, H **… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Aspect-based sentiment analysis (ABSA) is a fine-grained task of sentiment analysis. To
better comprehend long complicated sentences and obtain accurate aspect-specific …

Vision-language pre-training for multimodal aspect-based sentiment analysis

Y Ling, J Yu, R **a - arxiv preprint arxiv:2204.07955, 2022 - arxiv.org
As an important task in sentiment analysis, Multimodal Aspect-Based Sentiment Analysis
(MABSA) has attracted increasing attention in recent years. However, previous approaches …

Few-shot adaptation of multi-modal foundation models: A survey

F Liu, T Zhang, W Dai, C Zhang, W Cai, X Zhou… - Artificial Intelligence …, 2024 - Springer
Abstract Multi-modal (vision-language) models, such as CLIP, are replacing traditional
supervised pre-training models (eg, ImageNet-based pre-training) as the new generation of …

Multi-source semantic graph-based multimodal sarcasm explanation generation

L **g, X Song, K Ouyang, M Jia, L Nie - arxiv preprint arxiv:2306.16650, 2023 - arxiv.org
Multimodal Sarcasm Explanation (MuSE) is a new yet challenging task, which aims to
generate a natural language sentence for a multimodal social post (an image as well as its …

Summary-oriented vision modeling for multimodal abstractive summarization

Y Liang, F Meng, J Xu, J Wang, Y Chen… - arxiv preprint arxiv …, 2022 - arxiv.org
Multimodal abstractive summarization (MAS) aims to produce a concise summary given the
multimodal data (text and vision). Existing studies mainly focus on how to effectively use the …

Unisa: Unified generative framework for sentiment analysis

Z Li, TE Lin, Y Wu, M Liu, F Tang, M Zhao… - Proceedings of the 31st …, 2023 - dl.acm.org
Sentiment analysis is a crucial task that aims to understand people's emotional states and
predict emotional categories based on multimodal information. It consists of several …

A survey on knowledge-enhanced multimodal learning

M Lymperaiou, G Stamou - Artificial Intelligence Review, 2024 - Springer
Multimodal learning has been a field of increasing interest, aiming to combine various
modalities in a single joint representation. Especially in the area of visiolinguistic (VL) …