A comprehensive survey on applications of transformers for deep learning tasks
Abstract Transformers are Deep Neural Networks (DNN) that utilize a self-attention
mechanism to capture contextual relationships within sequential data. Unlike traditional …
mechanism to capture contextual relationships within sequential data. Unlike traditional …
Knowledge graphs meet multi-modal learning: A comprehensive survey
Knowledge Graphs (KGs) play a pivotal role in advancing various AI applications, with the
semantic web community's exploration into multi-modal dimensions unlocking new avenues …
semantic web community's exploration into multi-modal dimensions unlocking new avenues …
A Fine‐Tuned BERT‐Based Transfer Learning Approach for Text Classification
Text Classification problem has been thoroughly studied in information retrieval problems
and data mining tasks. It is beneficial in multiple tasks including medical diagnose health …
and data mining tasks. It is beneficial in multiple tasks including medical diagnose health …
Knowledge graph augmented network towards multiview representation learning for aspect-based sentiment analysis
Aspect-based sentiment analysis (ABSA) is a fine-grained task of sentiment analysis. To
better comprehend long complicated sentences and obtain accurate aspect-specific …
better comprehend long complicated sentences and obtain accurate aspect-specific …
Vision-language pre-training for multimodal aspect-based sentiment analysis
As an important task in sentiment analysis, Multimodal Aspect-Based Sentiment Analysis
(MABSA) has attracted increasing attention in recent years. However, previous approaches …
(MABSA) has attracted increasing attention in recent years. However, previous approaches …
Few-shot adaptation of multi-modal foundation models: A survey
F Liu, T Zhang, W Dai, C Zhang, W Cai, X Zhou… - Artificial Intelligence …, 2024 - Springer
Abstract Multi-modal (vision-language) models, such as CLIP, are replacing traditional
supervised pre-training models (eg, ImageNet-based pre-training) as the new generation of …
supervised pre-training models (eg, ImageNet-based pre-training) as the new generation of …
Multi-source semantic graph-based multimodal sarcasm explanation generation
Multimodal Sarcasm Explanation (MuSE) is a new yet challenging task, which aims to
generate a natural language sentence for a multimodal social post (an image as well as its …
generate a natural language sentence for a multimodal social post (an image as well as its …
Summary-oriented vision modeling for multimodal abstractive summarization
Multimodal abstractive summarization (MAS) aims to produce a concise summary given the
multimodal data (text and vision). Existing studies mainly focus on how to effectively use the …
multimodal data (text and vision). Existing studies mainly focus on how to effectively use the …
Unisa: Unified generative framework for sentiment analysis
Sentiment analysis is a crucial task that aims to understand people's emotional states and
predict emotional categories based on multimodal information. It consists of several …
predict emotional categories based on multimodal information. It consists of several …
A survey on knowledge-enhanced multimodal learning
Multimodal learning has been a field of increasing interest, aiming to combine various
modalities in a single joint representation. Especially in the area of visiolinguistic (VL) …
modalities in a single joint representation. Especially in the area of visiolinguistic (VL) …