A survey on multi-modal summarization
The new era of technology has brought us to the point where it is convenient for people to
share their opinions over an abundance of platforms. These platforms have a provision for …
share their opinions over an abundance of platforms. These platforms have a provision for …
A general survey on attention mechanisms in deep learning
G Brauwers, F Frasincar - IEEE Transactions on Knowledge …, 2021 - ieeexplore.ieee.org
Attention is an important mechanism that can be employed for a variety of deep learning
models across many different domains and tasks. This survey provides an overview of the …
models across many different domains and tasks. This survey provides an overview of the …
Good visual guidance makes a better extractor: Hierarchical visual prefix for multimodal entity and relation extraction
Multimodal named entity recognition and relation extraction (MNER and MRE) is a
fundamental and crucial branch in information extraction. However, existing approaches for …
fundamental and crucial branch in information extraction. However, existing approaches for …
RpBERT: a text-image relation propagation-based BERT model for multimodal NER
Recently multimodal named entity recognition (MNER) has utilized images to improve the
accuracy of NER in tweets. However, most of the multimodal methods use attention …
accuracy of NER in tweets. However, most of the multimodal methods use attention …
Mner-qg: An end-to-end mrc framework for multimodal named entity recognition with query grounding
Multimodal named entity recognition (MNER) is a critical step in information extraction,
which aims to detect entity spans and classify them to corresponding entity types given a …
which aims to detect entity spans and classify them to corresponding entity types given a …
Query prior matters: A MRC framework for multimodal named entity recognition
Multimodal named entity recognition (MNER) is a vision-language task where the system is
required to detect entity spans and corresponding entity types given a sentence-image pair …
required to detect entity spans and corresponding entity types given a sentence-image pair …
Multimodal aspect-based sentiment analysis: a survey of tasks, methods, challenges and future directions
With the development of social media, users increasingly tend to express their sentiments
(broadly including sentiment polarities, emotions and sarcasm, etc.) associated with fine …
(broadly including sentiment polarities, emotions and sarcasm, etc.) associated with fine …
Multimodal named entity recognition with image attributes and image knowledge
Multimodal named entity extraction is an emerging task which uses both textual and visual
information to detect named entities and identify their entity types. The existing efforts are …
information to detect named entities and identify their entity types. The existing efforts are …
A large-scale Chinese multimodal NER dataset with speech clues
In this paper, we aim to explore an uncharted territory, which is Chinese multimodal named
entity recognition (NER) with both textual and acoustic contents. To achieve this, we …
entity recognition (NER) with both textual and acoustic contents. To achieve this, we …
Umie: Unified multimodal information extraction with instruction tuning
Multimodal information extraction (MIE) gains significant attention as the popularity of
multimedia content increases. However, current MIE methods often resort to using task …
multimedia content increases. However, current MIE methods often resort to using task …