Google Академик

C Chen, D Han, X Shen - Knowledge-Based Systems, 2023 - Elsevier

The emergence of the Transformer optimizes the interactive modeling of multimodal
information in visual question answering (VQA) tasks, hel** machines better understand …

Сачувај Цитирај 90 пута наведен Сродни чланци Све верзије (3)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

SGUIE-Net: Semantic attention guided underwater image enhancement with multi-scale perception

Q Qi, K Li, H Zheng, X Gao, G Hou… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Due to the wavelength-dependent light attenuation, refraction and scattering, underwater
images usually suffer from color distortion and blurred details. However, due to the limited …

Сачувај Цитирај 142 пута наведен Сродни чланци Све верзије (5)

[Free GPT-4]
[DeepSeek]

[PDF] aber.ac.uk

Region-object relation-aware dense captioning via transformer

Z Shao, J Han, D Marnerides… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Dense captioning provides detailed captions of complex visual scenes. While a number of
successes have been achieved in recent years, there are still two broad limitations: 1) most …

Сачувај Цитирај 137 пута наведен Сродни чланци Све верзије (10)

Deep fuzzy hashing network for efficient image retrieval

H Lu, M Zhang, X Xu, Y Li… - IEEE transactions on fuzzy …, 2020 - ieeexplore.ieee.org

Hashing methods for efficient image retrieval aim at learning hash functions that map similar
images to semantically correlated binary codes in the Hamming space with similarity well …

Сачувај Цитирај 322 пута наведен Сродни чланци Све верзије (4)

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Fashionvlp: Vision language transformer for fashion retrieval with feedback

S Goenka, Z Zheng, A Jaiswal… - Proceedings of the …, 2022 - openaccess.thecvf.com

Fashion image retrieval based on a query pair of reference image and natural language
feedback is a challenging task that requires models to assess fashion related information …

Сачувај Цитирај 102 пута наведен Сродни чланци Све верзије (5) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

An overview of recent work in media forensics: Methods and threats

K Bhagtani, AKS Yadav, ER Bartusiak, Z **ang… - arxiv preprint arxiv …, 2022 - arxiv.org

In this paper, we review recent work in media forensics for digital images, video, audio
(specifically speech), and documents. For each data modality, we discuss synthesis and …

Сачувај Цитирај 34 пута наведен Сродни чланци Све верзије (2) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Towards multi-modal sarcasm detection via hierarchical congruity modeling with knowledge enhancement

H Liu, W Wang, H Li - arxiv preprint arxiv:2210.03501, 2022 - arxiv.org

Sarcasm is a linguistic phenomenon indicating a discrepancy between literal meanings and
implied intentions. Due to its sophisticated nature, it is usually challenging to be detected …

Сачувај Цитирај 82 пута наведен Сродни чланци Све верзије (7) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Nearest neighbor-based contrastive learning for hyperspectral and LiDAR data classification

M Wang, F Gao, J Dong, HC Li… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

The joint hyperspectral image (HSI) and light detection and ranging (LiDAR) data
classification aims to interpret ground objects at more detailed and precise level. Although …

Сачувај Цитирај 68 пута наведен Сродни чланци Све верзије (3)

Sarcasm driven by sentiment: A sentiment-aware hierarchical fusion network for multimodal sarcasm detection

H Liu, R Wei, G Tu, J Lin, C Liu, D Jiang - Information Fusion, 2024 - Elsevier

Sarcasm is a form of sentiment expression that highlights the disparity between a person's
true intentions and the content they explicitly present. With the exponential increase in …

Сачувај Цитирај 19 пута наведен Сродни чланци Све верзије (2)

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Cosmo: Content-style modulation for image retrieval with text feedback

S Lee, D Kim, B Han - … of the IEEE/CVF Conference on …, 2021 - openaccess.thecvf.com

We tackle the task of image retrieval with text feedback, where a reference image and
modifier text are combined to identify the desired target image. We focus on designing an …

Сачувај Цитирај 134 пута наведен Сродни чланци Све верзије (4) HTML верзија

Направи обавештење

Цитирај

Напредна претрага

Сачувано у мојој библиотеци

Cross-modal attention with semantic consistence for image–text matching

CLVIN: Complete language-vision interaction network for visual question answering

SGUIE-Net: Semantic attention guided underwater image enhancement with multi-scale perception

Region-object relation-aware dense captioning via transformer

Deep fuzzy hashing network for efficient image retrieval

Fashionvlp: Vision language transformer for fashion retrieval with feedback

An overview of recent work in media forensics: Methods and threats

Towards multi-modal sarcasm detection via hierarchical congruity modeling with knowledge enhancement

Nearest neighbor-based contrastive learning for hyperspectral and LiDAR data classification

Sarcasm driven by sentiment: A sentiment-aware hierarchical fusion network for multimodal sarcasm detection

Cosmo: Content-style modulation for image retrieval with text feedback