Μελετητής Google

D Yao, Z Li, B Li, C Zhang, H Ma - Expert Systems with Applications, 2024 - Elsevier

Existing cross-modal hash retrieval methods can simultaneously enhance retrieval speed
and reduce storage space. However, these methods face a major challenge in determining …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 17 Σχετικά άρθρα Όλες οι 2 εκδοχές

[Free GPT-4]

[PDF] cnr.it

VISIONE at video browser showdown 2023

G Amato, P Bolettieri, F Carrara, F Falchi… - … on multimedia modeling, 2023 - Springer

In this paper, we present the fourth release of VISIONE, a tool for fast and effective video
search on a large-scale dataset. It includes several search functionalities like text search …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 49 Σχετικά άρθρα Όλες οι 7 εκδοχές

[Free GPT-4]

[PDF] acm.org

Text-to-motion retrieval: Towards joint understanding of human motion data and natural language

N Messina, J Sedmidubsky, F Falchi… - Proceedings of the 46th …, 2023 - dl.acm.org

Due to recent advances in pose-estimation methods, human motion can be extracted from a
common video in the form of 3D skeleton sequences. Despite wonderful application …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 13 Σχετικά άρθρα Όλες οι 5 εκδοχές

[Free GPT-4]

[PDF] acm.org

Towards Retrieval-Augmented Architectures for Image Captioning

S Sarto, M Cornia, L Baraldi, A Nicolosi… - ACM Transactions on …, 2024 - dl.acm.org

The objective of image captioning models is to bridge the gap between the visual and
linguistic modalities by generating natural language descriptions that accurately reflect the …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 5 Σχετικά άρθρα Όλες οι 5 εκδοχές

[Free GPT-4]

[PDF] cnr.it

Visione: a large-scale video retrieval system with advanced search functionalities

G Amato, P Bolettieri, F Carrara, F Falchi… - Proceedings of the …, 2023 - dl.acm.org

VISIONE is a large-scale video retrieval system that integrates multiple search
functionalities, including free text search, spatial color and object search, visual and …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 7 Σχετικά άρθρα Όλες οι 4 εκδοχές

[Free GPT-4]

[HTML] mdpi.com

[HTML][HTML] Image–Text Matching Model Based on CLIP Bimodal Encoding

Y Zhu, H Xu, A Du, B Wang - Applied Sciences, 2024 - mdpi.com

Image–text matching is a fundamental task in the multimodal research field, connecting
computer vision and natural language processing by aligning visual content with …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 1 Σχετικά άρθρα Όλες οι 2 εκδοχές Προσωρινά αποθηκευμένη

[Free GPT-4]

[PDF] mdpi.com

Fine-Grained Cross-Modal Semantic Consistency in Natural Conservation Image Data from a Multi-Task Perspective

R Tao, M Zhu, H Cao, H Ren - Sensors, 2024 - mdpi.com

Fine-grained representation is fundamental to species classification based on deep
learning, and in this context, cross-modal contrastive learning is an effective method. The …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 2 Σχετικά άρθρα Όλες οι 9 εκδοχές Προβολή ως HTML

[Free GPT-4]

[PDF] acm.org

VISIONE for newbies: an easier-to-use video retrieval system

G Amato, P Bolettieri, F Carrara, F Falchi… - Proceedings of the 20th …, 2023 - dl.acm.org

This paper presents a revised version of the VISIONE video retrieval system, which offers a
wide range of search functionalities, including free text search, spatial color and object …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 6 Σχετικά άρθρα Όλες οι 4 εκδοχές

[Free GPT-4]

[PDF] springer.com

Cascaded transformer-based networks for wikipedia large-scale image-caption matching

N Messina, DA Coccomini, A Esuli, F Falchi - Multimedia Tools and …, 2024 - Springer

With the increasing importance of multimedia and multilingual data in online encyclopedias,
novel methods are needed to fill domain gaps and automatically connect different modalities …

Αποθήκευση Παράθεση Σχετικά άρθρα Όλες οι 5 εκδοχές

[Free GPT-4]

[PDF] ieee.org

Evaluating Performance and Trends in Interactive Video Retrieval: Insights from the 12th VBS Competition

L Vadicamo, R Arnold, W Bailer, F Carrara… - IEEE …, 2024 - ieeexplore.ieee.org

This paper conducts a thorough examination of the 12th Video Browser Showdown (VBS)
competition, a well-established international benchmarking campaign for interactive video …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 12 Σχετικά άρθρα Όλες οι 2 εκδοχές

Δημιουργία ειδοποίησης

Παράθεση

Σύνθετη αναζήτηση

Αποθηκεύτηκε στη Βιβλιοθήκη μου

Aladin: distilling fine-grained alignment scores for efficient image-text matching and retrieval

Similarity Graph-correlation Reconstruction Network for unsupervised cross-modal hashing

VISIONE at video browser showdown 2023

Text-to-motion retrieval: Towards joint understanding of human motion data and natural language

Towards Retrieval-Augmented Architectures for Image Captioning

Visione: a large-scale video retrieval system with advanced search functionalities

[HTML][HTML] Image–Text Matching Model Based on CLIP Bimodal Encoding

Fine-Grained Cross-Modal Semantic Consistency in Natural Conservation Image Data from a Multi-Task Perspective

VISIONE for newbies: an easier-to-use video retrieval system

Cascaded transformer-based networks for wikipedia large-scale image-caption matching

Evaluating Performance and Trends in Interactive Video Retrieval: Insights from the 12th VBS Competition