- Academic Search

Z Han, A Azman, MR Mustaffa, FB Khalid - IEEE Access, 2024 - ieeexplore.ieee.org

With the rapid development of science and technology, all types of mixed media contain
large amounts of data. Traditional single multimedia data can no longer satisfy daily …

Salva Cita Citato da 3 Articoli correlati Tutte e 3 le versioni

[Free GPT-4]

[PDF] aaai.org

Multi-modal knowledge hypergraph for diverse image retrieval

Y Zeng, Q **, T Bao, W Li - Proceedings of the AAAI Conference on …, 2023 - ojs.aaai.org

The task of keyword-based diverse image retrieval has received considerable attention due
to its wide demand in real-world scenarios. Existing methods either rely on a multi-stage re …

Salva Cita Citato da 22 Articoli correlati Tutte e 2 le versioni Versione HTML

Temporally language grounding with multi-modal multi-prompt tuning

Y Zeng, N Han, K Pan, Q ** - IEEE Transactions on Multimedia, 2023 - ieeexplore.ieee.org

The task of temporally language grounding (TLG), aiming to locate a video moment within
an untrimmed video that matches a given textual query, has attracted considerable research …

Salva Cita Citato da 7 Articoli correlati Tutte e 2 le versioni

[Free GPT-4]

[PDF] acm.org

LLM-enhanced Composed Image Retrieval: An Intent Uncertainty-aware Linguistic-Visual Dual Channel Matching Model

H Ge, Y Jiang, J Sun, K Yuan, Y Liu - ACM Transactions on Information …, 2024 - dl.acm.org

Composed image retrieval (CoIR) involves a multi-modal query of the reference image and
modification text describing the desired changes, allowing users to express image retrieval …

Salva Cita Citato da 2 Articoli correlati

[Free GPT-4]

[PDF] archive.org

Point prompt tuning for temporally language grounding

Y Zeng - Proceedings of the 45th international ACM SIGIR …, 2022 - dl.acm.org

The task of temporally language grounding (TLG) aims to locate a video moment from an
untrimmed video that match a given textual query, which has attracted considerable …

Salva Cita Citato da 18 Articoli correlati Tutte e 2 le versioni

[Free GPT-4]

[PDF] ssrn.com

Contrastive topic-enhanced network for video captioning

Y Zeng, Y Wang, D Liao, G Li, J Xu, H Man… - Expert Systems with …, 2024 - Elsevier

In the field of video captioning, recent works usually focus on multi-modal video content
understanding, in which transcripts are extracted from speech and are often adopted as an …

Salva Cita Citato da 6 Articoli correlati Tutte e 4 le versioni

Probabilistic keyphrase generation from copy and generating spaces

Y Yao, P Yang, G Zhao, Y Ge… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Keyphrase generation is one of the most fundamental tasks in natural language processing
(NLP). Most existing works on keyphrase generation mainly focus on using holistic …

Salva Cita Citato da 1 Articoli correlati Tutte e 3 le versioni

[Free GPT-4]

[PDF] arxiv.org

Data-driven knowledge fusion for deep multi-instance learning

YX Zhang, Z Zhou, X He, AR Adhikary… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Multi-instance learning (MIL) is a widely applied technique in practical applications that
involve complex data structures. MIL can be broadly categorized into two types: traditional …

Salva Cita Citato da 4 Articoli correlati Tutte e 2 le versioni

[Free GPT-4]

[PDF] mdpi.com

Enhancing document image retrieval in education: Leveraging ensemble-based document image retrieval systems for improved precision

YI Alzoubi, AE Topcu, E Ozdemir - Applied Sciences, 2024 - mdpi.com

Document image retrieval (DIR) systems simplify access to digital data within printed
documents by capturing images. These systems act as bridges between print and digital …

Salva Cita Citato da 2 Articoli correlati Tutte e 3 le versioni Copia cache

[Free GPT-4]

[PDF] arxiv.org

Globally Correlation-Aware Hard Negative Generation

W Peng, H Huang, T Chen, Q Ke, G Dai… - International Journal of …, 2024 - Springer

Hard negative generation aims to generate informative negative samples that help to
determine the decision boundaries and thus facilitate advancing deep metric learning …

Salva Cita Articoli correlati Tutte e 4 le versioni

Crea avviso

Cita

Ricerca avanzata

Salvato in La mia biblioteca

Keyword-based diverse image retrieval with variational multiple instance graph

Cross-Modal Retrieval: A Review of Methodologies, Datasets, and Future Perspectives

Multi-modal knowledge hypergraph for diverse image retrieval

Temporally language grounding with multi-modal multi-prompt tuning

LLM-enhanced Composed Image Retrieval: An Intent Uncertainty-aware Linguistic-Visual Dual Channel Matching Model

Point prompt tuning for temporally language grounding

Contrastive topic-enhanced network for video captioning

Probabilistic keyphrase generation from copy and generating spaces

Data-driven knowledge fusion for deep multi-instance learning

Enhancing document image retrieval in education: Leveraging ensemble-based document image retrieval systems for improved precision

Globally Correlation-Aware Hard Negative Generation