الباحث العلمي من Google

T Wang, F Li, L Zhu, J Li, Z Zhang… - Proceedings of the …, 2025‏ - ieeexplore.ieee.org‏

With the exponential surge in diverse multimodal data, traditional unimodal retrieval
methods struggle to meet the needs of users seeking access to data across various …‏

حفظ اقتباس تم اقتباسها في عدد: 24 مقالات ذات صلة الإصدارات الـ 3كلها

A survey of audio-based music classification and annotation‏

Z Fu, G Lu, KM Ting, D Zhang - IEEE transactions on …, 2010‏ - ieeexplore.ieee.org‏

Music information retrieval (MIR) is an emerging research area that receives growing
attention from both the research community and music industry. It addresses the problem of …‏

حفظ اقتباس تم اقتباسها في عدد: 632 مقالات ذات صلة الإصدارات الـ 8كلها

[Free GPT-4]

[PDF] arxiv.org

Use what you have: Video retrieval using representations from collaborative experts‏

Y Liu, S Albanie, A Nagrani, A Zisserman - arxiv preprint arxiv …, 2019‏ - arxiv.org‏

The rapid growth of video on the internet has made searching for video content using natural
language queries a significant challenge. Human-generated queries for video datasetsin the …‏

حفظ اقتباس تم اقتباسها في عدد: 463 مقالات ذات صلة الإصدارات الـ 10كلها إصدار HTML‏

[Free GPT-4]

[PDF] arxiv.org

Learning audio-video modalities from image captions‏

A Nagrani, PH Seo, B Seybold, A Hauth… - … on Computer Vision, 2022‏ - Springer‏

There has been a recent explosion of large-scale image-text datasets, as images with alt-
text captions can be easily obtained online. Obtaining large-scale, high quality data for video …‏

حفظ اقتباس تم اقتباسها في عدد: 97 مقالات ذات صلة الإصدارات الـ 8كلها

[Free GPT-4]

[PDF] arxiv.org

Audio retrieval with natural language queries: A benchmark study‏

AS Koepke, AM Oncescu, JF Henriques… - IEEE Transactions …, 2022‏ - ieeexplore.ieee.org‏

The objectives of this work are cross-modal text-audio and audio-text retrieval, in which the
goal is to retrieve the audio content from a pool of candidates that best matches a given …‏

حفظ اقتباس تم اقتباسها في عدد: 117 مقالات ذات صلة الإصدارات الـ 10كلها

[Free GPT-4]

[PDF] ucl.ac.uk

Deepear: robust smartphone audio sensing in unconstrained acoustic environments using deep learning‏

ND Lane, P Georgiev, L Qendro - … of the 2015 ACM international joint …, 2015‏ - dl.acm.org‏

Microphones are remarkably powerful sensors of human behavior and context. However,
audio sensing is highly susceptible to wild fluctuations in accuracy when used in diverse …‏

حفظ اقتباس تم اقتباسها في عدد: 429 مقالات ذات صلة الإصدارات الـ 7كلها

[Free GPT-4]

[PDF] kent.ac.uk

Robust sound event classification using deep neural networks‏

I McLoughlin, H Zhang, Z **e, Y Song… - IEEE/ACM Transactions …, 2015‏ - ieeexplore.ieee.org‏

The automatic recognition of sound events by computers is an important aspect of emerging
applications such as automated surveillance, machine hearing and auditory scene …‏

حفظ اقتباس تم اقتباسها في عدد: 315 مقالات ذات صلة الإصدارات الـ 6كلها

[Free GPT-4]

[PDF] arxiv.org

Audio retrieval with natural language queries‏

AM Oncescu, A Koepke, JF Henriques, Z Akata… - arxiv preprint arxiv …, 2021‏ - arxiv.org‏

We consider the task of retrieving audio using free-form natural language queries. To study
this problem, which has received limited attention in the existing literature, we introduce …‏

حفظ اقتباس تم اقتباسها في عدد: 90 مقالات ذات صلة الإصدارات الـ 13كلها إصدار HTML‏

[Free GPT-4]

[PDF] thecvf.com

Improving cross-modal retrieval with set of diverse embeddings‏

D Kim, N Kim, S Kwak - … of the IEEE/CVF Conference on …, 2023‏ - openaccess.thecvf.com‏

Cross-modal retrieval across image and text modalities is a challenging task due to its
inherent ambiguity: An image often exhibits various situations, and a caption can be coupled …‏

حفظ اقتباس تم اقتباسها في عدد: 41 مقالات ذات صلة الإصدارات الـ 7كلها إصدار HTML‏

[Free GPT-4]

[PDF] arxiv.org

On metric learning for audio-text cross-modal retrieval‏

X Mei, X Liu, J Sun, MD Plumbley, W Wang - arxiv preprint arxiv …, 2022‏ - arxiv.org‏

Audio-text retrieval aims at retrieving a target audio clip or caption from a pool of candidates
given a query in another modality. Solving such cross-modal retrieval task is challenging …‏

حفظ اقتباس تم اقتباسها في عدد: 72 مقالات ذات صلة الإصدارات الـ 9كلها إصدار HTML‏

إنشاء تنبيه

اقتباس

بحث متقدم

تم حفظ المقالة في مكتبتي.

Large-scale content-based audio retrieval from text queries

Cross-modal retrieval: a systematic review of methods and future directions‏

A survey of audio-based music classification and annotation‏

Use what you have: Video retrieval using representations from collaborative experts‏

Learning audio-video modalities from image captions‏

Audio retrieval with natural language queries: A benchmark study‏

Deepear: robust smartphone audio sensing in unconstrained acoustic environments using deep learning‏

Robust sound event classification using deep neural networks‏

Audio retrieval with natural language queries‏

Improving cross-modal retrieval with set of diverse embeddings‏

On metric learning for audio-text cross-modal retrieval‏