- Academic Search

X Mei, X Liu, MD Plumbley, W Wang - … journal on audio, speech, and music …, 2022 - Springer

Automated audio captioning is a cross-modal translation task that aims to generate natural
language descriptions for given audio clips. This task has received increasing attention with …

Gem Citer Citeret af 61 Relaterede artikler Alle 11 versioner

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Separate anything you describe

X Liu, Q Kong, Y Zhao, H Liu, Y Yuan… - … on Audio, Speech …, 2024 - ieeexplore.ieee.org

Language-queried audio source separation (LASS) is a new paradigm for computational
auditory scene analysis (CASA). LASS aims to separate a target sound from an audio …

Gem Citer Citeret af 42 Relaterede artikler Alle 8 versioner

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Improving text-audio retrieval by text-aware attention pooling and prior matrix revised loss

Y **, T Virtanen - arxiv preprint arxiv:2206.06108, 2022 - arxiv.org

Language-based audio retrieval is a task, where natural language textual captions are used
as queries to retrieve audio signals from a dataset. It has been first introduced into DCASE …

Gem Citer Citeret af 24 Relaterede artikler Alle 8 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Improving audio-text retrieval via hierarchical cross-modal interaction and auxiliary captions

Y **n, Y Zou - arxiv preprint arxiv:2307.15344, 2023 - arxiv.org

Most existing audio-text retrieval (ATR) methods focus on constructing contrastive pairs
between whole audio clips and complete caption sentences, while ignoring fine-grained …

Gem Citer Citeret af 9 Relaterede artikler Alle 6 versioner Vis som HTML

Cooperative game modeling with weighted token-level alignment for audio-text retrieval

Y **n, B Wang, L Shang - IEEE Signal Processing Letters, 2023 - ieeexplore.ieee.org

Previous audio-text retrieval (ATR) methods primarily concentrate on constructing
contrastive pairs between entire audio clips and full caption sentences, while neglecting fine …

Gem Citer Citeret af 7 Relaterede artikler Alle 3 versioner

[Free GPT-4]
[DeepSeek]

[PDF] dcase.community

[PDF][PDF] Cp-jku's submission to task 6b of the dcase2023 challenge: Audio retrieval with passt and gpt-augmented captions

P Primus, K Koutini, G Widmer - tech. rep., DCASE2023 …, 2023 - dcase.community

This technical report describes CP-JKU's submission to the naturallanguage-based audio
retrieval task of the 2023 DCASE Challenge (Task 6b). Our proposed system uses …

Gem Citer Citeret af 11 Relaterede artikler Vis som HTML

Opret underretning

Citer

Avanceret søgning

Gemt i Min samling

Language-based audio retrieval with pre-trained models

Automated audio captioning: An overview of recent progress and new challenges

Separate anything you describe

Improving text-audio retrieval by text-aware attention pooling and prior matrix revised loss

Improving audio-text retrieval via hierarchical cross-modal interaction and auxiliary captions

Cooperative game modeling with weighted token-level alignment for audio-text retrieval

[PDF][PDF] Cp-jku's submission to task 6b of the dcase2023 challenge: Audio retrieval with passt and gpt-augmented captions