- Academic Search

X Qian, Z Wang, J Wang, G Guan… - IEEE/ACM Transactions …, 2022 - ieeexplore.ieee.org

Audio-visual signals can be used jointly for robotic perception as they complement each
other. Such multi-modal sensory fusion has a clear advantage, especially under noisy …

บันทึก อ้างอิง อ้างโดย36 บทความที่เกี่ยวข้อง ทั้งหมด 2 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Transfer learning of wav2vec 2.0 for automatic lyric transcription

L Ou, X Gu, Y Wang - arxiv preprint arxiv:2207.09747, 2022 - arxiv.org

Automatic speech recognition (ASR) has progressed significantly in recent years due to the
emergence of large-scale datasets and the self-supervised learning (SSL) paradigm …

บันทึก อ้างอิง อ้างโดย30 บทความที่เกี่ยวข้อง ทั้งหมด 7 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Lyricwhiz: Robust multilingual zero-shot lyrics transcription by whispering to chatgpt

L Zhuo, R Yuan, J Pan, Y Ma, Y Li, G Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org

We introduce LyricWhiz, a robust, multilingual, and zero-shot automatic lyrics transcription
method achieving state-of-the-art performance on various lyrics transcription datasets, even …

บันทึก อ้างอิง อ้างโดย19 บทความที่เกี่ยวข้อง ทั้งหมด 3 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Generate, discriminate and contrast: A semi-supervised sentence representation learning framework

Y Chen, Y Zhang, B Wang, Z Liu, H Li - arxiv preprint arxiv:2210.16798, 2022 - arxiv.org

Most sentence embedding techniques heavily rely on expensive human-annotated
sentence pairs as the supervised signals. Despite the use of large-scale unlabeled data, the …

บันทึก อ้างอิง อ้างโดย22 บทความที่เกี่ยวข้อง ทั้งหมด 3 ฉบับ ดูในรูปแบบ HTML

Few-shot class-incremental audio classification via discriminative prototype learning

W **e, Y Li, Q He, W Cao - Expert Systems with Applications, 2023 - Elsevier

In real-world scenarios, new audio classes with insufficient samples usually emerge
continually, which motivates the study of few-shot class-incremental audio classification …

บันทึก อ้างอิง อ้างโดย10 บทความที่เกี่ยวข้อง ทั้งหมด 2 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Predict-and-update network: Audio-visual speech recognition inspired by human speech perception

J Wang, X Qian, H Li - IEEE/ACM Transactions on Audio …, 2024 - ieeexplore.ieee.org

Audio and visual signals complement each other in human speech perception, and the
same applies to automatic speech recognition. The visual signal is less evident than the …

บันทึก อ้างอิง อ้างโดย15 บทความที่เกี่ยวข้อง ทั้งหมด 3 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Dynamic transformers provide a false sense of efficiency

Y Chen, S Chen, Z Li, W Yang, C Liu, RT Tan… - arxiv preprint arxiv …, 2023 - arxiv.org

Despite much success in natural language processing (NLP), pre-trained language models
typically lead to a high computational cost during inference. Multi-exit is a mainstream …

บันทึก อ้างอิง อ้างโดย9 บทความที่เกี่ยวข้อง ทั้งหมด 9 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[HTML] ismir.net

[HTML][HTML] Wagner Ring Dataset: A complex opera scenario for music processing and computational musicology

C Weiß, V Arifi-Müller, M Krause… - Transactions of the …, 2023 - transactions.ismir.net

This paper introduces the Wagner Ring Dataset (WRD), a multi-modal and multi-version
resource on the large-scale opera cycle Der Ring des Nibelungen by Richard Wagner. The …

บันทึก อ้างอิง อ้างโดย5 บทความที่เกี่ยวข้อง ทั้งหมด 7 ฉบับ แคช

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Polyscriber: Integrated fine-tuning of extractor and lyrics transcriber for polyphonic music

X Gao, C Gupta, H Li - IEEE/ACM Transactions on Audio …, 2023 - ieeexplore.ieee.org

Lyrics transcription of polyphonic music is challenging as the background music affects lyrics
intelligibility. Typically, lyrics transcription can be performed by a two-step pipeline, ie a …

บันทึก อ้างอิง อ้างโดย11 บทความที่เกี่ยวข้อง ทั้งหมด 4 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Elucidate gender fairness in singing voice transcription

X Gu, W Zeng, Y Wang - Proceedings of the 31st ACM International …, 2023 - dl.acm.org

It is widely known that males and females typically possess different sound characteristics
when singing, such as timbre and pitch, but it has never been explored whether these …

บันทึก อ้างอิง อ้างโดย4 บทความที่เกี่ยวข้อง ทั้งหมด 6 ฉบับ

สร้างการแจ้งเตือน

อ้างอิง

การค้นหาขั้นสูง

บันทึกไปยังคลังของฉันแล้ว

Genre-conditioned acoustic models for automatic lyrics transcription of polyphonic music

Audio-visual cross-attention network for robotic speaker tracking

Transfer learning of wav2vec 2.0 for automatic lyric transcription

Lyricwhiz: Robust multilingual zero-shot lyrics transcription by whispering to chatgpt

Generate, discriminate and contrast: A semi-supervised sentence representation learning framework

Few-shot class-incremental audio classification via discriminative prototype learning

Predict-and-update network: Audio-visual speech recognition inspired by human speech perception

Dynamic transformers provide a false sense of efficiency

[HTML][HTML] Wagner Ring Dataset: A complex opera scenario for music processing and computational musicology

Polyscriber: Integrated fine-tuning of extractor and lyrics transcriber for polyphonic music

Elucidate gender fairness in singing voice transcription