- Academic Search

DPCCN: Densely-connected pyramid complex convolutional network for robust speech separation and extraction

J Han, Y Long, L Burget… - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org

In recent years, a number of time-domain speech separation methods have been proposed.
However, most of them are very sensitive to the environments and wide domain coverage …

บันทึก อ้างอิง อ้างโดย24 บทความที่เกี่ยวข้อง ทั้งหมด 4 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Learning-based personal speech enhancement for teleconferencing by exploiting spatial-spectral features

Y Hsu, Y Lee, MR Bai - ICASSP 2022-2022 IEEE International …, 2022 - ieeexplore.ieee.org

Teleconferencing is becoming essential during the COVID-19 pandemic. However, in real-
world applications, speech quality can deteriorate due to, for example, background …

บันทึก อ้างอิง อ้างโดย11 บทความที่เกี่ยวข้อง ทั้งหมด 5 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Attention-based scaling adaptation for target speech extraction

J Han, W Rao, Y Long, J Liang - 2021 IEEE Automatic Speech …, 2021 - ieeexplore.ieee.org

The target speech extraction has attracted widespread attention in recent years. In this work,
we focus on investigating the dynamic interaction between different mixtures and the target …

บันทึก อ้างอิง อ้างโดย15 บทความที่เกี่ยวข้อง ทั้งหมด 3 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Heterogeneous separation consistency training for adaptation of unsupervised speech separation

J Han, Y Long - EURASIP Journal on Audio, Speech, and Music …, 2023 - Springer

Recently, supervised speech separation has made great progress. However, limited by the
nature of supervised training, most existing separation methods require ground-truth …

บันทึก อ้างอิง อ้างโดย6 บทความที่เกี่ยวข้อง ทั้งหมด 9 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Learning-based robust speaker counting and separation with the aid of spatial coherence

Y Hsu, MR Bai - EURASIP Journal on Audio, Speech, and Music …, 2023 - Springer

A three-stage approach is proposed for speaker counting and speech separation in noisy
and reverberant environments. In the spatial feature extraction, a spatial coherence matrix …

บันทึก อ้างอิง อ้างโดย2 บทความที่เกี่ยวข้อง ทั้งหมด 9 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Multi-channel target speech enhancement based on ERB-scaled spatial coherence features

Y Hsu, Y Lee, MR Bai - arxiv preprint arxiv:2207.08126, 2022 - arxiv.org

Recently, speech enhancement technologies that are based on deep learning have
received considerable research attention. If the spatial information in microphone signals is …

บันทึก อ้างอิง อ้างโดย2 บทความที่เกี่ยวข้อง ทั้งหมด 2 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Array configuration-agnostic personalized speech enhancement using long-short-term spatial coherence

Y Hsu, Y Lee, MR Bai - The Journal of the Acoustical Society of …, 2023 - pubs.aip.org

Personalized speech enhancement (PSE) has been a field of active research for
suppression of speech-like interferers, such as competing speakers or television (TV) …

บันทึก อ้างอิง อ้างโดย1 บทความที่เกี่ยวข้อง ทั้งหมด 6 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Spatial-temporal activity-informed diarization and separation

Y Hsu, S Chen, Y Lai, C Wang, MR Bai - The Journal of the Acoustical …, 2025 - pubs.aip.org

A robust multichannel speaker diarization and separation system is proposed by exploiting
the spatiotemporal activity of the speakers. The system is realized in a hybrid architecture …

บันทึก อ้างอิง บทความที่เกี่ยวข้อง ทั้งหมด 5 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] researchgate.net

VocEmb4SVS: Improving singing voice separation with vocal embeddings

C Li, Y Li, X Du, Y Ju, S Hu, Z Wu - 2022 Asia-Pacific Signal …, 2022 - ieeexplore.ieee.org

Deep learning-based methods have shown promising performance on singing voice
separation (SVS). Recently, embeddings related to lyrics and voice activities have been …

บันทึก อ้างอิง อ้างโดย1 บทความที่เกี่ยวข้อง ทั้งหมด 4 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] researchgate.net

Boosting the Performance of SpEx+ by Attention and Contextual Mechanism

C Li, Z Wu, W Rao, Y Wang… - 2022 13th International …, 2022 - ieeexplore.ieee.org

Target speaker extraction (TSE) aims to mimic human selective attention to extracting our
interested voice from the multi-talker environment. Time-domain methods represented by …

บันทึก อ้างอิง บทความที่เกี่ยวข้อง ทั้งหมด 2 ฉบับ

สร้างการแจ้งเตือน

อ้างอิง

การค้นหาขั้นสูง

บันทึกไปยังคลังของฉันแล้ว

Improving channel decorrelation for multi-channel target speech extraction

DPCCN: Densely-connected pyramid complex convolutional network for robust speech separation and extraction

Learning-based personal speech enhancement for teleconferencing by exploiting spatial-spectral features

Attention-based scaling adaptation for target speech extraction

Heterogeneous separation consistency training for adaptation of unsupervised speech separation

Learning-based robust speaker counting and separation with the aid of spatial coherence

Multi-channel target speech enhancement based on ERB-scaled spatial coherence features

Array configuration-agnostic personalized speech enhancement using long-short-term spatial coherence

Spatial-temporal activity-informed diarization and separation

VocEmb4SVS: Improving singing voice separation with vocal embeddings

Boosting the Performance of SpEx+ by Attention and Contextual Mechanism