Real-time target sound extraction

B Veluri, J Chan, M Itani, T Chen… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
We present the first neural network model to achieve real-time and streaming target sound
extraction. To accomplish this, we propose Waveformer, an encoder-decoder architecture …

{SLNet}: A Spectrogram Learning Neural Network for Deep Wireless Sensing

Z Yang, Y Zhang, K Qian, C Wu - 20th USENIX Symposium on …, 2023 - usenix.org
Advances in wireless technologies have transformed wireless networks from a pure
communication medium to a pervasive sensing platform, enabling many sensorless and …

ClearBuds: wireless binaural earbuds for learning-based speech enhancement

I Chatterjee, M Kim, V Jayaram, S Gollakota… - Proceedings of the 20th …, 2022 - dl.acm.org
We present ClearBuds, the first hardware and software system that utilizes a neural network
to enhance speech streamed from two wireless earbuds. Real-time speech enhancement for …

Creating speech zones with self-distributing acoustic swarms

M Itani, T Chen, T Yoshioka, S Gollakota - Nature Communications, 2023 - nature.com
Imagine being in a crowded room with a cacophony of speakers and having the ability to
focus on or remove speech from a specific 2D region. This would require understanding and …

Hearable devices with sound bubbles

T Chen, M Itani, SE Eskimez, T Yoshioka… - Nature Electronics, 2024 - nature.com
The human auditory system has a limited ability to perceive distance and distinguish
speakers in crowded settings. A headset technology that can create a sound bubble in …

Target conversation extraction: Source separation using turn-taking dynamics

T Chen, Q Wang, B Wu, M Itani, SE Eskimez… - arxiv preprint arxiv …, 2024 - arxiv.org
Extracting the speech of participants in a conversation amidst interfering speakers and noise
presents a challenging problem. In this paper, we introduce the novel task of target …

Voicefind: Noise-resilient speech recovery in commodity headphones

I Shahid, Y Bai, N Garg, N Roy - … of the 1st ACM International Workshop …, 2022 - dl.acm.org
Robust speech enhancement is a key requirement for many emerging applications. It is
challenging to recover clear speech in commodity devices, especially in noisy real-world …

Cleanformer: A multichannel array configuration-invariant neural enhancement frontend for asr in smart speakers

J Caroselli, A Narayanan, N Howard… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
This work introduces Cleanformer—a streaming multichannel neural enhancement frontend
for automatic speech recognition (ASR). This model has a Conformer-based architecture …

Spatial-Aware Multi-Task Learning Based Speech Separation

W Sun, M Wang, L Qiu - … Conference on Mobile Ad-Hoc and …, 2024 - ieeexplore.ieee.org
Online meetings have become an indispensable part of our lives. However, background
noise from other family members, roommates, and office mates not only degrades the voice …

[HTML][HTML] Enhancing Situational Awareness with VAS-Compass Net for the Recognition of Directional Vehicle Alert Sounds

CL Chin, JR Chen, WX Lin, HC Hung… - Sensors (Basel …, 2024 - pmc.ncbi.nlm.nih.gov
People with hearing impairments often face increased risks related to traffic accidents due to
their reduced ability to perceive surrounding sounds. Given the cost and usage limitations of …