Ssdm: Scalable speech dysfluency modeling

J Lian, X Zhou, Z Ezzes, J Vonk… - Advances in neural …, 2025 - proceedings.neurips.cc
Speech dysfluency modeling is the core module for spoken language learning, and speech
therapy. However, there are three challenges. First, current state-of-the-art solutions~~\cite …

Sequence-based data-constrained deep learning framework to predict spider dragline mechanical properties

A Pandey, W Chen, S Keten - Communications Materials, 2024 - nature.com
Spider dragline silk is known for its exceptional strength and toughness; hence
understanding the link between its primary sequence and mechanics is crucial. Here, we …

Stutter-solver: End-to-end multi-lingual dysfluency detection

X Zhou, CJ Cho, A Sharma, B Morin… - 2024 IEEE Spoken …, 2024 - ieeexplore.ieee.org
Current de-facto dysfluency modeling methods [1, 2] utilize template matching algorithms
which are not generalizable to out-of-domain real-world dysfluencies across languages, and …

Missingness-resilient video-enhanced multimodal disfluency detection

P Mohapatra, S Likhite, S Biswas, B Islam… - arxiv preprint arxiv …, 2024 - arxiv.org
Most existing speech disfluency detection techniques only rely upon acoustic data. In this
work, we present a practical multimodal disfluency detection approach that leverages …

Self-Supervised Speech Models For Word-Level Stuttered Speech Detection

YJ Shih, Z Gkalitsiou, AG Dimakis… - 2024 IEEE Spoken …, 2024 - ieeexplore.ieee.org
Clinical diagnosis of stuttering requires an assessment by a licensed speech-language
pathologist. However, this process is time-consuming and requires clinicians with training …

Phase-driven domain generalizable learning for nonstationary time series

P Mohapatra, L Wang, Q Zhu - arxiv preprint arxiv:2402.05960, 2024 - arxiv.org
Monitoring and recognizing patterns in continuous sensing data is crucial for many practical
applications. These real-world time-series data are often nonstationary, characterized by …

Effect of attention and self-supervised speech embeddings on non-semantic speech tasks

P Mohapatra, A Pandey, Y Sui, Q Zhu - Proceedings of the 31st ACM …, 2023 - dl.acm.org
Human emotion understanding is pivotal in making conversational technology mainstream.
We view speech emotion understanding as a perception task which is a more realistic …

Non-verbal Hands-free Control for Smart Glasses using Teeth Clicks

P Mohapatra, A Aroudi, A Kumar… - arxiv preprint arxiv …, 2024 - arxiv.org
Smart glasses are emerging as a popular wearable computing platform potentially
revolutionizing the next generation of human-computer interaction. The widespread …

Individual-independent and cross-language detection of speech disfluencies in stuttering based on multi-adversarial tasks and self-training

J Shen, X Zhang - Biomedical Signal Processing and Control, 2025 - Elsevier
Stuttering is a complex speech disorder that affects people's fluent expression. People who
stutter may exhibit various types of speech disfluencies. Speech-language pathologists …

DACR: Distribution-Augmented Contrastive Reconstruction for Time-Series Anomaly Detection

L Wang, S Xu, X Du, Q Zhu - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
Anomaly detection in time-series data is crucial for identifying faults, failures, threats, and
outliers across a range of applications. Recently, deep learning techniques have been …