Speechformer++: A hierarchical efficient framework for paralinguistic speech processing

W Chen, X **ng, X Xu, J Pang… - IEEE/ACM Transactions …, 2023 - ieeexplore.ieee.org
Paralinguistic speech processing is important in addressing many issues, such as sentiment
and neurocognitive disorder analyses. Recently, Transformer has achieved remarkable …

Computational Architecture of Speech Comprehension in the Human Brain

L Gwilliams, I Bhaya-Grossman, Y Zhang… - Annual Review of …, 2025 - annualreviews.org
Understanding the computational algorithm that gives rise to human language is a shared
endeavor among neuroscience, linguistics, and machine learning. We propose a conceptual …

Large language models transition from integrating across position-yoked, exponential windows to structure-yoked, power-law windows

D Skrill, S Norman-Haignere - Advances in neural …, 2023 - proceedings.neurips.cc
Modern language models excel at integrating across long temporal scales needed to
encode linguistic meaning and show non-trivial similarities to biological neural systems …

Neural timescales from a computational perspective

R Zeraati, A Levina, JH Macke, R Gao - arxiv preprint arxiv:2409.02684, 2024 - arxiv.org
Timescales of neural activity are diverse across and within brain areas, and experimental
observations suggest that neural timescales reflect information in dynamic environments …

DeepSpeech models show Human-like Performance and Processing of Cochlear Implant Inputs

CR Steinhardt, M Keshishian, N Mesgarani… - arxiv preprint arxiv …, 2024 - arxiv.org
Cochlear implants (CIs) are arguably the most successful neural implant, having restored
hearing to over one million people worldwide. While CI research has focused on modeling …

Continuous Arabic Speech Recognition Model with N-gram Generation Using Deep Speech

FS Al-Anzi, STB Shalini - 2024 International Congress on …, 2024 - ieeexplore.ieee.org
A speech recognition system aims to translate audio input into a string of words. Deep
Speech is an end-to-end programmed communication acknowledgment engine that has …

[HTML][HTML] Revealing the Next Word and Character in Arabic: An Effective Blend of Long Short-Term Memory Networks and ARABERT

FS Al-Anzi, STB Shalini - Applied Sciences, 2024 - mdpi.com
Arabic raw audio datasets were initially gathered to produce a corresponding signal
spectrum, which was further used to extract the Mel-Frequency Cepstral Coefficients …

[HTML][HTML] Temporal integration in human auditory cortex is predominantly yoked to absolute time, not structure duration

SV Norman-Haignere, MK Keshishian, O Devinsky… - …, 2024 - pmc.ncbi.nlm.nih.gov
Sound structures such as phonemes and words have highly variable durations. Thus, there
is a fundamental difference between integrating across absolute time (eg, 100 ms) vs. sound …

Parallel hierarchical encoding of linguistic representations in the human auditory cortex and recurrent automatic speech recognition systems

M Keshishian, G Mischler, S Thomas, B Kingsbury… - bioRxiv, 2025 - biorxiv.org
The human brain's ability to transform acoustic speech signals into rich linguistic
representations has inspired advancements in automatic speech recognition (ASR) systems …

Novel Methods for Understanding the Neural Encoding of Natural Stimuli in the Human Brain

D Skrill - 2024 - search.proquest.com
The human brain's ability to understand complex natural stimuli, such as speech, music,
language, and visual scenes and objects, is remarkable for its accuracy, efficiency, and …