Review of methods for coding of speech signals

D O'Shaughnessy - EURASIP Journal on Audio, Speech, and Music …, 2023 - Springer
Speech is the most common form of human communication, and many conversations use
digital communication links. For efficient transmission, acoustic speech waveforms are …

Capitalization and punctuation restoration: a survey

V Păiş, D Tufiş - Artificial Intelligence Review, 2022 - Springer
Ensuring proper punctuation and letter casing is a key pre-processing step towards applying
complex natural language processing algorithms. This is especially significant for textual …

Revisiting the Boundary between ASR and NLU in the Age of Conversational Dialog Systems

M Faruqui, D Hakkani-Tür - Computational Linguistics, 2022 - direct.mit.edu
As more users across the world are interacting with dialog agents in their daily life, there is a
need for better speech understanding that calls for renewed attention to the dynamics …

[PDF][PDF] NMT-Based Segmentation and Punctuation Insertion for Real-Time Spoken Language Translation.

E Cho, J Niehues, A Waibel - Interspeech, 2017 - isca-archive.org
Insertion of proper segmentation and punctuation into an ASR transcript is crucial not only
for the performance of subsequent applications but also for the readability of the text. In a …

Bilingual experiments on automatic recovery of capitalization and punctuation of automatic speech transcripts

F Batista, H Moniz, I Trancoso… - IEEE transactions on …, 2012 - ieeexplore.ieee.org
This paper focuses on the tasks of recovering capitalization and punctuation marks from
texts without that information, such as spoken transcripts, produced by automatic speech …

Disfluency detection using a noisy channel model and a deep neural language model

PJ Lou, M Johnson - arxiv preprint arxiv:1808.09091, 2018 - arxiv.org
This paper presents a model for disfluency detection in spontaneous speech transcripts
called LSTM Noisy Channel Model. The model uses a Noisy Channel Model (NCM) to …

Describing lexical patterns in simultaneously interpreted discourse in a parallel aligned corpus of Russian-English interpreting (SIREN)

D Dayter - Forum, 2018 - jbe-platform.com
The paper introduces a corpus of simultaneous interpretation, SIREN. SIREN is a parallel
aligned bidirectional corpus of original and simultaneously interpreted speech in Russian …

Towards better subtitles: A multilingual approach for punctuation restoration of speech transcripts

NM Guerreiro, R Rei, F Batista - Expert Systems with Applications, 2021 - Elsevier
This paper proposes a flexible approach for punctuation prediction that can be used to
produce state-of-the-art results in a multilingual scenario. We have performed experiments …

Speech-centric information processing: An optimization-oriented approach

X He, L Deng - Proceedings of the IEEE, 2013 - ieeexplore.ieee.org
Automatic speech recognition (ASR) is a central and common component of voice-driven
information processing systems in human language technology, including spoken language …

[PDF][PDF] Augmenting translation models with simulated acoustic confusions for improved spoken language translation

Y Tsvetkov, F Metze, C Dyer - … of the 14th Conference of the …, 2014 - aclanthology.org
We propose a novel technique for adapting text-based statistical machine translation to deal
with input from automatic speech recognition in spoken language translation tasks. We …