Wavllm: Towards robust and adaptive speech large language model
The recent advancements in large language models (LLMs) have revolutionized the field of
natural language processing, progressively broadening their scope to multimodal …
natural language processing, progressively broadening their scope to multimodal …
Generalizing across domains via cross-gradient training
S Shankar, V Piratla, S Chakrabarti… - arxiv preprint arxiv …, 2018 - arxiv.org
We present CROSSGRAD, a method to use multi-domain training data to learn a classifier
that generalizes to new domains. CROSSGRAD does not need an adaptation phase via …
that generalizes to new domains. CROSSGRAD does not need an adaptation phase via …
Light gated recurrent units for speech recognition
M Ravanelli, P Brakel, M Omologo… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
A field that has directly benefited from the recent advances in deep learning is automatic
speech recognition (ASR). Despite the great achievements of the past decades, however, a …
speech recognition (ASR). Despite the great achievements of the past decades, however, a …
An analysis of environment, microphone and data simulation mismatches in robust speech recognition
Speech enhancement and automatic speech recognition (ASR) are most often evaluated in
matched (or multi-condition) settings where the acoustic conditions of the training data …
matched (or multi-condition) settings where the acoustic conditions of the training data …
[PDF][PDF] Speech and language processing
D Jurafsky - 2000 - academia.edu
" This book is an absolute necessity for instructors at all levels, as well as an indispensible
reference for researchers. Introducing NLP, computational linguistics, and speech …
reference for researchers. Introducing NLP, computational linguistics, and speech …
[PDF][PDF] Domain adaptation with structural correspondence learning
Discriminative learning methods are widely used in natural language processing. These
methods work best when their training and test data are drawn from the same distribution …
methods work best when their training and test data are drawn from the same distribution …
Weakly supervised learning with multi-stream CNN-LSTM-HMMs to discover sequential parallelism in sign language videos
In this work we present a new approach to the field of weakly supervised learning in the
video domain. Our method is relevant to sequence learning problems which can be split up …
video domain. Our method is relevant to sequence learning problems which can be split up …
The 2005 music information retrieval evaluation exchange (mirex 2005): Preliminary overview
This paper is an extended abstract which provides a brief preliminary overview of the 2005
Music Information Retrieval Evaluation eXchange (MIREX 2005). The MIREX organizational …
Music Information Retrieval Evaluation eXchange (MIREX 2005). The MIREX organizational …
[PDF][PDF] Shallow parsing with conditional random fields
Conditional random fields for sequence labeling offer advantages over both generative
models like HMMs and classifiers applied at each sequence position. Among sequence …
models like HMMs and classifiers applied at each sequence position. Among sequence …
MCYT baseline corpus: a bimodal biometric database
J Ortega-Garcia, J Fierrez-Aguilar, D Simon… - IEE Proceedings-Vision …, 2003 - IET
The current need for large multimodal databases to evaluate automatic biometric recognition
systems has motivated the development of the MCYT bimodal database. The main purpose …
systems has motivated the development of the MCYT bimodal database. The main purpose …