Speaker adaptation of neural network acoustic models using i-vectors
We propose to adapt deep neural network (DNN) acoustic models to a target speaker by
supplying speaker identity vectors (i-vectors) as input features to the network in parallel with …
supplying speaker identity vectors (i-vectors) as input features to the network in parallel with …
Speakerbeam: Speaker aware neural network for target speaker extraction in speech mixtures
The processing of speech corrupted by interfering overlap** speakers is one of the
challenging problems with regards to today's automatic speech recognition systems …
challenging problems with regards to today's automatic speech recognition systems …
Method and apparatus for automated speaker parameters adaptation in a deployed speaker verification system
DE Colibro, C Vair, KR Farrell - US Patent 9,865,266, 2018 - Google Patents
Typical speaker verification systems usually employ speakers' audio data collected during
an enrollment phase when users enroll with the system and provide respective voice …
an enrollment phase when users enroll with the system and provide respective voice …
[PDF][PDF] Language recognition in ivectors space
The concept of so called iVectors, where each utterance is represented by fixed-length low-
dimensional feature vector, has recently become very successfully in speaker verification. In …
dimensional feature vector, has recently become very successfully in speaker verification. In …
Bi-modal person recognition on a mobile phone: using mobile phone data
This paper presents a novel fully automatic bi-modal, face and speaker, recognition system
which runs in real-time on a mobile phone. The implemented system runs in real-time on a …
which runs in real-time on a mobile phone. The implemented system runs in real-time on a …
[PDF][PDF] Neural Network Bottleneck Features for Language Identification.
This paper presents the application of Neural Network Bottleneck (BN) features in Language
Identification (LID). BN features are generally used for Large Vocabulary Speech …
Identification (LID). BN features are generally used for Large Vocabulary Speech …
[PDF][PDF] Mobile biometrics (mobio): Joint face and voice verification for a mobile platform
Mobile Biometrics (MoBio): Joint Face and Voice Verification for a Mobile Platform Page 1
Mobile Biometrics (MoBio): Joint Face and Voice Verification for a Mobile Platform PA …
Mobile Biometrics (MoBio): Joint Face and Voice Verification for a Mobile Platform PA …
ALIZE 3.0-open source toolkit for state-of-the-art speaker recognition
ALIZE is an open-source platform for speaker recognition. The ALIZE library implements a
low-level statistical engine based on the well-known Gaussian mixture modelling. The toolkit …
low-level statistical engine based on the well-known Gaussian mixture modelling. The toolkit …
[PDF][PDF] A small footprint i-vector extractor.
P Kenny - Odyssey, 2012 - pdfs.semanticscholar.org
In the case c= 1 and Nc= 1 this is the same as the PPCA posterior calculation (Bishop)
Accumulate the matrix∑ c NcV∗ c Vc for each utterance The matrices V∗ c Vc are typically …
Accumulate the matrix∑ c NcV∗ c Vc for each utterance The matrices V∗ c Vc are typically …
[PDF][PDF] Towards speaker adaptive training of deep neural network acoustic models.
We investigate the concept of speaker adaptive training (SAT) in the context of deep neural
network (DNN) acoustic models. Previous studies have shown success of performing …
network (DNN) acoustic models. Previous studies have shown success of performing …