A deep neural network integrated with filterbank learning for speech recognition
Deep neural networks (DNN) have achieved significant success in the field of speech
recognition. One of the main advantages of the DNN is automatic feature extraction without …
recognition. One of the main advantages of the DNN is automatic feature extraction without …
A robust/fast spoken term detection method based on a syllable n-gram index with a distance metric
For spoken document retrieval, it is crucial to consider Out-of-vocabulary (OOV) and the mis-
recognition of spoken words. Consequently, sub-word unit based recognition and retrieval …
recognition of spoken words. Consequently, sub-word unit based recognition and retrieval …
Class-based n-gram language model for new words using out-of-vocabulary to in-vocabulary similarity
Out-of-vocabulary (OOV) words create serious problems for automatic speech recognition
(ASR) systems. Not only are they miss-recognized as in-vocabulary (IV) words with similar …
(ASR) systems. Not only are they miss-recognized as in-vocabulary (IV) words with similar …
Topic-Dependent-Class-Based -Gram Language Model
A topic-dependent-class (TDC)-based n-gram language model (LM) is a topic-based LM that
employs a semantic extraction method to reveal latent topic information extracted from noun …
employs a semantic extraction method to reveal latent topic information extracted from noun …
[PDF][PDF] Robust speech recognition using DNN-HMM acoustic model combining noise-aware training with spectral subtraction.
Recently, acoustic models based on deep neural notworks (DNNs) have been introduced
and showed dramatic improvements over acoustic models based on GMM in a variety of …
and showed dramatic improvements over acoustic models based on GMM in a variety of …
Comparison of syllable-based and phoneme-based DNN-HMM in Japanese speech recognition
H Seki, K Yamamoto… - … International Conference of …, 2014 - ieeexplore.ieee.org
Japanese is syllabic language. Additionally we have studied syllable-based GMM-HMM for
Japanese speech recognition. In this paper, we investigate the differences of recognition …
Japanese speech recognition. In this paper, we investigate the differences of recognition …
[PDF][PDF] Automatic Explanation Spot Estimation Method Targeted at Text and Figures in Lecture Slides.
Because of the spread of the Internet in recent years, e-learning, which is a form of learning
through the Internet, has been used in school education. Many lecture videos delivered at …
through the Internet, has been used in school education. Many lecture videos delivered at …
[PDF][PDF] High speed spoken term detection by combination of n-gram array of a syllable lattice and LVCSR result for NTCIR-SpokenDoc.
For spoken document retrieval, it is very important to consider Out-of-Vocabulary (OOV) and
mis-recognition of spoken words. Therefore, sub-word unit based recognition and retrieval …
mis-recognition of spoken words. Therefore, sub-word unit based recognition and retrieval …
Soft-clustering technique for training data in age-and gender-independent speech recognition
D Enami, F Zhu, K Yamamoto… - Proceedings of The …, 2012 - ieeexplore.ieee.org
In this paper, we propose approaches for the Gaussian mixture model (GMM) based soft
clustering of training data and the GMM-or/and hidden Markov model (HMM)-based cluster …
clustering of training data and the GMM-or/and hidden Markov model (HMM)-based cluster …
Combination of syllable based N-gram search and word search for spoken term detection through spoken queries and IV/OOV classification
N Sakamoto, K Yamamoto… - 2015 IEEE Workshop on …, 2015 - ieeexplore.ieee.org
This paper presents a Japanese spoken term detection method for spoken queries using a
combination of word-based search and syllable-based N-gram search with in …
combination of word-based search and syllable-based N-gram search with in …