Improved mixed language speech recognition using asymmetric acoustic model and language model with code-switch inversion constraints
Y Li, P Fung - 2013 IEEE International Conference on Acoustics …, 2013 - ieeexplore.ieee.org
We propose an integrated framework for large vocabulary continuous mixed language
speech recognition that handles the accent effect in the bilingual acoustic model and the …
speech recognition that handles the accent effect in the bilingual acoustic model and the …
End-to-end keywords spotting based on connectionist temporal classification for mandarin
Traditional hybrid DNN-HMM based ASR system for keywords spotting which models HMM
states are not flexible to optimize for a specific language. In this paper, we construct an end …
states are not flexible to optimize for a specific language. In this paper, we construct an end …
Cross-lingual language modeling for low-resource speech recognition
P Xu, P Fung - IEEE transactions on audio, speech, and …, 2013 - ieeexplore.ieee.org
This paper proposes using cross-lingual language modeling with syntactic information for
low-resource speech recognition. We propose phrase-level transduction and syntactic …
low-resource speech recognition. We propose phrase-level transduction and syntactic …
A study of large vocabulary speech recognition decoding using finite-state graphs
Z Ou, J **ao - 2010 7th International Symposium on Chinese …, 2010 - ieeexplore.ieee.org
The use of weighted finite-state transducers (WFSTs) has become an attractive technique for
building large vocabulary continuous speech recognition decoders. Conventionally, the …
building large vocabulary continuous speech recognition decoders. Conventionally, the …
Expanding functionality of the autonomous voice control system
V Zhigalov - E3S Web of Conferences, 2023 - e3s-conferences.org
The article describes the main advantages of voice control systems, notes the shortcomings
of the first implemented voice control systems, their functional limitations, and limited use in …
of the first implemented voice control systems, their functional limitations, and limited use in …
[PDF][PDF] Improvements to the pruning behavior of DNN acoustic models.
M Paulik - Interspeech, 2015 - isca-archive.org
This paper examines two strategies that improve the beam pruning behavior of DNN
acoustic models with only a negligible increase in model complexity. By augmenting the …
acoustic models with only a negligible increase in model complexity. By augmenting the …
Phrase-level transduction model with reordering for spoken to written language transformation
P Xu, P Fung, R Chan - 2012 IEEE International Conference on …, 2012 - ieeexplore.ieee.org
This paper proposes a first-ever phrase-level transduction model with reordering to
transform colloquial speech directly to written-style transcription. This model is capable of …
transform colloquial speech directly to written-style transcription. This model is capable of …
[PDF][PDF] Cross-lingual language modeling with syntactic reordering for low-resource speech recognition
P Xu, P Fung - Proceedings of the 2012 Joint Conference on …, 2012 - aclanthology.org
This paper proposes cross-lingual language modeling for transcribing source resourcepoor
languages and translating them into target resource-rich languages if necessary. Our focus …
languages and translating them into target resource-rich languages if necessary. Our focus …
音声認識技術の実用化への取り組み: 8. WFST に基づく T3 音声認識デコーダ
大西翼, 古井貞熙 - 情報処理, 2010 - ipsj.ixsq.nii.ac.jp
情報処理技術, および音声認識技術の発展により, 大規模なデータを利用した音声認識が実現可能
となってきた. 今日の音声認識技術は, 機械学習に基づいており, 学習データが大規模であればある …
となってきた. 今日の音声認識技術は, 機械学習に基づいており, 学習データが大規模であればある …