Folgen
Xiaofei Wang
Xiaofei Wang
Microsoft
Bestätigte E-Mail-Adresse bei jhu.edu - Startseite
Titel
Zitiert von
Zitiert von
Jahr
A comparative study on transformer vs rnn in speech applications
S Karita, N Chen, T Hayashi, T Hori, H Inaguma, Z Jiang, M Someki, ...
2019 IEEE automatic speech recognition and understanding workshop (ASRU …, 2019
8992019
Serialized output training for end-to-end overlapped speech recognition
N Kanda, Y Gaur, X Wang, Z Meng, T Yoshioka
Interspeech 2020, 2797-2801, 2020
1302020
Joint speaker counting, speech recognition, and speaker identification for overlapped speech of any number of speakers
N Kanda, Y Gaur, X Wang, Z Meng, Z Chen, T Zhou, T Yoshioka
Interspeech 2020, 36-40, 2020
902020
Speechx: Neural codec language model as a versatile speech transformer
X Wang, M Thakker, Z Chen, N Kanda, SE Eskimez, S Chen, M Tang, ...
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024
712024
Personalized speech enhancement: New models and comprehensive evaluation
SE Eskimez, T Yoshioka, H Wang, X Wang, Z Chen, X Huang
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
682022
Speech enhancement using end-to-end speech recognition objectives
AS Subramanian, X Wang, MK Baskar, S Watanabe, T Taniguchi, D Tran, ...
2019 IEEE Workshop on Applications of Signal Processing to Audio and …, 2019
682019
Streaming multi-talker ASR with token-level serialized output training
N Kanda, J Wu, Y Wu, X Xiao, Z Meng, X Wang, Y Gaur, Z Chen, J Li, ...
Interspeech 2022, 3774-3778, 2022
632022
The Hitachi/JHU CHiME-5 system: Advances in speech recognition for everyday home environments using multiple microphone arrays
N Kanda, R Ikeshita, S Horiguchi, Y Fujita, K Nagamatsu, X Wang, ...
Proc. CHiME-5, 6-10, 2018
542018
End-to-end speaker-attributed ASR with transformer
N Kanda, G Ye, Y Gaur, X Wang, Z Meng, Z Chen, T Yoshioka
Interspeech 2021, 4413-4417, 2021
532021
Investigation of end-to-end speaker-attributed ASR for continuous multi-talker recordings
N Kanda, X Chang, Y Gaur, X Wang, Z Meng, Z Chen, T Yoshioka
2021 IEEE Spoken Language Technology Workshop (SLT), 809-816, 2021
502021
Large-scale pre-training of end-to-end multi-talker ASR for meeting transcription with single distant microphone
N Kanda, G Ye, Y Wu, Y Gaur, X Wang, Z Meng, Z Chen, T Yoshioka
Interspeech 2021, 3430-3434, 2021
442021
Transcribe-to-diarize: Neural speaker diarization for unlimited number of speakers using end-to-end speaker-attributed ASR
N Kanda, X Xiao, Y Gaur, X Wang, Z Meng, Z Chen, T Yoshioka
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
422022
VarArray: Array-geometry-agnostic continuous speech separation
T Yoshioka, X Wang, D Wang, M Tang, Z Zhu, Z Chen, N Kanda
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
362022
Ella-v: Stable neural codec language modeling with alignment-guided sequence reordering
Y Song, Z Chen, X Wang, Z Ma, X Chen
The 39th Annual AAAI Conference on Artificial Intelligence, 2024
312024
Streaming speaker-attributed ASR with token-level speaker embeddings
N Kanda, J Wu, Y Wu, X Xiao, Z Meng, X Wang, Y Gaur, Z Chen, J Li, ...
Interspeech 2022, 3774-3778, 2022
302022
Multi-stream end-to-end speech recognition
R Li, X Wang, SH Mallidi, S Watanabe, T Hori, H Hermansky
IEEE/ACM Transactions on Audio, Speech, and Language Processing 28, 646-655, 2019
302019
Improving noise robustness of contrastive speech representation learning with speech reconstruction
H Wang, Y Qian, X Wang, Y Wang, C Wang, S Liu, T Yoshioka, J Li, ...
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
292022
Oracle performance investigation of the ideal masks
Z Wang, X Wang, X Li, Q Fu, Y Yan
2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC), 1-5, 2016
292016
Stream attention-based multi-array end-to-end speech recognition
X Wang, R Li, SH Mallidi, T Hori, S Watanabe, H Hermansky
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
272019
An investigation of end-to-end multichannel speech recognition for reverberant and mismatch conditions
AS Subramanian, X Wang, S Watanabe, T Taniguchi, D Tran, Y Fujita
arXiv preprint arXiv:1904.09049, 2019
272019
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–20