Подписаться
Byoung Jin Choi
Byoung Jin Choi
Подтвержден адрес электронной почты в домене hi.snu.ac.kr
Название
Процитировано
Процитировано
Год
Diff-tts: A denoising diffusion model for text-to-speech
M Jeong, H Kim, SJ Cheon, BJ Choi, NS Kim
arXiv preprint arXiv:2104.01409, 2021
2102021
Expressive text-to-speech using style tag
M Kim, SJ Cheon, BJ Choi, JJ Kim, NS Kim
arXiv preprint arXiv:2104.00436, 2021
602021
Transfer learning framework for low-resource text-to-speech using a large-scale unlabeled speech corpus
M Kim, M Jeong, BJ Choi, S Ahn, JY Lee, NS Kim
arXiv preprint arXiv:2203.15447, 2022
292022
SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speaker text-to-speech
BJ Choi, M Jeong, JY Lee, NS Kim
IEEE Signal Processing Letters 29, 2502-2506, 2022
172022
WaveNODE: A continuous normalizing flow for speech synthesis
H Kim, H Lee, WH Kang, SJ Cheon, BJ Choi, NS Kim
arXiv preprint arXiv:2006.04598, 2020
142020
Reformer-TTS: Neural Speech Synthesis with Reformer Network.
HR Ihm, JY Lee, BJ Choi, SJ Cheon, NS Kim
INTERSPEECH, 2012-2016, 2020
122020
Transduce and speak: Neural transducer for text-to-speech with semantic token prediction
M Kim, M Jeong, BJ Choi, D Lee, NS Kim
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-7, 2023
112023
Acoustic Modeling Using Adversarially Trained Variational Recurrent Neural Network for Speech Synthesis.
JY Lee, SJ Cheon, BJ Choi, NS Kim, E Song
INTERSPEECH, 917-921, 2018
82018
Adversarial speaker-consistency learning using untranscribed speech data for zero-shot multi-speaker text-to-speech
BJ Choi, M Jeong, M Kim, SH Mun, NS Kim
2022 Asia-Pacific Signal and Information Processing Association Annual …, 2022
52022
Gated recurrent attention for multi-style speech synthesis
SJ Cheon, JY Lee, BJ Choi, H Lee, NS Kim
Applied Sciences 10 (15), 5325, 2020
52020
Memory attention: Robust alignment using gating mechanism for end-to-end speech synthesis
JY Lee, SJ Cheon, BJ Choi, NS Kim
IEEE Signal Processing Letters 27, 2004-2008, 2020
42020
Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction
M Kim, M Jeong, BJ Choi, S Kim, JY Lee, NS Kim
arXiv preprint arXiv:2401.01498, 2024
32024
Transfer Learning for Low-Resource, Multi-Lingual, and Zero-Shot Multi-Speaker Text-to-Speech
M Jeong, M Kim, BJ Choi, J Yoon, W Jang, NS Kim
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024
22024
A controllable multi-lingual multi-speaker multi-style text-to-speech synthesis with multivariate information minimization
SJ Cheon, BJ Choi, M Kim, H Lee, NS Kim
IEEE Signal Processing Letters 29, 55-59, 2021
22021
MakeSinger: A Semi-Supervised Training Method for Data-Efficient Singing Voice Synthesis via Classifier-free Diffusion Guidance
S Kim, M Jeong, H Lee, M Kim, BJ Choi, NS Kim
arXiv preprint arXiv:2406.05965, 2024
12024
Variable-Length Speaker Conditioning in Flow-Based Text-to-Speech
BJ Choi, M Jeong, M Kim, NS Kim
IEEE Signal Processing Letters, 2024
12024
В данный момент система не может выполнить эту операцию. Повторите попытку позднее.
Статьи 1–16