Synctalkface: Talking face generation with precise lip-syncing via audio-lip memory SJ Park, M Kim, J Hong, J Choi, YM Ro AAAI 2022, 2022 | 80 | 2022 |
Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring J Hong, M Kim, J Choi, YM Ro CVPR 2023, 2023 | 39 | 2023 |
Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation M Kim, J Choi, D Kim, YM Ro IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 | 19* | 2024 |
Intelligible Lip-to-Speech Synthesis with Speech Units J Choi, M Kim, YM Ro Interspeech 2023, 2023 | 19 | 2023 |
AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model JH Yeo, M Kim, J Choi, DH Kim, YM Ro IEEE Transactions on Multimedia, 2024 | 18 | 2024 |
Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge M Kim, JH Yeo, J Choi, YM Ro ICCV 2023, 2023 | 14 | 2023 |
DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding J Choi, J Hong, YM Ro ICCV 2023, 2023 | 10 | 2023 |
Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal Tokens M Kim, J Choi, S Maiti, JH Yeo, S Watanabe, YM Ro ICASSP 2024, 2024 | 8 | 2024 |
ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation Z Li, S Hu, S Liu, L Zhou, J Choi, L Meng, X Guo, J Li, H Ling, F Wei ICLR 2025, 2024 | 4 | 2024 |
Exploring Phonetic Context-Aware Lip-Sync for Talking Face Generation SJ Park, M Kim, J Choi, YM Ro ICASSP 2024, 2024 | 4 | 2024 |
AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation J Choi, SJ Park, M Kim, YM Ro CVPR 2024, 2024 | 3 | 2024 |
Text-Driven Talking Face Synthesis by Reprogramming Audio-Driven Models J Choi, M Kim, SJ Park, YM Ro ICASSP 2024, 2024 | 2* | 2024 |
V2SFlow: Video-to-Speech Generation with Speech Decomposition and Rectified Flow J Choi, JH Kim, J Li, JS Chung, S Liu ICASSP 2025, 2024 | | 2024 |
Accelerating Codec-based Speech Synthesis with Multi-Token Prediction and Speculative Decoding TD Nguyen, JH Kim, J Choi, S Choi, J Park, Y Lee, JS Chung ICASSP 2025, 2024 | | 2024 |