Synctalkface: Talking face generation with precise lip-syncing via audio-lip memory SJ Park, M Kim, J Hong, J Choi, YM Ro Proceedings of the AAAI Conference on Artificial Intelligence 36 (2), 2062-2070, 2022 | 81 | 2022 |
Distinguishing homophenes using multi-head visual-audio memory for lip reading M Kim, JH Yeo, YM Ro Proceedings of the AAAI conference on artificial intelligence 36 (1), 1174-1182, 2022 | 57 | 2022 |
Multi-modality associative bridging through memory: Speech sound recollected from face video M Kim, J Hong, SJ Park, YM Ro Proceedings of the IEEE/CVF International Conference on Computer Vision, 296-306, 2021 | 52 | 2021 |
Lip to speech synthesis with visual context attentional GAN M Kim, J Hong, YM Ro Advances in Neural Information Processing Systems 34, 2758-2770, 2021 | 48 | 2021 |
Watch or listen: Robust audio-visual speech recognition with visual corruption modeling and reliability scoring J Hong, M Kim, J Choi, YM Ro Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 38 | 2023 |
Cromm-vsr: Cross-modal memory augmented visual speech recognition M Kim, J Hong, SJ Park, YM Ro IEEE Transactions on Multimedia 24, 4342-4355, 2021 | 33 | 2021 |
Speech reconstruction with reminiscent sound via visual voice memory J Hong, M Kim, SJ Park, YM Ro IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 3654-3667, 2021 | 25 | 2021 |
Speaker-adaptive lip reading with user-dependent padding M Kim, H Kim, YM Ro European Conference on Computer Vision, 576-593, 2022 | 24 | 2022 |
Visual context-driven audio feature enhancement for robust end-to-end audio-visual speech recognition J Hong, M Kim, D Yoo, YM Ro INTERSPEECH 2022, 2022 | 24 | 2022 |
Lip-to-speech synthesis in the wild with multi-task learning M Kim, J Hong, YM Ro ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 22 | 2023 |
Prompt tuning of deep neural networks for speaker-adaptive visual speech recognition M Kim, HI Kim, YM Ro IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024 | 18 | 2024 |
Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation M Kim, J Choi, D Kim, YM Ro IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 | 18* | 2024 |
Akvsr: Audio knowledge empowered visual speech recognition by compressing audio knowledge of a pretrained model JH Yeo, M Kim, J Choi, DH Kim, YM Ro IEEE Transactions on Multimedia 26, 6462-6474, 2024 | 18 | 2024 |
Intelligible Lip-to-Speech Synthesis with Speech Units J Choi, M Kim, YM Ro INTERSPEECH 2023, 2023 | 17 | 2023 |
Visual Speech Recognition for Languages with Limited Labeled Data using Automatic Labels from Whisper JH Yeo, M Kim, S Watanabe, YM Ro ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 15* | 2024 |
Lip reading for low-resource languages by learning and combining general speech knowledge and language-specific knowledge M Kim, JH Yeo, J Choi, YM Ro Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 13 | 2023 |
Multi-temporal lip-audio memory for visual speech recognition JH Yeo, M Kim, YM Ro ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 11 | 2023 |
Interpretation of lesional detection via counterfactual generation J Kim, M Kim, YM Ro 2021 IEEE International Conference on Image Processing (ICIP), 96-100, 2021 | 9 | 2021 |
Where visual speech meets language: VSP-LLM framework for efficient and context-aware visual speech processing JH Yeo, S Han, M Kim, YM Ro arXiv preprint arXiv:2402.15151, 2024 | 8 | 2024 |
Towards practical and efficient image-to-speech captioning with vision-language pre-training and multi-modal tokens M Kim, J Choi, S Maiti, JH Yeo, S Watanabe, YM Ro ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 7 | 2024 |