Digestive organ recognition in video capsule endoscopy based on temporal segmentation network Y Shin, T Eo, H Rha, DJ Oh, G Son, J An, YJ Kim, D Hwang, YJ Lim International Conference on Medical Image Computing and Computer-Assisted …, 2022 | 8 | 2022 |
Let's Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversation SJ Park, CW Kim, H Rha, M Kim, J Hong, JH Yeo, YM Ro arXiv preprint arXiv:2406.07867, 2024 | 6 | 2024 |
Tmt: Tri-modal translation between speech, image, and text by processing different modalities as different languages M Kim, J Jung, H Rha, S Maiti, S Arora, X Chang, S Watanabe, YM Ro arXiv preprint arXiv:2402.16021, 2024 | 5 | 2024 |
Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation M Kim, J Yeo, SJ Park, H Rha, YM Ro Proceedings of the 32nd ACM International Conference on Multimedia, 1311-1320, 2024 | 2 | 2024 |
AV-EmoDialog: Chat with Audio-Visual Users Leveraging Emotional Cues SJ Park, Y Kim, H Rha, B Godiva, YM Ro arXiv preprint arXiv:2412.17292, 2024 | | 2024 |
Personalized Lip Reading: Adapting to Your Unique Lip Movements with Vision and Language JH Yeo, CW Kim, H Kim, H Rha, S Han, WH Cheng, YM Ro arXiv preprint arXiv:2409.00986, 2024 | | 2024 |