FlashSpeech: Efficient Zero-Shot Speech Synthesis Z Ye, Z Ju, H Liu, X Tan, J Chen, Y Lu, P Sun, J Pan, W Bian, S He, Q Liu, ... ACM MM 2024, 2024 | 12 | 2024 |
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model Z Ye, P Sun, J Lei, H Lin, X Tan, Z Dai, Q Kong, J Chen, J Pan, Q Liu, ... AAAI 2025, 2024 | 5 | 2024 |
Ref-avs: Refer and segment objects in audio-visual scenes Y Wang*, P Sun*, D Zhou*, G Li, H Zhang, D Hu ECCV 2024, 2024 | 5 | 2024 |
Stepping stones: A progressive training strategy for audio-visual semantic segmentation J Ma, P Sun, Y Wang, D Hu ECCV 2024, 2024 | 4 | 2024 |
Can Textual Semantics Mitigate Sounding Object Segmentation Preference? Y Wang*, P Sun*, Y Li, H Zhang, D Hu ECCV 2024, 2024 | 4 | 2024 |
A method of audio-visual person verification by mining connections between time series P Sun, S Zhang, Z Liu, Y Yuan, T Zhang, H Zhang, P Hu Proc. Interspeech, 3227-3231, 2023 | 4 | 2023 |
Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation P Sun, S Cheng, X Li, Z Ye, H Liu, H Zhang, W Xue, Y Guo arXiv preprint arXiv:2410.10676, 2024 | 1 | 2024 |
Predicting central cervical lymph node metastasis of papillary thyroid carcinomas using multi-view ultrasound images Z Liu, P Sun, D Chen, H Zhang, Y Li International Conference on Medical Imaging and Computer-Aided Diagnosis, 83-91, 2023 | 1 | 2023 |
Unveiling and Mitigating Bias in Audio Visual Segmentation P Sun, H Zhang, D Hu ACM MM 2024, 2024 | | 2024 |
Enhancing Few-shot Classification through Token Selection for Balanced Learning W Zeng, P Sun, H Zhang 2024 International Joint Conference on Neural Networks (IJCNN), 1-9, 2024 | | 2024 |