Investigating local and global information for automated audio captioning with transfer learning X Xu, H Dinkel, M Wu, Z Xie, K Yu ICASSP 2021-2021 IEEE international conference on acoustics, speech and …, 2021 | 69 | 2021 |
Can audio captions be evaluated with image caption metrics? Z Zhou, Z Zhang, X Xu, Z Xie, M Wu, KQ Zhu ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 63 | 2022 |
Predicting tensile properties of AZ31 magnesium alloys by machine learning X Xu, L Wang, G Zhu, X Zeng Jom 72 (11), 3935-3942, 2020 | 60 | 2020 |
Voice activity detection in the wild: A data-driven approach using teacher-student training H Dinkel, S Wang, X Xu, M Wu, K Yu IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 1542-1555, 2021 | 55 | 2021 |
A CRNN-GRU Based Reinforcement Learning Approach to Audio Captioning. X Xu, H Dinkel, M Wu, K Yu DCASE, 225-229, 2020 | 52 | 2020 |
The SJTU system for DCASE2022 challenge task 6: Audio captioning with audio-text retrieval pre-training X Xu, Z Xie, M Wu, K Yu Tech. Rep., DCASE2022 Challenge, 2022 | 38 | 2022 |
Audio-text retrieval in context S Lou, X Xu, M Wu, K Yu ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 38 | 2022 |
Beyond the status quo: A contemporary survey of advances and challenges in audio captioning X Xu, Z Xie, M Wu, K Yu IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023 | 28* | 2023 |
Text-to-audio grounding: Building correspondence between captions and sound events X Xu, H Dinkel, M Wu, K Yu ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 28 | 2021 |
Audio caption in a car setting with a sentence-level loss X Xu, H Dinkel, M Wu, K Yu 2021 12th International Symposium on Chinese Spoken Language Processing …, 2021 | 23 | 2021 |
A large-scale dataset for audio-language representation learning L Sun, X Xu, M Wu, W Xie arXiv preprint arXiv:2309.11500, 2023 | 22* | 2023 |
SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound H Liu, X Xu, Y Yuan, M Wu, W Wang, MD Plumbley arXiv preprint arXiv:2405.00233, 2024 | 20 | 2024 |
Sound-based construction activity monitoring with deep learning W Xiong, X Xu, L Chen, J Yang Buildings 12 (11), 1947, 2022 | 20 | 2022 |
The SJTU system for DCASE2021 challenge task 6: Audio captioning based on encoder pre-training and reinforcement learning X Xu, Z Xie, M Wu, K Yu Proc. Conf. Detection Classification Acoust. Scenes Events, 1-4, 2021 | 18 | 2021 |
Automatic detection pipeline for accessing the motor severity of Parkinson’s disease in finger tapping and postural stability N Yang, DF Liu, T Liu, T Han, P Zhang, X Xu, S Lou, HG Liu, AC Yang, ... IEEE Access 10, 66961-66973, 2022 | 17 | 2022 |
BLAT: Bootstrapping language-audio pre-training based on audioset tag-guided synthetic data X Xu, Z Zhang, Z Zhou, P Zhang, Z Xie, M Wu, KQ Zhu Proceedings of the 31st ACM International Conference on Multimedia, 2756-2764, 2023 | 14 | 2023 |
Enhance temporal relations in audio captioning with sound event detection Z Xie, X Xu, M Wu, K Yu arXiv preprint arXiv:2306.01533, 2023 | 12 | 2023 |
Picoaudio: Enabling precise timestamp and frequency controllability of audio events in text-to-audio generation Z Xie, X Xu, Z Wu, M Wu arXiv preprint arXiv:2407.02869, 2024 | 11 | 2024 |
Towards Weakly Supervised Text-to-Audio Grounding X Xu, Z Ma, M Wu, K Yu arXiv preprint arXiv:2401.02584, 2024 | 9 | 2024 |
A Lightweight Framework for Online Voice Activity Detection in the Wild. X Xu, H Dinkel, M Wu, K Yu Interspeech, 371-375, 2021 | 9 | 2021 |