Ikuti
zeyu xie
zeyu xie
Email yang diverifikasi di sjtu.edu.cn
Judul
Dikutip oleh
Dikutip oleh
Tahun
Investigating local and global information for automated audio captioning with transfer learning
X Xu, H Dinkel, M Wu, Z Xie, K Yu
ICASSP 2021-2021 IEEE international conference on acoustics, speech and …, 2021
682021
Can audio captions be evaluated with image caption metrics?
Z Zhou, Z Zhang, X Xu, Z Xie, M Wu, KQ Zhu
ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022
632022
The SJTU system for DCASE2022 challenge task 6: Audio captioning with audio-text retrieval pre-training
X Xu, Z Xie, M Wu, K Yu
Tech. Rep., DCASE2022 Challenge, 2022
372022
The SJTU system for DCASE2021 challenge task 6: Audio captioning based on encoder pre-training and reinforcement learning
X Xu, Z Xie, M Wu, K Yu
Proc. Conf. Detection Classification Acoust. Scenes Events, 1-4, 2021
182021
Beyond the status quo: A contemporary survey of advances and challenges in audio captioning
X Xu, Z Xie, M Wu, K Yu
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023
152023
Blat: Bootstrapping language-audio pre-training based on audioset tag-guided synthetic data
X Xu, Z Zhang, Z Zhou, P Zhang, Z Xie, M Wu, KQ Zhu
Proceedings of the 31st ACM International Conference on Multimedia, 2756-2764, 2023
142023
Enhance temporal relations in audio captioning with sound event detection
Z Xie, X Xu, M Wu, K Yu
arXiv preprint arXiv:2306.01533, 2023
122023
Picoaudio: Enabling precise timestamp and frequency controllability of audio events in text-to-audio generation
Z Xie, X Xu, Z Wu, M Wu
arXiv preprint arXiv:2407.02869, 2024
112024
A Detailed Audio-Text Data Simulation Pipeline Using Single-Event Sounds
X Xu, X Xu, Z Xie, P Zhang, M Wu, K Yu
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
52024
AudioTime: A Temporally-aligned Audio-text Benchmark Dataset
Z Xie, X Xu, Z Wu, M Wu
arXiv preprint arXiv:2407.02857, 2024
42024
Enhancing Audio Generation Diversity with Visual Information
Z Xie, B Li, X Xu, M Wu, K Yu
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
42024
FakeSound: Deepfake General Audio Detection
Z Xie, B Li, X Xu, Z Liang, K Yu, M Wu
arXiv preprint arXiv:2406.08052, 2024
32024
Phonetic and Lexical Discovery of a Canine Language using HuBERT
X Li, S Wang, Z Xie, M Wu, KQ Zhu
arXiv preprint arXiv:2402.15985, 2024
12024
The X-LANCE system for DCASE2023 challenge task 7: Foley sound synthesis track b
Z Xie, X Xu, B Li, M Wu, K Yu
Tech. Rep., June, 2023
12023
Overview of the Amphion Toolkit (v0. 2)
J Li, X Zhang, Y Wang, H He, C Wang, L Wang, H Liao, J Ao, Z Xie, ...
arXiv preprint arXiv:2501.15442, 2025
2025
DiveSound: LLM-Assisted Automatic Taxonomy Construction for Diverse Audio Generation
B Li, Z Xie, X Xu, Y Guo, M Yan, J Zhang, K Yu, M Wu
arXiv preprint arXiv:2407.13198, 2024
2024
Improving Audio Caption Fluency with Automatic Error Correction
H Zhang, Z Xie, X Xu, M Wu, K Yu
arXiv preprint arXiv:2306.10090, 2023
2023
Sistem tidak dapat melakukan operasi ini. Coba lagi nanti.
Artikel 1–17