Segui
Dong Zhang
Dong Zhang
Email verificata su m.fudan.edu.cn - Home page
Titolo
Citata da
Citata da
Anno
Speechgpt: Empowering large language models with intrinsic cross-modal conversational abilities
D Zhang, S Li, X Zhang, J Zhan, P Wang, Y Zhou, X Qiu
EMNLP 2023 (Findings), 2023
2522023
SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models
X Zhang*, D Zhang*, S Li, Y Zhou, X Qiu
ICLR 2024, 2023
109*2023
AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling
J Zhan, J Dai, J Ye, Y Zhou, D Zhang, Z Liu, X Zhang, R Yuan, G Zhang, ...
ACL 2024, 2024
842024
SeqXGPT: Sentence-Level AI-Generated Text Detection
P Wang, L Li, K Ren, B Jiang, D Zhang, X Qiu
EMNLP 2023, 2023
472023
GroundingGPT: Language Enhanced Multi-modal Grounding Model
Z Li, Q Xu, D Zhang, H Song, Y Cai, Q Qi, R Zhou, J Pan, Z Li, VT Vu, ...
ACL 2024, 2024
40*2024
Inferaligner: Inference-time alignment for harmlessness through cross-model guidance
P Wang, D Zhang, L Li, C Tan, X Wang, K Ren, B Jiang, X Qiu
EMNLP 2024, 2024
302024
DUB: Discrete Unit Back-translation for Speech Translation
D Zhang, R Ye, T Ko, M Wang, Y Zhou
ACL 2023 (Findings), 2023
252023
SpeechGPT-Gen: Scaling Chain-of-Information Speech Generation
D Zhang, X Zhang, J Zhan, S Li, Y Zhou, X Qiu
arXiv preprint arXiv:2401.13527, 2024
142024
SpeechAlign: Aligning Speech Generation to Human Preferences
D Zhang, Z Li, S Li, X Zhang, P Wang, Y Zhou, X Qiu
NeurIPS 2024, 2024
122024
GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators
Y Hu, C Chen, CHH Yang, R Li, D Zhang, Z Chen, ES Chng
ACL 2024, 2024
112024
Espnet-codec: Comprehensive training and evaluation of neural codecs for audio, music, and speech
J Shi, J Tian, Y Wu, J Jung, JQ Yip, Y Masuyama, W Chen, Y Wu, Y Tang, ...
SLT 2024, 2024
62024
Unifiedmllm: Enabling unified representation for multi-modal multi-tasks with large language model
Z Li, W Wang, YQ Cai, X Qi, P Wang, D Zhang, H Song, B Jiang, Z Huang, ...
NAACL 2025 (Findings), 2024
62024
SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems
D Zhang, Z Li, P Wang, X Zhang, Y Zhou, X Qiu
arXiv preprint arXiv:2401.03945, 2024
52024
Automatic audio captioning with encoder fusion, multi-layer aggregation, and large language model enriched summarization
J Jung, D Zhang, HCH Yang, SL Wu, DM Chan, Z Kong, D Ruifan, ...
DCASE Challenge, Tech. Rep, 2024
22024
BitStack: Fine-Grained Size Control for Compressed Large Language Models in Variable Memory Environments
X Wang, P Wang, B Wang, D Zhang, Y Zhou, X Qiu
ICLR 2025, 2024
12024
Intrinsicvoice: Empowering llms with intrinsic real-time voice interaction abilities
X Zhang, X Lyu, Z Du, Q Chen, D Zhang, H Hu, C Tan, T Zhao, Y Wang, ...
arXiv preprint arXiv:2410.08035, 2024
12024
MetaAlign: Align Large Language Models with Diverse Preferences during Inference Time
M Zhang, P Wang, C Tan, M Huang, D Zhang, Y Zhou, X Qiu
NAACL 2025 (Findings), 2024
2024
Il sistema al momento non può eseguire l'operazione. Riprova più tardi.
Articoli 1–17