Følg
Yueze Wang
Yueze Wang
Beijing Academy of Artificial Intelligence (BAAI)
Verifisert e-postadresse på baai.ac.cn - Startside
Tittel
Sitert av
Sitert av
År
Emu: Generative pretraining in multimodality
Q Sun, Q Yu, Y Cui, F Zhang, X Zhang, Y Wang, H Gao, J Liu, T Huang, ...
arXiv preprint arXiv:2307.05222, 2023
2302023
Generative multimodal models are in-context learners
Q Sun, Y Cui, X Zhang, F Zhang, Q Yu, Y Wang, Y Rao, J Liu, T Huang, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
2082024
Efficient multimodal learning from data-centric perspective
M He, Y Liu, B Wu, J Yuan, Y Wang, T Huang, B Zhao
arXiv preprint arXiv:2402.11530, 2024
882024
Emu3: Next-token prediction is all you need
X Wang, X Zhang, Z Luo, Q Sun, Y Cui, J Wang, F Zhang, Y Wang, Z Li, ...
arXiv preprint arXiv:2409.18869, 2024
762024
Fine-grained visual prompting
L Yang, Y Wang, X Li, X Wang, J Yang
Advances in Neural Information Processing Systems 36, 24993-25006, 2023
562023
Omnigen: Unified image generation
S Xiao, Y Wang, J Zhou, H Yuan, X Xing, R Yan, S Wang, T Huang, Z Liu
arXiv preprint arXiv:2409.11340, 2024
282024
Densefusion-1m: Merging vision experts for comprehensive multimodal perception
X Li, F Zhang, H Diao, Y Wang, X Wang, LY Duan
arXiv preprint arXiv:2407.08303, 2024
202024
DSMENet: Detail and structure mutually enhancing network for under-sampled MRI reconstruction
Y Wang, Y Pang, C Tong
Computers in Biology and Medicine 154, 106204, 2023
202023
Unveiling encoder-free vision-language models
H Diao, Y Cui, X Li, Y Wang, H Lu, X Wang
arXiv preprint arXiv:2406.11832, 2024
192024
Universal prompt optimizer for safe text-to-image generation
Z Wu, H Gao, Y Wang, X Zhang, S Wang
Proceedings of the 2024 Conference of the North American Chapter of the …, 2024
132024
HIWDNet: A hybrid image-wavelet domain network for fast magnetic resonance image reconstruction
C Tong, Y Pang, Y Wang
Computers in Biology and Medicine 151, 105947, 2022
122022
Seeing clearly, answering incorrectly: A multimodal robustness benchmark for evaluating mllms on leading questions
Y Liu, Z Liang, Y Wang, M He, J Li, B Zhao
arXiv preprint arXiv:2406.10638, 2024
72024
Generative Pretraining in Multimodality. CoRR abs/2307.05222 (2023)
Q Sun, Q Yu, Y Cui, F Zhang, X Zhang, Y Wang, H Gao, J Liu, T Huang, ...
42023
Generative Multimodal Models are In-Context Learners. CoRR abs/2312.13286 (2023)
Q Sun, Y Cui, X Zhang, F Zhang, Q Yu, Z Luo, Y Wang, Y Rao, J Liu, ...
42023
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval
J Zhou, Z Liu, Z Liu, S Xiao, Y Wang, B Zhao, CJ Zhang, D Lian, Y Xiong
arXiv preprint arXiv:2412.14475, 2024
32024
Emu: Generative Pretraining in Multimodality
Q Sun, Q Yu, Y Cui, F Zhang, X Zhang, Y Wang, H Gao, J Liu, T Huang, ...
URL http://arxiv. org/abs/2307.05222, 2024
3*2024
Generative pretraining in multimodality (2023)
Q Sun, Q Yu, Y Cui, F Zhang, X Zhang, Y Wang, H Gao, J Liu, T Huang
arXiv preprint arXiv:2307.05222, 2023
2*2023
EVEv2: Improved Baselines for Encoder-Free Vision-Language Models
H Diao, X Li, Y Cui, Y Wang, H Deng, T Pan, W Wang, H Lu, X Wang
arXiv preprint arXiv:2502.06788, 2025
2025
Fine-Grained Visual Text Prompting
L Yang, X Li, Y Wang, X Wang, J Yang
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024
2024
Generative Pretraining in Multimodality
Q Sun, Q Yu, Y Cui, F Zhang, X Zhang, Y Wang, H Gao, J Liu, T Huang, ...
The Twelfth International Conference on Learning Representations, 2023
2023
Systemet kan ikke utføre handlingen. Prøv på nytt senere.
Artikler 1–20