Pan Zhang

نقل شده توسط

	همهٔ موارد	از 2020
نقل‌‏قول‌‏ها	4570	4565
شاخص h	24	24
شاخص i10	33	33

2800

1400

700

2100

20202021202220232024202512 130 433 778 2711 497

دسترسی عمومی

مشاهدهٔ همه

۶ مقاله

۰ مقاله

در دسترس

در دسترس نیست

براساس دستورات هزینه انتشار

نویسندگان مشترک

Jiaqi WangShanghai AI Laboratoryایمیل تأیید شده در pjlab.org.cn
Xiaoyi DongShanghai AI Laboratoryایمیل تأیید شده در mail.ustc.edu.cn
Dahua LinThe Chinese University of Hong Kongایمیل تأیید شده در ie.cuhk.edu.hk
Yuhang ZangShanghai AI Laboratoryایمیل تأیید شده در pjlab.org.cn
Conghui HeShanghai AI Laboratoryایمیل تأیید شده در pjlab.org.cn
Yuhang CaoMMLab The Chinese University of Hong Kongایمیل تأیید شده در ie.cuhk.edu.hk
Bo ZhangZhejiang University, ZJU100 Young Professorایمیل تأیید شده در zju.edu.cn
Dong ChenPrincipal Research Manager, Microsoft Research Asiaایمیل تأیید شده در microsoft.com
Ting ZhangAssociate Professor, Beijing Normal Universityایمیل تأیید شده در bnu.edu.cn
Bin Wang (王斌)Shanghai AI Laboratoryایمیل تأیید شده در pjlab.org.cn
Tong WuStanford Universityایمیل تأیید شده در stanford.edu
Ziyu WanCity University of Hong Kongایمیل تأیید شده در cs.stanford.edu
Jianmin BaoMicrosoft Researchایمیل تأیید شده در microsoft.com
Dongdong ChenPrincipal Research Manager, GenAI, Microsoftایمیل تأیید شده در mail.ustc.edu.cn
Lu YuanResearch Scientist, GenAI, Meta
Hao YangMoonshot AI

دنبال کردن

Pan Zhang

Shanghai AI Laboratory

ایمیل تأیید شده در mail.ustc.edu.cn - صفحهٔ اصلی

Multimodal LLM Image Synthesis Video Synthesis


عنوان به‌ترتیب نقل قول‌ها به‌ترتیب سال به‌ترتیب عنوان	نقل شده توسط نقل شده توسط	سال
Prototypical pseudo label denoising and target structure learning for domain adaptive semantic segmentation‏ P Zhang, B Zhang, T Zhang, D Chen, Y Wang, F Wen‏ Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2021‏	598	2021
Cross-domain correspondence learning for exemplar-based image translation‏ P Zhang, B Zhang, D Chen, L Yuan, F Wen‏ Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020‏	505	2020
Sharegpt4v: Improving large multi-modal models with better captions‏ L Chen, J Li, X Dong, P Zhang, C He, J Wang, F Zhao, D Lin‏ European Conference on Computer Vision, 370-387, 2024‏	482	2024
Cocosnet v2: Full-resolution correspondence learning for image translation‏ X Zhou, B Zhang, T Zhang, P Zhang, J Bao, D Chen, Z Zhang, F Wen‏ Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2021‏	364	2021
Bringing old photos back to life‏ Z Wan, B Zhang, D Chen, P Zhang, D Chen, J Liao, F Wen‏ proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020‏	252	2020
VLMEvalKit: An open-source toolkit for evaluating large multi-modality models‏ H Duan, J Yang, Y Qiao, X Fang, L Chen, Y Liu, X Dong, Y Zang, P Zhang, ...‏ Proceedings of the 32nd ACM International Conference on Multimedia, 11198-11201, 2024‏	251*	2024
Internlm2 technical report‏ Z Cai, M Cao, H Chen, K Chen, K Chen, X Chen, X Chen, Z Chen, Z Chen, ...‏ arXiv preprint arXiv:2403.17297, 2024‏	243	2024
Internlm-xcomposer2: Mastering free-form text-image composition and comprehension in vision-language large model‏ X Dong, P Zhang, Y Zang, Y Cao, B Wang, L Ouyang, X Wei, S Zhang, ...‏ arXiv preprint arXiv:2401.16420, 2024‏	229	2024
Internlm: A multilingual language model with progressively enhanced capabilities‏ ILM Team‏	193	2023
Internlm-xcomposer: A vision-language large model for advanced text-image comprehension and composition‏ P Zhang, X Dong, B Wang, Y Cao, C Xu, L Ouyang, Z Zhao, H Duan, ...‏ arXiv preprint arXiv:2309.15112, 2023‏	192	2023
Are we on the right way for evaluating large vision-language models?‏ L Chen, J Li, X Dong, P Zhang, Y Zang, Z Chen, H Duan, J Wang, Y Qiao, ...‏ arXiv preprint arXiv:2403.20330, 2024‏	163	2024
Opera: Alleviating hallucination in multi-modal large language models via over-trust penalty and retrospection-allocation‏ Q Huang, X Dong, P Zhang, B Wang, C He, J Wang, D Lin, W Zhang, ...‏ Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024‏	143	2024
Internlm-xcomposer2-4khd: A pioneering large vision-language model handling resolutions from 336 pixels to 4k hd‏ X Dong, P Zhang, Y Zang, Y Cao, B Wang, L Ouyang, S Zhang, H Duan, ...‏ Advances in Neural Information Processing Systems 37, 42566-42592, 2025‏	110	2025
Sharegpt4video: Improving video understanding and generation with better captions‏ L Chen, X Wei, J Li, X Dong, P Zhang, Y Zang, Z Chen, H Duan, B Lin, ...‏ arXiv preprint arXiv:2406.04325, 2024‏	100	2024
Old photo restoration via deep latent space translation‏ Z Wan, B Zhang, D Chen, P Zhang, F Wen, J Liao‏ IEEE Transactions on Pattern Analysis and Machine Intelligence 45 (2), 2071-2087, 2022‏	85	2022
Long-clip: Unlocking the long-text capability of clip‏ B Zhang, P Zhang, X Dong, Y Zang, J Wang‏ European Conference on Computer Vision, 310-325, 2024‏	83	2024
Internlm-xcomposer-2.5: A versatile large vision language model supporting long-contextual input and output‏ P Zhang, X Dong, Y Zang, Y Cao, R Qian, L Chen, Q Guo, H Duan, ...‏ arXiv preprint arXiv:2407.03320, 2024‏	79	2024
Vigc: Visual instruction generation and correction‏ B Wang, F Wu, X Han, J Peng, H Zhong, P Zhang, X Dong, W Li, W Li, ...‏ Proceedings of the AAAI Conference on Artificial Intelligence 38 (6), 5309-5317, 2024‏	68	2024
V3det: Vast vocabulary visual detection dataset‏ J Wang, P Zhang, T Chu, Y Cao, Y Zhou, T Wu, B Wang, C He, D Lin‏ Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023‏	57	2023
Alpha-clip: A clip model focusing on wherever you want‏ Z Sun, Y Fang, T Wu, P Zhang, Y Zang, S Kong, Y Xiong, D Lin, J Wang‏ Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2024‏	56	2024

سیستم در حال حاضر قادر به انجام عملکرد نیست. بعداً دوباره امتحان کنید.

مقاله‌ها 1–20

نقل‌قول‌ها در سال

نقل‌قول تکراری

نقل‌قول‌های ادغام شده

افزودن نویسنده‌های همکارنویسندگان مشترک

دنبال کردن

نقل شده توسط

نویسندگان مشترک