دنبال کردن
Cong Wei
Cong Wei
ایمیل تأیید شده در uwaterloo.ca - صفحهٔ اصلی
عنوان
نقل شده توسط
نقل شده توسط
سال
Mmmu: A massive multi-discipline multimodal understanding and reasoning benchmark for expert agi
X Yue, Y Ni, K Zhang, T Zheng, R Liu, G Zhang, S Stevens, D Jiang, ...
CVPR 2024 (Oral); Best Paper Candidate, 2024
5852024
Mantis: Interleaved multi-image instruction tuning
D Jiang, X He, H Zeng, C Wei, M Ku, Q Liu, W Chen
TMLR 2024, 2024
772024
Consisti2v: Enhancing visual consistency for image-to-video generation
W Ren, H Yang, G Zhang, C Wei, X Du, W Huang, W Chen
TMLR 2024, 2024
482024
Uniir: Training and benchmarking universal multimodal information retrievers
C Wei, Y Chen, H Chen, H Hu, G Zhang, J Fu, A Ritter, W Chen
ECCV 2025 (Oral), 2025
382025
AnyV2V: A Tuning-Free Framework For Any Video-to-Video Editing Tasks
M Ku*, C Wei*, W Ren*, H Yang, W Chen
TMLR 2024 (Reproducibility Certification), 2024
37*2024
Viescore: Towards explainable metrics for conditional image synthesis evaluation
M Ku, D Jiang, C Wei, X Yue, W Chen
ACL, 2023
362023
Dreamedit: Subject-driven image editing
T Li, M Ku*, C Wei*, W Chen
TMLR, 2023
262023
Sparsifiner: Learning sparse instance-dependent attention for efficient vision transformers
C Wei*, B Duke*, R Jiang, P Aarabi, GW Taylor, F Shkurti
CVPR 2023, 2023
162023
Mmmu: A massive multi-discipline multimodal understanding and reasoning benchmark for expert agi, 2024
X Yue, Y Ni, K Zhang, T Zheng, R Liu, G Zhang, S Stevens, D Jiang, ...
URL https://arxiv. org/abs/2311.16502 18, 0
12
VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation
W Ren, H Yang, J Min, C Wei, W Chen
arXiv preprint arXiv:2412.00927, 2024
2024
OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision
C Wei*, Z Xiong*, W Ren, X Du, G Zhang, W Chen
ICLR 2025, 2024
2024
سیستم در حال حاضر قادر به انجام عملکرد نیست. بعداً دوباره امتحان کنید.
مقاله‌ها 1–11