フォロー
Lin Chen
タイトル
引用先
引用先
Sharegpt4v: Improving large multi-modal models with better captions
L Chen, J Li, X Dong, P Zhang, C He, J Wang, F Zhao, D Lin
ECCV 2024, 2024
4492024
VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models
H Duan, J Yang, Y Qiao, X Fang, L Chen, Y Liu, X Dong, Y Zang, P Zhang, ...
MMOpen 2024, 2024
234*2024
Reusing the task-specific classifier as a discriminator: Discriminator-free adversarial domain adaptation
L Chen, H Chen, Z Wei, X Jin, X Tan, Y Jin, E Chen
CVPR 2022, 2022
1792022
Are We on the Right Way for Evaluating Large Vision-Language Models?
L Chen, J Li, X Dong, P Zhang, Y Zang, Z Chen, H Duan, J Wang, Y Qiao, ...
NeurIPS 2024, 2024
1422024
Sharegpt4video: Improving video understanding and generation with better captions
L Chen, X Wei, J Li, X Dong, P Zhang, Y Zang, Z Chen, H Duan, B Lin, ...
NeurIPS 2024, 2024
872024
Deliberated Domain Bridging for Domain Adaptive Semantic Segmentation
L Chen, Z Wei, X Jin, H Chen, M Zheng, K Chen, Y Jin
NeurIPS 2022, 2022
452022
Freedrag: Point tracking is not you need for interactive point-based image editing
P Ling, L Chen, P Zhang, H Chen, Y Jin
CVPR 2024, 2023
36*2023
Stronger, Fewer, & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation
Z Wei, L Chen, Y Jin, X Ma, T Liu, P Lin, B Wang, H Chen, J Zheng
CVPR 2024, 2023
292023
Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs
Y Qiao, H Duan, X Fang, J Yang, L Chen, S Zhang, J Wang, D Lin, ...
NeurIPS 2024, 2024
102024
Open-sora plan: Open-source large video generation model
B Lin, Y Ge, X Cheng, Z Li, B Zhu, S Wang, X He, Y Ye, S Yuan, L Chen, ...
arXiv preprint arXiv:2412.00131, 2024
92024
Disentangle then Parse: Night-time Semantic Segmentation with Illumination Disentanglement
Z Wei, L Chen, T Tu, H Chen, P Ling, Y Jin
ICCV 2023, 2023
92023
Internlm-xcomposer2. 5-omnilive: A comprehensive multimodal system for long-term streaming video and audio interactions
P Zhang, X Dong, Y Cao, Y Zang, R Qian, X Wei, L Chen, Y Li, J Niu, ...
arXiv preprint arXiv:2412.09596, 2024
22024
Internlm-xcomposer-2.5: A versatile large vision language model supporting long-contextual input and output
P Zhang, X Dong, Y Zang, Y Cao, R Qian, L Chen, Q Guo, H Duan, ...
arXiv preprint arXiv:2407.03320, 2024
2024
現在システムで処理を実行できません。しばらくしてからもう一度お試しください。
論文 1–13