Takip et
Conghui He
Conghui He
Shanghai AI Laboratory
pjlab.org.cn üzerinde doğrulanmış e-posta adresine sahip - Ana Sayfa
Başlık
Alıntı yapanlar
Alıntı yapanlar
Yıl
Mmbench: Is your multi-modal model an all-around player?
Y Liu, H Duan, Y Zhang, B Li, S Zhang, W Zhao, Y Yuan, J Wang, C He, ...
European conference on computer vision, 216-233, 2024
7682024
Llama-adapter v2: Parameter-efficient visual instruction model
P Gao, J Han, R Zhang, Z Lin, S Geng, A Zhou, W Zhang, P Lu, C He, ...
arXiv preprint arXiv:2304.15010, 2023
5302023
Sharegpt4v: Improving large multi-modal models with better captions
L Chen, J Li, X Dong, P Zhang, C He, J Wang, F Zhao, D Lin
European Conference on Computer Vision, 370-387, 2024
4762024
How far are we to gpt-4v? closing the gap to commercial multimodal models with open-source suites
Z Chen, W Wang, H Tian, S Ye, Z Gao, E Cui, W Tong, K Hu, J Luo, Z Ma, ...
Science China Information Sciences 67 (12), 220101, 2024
3802024
Semantic segmentation-based building footprint extraction using very high-resolution satellite images and multi-source GIS data
W Li, C He, J Fang, J Zheng, H Fu, L Yu
Remote Sensing 11 (4), 403, 2019
2492019
Internlm2 technical report
Z Cai, M Cao, H Chen, K Chen, K Chen, X Chen, X Chen, Z Chen, Z Chen, ...
arXiv preprint arXiv:2403.17297, 2024
2382024
Internlm-xcomposer2: Mastering free-form text-image composition and comprehension in vision-language large model
X Dong, P Zhang, Y Zang, Y Cao, B Wang, L Ouyang, X Wei, S Zhang, ...
arXiv preprint arXiv:2401.16420, 2024
2272024
Internvid: A large-scale video-text dataset for multimodal understanding and generation
Y Wang, Y He, Y Li, K Li, J Yu, X Ma, X Li, G Chen, X Chen, Y Wang, C He, ...
arXiv preprint arXiv:2307.06942, 2023
2272023
Internlm-xcomposer: A vision-language large model for advanced text-image comprehension and composition
P Zhang, X Dong, B Wang, Y Cao, C Xu, L Ouyang, Z Zhao, H Duan, ...
arXiv preprint arXiv:2309.15112, 2023
1912023
Persformer: 3d lane detection via perspective transformer and the openlane benchmark
L Chen, C Sima, Y Li, Z Zheng, J Xu, X Geng, H Li, C He, J Shi, Y Qiao, ...
European Conference on Computer Vision, 550-567, 2022
1772022
9-Pflops nonlinear earthquake simulation on Sunway TaihuLight: enabling depiction of 18-Hz and 8-meter scenarios
H Fu, C He, B Chen, Z Yin, Z Zhang, W Zhang, T Zhang, W Xue, W Liu, ...
Proceedings of the International Conference for High Performance Computing …, 2017
1472017
Opera: Alleviating hallucination in multi-modal large language models via over-trust penalty and retrospection-allocation
Q Huang, X Dong, P Zhang, B Wang, C He, J Wang, D Lin, W Zhang, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
1402024
Internlm-xcomposer2-4khd: A pioneering large vision-language model handling resolutions from 336 pixels to 4k hd
X Dong, P Zhang, Y Zang, Y Cao, B Wang, L Ouyang, S Zhang, H Duan, ...
Advances in Neural Information Processing Systems 37, 42566-42592, 2025
1092025
Influence selection for active learning
Z Liu, H Ding, H Zhong, W Li, J Dai, C He
Proceedings of the IEEE/CVF international conference on computer vision …, 2021
1012021
Sphinx-x: Scaling data and parameters for a family of multi-modal large language models
D Liu, R Zhang, L Qiu, S Huang, W Lin, S Zhao, S Geng, Z Lin, P Jin, ...
arXiv preprint arXiv:2402.05935, 2024
972024
Beyond hallucinations: Enhancing lvlms through hallucination-aware direct preference optimization
Z Zhao, B Wang, L Ouyang, X Dong, J Wang, C He
arXiv preprint arXiv:2311.16839, 2023
872023
Think twice before driving: Towards scalable decoders for end-to-end autonomous driving
X Jia, P Wu, L Chen, J Xie, C He, J Yan, H Li
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
802023
Internlm-xcomposer-2.5: A versatile large vision language model supporting long-contextual input and output
P Zhang, X Dong, Y Zang, Y Cao, R Qian, L Chen, Q Guo, H Duan, ...
arXiv preprint arXiv:2407.03320, 2024
792024
Vigc: Visual instruction generation and correction
B Wang, F Wu, X Han, J Peng, H Zhong, P Zhang, X Dong, W Li, W Li, ...
Proceedings of the AAAI Conference on Artificial Intelligence 38 (6), 5309-5317, 2024
672024
V3det: Vast vocabulary visual detection dataset
J Wang, P Zhang, T Chu, Y Cao, Y Zhou, T Wu, B Wang, C He, D Lin
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
562023
Sistem, işlemi şu anda gerçekleştiremiyor. Daha sonra yeniden deneyin.
Makaleler 1–20