Haotian Zhang

Citované v

	Všetky	Od 2020
Citácie	3364	3312
h-index	18	18
i10-index	26	25

1700

850

425

1275

201920202021202220232024202544 130 196 282 694 1695 313

Verejný prístup

všetky položky

2 články

1 článok

dostupné

nedostupné

Na základe mandátov na financovanie

Spoluautori

Jenq-Neng HwangUniversity of WashingtonOverená e-mailová adresa na: u.washington.edu
Yinfei YangAppleOverená e-mailová adresa na: apple.com
Zhe GanResearch Scientist, AppleOverená e-mailová adresa na: apple.com
Bowen ZhangAppleOverená e-mailová adresa na: apple.com
Yizhou WangNVIDIA; University of WashingtonOverená e-mailová adresa na: nvidia.com
Jianfeng GaoMicrosoft Research, RedmondOverená e-mailová adresa na: microsoft.com
Pengchuan ZhangMeta AIOverená e-mailová adresa na: fb.com
Lijuan WangMicrosoft GenAIOverená e-mailová adresa na: microsoft.com
Liunian Harold LiOpenAIOverená e-mailová adresa na: cs.ucla.edu
Xianzhi DuResearch Scientist, Apple AI/MLOverená e-mailová adresa na: apple.com
Gaoang WangZhejiang University / University of Illinois Urbana-Champaign InstituteOverená e-mailová adresa na: intl.zju.edu.cn
Haoxuan YouColumbia UniversityOverená e-mailová adresa na: columbia.edu
Lei ZhangInternational Digital Economy Academy (IDEA)Overená e-mailová adresa na: idea.edu.cn
Jianwei YangPrincipal Researcher, Microsoft Research, RedmondOverená e-mailová adresa na: microsoft.com
Chunyuan LixAIOverená e-mailová adresa na: x.ai
Yanghao LiFacebook AI Research (FAIR)Overená e-mailová adresa na: fb.com

Sledovať

Haotian Zhang

Research Scientist, Apple

Overená e-mailová adresa na: apple.com - Domovská stránka

Deep Learning Computer Vision Vision + Language


Názov Zoradiť podľa citácií Zoradiť podľa roka Zoradiť podľa názvu	Citované v Citované v	Rok
Grounded language-image pre-training LH Li, P Zhang, H Zhang*, J Yang, C Li, Y Zhong, L Wang, L Yuan, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022	1180	2022
Glipv2: Unifying localization and vision-language understanding H Zhang, P Zhang, X Hu, YC Chen, LH Li, X Dai, L Wang, L Yuan, ... NeurIPS, 2022	308	2022
Ferret: Refer and ground anything anywhere at any granularity H You, H Zhang, Z Gan, X Du, B Zhang, Z Wang, L Cao, SF Chang, ... ICLR, 2023	256	2023
Simple applications of BERT for ad hoc document retrieval W Yang, H Zhang, J Lin arXiv preprint arXiv:1903.10972, 2019	242	2019
Exploit the connectivity: Multi-object tracking with trackletnet G Wang, Y Wang, H Zhang, R Gu, JN Hwang Proceedings of the 27th ACM international conference on multimedia, 482-490, 2019	239	2019
Transmvsnet: Global context-aware multi-view stereo network with transformers Y Ding, W Yuan, Q Zhu, H Zhang, X Liu, Y Wang, X Liu Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022	215	2022
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training B McKinzie, Z Gan, JP Fauconnier, S Dodge, B Zhang, P Dufter, D Shah, ... ECCV, 2024	208	2024
An internal learning approach to video inpainting H Zhang, L Mai, N Xu, Z Wang, J Collomosse, H Jin Proceedings of the IEEE/CVF international conference on computer vision …, 2019	97	2019
Eye in the sky: Drone-based object tracking and 3d localization H Zhang, G Wang, Z Lei, JN Hwang Proceedings of the 27th ACM international conference on multimedia, 899-907, 2019	92	2019
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs K You, H Zhang, E Schoop, F Weers, A Swearngin, J Nichols, Y Yang, ... ECCV, 2024	80	2024
Visdrone-mot2019: The vision meets drone multiple object tracking challenge results L Wen, P Zhu, D Du, X Bian, H Ling, Q Hu, J Zheng, T Peng, X Wang, ... Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2019	63	2019
VisDrone-SOT2019: The vision meets drone single object tracking challenge results D Du, P Zhu, L Wen, X Bian, H Ling, Q Hu, J Zheng, T Peng, X Wang, ... Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2019	61	2019
Apple intelligence foundation language models T Gunter, Z Wang, C Wang, R Pang, A Narayanan, A Zhang, B Zhang, ... arXiv preprint arXiv:2407.21075, 2024	40	2024
Ferret-v2: An improved baseline for referring and grounding with large language models H Zhang COLM, 2024	29*	2024
How easy is it to fool your multimodal llms? an empirical analysis on deceptive prompts Y Qian, H Zhang, Y Yang, Z Gan arXiv preprint arXiv:2402.13220 2 (7), 2024	28	2024
From scarcity to efficiency: Improving clip training via visual-enriched captions Z Lai, H Zhang, W Wu, H Bai, A Timofeev, X Du, Z Gan, J Shan, ... ECCV2024, 2023	27	2023
Bundle adjustment for monocular visual odometry based on detections of traffic signs Y Zhang, H Zhang, G Wang, J Yang, JN Hwang IEEE transactions on vehicular technology 69 (1), 151-162, 2019	21	2019
From scarcity to efficiency: Improving clip training via visual-enriched captions Z Lai, H Zhang, B Zhang, W Wu, H Bai, A Timofeev, X Du, Z Gan, J Shan, ... European Conference on Computer Vision, 111-127, 2025	20*	2025
Empowering unsupervised domain adaptation with large-scale pre-trained vision-language models Z Lai, H Bai, H Zhang, X Du, J Shan, Y Yang, CN Chuah, M Cao Proceedings of the ieee/cvf winter conference on applications of computer …, 2024	18	2024
MM1. 5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning H Zhang, M Gao, Z Gan*, P Dufter, N Wenzel, F Huang, D Shah, X Du, ... ICLR2025, 2024	17	2024

Systém momentálne nemôže vykonať operáciu. Skúste to neskôr.

Články 1–20

Citácie za rok

Duplicitné citácie

Zlúčené citácie

Pridať spoluautorovSpoluautori

Sledovať

Citované v

Spoluautori