Xiaotian Han

Citirano

	Sve	Od 2020.
Citati	234	227
H-indeks	7	7
i10-indeks	6	6

140

105

2017201820192020202120222023202420251 1 4 8 6 25 24 134 29

Javni pristup

Prikaži sve

2 članka

0 članaka

dostupno

nije dostupno

Na temelju uvjeta financiranja

Suautori

Quanzeng YouTikTokPotvrđena adresa e-pošte na microsoft.com
Hongxia YangProfessor, HK Polytechnic UniversityPotvrđena adresa e-pošte na polyu.edu.hk
Bohan ZhaiGenAI at Snowflake, UC BerkeleyPotvrđena adresa e-pošte na berkeley.edu
Jianbo YuanPrinciple Scientist, Amazon AGIPotvrđena adresa e-pošte na amazon.com
Yongfei LiuBytedancePotvrđena adresa e-pošte na bytedance.com
Houdong HuMicrosoft，Principal Engineering ManagerPotvrđena adresa e-pošte na microsoft.com
Haogeng LiuMaster students, University of Chinese Academy of SciencesPotvrđena adresa e-pošte na ia.ac.cn
Yunzhe TaoResearch Scientist, ByteDancePotvrđena adresa e-pošte na bytedance.com
Jianghao Xiong 熊江浩Beijing Institute of TechnologyPotvrđena adresa e-pošte na bit.edu.cn
Mingshu ZhaoPotvrđena adresa e-pošte na umd.edu
Lei ZhangInternational Digital Economy Academy (IDEA)Potvrđena adresa e-pošte na idea.edu.cn
Jianwei YangPrincipal Researcher, Microsoft Research, RedmondPotvrđena adresa e-pošte na microsoft.com
Pengchuan ZhangMeta AIPotvrđena adresa e-pošte na fb.com
Jianfeng GaoMicrosoft Research, RedmondPotvrđena adresa e-pošte na microsoft.com
Jiang WangGooglePotvrđena adresa e-pošte na google.com
Zicheng LiuMicrosoftPotvrđena adresa e-pošte na microsoft.com
Peng ChuMicrosoftPotvrđena adresa e-pošte na microsoft.com
Chunyu WangMicrosoft ResearchPotvrđena adresa e-pošte na microsoft.com
Zhizheng Zhang (张直政)VP of Large Models at Galbot << Microsoft ResearchPotvrđena adresa e-pošte na microsoft.com
Yiren JianTikTok, ByteDance, Dartmouth CollegePotvrđena adresa e-pošte na dartmouth.edu

Prati

Xiaotian Han

TikTok

Potvrđena adresa e-pošte na bytedance.com - Početna stranica

Machine learning Computer Vision Multimodal GenAI LLM


Naslov Poredaj po navodima Poredaj po godini Poredaj po naslovu	Citirano Citirano	Godina
Exploring the reasoning abilities of multimodal large language models (mllms): A comprehensive survey on emerging trends in multimodal reasoning Y Wang, W Chen, X Han, X Lin, H Zhao, Y Liu, B Zhai, J Yuan, Q You, ... arXiv preprint arXiv:2401.06805, 2024	72*	2024
Real-time micro-scale temperature imaging at low cost based on fluorescent intensity ratio J Xiong, M Zhao, X Han, Z Cao, X Wei, Y Chen, C Duan, M Yin Scientific Reports 7 (1), 41311, 2017	37	2017
Mmptrack: Large-scale densely annotated multi-camera multiple people tracking benchmark X Han, Q You, C Wang, Z Zhang, P Chu, H Hu, J Wang, Z Liu Proceedings of the IEEE/CVF Winter Conference on Applications of Computer …, 2023	36*	2023
Image scene graph generation (sgg) benchmark X Han, J Yang, H Hu, L Zhang, J Gao, P Zhang arXiv preprint arXiv:2107.12604, 2021	34	2021
InfiMM-Eval: Complex Open-Ended Reasoning Evaluation For Multi-Modal Large Language Models X Han, Q You, Y Liu, W Chen, H Zheng, K Mrini, X Lin, Y Wang, B Zhai, ... arXiv e-prints, arXiv: 2311.11567, 2023	14*	2023
Vitar: Vision transformer with any resolution Q Fan, Q You, X Han, Y Liu, Y Tao, H Huang, R He, H Yang arXiv preprint arXiv:2403.18361, 2024	11	2024
Infimm-webmath-40b: Advancing multimodal pre-training for enhanced mathematical reasoning X Han, Y Jian, X Hu, H Liu, Y Wang, Q Fan, Y Ai, H Huang, R He, Z Yang, ... arXiv preprint arXiv:2409.12568, 2024	7	2024
InfiMM: Advancing Multimodal Understanding with an Open-Sourced Visual Language Model H Liu, Q You, Y Wang, X Han, B Zhai, Y Liu, W Chen, Y Jian, Y Tao, ... Findings of the Association for Computational Linguistics ACL 2024, 485-492, 2024	7*	2024
Infimm-hd: A leap forward in high-resolution multimodal understanding H Liu, Q You, X Han, Y Wang, B Zhai, Y Liu, Y Tao, H Huang, R He, ... arXiv preprint arXiv:2403.01487, 2024	7*	2024
InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection Y Liu, P Li, Z Wei, C Xie, X Hu, X Xu, S Zhang, X Han, H Yang, F Wu arXiv preprint arXiv:2501.04575, 2025	3	2025
Quanzeng You, and Hongxia Yang. Dreamclear: High-capacity real-world image restoration with privacy-safe dataset curation Y Ai, X Zhou, H Huang, X Han, Z Chen NeurIPS 5 (6), 7, 2024	3	2024
Visual Anchors Are Strong Information Aggregators For Multimodal Large Language Model H Liu, Q You, X Han, Y Liu, H Huang, R He, H Yang Advances in Neural Information Processing Systems 37, 17696-17718, 2025	2	2025
COCO is “ALL” You Need for Visual Instruction Fine-tuning X Han, Y Wang, B Zhai, Q You, H Yang 2024 IEEE International Conference on Multimedia and Expo (ICME), 1-5, 2024	1	2024
InfiR: Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning C Xie, S Cai, W Wang, P Li, Z Sang, K Yang, Y Zhang, Z Li, G Zhu, Z Liu, ... arXiv preprint arXiv:2502.11573, 2025		2025
BabelBench: An Omni Benchmark for Code-Driven Analysis of Multimodal and Multistructured Data X Wang, Q Cui, Y Tao, Y Wang, Z Chai, X Han, B Liu, J Yuan, J Su, ... arXiv preprint arXiv:2410.00773, 2024		2024

Sustav trenutno ne može provesti ovu radnju. Pokušajte ponovo kasnije.

Članci 1–15

Godišnji broj citata

Dvostruki navodi

Spojeni navodi

Dodavanje suautoraSuautori

Prati

Citirano

Suautori