Le Xue

Zitiert von

	Alle	Seit 2020
Zitate	619	610
h-index	8	8
i10-index	8	7

480

240

120

360

2017201820192020202120222023202420252 5 2 3 3 6 91 461 43

Öffentlicher Zugriff

Alle anzeigen

0 Artikel

1 Artikel

verfügbar

nicht verfügbar

Basierend auf Fördermandaten

Koautoren

Caiming XiongSalesforce ResearchBestätigte E-Mail-Adresse bei salesforce.com
Ran XuSalesforce ResearchBestätigte E-Mail-Adresse bei salesforce.com
Silvio SavareseAssociate Professor of Computer Science at Stanford UniversityBestätigte E-Mail-Adresse bei stanford.edu
Juan Carlos NieblesResearch Director (Salesforce) & Adjunct Professor (Stanford University)Bestätigte E-Mail-Adresse bei cs.stanford.edu
Zeyuan ChenSalesforceBestätigte E-Mail-Adresse bei salesforce.com
Roberto Martín-MartínThe University of Texas at AustinBestätigte E-Mail-Adresse bei cs.utexas.edu
Mingfei GaoApple Inc.Bestätigte E-Mail-Adresse bei apple.com
Chen Xing (星辰)Scale AIBestätigte E-Mail-Adresse bei scale.com
Jiajun WuStanford UniversityBestätigte E-Mail-Adresse bei cs.stanford.edu
Weiran YaoResearch Scientist, Salesforce AI ResearchBestätigte E-Mail-Adresse bei cmu.edu

Folgen

Le Xue

Senior Applied Scientist, Salesforce Research

Bestätigte E-Mail-Adresse bei salesforce.com

Multimodal Foundation Models


Titel Nach Zitationen sortieren Nach Jahr sortieren Nach Titel sortieren	Zitiert von Zitiert von	Jahr
Ulip: Learning a unified representation of language, images, and point clouds for 3d understanding L Xue, M Gao, C Xing, R Martín-Martín, J Wu, C Xiong, R Xu, JC Niebles, ... Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2023	235	2023
ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding L Xue, N Yu, S Zhang, J Li, R Martín-Martín, J Wu, C Xiong, R Xu, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023	99	2023
Bolaa: Benchmarking and orchestrating llm-augmented autonomous agents Z Liu, W Yao, J Zhang, L Xue, S Heinecke, R Murthy, Y Feng, Z Chen, ... arXiv preprint arXiv:2308.05960, 2023	74	2023
Retroformer: Retrospective large language agents with policy gradient optimization W Yao, S Heinecke, JC Niebles, Z Liu, Y Feng, L Xue, R Murthy, Z Chen, ... arXiv preprint arXiv:2308.02151, 2023	58	2023
X-instructblip: A framework for aligning x-modal instruction-aware representations to llms and emergent cross-modal reasoning A Panagopoulou, L Xue, N Yu, J Li, D Li, S Joty, R Xu, S Savarese, ... arXiv preprint arXiv:2311.18799, 2023	41	2023
xgen-mm (blip-3): A family of open large multimodal models L Xue, M Shu, A Awadalla, J Wang, A Yan, S Purushwalkam, H Zhou, ... arXiv preprint arXiv:2408.08872, 2024	40	2024
MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens A Awadalla, L Xue, O Lo, M Shu, H Lee, EK Guha, M Jordan, S Shen, ... arXiv preprint arXiv:2406.11271, 2024	20	2024
Directed weighted network structure analysis of complex impedance measurements for characterizing oil-in-water bubbly flow ZK Gao, WD Dang, L Xue, SS Zhang Chaos: An Interdisciplinary Journal of Nonlinear Science 27 (3), 2017	15	2017
Rex: Rapid exploration and exploitation for ai agents R Murthy, S Heinecke, JC Niebles, Z Liu, L Xue, W Yao, Y Feng, Z Chen, ... arXiv preprint arXiv:2307.08962, 2023	8	2023
Robustness evaluation of transformer-based form field extractors via form attacks L Xue, M Gao, Z Chen, C Xiong, R Xu International Conference on Document Analysis and Recognition, 167-184, 2023	6	2023
xgen-mm-vid (blip-3-video): You only need 32 tokens to represent a video even in vlms MS Ryoo, H Zhou, S Kendre, C Qin, L Xue, M Shu, S Savarese, R Xu, ... arXiv preprint arXiv:2410.16267, 2024	5	2024
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition L Xue, M Gao, C Xing, R Martín-Martín, J Wu, C Xiong, R Xu, JC Niebles, ...	5	2023
Docquerynet: Value retrieval with arbitrary queries for form-like documents M Gao, L Xue, C Ramaiah, C Xing, R Xu, C Xiong Proceedings of the 29th International Conference on Computational …, 2022	5*	2022
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations C Qin, C Xia, K Ramakrishnan, M Ryoo, L Tu, Y Feng, M Shu, H Zhou, ... arXiv preprint arXiv:2408.12590, 2024	2	2024
Image analysis based document processing for inference of key-value pairs in non-fixed digital documents M Gao, C Zeyuan, L Xue, R Xu, C Xiong US Patent 11,699,297, 2023	2	2023
Model-Agnostic Hierarchical Attention for 3D Object Detection M Shu, L Xue, N Yu, R Martín-Martín, JC Niebles, C Xiong, R Xu arXiv preprint arXiv:2301.02650, 2023	2	2023
ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models J Zhang, L Xue, L Song, J Wang, W Huang, M Shu, A Yan, Z Ma, ... arXiv preprint arXiv:2412.07012, 2024	1	2024
`X-InstructBLIP`: A Framework for Aligning Image, 3D, Audio, Video to LLMs and its Emergent Cross-Modal Reasoning A Panagopoulou, L Xue, N Yu, J Li, D Li, S Joty, R Xu, S Savarese, ... European Conference on Computer Vision, 177-197, 2024	1	2024
BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions A Awadalla, L Xue, M Shu, A Yan, J Wang, S Purushwalkam, S Shen, ... arXiv preprint arXiv:2411.07461, 2024		2024
Systems and methods for multi-modal language models A Panagopoulou, L Xue, N Yu, LI Junnan, D Li, S Savarese, SR Joty, ... US Patent App. 18/400,477, 2024		2024

Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.

Artikel 1–20

Zitate pro Jahr

Doppelte Zitate

Zusammengeführte Zitate

Koautor hinzufügenKoautoren

Folgen

Zitiert von

Koautoren