Difei Gao

Geciteerd door

	Alles	Sinds 2020
Citaties	1252	1246
h-index	17	17
i10-index	22	22

780

390

195

585

2021202220232024202529 66 267 777 102

Openbare toegang

Alles bekijken

15 artikelen

2 artikelen

beschikbaar

niet beschikbaar

Op basis van financieringsmachtigingen

Medeauteurs

Mike Z. SHOUNational U. of Singapore; Facebook AI; Columbia UniversityGeverifieerd e-mailadres voor columbia.edu
Kevin Qinghong LinNational University of SingaporeGeverifieerd e-mailadres voor u.nus.edu
Joya ChenNational University of SingaporeGeverifieerd e-mailadres voor u.nus.edu
Ruiping WangProfessor, Institute of Computing Technology, Chinese Academy of SciencesGeverifieerd e-mailadres voor ict.ac.cn
Xilin ChenInstitute of Computing Technology, Chinese Academy of SciencesGeverifieerd e-mailadres voor ict.ac.cn
Shiguang ShanProfessor of Institute of Computing Technology, Chinese Academy of SciencesGeverifieerd e-mailadres voor ict.ac.cn
Luowei ZhouResearch Scientist, Google DeepmindGeverifieerd e-mailadres voor google.com
Mengmi ZhangAssistant professor and PI of Deep NeuroCognition Lab, NTU and A*STARGeverifieerd e-mailadres voor ntu.edu.sg
Kenneth LiHarvard UniversityGeverifieerd e-mailadres voor g.harvard.edu
Lili PanAssociate Professor, University of Electronic Science and Technology of ChinaGeverifieerd e-mailadres voor uestc.edu.cn
Rui ChenUniversity of CambridgeGeverifieerd e-mailadres voor cam.ac.uk

Volgen

Difei Gao

National U. of Singapore; Institute of Computing Technology, Chinese Academy of Sciences

Geverifieerd e-mailadres voor nus.edu.sg

Artificial Intelligence AI Agent Vision and Language


Titel Sorteren op citaties Sorteren op jaar Sorteren op titel	Geciteerd door Geciteerd door	Jaar
Egocentric video-language pretraining KQ Lin, AJ Wang, M Soldan, M Wray, R Yan, EZ Xu, D Gao, R Tu, W Zhao Neural Information Processing Systems (NeurIPS) 2 (3), 2022	185	2022
Show-1: Marrying pixel and latent diffusion models for text-to-video generation DJ Zhang, JZ Wu, JW Liu, R Zhao, L Ran, Y Gu, D Gao, MZ Shou International Journal of Computer Vision, 1-15, 2024	168	2024
Multi-modal graph neural network for joint reasoning on vision and scene text D Gao, K Li, R Wang, S Shan, X Chen IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 12746 …, 2020	141	2020
UniVTG: Towards Unified Video-Language Temporal Grounding KQ Lin, P Zhang, J Chen, S Pramanick, D Gao, AJ Wang, R Yan, MZ Shou IEEE/CVF International Conference on Computer Vision (ICCV), 2023	125	2023
MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering D Gao, L Zhou, L Ji, L Zhu, Y Yang, MZ Shou IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 14773 …, 2023	99	2023
Assistgpt: A general multi-modal assistant that can plan, execute, inspect, and learn D Gao, L Ji, L Zhou, KQ Lin, J Chen, Z Fan, MZ Shou arXiv preprint arXiv:2306.08640, 2023	77	2023
CRIC: A vqa dataset for compositional reasoning on vision and commonsense D Gao, R Wang, S Shan, X Chen IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022	36*	2022
Symbolic replay: Scene graph as prompt for continual learning on vqa task SW Lei, D Gao, JZ Wu, Y Wang, W Liu, M Zhang, MZ Shou The AAAI Conference on Artificial Intelligence (AAAI), 2023	33	2023
Env-QA: A Video Question Answering Benchmark for Comprehensive Understanding of Dynamic Environments D Gao, R Wang, Z Bai, X Chen IEEE/CVF International Conference on Computer Vision (ICCV), 1675-1685, 2021	33	2021
Cvpr 2023 text guided video editing competition JZ Wu, X Li, D Gao, Z Dong, J Bai, A Singh, X Xiang, Y Li, Z Huang, Y Sun, ... arXiv preprint arXiv:2310.16003, 2023	31	2023
Cone: An efficient coarse-to-fine alignment framework for long video temporal grounding Z Hou, W Zhong, L Ji, D Gao, K Yan, WK Chan, CW Ngo, Z Shou, N Duan Annual Meeting of the Association for Computational Linguistics (ACL), 2022	29	2022
AssistQ: Affordance-centric Question-driven Task Completion for Egocentric Assistant B Wong, J Chen, Y Wu, SW Lei, D Mao, D Gao, MZ Shou European Conference on Computer Vision (ECCV), 2022	29	2022
Assistgui: Task-oriented pc graphical user interface automation D Gao, L Ji, Z Bai, M Ouyang, P Li, D Mao, Q Wu, W Zhang, P Wang, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024	28*	2024
Affordance grounding from demonstration video to target image J Chen, D Gao, KQ Lin, MZ Shou IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 6799-6808, 2023	28	2023
Videollm-online: Online video large language model for streaming video J Chen, Z Lv, S Wu, KQ Lin, C Song, D Gao, JW Liu, Z Gao, D Mao, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024	26	2024
Weijie Kong, et al KQ Lin, AJ Wang, M Soldan, M Wray, R Yan, EZ Xu, D Gao, R Tu, W Zhao Egocentric video-language pretraining. NeurIPS 6, 26-50, 2022	24	2022
GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval Y Wang, D Gao, L Yu, W Lei, M Feiszli, MZ Shou European Conference on Computer Vision (ECCV), 2022	22	2022
KMIR: a benchmark for evaluating knowledge memorization, identification and reasoning abilities of language models D Gao, Y Jia, L Li, C Fu, Z Dou, H Jiang, X Zhang, L Chen, Z Cao arXiv preprint arXiv:2202.13529, 2022	14	2022
Vit-lens: Towards omni-modal representations W Lei, Y Ge, K Yi, J Zhang, D Gao, D Sun, Y Ge, Y Shan, MZ Shou Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024	13	2024
GroundNLQ@ Ego4D Natural Language Queries Challenge 2023 Z Hou, L Ji, D Gao, W Zhong, K Yan, C Li, WK Chan, CW Ngo, N Duan, ... arXiv preprint arXiv:2306.15255, 2023	13	2023

Het systeem kan de bewerking nu niet uitvoeren. Probeer het later opnieuw.

Artikelen 1–20

Citaties per jaar

Dubbele citaties

Samengevoegde citaties

Medeauteurs toevoegenMedeauteurs

Volgen

Geciteerd door

Medeauteurs