Volgen
Difei Gao
Difei Gao
National U. of Singapore; Institute of Computing Technology, Chinese Academy of Sciences
Geverifieerd e-mailadres voor nus.edu.sg
Titel
Geciteerd door
Geciteerd door
Jaar
Egocentric video-language pretraining
KQ Lin, AJ Wang, M Soldan, M Wray, R Yan, EZ Xu, D Gao, R Tu, W Zhao
Neural Information Processing Systems (NeurIPS) 2 (3), 2022
1852022
Show-1: Marrying pixel and latent diffusion models for text-to-video generation
DJ Zhang, JZ Wu, JW Liu, R Zhao, L Ran, Y Gu, D Gao, MZ Shou
International Journal of Computer Vision, 1-15, 2024
1682024
Multi-modal graph neural network for joint reasoning on vision and scene text
D Gao, K Li, R Wang, S Shan, X Chen
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 12746 …, 2020
1412020
UniVTG: Towards Unified Video-Language Temporal Grounding
KQ Lin, P Zhang, J Chen, S Pramanick, D Gao, AJ Wang, R Yan, MZ Shou
IEEE/CVF International Conference on Computer Vision (ICCV), 2023
1252023
MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering
D Gao, L Zhou, L Ji, L Zhu, Y Yang, MZ Shou
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 14773 …, 2023
992023
Assistgpt: A general multi-modal assistant that can plan, execute, inspect, and learn
D Gao, L Ji, L Zhou, KQ Lin, J Chen, Z Fan, MZ Shou
arXiv preprint arXiv:2306.08640, 2023
772023
CRIC: A vqa dataset for compositional reasoning on vision and commonsense
D Gao, R Wang, S Shan, X Chen
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
36*2022
Symbolic replay: Scene graph as prompt for continual learning on vqa task
SW Lei, D Gao, JZ Wu, Y Wang, W Liu, M Zhang, MZ Shou
The AAAI Conference on Artificial Intelligence (AAAI), 2023
332023
Env-QA: A Video Question Answering Benchmark for Comprehensive Understanding of Dynamic Environments
D Gao, R Wang, Z Bai, X Chen
IEEE/CVF International Conference on Computer Vision (ICCV), 1675-1685, 2021
332021
Cvpr 2023 text guided video editing competition
JZ Wu, X Li, D Gao, Z Dong, J Bai, A Singh, X Xiang, Y Li, Z Huang, Y Sun, ...
arXiv preprint arXiv:2310.16003, 2023
312023
Cone: An efficient coarse-to-fine alignment framework for long video temporal grounding
Z Hou, W Zhong, L Ji, D Gao, K Yan, WK Chan, CW Ngo, Z Shou, N Duan
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
292022
AssistQ: Affordance-centric Question-driven Task Completion for Egocentric Assistant
B Wong, J Chen, Y Wu, SW Lei, D Mao, D Gao, MZ Shou
European Conference on Computer Vision (ECCV), 2022
292022
Assistgui: Task-oriented pc graphical user interface automation
D Gao, L Ji, Z Bai, M Ouyang, P Li, D Mao, Q Wu, W Zhang, P Wang, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
28*2024
Affordance grounding from demonstration video to target image
J Chen, D Gao, KQ Lin, MZ Shou
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 6799-6808, 2023
282023
Videollm-online: Online video large language model for streaming video
J Chen, Z Lv, S Wu, KQ Lin, C Song, D Gao, JW Liu, Z Gao, D Mao, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
262024
Weijie Kong, et al
KQ Lin, AJ Wang, M Soldan, M Wray, R Yan, EZ Xu, D Gao, R Tu, W Zhao
Egocentric video-language pretraining. NeurIPS 6, 26-50, 2022
242022
GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval
Y Wang, D Gao, L Yu, W Lei, M Feiszli, MZ Shou
European Conference on Computer Vision (ECCV), 2022
222022
KMIR: a benchmark for evaluating knowledge memorization, identification and reasoning abilities of language models
D Gao, Y Jia, L Li, C Fu, Z Dou, H Jiang, X Zhang, L Chen, Z Cao
arXiv preprint arXiv:2202.13529, 2022
142022
Vit-lens: Towards omni-modal representations
W Lei, Y Ge, K Yi, J Zhang, D Gao, D Sun, Y Ge, Y Shan, MZ Shou
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
132024
GroundNLQ@ Ego4D Natural Language Queries Challenge 2023
Z Hou, L Ji, D Gao, W Zhong, K Yan, C Li, WK Chan, CW Ngo, N Duan, ...
arXiv preprint arXiv:2306.15255, 2023
132023
Het systeem kan de bewerking nu niet uitvoeren. Probeer het later opnieuw.
Artikelen 1–20