Sledovat
Gengyuan Zhang
Název
Citace
Citace
Rok
A systematic survey of prompt engineering on vision-language foundation models
J Gu, Z Han, S Chen, A Beirami, B He, G Zhang, R Liao, Y Qin, V Tresp, ...
arXiv preprint arXiv:2307.12980, 2023
1402023
Time-dependent entity embedding is not all you need: A re-evaluation of temporal knowledge graph completion models under a unified framework
Z Han*, G Zhang*, Y Ma, V Tresp
Proceedings of the 2021 Conference on Empirical Methods in Natural Language …, 2021
232021
Multi-event Video-Text Retrieval
G Zhang, J Ren, J Gu, V Tresp
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
122023
Cl-crossvqa: A continual learning benchmark for cross-domain visual question answering
Y Zhang, H Chen, A Frikha, Y Yang, D Krompass, G Zhang, J Gu, V Tresp
arXiv preprint arXiv:2211.10567, 2022
112022
Can Vision-Language Models be a Good Guesser? Exploring VLMs for Times and Location Reasoning
G Zhang, Y Zhang, K Zhang, V Tresp
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer …, 2024
92024
VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs
R Liao, M Erler, H Wang, G Zhai, G Zhang, Y Ma, V Tresp
arXiv preprint arXiv:2409.20365, 2024
52024
A systematic survey of prompt engineering on vision-language foundation models. arXiv
J Gu, Z Han, S Chen, A Beirami, B He, G Zhang, P Torr
arXiv preprint arXiv:2307.12980, 2023
52023
Multimodal pragmatic jailbreak on text-to-image models
T Liu, Z Lai, G Zhang, P Torr, V Demberg, V Tresp, J Gu
arXiv preprint arXiv:2409.19149, 2024
32024
Localizing Events in Videos with Multimodal Queries
G Zhang, MLA Fok, Y Xia, Y Tang, D Cremers, P Torr, V Tresp, J Gu
arXiv preprint arXiv:2406.10079, 2024
12024
SPOT! Revisiting Video-Language Models for Event Understanding
G Zhang, J Bi, J Gu, Y Chen, V Tresp
arXiv preprint arXiv:2311.12919, 2023
12023
Perceive, Query & Reason: Enhancing Video QA with Question-Guided Temporal Queries
R Amoroso*, G Zhang*, R Koner, L Baraldi, R Cucchiara, V Tresp
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer …, 2025
2025
FedBiP: Heterogeneous One-Shot Federated Learning with Personalized Latent Diffusion Models
H Chen, H Li, Y Zhang, G Zhang, J Bi, P Torr, J Gu, D Krompass, V Tresp
arXiv preprint arXiv:2410.04810, 2024
2024
Can Vision-Language Models be a Good Guesser? Exploring VLMs for Times and Location Reasoning Supplementary Materials
G Zhang, Y Zhang, K Zhang, V Tresp, AD WikiTiLo
Middle East 11, 16, 0
Systém momentálně nemůže danou operaci provést. Zkuste to znovu později.
Články 1–13