VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs Z Cheng, S Leng, H Zhang, Y Xin, X Li, G Chen, Y Zhu, W Zhang, Z Luo, ... arXiv preprint arXiv:2406.07476, 2024 | 142 | 2024 |
EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding Y Wu, X Cheng, R Zhang, Z Cheng, J Zhang CVPR 2023, 19231-19242, 2023 | 85 | 2023 |
Diffusionret: Generative text-video retrieval with diffusion model P Jin, H Li, Z Cheng, K Li, X Ji, C Liu, L Yuan, J Chen ICCV 2023, 2023 | 59 | 2023 |
ACSeg: Adaptive Conceptualization for Unsupervised Semantic Segmentation K Li, Z Wang, Z Cheng, R Yu, Y Zhao, G Song, C Liu, L Yuan, J Chen CVPR 2023, 7162-7172, 2023 | 51* | 2023 |
Out-of-Candidate Rectification for Weakly-supervised Semantic Segmentation Z Cheng, P Qiao, K Li, S Li, P Wei, X Ji, L Yuan, C Liu, J Chen CVPR 2023, 23673-23684, 2023 | 48 | 2023 |
Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment P Jin, H Li, Z Cheng, J Huang, Z Wang, L Yuan, C Liu, J Chen IJCAI 2023, 2023 | 37 | 2023 |
Parallel Vertex Diffusion for Unified Visual Grounding Z Cheng, K Li, P Jin, X Ji, L Yuan, C Liu, J Chen AAAI 2024, 2023 | 22 | 2023 |
TG-VQA: Ternary Game of Video Question Answering H Li, P Jin, Z Cheng, S Zhang, K Chen, Z Wang, C Liu, J Chen IJCAI 2023, 2023 | 16 | 2023 |
AutoConv: Automatically Generating Information-seeking Conversations with Large Language Models S Li, C Yang, Y Yin, X Zhu, Z Cheng, L Shang, X Jiang, Q Liu, Y Yang ACL 2023 short, 2023 | 13 | 2023 |
Integrating multiple MRI sequences for pelvic organs segmentation via the attention mechanism S Huang, Z Cheng, L Lai, W Zheng, M He, J Li, T Zeng, X Huang, X Yang Medical physics 48 (12), 7930-7945, 2021 | 12 | 2021 |
FreestyleRet: Retrieving Images from Style-Diversified Queries H Li, C Jia, P Jin, Z Cheng, K Li, J Sui, C Liu, L Yuan ECCV 2024, 2024 | 9 | 2024 |
Multi-granularity Interaction Simulation for Unsupervised Interactive Segmentation K Li, Y Zhao, Z Wang, Z Cheng, P Jin, X Ji, L Yuan, C Liu, J Chen ICCV 2023, 2023 | 8 | 2023 |
GraCo: Granularity-Controllable Interactive Segmentation Y Zhao, K Li, Z Cheng, P Qiao, X Zheng, R Ji, C Liu, L Yuan, J Chen CVPR 2024 Highlight, 2024 | 6 | 2024 |
The curse of multi-modalities: Evaluating hallucinations of large multimodal models across language, visual, and audio S Leng, Y Xing, Z Cheng, Y Zhou, H Zhang, X Li, D Zhao, S Lu, C Miao, ... arXiv preprint arXiv:2410.12787, 2024 | 4 | 2024 |
A Survey on the Honesty of Large Language Models S Li, C Yang, T Wu, C Shi, Y Zhang, X Zhu, Z Cheng, D Cai, M Yu, L Liu, ... arXiv preprint arXiv:2409.18786, 2024 | 4 | 2024 |
NewsDialogues: Towards Proactive News Grounded Conversation S Li, Y Yin, C Yang, W Jiang, Y Li, Z Cheng, L Shang, X Jiang, Q Liu, ... ACL 2023 findings, 2023 | 4 | 2023 |
WiCo: Win-win Cooperation of Bottom-up and Top-down Referring Image Segmentation Z Cheng, P Jin, H Li, K Li, S Li, X Ji, C Liu, J Chen IJCAI 2023, 2023 | 4 | 2023 |
Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation P Jin, H Li, Z Cheng, K Li, R Yu, C Liu, X Ji, L Yuan, J Chen ECCV 2024, 2024 | 2 | 2024 |
Instance Brownian Bridge as Texts for Open-vocabulary Video Instance Segmentation Z Cheng, K Li, H Li, P Jin, C Liu, X Zheng, R Ji, J Chen AAAI 2025, 2024 | 2 | 2024 |
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM Y Yuan, H Zhang, W Li, Z Cheng, B Zhang, L Li, X Li, D Zhao, W Zhang, ... arXiv preprint arXiv:2501.00599, 2024 | 1 | 2024 |