Seguir
Peng Jin
Peng Jin
PhD student, Peking University
Dirección de correo verificada de stu.pku.edu.cn - Página principal
Título
Citado por
Citado por
Año
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
B Lin, Y Ye, B Zhu, J Cui, M Ning, P Jin, L Yuan
EMNLP 2024, 2024
4382024
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
P Jin, R Takanobu, W Zhang, X Cao, L Yuan
CVPR 2024 Highlight, 13700-13710, 2024
1732024
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
B Lin, Z Tang, Y Ye, J Cui, B Zhu, P Jin, J Zhang, M Ning, L Yuan
arXiv preprint arXiv:2401.15947, 2024
1702024
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
D Liu, R Zhang, L Qiu, S Huang, W Lin, S Zhao, S Geng, Z Lin, P Jin, ...
ICML 2024, 2024
96*2024
Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations
P Jin, J Huang, F Liu, X Wu, S Ge, G Song, D Clifton, J Chen
NeurIPS 2022 Spotlight 35, 30291-30306, 2022
692022
Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning
P Jin, J Huang, P Xiong, S Tian, C Liu, X Ji, L Yuan, J Chen
CVPR 2023 Highlight, 2472-2482, 2023
662023
DiffusionRet: Generative Text-Video Retrieval with Diffusion Model
P Jin, H Li, Z Cheng, K Li, X Ji, C Liu, L Yuan, J Chen
ICCV 2023, 2470-2481, 2023
602023
Weakly-Supervised 3D Spatial Reasoning for Text-based Visual Question Answering
H Li, J Huang, P Jin, G Song, Q Wu, J Chen
IEEE Transactions on Image Processing, 2023
40*2023
Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment
P Jin, H Li, Z Cheng, J Huang, Z Wang, L Yuan, C Liu, J Chen
IJCAI 2023, 938-946, 2023
352023
Act As You Wish: Fine-Grained Control of Motion Diffusion Model with Hierarchical Semantic Graphs
P Jin, Y Wu, Y Fan, Z Sun, W Yang, L Yuan
NeurIPS 2023, 2023
262023
LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference
Z Wan, Z Wu, C Liu, J Huang, Z Zhu, P Jin, L Wang, L Yuan
EMNLP 2024 Findings, 2024
222024
Parallel Vertex Diffusion for Unified Visual Grounding
Z Cheng, K Li, P Jin, X Ji, L Yuan, C Liu, J Chen
AAAI 2024, 1326-1334, 2024
222024
Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting
J Zhang, Z Tang, Y Pang, X Cheng, P Jin, Y Wei, W Yu, M Ning, L Yuan
ECCV 2024, 2024
202024
TG-VQA: Ternary Game of Video Question Answering
H Li, P Jin, Z Cheng, S Zhang, K Chen, Z Wang, C Liu, J Chen
IJCAI 2023, 1044-1052, 2023
162023
LLaVA-o1: Let Vision Language Models Reason Step-by-Step
G Xu, P Jin, L Hao, Y Song, L Sun, L Yuan
arXiv preprint arXiv:2411.10440, 2024
142024
RAP: Efficient Text-Video Retrieval with Sparse-and-Correlated Adapter
M Cao, H Tang, J Huang, P Jin, C Zhang, R Liu, L Chen, X Liang, L Yuan, ...
ACL 2024 Findings, 2024
102024
FreestyleRet: Retrieving Images from Style-Diversified Queries
H Li, C Jia, P Jin, Z Cheng, K Li, J Sui, C Liu, L Yuan
ECCV 2024, 2024
92024
Multi-granularity Interaction Simulation for Unsupervised Interactive Segmentation
K Li, Y Zhao, Z Wang, Z Cheng, P Jin, X Ji, L Yuan, C Liu, J Chen
ICCV 2023, 666-676, 2023
82023
MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval
H Tang, M Cao, J Huang, R Liu, P Jin, G Li, X Liang
AAAI 2025, 2025
62025
LLMBind: A Unified Modality-Task Integration Framework
B Zhu, P Jin, M Ning, B Lin, J Huang, Q Song, M Pan, L Yuan
arXiv preprint arXiv:2402.14891, 2024
62024
El sistema no puede realizar la operación en estos momentos. Inténtalo de nuevo más tarde.
Artículos 1–20