Seguir
Jinguo Zhu
Título
Citado por
Citado por
Año
Uni-Perceiver: Pre-training unified architecture for generic perception for zero-shot and few-shot tasks
X Zhu, J Zhu, H Li, X Wu, H Li, X Wang, J Dai
CVPR 2022, 16804-16815, 2022
1362022
Complementary relation contrastive distillation
J Zhu, S Tang, D Chen, S Yu, Y Liu, M Rong, A Yang, X Wang
CVPR 2021, 9260-9269, 2021
1052021
Layerwise optimization by gradient decomposition for continual learning
S Tang, D Chen, J Zhu, S Yu, W Ouyang
CVPR 2021, 9634-9643, 2021
762021
Uni-Perceiver-MoE: Learning sparse generalist models with conditional moes
J Zhu, X Zhu, W Wang, X Wang, H Li, X Wang, J Dai
NeurIPS 2022, 2022
642022
Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks
H Li, J Zhu, X Jiang, X Zhu, H Li, C Yuan, X Wang, Y Qiao, X Wang, ...
CVPR 2023, 2022
562022
SEED-X: Multimodal models with unified multi-granularity comprehension and generation
Y Ge, S Zhao, J Zhu, Y Ge, K Yi, L Song, C Li, X Ding, Y Shan
arXiv preprint arXiv:2404.14396, 2024
542024
A deep learning method to detect foreign objects for inspecting power transmission lines
J Zhu, Y Guo, F Yue, H Yuan, A Yang, X Wang, M Rong
Ieee Access 8, 94065-94075, 2020
412020
Vl-gpt: A generative pre-trained transformer for vision and language understanding and generation
J Zhu, X Ding, Y Ge, Y Ge, S Zhao, H Zhao, X Wang, Y Shan
arXiv preprint arXiv:2312.09251, 2023
292023
Expanding performance boundaries of open-source multimodal models with model, data, and test-time scaling
Z Chen, W Wang, Y Cao, Y Liu, Z Gao, E Cui, J Zhu, S Ye, H Tian, Z Liu, ...
arXiv preprint arXiv:2412.05271, 2024
232024
Multiple domain experts collaborative learning: Multi-source domain generalization for person re-identification
S Yu, F Zhu, D Chen, R Zhao, H Chen, S Tang, J Zhu, Y Qiao
arXiv preprint arXiv:2105.12355, 2021
232021
Vlattack: Multimodal adversarial attacks on vision-language tasks via pre-trained models
Z Yin, M Ye, T Zhang, T Du, J Zhu, H Liu, J Chen, T Wang, F Ma
Advances in Neural Information Processing Systems 36, 2024
212024
Enhanced sensing of sulfur hexafluoride decomposition components based on noble-metal-functionalized cerium oxide
A Yang, W Li, J Chu, D Wang, H Yuan, J Zhu, X Wang, M Rong
Materials & Design 187, 108391, 2020
202020
Crowded human detection via an anchor-pair network
J Zhu, Z Yuan, C Zhang, W Chi, Y Ling
WACV 2020, 1391-1399, 2020
92020
Mini-InternVL: a flexible-transfer pocket multi-modal model with 5% parameters and 90% performance
Z Gao, Z Chen, E Cui, Y Ren, W Wang, J Zhu, H Tian, S Ye, J He, X Zhu, ...
Visual Intelligence 2 (1), 1-17, 2024
82024
Enhancing the reasoning ability of multimodal large language models via mixed preference optimization
W Wang, Z Chen, W Wang, Y Cao, Y Liu, Z Gao, J Zhu, X Zhu, L Lu, ...
arXiv preprint arXiv:2411.10442, 2024
62024
Welding joints inspection via residual attention network
J Zhu, Z Yuan, T Liu
2019 16th International Conference on Machine Vision Applications (MVA), 1-5, 2019
32019
Power-llava: Large language and vision assistant for power transmission line inspection
J Wang, M Li, H Luo, J Zhu, A Yang, M Rong, X Wang
2024 IEEE International Conference on Image Processing (ICIP), 963-969, 2024
22024
SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding
H Li, C Tian, J Shao, X Zhu, Z Wang, J Zhu, W Dou, X Wang, H Li, L Lu, ...
arXiv preprint arXiv:2412.09604, 2024
12024
V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding
J Ge, Z Chen, J Lin, J Zhu, X Liu, J Dai, X Zhu
arXiv preprint arXiv:2412.09616, 2024
2024
Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning
C Yang, X Zhu, J Zhu, W Su, J Wang, X Dong, W Wang, L Lu, B Li, J Zhou, ...
arXiv preprint arXiv:2406.07543, 2024
2024
El sistema no puede realizar la operación en estos momentos. Inténtalo de nuevo más tarde.
Artículos 1–20