Követés
Liang Zhao
Liang Zhao
StepFun
E-mail megerősítve itt: smail.nju.edu.cn
Cím
Hivatkozott rá
Hivatkozott rá
Év
Dreamllm: Synergistic multimodal comprehension and creation
R Dong, C Han, Y Peng, Z Qi, Z Ge, J Yang, L Zhao, J Sun, H Zhou, H Wei, ...
ICLR 2024 (Spotlight), 2023
1622023
Task-specific inconsistency alignment for domain adaptive object detection
L Zhao, L Wang
CVPR 2022, 2022
1122022
Vary: Scaling up the vision vocabulary for large vision-language model
H Wei, L Kong, J Chen, L Zhao, Z Ge, J Yang, J Sun, C Han, X Zhang
ECCV 2024, 2024
822024
Chatspot: Bootstrapping multimodal llms via precise referring instruction tuning
L Zhao, E Yu, Z Ge, J Yang, H Wei, H Zhou, J Sun, Y Peng, R Dong, ...
IJCAI 2024 (Long Oral), 2023
472023
Small Language Model Meets with Reinforced Vision Vocabulary
H Wei, L Kong, J Chen, L Zhao, Z Ge, E Yu, J Sun, C Han, X Zhang
arXiv preprint arXiv:2401.12503, 2024
332024
Unified density-aware image dehazing and object detection in real-world hazy scenes
Z Zhang, L Zhao, Y Liu, S Zhang, J Yang
ACCV 2020, 2020
312020
Merlin: Empowering multimodal llms with foresight minds
E Yu, L Zhao, Y Wei, J Yang, D Wu, L Kong, H Wei, T Wang, Z Ge, ...
ECCV 2024, 2024
242024
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
H Wei, C Liu, J Chen, J Wang, L Kong, Y Xu, Z Ge, L Zhao, J Sun, Y Peng, ...
arXiv preprint arXiv:2409.01704, 2024
222024
Hybrid resolution network using edge guided region mutual information loss for human parsing
Y Liu, L Zhao, S Zhang, J Yang
ACM MM 2020, 2020
182020
Onechart: Purify the chart structural extraction via one auxiliary token
J Chen, L Kong, H Wei, C Liu, Z Ge, L Zhao, J Sun, C Han, X Zhang
ACM MM 2024 (Oral), 2024
172024
Focus Anywhere for Fine-grained Multi-page Document Understanding
C Liu, H Wei, J Chen, L Kong, Z Ge, Z Zhu, L Zhao, J Sun, C Han, ...
arXiv preprint arXiv:2405.14295, 2024
162024
Logit normalization for long-tail object detection
L Zhao, Y Teng, L Wang
IJCV 132 (6), 2114-2134, 2024
92024
Self-supervised visual preference alignment
K Zhu, L Zhao, Z Ge, X Zhang
ACM MM 2024 (Oral), 2024
72024
Slow Perception: Let's Perceive Geometric Figures Step-by-step
H Wei, Y Yin, Y Li, J Wang, L Zhao, J Sun, Z Ge, X Zhang
arXiv preprint arXiv:2412.20631, 2024
12024
Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction
A Huang, B Wu, B Wang, C Yan, C Hu, C Feng, F Tian, F Shen, J Li, ...
arXiv preprint arXiv:2502.11946, 2025
2025
Unhackable Temporal Rewarding for Scalable Video MLLMs
E Yu, K Lin, L Zhao, Y Wei, Z Zhu, H Wei, J Sun, Z Ge, X Zhang, J Wang, ...
ICLR 2025, 2025
2025
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model
G Ma, H Huang, K Yan, L Chen, N Duan, S Yin, C Wan, R Ming, X Song, ...
arXiv preprint arXiv:2502.10248, 2025
2025
PerPO: Perceptual Preference Optimization via Discriminative Rewarding
Z Zhu, L Zhao, K Lin, J Yang, E Yu, C Liu, H Wei, J Sun, Z Ge, X Zhang
arXiv preprint arXiv:2502.04371, 2025
2025
A rendszer jelenleg nem tudja elvégezni a műveletet. Próbálkozzon újra később.
Cikkek 1–18