Sledovat
Jianjian Sun
Jianjian Sun
Researcher of StepFun
E-mailová adresa ověřena na: megvii.com
Název
Citace
Citace
Rok
Bevdepth: Acquisition of reliable depth for multi-view 3d object detection
Y Li, Z Ge, G Yu, J Yang, Z Wang, Y Shi, J Sun, Z Li
Proceedings of the AAAI Conference on Artificial Intelligence 37 (2), 1477-1485, 2023
6252023
Bevstereo: Enhancing depth estimation in multi-view 3d object detection with temporal stereo
Y Li, H Bao, Z Ge, J Yang, J Sun, Z Li
Proceedings of the AAAI Conference on Artificial Intelligence 37 (2), 1486-1494, 2023
2232023
Dreamllm: Synergistic multimodal comprehension and creation
R Dong, C Han, Y Peng, Z Qi, Z Ge, J Yang, L Zhao, J Sun, H Zhou, H Wei, ...
arXiv preprint arXiv:2309.11499, 2023
1502023
Cross modal transformer: Towards fast and robust 3d object detection
J Yan, Y Liu, J Sun, F Jia, S Li, T Wang, X Zhang
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
132*2023
Autoencoders as cross-modal teachers: Can pretrained 2d image transformers help 3d representation learning?
R Dong, Z Qi, L Zhang, J Zhang, J Sun, Z Ge, L Yi, K Ma
arXiv preprint arXiv:2212.08320, 2022
972022
Reversible column networks
Y Cai, Y Zhou, Q Han, J Sun, X Kong, J Li, X Zhang
arXiv preprint arXiv:2212.11696, 2022
842022
Vary: Scaling up the vision vocabulary for large vision-language model
H Wei, L Kong, J Chen, L Zhao, Z Ge, J Yang, J Sun, C Han, X Zhang
European Conference on Computer Vision, 408-424, 2024
752024
Exploring recurrent long-term temporal fusion for multi-view 3d perception
C Han, J Yang, J Sun, Z Ge, R Dong, H Zhou, W Mao, Y Peng, X Zhang
IEEE Robotics and Automation Letters, 2024
582024
Chatspot: Bootstrapping multimodal llms via precise referring instruction tuning
L Zhao, E Yu, Z Ge, J Yang, H Wei, H Zhou, J Sun, Y Peng, R Dong, ...
arXiv preprint arXiv:2307.09474, 2023
502023
Small language model meets with reinforced vision vocabulary
H Wei, L Kong, J Chen, L Zhao, Z Ge, E Yu, J Sun, C Han, X Zhang
arXiv preprint arXiv:2401.12503, 2024
312024
General ocr theory: Towards ocr-2.0 via a unified end-to-end model
H Wei, C Liu, J Chen, J Wang, L Kong, Y Xu, Z Ge, L Zhao, J Sun, Y Peng, ...
202024
Focus Anywhere for Fine-grained Multi-page Document Understanding
C Liu, H Wei, J Chen, L Kong, Z Ge, Z Zhu, L Zhao, J Sun, C Han, ...
arXiv preprint arXiv:2405.14295, 2024
152024
Onechart: Purify the chart structural extraction via one auxiliary token
J Chen, L Kong, H Wei, C Liu, Z Ge, L Zhao, J Sun, C Han, X Zhang
Proceedings of the 32nd ACM International Conference on Multimedia, 147-155, 2024
132024
The 1st-place solution for cvpr 2023 openlane topology in autonomous driving challenge
D Wu, F Jia, J Chang, Z Li, J Sun, C Han, S Li, Y Liu, Z Ge, T Wang
arXiv preprint arXiv:2306.09590, 2023
122023
Disttrain: Addressing model and data heterogeneity with disaggregated training for multimodal large language models
Z Zhang, Y Zhong, R Ming, H Hu, J Sun, Z Ge, Y Zhu, X Jin
arXiv preprint arXiv:2408.04275, 2024
42024
Bevstereo++: Accurate depth estimation in multi-view 3d object detection via dynamic temporal stereo
Y Li, J Yang, J Sun, H Bao, Z Ge, L Xiao
arXiv preprint arXiv:2304.04185, 2023
42023
Slow Perception: Let's Perceive Geometric Figures Step-by-step
H Wei, Y Yin, Y Li, J Wang, L Zhao, J Sun, Z Ge, X Zhang
arXiv preprint arXiv:2412.20631, 2024
2024
First Place Solution to the 3D Object Detection of the SSLAD2022 Challenge
T Huang, Z Yao, L Liu, B Wang, T Jiang, J Sun, X Wang, Z Li, H Yao
Systém momentálně nemůže danou operaci provést. Zkuste to znovu později.
Články 1–18