Spremljaj
Lewei Lu
Lewei Lu
Research Director (We're Hiring, luotto@sensetime.com) @ SenseTime Research
Preverjeni e-poštni naslov na sensetime.com
Naslov
Navedeno
Navedeno
Leto
Deformable DETR: Deformable Transformers for End-to-End Object Detection
X Zhu, W Su, L Lu, B Li, X Wang, J Dai
The International Conference on Learning Representations (ICLR), 2021
62882021
VL-BERT: Pre-Training of Generic Visual-Linguistic Representations
W Su, X Zhu, Y Cao, B Li, L Lu, F Wei, J Dai
The International Conference on Learning Representations (ICLR), 2020
19482020
Internimage: Exploring large-scale vision foundation models with deformable convolutions
W Wang, J Dai, Z Chen, Z Huang, Z Li, X Zhu, X Hu, T Lu, L Lu, H Li, ...
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2023
8512023
Internvl: Scaling up vision foundation models and aligning for generic visual-linguistic tasks
Z Chen, J Wu, W Wang, W Su, G Chen, S Xing, M Zhong, Q Zhang, X Zhu, ...
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2024
683*2024
Planning-oriented autonomous driving
Y Hu, J Yang, L Chen, K Li, C Sima, X Zhu, S Chai, S Du, T Lin, W Wang, ...
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2023
678*2023
Visionllm: Large language model is also an open-ended decoder for vision-centric tasks
W Wang, Z Chen, X Chen, J Wu, X Zhu, G Zeng, P Luo, T Lu, J Zhou, ...
Advances in Neural Information Processing Systems 36, 61501-61513, 2023
4482023
How far are we to gpt-4v? closing the gap to commercial multimodal models with open-source suites
Z Chen, W Wang, H Tian, S Ye, Z Gao, E Cui, W Tong, K Hu, J Luo, Z Ma, ...
Science China Information Sciences 67 (12), 220101, 2024
3942024
Bevformer v2: Adapting modern image backbones to bird's-eye-view recognition via perspective supervision
C Yang, Y Chen, H Tian, C Tao, X Zhu, Z Zhang, G Huang, H Li, Y Qiao, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
2712023
Ghost in the minecraft: Generally capable agents for open-world environments via large language models with text-based knowledge and memory
X Zhu, Y Chen, H Tian, C Tao, W Su, C Yang, G Huang, B Li, L Lu, ...
arXiv preprint arXiv:2305.17144, 2023
195*2023
Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, and Jifeng Dai
X Zhu
Deformable DETR: deformable transformers for end-to-end object detection …, 2020
1682020
FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting
R Liu, H Deng, Y Huang, X Shi, L Lu, W Sun, X Wang, J Dai, H Li
International Conference on Computer Vision (ICCV), 2021
1662021
Scene as occupancy
W Tong, C Sima, T Wang, L Chen, S Wu, H Deng, Y Gu, L Lu, P Luo, ...
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
146*2023
Delving into the devils of bird’s-eye-view perception: A review, evaluation and recipe
H Li, C Sima, J Dai, W Wang, L Lu, H Wang, J Zeng, Z Li, J Yang, H Deng, ...
IEEE Transactions on Pattern Analysis and Machine Intelligence 46 (4), 2151-2170, 2023
1442023
Drivemlm: Aligning multi-modal large language models with behavioral planning states for autonomous driving
W Wang, J Xie, CY Hu, H Zou, J Fan, W Tong, Y Wen, S Wu, H Deng, Z Li, ...
arXiv preprint arXiv:2312.09245, 2023
1002023
Decoupled spatial-temporal transformer for video inpainting
R Liu, H Deng, Y Huang, X Shi, L Lu, W Sun, X Wang, J Dai, H Li
arXiv preprint arXiv:2104.06637, 2021
722021
Efficient deformable convnets: Rethinking dynamic and sparse operator for vision applications
Y Xiong, Z Li, Y Chen, F Wang, X Zhu, J Luo, W Wang, T Lu, H Li, Y Qiao, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
682024
Towards all-in-one pre-training via maximizing multi-modal mutual information
W Su, X Zhu, C Tao, L Lu, B Li, G Huang, Y Qiao, X Wang, J Zhou, J Dai
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
482023
Weijie Su, Chenyu Yang, Gao Huang, Bin Li, Lewei Lu, Xiaogang Wang, Yu Qiao, Zhaoxiang Zhang, and Jifeng Dai
X Zhu, Y Chen, H Tian, C Tao
Ghost in the minecraft: Generally capable agents for open-world environments …, 2023
432023
Expanding performance boundaries of open-source multimodal models with model, data, and test-time scaling
Z Chen, W Wang, Y Cao, Y Liu, Z Gao, E Cui, J Zhu, S Ye, H Tian, Z Liu, ...
arXiv preprint arXiv:2412.05271, 2024
422024
Vision-rwkv: Efficient and scalable visual perception with rwkv-like architectures
Y Duan, W Wang, Z Chen, X Zhu, L Lu, T Lu, Y Qiao, H Li, J Dai, W Wang
arXiv preprint arXiv:2403.02308, 2024
392024
Sistem trenutno ne more izvesti postopka. Poskusite znova pozneje.
Članki 1–20