Spremljaj
Yucheng Zhao
Yucheng Zhao
MEGVII Technology
Preverjeni e-poštni naslov na mail.ustc.edu.cn
Naslov
Navedeno
Navedeno
Leto
Omnivl: One foundation model for image-language and video-language tasks
J Wang, D Chen, Z Wu, C Luo, L Zhou, Y Zhao, Y Xie, C Liu, YG Jiang, ...
Advances in neural information processing systems 35, 5696-5710, 2022
1522022
A battle of network structures: An empirical study of cnn, transformer, and mlp
Y Zhao, G Wang, C Tang, C Luo, W Zeng, ZJ Zha
arXiv preprint arXiv:2108.13002, 2021
1162021
Sparse MLP for image recognition: Is self-attention really necessary?
C Tang, Y Zhao, G Wang, C Luo, W Xie, W Zeng
Proceedings of the AAAI conference on artificial intelligence 36 (2), 2344-2351, 2022
1122022
When shift operation meets vision transformer: An extremely simple alternative to attention mechanism
G Wang, Y Zhao, C Tang, C Luo, W Zeng
Proceedings of the AAAI conference on artificial intelligence 36 (2), 2423-2430, 2022
742022
Look before you match: Instance understanding matters in video object segmentation
J Wang, D Chen, Z Wu, C Luo, C Tang, X Dai, Y Zhao, Y Xie, L Yuan, ...
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2023
552023
Adriver-i: A general world model for autonomous driving
F Jia, W Mao, Y Liu, Y Zhao, Y Wen, C Zhang, X Zhang, T Wang
arXiv preprint arXiv:2311.13549, 2023
472023
Self-supervised visual representations learning by contrastive mask prediction
Y Zhao, G Wang, C Luo, W Zeng, ZJ Zha
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021
462021
Panacea: Panoramic and controllable video generation for autonomous driving
Y Wen, Y Zhao, Y Liu, F Jia, Y Wang, C Luo, C Zhang, T Wang, X Sun, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
382024
Peripheral vision transformer
J Min, Y Zhao, C Luo, M Cho
Advances in Neural Information Processing Systems 35, 32097-32111, 2022
372022
Streaming video model
Y Zhao, C Luo, C Tang, D Chen, N Codella, ZJ Zha
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2023
142023
Multi-scale group transformer for long sequence modeling in speech separation
Y Zhao, C Luo, ZJ Zha, W Zeng
Proceedings of the Twenty-Ninth International Conference on International …, 2021
142021
Stream Query Denoising for Vectorized HD-Map Construction
S Wang, F Jia, W Mao, Y Liu, Y Zhao, Z Chen, T Wang, C Zhang, X Zhang, ...
European Conference on Computer Vision, 203-220, 2024
132024
RetrieverTTS: Modeling decomposed factors for text-based speech insertion
D Yin, C Tang, Y Liu, X Wang, Z Zhao, Y Zhao, Z Xiong, S Zhao, C Luo
arXiv preprint arXiv:2206.13865, 2022
122022
Zero-shot text-to-speech for text-based insertion in audio narration
C Tang, C Luo, Z Zhao, D Yin, Y Zhao, W Zeng
arXiv preprint arXiv:2109.05426, 2021
92021
General-purpose speech representation learning through a self-supervised multi-granularity framework
Y Zhao, D Yin, C Luo, Z Zhao, C Tang, W Zeng, ZJ Zha
arXiv preprint arXiv:2102.01930, 2021
82021
Is a 3D-Tokenized LLM the Key to Reliable Autonomous Driving?
Y Bai, D Wu, Y Liu, F Jia, W Mao, Z Zhang, Y Zhao, J Shen, X Wei, ...
arXiv preprint arXiv:2405.18361, 2024
72024
Subjectdrive: Scaling generative data in autonomous driving via subject control
B Huang, Y Wen, Y Zhao, Y Hu, Y Liu, F Jia, W Mao, T Wang, C Zhang, ...
arXiv preprint arXiv:2403.19438, 2024
62024
VLM-Eval: a general evaluation on video large language models
S Li, Y Zhang, Y Zhao, Q Wang, F Jia, Y Liu, T Wang
arXiv preprint arXiv:2311.11865, 2023
42023
Reconstructive visual instruction tuning
H Wang, A Zheng, Y Zhao, T Wang, Z Ge, X Zhang, Z Zhang
arXiv preprint arXiv:2410.09575, 2024
22024
Attention-guided contrastive masked image modeling for transformer-based self-supervised learning
Y Zhan, Y Zhao, C Luo, Y Zhang, X Sun
2023 IEEE International Conference on Image Processing (ICIP), 2490-2494, 2023
12023
Sistem trenutno ne more izvesti postopka. Poskusite znova pozneje.
Članki 1–20