フォロー
Sinan Tan
Sinan Tan
Alibaba Group; Tsinghua University
確認したメール アドレス: tinytangent.com
タイトル
引用先
引用先
Qwen technical report
J Bai, S Bai, Y Chu, Z Cui, K Dang, X Deng, Y Fan, W Ge, Y Han, F Huang, ...
arXiv preprint arXiv:2309.16609, 2023
18802023
Qwen-vl: A frontier large vision-language model with versatile abilities
J Bai, S Bai, S Yang, S Wang, S Tan, P Wang, J Lin, C Zhou, J Zhou
arXiv preprint arXiv:2308.12966, 2023
1537*2023
Qwen2-vl: Enhancing vision-language model's perception of the world at any resolution
P Wang, S Bai, S Tan, S Wang, Z Fan, J Bai, K Chen, X Liu, J Wang, W Ge, ...
arXiv preprint arXiv:2409.12191, 2024
375*2024
Mixed neural voxels for fast multi-view video synthesis
F Wang, S Tan, X Li, Z Tian, Y Song, H Liu
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
782023
Qwen2 technical report, 2024
A Yang, B Yang, B Hui, B Zheng, B Yu, C Zhou, C Li, C Li, D Liu, F Huang, ...
URL https://arxiv. org/abs/2407.10671, 0
52
Multi-agent embodied question answering in interactive environments
S Tan, W Xiang, H Liu, D Guo, F Sun
Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23 …, 2020
262020
Knowledge-based embodied question answering
S Tan, M Ge, D Guo, H Liu, F Sun
IEEE Transactions on Pattern Analysis and Machine Intelligence 45 (10 …, 2023
222023
Ofasys: A multi-modal multi-task learning system for building generalist models
J Bai, R Men, H Yang, X Ren, K Dang, Y Zhang, X Zhou, P Wang, S Tan, ...
arXiv preprint arXiv:2212.04408, 2022
152022
Self-Supervised 3-D Semantic Representation Learning for Vision-and-Language Navigation
S Tan, K Sima, D Wang, M Ge, D Guo, H Liu
IEEE Transactions on Neural Networks and Learning Systems, 2024
142024
Towards embodied scene description
S Tan, H Liu, D Guo, X Zhang, F Sun
arXiv preprint arXiv:2004.14638, 2020
132020
Embodied Multi-Agent Task Planning from Ambiguous Instruction.
X Liu, X Li, D Guo, S Tan, H Liu, F Sun
Robotics: Science and Systems, 2022
122022
Embodied referring expression for manipulation question answering in interactive environment
Q Sima, S Tan, H Liu, F Sun, W Xu, L Fu
2023 IEEE International Conference on Robotics and Automation (ICRA), 7635-7641, 2023
62023
Embodied scene description
S Tan, D Guo, H Liu, X Zhang, F Sun
Autonomous robots, 1-23, 2022
62022
A spark of vision-language intelligence: 2-dimensional autoregressive transformer for efficient finegrained image generation
L Chen, S Tan, Z Cai, W Xie, H Zhao, Y Zhang, J Lin, J Bai, T Liu, ...
arXiv preprint arXiv:2410.01912, 2024
32024
Depth-aware vision-and-language navigation using scene query attention network
S Tan, M Ge, D Guo, H Liu, F Sun
2022 International Conference on Robotics and Automation (ICRA), 9390-9396, 2022
32022
An Automated Question-Answering Framework Based on Evolution Algorithm
S Tan, H Xue, Q Ren, H Liu, J Bai
arXiv preprint arXiv:2201.10797, 2022
2022
現在システムで処理を実行できません。しばらくしてからもう一度お試しください。
論文 1–16