Seguir
Yuxin Song
Yuxin Song
Dirección de correo verificada de baidu.com
Título
Citado por
Citado por
Año
Uatvr: Uncertainty-adaptive text-video retrieval
B Fang, W Wu, C Liu, Y Zhou, Y Song, W Wang, X Shu, X Ji, J Wang
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
482023
GPT4Vis: what can GPT-4 do for zero-shot visual recognition?
W Wu, H Yao, M Zhang, Y Song, W Ouyang, J Wang
arXiv preprint arXiv:2311.15732, 2023
262023
Gratis: Deep learning graph representation with task-specific topology and multi-dimensional edge features
S Song, Y Song, C Luo, Z Song, S Kuzucu, X Jia, Z Guo, W Xie, L Shen, ...
arXiv preprint arXiv:2211.12482, 2022
262022
Transferring vision-language models for visual recognition: A classifier perspective
W Wu, Z Sun, Y Song, J Wang, W Ouyang
International Journal of Computer Vision 132 (2), 392-409, 2024
192024
Dalg: Deep attentive local and global modeling for image retrieval
Y Song, R Zhu, M Yang, D He
arXiv preprint arXiv:2207.00287, 2022
152022
Dense Connector for MLLMs
H Yao, W Wu, T Yang, YX Song, M Zhang, H Feng, Y Sun, Z Li, W Ouyang, ...
arXiv preprint arXiv:2405.13800, 2024
112024
What Can Simple Arithmetic Operations Do for Temporal Modeling?
W Wu, Y Song, Z Sun, J Wang, C Xu, W Ouyang
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
102023
It takes two: Masked appearance-motion modeling for self-supervised video transformer pre-training
Y Song, M Yang, W Wu, D He, F Li, J Wang
arXiv preprint arXiv:2210.05234, 2022
102022
Monoformer: One transformer for both diffusion and autoregression
C Zhao, Y Song, W Wang, H Feng, E Ding, Y Sun, X Xiao, J Wang
arXiv preprint arXiv:2409.16280, 2024
82024
Automated multi-level preference for mllms
M Zhang, W Wu, Y Lu, Y Song, K Rong, H Yao, J Zhao, F Liu, Y Sun, ...
arXiv preprint arXiv:2405.11165, 2024
52024
MERG: Multi-Dimensional Edge Representation Generation Layer for Graph Neural Networks
Y Song, C Luo, A Jackson, X Jia, W Xie, L Shen, H Gunes, S Song
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
32024
Mulberry: Empowering mllm with o1-like reasoning and reflection via collective monte carlo tree search
H Yao, J Huang, W Wu, J Zhang, Y Wang, S Liu, Y Wang, Y Song, H Feng, ...
arXiv preprint arXiv:2412.18319, 2024
22024
Multi-level graph learning for audio event classification and human-perceived annoyance rating prediction
Y Hou, Q Ren, S Song, Y Song, W Wang, D Botteldooren
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
12024
The Key of Understanding Vision Tasks: Explanatory Instructions
Y Shen, XS Wei, Y Sun, Y Song, T Yuan, J Jin, H Xu, Y Yao, E Ding
arXiv preprint arXiv:2412.18525, 2024
2024
Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization
Y Shen, XS Wei, Y Sun, Y Song, T Yuan, J Jin, H Xu, Y Yao, E Ding
arXiv e-prints, arXiv: 2412.18525, 2024
2024
DistinctAD: Distinctive Audio Description Generation in Contexts
B Fang, W Wu, Q Wu, Y Song, AB Chan
arXiv preprint arXiv:2411.18180, 2024
2024
Octopus: A Multi-modal LLM with Parallel Recognition and Sequential Understanding
YS Chuyang Zhao, YuXin Song, Junru Chen, Kang Rong, Haocheng Feng, Gang ...
Advances in Neural Information Processing Systems (NeurIPS), 2024, 2024
2024
Octopus: A Multi-modal LLM with Parallel Recognition and Sequential Understanding
C Zhao, YX Song, J Chen, K Rong, H Feng, G Zhang, S Ji, J Wang, ...
The Thirty-eighth Annual Conference on Neural Information Processing Systems, 0
El sistema no puede realizar la operación en estos momentos. Inténtalo de nuevo más tarde.
Artículos 1–18