Suivre
Yunlong Tang
Yunlong Tang
Adresse e-mail validée de rochester.edu - Page d'accueil
Titre
Citée par
Citée par
Année
Caption Anything: Interactive Image Description with Diverse Multimodal Controls
T Wang*, J Zhang*, J Fei*, H Zheng, Y Tang, Z Li, M Gao, S Zhao
arXiv preprint arXiv:2305.02677, 2023
882023
Video Understanding with Large Language Models: A Survey
Y Tang*, J Bi*, S Xu*, L Song, S Liang, T Wang, D Zhang, J An, J Lin, ...
arXiv preprint arXiv:2312.17432, 2023
582023
Empowering LLMs with Pseudo-Untrimmed Videos for Audio-Visual Temporal Understanding
Y Tang, D Shimada, J Bi, M Feng, H Hua, C Xu
AAAI Conference on Artificial Intelligence (AAAI), 2025
18*2025
V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning
H Hua*, Y Tang*, C Xu, J Luo
AAAI Conference on Artificial Intelligence (AAAI), 2025
162025
LLMVA-GEBC: Large Language Model with Video Adapter for Generic Event Boundary Captioning
Y Tang, J Zhang, X Wang, T Wang, F Zheng
IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2023
92023
Multi-modal Segment Assemblage Network for Ad Video Editing with Importance-Coherence Reward
Y Tang, S Xu, T Wang, Q Lin, Q Lu, F Zheng
Proceedings of the Asian Conference on Computer Vision (ACCV), 3519-3535, 2022
82022
AIM 2024 Challenge on Video Saliency Prediction: Methods and Results
A Moskalenko, A Bryncev, D Vatolin, R Timofte, G Zhan, L Yang, Y Tang, ...
Proceedings of the European Conference on Computer Vision (ECCV) Workshops, 2024
62024
GaussianStyle: Gaussian Head Avatar via StyleGAN
P Liu, L Song, D Zhang, H Hua, Y Tang, H Tu, J Luo, C Xu
International Conference on 3D Vision (3DV), 2025
4*2025
LaunchpadGPT: Language Model as Music Visualization Designer on Launchpad
S Xu*, Y Tang*, F Zheng
Proceedings of the International Computer Music Conference (ICMC), 213-217, 2023
42023
Do More Details Always Introduce More Hallucinations in LVLM-based Image Captioning?
M Feng, Y Tang, Z Zhang, C Xu
arXiv preprint arXiv:2406.12663, 2024
32024
CaRDiff: Video Salient Object Ranking Chain of Thought Reasoning for Saliency Prediction with Diffusion
Y Tang, G Zhan, L Yang, Y Liao, C Xu
AAAI Conference on Artificial Intelligence (AAAI), 2025
22025
MMCOMPOSITION: Revisiting the Compositionality of Pre-trained Vision-Language Models
H Hua*, Y Tang*, Z Zeng*, L Cao, Z Yang, H He, C Xu, J Luo
arXiv preprint arXiv:2410.09733, 2024
22024
EAGLE: Egocentric AGgregated Language-video Engine
J Bi, Y Tang, L Song, A Vosoughi, N Nguyen, C Xu
Proceedings of the 32nd ACM International Conference on Multimedia (ACM MM …, 2024
22024
VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?
Y Tang*, J Guo*, H Hua, S Liang, M Feng, X Li, R Mao, C Huang, J Bi, ...
arXiv preprint arXiv:2411.10979, 2024
12024
Scaling Concept with Text-Guided Diffusion Models
C Huang, S Liang, Y Tang, Y Tian, A Kumar, C Xu
arXiv preprint arXiv:2410.24151, 2024
12024
Generative AI for Cel-Animation: A Survey
Y Tang, J Guo, P Liu, Z Wang, H Hua, JX Zhong, Y Xiao, C Huang, L Song, ...
arXiv preprint arXiv:2501.06250, 2025
2025
Unveiling Visual Perception in Language Models: An Attention Head Analysis Approach
J Bi, J Guo, Y Tang, LB Wen, Z Liu, C Xu
arXiv preprint arXiv:2412.18108, 2024
2024
Video Editing Method and Device, Electronic Equipment and Storage Medium
Q Lin, Y Tang, Q Lu, N Pang, W Jiang, F Zheng
CN Patent 115,883,878, 2024
2024
Le système ne peut pas réaliser cette opération maintenant. Veuillez réessayer plus tard.
Articles 1–18