Yunlong Tang

Citée par

	Toutes	Depuis 2020
Citations	222	222
indice h	6	6
indice i10	4	4

180

135

20222023202420251 29 172 20

Accès public

Tout afficher

1 article

0 article

disponibles

non disponibles

Sur la base des exigences liées au financement

Coauteurs

Chenliang XuAssociate Professor of Computer Science, University of RochesterAdresse e-mail validée de rochester.edu
Feng ZhengSouthern University of Science and TechnologyAdresse e-mail validée de sustech.edu.cn
Teng WangTencentAdresse e-mail validée de connect.hku.hk
Jing BiUniveristy of RochesterAdresse e-mail validée de rochester.edu
Jiebo Luo, Fellow of ACM/AAAI/IEEE/...Albert Arendt Hopeman Professor of Engineering, University of RochesterAdresse e-mail validée de cs.rochester.edu
Siting XuSouthern University of Science and TechnologyAdresse e-mail validée de mail.sustech.edu.cn
Hang HuaUniversity of RochesterAdresse e-mail validée de cs.rochester.edu
Ping Luo (羅平)Associate Professor, The University of Hong Kong; MMLAB@HKUAdresse e-mail validée de hku.hk
Daiki ShimadaSony Group Corp.Adresse e-mail validée de sony.com
Zhengyuan YangPrincipal Researcher, MicrosoftAdresse e-mail validée de microsoft.com
Liangliang CaoGoogle DeepMind, IEEE FellowAdresse e-mail validée de google.com
Pooyan FazliArizona State UniversityAdresse e-mail validée de asu.edu
Yapeng TianAssistant Professor, University of Texas at DallasAdresse e-mail validée de utdallas.edu
Yiting LiaoStaff Research Scientist at Wireless Communications Research, Intel LabsAdresse e-mail validée de intel.com

Suivre

Yunlong Tang

University of Rochester

Adresse e-mail validée de rochester.edu - Page d'accueil

Multimodal Learning Video Understanding


Titre Trier par citations Trier par année Trier par titre	Citée par Citée par	Année
Caption Anything: Interactive Image Description with Diverse Multimodal Controls T Wang, J Zhang, J Fei*, H Zheng, Y Tang, Z Li, M Gao, S Zhao arXiv preprint arXiv:2305.02677, 2023	88	2023
Video Understanding with Large Language Models: A Survey Y Tang, J Bi, S Xu*, L Song, S Liang, T Wang, D Zhang, J An, J Lin, ... arXiv preprint arXiv:2312.17432, 2023	58	2023
Empowering LLMs with Pseudo-Untrimmed Videos for Audio-Visual Temporal Understanding Y Tang, D Shimada, J Bi, M Feng, H Hua, C Xu AAAI Conference on Artificial Intelligence (AAAI), 2025	18*	2025
V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning H Hua, Y Tang, C Xu, J Luo AAAI Conference on Artificial Intelligence (AAAI), 2025	16	2025
LLMVA-GEBC: Large Language Model with Video Adapter for Generic Event Boundary Captioning Y Tang, J Zhang, X Wang, T Wang, F Zheng IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2023	9	2023
Multi-modal Segment Assemblage Network for Ad Video Editing with Importance-Coherence Reward Y Tang, S Xu, T Wang, Q Lin, Q Lu, F Zheng Proceedings of the Asian Conference on Computer Vision (ACCV), 3519-3535, 2022	8	2022
AIM 2024 Challenge on Video Saliency Prediction: Methods and Results A Moskalenko, A Bryncev, D Vatolin, R Timofte, G Zhan, L Yang, Y Tang, ... Proceedings of the European Conference on Computer Vision (ECCV) Workshops, 2024	6	2024
GaussianStyle: Gaussian Head Avatar via StyleGAN P Liu, L Song, D Zhang, H Hua, Y Tang, H Tu, J Luo, C Xu International Conference on 3D Vision (3DV), 2025	4*	2025
LaunchpadGPT: Language Model as Music Visualization Designer on Launchpad S Xu, Y Tang, F Zheng Proceedings of the International Computer Music Conference (ICMC), 213-217, 2023	4	2023
Do More Details Always Introduce More Hallucinations in LVLM-based Image Captioning? M Feng, Y Tang, Z Zhang, C Xu arXiv preprint arXiv:2406.12663, 2024	3	2024
CaRDiff: Video Salient Object Ranking Chain of Thought Reasoning for Saliency Prediction with Diffusion Y Tang, G Zhan, L Yang, Y Liao, C Xu AAAI Conference on Artificial Intelligence (AAAI), 2025	2	2025
MMCOMPOSITION: Revisiting the Compositionality of Pre-trained Vision-Language Models H Hua, Y Tang, Z Zeng*, L Cao, Z Yang, H He, C Xu, J Luo arXiv preprint arXiv:2410.09733, 2024	2	2024
EAGLE: Egocentric AGgregated Language-video Engine J Bi, Y Tang, L Song, A Vosoughi, N Nguyen, C Xu Proceedings of the 32nd ACM International Conference on Multimedia (ACM MM …, 2024	2	2024
VidComposition: Can MLLMs Analyze Compositions in Compiled Videos? Y Tang, J Guo, H Hua, S Liang, M Feng, X Li, R Mao, C Huang, J Bi, ... arXiv preprint arXiv:2411.10979, 2024	1	2024
Scaling Concept with Text-Guided Diffusion Models C Huang, S Liang, Y Tang, Y Tian, A Kumar, C Xu arXiv preprint arXiv:2410.24151, 2024	1	2024
Generative AI for Cel-Animation: A Survey Y Tang, J Guo, P Liu, Z Wang, H Hua, JX Zhong, Y Xiao, C Huang, L Song, ... arXiv preprint arXiv:2501.06250, 2025		2025
Unveiling Visual Perception in Language Models: An Attention Head Analysis Approach J Bi, J Guo, Y Tang, LB Wen, Z Liu, C Xu arXiv preprint arXiv:2412.18108, 2024		2024
Video Editing Method and Device, Electronic Equipment and Storage Medium Q Lin, Y Tang, Q Lu, N Pang, W Jiang, F Zheng CN Patent 115,883,878, 2024		2024

Le système ne peut pas réaliser cette opération maintenant. Veuillez réessayer plus tard.

Articles 1–18

Nombre de citations par an

Citations en double

Citations fusionnées

Ajouter les coauteursCoauteurs

Suivre

Citée par

Coauteurs