Sledovat
Haonan Zhang
Haonan Zhang
Další jména张 浩楠
UESTC |Alibaba TongYi Laboratory
E-mailová adresa ověřena na: std.uestc.edu.cn - Domovská stránka
Název
Citace
Citace
Rok
S2 Transformer for Image Captioning
P Zeng, H Zhang, J Song, L Gao
Proceedings of the Thirty-First International Joint Conference on Artificial …, 2022
602022
Video Question Answering with Prior Knowledge and Object-sensitive Learning
P Zeng, H Zhang, L Gao, J Song, HT Shen
IEEE Transactions on Image Processing, 5936-5948, 2022
342022
Memory-based augmentation network for video captioning
S Jing, H Zhang, P Zeng, L Gao, J Song, HT Shen
IEEE Transactions on Multimedia, 2023
272023
Learning visual question answering on controlled semantic noisy labels
H Zhang, P Zeng, Y Hu, J Qian, J Song, L Gao
Pattern Recognition 138, 109339, 2023
232023
A differentiable semantic metric approximation in probabilistic embedding for cross-modal retrieval
H Li, J Song, L Gao, P Zeng, H Zhang, G Li
Advances in Neural Information Processing Systems 35, 11934-11946, 2022
172022
Visual Commonsense-aware Representation Network for Video Captioning
P Zeng, H Zhang, L Gao, X Li, J Qian, HT Shen
IEEE Transactions on Neural Networks and Learning Systems, 2023
162023
SPT: Spatial pyramid transformer for image captioning
H Zhang, P Zeng, L Gao, X Lyu, J Song, HT Shen
IEEE Transactions on Circuits and Systems for Video Technology, 2023
142023
Depth-aware sparse transformer for video-language learning
H Zhang, L Gao, P Zeng, A Hanjalic, HT Shen
Proceedings of the 31st ACM International Conference on Multimedia, 4778-4787, 2023
122023
MMEvol: Empowering multimodal large language models with evol-instruct
R Luo, H Zhang, L Chen, TE Lin, X Liu, Y Wu, M Yang, M Wang, P Zeng, ...
arXiv preprint arXiv:2409.05840, 2024
72024
You should know more: Learning external knowledge for visual dialog
L Zhao, H Zhang, X Li, S Yang, Y Song
Neurocomputing 488, 54-65, 2022
32022
OpenOmni: Large Language Models Pivot Zero-shot Omnimodal Alignment across Language with Real-time Self-Aware Emotional Speech Synthesis
R Luo, TE Lin, H Zhang, Y Wu, X Liu, M Yang, Y Li, L Chen, J Li, L Zhang, ...
arXiv preprint arXiv:2501.04561, 2025
12025
UMP: Unified Modality-aware Prompt Tuning for Text-Video Retrieval
H Zhang, P Zeng, L Gao, J Song, HT Shen
IEEE Transactions on Circuits and Systems for Video Technology, 2024
12024
Text-Video Retrieval with Global-Local Semantic Consistent Learning
H Zhang, P Zeng, L Gao, J Song, Y Duan, X Lyu, H Shen
arXiv preprint arXiv:2405.12710, 2024
12024
Pedestrian Attributes Recognition for UAV-Human
H Ni, P Lai, Y Li, P Zeng, H Zhang, J Song
2024 IEEE International Conference on Multimedia and Expo Workshops (ICMEW), 1-5, 2024
2024
MPT: Multi-grained Prompt Tuning for Text-Video Retrieval
H Zhang, P Zeng, L Gao, J Song, HT Shen
ACM Multimedia 2024, 2024
2024
Systém momentálně nemůže danou operaci provést. Zkuste to znovu později.
Články 1–15