Folgen
Yuxuan Wang
Yuxuan Wang
Bestätigte E-Mail-Adresse bei stu.pku.edu.cn - Startseite
Titel
Zitiert von
Zitiert von
Jahr
Hawkeye: Training video-text llms for grounding text in videos
Y Wang, X Meng, J Liang, Y Wang, Q Liu, D Zhao
arXiv preprint arXiv:2403.10228, 2024
232024
Videohallucer: Evaluating intrinsic and extrinsic hallucinations in large video-language models
Y Wang, Y Wang, D Zhao, C Xie, Z Zheng
arXiv preprint arXiv:2406.16338, 2024
112024
VSTAR: A Video-grounded Dialogue Dataset for Situated Semantic Understanding with Scene and Topic Transitions
Y Wang, Z Zheng, X Zhao, J Li, Y Wang, D Zhao
Proc. of ACL 2023 (long paper), 2023
82023
Efficient Temporal Extrapolation of Multimodal Large Language Models with Temporal Grounding Bridge
Y Wang, Y Wang, P Wu, J Liang, D Zhao, Y Liu, Z Zheng
Proceedings of the 2024 Conference on Empirical Methods in Natural Language …, 2024
7*2024
Collaborative Reasoning on Multi-Modal Semantic Graphs for Video-Grounded Dialogue Generation
X Zhao*, Y Wang*, C Tao, C Wang, D Zhao
Findings of EMNLP 2022 (long paper), 2022
52022
Videollamb: Long-context video understanding with recurrent memory bridges
Y Wang, C Xie, Y Liu, Z Zheng
arXiv preprint arXiv:2409.01071, 2024
42024
Rethinking dictionaries and glyphs for Chinese language pre-training
Y Wang, J Wang, D Zhao, Z Zheng
Findings of the Association for Computational Linguistics: ACL 2023, 1089-1101, 2023
4*2023
MoviePuzzle: Visual Narrative Reasoning through Multimodal Order Learning
J Wang*, Y Wang*, D Zhao, Z Zheng
arXiv preprint arXiv:2306.02252, 2023
32023
Overview of the NLPCC 2022 shared task: multi-modal dialogue understanding and generation
Y Wang, X Zhao, D Zhao
CCF International Conference on Natural Language Processing and Chinese …, 2022
32022
Overview of the NLPCC 2023 Shared Task 10: Learn to Watch TV: Multimodal Dialogue Understanding and Response Generation
Y Wang, Y Wang, D Zhao
CCF International Conference on Natural Language Processing and Chinese …, 2023
22023
STAIR: Spatial-Temporal Reasoning with Auditable Intermediate Results for Video Question Answering
Y Wang, Y Wang, K Chen, D Zhao
Proceedings of the AAAI Conference on Artificial Intelligence 38 (17), 19215 …, 2024
12024
Teaching Text-to-Image Models to Communicate
X Sun, J Feng, Y Wang, Y Lai, X Shen, D Zhao
arXiv preprint arXiv:2309.15516, 2023
12023
LongViTU: Instruction Tuning for Long-Form Video Understanding
R Wu, X Ma, H Ci, Y Fan, Y Wang, H Zhao, Q Li, Y Wang
arXiv preprint arXiv:2501.05037, 2025
2025
Friends-MMC: A Dataset for Multi-modal Multi-party Conversation Understanding
Y Wang, X Meng, Y Wang, J Liang, Q Liu, D Zhao
arXiv preprint arXiv:2412.17295, 2024
2024
VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interaction Format
Y Wang, X Meng, Y Wang, J Liang, J Wei, H Zhang, D Zhao
arXiv preprint arXiv:2411.17991, 2024
2024
Understanding Multimodal Hallucination with Parameter-Free Representation Alignment
Y Wang, J Liang, Y Wang, H Zhang, D Zhao
arXiv preprint arXiv:2409.01151, 2024
2024
ExoViP: Step-by-step Verification and Exploration with Exoskeleton Modules for Compositional Visual Reasoning
Y Wang, A Yuille, Z Li, Z Zheng
COLM 2024, 2024
2024
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–17