Fewer Steps, Better Performance: Efficient Cross-Modal Clip Trimming for Video Moment Retrieval Using Language X Fang*, D Liu*, W Fang*, P Zhou, Z Xu, W Xu, J Chen, R Li Association for the Advancement of Artificial Intelligence (AAAI), 2024 | 14 | 2024 |
Annotations Are Not All You Need: A Cross-modal Knowledge Transfer Network for Unsupervised Temporal Sentence Grounding X Fang*, D Liu*, W Fang*, P Zhou, Y Cheng, K Tang, K Zou Findings of the Association for Computational Linguistics: EMNLP 2023, 8721-8733, 2023 | 11 | 2023 |
Rethinking Weakly-supervised Video Temporal Grounding From a Game Perspective X Fang*, Z Xiong*, W Fang*, X Qu, C Chen, J Dong, K Tang, P Zhou, ... European Conference on Computer Vision, 2024 | 10 | 2024 |
Not All Inputs Are Valid: Towards Open-Set Video Moment Retrieval using Language X Fang*, W Fang*, D Liu, X Qu, J Dong, P Zhou, R Li, Z Xu, L Chen, ... ACM Multimedia 2024, 2024 | 7 | 2024 |
Towards Robust Temporal Activity Localization Learning with Noisy Labels D Liu, X Qu, X Fang, J Dong, P Zhou, G Nan, K Tang, W Fang, Y Cheng International Conference on Computational Linguistics 2024, 2024 | 5 | 2024 |
Multi-Pair Temporal Sentence Grounding via Multi-Thread Knowledge Transfer Network X Fang, W Fang, C Wang, D Liu, K Tang, J Dong, P Zhou, B Li Association for the Advancement of Artificial Intelligence (AAAI), 2025 | 1 | 2025 |