Towards open vocabulary learning: A survey J Wu, X Li, S Xu, H Yuan, H Ding, Y Yang, X Li, J Zhang, Y Tong, X Jiang, ... IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024 | 125 | 2024 |
Panoptic-partformer: Learning a unified model for panoptic part segmentation X Li, S Xu, Y Yang, G Cheng, Y Tong, D Tao European Conference on Computer Vision, 729-747, 2022 | 47 | 2022 |
Fashionformer: A simple, effective and unified baseline for human fashion segmentation and recognition S Xu, X Li, J Wang, G Cheng, Y Tong, D Tao European Conference on Computer Vision, 545-563, 2022 | 35 | 2022 |
An open and comprehensive pipeline for unified object grounding and detection X Zhao, Y Chen, S Xu, X Li, X Wang, Y Li, H Huang arXiv preprint arXiv:2401.02361, 2024 | 23 | 2024 |
Panoptic-PartFormer++: A unified and decoupled view for panoptic part segmentation X Li, S Xu, Y Yang, H Yuan, G Cheng, Y Tong, Z Lin, MH Yang, D Tao IEEE transactions on pattern analysis and machine intelligence, 2024 | 21 | 2024 |
Dst-det: Simple dynamic self-training for open-vocabulary object detection S Xu, X Li, S Wu, W Zhang, Y Li, G Cheng, Y Tong, K Chen, CC Loy arXiv preprint arXiv:2310.01393, 2023 | 11 | 2023 |
RAP-SAM: Towards Real-Time All-Purpose Segment Anything S Xu, H Yuan, Q Shi, L Qi, J Wang, Y Yang, Y Li, K Chen, Y Tong, ... arXiv preprint arXiv:2401.10228, 2024 | 5 | 2024 |
Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs Y Zhou, T Zhang, S Xu, S Chen, Q Zhou, Y Tong, S Ji, J Zhang, X Li, L Qi arXiv preprint arXiv:2501.04670, 2025 | | 2025 |
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos H Yuan, X Li, T Zhang, Z Huang, S Xu, S Ji, Y Tong, L Qi, J Feng, ... arXiv preprint arXiv:2501.04001, 2025 | | 2025 |
DST-Det: Open-Vocabulary Object Detection via Dynamic Self-Training S Xu, X Li, S Wu, W Zhang, Y Tong, CC Loy IEEE Transactions on Circuits and Systems for Video Technology, 2024 | | 2024 |
RLRF4Rec: Reinforcement Learning from Recsys Feedback for Enhanced Recommendation Reranking C Sun, Y Liang, Y Yang, S Xu, T Yang, Y Tong arXiv preprint arXiv:2410.05939, 2024 | | 2024 |
LLAVADI: What Matters For Multimodal Large Language Models Distillation S Xu, X Li, H Yuan, L Qi, Y Tong, MH Yang arXiv preprint arXiv:2407.19409, 2024 | | 2024 |
Query Learning of Both Thing and Stuff for Panoptic Segmentation S Xu, X Li, Y Yang, H Li, G Cheng, Y Tong 2022 IEEE International Conference on Image Processing (ICIP), 716-720, 2022 | | 2022 |