Linguistic more: Taking a further step toward efficient and accurate scene text recognition B Zhang, H Xie, Y Wang, J Xu, Y Zhang arXiv preprint arXiv:2305.05140, 2023 | 29 | 2023 |
Symmetrical linguistic feature distillation with clip for scene text recognition Z Wang, H Xie, Y Wang, J Xu, B Zhang, Y Zhang Proceedings of the 31st ACM International Conference on Multimedia, 509-518, 2023 | 22 | 2023 |
Tampered text detection via rgb and frequency relationship modeling Y Wang, B Zhang, H Xie, Y Zhang Chinese Journal of Network and Information Security 8 (3), 29-40, 2022 | 11 | 2022 |
Choose What You Need: Disentangled Representation Learning for Scene Text Recognition Removal and Editing B Zhang, H Xie, Z Gao, Y Wang Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 10 | 2024 |
Chain of ideas: Revolutionizing research via novel idea development with llm agents L Li, W Xu, J Guo, R Zhao, X Li, Y Yuan, B Zhang, Y Jiang, Y Xin, R Dang, ... arXiv preprint arXiv:2410.13185, 2024 | 4 | 2024 |
How Control Information Influences Multilingual Text Image Generation and Editing? B Zhang, Z Gao, Y Qu, H Xie arXiv preprint arXiv:2407.11502, 2024 | 4 | 2024 |
Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition Z Gao, Y Wang, Y Qu, B Zhang, Z Wang, J Xu, H Xie arXiv preprint arXiv:2405.05841, 2024 | 4 | 2024 |
Videorefer suite: Advancing spatial-temporal object understanding with video llm Y Yuan, H Zhang, W Li, Z Cheng, B Zhang, L Li, X Li, D Zhao, W Zhang, ... arXiv preprint arXiv:2501.00599, 2024 | 1 | 2024 |
Focus on the whole character: discriminative character modeling for scene text recognition B Zhou, Y Qu, Z Wang, Z Li, B Zhang, H Xie arXiv preprint arXiv:2407.05562, 2024 | 1 | 2024 |
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding B Zhang, K Li, Z Cheng, Z Hu, Y Yuan, G Chen, S Leng, Y Jiang, H Zhang, ... arXiv preprint arXiv:2501.13106, 2025 | | 2025 |
ECBench: Can Multi-modal Foundation Models Understand the Egocentric World? A Holistic Embodied Cognition Benchmark R Dang, Y Yuan, W Zhang, Y Xin, B Zhang, L Li, L Wang, Q Zeng, X Li, ... arXiv preprint arXiv:2501.05031, 2025 | | 2025 |