Follow
Mingxin Huang
Mingxin Huang
Verified email at mail.scut.edu.cn
Title
Cited by
Cited by
Year
On the hidden mystery of ocr in large multimodal models
Y Liu, Z Li, M Huang, B Yang, W Yu, C Li, XC Yin, CL Liu, L Jin, X Bai
arXiv preprint arXiv:2305.07895, 2023
1742023
Swintextspotter: Scene text spotting via better synergy between text detection and text recognition
M Huang, Y Liu, Z Peng, C Liu, D Lin, S Zhu, N Yuan, K Ding, L Jin
proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022
1402022
Spts: single-point text spotting
D Peng, X Wang, Y Liu, J Zhang, M Huang, S Lai, J Li, S Zhu, D Lin, ...
Proceedings of the 30th ACM International Conference on Multimedia, 4272-4281, 2022
632022
Spts v2: single-point scene text spotting
Y Liu, J Zhang, D Peng, M Huang, X Wang, J Tang, C Huang, D Lin, ...
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023
512023
ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer
M Huang, J Zhang, D Peng, H Lu, C Huang, Y Liu, X Bai, L Jin
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
322023
OCRBench: on the hidden mystery of OCR in large multimodal models
Y Liu, Z Li, M Huang, B Yang, W Yu, C Li, XC Yin, CL Liu, L Jin, X Bai
Science China Information Sciences 67 (12), 220102, 2024
112024
Mini-monkey: Alleviating the semantic sawtooth effect for lightweight mllms via complementary image pyramid
M Huang, Y Liu, D Liang, L Jin, X Bai
arXiv preprint arXiv:2408.02034, 2024
11*2024
Hierarchical side-tuning for vision transformers
W Lin, Z Wu, W Yang, M Huang, J Huang, L Jin
arXiv preprint arXiv:2310.05393, 2023
102023
Swintextspotter v2: Towards better synergy for scene text spotting
M Huang, D Peng, H Li, Z Peng, C Liu, D Lin, Y Liu, X Bai, L Jin
arXiv preprint arXiv:2401.07641, 2024
32024
DTDT: Highly Accurate Dense Text Line Detection in Historical Documents via Dynamic Transformer
H Li, C Liu, J Wang, M Huang, W Zhou, L Jin
International Conference on Document Analysis and Recognition, 381-396, 2023
22023
Progressive Evolution from Single-Point to Polygon for Scene Text
L Deng, M Huang, X Xie, Y Liu, L Jin, X Bai
International Conference on Document Analysis and Recognition, 111-128, 2024
12024
VimTS: A Unified Video and Image Text Spotter for Enhancing the Cross-domain Generalization
Y Liu, M Huang, H Yan, L Deng, W Wu, H Lu, C Shen, L Jin, X Bai
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025
2025
OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
L Fu, B Yang, Z Kuang, J Song, Y Li, L Zhu, Q Luo, X Wang, H Lu, ...
arXiv preprint arXiv:2501.00321, 2024
2024
OCRBench: on the hidden mystery of OCR in large multimodal models
Y Liu, Z Li, M Huang, B Yang, W Yu, C Li, XC Yin, CL Liu, L Jin, X Bai
Science China Information Sciences 67 (12), 220102, 2024
2024
Bridging the Gap Between End-to-End and Two-Step Text Spotting
M Huang, H Li, Y Liu, X Bai, L Jin
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–15