Follow
Jingqun Tang
Jingqun Tang
ByteDance Inc.
Verified email at bytedance.com
Title
Cited by
Cited by
Year
Few could be better than all: Feature sampling and grouping for scene text detection
J Tang, W Zhang, H Liu, MK Yang, B Jiang, G Hu, X Bai
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022
1002022
Spts v2: single-point scene text spotting
Y Liu, J Zhang, D Peng, M Huang, X Wang, J Tang, C Huang, D Lin, ...
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023
512023
Docpedia: Unleashing the power of large multimodal model in the frequency domain for versatile document understanding
H Feng, Q Liu, H Liu, J Tang, W Zhou, H Li, C Huang
Science China Information Sciences 2024, 2023
422023
Unidoc: A universal large multimodal model for simultaneous text detection, recognition, spotting and understanding
H Feng, Z Wang, J Tang, J Lu, W Zhou, H Li, C Huang
arXiv preprint arXiv:2308.11592, 2023
362023
MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering
J Tang, Q Liu, Y Ye, J Lu, S Wei, C Lin, W Li, MFFB Mahmood, H Feng, ...
arXiv preprint arXiv:2405.11985, 2024
252024
You can even annotate text with voice: Transcription-only-supervised text spotting
J Tang, S Qiao, B Cui, Y Ma, S Zhang, D Kanoulas
Proceedings of the 30th ACM International Conference on Multimedia, 4154-4163, 2022
222022
TextSquare: Scaling up Text-Centric Visual Instruction Tuning
J Tang, C Lin, Z Zhao, S Wei, B Wu, Q Liu, H Feng, Y Li, S Wang, L Liao, ...
arXiv preprint arXiv:2404.12803, 2024
202024
Optimal boxes: boosting end-to-end scene text recognition by adjusting annotated bounding boxes via reinforcement learning
J Tang, W Qian, L Song, X Dong, L Li, X Bai
European Conference on Computer Vision, 233-248, 2022
172022
Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer
Z Zhao, J Tang, B Wu, C Lin, H Liu, Z Zhang, X Tan, C Huang, Y Xie
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
152023
Tabpedia: Towards comprehensive visual table understanding with concept synergy
W Zhao, H Feng, Q Liu, J Tang, S Wei, B Wu, L Liao, Y Ye, H Liu, W Zhou, ...
arXiv preprint arXiv:2406.01326 (NeurIPS 2024), 2024
112024
Character recognition competition for street view shop signs
J Tang, W Du, B Wang, W Zhou, S Mei, T Xue, X Xu, H Zhang
National Science Review 10 (6), nwad141, 2023
102023
Harmonizing Visual Text Comprehension and Generation
Z Zhao, J Tang, B Wu, C Lin, S Wei, H Liu, X Tan, Z Zhang, C Huang, ...
arXiv preprint arXiv:2407.16364 (NeurIPS 2024), 2024
92024
Cell-cell contact-driven EphB1 cis-and trans-signalings regulate cancer stem cells enrichment after chemotherapy
L Wang, Q Peng, Y Xie, N Yin, J Xu, A Chen, J Yi, W Shi, J Tang, J Xiang
Cell Death & Disease 13 (11), 980, 2022
92022
Pargo: Bridging vision-language with partial and global views
AL Wang, B Shan, W Shi, KY Lin, X Fei, G Tang, L Liao, J Tang, C Huang, ...
arXiv preprint arXiv:2408.12928 (AAAI 2025), 2024
72024
Attentive Eraser: Unleashing Diffusion Model's Object Removal Potential via Self-Attention Redirection Guidance
W Sun, B Cui, J Tang, XM Dong
arXiv preprint arXiv:2412.12974 (AAAI 2025), 2024
62024
A bounding box is worth one token: Interleaving layout and text in a large language model for document understanding
J Lu, H Yu, Y Wang, Y Ye, J Tang, Z Yang, B Wu, Q Liu, H Feng, H Wang, ...
arXiv preprint arXiv:2407.01976, 2024
52024
Mctbench: Multimodal cognition towards text-rich visual scenes benchmark
B Shan, X Fei, W Shi, AL Wang, G Tang, L Liao, J Tang, X Bai, C Huang
arXiv preprint arXiv:2410.11538, 2024
42024
OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
L Fu, B Yang, Z Kuang, J Song, Y Li, L Zhu, Q Luo, X Wang, H Lu, ...
arXiv preprint arXiv:2501.00321, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–18