Следене
Hao Feng
Hao Feng
Ph.D., University of Science and Technology of China; Researcher, ByteDance
Потвърден имейл адрес: mail.ustc.edu.cn - Начална страница
Заглавие
Позовавания
Позовавания
Година
DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction
H Feng, Y Wang, W Zhou, J Deng, H Li
ACM International Conference on Multimedia (ACM MM), 2021, 2021
642021
DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding
H Feng, Q Liu, H Liu, W Zhou, H Li, C Huang
Science China Information Sciences (SCIS), 2024, 2023
482023
UniDoc: A Universal Large Multimodal Model for Simultaneous Text Detection, Recognition, Spotting and Understanding
H Feng, Z Wang, J Tang, J Lu, W Zhou, H Li, C Huang
arXiv preprint arXiv:2308.11592, 2023
382023
Geometric Representation Learning for Document Image Rectification
H Feng, W Zhou, J Deng, Y Wang, H Li
European Conference on Computer Vision (ECCV), 2022, 475-492, 2022
312022
MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering
J Tang, Q Liu, Y Ye, J Lu, S Wei, C Lin, W Li, MFFB Mahmood, H Feng, ...
arXiv preprint arXiv:2405.11985, 2024
302024
TextSquare: Scaling up Text-Centric Visual Instruction Tuning
J Tang, C Lin, Z Zhao, S Wei, B Wu, Q Liu, H Feng, Y Li, S Wang, L Liao, ...
arXiv preprint arXiv:2404.12803, 2024
232024
DocScanner: Robust Document Image Rectification with Progressive Learning
H Feng, W Zhou, J Deng, Q Tian, H Li
arXiv preprint arXiv:2110.14968, 2021
212021
Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs
Y Wang, W Zhou, H Feng, K Zhou, H Li
arXiv preprint arXiv:2311.13194, 2023
172023
Recurrent Generic Contour-based Instance Segmentation with Progressive Learning
H Feng, K Zhou, W Zhou, Y Yin, J Deng, Q Sun, H Li
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2024, 2023
17*2023
Deep Unrestricted Document Image Rectification
H Feng, S Liu, J Deng, W Zhou, H Li
IEEE Transactions on Multimedia (TMM), 2023, 2023
172023
Sign Language Translation with Iterative Prototype
H Yao, W Zhou, H Feng, H Hu, H Zhou, H Li
International Conference on Computer Vision (ICCV), 2023, 2023
162023
TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy
W Zhao, H Feng, Q Liu, J Tang, S Wei, B Wu, L Liao, Y Ye, H Liu, H Li, ...
Neural Information Processing Systems (NeurIPS), 2024, 2024
142024
SimFIR: A Simple Framework for Fisheye Image Rectification with Self-supervised Representation Learning
H Feng, W Wang, J Deng, W Zhou, L Li, H Li
International Conference on Computer Vision (ICCV), 2023, 2023
122023
A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding
J Lu, H Yu, Y Wang, Y Ye, J Tang, Z Yang, B Wu, Q Liu, H Feng, H Wang, ...
arXiv preprint arXiv:2407.01976, 2024
82024
Progressive Recurrent Network for Shadow Removal
Y Wang, W Zhou, H Feng, L Li, H Li
Computer Vision and Image Understanding (CVIU), 2023, 103861, 2023
72023
Model-aware Pre-training for Radial Distortion Rectification
W Wang, H Feng, W Zhou, Z Liao, H Li
IEEE Transactions on Image Processing (TIP), 2023
52023
TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding
B Luan, H Feng, H Chen, Y Wang, W Zhou, H Li
arXiv preprint arXiv:2404.09797, 2024
42024
DocMAE: Document Image Rectification via Self-supervised Representation Learning
S Liu, H Feng, W Zhou, H Li, C Liu, F Wu
International Conference on Multimedia and Expo (ICME), 2023, 2023
42023
DeepEraser: Deep Iterative Context Mining for Generic Text Eraser
H Feng, W Wang, S Liu, J Deng, W Zhou, H Li
IEEE Transactions on Multimedia (TMM), 2024, 2024
22024
Progressive Multi-modal Conditional Prompt Tuning
X Qiu, H Feng, Y Wang, W Zhou, H Li
International Conference on Multimedia Retrieval (ICMR), 2024, 46-54, 2024
12024
Системата не може да изпълни операцията сега. Опитайте отново по-късно.
Статии 1–20