Multimodal pre-training based on graph attention network for document understanding Z Zhang, J Ma, J Du, L Wang, J Zhang IEEE Transactions on Multimedia 25, 6743-6755, 2022 | 47 | 2022 |
HRDoc: Dataset and Baseline Method Toward Hierarchical Reconstruction of Document Structures J Ma, J Du, P Hu, Z Zhang, J Zhang, H Zhu, C Liu AAAI 2023, 2023 | 13 | 2023 |
GMN: Generative Multi-modal Network for Practical Document Information Extraction H Cao, J Ma, A Guo, Y Hu, H Liu, D Jiang, Y Liu, B Ren NAACL 2022, 2022 | 12 | 2022 |
SEMv2: Table separation line detection based on instance segmentation Z Zhang, P Hu, J Ma, J Du, J Zhang, B Yin, B Yin, C Liu Pattern Recognition 149, 110279, 2024 | 11 | 2024 |
Query-driven Generative Network for Document Information Extraction in the Wild H Cao, X Li, J Ma, D Jiang, A Guo, Y Hu, H Liu, Y Liu, B Ren ACM-MM 2022, 4261-4271, 2022 | 11 | 2022 |
A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition Y Dai, H Chen, J Du, R Wang, S Chen, J Ma, H Wang, CH Lee CVPR 2024, 2024 | 5 | 2024 |
An Open-Source Library of 2D-GMM-HMM Based on Kaldi Toolkit and Its Application to Handwritten Chinese Character Recognition J Ma, Z Wang, J Du ICIG 2021, 235-244, 2021 | 5 | 2021 |
Generate, transform, and clean: the role of GANs and transformers in palm leaf manuscript generation and enhancement N Thuon, J Du, Z Zhang, J Ma, P Hu International Journal on Document Analysis and Recognition (IJDAR) 27 (3 …, 2024 | 4 | 2024 |
SEMv2: Table separation line detection based on conditional convolution Z Zhang, P Hu, J Ma, J Du, J Zhang, H Zhu, B Yin, B Yin, C Liu arXiv e-prints, arXiv: 2303.04384, 2023 | 4 | 2023 |
Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023 H Wang, Y Xi, H Chen, J Du, Y Song, Q Wang, H Zhou, C Wang, J Ma, ... ACM-MM 2023, 9531-9535, 2023 | 3 | 2023 |
Count, decode and fetch: A new approach to handwritten chinese character error correction P Hu, J Ma, Z Zhang, J Du, J Zhang arXiv preprint arXiv:2307.16253, 2023 | 3 | 2023 |
USTC-iFLYTEK at DocILE: A Multi-modal Approach Using Domain-specific GraphDoc. Y Wang, J Du, J Ma, P Hu, Z Zhang, J Zhang CLEF (Working Notes), 598-610, 2023 | 3 | 2023 |
SEMv3: A Fast and Robust Approach to Table Separation Line Detection C Qin, Z Zhang, P Hu, C Liu, J Ma, J Du IJCAI 2024, 2024 | 2 | 2024 |
Enhancing math word problem solving through salient clue prioritization: a joint token-phrase-level feature integration approach J Xie, J Ma, X Zhang, J Zhang, J Du 2023 International Conference on Asian Language Processing (IALP), 252-257, 2023 | 2 | 2023 |
Group, contrast and recognize: a self-supervised method for chinese character recognition X Jiang, J Du, P Hu, M Xue, J Ma, J Wu, J Zhang International Conference on Document Analysis and Recognition, 411-427, 2023 | 2 | 2023 |
Bidirectional trained tree-structured decoder for handwritten mathematical expression recognition H Cheng, C Liu, P Hu, Z Zhang, J Ma, J Du arXiv preprint arXiv:2401.00435, 2023 | 1 | 2023 |
Count, decompose and correct: A new approach to handwritten Chinese character error correction P Hu, J Ma, Z Zhang, J Du, J Zhang Pattern Recognition 160, 111110, 2025 | | 2025 |
Latent Swap Joint Diffusion for Long-Form Audio Generation Y Dai, C Wang, C Li, C Wang, J Du, K Li, R Wang, J Ma, L Sun, J Gao arXiv preprint arXiv:2502.05130, 2025 | | 2025 |
Skeleton and Font Generation Network for Zero-shot Chinese Character Generation M Xue, J Du, Z Zhang, J Ma, Q Chang, P Hu, J Zhang, Y Hu arXiv preprint arXiv:2501.08062, 2025 | | 2025 |
RFL: Simplifying Chemical Structure Recognition with Ring-Free Language Q Chang, M Chen, C Pi, P Hu, Z Zhang, J Ma, J Du, B Yin, J Hu arXiv preprint arXiv:2412.07594, 2024 | | 2024 |