Document image machine translation with dynamic multi-pre-trained models assembling

Y Liang, Y Zhang, C Ma, Z Zhang, Y Zhao… - Proceedings of the …, 2024 - aclanthology.org
Text image machine translation (TIMT) is a task that translates source texts embedded in the
image to target translations. The existing TIMT task mainly focuses on text-line-level images …

Translatotron-v (ison): An end-to-end model for in-image machine translation

Z Lan, L Niu, F Meng, J Zhou, M Zhang, J Su - arxiv preprint arxiv …, 2024 - arxiv.org
In-image machine translation (IIMT) aims to translate an image containing texts in source
language into an image containing translations in target language. In this regard …

LayoutDIT: Layout-aware end-to-end document image translation with multi-step conductive decoder

Z Zhang, Y Zhang, Y Liang, L **ang… - Findings of the …, 2023 - aclanthology.org
Document image translation (DIT) aims to translate text embedded in images from one
language to another. It is a challenging task that needs to understand visual layout with text …

A survey on multi-modal machine translation: Tasks, methods and challenges

H Shen, L Shao, W Li, Z Lan, Z Liu, J Su - arxiv preprint arxiv:2405.12669, 2024 - arxiv.org
In recent years, multi-modal machine translation has attracted significant interest in both
academia and industry due to its superior performance. It takes both textual and visual …

CCIM: cross-modal cross-lingual interactive image translation

C Ma, Y Zhang, M Tu, Y Zhao, Y Zhou… - Findings of the …, 2023 - aclanthology.org
Text image machine translation (TIMT) which translates source language text images into
target language texts has attracted intensive attention in recent years. Although the end-to …

MIT-10M: A Large Scale Parallel Corpus of Multilingual Image Translation

B Li, S Zhu, L Wen - arxiv preprint arxiv:2412.07147, 2024 - arxiv.org
Image Translation (IT) holds immense potential across diverse domains, enabling the
translation of textual content within images into various languages. However, existing …

Understand Layout and Translate Text: Unified Feature-Conductive End-to-End Document Image Translation

Z Zhang, Y Zhang, Y Liang, C Ma… - … on Pattern Analysis …, 2025 - ieeexplore.ieee.org
Document Image Translation (DIT) aims to translate texts on document images from one
language to another. It is a multi-modal task involving cooperation of text and layout. Current …

Vector Quantization Knowledge Transfer for End-to-End Text Image Machine Translation

C Ma, Y Zhang, Y Zhao, Y Zhou… - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
End-to-end text image machine translation (TIMT) aims at translating source language
embedded in images into target language without recognizing intermediate texts in images …

CCIM: Cross-modal Cross-lingual Interactive Image Translation

MA Cong, Y Zhang, M Tu, Y Zhao, Y Zhou… - The 2023 Conference on … - openreview.net
Text image machine translation (TIMT) which translates source language text images into
target language texts has attracted intensive attention in recent years. Although the end-to …