Integrating Multi-scale Contextualized Information for Byte-based Neural Machine Translation

L Huang, Y Feng - arxiv preprint arxiv:2405.19290, 2024‏ - arxiv.org
Subword tokenization is a common method for vocabulary building in Neural Machine
Translation (NMT) models. However, increasingly complex tasks have revealed its …

Progressive and Consistent Subword Regularization for Neural Machine Translation

Y Gao, Y Luo, Q Zhang, H Shao, T **ao… - … Conference on Natural …, 2024‏ - Springer
Despite the prevalence of subword tokenization, its deterministic nature—splitting words into
unique output tokens, may limit models from fully exploiting the intricate semantic …

Enhancing Medium-Sized Sentence Translation In English-Hindi NMT Using Clause-Based Approach

S Thakur, J Srivastava - 2024 15th International Conference on …, 2024‏ - ieeexplore.ieee.org
The issues of structural divergence, ambiguity, low resource language, and lack of training
data are present in neural machine translation from English-to-Hindi and Hindi-to-English …

Artificial Intelligence in Translation: The Menace, Promise, and Response to Technology and Superseded Practice

B Budiharjo - … , Language, Literature, and Culture (ICCoLliC 2024 …, 2024‏ - atlantis-press.com
The integration of artificial intelligence (AI) into translation practice, teaching, and research
has transformed the domain of translation. In translation practice, AI-driven tools such as …