A survey of mix-based data augmentation: Taxonomy, methods, applications, and explainability

C Cao, F Zhou, Y Dai, J Wang, K Zhang - ACM Computing Surveys, 2024‏ - dl.acm.org
Data augmentation (DA) is indispensable in modern machine learning and deep neural
networks. The basic idea of DA is to construct new training data to improve the model's …

Lauragpt: Listen, attend, understand, and regenerate audio with gpt

Z Du, J Wang, Q Chen, Y Chu, Z Gao, Z Li, K Hu… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Generative Pre-trained Transformer (GPT) models have achieved remarkable performance
on various natural language processing tasks, and have shown great potential as …

On compositional generalization of transformer-based neural machine translation

Y Yin, L Fu, Y Li, Y Zhang - Information Fusion, 2024‏ - Elsevier
Neural networks have been shown to have deficiencies in the ability of compositional
generalization while existing work has generally targeted semantic parsing tasks. In this …

On the complementarity between pre-training and random-initialization for resource-rich machine translation

C Zan, L Ding, L Shen, Y Cao, W Liu, D Tao - arxiv preprint arxiv …, 2022‏ - arxiv.org
Pre-Training (PT) of text representations has been successfully applied to low-resource
Neural Machine Translation (NMT). However, it usually fails to achieve notable gains …

Causal document-grounded dialogue pre-training

Y Zhao, B Yu, H Yu, B Li, J Li, C Wang, F Huang… - arxiv preprint arxiv …, 2023‏ - arxiv.org
The goal of document-grounded dialogue (DocGD) is to generate a response by grounding
the evidence in a supporting document in accordance with the dialogue context. This …

EMMA-X: an EM-like multilingual pre-training algorithm for cross-lingual representation learning

P Guo, X Wei, Y Hu, B Yang, D Liu… - Advances in Neural …, 2023‏ - proceedings.neurips.cc
Expressing universal semantics common to all languages is helpful to understand the
meanings of complex and culture-specific sentences. The research theme underlying this …

Lae-st-moe: Boosted language-aware encoder using speech translation auxiliary task for e2e code-switching asr

G Ma, W Wang, Y Li, Y Yang, B Du… - 2023 IEEE Automatic …, 2023‏ - ieeexplore.ieee.org
Recently, to mitigate the confusion between different languages in code-switching (CS)
automatic speech recognition (ASR), the conditionally factorized models, such as the …

Bridging the Gap between Decision and Logits in Decision-based Knowledge Distillation for Pre-trained Language Models

Q Zhou, Z Yang, P Li, Y Liu - arxiv preprint arxiv:2306.08909, 2023‏ - arxiv.org
Conventional knowledge distillation (KD) methods require access to the internal information
of teachers, eg, logits. However, such information may not always be accessible for large pre …

[HTML][HTML] Research on the Development of Data Augmentation Techniques in the Field of Machine Translation

Z Zhipeng, P Aleksey - International Journal of Open Information …, 2023‏ - cyberleninka.ru
Neural machine translation usually requires a large number of bilingual parallel corpus for
training, which is very easy to overfit on the training set of small data. Through a large …

Alibaba-Translate China's Submission for WMT 2022 Metrics Shared Task

Y Wan, K Bao, D Liu, B Yang, DF Wong… - arxiv preprint arxiv …, 2022‏ - arxiv.org
In this report, we present our submission to the WMT 2022 Metrics Shared Task. We build
our system based on the core idea of UNITE (Unified Translation Evaluation), which unifies …