Findings of the first wmt shared task on sign language translation (wmt-slt22)

M Müller, S Ebling, E Avramidis, A Battisti… - Proceedings of the …, 2022 - aclanthology.org
This paper presents the results of the First WMT Shared Task on Sign Language Translation
(WMT-SLT22). This shared task is concerned with automatic translation between signed and …

Machine translation with large language models: Prompting, few-shot learning, and fine-tuning with QLoRA

X Zhang, N Rajabi, K Duh, P Koehn - Proceedings of the Eighth …, 2023 - aclanthology.org
While large language models have made remarkable advancements in natural language
generation, their potential in machine translation, especially when fine-tuned, remains under …

Exploiting biased models to de-bias text: A gender-fair rewriting model

C Amrhein, F Schottmann, R Sennrich… - arxiv preprint arxiv …, 2023 - arxiv.org
Natural language generation models reproduce and often amplify the biases present in their
training data. Previous research explored using sequence-to-sequence rewriting models to …

MT-GenEval: A counterfactual and contextual dataset for evaluating gender accuracy in machine translation

A Currey, M Nădejde, R Pappagari, M Mayer… - arxiv preprint arxiv …, 2022 - arxiv.org
As generic machine translation (MT) quality has improved, the need for targeted
benchmarks that explore fine-grained aspects of quality has increased. In particular, gender …

Self-attention mechanism at the token level: gradient analysis and algorithm optimization

L Liu, X Xu - Knowledge-Based Systems, 2023 - Elsevier
The self-attention mechanism is a feature processing mechanism for structured data in deep
learning models. It has been widely used in transformer-based deep learning models and …

Metric score landscape challenge (MSLC23): Understanding metrics' performance on a wider landscape of translation quality

C Lo, S Larkin, R Knowles - … of the Eighth Conference on Machine …, 2023 - aclanthology.org
Abstract The Metric Score Landscape Challenge (MSLC23) dataset aims to gain insight into
metric scores on a broader/wider landscape of machine translation (MT) quality. It provides a …

[PDF][PDF] An open-source gloss-based baseline for spoken to signed language translation

A Moryossef, M Müller, A Göhring… - Proceedings of the …, 2023 - assets.pubpub.org
Sign language translation systems are complex and require many components. As a result,
it is very hard to compare methods across publications. We present an open-source …

Findings of the WMT 2023 shared task on parallel data curation

S Sloto, B Thompson, H Khayrallah… - Proceedings of the …, 2023 - aclanthology.org
Building upon prior WMT shared tasks in document alignment and sentence filtering, we
posed the open-ended shared task of finding the best subset of possible training data from a …

A sentence alignment approach to document alignment and multi-faceted filtering for curating parallel sentence pairs from web-crawled data

S Steingrímsson - Proceedings of the Eighth Conference on …, 2023 - aclanthology.org
This paper describes the AST submission to the WMT23 Shared Task on Parallel Data
Curation. We experiment with two approaches for curating data from the provided web …

PyMarian: Fast Neural Machine Translation and Evaluation in Python

T Gowda, R Grundkiewicz, E Rippeth, M Post… - arxiv preprint arxiv …, 2024 - arxiv.org
The deep learning language of choice these days is Python; measured by factors such as
available libraries and technical support, it is hard to beat. At the same time, software written …