Where's the Point? Self-Supervised Multilingual Punctuation-Agnostic Sentence Segmentation

B Minixhofer, J Pfeiffer, I Vulić - arxiv preprint arxiv:2305.18893, 2023 - arxiv.org
Many NLP pipelines split text into sentences as one of the crucial preprocessing steps. Prior
sentence segmentation tools either rely on punctuation or require a considerable amount of …

Do online machine translation systems care for context? What about a GPT model?

S Castilho, C Mallon, R Meister, S Yue - 2023 - doras.dcu.ie
This paper addresses the challenges of evaluating document-level machine translation (MT)
in the context of recent advances in context-aware neural machine translation (NMT). It …

Lightweight Audio Segmentation for Long-form Speech Translation

J Lee, S Kim, H Kim, JS Chung - arxiv preprint arxiv:2406.10549, 2024 - arxiv.org
Speech segmentation is an essential part of speech translation (ST) systems in real-world
scenarios. Since most ST models are designed to process speech segments, long-form …

À propos des difficultés de traduire automatiquement de longs documents

Z Peng, R Bawden, F Yvon - Actes de JEP-TALN-RECITAL 2024 …, 2024 - inria.hal.science
Les nouvelles architectures de traduction automatique sont capables de traiter des
segments longs et de surpasser la traduction de phrases isolées, laissant entrevoir la …

[PDF][PDF] Manipulating Data Representations for Neural Machine Translation

C Amrhein - 2023 - zora.uzh.ch
In natural language processing, much current research focuses on training larger and larger
models on more and more data. In this thesis, we argue that how data is represented can …

[PDF][PDF] Is Sentence Splitting a Solved Task? Experiments to the Intersection Between NLP and Italian Linguistics

A Redaelli, R Sprugnoli - … of the 10th Italian Conference on …, 2024 - researchgate.net
Sentence splitting, that is the segmentation of the raw input text into sentences, is a
fundamental step in text processing. Although it is considered a solved task for texts such as …

A survey of context in neural machine translation and its evaluation

S Castilho, R Knowles - Natural Language Processing - cambridge.org
The question of context in neural machine translation often focuses on topics related to
document-level translation or intersentential context. However, there is a wide range of other …