A comprehensive survey of grammatical error correction
Grammatical error correction (GEC) is an important application aspect of natural language
processing techniques, and GEC system is a kind of very important intelligent system that …
processing techniques, and GEC system is a kind of very important intelligent system that …
The united nations parallel corpus v1. 0
M Ziemski, M Junczys-Dowmunt… - Proceedings of the …, 2016 - aclanthology.org
This paper describes the creation process and statistics of the official United Nations Parallel
Corpus, the first parallel corpus composed from United Nations documents published by the …
Corpus, the first parallel corpus composed from United Nations documents published by the …
On the impact of various types of noise on neural machine translation
We examine how various types of noise in the parallel training data impact the quality of
neural machine translation systems. We create five types of artificial noise and analyze how …
neural machine translation systems. We create five types of artificial noise and analyze how …
A comprehensive survey of grammar error correction
Y Wang, Y Wang, J Liu, Z Liu - arxiv preprint arxiv:2005.06600, 2020 - arxiv.org
Grammar error correction (GEC) is an important application aspect of natural language
processing techniques. The past decade has witnessed significant progress achieved in …
processing techniques. The past decade has witnessed significant progress achieved in …
Is neural machine translation ready for deployment? A case study on 30 translation directions
In this paper we provide the largest published comparison of translation quality for phrase-
based SMT and neural machine translation across 30 translation directions. For ten …
based SMT and neural machine translation across 30 translation directions. For ten …
N-gram counts and language models from the common crawl
We contribute 5-gram counts and language models trained on the Common Crawl corpus, a
collection over 9 billion web pages. This release improves upon the Google n-gram counts …
collection over 9 billion web pages. This release improves upon the Google n-gram counts …
Self-attention with cross-lingual position representation
Position encoding (PE), an essential part of self-attention networks (SANs), is used to
preserve the word order information for natural language processing tasks, generating fixed …
preserve the word order information for natural language processing tasks, generating fixed …
Incremental decoding and training methods for simultaneous translation in neural machine translation
We address the problem of simultaneous translation by modifying the Neural MT decoder to
operate with dynamically built encoder and attention. We propose a tunable agent which …
operate with dynamically built encoder and attention. We propose a tunable agent which …
Phrase-based machine translation is state-of-the-art for automatic grammatical error correction
In this work, we study parameter tuning towards the M^ 2 metric, the standard metric for
automatic grammar error correction (GEC) tasks. After implementing M^ 2 as a scorer in the …
automatic grammar error correction (GEC) tasks. After implementing M^ 2 as a scorer in the …
A neural approach to source dependence based context model for statistical machine translation
K Chen, T Zhao, M Yang, L Liu… - … on Audio, Speech …, 2017 - ieeexplore.ieee.org
In statistical machine translation, translation prediction considers not only the aligned source
word itself but also its source contextual information. Learning context representation is a …
word itself but also its source contextual information. Learning context representation is a …