Adversarial training for unsupervised bilingual lexicon induction

M Zhang, Y Liu, H Luan, M Sun - … of the 55th Annual Meeting of …, 2017 - aclanthology.org
Word embeddings are well known to capture linguistic regularities of the language on which
they are trained. Researchers also observe that these regularities can transfer across …

Improving machine translation performance by exploiting non-parallel corpora

DS Munteanu, D Marcu - Computational Linguistics, 2005 - direct.mit.edu
We present a novel method for discovering parallel sentences in comparable, non-parallel
corpora. We train a maximum entropy classifier that, given a pair of sentences, can reliably …

[PDF][PDF] Aligning sentences from standard wikipedia to simple wikipedia

W Hwang, H Hajishirzi, M Ostendorf… - Proceedings of the 2015 …, 2015 - aclanthology.org
This work improves monolingual sentence alignment for text simplification, specifically for
text in standard and simple Wikipedia. We introduce a method that improves over past efforts …

[PDF][PDF] Extracting parallel sub-sentential fragments from non-parallel corpora

DS Munteanu, D Marcu - … of the 21st international conference on …, 2006 - aclanthology.org
We present a novel method for extracting parallel sub-sentential fragments from
comparable, non-parallel bilingual corpora. By analyzing potentially similar sentence pairs …

Extracting lexically divergent paraphrases from Twitter

W Xu, A Ritter, C Callison-Burch… - Transactions of the …, 2014 - direct.mit.edu
Abstract We present MultiP (Multi-instance Learning Paraphrase Model), a new model
suited to identify paraphrases within the short messages on Twitter. We jointly model …

Clear-simple corpus for medical french

N Grabar, R Cardon - ATA, 2018 - shs.hal.science
Availability of corpora with technical and simplified contents is crucial for the development
and test of methods for text simplification. We describe this kind of corpus for the French …

[PDF][PDF] On the use of comparable corpora to improve SMT performance

S Abdul-Rauf, H Schwenk - Proceedings of the 12th Conference of …, 2009 - aclanthology.org
We present a simple and effective method for extracting parallel sentences from comparable
corpora. We employ a statistical machine translation (SMT) system built from small amounts …

MT-based sentence alignment for OCR-generated parallel texts

R Sennrich, M Volk - 9th Conference of the Association for …, 2010 - research.ed.ac.uk
The performance of current sentence alignment tools varies according to the to-bealigned
texts. We have found existing tools unsuitable for hard-to-align parallel texts and describe an …

[PDF][PDF] Towards Zero Unknown Word in Neural Machine Translation.

X Li, J Zhang, C Zong - IJCAI, 2016 - nlpr.ia.ac.cn
Neural Machine translation has shown promising results in recent years. In order to control
the computational complexity, NMT has to employ a small vocabulary, and massive rare …

Statistical machine translation enhancements through linguistic levels: A survey

MR Costa-Jussá, M Farrús - ACM Computing Surveys (CSUR), 2014 - dl.acm.org
Machine translation can be considered a highly interdisciplinary and multidisciplinary field
because it is approached from the point of view of human translators, engineers, computer …