Optimizing statistical machine translation for text simplification

W Xu, C Napoles, E Pavlick, Q Chen… - Transactions of the …, 2016 - direct.mit.edu
Most recent sentence simplification systems use basic machine translation models to learn
lexical and syntactic paraphrases from a manually simplified parallel corpus. These methods …

[PDF][PDF] Findings of the 2011 workshop on statistical machine translation

C Callison-Burch, P Koehn, C Monz… - Proceedings of the sixth …, 2011 - aclanthology.org
This paper presents the results of the WMT11 shared tasks, which included a translation
task, a system combination task, and a task for machine translation evaluation metrics. We …

Text rewriting improves semantic role labeling

K Woodsend, M Lapata - Journal of Artificial Intelligence Research, 2014 - jair.org
Large-scale annotated corpora are a prerequisite to develo** high-performance NLP
systems. Such corpora are expensive to produce, limited in size, often demanding linguistic …

[PDF][PDF] Crowdsourcing high-quality parallel data extraction from twitter

W Ling, L Marujo, C Dyer, AW Black… - Proceedings of the …, 2014 - aclanthology.org
High-quality parallel data is crucial for a range of multilingual applications, from tuning and
evaluating machine translation systems to cross-lingual annotation projection. Unfortunately …

[PDF][PDF] Joshua 5.0: Sparser, better, faster, server

M Post, J Ganitkevitch, L Orland, J Weese… - Proceedings of the …, 2013 - aclanthology.org
We describe improvements made over the past year to Joshua, an open-source translation
system for parsing-based machine translation. The main contributions this past year are …

[PDF][PDF] Joshua 6: A phrase-based and hierarchical statistical machine translation system.

M Post, Y Cao, G Kumar - Prague Bull. Math. Linguistics, 2015 - archive.sciendo.com
We describe the version six release of Joshua, an open-source statistical machine
translation toolkit. The main difference from release five is the introduction of a simple …

[PDF][PDF] Sentential paraphrasing as black-box machine translation

C Napoles, C Callison-Burch… - Proceedings of the 2016 …, 2016 - aclanthology.org
We present a simple, prepackaged solution to generating paraphrases of English
sentences. We use the Paraphrase Database (PPDB) for monolingual sentence rewriting …

One system, many domains: Open-domain statistical machine translation via feature augmentation

JH Clark, A Lavie, C Dyer - Proceedings of the 10th Conference of …, 2012 - aclanthology.org
In this paper, we introduce a simple technique for incorporating domain information into a
statistical machine translation system that significantly improves translation quality when test …

HLTCOE participation at TAC 2012: Entity linking and cold start knowledge base construction

P McNamee, J Mayfield, T Finin, T Oates… - Proceedings of the …, 2012 - ebiquity.umbc.edu
Our team from the JHU HLTCOE participated in the Entity Linking and Cold Start Knowledge
Base tasks in this year's Text Analysis Conference Knowledge Base Population evaluation …

Dual subtitles as parallel corpora

S Zhang, W Ling, C Dyer - 2014 - kilthub.cmu.edu
In this paper, we leverage the existence of dual subtitles as a source of parallel data. Dual
subtitles present viewers with two languages simultaneously, and are generally aligned in …