Optimizing statistical machine translation for text simplification
Most recent sentence simplification systems use basic machine translation models to learn
lexical and syntactic paraphrases from a manually simplified parallel corpus. These methods …
lexical and syntactic paraphrases from a manually simplified parallel corpus. These methods …
[PDF][PDF] Findings of the 2011 workshop on statistical machine translation
This paper presents the results of the WMT11 shared tasks, which included a translation
task, a system combination task, and a task for machine translation evaluation metrics. We …
task, a system combination task, and a task for machine translation evaluation metrics. We …
Text rewriting improves semantic role labeling
Large-scale annotated corpora are a prerequisite to develo** high-performance NLP
systems. Such corpora are expensive to produce, limited in size, often demanding linguistic …
systems. Such corpora are expensive to produce, limited in size, often demanding linguistic …
[PDF][PDF] Crowdsourcing high-quality parallel data extraction from twitter
High-quality parallel data is crucial for a range of multilingual applications, from tuning and
evaluating machine translation systems to cross-lingual annotation projection. Unfortunately …
evaluating machine translation systems to cross-lingual annotation projection. Unfortunately …
[PDF][PDF] Joshua 5.0: Sparser, better, faster, server
We describe improvements made over the past year to Joshua, an open-source translation
system for parsing-based machine translation. The main contributions this past year are …
system for parsing-based machine translation. The main contributions this past year are …
[PDF][PDF] Joshua 6: A phrase-based and hierarchical statistical machine translation system.
We describe the version six release of Joshua, an open-source statistical machine
translation toolkit. The main difference from release five is the introduction of a simple …
translation toolkit. The main difference from release five is the introduction of a simple …
[PDF][PDF] Sentential paraphrasing as black-box machine translation
We present a simple, prepackaged solution to generating paraphrases of English
sentences. We use the Paraphrase Database (PPDB) for monolingual sentence rewriting …
sentences. We use the Paraphrase Database (PPDB) for monolingual sentence rewriting …
One system, many domains: Open-domain statistical machine translation via feature augmentation
In this paper, we introduce a simple technique for incorporating domain information into a
statistical machine translation system that significantly improves translation quality when test …
statistical machine translation system that significantly improves translation quality when test …
HLTCOE participation at TAC 2012: Entity linking and cold start knowledge base construction
Our team from the JHU HLTCOE participated in the Entity Linking and Cold Start Knowledge
Base tasks in this year's Text Analysis Conference Knowledge Base Population evaluation …
Base tasks in this year's Text Analysis Conference Knowledge Base Population evaluation …
Dual subtitles as parallel corpora
In this paper, we leverage the existence of dual subtitles as a source of parallel data. Dual
subtitles present viewers with two languages simultaneously, and are generally aligned in …
subtitles present viewers with two languages simultaneously, and are generally aligned in …