[PDF][PDF] WebBootCaT. instant domain-specific corpora to support human translators

M Baroni, A Kilgarriff, J Pomikálek… - Proceedings of the 11th …, 2006 - aclanthology.org
We present a web service to aid translators by quickly producing corpora for specialist
areas, in any of a range of languages, from the web. The underlying BootCaT tools have …

[PDF][PDF] Building a web-based parallel corpus and filtering out machine-translated text

A Antonova, A Misyurev - Proceedings of the 4th Workshop on …, 2011 - aclanthology.org
We describe a set of techniques that have been developed while collecting parallel texts for
Russian-English language pair and building a corpus of parallel sentences for training a …

[PDF][PDF] PaCo2: A Fully Automated tool for gathering Parallel Corpora from the Web.

I San Vicente, I Manterola - LREC, 2012 - mt-archive.net
The importance of parallel corpora in the NLP field is well known. This paper presents a tool
that can build parallel corpora given just a seed word list and a pair of languages. Our …

Automatic identification of parallel documents with light or without linguistic resources

A Patry, P Langlais - Advances in Artificial Intelligence: 18th Conference of …, 2005 - Springer
Parallel corpora are playing a crucial role in multilingual natural language processing.
Unfortunately, the availability of such a resource is the bottleneck in most applications of …

[PDF][PDF] Pen: Parallel english-persian news corpus

MA Farajian - Proceedings on the International Conference on …, 2011 - world-comp.org
Parallel corpora are the necessary resources in many multilingual natural language
processing applications, including machine translation and cross-lingual information …

[PDF][PDF] A fast and accurate method for detecting English-Japanese parallel texts

K Fukushima, K Taura… - Proceedings of the …, 2006 - aclanthology.org
Parallel corpus is a valuable resource used in various fields of multilingual natural language
processing. One of the most significant problems in using parallel corpora is the lack of their …

Paradocs: un système d'identification automatique de documents parallèles

A Patry, P Langlais - Actes de la 12ème conférence sur le …, 2005 - aclanthology.org
Les corpus parallèles sont d'une importance capitale pour les applications multilingues de
traitement automatique des langues. Malheureusement, leur rareté est le maillon faible de …

[PDF][PDF] PARADOCS: l'entremetteur de documents parallèles indépendant de la langue [PARADOCS: A Language Independant Go-Between for Mating Parallel …

A Patry, P Langlais - … des Langues, Volume 51, Numéro 2 …, 2010 - aclanthology.org
Les corpus parallèles sont la pierre angulaire de plusieurs technologies de traduction
automatique et des efforts conséquents sont régulièrement portés afin d'en réunir de …

Les traitements documentaires automatiques et le passage du temps

L Da Sylva - Le numérique: impact sur le cycle de vie du …, 2004 - papyrus.bib.umontreal.ca
Dans cet article, nous examinons le sort des documents qui ne sont pas destinés à vivre
longtemps et qui ne méritent ainsi aucun traitement documentaire traditionnel. Nous …

[PDF][PDF] Building parallel corpora from the web

J Pumikalek - 2007 - Citeseer
Parallel corpora are a valuable resource for many fields in computational linguistics, eg
machine translation, cross language information retrieval (CLIR), lexicography …