An efficient machine translation model for Dravidian language

PK Pareek, K Swathi… - 2017 2nd IEEE …, 2017 - ieeexplore.ieee.org
In a multilingual diversity country like India, language translation plays a significant factor in
the area of text processing application such as information extraction, machine learning …

Integrated parallel sentence and fragment extraction from comparable corpora: A case study on Chinese--Japanese Wikipedia

C Chu, T Nakazawa, S Kurohashi - ACM Transactions on Asian and …, 2015 - dl.acm.org
Parallel corpora are crucial for statistical machine translation (SMT); however, they are quite
scarce for most language pairs and domains. As comparable corpora are far more available …

[PDF][PDF] Automatic building and using parallel resources for SMT from comparable corpora

S Pal, P Pakray, SK Naskar - Proceedings of the 3rd Workshop on …, 2014 - aclanthology.org
Building parallel resources for corpus based machine translation, especially Statistical
Machine Translation (SMT), from comparable corpora has recently received wide attention …

Extracting parallel phrases from comparable data for machine translation

S Hewavitharana, S Vogel - Natural Language Engineering, 2016 - cambridge.org
Mining parallel data from comparable corpora is a promising approach for overcoming the
data sparseness in statistical machine translation and other natural language processing …

[PDF][PDF] A subject identification method based on term frequency technique

NS Jamil, KR Ku-Mahamud, AM Din, F Ahmad… - International Journal of …, 2017 - core.ac.uk
The analyzing and extracting important information from a text document is crucial and has
produced interest in the area of text mining and information retrieval. This process is used in …

Sentence alignment using local and global information

H Zamani, H Faili, A Shakery - Computer Speech & Language, 2016 - Elsevier
Parallel corpora are essential resources for statistical machine translation (SMT) and cross
language information retrieval (CLIR) systems. Creating parallel corpora is highly expensive …

Chinese-Khmer parallel fragments extraction from comparable corpus based on dirichlet process

S Ning, X Yan, Y Nuo, F Zhou, Q **e… - Procedia Computer …, 2020 - Elsevier
Aiming at the problems existing in the Chinese-Khmer parallel corpus, such as single field,
small scale, and poor timeliness, a method of Chinese-Khmer parallel fragment extraction …

Domain adaptation in MT using titles in wikipedia as a parallel corpus: Resources and evaluation

G Labaka, I Alegria, K Sarasola - Proceedings of the Tenth …, 2016 - aclanthology.org
This paper presents how an state-of-the-art SMT system is enriched by using an extra in-
domain parallel corpora extracted from Wikipedia. We collect corpora from parallel titles and …

A hybrid machine translation framework for an improved translation workflow

S Pal - 2018 - publikationen.sulb.uni-saarland.de
Over the past few decades, due to a continuing surge in the amount of content being
translated and ever increasing pressure to deliver high quality and high throughput …

[PDF][PDF] Integrated parallel data extraction from comparable corpora for statistical machine translation

C Chu - 2015 - repository.kulib.kyoto-u.ac.jp
Abstract Machine translation (MT), as a high level application of natural language
processing (NLP), is a powerful tool to improve the efficiency and reduce the cost of …