IndicNLPSuite: Monolingual corpora, evaluation benchmarks and pre-trained multilingual language models for Indian languages

D Kakwani, A Kunchukuttan, S Golla… - Findings of the …, 2020 - aclanthology.org
In this paper, we introduce NLP resources for 11 major Indian languages from two major
language families. These resources include:(a) large-scale sentence-level monolingual …

[PDF][PDF] Findings of the 2014 workshop on statistical machine translation

O Bojar, C Buck, C Federmann, B Haddow… - Proceedings of the …, 2014 - aclanthology.org
This paper presents the results of the WMT14 shared tasks, which included a standard news
translation task, a separate medical translation task, a task for run-time estimation of …

The iit bombay english-hindi parallel corpus

A Kunchukuttan, P Mehta, P Bhattacharyya - arxiv preprint arxiv …, 2017 - arxiv.org
We present the IIT Bombay English-Hindi Parallel Corpus. The corpus is a compilation of
parallel corpora previously available in the public domain as well as new parallel corpora …

Overview of the 8th workshop on Asian translation

T Nakazawa, H Nakayama, C Ding… - Proceedings of the …, 2021 - aclanthology.org
This paper presents the results of the shared tasks from the 8th workshop on Asian
translation (WAT2021). For the WAT2021, 28 teams participated in the shared tasks and 24 …

Ai4bharat-indicnlp corpus: Monolingual corpora and word embeddings for indic languages

A Kunchukuttan, D Kakwani, S Golla… - arxiv preprint arxiv …, 2020 - arxiv.org
We present the IndicNLP corpus, a large-scale, general-domain corpus containing 2.7
billion words for 10 Indian languages from two language families. We share pre-trained …

Recent advances of low-resource neural machine translation

R Haque, CH Liu, A Way - Machine Translation, 2021 - Springer
In recent years, neural network-based machine translation (MT) approaches have steadily
superseded the statistical MT (SMT) methods, and represents the current state-of-the-art in …

Neural machine translation: English to hindi

SR Laskar, A Dutta, P Pakray… - 2019 IEEE conference …, 2019 - ieeexplore.ieee.org
Machine Translation (MT) attempts to minimize the communication gap among people from
various linguistic backgrounds. Automatic translation between pair of different natural …

A new language-independent deep CNN for scene text detection and style transfer in social media images

P Shivakumara, A Banerjee, U Pal… - … on Image Processing, 2023 - ieeexplore.ieee.org
Due to the adverse effect of quality caused by different social media and arbitrary languages
in natural scenes, detecting text from social media images and transferring its style is …

A comparative analysis on Hindi and English extractive text summarization

P Verma, S Pal, H Om - ACM Transactions on Asian and Low-Resource …, 2019 - dl.acm.org
Text summarization is the process of transfiguring a large documental information into a
clear and concise form. In this article, we present a detailed comparative study of various …

Universal Dependency parsing for Hindi-English code-switching

IA Bhat, RA Bhat, M Shrivastava… - arxiv preprint arxiv …, 2018 - arxiv.org
Code-switching is a phenomenon of mixing grammatical structures of two or more
languages under varied social constraints. The code-switching data differ so radically from …