Ammus: A survey of transformer-based pretrained models in natural language processing
KS Kalyan, A Rajasekharan, S Sangeetha - arxiv preprint arxiv …, 2021 - arxiv.org
Transformer-based pretrained language models (T-PTLMs) have achieved great success in
almost every NLP task. The evolution of these models started with GPT and BERT. These …
almost every NLP task. The evolution of these models started with GPT and BERT. These …
Survey of low-resource machine translation
We present a survey covering the state of the art in low-resource machine translation (MT)
research. There are currently around 7,000 languages spoken in the world and almost all …
research. There are currently around 7,000 languages spoken in the world and almost all …
A voyage on neural machine translation for Indic languages
With the invention of deep learning concepts, Machine Translation (MT) migrated towards
Neural Machine Translation (NMT) architectures, eventually from Statistical Machine …
Neural Machine Translation (NMT) architectures, eventually from Statistical Machine …
AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation
This paper proposes a novel direct Audio-Visual Speech to Audio-Visual Speech
Translation (AV2AV) framework where the input and output of the system are multimodal (ie …
Translation (AV2AV) framework where the input and output of the system are multimodal (ie …
[PDF][PDF] Aksharantar: Towards building open transliteration tools for the next billion users
We introduce Aksharantar, the largest publicly available transliteration dataset for 21 Indic
languages containing 26 million transliteration pairs. We build this dataset by mining …
languages containing 26 million transliteration pairs. We build this dataset by mining …
English–Assamese neural machine translation using prior alignment and pre-trained language model
In a multilingual country like India, automatic natural language translation plays a key role in
building a community with different linguistic people. Many researchers have explored and …
building a community with different linguistic people. Many researchers have explored and …
Using natural language prompts for machine translation
X Garcia, O Firat - arxiv preprint arxiv:2202.11822, 2022 - arxiv.org
We explore the use of natural language prompts for controlling various aspects of the
outputs generated by machine translation models. We demonstrate that natural language …
outputs generated by machine translation models. We demonstrate that natural language …
Effectiveness of mining audio and text pairs from public data for improving ASR systems for low-resource languages
Collecting labelled datasets for speech recognition systems for low-resource languages on
a diverse set of domains and speakers is expensive. In this work, we demonstrate an …
a diverse set of domains and speakers is expensive. In this work, we demonstrate an …
Aksharantar: Open Indic-language transliteration datasets and models for the next billion users
Transliteration is very important in the Indian language context due to the usage of multiple
scripts and the widespread use of romanized inputs. However, few training and evaluation …
scripts and the widespread use of romanized inputs. However, few training and evaluation …