Google Академія

A Fan, S Bhosale, H Schwenk, Z Ma, A El-Kishky… - Journal of Machine …, 2021 - jmlr.org

Existing work in translation demonstrated the potential of massively multilingual machine
translation by training a single model able to translate between any pair of languages …

Зберегти Послатися Цитовано в 874 джерелах Пов’язані статті Кількість версій: 9 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] psu.edu

Probabilistic topic modeling in multilingual settings: An overview of its methodology and applications

I Vulić, W De Smet, J Tang, MF Moens - Information Processing & …, 2015 - Elsevier

Probabilistic topic models are unsupervised generative models which model document
content as a two-step generation process, that is, documents are observed as mixtures of …

Зберегти Послатися Цитовано в 143 джерелах Пов’язані статті Кількість версій: 8

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Wikimatrix: Mining 135m parallel sentences in 1620 language pairs from wikipedia

H Schwenk, V Chaudhary, S Sun, H Gong… - arxiv preprint arxiv …, 2019 - arxiv.org

We present an approach based on multilingual sentence embeddings to automatically
extract parallel sentences from the content of Wikipedia articles in 85 languages, including …

Зберегти Послатися Цитовано в 374 джерелах Пов’язані статті Кількість версій: 7 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] strath.ac.uk

ParaCrawl: Web-scale acquisition of parallel corpora

M Bañón, P Chen, B Haddow, K Heafield, H Hoang… - 2020 - strathprints.strath.ac.uk

We report on methods to create the largest publicly available parallel corpora by crawling
the web, using open source software. We empirically compare alternative methods and …

Зберегти Послатися Цитовано в 275 джерелах Пов’язані статті Кількість версій: 17 Показати у форматі HTML

Bitext alignment

J Tiedemann - 2011 - books.google.com

This book provides an overview of various techniques for the alignment of bitexts. It
describes general concepts and strategies that can be applied to map corresponding parts …

Зберегти Послатися Цитовано в 154 джерелах Пов’язані статті Кількість версій: 8 Пошук бібліотеки

[Free GPT-4]
[DeepSeek]

[PDF] jst.go.jp

A survey of domain adaptation for machine translation

C Chu, R Wang - Journal of information processing, 2020 - jstage.jst.go.jp

Neural machine translation (NMT) is a deep learning based approach for machine
translation, which outperforms traditional statistical machine translation (SMT) and yields the …

Зберегти Послатися Цитовано в 339 джерелах Пов’язані статті Кількість версій: 5

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

CCMatrix: Mining billions of high-quality parallel sentences on the web

H Schwenk, G Wenzek, S Edunov, E Grave… - arxiv preprint arxiv …, 2019 - arxiv.org

We show that margin-based bitext mining in a multilingual sentence space can be applied to
monolingual corpora of billions of sentences. We are using ten snapshots of a curated …

Зберегти Послатися Цитовано в 244 джерелах Пов’язані статті Кількість версій: 6 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Margin-based parallel corpus mining with multilingual sentence embeddings

M Artetxe, H Schwenk - arxiv preprint arxiv:1811.01136, 2018 - arxiv.org

Machine translation is highly sensitive to the size and quality of the training data, which has
led to an increasing interest in collecting and filtering large parallel corpora. In this paper, we …

Зберегти Послатися Цитовано в 229 джерелах Пов’язані статті Кількість версій: 5 Показати у форматі HTML

Crowdsourcing and online collaborative translations

MA Jiménez-Crespo - 2017 - torrossa.com

We control the world basically because we are the only animals that can cooperate flexibly
in very large numbers […] This is something very unique to us, perhaps the most unique …

Зберегти Послатися Цитовано в 351 джерелах Пов’язані статті Кількість версій: 6 Пошук бібліотеки

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Improving multilingual sentence embedding using bi-directional dual encoder with additive margin softmax

Y Yang, GH Abrego, S Yuan, M Guo, Q Shen… - arxiv preprint arxiv …, 2019 - arxiv.org

In this paper, we present an approach to learn multilingual sentence embeddings using a bi-
directional dual-encoder with additive margin softmax. The embeddings are able to achieve …

Зберегти Послатися Цитовано в 130 джерелах Пов’язані статті Кількість версій: 7 Показати у форматі HTML

Створити сповіщення

Послатися

Розширений пошук

Збережено в моїй бібліотеці

Reliable measures for aligning Japanese-English news articles and sentences

Beyond english-centric multilingual machine translation

Probabilistic topic modeling in multilingual settings: An overview of its methodology and applications

Wikimatrix: Mining 135m parallel sentences in 1620 language pairs from wikipedia

ParaCrawl: Web-scale acquisition of parallel corpora

Bitext alignment

A survey of domain adaptation for machine translation

CCMatrix: Mining billions of high-quality parallel sentences on the web

Margin-based parallel corpus mining with multilingual sentence embeddings

Crowdsourcing and online collaborative translations

Improving multilingual sentence embedding using bi-directional dual encoder with additive margin softmax