Google znalac

M Treviso, JU Lee, T Ji, B Aken, Q Cao… - Transactions of the …, 2023 - direct.mit.edu

Recent work in natural language processing (NLP) has yielded appealing results from
scaling model parameters and training data; however, using only scale to improve …

Spremi Citiraj Spominje se 112 puta Srodni članci Svih 10 inačica

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Accelerating transformer inference for translation via parallel decoding

A Santilli, S Severino, E Postolache, V Maiorca… - arxiv preprint arxiv …, 2023 - arxiv.org

Autoregressive decoding limits the efficiency of transformers for Machine Translation (MT).
The community proposed specific network architectures and learning-based methods to …

Spremi Citiraj Spominje se 71 puta Srodni članci Svih 7 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Deep encoder, shallow decoder: Reevaluating non-autoregressive machine translation

J Kasai, N Pappas, H Peng, J Cross… - arxiv preprint arxiv …, 2020 - arxiv.org

Much recent effort has been invested in non-autoregressive neural machine translation,
which appears to be an efficient alternative to state-of-the-art autoregressive machine …

Spremi Citiraj Spominje se 177 puta Srodni članci Svih 5 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

One country, 700+ languages: NLP challenges for underrepresented languages and dialects in Indonesia

AF Aji, GI Winata, F Koto, S Cahyawijaya… - arxiv preprint arxiv …, 2022 - arxiv.org

NLP research is impeded by a lack of resources and awareness of the challenges presented
by underrepresented languages and dialects. Focusing on the languages spoken in …

Spremi Citiraj Spominje se 88 puta Srodni članci Svih 13 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Fully non-autoregressive neural machine translation: Tricks of the trade

J Gu, X Kong - arxiv preprint arxiv:2012.15833, 2020 - arxiv.org

Fully non-autoregressive neural machine translation (NAT) is proposed to simultaneously
predict tokens with single forward of neural networks, which significantly reduces the …

Spremi Citiraj Spominje se 117 puta Srodni članci Svih 6 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

MulDA: A multilingual data augmentation framework for low-resource cross-lingual NER

L Liu, B Ding, L Bing, S Joty, L Si… - Proceedings of the 59th …, 2021 - aclanthology.org

Abstract Named Entity Recognition (NER) for low-resource languages is a both practical and
challenging research problem. This paper addresses zero-shot transfer for cross-lingual …

Spremi Citiraj Spominje se 78 puta Srodni članci Svih 3 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Imitation attacks and defenses for black-box machine translation systems

E Wallace, M Stern, D Song - arxiv preprint arxiv:2004.15015, 2020 - arxiv.org

Adversaries may look to steal or attack black-box NLP systems, either for financial gain or to
exploit model errors. One setting of particular interest is machine translation (MT), where …

Spremi Citiraj Spominje se 114 puta Srodni članci Svih 6 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] ed.ac.uk

Losing Heads in the Lottery: Pruning Transformer

M Behnke, K Heafield - The 2020 Conference on Empirical …, 2020 - research.ed.ac.uk

The attention mechanism is the crucial component of the transformer architecture. Recent
research shows that most attention heads are not confident in their decisions and can be …

Spremi Citiraj Spominje se 82 puta Srodni članci Svih 4 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

When attention meets fast recurrence: Training language models with reduced compute

T Lei - arxiv preprint arxiv:2102.12459, 2021 - arxiv.org

Large language models have become increasingly difficult to train because of the growing
computation time and cost. In this work, we present SRU++, a highly-efficient architecture …

Spremi Citiraj Spominje se 65 puta Srodni članci Svih 5 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Finetuning pretrained transformers into rnns

J Kasai, H Peng, Y Zhang, D Yogatama… - arxiv preprint arxiv …, 2021 - arxiv.org

Transformers have outperformed recurrent neural networks (RNNs) in natural language
generation. But this comes with a significant computational cost, as the attention …

Spremi Citiraj Spominje se 53 puta Srodni članci Svih 6 inačica Prikaži kao HTML

Stvori obavijest

Citiraj

Napredno pretraživanje

Spremljeno u Moju knjižnicu

From research to production and back: Ludicrously fast neural machine translation

Efficient methods for natural language processing: A survey

Accelerating transformer inference for translation via parallel decoding

Deep encoder, shallow decoder: Reevaluating non-autoregressive machine translation

One country, 700+ languages: NLP challenges for underrepresented languages and dialects in Indonesia

Fully non-autoregressive neural machine translation: Tricks of the trade

MulDA: A multilingual data augmentation framework for low-resource cross-lingual NER

Imitation attacks and defenses for black-box machine translation systems

Losing Heads in the Lottery: Pruning Transformer

When attention meets fast recurrence: Training language models with reduced compute

Finetuning pretrained transformers into rnns