الباحث العلمي من Google

K Song, X Tan, T Qin, J Lu, TY Liu - arxiv preprint arxiv:1905.02450, 2019‏ - arxiv.org‏

Pre-training and fine-tuning, eg, BERT, have achieved great success in language
understanding by transferring knowledge from rich-resource pre-training task to the low/zero …‏

حفظ اقتباس تم اقتباسها في عدد: 1201 مقالات ذات صلة الإصدارات الـ 9كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Learning deep transformer models for machine translation‏

Q Wang, B Li, T **ao, J Zhu, C Li, DF Wong… - arxiv preprint arxiv …, 2019‏ - arxiv.org‏

Transformer is the state-of-the-art model in recent machine translation evaluations. Two
strands of research are promising to improve models of this kind: the first uses wide …‏

حفظ اقتباس تم اقتباسها في عدد: 862 مقالات ذات صلة الإصدارات الـ 9كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] mit.edu

Leveraging pre-trained checkpoints for sequence generation tasks‏

S Rothe, S Narayan, A Severyn - Transactions of the Association for …, 2020‏ - direct.mit.edu‏

Unsupervised pre-training of large neural models has recently revolutionized Natural
Language Processing. By warm-starting from the publicly released checkpoints, NLP …‏

حفظ اقتباس تم اقتباسها في عدد: 518 مقالات ذات صلة الإصدارات الـ 12كلها

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Sparse is enough in scaling transformers‏

S Jaszczur, A Chowdhery… - Advances in …, 2021‏ - proceedings.neurips.cc‏

Large Transformer models yield impressive results on many tasks, but are expensive to
train, or even fine-tune, and so slow at decoding that their use and study becomes out of …‏

حفظ اقتباس تم اقتباسها في عدد: 92 مقالات ذات صلة الإصدارات الـ 7كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Exploring versatile generative language model via parameter-efficient transfer learning‏

Z Lin, A Madotto, P Fung - arxiv preprint arxiv:2004.03829, 2020‏ - arxiv.org‏

Fine-tuning pre-trained generative language models to down-stream language generation
tasks has shown promising results. However, this comes with the cost of having a single …‏

حفظ اقتباس تم اقتباسها في عدد: 143 مقالات ذات صلة الإصدارات الـ 4كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Very deep transformers for neural machine translation‏

X Liu, K Duh, L Liu, J Gao - arxiv preprint arxiv:2008.07772, 2020‏ - arxiv.org‏

We explore the application of very deep Transformer models for Neural Machine Translation
(NMT). Using a simple yet effective initialization technique that stabilizes training, we show …‏

حفظ اقتباس تم اقتباسها في عدد: 137 مقالات ذات صلة الإصدارات الـ 2كلها إصدار HTML‏

Decoder-only or encoder-decoder? interpreting language model as a regularized encoder-decoder‏

Z Fu, W Lam, Q Yu, AMC So, S Hu, Z Liu… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

The sequence-to-sequence (seq2seq) task aims at generating the target sequence based
on the given input source sequence. Traditionally, most of the seq2seq task is resolved by …‏

حفظ اقتباس تم اقتباسها في عدد: 50 مقالات ذات صلة الإصدارات الـ 2كلها نسخة مخزَّنة مؤقتًا

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Explicit sparse transformer: Concentrated attention through explicit selection‏

G Zhao, J Lin, Z Zhang, X Ren, Q Su, X Sun - arxiv preprint arxiv …, 2019‏ - arxiv.org‏

Self-attention based Transformer has demonstrated the state-of-the-art performances in a
number of natural language processing tasks. Self-attention is able to model long-term …‏

حفظ اقتباس تم اقتباسها في عدد: 137 مقالات ذات صلة الإصدارات الـ 3كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

UniTE: Unified translation evaluation‏

Y Wan, D Liu, B Yang, H Zhang, B Chen… - arxiv preprint arxiv …, 2022‏ - arxiv.org‏

Translation quality evaluation plays a crucial role in machine translation. According to the
input format, it is mainly separated into three tasks, ie, reference-only, source-only and …‏

حفظ اقتباس تم اقتباسها في عدد: 62 مقالات ذات صلة الإصدارات الـ 6كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Multilingual neural machine translation with language clustering‏

X Tan, J Chen, D He, Y **a, T Qin, TY Liu - arxiv preprint arxiv …, 2019‏ - arxiv.org‏

Multilingual neural machine translation (NMT), which translates multiple languages using a
single model, is of great practical importance due to its advantages in simplifying the training …‏

حفظ اقتباس تم اقتباسها في عدد: 124 مقالات ذات صلة الإصدارات الـ 4كلها إصدار HTML‏

إنشاء تنبيه

اقتباس

بحث متقدم

تم حفظ المقالة في مكتبتي.

Layer-wise coordination between encoder and decoder for neural machine translation

Mass: Masked sequence to sequence pre-training for language generation‏

Learning deep transformer models for machine translation‏

Leveraging pre-trained checkpoints for sequence generation tasks‏

Sparse is enough in scaling transformers‏

Exploring versatile generative language model via parameter-efficient transfer learning‏

Very deep transformers for neural machine translation‏

Decoder-only or encoder-decoder? interpreting language model as a regularized encoder-decoder‏

Explicit sparse transformer: Concentrated attention through explicit selection‏

UniTE: Unified translation evaluation‏

Multilingual neural machine translation with language clustering‏