Μελετητής Google

KS Kalyan, A Rajasekharan, S Sangeetha - arxiv preprint arxiv …, 2021 - arxiv.org

Transformer-based pretrained language models (T-PTLMs) have achieved great success in
almost every NLP task. The evolution of these models started with GPT and BERT. These …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 362 Σχετικά άρθρα Όλες οι 2 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Dawn of the transformer era in speech emotion recognition: closing the valence gap

J Wagner, A Triantafyllopoulos… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org

Recent advances in transformer-based architectures have shown promise in several
machine learning tasks. In the audio domain, such architectures have been successfully …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 337 Σχετικά άρθρα Όλες οι 11 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Structured pruning learns compact and accurate models

M **a, Z Zhong, D Chen - arxiv preprint arxiv:2204.00408, 2022 - arxiv.org

The growing size of neural language models has led to increased attention in model
compression. The two predominant approaches are pruning, which gradually removes …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 233 Σχετικά άρθρα Όλες οι 7 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Minilm: Deep self-attention distillation for task-agnostic compression of pre-trained transformers

W Wang, F Wei, L Dong, H Bao… - Advances in neural …, 2020 - proceedings.neurips.cc

Pre-trained language models (eg, BERT (Devlin et al., 2018) and its variants) have achieved
remarkable success in varieties of NLP tasks. However, these models usually consist of …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 1277 Σχετικά άρθρα Όλες οι 6 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Tinybert: Distilling bert for natural language understanding

X Jiao, Y Yin, L Shang, X Jiang, X Chen, L Li… - arxiv preprint arxiv …, 2019 - arxiv.org

Language model pre-training, such as BERT, has significantly improved the performances of
many natural language processing tasks. However, pre-trained language models are …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 2027 Σχετικά άρθρα Όλες οι 5 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Block pruning for faster transformers

F Lagunas, E Charlaix, V Sanh, AM Rush - arxiv preprint arxiv …, 2021 - arxiv.org

Pre-training has improved model accuracy for both classification and generation tasks at the
cost of introducing much larger and slower models. Pruning methods have proven to be an …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 226 Σχετικά άρθρα Όλες οι 5 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Movement pruning: Adaptive sparsity by fine-tuning

V Sanh, T Wolf, A Rush - Advances in neural information …, 2020 - proceedings.neurips.cc

Magnitude pruning is a widely used strategy for reducing model size in pure supervised
learning; however, it is less effective in the transfer learning regime that has become …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 498 Σχετικά άρθρα Όλες οι 7 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Parameter-efficient transfer learning with diff pruning

D Guo, AM Rush, Y Kim - arxiv preprint arxiv:2012.07463, 2020 - arxiv.org

While task-specific finetuning of pretrained networks has led to significant empirical
advances in NLP, the large size of networks makes finetuning difficult to deploy in multi-task …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 417 Σχετικά άρθρα Όλες οι 7 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Full stack optimization of transformer inference: a survey

S Kim, C Hooper, T Wattanawong, M Kang… - arxiv preprint arxiv …, 2023 - arxiv.org

Recent advances in state-of-the-art DNN architecture design have been moving toward
Transformer models. These models achieve superior accuracy across a wide range of …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 99 Σχετικά άρθρα Όλες οι 4 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The optimal bert surgeon: Scalable and accurate second-order pruning for large language models

E Kurtic, D Campos, T Nguyen, E Frantar… - arxiv preprint arxiv …, 2022 - arxiv.org

Transformer-based language models have become a key building block for natural
language processing. While these models are extremely accurate, they can be too large and …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 134 Σχετικά άρθρα Όλες οι 4 εκδοχές Προβολή ως HTML

Δημιουργία ειδοποίησης

Παράθεση

Σύνθετη αναζήτηση

Αποθηκεύτηκε στη Βιβλιοθήκη μου

Poor man’s bert: Smaller and faster transformer models

Ammus: A survey of transformer-based pretrained models in natural language processing

Dawn of the transformer era in speech emotion recognition: closing the valence gap

Structured pruning learns compact and accurate models

Minilm: Deep self-attention distillation for task-agnostic compression of pre-trained transformers

Tinybert: Distilling bert for natural language understanding

Block pruning for faster transformers

Movement pruning: Adaptive sparsity by fine-tuning

Parameter-efficient transfer learning with diff pruning

Full stack optimization of transformer inference: a survey

The optimal bert surgeon: Scalable and accurate second-order pruning for large language models