- Academic Search

Y Belinkov - Computational Linguistics, 2022 - direct.mit.edu

Probing classifiers have emerged as one of the prominent methodologies for interpreting
and analyzing deep neural network models of natural language processing. The basic idea …

Speichern Zitieren Zitiert von: 452 Ähnliche Artikel Alle 8 Versionen

[Free GPT-4]

[PDF] arxiv.org

Pre-trained models for natural language processing: A survey

X Qiu, T Sun, Y Xu, Y Shao, N Dai, X Huang - Science China …, 2020 - Springer

Recently, the emergence of pre-trained models (PTMs) has brought natural language
processing (NLP) to a new era. In this survey, we provide a comprehensive review of PTMs …

Speichern Zitieren Zitiert von: 1930 Ähnliche Artikel Alle 9 Versionen

[Free GPT-4]

[HTML] google.com

[HTML][HTML] Modern language models refute Chomsky's approach to language

ST Piantadosi - From fieldwork to linguistic theory: A tribute to …, 2023 - books.google.com

Modern machine learning has subverted and bypassed the theoretical framework of
Chomsky's generative approach to linguistics, including its core claims to particular insights …

Speichern Zitieren Zitiert von: 188 Ähnliche Artikel Alle 3 Versionen

[Free GPT-4]

[HTML] sciencedirect.com

[HTML][HTML] Pre-trained models: Past, present and future

X Han, Z Zhang, N Ding, Y Gu, X Liu, Y Huo, J Qiu… - AI Open, 2021 - Elsevier

Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved
great success and become a milestone in the field of artificial intelligence (AI). Owing to …

Speichern Zitieren Zitiert von: 931 Ähnliche Artikel Alle 9 Versionen

[Free GPT-4]

[PDF] arxiv.org

Autoprompt: Eliciting knowledge from language models with automatically generated prompts

T Shin, Y Razeghi, RL Logan IV, E Wallace… - arxiv preprint arxiv …, 2020 - arxiv.org

The remarkable success of pretrained language models has motivated the study of what
kinds of knowledge these models learn during pretraining. Reformulating tasks as fill-in-the …

Speichern Zitieren Zitiert von: 1882 Ähnliche Artikel Alle 8 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Transformer feed-forward layers are key-value memories

M Geva, R Schuster, J Berant, O Levy - arxiv preprint arxiv:2012.14913, 2020 - arxiv.org

Feed-forward layers constitute two-thirds of a transformer model's parameters, yet their role
in the network remains under-explored. We show that feed-forward layers in transformer …

Speichern Zitieren Zitiert von: 638 Ähnliche Artikel Alle 6 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Spot: Better frozen model adaptation through soft prompt transfer

T Vu, B Lester, N Constant, R Al-Rfou, D Cer - arxiv preprint arxiv …, 2021 - arxiv.org

There has been growing interest in parameter-efficient methods to apply pre-trained
language models to downstream tasks. Building on the Prompt Tuning approach of Lester et …

Speichern Zitieren Zitiert von: 286 Ähnliche Artikel Alle 12 Versionen HTML-Version

[Free GPT-4]

[PDF] mit.edu

A primer in BERTology: What we know about how BERT works

A Rogers, O Kovaleva, A Rumshisky - Transactions of the Association …, 2021 - direct.mit.edu

Transformer-based models have pushed state of the art in many areas of NLP, but our
understanding of what is behind their success is still limited. This paper is the first survey of …

Speichern Zitieren Zitiert von: 1826 Ähnliche Artikel Alle 12 Versionen

[Free GPT-4]

[PDF] arxiv.org

A survey of knowledge enhanced pre-trained language models

L Hu, Z Liu, Z Zhao, L Hou, L Nie… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Pre-trained Language Models (PLMs) which are trained on large text corpus via self-
supervised learning method, have yielded promising performance on various tasks in …

Speichern Zitieren Zitiert von: 183 Ähnliche Artikel Alle 8 Versionen

[Free GPT-4]

[PDF] arxiv.org

Tera: Self-supervised learning of transformer encoder representation for speech

AT Liu, SW Li, H Lee - IEEE/ACM Transactions on Audio …, 2021 - ieeexplore.ieee.org

We introduce a self-supervised speech pre-training method called TERA, which stands for
Transformer Encoder Representations from Alteration. Recent approaches often learn by …

Speichern Zitieren Zitiert von: 410 Ähnliche Artikel Alle 6 Versionen

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Linguistic Knowledge and Transferability of Contextual Representations

Probing classifiers: Promises, shortcomings, and advances

Pre-trained models for natural language processing: A survey

[HTML][HTML] Modern language models refute Chomsky's approach to language

[HTML][HTML] Pre-trained models: Past, present and future

Autoprompt: Eliciting knowledge from language models with automatically generated prompts

Transformer feed-forward layers are key-value memories

Spot: Better frozen model adaptation through soft prompt transfer

A primer in BERTology: What we know about how BERT works

A survey of knowledge enhanced pre-trained language models

Tera: Self-supervised learning of transformer encoder representation for speech