- Academic Search

KS Kalyan, A Rajasekharan, S Sangeetha - arxiv preprint arxiv …, 2021 - arxiv.org

Transformer-based pretrained language models (T-PTLMs) have achieved great success in
almost every NLP task. The evolution of these models started with GPT and BERT. These …

Salva Cita Citato da 364 Articoli correlati Tutte e 2 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Pre-trained language models in biomedical domain: A systematic survey

B Wang, Q **e, J Pei, Z Chen, P Tiwari, Z Li… - ACM Computing …, 2023 - dl.acm.org

Pre-trained language models (PLMs) have been the de facto paradigm for most natural
language processing tasks. This also benefits the biomedical domain: researchers from …

Salva Cita Citato da 173 Articoli correlati Tutte e 5 le versioni

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Memorization without overfitting: Analyzing the training dynamics of large language models

K Tirumala, A Markosyan… - Advances in …, 2022 - proceedings.neurips.cc

Despite their wide adoption, the underlying training and memorization dynamics of very
large language models is not well understood. We empirically study exact memorization in …

Salva Cita Citato da 245 Articoli correlati Tutte e 5 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Do vision transformers see like convolutional neural networks?

M Raghu, T Unterthiner, S Kornblith… - Advances in neural …, 2021 - proceedings.neurips.cc

Convolutional neural networks (CNNs) have so far been the de-facto model for visual data.
Recent work has shown that (Vision) Transformer models (ViT) can achieve comparable or …

Salva Cita Citato da 1199 Articoli correlati Tutte e 8 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] pnas.org Full View

The neural architecture of language: Integrative modeling converges on predictive processing

M Schrimpf, IA Blank, G Tuckute… - Proceedings of the …, 2021 - National Acad Sciences

The neuroscience of perception has recently been revolutionized with an integrative
modeling approach in which computation, brain function, and behavior are linked across …

Salva Cita Citato da 414 Articoli correlati Tutte e 20 le versioni

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Revisiting few-sample BERT fine-tuning

T Zhang, F Wu, A Katiyar, KQ Weinberger… - arxiv preprint arxiv …, 2020 - arxiv.org

This paper is a study of fine-tuning of BERT contextual representations, with focus on
commonly observed instabilities in few-sample scenarios. We identify several factors that …

Salva Cita Citato da 469 Articoli correlati Tutte e 3 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Achieving forgetting prevention and knowledge transfer in continual learning

Z Ke, B Liu, N Ma, H Xu, L Shu - Advances in Neural …, 2021 - proceedings.neurips.cc

Continual learning (CL) learns a sequence of tasks incrementally with the goal of achieving
two main objectives: overcoming catastrophic forgetting (CF) and encouraging knowledge …

Salva Cita Citato da 126 Articoli correlati Tutte e 7 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

On the effectiveness of adapter-based tuning for pretrained language model adaptation

R He, L Liu, H Ye, Q Tan, B Ding, L Cheng… - arxiv preprint arxiv …, 2021 - arxiv.org

Adapter-based tuning has recently arisen as an alternative to fine-tuning. It works by adding
light-weight adapter modules to a pretrained language model (PrLM) and only updating the …

Salva Cita Citato da 213 Articoli correlati Tutte e 4 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

All bark and no bite: Rogue dimensions in transformer language models obscure representational quality

W Timkey, M Van Schijndel - arxiv preprint arxiv:2109.04404, 2021 - arxiv.org

Similarity measures are a vital tool for understanding how language models represent and
process language. Standard representational similarity measures such as cosine similarity …

Salva Cita Citato da 112 Articoli correlati Tutte e 5 le versioni Versione HTML

Semantic structure in deep learning

E Pavlick - Annual Review of Linguistics, 2022 - annualreviews.org

Deep learning has recently come to dominate computational linguistics, leading to claims of
human-level performance in a range of language processing tasks. Like much previous …

Salva Cita Citato da 64 Articoli correlati Tutte e 2 le versioni

Crea avviso

Cita

Ricerca avanzata

Salvato in La mia biblioteca

What happens to BERT embeddings during fine-tuning?

Ammus: A survey of transformer-based pretrained models in natural language processing

Pre-trained language models in biomedical domain: A systematic survey

Memorization without overfitting: Analyzing the training dynamics of large language models

Do vision transformers see like convolutional neural networks?

The neural architecture of language: Integrative modeling converges on predictive processing

Revisiting few-sample BERT fine-tuning

Achieving forgetting prevention and knowledge transfer in continual learning

On the effectiveness of adapter-based tuning for pretrained language model adaptation

All bark and no bite: Rogue dimensions in transformer language models obscure representational quality

Semantic structure in deep learning