Google Akademik

The multiberts: Bert reproductions for robustness analysis

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

A survey on stability of learning with limited labelled data and its sensitivity to the effects of randomness

B Pecher, I Srba, M Bielikova - ACM Computing Surveys, 2024 - dl.acm.org

Learning with limited labelled data, such as prompting, in-context learning, fine-tuning, meta-
learning, or few-shot learning, aims to effectively train a model using only a small amount of …

Kaydet Alıntı yap Alıntılanma sayısı: 5 İlgili makaleler

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Pythia: A suite for analyzing large language models across training and scaling

S Biderman, H Schoelkopf… - International …, 2023 - proceedings.mlr.press

How do large language models (LLMs) develop and evolve over the course of training?
How do these patterns change as models scale? To answer these questions, we introduce …

Kaydet Alıntı yap Alıntılanma sayısı: 992 İlgili makaleler 7 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Deep reinforcement learning at the edge of the statistical precipice

R Agarwal, M Schwarzer, PS Castro… - Advances in neural …, 2021 - proceedings.neurips.cc

Deep reinforcement learning (RL) algorithms are predominantly evaluated by comparing
their relative performance on a large suite of tasks. Most published results on deep RL …

Kaydet Alıntı yap Alıntılanma sayısı: 751 İlgili makaleler 8 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

A toy model of universality: Reverse engineering how networks learn group operations

B Chughtai, L Chan, N Nanda - International Conference on …, 2023 - proceedings.mlr.press

Universality is a key hypothesis in mechanistic interpretability–that different models learn
similar features and circuits when trained on similar tasks. In this work, we study the …

Kaydet Alıntı yap Alıntılanma sayısı: 87 İlgili makaleler 6 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A pretrainer's guide to training data: Measuring the effects of data age, domain coverage, quality, & toxicity

S Longpre, G Yauney, E Reif, K Lee, A Roberts… - ar** capable language models
(LM). Despite this, pretraining data design is critically under-documented and often guided …

Kaydet Alıntı yap Alıntılanma sayısı: 130 İlgili makaleler 3 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Cramming: Training a Language Model on a single GPU in one day.

J Gei**, T Goldstein - International Conference on …, 2023 - proceedings.mlr.press

Recent trends in language modeling have focused on increasing performance through
scaling, and have resulted in an environment where training language models is out of …

Kaydet Alıntı yap Alıntılanma sayısı: 76 İlgili makaleler 7 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Datamodels: Predicting predictions from training data

A Ilyas, SM Park, L Engstrom, G Leclerc… - arxiv preprint arxiv …, 2022 - arxiv.org

We present a conceptual framework, datamodeling, for analyzing the behavior of a model
class in terms of the training data. For any fixed" target" example $ x $, training set $ S $, and …

Kaydet Alıntı yap Alıntılanma sayısı: 146 İlgili makaleler 3 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Analyzing transformers in embedding space

G Dar, M Geva, A Gupta, J Berant - arxiv preprint arxiv:2209.02535, 2022 - arxiv.org

Understanding Transformer-based models has attracted significant attention, as they lie at
the heart of recent technological advances across machine learning. While most …

Kaydet Alıntı yap Alıntılanma sayısı: 117 İlgili makaleler 8 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

AdaMix: Mixture-of-adaptations for parameter-efficient model tuning

Y Wang, S Agarwal, S Mukherjee, X Liu, J Gao… - arxiv preprint arxiv …, 2022 - arxiv.org

Standard fine-tuning of large pre-trained language models (PLMs) for downstream tasks
requires updating hundreds of millions to billions of parameters, and storing a large copy of …

Kaydet Alıntı yap Alıntılanma sayısı: 118 İlgili makaleler 8 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

All bark and no bite: Rogue dimensions in transformer language models obscure representational quality

W Timkey, M Van Schijndel - arxiv preprint arxiv:2109.04404, 2021 - arxiv.org

Similarity measures are a vital tool for understanding how language models represent and
process language. Standard representational similarity measures such as cosine similarity …

Kaydet Alıntı yap Alıntılanma sayısı: 112 İlgili makaleler 5 sürümün hepsi HTML olarak görüntüle

Uyarı oluştur

Alıntı yap

Gelişmiş arama

Kitaplığım'a kaydedildi

The multiberts: Bert reproductions for robustness analysis

A survey on stability of learning with limited labelled data and its sensitivity to the effects of randomness

Pythia: A suite for analyzing large language models across training and scaling

Deep reinforcement learning at the edge of the statistical precipice

A toy model of universality: Reverse engineering how networks learn group operations

A pretrainer's guide to training data: Measuring the effects of data age, domain coverage, quality, & toxicity

Cramming: Training a Language Model on a single GPU in one day.

Datamodels: Predicting predictions from training data

Analyzing transformers in embedding space

AdaMix: Mixture-of-adaptations for parameter-efficient model tuning

All bark and no bite: Rogue dimensions in transformer language models obscure representational quality