Google Académico

A Madsen, S Reddy, S Chandar - ACM Computing Surveys, 2022 - dl.acm.org

Neural networks for NLP are becoming increasingly complex and widespread, and there is a
growing concern if these models are responsible to use. Explaining models helps to address …

Guardar Citar Citado por 269 Artículos relacionados Las 5 versiones

[Free GPT-4]
[DeepSeek]

[PDF] mit.edu

Probing classifiers: Promises, shortcomings, and advances

Y Belinkov - Computational Linguistics, 2022 - direct.mit.edu

Probing classifiers have emerged as one of the prominent methodologies for interpreting
and analyzing deep neural network models of natural language processing. The basic idea …

Guardar Citar Citado por 455 Artículos relacionados Las 8 versiones

[Free GPT-4]
[DeepSeek]

[PDF] hal.science

Bloom: A 176b-parameter open-access multilingual language model

T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow… - 2023 - inria.hal.science

Large language models (LLMs) have been shown to be able to perform new tasks based on
a few demonstrations or natural language instructions. While these capabilities have led to …

Guardar Citar Citado por 1749 Artículos relacionados Las 16 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Locating and editing factual associations in GPT

K Meng, D Bau, A Andonian… - Advances in Neural …, 2022 - proceedings.neurips.cc

We analyze the storage and recall of factual associations in autoregressive transformer
language models, finding evidence that these associations correspond to localized, directly …

Guardar Citar Citado por 1035 Artículos relacionados Las 7 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

On the opportunities and risks of foundation models

R Bommasani, DA Hudson, E Adeli, R Altman… - arxiv preprint arxiv …, 2021 - arxiv.org

AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …

Guardar Citar Citado por 4739 Artículos relacionados Las 2 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Do vision transformers see like convolutional neural networks?

M Raghu, T Unterthiner, S Kornblith… - Advances in neural …, 2021 - proceedings.neurips.cc

Convolutional neural networks (CNNs) have so far been the de-facto model for visual data.
Recent work has shown that (Vision) Transformer models (ViT) can achieve comparable or …

Guardar Citar Citado por 1195 Artículos relacionados Las 8 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Fast model editing at scale

E Mitchell, C Lin, A Bosselut, C Finn… - arxiv preprint arxiv …, 2021 - arxiv.org

While large pre-trained models have enabled impressive results on a variety of downstream
tasks, the largest existing models still make errors, and even accurate predictions may …

Guardar Citar Citado por 451 Artículos relacionados Las 4 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Interpretability at scale: Identifying causal mechanisms in alpaca

Z Wu, A Geiger, T Icard, C Potts… - Advances in neural …, 2023 - proceedings.neurips.cc

Obtaining human-interpretable explanations of large, general-purpose language models is
an urgent goal for AI safety. However, it is just as important that our interpretability methods …

Guardar Citar Citado por 88 Artículos relacionados Las 6 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] Pre-trained models: Past, present and future

X Han, Z Zhang, N Ding, Y Gu, X Liu, Y Huo, J Qiu… - AI Open, 2021 - Elsevier

Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved
great success and become a milestone in the field of artificial intelligence (AI). Owing to …

Guardar Citar Citado por 938 Artículos relacionados Las 9 versiones

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Physics of language models: Part 3.1, knowledge storage and extraction

Z Allen-Zhu, Y Li - arxiv preprint arxiv:2309.14316, 2023 - arxiv.org

Large language models (LLMs) can store a vast amount of world knowledge, often
extractable via question-answering (eg," What is Abraham Lincoln's birthday?"). However …

Guardar Citar Citado por 82 Artículos relacionados Las 4 versiones Versión en HTML

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

Post-hoc interpretability for neural nlp: A survey

Probing classifiers: Promises, shortcomings, and advances

Bloom: A 176b-parameter open-access multilingual language model

Locating and editing factual associations in GPT

On the opportunities and risks of foundation models

Do vision transformers see like convolutional neural networks?

Fast model editing at scale

Interpretability at scale: Identifying causal mechanisms in alpaca

[HTML][HTML] Pre-trained models: Past, present and future

Physics of language models: Part 3.1, knowledge storage and extraction