Recent advances in natural language processing via large pre-trained language models: A survey
Large, pre-trained language models (PLMs) such as BERT and GPT have drastically
changed the Natural Language Processing (NLP) field. For numerous NLP tasks …
changed the Natural Language Processing (NLP) field. For numerous NLP tasks …
Post-hoc interpretability for neural nlp: A survey
Neural networks for NLP are becoming increasingly complex and widespread, and there is a
growing concern if these models are responsible to use. Explaining models helps to address …
growing concern if these models are responsible to use. Explaining models helps to address …
Explainability for large language models: A survey
Large language models (LLMs) have demonstrated impressive capabilities in natural
language processing. However, their internal mechanisms are still unclear and this lack of …
language processing. However, their internal mechanisms are still unclear and this lack of …
How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model
Pre-trained language models can be surprisingly adept at tasks they were not explicitly
trained on, but how they implement these capabilities is poorly understood. In this paper, we …
trained on, but how they implement these capabilities is poorly understood. In this paper, we …
Representation engineering: A top-down approach to ai transparency
A Zou, L Phan, S Chen, J Campbell, P Guo… - ar** attention heads do nothing
Transformer models have been widely adopted in various domains over the last years and
especially large language models have advanced the field of AI significantly. Due to their …
especially large language models have advanced the field of AI significantly. Due to their …
Predictability and surprise in large generative models
Large-scale pre-training has recently emerged as a technique for creating capable, general-
purpose, generative models such as GPT-3, Megatron-Turing NLG, Gopher, and many …
purpose, generative models such as GPT-3, Megatron-Turing NLG, Gopher, and many …
Cogview: Mastering text-to-image generation via transformers
Text-to-Image generation in the general domain has long been an open problem, which
requires both a powerful generative model and cross-modal understanding. We propose …
requires both a powerful generative model and cross-modal understanding. We propose …
Nüwa: Visual synthesis pre-training for neural visual world creation
This paper presents a unified multimodal pre-trained model called NÜWA that can generate
new or manipulate existing visual data (ie, images and videos) for various visual synthesis …
new or manipulate existing visual data (ie, images and videos) for various visual synthesis …
[HTML][HTML] GPT understands, too
Prompting a pretrained language model with natural language patterns has been proved
effective for natural language understanding (NLU). However, our preliminary study reveals …
effective for natural language understanding (NLU). However, our preliminary study reveals …