Dissociating language and thought in large language models
Large language models (LLMs) have come closest among all models to date to mastering
human language, yet opinions about their linguistic and cognitive capabilities remain split …
human language, yet opinions about their linguistic and cognitive capabilities remain split …
Recent advances in natural language processing via large pre-trained language models: A survey
Large, pre-trained language models (PLMs) such as BERT and GPT have drastically
changed the Natural Language Processing (NLP) field. For numerous NLP tasks …
changed the Natural Language Processing (NLP) field. For numerous NLP tasks …
Inference-time intervention: Eliciting truthful answers from a language model
Abstract We introduce Inference-Time Intervention (ITI), a technique designed to enhance
the" truthfulness" of large language models (LLMs). ITI operates by shifting model activations …
the" truthfulness" of large language models (LLMs). ITI operates by shifting model activations …
[HTML][HTML] Modern language models refute Chomsky's approach to language
ST Piantadosi - From fieldwork to linguistic theory: A tribute to …, 2023 - books.google.com
Modern machine learning has subverted and bypassed the theoretical framework of
Chomsky's generative approach to linguistics, including its core claims to particular insights …
Chomsky's generative approach to linguistics, including its core claims to particular insights …
Explainability for large language models: A survey
Large language models (LLMs) have demonstrated impressive capabilities in natural
language processing. However, their internal mechanisms are still unclear and this lack of …
language processing. However, their internal mechanisms are still unclear and this lack of …
How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model
Pre-trained language models can be surprisingly adept at tasks they were not explicitly
trained on, but how they implement these capabilities is poorly understood. In this paper, we …
trained on, but how they implement these capabilities is poorly understood. In this paper, we …
Zeroquant: Efficient and affordable post-training quantization for large-scale transformers
How to efficiently serve ever-larger trained natural language models in practice has become
exceptionally challenging even for powerful cloud servers due to their prohibitive …
exceptionally challenging even for powerful cloud servers due to their prohibitive …
Socratic models: Composing zero-shot multimodal reasoning with language
Large pretrained (eg," foundation") models exhibit distinct capabilities depending on the
domain of data they are trained on. While these domains are generic, they may only barely …
domain of data they are trained on. While these domains are generic, they may only barely …
Leace: Perfect linear concept erasure in closed form
Abstract Concept erasure aims to remove specified features from a representation. It can
improve fairness (eg preventing a classifier from using gender or race) and interpretability …
improve fairness (eg preventing a classifier from using gender or race) and interpretability …
Design guidelines for prompt engineering text-to-image generative models
Text-to-image generative models are a new and powerful way to generate visual artwork.
However, the open-ended nature of text as interaction is double-edged; while users can …
However, the open-ended nature of text as interaction is double-edged; while users can …