A brief overview of ChatGPT: The history, status quo and potential future development
ChatGPT, an artificial intelligence generated content (AIGC) model developed by OpenAI,
has attracted world-wide attention for its capability of dealing with challenging language …
has attracted world-wide attention for its capability of dealing with challenging language …
Recent advances in natural language processing via large pre-trained language models: A survey
Large, pre-trained language models (PLMs) such as BERT and GPT have drastically
changed the Natural Language Processing (NLP) field. For numerous NLP tasks …
changed the Natural Language Processing (NLP) field. For numerous NLP tasks …
Nucleotide Transformer: building and evaluating robust foundation models for human genomics
The prediction of molecular phenotypes from DNA sequences remains a longstanding
challenge in genomics, often driven by limited annotated data and the inability to transfer …
challenge in genomics, often driven by limited annotated data and the inability to transfer …
Evolutionary-scale prediction of atomic-level protein structure with a language model
Recent advances in machine learning have leveraged evolutionary information in multiple
sequence alignments to predict protein structure. We demonstrate direct inference of full …
sequence alignments to predict protein structure. We demonstrate direct inference of full …
[HTML][HTML] Modern language models refute Chomsky's approach to language
ST Piantadosi - From fieldwork to linguistic theory: A tribute to …, 2023 - books.google.com
Modern machine learning has subverted and bypassed the theoretical framework of
Chomsky's generative approach to linguistics, including its core claims to particular insights …
Chomsky's generative approach to linguistics, including its core claims to particular insights …
Explainability for large language models: A survey
Large language models (LLMs) have demonstrated impressive capabilities in natural
language processing. However, their internal mechanisms are still unclear and this lack of …
language processing. However, their internal mechanisms are still unclear and this lack of …
How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model
Pre-trained language models can be surprisingly adept at tasks they were not explicitly
trained on, but how they implement these capabilities is poorly understood. In this paper, we …
trained on, but how they implement these capabilities is poorly understood. In this paper, we …
[PDF][PDF] Language models of protein sequences at the scale of evolution enable accurate structure prediction
Large language models have recently been shown to develop emergent capabilities with
scale, going beyond simple pattern matching to perform higher level reasoning and …
scale, going beyond simple pattern matching to perform higher level reasoning and …
On the opportunities and risks of foundation models
AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …
Does localization inform editing? surprising differences in causality-based localization vs. knowledge editing in language models
Abstract Language models learn a great quantity of factual information during pretraining,
and recent work localizes this information to specific model weights like mid-layer MLP …
and recent work localizes this information to specific model weights like mid-layer MLP …