Challenges and applications of large language models
Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …
[HTML][HTML] Pre-trained language models and their applications
Pre-trained language models have achieved striking success in natural language
processing (NLP), leading to a paradigm shift from supervised learning to pre-training …
processing (NLP), leading to a paradigm shift from supervised learning to pre-training …
Scaling speech technology to 1,000+ languages
Expanding the language coverage of speech technology has the potential to improve
access to information for many more people. However, current speech technology is …
access to information for many more people. However, current speech technology is …
BioGPT: generative pre-trained transformer for biomedical text generation and mining
Pre-trained language models have attracted increasing attention in the biomedical domain,
inspired by their great success in the general natural language domain. Among the two main …
inspired by their great success in the general natural language domain. Among the two main …
Gpt3. int8 (): 8-bit matrix multiplication for transformers at scale
Large language models have been widely adopted but require significant GPU memory for
inference. We develop a procedure for Int8 matrix multiplication for feed-forward and …
inference. We develop a procedure for Int8 matrix multiplication for feed-forward and …
Impact of code language models on automated program repair
Automated program repair (APR) aims to help developers improve software reliability by
generating patches for buggy programs. Although many code language models (CLM) are …
generating patches for buggy programs. Although many code language models (CLM) are …
Learning inverse folding from millions of predicted structures
We consider the problem of predicting a protein sequence from its backbone atom
coordinates. Machine learning approaches to this problem to date have been limited by the …
coordinates. Machine learning approaches to this problem to date have been limited by the …
Data2vec: A general framework for self-supervised learning in speech, vision and language
While the general idea of self-supervised learning is identical across modalities, the actual
algorithms and objectives differ widely because they were developed with a single modality …
algorithms and objectives differ widely because they were developed with a single modality …
Flava: A foundational language and vision alignment model
State-of-the-art vision and vision-and-language models rely on large-scale visio-linguistic
pretraining for obtaining good performance on a variety of downstream tasks. Generally …
pretraining for obtaining good performance on a variety of downstream tasks. Generally …
XLS-R: Self-supervised cross-lingual speech representation learning at scale
This paper presents XLS-R, a large-scale model for cross-lingual speech representation
learning based on wav2vec 2.0. We train models with up to 2B parameters on nearly half a …
learning based on wav2vec 2.0. We train models with up to 2B parameters on nearly half a …