On the opportunities and risks of foundation models
AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …
Finetuned language models are zero-shot learners
This paper explores a simple method for improving the zero-shot learning abilities of
language models. We show that instruction tuning--finetuning language models on a …
language models. We show that instruction tuning--finetuning language models on a …
End-to-end transformer-based models in textual-based NLP
Transformer architectures are highly expressive because they use self-attention
mechanisms to encode long-range dependencies in the input sequences. In this paper, we …
mechanisms to encode long-range dependencies in the input sequences. In this paper, we …
Language models are multilingual chain-of-thought reasoners
We evaluate the reasoning abilities of large language models in multilingual settings. We
introduce the Multilingual Grade School Math (MGSM) benchmark, by manually translating …
introduce the Multilingual Grade School Math (MGSM) benchmark, by manually translating …
Linguistically inspired roadmap for building biologically reliable protein language models
Deep neural-network-based language models (LMs) are increasingly applied to large-scale
protein sequence data to predict protein function. However, being largely black-box models …
protein sequence data to predict protein function. However, being largely black-box models …
Muril: Multilingual representations for indian languages
India is a multilingual society with 1369 rationalized languages and dialects being spoken
across the country (INDIA, 2011). Of these, the 22 scheduled languages have a staggering …
across the country (INDIA, 2011). Of these, the 22 scheduled languages have a staggering …
Med-unic: Unifying cross-lingual medical vision-language pre-training by diminishing bias
The scarcity of data presents a critical obstacle to the efficacy of medical vision-language pre-
training (VLP). A potential solution lies in the combination of datasets from various language …
training (VLP). A potential solution lies in the combination of datasets from various language …
mgpt: Few-shot learners go multilingual
Recent studies report that autoregressive language models can successfully solve many
NLP tasks via zero-and few-shot learning paradigms, which opens up new possibilities for …
NLP tasks via zero-and few-shot learning paradigms, which opens up new possibilities for …
LLM-powered data augmentation for enhanced cross-lingual performance
This paper explores the potential of leveraging Large Language Models (LLMs) for data
augmentation in multilingual commonsense reasoning datasets where the available training …
augmentation in multilingual commonsense reasoning datasets where the available training …
Language models are few-shot multilingual learners
General-purpose language models have demonstrated impressive capabilities, performing
on par with state-of-the-art approaches on a range of downstream natural language …
on par with state-of-the-art approaches on a range of downstream natural language …