Cramming: Training a Language Model on a single GPU in one day.
Recent trends in language modeling have focused on increasing performance through
scaling, and have resulted in an environment where training language models is out of …
scaling, and have resulted in an environment where training language models is out of …
No train no gain: Revisiting efficient training algorithms for transformer-based language models
The computation necessary for training Transformer-based language models has
skyrocketed in recent years. This trend has motivated research on efficient training …
skyrocketed in recent years. This trend has motivated research on efficient training …
Less: Selecting influential data for targeted instruction tuning
Instruction tuning has unlocked powerful capabilities in large language models (LLMs),
effectively using combined datasets to develop generalpurpose chatbots. However, real …
effectively using combined datasets to develop generalpurpose chatbots. However, real …
Compute-efficient deep learning: Algorithmic trends and opportunities
Although deep learning has made great progress in recent years, the exploding economic
and environmental costs of training neural networks are becoming unsustainable. To …
and environmental costs of training neural networks are becoming unsustainable. To …