A survey on federated learning for resource-constrained IoT devices
Federated learning (FL) is a distributed machine learning strategy that generates a global
model by learning from multiple decentralized edge clients. FL enables on-device training …
model by learning from multiple decentralized edge clients. FL enables on-device training …
Demystifying parallel and distributed deep learning: An in-depth concurrency analysis
Deep Neural Networks (DNNs) are becoming an important tool in modern computing
applications. Accelerating their training is a major challenge and techniques range from …
applications. Accelerating their training is a major challenge and techniques range from …
Qlora: Efficient finetuning of quantized llms
We present QLoRA, an efficient finetuning approach that reduces memory usage enough to
finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit …
finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit …
A survey of large language models
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …
Efficient memory management for large language model serving with pagedattention
High throughput serving of large language models (LLMs) requires batching sufficiently
many requests at a time. However, existing systems struggle because the key-value cache …
many requests at a time. However, existing systems struggle because the key-value cache …
Learning skillful medium-range global weather forecasting
Global medium-range weather forecasting is critical to decision-making across many social
and economic domains. Traditional numerical weather prediction uses increased compute …
and economic domains. Traditional numerical weather prediction uses increased compute …
Llamafactory: Unified efficient fine-tuning of 100+ language models
Efficient fine-tuning is vital for adapting large language models (LLMs) to downstream tasks.
However, it requires non-trivial efforts to implement these methods on different models. We …
However, it requires non-trivial efforts to implement these methods on different models. We …
Eva: Exploring the limits of masked visual representation learning at scale
We launch EVA, a vision-centric foundation model to explore the limits of visual
representation at scale using only publicly accessible data. EVA is a vanilla ViT pre-trained …
representation at scale using only publicly accessible data. EVA is a vanilla ViT pre-trained …
Scaling language-image pre-training via masking
Abstract We present Fast Language-Image Pre-training (FLIP), a simple and more efficient
method for training CLIP. Our method randomly masks out and removes a large portion of …
method for training CLIP. Our method randomly masks out and removes a large portion of …
Lightglue: Local feature matching at light speed
We introduce LightGlue, a deep neural network that learns to match local features across
images. We revisit multiple design decisions of SuperGlue, the state of the art in sparse …
images. We revisit multiple design decisions of SuperGlue, the state of the art in sparse …