Communication-efficient federated learning via knowledge distillation

C Wu, F Wu, L Lyu, Y Huang, X **e - Nature communications, 2022 - nature.com
Federated learning is a privacy-preserving machine learning technique to train intelligent
models from decentralized data, which enables exploiting private data by communicating …

Dense: Data-free one-shot federated learning

J Zhang, C Chen, B Li, L Lyu, S Wu… - Advances in …, 2022 - proceedings.neurips.cc
Abstract One-shot Federated Learning (FL) has recently emerged as a promising approach,
which allows the central server to learn a model in a single communication round. Despite …

Fedpara: Low-rank hadamard product for communication-efficient federated learning

N Hyeon-Woo, M Ye-Bin, TH Oh - arxiv preprint arxiv:2108.06098, 2021 - arxiv.org
In this work, we propose a communication-efficient parameterization, FedPara, for federated
learning (FL) to overcome the burdens on frequent model uploads and downloads. Our …

What kinds of functions do deep neural networks learn? Insights from variational spline theory

R Parhi, RD Nowak - SIAM Journal on Mathematics of Data Science, 2022 - SIAM
We develop a variational framework to understand the properties of functions learned by
fitting deep neural networks with rectified linear unit (ReLU) activations to data. We propose …

Llm360: Towards fully transparent open-source llms

Z Liu, A Qiao, W Neiswanger, H Wang, B Tan… - arxiv preprint arxiv …, 2023 - arxiv.org
The recent surge in open-source Large Language Models (LLMs), such as LLaMA, Falcon,
and Mistral, provides diverse options for AI practitioners and researchers. However, most …

Deep learning meets sparse regularization: A signal processing perspective

R Parhi, RD Nowak - IEEE Signal Processing Magazine, 2023 - ieeexplore.ieee.org
Deep learning (DL) has been wildly successful in practice, and most of the state-of-the-art
machine learning methods are based on neural networks (NNs). Lacking, however, is a …

Low-rank lottery tickets: finding efficient low-rank neural networks via matrix differential equations

S Schotthöfer, E Zangrando, J Kusch… - Advances in …, 2022 - proceedings.neurips.cc
Neural networks have achieved tremendous success in a large variety of applications.
However, their memory footprint and computational demand can render them impractical in …

Optimus-CC: Efficient large NLP model training with 3D parallelism aware communication compression

J Song, J Yim, J Jung, H Jang, HJ Kim, Y Kim… - Proceedings of the 28th …, 2023 - dl.acm.org
In training of modern large natural language processing (NLP) models, it has become a
common practice to split models using 3D parallelism to multiple GPUs. Such technique …

Layer-wise adaptive model aggregation for scalable federated learning

S Lee, T Zhang, AS Avestimehr - … of the AAAI Conference on Artificial …, 2023 - ojs.aaai.org
Abstract In Federated Learning (FL), a common approach for aggregating local solutions
across clients is periodic full model averaging. It is, however, known that different layers of …

Fedhm: Efficient federated learning for heterogeneous models via low-rank factorization

D Yao, W Pan, MJ O'Neill, Y Dai, Y Wan, H **… - arxiv preprint arxiv …, 2021 - arxiv.org
One underlying assumption of recent federated learning (FL) paradigms is that all local
models usually share the same network architecture and size, which becomes impractical …