The internet of federated things (IoFT)

R Kontar, N Shi, X Yue, S Chung, E Byon… - IEEE …, 2021 - ieeexplore.ieee.org
The Internet of Things (IoT) is on the verge of a major paradigm shift. In the IoT system of the
future, IoFT, the “cloud” will be substituted by the “crowd” where model training is brought to …

Distributed learning systems with first-order methods

J Liu, C Zhang - Foundations and Trends® in Databases, 2020 - nowpublishers.com
Scalable and efficient distributed learning is one of the main driving forces behind the recent
rapid advancement of machine learning and artificial intelligence. One prominent feature of …

Adaptive weight decay for deep neural networks

K Nakamura, BW Hong - IEEE Access, 2019 - ieeexplore.ieee.org
Regularization in the optimization of deep neural networks is often critical to avoid
undesirable over-fitting leading to better generalization of model. One of the most popular …

Non-monotone submodular maximization in exponentially fewer iterations

E Balkanski, A Breuer, Y Singer - Advances in Neural …, 2018 - proceedings.neurips.cc
In this paper we consider parallelization for applications whose objective can be expressed
as maximizing a non-monotone submodular function under a cardinality constraint. Our …

Let's Make Block Coordinate Descent Converge Faster: Faster Greedy Rules, Message-Passing, Active-Set Complexity, and Superlinear Convergence

J Nutini, I Laradji, M Schmidt - arxiv preprint arxiv:1712.08859, 2017 - arxiv.org
Block coordinate descent (BCD) methods are widely used for large-scale numerical
optimization because of their cheap iteration costs, low memory requirements, amenability to …

Addressing budget allocation and revenue allocation in data market environments using an adaptive sampling algorithm

B Zhao, B Lyu, RC Fernandez… - … Conference on Machine …, 2023 - proceedings.mlr.press
High-quality machine learning models are dependent on access to high-quality training
data. When the data are not already available, it is tedious and costly to obtain them. Data …

Fast and accurate stochastic gradient estimation

B Chen, Y Xu, A Shrivastava - Advances in Neural …, 2019 - proceedings.neurips.cc
Abstract Stochastic Gradient Descent or SGD is the most popular optimization algorithm for
large-scale problems. SGD estimates the gradient by uniform sampling with sample size …

Adam with bandit sampling for deep learning

R Liu, T Wu, B Mozafari - Advances in Neural Information …, 2020 - proceedings.neurips.cc
Adam is a widely used optimization method for training deep learning models. It computes
individual adaptive learning rates for different parameters. In this paper, we propose a …

Efficiency ordering of stochastic gradient descent

J Hu, V Doshi, DY Eun - Advances in Neural Information …, 2022 - proceedings.neurips.cc
We consider the stochastic gradient descent (SGD) algorithm driven by a general stochastic
sequence, including iid noise and random walk on an arbitrary graph, among others; and …

Walk for learning: A random walk approach for federated learning from heterogeneous data

G Ayache, V Dassari… - IEEE Journal on Selected …, 2023 - ieeexplore.ieee.org
We consider the problem of a Parameter Server (PS) that wishes to learn a model that fits
data distributed on the nodes of a graph. We focus on Federated Learning (FL) as a …