- Academic Search

L Yang, A Shami - Neurocomputing, 2020 - Elsevier

Abstract Machine learning algorithms have been used widely in various applications and
areas. To fit a machine learning model into different problems, its hyper-parameters must be …

Save Cite Cited by 2700 Related articles All 7 versions Free GPT-4

[Free GPT-4]

[PDF] ieee.org

Artificial neural networks-based machine learning for wireless networks: A tutorial

M Chen, U Challita, W Saad, C Yin… - … Surveys & Tutorials, 2019 - ieeexplore.ieee.org

In order to effectively provide ultra reliable low latency communications and pervasive
connectivity for Internet of Things (IoT) devices, next-generation wireless networks can …

Save Cite Cited by 1117 Related articles All 10 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Training compute-optimal large language models

J Hoffmann, S Borgeaud, A Mensch… - arxiv preprint arxiv …, 2022 - arxiv.org

We investigate the optimal model size and number of tokens for training a transformer
language model under a given compute budget. We find that current large language models …

Save Cite Cited by 1837 Related articles All 8 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Transformers as statisticians: Provable in-context learning with in-context algorithm selection

Y Bai, F Chen, H Wang, C **ong… - Advances in neural …, 2024 - proceedings.neurips.cc

Neural sequence models based on the transformer architecture have demonstrated
remarkable\emph {in-context learning}(ICL) abilities, where they can perform new tasks …

Save Cite Cited by 190 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Adaptive federated learning in resource constrained edge computing systems

S Wang, T Tuor, T Salonidis, KK Leung… - IEEE journal on …, 2019 - ieeexplore.ieee.org

Emerging technologies and applications including Internet of Things, social networking, and
crowd-sourcing generate large amounts of data at the network edge. Machine learning …

Save Cite Cited by 2135 Related articles All 11 versions Free GPT-4

[Free GPT-4]

[PDF] uci.edu

[BOOK][B] High-dimensional probability: An introduction with applications in data science

R Vershynin - 2018 - books.google.com

High-dimensional probability offers insight into the behavior of random vectors, random
matrices, random subspaces, and objects used to quantify uncertainty in high dimensions …

Save Cite Cited by 4329 Related articles All 11 versions Free GPT-4 Library Search

[Free GPT-4]

[PDF] neurips.cc

Personalized federated learning with moreau envelopes

CT Dinh, N Tran, J Nguyen - Advances in neural …, 2020 - proceedings.neurips.cc

Federated learning (FL) is a decentralized and privacy-preserving machine learning
technique in which a group of clients collaborate with a server to learn a global model …

Save Cite Cited by 1096 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

An empirical analysis of compute-optimal large language model training

J Hoffmann, S Borgeaud, A Mensch… - Advances in …, 2022 - proceedings.neurips.cc

We investigate the optimal model size and number of tokens for training a transformer
language model under a given compute budget. We find that current large language models …

Save Cite Cited by 159 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Byzantine-robust distributed learning: Towards optimal statistical rates

D Yin, Y Chen, R Kannan… - … conference on machine …, 2018 - proceedings.mlr.press

In this paper, we develop distributed optimization algorithms that are provably robust against
Byzantine failures—arbitrary and potentially adversarial behavior, in distributed computing …

Save Cite Cited by 1794 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

QSGD: Communication-efficient SGD via gradient quantization and encoding

D Alistarh, D Grubic, J Li, R Tomioka… - Advances in neural …, 2017 - proceedings.neurips.cc

Parallel implementations of stochastic gradient descent (SGD) have received significant
research attention, thanks to its excellent scalability properties. A fundamental barrier when …

Save Cite Cited by 1981 Related articles All 12 versions Free GPT-4 View as HTML

Cite

Advanced search

Saved to My library

On hyperparameter optimization of machine learning algorithms: Theory and practice

Artificial neural networks-based machine learning for wireless networks: A tutorial

Training compute-optimal large language models

Transformers as statisticians: Provable in-context learning with in-context algorithm selection

Adaptive federated learning in resource constrained edge computing systems

[BOOK][B] High-dimensional probability: An introduction with applications in data science

Personalized federated learning with moreau envelopes

An empirical analysis of compute-optimal large language model training

Byzantine-robust distributed learning: Towards optimal statistical rates

QSGD: Communication-efficient SGD via gradient quantization and encoding