- Academic Search

R Bassily, V Feldman, C Guzmán… - Advances in Neural …, 2020 - proceedings.neurips.cc

Uniform stability is a notion of algorithmic stability that bounds the worst case change in the
model output by the algorithm when a single data point in the dataset is replaced. An …

保存引用被引用数: 205 関連記事全 12 バージョン HTMLバージョン

On Efficient Training of Large-Scale Deep Learning Models

L Shen, Y Sun, Z Yu, L Ding, X Tian, D Tao - ACM Computing Surveys, 2024 - dl.acm.org

The field of deep learning has witnessed significant progress in recent times, particularly in
areas such as computer vision (CV), natural language processing (NLP), and speech. The …

保存引用被引用数: 1 関連記事全 2 バージョン

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

On convergence of FedProx: Local dissimilarity invariant bounds, non-smoothness and beyond

X Yuan, P Li - Advances in Neural Information Processing …, 2022 - proceedings.neurips.cc

The\FedProx~ algorithm is a simple yet powerful distributed proximal point optimization
method widely used for federated learning (FL) over heterogeneous data. Despite its …

保存引用被引用数: 64 関連記事全 7 バージョン HTMLバージョン

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

On the algorithmic stability of adversarial training

Y **ng, Q Song, G Cheng - Advances in neural information …, 2021 - proceedings.neurips.cc

The adversarial training is a popular tool to remedy the vulnerability of deep learning models
against adversarial attacks, and there is rich theoretical literature on the training loss of …

保存引用被引用数: 65 関連記事全 7 バージョン HTMLバージョン

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Information-theoretic generalization bounds for stochastic gradient descent

G Neu, GK Dziugaite, M Haghifam… - … on Learning Theory, 2021 - proceedings.mlr.press

We study the generalization properties of the popular stochastic optimization method known
as stochastic gradient descent (SGD) for optimizing general non-convex loss functions. Our …

保存引用被引用数: 91 関連記事全 8 バージョン HTMLバージョン

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Topology-aware generalization of decentralized sgd

T Zhu, F He, L Zhang, Z Niu… - … on Machine Learning, 2022 - proceedings.mlr.press

This paper studies the algorithmic stability and generalizability of decentralized stochastic
gradient descent (D-SGD). We prove that the consensus model learned by D-SGD is …

保存引用被引用数: 37 関連記事全 9 バージョン HTMLバージョン

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

On the optimization and generalization of multi-head attention

P Deora, R Ghaderi, H Taheri… - arxiv preprint arxiv …, 2023 - arxiv.org

The training and generalization dynamics of the Transformer's core mechanism, namely the
Attention mechanism, remain under-explored. Besides, existing analyses primarily focus on …

保存引用被引用数: 33 関連記事全 3 バージョン HTMLバージョン

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Algorithmic stability of heavy-tailed sgd with general loss functions

A Raj, L Zhu, M Gurbuzbalaban… - … on Machine Learning, 2023 - proceedings.mlr.press

Heavy-tail phenomena in stochastic gradient descent (SGD) have been reported in several
empirical studies. Experimental evidence in previous works suggests a strong interplay …

保存引用被引用数: 20 関連記事全 9 バージョン HTMLバージョン

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Stability-based generalization analysis of the asynchronous decentralized SGD

X Deng, T Sun, S Li, D Li - Proceedings of the AAAI Conference on …, 2023 - ojs.aaai.org

The generalization ability often determines the success of machine learning algorithms in
practice. Therefore, it is of great theoretical and practical importance to understand and …

保存引用被引用数: 15 関連記事全 4 バージョン HTMLバージョン

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Three-way trade-off in multi-objective learning: Optimization, generalization and conflict-avoidance

L Chen, H Fernando, Y Ying… - Advances in Neural …, 2024 - proceedings.neurips.cc

Multi-objective learning (MOL) often arises in emerging machine learning problems when
multiple learning criteria or tasks need to be addressed. Recent works have developed …

保存引用被引用数: 19 関連記事全 8 バージョン HTMLバージョン

アラートを作成

引用

検索オプション

マイライブラリに保存しました

Fine-grained analysis of stability and generalization for stochastic gradient descent

Stability of stochastic gradient descent on nonsmooth convex losses

On Efficient Training of Large-Scale Deep Learning Models

On convergence of FedProx: Local dissimilarity invariant bounds, non-smoothness and beyond

On the algorithmic stability of adversarial training

Information-theoretic generalization bounds for stochastic gradient descent

Topology-aware generalization of decentralized sgd

On the optimization and generalization of multi-head attention

Algorithmic stability of heavy-tailed sgd with general loss functions

Stability-based generalization analysis of the asynchronous decentralized SGD

Three-way trade-off in multi-objective learning: Optimization, generalization and conflict-avoidance