Google 學術搜尋

J Zhao, Z Zhang, B Chen, Z Wang… - ar** is provably robust to label noise for overparameterized neural networks

M Li, M Soltanolkotabi, S Oymak - … conference on artificial …, 2020 - proceedings.mlr.press

Modern neural networks are typically trained in an over-parameterized regime where the
parameters of the model far exceed the size of the training data. Such neural networks in …

儲存引用被引用 429 次相關文章全部共 9 個版本 HTML 版

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Accelerating dataset distillation via model augmentation

L Zhang, J Zhang, B Lei, S Mukherjee… - Proceedings of the …, 2023 - openaccess.thecvf.com

Dataset Distillation (DD), a newly emerging field, aims at generating much smaller but
efficient synthetic training datasets from large ones. Existing DD methods based on gradient …

儲存引用被引用 64 次相關文章全部共 6 個版本 HTML 版

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

An investigation into neural net optimization via hessian eigenvalue density

B Ghorbani, S Krishnan, Y ** in private sgd: A geometric perspective

X Chen, SZ Wu, M Hong - Advances in Neural Information …, 2020 - proceedings.neurips.cc

Deep learning models are increasingly popular in many machine learning applications
where the training data may contain sensitive information. To provide formal and rigorous …

儲存引用被引用 223 次相關文章全部共 8 個版本 HTML 版

建立快訊

引用

進階搜尋

已儲存至「我的圖書館」

Gradient descent happens in a tiny subspace

Galore: Memory-efficient llm training by gradient low-rank projection

Accelerating dataset distillation via model augmentation

An investigation into neural net optimization via hessian eigenvalue density