- Academic Search

L Shen, Y Sun, Z Yu, L Ding, X Tian, D Tao - arxiv preprint arxiv …, 2023 - arxiv.org

The field of deep learning has witnessed significant progress, particularly in computer vision
(CV), natural language processing (NLP), and speech. The use of large-scale models …

保存引用被引用数: 38 関連記事全 2 バージョン HTMLバージョン

[Free GPT-4]

[PDF] neurips.cc

Large-scale methods for distributionally robust optimization

D Levy, Y Carmon, JC Duchi… - Advances in Neural …, 2020 - proceedings.neurips.cc

We propose and analyze algorithms for distributionally robust optimization of convex losses
with conditional value at risk (CVaR) and $\chi^ 2$ divergence uncertainty sets. We prove …

保存引用被引用数: 247 関連記事全 9 バージョン HTMLバージョン

[Free GPT-4]

[PDF] mlr.press

Prioritized training on points that are learnable, worth learning, and not yet learnt

S Mindermann, JM Brauner… - International …, 2022 - proceedings.mlr.press

Training on web-scale data can take months. But much computation and time is wasted on
redundant and noisy points that are already learnt or not learnable. To accelerate training …

保存引用被引用数: 142 関連記事全 9 バージョン HTMLバージョン

[Free GPT-4]

[PDF] neurips.cc

No train no gain: Revisiting efficient training algorithms for transformer-based language models

J Kaddour, O Key, P Nawrot… - Advances in Neural …, 2024 - proceedings.neurips.cc

The computation necessary for training Transformer-based language models has
skyrocketed in recent years. This trend has motivated research on efficient training …

保存引用被引用数: 30 関連記事全 5 バージョン HTMLバージョン

[Free GPT-4]

[PDF] thecvf.com

Acpl: Anti-curriculum pseudo-labelling for semi-supervised medical image classification

F Liu, Y Tian, Y Chen, Y Liu… - Proceedings of the …, 2022 - openaccess.thecvf.com

Effective semi-supervised learning (SSL) in medical image analysis (MIA) must address two
challenges: 1) work effectively on both multi-class (eg, lesion classification) and multi-label …

保存引用被引用数: 119 関連記事全 11 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

When do curricula work?

X Wu, E Dyer, B Neyshabur - arxiv preprint arxiv:2012.03107, 2020 - arxiv.org

Inspired by human learning, researchers have proposed ordering examples during training
based on their difficulty. Both curriculum learning, exposing a network to easier examples …

保存引用被引用数: 140 関連記事全 6 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Chaos as an interpretable benchmark for forecasting and data-driven modelling

W Gilpin - arxiv preprint arxiv:2110.05266, 2021 - arxiv.org

The striking fractal geometry of strange attractors underscores the generative nature of
chaos: like probability distributions, chaotic systems can be repeatedly measured to produce …

保存引用被引用数: 84 関連記事全 6 バージョン HTMLバージョン

On Efficient Training of Large-Scale Deep Learning Models

L Shen, Y Sun, Z Yu, L Ding, X Tian, D Tao - ACM Computing Surveys, 2024 - dl.acm.org

The field of deep learning has witnessed significant progress in recent times, particularly in
areas such as computer vision (CV), natural language processing (NLP), and speech. The …

保存引用関連記事全 2 バージョン

[Free GPT-4]

[PDF] arxiv.org

Rank-based decomposable losses in machine learning: A survey

S Hu, X Wang, S Lyu - IEEE Transactions on Pattern Analysis …, 2023 - ieeexplore.ieee.org

Recent works have revealed an essential paradigm in designing loss functions that
differentiate individual losses versus aggregate losses. The individual loss measures the …

保存引用被引用数: 29 関連記事全 8 バージョン

[Free GPT-4]

[PDF] arxiv.org

Calibrated selective classification

A Fisch, T Jaakkola, R Barzilay - arxiv preprint arxiv:2208.12084, 2022 - arxiv.org

Selective classification allows models to abstain from making predictions (eg, say" I don't
know") when in doubt in order to obtain better effective accuracy. While typical selective …

保存引用被引用数: 24 関連記事全 4 バージョン HTMLバージョン

引用

検索オプション

マイライブラリに保存しました

On efficient training of large-scale deep learning models: A literature review

Large-scale methods for distributionally robust optimization

Prioritized training on points that are learnable, worth learning, and not yet learnt

No train no gain: Revisiting efficient training algorithms for transformer-based language models

Acpl: Anti-curriculum pseudo-labelling for semi-supervised medical image classification

When do curricula work?

Chaos as an interpretable benchmark for forecasting and data-driven modelling

On Efficient Training of Large-Scale Deep Learning Models

Rank-based decomposable losses in machine learning: A survey

Calibrated selective classification