Spanning training progress: Temporal dual-depth scoring (tdds) for enhanced dataset pruning
Dataset pruning aims to construct a coreset capable of achieving performance comparable
to the original full dataset. Most existing dataset pruning methods rely on snapshot-based …
to the original full dataset. Most existing dataset pruning methods rely on snapshot-based …
Selectivity drives productivity: efficient dataset pruning for enhanced transfer learning
Massive data is often considered essential for deep learning applications, but it also incurs
significant computational and infrastructural costs. Therefore, dataset pruning (DP) has …
significant computational and infrastructural costs. Therefore, dataset pruning (DP) has …
Data distillation can be like vodka: Distilling more times for better quality
Dataset distillation aims to minimize the time and memory needed for training deep networks
on large datasets, by creating a small set of synthetic images that has a similar …
on large datasets, by creating a small set of synthetic images that has a similar …
A framework for measuring the training efficiency of a neural architecture
Measuring Efficiency in neural network system development is an open research problem.
This paper presents an experimental framework to measure the training efficiency of a …
This paper presents an experimental framework to measure the training efficiency of a …
Active Data Collection and Management for Real-World Continual Learning via Pretrained Oracle
Incremental Learning (IL) deals with learning from continuous streams of data while
minimising catastrophic forgetting. This field of Machine Learning (ML) research has …
minimising catastrophic forgetting. This field of Machine Learning (ML) research has …
KAKURENBO: Adaptively Hiding Samples in Deep Neural Network Training
This paper proposes a method for hiding the least-important samples during the training of
deep neural networks to increase efficiency, ie, to reduce the cost of training. Using …
deep neural networks to increase efficiency, ie, to reduce the cost of training. Using …
Optimizing Data Acquisition to Enhance Machine Learning Performance
In this paper, we study how to acquire labeled data points from a large data pool to enrich a
training set for enhancing supervised machine learning (ML) performance. The state-of-the …
training set for enhancing supervised machine learning (ML) performance. The state-of-the …
Data optimization in deep learning: A survey
O Wu, R Yao - IEEE Transactions on Knowledge and Data …, 2025 - ieeexplore.ieee.org
Large-scale, high-quality data are considered an essential factor for the successful
application of many deep learning techniques. Meanwhile, numerous real-world deep …
application of many deep learning techniques. Meanwhile, numerous real-world deep …
Cadc: Encoding user-item interactions for compressing recommendation model training data
HE Zarch, A Alshabanah, C Jiang… - arxiv preprint arxiv …, 2024 - arxiv.org
Deep learning recommendation models (DLRMs) are at the heart of the current e-commerce
industry. However, the amount of training data used to train these large models is growing …
industry. However, the amount of training data used to train these large models is growing …
Data-Efficient Contrastive Language-Image Pretraining: Prioritizing Data Quality over Quantity
Abstract Contrastive Language-Image Pre-training (CLIP) on large-scale image-caption
datasets learns representations that can achieve remarkable zero-shot generalization …
datasets learns representations that can achieve remarkable zero-shot generalization …