Infobatch: Lossless training speed up by unbiased dynamic data pruning

Z Qin, K Wang, Z Zheng, J Gu, X Peng, Z Xu… - arxiv preprint arxiv …, 2023 - arxiv.org
Data pruning aims to obtain lossless performances with less overall cost. A common
approach is to filter out samples that make less contribution to the training. This could lead to …

Generative dataset distillation based on diffusion model

D Su, J Hou, G Li, R Togo, R Song, T Ogawa… - arxiv preprint arxiv …, 2024 - arxiv.org
This paper presents our method for the generative track of The First Dataset Distillation
Challenge at ECCV 2024. Since the diffusion model has become the mainstay of generative …

Unlocking the potential of federated learning: The symphony of dataset distillation via deep generative latents

Y Jia, S Vahidian, J Sun, J Zhang, V Kungurtsev… - … on Computer Vision, 2024 - Springer
Data heterogeneity presents significant challenges for federated learning (FL). Recently,
dataset distillation techniques have been introduced, and performed at the client level, to …

Dataset distillation from first principles: Integrating core information extraction and purposeful learning

V Kungurtsev, Y Peng, J Gu, S Vahidian… - arxiv preprint arxiv …, 2024 - arxiv.org
Dataset distillation (DD) is an increasingly important technique that focuses on constructing
a synthetic dataset capable of capturing the core information in training data to achieve …

Self-supervised Dataset Distillation: A Good Compression Is All You Need

M Zhou, Z Yin, S Shao, Z Shen - arxiv preprint arxiv:2404.07976, 2024 - arxiv.org
Dataset distillation aims to compress information from a large-scale original dataset to a new
compact dataset while striving to preserve the utmost degree of the original data …

Dd-robustbench: An adversarial robustness benchmark for dataset distillation

Y Wu, J Du, P Liu, Y Lin, W Xu, W Cheng - arxiv preprint arxiv:2403.13322, 2024 - arxiv.org
Dataset distillation is an advanced technique aimed at compressing datasets into
significantly smaller counterparts, while preserving formidable training performance …

Emphasizing discriminative features for dataset distillation in complex scenarios

K Wang, Z Li, ZQ Cheng, S Khaki, A Sajedi… - arxiv preprint arxiv …, 2024 - arxiv.org
Dataset distillation has demonstrated strong performance on simple datasets like CIFAR,
MNIST, and TinyImageNet but struggles to achieve similar results in more complex …

Group Distributionally Robust Dataset Distillation with Risk Minimization

S Vahidian, M Wang, J Gu, V Kungurtsev… - arxiv preprint arxiv …, 2024 - arxiv.org
Dataset distillation (DD) has emerged as a widely adopted technique for crafting a synthetic
dataset that captures the essential information of a training dataset, facilitating the training of …

Efficient Dataset Distillation via Diffusion-Driven Patch Selection for Improved Generalization

X Zhong, S Sun, X Gu, Z Xu, Y Wang, J Wu… - arxiv preprint arxiv …, 2024 - arxiv.org
Dataset distillation offers an efficient way to reduce memory and computational costs by
optimizing a smaller dataset with performance comparable to the full-scale original …

Dataset distillation via curriculum data synthesis in large data era

Z Yin, Z Shen - Transactions on Machine Learning Research, 2024 - openreview.net
Dataset distillation or condensation aims to generate a smaller but representative subset
from a large dataset, which allows a model to be trained more efficiently, meanwhile …