Towards lossless dataset distillation via difficulty-aligned trajectory matching

Z Guo, K Wang, G Cazenavette, H Li, K Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org
The ultimate goal of Dataset Distillation is to synthesize a small synthetic dataset such that a
model trained on this synthetic set will perform equally well as a model trained on the full …

Distill gold from massive ores: Bi-level data pruning towards efficient dataset distillation

Y Xu, YL Li, K Cui, Z Wang, C Lu, YW Tai… - European Conference on …, 2024 - Springer
Data-efficient learning has garnered significant attention, especially given the current trend
of large multi-modal models. Recently, dataset distillation has become an effective approach …

Prioritize Alignment in Dataset Distillation

Z Li, Z Guo, W Zhao, T Zhang, ZQ Cheng… - arxiv preprint arxiv …, 2024 - arxiv.org
Dataset Distillation aims to compress a large dataset into a significantly more compact,
synthetic one without compromising the performance of the trained models. To achieve this …

ATOM: Attention Mixer for Efficient Dataset Distillation

S Khaki, A Sajedi, K Wang, LZ Liu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Recent works in dataset distillation seek to minimize training expenses by generating a
condensed synthetic dataset that encapsulates the information present in a larger real …

Emphasizing discriminative features for dataset distillation in complex scenarios

K Wang, Z Li, ZQ Cheng, S Khaki, A Sajedi… - arxiv preprint arxiv …, 2024 - arxiv.org
Dataset distillation has demonstrated strong performance on simple datasets like CIFAR,
MNIST, and TinyImageNet but struggles to achieve similar results in more complex …

Going Beyond Feature Similarity: Effective Dataset distillation based on Class-aware Conditional Mutual Information

X Zhong, B Chen, H Fang, X Gu, ST **a… - arxiv preprint arxiv …, 2024 - arxiv.org
Dataset distillation (DD) aims to minimize the time and memory consumption needed for
training deep neural networks on large datasets, by creating a smaller synthetic dataset that …

Diffusion-Augmented Coreset Expansion for Scalable Dataset Distillation

A Abbasi, S Imani, C An, G Mahalingam… - arxiv preprint arxiv …, 2024 - arxiv.org
With the rapid scaling of neural networks, data storage and communication demands have
intensified. Dataset distillation has emerged as a promising solution, condensing information …

DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation

Z Shen, A Sherif, Z Yin, S Shao - arxiv preprint arxiv:2411.19946, 2024 - arxiv.org
Recent advances in dataset distillation have led to solutions in two main directions. The
conventional batch-to-batch matching mechanism is ideal for small-scale datasets and …

FairDD: Fair Dataset Distillation via Synchronized Matching

Q Zhou, S Fang, S He, W Meng, J Chen - arxiv preprint arxiv:2411.19623, 2024 - arxiv.org
Condensing large datasets into smaller synthetic counterparts has demonstrated its promise
for image classification. However, previous research has overlooked a crucial concern in …

Dataset Distillers Are Good Label Denoisers In the Wild

L Cheng, K Chen, J Li, S Tang, S Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Learning from noisy data has become essential for adapting deep learning models to real-
world applications. Traditional methods often involve first evaluating the noise and then …