Dataset distillation: A comprehensive review

R Yu, S Liu, X Wang - IEEE Transactions on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Recent success of deep learning is largely attributed to the sheer amount of data used for
training deep neural networks. Despite the unprecedented success, the massive data …

Minimizing the accumulated trajectory error to improve dataset distillation

J Du, Y Jiang, VYF Tan, JT Zhou… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Abstract Model-based deep learning has achieved astounding successes due in part to the
availability of large-scale real-world data. However, processing such massive amounts of …

Importance-aware co-teaching for offline model-based optimization

Y Yuan, CS Chen, Z Liu… - Advances in Neural …, 2024 - proceedings.neurips.cc
Offline model-based optimization aims to find a design that maximizes a property of interest
using only an offline dataset, with applications in robot, protein, and molecule design …

Dataset distillation by automatic training trajectories

D Liu, J Gu, H Cao, C Trinitis, M Schulz - European Conference on …, 2024 - Springer
Dataset Distillation is used to create a concise, yet informative, synthetic dataset that can
replace the original dataset for training purposes. Some leading methods in this domain …

Graph data condensation via self-expressive graph structure reconstruction

Z Liu, C Zeng, G Zheng - Proceedings of the 30th ACM SIGKDD …, 2024 - dl.acm.org
With the increasing demands of training graph neural networks (GNNs) on large-scale
graphs, graph data condensation has emerged as a critical technique to relieve the storage …

Sparse parameterization for epitomic dataset distillation

X Wei, A Cao, F Yang, Z Ma - Advances in Neural …, 2024 - proceedings.neurips.cc
The success of deep learning relies heavily on large and diverse datasets, but the storage,
preprocessing, and training of such data present significant challenges. To address these …

Dataset condensation for time series classification via dual domain matching

Z Liu, K Hao, G Zheng, Y Yu - Proceedings of the 30th ACM SIGKDD …, 2024 - dl.acm.org
Time series data has been demonstrated to be crucial in various research fields. The
management of large quantities of time series data presents challenges in terms of deep …

Dataset quantization with active learning based adaptive sampling

Z Zhao, Y Shang, J Wu, Y Yan - European Conference on Computer …, 2024 - Springer
Deep learning has made remarkable progress recently, largely due to the availability of
large, well-labeled datasets. However, the training on such datasets elevates costs and …

Multisize dataset condensation

Y He, L **ao, JT Zhou, I Tsang - arxiv preprint arxiv:2403.06075, 2024 - arxiv.org
While dataset condensation effectively enhances training efficiency, its application in on-
device scenarios brings unique challenges. 1) Due to the fluctuating computational …

Boosting automatic COVID-19 detection performance with self-supervised learning and batch knowledge ensembling

G Li, R Togo, T Ogawa, M Haseyama - Computers in biology and medicine, 2023 - Elsevier
Problem: Detecting COVID-19 from chest X-ray (CXR) images has become one of the fastest
and easiest methods for detecting COVID-19. However, the existing methods usually use …