Multisize dataset condensation

Y He, L **ao, JT Zhou, I Tsang - arxiv preprint arxiv:2403.06075, 2024 - arxiv.org
While dataset condensation effectively enhances training efficiency, its application in on-
device scenarios brings unique challenges. 1) Due to the fluctuating computational …

Curriculum dataset distillation

Z Ma, A Cao, F Yang, X Wei - arxiv preprint arxiv:2405.09150, 2024 - arxiv.org
Most dataset distillation methods struggle to accommodate large-scale datasets due to their
substantial computational and memory requirements. In this paper, we present a curriculum …

Improve cross-architecture generalization on dataset distillation

B Zhou, L Zhong, W Chen - arxiv preprint arxiv:2402.13007, 2024 - arxiv.org
Dataset distillation, a pragmatic approach in machine learning, aims to create a smaller
synthetic dataset from a larger existing dataset. However, existing distillation methods …

Distill gold from massive ores: Bi-level data pruning towards efficient dataset distillation

Y Xu, YL Li, K Cui, Z Wang, C Lu, YW Tai… - European Conference on …, 2024 - Springer
Data-efficient learning has garnered significant attention, especially given the current trend
of large multi-modal models. Recently, dataset distillation has become an effective approach …

Distill gold from massive ores: Efficient dataset distillation via critical samples selection

Y Xu, YL Li, K Cui, Z Wang, C Lu, MX LIU, YW Tai… - 2023 - openreview.net
Data-efficient learning has drawn significant attention, especially given the current trend of
large multi-modal models, where dataset distillation can be an effective solution. However …

Adaptive batch sizes for active learning: A probabilistic numerics approach

M Adachi, S Hayakawa, M Jørgensen… - International …, 2024 - proceedings.mlr.press
Active learning parallelization is widely used, but typically relies on fixing the batch size
throughout experimentation. This fixed approach is inefficient because of a dynamic trade-off …

Low-rank similarity mining for multimodal dataset distillation

Y Xu, Z Lin, Y Qiu, C Lu, YL Li - arxiv preprint arxiv:2406.03793, 2024 - arxiv.org
Though dataset distillation has witnessed rapid development in recent years, the distillation
of multimodal data, eg, image-text pairs, poses unique and under-explored challenges …

Bayesian Pseudo-Coresets via Contrastive Divergence

P Tiwary, K Shubham, VV Kashyap - arxiv preprint arxiv:2303.11278, 2023 - arxiv.org
Bayesian methods provide an elegant framework for estimating parameter posteriors and
quantification of uncertainty associated with probabilistic models. However, they often suffer …

Function space Bayesian pseudocoreset for Bayesian neural networks

B Kim, H Lee, J Lee - Advances in Neural Information …, 2023 - proceedings.neurips.cc
A Bayesian pseudocoreset is a compact synthetic dataset summarizing essential information
of a large-scale dataset and thus can be used as a proxy dataset for scalable Bayesian …

One-Shot Federated Learning with Bayesian Pseudocoresets

T d'Hondt, M Pechenizkiy, R Peharz - arxiv preprint arxiv:2406.02177, 2024 - arxiv.org
Optimization-based techniques for federated learning (FL) often come with prohibitive
communication cost, as high dimensional model parameters need to be communicated …