Dataset distillation: A comprehensive review

R Yu, S Liu, X Wang - IEEE Transactions on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Recent success of deep learning is largely attributed to the sheer amount of data used for
training deep neural networks. Despite the unprecedented success, the massive data …

A comprehensive survey of dataset distillation

S Lei, D Tao - IEEE Transactions on Pattern Analysis and …, 2023 - ieeexplore.ieee.org
Deep learning technology has developed unprecedentedly in the last decade and has
become the primary choice in many application domains. This progress is mainly attributed …

Meta-learning in neural networks: A survey

T Hospedales, A Antoniou, P Micaelli… - IEEE transactions on …, 2021 - ieeexplore.ieee.org
The field of meta-learning, or learning-to-learn, has seen a dramatic rise in interest in recent
years. Contrary to conventional approaches to AI where tasks are solved from scratch using …

Dataset distillation by matching training trajectories

G Cazenavette, T Wang, A Torralba… - Proceedings of the …, 2022 - openaccess.thecvf.com
Dataset distillation is the task of synthesizing a small dataset such that a model trained on
the synthetic set will match the test accuracy of the model trained on the full dataset. In this …

Metamath: Bootstrap your own mathematical questions for large language models

L Yu, W Jiang, H Shi, J Yu, Z Liu, Y Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) have pushed the limits of natural language understanding
and exhibited excellent problem-solving ability. Despite the great success, most existing …

Dataset distillation via factorization

S Liu, K Wang, X Yang, J Ye… - Advances in neural …, 2022 - proceedings.neurips.cc
In this paper, we study dataset distillation (DD), from a novel perspective and introduce
a\emph {dataset factorization} approach, termed\emph {HaBa}, which is a plug-and-play …

D4: Improving llm pretraining via document de-duplication and diversification

K Tirumala, D Simig, A Aghajanyan… - Advances in Neural …, 2023 - proceedings.neurips.cc
Over recent years, an increasing amount of compute and data has been poured into training
large language models (LLMs), usually by doing one-pass learning on as many tokens as …

Fake it till you make it: Learning transferable representations from synthetic imagenet clones

MB Sarıyıldız, K Alahari, D Larlus… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent image generation models such as Stable Diffusion have exhibited an impressive
ability to generate fairly realistic images starting from a simple text prompt. Could such …

Cafe: Learning to condense dataset by aligning features

K Wang, B Zhao, X Peng, Z Zhu… - Proceedings of the …, 2022 - openaccess.thecvf.com
Dataset condensation aims at reducing the network training effort through condensing a
cumbersome training set into a compact synthetic one. State-of-the-art approaches largely …

Data-efficient Fine-tuning for LLM-based Recommendation

X Lin, W Wang, Y Li, S Yang, F Feng, Y Wei… - Proceedings of the 47th …, 2024 - dl.acm.org
Leveraging Large Language Models (LLMs) for recommendation has recently garnered
considerable attention, where fine-tuning plays a key role in LLMs' adaptation. However, the …