Dataset distillation: A comprehensive review
Recent success of deep learning is largely attributed to the sheer amount of data used for
training deep neural networks. Despite the unprecedented success, the massive data …
training deep neural networks. Despite the unprecedented success, the massive data …
A comprehensive survey of dataset distillation
Deep learning technology has developed unprecedentedly in the last decade and has
become the primary choice in many application domains. This progress is mainly attributed …
become the primary choice in many application domains. This progress is mainly attributed …
Meta-learning in neural networks: A survey
The field of meta-learning, or learning-to-learn, has seen a dramatic rise in interest in recent
years. Contrary to conventional approaches to AI where tasks are solved from scratch using …
years. Contrary to conventional approaches to AI where tasks are solved from scratch using …
Dataset distillation by matching training trajectories
Dataset distillation is the task of synthesizing a small dataset such that a model trained on
the synthetic set will match the test accuracy of the model trained on the full dataset. In this …
the synthetic set will match the test accuracy of the model trained on the full dataset. In this …
Metamath: Bootstrap your own mathematical questions for large language models
Large language models (LLMs) have pushed the limits of natural language understanding
and exhibited excellent problem-solving ability. Despite the great success, most existing …
and exhibited excellent problem-solving ability. Despite the great success, most existing …
Dataset distillation via factorization
In this paper, we study dataset distillation (DD), from a novel perspective and introduce
a\emph {dataset factorization} approach, termed\emph {HaBa}, which is a plug-and-play …
a\emph {dataset factorization} approach, termed\emph {HaBa}, which is a plug-and-play …
D4: Improving llm pretraining via document de-duplication and diversification
Over recent years, an increasing amount of compute and data has been poured into training
large language models (LLMs), usually by doing one-pass learning on as many tokens as …
large language models (LLMs), usually by doing one-pass learning on as many tokens as …
Fake it till you make it: Learning transferable representations from synthetic imagenet clones
Recent image generation models such as Stable Diffusion have exhibited an impressive
ability to generate fairly realistic images starting from a simple text prompt. Could such …
ability to generate fairly realistic images starting from a simple text prompt. Could such …
Cafe: Learning to condense dataset by aligning features
Dataset condensation aims at reducing the network training effort through condensing a
cumbersome training set into a compact synthetic one. State-of-the-art approaches largely …
cumbersome training set into a compact synthetic one. State-of-the-art approaches largely …
Data-efficient Fine-tuning for LLM-based Recommendation
Leveraging Large Language Models (LLMs) for recommendation has recently garnered
considerable attention, where fine-tuning plays a key role in LLMs' adaptation. However, the …
considerable attention, where fine-tuning plays a key role in LLMs' adaptation. However, the …