Does graph distillation see like vision dataset counterpart?
Training on large-scale graphs has achieved remarkable results in graph representation
learning, but its cost and storage have attracted increasing concerns. Existing graph …
learning, but its cost and storage have attracted increasing concerns. Existing graph …
Dataset regeneration for sequential recommendation
The sequential recommender (SR) system is a crucial component of modern recommender
systems, as it aims to capture the evolving preferences of users. Significant efforts have …
systems, as it aims to capture the evolving preferences of users. Significant efforts have …
Expanding small-scale datasets with guided imagination
The power of DNNs relies heavily on the quantity and quality of training data. However,
collecting and annotating data on a large scale is often expensive and time-consuming. To …
collecting and annotating data on a large scale is often expensive and time-consuming. To …
Unleashing the power of data tsunami: A comprehensive survey on data assessment and selection for instruction tuning of language models
Instruction tuning plays a critical role in aligning large language models (LLMs) with human
preference. Despite the vast amount of open instruction datasets, naively training a LLM on …
preference. Despite the vast amount of open instruction datasets, naively training a LLM on …
Towards lossless dataset distillation via difficulty-aligned trajectory matching
The ultimate goal of Dataset Distillation is to synthesize a small synthetic dataset such that a
model trained on this synthetic set will perform equally well as a model trained on the full …
model trained on this synthetic set will perform equally well as a model trained on the full …
Spanning training progress: Temporal dual-depth scoring (tdds) for enhanced dataset pruning
Dataset pruning aims to construct a coreset capable of achieving performance comparable
to the original full dataset. Most existing dataset pruning methods rely on snapshot-based …
to the original full dataset. Most existing dataset pruning methods rely on snapshot-based …
M3d: Dataset condensation by minimizing maximum mean discrepancy
Training state-of-the-art (SOTA) deep models often requires extensive data, resulting in
substantial training and storage costs. To address these challenges, dataset condensation …
substantial training and storage costs. To address these challenges, dataset condensation …
FlashEval: Towards Fast and Accurate Evaluation of Text-to-image Diffusion Generative Models
In recent years there has been significant progress in the development of text-to-image
generative models. Evaluating the quality of the generative models is one essential step in …
generative models. Evaluating the quality of the generative models is one essential step in …
Dataset quantization with active learning based adaptive sampling
Deep learning has made remarkable progress recently, largely due to the availability of
large, well-labeled datasets. However, the training on such datasets elevates costs and …
large, well-labeled datasets. However, the training on such datasets elevates costs and …
Navigating complexity: Toward lossless graph condensation via expanding window matching
Graph condensation aims to reduce the size of a large-scale graph dataset by synthesizing
a compact counterpart without sacrificing the performance of Graph Neural Networks …
a compact counterpart without sacrificing the performance of Graph Neural Networks …