Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Deep learning on a data diet: Finding important examples early in training
Recent success in deep learning has partially been driven by training increasingly
overparametrized networks on ever larger datasets. It is therefore natural to ask: how much …
overparametrized networks on ever larger datasets. It is therefore natural to ask: how much …
Unleashing the power of data tsunami: A comprehensive survey on data assessment and selection for instruction tuning of language models
Instruction tuning plays a critical role in aligning large language models (LLMs) with human
preference. Despite the vast amount of open instruction datasets, naively training a LLM on …
preference. Despite the vast amount of open instruction datasets, naively training a LLM on …
Modyn: A platform for model training on dynamic datasets with sample-level data selection
Machine learning training data is often dynamic in real-world use cases, ie, data is added or
removed and may experience distribution shifts over time. Models must incorporate this …
removed and may experience distribution shifts over time. Models must incorporate this …
Advancing deep active learning & data subset selection: Unifying principles with information-theory intuitions
A Kirsch - arxiv preprint arxiv:2401.04305, 2024 - arxiv.org
At its core, this thesis aims to enhance the practicality of deep learning by improving the
label and training efficiency of deep learning models. To this end, we investigate data subset …
label and training efficiency of deep learning models. To this end, we investigate data subset …
Efficient and Robust Quantization-aware Training via Adaptive Coreset Selection
Quantization-aware training (QAT) is a representative model compression method to reduce
redundancy in weights and activations. However, most existing QAT methods require end-to …
redundancy in weights and activations. However, most existing QAT methods require end-to …
Modyn: Data-Centric Machine Learning Pipeline Orchestration
In real-world machine learning (ML) pipelines, datasets are continuously growing. Models
must incorporate this new training data to improve generalization and adapt to potential …
must incorporate this new training data to improve generalization and adapt to potential …
Robust and Efficient Quantization-aware Training via Coreset Selection
Quantization-aware training (QAT) is a representative model compression method to reduce
redundancy in weights and activations. However, most existing QAT methods require end-to …
redundancy in weights and activations. However, most existing QAT methods require end-to …