Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Trak: Attributing model behavior at scale
The goal of data attribution is to trace model predictions back to training data. Despite a long
line of work towards this goal, existing approaches to data attribution tend to force users to …
line of work towards this goal, existing approaches to data attribution tend to force users to …
Data banzhaf: A robust data valuation framework for machine learning
Data valuation has wide use cases in machine learning, including improving data quality
and creating economic incentives for data sharing. This paper studies the robustness of data …
and creating economic incentives for data sharing. This paper studies the robustness of data …
Training data influence analysis and estimation: A survey
Good models require good training data. For overparameterized deep models, the causal
relationship between training data and model predictions is increasingly opaque and poorly …
relationship between training data and model predictions is increasingly opaque and poorly …
Data-oob: Out-of-bag estimate as a simple and efficient data value
Data valuation is a powerful framework for providing statistical insights into which data are
beneficial or detrimental to model training. Many Shapley-based data valuation methods …
beneficial or detrimental to model training. Many Shapley-based data valuation methods …
Opendataval: a unified benchmark for data valuation
Assessing the quality and impact of individual data points is critical for improving model
performance and mitigating undesirable biases within the training dataset. Several data …
performance and mitigating undesirable biases within the training dataset. Several data …
Unleashing the power of data tsunami: A comprehensive survey on data assessment and selection for instruction tuning of language models
Instruction tuning plays a critical role in aligning large language models (LLMs) with human
preference. Despite the vast amount of open instruction datasets, naively training a LLM on …
preference. Despite the vast amount of open instruction datasets, naively training a LLM on …
A privacy-friendly approach to data valuation
Data valuation, a growing field that aims at quantifying the usefulness of individual data
sources for training machine learning (ML) models, faces notable yet often overlooked …
sources for training machine learning (ML) models, faces notable yet often overlooked …
Rethinking backdoor attacks
In a backdoor attack, an adversary inserts maliciously constructed backdoor examples into a
training set to make the resulting model vulnerable to manipulation. Defending against such …
training set to make the resulting model vulnerable to manipulation. Defending against such …
Open problems in technical ai governance
AI progress is creating a growing range of risks and opportunities, but it is often unclear how
they should be navigated. In many cases, the barriers and uncertainties faced are at least …
they should be navigated. In many cases, the barriers and uncertainties faced are at least …
Intriguing properties of data attribution on diffusion models
Data attribution seeks to trace model outputs back to training data. With the recent
development of diffusion models, data attribution has become a desired module to properly …
development of diffusion models, data attribution has become a desired module to properly …