Muse: Machine unlearning six-way evaluation for language models
Language models (LMs) are trained on vast amounts of text data, which may include private
and copyrighted content. Data owners may request the removal of their data from a trained …
and copyrighted content. Data owners may request the removal of their data from a trained …
Evaluating copyright takedown methods for language models
Language models (LMs) derive their capabilities from extensive training on diverse data,
including potentially copyrighted material. These models can memorize and generate …
including potentially copyrighted material. These models can memorize and generate …
An economic solution to copyright challenges of generative ai
Generative artificial intelligence (AI) systems are trained on large data corpora to generate
new pieces of text, images, videos, and other media. There is growing concern that such …
new pieces of text, images, videos, and other media. There is growing concern that such …
On memorization of large language models in logical reasoning
Large language models (LLMs) achieve good performance on challenging reasoning
benchmarks, yet could also make basic reasoning mistakes. This contrasting behavior is …
benchmarks, yet could also make basic reasoning mistakes. This contrasting behavior is …
CopyBench: Measuring literal and non-literal reproduction of copyright-protected text in language model generation
Evaluating the degree of reproduction of copyright-protected content by language models
(LMs) is of significant interest to the AI and legal communities. Although both literal and non …
(LMs) is of significant interest to the AI and legal communities. Although both literal and non …
Tackling genai copyright issues: Originality estimation and genericization
The rapid progress of generative AI technology has sparked significant copyright concerns,
leading to numerous lawsuits filed against AI developers. While various techniques for …
leading to numerous lawsuits filed against AI developers. While various techniques for …
Unlearn and Burn: Adversarial Machine Unlearning Requests Destroy Model Accuracy
Machine unlearning algorithms, designed for selective removal of training data from models,
have emerged as a promising approach to growing privacy concerns. In this work, we …
have emerged as a promising approach to growing privacy concerns. In this work, we …
Provable unlearning in topic modeling and downstream tasks
Machine unlearning algorithms are increasingly important as legal concerns arise around
the provenance of training data, but verifying the success of unlearning is often difficult …
the provenance of training data, but verifying the success of unlearning is often difficult …
Semantic to Structure: Learning Structural Representations for Infringement Detection
C Huang, Z Jia, H Fei, Y Zhu, Z Yuan, J Zhang… - arxiv preprint arxiv …, 2025 - arxiv.org
Structural information in images is crucial for aesthetic assessment, and it is widely
recognized in the artistic field that imitating the structure of other works significantly infringes …
recognized in the artistic field that imitating the structure of other works significantly infringes …
The Pitfalls of" Security by Obscurity" And What They Mean for Transparent AI
P Hall, O Mundahl, S Park - arxiv preprint arxiv:2501.18669, 2025 - arxiv.org
Calls for transparency in AI systems are growing in number and urgency from diverse
stakeholders ranging from regulators to researchers to users (with a comparative absence of …
stakeholders ranging from regulators to researchers to users (with a comparative absence of …