Google Akademik

J Łucki, B Wei, Y Huang, P Henderson… - arxiv preprint arxiv …, 2024 - arxiv.org

Large language models are finetuned to refuse questions about hazardous knowledge, but
these protections can often be bypassed. Unlearning methods aim at completely removing …

Kaydet Alıntı yap Alıntılanma sayısı: 12 İlgili makaleler 6 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Adversarial ML Problems Are Getting Harder to Solve and to Evaluate

J Rando, J Zhang, N Carlini, F Tramèr - arxiv preprint arxiv:2502.02260, 2025 - arxiv.org

In the past decade, considerable research effort has been devoted to securing machine
learning (ML) models that operate in adversarial settings. Yet, progress has been slow even …

Kaydet Alıntı yap İlgili makaleler 2 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

CopyrightMeter: Revisiting Copyright Protection in Text-to-image Models

N Xu, C Li, T Du, M Li, W Luo, J Liang, Y Li… - arxiv preprint arxiv …, 2024 - arxiv.org

Text-to-image diffusion models have emerged as powerful tools for generating high-quality
images from textual descriptions. However, their increasing popularity has raised significant …

Kaydet Alıntı yap İlgili makaleler 2 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

AdvI2I: Adversarial Image Attack on Image-to-Image Diffusion models

Y Zeng, Y Cao, B Cao, Y Chang, J Chen… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent advances in diffusion models have significantly enhanced the quality of image
synthesis, yet they have also introduced serious safety concerns, particularly the generation …

Kaydet Alıntı yap İlgili makaleler 2 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Revisiting the Robust Alignment of Circuit Breakers

L Schwinn, S Geisler - arxiv preprint arxiv:2407.15902, 2024 - arxiv.org

Over the past decade, adversarial training has emerged as one of the few reliable methods
for enhancing model robustness against adversarial attacks [Szegedy et al., 2014, Madry et …

Kaydet Alıntı yap İlgili makaleler 3 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]
[DeepSeek]

[PDF] researchgate.net

[PDF][PDF] SongBsAb: A Dual Prevention Approach against Singing Voice Conversion based Illegal Song Covers

G Chen, Y Zhang - CoRR, vol. abs/2401.17133, 2024 - researchgate.net

Singing voice conversion (SVC) automates song covers by converting a source singing
voice from a source singer into a new singing voice with the same lyrics and melody as the …

Kaydet Alıntı yap İlgili makaleler 2 sürümün hepsi HTML olarak görüntüle

Certifiable AI Security against Localized Corruption Attacks

C **ang - 2025 - search.proquest.com

Building secure and robust AI models has proven to be difficult. Nearly all defenses,
including those published at top-tier venues and recognized with prestigious awards, can be …

Kaydet Alıntı yap İlgili makaleler 2 sürümün hepsi

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

AdvPaint: Protecting Images from Inpainting Manipulation via Adversarial Attention Disruption

J Jeon, WJ Kim, S Ha, S Son, S Yoon - The Thirteenth International … - openreview.net

The outstanding capability of diffusion models in generating high-quality images poses
significant threats when misused by adversaries. In particular, we assume malicious …

Kaydet Alıntı yap İlgili makaleler HTML olarak görüntüle

Uyarı oluştur

Alıntı yap

Gelişmiş arama

Kitaplığım'a kaydedildi

Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI

An adversarial perspective on machine unlearning for ai safety

Adversarial ML Problems Are Getting Harder to Solve and to Evaluate

CopyrightMeter: Revisiting Copyright Protection in Text-to-image Models

AdvI2I: Adversarial Image Attack on Image-to-Image Diffusion models

Revisiting the Robust Alignment of Circuit Breakers

[PDF][PDF] SongBsAb: A Dual Prevention Approach against Singing Voice Conversion based Illegal Song Covers

Certifiable AI Security against Localized Corruption Attacks

AdvPaint: Protecting Images from Inpainting Manipulation via Adversarial Attention Disruption