Google 학술 검색

S Zhou, C Liu, D Ye, T Zhu, W Zhou, PS Yu - ACM Computing Surveys, 2022 - dl.acm.org

The outstanding performance of deep neural networks has promoted deep learning
applications in a broad set of domains. However, the potential risks caused by adversarial …

저장 인용 102회 인용 관련 학술자료 전체 3개의 버전

[Free GPT-4]

[PDF] arxiv.org

Open problems and fundamental limitations of reinforcement learning from human feedback

S Casper, X Davies, C Shi, TK Gilbert… - arxiv preprint arxiv …, 2023 - arxiv.org

Reinforcement learning from human feedback (RLHF) is a technique for training AI systems
to align with human goals. RLHF has emerged as the central method used to finetune state …

저장 인용 438회 인용 관련 학술자료 전체 6개의 버전 HTML 버전

[Free GPT-4]

[PDF] arxiv.org

“real attackers don't compute gradients”: bridging the gap between adversarial ml research and practice

G Apruzzese, HS Anderson, S Dambra… - … IEEE Conference on …, 2023 - ieeexplore.ieee.org

Recent years have seen a proliferation of research on adversarial machine learning.
Numerous papers demonstrate powerful algorithmic attacks against a wide variety of …

저장 인용 108회 인용 관련 학술자료 전체 15개의 버전

[Free GPT-4]

[PDF] thecvf.com

Policycleanse: Backdoor detection and mitigation for competitive reinforcement learning

J Guo, A Li, L Wang, C Liu - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com

While real-world applications of reinforcement learning (RL) are becoming popular, the
security and robustness of RL systems are worthy of more attention and exploration. In …

저장 인용 17회 인용 관련 학술자료 전체 3개의 버전 HTML 버전

[Free GPT-4]

[PDF] mlr.press

Adversarial policies beat superhuman go AIs

TT Wang, A Gleave, T Tseng, K Pelrine… - International …, 2023 - proceedings.mlr.press

We attack the state-of-the-art Go-playing AI system KataGo by training adversarial policies
against it, achieving a $> $97% win rate against KataGo running at superhuman settings …

저장 인용 35회 인용 관련 학술자료 전체 9개의 버전 HTML 버전

[Free GPT-4]

[PDF] arxiv.org

Sok: Explainable machine learning for computer security applications

A Nadeem, D Vos, C Cao, L Pajola… - 2023 IEEE 8th …, 2023 - ieeexplore.ieee.org

Explainable Artificial Intelligence (XAI) aims to improve the transparency of machine
learning (ML) pipelines. We systematize the increasingly growing (but fragmented) …

저장 인용 61회 인용 관련 학술자료 전체 8개의 버전

[Free GPT-4]

[PDF] arxiv.org

Race: Robust adversarial concept erasure for secure text-to-image diffusion model

C Kim, K Min, Y Yang - European Conference on Computer Vision, 2024 - Springer

In the evolving landscape of text-to-image (T2I) diffusion models, the remarkable capability
to generate high-quality images from textual descriptions faces challenges with the potential …

저장 인용 10회 인용 관련 학술자료 전체 2개의 버전

[Free GPT-4]

[PDF] acm.org

Adversarial Machine Learning Attacks and Defences in Multi-Agent Reinforcement Learning

M Standen, J Kim, C Szabo - ACM Computing Surveys, 2023 - dl.acm.org

Multi-Agent Reinforcement Learning (MARL) is susceptible to Adversarial Machine Learning
(AML) attacks. Execution-time AML attacks against MARL are complex due to effects that …

저장 인용 9회 인용 관련 학술자료 전체 2개의 버전

[Free GPT-4]

[PDF] ufl.edu

" Get in Researchers; We're Measuring Reproducibility": A Reproducibility Study of Machine Learning Papers in Tier 1 Security Conferences

D Olszewski, A Lu, C Stillman, K Warren… - Proceedings of the …, 2023 - dl.acm.org

Reproducibility is crucial to the advancement of science; it strengthens confidence in
seemingly contradictory results and expands the boundaries of known discoveries …

저장 인용 15회 인용 관련 학술자료

[Free GPT-4]

[PDF] smu.edu.sg

Curiosity-driven and victim-aware adversarial policies

C Gong, Z Yang, Y Bai, J Shi, A Sinha, B Xu… - Proceedings of the 38th …, 2022 - dl.acm.org

Recent years have witnessed great potential in applying Deep Reinforcement Learning
(DRL) in various challenging applications, such as autonomous driving, nuclear fusion …

저장 인용 26회 인용 관련 학술자료 전체 10개의 버전

알림 만들기

인용

고급 검색

라이브러리에 저장됨

Adversarial policy training against deep reinforcement learning

Adversarial attacks and defenses in deep learning: From a perspective of cybersecurity

Open problems and fundamental limitations of reinforcement learning from human feedback

“real attackers don't compute gradients”: bridging the gap between adversarial ml research and practice

Policycleanse: Backdoor detection and mitigation for competitive reinforcement learning

Adversarial policies beat superhuman go AIs

Sok: Explainable machine learning for computer security applications

Race: Robust adversarial concept erasure for secure text-to-image diffusion model

Adversarial Machine Learning Attacks and Defences in Multi-Agent Reinforcement Learning

" Get in Researchers; We're Measuring Reproducibility": A Reproducibility Study of Machine Learning Papers in Tier 1 Security Conferences

Curiosity-driven and victim-aware adversarial policies