- Academic Search

Grounding and evaluation for large language models: Practical challenges and lessons learned (survey)

K Kenthapadi, M Sameki, A Taly - Proceedings of the 30th ACM SIGKDD …, 2024 - dl.acm.org

With the ongoing rapid adoption of Artificial Intelligence (AI)-based systems in high-stakes
domains, ensuring the trustworthiness, safety, and observability of these systems has …

保存引用被引用数: 20 関連記事全 2 バージョン

[Free GPT-4]

[PDF] arxiv.org

Threats, attacks, and defenses in machine unlearning: A survey

Z Liu, H Ye, C Chen, Y Zheng, KY Lam - arxiv preprint arxiv:2403.13682, 2024 - arxiv.org

Machine Unlearning (MU) has recently gained considerable attention due to its potential to
achieve Safe AI by removing the influence of specific data from trained Machine Learning …

保存引用被引用数: 23 関連記事全 2 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Foundational challenges in assuring alignment and safety of large language models

U Anwar, A Saparov, J Rando, D Paleka… - arxiv preprint arxiv …, 2024 - arxiv.org

This work identifies 18 foundational challenges in assuring the alignment and safety of large
language models (LLMs). These challenges are organized into three different categories …

保存引用被引用数: 116 関連記事全 3 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

On protecting the data privacy of large language models (llms): A survey

B Yan, K Li, M Xu, Y Dong, Y Zhang, Z Ren… - arxiv preprint arxiv …, 2024 - arxiv.org

Large language models (LLMs) are complex artificial intelligence systems capable of
understanding, generating and translating human language. They learn language patterns …

保存引用被引用数: 72 関連記事全 4 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Privacy in large language models: Attacks, defenses and future directions

H Li, Y Chen, J Luo, J Wang, H Peng, Y Kang… - arxiv preprint arxiv …, 2023 - arxiv.org

The advancement of large language models (LLMs) has significantly enhanced the ability to
effectively tackle various downstream NLP tasks and unify these tasks into generative …

保存引用被引用数: 53 関連記事全 2 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Muse: Machine unlearning six-way evaluation for language models

W Shi, J Lee, Y Huang, S Malladi, J Zhao… - arxiv preprint arxiv …, 2024 - arxiv.org

Language models (LMs) are trained on vast amounts of text data, which may include private
and copyrighted content. Data owners may request the removal of their data from a trained …

保存引用被引用数: 27 関連記事全 3 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Guardrail baselines for unlearning in llms

P Thaker, Y Maurya, S Hu, ZS Wu, V Smith - arxiv preprint arxiv …, 2024 - arxiv.org

Recent work has demonstrated that finetuning is a promising approach to'unlearn'concepts
from large language models. However, finetuning can be expensive, as it requires both …

保存引用被引用数: 22 関連記事全 2 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Tamper-resistant safeguards for open-weight llms

R Tamirisa, B Bharathi, L Phan, A Zhou, A Gatti… - arxiv preprint arxiv …, 2024 - arxiv.org

Rapid advances in the capabilities of large language models (LLMs) have raised
widespread concerns regarding their potential for malicious use. Open-weight LLMs present …

保存引用被引用数: 21 関連記事 HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Challenging forgets: Unveiling the worst-case forget sets in machine unlearning

C Fan, J Liu, A Hero, S Liu - European Conference on Computer Vision, 2024 - Springer

The trustworthy machine learning (ML) community is increasingly recognizing the crucial
need for models capable of selectively 'unlearning'data points after training. This leads to the …

保存引用被引用数: 15 関連記事全 2 バージョン

[Free GPT-4]

[PDF] arxiv.org

An adversarial perspective on machine unlearning for ai safety

J Łucki, B Wei, Y Huang, P Henderson… - arxiv preprint arxiv …, 2024 - arxiv.org

Large language models are finetuned to refuse questions about hazardous knowledge, but
these protections can often be bypassed. Unlearning methods aim at completely removing …

保存引用被引用数: 10 関連記事全 5 バージョン HTMLバージョン

アラートを作成

引用

検索オプション

マイライブラリに保存しました

Rethinking machine unlearning for large language models

Grounding and evaluation for large language models: Practical challenges and lessons learned (survey)

Threats, attacks, and defenses in machine unlearning: A survey

Foundational challenges in assuring alignment and safety of large language models

On protecting the data privacy of large language models (llms): A survey

Privacy in large language models: Attacks, defenses and future directions

Muse: Machine unlearning six-way evaluation for language models

Guardrail baselines for unlearning in llms

Tamper-resistant safeguards for open-weight llms

Challenging forgets: Unveiling the worst-case forget sets in machine unlearning

An adversarial perspective on machine unlearning for ai safety