- Academic Search

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

Foundational challenges in assuring alignment and safety of large language models

U Anwar, A Saparov, J Rando, D Paleka… - arxiv preprint arxiv …, 2024 - arxiv.org

This work identifies 18 foundational challenges in assuring the alignment and safety of large
language models (LLMs). These challenges are organized into three different categories …

บันทึก อ้างอิง อ้างโดย136 บทความที่เกี่ยวข้อง ทั้งหมด 7 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Prompting a pretrained transformer can be a universal approximator

A Petrov, PHS Torr, A Bibi - arxiv preprint arxiv:2402.14753, 2024 - arxiv.org

Despite the widespread adoption of prompting, prompt tuning and prefix-tuning of
transformer models, our theoretical understanding of these fine-tuning methods remains …

บันทึก อ้างอิง อ้างโดย11 บทความที่เกี่ยวข้อง ทั้งหมด 8 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Pseudorandom error-correcting codes

M Christ, S Gunn - Annual International Cryptology Conference, 2024 - Springer

We construct pseudorandom error-correcting codes (or simply pseudorandom codes), which
are error-correcting codes with the property that any polynomial number of codewords are …

บันทึก อ้างอิง อ้างโดย17 บทความที่เกี่ยวข้อง ทั้งหมด 7 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Advancing beyond identification: Multi-bit watermark for large language models

KY Yoo, W Ahn, N Kwak - arxiv preprint arxiv:2308.00221, 2023 - arxiv.org

We show the viability of tackling misuses of large language models beyond the identification
of machine-generated text. While existing zero-bit watermark methods focus on detection …

บันทึก อ้างอิง อ้างโดย18 บทความที่เกี่ยวข้อง ทั้งหมด 5 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Injecting Undetectable Backdoors in Obfuscated Neural Networks and Language Models

A Kalavasis, A Karbasi, A Oikonomou… - Advances in …, 2025 - proceedings.neurips.cc

As ML models become increasingly complex and integral to high-stakes domains such as
finance and healthcare, they also become more susceptible to sophisticated adversarial …

บันทึก อ้างอิง อ้างโดย1 บทความที่เกี่ยวข้อง ทั้งหมด 2 ฉบับ ดูในรูปแบบ HTML

Provably secure public-key steganography based on elliptic curve cryptography

X Zhang, K Chen, J Ding, Y Yang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Steganography is the technique of hiding secret messages within seemingly harmless
covers to elude examination by censors. Despite having been proposed several decades …

บันทึก อ้างอิง อ้างโดย8 บทความที่เกี่ยวข้อง ทั้งหมด 3 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Exploring the relevance of data Privacy-Enhancing technologies for AI governance use cases

E Bluemke, T Collins, B Garfinkel, A Trask - arxiv preprint arxiv …, 2023 - arxiv.org

The development of privacy-enhancing technologies has made immense progress in
reducing trade-offs between privacy and performance in data exchange and analysis …

บันทึก อ้างอิง อ้างโดย11 บทความที่เกี่ยวข้อง ทั้งหมด 3 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Minimum-entropy coupling approximation guarantees beyond the majorization barrier

S Compton, D Katz, B Qi… - International …, 2023 - proceedings.mlr.press

Given a set of discrete probability distributions, the minimum entropy coupling is the
minimum entropy joint distribution that has the input distributions as its marginals. This has …

บันทึก อ้างอิง อ้างโดย11 บทความที่เกี่ยวข้อง ทั้งหมด 8 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Excuse me, sir? your language model is leaking (information)

O Zamir - arxiv preprint arxiv:2401.10360, 2024 - arxiv.org

We introduce a cryptographic method to hide an arbitrary secret payload in the response of
a Large Language Model (LLM). A secret key is required to extract the payload from the …

บันทึก อ้างอิง อ้างโดย5 บทความที่เกี่ยวข้อง ทั้งหมด 2 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Hidden in plain text: Emergence & mitigation of steganographic collusion in LLMs

Y Mathew, O Matthews, R McCarthy, J Velja… - arxiv preprint arxiv …, 2024 - arxiv.org

The rapid proliferation of frontier model agents promises significant societal advances but
also raises concerns about systemic risks arising from unsafe interactions. Collusion to the …

บันทึก อ้างอิง อ้างโดย4 บทความที่เกี่ยวข้อง ทั้งหมด 6 ฉบับ ดูในรูปแบบ HTML

อ้างอิง

การค้นหาขั้นสูง

บันทึกไปยังคลังของฉันแล้ว

Foundational challenges in assuring alignment and safety of large language models

Prompting a pretrained transformer can be a universal approximator

Pseudorandom error-correcting codes

Advancing beyond identification: Multi-bit watermark for large language models

Injecting Undetectable Backdoors in Obfuscated Neural Networks and Language Models

Provably secure public-key steganography based on elliptic curve cryptography

Exploring the relevance of data Privacy-Enhancing technologies for AI governance use cases

Minimum-entropy coupling approximation guarantees beyond the majorization barrier

Excuse me, sir? your language model is leaking (information)

Hidden in plain text: Emergence & mitigation of steganographic collusion in LLMs