Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Foundational challenges in assuring alignment and safety of large language models
This work identifies 18 foundational challenges in assuring the alignment and safety of large
language models (LLMs). These challenges are organized into three different categories …
language models (LLMs). These challenges are organized into three different categories …
Prompting a pretrained transformer can be a universal approximator
Despite the widespread adoption of prompting, prompt tuning and prefix-tuning of
transformer models, our theoretical understanding of these fine-tuning methods remains …
transformer models, our theoretical understanding of these fine-tuning methods remains …
Pseudorandom error-correcting codes
We construct pseudorandom error-correcting codes (or simply pseudorandom codes), which
are error-correcting codes with the property that any polynomial number of codewords are …
are error-correcting codes with the property that any polynomial number of codewords are …
Advancing beyond identification: Multi-bit watermark for large language models
We show the viability of tackling misuses of large language models beyond the identification
of machine-generated text. While existing zero-bit watermark methods focus on detection …
of machine-generated text. While existing zero-bit watermark methods focus on detection …
Injecting Undetectable Backdoors in Obfuscated Neural Networks and Language Models
As ML models become increasingly complex and integral to high-stakes domains such as
finance and healthcare, they also become more susceptible to sophisticated adversarial …
finance and healthcare, they also become more susceptible to sophisticated adversarial …
Provably secure public-key steganography based on elliptic curve cryptography
Steganography is the technique of hiding secret messages within seemingly harmless
covers to elude examination by censors. Despite having been proposed several decades …
covers to elude examination by censors. Despite having been proposed several decades …
Exploring the relevance of data Privacy-Enhancing technologies for AI governance use cases
The development of privacy-enhancing technologies has made immense progress in
reducing trade-offs between privacy and performance in data exchange and analysis …
reducing trade-offs between privacy and performance in data exchange and analysis …
Minimum-entropy coupling approximation guarantees beyond the majorization barrier
Given a set of discrete probability distributions, the minimum entropy coupling is the
minimum entropy joint distribution that has the input distributions as its marginals. This has …
minimum entropy joint distribution that has the input distributions as its marginals. This has …
Excuse me, sir? your language model is leaking (information)
O Zamir - arxiv preprint arxiv:2401.10360, 2024 - arxiv.org
We introduce a cryptographic method to hide an arbitrary secret payload in the response of
a Large Language Model (LLM). A secret key is required to extract the payload from the …
a Large Language Model (LLM). A secret key is required to extract the payload from the …
Hidden in plain text: Emergence & mitigation of steganographic collusion in LLMs
The rapid proliferation of frontier model agents promises significant societal advances but
also raises concerns about systemic risks arising from unsafe interactions. Collusion to the …
also raises concerns about systemic risks arising from unsafe interactions. Collusion to the …