The rise and potential of large language model based agents: A survey
For a long time, researchers have sought artificial intelligence (AI) that matches or exceeds
human intelligence. AI agents, which are artificial entities capable of sensing the …
human intelligence. AI agents, which are artificial entities capable of sensing the …
Recent advances in adversarial training for adversarial robustness
Adversarial training is one of the most effective approaches defending against adversarial
examples for deep learning models. Unlike other defense strategies, adversarial training …
examples for deep learning models. Unlike other defense strategies, adversarial training …
Cross-entropy loss functions: Theoretical analysis and applications
Cross-entropy is a widely used loss function in applications. It coincides with the logistic loss
applied to the outputs of a neural network, when the softmax is used. But, what guarantees …
applied to the outputs of a neural network, when the softmax is used. But, what guarantees …
Test-time prompt tuning for zero-shot generalization in vision-language models
Pre-trained vision-language models (eg, CLIP) have shown promising zero-shot
generalization in many downstream tasks with properly designed text prompts. Instead of …
generalization in many downstream tasks with properly designed text prompts. Instead of …
Foundational challenges in assuring alignment and safety of large language models
This work identifies 18 foundational challenges in assuring the alignment and safety of large
language models (LLMs). These challenges are organized into three different categories …
language models (LLMs). These challenges are organized into three different categories …
Improving robustness using generated data
Recent work argues that robust training requires substantially larger datasets than those
required for standard classification. On CIFAR-10 and CIFAR-100, this translates into a …
required for standard classification. On CIFAR-10 and CIFAR-100, this translates into a …
Smoothllm: Defending large language models against jailbreaking attacks
Despite efforts to align large language models (LLMs) with human values, widely-used
LLMs such as GPT, Llama, Claude, and PaLM are susceptible to jailbreaking attacks …
LLMs such as GPT, Llama, Claude, and PaLM are susceptible to jailbreaking attacks …
Robustbench: a standardized adversarial robustness benchmark
As a research community, we are still lacking a systematic understanding of the progress on
adversarial robustness which often makes it hard to identify the most promising ideas in …
adversarial robustness which often makes it hard to identify the most promising ideas in …
Reconstructing training data from trained neural networks
Understanding to what extent neural networks memorize training data is an intriguing
question with practical and theoretical implications. In this paper we show that in some …
question with practical and theoretical implications. In this paper we show that in some …
[HTML][HTML] Information fusion as an integrative cross-cutting enabler to achieve robust, explainable, and trustworthy medical artificial intelligence
Medical artificial intelligence (AI) systems have been remarkably successful, even
outperforming human performance at certain tasks. There is no doubt that AI is important to …
outperforming human performance at certain tasks. There is no doubt that AI is important to …