Академия Google

Safetyprompts: a systematic review of open datasets for evaluating and improving large language model safety

P Röttger, F Pernisi, B Vidgen, D Hovy - arxiv preprint arxiv:2404.05399, 2024 - arxiv.org

The last two years have seen a rapid growth in concerns around the safety of large
language models (LLMs). Researchers and practitioners have met these concerns by …

Сохранить Цитировать Цитируется: 21 Похожие статьи Все версии статьи (3) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey on large language models for software engineering

Q Zhang, C Fang, Y **e, Y Zhang, Y Yang… - arxiv preprint arxiv …, 2023 - arxiv.org

Software Engineering (SE) is the systematic design, development, maintenance, and
management of software applications underpinning the digital infrastructure of our modern …

Сохранить Цитировать Цитируется: 43 Похожие статьи Все версии статьи (2) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] radensa.ru

[PDF][PDF] A comprehensive survey of small language models in the era of large language models: Techniques, enhancements, applications, collaboration with llms, and …

F Wang, Z Zhang, X Zhang, Z Wu, T Mo, Q Lu… - arxiv preprint arxiv …, 2024 - ai.radensa.ru

Large language models (LLM) have demonstrated emergent abilities in text generation,
question answering, and reasoning, facilitating various tasks and domains. Despite their …

Сохранить Цитировать Цитируется: 8 Похожие статьи Все версии статьи (4) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Eureka: Evaluating and understanding large foundation models

V Balachandran, J Chen, N Joshi, B Nushi… - arxiv preprint arxiv …, 2024 - arxiv.org

Rigorous and reproducible evaluation is critical for assessing the state of the art and for
guiding scientific advances in Artificial Intelligence. Evaluation is challenging in practice due …

Сохранить Цитировать Цитируется: 7 Похожие статьи Все версии статьи (3) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Clave: An adaptive framework for evaluating values of llm generated responses

J Yao, X Yi, X **e - arxiv preprint arxiv:2407.10725, 2024 - arxiv.org

The rapid progress in Large Language Models (LLMs) poses potential risks such as
generating unethical content. Assessing LLMs' values can help expose their misalignment …

Сохранить Цитировать Цитируется: 5 Похожие статьи Все версии статьи (5) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Trustworthy, responsible, and safe ai: A comprehensive architectural framework for ai safety with challenges and mitigations

C Chen, Z Liu, W Jiang, SQ Goh, KKY Lam - arxiv preprint arxiv …, 2024 - arxiv.org

AI Safety is an emerging area of critical importance to the safe adoption and deployment of
AI systems. With the rapid proliferation of AI and especially with the recent advancement of …

Сохранить Цитировать Цитируется: 7 Похожие статьи Все версии статьи (3) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Astral: Automated safety testing of large language models

M Ugarte, P Valle, JA Parejo, S Segura… - arxiv preprint arxiv …, 2025 - arxiv.org

Large Language Models (LLMs) have recently gained attention due to their ability to
understand and generate sophisticated human-like content. However, ensuring their safety …

Сохранить Цитировать Цитируется: 2 Похожие статьи Все версии статьи (4) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Value compass leaderboard: A platform for fundamental and validated evaluation of llms values

J Yao, X Yi, S Duan, J Wang, Y Bai, M Huang… - arxiv preprint arxiv …, 2025 - arxiv.org

As Large Language Models (LLMs) achieve remarkable breakthroughs, aligning their
values with humans has become imperative for their responsible development and …

Сохранить Цитировать Цитируется: 1 Похожие статьи Все версии статьи (2) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation

A Arrieta, M Ugarte, P Valle, JA Parejo… - arxiv preprint arxiv …, 2025 - arxiv.org

Large Language Models (LLMs) have become an integral part of our daily lives. However,
they impose certain risks, including those that can harm individuals' privacy, perpetuate …

Сохранить Цитировать Цитируется: 3 Похожие статьи В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

BENCHAGENTS: Automated Benchmark Creation with Agent Interaction

N Butt, V Chandrasekaran, N Joshi, B Nushi… - arxiv preprint arxiv …, 2024 - arxiv.org

Evaluations are limited by benchmark availability. As models evolve, there is a need to
create benchmarks that can measure progress on new generative capabilities. However …

Сохранить Цитировать Цитируется: 1 Похожие статьи Все версии статьи (2) В виде HTML

Создать оповещение

Цитировать

Расширенный поиск

Сохранено в вашей библиотеке

S-eval: Automatic and adaptive test generation for benchmarking safety evaluation of large...

Safetyprompts: a systematic review of open datasets for evaluating and improving large language model safety

A survey on large language models for software engineering

[PDF][PDF] A comprehensive survey of small language models in the era of large language models: Techniques, enhancements, applications, collaboration with llms, and …

Eureka: Evaluating and understanding large foundation models

Clave: An adaptive framework for evaluating values of llm generated responses

Trustworthy, responsible, and safe ai: A comprehensive architectural framework for ai safety with challenges and mitigations

Astral: Automated safety testing of large language models

Value compass leaderboard: A platform for fundamental and validated evaluation of llms values

Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation

BENCHAGENTS: Automated Benchmark Creation with Agent Interaction