Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Safetyprompts: a systematic review of open datasets for evaluating and improving large language model safety
The last two years have seen a rapid growth in concerns around the safety of large
language models (LLMs). Researchers and practitioners have met these concerns by …
language models (LLMs). Researchers and practitioners have met these concerns by …
A survey on large language models for software engineering
Software Engineering (SE) is the systematic design, development, maintenance, and
management of software applications underpinning the digital infrastructure of our modern …
management of software applications underpinning the digital infrastructure of our modern …
[PDF][PDF] A comprehensive survey of small language models in the era of large language models: Techniques, enhancements, applications, collaboration with llms, and …
Large language models (LLM) have demonstrated emergent abilities in text generation,
question answering, and reasoning, facilitating various tasks and domains. Despite their …
question answering, and reasoning, facilitating various tasks and domains. Despite their …
Eureka: Evaluating and understanding large foundation models
Rigorous and reproducible evaluation is critical for assessing the state of the art and for
guiding scientific advances in Artificial Intelligence. Evaluation is challenging in practice due …
guiding scientific advances in Artificial Intelligence. Evaluation is challenging in practice due …
Clave: An adaptive framework for evaluating values of llm generated responses
The rapid progress in Large Language Models (LLMs) poses potential risks such as
generating unethical content. Assessing LLMs' values can help expose their misalignment …
generating unethical content. Assessing LLMs' values can help expose their misalignment …
Trustworthy, responsible, and safe ai: A comprehensive architectural framework for ai safety with challenges and mitigations
AI Safety is an emerging area of critical importance to the safe adoption and deployment of
AI systems. With the rapid proliferation of AI and especially with the recent advancement of …
AI systems. With the rapid proliferation of AI and especially with the recent advancement of …
Astral: Automated safety testing of large language models
Large Language Models (LLMs) have recently gained attention due to their ability to
understand and generate sophisticated human-like content. However, ensuring their safety …
understand and generate sophisticated human-like content. However, ensuring their safety …
Value compass leaderboard: A platform for fundamental and validated evaluation of llms values
As Large Language Models (LLMs) achieve remarkable breakthroughs, aligning their
values with humans has become imperative for their responsible development and …
values with humans has become imperative for their responsible development and …
Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation
Large Language Models (LLMs) have become an integral part of our daily lives. However,
they impose certain risks, including those that can harm individuals' privacy, perpetuate …
they impose certain risks, including those that can harm individuals' privacy, perpetuate …
BENCHAGENTS: Automated Benchmark Creation with Agent Interaction
Evaluations are limited by benchmark availability. As models evolve, there is a need to
create benchmarks that can measure progress on new generative capabilities. However …
create benchmarks that can measure progress on new generative capabilities. However …