Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
From generation to judgment: Opportunities and challenges of llm-as-a-judge
Assessment and evaluation have long been critical challenges in artificial intelligence (AI)
and natural language processing (NLP). However, traditional methods, whether matching …
and natural language processing (NLP). However, traditional methods, whether matching …
Generative language models exhibit social identity biases
Social identity biases, particularly the tendency to favor one's own group (ingroup solidarity)
and derogate other groups (outgroup hostility), are deeply rooted in human psychology and …
and derogate other groups (outgroup hostility), are deeply rooted in human psychology and …
Reinforcement Learning Enhanced LLMs: A Survey
This paper surveys research in the rapidly growing field of enhancing large language
models (LLMs) with reinforcement learning (RL), a technique that enables LLMs to improve …
models (LLMs) with reinforcement learning (RL), a technique that enables LLMs to improve …
Thinking llms: General instruction following with thought generation
LLMs are typically trained to answer user questions or follow instructions similarly to how
human experts respond. However, in the standard alignment framework they lack the basic …
human experts respond. However, in the standard alignment framework they lack the basic …
Preference tuning with human feedback on language, speech, and vision tasks: A survey
Preference tuning is a crucial process for aligning deep generative models with human
preferences. This survey offers a thorough overview of recent advancements in preference …
preferences. This survey offers a thorough overview of recent advancements in preference …
Lmunit: Fine-grained evaluation with natural language unit tests
As language models become integral to critical workflows, assessing their behavior remains
a fundamental challenge--human evaluation is costly and noisy, while automated metrics …
a fundamental challenge--human evaluation is costly and noisy, while automated metrics …
How Reliable Is Human Feedback For Aligning Large Language Models?
Most alignment research today focuses on designing new learning algorithms using
datasets like Anthropic-HH, assuming human feedback data is inherently reliable. However …
datasets like Anthropic-HH, assuming human feedback data is inherently reliable. However …
Fanar: An Arabic-Centric Multimodal Generative AI Platform
We present Fanar, a platform for Arabic-centric multimodal generative AI systems, that
supports language, speech and image generation tasks. At the heart of Fanar are Fanar Star …
supports language, speech and image generation tasks. At the heart of Fanar are Fanar Star …
Qtok: A Comprehensive Framework for Evaluating Multilingual Tokenizer Quality in Large Language Models
In the development of Large Language Models (LLMs), considerable attention has been
given to the quality of training datasets. However, the role of tokenizers in the LLM training …
given to the quality of training datasets. However, the role of tokenizers in the LLM training …
Cross-lingual Transfer of Reward Models in Multilingual Alignment
Reinforcement learning with human feedback (RLHF) is shown to largely benefit from
precise reward models (RMs). However, recent studies in reward modeling schemes are …
precise reward models (RMs). However, recent studies in reward modeling schemes are …