Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Chain of thoughtlessness? an analysis of cot in planning
K Stechly, K Valmeekam… - Advances in Neural …, 2025 - proceedings.neurips.cc
Large language model (LLM) performance on reasoning problems typically does not
generalize out of distribution. Previous work has claimed that this can be mitigated with …
generalize out of distribution. Previous work has claimed that this can be mitigated with …
On the self-verification limitations of large language models on reasoning and planning tasks
There has been considerable divergence of opinion on the reasoning abilities of Large
Language Models (LLMs). While the initial optimism that reasoning might emerge …
Language Models (LLMs). While the initial optimism that reasoning might emerge …
Eureka: Evaluating and understanding large foundation models
Rigorous and reproducible evaluation is critical for assessing the state of the art and for
guiding scientific advances in Artificial Intelligence. Evaluation is challenging in practice due …
guiding scientific advances in Artificial Intelligence. Evaluation is challenging in practice due …
“I Want It That Way”: Enabling Interactive Decision Support Using Large Language Models and Constraint Programming
A critical factor in the success of many decision support systems is the accurate modeling of
user preferences. Psychology research has demonstrated that users often develop their …
user preferences. Psychology research has demonstrated that users often develop their …
BENCHAGENTS: Automated Benchmark Creation with Agent Interaction
Evaluations are limited by benchmark availability. As models evolve, there is a need to
create benchmarks that can measure progress on new generative capabilities. However …
create benchmarks that can measure progress on new generative capabilities. However …
From instructions to constraints: Language model alignment with automatic constraint verification
User alignment is crucial for adapting general-purpose language models (LMs) to
downstream tasks, but human annotations are often not available for all types of instructions …
downstream tasks, but human annotations are often not available for all types of instructions …
Recursive Decomposition of Logical Thoughts: Framework for Superior Reasoning and Knowledge Propagation in Large Language Models
Enhancing the reasoning capabilities of Large Language Models remains a critical
challenge in artificial intelligence. We introduce RDoLT, Recursive Decomposition of Logical …
challenge in artificial intelligence. We introduce RDoLT, Recursive Decomposition of Logical …
The Ability of Large Language Models to Evaluate Constraint-satisfaction in Agent Responses to Open-ended Requests
Generative AI agents are often expected to respond to complex user requests that have No
One Right Answer (NORA), eg," design a vegetarian meal plan below 1800 calories". Such …
One Right Answer (NORA), eg," design a vegetarian meal plan below 1800 calories". Such …
[HTML][HTML] Aligning to constraints for data-efficient language model customization
General-purpose language models (LMs) are aligned to diverse user intents, but fall short
when it comes to specific applications. While finetuning is the default method for customized …
when it comes to specific applications. While finetuning is the default method for customized …
[КНИГА][B] Towards Trustworthy Machine Learning: An Integer Programming Approach
CA Lawless - 2024 - search.proquest.com
Despite the proliferation of machine learning (ML) in a multitude of applications, current
black-box models, such as deep learning, remain hard to understand, critique, and judge by …
black-box models, such as deep learning, remain hard to understand, critique, and judge by …