Uncertainty-Aware Step-wise Verification with Generative Reward Models

Z Ye, LC Melo, Y Kaddar, P Blunsom, S Staton… - arxiv preprint arxiv …, 2025 - arxiv.org
Complex multi-step reasoning tasks, such as solving mathematical problems, remain
challenging for large language models (LLMs). While outcome supervision is commonly …

PredictaBoard: Benchmarking LLM Score Predictability

L Pacchiardi, K Voudouris, B Slater… - arxiv preprint arxiv …, 2025 - arxiv.org
Despite possessing impressive skills, Large Language Models (LLMs) often fail
unpredictably, demonstrating inconsistent success in even basic common sense reasoning …

Bridging the Gap Between LLMs and Human Intentions: Progresses and Challenges in Instruction Understanding, Intention Reasoning, and Reliable Generation

Z Chang, F Lu, Z Zhu, Q Li, C Ji, Z Chen, Y Liu… - arxiv preprint arxiv …, 2025 - arxiv.org
Large language models (LLMs) have demonstrated exceptional capabilities in
understanding and generation. However, when interacting with human instructions in real …

Uncertainty-Aware Adaptation of Large Language Models for Protein-Protein Interaction Analysis

S Jantre, T Wang, G Park, K Chopra, N Jeon… - arxiv preprint arxiv …, 2025 - arxiv.org
Identification of protein-protein interactions (PPIs) helps derive cellular mechanistic
understanding, particularly in the context of complex conditions such as neurodegenerative …