Training large language models to reason in a continuous latent space

S Hao, S Sukhbaatar, DJ Su, X Li, Z Hu… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) are restricted to reason in the" language space", where they
typically express the reasoning process with a chain-of-thought (CoT) to solve a complex …

Weak-to-strong reasoning

Y Yang, Y Ma, P Liu - arxiv preprint arxiv:2407.13647, 2024 - arxiv.org
When large language models (LLMs) exceed human-level capabilities, it becomes
increasingly challenging to provide full-scale and accurate supervision for these models …

Can a Bayesian Oracle Prevent Harm from an Agent?

Y Bengio, MK Cohen, N Malkin, M MacDermott… - arxiv preprint arxiv …, 2024 - arxiv.org
Is there a way to design powerful AI systems based on machine learning methods that would
satisfy probabilistic safety guarantees? With the long-term goal of obtaining a probabilistic …

ControlAgent: Automating Control System Design via Novel Integration of LLM Agents and Domain Expertise

X Guo, D Keivan, U Syed, L Qin, H Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Control system design is a crucial aspect of modern engineering with far-reaching
applications across diverse sectors including aerospace, automotive systems, power grids …

GFlowNet Fine-tuning for Diverse Correct Solutions in Mathematical Reasoning Tasks

R Takase, M Tsunokake, Y Tsuchiya… - arxiv preprint arxiv …, 2024 - arxiv.org
Mathematical reasoning problems are among the most challenging, as they typically require
an understanding of fundamental laws to solve. The laws are universal, but the derivation of …