From decoding to meta-generation: Inference-time algorithms for large language models
One of the most striking findings in modern research on large language models (LLMs) is
that scaling up compute during training leads to better results. However, less attention has …
that scaling up compute during training leads to better results. However, less attention has …
Scaling of search and learning: A roadmap to reproduce o1 from reinforcement learning perspective
OpenAI o1 represents a significant milestone in Artificial Inteiligence, which achieves expert-
level performances on many challanging tasks that require strong reasoning ability. OpenAI …
level performances on many challanging tasks that require strong reasoning ability. OpenAI …
Aligning large language models via self-steering optimization
Automated alignment develops alignment systems with minimal human intervention. The
key to automated alignment lies in providing learnable and accurate preference signals for …
key to automated alignment lies in providing learnable and accurate preference signals for …
Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models
Synthetic data generation with Large Language Models is a promising paradigm for
augmenting natural data over a nearly infinite range of tasks. Given this variety, direct …
augmenting natural data over a nearly infinite range of tasks. Given this variety, direct …
[PDF][PDF] Unlocking the Mysteries of OpenAI o1: A Survey of the Reasoning Abilities of Large Language Models
The release of OpenAI's o1 marks a significant milestone in AI, achieving proficiency
comparable to PhD-level expertise in mathematics and coding. While o1 excels at solving …
comparable to PhD-level expertise in mathematics and coding. While o1 excels at solving …
Gödel Agent: A Self-Referential Framework Helps for Recursively Self-Improvement
The rapid advancement of large language models (LLMs) has significantly enhanced the
capabilities of AI-driven agents across various tasks. However, existing agentic systems …
capabilities of AI-driven agents across various tasks. However, existing agentic systems …
Improving the Efficiency of Test-Time Search in LLMs with Backtracking
Solving reasoning problems is an iterative multi-step computation, where a reasoning agent
progresses through a sequence of steps, with each step logically building upon the previous …
progresses through a sequence of steps, with each step logically building upon the previous …
Improving Language Model Self-Correction Capability with Meta-Feedback
X Li, Y Zhang, L Wang - openreview.net
Large language models (LLMs) are capable of self-correcting their responses by generating
feedback and refining the initial output. However, their performance may sometimes decline …
feedback and refining the initial output. However, their performance may sometimes decline …