PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking
MJ Buehler - arxiv preprint arxiv:2410.12375, 2024 - arxiv.org
PRefLexOR (Preference-based Recursive Language Modeling for Exploratory Optimization
of Reasoning) combines preference optimization with concepts from Reinforcement …
of Reasoning) combines preference optimization with concepts from Reinforcement …
In-situ graph reasoning and knowledge expansion using Graph-PReFLexOR
MJ Buehler - arxiv preprint arxiv:2501.08120, 2025 - arxiv.org
The pursuit of automated scientific discovery has fueled progress from symbolic logic to
modern AI, forging new frontiers in reasoning and pattern recognition. Transformers function …
modern AI, forging new frontiers in reasoning and pattern recognition. Transformers function …
[PDF][PDF] Alphathon 2024 Submission: Training a Time-Series LLM for Portfolio Optimization with Parameter-Efficient Online Learning
R Engel, M Xerri, M Gannon - ryanengel.info
In this submission for the SQA 2024 Alphathon, we chose to work on the LLM problem
sponsored by AllianceBernstein. We address this problem by formulating it as a long-term …
sponsored by AllianceBernstein. We address this problem by formulating it as a long-term …