Reasoning abilities of large language models: In-depth analysis on the abstraction and reasoning corpus
The existing methods for evaluating the inference abilities of Large Language Models
(LLMs) have been predominantly results-centric, making it challenging to assess the …
(LLMs) have been predominantly results-centric, making it challenging to assess the …
Multimodal self-instruct: Synthetic abstract image and visual reasoning instruction using language model
Although most current large multimodal models (LMMs) can already understand photos of
natural scenes and portraits, their understanding of abstract images, eg, charts, maps, or …
natural scenes and portraits, their understanding of abstract images, eg, charts, maps, or …
Think twice before assure: Confidence estimation for large language models through reflection on multiple answers
Confidence estimation aiming to evaluate output trustability is crucial for the application of
large language models (LLM), especially the black-box ones. Existing confidence estimation …
large language models (LLM), especially the black-box ones. Existing confidence estimation …
Automl-agent: A multi-agent llm framework for full-pipeline automl
Automated machine learning (AutoML) accelerates AI development by automating tasks in
the development pipeline, such as optimal model search and hyperparameter tuning …
the development pipeline, such as optimal model search and hyperparameter tuning …
Agent-pro: Learning to evolve via policy-level reflection and optimization
Large Language Models exhibit robust problem-solving capabilities for diverse tasks.
However, most LLM-based agents are designed as specific task solvers with sophisticated …
However, most LLM-based agents are designed as specific task solvers with sophisticated …
Training language models to critique with multi-agent feedback
Critique ability, a meta-cognitive capability of humans, presents significant challenges for
LLMs to improve. Recent works primarily rely on supervised fine-tuning (SFT) using critiques …
LLMs to improve. Recent works primarily rely on supervised fine-tuning (SFT) using critiques …
Reasoning and planning with large language models in code development
Large Language Models (LLMs) are revolutionizing the field of code development by
leveraging their deep understanding of code patterns, syntax, and semantics to assist …
leveraging their deep understanding of code patterns, syntax, and semantics to assist …
Integrate the Essence and Eliminate the Dross: Fine-Grained Self-Consistency for Free-Form Language Generation
Self-consistency (SC), leveraging multiple samples from LLMs, shows significant gains on
various reasoning tasks but struggles with free-form generation due to the difficulty of …
various reasoning tasks but struggles with free-form generation due to the difficulty of …
Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models
Large Language Models (LLMs) are often described as being instances of foundation
models-that is, models that transfer strongly across various tasks and conditions in few-show …
models-that is, models that transfer strongly across various tasks and conditions in few-show …
Fine-tuning with divergent chains of thought boosts reasoning through self-correction in language models
Requiring a large language model to generate intermediary reasoning steps has been
shown to be an effective way of boosting performance. In fact, instruction tuning on these …
shown to be an effective way of boosting performance. In fact, instruction tuning on these …