- Academic Search

Reasoning abilities of large language models: In-depth analysis on the abstraction and reasoning corpus

S Lee, W Sim, D Shin, W Seo, J Park, S Lee… - ACM Transactions on …, 2024 - dl.acm.org

The existing methods for evaluating the inference abilities of Large Language Models
(LLMs) have been predominantly results-centric, making it challenging to assess the …

Save Cite Cited by 9 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Multimodal self-instruct: Synthetic abstract image and visual reasoning instruction using language model

W Zhang, Z Cheng, Y He, M Wang, Y Shen… - arxiv preprint arxiv …, 2024 - arxiv.org

Although most current large multimodal models (LMMs) can already understand photos of
natural scenes and portraits, their understanding of abstract images, eg, charts, maps, or …

Save Cite Cited by 10 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Think twice before assure: Confidence estimation for large language models through reflection on multiple answers

M Li, W Wang, F Feng, F Zhu, Q Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Confidence estimation aiming to evaluate output trustability is crucial for the application of
large language models (LLM), especially the black-box ones. Existing confidence estimation …

Save Cite Cited by 11 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Automl-agent: A multi-agent llm framework for full-pipeline automl

P Trirat, W Jeong, SJ Hwang - arxiv preprint arxiv:2410.02958, 2024 - arxiv.org

Automated machine learning (AutoML) accelerates AI development by automating tasks in
the development pipeline, such as optimal model search and hyperparameter tuning …

Save Cite Cited by 5 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Agent-pro: Learning to evolve via policy-level reflection and optimization

W Zhang, K Tang, H Wu, M Wang, Y Shen… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Language Models exhibit robust problem-solving capabilities for diverse tasks.
However, most LLM-based agents are designed as specific task solvers with sophisticated …

Save Cite Cited by 30 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Training language models to critique with multi-agent feedback

T Lan, W Zhang, C Lyu, S Li, C Xu, H Huang… - arxiv preprint arxiv …, 2024 - arxiv.org

Critique ability, a meta-cognitive capability of humans, presents significant challenges for
LLMs to improve. Recent works primarily rely on supervised fine-tuning (SFT) using critiques …

Save Cite Cited by 2 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] acm.org

Reasoning and planning with large language models in code development

H Ding, Z Fan, I Guehring, G Gupta, W Ha… - Proceedings of the 30th …, 2024 - dl.acm.org

Large Language Models (LLMs) are revolutionizing the field of code development by
leveraging their deep understanding of code patterns, syntax, and semantics to assist …

Save Cite Cited by 2 Related articles

[Free GPT-4]

[PDF] arxiv.org

Integrate the Essence and Eliminate the Dross: Fine-Grained Self-Consistency for Free-Form Language Generation

X Wang, Y Li, S Feng, P Yuan, B Pan, H Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Self-consistency (SC), leveraging multiple samples from LLMs, shows significant gains on
various reasoning tasks but struggles with free-form generation due to the difficulty of …

Save Cite Cited by 3 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models

M Nezhurina, L Cipolina-Kun, M Cherti… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Language Models (LLMs) are often described as being instances of foundation
models-that is, models that transfer strongly across various tasks and conditions in few-show …

Save Cite Cited by 40 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] openreview.net

Fine-tuning with divergent chains of thought boosts reasoning through self-correction in language models

H Puerto, T Chubakov, X Zhu, HT Madabushi… - 2024 - openreview.net

Requiring a large language model to generate intermediary reasoning steps has been
shown to be an effective way of boosting performance. In fact, instruction tuning on these …

Save Cite Cited by 3 Related articles All 3 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Self-contrast: Better reflection through inconsistent solving perspectives

Reasoning abilities of large language models: In-depth analysis on the abstraction and reasoning corpus

Multimodal self-instruct: Synthetic abstract image and visual reasoning instruction using language model

Think twice before assure: Confidence estimation for large language models through reflection on multiple answers

Automl-agent: A multi-agent llm framework for full-pipeline automl

Agent-pro: Learning to evolve via policy-level reflection and optimization

Training language models to critique with multi-agent feedback

Reasoning and planning with large language models in code development

Integrate the Essence and Eliminate the Dross: Fine-Grained Self-Consistency for Free-Form Language Generation

Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models

Fine-tuning with divergent chains of thought boosts reasoning through self-correction in language models