- Academic Search

Y Tong, X Zhang, R Wang, R Wu… - Advances in Neural …, 2025 - proceedings.neurips.cc

Solving mathematical problems requires advanced reasoning abilities and presents notable
challenges for large language models. Previous works usually synthesize data from …

Tallenna Viittaa Viittausten määrä 26 Aiheeseen liittyviä artikkeleita Kaikki 7 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Mammoth2: Scaling instructions from the web

X Yue, T Zheng, G Zhang, W Chen - arxiv preprint arxiv:2405.03548, 2024 - arxiv.org

Instruction tuning improves the reasoning abilities of large language models (LLMs), with
data quality and scalability being the crucial factors. Most instruction tuning data come from …

Tallenna Viittaa Viittausten määrä 61 Aiheeseen liittyviä artikkeleita Kaikki 3 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Internal consistency and self-feedback in large language models: A survey

X Liang, S Song, Z Zheng, H Wang, Q Yu, X Li… - arxiv preprint arxiv …, 2024 - arxiv.org

Large language models (LLMs) often exhibit deficient reasoning or generate hallucinations.
To address these, studies prefixed with" Self-" such as Self-Consistency, Self-Improve, and …

Tallenna Viittaa Viittausten määrä 25 Aiheeseen liittyviä artikkeleita Kaikki 4 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey on data synthesis and augmentation for large language models

K Wang, J Zhu, M Ren, Z Liu, S Li, Z Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

The success of Large Language Models (LLMs) is inherently linked to the availability of vast,
diverse, and high-quality data for training and evaluation. However, the growth rate of high …

Tallenna Viittaa Viittausten määrä 4 Aiheeseen liittyviä artikkeleita Kaikki 2 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A theoretical understanding of self-correction through in-context alignment

Y Wang, Y Wu, Z Wei, S Jegelka, Y Wang - arxiv preprint arxiv …, 2024 - arxiv.org

Going beyond mimicking limited human experiences, recent studies show initial evidence
that, like humans, large language models (LLMs) are capable of improving their abilities …

Tallenna Viittaa Viittausten määrä 11 Aiheeseen liittyviä artikkeleita Kaikki 5 versiota HTML-versio

Exploring automated energy optimization with unstructured building data: A multi-agent based framework leveraging large language models

T **ao, P Xu - Energy and Buildings, 2024 - Elsevier

The building sector is a significant energy consumer, making building energy optimization
crucial for reducing energy demand. Automating energy optimization tasks eases the …

Tallenna Viittaa Viittausten määrä 6 Aiheeseen liittyviä artikkeleita Kaikki 2 versiota

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Self-generated critiques boost reward modeling for language models

Y Yu, Z Chen, A Zhang, L Tan, C Zhu, RY Pang… - arxiv preprint arxiv …, 2024 - arxiv.org

Reward modeling is crucial for aligning large language models (LLMs) with human
preferences, especially in reinforcement learning from human feedback (RLHF). However …

Tallenna Viittaa Viittausten määrä 7 Aiheeseen liittyviä artikkeleita Kaikki 2 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Do not think that much for 2+ 3=? on the overthinking of o1-like llms

X Chen, J Xu, T Liang, Z He, J Pang, D Yu… - arxiv preprint arxiv …, 2024 - arxiv.org

The remarkable performance of models like the OpenAI o1 can be attributed to their ability to
emulate human-like long-time thinking during inference. These models employ extended …

Tallenna Viittaa Viittausten määrä 10 Aiheeseen liittyviä artikkeleita Kaikki 2 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Ptd-sql: Partitioning and targeted drilling with llms in text-to-sql

R Luo, L Wang, B Lin, Z Lin, Y Yang - arxiv preprint arxiv:2409.14082, 2024 - arxiv.org

Large Language Models (LLMs) have emerged as powerful tools for Text-to-SQL tasks,
exhibiting remarkable reasoning capabilities. Different from tasks such as math word …

Tallenna Viittaa Viittausten määrä 6 Aiheeseen liittyviä artikkeleita Kaikki 5 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Evaluating the Evaluator: Measuring LLMs' Adherence to Task Evaluation Instructions

B Murugadoss, C Poelitz, I Drosos, V Le… - arxiv preprint arxiv …, 2024 - arxiv.org

LLMs-as-a-judge is a recently popularized method which replaces human judgements in
task evaluation (Zheng et al. 2024) with automatic evaluation using LLMs. Due to …

Tallenna Viittaa Viittausten määrä 4 Aiheeseen liittyviä artikkeleita Kaikki 2 versiota HTML-versio

Luo ilmoitus

Viittaa

Tarkennettu haku

Tallennettu omaan kirjastoon

Criticbench: Benchmarking llms for critique-correct reasoning

Dart-math: Difficulty-aware rejection tuning for mathematical problem-solving

Mammoth2: Scaling instructions from the web

Internal consistency and self-feedback in large language models: A survey

A survey on data synthesis and augmentation for large language models

A theoretical understanding of self-correction through in-context alignment

Exploring automated energy optimization with unstructured building data: A multi-agent based framework leveraging large language models

Self-generated critiques boost reward modeling for language models

Do not think that much for 2+ 3=? on the overthinking of o1-like llms

Ptd-sql: Partitioning and targeted drilling with llms in text-to-sql

Evaluating the Evaluator: Measuring LLMs' Adherence to Task Evaluation Instructions