- Academic Search

D Li, B Jiang, L Huang, A Beigi, C Zhao, Z Tan… - arxiv preprint arxiv …, 2024 - arxiv.org

Assessment and evaluation have long been critical challenges in artificial intelligence (AI)
and natural language processing (NLP). However, traditional methods, whether matching …

保存引用被引用数: 11 関連記事全 2 バージョン HTMLバージョン

[Free GPT-4]

[PDF] openreview.net

Multi-modal and multi-agent systems meet rationality: A survey

B Jiang, Y **e, X Wang, WJ Su, CJ Taylor… - ICML 2024 Workshop …, 2024 - openreview.net

Rationality is characterized by logical thinking and decision-making that align with evidence
and logical rules. This quality is essential for effective problem-solving, as it ensures that …

保存引用被引用数: 12 関連記事全 2 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

DebUnc: mitigating hallucinations in large language model agent communication with uncertainty estimations

L Yoffe, A Amayuelas, WY Wang - arxiv preprint arxiv:2407.06426, 2024 - arxiv.org

To enhance Large Language Model (LLM) capabilities, multi-agent debates have been
introduced, where multiple LLMs discuss solutions to a problem over several rounds of …

保存引用被引用数: 8 関連記事 HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

A Survey on Uncertainty Quantification of Large Language Models: Taxonomy, Open Research Challenges, and Future Directions

O Shorinwa, Z Mei, J Lidard, AZ Ren… - arxiv preprint arxiv …, 2024 - arxiv.org

The remarkable performance of large language models (LLMs) in content generation,
coding, and common-sense reasoning has spurred widespread integration into many facets …

保存引用関連記事 HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Do llms know when to not answer? investigating abstention abilities of large language models

N Madhusudhan, ST Madhusudhan, V Yadav… - arxiv preprint arxiv …, 2024 - arxiv.org

Abstention Ability (AA) is a critical aspect of Large Language Model (LLM) reliability,
referring to an LLM's capability to withhold responses when uncertain or lacking a definitive …

保存引用被引用数: 2 関連記事全 2 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Deliberate reasoning for llms as structure-aware planning with accurate world model

S **ong, A Payani, Y Yang, F Fekri - arxiv preprint arxiv:2410.03136, 2024 - arxiv.org

Enhancing the reasoning capabilities of large language models (LLMs) remains a key
challenge, especially for tasks that require complex, multi-step decision-making. Humans …

保存引用被引用数: 2 関連記事 HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Understanding the relationship between prompts and response uncertainty in large language models

ZY Zhang, A Verma, F Doshi-Velez… - arxiv preprint arxiv …, 2024 - arxiv.org

Large language models (LLMs) are widely used in decision-making, but their reliability,
especially in critical tasks like healthcare, is not well-established. Therefore, understanding …

保存引用被引用数: 1 関連記事 HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Cross-Refine: Improving Natural Language Explanation Generation by Learning in Tandem

Q Wang, T Anikina, N Feldhus, S Ostermann… - arxiv preprint arxiv …, 2024 - arxiv.org

Natural language explanations (NLEs) are vital for elucidating the reasoning behind large
language model (LLM) decisions. Many techniques have been developed to generate NLEs …

保存引用関連記事全 2 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

FactTest: Factuality Testing in Large Language Models with Statistical Guarantees

F Nie, X Hou, S Lin, J Zou, H Yao, L Zhang - arxiv preprint arxiv …, 2024 - arxiv.org

The propensity of Large Language Models (LLMs) to generate hallucinations and non-
factual content undermines their reliability in high-stakes domains, where rigorous control …

保存引用関連記事 HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

MACAROON: Training Vision-Language Models To Be Your Engaged Partners

S Wu, YR Fung, S Li, Y Wan, KW Chang… - arxiv preprint arxiv …, 2024 - arxiv.org

Large vision-language models (LVLMs), while proficient in following instructions and
responding to diverse questions, invariably generate detailed responses even when …

保存引用被引用数: 3 関連記事 HTMLバージョン

アラートを作成

引用

検索オプション

マイライブラリに保存しました

SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales

From generation to judgment: Opportunities and challenges of llm-as-a-judge

Multi-modal and multi-agent systems meet rationality: A survey

DebUnc: mitigating hallucinations in large language model agent communication with uncertainty estimations

A Survey on Uncertainty Quantification of Large Language Models: Taxonomy, Open Research Challenges, and Future Directions

Do llms know when to not answer? investigating abstention abilities of large language models

Deliberate reasoning for llms as structure-aware planning with accurate world model

Understanding the relationship between prompts and response uncertainty in large language models

Cross-Refine: Improving Natural Language Explanation Generation by Learning in Tandem

FactTest: Factuality Testing in Large Language Models with Statistical Guarantees

MACAROON: Training Vision-Language Models To Be Your Engaged Partners