- Academic Search

How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments

J Huang, EJ Li, MH Lam, T Liang, W Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Decision-making is a complex process requiring diverse abilities, making it an excellent
framework for evaluating Large Language Models (LLMs). Researchers have examined …

Zapisz Cytuj Cytowane przez 41 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Llm as a mastermind: A survey of strategic reasoning with large language models

Y Zhang, S Mao, T Ge, X Wang, A de Wynter… - arxiv preprint arxiv …, 2024 - arxiv.org

This paper presents a comprehensive survey of the current status and opportunities for
Large Language Models (LLMs) in strategic reasoning, a sophisticated form of reasoning …

Zapisz Cytuj Cytowane przez 28 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Put your money where your mouth is: Evaluating strategic planning and execution of llm agents in an auction arena

J Chen, S Yuan, R Ye, BP Majumder… - arxiv preprint arxiv …, 2023 - arxiv.org

Recent advancements in Large Language Models (LLMs) showcase advanced reasoning,
yet NLP evaluations often depend on static benchmarks. Evaluating this necessitates …

Zapisz Cytuj Cytowane przez 46 Powiązane artykuły Wszystkie wersje 4 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Generative ai for game theory-based mobile networking

L He, G Sun, D Niyato, H Du, F Mei… - IEEE Wireless …, 2025 - ieeexplore.ieee.org

With the continuous advancement of network technology, various emerging complex
networking optimization problems have created a wide range of applications utilizing game …

Zapisz Cytuj Cytowane przez 4 Powiązane artykuły Wszystkie wersje 4

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Suspicion-agent: Playing imperfect information games with theory of mind aware gpt-4

J Guo, B Yang, P Yoo, BY Lin, Y Iwasawa… - arxiv preprint arxiv …, 2023 - arxiv.org

Unlike perfect information games, where all elements are known to every player, imperfect
information games emulate the real-world complexities of decision-making under uncertain …

Zapisz Cytuj Cytowane przez 30 Powiązane artykuły Wszystkie wersje 4 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Cooperate or collapse: Emergence of sustainable cooperation in a society of llm agents

G Piatti, Z **, M Kleiman-Weiner… - Advances in …, 2025 - proceedings.neurips.cc

As AI systems pervade human life, ensuring that large language models (LLMs) make safe
decisions remains a significant challenge. We introduce the Governance of the Commons …

Zapisz Cytuj Cytowane przez 8 Powiązane artykuły Wszystkie wersje 5 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Shifting attention to relevance: Towards the predictive uncertainty quantification of free-form large language models

J Duan, H Cheng, S Wang, A Zavalny, C Wang… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) show promising results in language generation and
instruction following but frequently" hallucinate", making their outputs less reliable. Despite …

Zapisz Cytuj Cytowane przez 22 Powiązane artykuły Wszystkie wersje 6 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Reta: Recursively thinking ahead to improve the strategic reasoning of large language models

J Duan, S Wang, J Diffenderfer, L Sun… - Proceedings of the …, 2024 - aclanthology.org

Current logical reasoning evaluations of Large Language Models (LLMs) primarily focus on
single-turn and static environments, such as arithmetic problems. The crucial problem of …

Zapisz Cytuj Cytowane przez 7 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Nicer Than Humans: How do Large Language Models Behave in the Prisoner's Dilemma?

N Fontana, F Pierri, LM Aiello - arxiv preprint arxiv:2406.13605, 2024 - arxiv.org

The behavior of Large Language Models (LLMs) as artificial social agents is largely
unexplored, and we still lack extensive evidence of how these agents react to simple social …

Zapisz Cytuj Cytowane przez 11 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Multi-agent software development through cross-team collaboration

Z Du, C Qian, W Liu, Z **e, Y Wang, Y Dang… - arxiv preprint arxiv …, 2024 - arxiv.org

The latest breakthroughs in Large Language Models (LLMs), eg., ChatDev, have catalyzed
profound transformations, particularly through multi-agent collaboration for software …

Zapisz Cytuj Cytowane przez 8 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

Gtbench: Uncovering the strategic reasoning limitations of llms via game-theoretic evaluations

How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments

Llm as a mastermind: A survey of strategic reasoning with large language models

Put your money where your mouth is: Evaluating strategic planning and execution of llm agents in an auction arena

Generative ai for game theory-based mobile networking

Suspicion-agent: Playing imperfect information games with theory of mind aware gpt-4

Cooperate or collapse: Emergence of sustainable cooperation in a society of llm agents

Shifting attention to relevance: Towards the predictive uncertainty quantification of free-form large language models

Reta: Recursively thinking ahead to improve the strategic reasoning of large language models

Nicer Than Humans: How do Large Language Models Behave in the Prisoner's Dilemma?

Multi-agent software development through cross-team collaboration