Google 학술 검색

Z **, W Chen, X Guo, W He, Y Ding, B Hong… - Science China …, 2025 - Springer

For a long time, researchers have sought artificial intelligence (AI) that matches or exceeds
human intelligence. AI agents, which are artificial entities capable of sensing the …

저장 인용 730회 인용 관련 학술자료 전체 4개의 버전

Reinforcement learning algorithms: A brief survey

AK Shakya, G Pillai, S Chakrabarty - Expert Systems with Applications, 2023 - Elsevier

Reinforcement Learning (RL) is a machine learning (ML) technique to learn sequential
decision-making in complex problems. RL is inspired by trial-and-error based human/animal …

저장 인용 210회 인용 관련 학술자료 전체 2개의 버전

[Free GPT-4]

[PDF] science.org

Human-level play in the game of Diplomacy by combining language models with strategic reasoning

Meta Fundamental AI Research Diplomacy Team … - Science, 2022 - science.org

Despite much progress in training artificial intelligence (AI) systems to imitate human
language, building agents that use language to communicate intentionally with humans in …

저장 인용 243회 인용 관련 학술자료 전체 4개의 버전

[Free GPT-4]

[PDF] mlr.press

Language instructed reinforcement learning for human-ai coordination

H Hu, D Sadigh - International Conference on Machine …, 2023 - proceedings.mlr.press

One of the fundamental quests of AI is to produce agents that coordinate well with humans.
This problem is challenging, especially in domains that lack high quality human behavioral …

저장 인용 63회 인용 관련 학술자료 전체 7개의 버전 HTML 버전

[Free GPT-4]

[PDF] arxiv.org

Avalon's game of thoughts: Battle against deception through recursive contemplation

S Wang, C Liu, Z Zheng, S Qi, S Chen, Q Yang… - arxiv preprint arxiv …, 2023 - arxiv.org

Recent breakthroughs in large language models (LLMs) have brought remarkable success
in the field of LLM-as-Agent. Nevertheless, a prevalent assumption is that the information …

저장 인용 53회 인용 관련 학술자료 전체 3개의 버전 HTML 버전

[Free GPT-4]

[PDF] neurips.cc

Polynomial-time linear-swap regret minimization in imperfect-information sequential games

G Farina, C Pipis - Advances in Neural Information …, 2024 - proceedings.neurips.cc

No-regret learners seek to minimize the difference between the loss they cumulated through
the actions they played, and the loss they would have cumulated in hindsight had they …

저장 인용 10회 인용 관련 학술자료 전체 5개의 버전 HTML 버전

[Free GPT-4]

[PDF] arxiv.org

Evaluating superhuman models with consistency checks

L Fluri, D Paleka, F Tramèr - 2024 IEEE Conference on Secure …, 2024 - ieeexplore.ieee.org

If machine learning models were to achieve superhuman abilities at various reasoning or
decision-making tasks, how would we go about evaluating such models, given that humans …

저장 인용 27회 인용 관련 학술자료 전체 6개의 버전

[Free GPT-4]

[PDF] arxiv.org

Learning to Drive via Asymmetric Self-Play

C Zhang, S Biswas, K Wong, K Fallah, L Zhang… - … on Computer Vision, 2024 - Springer

Large-scale data is crucial for learning realistic and capable driving policies. However, it can
be impractical to rely on scaling datasets with real data alone. The majority of driving data is …

저장 인용 1회 인용 관련 학술자료 전체 8개의 버전

[Free GPT-4]

[PDF] aaai.org

Minimum coverage sets for training robust ad hoc teamwork agents

M Rahman, J Cui, P Stone - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org

Robustly cooperating with unseen agents and human partners presents significant
challenges due to the diverse cooperative conventions these partners may adopt. Existing …

저장 인용 7회 인용 관련 학술자료 전체 7개의 버전 HTML 버전

[Free GPT-4]

[PDF] arxiv.org

The consensus game: Language model generation via equilibrium search

AP Jacob, Y Shen, G Farina, J Andreas - arxiv preprint arxiv:2310.09139, 2023 - arxiv.org

When applied to question answering and other text generation tasks, language models
(LMs) may be queried generatively (by sampling answers from their output distribution) or …

저장 인용 15회 인용 관련 학술자료 전체 5개의 버전 HTML 버전

알림 만들기

인용

고급 검색

라이브러리에 저장됨

Mastering the game of no-press Diplomacy via human-regularized reinforcement learning and planning

The rise and potential of large language model based agents: A survey

Reinforcement learning algorithms: A brief survey

Human-level play in the game of Diplomacy by combining language models with strategic reasoning

Language instructed reinforcement learning for human-ai coordination

Avalon's game of thoughts: Battle against deception through recursive contemplation

Polynomial-time linear-swap regret minimization in imperfect-information sequential games

Evaluating superhuman models with consistency checks

Learning to Drive via Asymmetric Self-Play

Minimum coverage sets for training robust ad hoc teamwork agents

The consensus game: Language model generation via equilibrium search