- Academic Search

S Hong, Y Lin, B Liu, B Liu, B Wu, C Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Language Model (LLM)-based agents have shown effectiveness across many
applications. However, their use in data science scenarios requiring solving long-term …

Zapisz Cytuj Cytowane przez 49 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]

[PDF] acm.org

Automated program repair via conversation: Fixing 162 out of 337 bugs for $0.42 each using ChatGPT

CS **a, L Zhang - Proceedings of the 33rd ACM SIGSOFT International …, 2024 - dl.acm.org

Automated Program Repair (APR) aims to automatically generate patches for buggy
programs. Traditional APR techniques suffer from a lack of patch variety as they rely heavily …

Zapisz Cytuj Cytowane przez 13 Powiązane artykuły Wszystkie wersje 2

[Free GPT-4]

[PDF] arxiv.org

Agent-as-a-judge: Evaluate agents with agents

M Zhuge, C Zhao, D Ashley, W Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Contemporary evaluation techniques are inadequate for agentic systems. These
approaches either focus exclusively on final outcomes--ignoring the step-by-step nature of …

Zapisz Cytuj Cytowane przez 15 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Agentharm: A benchmark for measuring harmfulness of llm agents

M Andriushchenko, A Souly, M Dziemian… - arxiv preprint arxiv …, 2024 - arxiv.org

The robustness of LLMs to jailbreak attacks, where users design prompts to circumvent
safety measures and misuse model capabilities, has been studied primarily for LLMs acting …

Zapisz Cytuj Cytowane przez 9 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Opencoder: The open cookbook for top-tier code large language models

S Huang, T Cheng, JK Liu, J Hao, L Song, Y Xu… - arxiv preprint arxiv …, 2024 - arxiv.org

Large language models (LLMs) for code have become indispensable in various domains,
including code generation, reasoning tasks and agent systems. While open-access code …

Zapisz Cytuj Cytowane przez 11 Powiązane artykuły Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Marscode agent: Ai-native automated bug fixing

Y Liu, P Gao, X Wang, J Liu, Y Shi, Z Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent advances in large language models (LLMs) have shown significant potential to
automate various software development tasks, including code completion, test generation …

Zapisz Cytuj Cytowane przez 15 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Specrover: Code intent extraction via llms

H Ruan, Y Zhang, A Roychoudhury - arxiv preprint arxiv:2408.02232, 2024 - arxiv.org

Autonomous program improvement typically involves automatically producing bug fixes and
feature additions. Such program improvement can be accomplished by a combination of …

Zapisz Cytuj Cytowane przez 8 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph

S Ouyang, W Yu, K Ma, Z **ao, Z Zhang, M Jia… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Language Models (LLMs) excel in code generation yet struggle with modern AI
software engineering tasks. Unlike traditional function-level or file-level coding tasks, AI …

Zapisz Cytuj Cytowane przez 5 Powiązane artykuły Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Autoglm: Autonomous foundation agents for guis

X Liu, B Qin, D Liang, G Dong, H Lai, H Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

We present AutoGLM, a new series in the ChatGLM family, designed to serve as foundation
agents for autonomous control of digital devices through Graphical User Interfaces (GUIs) …

Zapisz Cytuj Cytowane przez 3 Powiązane artykuły Wersja HTML

Diversity empowers intelligence: Integrating expertise of software engineering agents

K Zhang, W Yao, Z Liu, Y Feng, Z Liu, R Murthy… - arxiv preprint arxiv …, 2024 - arxiv.org

Large language model (LLM) agents have shown great potential in solving real-world
software engineering (SWE) problems. The most advanced open-source SWE agent can …

Zapisz Cytuj Cytowane przez 7 Powiązane artykuły Kopia

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

Data interpreter: An llm agent for data science

Automated program repair via conversation: Fixing 162 out of 337 bugs for $0.42 each using ChatGPT

Agent-as-a-judge: Evaluate agents with agents

Agentharm: A benchmark for measuring harmfulness of llm agents

Opencoder: The open cookbook for top-tier code large language models

Marscode agent: Ai-native automated bug fixing

Specrover: Code intent extraction via llms

RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph

Autoglm: Autonomous foundation agents for guis

Diversity empowers intelligence: Integrating expertise of software engineering agents