Google Наука

T GLM, A Zeng, B Xu, B Wang, C Zhang, D Yin… - ar** over time. This report primarily focuses on the GLM-4 language series, which …

Запазване Позоваване С позовавания в 290 Сродни статии Всички 5 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Where Are Large Language Models for Code Generation on GitHub?

X Yu, L Liu, X Hu, JW Keung, J Liu, X **a - arxiv preprint arxiv:2406.19544, 2024 - arxiv.org

The increasing use of Large Language Models (LLMs) in software development has
garnered significant attention from researchers assessing the quality of the code they …

Запазване Позоваване С позовавания в 8 Сродни статии Всички 3 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Autoglm: Autonomous foundation agents for guis

X Liu, B Qin, D Liang, G Dong, H Lai, H Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

We present AutoGLM, a new series in the ChatGLM family, designed to serve as foundation
agents for autonomous control of digital devices through Graphical User Interfaces (GUIs) …

Запазване Позоваване С позовавания в 4 Сродни статии Всички 2 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Diagnosing and Remedying Knowledge Deficiencies in LLMs via Label-free Curricular Meaningful Learning

K **ong, X Ding, L Du, J Ying, T Liu, B Qin… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Language Models (LLMs) are versatile and demonstrate impressive generalization
ability by mining and learning information from extensive unlabeled text. However, they still …

Запазване Позоваване С позовавания в 1 Сродни статии Всички 3 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective

Y Huang, C Gao, S Wu, H Wang, X Wang… - arxiv preprint arxiv …, 2025 - arxiv.org

Generative Foundation Models (GenFMs) have emerged as transformative tools. However,
their widespread adoption raises critical concerns regarding trustworthiness across …

Запазване Позоваване Сродни статии Всички 2 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

How Well Do LLMs Generate Code for Different Application Domains? Benchmark and Evaluation

D Zheng, Y Wang, E Shi, H Zhang, Z Zheng - arxiv preprint arxiv …, 2024 - arxiv.org

Recently, an increasing number of AI-driven programming assistants powered by code
LLMs have been integrated into various real-world software development environments …

Запазване Позоваване Сродни статии Всички 3 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

FVEval: Understanding Language Model Capabilities in Formal Verification of Digital Hardware

M Kang, M Liu, GB Hamad, S Suhaib, H Ren - arxiv preprint arxiv …, 2024 - arxiv.org

The remarkable reasoning and code generation capabilities of large language models
(LLMs) have spurred significant interest in applying LLMs to enable task automation in …

Запазване Позоваване Сродни статии Всички 2 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

DataSciBench: An LLM Agent Benchmark for Data Science

D Zhang, S Zhoubian, M Cai, F Li, L Yang… - arxiv preprint arxiv …, 2025 - arxiv.org

This paper presents DataSciBench, a comprehensive benchmark for evaluating Large
Language Model (LLM) capabilities in data science. Recent related benchmarks have …

Запазване Позоваване Сродни статии Всички 2 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] kellyroach.com

[PDF][PDF] Llms for malware offense and defense

K Roach - kellyroach.com

This survey paper explores the use of large language models (LLMs) in generating and
defending against malware. The paper summarizes news reports and research papers on …

Запазване Позоваване С позовавания в 1 Сродни статии Във вид на HTML

Създаване на сигнал

Позоваване

Разширено търсене

Запазено в „Моята библиотека“

Naturalcodebench: Examining coding performance mismatch on humaneval and natural user prompts

Chatglm: A family of large language models from glm-130b to glm-4 all tools

Where Are Large Language Models for Code Generation on GitHub?

Autoglm: Autonomous foundation agents for guis

Diagnosing and Remedying Knowledge Deficiencies in LLMs via Label-free Curricular Meaningful Learning

On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective

How Well Do LLMs Generate Code for Different Application Domains? Benchmark and Evaluation

FVEval: Understanding Language Model Capabilities in Formal Verification of Digital Hardware

DataSciBench: An LLM Agent Benchmark for Data Science

[PDF][PDF] Llms for malware offense and defense