- Academic Search

N Jain, K Han, A Gu, WD Li, F Yan, T Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Language Models (LLMs) applied to code-related applications have emerged as a
prominent field, attracting significant interest from both academia and industry. However, as …

Uložit Citovat Počet citací tohoto článku: 117 Související články Všechny verze (počet: 5) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Cruxeval: A benchmark for code reasoning, understanding and execution

A Gu, B Rozière, H Leather, A Solar-Lezama… - arxiv preprint arxiv …, 2024 - arxiv.org

We present CRUXEval (Code Reasoning, Understanding, and eXecution Evaluation), a
benchmark consisting of 800 Python functions (3-13 lines). Each function comes with an …

Uložit Citovat Počet citací tohoto článku: 60 Související články Všechny verze (počet: 8) Zobrazit jako HTML

Transformers in source code generation: A comprehensive survey

H Ghaemi, Z Alizadehsani, A Shahraki… - Journal of Systems …, 2024 - Elsevier

Transformers have revolutionized natural language processing (NLP) and have had a huge
impact on automating tasks. Recently, transformers have led to the development of powerful …

Uložit Citovat Počet citací tohoto článku: 5 Související články Všechny verze (počet: 2)

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Multilingual training for software engineering

T Ahmed, P Devanbu - Proceedings of the 44th International Conference …, 2022 - dl.acm.org

Well-trained machine-learning models, which leverage large amounts of open-source
software data, have now become an interesting approach to automating many software …

Uložit Citovat Počet citací tohoto článku: 80 Související články Všechny verze (počet: 7)

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

A catalog of data smells for coding tasks

A Vitale, R Oliveto, S Scalabrino - ACM Transactions on Software …, 2024 - dl.acm.org

Large Language Models (LLMs) are increasingly becoming fundamental in supporting
software developers in coding tasks. The massive datasets used for training LLMs are often …

Uložit Citovat Počet citací tohoto článku: 1 Související články Všechny verze (počet: 3)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Formal specifications from natural language

C Hahn, F Schmitt, JJ Tillman, N Metzger… - arxiv preprint arxiv …, 2022 - arxiv.org

We study the generalization abilities of language models when translating natural language
into formal specifications with complex semantics. In particular, we fine-tune language …

Uložit Citovat Počet citací tohoto článku: 39 Související články Všechny verze (počet: 4) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Effibench: Benchmarking the efficiency of automatically generated code

D Huang, Y Qing, W Shang, H Cui… - arxiv preprint arxiv …, 2024 - arxiv.org

Code generation models have increasingly become integral to aiding software
development. Although current research has thoroughly examined the correctness of the …

Uložit Citovat Počet citací tohoto článku: 18 Související články Všechny verze (počet: 5) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The counterfeit conundrum: Can code language models grasp the nuances of their incorrect generations?

A Gu, WD Li, N Jain, TX Olausson, C Lee, K Sen… - arxiv preprint arxiv …, 2024 - arxiv.org

While language models are increasingly more proficient at code generation, they still
frequently generate incorrect programs. Many of these programs are obviously wrong, but …

Uložit Citovat Počet citací tohoto článku: 10 Související články Všechny verze (počet: 6) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Mhpp: Exploring the capabilities and limitations of language models beyond basic code generation

J Dai, J Lu, Y Feng, D Huang, G Zeng, R Ruan… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent advancements in large language models (LLMs) have greatly improved code
generation, specifically at the function level. For instance, GPT-4o has achieved a 91.0 …

Uložit Citovat Počet citací tohoto článku: 6 Související články Všechny verze (počet: 3) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The vault: A comprehensive multilingual dataset for advancing code understanding and generation

DN Manh, NL Hai, ATV Dau, AM Nguyen… - arxiv preprint arxiv …, 2023 - arxiv.org

We present The Vault, a dataset of high-quality code-text pairs in multiple programming
languages for training large language models to understand and generate code. We present …

Uložit Citovat Počet citací tohoto článku: 14 Související články Všechny verze (počet: 8) Zobrazit jako HTML

Vytvořit upozornění

Citovat

Rozšířené vyhledávání

Uloženo do Mojí knihovny

Codesc: A large code-description parallel dataset

Livecodebench: Holistic and contamination free evaluation of large language models for code

Cruxeval: A benchmark for code reasoning, understanding and execution

Transformers in source code generation: A comprehensive survey

Multilingual training for software engineering

A catalog of data smells for coding tasks

Formal specifications from natural language

Effibench: Benchmarking the efficiency of automatically generated code

The counterfeit conundrum: Can code language models grasp the nuances of their incorrect generations?

Mhpp: Exploring the capabilities and limitations of language models beyond basic code generation

The vault: A comprehensive multilingual dataset for advancing code understanding and generation