Qwen2. 5-coder technical report
In this report, we introduce the Qwen2. 5-Coder series, a significant upgrade from its
predecessor, CodeQwen1. 5. This series includes six models: Qwen2. 5-Coder-(0.5 B/1.5 …
predecessor, CodeQwen1. 5. This series includes six models: Qwen2. 5-Coder-(0.5 B/1.5 …
Starcoder 2 and the stack v2: The next generation
The BigCode project, an open-scientific collaboration focused on the responsible
development of Large Language Models for Code (Code LLMs), introduces StarCoder2. In …
development of Large Language Models for Code (Code LLMs), introduces StarCoder2. In …
Evaluating language models for efficient code generation
We introduce Differential Performance Evaluation (DPE), a framework designed to reliably
evaluate Large Language Models (LLMs) for efficient code generation. Traditional coding …
evaluate Large Language Models (LLMs) for efficient code generation. Traditional coding …
The mamba in the llama: Distilling and accelerating hybrid models
Linear RNN architectures, like Mamba, can be competitive with Transformer models in
language modeling while having advantageous deployment characteristics. Given the focus …
language modeling while having advantageous deployment characteristics. Given the focus …
Codemind: A framework to challenge large language models for code reasoning
Solely relying on test passing to evaluate Large Language Models (LLMs) for code
synthesis may result in unfair assessment or promoting models with data leakage. As an …
synthesis may result in unfair assessment or promoting models with data leakage. As an …
Mhpp: Exploring the capabilities and limitations of language models beyond basic code generation
Recent advancements in large language models (LLMs) have greatly improved code
generation, specifically at the function level. For instance, GPT-4o has achieved a 91.0 …
generation, specifically at the function level. For instance, GPT-4o has achieved a 91.0 …
Evaluating and aligning codellms on human preference
Code large language models (codeLLMs) have made significant strides in code generation.
Most previous code-related benchmarks, which consist of various programming exercises …
Most previous code-related benchmarks, which consist of various programming exercises …
[PDF][PDF] SemCoder: Training Code Language Models with Comprehensive Semantics Reasoning
Abstract Code Large Language Models (Code LLMs) have excelled at tasks like code
completion but often miss deeper semantics such as execution effects and dynamic states …
completion but often miss deeper semantics such as execution effects and dynamic states …
R2E: Turning any Github Repository into a Programming Agent Environment
While Large Language Models'(LLMs) coding capabilities have advanced rapidly,
corresponding evaluation benchmarks on real-world programming setups are yet to catch …
corresponding evaluation benchmarks on real-world programming setups are yet to catch …
Testgeneval: A real world unit test generation and test completion benchmark
Code generation models can help improve many common software tasks ranging from code
completion to defect prediction. Most of the existing benchmarks for code generation LLMs …
completion to defect prediction. Most of the existing benchmarks for code generation LLMs …