Google Učenjak

J Schneider - Artificial Intelligence Review, 2024 - Springer

Generative AI (GenAI) represents a shift from AI's ability to “recognize” to its ability to
“generate” solutions for a wide range of tasks. As generated solutions and applications grow …

Shrani Navedi Navedeno v 25 virih Sorodni članki Vse različice: 5

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Not all tokens are what you need for pretraining

Z Lin, Z Gou, Y Gong, X Liu, R Xu… - Advances in …, 2025 - proceedings.neurips.cc

Previous language model pre-training methods have uniformly applied a next-token
prediction loss to all training tokens. Challenging this norm, we posit that''Not all tokens in a …

Shrani Navedi Navedeno v 4 virih Sorodni članki Vse različice: 3 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Rho-1: Not all tokens are what you need

Z Lin, Z Gou, Y Gong, X Liu, Y Shen, R Xu, C Lin… - arxiv preprint arxiv …, 2024 - arxiv.org

Previous language model pre-training methods have uniformly applied a next-token
prediction loss to all training tokens. Challenging this norm, we posit that" 9l training". Our …

Shrani Navedi Navedeno v 54 virih Sorodni članki Vse različice: 3 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Gtbench: Uncovering the strategic reasoning limitations of llms via game-theoretic evaluations

J Duan, R Zhang, J Diffenderfer, B Kailkhura… - arxiv preprint arxiv …, 2024 - arxiv.org

As Large Language Models (LLMs) are integrated into critical real-world applications, their
strategic and logical reasoning abilities are increasingly crucial. This paper evaluates LLMs' …

Shrani Navedi Navedeno v 40 virih Sorodni članki Vse različice: 5 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A peek into token bias: Large language models are not yet genuine reasoners

B Jiang, Y **e, Z Hao, X Wang, T Mallick, WJ Su… - arxiv preprint arxiv …, 2024 - arxiv.org

This study introduces a hypothesis-testing framework to assess whether large language
models (LLMs) possess genuine reasoning abilities or primarily depend on token bias. We …

Shrani Navedi Navedeno v 30 virih Sorodni članki Vse različice: 6 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] radensa.ru

[PDF][PDF] A comprehensive survey of small language models in the era of large language models: Techniques, enhancements, applications, collaboration with llms, and …

F Wang, Z Zhang, X Zhang, Z Wu, T Mo, Q Lu… - arxiv preprint arxiv …, 2024 - ai.radensa.ru

Large language models (LLM) have demonstrated emergent abilities in text generation,
question answering, and reasoning, facilitating various tasks and domains. Despite their …

Shrani Navedi Navedeno v 10 virih Sorodni članki Vse različice: 4 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Instructgraph: Boosting large language models via graph-centric instruction tuning and preference alignment

J Wang, J Wu, Y Hou, Y Liu, M Gao… - arxiv preprint arxiv …, 2024 - arxiv.org

Do current large language models (LLMs) better solve graph reasoning and generation
tasks with parameter updates? In this paper, we propose InstructGraph, a framework that …

Shrani Navedi Navedeno v 27 virih Sorodni članki Vse različice: 4 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Selfpico: Self-guided partial code execution with llms

Z Xue, Z Gao, S Wang, X Hu, X **a, S Li - Proceedings of the 33rd ACM …, 2024 - dl.acm.org

Code executability plays a vital role in software debugging and testing (eg, detecting runtime
exceptions or assertion violations). However, code execution, especially partial or arbitrary …

Shrani Navedi Navedeno v 7 virih Sorodni članki Vse različice: 4

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Can Large Language Models Understand Symbolic Graphics Programs?

Z Qiu, W Liu, H Feng, Z Liu, TZ **ao, KM Collins… - arxiv preprint arxiv …, 2024 - arxiv.org

Against the backdrop of enthusiasm for large language models (LLMs), there is an urgent
need to scientifically assess their capabilities and shortcomings. This is nontrivial in part …

Shrani Navedi Navedeno v 8 virih Sorodni članki Vse različice: 2 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Mitigating catastrophic forgetting in language transfer via model merging

A Alexandrov, V Raychev, MN Müller, C Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

As open-weight large language models (LLMs) achieve ever more impressive performances
across a wide range of tasks in English, practitioners aim to adapt these models to different …

Shrani Navedi Navedeno v 7 virih Sorodni članki Vse različice: 5 V obliki HTML

Ustvari opozorilo

Navedi

Napredno iskanje

Shranjeno v Mojo knjižnico

At which training stage does code data help llms reasoning?

Explainable generative ai (genxai): A survey, conceptualization, and research agenda

Not all tokens are what you need for pretraining

Rho-1: Not all tokens are what you need

Gtbench: Uncovering the strategic reasoning limitations of llms via game-theoretic evaluations

A peek into token bias: Large language models are not yet genuine reasoners

[PDF][PDF] A comprehensive survey of small language models in the era of large language models: Techniques, enhancements, applications, collaboration with llms, and …

Instructgraph: Boosting large language models via graph-centric instruction tuning and preference alignment

Selfpico: Self-guided partial code execution with llms

Can Large Language Models Understand Symbolic Graphics Programs?

Mitigating catastrophic forgetting in language transfer via model merging