Explainable generative ai (genxai): A survey, conceptualization, and research agenda

J Schneider - Artificial Intelligence Review, 2024 - Springer
Generative AI (GenAI) represents a shift from AI's ability to “recognize” to its ability to
“generate” solutions for a wide range of tasks. As generated solutions and applications grow …

Not all tokens are what you need for pretraining

Z Lin, Z Gou, Y Gong, X Liu, R Xu… - Advances in …, 2025 - proceedings.neurips.cc
Previous language model pre-training methods have uniformly applied a next-token
prediction loss to all training tokens. Challenging this norm, we posit that''Not all tokens in a …

Rho-1: Not all tokens are what you need

Z Lin, Z Gou, Y Gong, X Liu, Y Shen, R Xu, C Lin… - arxiv preprint arxiv …, 2024 - arxiv.org
Previous language model pre-training methods have uniformly applied a next-token
prediction loss to all training tokens. Challenging this norm, we posit that" 9l training". Our …

Gtbench: Uncovering the strategic reasoning limitations of llms via game-theoretic evaluations

J Duan, R Zhang, J Diffenderfer, B Kailkhura… - arxiv preprint arxiv …, 2024 - arxiv.org
As Large Language Models (LLMs) are integrated into critical real-world applications, their
strategic and logical reasoning abilities are increasingly crucial. This paper evaluates LLMs' …

A peek into token bias: Large language models are not yet genuine reasoners

B Jiang, Y **e, Z Hao, X Wang, T Mallick, WJ Su… - arxiv preprint arxiv …, 2024 - arxiv.org
This study introduces a hypothesis-testing framework to assess whether large language
models (LLMs) possess genuine reasoning abilities or primarily depend on token bias. We …

[PDF][PDF] A comprehensive survey of small language models in the era of large language models: Techniques, enhancements, applications, collaboration with llms, and …

F Wang, Z Zhang, X Zhang, Z Wu, T Mo, Q Lu… - arxiv preprint arxiv …, 2024 - ai.radensa.ru
Large language models (LLM) have demonstrated emergent abilities in text generation,
question answering, and reasoning, facilitating various tasks and domains. Despite their …

Instructgraph: Boosting large language models via graph-centric instruction tuning and preference alignment

J Wang, J Wu, Y Hou, Y Liu, M Gao… - arxiv preprint arxiv …, 2024 - arxiv.org
Do current large language models (LLMs) better solve graph reasoning and generation
tasks with parameter updates? In this paper, we propose InstructGraph, a framework that …

Selfpico: Self-guided partial code execution with llms

Z Xue, Z Gao, S Wang, X Hu, X **a, S Li - Proceedings of the 33rd ACM …, 2024 - dl.acm.org
Code executability plays a vital role in software debugging and testing (eg, detecting runtime
exceptions or assertion violations). However, code execution, especially partial or arbitrary …

Can Large Language Models Understand Symbolic Graphics Programs?

Z Qiu, W Liu, H Feng, Z Liu, TZ **ao, KM Collins… - arxiv preprint arxiv …, 2024 - arxiv.org
Against the backdrop of enthusiasm for large language models (LLMs), there is an urgent
need to scientifically assess their capabilities and shortcomings. This is nontrivial in part …

Mitigating catastrophic forgetting in language transfer via model merging

A Alexandrov, V Raychev, MN Müller, C Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
As open-weight large language models (LLMs) achieve ever more impressive performances
across a wide range of tasks in English, practitioners aim to adapt these models to different …