Clinical insights: A comprehensive review of language models in medicine

N Neveditsin, P Lingras, V Mago - arxiv preprint arxiv:2408.11735, 2024 - arxiv.org
This paper provides a detailed examination of the advancements and applications of large
language models in the healthcare sector, with a particular emphasis on clinical …

Against The Achilles' Heel: A Survey on Red Teaming for Generative Models

L Lin, H Mu, Z Zhai, M Wang, Y Wang, R Wang… - Journal of Artificial …, 2025 - jair.org
Generative models are rapidly gaining popularity and being integrated into everyday
applications, raising concerns over their safe use as various vulnerabilities are exposed. In …

Large language models as surrogate models in evolutionary algorithms: A preliminary study

H Hao, X Zhang, A Zhou - Swarm and Evolutionary Computation, 2024 - Elsevier
Abstract Large Language Models (LLMs) have demonstrated remarkable advancements
across diverse domains, manifesting considerable capabilities in evolutionary computation …

Cut your losses in large-vocabulary language models

E Wijmans, B Huval, A Hertzberg, V Koltun… - arxiv preprint arxiv …, 2024 - arxiv.org
As language models grow ever larger, so do their vocabularies. This has shifted the memory
footprint of LLMs during training disproportionately to one single layer: the cross-entropy in …

When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training

H Wang, Q Liu, C Du, T Zhu, C Du… - arxiv preprint arxiv …, 2024 - arxiv.org
Extending context window sizes allows large language models (LLMs) to process longer
sequences and handle more complex tasks. Rotary Positional Embedding (RoPE) has …

Not All LLM Reasoners Are Created Equal

A Hosseini, A Sordoni, D Toyama, A Courville… - arxiv preprint arxiv …, 2024 - arxiv.org
We study the depth of grade-school math (GSM) problem-solving capabilities of LLMs. To
this end, we evaluate their performance on pairs of existing math word problems together so …

Steel design based on a large language model

S Tian, X Jiang, W Wang, Z **g, C Zhang, C Zhang… - Acta Materialia, 2025 - Elsevier
The success of artificial intelligence (AI) in materials research heavily relies on the integrity
of structured data and the construction of precise descriptors. In this study, we present an …

High-dimensional analysis of knowledge distillation: Weak-to-strong generalization and scaling laws

ME Ildiz, HA Gozeten, EO Taga, M Mondelli… - arxiv preprint arxiv …, 2024 - arxiv.org
A growing number of machine learning scenarios rely on knowledge distillation where one
uses the output of a surrogate model as labels to supervise the training of a target model. In …

Transfer Learning for Finetuning Large Language Models

T Strangmann, L Purucker, JKH Franke… - arxiv preprint arxiv …, 2024 - arxiv.org
As the landscape of large language models expands, efficiently finetuning for specific tasks
becomes increasingly crucial. At the same time, the landscape of parameter-efficient …

NatureLM: Deciphering the Language of Nature for Scientific Discovery

Y **a, P **, S **e, L He, C Cao, R Luo, G Liu… - arxiv preprint arxiv …, 2025 - arxiv.org
Foundation models have revolutionized natural language processing and artificial
intelligence, significantly enhancing how machines comprehend and generate human …