Toward expert-level medical question answering with large language models

K Singhal, T Tu, J Gottweis, R Sayres, E Wulczyn… - Nature Medicine, 2025 - nature.com
Large language models (LLMs) have shown promise in medical question answering, with
Med-PaLM being the first to exceed a 'passing'score in United States Medical Licensing …

Gemma scope: Open sparse autoencoders everywhere all at once on gemma 2

T Lieberum, S Rajamanoharan, A Conmy… - arxiv preprint arxiv …, 2024 - arxiv.org
Sparse autoencoders (SAEs) are an unsupervised method for learning a sparse
decomposition of a neural network's latent representations into seemingly interpretable …

[HTML][HTML] A survey of robot intelligence with large language models

H Jeong, H Lee, C Kim, S Shin - Applied Sciences, 2024 - mdpi.com
Since the emergence of ChatGPT, research on large language models (LLMs) has actively
progressed across various fields. LLMs, pre-trained on vast text datasets, have exhibited …

Generative verifiers: Reward modeling as next-token prediction

L Zhang, A Hosseini, H Bansal, M Kazemi… - arxiv preprint arxiv …, 2024 - arxiv.org
Verifiers or reward models are often used to enhance the reasoning performance of large
language models (LLMs). A common approach is the Best-of-N method, where N candidate …

Large language model inference acceleration: A comprehensive hardware perspective

J Li, J Xu, S Huang, Y Chen, W Li, J Liu, Y Lian… - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs) have demonstrated remarkable capabilities across various
fields, from natural language understanding to text generation. Compared to non-generative …

Smaller, weaker, yet better: Training llm reasoners via compute-optimal sampling

H Bansal, A Hosseini, R Agarwal, VQ Tran… - arxiv preprint arxiv …, 2024 - arxiv.org
Training on high-quality synthetic data from strong language models (LMs) is a common
strategy to improve the reasoning performance of LMs. In this work, we revisit whether this …

miRTarBase 2025: updates to the collection of experimentally validated microRNA–target interactions

S Cui, S Yu, HY Huang, YCD Lin, Y Huang… - Nucleic Acids …, 2025 - academic.oup.com
MicroRNAs (miRNAs) are small non-coding RNAs (18–26 nucleotides) that regulate gene
expression by interacting with target mRNAs, affecting various physiological and …

Out-of-distribution generalization via composition: a lens through induction heads in transformers

J Song, Z Xu, Y Zhong - Proceedings of the National Academy of Sciences, 2025 - pnas.org
Large language models (LLMs) such as GPT-4 sometimes appear to be creative, solving
novel tasks often with a few demonstrations in the prompt. These tasks require the models to …

Skywork-reward: Bag of tricks for reward modeling in llms

CY Liu, L Zeng, J Liu, R Yan, J He, C Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
In this report, we introduce a collection of methods to enhance reward modeling for LLMs,
focusing specifically on data-centric techniques. We propose effective data selection and …

Evaluating copyright takedown methods for language models

B Wei, W Shi, Y Huang, NA Smith, C Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Language models (LMs) derive their capabilities from extensive training on diverse data,
including potentially copyrighted material. These models can memorize and generate …