Rest-mcts*: Llm self-training via process reward guided tree search

D Zhang, S Zhoubian, Z Hu, Y Yue… - Advances in Neural …, 2025 - proceedings.neurips.cc
Recent methodologies in LLM self-training mostly rely on LLM generating responses and
filtering those with correct output answers as training data. This approach often yields a low …

Benchmarking Human–AI collaboration for common evidence appraisal tools

T Woelfle, J Hirt, P Janiaud, L Kappos… - Journal of Clinical …, 2024 - Elsevier
Background It is unknown whether large language models (LLMs) may facilitate time-and
resource-intensive text-related processes in evidence appraisal. Objectives To quantify the …

Is a picture worth a thousand words? delving into spatial reasoning for vision language models

J Wang, Y Ming, Z Shi, V Vineet… - Advances in Neural …, 2025 - proceedings.neurips.cc
Large language models (LLMs) and vision-language models (VLMs) have demonstrated
remarkable performance across a wide range of tasks and domains. Despite this promise …

Spider2-v: How far are multimodal agents from automating data science and engineering workflows?

R Cao, F Lei, H Wu, J Chen, Y Fu… - Advances in …, 2025 - proceedings.neurips.cc
Data science and engineering workflows often span multiple stages, from warehousing to
orchestration, using tools like BigQuery, dbt, and Airbyte. As vision language models (VLMs) …

Tensor attention training: Provably efficient learning of higher-order transformers

Y Liang, Z Shi, Z Song, Y Zhou - arxiv preprint arxiv:2405.16411, 2024 - arxiv.org
Tensor Attention, a multi-view attention that is able to capture high-order correlations among
multiple modalities, can overcome the representational limitations of classical matrix …

Large language model inference acceleration: A comprehensive hardware perspective

J Li, J Xu, S Huang, Y Chen, W Li, J Liu, Y Lian… - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs) have demonstrated remarkable capabilities across various
fields, from natural language understanding to text generation. Compared to non-generative …

RouteLLM: Learning to Route LLMs from Preference Data

I Ong, A Almahairi, V Wu, WL Chiang, T Wu… - The Thirteenth …, 2024 - openreview.net
Large language models (LLMs) excel at a wide range of tasks, but choosing the right model
often involves balancing performance and cost. Powerful models offer better results but are …

Conv-basis: A new paradigm for efficient attention inference and gradient computation in transformers

Y Liang, H Liu, Z Shi, Z Song, Z Xu, J Yin - arxiv preprint arxiv:2405.05219, 2024 - arxiv.org
The self-attention mechanism is the key to the success of transformers in recent Large
Language Models (LLMs). However, the quadratic computational cost $ O (n^ 2) $ in the …

Navigating the safety landscape: Measuring risks in finetuning large language models

SY Peng, PY Chen, M Hull, DH Chau - arxiv preprint arxiv:2405.17374, 2024 - arxiv.org
Safety alignment is crucial to ensure that large language models (LLMs) behave in ways that
align with human preferences and prevent harmful actions during inference. However …

Harmful fine-tuning attacks and defenses for large language models: A survey

T Huang, S Hu, F Ilhan, SF Tekin, L Liu - arxiv preprint arxiv:2409.18169, 2024 - arxiv.org
Recent research demonstrates that the nascent fine-tuning-as-a-service business model
exposes serious safety concerns--fine-tuning over a few harmful data uploaded by the users …