A survey on transformer compression

Y Tang, Y Wang, J Guo, Z Tu, K Han, H Hu… - arxiv preprint arxiv …, 2024 - arxiv.org
Transformer plays a vital role in the realms of natural language processing (NLP) and
computer vision (CV), specially for constructing large language models (LLM) and large …

Reasoning with large language models, a survey

A Plaat, A Wong, S Verberne, J Broekens… - arxiv preprint arxiv …, 2024 - arxiv.org
Scaling up language models to billions of parameters has opened up possibilities for in-
context learning, allowing instruction tuning and few-shot learning on tasks that the model …

Metamath: Bootstrap your own mathematical questions for large language models

L Yu, W Jiang, H Shi, J Yu, Z Liu, Y Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) have pushed the limits of natural language understanding
and exhibited excellent problem-solving ability. Despite the great success, most existing …

A survey on model compression for large language models

X Zhu, J Li, Y Liu, C Ma, W Wang - Transactions of the Association for …, 2024 - direct.mit.edu
Abstract Large Language Models (LLMs) have transformed natural language processing
tasks successfully. Yet, their large size and high computational needs pose challenges for …

Towards reasoning in large language models: A survey

J Huang, KCC Chang - arxiv preprint arxiv:2212.10403, 2022 - arxiv.org
Reasoning is a fundamental aspect of human intelligence that plays a crucial role in
activities such as problem solving, decision making, and critical thinking. In recent years …

Specializing smaller language models towards multi-step reasoning

Y Fu, H Peng, L Ou, A Sabharwal… - … on Machine Learning, 2023 - proceedings.mlr.press
The surprising ability of Large Language Models (LLMs) to perform well on complex
reasoning with only few-shot chain-of-thought prompts is believed to emerge only in very …

Large language models are reasoning teachers

N Ho, L Schmid, SY Yun - arxiv preprint arxiv:2212.10071, 2022 - arxiv.org
Recent works have shown that chain-of-thought (CoT) prompting can elicit language models
to solve complex reasoning tasks, step-by-step. However, prompt-based CoT methods are …

Reasoning with language model prompting: A survey

S Qiao, Y Ou, N Zhang, X Chen, Y Yao, S Deng… - arxiv preprint arxiv …, 2022 - arxiv.org
Reasoning, as an essential ability for complex problem-solving, can provide back-end
support for various real-world applications, such as medical diagnosis, negotiation, etc. This …

Teaching small language models to reason

LC Magister, J Mallinson, J Adamek, E Malmi… - arxiv preprint arxiv …, 2022 - arxiv.org
Chain of thought prompting successfully improves the reasoning capabilities of large
language models, achieving state of the art results on a range of datasets. However, these …

Quiet-star: Language models can teach themselves to think before speaking

E Zelikman, GR Harik, Y Shao, V Jayasiri… - First Conference on …, 2024 - openreview.net
When writing and talking, people sometimes pause to think. Although reasoning-focused
works have often framed reasoning as a method of answering questions or completing …