Transformers learn shortcuts to automata

B Liu, JT Ash, S Goel, A Krishnamurthy… - arxiv preprint arxiv …, 2022‏ - arxiv.org
Algorithmic reasoning requires capabilities which are most naturally understood through
recurrent models of computation, like the Turing machine. However, Transformer models …

Open-world story generation with structured knowledge enhancement: A comprehensive survey

Y Wang, J Lin, Z Yu, W Hu, BF Karlsson - Neurocomputing, 2023‏ - Elsevier
Storytelling and narrative are fundamental to human experience, intertwined with our social
and cultural engagement. As such, researchers have long attempted to create systems that …

[PDF][PDF] Skeleton-of-thought: Large language models can do parallel decoding

X Ning, Z Lin, Z Zhou, Z Wang, H Yang… - Proceedings ENLSP …, 2023‏ - lirias.kuleuven.be
This work aims at decreasing the end-to-end generation latency of large language models
(LLMs). One of the major causes of the high generation latency is the sequential decoding …

Medusa: Simple llm inference acceleration framework with multiple decoding heads

T Cai, Y Li, Z Geng, H Peng, JD Lee, D Chen… - arxiv preprint arxiv …, 2024‏ - arxiv.org
The inference process in Large Language Models (LLMs) is often limited due to the absence
of parallelism in the auto-regressive decoding process, resulting in most operations being …

Towards efficient generative large language model serving: A survey from algorithms to systems

X Miao, G Oliaro, Z Zhang, X Cheng, H **… - arxiv preprint arxiv …, 2023‏ - arxiv.org
In the rapidly evolving landscape of artificial intelligence (AI), generative large language
models (LLMs) stand at the forefront, revolutionizing how we interact with our data. However …

Boosting consistency in story visualization with rich-contextual conditional diffusion models

F Shen, H Ye, S Liu, J Zhang, C Wang, X Han… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Recent research showcases the considerable potential of conditional diffusion models for
generating consistent stories. However, current methods, which predominantly generate …

Retrosynthesis prediction with an iterative string editing model

Y Han, X Xu, CY Hsieh, K Ding, H Xu, R Xu… - Nature …, 2024‏ - nature.com
Retrosynthesis is a crucial task in drug discovery and organic synthesis, where artificial
intelligence (AI) is increasingly employed to expedite the process. However, existing …

Text generation with diffusion language models: A pre-training approach with continuous paragraph denoise

Z Lin, Y Gong, Y Shen, T Wu, Z Fan… - International …, 2023‏ - proceedings.mlr.press
In this paper, we introduce a novel dIffusion language modEl pre-training framework for text
generation, which we call GENIE. GENIE is a large-scale pre-trained diffusion language …

Accelerating transformer inference for translation via parallel decoding

A Santilli, S Severino, E Postolache, V Maiorca… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Autoregressive decoding limits the efficiency of transformers for Machine Translation (MT).
The community proposed specific network architectures and learning-based methods to …

Model-enhanced vector index

H Zhang, Y Wang, Q Chen, R Chang… - Advances in …, 2024‏ - proceedings.neurips.cc
Embedding-based retrieval methods construct vector indices to search for document
representations that are most similar to the query representations. They are widely used in …