An overview on language models: Recent developments and outlook

C Wei, YC Wang, B Wang, CCJ Kuo - arxiv preprint arxiv:2303.05759, 2023 - arxiv.org
Language modeling studies the probability distributions over strings of texts. It is one of the
most fundamental tasks in natural language processing (NLP). It has been widely used in …

A contrastive framework for neural text generation

Y Su, T Lan, Y Wang, D Yogatama… - Advances in Neural …, 2022 - proceedings.neurips.cc
Text generation is of great importance to many natural language processing applications.
However, maximization-based decoding methods (eg, beam search) of neural language …

A survey on non-autoregressive generation for neural machine translation and beyond

Y **ao, L Wu, J Guo, J Li, M Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Non-autoregressive (NAR) generation, which is first proposed in neural machine translation
(NMT) to speed up inference, has attracted much attention in both machine learning and …

Language models can see: Plugging visual controls in text generation

Y Su, T Lan, Y Liu, F Liu, D Yogatama, Y Wang… - arxiv preprint arxiv …, 2022 - arxiv.org
Generative language models (LMs) such as GPT-2/3 can be prompted to generate text with
remarkable quality. While they are designed for text-prompted generation, it remains an …

Text generation with diffusion language models: A pre-training approach with continuous paragraph denoise

Z Lin, Y Gong, Y Shen, T Wu, Z Fan… - International …, 2023 - proceedings.mlr.press
In this paper, we introduce a novel dIffusion language modEl pre-training framework for text
generation, which we call GENIE. GENIE is a large-scale pre-trained diffusion language …

Recent advances in neural text generation: A task-agnostic survey

C Tang, F Guerin, C Lin - arxiv preprint arxiv:2203.03047, 2022 - arxiv.org
In recent years, considerable research has been dedicated to the application of neural
models in the field of natural language generation (NLG). The primary objective is to …

Plan-then-generate: Controlled data-to-text generation via planning

Y Su, D Vandyke, S Wang, Y Fang, N Collier - arxiv preprint arxiv …, 2021 - arxiv.org
Recent developments in neural networks have led to the advance in data-to-text generation.
However, the lack of ability of neural models to control the structure of generated output can …

Future lens: Anticipating subsequent tokens from a single hidden state

K Pal, J Sun, A Yuan, BC Wallace, D Bau - arxiv preprint arxiv …, 2023 - arxiv.org
We conjecture that hidden state vectors corresponding to individual input tokens encode
information sufficient to accurately predict several tokens ahead. More concretely, in this …

Break the sequential dependency of llm inference using lookahead decoding

Y Fu, P Bailis, I Stoica, H Zhang - arxiv preprint arxiv:2402.02057, 2024 - arxiv.org
Autoregressive decoding of large language models (LLMs) is memory bandwidth bounded,
resulting in high latency and significant wastes of the parallel processing power of modern …

TaCL: Improving BERT pre-training with token-aware contrastive learning

Y Su, F Liu, Z Meng, T Lan, L Shu, E Shareghi… - arxiv preprint arxiv …, 2021 - arxiv.org
Masked language models (MLMs) such as BERT and RoBERTa have revolutionized the
field of Natural Language Understanding in the past few years. However, existing pre …