Grounding language models to images for multimodal inputs and outputs

JY Koh, R Salakhutdinov… - … Conference on Machine …, 2023 - proceedings.mlr.press
We propose an efficient method to ground pretrained text-only language models to the
visual domain, enabling them to process arbitrarily interleaved image-and-text data, and …

Open-world story generation with structured knowledge enhancement: A comprehensive survey

Y Wang, J Lin, Z Yu, W Hu, BF Karlsson - Neurocomputing, 2023 - Elsevier
Storytelling and narrative are fundamental to human experience, intertwined with our social
and cultural engagement. As such, researchers have long attempted to create systems that …

Calibrated language models must hallucinate

AT Kalai, SS Vempala - Proceedings of the 56th Annual ACM …, 2024 - dl.acm.org
Recent language models generate false but plausible-sounding text with surprising
frequency. Such “hallucinations” are an obstacle to the usability of language-based AI …

Ai chains: Transparent and controllable human-ai interaction by chaining large language model prompts

T Wu, M Terry, CJ Cai - Proceedings of the 2022 CHI conference on …, 2022 - dl.acm.org
Although large language models (LLMs) have demonstrated impressive potential on simple
tasks, their breadth of scope, lack of transparency, and insufficient controllability can make …

Conditional generation with a question-answering blueprint

S Narayan, J Maynez, RK Amplayo… - Transactions of the …, 2023 - direct.mit.edu
The ability to convey relevant and faithful information is critical for many tasks in conditional
generation and yet remains elusive for neural seq-to-seq models whose outputs often reveal …

Recurrentgpt: Interactive generation of (arbitrarily) long text

W Zhou, YE Jiang, P Cui, T Wang, Z **ao… - arxiv preprint arxiv …, 2023 - arxiv.org
The fixed-size context of Transformer makes GPT models incapable of generating arbitrarily
long text. In this paper, we introduce RecurrentGPT, a language-based simulacrum of the …

A comprehensive survey of mamba architectures for medical image analysis: Classification, segmentation, restoration and beyond

S Bansal, S Madisetty, MZU Rehman… - arxiv preprint arxiv …, 2024 - arxiv.org
Mamba, a special case of the State Space Model, is gaining popularity as an alternative to
template-based deep learning approaches in medical image analysis. While transformers …

PLANET: Dynamic content planning in autoregressive transformers for long-form text generation

Z Hu, HP Chan, J Liu, X **ao, H Wu… - arxiv preprint arxiv …, 2022 - arxiv.org
Despite recent progress of pre-trained language models on generating fluent text, existing
methods still suffer from incoherence problems in long-form text generation tasks that …

Parallel refinements for lexically constrained text generation with bart

X He - arxiv preprint arxiv:2109.12487, 2021 - arxiv.org
Lexically constrained text generation aims to control the generated text by incorporating
some pre-specified keywords into the output. Previous work injects lexical constraints into …

Variational autoencoder for design of synthetic viral vector serotypes

S Lyu, S Sowlati-Hash**, M Garton - Nature Machine Intelligence, 2024 - nature.com
Recent, rapid advances in deep generative models for protein design have focused on small
proteins with lots of data. Such models perform poorly on large proteins with limited natural …