Grounding language models to images for multimodal inputs and outputs
We propose an efficient method to ground pretrained text-only language models to the
visual domain, enabling them to process arbitrarily interleaved image-and-text data, and …
visual domain, enabling them to process arbitrarily interleaved image-and-text data, and …
Open-world story generation with structured knowledge enhancement: A comprehensive survey
Storytelling and narrative are fundamental to human experience, intertwined with our social
and cultural engagement. As such, researchers have long attempted to create systems that …
and cultural engagement. As such, researchers have long attempted to create systems that …
Calibrated language models must hallucinate
Recent language models generate false but plausible-sounding text with surprising
frequency. Such “hallucinations” are an obstacle to the usability of language-based AI …
frequency. Such “hallucinations” are an obstacle to the usability of language-based AI …
Ai chains: Transparent and controllable human-ai interaction by chaining large language model prompts
Although large language models (LLMs) have demonstrated impressive potential on simple
tasks, their breadth of scope, lack of transparency, and insufficient controllability can make …
tasks, their breadth of scope, lack of transparency, and insufficient controllability can make …
Conditional generation with a question-answering blueprint
The ability to convey relevant and faithful information is critical for many tasks in conditional
generation and yet remains elusive for neural seq-to-seq models whose outputs often reveal …
generation and yet remains elusive for neural seq-to-seq models whose outputs often reveal …
Recurrentgpt: Interactive generation of (arbitrarily) long text
The fixed-size context of Transformer makes GPT models incapable of generating arbitrarily
long text. In this paper, we introduce RecurrentGPT, a language-based simulacrum of the …
long text. In this paper, we introduce RecurrentGPT, a language-based simulacrum of the …
A comprehensive survey of mamba architectures for medical image analysis: Classification, segmentation, restoration and beyond
Mamba, a special case of the State Space Model, is gaining popularity as an alternative to
template-based deep learning approaches in medical image analysis. While transformers …
template-based deep learning approaches in medical image analysis. While transformers …
PLANET: Dynamic content planning in autoregressive transformers for long-form text generation
Despite recent progress of pre-trained language models on generating fluent text, existing
methods still suffer from incoherence problems in long-form text generation tasks that …
methods still suffer from incoherence problems in long-form text generation tasks that …
Parallel refinements for lexically constrained text generation with bart
X He - arxiv preprint arxiv:2109.12487, 2021 - arxiv.org
Lexically constrained text generation aims to control the generated text by incorporating
some pre-specified keywords into the output. Previous work injects lexical constraints into …
some pre-specified keywords into the output. Previous work injects lexical constraints into …
Variational autoencoder for design of synthetic viral vector serotypes
Recent, rapid advances in deep generative models for protein design have focused on small
proteins with lots of data. Such models perform poorly on large proteins with limited natural …
proteins with lots of data. Such models perform poorly on large proteins with limited natural …