Parameter-efficient fine-tuning for large models: A comprehensive survey
Large models represent a groundbreaking advancement in multiple application fields,
enabling remarkable achievements across various tasks. However, their unprecedented …
enabling remarkable achievements across various tasks. However, their unprecedented …
State of the art on diffusion models for visual computing
The field of visual computing is rapidly advancing due to the emergence of generative
artificial intelligence (AI), which unlocks unprecedented capabilities for the generation …
artificial intelligence (AI), which unlocks unprecedented capabilities for the generation …
Text2video-zero: Text-to-image diffusion models are zero-shot video generators
L Khachatryan, A Movsisyan… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent text-to-video generation approaches rely on computationally heavy training and
require large-scale video datasets. In this paper, we introduce a new task, zero-shot text-to …
require large-scale video datasets. In this paper, we introduce a new task, zero-shot text-to …
Adding conditional control to text-to-image diffusion models
We present ControlNet, a neural network architecture to add spatial conditioning controls to
large, pretrained text-to-image diffusion models. ControlNet locks the production-ready large …
large, pretrained text-to-image diffusion models. ControlNet locks the production-ready large …
Tune-a-video: One-shot tuning of image diffusion models for text-to-video generation
To replicate the success of text-to-image (T2I) generation, recent works employ large-scale
video datasets to train a text-to-video (T2V) generator. Despite their promising results, such …
video datasets to train a text-to-video (T2V) generator. Despite their promising results, such …
Uni-controlnet: All-in-one control to text-to-image diffusion models
Text-to-Image diffusion models have made tremendous progress over the past two years,
enabling the generation of highly realistic images based on open-domain text descriptions …
enabling the generation of highly realistic images based on open-domain text descriptions …
Masactrl: Tuning-free mutual self-attention control for consistent image synthesis and editing
Despite the success in large-scale text-to-image generation and text-conditioned image
editing, existing methods still struggle to produce consistent generation and editing results …
editing, existing methods still struggle to produce consistent generation and editing results …
Fatezero: Fusing attentions for zero-shot text-based video editing
The diffusion-based generative models have achieved remarkable success in text-based
image generation. However, since it contains enormous randomness in generation …
image generation. However, since it contains enormous randomness in generation …
Videocomposer: Compositional video synthesis with motion controllability
The pursuit of controllability as a higher standard of visual content creation has yielded
remarkable progress in customizable image synthesis. However, achieving controllable …
remarkable progress in customizable image synthesis. However, achieving controllable …
Next-gpt: Any-to-any multimodal llm
While recently Multimodal Large Language Models (MM-LLMs) have made exciting strides,
they mostly fall prey to the limitation of only input-side multimodal understanding, without the …
they mostly fall prey to the limitation of only input-side multimodal understanding, without the …