- Academic Search

T Wu, YJ Yuan, LX Zhang, J Yang, YP Cao… - Computational Visual …, 2024 - Springer

The emergence of 3D Gaussian splatting (3DGS) has greatly accelerated rendering in novel
view synthesis. Unlike neural implicit representations like neural radiance fields (NeRFs) …

保存引用被引用数: 61 関連記事全 3 バージョン

[Free GPT-4]

[PDF] arxiv.org

Adversarial diffusion distillation

A Sauer, D Lorenz, A Blattmann… - European Conference on …, 2024 - Springer

Abstract We introduce Adversarial Diffusion Distillation (ADD), a novel training approach that
efficiently samples large-scale foundational image diffusion models in just 1–4 steps while …

保存引用被引用数: 284 関連記事全 2 バージョン

[Free GPT-4]

[PDF] arxiv.org

Lavie: High-quality video generation with cascaded latent diffusion models

Y Wang, X Chen, X Ma, S Zhou, Z Huang… - International Journal of …, 2024 - Springer

This work aims to learn a high-quality text-to-video (T2V) generative model by leveraging a
pre-trained text-to-image (T2I) model as a basis. It is a highly desirable yet challenging task …

保存引用被引用数: 220 関連記事全 3 バージョン

[Free GPT-4]

[PDF] thecvf.com

Align your gaussians: Text-to-4d with dynamic 3d gaussians and composed diffusion models

H Ling, SW Kim, A Torralba… - Proceedings of the …, 2024 - openaccess.thecvf.com

Text-guided diffusion models have revolutionized image and video generation and have
also been successfully used for optimization-based 3D object synthesis. Here we instead …

保存引用被引用数: 94 関連記事全 4 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Fast high-resolution image synthesis with latent adversarial diffusion distillation

A Sauer, F Boesel, T Dockhorn, A Blattmann… - SIGGRAPH Asia 2024 …, 2024 - dl.acm.org

Diffusion models are the main driver of progress in image and video synthesis, but suffer
from slow inference speed. Distillation methods, like the recently introduced adversarial …

保存引用被引用数: 77 関連記事全 2 バージョン

[Free GPT-4]

[PDF] arxiv.org

Evaluating text-to-visual generation with image-to-text generation

Z Lin, D Pathak, B Li, J Li, X **a, G Neubig… - … on Computer Vision, 2024 - Springer

Despite significant progress in generative AI, comprehensive evaluation remains
challenging because of the lack of effective metrics and standardized benchmarks. For …

保存引用被引用数: 62 関連記事全 2 バージョン

[Free GPT-4]

[PDF] arxiv.org

Sparsectrl: Adding sparse controls to text-to-video diffusion models

Y Guo, C Yang, A Rao, M Agrawala, D Lin… - European Conference on …, 2024 - Springer

The development of text-to-video (T2V), ie, generating videos with a given text prompt, has
been significantly advanced in recent years. However, relying solely on text prompts often …

保存引用被引用数: 76 関連記事全 2 バージョン

[Free GPT-4]

[PDF] arxiv.org

Factorizing text-to-video generation by explicit image conditioning

R Girdhar, M Singh, A Brown, Q Duval, S Azadi… - … on Computer Vision, 2024 - Springer

Abstract We present Emu Video, a text-to-video generation model that factorizes the
generation into two steps: first generating an image conditioned on the text, and then …

保存引用被引用数: 72 関連記事全 2 バージョン

[Free GPT-4]

[PDF] thecvf.com

Emu edit: Precise image editing via recognition and generation tasks

S Sheynin, A Polyak, U Singer… - Proceedings of the …, 2024 - openaccess.thecvf.com

Instruction-based image editing holds immense potential for a variety of applications as it
enables users to perform any editing operation using a natural language instruction …

保存引用被引用数: 89 関連記事全 5 バージョン HTMLバージョン

[Free GPT-4]

[PDF] thecvf.com

Diffusion model alignment using direct preference optimization

B Wallace, M Dang, R Rafailov… - Proceedings of the …, 2024 - openaccess.thecvf.com

Large language models (LLMs) are fine-tuned using human comparison data with
Reinforcement Learning from Human Feedback (RLHF) methods to make them better …

保存引用被引用数: 128 関連記事全 3 バージョン HTMLバージョン

アラートを作成

引用

検索オプション

マイライブラリに保存しました

Emu: Enhancing image generation models using photogenic needles in a haystack

Recent advances in 3d gaussian splatting

Adversarial diffusion distillation

Lavie: High-quality video generation with cascaded latent diffusion models

Align your gaussians: Text-to-4d with dynamic 3d gaussians and composed diffusion models

Fast high-resolution image synthesis with latent adversarial diffusion distillation

Evaluating text-to-visual generation with image-to-text generation

Sparsectrl: Adding sparse controls to text-to-video diffusion models

Factorizing text-to-video generation by explicit image conditioning

Emu edit: Precise image editing via recognition and generation tasks

Diffusion model alignment using direct preference optimization