- Academic Search

X Ju, X Liu, X Wang, Y Bian, Y Shan, Q Xu - European Conference on …, 2024 - Springer

Image inpainting, the process of restoring corrupted images, has seen significant
advancements with the advent of diffusion models (DMs). Despite these advancements …

Simpan Kutip Dirujuk 27 kali Artikel terkait 2 versi

[Free GPT-4]

[PDF] arxiv.org

Self-rectifying diffusion sampling with perturbed-attention guidance

D Ahn, H Cho, J Min, W Jang, J Kim, SH Kim… - … on Computer Vision, 2024 - Springer

Recent studies have demonstrated that diffusion models can generate high-quality samples,
but their quality heavily depends on sampling guidance techniques, such as classifier …

Simpan Kutip Dirujuk 14 kali Artikel terkait 2 versi

[Free GPT-4]

[PDF] arxiv.org

Getting it Right: Improving Spatial Consistency in Text-to-Image Models

A Chatterjee, GBM Stan, E Aflalo, S Paul… - … on Computer Vision, 2024 - Springer

One of the key shortcomings in current text-to-image (T2I) models is their inability to
consistently generate images which faithfully follow the spatial relationships specified in the …

Simpan Kutip Dirujuk 10 kali Artikel terkait 2 versi

[Free GPT-4]

[PDF] arxiv.org

Renaissance: A survey into ai text-to-image generation in the era of large model

F Bie, Y Yang, Z Zhou, A Ghanem… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Text-to-image generation (TTI) refers to the usage of models that could process text input
and generate high fidelity images based on text descriptions. Text-to-image generation …

Simpan Kutip Dirujuk 21 kali Artikel terkait 2 versi

[Free GPT-4]

[PDF] arxiv.org

Deep compression autoencoder for efficient high-resolution diffusion models

J Chen, H Cai, J Chen, E **e, S Yang, H Tang… - arxiv preprint arxiv …, 2024 - arxiv.org

We present Deep Compression Autoencoder (DC-AE), a new family of autoencoder models
for accelerating high-resolution diffusion models. Existing autoencoder models have …

Simpan Kutip Dirujuk 9 kali Artikel terkait 3 versi Versi HTML

[Free GPT-4]

[PDF] arxiv.org

Diffh2o: Diffusion-based synthesis of hand-object interactions from textual descriptions

S Christen, S Hampali, F Sener, E Remelli… - SIGGRAPH Asia 2024 …, 2024 - dl.acm.org

We introduce DiffH2O, a new diffusion-based framework for synthesizing realistic, dexterous
hand-object interactions from natural language. Our model employs a temporal two-stage …

Simpan Kutip Dirujuk 8 kali Artikel terkait 2 versi

[Free GPT-4]

[PDF] arxiv.org

SATO: Stable Text-to-Motion Framework

W Chen, H **ao, E Zhang, L Hu, L Wang… - Proceedings of the …, 2024 - dl.acm.org

Is the Text to Motion model robust? Recent advancements in Text to Motion models primarily
stem from more accurate predictions of specific actions. However, the text modality typically …

Simpan Kutip Dirujuk 6 kali Artikel terkait 2 versi

[Free GPT-4]

[PDF] bmva-archive.org.uk

[PDF][PDF] Erasing concepts from text-to-image diffusion models with few-shot unlearning

M Fuchi, T Takagi - arxiv preprint arxiv:2405.07288, 2024 - bmva-archive.org.uk

Generating images from text has become easier because of the scaling of diffusion models
and advancements in the field of vision and language. These models are trained using vast …

Simpan Kutip Dirujuk 8 kali Artikel terkait 2 versi Versi HTML

[Free GPT-4]

[PDF] arxiv.org

Revisit large-scale image-caption data in pre-training multimodal foundation models

Z Lai, V Saveris, C Chen, HY Chen, H Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent advancements in multimodal models highlight the value of rewritten captions for
improving performance, yet key challenges remain. For example, while synthetic captions …

Simpan Kutip Dirujuk 3 kali Artikel terkait 2 versi Versi HTML

[Free GPT-4]

[PDF] openreview.net

Bigger is not always better: Scaling properties of latent diffusion models

K Mei, Z Tu, M Delbracio, H Talebi… - … on Machine Learning …, 2024 - openreview.net

We study the scaling properties of latent diffusion models (LDMs) with an emphasis on their
sampling efficiency. While improved network architecture and inference algorithms have …

Simpan Kutip Dirujuk 7 kali Artikel terkait 2 versi Versi HTML

Buat notifikasi

Kutip

Penelusuran lanjutan

Disimpan ke Koleksi saya

Rethinking fid: Towards a better evaluation metric for image generation

Brushnet: A plug-and-play image inpainting model with decomposed dual-branch diffusion

Self-rectifying diffusion sampling with perturbed-attention guidance

Getting it Right: Improving Spatial Consistency in Text-to-Image Models

Renaissance: A survey into ai text-to-image generation in the era of large model

Deep compression autoencoder for efficient high-resolution diffusion models

Diffh2o: Diffusion-based synthesis of hand-object interactions from textual descriptions

SATO: Stable Text-to-Motion Framework

[PDF][PDF] Erasing concepts from text-to-image diffusion models with few-shot unlearning

Revisit large-scale image-caption data in pre-training multimodal foundation models

Bigger is not always better: Scaling properties of latent diffusion models