Google Académico

X Wu, S Wu, J Wu, L Feng, KC Tan - arxiv preprint arxiv:2401.10034, 2024 - arxiv.org

Large Language Models (LLMs), built upon Transformer-based architectures with massive
pretraining on diverse data, have not only revolutionized natural language processing but …

Guardar Citar Citado por 56 Artículos relacionados Las 2 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Masactrl: Tuning-free mutual self-attention control for consistent image synthesis and editing

M Cao, X Wang, Z Qi, Y Shan… - Proceedings of the …, 2023 - openaccess.thecvf.com

Despite the success in large-scale text-to-image generation and text-conditioned image
editing, existing methods still struggle to produce consistent generation and editing results …

Guardar Citar Citado por 362 Artículos relacionados Las 5 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

T2i-compbench: A comprehensive benchmark for open-world compositional text-to-image generation

K Huang, K Sun, E **e, Z Li… - Advances in Neural …, 2023 - proceedings.neurips.cc

Despite the stunning ability to generate high-quality images by recent text-to-image models,
current approaches often struggle to effectively compose objects with different attributes and …

Guardar Citar Citado por 174 Artículos relacionados Las 6 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Layoutgpt: Compositional visual planning and generation with large language models

W Feng, W Zhu, T Fu, V Jampani… - Advances in …, 2023 - proceedings.neurips.cc

Attaining a high degree of user controllability in visual generation often requires intricate,
fine-grained inputs like layouts. However, such inputs impose a substantial burden on users …

Guardar Citar Citado por 183 Artículos relacionados Las 7 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Svdiff: Compact parameter space for diffusion fine-tuning

L Han, Y Li, H Zhang, P Milanfar… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recently, diffusion models have achieved remarkable success in text-to-image generation,
enabling the creation of high-quality images from text prompts and various conditions …

Guardar Citar Citado por 222 Artículos relacionados Las 9 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Fastcomposer: Tuning-free multi-subject image generation with localized attention

G **ao, T Yin, WT Freeman, F Durand… - International Journal of …, 2024 - Springer

Diffusion models excel at text-to-image generation, especially in subject-driven generation
for personalized images. However, existing methods are inefficient due to the subject …

Guardar Citar Citado por 173 Artículos relacionados Las 2 versiones

[Free GPT-4]
[DeepSeek]

[PDF] nowpublishers.com

Multimodal foundation models: From specialists to general-purpose assistants

C Li, Z Gan, Z Yang, J Yang, L Li… - … and Trends® in …, 2024 - nowpublishers.com

Neural compression is the application of neural networks and other machine learning
methods to data compression. Recent advances in statistical machine learning have opened …

Guardar Citar Citado por 219 Artículos relacionados Las 6 versiones Búsqueda de bibliotecas Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Boxdiff: Text-to-image synthesis with training-free box-constrained diffusion

J **e, Y Li, Y Huang, H Liu, W Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recent text-to-image diffusion models have demonstrated an astonishing capacity to
generate high-quality images. However, researchers mainly studied the way of synthesizing …

Guardar Citar Citado por 164 Artículos relacionados Las 8 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Tokenflow: Consistent diffusion features for consistent video editing

M Geyer, O Bar-Tal, S Bagon, T Dekel - arxiv preprint arxiv:2307.10373, 2023 - arxiv.org

The generative AI revolution has recently expanded to videos. Nevertheless, current state-of-
the-art video models are still lagging behind image models in terms of visual quality and …

Guardar Citar Citado por 217 Artículos relacionados Las 3 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey on generative diffusion models

H Cao, C Tan, Z Gao, Y Xu, G Chen… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Deep generative models have unlocked another profound realm of human creativity. By
capturing and generalizing patterns within data, we have entered the epoch of all …

Guardar Citar Citado por 420 Artículos relacionados Las 5 versiones

Crear alerta

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

Attend-and-excite: Attention-based semantic guidance for text-to-image diffusion models

Evolutionary computation in the era of large language model: Survey and roadmap

Masactrl: Tuning-free mutual self-attention control for consistent image synthesis and editing

T2i-compbench: A comprehensive benchmark for open-world compositional text-to-image generation

Layoutgpt: Compositional visual planning and generation with large language models

Svdiff: Compact parameter space for diffusion fine-tuning

Fastcomposer: Tuning-free multi-subject image generation with localized attention

Multimodal foundation models: From specialists to general-purpose assistants

Boxdiff: Text-to-image synthesis with training-free box-constrained diffusion

Tokenflow: Consistent diffusion features for consistent video editing

A survey on generative diffusion models