„Google“ mokslinčius

Z Gan, L Li, C Li, L Wang, Z Liu… - Foundations and Trends …, 2022 - nowpublishers.com

This monograph surveys vision-language pre-training (VLP) methods for multimodal
intelligence that have been developed in the last few years. We group these approaches …

Išsaugoti Cituoti Cituoja 198 Susiję straipsniai Visos 7 versijos Paieška bibliotekoje HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Large-scale text-to-image generation models for visual artists' creative works

HK Ko, G Park, H Jeon, J Jo, J Kim, J Seo - Proceedings of the 28th …, 2023 - dl.acm.org

Large-scale Text-to-image Generation Models (LTGMs)(eg, DALL-E), self-supervised deep
learning models trained on a huge dataset, have demonstrated the capacity for generating …

Išsaugoti Cituoti Cituoja 145 Susiję straipsniai Visos 11 versijos

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Videocrafter2: Overcoming data limitations for high-quality video diffusion models

H Chen, Y Zhang, X Cun, M ** counterfactuals for photorealistic object removal and insertion

D Winter, M Cohen, S Fruchter, Y Pritch… - … on Computer Vision, 2024 - Springer

Diffusion models have revolutionized image editing but often generate images that violate
physical laws, particularly the effects of objects on the scene, eg, occlusions, shadows, and …

Išsaugoti Cituoti Cituoja 13 Susiję straipsniai Visos 8 versijos

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Generative disco: Text-to-video generation for music visualization

V Liu, T Long, N Raw, L Chilton - arxiv preprint arxiv:2304.08551, 2023 - arxiv.org

Visuals can enhance our experience of music, owing to the way they can amplify the
emotions and messages conveyed within it. However, creating music visualization is a …

Išsaugoti Cituoti Cituoja 37 Susiję straipsniai Visos 3 versijos HTML kopija

Kurti įspėjimą

Cituoti

Išplėstinė paieška

Išsaugota skiltyje „Mano biblioteka“

Nuwa-infinity: Autoregressive over autoregressive generation for infinite visual synthesis

Vision-language pre-training: Basics, recent advances, and future trends

Large-scale text-to-image generation models for visual artists' creative works

Videocrafter2: Overcoming data limitations for high-quality video diffusion models

Generative disco: Text-to-video generation for music visualization