- Academic Search

C Zhang, C Zhang, S Zheng, Y Qiao, C Li… - arxiv preprint arxiv …, 2023 - arxiv.org

As ChatGPT goes viral, generative AI (AIGC, aka AI-generated content) has made headlines
everywhere because of its ability to analyze and create text, images, and beyond. With such …

保存引用被引用数: 206 関連記事全 4 バージョン HTMLバージョン

[Free GPT-4]

[PDF] nowpublishers.com

Vision-language pre-training: Basics, recent advances, and future trends

Z Gan, L Li, C Li, L Wang, Z Liu… - Foundations and Trends …, 2022 - nowpublishers.com

This monograph surveys vision-language pre-training (VLP) methods for multimodal
intelligence that have been developed in the last few years. We group these approaches …

[Free GPT-4]

[PDF] thecvf.com

Masactrl: Tuning-free mutual self-attention control for consistent image synthesis and editing

M Cao, X Wang, Z Qi, Y Shan… - Proceedings of the …, 2023 - openaccess.thecvf.com

Despite the success in large-scale text-to-image generation and text-conditioned image
editing, existing methods still struggle to produce consistent generation and editing results …

保存引用被引用数: 358 関連記事全 5 バージョン HTMLバージョン

[Free GPT-4]

[HTML] sciencedirect.com

[HTML][HTML] Deep learning in food category recognition

Y Zhang, L Deng, H Zhu, W Wang, Z Ren, Q Zhou… - Information …, 2023 - Elsevier

Integrating artificial intelligence with food category recognition has been a field of interest for
research for the past few decades. It is potentially one of the next steps in revolutionizing …

保存引用被引用数: 279 関連記事全 4 バージョン

[Free GPT-4]

[PDF] 3dvar.com

[PDF][PDF] Scaling autoregressive models for content-rich text-to-image generation

J Yu, Y Xu, JY Koh, T Luong, G Baid, Z Wang… - arxiv preprint arxiv …, 2022 - 3dvar.com

Abstract We present the Pathways [1] Autoregressive Text-to-Image (Parti) model, which
generates high-fidelity photorealistic images and supports content-rich synthesis involving …

保存引用被引用数: 1093 関連記事全 5 バージョン HTMLバージョン

[Free GPT-4]

[PDF] thecvf.com

Spatext: Spatio-textual representation for controllable image generation

O Avrahami, T Hayes, O Gafni… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recent text-to-image diffusion models are able to generate convincing results of
unprecedented quality. However, it is nearly impossible to control the shapes of different …

保存引用被引用数: 199 関連記事全 5 バージョン HTMLバージョン

[Free GPT-4]

[PDF] acm.org

Blended latent diffusion

O Avrahami, O Fried, D Lischinski - ACM transactions on graphics (TOG), 2023 - dl.acm.org

The tremendous progress in neural image generation, coupled with the emergence of
seemingly omnipotent vision-language models has finally enabled text-based interfaces for …

保存引用被引用数: 362 関連記事全 3 バージョン

[Free GPT-4]

[PDF] thecvf.com

Maxim: Multi-axis mlp for image processing

Z Tu, H Talebi, H Zhang, F Yang… - Proceedings of the …, 2022 - openaccess.thecvf.com

Recent progress on Transformers and multi-layer perceptron (MLP) models provide new
network architectural designs for computer vision tasks. Although these models proved to be …

保存引用被引用数: 563 関連記事全 10 バージョン HTMLバージョン

[Free GPT-4]

[PDF] thecvf.com

Blended diffusion for text-driven editing of natural images

O Avrahami, D Lischinski… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

Natural language offers a highly intuitive interface for image editing. In this paper, we
introduce the first solution for performing local (region-based) edits in generic natural …

保存引用被引用数: 932 関連記事全 6 バージョン HTMLバージョン

[Free GPT-4]

[PDF] thecvf.com

Vector quantized diffusion model for text-to-image synthesis

S Gu, D Chen, J Bao, F Wen, B Zhang… - Proceedings of the …, 2022 - openaccess.thecvf.com

We present the vector quantized diffusion (VQ-Diffusion) model for text-to-image generation.
This method is based on a vector quantized variational autoencoder (VQ-VAE) whose latent …

保存引用被引用数: 858 関連記事全 10 バージョン HTMLバージョン

アラートを作成

引用

検索オプション

マイライブラリに保存しました

Stackgan++: Realistic image synthesis with stacked generative adversarial networks

A complete survey on generative ai (aigc): Is chatgpt from gpt-4 to gpt-5 all you need?

Vision-language pre-training: Basics, recent advances, and future trends

Masactrl: Tuning-free mutual self-attention control for consistent image synthesis and editing

[HTML][HTML] Deep learning in food category recognition

[PDF][PDF] Scaling autoregressive models for content-rich text-to-image generation

Spatext: Spatio-textual representation for controllable image generation

Blended latent diffusion

Maxim: Multi-axis mlp for image processing

Blended diffusion for text-driven editing of natural images

Vector quantized diffusion model for text-to-image synthesis