- Academic Search

B Song, R Zhou, F Ahmed - … of Computing and …, 2024 - asmedigitalcollection.asme.org

In the rapidly advancing field of multi-modal machine learning (MMML), the convergence of
multiple data modalities has the potential to reshape various applications. This paper …

Tallenna Viittaa Viittausten määrä 56 Aiheeseen liittyviä artikkeleita Kaikki 4 versiota

A survey on multimodal bidirectional machine learning translation of image and natural language processing

W Nam, B Jang - Expert Systems with Applications, 2024 - Elsevier

Advances in multimodal machine learning help artificial intelligence to resemble human
intellect more closely, which perceives the world from multiple modalities. We surveyed state …

Tallenna Viittaa Viittausten määrä 21 Aiheeseen liittyviä artikkeleita Kaikki 3 versiota

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Inversion-based style transfer with diffusion models

Y Zhang, N Huang, F Tang, H Huang… - Proceedings of the …, 2023 - openaccess.thecvf.com

The artistic style within a painting is the means of expression, which includes not only the
painting material, colors, and brushstrokes, but also the high-level attributes, including …

Tallenna Viittaa Viittausten määrä 256 Aiheeseen liittyviä artikkeleita Kaikki 6 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Long-clip: Unlocking the long-text capability of clip

B Zhang, P Zhang, X Dong, Y Zang, J Wang - European Conference on …, 2024 - Springer

Abstract Contrastive Language-Image Pre-training (CLIP) has been the cornerstone for zero-
shot classification, text-image retrieval, and text-image generation by aligning image and …

Tallenna Viittaa Viittausten määrä 83 Aiheeseen liittyviä artikkeleita Kaikki 7 versiota

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Iterative prompt learning for unsupervised backlit image enhancement

Z Liang, C Li, S Zhou, R Feng… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

We propose a novel unsupervised backlit image enhancement method, abbreviated as CLIP-
LIT, by exploring the potential of Contrastive Language-Image Pre-Training (CLIP) for pixel …

Tallenna Viittaa Viittausten määrä 116 Aiheeseen liittyviä artikkeleita Kaikki 5 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

High-resolution image synthesis with latent diffusion models

R Rombach, A Blattmann, D Lorenz… - Proceedings of the …, 2022 - openaccess.thecvf.com

By decomposing the image formation process into a sequential application of denoising
autoencoders, diffusion models (DMs) achieve state-of-the-art synthesis results on image …

Tallenna Viittaa Viittausten määrä 15729 Aiheeseen liittyviä artikkeleita Kaikki 13 versiota HTML-versio

Text2live: Text-driven layered image and video editing

O Bar-Tal, D Ofri-Amar, R Fridman, Y Kasten… - European conference on …, 2022 - Springer

We present a method for zero-shot, text-driven editing of natural images and videos. Given
an image or a video and a text prompt, our goal is to edit the appearance of existing objects …

Tallenna Viittaa Viittausten määrä 337 Aiheeseen liittyviä artikkeleita Kaikki 4 versiota

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Zero-shot text-guided object generation with dream fields

A Jain, B Mildenhall, JT Barron… - Proceedings of the …, 2022 - openaccess.thecvf.com

We combine neural rendering with multi-modal image and text representations to synthesize
diverse 3D objects solely from natural language descriptions. Our method, Dream Fields …

Tallenna Viittaa Viittausten määrä 588 Aiheeseen liittyviä artikkeleita Kaikki 7 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Avatarclip: Zero-shot text-driven generation and animation of 3d avatars

F Hong, M Zhang, L Pan, Z Cai, L Yang… - arxiv preprint arxiv …, 2022 - arxiv.org

3D avatar creation plays a crucial role in the digital age. However, the whole production
process is prohibitively time-consuming and labor-intensive. To democratize this technology …

Tallenna Viittaa Viittausten määrä 296 Aiheeseen liittyviä artikkeleita Kaikki 3 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Motionclip: Exposing human motion generation to clip space

G Tevet, B Gordon, A Hertz, AH Bermano… - … on Computer Vision, 2022 - Springer

We introduce MotionCLIP, a 3D human motion auto-encoder featuring a latent embedding
that is disentangled, well behaved, and supports highly semantic textual descriptions …

Tallenna Viittaa Viittausten määrä 317 Aiheeseen liittyviä artikkeleita Kaikki 10 versiota

Luo ilmoitus

Viittaa

Tarkennettu haku

Tallennettu omaan kirjastoon

Clipdraw: Exploring text-to-drawing synthesis through language-image encoders

Multi-modal machine learning in engineering design: A review and future directions

A survey on multimodal bidirectional machine learning translation of image and natural language processing

Inversion-based style transfer with diffusion models

Long-clip: Unlocking the long-text capability of clip

Iterative prompt learning for unsupervised backlit image enhancement

High-resolution image synthesis with latent diffusion models

Text2live: Text-driven layered image and video editing

Zero-shot text-guided object generation with dream fields

Avatarclip: Zero-shot text-driven generation and animation of 3d avatars

Motionclip: Exposing human motion generation to clip space