- Academic Search

M Zhang, D **, C Gu, F Hong, Z Cai, J Huang… - … on Computer Vision, 2024 - Springer

Human motion generation, a cornerstone technique in animation and video production, has
widespread applications in various tasks like text-to-motion and music-to-dance. Previous …

保存引用被引用次数：15 相关文章所有 2 个版本

[Free GPT-4]

[PDF] thecvf.com

Exploiting Diffusion Prior for Generalizable Dense Prediction

HY Lee, HY Tseng, MH Yang - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Contents generated by recent advanced Text-to-Image (T2I) diffusion models are sometimes
too imaginative for existing off-the-shelf dense predictors to estimate due to the immitigable …

保存引用被引用次数：9 相关文章 HTML 版

[Free GPT-4]

[PDF] arxiv.org

Generative models: What do they know? do they know things? let's find out!

X Du, N Kolkin, G Shakhnarovich, A Bhattad - arxiv preprint arxiv …, 2023 - arxiv.org

Generative models excel at mimicking real scenes, suggesting they might inherently encode
important intrinsic scene properties. In this paper, we aim to explore the following key …

保存引用被引用次数：17 相关文章所有 2 个版本 HTML 版

[Free GPT-4]

[PDF] arxiv.org

Analogist: Out-of-the-box visual in-context learning with image diffusion model

Z Gu, S Yang, J Liao, J Huo, Y Gao - ACM Transactions on Graphics …, 2024 - dl.acm.org

Visual In-Context Learning (ICL) has emerged as a promising research area due to its
capability to accomplish various tasks with limited example pairs through analogical …

保存引用被引用次数：3 相关文章所有 2 个版本

[Free GPT-4]

[PDF] arxiv.org

Instructgie: Towards generalizable image editing

Z Meng, C Yang, J Liu, H Tang, P Zhao… - European Conference on …, 2024 - Springer

Recent advances in image editing have been driven by the development of denoising
diffusion models, marking a significant leap forward in this field. Despite these advances, the …

保存引用被引用次数：6 相关文章所有 2 个版本

[Free GPT-4]

[PDF] ecva.net

Mevg: Multi-event video generation with text-to-video models

G Oh, J Jeong, S Kim, W Byeon, J Kim, S Kim… - European Conference on …, 2024 - Springer

We introduce a novel diffusion-based video generation method, generating a video showing
multiple events given multiple individual sentences from the user. Our method does not …

保存引用被引用次数：6 相关文章所有 6 个版本

[Free GPT-4]

[PDF] arxiv.org

Mtvg: Multi-text video generation with text-to-video models

G Oh, J Jeong, S Kim, W Byeon, J Kim, S Kim… - arxiv preprint arxiv …, 2023 - arxiv.org

Recently, video generation has attracted massive attention and yielded noticeable
outcomes. Concerning the characteristics of video, multi-text conditioning incorporating …

保存引用被引用次数：9 相关文章所有 2 个版本 HTML 版

[Free GPT-4]

[PDF] arxiv.org

A survey on data augmentation in large model era

Y Zhou, C Guo, X Wang, Y Chang, Y Wu - arxiv preprint arxiv:2401.15422, 2024 - arxiv.org

Large models, encompassing large language and diffusion models, have shown
exceptional promise in approximating human-level intelligence, garnering significant …

保存引用被引用次数：21 相关文章所有 2 个版本 HTML 版

[Free GPT-4]

[PDF] thecvf.com

Edit One for All: Interactive Batch Image Editing

T Nguyen, U Ojha, Y Li, H Liu… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

In recent years image editing has advanced remarkably. With increased human control it is
now possible to edit an image in a plethora of ways; from specifying in text what we want to …

保存引用被引用次数：3 相关文章所有 3 个版本 HTML 版

[Free GPT-4]

[PDF] arxiv.org

LLMs Meet Multimodal Generation and Editing: A Survey

Y He, Z Liu, J Chen, Z Tian, H Liu, X Chi, R Liu… - arxiv preprint arxiv …, 2024 - arxiv.org

With the recent advancement in large language models (LLMs), there is a growing interest in
combining LLMs with multimodal learning. Previous surveys of multimodal large language …

保存引用被引用次数：14 相关文章所有 2 个版本 HTML 版

创建快讯

引用

高级搜索

已保存到“我的图书馆”

Visual instruction inversion: Image editing via image prompting

Large motion model for unified multi-modal motion generation

Exploiting Diffusion Prior for Generalizable Dense Prediction

Generative models: What do they know? do they know things? let's find out!

Analogist: Out-of-the-box visual in-context learning with image diffusion model

Instructgie: Towards generalizable image editing

Mevg: Multi-event video generation with text-to-video models

Mtvg: Multi-text video generation with text-to-video models

A survey on data augmentation in large model era

Edit One for All: Interactive Batch Image Editing

LLMs Meet Multimodal Generation and Editing: A Survey