- Academic Search

G Zheng, T Li, R Jiang, Y Lu, T Wu, X Li - arxiv preprint arxiv:2410.15957, 2024 - arxiv.org

Recently, camera pose, as a user-friendly and physics-related condition, has been
introduced into text-to-video diffusion model for camera control. However, existing methods …

Lagre Referanse Sitert av 6 Beslektede artikler Alle 2 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

CreatiLayout: Siamese Multimodal Diffusion Transformer for Creative Layout-to-Image Generation

H Zhang, D Hong, T Gao, Y Wang, J Shao… - arxiv preprint arxiv …, 2024 - arxiv.org

Diffusion models have been recognized for their ability to generate images that are not only
visually appealing but also of high artistic quality. As a result, Layout-to-Image (L2I) …

Lagre Referanse Sitert av 1 Beslektede artikler Alle 2 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

LAION-SG: An Enhanced Large-Scale Dataset for Training Complex Image-Text Models with Structural Annotations

Z Li, C Meng, Y Li, L Yang, S Zhang, J Ma, J Li… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent advances in text-to-image (T2I) generation have shown remarkable success in
producing high-quality images from text. However, existing T2I models show decayed …

Lagre Referanse Sitert av 1 Beslektede artikler Alle 2 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

CC-Diff: Enhancing Contextual Coherence in Remote Sensing Image Synthesis

M Zhang, Y Liu, Y Liu, H Yu, Q Ye - arxiv preprint arxiv:2412.08464, 2024 - arxiv.org

Accurately depicting real-world landscapes in remote sensing (RS) images requires precise
alignment between objects and their environment. However, most existing synthesis …

Lagre Referanse Beslektede artikler Alle 2 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

RealCam-I2V: Real-World Image-to-Video Generation with Interactive Complex Camera Control

T Li, G Zheng, R Jiang, T Wu, Y Lu, Y Lin, X Li - arxiv preprint arxiv …, 2025 - arxiv.org

Recent advancements in camera-trajectory-guided image-to-video generation offer higher
precision and better support for complex camera control compared to text-based …

Lagre Referanse Beslektede artikler Alle 2 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

EliGen: Entity-Level Controlled Image Generation with Regional Attention

H Zhang, Z Duan, X Wang, Y Chen, Y Zhang - arxiv preprint arxiv …, 2025 - arxiv.org

Recent advancements in diffusion models have significantly advanced text-to-image
generation, yet global text prompts alone remain insufficient for achieving fine-grained …

Lagre Referanse Beslektede artikler Alle 2 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

3DIS-FLUX: simple and efficient multi-instance generation with DiT rendering

D Zhou, J **e, Z Yang, Y Yang - arxiv preprint arxiv:2501.05131, 2025 - arxiv.org

The growing demand for controllable outputs in text-to-image generation has driven
significant advancements in multi-instance generation (MIG), enabling users to define both …

Lagre Referanse Beslektede artikler Alle 2 versjoner HTML-versjon

Opprett varsel

Referanse

Avansert søk

Lagret i Mitt bibliotek

Ifadapter: Instance feature control for grounded text-to-image generation

Cami2v: Camera-controlled image-to-video diffusion model

CreatiLayout: Siamese Multimodal Diffusion Transformer for Creative Layout-to-Image Generation

LAION-SG: An Enhanced Large-Scale Dataset for Training Complex Image-Text Models with Structural Annotations

CC-Diff: Enhancing Contextual Coherence in Remote Sensing Image Synthesis

RealCam-I2V: Real-World Image-to-Video Generation with Interactive Complex Camera Control

EliGen: Entity-Level Controlled Image Generation with Regional Attention

3DIS-FLUX: simple and efficient multi-instance generation with DiT rendering