- Academic Search

Z Zhu, X Wang, W Zhao, C Min, N Deng, M Dou… - arxiv preprint arxiv …, 2024 - arxiv.org

General world models represent a crucial pathway toward achieving Artificial General
Intelligence (AGI), serving as the cornerstone for various applications ranging from virtual …

Speichern Zitieren Zitiert von: 33 Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Representation alignment for generation: Training diffusion transformers is easier than you think

S Yu, S Kwak, H Jang, J Jeong, J Huang, J Shin… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent studies have shown that the denoising process in (generative) diffusion models can
induce meaningful (discriminative) representations inside the model, though the quality of …

Speichern Zitieren Zitiert von: 24 Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Self-rectifying diffusion sampling with perturbed-attention guidance

D Ahn, H Cho, J Min, W Jang, J Kim, SH Kim… - … on Computer Vision, 2024 - Springer

Recent studies have demonstrated that diffusion models can generate high-quality samples,
but their quality heavily depends on sampling guidance techniques, such as classifier …

Speichern Zitieren Zitiert von: 14 Ähnliche Artikel Alle 2 Versionen

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Diffusion models and representation learning: A survey

M Fuest, P Ma, M Gui, JS Fischer, VT Hu… - arxiv preprint arxiv …, 2024 - arxiv.org

Diffusion Models are popular generative modeling methods in various vision tasks, attracting
significant attention. They can be considered a unique instance of self-supervised learning …

Speichern Zitieren Zitiert von: 12 Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Visual autoregressive modeling: Scalable image generation via next-scale prediction

K Tian, Y Jiang, Z Yuan, B Peng, L Wang - arxiv preprint arxiv:2404.02905, 2024 - arxiv.org

We present Visual AutoRegressive modeling (VAR), a new generation paradigm that
redefines the autoregressive learning on images as coarse-to-fine" next-scale prediction" or" …

Speichern Zitieren Zitiert von: 142 Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Disco-diff: Enhancing continuous diffusion models with discrete latents

Y Xu, G Corso, T Jaakkola, A Vahdat… - arxiv preprint arxiv …, 2024 - arxiv.org

Diffusion models (DMs) have revolutionized generative learning. They utilize a diffusion
process to encode data into a simple Gaussian distribution. However, encoding a complex …

Speichern Zitieren Zitiert von: 8 Ähnliche Artikel HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Cross-conditioned diffusion model for medical image to image translation

Z **ng, S Yang, S Chen, T Ye, Y Yang, J Qin… - … Conference on Medical …, 2024 - Springer

Multi-modal magnetic resonance imaging (MRI) provides rich, complementary information
for analyzing diseases. However, the practical challenges of acquiring multiple MRI …

Speichern Zitieren Zitiert von: 5 Ähnliche Artikel Alle 5 Versionen

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Metamorph: Multimodal understanding and generation via instruction tuning

S Tong, D Fan, J Zhu, Y **ong, X Chen, K Sinha… - arxiv preprint arxiv …, 2024 - arxiv.org

In this work, we propose Visual-Predictive Instruction Tuning (VPiT)-a simple and effective
extension to visual instruction tuning that enables a pretrained LLM to quickly morph into an …

Speichern Zitieren Zitiert von: 2 Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Bigr: Harnessing binary latent codes for image generation and improved visual representation capabilities

S Hao, X Liu, X Qi, S Zhao, B Zi, R **ao, K Han… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce BiGR, a novel conditional image generation model using compact binary
latent codes for generative training, focusing on enhancing both generation and …

Speichern Zitieren Zitiert von: 3 Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Contrastive learning with synthetic positives

D Zeng, Y Wu, X Hu, X Xu, Y Shi - European Conference on Computer …, 2024 - Springer

Contrastive learning with the nearest neighbor has proved to be one of the most efficient self-
supervised learning (SSL) techniques by utilizing the similarity of multiple instances within …

Speichern Zitieren Zitiert von: 1 Ähnliche Artikel Alle 10 Versionen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Is sora a world simulator? a comprehensive survey on general world models and beyond

Representation alignment for generation: Training diffusion transformers is easier than you think

Self-rectifying diffusion sampling with perturbed-attention guidance

Diffusion models and representation learning: A survey

Visual autoregressive modeling: Scalable image generation via next-scale prediction

Disco-diff: Enhancing continuous diffusion models with discrete latents

Cross-conditioned diffusion model for medical image to image translation

Metamorph: Multimodal understanding and generation via instruction tuning

Bigr: Harnessing binary latent codes for image generation and improved visual representation capabilities

Contrastive learning with synthetic positives