Multimodal pretraining, adaptation, and generation for recommendation: A survey

Q Liu, J Zhu, Y Yang, Q Dai, Z Du, XM Wu… - Proceedings of the 30th …, 2024 - dl.acm.org
Personalized recommendation serves as a ubiquitous channel for users to discover
information tailored to their interests. However, traditional recommendation models primarily …

Efficient diffusion models: A comprehensive survey from principles to practices

Z Ma, Y Zhang, G Jia, L Zhao, Y Ma, M Ma… - arxiv preprint arxiv …, 2024 - arxiv.org
As one of the most popular and sought-after generative models in the recent years, diffusion
models have sparked the interests of many researchers and steadily shown excellent …

Multimodal pretraining and generation for recommendation: A tutorial

J Zhu, X Zhou, C Wu, R Zhang, Z Dong - Companion Proceedings of the …, 2024 - dl.acm.org
Personalized recommendation stands as a ubiquitous channel for users to explore
information or items aligned with their interests. Nevertheless, prevailing recommendation …

Retrieval-augmented layout transformer for content-aware layout generation

D Horita, N Inoue, K Kikuchi… - Proceedings of the …, 2024 - openaccess.thecvf.com
Content-aware graphic layout generation aims to automatically arrange visual elements
along with a given content such as an e-commerce product image. In this paper we argue …

Dreamstruct: Understanding slides and user interfaces via synthetic data generation

YH Peng, F Huq, Y Jiang, J Wu, XY Li… - … on Computer Vision, 2024 - Springer
Enabling machines to understand structured visuals like slides and user interfaces is
essential for making them accessible to people with disabilities. However, achieving such …

Visual layout composer: Image-vector dual diffusion model for design layout generation

MA Shabani, Z Wang, D Liu, N Zhao… - Proceedings of the …, 2024 - openaccess.thecvf.com
This paper proposes an image-vector dual diffusion model for generative layout design.
Distinct from prior efforts that mostly ignore element-level visual information our approach …

Graphic design with large multimodal model

Y Cheng, Z Zhang, M Yang, H Nie, C Li, X Wu… - arxiv preprint arxiv …, 2024 - arxiv.org
In the field of graphic design, automating the integration of design elements into a cohesive
multi-layered artwork not only boosts productivity but also paves the way for the …

Desigen: A pipeline for controllable design template generation

H Weng, D Huang, Y Qiao, Z Hu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Templates serve as a good starting point to implement a design (eg banner slide) but it takes
great effort from designers to manually create. In this paper we present Desigen an …

LayoutFlow: flow matching for layout generation

JJA Guerreiro, N Inoue, K Masui, M Otani… - … on Computer Vision, 2024 - Springer
Finding a suitable layout represents a crucial task for diverse applications in graphic design.
Motivated by simpler and smoother sampling trajectories, we explore the use of Flow …

Posterllava: Constructing a unified multi-modal layout generator with llm

T Yang, Y Luo, Z Qi, Y Wu, Y Shan… - arxiv preprint arxiv …, 2024 - arxiv.org
Layout generation is the keystone in achieving automated graphic design, requiring
arranging the position and size of various multi-modal design elements in a visually …