Towards unified multimodal editing with enhanced knowledge collaboration

K Pan, Z Fan, J Li, Q Yu, H Fei, S Tang, R Hong… - arxiv preprint arxiv …, 2024 - arxiv.org
The swift advancement in Multimodal LLMs (MLLMs) also presents significant challenges for
effective knowledge editing. Current methods, including intrinsic knowledge editing and …

SPDiffusion: Semantic Protection Diffusion for Multi-concept Text-to-image Generation

Y Zhang, R Zhang, X Nie, H Li, J Chen, Y Hao… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent text-to-image models have achieved remarkable success in generating high-quality
images. However, when tasked with multi-concept generation which creates images …

RelationBooth: Towards Relation-Aware Customized Object Generation

Q Shi, L Qi, J Wu, J Bai, J Wang, Y Tong, X Li… - arxiv preprint arxiv …, 2024 - arxiv.org
Customized image generation is crucial for delivering personalized content based on user-
provided image prompts, aligning large-scale text-to-image diffusion models with individual …

: Exploring Embodied Emotion Through A Large-Scale Egocentric Video Dataset

W Lin, Y Feng, WK Han, T **, Z Zhao, F Wu… - The Thirty-eight … - openreview.net
Understanding human emotions is fundamental to enhancing human-computer interaction,
especially for embodied agents that mimic human behavior. Traditional emotion analysis …