A review of modern recommender systems using generative models (gen-recsys)

Y Deldjoo, Z He, J McAuley, A Korikov… - Proceedings of the 30th …, 2024 - dl.acm.org
Traditional recommender systems typically use user-item rating histories as their main data
source. However, deep generative models now have the capability to model and sample …

Ladi-vton: Latent diffusion textual-inversion enhanced virtual try-on

D Morelli, A Baldrati, G Cartella, M Cornia… - Proceedings of the 31st …, 2023 - dl.acm.org
The rapidly evolving fields of e-commerce and metaverse continue to seek innovative
approaches to enhance the consumer experience. At the same time, recent advancements …

Language-only training of zero-shot composed image retrieval

G Gu, S Chun, W Kim, Y Kang… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Composed image retrieval (CIR) task takes a composed query of image and text aiming to
search relative images for both conditions. Conventional CIR approaches need a training …

Knowledge-enhanced dual-stream zero-shot composed image retrieval

Y Suo, F Ma, L Zhu, Y Yang - Proceedings of the IEEE/CVF …, 2024 - openaccess.thecvf.com
We study the zero-shot Composed Image Retrieval (ZS-CIR) task which is to retrieve the
target image given a reference image and a description without training on the triplet …

You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval

S Koley, AK Bhunia, A Sain… - Proceedings of the …, 2024 - openaccess.thecvf.com
Two primary input modalities prevail in image retrieval: sketch and text. While text is widely
used for inter-category retrieval tasks sketches have been established as the sole preferred …

CoVR: Learning composed video retrieval from web video captions

L Ventura, A Yang, C Schmid, G Varol - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Composed Image Retrieval (CoIR) has recently gained popularity as a task that considers
both text and image queries together, to search for relevant images in a database. Most …

Magiclens: Self-supervised image retrieval with open-ended instructions

K Zhang, Y Luan, H Hu, K Lee, S Qiao, W Chen… - ar** images to context-dependent words for accurate zero-shot composed image retrieval
Y Tang, J Yu, K Gai, J Zhuang, G **ong, Y Hu… - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Different from the Composed Image Retrieval task that requires expensive labels for training
task-specific models, Zero-Shot Composed Image Retrieval (ZS-CIR) involves diverse tasks …

Myvlm: Personalizing vlms for user-specific queries

Y Alaluf, E Richardson, S Tulyakov, K Aberman… - … on Computer Vision, 2024 - Springer
Recent large-scale vision-language models (VLMs) have demonstrated remarkable
capabilities in understanding and generating textual descriptions for visual content …

Compodiff: Versatile composed image retrieval with latent diffusion

G Gu, S Chun, W Kim, HJ Jun, Y Kang… - arxiv preprint arxiv …, 2023 - arxiv.org
This paper proposes a novel diffusion-based model, CompoDiff, for solving zero-shot
Composed Image Retrieval (ZS-CIR) with latent diffusion. This paper also introduces a new …