- Academic Search

Z Gan, L Li, C Li, L Wang, Z Liu… - Foundations and Trends …, 2022 - nowpublishers.com

This monograph surveys vision-language pre-training (VLP) methods for multimodal
intelligence that have been developed in the last few years. We group these approaches …

Uložit Citovat Počet citací tohoto článku: 197 Související články Všechny verze (počet: 7) Hledat knihovnu Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A content-driven micro-video recommendation dataset at scale

Y Ni, Y Cheng, X Liu, J Fu, Y Li, X He, Y Zhang… - ar** text content-based recommendation models. Meanwhile, although image …

Uložit Citovat Počet citací tohoto článku: 12 Související články Všechny verze (počet: 3)

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] Partial visual-semantic embedding: Fine-grained outfit image representation with massive volumes of tags via angular-based contrastive learning

R Shimizu, T Nakamura, M Goto - Knowledge-Based Systems, 2023 - Elsevier

A novel technology named fashion intelligence system, which quantifies ambiguous
expressions unique to fashion, such as “casual,”“adult-casual,” and “office-casual,” was …

Uložit Citovat Počet citací tohoto článku: 5 Související články Všechny verze (počet: 4)

Debiased momentum contrastive learning for multimodal video similarity measures

K Liu, J Wang, X Zhang - Neurocomputing, 2024 - Elsevier

The growing potential of multimodal short videos has contributed to a new type of
recommendation. It depends on effectively measuring the similarities between the short …

Uložit Citovat Počet citací tohoto článku: 3 Související články Všechny verze (počet: 2)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Revisiting pre-training in audio-visual learning

R Feng, W **a, D Hu - arxiv preprint arxiv:2302.03533, 2023 - arxiv.org

Pre-training technique has gained tremendous success in enhancing model performance on
various tasks, but found to perform worse than training from scratch in some uni-modal …

Uložit Citovat Počet citací tohoto článku: 2 Související články Všechny verze (počet: 2) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Partial Visual-Semantic Embedding: Fashion Intelligence System with Sensitive Part-by-Part Learning

R Shimizu, T Nakamura, M Goto - arxiv preprint arxiv:2211.06688, 2022 - arxiv.org

In this study, we propose a technology called the Fashion Intelligence System based on the
visual-semantic embedding (VSE) model to quantify abstract and complex expressions …

Uložit Citovat Související články Všechny verze (počet: 2) Zobrazit jako HTML