Google Наука

A Multimodal Single-Branch Embedding Network for Recommendation in Cold-Start and Missing Modality Scenarios

C Ganhör, M Moscati, A Hausberger, S Nawaz… - Proceedings of the 18th …, 2024 - dl.acm.org

Most recommender systems adopt collaborative filtering (CF) and provide recommendations
based on past collective interactions. Therefore, the performance of CF algorithms degrades …

Запазване Позоваване С позовавания в 3 Сродни статии Всички 5 версии

[免费ChatGPT] [DeepSeek可用网址] [PDF] github.io

Attribute-guided cross-modal interaction and enhancement for audio-visual matching

J Wang, A Zheng, Y Yan, R He… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Audio-visual matching is an essential task that measures the correlation between audio clips
and visual images. However, current methods rely solely on the joint embedding of global …

Запазване Позоваване С позовавания в 2 Сродни статии Всички 3 версии

[免费ChatGPT] [DeepSeek可用网址] [PDF] acm.org

Multi-stage Face-voice Association Learning with Keynote Speaker Diarization

R Tao, Z Shi, Y Jiang, DT Truong, ES Chng… - Proceedings of the …, 2024 - dl.acm.org

The human brain has the capability to associate the unknown person's voice and face by
leveraging their general relationship, referred to as" cross-modal speaker verification''. This …

Запазване Позоваване С позовавания в 2 Сродни статии Всички 3 версии

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Dctm: Dilated convolutional transformer model for multimodal engagement estimation in conversation

VN Tu, VT Huynh, HJ Yang, SH Kim, S Nawaz… - Proceedings of the 31st …, 2023 - dl.acm.org

Conversational engagement estimation is posed as a regression problem, entailing the
identification of the favorable attention and involvement of the participants in the …

Запазване Позоваване С позовавания в 7 Сродни статии Всички 6 версии

Multimodal Representation Learning for High-Quality Recommendations in Cold-Start and Beyond-Accuracy

M Moscati - Proceedings of the 18th ACM Conference on …, 2024 - dl.acm.org

Recommender systems (RS) traditionally leverage the large amount of user–item interaction
data. This exposes RS to a lower recommendation quality in cold-start scenarios, as well as …

Запазване Позоваване С позовавания в 1 Сродни статии

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Face-voice Association in Multilingual Environments (FAME) Challenge 2024 Evaluation Plan

MS Saeed, S Nawaz, MS Tahir, RK Das… - arxiv preprint arxiv …, 2024 - arxiv.org

The advancements of technology have led to the use of multimodal systems in various real-
world applications. Among them, the audio-visual systems are one of the widely used …

Запазване Позоваване С позовавания в 4 Сродни статии Всички 3 версии Във вид на HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] ieee.org

One Model to Rule Them All: A Universal Transformer for Biometric Matching

M Abdrakhmanova, A Yermekova, Y Barko… - IEEE …, 2024 - ieeexplore.ieee.org

This study introduces the first single branch network designed to tackle a spectrum of
biometric matching scenarios, including unimodal, multimodal, cross-modal, and missing …

Запазване Позоваване С позовавания в 2 Сродни статии Всички 3 версии

Multimodal pre-train then transfer learning approach for speaker recognition

S Jabeen, MS Amin, X Li - Multimedia Tools and Applications, 2024 - Springer

Cognitive science has well-established the correlation between faces and voices because
neuro-cognitive pathways of both information share the same structure. Recently, the task …

Запазване Позоваване С позовавания в 1 Сродни статии

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Modality Invariant Multimodal Learning to Handle Missing Modalities: A Single-Branch Approach

MS Saeed, S Nawaz, MZ Zaheer, MH Khan… - arxiv preprint arxiv …, 2024 - arxiv.org

Multimodal networks have demonstrated remarkable performance improvements over their
unimodal counterparts. Existing multimodal networks are designed in a multi-branch fashion …

Запазване Позоваване С позовавания в 1 Сродни статии Всички 2 версии Във вид на HTML

Public-Private Attributes-Based Variational Adversarial Network for Audio-Visual Cross-Modal Matching

A Zheng, F Yuan, H Zhang, J Wang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Existing audio-visual cross-modal matching methods focus on mitigating cross-modal
heterogeneity but ignore the impact of intra-class discrepancy of the same identity in …

Запазване Позоваване Сродни статии Всички 2 версии

Създаване на сигнал

Позоваване

Разширено търсене

Запазено в „Моята библиотека“

Single-branch network for multimodal training

A Multimodal Single-Branch Embedding Network for Recommendation in Cold-Start and Missing Modality Scenarios

Attribute-guided cross-modal interaction and enhancement for audio-visual matching

Multi-stage Face-voice Association Learning with Keynote Speaker Diarization

Dctm: Dilated convolutional transformer model for multimodal engagement estimation in conversation

Multimodal Representation Learning for High-Quality Recommendations in Cold-Start and Beyond-Accuracy

Face-voice Association in Multilingual Environments (FAME) Challenge 2024 Evaluation Plan

One Model to Rule Them All: A Universal Transformer for Biometric Matching

Multimodal pre-train then transfer learning approach for speaker recognition

Modality Invariant Multimodal Learning to Handle Missing Modalities: A Single-Branch Approach

Public-Private Attributes-Based Variational Adversarial Network for Audio-Visual Cross-Modal Matching