Self-supervised speech representation learning: A review
Although supervised deep learning has revolutionized speech and audio processing, it has
necessitated the building of specialist models for individual tasks and application scenarios …
necessitated the building of specialist models for individual tasks and application scenarios …
A survey of multimodal deep generative models
Multimodal learning is a framework for building models that make predictions based on
different types of modalities. Important challenges in multimodal learning are the inference of …
different types of modalities. Important challenges in multimodal learning are the inference of …
Trusted multi-view classification with dynamic evidential fusion
Existing multi-view classification algorithms focus on promoting accuracy by exploiting
different views, typically integrating them into common representations for follow-up tasks …
different views, typically integrating them into common representations for follow-up tasks …
Deep multimodal representation learning: A survey
W Guo, J Wang, S Wang - Ieee Access, 2019 - ieeexplore.ieee.org
Multimodal representation learning, which aims to narrow the heterogeneity gap among
different modalities, plays an indispensable role in the utilization of ubiquitous multimodal …
different modalities, plays an indispensable role in the utilization of ubiquitous multimodal …
Variational mixture-of-experts autoencoders for multi-modal deep generative models
Learning generative models that span multiple data modalities, such as vision and
language, is often motivated by the desire to learn more useful, generalisable …
language, is often motivated by the desire to learn more useful, generalisable …
Multimodal generative models for scalable weakly-supervised learning
Multiple modalities often co-occur when describing natural phenomena. Learning a joint
representation of these modalities should yield deeper and more useful representations …
representation of these modalities should yield deeper and more useful representations …
Deep partial multi-view learning
Although multi-view learning has made significant progress over the past few decades, it is
still challenging due to the difficulty in modeling complex correlations among different views …
still challenging due to the difficulty in modeling complex correlations among different views …
[HTML][HTML] The human tumor atlas network: charting tumor transitions across space and time at single-cell resolution
O Rozenblatt-Rosen, A Regev, P Oberdoerffer, T Nawy… - Cell, 2020 - cell.com
Crucial transitions in cancer—including tumor initiation, local expansion, metastasis, and
therapeutic resistance—involve complex interactions between cells within the dynamic …
therapeutic resistance—involve complex interactions between cells within the dynamic …
Learning modality-specific and-agnostic representations for asynchronous multimodal language sequences
Understanding human behaviors and intents from videos is a challenging task. Video flows
usually involve time-series data from different modalities, such as natural language, facial …
usually involve time-series data from different modalities, such as natural language, facial …
Gaussian process prior variational autoencoders
Variational autoencoders (VAE) are a powerful and widely-used class of models to learn
complex data distributions in an unsupervised fashion. One important limitation of VAEs is …
complex data distributions in an unsupervised fashion. One important limitation of VAEs is …