Академия Google

[HTML][HTML] Data augmentation for audio-visual emotion recognition with an efficient multimodal conditional GAN

F Ma, Y Li, S Ni, SL Huang, L Zhang - Applied Sciences, 2022 - mdpi.com

Audio-visual emotion recognition is the research of identifying human emotional states by
combining the audio modality and the visual modality simultaneously, which plays an …

Сохранить Цитировать Цитируется: 55 Похожие статьи Все версии статьи (4) Сохраненная копия

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

On universal features for high-dimensional learning and inference

SL Huang, A Makur, GW Wornell, L Zheng - arxiv preprint arxiv …, 2019 - arxiv.org

We consider the problem of identifying universal low-dimensional features from high-
dimensional data for inference tasks in settings involving learning. For such problems, we …

Сохранить Цитировать Цитируется: 62 Похожие статьи Все версии статьи (6) Поиск в библиотеках В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] mdpi.com

Learning better representations for audio-visual emotion recognition with common information

F Ma, W Zhang, Y Li, SL Huang, L Zhang - Applied Sciences, 2020 - mdpi.com

Audio-visual emotion recognition aims to distinguish human emotional states by integrating
the audio and visual data acquired in the expression of emotions. It is crucial for facilitating …

Сохранить Цитировать Цитируется: 32 Похожие статьи Все версии статьи (5) Сохраненная копия

[Free GPT-4]
[DeepSeek]

[PDF] mdpi.com

HGR Correlation Pooling Fusion Framework for Recognition and Classification in Multimodal Remote Sensing Data

H Zhang, SL Huang, EE Kuruoglu - Remote Sensing, 2024 - mdpi.com

This paper investigates remote sensing data recognition and classification with multimodal
data fusion. Aiming at the problems of low recognition and classification accuracy and the …

Сохранить Цитировать Цитируется: 3 Похожие статьи Все версии статьи (4) Сохраненная копия

Robust cross-modal remote sensing image retrieval via maximal correlation augmentation

Z Wang, X Wang, G Li, C Li - IEEE Transactions on Geoscience …, 2024 - ieeexplore.ieee.org

Most of the existing studies regarding cross-modal content-based remote sensing image
retrieval (CM-CBRSIR) focus on reducing/enlarging the Euclidean distances of cross-modal …

Сохранить Цитировать Цитируется: 1 Похожие статьи Все версии статьи (3)

[Free GPT-4]
[DeepSeek]

[PDF] isca-archive.org

[PDF][PDF] A method of audio-visual person verification by mining connections between time series

P Sun, S Zhang, Z Liu, Y Yuan, T Zhang… - Proc …, 2023 - isca-archive.org

It has already been observed that audio-visual embedding is more robust than uni-modality
embedding for person verification. But the relationship of keyframes in time series between …

Сохранить Цитировать Цитируется: 4 Похожие статьи Все версии статьи (3) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Learning Audio-Visual embedding for Person Verification in the Wild

P Sun, S Zhang, Z Liu, Y Yuan, T Zhang… - arxiv preprint arxiv …, 2022 - arxiv.org

It has already been observed that audio-visual embedding is more robust than uni-modality
embedding for person verification. Here, we proposed a novel audio-visual strategy that …

Сохранить Цитировать Цитируется: 3 Похожие статьи Все версии статьи (2) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Generalized product-of-experts for learning multimodal representations in noisy environments

A Joshi, N Gupta, J Shah, B Bhattarai, A Modi… - Proceedings of the …, 2022 - dl.acm.org

A real-world application or setting involves interaction between different modalities (eg,
video, speech, text). In order to process the multimodal information automatically and use it …

Сохранить Цитировать Цитируется: 3 Похожие статьи Все версии статьи (6)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

More than Vanilla Fusion: a Simple, Decoupling-free, Attention Module for Multimodal Fusion Based on Signal Theory

P Sun, Y Zhang, Z Liu, D Chen, H Zhang - arxiv preprint arxiv:2312.07212, 2023 - arxiv.org

The vanilla fusion methods still dominate a large percentage of mainstream audio-visual
tasks. However, the effectiveness of vanilla fusion from a theoretical perspective is still worth …

Сохранить Цитировать Похожие статьи Все версии статьи (2) В виде HTML

A semi-supervised learning approach for visual question answering based on maximal correlation

S Yin, F Ma, SL Huang - 2021 IEEE International Conference …, 2021 - ieeexplore.ieee.org

In this paper, we propose a semi-supervised learning approach for the Visual Question
Answering (VQA) task based on maximal correlation. Instead of training the VQA model with …

Сохранить Цитировать Цитируется: 2 Похожие статьи Все версии статьи (2)

Создать оповещение

Цитировать

Расширенный поиск

Сохранено в вашей библиотеке

Person recognition with hgr maximal correlation on multimodal data

[HTML][HTML] Data augmentation for audio-visual emotion recognition with an efficient multimodal conditional GAN

On universal features for high-dimensional learning and inference

Learning better representations for audio-visual emotion recognition with common information

HGR Correlation Pooling Fusion Framework for Recognition and Classification in Multimodal Remote Sensing Data

Robust cross-modal remote sensing image retrieval via maximal correlation augmentation

[PDF][PDF] A method of audio-visual person verification by mining connections between time series

Learning Audio-Visual embedding for Person Verification in the Wild

Generalized product-of-experts for learning multimodal representations in noisy environments

More than Vanilla Fusion: a Simple, Decoupling-free, Attention Module for Multimodal Fusion Based on Signal Theory

A semi-supervised learning approach for visual question answering based on maximal correlation