Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
[HTML][HTML] Data augmentation for audio-visual emotion recognition with an efficient multimodal conditional GAN
Audio-visual emotion recognition is the research of identifying human emotional states by
combining the audio modality and the visual modality simultaneously, which plays an …
combining the audio modality and the visual modality simultaneously, which plays an …
On universal features for high-dimensional learning and inference
We consider the problem of identifying universal low-dimensional features from high-
dimensional data for inference tasks in settings involving learning. For such problems, we …
dimensional data for inference tasks in settings involving learning. For such problems, we …
Learning better representations for audio-visual emotion recognition with common information
F Ma, W Zhang, Y Li, SL Huang, L Zhang - Applied Sciences, 2020 - mdpi.com
Audio-visual emotion recognition aims to distinguish human emotional states by integrating
the audio and visual data acquired in the expression of emotions. It is crucial for facilitating …
the audio and visual data acquired in the expression of emotions. It is crucial for facilitating …
HGR Correlation Pooling Fusion Framework for Recognition and Classification in Multimodal Remote Sensing Data
This paper investigates remote sensing data recognition and classification with multimodal
data fusion. Aiming at the problems of low recognition and classification accuracy and the …
data fusion. Aiming at the problems of low recognition and classification accuracy and the …
Robust cross-modal remote sensing image retrieval via maximal correlation augmentation
Most of the existing studies regarding cross-modal content-based remote sensing image
retrieval (CM-CBRSIR) focus on reducing/enlarging the Euclidean distances of cross-modal …
retrieval (CM-CBRSIR) focus on reducing/enlarging the Euclidean distances of cross-modal …
[PDF][PDF] A method of audio-visual person verification by mining connections between time series
P Sun, S Zhang, Z Liu, Y Yuan, T Zhang… - Proc …, 2023 - isca-archive.org
It has already been observed that audio-visual embedding is more robust than uni-modality
embedding for person verification. But the relationship of keyframes in time series between …
embedding for person verification. But the relationship of keyframes in time series between …
Learning Audio-Visual embedding for Person Verification in the Wild
P Sun, S Zhang, Z Liu, Y Yuan, T Zhang… - arxiv preprint arxiv …, 2022 - arxiv.org
It has already been observed that audio-visual embedding is more robust than uni-modality
embedding for person verification. Here, we proposed a novel audio-visual strategy that …
embedding for person verification. Here, we proposed a novel audio-visual strategy that …
Generalized product-of-experts for learning multimodal representations in noisy environments
A real-world application or setting involves interaction between different modalities (eg,
video, speech, text). In order to process the multimodal information automatically and use it …
video, speech, text). In order to process the multimodal information automatically and use it …
More than Vanilla Fusion: a Simple, Decoupling-free, Attention Module for Multimodal Fusion Based on Signal Theory
P Sun, Y Zhang, Z Liu, D Chen, H Zhang - arxiv preprint arxiv:2312.07212, 2023 - arxiv.org
The vanilla fusion methods still dominate a large percentage of mainstream audio-visual
tasks. However, the effectiveness of vanilla fusion from a theoretical perspective is still worth …
tasks. However, the effectiveness of vanilla fusion from a theoretical perspective is still worth …
A semi-supervised learning approach for visual question answering based on maximal correlation
S Yin, F Ma, SL Huang - 2021 IEEE International Conference …, 2021 - ieeexplore.ieee.org
In this paper, we propose a semi-supervised learning approach for the Visual Question
Answering (VQA) task based on maximal correlation. Instead of training the VQA model with …
Answering (VQA) task based on maximal correlation. Instead of training the VQA model with …