Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Deep multimodal representation learning: A survey
Multimodal representation learning, which aims to narrow the heterogeneity gap among
different modalities, plays an indispensable role in the utilization of ubiquitous multimodal …
different modalities, plays an indispensable role in the utilization of ubiquitous multimodal …
Ternary adversarial networks with self-supervision for zero-shot cross-modal retrieval
Given a query instance from one modality (eg, image), cross-modal retrieval aims to find
semantically similar instances from another modality (eg, text). To perform cross-modal …
semantically similar instances from another modality (eg, text). To perform cross-modal …
Graph embedding contrastive multi-modal representation learning for clustering
Multi-modal clustering (MMC) aims to explore complementary information from diverse
modalities for clustering performance facilitating. This article studies challenging problems in …
modalities for clustering performance facilitating. This article studies challenging problems in …
Dual alignment unsupervised domain adaptation for video-text retrieval
Video-text retrieval is an emerging stream in both computer vision and natural language
processing communities, which aims to find relevant videos given text queries. In this paper …
processing communities, which aims to find relevant videos given text queries. In this paper …
MHTN: Modal-adversarial hybrid transfer network for cross-modal retrieval
Cross-modal retrieval has drawn wide interest for retrieval across different modalities (such
as text, image, video, audio, and 3-D model). However, existing methods based on a deep …
as text, image, video, audio, and 3-D model). However, existing methods based on a deep …
Unsupervised domain adaptative temporal sentence localization with mutual information maximization
Temporal sentence localization (TSL) aims to localize a target segment in a video according
to a given sentence query. Though respectable works have made decent achievements in …
to a given sentence query. Though respectable works have made decent achievements in …
Multi-modality associative bridging through memory: Speech sound recollected from face video
In this paper, we introduce a novel audio-visual multi-modal bridging framework that can
utilize both audio and visual information, even with uni-modal inputs. We exploit a memory …
utilize both audio and visual information, even with uni-modal inputs. We exploit a memory …
Joint feature synthesis and embedding: Adversarial cross-modal retrieval revisited
Recently, generative adversarial network (GAN) has shown its strong ability on modeling
data distribution via adversarial learning. Cross-modal GAN, which attempts to utilize the …
data distribution via adversarial learning. Cross-modal GAN, which attempts to utilize the …
Akvsr: Audio knowledge empowered visual speech recognition by compressing audio knowledge of a pretrained model
Visual Speech Recognition (VSR) is the task of predicting spoken words from silent lip
movements. VSR is regarded as a challenging task because of the insufficient information …
movements. VSR is regarded as a challenging task because of the insufficient information …
Learning cross-modal common representations by private–shared subspaces separation
Due to the inconsistent distributions and representations of different modalities (eg, images
and texts), it is very challenging to correlate such heterogeneous data. A standard solution is …
and texts), it is very challenging to correlate such heterogeneous data. A standard solution is …