Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Video frame interpolation: A comprehensive survey
Video Frame Interpolation (VFI) is a fascinating and challenging problem in the computer
vision (CV) field, aiming to generate non-existing frames between two consecutive video …
vision (CV) field, aiming to generate non-existing frames between two consecutive video …
Uniformer: Unifying convolution and self-attention for visual recognition
It is a challenging task to learn discriminative representation from images and videos, due to
large local redundancy and complex global dependency in these visual data. Convolution …
large local redundancy and complex global dependency in these visual data. Convolution …
A survey on vision transformer
Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …
network mainly based on the self-attention mechanism. Thanks to its strong representation …
A survey on visual transformer
Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …
network mainly based on the self-attention mechanism. Thanks to its strong representation …
Facial expression recognition with visual transformers and attentional selective fusion
Facial Expression Recognition (FER) in the wild is extremely challenging due to occlusions,
variant head poses, face deformation and motion blur under unconstrained conditions …
variant head poses, face deformation and motion blur under unconstrained conditions …
Oadtr: Online action detection with transformers
Most recent approaches for online action detection tend to apply Recurrent Neural Network
(RNN) to capture long-range temporal structure. However, RNN suffers from non-parallelism …
(RNN) to capture long-range temporal structure. However, RNN suffers from non-parallelism …
A transformer-based feature segmentation and region alignment method for UAV-view geo-localization
M Dai, J Hu, J Zhuang, E Zheng - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Cross-view geo-localization is a task of matching the same geographic image from different
views, eg, unmanned aerial vehicle (UAV) and satellite. The most difficult challenges are the …
views, eg, unmanned aerial vehicle (UAV) and satellite. The most difficult challenges are the …
Neural video depth stabilizer
Video depth estimation aims to infer temporally consistent depth. Some methods achieve
temporal consistency by finetuning a single-image depth model during test time using …
temporal consistency by finetuning a single-image depth model during test time using …
CCTNet: Coupled CNN and transformer network for crop segmentation of remote sensing images
Semantic segmentation by using remote sensing images is an efficient method for
agricultural crop classification. Recent solutions in crop segmentation are mainly deep …
agricultural crop classification. Recent solutions in crop segmentation are mainly deep …
Spike transformer: Monocular depth estimation for spiking camera
Spiking camera is a bio-inspired vision sensor that mimics the sampling mechanism of the
primate fovea, which has shown great potential for capturing high-speed dynamic scenes …
primate fovea, which has shown great potential for capturing high-speed dynamic scenes …