Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Deepmad: Mathematical architecture design for deep convolutional neural network
The rapid advances in Vision Transformer (ViT) refresh the state-of-the-art performances in
various vision tasks, overshadowing the conventional CNN-based models. This ignites a few …
various vision tasks, overshadowing the conventional CNN-based models. This ignites a few …
Diversifying spatial-temporal perception for video domain generalization
Video domain generalization aims to learn generalizable video classification models for
unseen target domains by training in a source domain. A critical challenge of video domain …
unseen target domains by training in a source domain. A critical challenge of video domain …
Temporally-adaptive models for efficient video understanding
Spatial convolutions are extensively used in numerous deep video models. It fundamentally
assumes spatio-temporal invariance, ie, using shared weights for every location in different …
assumes spatio-temporal invariance, ie, using shared weights for every location in different …
Multimodal cross-domain few-shot learning for egocentric action recognition
We address a novel cross-domain few-shot learning task (CD-FSL) with multimodal input
and unlabeled target data for egocentric action recognition. This paper simultaneously …
and unlabeled target data for egocentric action recognition. This paper simultaneously …
Privacy-safe Action Recognition via Cross-Modality Distillation
Human action recognition systems enhance public safety by detecting abnormal behavior
autonomously. RGB sensors commonly used in such systems capture personal information …
autonomously. RGB sensors commonly used in such systems capture personal information …
Dynamical semantic enhancement network for continuous sign language recognition
S Wang, L Guo, W Xue - Multimedia Systems, 2024 - Springer
In the field of sign language recognition, effective interpretation of semantic information,
which is primarily conveyed through facial and hand gestures, poses significant challenges …
which is primarily conveyed through facial and hand gestures, poses significant challenges …
STAN: Spatio-Temporal Analysis Network for efficient video action recognition
Action recognition, whose goal is identifying and extracting spatio-temporal features from
video content, is a foundation of work in video understanding. However, current methods for …
video content, is a foundation of work in video understanding. However, current methods for …
Privacy-enhanced zero-shot learning via data-free knowledge transfer
Considering the increasing concerns about data copyright and sensitivity issues, we present
a novel Privacy-Enhanced Zero-Shot Learning (PE-ZSL) paradigm. The key innovation is to …
a novel Privacy-Enhanced Zero-Shot Learning (PE-ZSL) paradigm. The key innovation is to …
[HTML][HTML] Zero-Shot Proxy with Incorporated-Score for Lightweight Deep Neural Architecture Search
TT Nguyen, JH Han - Electronics, 2024 - mdpi.com
Designing a high-performance neural network is a difficult task. Neural architecture search
(NAS) methods aim to solve this process. However, the construction of a high-quality …
(NAS) methods aim to solve this process. However, the construction of a high-quality …
No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding
Y Zhai, W Li, Y Tang, X Chen, Y Wang - arxiv preprint arxiv:2405.08344, 2024 - arxiv.org
Current architectures for video understanding mainly build upon 3D convolutional blocks or
2D convolutions with additional operations for temporal modeling. However, these methods …
2D convolutions with additional operations for temporal modeling. However, these methods …