Rewrite the stars

X Ma, X Dai, Y Bai, Y Wang… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Recent studies have drawn attention to the untapped potential of the" star
operation"(element-wise multiplication) in network design. While intuitive explanations …

Cvt-slr: Contrastive visual-textual transformation for sign language recognition with variational alignment

J Zheng, Y Wang, C Tan, S Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
Sign language recognition (SLR) is a weakly supervised task that annotates sign videos as
textual glosses. Recent studies show that insufficient training caused by the lack of large …

Openstl: A comprehensive benchmark of spatio-temporal predictive learning

C Tan, S Li, Z Gao, W Guan, Z Wang… - Advances in …, 2023 - proceedings.neurips.cc
Spatio-temporal predictive learning is a learning paradigm that enables models to learn
spatial and temporal patterns by predicting future frames from given past frames in an …

Temporal attention unit: Towards efficient spatiotemporal predictive learning

C Tan, Z Gao, L Wu, Y Xu, J **a… - Proceedings of the …, 2023 - openaccess.thecvf.com
Spatiotemporal predictive learning aims to generate future frames by learning from historical
frames. In this paper, we investigate existing methods and present a general framework of …

Masked modeling for self-supervised representation learning on vision and beyond

S Li, L Zhang, Z Wang, D Wu, L Wu, Z Liu, J **a… - arxiv preprint arxiv …, 2023 - arxiv.org
As the deep learning revolution marches on, self-supervised learning has garnered
increasing attention in recent years thanks to its remarkable representation learning ability …

Cf-vit: A general coarse-to-fine method for vision transformer

M Chen, M Lin, K Li, Y Shen, Y Wu, F Chao… - Proceedings of the AAAI …, 2023 - ojs.aaai.org
Abstract Vision Transformers (ViT) have made many breakthroughs in computer vision tasks.
However, considerable redundancy arises in the spatial dimension of an input image …

Semireward: A general reward model for semi-supervised learning

S Li, W **, Z Wang, F Wu, Z Liu, C Tan… - arxiv preprint arxiv …, 2023 - arxiv.org
Semi-supervised learning (SSL) has witnessed great progress with various improvements in
the self-training framework with pseudo labeling. The main challenge is how to distinguish …

Lightweight image super-resolution based multi-order gated aggregation network

G Gendy, N Sabor, G He - Neural Networks, 2023 - Elsevier
Recently, Transformer-based models are taken much focus on solving the task of image
super-resolution (SR) due to their ability to achieve better performance. However, these …

Sumix: Mixup with semantic and uncertain information

H Qin, X **, H Zhu, H Liao, MA El-Yacoubi… - European Conference on …, 2024 - Springer
Mixup data augmentation approaches have been applied for various tasks of deep learning
to improve the generalization ability of deep neural networks. Some existing approaches …

Wavelet-driven spatiotemporal predictive learning: bridging frequency and time variations

X Nie, Y Yan, S Li, C Tan, X Chen, H **… - Proceedings of the …, 2024 - ojs.aaai.org
Spatiotemporal predictive learning is a learning paradigm that enables models to learn
spatial and temporal patterns by predicting future frames from given past frames in an …