A review of convolutional neural networks in computer vision

X Zhao, L Wang, Y Zhang, X Han, M Deveci… - Artificial Intelligence …, 2024 - Springer
In computer vision, a series of exemplary advances have been made in several areas
involving image classification, semantic segmentation, object detection, and image super …

Cvt-slr: Contrastive visual-textual transformation for sign language recognition with variational alignment

J Zheng, Y Wang, C Tan, S Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
Sign language recognition (SLR) is a weakly supervised task that annotates sign videos as
textual glosses. Recent studies show that insufficient training caused by the lack of large …

Temporal attention unit: Towards efficient spatiotemporal predictive learning

C Tan, Z Gao, L Wu, Y Xu, J **a… - Proceedings of the …, 2023 - openaccess.thecvf.com
Spatiotemporal predictive learning aims to generate future frames by learning from historical
frames. In this paper, we investigate existing methods and present a general framework of …

Unist: A prompt-empowered universal model for urban spatio-temporal prediction

Y Yuan, J Ding, J Feng, D **, Y Li - Proceedings of the 30th ACM …, 2024 - dl.acm.org
Urban spatio-temporal prediction is crucial for informed decision-making, such as traffic
management, resource optimization, and emergence response. Despite remarkable …

Openstl: A comprehensive benchmark of spatio-temporal predictive learning

C Tan, S Li, Z Gao, W Guan, Z Wang… - Advances in …, 2023 - proceedings.neurips.cc
Spatio-temporal predictive learning is a learning paradigm that enables models to learn
spatial and temporal patterns by predicting future frames from given past frames in an …

Proteininvbench: Benchmarking protein inverse folding on diverse tasks, models, and metrics

Z Gao, C Tan, Y Zhang, X Chen… - Advances in Neural …, 2024 - proceedings.neurips.cc
Protein inverse folding has attracted increasing attention in recent years. However, we
observe that current methods are usually limited to the CATH dataset and the recovery …

PiFold: Toward effective and efficient protein inverse folding

Z Gao, C Tan, P Chacón, SZ Li - arxiv preprint arxiv:2209.12643, 2022 - arxiv.org
How can we design protein sequences folding into the desired structures effectively and
efficiently? AI methods for structure-based protein design have attracted increasing attention …

Extdm: Distribution extrapolation diffusion model for video prediction

Z Zhang, J Hu, W Cheng, D Paudel… - Proceedings of the …, 2024 - openaccess.thecvf.com
Video prediction is a challenging task due to its nature of uncertainty especially for
forecasting a long period. To model the temporal dynamics advanced methods benefit from …

TSANet: Forecasting traffic congestion patterns from aerial videos using graphs and transformers

KN Kumar, D Roy, TA Suman, C Vishnu, CK Mohan - Pattern Recognition, 2024 - Elsevier
Forecasting traffic congestion patterns in lane-less traffic scenarios is a complex task
because of the combination of high & irregular vehicle densities, fluctuating speeds, and the …

Simvp: Towards simple yet powerful spatiotemporal predictive learning

C Tan, Z Gao, S Li, SZ Li - arxiv preprint arxiv:2211.12509, 2022 - arxiv.org
Recent years have witnessed remarkable advances in spatiotemporal predictive learning,
incorporating auxiliary inputs, elaborate neural architectures, and sophisticated training …