Академия Google

X Ma, X Dai, Y Bai, Y Wang… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Recent studies have drawn attention to the untapped potential of the" star
operation"(element-wise multiplication) in network design. While intuitive explanations …

Сохранить Цитировать Цитируется: 93 Похожие статьи Все версии статьи (7) В виде HTML

Dynamic and static mutual fitting for action recognition

W Liu, X Jia, X Zhong, K Jiang, X Yu, M Ye - Pattern Recognition, 2025 - Elsevier

Action recognition is intended to classify a video into a certain category by aggregating and
summarizing its temporal and spatial information. Existing methods have achieved …

Сохранить Цитировать Цитируется: 3 Похожие статьи Все версии статьи (5)

[Free GPT-4]
[DeepSeek]

[PDF] pkwyx.com

Optimizing Factorized Encoder Models: Time and Memory Reduction for Scalable and Efficient Action Recognition

SN Gowda, A Arnab, J Huang - European Conference on Computer Vision, 2024 - Springer

In this paper, we address the challenges posed by the substantial training time and memory
consumption associated with video transformers, focusing on the ViViT (Video Vision …

Сохранить Цитировать Цитируется: 1 Похожие статьи Все версии статьи (4)

Gbc: Guided alignment and adaptive boosting clip bridging vision and language for robust action recognition

Z Yang, G An, Z Zheng, S Cao… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

The Contrastive Language-Image Pre-training (CLIP) model achieves strong generalization
by using a large number of text-image pairs for contrastive learning. However, when it is …

Сохранить Цитировать Цитируется: 3 Похожие статьи Все версии статьи (2)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

SOAP: Enhancing Spatio-Temporal Relation and Motion Information Capturing for Few-Shot Action Recognition

W Huang, J Zhang, X Qian, Z Wu, M Wang… - Proceedings of the 32nd …, 2024 - dl.acm.org

High frame-rate~(HFR) videos of action recognition improve fine-grained expression while
reducing the spatio-temporal relation and motion information density. Thus, large amounts of …

Сохранить Цитировать Цитируется: 2 Похожие статьи Все версии статьи (5)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Distillation-free Scaling of Large SSMs for Images and Videos

H Suleman, ST Wasim, M Naseer, J Gall - arxiv preprint arxiv:2409.11867, 2024 - arxiv.org

State-space models (SSMs), exemplified by S4, have introduced a novel context modeling
method by integrating state-space techniques into deep learning. However, they struggle …

Сохранить Цитировать Цитируется: 1 Похожие статьи Все версии статьи (3) В виде HTML

RaSTFormer: region-aware spatiotemporal transformer for visual homogenization recognition in short videos

S Zhang, J Zhang, H Zhang, L Zhuo - Neural Computing and Applications, 2024 - Springer

With the surge in network traffic, the homogenization of short video content is becoming
increasingly prominent, resulting in low-quality entertainment due to proliferation and …

Сохранить Цитировать Цитируется: 1 Похожие статьи Все версии статьи (2)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Focal modulation networks for interpretable sound classification

L Della Libera, C Subakan… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org

The increasing success of deep neural networks has raised concerns about their inherent
black-box nature, posing challenges related to interpretability and trust. While there has …

Сохранить Цитировать Цитируется: 3 Похожие статьи Все версии статьи (3)

VT-Grapher: Video Tube Graph Network with Self-Distillation for Human Action Recognition

X Liu, J Liu, X Cheng, J Li, W Wan… - IEEE Sensors Journal, 2024 - ieeexplore.ieee.org

The proliferation of videos captured by sensor-based cameras has driven the application of
human action recognition (HAR) task. As the fundamental video application in human …

Сохранить Цитировать Цитируется: 1 Похожие статьи Все версии статьи (2)

[Free GPT-4]
[DeepSeek]

[PDF] copernicus.org

Focal-TSMP: deep learning for vegetation health prediction and agricultural drought assessment from a regional climate simulation

MH Shams Eddin, J Gall - Geoscientific Model Development, 2024 - gmd.copernicus.org

Satellite-derived agricultural drought indices can provide a complementary perspective of
terrestrial vegetation trends. In addition, their integration for drought assessments under …

Сохранить Цитировать Цитируется: 1 Похожие статьи Все версии статьи (8) Сохраненная копия

Создать оповещение

Цитировать

Расширенный поиск

Сохранено в вашей библиотеке

Video-focalnets: Spatio-temporal focal modulation for video action recognition

Rewrite the stars

Dynamic and static mutual fitting for action recognition

Optimizing Factorized Encoder Models: Time and Memory Reduction for Scalable and Efficient Action Recognition

Gbc: Guided alignment and adaptive boosting clip bridging vision and language for robust action recognition

SOAP: Enhancing Spatio-Temporal Relation and Motion Information Capturing for Few-Shot Action Recognition

Distillation-free Scaling of Large SSMs for Images and Videos

RaSTFormer: region-aware spatiotemporal transformer for visual homogenization recognition in short videos

Focal modulation networks for interpretable sound classification

VT-Grapher: Video Tube Graph Network with Self-Distillation for Human Action Recognition

Focal-TSMP: deep learning for vegetation health prediction and agricultural drought assessment from a regional climate simulation