Google Наука

Z **ng, Q Feng, H Chen, Q Dai, H Hu, H Xu… - ACM Computing …, 2024 - dl.acm.org

The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …

Запазване Позоваване С позовавания в 98 Сродни статии Всички 4 версии

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Simda: Simple diffusion adapter for efficient video generation

Z **ng, Q Dai, H Hu, Z Wu… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

The recent wave of AI-generated content has witnessed the great development and success
of Text-to-Image (T2I) technologies. By contrast Text-to-Video (T2V) still falls short of …

Запазване Позоваване С позовавания в 72 Сродни статии Всички 5 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Masked video distillation: Rethinking masked feature modeling for self-supervised video representation learning

R Wang, D Chen, Z Wu, Y Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com

Benefiting from masked visual modeling, self-supervised video representation learning has
achieved remarkable progress. However, existing methods focus on learning …

Запазване Позоваване С позовавания в 102 Сродни статии Всички 8 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Prototypical residual networks for anomaly detection and localization

H Zhang, Z Wu, Z Wang, Z Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com

Anomaly detection and localization are widely used in industrial manufacturing for its
efficiency and effectiveness. Anomalies are rare and hard to collect and supervised models …

Запазване Позоваване С позовавания в 77 Сродни статии Всички 6 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Implicit temporal modeling with learnable alignment for video recognition

S Tu, Q Dai, Z Wu, ZQ Cheng, H Hu… - Proceedings of the …, 2023 - openaccess.thecvf.com

Contrastive language-image pretraining (CLIP) has demonstrated remarkable success in
various image tasks. However, how to extend CLIP with effective temporal modeling is still …

Запазване Позоваване С позовавания в 40 Сродни статии Всички 9 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Open-vclip: Transforming clip to an open-vocabulary video model via interpolated weight optimization

Z Weng, X Yang, A Li, Z Wu… - … Conference on Machine …, 2023 - proceedings.mlr.press

Abstract Contrastive Language-Image Pretraining (CLIP) has demonstrated impressive zero-
shot learning abilities for image understanding, yet limited effort has been made to …

Запазване Позоваване С позовавания в 46 Сродни статии Всички 6 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Motioneditor: Editing video motion via content-aware diffusion

S Tu, Q Dai, ZQ Cheng, H Hu, X Han… - Proceedings of the …, 2024 - openaccess.thecvf.com

Existing diffusion-based video editing models have made gorgeous advances for editing
attributes of a source video over time but struggle to manipulate the motion information while …

Запазване Позоваване С позовавания в 22 Сродни статии Всички 6 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

XVO: Generalized visual odometry via cross-modal self-training

L Lai, Z Shangguan, J Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

We propose XVO, a semi-supervised learning method for training generalized monocular
Visual Odometry (VO) models with robust off-the-self operation across diverse datasets and …

Запазване Позоваване С позовавания в 23 Сродни статии Всички 9 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

vid-tldr: Training free token merging for light-weight video transformer

J Choi, S Lee, J Chu, M Choi… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Video Transformers have become the prevalent solution for various video downstream tasks
with superior expressive power and flexibility. However these video transformers suffer from …

Запазване Позоваване С позовавания в 13 Сродни статии Всички 7 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Clip-tsa: Clip-assisted temporal self-attention for weakly-supervised video anomaly detection

HK Joo, K Vo, K Yamazaki, N Le - 2023 IEEE International …, 2023 - ieeexplore.ieee.org

Video anomaly detection (VAD)–commonly formulated as a multiple-instance learning
problem in a weakly-supervised manner due to its labor-intensive nature–is a challenging …

Запазване Позоваване С позовавания в 50 Сродни статии Всички 7 версии

Създаване на сигнал

Позоваване

Разширено търсене

Запазено в „Моята библиотека“

Svformer: Semi-supervised video transformer for action recognition

A survey on video diffusion models

Simda: Simple diffusion adapter for efficient video generation

Masked video distillation: Rethinking masked feature modeling for self-supervised video representation learning

Prototypical residual networks for anomaly detection and localization

Implicit temporal modeling with learnable alignment for video recognition

Open-vclip: Transforming clip to an open-vocabulary video model via interpolated weight optimization

Motioneditor: Editing video motion via content-aware diffusion

XVO: Generalized visual odometry via cross-modal self-training

vid-tldr: Training free token merging for light-weight video transformer

Clip-tsa: Clip-assisted temporal self-attention for weakly-supervised video anomaly detection