Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
A survey on video diffusion models
The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …
A survey on self-supervised learning: Algorithms, applications, and future trends
Deep supervised learning algorithms typically require a large volume of labeled data to
achieve satisfactory performance. However, the process of collecting and labeling such data …
achieve satisfactory performance. However, the process of collecting and labeling such data …
Dinov2: Learning robust visual features without supervision
The recent breakthroughs in natural language processing for model pretraining on large
quantities of data have opened the way for similar foundation models in computer vision …
quantities of data have opened the way for similar foundation models in computer vision …
Align your latents: High-resolution video synthesis with latent diffusion models
Abstract Latent Diffusion Models (LDMs) enable high-quality image synthesis while avoiding
excessive compute demands by training a diffusion model in a compressed lower …
excessive compute demands by training a diffusion model in a compressed lower …
Stable video diffusion: Scaling latent video diffusion models to large datasets
We present Stable Video Diffusion-a latent video diffusion model for high-resolution, state-of-
the-art text-to-video and image-to-video generation. Recently, latent diffusion models trained …
the-art text-to-video and image-to-video generation. Recently, latent diffusion models trained …
Videomae v2: Scaling video masked autoencoders with dual masking
Scale is the primary factor for building a powerful foundation model that could well
generalize to a variety of downstream tasks. However, it is still challenging to train video …
generalize to a variety of downstream tasks. However, it is still challenging to train video …
Vision-language models for vision tasks: A survey
Most visual recognition studies rely heavily on crowd-labelled data in deep neural networks
(DNNs) training, and they usually train a DNN for each single visual recognition task …
(DNNs) training, and they usually train a DNN for each single visual recognition task …
Vbench: Comprehensive benchmark suite for video generative models
Video generation has witnessed significant advancements yet evaluating these models
remains a challenge. A comprehensive evaluation benchmark for video generation is …
remains a challenge. A comprehensive evaluation benchmark for video generation is …
Panda-70m: Captioning 70m videos with multiple cross-modality teachers
The quality of the data and annotation upper-bounds the quality of a downstream model.
While there exist large text corpora and image-text pairs high-quality video-text data is much …
While there exist large text corpora and image-text pairs high-quality video-text data is much …
Videomamba: State space model for efficient video understanding
Addressing the dual challenges of local redundancy and global dependencies in video
understanding, this work innovatively adapts the Mamba to the video domain. The proposed …
understanding, this work innovatively adapts the Mamba to the video domain. The proposed …