A survey on video diffusion models
The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …
Sora: A review on background, technology, limitations, and opportunities of large vision models
Sora is a text-to-video generative AI model, released by OpenAI in February 2024. The
model is trained to generate videos of realistic or imaginative scenes from text instructions …
model is trained to generate videos of realistic or imaginative scenes from text instructions …
Patch diffusion: Faster and more data-efficient training of diffusion models
Diffusion models are powerful, but they require a lot of time and data to train. We propose
Patch Diffusion, a generic patch-wise training framework, to significantly reduce the training …
Patch Diffusion, a generic patch-wise training framework, to significantly reduce the training …
Fact: Frame-action cross-attention temporal modeling for efficient action segmentation
We study supervised action segmentation whose goal is to predict framewise action labels
of a video. To capture temporal dependencies over long horizons prior works either improve …
of a video. To capture temporal dependencies over long horizons prior works either improve …
Progress-aware online action segmentation for egocentric procedural task videos
We address the problem of online action segmentation for egocentric procedural task
videos. While previous studies have mostly focused on offline action segmentation where …
videos. While previous studies have mostly focused on offline action segmentation where …
Temporal action segmentation: An analysis of modern techniques
Temporal action segmentation (TAS) in videos aims at densely identifying video frames in
minutes-long videos with multiple action classes. As a long-range video understanding task …
minutes-long videos with multiple action classes. As a long-range video understanding task …
Learning to schedule in diffusion probabilistic models
Recently, the field of generative models has seen a significant advancement with the
introduction of Diffusion Probabilistic Models (DPMs). The Denoising Diffusion Implicit Model …
introduction of Diffusion Probabilistic Models (DPMs). The Denoising Diffusion Implicit Model …
Action Detection via an Image Diffusion Process
Action detection aims to localize the starting and ending points of action instances in
untrimmed videos and predict the classes of those instances. In this paper we make the …
untrimmed videos and predict the classes of those instances. In this paper we make the …
Rethinking conditional diffusion sampling with progressive guidance
This paper tackles two critical challenges encountered in classifier guidance for diffusion
generative models, ie, the lack of diversity and the presence of adversarial effects. These …
generative models, ie, the lack of diversity and the presence of adversarial effects. These …
ActSonic: Recognizing Everyday Activities from Inaudible Acoustic Wave Around the Body
We present ActSonic, an intelligent, low-power active acoustic sensing system integrated
into eyeglasses that can recognize 27 different everyday activities (eg, eating, drinking …
into eyeglasses that can recognize 27 different everyday activities (eg, eating, drinking …