Scientific discovery in the age of artificial intelligence
Artificial intelligence (AI) is being increasingly integrated into scientific discovery to augment
and accelerate research, hel** scientists to generate hypotheses, design experiments …
and accelerate research, hel** scientists to generate hypotheses, design experiments …
Diffusion models: A comprehensive survey of methods and applications
Diffusion models have emerged as a powerful new family of deep generative models with
record-breaking performance in many applications, including image synthesis, video …
record-breaking performance in many applications, including image synthesis, video …
Videomae v2: Scaling video masked autoencoders with dual masking
Scale is the primary factor for building a powerful foundation model that could well
generalize to a variety of downstream tasks. However, it is still challenging to train video …
generalize to a variety of downstream tasks. However, it is still challenging to train video …
Scaling language-image pre-training via masking
Abstract We present Fast Language-Image Pre-training (FLIP), a simple and more efficient
method for training CLIP. Our method randomly masks out and removes a large portion of …
method for training CLIP. Our method randomly masks out and removes a large portion of …
Prompt, generate, then cache: Cascade of foundation models makes strong few-shot learners
Visual recognition in low-data regimes requires deep neural networks to learn generalized
representations from limited training samples. Recently, CLIP-based methods have shown …
representations from limited training samples. Recently, CLIP-based methods have shown …
Photorealistic video generation with diffusion models
We present WALT, a diffusion transformer for photorealistic video generation from text
prompts. Our approach has two key design decisions. First, we use a causal encoder to …
prompts. Our approach has two key design decisions. First, we use a causal encoder to …
Sequential modeling enables scalable learning for large vision models
We introduce a novel sequential modeling approach which enables learning a Large Vision
Model (LVM) without making use of any linguistic data. To do this we define a common …
Model (LVM) without making use of any linguistic data. To do this we define a common …
Your diffusion model is secretly a zero-shot classifier
The recent wave of large-scale text-to-image diffusion models has dramatically increased
our text-based image generation abilities. These models can generate realistic images for a …
our text-based image generation abilities. These models can generate realistic images for a …
Self-supervised learning for medical image classification: a systematic review and implementation guidelines
Advancements in deep learning and computer vision provide promising solutions for
medical image analysis, potentially improving healthcare and patient outcomes. However …
medical image analysis, potentially improving healthcare and patient outcomes. However …
Masked autoencoders as spatiotemporal learners
This paper studies a conceptually simple extension of Masked Autoencoders (MAE) to
spatiotemporal representation learning from videos. We randomly mask out spacetime …
spatiotemporal representation learning from videos. We randomly mask out spacetime …