Switching temporary teachers for semi-supervised semantic segmentation
The teacher-student framework, prevalent in semi-supervised semantic segmentation,
mainly employs the exponential moving average (EMA) to update a single teacher's weights …
mainly employs the exponential moving average (EMA) to update a single teacher's weights …
Learning with unmasked tokens drives stronger vision learners
Masked image modeling (MIM) has become a leading self-supervised learning strategy.
MIMs such as Masked Autoencoder (MAE) learn strong representations by randomly …
MIMs such as Masked Autoencoder (MAE) learn strong representations by randomly …
POA: Pre-training Once for Models of All Sizes
Large-scale self-supervised pre-training has paved the way for one foundation model to
handle many different vision tasks. Most pre-training methodologies train a single model of a …
handle many different vision tasks. Most pre-training methodologies train a single model of a …
Adaptive depth networks with skippable sub-paths
Predictable adaptation of network depths can be an effective way to control inference
latency and meet the resource condition of various devices. However, previous adaptive …
latency and meet the resource condition of various devices. However, previous adaptive …
CR-CTC: Consistency regularization on CTC for improved speech recognition
Connectionist Temporal Classification (CTC) is a widely used method for automatic speech
recognition (ASR), renowned for its simplicity and computational efficiency. However, it often …
recognition (ASR), renowned for its simplicity and computational efficiency. However, it often …
MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation
Referring Image Segmentation (RIS) is an advanced vision-language task that involves
identifying and segmenting objects within an image as described by free-form text …
identifying and segmenting objects within an image as described by free-form text …
Masked Image Modeling via Dynamic Token Morphing
Masked Image Modeling (MIM) arises as a promising option for Vision Transformers among
various self-supervised learning (SSL) methods. The essence of MIM lies in token-wise …
various self-supervised learning (SSL) methods. The essence of MIM lies in token-wise …
Longer-range Contextualized Masked Autoencoder
Masked image modeling (MIM) has emerged as a promising self-supervised learning (SSL)
strategy. The MIM pre-training facilitates learning powerful representations using an encoder …
strategy. The MIM pre-training facilitates learning powerful representations using an encoder …
Augmenting Sub-model to Improve Main Model
Image classification has improved with the development of training techniques. However,
these techniques often require careful parameter tuning to balance the strength of …
these techniques often require careful parameter tuning to balance the strength of …
Elevating Augmentation: Boosting Performance via Sub-Model Training
Image classification has improved with the development of training techniques. However,
these techniques often require careful parameter tuning to balance the strength of …
these techniques often require careful parameter tuning to balance the strength of …