Switching temporary teachers for semi-supervised semantic segmentation

J Na, JW Ha, HJ Chang, D Han… - Advances in Neural …, 2024 - proceedings.neurips.cc
The teacher-student framework, prevalent in semi-supervised semantic segmentation,
mainly employs the exponential moving average (EMA) to update a single teacher's weights …

Learning with unmasked tokens drives stronger vision learners

T Kim, S Chun, B Heo, D Han - European Conference on Computer Vision, 2024 - Springer
Masked image modeling (MIM) has become a leading self-supervised learning strategy.
MIMs such as Masked Autoencoder (MAE) learn strong representations by randomly …

POA: Pre-training Once for Models of All Sizes

Y Zhang, X Guo, J Lao, L Yu, L Ru, J Wang… - … on Computer Vision, 2024 - Springer
Large-scale self-supervised pre-training has paved the way for one foundation model to
handle many different vision tasks. Most pre-training methodologies train a single model of a …

Adaptive depth networks with skippable sub-paths

W Kang, H Lee - arxiv preprint arxiv:2312.16392, 2023 - arxiv.org
Predictable adaptation of network depths can be an effective way to control inference
latency and meet the resource condition of various devices. However, previous adaptive …

CR-CTC: Consistency regularization on CTC for improved speech recognition

Z Yao, W Kang, X Yang, F Kuang, L Guo, H Zhu… - arxiv preprint arxiv …, 2024 - arxiv.org
Connectionist Temporal Classification (CTC) is a widely used method for automatic speech
recognition (ASR), renowned for its simplicity and computational efficiency. However, it often …

MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation

M Lee, S Lee, S Park, D Han, B Heo, H Shim - arxiv preprint arxiv …, 2024 - arxiv.org
Referring Image Segmentation (RIS) is an advanced vision-language task that involves
identifying and segmenting objects within an image as described by free-form text …

Masked Image Modeling via Dynamic Token Morphing

T Kim, D Han, B Heo - arxiv preprint arxiv:2401.00254, 2023 - arxiv.org
Masked Image Modeling (MIM) arises as a promising option for Vision Transformers among
various self-supervised learning (SSL) methods. The essence of MIM lies in token-wise …

Longer-range Contextualized Masked Autoencoder

T Kim, S Chun, B Heo, D Han - arxiv preprint arxiv:2310.13593, 2023 - arxiv.org
Masked image modeling (MIM) has emerged as a promising self-supervised learning (SSL)
strategy. The MIM pre-training facilitates learning powerful representations using an encoder …

Augmenting Sub-model to Improve Main Model

B Heo, T Kim, S Yun, D Han - arxiv preprint arxiv:2306.11339, 2023 - arxiv.org
Image classification has improved with the development of training techniques. However,
these techniques often require careful parameter tuning to balance the strength of …

Elevating Augmentation: Boosting Performance via Sub-Model Training

B Heo, T Kim, S Yun, D Han - openreview.net
Image classification has improved with the development of training techniques. However,
these techniques often require careful parameter tuning to balance the strength of …