- Academic Search

J Na, JW Ha, HJ Chang, D Han… - Advances in Neural …, 2024 - proceedings.neurips.cc

The teacher-student framework, prevalent in semi-supervised semantic segmentation,
mainly employs the exponential moving average (EMA) to update a single teacher's weights …

Enregistrer Citer Cité 25 fois Autres articles Les 7 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] github.io

Learning with unmasked tokens drives stronger vision learners

T Kim, S Chun, B Heo, D Han - European Conference on Computer Vision, 2024 - Springer

Masked image modeling (MIM) has become a leading self-supervised learning strategy.
MIMs such as Masked Autoencoder (MAE) learn strong representations by randomly …

Enregistrer Citer Autres articles Les 5 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

POA: Pre-training Once for Models of All Sizes

Y Zhang, X Guo, J Lao, L Yu, L Ru, J Wang… - … on Computer Vision, 2024 - Springer

Large-scale self-supervised pre-training has paved the way for one foundation model to
handle many different vision tasks. Most pre-training methodologies train a single model of a …

Enregistrer Citer Autres articles Les 10 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Adaptive depth networks with skippable sub-paths

W Kang, H Lee - arxiv preprint arxiv:2312.16392, 2023 - arxiv.org

Predictable adaptation of network depths can be an effective way to control inference
latency and meet the resource condition of various devices. However, previous adaptive …

Enregistrer Citer Cité 1 fois Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

CR-CTC: Consistency regularization on CTC for improved speech recognition

Z Yao, W Kang, X Yang, F Kuang, L Guo, H Zhu… - arxiv preprint arxiv …, 2024 - arxiv.org

Connectionist Temporal Classification (CTC) is a widely used method for automatic speech
recognition (ASR), renowned for its simplicity and computational efficiency. However, it often …

Enregistrer Citer Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation

M Lee, S Lee, S Park, D Han, B Heo, H Shim - arxiv preprint arxiv …, 2024 - arxiv.org

Referring Image Segmentation (RIS) is an advanced vision-language task that involves
identifying and segmenting objects within an image as described by free-form text …

Enregistrer Citer Autres articles Version HTML

[Free GPT-4]

[PDF] arxiv.org

Masked Image Modeling via Dynamic Token Morphing

T Kim, D Han, B Heo - arxiv preprint arxiv:2401.00254, 2023 - arxiv.org

Masked Image Modeling (MIM) arises as a promising option for Vision Transformers among
various self-supervised learning (SSL) methods. The essence of MIM lies in token-wise …

Enregistrer Citer Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Longer-range Contextualized Masked Autoencoder

T Kim, S Chun, B Heo, D Han - arxiv preprint arxiv:2310.13593, 2023 - arxiv.org

Masked image modeling (MIM) has emerged as a promising self-supervised learning (SSL)
strategy. The MIM pre-training facilitates learning powerful representations using an encoder …

Enregistrer Citer Cité 1 fois Autres articles Les 3 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Augmenting Sub-model to Improve Main Model

B Heo, T Kim, S Yun, D Han - arxiv preprint arxiv:2306.11339, 2023 - arxiv.org

Image classification has improved with the development of training techniques. However,
these techniques often require careful parameter tuning to balance the strength of …

Enregistrer Citer Cité 3 fois Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] openreview.net

Elevating Augmentation: Boosting Performance via Sub-Model Training

B Heo, T Kim, S Yun, D Han - openreview.net

Image classification has improved with the development of training techniques. However,
these techniques often require careful parameter tuning to balance the strength of …

Enregistrer Citer Autres articles Version HTML

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Co-training 2L submodels for visual recognition

Switching temporary teachers for semi-supervised semantic segmentation

Learning with unmasked tokens drives stronger vision learners

POA: Pre-training Once for Models of All Sizes

Adaptive depth networks with skippable sub-paths

CR-CTC: Consistency regularization on CTC for improved speech recognition

MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation

Masked Image Modeling via Dynamic Token Morphing

Longer-range Contextualized Masked Autoencoder

Augmenting Sub-model to Improve Main Model

Elevating Augmentation: Boosting Performance via Sub-Model Training