Weight-sharing neural architecture search: A battle to shrink the optimization gap

L **e, X Chen, K Bi, L Wei, Y Xu, L Wang… - ACM Computing …, 2021 - dl.acm.org
Neural architecture search (NAS) has attracted increasing attention. In recent years,
individual search methods have been replaced by weight-sharing search methods for higher …

Trocr: Transformer-based optical character recognition with pre-trained models

M Li, T Lv, J Chen, L Cui, Y Lu, D Florencio… - Proceedings of the …, 2023 - ojs.aaai.org
Text recognition is a long-standing research problem for document digitalization. Existing
approaches are usually built based on CNN for image understanding and RNN for char …

Scene text recognition with permuted autoregressive sequence models

D Bautista, R Atienza - European conference on computer vision, 2022 - Springer
Context-aware STR methods typically use internal autoregressive (AR) language models
(LM). Inherent limitations of AR models motivated two-stage methods which employ an …

Svtr: Scene text recognition with a single visual model

Y Du, Z Chen, C Jia, X Yin, T Zheng, C Li, Y Du… - arxiv preprint arxiv …, 2022 - arxiv.org
Dominant scene text recognition models commonly contain two building blocks, a visual
model for feature extraction and a sequence model for text transcription. This hybrid …

Scene text telescope: Text-focused scene image super-resolution

J Chen, B Li, X Xue - … of the IEEE/CVF Conference on …, 2021 - openaccess.thecvf.com
Image super-resolution, which is often regarded as a preprocessing procedure of scene text
recognition, aims to recover the realistic features from a low-resolution text image. It has …

Reading and writing: Discriminative and generative modeling for self-supervised text recognition

M Yang, M Liao, P Lu, J Wang, S Zhu, H Luo… - Proceedings of the 30th …, 2022 - dl.acm.org
Existing text recognition methods usually need large-scale training data. Most of them rely
on synthetic training data due to the lack of annotated real images. However, there is a …

Primitive representation learning for scene text recognition

R Yan, L Peng, S **ao, G Yao - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Scene text recognition is a challenging task due to diverse variations of text instances in
natural scene images. Conventional methods based on CNN-RNN-CTC or encoder …

Searching to exploit memorization effect in learning with noisy labels

Q Yao, H Yang, B Han, G Niu… - … on Machine Learning, 2020 - proceedings.mlr.press
Sample selection approaches are popular in robust learning from noisy labels. However,
how to properly control the selection process so that deep networks can benefit from the …

Multi-modal text recognition networks: Interactive enhancements between visual and semantic features

B Na, Y Kim, S Park - European Conference on Computer Vision, 2022 - Springer
Linguistic knowledge has brought great benefits to scene text recognition by providing
semantics to refine character sequences. However, since linguistic knowledge has been …

Pimnet: a parallel, iterative and mimicking network for scene text recognition

Z Qiao, Y Zhou, J Wei, W Wang, Y Zhang… - Proceedings of the 29th …, 2021 - dl.acm.org
Nowadays, scene text recognition has attracted more and more attention due to its various
applications. Most state-of-the-art methods adopt an encoder-decoder framework with …