Weight-sharing neural architecture search: A battle to shrink the optimization gap
Neural architecture search (NAS) has attracted increasing attention. In recent years,
individual search methods have been replaced by weight-sharing search methods for higher …
individual search methods have been replaced by weight-sharing search methods for higher …
Trocr: Transformer-based optical character recognition with pre-trained models
Text recognition is a long-standing research problem for document digitalization. Existing
approaches are usually built based on CNN for image understanding and RNN for char …
approaches are usually built based on CNN for image understanding and RNN for char …
Scene text recognition with permuted autoregressive sequence models
Context-aware STR methods typically use internal autoregressive (AR) language models
(LM). Inherent limitations of AR models motivated two-stage methods which employ an …
(LM). Inherent limitations of AR models motivated two-stage methods which employ an …
Svtr: Scene text recognition with a single visual model
Dominant scene text recognition models commonly contain two building blocks, a visual
model for feature extraction and a sequence model for text transcription. This hybrid …
model for feature extraction and a sequence model for text transcription. This hybrid …
Scene text telescope: Text-focused scene image super-resolution
Image super-resolution, which is often regarded as a preprocessing procedure of scene text
recognition, aims to recover the realistic features from a low-resolution text image. It has …
recognition, aims to recover the realistic features from a low-resolution text image. It has …
Reading and writing: Discriminative and generative modeling for self-supervised text recognition
Existing text recognition methods usually need large-scale training data. Most of them rely
on synthetic training data due to the lack of annotated real images. However, there is a …
on synthetic training data due to the lack of annotated real images. However, there is a …
Primitive representation learning for scene text recognition
R Yan, L Peng, S **ao, G Yao - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Scene text recognition is a challenging task due to diverse variations of text instances in
natural scene images. Conventional methods based on CNN-RNN-CTC or encoder …
natural scene images. Conventional methods based on CNN-RNN-CTC or encoder …
Searching to exploit memorization effect in learning with noisy labels
Sample selection approaches are popular in robust learning from noisy labels. However,
how to properly control the selection process so that deep networks can benefit from the …
how to properly control the selection process so that deep networks can benefit from the …
Multi-modal text recognition networks: Interactive enhancements between visual and semantic features
Linguistic knowledge has brought great benefits to scene text recognition by providing
semantics to refine character sequences. However, since linguistic knowledge has been …
semantics to refine character sequences. However, since linguistic knowledge has been …
Pimnet: a parallel, iterative and mimicking network for scene text recognition
Nowadays, scene text recognition has attracted more and more attention due to its various
applications. Most state-of-the-art methods adopt an encoder-decoder framework with …
applications. Most state-of-the-art methods adopt an encoder-decoder framework with …