- Academic Search

M Li, T Lv, J Chen, L Cui, Y Lu, D Florencio… - Proceedings of the …, 2023 - ojs.aaai.org

Text recognition is a long-standing research problem for document digitalization. Existing
approaches are usually built based on CNN for image understanding and RNN for char …

Save Cite Cited by 443 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Scene text recognition with permuted autoregressive sequence models

D Bautista, R Atienza - European conference on computer vision, 2022 - Springer

Context-aware STR methods typically use internal autoregressive (AR) language models
(LM). Inherent limitations of AR models motivated two-stage methods which employ an …

Save Cite Cited by 198 Related articles All 6 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Abinet++: Autonomous, bidirectional and iterative language modeling for scene text spotting

S Fang, Z Mao, H **e, Y Wang, C Yan… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Scene text spotting is of great importance to the computer vision community due to its wide
variety of applications. Recent methods attempt to introduce linguistic knowledge for …

Save Cite Cited by 55 Related articles All 7 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Continuous human action recognition for human-machine interaction: a review

H Gammulle, D Ahmedt-Aristizabal, S Denman… - ACM Computing …, 2023 - dl.acm.org

With advances in data-driven machine learning research, a wide variety of prediction
models have been proposed to capture spatio-temporal features for the analysis of video …

Save Cite Cited by 29 Related articles All 5 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Reading and writing: Discriminative and generative modeling for self-supervised text recognition

M Yang, M Liao, P Lu, J Wang, S Zhu, H Luo… - Proceedings of the 30th …, 2022 - dl.acm.org

Existing text recognition methods usually need large-scale training data. Most of them rely
on synthetic training data due to the lack of annotated real images. However, there is a …

Save Cite Cited by 66 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[HTML] mdpi.com

[HTML][HTML] Human–AI Collaboration for Remote Sighted Assistance: Perspectives from the LLM Era

R Yu, S Lee, J **e, SM Billah, JM Carroll - Future Internet, 2024 - mdpi.com

Remote sighted assistance (RSA) has emerged as a conversational technology aiding
people with visual impairments (VI) through real-time video chat communication with sighted …

Save Cite Cited by 3 Related articles All 7 versions Free GPT-4 Cached

[Free GPT-4]

[PDF] thecvf.com

Sketch2Saliency: learning to detect salient objects from human drawings

AK Bhunia, S Koley, A Kumar, A Sain… - Proceedings of the …, 2023 - openaccess.thecvf.com

Human sketch has already proved its worth in various visual understanding tasks (eg,
retrieval, segmentation, image-captioning, etc). In this paper, we reveal a new trait of …

Save Cite Cited by 22 Related articles All 8 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Multi-modal text recognition networks: Interactive enhancements between visual and semantic features

B Na, Y Kim, S Park - European Conference on Computer Vision, 2022 - Springer

Linguistic knowledge has brought great benefits to scene text recognition by providing
semantics to refine character sequences. However, since linguistic knowledge has been …

Save Cite Cited by 74 Related articles All 5 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Cdistnet: Perceiving multi-domain character distance for robust text recognition

T Zheng, Z Chen, S Fang, H **e, YG Jiang - International Journal of …, 2024 - Springer

The transformer-based encoder-decoder framework is becoming popular in scene text
recognition, largely because it naturally integrates recognition clues from both visual and …

Save Cite Cited by 66 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Dtrocr: Decoder-only transformer for optical character recognition

M Fujitake - Proceedings of the IEEE/CVF Winter …, 2024 - openaccess.thecvf.com

Typical text recognition methods rely on an encoder-decoder structure, in which the encoder
extracts features from an image, and the decoder produces recognized text from these …

Save Cite Cited by 42 Related articles All 7 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Joint visual semantic reasoning: Multi-stage decoder for text recognition

Trocr: Transformer-based optical character recognition with pre-trained models

Scene text recognition with permuted autoregressive sequence models

Abinet++: Autonomous, bidirectional and iterative language modeling for scene text spotting

Continuous human action recognition for human-machine interaction: a review

Reading and writing: Discriminative and generative modeling for self-supervised text recognition

[HTML][HTML] Human–AI Collaboration for Remote Sighted Assistance: Perspectives from the LLM Era

Sketch2Saliency: learning to detect salient objects from human drawings

Multi-modal text recognition networks: Interactive enhancements between visual and semantic features

Cdistnet: Perceiving multi-domain character distance for robust text recognition

Dtrocr: Decoder-only transformer for optical character recognition