Estextspotter: Towards better scene text spotting with explicit synergy in transformer
In recent years, end-to-end scene text spotting approaches are evolving to the Transformer-
based framework. While previous studies have shown the crucial importance of the intrinsic …
based framework. While previous studies have shown the crucial importance of the intrinsic …
Empowering agrifood system with artificial intelligence: A survey of the progress, challenges and opportunities
With the world population rapidly increasing, transforming our agrifood systems to be more
productive, efficient, safe, and sustainable is crucial to mitigate potential food shortages …
productive, efficient, safe, and sustainable is crucial to mitigate potential food shortages …
Exploring ocr capabilities of gpt-4v (ision): A quantitative and in-depth evaluation
This paper presents a comprehensive evaluation of the Optical Character Recognition
(OCR) capabilities of the recently released GPT-4V (ision), a Large Multimodal Model …
(OCR) capabilities of the recently released GPT-4V (ision), a Large Multimodal Model …
Parrot captions teach clip to spot text
Despite CLIP being the foundation model in numerous vision-language applications, CLIP
suffers from a severe text spotting bias. Such bias causes CLIP models to 'Parrot'the visual …
suffers from a severe text spotting bias. Such bias causes CLIP models to 'Parrot'the visual …
OmniParser: A Unified Framework for Text Spotting Key Information Extraction and Table Recognition
Recently visually-situated text parsing (VsTP) has experienced notable advancements
driven by the increasing demand for automated document understanding and the …
driven by the increasing demand for automated document understanding and the …
Turning a clip model into a scene text spotter
We exploit the potential of the large-scale Contrastive Language-Image Pretraining (CLIP)
model to enhance scene text detection and spotting tasks, transforming it into a robust …
model to enhance scene text detection and spotting tasks, transforming it into a robust …
DNTextSpotter: Arbitrary-shaped scene text spotting via improved denoising training
More and more end-to-end text spotting methods based on Transformer architecture have
demonstrated superior performance. These methods utilize a bipartite graph matching …
demonstrated superior performance. These methods utilize a bipartite graph matching …
Platypus: A generalized specialist model for reading text in various forms
Reading text from images (either natural scenes or documents) has been a long-standing
research topic for decades, due to the high technical challenge and wide application range …
research topic for decades, due to the high technical challenge and wide application range …
Hyper-local deformable transformers for text spotting on historical maps
Text on historical maps contains valuable information providing georeferenced historical,
political, and cultural contexts. However, text extraction from historical maps has been …
political, and cultural contexts. However, text extraction from historical maps has been …
Bridging the Gap Between End-to-End and Two-Step Text Spotting
Modularity plays a crucial role in the development and maintenance of complex systems.
While end-to-end text spotting efficiently mitigates the issues of error accumulation and sub …
While end-to-end text spotting efficiently mitigates the issues of error accumulation and sub …