A comprehensive survey on applications of transformers for deep learning tasks

S Islam, H Elmekki, A Elsebai, J Bentahar… - Expert Systems with …, 2024‏ - Elsevier
Abstract Transformers are Deep Neural Networks (DNN) that utilize a self-attention
mechanism to capture contextual relationships within sequential data. Unlike traditional …

Text recognition in the wild: A survey

X Chen, L **, Y Zhu, C Luo, T Wang - ACM Computing Surveys (CSUR), 2021‏ - dl.acm.org
The history of text can be traced back over thousands of years. Rich and precise semantic
information carried by text is important in a wide range of vision-based application …

Learning transferable visual models from natural language supervision

A Radford, JW Kim, C Hallacy… - International …, 2021‏ - proceedings.mlr.press
State-of-the-art computer vision systems are trained to predict a fixed set of predetermined
object categories. This restricted form of supervision limits their generality and usability since …

Fourier contour embedding for arbitrary-shaped text detection

Y Zhu, J Chen, L Liang, Z Kuang… - Proceedings of the …, 2021‏ - openaccess.thecvf.com
One of the main challenges for arbitrary-shaped text detection is to design a good text
instance representation that allows networks to learn diverse text geometry variances. Most …

Deepsolo: Let transformer decoder with explicit points solo for text spotting

M Ye, J Zhang, S Zhao, J Liu, T Liu… - Proceedings of the …, 2023‏ - openaccess.thecvf.com
End-to-end text spotting aims to integrate scene text detection and recognition into a unified
framework. Dealing with the relationship between the two sub-tasks plays a pivotal role in …

Swintextspotter: Scene text spotting via better synergy between text detection and text recognition

M Huang, Y Liu, Z Peng, C Liu, D Lin… - proceedings of the …, 2022‏ - openaccess.thecvf.com
End-to-end scene text spotting has attracted great attention in recent years due to the
success of excavating the intrinsic synergy of the scene text detection and recognition …

Text spotting transformers

X Zhang, Y Su, S Tripathi, Z Tu - Proceedings of the IEEE …, 2022‏ - openaccess.thecvf.com
In this paper, we present TExt Spotting TRansformers (TESTR), a generic end-to-end text
spotting framework using Transformers for text detection and recognition in the wild. TESTR …

Textocr: Towards large-scale end-to-end reasoning for arbitrary-shaped scene text

A Singh, G Pang, M Toh, J Huang… - Proceedings of the …, 2021‏ - openaccess.thecvf.com
A crucial component for the scene text based reasoning required for TextVQA and TextCaps
datasets involve detecting and recognizing text present in the images using an optical …

Abcnet v2: Adaptive bezier-curve network for real-time end-to-end text spotting

Y Liu, C Shen, L **, T He, P Chen… - IEEE Transactions on …, 2021‏ - ieeexplore.ieee.org
End-to-end text-spotting, which aims to integrate detection and recognition in a unified
framework, has attracted increasing attention due to its simplicity of the two complimentary …

Omniparser: A unified framework for text spotting key information extraction and table recognition

J Wan, S Song, W Yu, Y Liu, W Cheng… - Proceedings of the …, 2024‏ - openaccess.thecvf.com
Recently visually-situated text parsing (VsTP) has experienced notable advancements
driven by the increasing demand for automated document understanding and the …