Text detection, tracking and recognition in video: a comprehensive survey

XC Yin, ZY Zuo, S Tian, CL Liu - IEEE Transactions on Image …, 2016 - ieeexplore.ieee.org
The intelligent analysis of video data is currently in wide demand because a video is a major
source of sensory data in our lives. Text is a prominent and direct source of information in …

From object detection to text detection and recognition: A brief evolution history of optical character recognition

H Wang, C Pan, X Guo, C Ji… - Wiley Interdisciplinary …, 2021 - Wiley Online Library
Text detection and recognition, which is also known as optical character recognition (OCR),
is an active research area under quick development with a lot of exciting applications. Deep …

Roadtext-1k: Text detection & recognition dataset for driving videos

S Reddy, M Mathew, L Gomez… - … on Robotics and …, 2020 - ieeexplore.ieee.org
Perceiving text is crucial to understand semantics of outdoor scenes and hence is a critical
requirement to build intelligent systems for driver assistance and self-driving. Most of the …

End-to-end video text detection with online tracking

H Yu, Y Huang, L Pi, C Zhang, X Li, L Wang - Pattern Recognition, 2021 - Elsevier
Text in videos usually acts as important semantic cues, which is helpful to video analysis.
Video text detection is considered as one of the most difficult tasks in document analysis due …

Semantic-aware video text detection

W Feng, F Yin, XY Zhang… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Most existing video text detection methods track texts with appearance features, which are
easily influenced by the change of perspective and illumination. Compared with appearance …

A bilingual, openworld video text dataset and end-to-end video text spotter with transformer

W Wu, Y Cai, D Zhang, S Wang, Z Li, J Li… - arxiv preprint arxiv …, 2021 - arxiv.org
Most existing video text spotting benchmarks focus on evaluating a single language and
scenario with limited data. In this work, we introduce a large-scale, Bilingual, Open World …

End-to-end video text spotting with transformer

W Wu, Y Cai, C Shen, D Zhang, Y Fu, H Zhou… - International Journal of …, 2024 - Springer
Recent video text spotting methods usually require the three-staged pipeline, ie, detecting
text in individual images, recognizing localized text, tracking text streams with post …

Free: A fast and robust end-to-end video text spotter

Z Cheng, J Lu, B Zou, L Qiao, Y Xu, S Pu… - … on Image Processing, 2020 - ieeexplore.ieee.org
Currently, video text spotting tasks usually fall into the four-staged pipeline: detecting text
regions in individual images, recognizing localized text regions frame-wisely, tracking text …

A unified framework for tracking based text detection and recognition from web videos

S Tian, XC Yin, Y Su, HW Hao - IEEE transactions on pattern …, 2017 - ieeexplore.ieee.org
Video text extraction plays an important role for multimedia understanding and retrieval.
Most previous research efforts are conducted within individual frames. A few of recent …

T-HOG: An effective gradient-based descriptor for single line text regions

R Minetto, N Thome, M Cord, NJ Leite, J Stolfi - Pattern recognition, 2013 - Elsevier
We discuss the use of histogram of oriented gradients (HOG) descriptors as an effective tool
for text description and recognition. Specifically, we propose a HOG-based texture descriptor …