Boundary-induced and scene-aggregated network for monocular depth prediction

F Xue, J Cao, Y Zhou, F Sheng, Y Wang, A Ming - Pattern Recognition, 2021 - Elsevier
Monocular depth prediction is an important task in scene understanding. It aims to predict
the dense depth of a single RGB image. With the development of deep learning, the …

MsMED-Net: An Optimized Multi-scale Mirror Connected Encoder-Decoder Network for Multilingual Natural Scene Text Recognition

K Dutta, SG Dastidar, M Kundu, M Nasipuri… - Frontiers of ICT in …, 2023 - Springer
End-to-end multi-script text recognition from natural scene text images is a difficult task as
compared to document text recognition due to the complex background semantics, distinct …

[PDF][PDF] A deep learning framework for handwritten Ol Chiki character recognition

D Barman, T Bhattacharya, N Chowdhury - Journal of Scientific Research, 2022 - bhu.ac.in
Ol Chiki is an Austroasiatic-Santali language used by the Santhal tribe of India. Despite
being one of the official languages of India, Ol Chiki language still remains marginalized …

Deep-TDRS: An Integrated System for Handwritten Text Detection-Recognition and Conversion to Speech Using Deep Learning

B Mondal, SG Dastidar, N Das - International Conference on Computer …, 2021 - Springer
Abstract Development of complete OCR for handwritten document (HOCR) is a challenging
task due to a wide variation in writing styles, cursiveness, and contrasts in captured text …

Recent Challenges and Opportunities of Multilingual Natural Scene Text Recognition and Its Real World Deployment

K Dutta, S Chowdhury, M Kundu, M Nasipuri… - … Conference on Data …, 2022 - Springer
Multilingual natural scene text recognition is difficult due to its complex text font style, difficult
image background, multilingual text formats, etc. In vision-based applications, natural scene …

Special issue on deep learning for video text analysis

S Basu, U Maulik, U Pal - Pattern Recognition Letters, 2020 - ui.adsabs.harvard.edu
We are living in a world seamlessly surrounded by multimedia content, such as text, image,
audio, video etc. Much of it is due to the advancement in multimodal sensor technology. For …