- Academic Search

X Chen, L **, Y Zhu, C Luo, T Wang - ACM Computing Surveys (CSUR), 2021 - dl.acm.org

The history of text can be traced back over thousands of years. Rich and precise semantic
information carried by text is important in a wide range of vision-based application …

Save Cite Cited by 256 Related articles All 5 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

How far are we to gpt-4v? closing the gap to commercial multimodal models with open-source suites

Z Chen, W Wang, H Tian, S Ye, Z Gao, E Cui… - Science China …, 2024 - Springer

In this paper, we introduce InternVL 1.5, an open-source multimodal large language model
(MLLM) to bridge the capability gap between open-source and proprietary commercial …

Save Cite Cited by 348 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Deepseek-vl: towards real-world vision-language understanding

H Lu, W Liu, B Zhang, B Wang, K Dong, B Liu… - arxiv preprint arxiv …, 2024 - arxiv.org

We present DeepSeek-VL, an open-source Vision-Language (VL) Model designed for real-
world vision and language understanding applications. Our approach is structured around …

Save Cite Cited by 191 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Scene text recognition with permuted autoregressive sequence models

D Bautista, R Atienza - European conference on computer vision, 2022 - Springer

Context-aware STR methods typically use internal autoregressive (AR) language models
(LM). Inherent limitations of AR models motivated two-stage methods which employ an …

Save Cite Cited by 198 Related articles All 6 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Boxinst: High-performance instance segmentation with box annotations

Z Tian, C Shen, X Wang… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

We present a high-performance method that can achieve mask-level instance segmentation
with only bounding-box annotations for training. While this setting has been studied in the …

Save Cite Cited by 298 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Swintextspotter: Scene text spotting via better synergy between text detection and text recognition

M Huang, Y Liu, Z Peng, C Liu, D Lin… - proceedings of the …, 2022 - openaccess.thecvf.com

End-to-end scene text spotting has attracted great attention in recent years due to the
success of excavating the intrinsic synergy of the scene text detection and recognition …

Save Cite Cited by 140 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

On the hidden mystery of ocr in large multimodal models

Y Liu, Z Li, B Yang, C Li, X Yin, C Liu, L **… - arxiv preprint arxiv …, 2023 - arxiv.org

Large models have recently played a dominant role in natural language processing and
multimodal vision-language learning. However, their effectiveness in text-related visual …

Save Cite Cited by 174 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Estextspotter: Towards better scene text spotting with explicit synergy in transformer

M Huang, J Zhang, D Peng, H Lu… - Proceedings of the …, 2023 - openaccess.thecvf.com

In recent years, end-to-end scene text spotting approaches are evolving to the Transformer-
based framework. While previous studies have shown the crucial importance of the intrinsic …

Save Cite Cited by 32 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Abcnet v2: Adaptive bezier-curve network for real-time end-to-end text spotting

Y Liu, C Shen, L **, T He, P Chen… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

End-to-end text-spotting, which aims to integrate detection and recognition in a unified
framework, has attracted increasing attention due to its simplicity of the two complimentary …

Save Cite Cited by 160 Related articles All 6 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

RTFN: A robust temporal feature network for time series classification

Z **ao, X Xu, H **ng, S Luo, P Dai, D Zhan - Information sciences, 2021 - Elsevier

Time series data usually contains local and global patterns. Most of the existing feature
networks focus on local features rather than the relationships among them. The latter is also …

Save Cite Cited by 165 Related articles All 6 versions Free GPT-4

Create alert

Cite

Advanced search

Saved to My library

Icdar 2019 robust reading challenge on reading chinese text on signboard

Text recognition in the wild: A survey

How far are we to gpt-4v? closing the gap to commercial multimodal models with open-source suites

Deepseek-vl: towards real-world vision-language understanding

Scene text recognition with permuted autoregressive sequence models

Boxinst: High-performance instance segmentation with box annotations

Swintextspotter: Scene text spotting via better synergy between text detection and text recognition

On the hidden mystery of ocr in large multimodal models

Estextspotter: Towards better scene text spotting with explicit synergy in transformer

Abcnet v2: Adaptive bezier-curve network for real-time end-to-end text spotting

RTFN: A robust temporal feature network for time series classification