Академия Google

AA Aleissaee, A Kumar, RM Anwer, S Khan… - Remote Sensing, 2023 - mdpi.com

Deep learning-based algorithms have seen a massive popularity in different areas of remote
sensing image analysis over the past decade. Recently, transformer-based architectures …

Сохранить Цитировать Цитируется: 200 Похожие статьи Все версии статьи (7) Сохраненная копия

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Object detection in 20 years: A survey

Z Zou, K Chen, Z Shi, Y Guo, J Ye - Proceedings of the IEEE, 2023 - ieeexplore.ieee.org

Object detection, as of one the most fundamental and challenging problems in computer
vision, has received great attention in recent years. Over the past two decades, we have …

Сохранить Цитировать Цитируется: 3534 Похожие статьи Все версии статьи (6)

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Lvlm-ehub: A comprehensive evaluation benchmark for large vision-language models

P Xu, W Shao, K Zhang, P Gao, S Liu… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Large Vision-Language Models (LVLMs) have recently played a dominant role in
multimodal vision-language learning. Despite the great success, it lacks a holistic evaluation …

Сохранить Цитировать Цитируется: 177 Похожие статьи Все версии статьи (6)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Git: A generative image-to-text transformer for vision and language

J Wang, Z Yang, X Hu, L Li, K Lin, Z Gan, Z Liu… - arxiv preprint arxiv …, 2022 - arxiv.org

In this paper, we design and train a Generative Image-to-text Transformer, GIT, to unify
vision-language tasks such as image/video captioning and question answering. While …

Сохранить Цитировать Цитируется: 562 Похожие статьи Все версии статьи (4) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Textdiffuser: Diffusion models as text painters

J Chen, Y Huang, T Lv, L Cui… - Advances in Neural …, 2023 - proceedings.neurips.cc

Diffusion models have gained increasing attention for their impressive generation abilities
but currently struggle with rendering accurate and coherent text. To address this issue, we …

Сохранить Цитировать Цитируется: 101 Похожие статьи Все версии статьи (5) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Adaptive rotated convolution for rotated object detection

Y Pu, Y Wang, Z **a, Y Han, Y Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Rotated object detection aims to identify and locate objects in images with arbitrary
orientation. In this scenario, the oriented directions of objects vary considerably across …

Сохранить Цитировать Цитируется: 103 Похожие статьи Все версии статьи (7) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Ocr-free document understanding transformer

G Kim, T Hong, M Yim, JY Nam, J Park, J Yim… - … on Computer Vision, 2022 - Springer

Understanding document images (eg, invoices) is a core but challenging task since it
requires complex functions such as reading text and a holistic understanding of the …

Сохранить Цитировать Цитируется: 381 Похожие статьи Все версии статьи (7)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Real-time scene text detection with differentiable binarization and adaptive scale fusion

M Liao, Z Zou, Z Wan, C Yao… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Recently, segmentation-based scene text detection methods have drawn extensive attention
in the scene text detection field, because of their superiority in detecting the text instances of …

Сохранить Цитировать Цитируется: 329 Похожие статьи Все версии статьи (7)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Scene text recognition with permuted autoregressive sequence models

D Bautista, R Atienza - European conference on computer vision, 2022 - Springer

Context-aware STR methods typically use internal autoregressive (AR) language models
(LM). Inherent limitations of AR models motivated two-stage methods which employ an …

Сохранить Цитировать Цитируется: 206 Похожие статьи Все версии статьи (8)

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Trocr: Transformer-based optical character recognition with pre-trained models

M Li, T Lv, J Chen, L Cui, Y Lu, D Florencio… - Proceedings of the …, 2023 - ojs.aaai.org

Text recognition is a long-standing research problem for document digitalization. Existing
approaches are usually built based on CNN for image understanding and RNN for char …

Сохранить Цитировать Цитируется: 468 Похожие статьи Все версии статьи (6) В виде HTML

Создать оповещение

Цитировать

Расширенный поиск

Сохранено в вашей библиотеке

ICDAR 2015 competition on robust reading

Transformers in remote sensing: A survey

Object detection in 20 years: A survey

Lvlm-ehub: A comprehensive evaluation benchmark for large vision-language models

Git: A generative image-to-text transformer for vision and language

Textdiffuser: Diffusion models as text painters

Adaptive rotated convolution for rotated object detection

Ocr-free document understanding transformer

Real-time scene text detection with differentiable binarization and adaptive scale fusion

Scene text recognition with permuted autoregressive sequence models

Trocr: Transformer-based optical character recognition with pre-trained models