- Academic Search

[HTML][HTML] From vision to text: A comprehensive review of natural image captioning in medical diagnosis and radiology report generation

G Reale-Nosei, E Amador-Domínguez… - Medical Image Analysis, 2024 - Elsevier

Abstract Natural Image Captioning (NIC) is an interdisciplinary research area that lies within
the intersection of Computer Vision (CV) and Natural Language Processing (NLP). Several …

Opslaan Citeren Geciteerd door 7 Verwante artikelen Alle 4 versies

[Free GPT-4]

[PDF] thecvf.com

Interactive and explainable region-guided radiology report generation

T Tanida, P Müller, G Kaissis… - Proceedings of the …, 2023 - openaccess.thecvf.com

The automatic generation of radiology reports has the potential to assist radiologists in the
time-consuming task of report writing. Existing methods generate the full report from image …

Opslaan Citeren Geciteerd door 134 Verwante artikelen Alle 6 versies HTML-versie

[Free GPT-4]

[PDF] arxiv.org

Grit: A generative region-to-text transformer for object understanding

J Wu, J Wang, Z Yang, Z Gan, Z Liu, J Yuan… - European Conference on …, 2024 - Springer

This paper presents a Generative RegIon-to-Text transformer, GRiT, for object
understanding. The spirit of GRiT is to formulate object understanding as< region, text> …

Opslaan Citeren Geciteerd door 110 Verwante artikelen Alle 2 versies

Dual-level representation enhancement on characteristic and context for image-text retrieval

S Yang, Q Li, W Li, X Li, AA Liu - IEEE Transactions on Circuits …, 2022 - ieeexplore.ieee.org

Image-text retrieval is a fundamental and vital task in multi-media retrieval and has received
growing attention since it connects heterogeneous data. Previous methods that perform well …

Opslaan Citeren Geciteerd door 106 Verwante artikelen Alle 2 versies

[Free GPT-4]

[PDF] arxiv.org

Caption anything: Interactive image description with diverse multimodal controls

T Wang, J Zhang, J Fei, H Zheng, Y Tang, Z Li… - arxiv preprint arxiv …, 2023 - arxiv.org

Controllable image captioning is an emerging multimodal topic that aims to describe the
image with natural language following human purpose, $\textit {eg} $, looking at the …

Opslaan Citeren Geciteerd door 88 Verwante artikelen Alle 3 versies HTML-versie

Evolution of visual data captioning Methods, Datasets, and evaluation Metrics: A comprehensive survey

D Sharma, C Dhiman, D Kumar - Expert Systems with Applications, 2023 - Elsevier

Abstract Automatic Visual Captioning (AVC) generates syntactically and semantically correct
sentences by describing important objects, attributes, and their relationships with each other …

Opslaan Citeren Geciteerd door 15 Verwante artikelen Alle 2 versies

Long dialogue emotion detection based on commonsense knowledge graph guidance

W Nie, Y Bao, Y Zhao, A Liu - IEEE Transactions on Multimedia, 2023 - ieeexplore.ieee.org

Dialogue emotion detection is always challenging due to human subjectivity and the
randomness of dialogue content. In a conversation, the emotion of each person often …

Opslaan Citeren Geciteerd door 66 Verwante artikelen Alle 2 versies

[Free GPT-4]

[PDF] github.io

Cof-net: A progressive coarse-to-fine framework for object detection in remote-sensing imagery

C Zhang, KM Lam, Q Wang - IEEE Transactions on Geoscience …, 2023 - ieeexplore.ieee.org

Object detection in remote-sensing images is a crucial task in the fields of Earth observation
and computer vision. Despite impressive progress in modern remote-sensing object …

Opslaan Citeren Geciteerd door 76 Verwante artikelen Alle 3 versies

Deep unsupervised part-whole relational visual saliency

Y Liu, X Dong, D Zhang, S Xu - Neurocomputing, 2024 - Elsevier

Abstract Deep Supervised Salient Object Detection (SSOD) excessively relies on large-
scale annotated pixel-level labels which consume intensive labour acquiring high quality …

Opslaan Citeren Geciteerd door 46 Verwante artikelen Alle 2 versies

[Free GPT-4]

[PDF] whiterose.ac.uk

Textual context-aware dense captioning with diverse words

Z Shao, J Han, K Debattista… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Dense captioning generates more detailed spoken descriptions for complex visual scenes.
Despite several promising leads, existing methods still have two broad limitations: 1) The …

Opslaan Citeren Geciteerd door 53 Verwante artikelen Alle 3 versies

Melding maken

Citeren

Geavanceerd zoeken

Opgeslagen in Mijn bibliotheek

Region-object relation-aware dense captioning via transformer

[HTML][HTML] From vision to text: A comprehensive review of natural image captioning in medical diagnosis and radiology report generation

Interactive and explainable region-guided radiology report generation

Grit: A generative region-to-text transformer for object understanding

Dual-level representation enhancement on characteristic and context for image-text retrieval

Caption anything: Interactive image description with diverse multimodal controls

Evolution of visual data captioning Methods, Datasets, and evaluation Metrics: A comprehensive survey

Long dialogue emotion detection based on commonsense knowledge graph guidance

Cof-net: A progressive coarse-to-fine framework for object detection in remote-sensing imagery

Deep unsupervised part-whole relational visual saliency

Textual context-aware dense captioning with diverse words