- Academic Search

M Stefanini, M Cornia, L Baraldi… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Connecting Vision and Language plays an essential role in Generative Intelligence. For this
reason, large research efforts have been devoted to image captioning, ie describing images …

Salva Cita Citato da 396 Articoli correlati Tutte e 11 le versioni

[Free GPT-4]

[PDF] neurips.cc

Visual clues: Bridging vision and language foundations for image paragraph captioning

Y **e, L Zhou, X Dai, L Yuan, N Bach… - Advances in Neural …, 2022 - proceedings.neurips.cc

People say," A picture is worth a thousand words". Then how can we get the rich information
out of the image? We argue that by using visual clues to bridge large pretrained vision …

Salva Cita Citato da 27 Articoli correlati Tutte e 7 le versioni Versione HTML

Chinese image caption generation via visual attention and topic modeling

M Liu, H Hu, L Li, Y Yu, W Guan - IEEE transactions on …, 2020 - ieeexplore.ieee.org

Automatic image captioning is to conduct the cross-modal conversion from image visual
content to natural language text. Involving computer vision (CV) and natural language …

Salva Cita Citato da 63 Articoli correlati Tutte e 4 le versioni

[Free GPT-4]

[PDF] arxiv.org

Dialoguetrm: Exploring the intra-and inter-modal emotional behaviors in the conversation

Y Mao, Q Sun, G Liu, X Wang, W Gao, X Li… - arxiv preprint arxiv …, 2020 - arxiv.org

Emotion Recognition in Conversations (ERC) is essential for building empathetic human-
machine systems. Existing studies on ERC primarily focus on summarizing the context …

Salva Cita Citato da 57 Articoli correlati Tutte e 5 le versioni Versione HTML

[Free GPT-4]

[PDF] thecvf.com

Intention oriented image captions with guiding objects

Y Zheng, Y Li, S Wang - … of the IEEE/CVF conference on …, 2019 - openaccess.thecvf.com

Although existing image caption models can produce promising results using recurrent
neural networks (RNNs), it is difficult to guarantee that an object we care about is contained …

Salva Cita Citato da 64 Articoli correlati Tutte e 5 le versioni Versione HTML

Effective multimodal encoding for image paragraph captioning

TS Nguyen, B Fernando - IEEE Transactions on Image …, 2022 - ieeexplore.ieee.org

In this paper, we present a regularization-based image paragraph generation method. We
propose a novel multimodal encoding generator (MEG) to generate effective multimodal …

Salva Cita Citato da 12 Articoli correlati Tutte e 5 le versioni

Dual-CNN: A Convolutional language decoder for paragraph image captioning

R Li, H Liang, Y Shi, F Feng, X Wang - Neurocomputing, 2020 - Elsevier

The task of paragraph image captioning aims to generate a coherent paragraph describing
a given image. However, due to their limited ability to capture long-term dependency …

Salva Cita Citato da 45 Articoli correlati

[Free GPT-4]

[PDF] arxiv.org

Curiosity-driven reinforcement learning for diverse visual paragraph generation

Y Luo, Z Huang, Z Zhang, Z Wang, J Li… - Proceedings of the 27th …, 2019 - dl.acm.org

Visual paragraph generation aims to automatically describe a given image from different
perspectives and organize sentences in a coherent way. In this paper, we address three …

Salva Cita Citato da 49 Articoli correlati Tutte e 4 le versioni

Image captioning with novel topics guidance and retrieval-based topics re-weighting

M Al-Qatf, X Wang, A Hawbani… - IEEE Transactions …, 2022 - ieeexplore.ieee.org

Topic modelling (TM) has shown significant progress in boosting the effectiveness of image
captioning in the last few years. Although important improvements have been shown in …

Salva Cita Citato da 15 Articoli correlati Tutte e 2 le versioni

Exploring global and local linguistic representations for text-to-image synthesis

R Li, N Wang, F Feng, G Zhang… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org

The task of text-to-image synthesis is to generate photographic images conditioned on given
textual descriptions. This challenging task has recently attracted considerable attention from …

Salva Cita Citato da 48 Articoli correlati Tutte e 3 le versioni

Crea avviso

Cita

Ricerca avanzata

Salvato in La mia biblioteca

Show and Tell More: Topic-Oriented Multi-Sentence Image Captioning.

From show to tell: A survey on deep learning-based image captioning

Visual clues: Bridging vision and language foundations for image paragraph captioning

Chinese image caption generation via visual attention and topic modeling

Dialoguetrm: Exploring the intra-and inter-modal emotional behaviors in the conversation

Intention oriented image captions with guiding objects

Effective multimodal encoding for image paragraph captioning

Dual-CNN: A Convolutional language decoder for paragraph image captioning

Curiosity-driven reinforcement learning for diverse visual paragraph generation

Image captioning with novel topics guidance and retrieval-based topics re-weighting

Exploring global and local linguistic representations for text-to-image synthesis