Deep learning approaches on image captioning: A review

T Ghandi, H Pourreza, H Mahyar - ACM Computing Surveys, 2023 - dl.acm.org
Image captioning is a research area of immense importance, aiming to generate natural
language descriptions for visual content in the form of still images. The advent of deep …

Neural attention for image captioning: review of outstanding methods

Z Zohourianshahzadi, JK Kalita - Artificial Intelligence Review, 2022 - Springer
Image captioning is the task of automatically generating sentences that describe an input
image in the best way possible. The most successful techniques for automatically generating …

Task-adaptive attention for image captioning

C Yan, Y Hao, L Li, J Yin, A Liu, Z Mao… - … on Circuits and …, 2021 - ieeexplore.ieee.org
Attention mechanisms are now widely used in image captioning models. However, most
attention models only focus on visual features. When generating syntax related words, little …

End-to-end dense video captioning with masked transformer

L Zhou, Y Zhou, JJ Corso… - Proceedings of the …, 2018 - openaccess.thecvf.com
Dense video captioning aims to generate text descriptions for all events in an untrimmed
video. This involves both detecting and describing events. Therefore, all previous methods …

State2explanation: Concept-based explanations to benefit agent learning and user understanding

D Das, S Chernova, B Kim - Advances in Neural …, 2023 - proceedings.neurips.cc
As more non-AI experts use complex AI systems for daily tasks, there has been an
increasing effort to develop methods that produce explanations of AI decision making that …

Towards automatic learning of procedures from web instructional videos

L Zhou, C Xu, J Corso - Proceedings of the AAAI Conference on …, 2018 - ojs.aaai.org
The potential for agents, whether embodied or software, to learn by observing other agents
performing procedures involving objects and actions is rich. Current research on automatic …

Recurrent fusion network for image captioning

W Jiang, L Ma, YG Jiang, W Liu… - Proceedings of the …, 2018 - openaccess.thecvf.com
Recently, much advance has been made in image captioning, and an encoder-decoder
framework has been adopted by all the state-of-the-art models. Under this framework, an …

Chinese image captioning via fuzzy attention-based DenseNet-BiLSTM

H Lu, R Yang, Z Deng, Y Zhang, G Gao… - ACM Transactions on …, 2021 - dl.acm.org
Chinese image description generation tasks usually have some challenges, such as single-
feature extraction, lack of global information, and lack of detailed description of the image …

Adversarial attack and defense: A survey

H Liang, E He, Y Zhao, Z Jia, H Li - Electronics, 2022 - mdpi.com
In recent years, artificial intelligence technology represented by deep learning has achieved
remarkable results in image recognition, semantic analysis, natural language processing …

Grounded video description

L Zhou, Y Kalantidis, X Chen… - Proceedings of the …, 2019 - openaccess.thecvf.com
Video description is one of the most challenging problems in vision and language
understanding due to the large variability both on the video and language side. Models …