Attention, please! A survey of neural attention models in deep learning

A de Santana Correia, EL Colombini - Artificial Intelligence Review, 2022 - Springer
In humans, Attention is a core property of all perceptual and cognitive operations. Given our
limited ability to process competing sources, attention mechanisms select, modulate, and …

Attention-based interpretable neural network for building cooling load prediction

A Li, F **ao, C Zhang, C Fan - Applied Energy, 2021 - Elsevier
Abstract Machine learning has gained increasing popularity in building energy management
due to its powerful capability and flexibility in model development as well as the rich data …

Video captioning by adversarial LSTM

Y Yang, J Zhou, J Ai, Y Bin, A Hanjalic… - … on Image Processing, 2018 - ieeexplore.ieee.org
In this paper, we propose a novel approach to video captioning based on adversarial
learning and long short-term memory (LSTM). With this solution concept, we aim at …

Hierarchically structured reinforcement learning for topically coherent visual story generation

Q Huang, Z Gan, A Celikyilmaz, D Wu… - Proceedings of the AAAI …, 2019 - ojs.aaai.org
We propose a hierarchically structured reinforcement learning approach to address the
challenges of planning for generating coherent multi-sentence stories for the visual …

Violin: A large-scale dataset for video-and-language inference

J Liu, W Chen, Y Cheng, Z Gan, L Yu… - Proceedings of the …, 2020 - openaccess.thecvf.com
We introduce a new task, Video-and-Language Inference, for joint multimodal
understanding of video and text. Given a video clip with aligned subtitles as premise, paired …

Video captioning: a comparative review of where we are and which could be the route

D Moctezuma, T Ramírez-delReal, G Ruiz… - Computer Vision and …, 2023 - Elsevier
Video captioning is the process of describing the content of a sequence of images capturing
its semantic relationships and meanings. Dealing with this task with a single image is …

Denoising-based multiscale feature fusion for remote sensing image captioning

W Huang, Q Wang, X Li - IEEE Geoscience and Remote …, 2020 - ieeexplore.ieee.org
With the benefits from deep learning technology, generating captions for remote sensing
images has become achievable, and great progress has been made in this field in the …

Adaptive hierarchical graph reasoning with semantic coherence for video-and-language inference

J Li, S Tang, L Zhu, H Shi, X Huang… - Proceedings of the …, 2021 - openaccess.thecvf.com
Abstract Video-and-Language Inference is a recently proposed task for joint video-and-
language understanding. This new task requires a model to draw inference on whether a …

Study on key factors affecting the high-order building model order reduction for model predictive control application

Q Chen, N Li - Energy and Buildings, 2023 - Elsevier
The reduced-order model (ROM) can highly reduce computation costs while maintaining
high high-fidelity performance for model predictive control's application by applying the …

SCA-Net: A spatial and channel attention network for medical image segmentation

T Shan, J Yan - IEEE Access, 2021 - ieeexplore.ieee.org
Automatic medical image segmentation is a critical tool for medical image analysis and
disease treatment. In recent years, convolutional neural networks (CNNs) have played an …