The multi-modal fusion in visual question answering: a review of attention mechanisms

S Lu, M Liu, L Yin, Z Yin, X Liu, W Zheng - PeerJ Computer Science, 2023 - peerj.com
Abstract Visual Question Answering (VQA) is a significant cross-disciplinary issue in the
fields of computer vision and natural language processing that requires a computer to output …

A review of deep learning techniques for speech processing

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Obtaining genetics insights from deep learning via explainable artificial intelligence

G Novakovsky, N Dexter, MW Libbrecht… - Nature Reviews …, 2023 - nature.com
Artificial intelligence (AI) models based on deep learning now represent the state of the art
for making functional predictions in genomics research. However, the underlying basis on …

Transformers in time series: A survey

Q Wen, T Zhou, C Zhang, W Chen, Z Ma, J Yan… - ar** new medical image
processing algorithms, and deep learning based models have been remarkably successful …