Multimodal intelligence: Representation learning, information fusion, and applications
Deep learning methods haverevolutionized speech recognition, image recognition, and
natural language processing since 2010. Each of these tasks involves a single modality in …
natural language processing since 2010. Each of these tasks involves a single modality in …
Predicting industrial building energy consumption with statistical and machine-learning models informed by physical system parameters
The industrial sector consumes about one-third of global energy, making them a frequent
target for energy use reduction. Variation in energy usage is observed with weather …
target for energy use reduction. Variation in energy usage is observed with weather …
Fine-grained image analysis with deep learning: A survey
Fine-grained image analysis (FGIA) is a longstanding and fundamental problem in computer
vision and pattern recognition, and underpins a diverse set of real-world applications. The …
vision and pattern recognition, and underpins a diverse set of real-world applications. The …
Mixed high-order attention network for person re-identification
Attention has become more attractive in person re-identification (ReID) as it is capable of
biasing the allocation of available resources towards the most informative parts of an input …
biasing the allocation of available resources towards the most informative parts of an input …
Randomized numerical linear algebra: Foundations and algorithms
This survey describes probabilistic algorithms for linear algebraic computations, such as
factorizing matrices and solving linear systems. It focuses on techniques that have a proven …
factorizing matrices and solving linear systems. It focuses on techniques that have a proven …
Deep multimodal representation learning: A survey
W Guo, J Wang, S Wang - Ieee Access, 2019 - ieeexplore.ieee.org
Multimodal representation learning, which aims to narrow the heterogeneity gap among
different modalities, plays an indispensable role in the utilization of ubiquitous multimodal …
different modalities, plays an indispensable role in the utilization of ubiquitous multimodal …
Part-aligned bilinear representations for person re-identification
Comparing the appearance of corresponding body parts is essential for person re-
identification. As body parts are frequently misaligned between the detected human boxes …
identification. As body parts are frequently misaligned between the detected human boxes …
Multimodal compact bilinear pooling for visual question answering and visual grounding
Modeling textual or visual information with vector representations trained from large
language or visual datasets has been successfully explored in recent years. However, tasks …
language or visual datasets has been successfully explored in recent years. However, tasks …
Hadamard product for low-rank bilinear pooling
Bilinear models provide rich representations compared with linear models. They have been
applied in various visual tasks, such as object recognition, segmentation, and visual …
applied in various visual tasks, such as object recognition, segmentation, and visual …
Hierarchical bilinear pooling for fine-grained visual recognition
Fine-grained visual recognition is challenging because it highly relies on the modeling of
various semantic parts and fine-grained feature learning. Bilinear pooling based models …
various semantic parts and fine-grained feature learning. Bilinear pooling based models …