Capsule networks with residual pose routing

Y Liu, D Cheng, D Zhang, S Xu… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Capsule networks (CapsNets) have been known difficult to develop a deeper architecture,
which is desirable for high performance in the deep learning era, due to the complex …

Broadcasted residual learning for efficient keyword spotting

B Kim, S Chang, J Lee, D Sung - arxiv preprint arxiv:2106.04140, 2021 - arxiv.org
Keyword spotting is an important research field because it plays a key role in device wake-
up and user interaction on smart devices. However, it is challenging to minimize errors while …

Language-conditioned graph networks for relational reasoning

R Hu, A Rohrbach, T Darrell… - Proceedings of the …, 2019 - openaccess.thecvf.com
Solving grounded language tasks often requires reasoning about relationships between
objects in the context of a given task. For example, to answer the question" What color is the …

Trends in integration of vision and language research: A survey of tasks, datasets, and methods

A Mogadala, M Kalimuthu, D Klakow - Journal of Artificial Intelligence …, 2021 - jair.org
Abstract Interest in Artificial Intelligence (AI) and its applications has seen unprecedented
growth in the last few years. This success can be partly attributed to the advancements made …

Multimodal graph networks for compositional generalization in visual question answering

R Saqur, K Narasimhan - Advances in Neural Information …, 2020 - proceedings.neurips.cc
Compositional generalization is a key challenge in grounding natural language to visual
perception. While deep learning models have achieved great success in multimodal tasks …

Relational reasoning using neural networks: a survey

AA Pise, H Vadapalli, I Sanders - International Journal of Uncertainty …, 2021 - World Scientific
Relational Networks (RN), as one of the most widely used relational reasoning techniques,
have achieved great success in many applications such as action and image analysis …

Improving the robustness of capsule networks to image affine transformations

J Gu, V Tresp - Proceedings of the IEEE/CVF conference on …, 2020 - openaccess.thecvf.com
Convolutional neural networks (CNNs) achieve translational invariance by using pooling
operations. However, the operations do not preserve the spatial relationships in the learned …

Introducing routing uncertainty in capsule networks

F De Sousa Ribeiro, G Leontidis… - Advances in neural …, 2020 - proceedings.neurips.cc
Rather than performing inefficient local iterative routing between adjacent capsule layers,
we propose an alternative global view based on representing the inherent uncertainty in part …

Multi-scale deep relational reasoning for facial kinship verification

H Yan, C Song - Pattern Recognition, 2021 - Elsevier
In this paper, we propose a deep relational network which exploits multi-scale information of
facial images for kinship verification. Unlike most existing deep learning based facial kinship …

Km4: Visual reasoning via knowledge embedding memory model with mutual modulation

W Zheng, L Yan, C Gou, FY Wang - Information Fusion, 2021 - Elsevier
Visual reasoning is a special kind of visual question answering, which is essentially multi-
step and compositional, and also requires intensive text-visual interaction. The most …