Multimodal logical inference system for visual-textual entailment

R Suzuki, H Yanaka, M Yoshikawa… - arxiv preprint arxiv …, 2019 - arxiv.org
A large amount of research about multimodal inference across text and vision has been
recently developed to obtain visually grounded word and sentence representations. In this …

Functional distributional semantics: Learning linguistically informed representations from a precisely annotated corpus

G Emerson - 2018 - repository.cam.ac.uk
The aim of distributional semantics is to design computational techniques that can
automatically learn the meanings of words from a body of text. The twin challenges are: how …

Points, paths, and playscapes: Large-scale spatial language understanding tasks set in the real world

J Baldridge, T Bedrax-Weiss, D Luong… - Proceedings of the …, 2018 - aclanthology.org
Spatial language understanding is important for practical applications and as a building
block for better abstract language understanding. Much progress has been made through …

Learning to generate descriptions of visual data anchored in spatial relations

A Muscat, A Belz - IEEE Computational Intelligence Magazine, 2017 - ieeexplore.ieee.org
The explosive growth of visual data both online and offline in private and public repositories
has led to urgent requirements for better ways to index, search, retrieve, process and …

Transfer of isospace into a 3d environment for annotations and applications

A Henlein, G Abrami, A Kett… - … of the 16th Joint ACL-ISO …, 2020 - aclanthology.org
People's visual perception is very pronounced and therefore it is usually no problem for
them to describe the space around them in words. Conversely, people also have no …

Natural language semantics with pictures: Some language & vision datasets and potential uses for computational semantics

D Schlangen - arxiv preprint arxiv:1904.07318, 2019 - arxiv.org
Propelling, and propelled by, the" deep learning revolution", recent years have seen the
introduction of ever larger corpora of images annotated with natural language expressions …

Natural language generation with computational intelligence [guest editorial]

JM Alonso, A Bugarin, E Reiter - IEEE Computational …, 2017 - ieeexplore.ieee.org
The articles in this special section focus on using natural language generation techniques
(NLG) and natural language processing (NLP) to build computational systems that generate …

The clarity and correctness of visualized thrust actions: a description and insights from users and experts

I Van der sluis, G Matoušková… - Visual …, 2022 - journals.sagepub.com
This article presents three studies that evaluate the effectiveness of instructional pictures that
visualize Heimlich maneuver thrusts. Firstly, a corpus study is used to describe a collection …

Visual-Textual Entailment with Quantities Using Model Checking and Knowledge Injection

N Iokawa, H Yanaka - Proceedings of the 2024 Joint International …, 2024 - aclanthology.org
In recent years, there has been great interest in multimodal inference. We concentrate on
visual-textual entailment (VTE), a critical task in multimodal inference. VTE is the task of …

What did this castle look like before? exploring referential relations in naturally occurring multimodal texts

R Utescher, S Zarrieß - Proceedings of the Third Workshop on …, 2021 - aclanthology.org
Multi-modal texts are abundant and diverse in structure, yet Language & Vision research of
these naturally occurring texts has mostly focused on genres that are comparatively light on …