A survey of evaluation metrics used for NLG systems

AB Sai, AK Mohankumar, MM Khapra - ACM Computing Surveys (CSUR …, 2022 - dl.acm.org
In the last few years, a large number of automatic evaluation metrics have been proposed for
evaluating Natural Language Generation (NLG) systems. The rapid development and …

Multimodal machine learning: A survey and taxonomy

T Baltrušaitis, C Ahuja… - IEEE transactions on …, 2018 - ieeexplore.ieee.org
Our experience of the world is multimodal-we see objects, hear sounds, feel texture, smell
odors, and taste flavors. Modality refers to the way in which something happens or is …

Survey of the state of the art in natural language generation: Core tasks, applications and evaluation

A Gatt, E Krahmer - Journal of Artificial Intelligence Research, 2018 - jair.org
This paper surveys the current state of the art in Natural Language Generation (NLG),
defined as the task of generating text or speech from non-linguistic input. A survey of NLG is …

Spice: Semantic propositional image caption evaluation

P Anderson, B Fernando, M Johnson… - Computer Vision–ECCV …, 2016 - Springer
There is considerable interest in the task of automatically generating image captions.
However, evaluation is challenging. Existing automatic evaluation metrics are primarily …

Why we need new evaluation metrics for NLG

J Novikova, O Dušek, AC Curry, V Rieser - arxiv preprint arxiv …, 2017 - arxiv.org
The majority of NLG evaluation relies on automatic metrics, such as BLEU. In this paper, we
motivate the need for novel, system-and data-independent automatic evaluation methods …

Vqa: Visual question answering

S Antol, A Agrawal, J Lu, M Mitchell… - Proceedings of the …, 2015 - openaccess.thecvf.com
We propose the task of free-form and open-ended Visual Question Answering (VQA). Given
an image and a natural language question about the image, the task is to provide an …

Microsoft coco captions: Data collection and evaluation server

X Chen, H Fang, TY Lin, R Vedantam, S Gupta… - arxiv preprint arxiv …, 2015 - arxiv.org
In this paper we describe the Microsoft COCO Caption dataset and evaluation server. When
completed, the dataset will contain over one and a half million captions describing over …

Cider: Consensus-based image description evaluation

R Vedantam, C Lawrence Zitnick… - Proceedings of the …, 2015 - openaccess.thecvf.com
Automatically describing an image with a sentence is a long-standing challenge in computer
vision and natural language processing. Due to recent progress in object detection, attribute …

From captions to visual concepts and back

H Fang, S Gupta, F Iandola… - Proceedings of the …, 2015 - openaccess.thecvf.com
This paper presents a novel approach for automatically generating image descriptions:
visual detectors, language models, and multimodal similarity models learnt directly from a …

Framing image description as a ranking task: Data, models and evaluation metrics

M Hodosh, P Young, J Hockenmaier - Journal of Artificial Intelligence …, 2013 - jair.org
The ability to associate images with natural language sentences that describe what is
depicted in them is a hallmark of image understanding, and a prerequisite for applications …