Storytelling with image data: A systematic review and comparative analysis of methods and tools

F Lotfi, A Beheshti, H Farhood, M Pooshideh… - Algorithms, 2023 - mdpi.com
In our digital age, data are generated constantly from public and private sources, social
media platforms, and the Internet of Things. A significant portion of this information comes in …

Revam** cross-modal recipe retrieval with hierarchical transformers and self-supervised learning

A Salvador, E Gundogdu, L Bazzani… - Proceedings of the …, 2021 - openaccess.thecvf.com
Cross-modal recipe retrieval has recently gained substantial attention due to the importance
of food in people's lives, as well as the availability of vast amounts of digital cooking recipes …

Grounding'grounding'in NLP

KR Chandu, Y Bisk, AW Black - arxiv preprint arxiv:2106.02192, 2021 - arxiv.org
The NLP community has seen substantial recent interest in grounding to facilitate interaction
between language technologies and the world. However, as a community, we use the term …

Visual writing prompts: Character-grounded story generation with curated image sequences

X Hong, A Sayeed, K Mehra, V Demberg… - Transactions of the …, 2023 - direct.mit.edu
Current work on image-based story generation suffers from the fact that the existing image
sequence collections do not have coherent plots behind them. We improve visual story …

Evaluating document coherence modeling

A Shen, M Mistica, B Salehi, H Li… - Transactions of the …, 2021 - direct.mit.edu
While pretrained language models (LMs) have driven impressive gains over morpho-
syntactic and semantic tasks, their ability to model discourse and pragmatic phenomena is …

Learning to substitute ingredients in recipes

B Fatemi, Q Duval, R Girdhar, M Drozdzal… - arxiv preprint arxiv …, 2023 - arxiv.org
Recipe personalization through ingredient substitution has the potential to help people meet
their dietary needs and preferences, avoid potential allergens, and ease culinary exploration …

Structure-aware procedural text generation from an image sequence

T Nishimura, A Hashimoto, Y Ushiku, H Kameko… - IEEE …, 2020 - ieeexplore.ieee.org
It is an important activity for our society to create new value by combining materials. From
daily cooking to manufacturing for industry, we often describe the way to do it as a …

COM Kitchens: An Unedited Overhead-view Video Dataset as a Vision-Language Benchmark

K Maeda, T Hirasawa, A Hashimoto… - … on Computer Vision, 2024 - Springer
Procedural video understanding is gaining attention in the vision and language community.
Deep learning-based video analysis requires extensive data. Consequently, existing works …

Visual grounding annotation of recipe flow graph

T Nishimura, S Tomori, H Hashimoto… - Proceedings of the …, 2020 - aclanthology.org
In this paper, we provide a dataset that gives visual grounding annotations to recipe flow
graphs. A recipe flow graph is a representation of the cooking workflow, which is designed …

Local and global context-based pairwise models for sentence ordering

RR Manku, AJ Paul - Knowledge-Based Systems, 2022 - Elsevier
Sentence Ordering refers to the task of rearranging a set of sentences into the appropriate
coherent order. For this task, most previous approaches have explored global context-based …