CIC-BART-SSA: Controllable Image Captioning with Structured Semantic Augmentation

K Basioti, MA Abdelsalam, F Fancellu… - … on Computer Vision, 2024 - Springer
Abstract Controllable Image Captioning (CIC) aims at generating natural language
descriptions for an image, conditioned on information provided by end users, eg, regions …

Fine-grained length controllable video captioning with ordinal embeddings

T Nitta, T Fukuzawa, T Tamaki - IEEE Access, 2024 - ieeexplore.ieee.org
This paper proposes a method for video captioning that controls the length of generated
captions. Previous work on length control often had few levels for expressing length. In this …

Mobile Manipulation Instruction Generation From Multiple Images With Automatic Metric Enhancement

K Katsumata, M Kambara, D Yashima… - IEEE Robotics and …, 2025 - ieeexplore.ieee.org
We consider the problem of generating free-form mobile manipulation instructions based on
a target object image and receptacle image. Conventional image captioning models are not …

Curriculum Learning for Cross-Lingual Data-to-Text Generation With Noisy Data

KA Hari, M Gupta, V Varma - arxiv preprint arxiv:2412.13484, 2024 - arxiv.org
Curriculum learning has been used to improve the quality of text generation systems by
ordering the training samples according to a particular schedule in various tasks. In the …

[PDF][PDF] Learning from Noisy Data for Cross Lingual Text Generation in Low-Resource Languages

KA Hari - 2024 - web2py.iiit.ac.in
Abstract With Large Language Models (LLMs) and Language models in general becoming a
more significant part of our daily content consumption, it is paramount to ensure that …