Enhancing traditional museum fruition: current state and emerging tendencies

R Furferi, L Di Angelo, M Bertini, P Mazzanti… - Heritage Science, 2024 - Springer
Galleries, libraries, archives, and museums are nowadays striving to implement innovative
approaches to adequately use and distribute the wealth of knowledge found in cultural …

Taming CLIP for Fine-Grained and Structured Visual Understanding of Museum Exhibits

AA Balauca, DP Paudel, K Toutanova… - European Conference on …, 2024 - Springer
CLIP is a powerful and widely used tool for understanding images in the context of natural
language descriptions to perform nuanced tasks. However, it does not offer application …

Diffusion Based Augmentation for Captioning and Retrieval in Cultural Heritage

D Cioni, L Berlincioni, F Becattini… - Proceedings of the …, 2023 - openaccess.thecvf.com
Cultural heritage applications and advanced machine learning models are creating a fruitful
synergy to provide effective and accessible ways of interacting with artworks. Smart audio …

A dataset of synthetic art dialogues with ChatGPT

M Gil-Martín, C Luna-Jiménez, S Esteban-Romero… - Scientific Data, 2024 - nature.com
Abstract This paper introduces Art_GenEvalGPT, a novel dataset of synthetic dialogues
centered on art generated through ChatGPT. Unlike existing datasets focused on …

Understanding the World's Museums through Vision-Language Reasoning

AA Balauca, S Garai, S Balauca, RU Shetty… - arxiv preprint arxiv …, 2024 - arxiv.org
Museums serve as vital repositories of cultural heritage and historical artifacts spanning
diverse epochs, civilizations, and regions, preserving well-documented collections. Data …

Language-guided Bias Generation Contrastive Strategy for Visual Question Answering

E Zhao, N Song, Z Zhang, J Nie, X Liang… - ACM Transactions on …, 2025 - dl.acm.org
Visual question answering (VQA) is a challenging task that requires models to understand
both visual and linguistic inputs and produce accurate answers. However, VQA models often …

SANet: Selective Aggregation Network for unsupervised object re-identification

M Lin, J Tang, L Fu, Z Zuo - Computer Vision and Image Understanding, 2025 - Elsevier
Recent advancements in unsupervised object re-identification have witnessed remarkable
progress, which usually focuses on capturing fine-grained semantic information through …

Creating a Lens of Chinese Culture: A Multimodal Dataset for Chinese Pun Rebus Art Understanding

T Zhang, T Feng, Y Ni, M Cao, R Liu, K Butler… - arxiv preprint arxiv …, 2024 - arxiv.org
Large vision-language models (VLMs) have demonstrated remarkable abilities in
understanding everyday content. However, their performance in the domain of art …

Exploring the Synergy Between Vision-Language Pretraining and ChatGPT for Artwork Captioning: A Preliminary Study

G Castellano, N Fanelli, R Scaringi… - … Conference on Image …, 2023 - Springer
While AI techniques have enabled automated analysis and interpretation of visual content,
generating meaningful captions for artworks presents unique challenges. These include …

Computer Vision and AI Tools for Enhancing User Experience in the Cultural Heritage Domain

PK Rachabathuni, P Mazzanti, F Principi… - … Conference on Human …, 2025 - Springer
To enhance the museum experience for visitors, it's crucial to adopt a people-centered
approach. This means engaging with visitors dynamically, providing transformative learning …