Enhancing traditional museum fruition: current state and emerging tendencies
Galleries, libraries, archives, and museums are nowadays striving to implement innovative
approaches to adequately use and distribute the wealth of knowledge found in cultural …
approaches to adequately use and distribute the wealth of knowledge found in cultural …
Taming CLIP for Fine-Grained and Structured Visual Understanding of Museum Exhibits
CLIP is a powerful and widely used tool for understanding images in the context of natural
language descriptions to perform nuanced tasks. However, it does not offer application …
language descriptions to perform nuanced tasks. However, it does not offer application …
Diffusion Based Augmentation for Captioning and Retrieval in Cultural Heritage
Cultural heritage applications and advanced machine learning models are creating a fruitful
synergy to provide effective and accessible ways of interacting with artworks. Smart audio …
synergy to provide effective and accessible ways of interacting with artworks. Smart audio …
A dataset of synthetic art dialogues with ChatGPT
Abstract This paper introduces Art_GenEvalGPT, a novel dataset of synthetic dialogues
centered on art generated through ChatGPT. Unlike existing datasets focused on …
centered on art generated through ChatGPT. Unlike existing datasets focused on …
Understanding the World's Museums through Vision-Language Reasoning
AA Balauca, S Garai, S Balauca, RU Shetty… - arxiv preprint arxiv …, 2024 - arxiv.org
Museums serve as vital repositories of cultural heritage and historical artifacts spanning
diverse epochs, civilizations, and regions, preserving well-documented collections. Data …
diverse epochs, civilizations, and regions, preserving well-documented collections. Data …
Language-guided Bias Generation Contrastive Strategy for Visual Question Answering
Visual question answering (VQA) is a challenging task that requires models to understand
both visual and linguistic inputs and produce accurate answers. However, VQA models often …
both visual and linguistic inputs and produce accurate answers. However, VQA models often …
SANet: Selective Aggregation Network for unsupervised object re-identification
M Lin, J Tang, L Fu, Z Zuo - Computer Vision and Image Understanding, 2025 - Elsevier
Recent advancements in unsupervised object re-identification have witnessed remarkable
progress, which usually focuses on capturing fine-grained semantic information through …
progress, which usually focuses on capturing fine-grained semantic information through …
Creating a Lens of Chinese Culture: A Multimodal Dataset for Chinese Pun Rebus Art Understanding
Large vision-language models (VLMs) have demonstrated remarkable abilities in
understanding everyday content. However, their performance in the domain of art …
understanding everyday content. However, their performance in the domain of art …
Exploring the Synergy Between Vision-Language Pretraining and ChatGPT for Artwork Captioning: A Preliminary Study
While AI techniques have enabled automated analysis and interpretation of visual content,
generating meaningful captions for artworks presents unique challenges. These include …
generating meaningful captions for artworks presents unique challenges. These include …
Computer Vision and AI Tools for Enhancing User Experience in the Cultural Heritage Domain
PK Rachabathuni, P Mazzanti, F Principi… - … Conference on Human …, 2025 - Springer
To enhance the museum experience for visitors, it's crucial to adopt a people-centered
approach. This means engaging with visitors dynamically, providing transformative learning …
approach. This means engaging with visitors dynamically, providing transformative learning …