The Revolution of Multimodal Large Language Models: A Survey D Caffagni, F Cocchi, L Barsellotti, N Moratelli, S Sarto, L Baraldi, ... Findings of the Association for Computational Linguistics (ACL), 2024 | 46 | 2024 |
Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs D Caffagni, F Cocchi, N Moratelli, S Sarto, M Cornia, L Baraldi, ... IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops 2024, 2024 | 29 | 2024 |
Fashion-oriented image captioning with external knowledge retrieval and fully attentive gates N Moratelli, M Barraco, D Morelli, M Cornia, L Baraldi, R Cucchiara Sensors 23, 2023 | 18 | 2023 |
Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization N Moratelli, D Caffagni, M Cornia, L Baraldi, R Cucchiara British Machine Vision Conference 2024 (BMVC Oral), 2024 | 4 | 2024 |
Positive-Augmented Contrastive Learning for Vision-and-Language Evaluation and Training S Sarto, N Moratelli, M Cornia, L Baraldi, R Cucchiara arXiv preprint arXiv:2410.07336, 2024 | 3 | 2024 |
Personalizing Multimodal Large Language Models for Image Captioning: An Experimental Analysis D Bucciarelli, N Moratelli, M Cornia, L Baraldi, R Cucchiara European Conference on Computer Vision Workshops 2024, 2024 | 3 | 2024 |
Are learnable prompts the right way of prompting? Adapting vision-and-language models with memory optimization N Moratelli, M Barraco, M Cornia, L Baraldi, R Cucchiara IEEE Intelligent Systems, 2024 | 2 | 2024 |
Fluent and Accurate Image Captioning with a Self-Trained Reward Model N Moratelli, M Cornia, L Baraldi, R Cucchiara International Conference on Pattern Recognition (ICPR Oral), 2024 | 1 | 2024 |
Causal Graphical Models for Vision-Language Compositional Understanding F Parascandolo, N Moratelli, E Sangineto, L Baraldi, R Cucchiara The Thirteenth International Conference on Learning Representations (ICLR), 2024 | | 2024 |
Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering F Cocchi, N Moratelli, M Cornia, L Baraldi, R Cucchiara IEEE/CVF Conference on Computer Vision and Pattern Recognition 2025 (CVPR), 2024 | | 2024 |
Descrizione di immagini in linguaggio naturale utilizzando un nuovo meccanismo di attenzione e conoscenza esterna N MORATELLI | | 2022 |