The (r) evolution of multimodal large language models: A survey D Caffagni, F Cocchi, L Barsellotti, N Moratelli, S Sarto, L Baraldi, ... Findings of the Association for Computational Linguistics: ACL, 2024 | 18 | 2024 |
Fashion-oriented image captioning with external knowledge retrieval and fully attentive gates N Moratelli, M Barraco, D Morelli, M Cornia, L Baraldi, R Cucchiara Sensors 23 (3), 1286, 2023 | 13 | 2023 |
Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs D Caffagni, F Cocchi, N Moratelli, S Sarto, M Cornia, L Baraldi, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 11 | 2024 |
Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization N Moratelli, D Caffagni, M Cornia, L Baraldi, R Cucchiara arXiv preprint arXiv:2408.14547, 2024 | 2 | 2024 |
Personalizing Multimodal Large Language Models for Image Captioning: An Experimental Analysis D Bucciarelli, N Moratelli, M Cornia, L Baraldi, R Cucchiara ECCV Workshops, 2024 | 2 | 2024 |
Are Learnable Prompts the Right Way of Prompting? Adapting Vision-and-Language Models with Memory Optimization N Moratelli, M Barraco, M Cornia, L Baraldi, R Cucchiara IEEE Intelligent Systems, 2024 | 1 | 2024 |
Fluent and Accurate Image Captioning with a Self-Trained Reward Model N Moratelli, M Cornia, L Baraldi, R Cucchiara arXiv preprint arXiv:2408.16827, 2024 | | 2024 |
Descrizione di immagini in linguaggio naturale utilizzando un nuovo meccanismo di attenzione e conoscenza esterna N MORATELLI | | 2022 |