Vision-language pre-training: Basics, recent advances, and future trends
This monograph surveys vision-language pre-training (VLP) methods for multimodal
intelligence that have been developed in the last few years. We group these approaches …
intelligence that have been developed in the last few years. We group these approaches …
Llmscore: Unveiling the power of large language models in text-to-image synthesis evaluation
Existing automatic evaluation on text-to-image synthesis can only provide an image-text
matching score, without considering the object-level compositionality, which results in poor …
matching score, without considering the object-level compositionality, which results in poor …