Lvlm-ehub: A comprehensive evaluation benchmark for large vision-language models
Large Vision-Language Models (LVLMs) have recently played a dominant role in
multimodal vision-language learning. Despite the great success, it lacks a holistic evaluation …
multimodal vision-language learning. Despite the great success, it lacks a holistic evaluation …
Crosslingual generalization through multitask finetuning
Multitask prompted finetuning (MTF) has been shown to help large language models
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …
A comprehensive survey on applications of transformers for deep learning tasks
Abstract Transformers are Deep Neural Networks (DNN) that utilize a self-attention
mechanism to capture contextual relationships within sequential data. Unlike traditional …
mechanism to capture contextual relationships within sequential data. Unlike traditional …