Tinygpt-v: Efficient multimodal large language model via small backbones

Z Yuan, Z Li, W Huang, Y Ye, L Sun - arxiv preprint arxiv:2312.16862, 2023 - arxiv.org
In recent years, multimodal large language models (MLLMs) such as GPT-4V have
demonstrated remarkable advancements, excelling in a variety of vision-language tasks …

Smaller, weaker, yet better: Training llm reasoners via compute-optimal sampling

H Bansal, A Hosseini, R Agarwal, VQ Tran… - arxiv preprint arxiv …, 2024 - arxiv.org
Training on high-quality synthetic data from strong language models (LMs) is a common
strategy to improve the reasoning performance of LMs. In this work, we revisit whether this …

Tinyllava: A framework of small-scale large multimodal models

B Zhou, Y Hu, X Weng, J Jia, J Luo, X Liu, J Wu… - arxiv preprint arxiv …, 2024 - arxiv.org
We present the TinyLLaVA framework that provides a unified perspective in designing and
analyzing the small-scale Large Multimodal Models (LMMs). We empirically study the effects …