A survey on lora of large language models

Y Mao, Y Ge, Y Fan, W Xu, Y Mi, Z Hu… - Frontiers of Computer …, 2025 - Springer
Abstract Low-Rank Adaptation (LoRA), which updates the dense neural network layers with
pluggable low-rank matrices, is one of the best performed parameter efficient fine-tuning …

Toward general-purpose robots via foundation models: A survey and meta-analysis

Y Hu, Q **e, V Jain, J Francis, J Patrikar… - arxiv preprint arxiv …, 2023 - arxiv.org
Building general-purpose robots that operate seamlessly in any environment, with any
object, and utilizing various skills to complete diverse tasks has been a long-standing goal in …

Dense and aligned captions (dac) promote compositional reasoning in vl models

S Doveh, A Arbelle, S Harary… - Advances in …, 2023 - proceedings.neurips.cc
Vision and Language (VL) models offer an effective method for aligning representation
spaces of images and text allowing for numerous applications such as cross-modal retrieval …

Going beyond nouns with vision & language models using synthetic data

P Cascante-Bonilla, K Shehada… - Proceedings of the …, 2023 - openaccess.thecvf.com
Large-scale pre-trained Vision & Language (VL) models have shown remarkable
performance in many applications, enabling replacing a fixed set of supported classes with …

Mind the interference: Retaining pre-trained knowledge in parameter efficient continual learning of vision-language models

L Tang, Z Tian, K Li, C He, H Zhou, H Zhao, X Li… - … on Computer Vision, 2024 - Springer
This study addresses the Domain-Class Incremental Learning problem, a realistic but
challenging continual learning scenario where both the domain distribution and target …

Synthesize diagnose and optimize: Towards fine-grained vision-language understanding

W Peng, S **e, Z You, S Lan… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Vision language models (VLM) have demonstrated remarkable performance across various
downstream tasks. However understanding fine-grained visual-linguistic concepts such as …

A Practitioner's Guide to Continual Multimodal Pretraining

K Roth, V Udandarao, S Dziadzio, A Prabhu… - arxiv preprint arxiv …, 2024 - arxiv.org
Multimodal foundation models serve numerous applications at the intersection of vision and
language. Still, despite being pretrained on extensive data, they become outdated over time …

Continual diffusion with stamina: Stack-and-mask incremental adapters

JS Smith, YC Hsu, Z Kira, Y Shen… - Proceedings of the …, 2024 - openaccess.thecvf.com
Recent work has demonstrated a remarkable ability to customize text-to-image diffusion
models to multiple fine-grained concepts in a sequential (ie continual) manner while only …

Dynamic v2x perception from road-to-vehicle vision

J Tan, F Lyu, L Li, F Hu, T Feng, F Xu… - IEEE Transactions …, 2024 - ieeexplore.ieee.org
Vehicle-to-everything (V2X) perception is an innovative technology that enhances vehicle
perception accuracy, thereby elevating the security and reliability of autonomous systems …

Does continual learning meet compositionality? new benchmarks and an evaluation framework

W Liao, Y Wei, M Jiang, Q Zhang… - Advances in Neural …, 2023 - proceedings.neurips.cc
Compositionality facilitates the comprehension of novel objects using acquired concepts
and the maintenance of a knowledge pool. This is particularly crucial for continual learners …