A Practitioner's Guide to Continual Multimodal Pretraining
Multimodal foundation models serve numerous applications at the intersection of vision and
language. Still, despite being pretrained on extensive data, they become outdated over time …
language. Still, despite being pretrained on extensive data, they become outdated over time …
Improving intervention efficacy via concept realignment in concept bottleneck models
Abstract Concept Bottleneck Models (CBMs) ground image classification on human-
understandable concepts to allow for interpretable model decisions as well as human …
understandable concepts to allow for interpretable model decisions as well as human …
UNIC: Universal classification models via multi-teacher distillation
Pretrained models have become a commodity and offer strong results on a broad range of
tasks. In this work, we focus on classification and seek to learn a unique encoder able to …
tasks. In this work, we focus on classification and seek to learn a unique encoder able to …
Active data curation effectively distills large-scale multimodal models
Knowledge distillation (KD) is the de facto standard for compressing large-scale models into
smaller ones. Prior works have explored ever more complex KD strategies involving different …
smaller ones. Prior works have explored ever more complex KD strategies involving different …
UNIC: Universal Classification Models via Multi-teacher Distillation
Pretrained models have become a commodity and offer strong results on a broad range of
tasks. In this work, we focus on classification and seek to learn a unique encoder able to …
tasks. In this work, we focus on classification and seek to learn a unique encoder able to …
[HTML][HTML] Towards a Decentralized Collaborative Framework for Scalable Edge AI
Nowadays, Edge Intelligence has seen unprecedented growth in most of our daily life
applications. Traditionally, most applications required significant efforts into data collection …
applications. Traditionally, most applications required significant efforts into data collection …
How to Merge Your Multimodal Models Over Time?
Model merging combines multiple expert models-finetuned from a base foundation model
on diverse tasks and domains-into a single, more capable model. However, most existing …
on diverse tasks and domains-into a single, more capable model. However, most existing …
ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections
Parameter-efficient finetuning (PEFT) has become ubiquitous to adapt foundation models to
downstream task requirements while retaining their generalization ability. However, the …
downstream task requirements while retaining their generalization ability. However, the …
Weak-to-Strong Enhanced Vision Model
Recent advancements in large language and vision models have demonstrated
extraordinary capabilities, driving researchers to train increasingly larger models in pursuit …
extraordinary capabilities, driving researchers to train increasingly larger models in pursuit …