Boosting continual learning of vision-language models via mixture-of-experts adapters

J Yu, Y Zhuge, L Zhang, P Hu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Continual learning can empower vision-language models to continuously acquire new
knowledge without the need for access to the entire historical dataset. However mitigating …

A survey on mixture of experts

W Cai, J Jiang, F Wang, J Tang, S Kim… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) have garnered unprecedented advancements across
diverse fields, ranging from natural language processing to computer vision and beyond …

Multi-task dense prediction via mixture of low-rank experts

Y Yang, PT Jiang, Q Hou, H Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Previous multi-task dense prediction methods based on the Mixture of Experts (MoE) have
received great performance but they neglect the importance of explicitly modeling the global …

A survey of reasoning with foundation models

J Sun, C Zheng, E **e, Z Liu, R Chu, J Qiu, J Xu… - arxiv preprint arxiv …, 2023 - arxiv.org
Reasoning, a crucial ability for complex problem-solving, plays a pivotal role in various real-
world settings such as negotiation, medical diagnosis, and criminal investigation. It serves …

Generative AI agents with large language model for satellite networks via a mixture of experts transmission

R Zhang, H Du, Y Liu, D Niyato, J Kang… - IEEE Journal on …, 2024 - ieeexplore.ieee.org
In response to the needs of 6G global communications, satellite communication networks
have emerged as a key solution. However, the large-scale development of satellite …

Taskexpert: Dynamically assembling multi-task representations with memorial mixture-of-experts

H Ye, D Xu - Proceedings of the IEEE/CVF international …, 2023 - openaccess.thecvf.com
Learning discriminative task-specific features simultaneously for multiple distinct tasks is a
fundamental problem in multi-task learning. Recent state-of-the-art models consider directly …

Psychometry: An omnifit model for image reconstruction from human brain activity

R Quan, W Wang, Z Tian, F Ma… - Proceedings of the …, 2024 - openaccess.thecvf.com
Reconstructing the viewed images from human brain activity bridges human and computer
vision through the Brain-Computer Interface. The inherent variability in brain function …

Mixtures of experts unlock parameter scaling for deep rl

J Obando-Ceron, G Sokar, T Willi, C Lyle… - arxiv preprint arxiv …, 2024 - arxiv.org
The recent rapid progress in (self) supervised learning models is in large part predicted by
empirical scaling laws: a model's performance scales proportionally to its size. Analogous …

Sira: Sparse mixture of low rank adaptation

Y Zhu, N Wichers, CC Lin, X Wang, T Chen… - arxiv preprint arxiv …, 2023 - arxiv.org
Parameter Efficient Tuning has been an prominent approach to adapt the Large Language
Model to downstream tasks. Most previous works considers adding the dense trainable …

Diffusionmtl: Learning multi-task denoising diffusion model from partially annotated data

H Ye, D Xu - Proceedings of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
Recently there has been an increased interest in the practical problem of learning multiple
dense scene understanding tasks from partially annotated data where each training sample …