Adapting neural networks at runtime: Current trends in at-runtime optimizations for deep learning

M Sponner, B Waschneck, A Kumar - ACM Computing Surveys, 2024 - dl.acm.org
Adaptive optimization methods for deep learning adjust the inference task to the current
circumstances at runtime to improve the resource footprint while maintaining the model's …

Recent advances in generative ai and large language models: Current status, challenges, and perspectives

DH Hagos, R Battle, DB Rawat - IEEE Transactions on Artificial …, 2024 - ieeexplore.ieee.org
The emergence of generative artificial intelligence (AI) and large language models (LLMs)
has marked a new era of natural language processing (NLP), introducing unprecedented …

Scaling vision with sparse mixture of experts

C Riquelme, J Puigcerver, B Mustafa… - Advances in …, 2021 - proceedings.neurips.cc
Abstract Sparsely-gated Mixture of Experts networks (MoEs) have demonstrated excellent
scalability in Natural Language Processing. In Computer Vision, however, almost all …

Dynamic neural networks: A survey

Y Han, G Huang, S Song, L Yang… - IEEE transactions on …, 2021 - ieeexplore.ieee.org
Dynamic neural network is an emerging research topic in deep learning. Compared to static
models which have fixed computational graphs and parameters at the inference stage …

A survey on mixture of experts

W Cai, J Jiang, F Wang, J Tang, S Kim… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) have garnered unprecedented advancements across
diverse fields, ranging from natural language processing to computer vision and beyond …

Adamv-moe: Adaptive multi-task vision mixture-of-experts

T Chen, X Chen, X Du, A Rashwan… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Sparsely activated Mixture-of-Experts (MoE) is becoming a promising paradigm for
multi-task learning (MTL). Instead of compressing multiple tasks' knowledge into a single …

[HTML][HTML] Multi-site fMRI analysis using privacy-preserving federated learning and domain adaptation: ABIDE results

X Li, Y Gu, N Dvornek, LH Staib, P Ventola… - Medical image …, 2020 - Elsevier
Deep learning models have shown their advantage in many different tasks, including
neuroimage analysis. However, to effectively train a high-quality deep learning model, the …

M³vit: Mixture-of-experts vision transformer for efficient multi-task learning with model-accelerator co-design

Z Fan, R Sarkar, Z Jiang, T Chen… - Advances in …, 2022 - proceedings.neurips.cc
Multi-task learning (MTL) encapsulates multiple learned tasks in a single model and often
lets those tasks learn better jointly. Multi-tasking models have become successful and often …

Metabev: Solving sensor failures for 3d detection and map segmentation

C Ge, J Chen, E **e, Z Wang, L Hong… - Proceedings of the …, 2023 - openaccess.thecvf.com
Perception systems in modern autonomous driving vehicles typically take inputs from
complementary multi-modal sensors, eg, LiDAR and cameras. However, in real-world …

Generalizable person re-identification with relevance-aware mixture of experts

Y Dai, X Li, J Liu, Z Tong… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Abstract Domain generalizable (DG) person re-identification (ReID) is a challenging
problem because we cannot access any unseen target domain data during training. Almost …