From google gemini to openai q*(q-star): A survey of resha** the generative artificial intelligence (ai) research landscape
This comprehensive survey explored the evolving landscape of generative Artificial
Intelligence (AI), with a specific focus on the transformative impacts of Mixture of Experts …
Intelligence (AI), with a specific focus on the transformative impacts of Mixture of Experts …
Recent advances in generative ai and large language models: Current status, challenges, and perspectives
The emergence of generative artificial intelligence (AI) and large language models (LLMs)
has marked a new era of natural language processing (NLP), introducing unprecedented …
has marked a new era of natural language processing (NLP), introducing unprecedented …
Language is not all you need: Aligning perception with language models
A big convergence of language, multimodal perception, action, and world modeling is a key
step toward artificial general intelligence. In this work, we introduce KOSMOS-1, a …
step toward artificial general intelligence. In this work, we introduce KOSMOS-1, a …
Efficient large language models: A survey
Large Language Models (LLMs) have demonstrated remarkable capabilities in important
tasks such as natural language understanding and language generation, and thus have the …
tasks such as natural language understanding and language generation, and thus have the …
Retentive network: A successor to transformer for large language models
In this work, we propose Retentive Network (RetNet) as a foundation architecture for large
language models, simultaneously achieving training parallelism, low-cost inference, and …
language models, simultaneously achieving training parallelism, low-cost inference, and …
Language models are general-purpose interfaces
Foundation models have received much attention due to their effectiveness across a broad
range of downstream applications. Though there is a big convergence in terms of …
range of downstream applications. Though there is a big convergence in terms of …
A survey on mixture of experts
Large language models (LLMs) have garnered unprecedented advancements across
diverse fields, ranging from natural language processing to computer vision and beyond …
diverse fields, ranging from natural language processing to computer vision and beyond …
LoRAMoE: Alleviating world knowledge forgetting in large language models via MoE-style plugin
Supervised fine-tuning (SFT) is a crucial step for large language models (LLMs), enabling
them to align with human instructions and enhance their capabilities in downstream tasks …
them to align with human instructions and enhance their capabilities in downstream tasks …
Tutel: Adaptive mixture-of-experts at scale
Sparsely-gated mixture-of-experts (MoE) has been widely adopted to scale deep learning
models to trillion-plus parameters with fixed computational cost. The algorithmic …
models to trillion-plus parameters with fixed computational cost. The algorithmic …
Adamv-moe: Adaptive multi-task vision mixture-of-experts
Abstract Sparsely activated Mixture-of-Experts (MoE) is becoming a promising paradigm for
multi-task learning (MTL). Instead of compressing multiple tasks' knowledge into a single …
multi-task learning (MTL). Instead of compressing multiple tasks' knowledge into a single …