Google 학술 검색

From google gemini to openai q*(q-star): A survey of resha** the generative artificial intelligence (ai) research landscape

TR McIntosh, T Susnjak, T Liu, P Watters… - arxiv preprint arxiv …, 2023 - arxiv.org

This comprehensive survey explored the evolving landscape of generative Artificial
Intelligence (AI), with a specific focus on the transformative impacts of Mixture of Experts …

저장 인용 127회 인용 관련 학술자료 전체 3개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Recent advances in generative ai and large language models: Current status, challenges, and perspectives

DH Hagos, R Battle, DB Rawat - IEEE Transactions on Artificial …, 2024 - ieeexplore.ieee.org

The emergence of generative artificial intelligence (AI) and large language models (LLMs)
has marked a new era of natural language processing (NLP), introducing unprecedented …

저장 인용 16회 인용 관련 학술자료 전체 4개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Language is not all you need: Aligning perception with language models

S Huang, L Dong, W Wang, Y Hao… - Advances in …, 2024 - proceedings.neurips.cc

A big convergence of language, multimodal perception, action, and world modeling is a key
step toward artificial general intelligence. In this work, we introduce KOSMOS-1, a …

저장 인용 481회 인용 관련 학술자료 전체 5개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Efficient large language models: A survey

Z Wan, X Wang, C Liu, S Alam, Y Zheng, J Liu… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) have demonstrated remarkable capabilities in important
tasks such as natural language understanding and language generation, and thus have the …

저장 인용 128회 인용 관련 학술자료 전체 7개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Retentive network: A successor to transformer for large language models

Y Sun, L Dong, S Huang, S Ma, Y **a, J Xue… - arxiv preprint arxiv …, 2023 - arxiv.org

In this work, we propose Retentive Network (RetNet) as a foundation architecture for large
language models, simultaneously achieving training parallelism, low-cost inference, and …

저장 인용 307회 인용 관련 학술자료 전체 3개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Language models are general-purpose interfaces

Y Hao, H Song, L Dong, S Huang, Z Chi… - arxiv preprint arxiv …, 2022 - arxiv.org

Foundation models have received much attention due to their effectiveness across a broad
range of downstream applications. Though there is a big convergence in terms of …

저장 인용 108회 인용 관련 학술자료 전체 2개의 버전 HTML 버전

A survey on mixture of experts

W Cai, J Jiang, F Wang, J Tang, S Kim… - arxiv preprint arxiv …, 2024 - arxiv.org

Large language models (LLMs) have garnered unprecedented advancements across
diverse fields, ranging from natural language processing to computer vision and beyond …

저장 인용 63회 인용 관련 학술자료 전체 4개의 버전 저장된 페이지

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

LoRAMoE: Alleviating world knowledge forgetting in large language models via MoE-style plugin

S Dou, E Zhou, Y Liu, S Gao, W Shen… - Proceedings of the …, 2024 - aclanthology.org

Supervised fine-tuning (SFT) is a crucial step for large language models (LLMs), enabling
them to align with human instructions and enhance their capabilities in downstream tasks …

저장 인용 22회 인용 관련 학술자료 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] mlsys.org

Tutel: Adaptive mixture-of-experts at scale

C Hwang, W Cui, Y **ong, Z Yang… - Proceedings of …, 2023 - proceedings.mlsys.org

Sparsely-gated mixture-of-experts (MoE) has been widely adopted to scale deep learning
models to trillion-plus parameters with fixed computational cost. The algorithmic …

저장 인용 82회 인용 관련 학술자료 전체 4개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Adamv-moe: Adaptive multi-task vision mixture-of-experts

T Chen, X Chen, X Du, A Rashwan… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract Sparsely activated Mixture-of-Experts (MoE) is becoming a promising paradigm for
multi-task learning (MTL). Instead of compressing multiple tasks' knowledge into a single …

저장 인용 42회 인용 관련 학술자료 전체 4개의 버전 HTML 버전

알림 만들기

인용

고급 검색

라이브러리에 저장됨

On the representation collapse of sparse mixture of experts

From google gemini to openai q*(q-star): A survey of resha** the generative artificial intelligence (ai) research landscape

Recent advances in generative ai and large language models: Current status, challenges, and perspectives

Language is not all you need: Aligning perception with language models

Efficient large language models: A survey

Retentive network: A successor to transformer for large language models

Language models are general-purpose interfaces

A survey on mixture of experts

LoRAMoE: Alleviating world knowledge forgetting in large language models via MoE-style plugin

Tutel: Adaptive mixture-of-experts at scale

Adamv-moe: Adaptive multi-task vision mixture-of-experts