A Survey of Multimodel Large Language Models

Z Liang, Y Xu, Y Hong, P Shang, Q Wang… - Proceedings of the 3rd …, 2024 - dl.acm.org
With the widespread application of the Transformer architecture in various modalities,
including vision, the technology of large language models is evolving from a single modality …

From google gemini to openai q*(q-star): A survey of resha** the generative artificial intelligence (ai) research landscape

TR McIntosh, T Susnjak, T Liu, P Watters… - arxiv preprint arxiv …, 2023 - arxiv.org
This comprehensive survey explored the evolving landscape of generative Artificial
Intelligence (AI), with a specific focus on the transformative impacts of Mixture of Experts …

Efficient large language models: A survey

Z Wan, X Wang, C Liu, S Alam, Y Zheng, J Liu… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have demonstrated remarkable capabilities in important
tasks such as natural language understanding and language generation, and thus have the …

Llama-moe: Building mixture-of-experts from llama with continual pre-training

T Zhu, X Qu, D Dong, J Ruan, J Tong… - Proceedings of the …, 2024 - aclanthology.org
Abstract Mixture-of-Experts (MoE) has gained increasing popularity as a promising
framework for scaling up large language models (LLMs). However, training MoE from …

A survey on mixture of experts

W Cai, J Jiang, F Wang, J Tang, S Kim… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) have garnered unprecedented advancements across
diverse fields, ranging from natural language processing to computer vision and beyond …

Scaling vision-language models with sparse mixture of experts

S Shen, Z Yao, C Li, T Darrell, K Keutzer… - arxiv preprint arxiv …, 2023 - arxiv.org
The field of natural language processing (NLP) has made significant strides in recent years,
particularly in the development of large-scale vision-language models (VLMs). These …

Conpet: Continual parameter-efficient tuning for large language models

C Song, X Han, Z Zeng, K Li, C Chen, Z Liu… - arxiv preprint arxiv …, 2023 - arxiv.org
Continual learning necessitates the continual adaptation of models to newly emerging tasks
while minimizing the catastrophic forgetting of old ones. This is extremely challenging for …

Dialogue summarization with mixture of experts based on large language models

Y Tian, F **a, Y Song - Proceedings of the 62nd Annual Meeting of …, 2024 - aclanthology.org
Dialogue summarization is an important task that requires to generate highlights for a
conversation from different aspects (eg, content of various speakers). While several studies …

Enable language models to implicitly learn self-improvement from data

Z Wang, L Hou, T Lu, Y Wu, Y Li, H Yu, H Ji - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have demonstrated remarkable capabilities in open-ended
text generation tasks. However, the inherent open-ended nature of these tasks implies that …

Ai safety in generative ai large language models: A survey

J Chua, Y Li, S Yang, C Wang, L Yao - arxiv preprint arxiv:2407.18369, 2024 - arxiv.org
Large Language Model (LLMs) such as ChatGPT that exhibit generative AI capabilities are
facing accelerated adoption and innovation. The increased presence of Generative AI (GAI) …