From google gemini to openai q*(q-star): A survey of resha** the generative artificial intelligence (ai) research landscape

TR McIntosh, T Susnjak, T Liu, P Watters… - arxiv preprint arxiv …, 2023 - arxiv.org
This comprehensive survey explored the evolving landscape of generative Artificial
Intelligence (AI), with a specific focus on the transformative impacts of Mixture of Experts …

A survey on lora of large language models

Y Mao, Y Ge, Y Fan, W Xu, Y Mi, Z Hu… - Frontiers of Computer …, 2025 - Springer
Abstract Low-Rank Adaptation (LoRA), which updates the dense neural network layers with
pluggable low-rank matrices, is one of the best performed parameter efficient fine-tuning …

Mixture of cluster-conditional lora experts for vision-language instruction tuning

Y Gou, Z Liu, K Chen, L Hong, H Xu, A Li… - arxiv preprint arxiv …, 2023 - arxiv.org
Instruction tuning of Large Vision-language Models (LVLMs) has revolutionized the
development of versatile models with zero-shot generalization across a wide range of …

Interactive AI with retrieval-augmented generation for next generation networking

R Zhang, H Du, Y Liu, D Niyato, J Kang, S Sun… - IEEE …, 2024 - ieeexplore.ieee.org
With the advance of artificial intelligence (AI), the concept of interactive AI (IAI) has been
introduced, which can interactively understand and respond not only to human user input …

Minedreamer: Learning to follow instructions via chain-of-imagination for simulated-world control

E Zhou, Y Qin, Z Yin, Y Huang, R Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
It is a long-lasting goal to design a generalist-embodied agent that can follow diverse
instructions in human-like ways. However, existing approaches often fail to steadily follow …

Minigpt-3d: Efficiently aligning 3d point clouds with large language models using 2d priors

Y Tang, X Han, X Li, Q Yu, Y Hao, L Hu… - Proceedings of the 32nd …, 2024 - dl.acm.org
Large 2D vision-language models (2D-LLMs) have gained significant attention by bridging
Large Language Models (LLMs) with images using a simple projector. Inspired by their …

Harmonizing visual text comprehension and generation

Z Zhao, J Tang, B Wu, C Lin, S Wei, H Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
In this work, we present TextHarmony, a unified and versatile multimodal generative model
proficient in comprehending and generating visual text. Simultaneously generating images …

LoRAMoE: Alleviating world knowledge forgetting in large language models via MoE-style plugin

S Dou, E Zhou, Y Liu, S Gao, W Shen… - Proceedings of the …, 2024 - aclanthology.org
Supervised fine-tuning (SFT) is a crucial step for large language models (LLMs), enabling
them to align with human instructions and enhance their capabilities in downstream tasks …

Mixture of insightful experts (mote): The synergy of thought chains and expert mixtures in self-alignment

Z Liu, Y Gou, K Chen, L Hong, J Gao, F Mi… - arxiv preprint arxiv …, 2024 - arxiv.org
As the capabilities of large language models (LLMs) have expanded dramatically, aligning
these models with human values presents a significant challenge. Traditional alignment …

Mome: Mixture of multimodal experts for generalist multimodal large language models

L Shen, G Chen, R Shao, W Guan, L Nie - arxiv preprint arxiv:2407.12709, 2024 - arxiv.org
Multimodal large language models (MLLMs) have demonstrated impressive capabilities
across various vision-language tasks. However, a generalist MLLM typically underperforms …