A survey of multimodal large language model from a data-centric perspective

T Bai, H Liang, B Wan, Y Xu, X Li, S Li, L Yang… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Multimodal large language models (MLLMs) enhance the capabilities of standard large
language models by integrating and processing data from multiple modalities, including text …

From Instructions to Intrinsic Human Values--A Survey of Alignment Goals for Big Models

J Yao, X Yi, X Wang, J Wang, X **e - arxiv preprint arxiv:2308.12014, 2023‏ - arxiv.org
Big models, exemplified by Large Language Models (LLMs), are models typically pre-
trained on massive data and comprised of enormous parameters, which not only obtain …

Towards tracing trustworthiness dynamics: Revisiting pre-training period of large language models

C Qian, J Zhang, W Yao, D Liu, Z Yin, Y Qiao… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Ensuring the trustworthiness of large language models (LLMs) is crucial. Most studies
concentrate on fully pre-trained LLMs to better understand and improve LLMs' …

Learning or self-aligning? rethinking instruction fine-tuning

M Ren, B Cao, H Lin, C Liu, X Han, K Zeng… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Instruction Fine-tuning~(IFT) is a critical phase in building large language models~(LLMs).
Previous works mainly focus on the IFT's role in the transfer of behavioral norms and the …

Foundation models for recommender systems: A survey and new perspectives

C Huang, T Yu, K **e, S Zhang, L Yao… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Recently, Foundation Models (FMs), with their extensive knowledge bases and complex
architectures, have offered unique opportunities within the realm of recommender systems …

Llm-assisted code cleaning for training accurate code generators

N Jain, T Zhang, WL Chiang, JE Gonzalez… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Natural language to code generation is an important application area of LLMs and has
received wide attention from the community. The majority of relevant studies have …

Wavecoder: Widespread and versatile enhancement for code large language models by instruction tuning

Z Yu, X Zhang, N Shang, Y Huang, C Xu… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Recent work demonstrates that, after instruction tuning, Code Large Language Models
(Code LLMs) can obtain impressive capabilities to address a wide range of code-related …