- Academic Search

Y Chang, X Wang, J Wang, Y Wu, L Yang… - ACM Transactions on …, 2024 - dl.acm.org

Large language models (LLMs) are gaining increasing popularity in both academia and
industry, owing to their unprecedented performance in various applications. As LLMs …

Salva Cita Citato da 2106 Articoli correlati Tutte e 4 le versioni

[Free GPT-4]

[PDF] acm.org

Scientific large language models: A survey on biological & chemical domains

Q Zhang, K Ding, T Lv, X Wang, Q Yin, Y Zhang… - ACM Computing …, 2024 - dl.acm.org

Large Language Models (LLMs) have emerged as a transformative power in enhancing
natural language comprehension, representing a significant stride toward artificial general …

Salva Cita Citato da 44 Articoli correlati Tutte e 2 le versioni

[Free GPT-4]

[PDF] arxiv.org

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arxiv preprint arxiv …, 2023 - arxiv.org

Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

Salva Cita Citato da 3554 Articoli correlati Tutte e 4 le versioni Versione HTML

[Free GPT-4]

[PDF] arxiv.org

Inadequacies of large language model benchmarks in the era of generative artificial intelligence

TR McIntosh, T Susnjak, N Arachchilage, T Liu… - arxiv preprint arxiv …, 2024 - arxiv.org

The rapid rise in popularity of Large Language Models (LLMs) with emerging capabilities
has spurred public curiosity to evaluate and compare different LLMs, leading many …

Salva Cita Citato da 105 Articoli correlati Tutte e 3 le versioni Versione HTML

[Free GPT-4]

[PDF] thecvf.com

Evalcrafter: Benchmarking and evaluating large video generation models

Y Liu, X Cun, X Liu, X Wang, Y Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com

The vision and language generative models have been overgrown in recent years. For
video generation various open-sourced models and public-available services have been …

Salva Cita Citato da 89 Articoli correlati Tutte e 3 le versioni Versione HTML

[Free GPT-4]

[PDF] arxiv.org

Superclue: A comprehensive chinese large language model benchmark

L Xu, A Li, L Zhu, H Xue, C Zhu, K Zhao, H He… - arxiv preprint arxiv …, 2023 - arxiv.org

Large language models (LLMs) have shown the potential to be integrated into human daily
lives. Therefore, user preference is the most critical criterion for assessing LLMs' …

Salva Cita Citato da 48 Articoli correlati Tutte e 2 le versioni Versione HTML

[Free GPT-4]

[PDF] arxiv.org

Learning or self-aligning? rethinking instruction fine-tuning

M Ren, B Cao, H Lin, C Liu, X Han, K Zeng… - arxiv preprint arxiv …, 2024 - arxiv.org

Instruction Fine-tuning~(IFT) is a critical phase in building large language models~(LLMs).
Previous works mainly focus on the IFT's role in the transfer of behavioral norms and the …

Salva Cita Citato da 17 Articoli correlati Tutte e 2 le versioni Versione HTML

[Free GPT-4]

[PDF] arxiv.org

Sciassess: Benchmarking llm proficiency in scientific literature analysis

H Cai, X Cai, J Chang, S Li, L Yao, C Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent breakthroughs in Large Language Models (LLMs) have revolutionized scientific
literature analysis. However, existing benchmarks fail to adequately evaluate the proficiency …

Salva Cita Citato da 17 Articoli correlati Tutte e 2 le versioni Versione HTML

[Free GPT-4]

[PDF] aaai.org

Can Large Language Models Understand Real-World Complex Instructions?

Q He, J Zeng, W Huang, L Chen, J **ao, Q He… - Proceedings of the …, 2024 - ojs.aaai.org

Large language models (LLMs) can understand human instructions, showing their potential
for pragmatic applications beyond traditional NLP tasks. However, they still struggle with …

Salva Cita Citato da 48 Articoli correlati Tutte e 4 le versioni Versione HTML

[Free GPT-4]

[PDF] arxiv.org

SeaEval for multilingual foundation models: From cross-lingual alignment to cultural reasoning

B Wang, Z Liu, X Huang, F Jiao, Y Ding, AT Aw… - arxiv preprint arxiv …, 2023 - arxiv.org

We present SeaEval, a benchmark for multilingual foundation models. In addition to
characterizing how these models understand and reason with natural language, we also …

Salva Cita Citato da 19 Articoli correlati Tutte e 4 le versioni Versione HTML

Crea avviso

Cita

Ricerca avanzata

Salvato in La mia biblioteca

**ezhi: An ever-updating benchmark for holistic domain knowledge evaluation

A survey on evaluation of large language models

Scientific large language models: A survey on biological & chemical domains

A survey of large language models

Inadequacies of large language model benchmarks in the era of generative artificial intelligence

Evalcrafter: Benchmarking and evaluating large video generation models

Superclue: A comprehensive chinese large language model benchmark

Learning or self-aligning? rethinking instruction fine-tuning

Sciassess: Benchmarking llm proficiency in scientific literature analysis

Can Large Language Models Understand Real-World Complex Instructions?

SeaEval for multilingual foundation models: From cross-lingual alignment to cultural reasoning