A survey on evaluation of large language models

Y Chang, X Wang, J Wang, Y Wu, L Yang… - ACM transactions on …, 2024 - dl.acm.org
Large language models (LLMs) are gaining increasing popularity in both academia and
industry, owing to their unprecedented performance in various applications. As LLMs …

A systematic survey and critical review on evaluating large language models: Challenges, limitations, and recommendations

MTR Laskar, S Alqahtani, MS Bari… - Proceedings of the …, 2024 - aclanthology.org
Abstract Large Language Models (LLMs) have recently gained significant attention due to
their remarkable capabilities in performing diverse tasks across various domains. However …

Multilingual large language model: A survey of resources, taxonomy and frontiers

L Qin, Q Chen, Y Zhou, Z Chen, Y Li, L Liao… - ar** and deploying large language models (LLMs).
However, previous safety benchmarks only concern the safety in one language, eg the …

Arabicmmlu: Assessing massive multitask language understanding in arabic

F Koto, H Li, S Shatnawi, J Doughman… - arxiv preprint arxiv …, 2024 - arxiv.org
The focus of language model evaluation has transitioned towards reasoning and knowledge-
intensive tasks, driven by advancements in pretraining large models. While state-of-the-art …

A fast optimization view: Reformulating single layer attention in llm based on tensor and svm trick, and solving it in matrix multiplication time

Y Gao, Z Song, W Wang, J Yin - arxiv preprint arxiv:2309.07418, 2023 - arxiv.org
Large language models (LLMs) have played a pivotal role in revolutionizing various facets
of our daily existence. Solving attention regression is a fundamental task in optimizing LLMs …

Can gpt-4 identify propaganda? annotation and detection of propaganda spans in news articles

M Hasanain, F Ahmed, F Alam - arxiv preprint arxiv:2402.17478, 2024 - arxiv.org
The use of propaganda has spiked on mainstream and social media, aiming to manipulate
or mislead users. While efforts to automatically detect propaganda techniques in textual …

Adversarial attacks and defenses for large language models (LLMs): methods, frameworks & challenges

P Kumar - International Journal of Multimedia Information …, 2024 - Springer
Large language models (LLMs) have exhibited remarkable efficacy and proficiency in a
wide array of NLP endeavors. Nevertheless, concerns are growing rapidly regarding the …

ArMeme: Propagandistic content in arabic memes

F Alam, A Hasnat, F Ahmed, MA Hasan… - arxiv preprint arxiv …, 2024 - arxiv.org
With the rise of digital communication, memes have become a significant medium for cultural
and political expression that is often used to mislead audiences. Identification of such …

Taqyim: Evaluating arabic nlp tasks using chatgpt models

Z Alyafeai, MS Alshaibani, B AlKhamissi… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) have demonstrated impressive performance on various
downstream tasks without requiring fine-tuning, including ChatGPT, a chat-based model …