Unifying large language models and knowledge graphs: A roadmap

S Pan, L Luo, Y Wang, C Chen… - IEEE Transactions on …, 2024‏ - ieeexplore.ieee.org
Large language models (LLMs), such as ChatGPT and GPT4, are making new waves in the
field of natural language processing and artificial intelligence, due to their emergent ability …

Are Large Language Models a Good Replacement of Taxonomies?

Y Sun, H **n, K Sun, YE Xu, X Yang, XL Dong… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Large language models (LLMs) demonstrate an impressive ability to internalize knowledge
and answer natural language questions. Although previous studies validate that LLMs …

Direct evaluation of chain-of-thought in multi-hop reasoning with knowledge graphs

MV Nguyen, L Luo, F Shiri, D Phung, YF Li… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Large language models (LLMs) demonstrate strong reasoning abilities when prompted to
generate chain-of-thought (CoT) explanations alongside answers. However, previous …

[HTML][HTML] Assessing how accurately large language models encode and apply the common European framework of reference for languages

L Benedetto, G Gaudeau, A Caines, P Buttery - Computers and Education …, 2025‏ - Elsevier
Abstract Large Language Models (LLMs) can have a transformative effect on a variety of
domains, including education, and it is therefore pressing to understand whether these …

[HTML][HTML] Large Language Models, scientific knowledge and factuality: A framework to streamline human expert evaluation

M Wysocka, O Wysocki, M Delmas, V Mutel… - Journal of Biomedical …, 2024‏ - Elsevier
Objective: The paper introduces a framework for the evaluation of the encoding of factual
scientific knowledge, designed to streamline the manual evaluation process typically …

KGPA: Robustness Evaluation for Large Language Models via Cross-Domain Knowledge Graphs

A Pei, Z Yang, S Zhu, R Cheng, J Jia… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Existing frameworks for assessing robustness of large language models (LLMs) overly
depend on specific benchmarks, increasing costs and failing to evaluate performance of …

Factual confidence of LLMs: On reliability and robustness of current estimators

M Mahaut, L Aina, P Czarnowska, M Hardalov… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Large Language Models (LLMs) tend to be unreliable in the factuality of their answers. To
address this problem, NLP researchers have proposed a range of techniques to estimate …

Incentive Distributed Knowledge Graph Market for Generative Artificial Intelligence in IoT

G Hao, Q Pan, J Wu - IEEE Internet of Things Journal, 2024‏ - ieeexplore.ieee.org
Generative artificial intelligence (GAI) models are pre-trained using extensive public data.
However, in the Internet of Things (IoT) domain, distributed and heterogeneous data from …

SelfPrompt: Autonomously Evaluating LLM Robustness via Domain-Constrained Knowledge Guidelines and Refined Adversarial Prompts

A Pei, Z Yang, S Zhu, R Cheng, J Jia - arxiv preprint arxiv:2412.00765, 2024‏ - arxiv.org
Traditional methods for evaluating the robustness of large language models (LLMs) often
rely on standardized benchmarks, which can escalate costs and limit evaluations across …

Benchmarking Biomedical Relation Knowledge in Large Language Models

F Zhang, K Yang, C Zhao, H Li, X Dong, H Tian… - International Symposium …, 2024‏ - Springer
As a special knowledge base (KB), a large language model (LLM) stores a great deal of
knowledge in the form of the parametric deep neural network, and evaluating the accuracy …