Struc-bench: Are large language models really good at generating complex structured data?

X Tang, Y Zong, J Phang, Y Zhao, W Zhou… - arxiv preprint arxiv …, 2023 - arxiv.org
Despite the power of Large Language Models (LLMs) like GPT-4, they still struggle with
tasks that require generating complex, structured outputs. In this study, we assess the …

PRobELM: Plausibility ranking evaluation for language models

Z Yuan, E Chamoun, R Aly, C Whitehouse… - arxiv preprint arxiv …, 2024 - arxiv.org
This paper introduces PRobELM (Plausibility Ranking Evaluation for Language Models), a
benchmark designed to assess language models' ability to discern more plausible from less …

InstructIE: A Bilingual Instruction-based Information Extraction Dataset

H Gui, S Qiao, J Zhang, H Ye, M Sun, L Liang… - International Semantic …, 2024 - Springer
Large language models can perform well on general natural language tasks, but their
effectiveness is still suboptimal for information extraction (IE). Recent works indicate that the …

DocTabQA: Answering Questions from Long Documents Using Tables

H Wang, K Hu, H Dong, L Gao - International Conference on Document …, 2024 - Springer
We study a new problem setting of question answering (QA), referred to as DocTabQA.
Within this setting, given a long document, the goal is to respond to questions by organizing …

Towards Knowledge-Grounded Natural Language Understanding and Generation

C Whitehouse - arxiv preprint arxiv:2403.15364, 2024 - arxiv.org
This thesis investigates how natural language understanding and generation with
transformer models can benefit from grounding the models with knowledge representations …

UniKG: A Benchmark and Universal Embedding for Large-Scale Knowledge Graphs

Y Qiu, S Ling, T Zhang, B Huang, Z Cui - arxiv preprint arxiv:2309.05269, 2023 - arxiv.org
Irregular data in real-world are usually organized as heterogeneous graphs (HGs)
consisting of multiple types of nodes and edges. To explore useful knowledge from real …

Low-Rank Adaptation for Multilingual Summarization: An Empirical Study

C Whitehouse, F Huot, J Bastings… - Findings of the …, 2024 - aclanthology.org
Although the advancements of pre-trained Large Language Models have significantly
accelerated recent progress in NLP, their ever-increasing size poses significant challenges …

[LIVRE][B] Document Analysis and Recognition-ICDAR 2024: 18th International Conference, Athens, Greece, August 30–September 4, 2024, Proceedings, Part I

EHB Smith - 2024 - books.google.com
This six-volume set LNCS 14804-14809 constitutes the proceedings of the 18th International
Conference on Document Analysis and Recognition, ICDAR 2024, held in Athens, Greece …

[PDF][PDF] Beyond boundaries: Towards generalizable information extraction frameworks

Z Wang - 2024 - staff.fnwi.uva.nl
Abstract Information Extraction (IE) is a core area of natural language processing focused on
identifying structured information, such as named entities and relationships, within plain text …

[PDF][PDF] Application of GenIR Models in Complex Information Retrieval Tasks

W Zhang - Academic Journal of Computing & Information …, 2024 - francis-press.com
The field of information retrieval (IR) has evolved significantly with the advent of Generative
Information Retrieval (GenIR) models, which leverage advancements in large language …