Large language models with controllable working memory

D Li, AS Rawat, M Zaheer, X Wang, M Lukasik… - arxiv preprint arxiv …, 2022‏ - arxiv.org
Large language models (LLMs) have led to a series of breakthroughs in natural language
processing (NLP), owing to their excellent understanding and generation abilities …

Unsupervised commonsense question answering with self-talk

V Shwartz, P West, RL Bras, C Bhagavatula… - arxiv preprint arxiv …, 2020‏ - arxiv.org
Natural language understanding involves reading between the lines with implicit
background knowledge. Current systems either rely on pre-trained language models as the …

A survey of knowledge enhanced pre-trained language models

J Yang, X Hu, G **ao, Y Shen - ACM Transactions on Asian and Low …, 2024‏ - dl.acm.org
Pre-trained language models learn informative word representations on a large-scale text
corpus through self-supervised learning, which has achieved promising performance in …

Blade: Enhancing black-box large language models with small domain-specific models

H Li, Q Ai, J Chen, Q Dong, Z Wu, Y Liu, C Chen… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Large Language Models (LLMs) like ChatGPT and GPT-4 are versatile and capable of
addressing a diverse range of tasks. However, general LLMs, which are developed on open …

A comparative analysis of knowledge injection strategies for large language models in the scholarly domain

A Cadeddu, A Chessa, V De Leo, G Fenu… - … Applications of Artificial …, 2024‏ - Elsevier
In recent years, transformer-based models have emerged as powerful tools for natural
language processing tasks, demonstrating remarkable performance in several domains …

Self-supervised knowledge triplet learning for zero-shot question answering

P Banerjee, C Baral - arxiv preprint arxiv:2005.00316, 2020‏ - arxiv.org
The aim of all Question Answering (QA) systems is to be able to generalize to unseen
questions. Current supervised methods are reliant on expensive data annotation. Moreover …

BERT-kNN: Adding a kNN search component to pretrained language models for better QA

N Kassner, H Schütze - arxiv preprint arxiv:2005.00766, 2020‏ - arxiv.org
Khandelwal et al.(2020) use a k-nearest-neighbor (kNN) component to improve language
model performance. We show that this idea is beneficial for open-domain question …

Kalm: Knowledge-aware integration of local, document, and global contexts for long document understanding

S Feng, Z Tan, W Zhang, Z Lei, Y Tsvetkov - arxiv preprint arxiv …, 2022‏ - arxiv.org
With the advent of pretrained language models (LMs), increasing research efforts have been
focusing on infusing commonsense and domain-specific knowledge to prepare LMs for …

A novel self-attention enriching mechanism for biomedical question answering

Z Kaddari, T Bouchentouf - Expert Systems with Applications, 2023‏ - Elsevier
The task of biomedical question answering is a subtask of the more general question
answering task, that is concerned only with biomedical questions. The current state-of-the …

Distilling hypernymy relations from language models: On the effectiveness of zero-shot taxonomy induction

D Jain, LE Anke - arxiv preprint arxiv:2202.04876, 2022‏ - arxiv.org
In this paper, we analyze zero-shot taxonomy learning methods which are based on
distilling knowledge from language models via prompting and sentence scoring. We show …