A survey of knowledge enhanced pre-trained language models

L Hu, Z Liu, Z Zhao, L Hou, L Nie… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Pre-trained Language Models (PLMs) which are trained on large text corpus via self-
supervised learning method, have yielded promising performance on various tasks in …

A comprehensive study of knowledge editing for large language models

N Zhang, Y Yao, B Tian, P Wang, S Deng… - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs) have shown extraordinary capabilities in understanding
and generating text that closely mirrors human communication. However, a primary …

Mass-editing memory in a transformer

K Meng, AS Sharma, A Andonian, Y Belinkov… - arxiv preprint arxiv …, 2022 - arxiv.org
Recent work has shown exciting promise in updating large language models with new
memories, so as to replace obsolete information or add specialized knowledge. However …

Editing large language models: Problems, methods, and opportunities

Y Yao, P Wang, B Tian, S Cheng, Z Li, S Deng… - arxiv preprint arxiv …, 2023 - arxiv.org
Despite the ability to train capable LLMs, the methodology for maintaining their relevancy
and rectifying errors remains elusive. To this end, the past few years have witnessed a surge …

Evaluating the ripple effects of knowledge editing in language models

R Cohen, E Biran, O Yoran, A Globerson… - Transactions of the …, 2024 - direct.mit.edu
Modern language models capture a large body of factual knowledge. However, some facts
can be incorrectly induced or become obsolete over time, resulting in factually incorrect …

Transformer feed-forward layers build predictions by promoting concepts in the vocabulary space

M Geva, A Caciularu, KR Wang, Y Goldberg - arxiv preprint arxiv …, 2022 - arxiv.org
Transformer-based language models (LMs) are at the core of modern NLP, but their internal
prediction construction process is opaque and largely not understood. In this work, we make …

Calibrating factual knowledge in pretrained language models

Q Dong, D Dai, Y Song, J Xu, Z Sui, L Li - arxiv preprint arxiv:2210.03329, 2022 - arxiv.org
Previous literature has proved that Pretrained Language Models (PLMs) can store factual
knowledge. However, we find that facts stored in the PLMs are not always correct. It …

Ontoprotein: Protein pretraining with gene ontology embedding

N Zhang, Z Bi, X Liang, S Cheng, H Hong… - arxiv preprint arxiv …, 2022 - arxiv.org
Self-supervised protein language models have proved their effectiveness in learning the
proteins representations. With the increasing computational power, current protein language …

A study on relu and softmax in transformer

K Shen, J Guo, X Tan, S Tang, R Wang… - arxiv preprint arxiv …, 2023 - arxiv.org
The Transformer architecture consists of self-attention and feed-forward networks (FFNs)
which can be viewed as key-value memories according to previous works. However, FFN …

Human parity on commonsenseqa: Augmenting self-attention with external attention

Y Xu, C Zhu, S Wang, S Sun, H Cheng, X Liu… - arxiv preprint arxiv …, 2021 - arxiv.org
Most of today's AI systems focus on using self-attention mechanisms and transformer
architectures on large amounts of diverse data to achieve impressive performance gains. In …