A survey of knowledge enhanced pre-trained language models

L Hu, Z Liu, Z Zhao, L Hou, L Nie… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Pre-trained Language Models (PLMs) which are trained on large text corpus via self-
supervised learning method, have yielded promising performance on various tasks in …

Probing classifiers: Promises, shortcomings, and advances

Y Belinkov - Computational Linguistics, 2022 - direct.mit.edu
Probing classifiers have emerged as one of the prominent methodologies for interpreting
and analyzing deep neural network models of natural language processing. The basic idea …

[HTML][HTML] Modern language models refute Chomsky's approach to language

ST Piantadosi - From fieldwork to linguistic theory: A tribute to …, 2023 - books.google.com
Modern machine learning has subverted and bypassed the theoretical framework of
Chomsky's generative approach to linguistics, including its core claims to particular insights …

Leveraging large language models for multiple choice question answering

J Robinson, CM Rytting, D Wingate - arxiv preprint arxiv:2210.12353, 2022 - arxiv.org
While large language models (LLMs) like GPT-3 have achieved impressive results on
multiple choice question answering (MCQA) tasks in the zero, one, and few-shot settings …

Towards understanding cross and self-attention in stable diffusion for text-guided image editing

B Liu, C Wang, T Cao, K Jia… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Abstract Deep Text-to-Image Synthesis (TIS) models such as Stable Diffusion have recently
gained significant popularity for creative text-to-image generation. However for domain …

[HTML][HTML] Pre-trained models: Past, present and future

X Han, Z Zhang, N Ding, Y Gu, X Liu, Y Huo, J Qiu… - AI Open, 2021 - Elsevier
Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved
great success and become a milestone in the field of artificial intelligence (AI). Owing to …

Autoprompt: Eliciting knowledge from language models with automatically generated prompts

T Shin, Y Razeghi, RL Logan IV, E Wallace… - arxiv preprint arxiv …, 2020 - arxiv.org
The remarkable success of pretrained language models has motivated the study of what
kinds of knowledge these models learn during pretraining. Reformulating tasks as fill-in-the …

A comprehensive study of knowledge editing for large language models

N Zhang, Y Yao, B Tian, P Wang, S Deng… - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs) have shown extraordinary capabilities in understanding
and generating text that closely mirrors human communication. However, a primary …

Spot: Better frozen model adaptation through soft prompt transfer

T Vu, B Lester, N Constant, R Al-Rfou, D Cer - arxiv preprint arxiv …, 2021 - arxiv.org
There has been growing interest in parameter-efficient methods to apply pre-trained
language models to downstream tasks. Building on the Prompt Tuning approach of Lester et …

Transformer feed-forward layers are key-value memories

M Geva, R Schuster, J Berant, O Levy - arxiv preprint arxiv:2012.14913, 2020 - arxiv.org
Feed-forward layers constitute two-thirds of a transformer model's parameters, yet their role
in the network remains under-explored. We show that feed-forward layers in transformer …