[HTML][HTML] Pre-trained models: Past, present and future

X Han, Z Zhang, N Ding, Y Gu, X Liu, Y Huo, J Qiu… - AI Open, 2021 - Elsevier
Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved
great success and become a milestone in the field of artificial intelligence (AI). Owing to …

A primer in BERTology: What we know about how BERT works

A Rogers, O Kovaleva, A Rumshisky - Transactions of the Association …, 2021 - direct.mit.edu
Transformer-based models have pushed state of the art in many areas of NLP, but our
understanding of what is behind their success is still limited. This paper is the first survey of …

Bidirectional language modeling: a systematic literature review

M Shah Jahan, HU Khan, S Akbar… - Scientific …, 2021 - Wiley Online Library
In transfer learning, two major activities, ie, pretraining and fine‐tuning, are carried out to
perform downstream tasks. The advent of transformer architecture and bidirectional …

Investigating the difference of fake news source credibility recognition between ANN and BERT algorithms in artificial intelligence

THC Chiang, CS Liao, WC Wang - Applied Sciences, 2022 - mdpi.com
Fake news permeating life through channels misleads people into disinformation. To reduce
the harm of fake news and provide multiple and effective news credibility channels, the …

GiBERT: Introducing Linguistic Knowledge into BERT through a Lightweight Gated Injection Method

N Peinelt, M Rei, M Liakata - arxiv preprint arxiv:2010.12532, 2020 - arxiv.org
Large pre-trained language models such as BERT have been the driving force behind
recent improvements across many NLP tasks. However, BERT is only trained to predict …

[PDF][PDF] AI Open

X Han, Z Zhang, N Ding, Y Gu, X Liu, Y Huo, J Qiu… - nlp.csai.tsinghua.edu.cn
Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved
great success and become a milestone in the field of artificial intelligence (AI). Owing to …