[HTML][HTML] Pre-trained language models and their applications

H Wang, J Li, H Wu, E Hovy, Y Sun - Engineering, 2023 - Elsevier
Pre-trained language models have achieved striking success in natural language
processing (NLP), leading to a paradigm shift from supervised learning to pre-training …

A survey on contrastive self-supervised learning

A Jaiswal, AR Babu, MZ Zadeh, D Banerjee… - Technologies, 2020 - mdpi.com
Self-supervised learning has gained popularity because of its ability to avoid the cost of
annotating large-scale datasets. It is capable of adopting self-defined pseudolabels as …

Flava: A foundational language and vision alignment model

A Singh, R Hu, V Goswami… - Proceedings of the …, 2022 - openaccess.thecvf.com
State-of-the-art vision and vision-and-language models rely on large-scale visio-linguistic
pretraining for obtaining good performance on a variety of downstream tasks. Generally …

Vlmo: Unified vision-language pre-training with mixture-of-modality-experts

H Bao, W Wang, L Dong, Q Liu… - Advances in …, 2022 - proceedings.neurips.cc
We present a unified Vision-Language pretrained Model (VLMo) that jointly learns a dual
encoder and a fusion encoder with a modular Transformer network. Specifically, we …

[HTML][HTML] Pre-trained models: Past, present and future

X Han, Z Zhang, N Ding, Y Gu, X Liu, Y Huo, J Qiu… - AI Open, 2021 - Elsevier
Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved
great success and become a milestone in the field of artificial intelligence (AI). Owing to …

COMET-22: Unbabel-IST 2022 submission for the metrics shared task

R Rei, JGC De Souza, D Alves, C Zerva… - Proceedings of the …, 2022 - aclanthology.org
In this paper, we present the joint contribution of Unbabel and IST to the WMT 2022 Metrics
Shared Task. Our primary submission–dubbed COMET-22–is an ensemble between a …

[PDF][PDF] mt5: A massively multilingual pre-trained text-to-text transformer

L Xue - arxiv preprint arxiv:2010.11934, 2020 - fq.pkwyx.com
The recent" Text-to-Text Transfer Transformer"(T5) leveraged a unified text-to-text format and
scale to attain state-of-the-art results on a wide variety of English-language NLP tasks. In this …

Contrastive representation learning: A framework and review

PH Le-Khac, G Healy, AF Smeaton - Ieee Access, 2020 - ieeexplore.ieee.org
Contrastive Learning has recently received interest due to its success in self-supervised
representation learning in the computer vision domain. However, the origins of Contrastive …

Knowledge neurons in pretrained transformers

D Dai, L Dong, Y Hao, Z Sui, B Chang, F Wei - arxiv preprint arxiv …, 2021 - arxiv.org
Large-scale pretrained language models are surprisingly good at recalling factual
knowledge presented in the training corpus. In this paper, we present preliminary studies on …

Ammus: A survey of transformer-based pretrained models in natural language processing

KS Kalyan, A Rajasekharan, S Sangeetha - arxiv preprint arxiv …, 2021 - arxiv.org
Transformer-based pretrained language models (T-PTLMs) have achieved great success in
almost every NLP task. The evolution of these models started with GPT and BERT. These …