[HTML][HTML] Pre-trained language models and their applications

H Wang, J Li, H Wu, E Hovy, Y Sun - Engineering, 2023 - Elsevier
Pre-trained language models have achieved striking success in natural language
processing (NLP), leading to a paradigm shift from supervised learning to pre-training …

[HTML][HTML] A survey of transformers

T Lin, Y Wang, X Liu, X Qiu - AI open, 2022 - Elsevier
Transformers have achieved great success in many artificial intelligence fields, such as
natural language processing, computer vision, and audio processing. Therefore, it is natural …

Ernie 3.0: Large-scale knowledge enhanced pre-training for language understanding and generation

Y Sun, S Wang, S Feng, S Ding, C Pang… - arxiv preprint arxiv …, 2021 - arxiv.org
Pre-trained models have achieved state-of-the-art results in various Natural Language
Processing (NLP) tasks. Recent works such as T5 and GPT-3 have shown that scaling up …

Recurrent memory transformer

A Bulatov, Y Kuratov, M Burtsev - Advances in Neural …, 2022 - proceedings.neurips.cc
Transformer-based models show their effectiveness across multiple domains and tasks. The
self-attention allows to combine information from all sequence elements into context-aware …

A survey on text classification algorithms: From text to predictions

A Gasparetto, M Marcuzzo, A Zangari, A Albarelli - Information, 2022 - mdpi.com
In recent years, the exponential growth of digital documents has been met by rapid progress
in text classification techniques. Newly proposed machine learning algorithms leverage the …

Scaling transformer to 1m tokens and beyond with rmt

A Bulatov, Y Kuratov, Y Kapushev… - arxiv preprint arxiv …, 2023 - arxiv.org
A major limitation for the broader scope of problems solvable by transformers is the
quadratic scaling of computational complexity with input size. In this study, we investigate …

Museformer: Transformer with fine-and coarse-grained attention for music generation

B Yu, P Lu, R Wang, W Hu, X Tan… - Advances in …, 2022 - proceedings.neurips.cc
Symbolic music generation aims to generate music scores automatically. A recent trend is to
use Transformer or its variants in music generation, which is, however, suboptimal, because …

Advancing transformer architecture in long-context large language models: A comprehensive survey

Y Huang, J Xu, J Lai, Z Jiang, T Chen, Z Li… - arxiv preprint arxiv …, 2023 - arxiv.org
With the bomb ignited by ChatGPT, Transformer-based Large Language Models (LLMs)
have paved a revolutionary path toward Artificial General Intelligence (AGI) and have been …

A survey on long text modeling with transformers

Z Dong, T Tang, L Li, WX Zhao - arxiv preprint arxiv:2302.14502, 2023 - arxiv.org
Modeling long texts has been an essential technique in the field of natural language
processing (NLP). With the ever-growing number of long documents, it is important to …

Realistic morphology-preserving generative modelling of the brain

PD Tudosiu, WHL Pinaya… - Nature Machine …, 2024 - nature.com
Medical imaging research is often limited by data scarcity and availability. Governance,
privacy concerns and the cost of acquisition all restrict access to medical imaging data …