Diffusion models: A comprehensive survey of methods and applications
Diffusion models have emerged as a powerful new family of deep generative models with
record-breaking performance in many applications, including image synthesis, video …
record-breaking performance in many applications, including image synthesis, video …
Machine-generated text: A comprehensive survey of threat models and detection methods
Machine-generated text is increasingly difficult to distinguish from text authored by humans.
Powerful open-source models are freely available, and user-friendly tools that democratize …
Powerful open-source models are freely available, and user-friendly tools that democratize …
[HTML][HTML] Review of image classification algorithms based on convolutional neural networks
L Chen, S Li, Q Bai, J Yang, S Jiang, Y Miao - Remote Sensing, 2021 - mdpi.com
Image classification has always been a hot research direction in the world, and the
emergence of deep learning has promoted the development of this field. Convolutional …
emergence of deep learning has promoted the development of this field. Convolutional …
R-drop: Regularized dropout for neural networks
Dropout is a powerful and widely used technique to regularize the training of deep neural
networks. Though effective and performing well, the randomness introduced by dropout …
networks. Though effective and performing well, the randomness introduced by dropout …
High-performance large-scale image recognition without normalization
Batch normalization is a key component of most image classification models, but it has many
undesirable properties stemming from its dependence on the batch size and interactions …
undesirable properties stemming from its dependence on the batch size and interactions …
Sophia: A scalable stochastic second-order optimizer for language model pre-training
Given the massive cost of language model pre-training, a non-trivial improvement of the
optimization algorithm would lead to a material reduction on the time and cost of training …
optimization algorithm would lead to a material reduction on the time and cost of training …
How can we know what language models know?
Recent work has presented intriguing results examining the knowledge contained in
language models (LMs) by having the LM fill in the blanks of prompts such as “Obama is a …
language models (LMs) by having the LM fill in the blanks of prompts such as “Obama is a …
Ctrl: A conditional transformer language model for controllable generation
Large-scale language models show promising text generation capabilities, but users cannot
easily control particular aspects of the generated text. We release CTRL, a 1.63 billion …
easily control particular aspects of the generated text. We release CTRL, a 1.63 billion …
Random feature attention
Transformers are state-of-the-art models for a variety of sequence modeling tasks. At their
core is an attention function which models pairwise interactions between the inputs at every …
core is an attention function which models pairwise interactions between the inputs at every …
ERNIE: Enhanced language representation with informative entities
Neural language representation models such as BERT pre-trained on large-scale corpora
can well capture rich semantic patterns from plain text, and be fine-tuned to consistently …
can well capture rich semantic patterns from plain text, and be fine-tuned to consistently …