Diffusion models: A comprehensive survey of methods and applications

L Yang, Z Zhang, Y Song, S Hong, R Xu, Y Zhao… - ACM Computing …, 2023 - dl.acm.org
Diffusion models have emerged as a powerful new family of deep generative models with
record-breaking performance in many applications, including image synthesis, video …

Machine-generated text: A comprehensive survey of threat models and detection methods

EN Crothers, N Japkowicz, HL Viktor - IEEE Access, 2023 - ieeexplore.ieee.org
Machine-generated text is increasingly difficult to distinguish from text authored by humans.
Powerful open-source models are freely available, and user-friendly tools that democratize …

[HTML][HTML] Review of image classification algorithms based on convolutional neural networks

L Chen, S Li, Q Bai, J Yang, S Jiang, Y Miao - Remote Sensing, 2021 - mdpi.com
Image classification has always been a hot research direction in the world, and the
emergence of deep learning has promoted the development of this field. Convolutional …

R-drop: Regularized dropout for neural networks

L Wu, J Li, Y Wang, Q Meng, T Qin… - Advances in …, 2021 - proceedings.neurips.cc
Dropout is a powerful and widely used technique to regularize the training of deep neural
networks. Though effective and performing well, the randomness introduced by dropout …

High-performance large-scale image recognition without normalization

A Brock, S De, SL Smith… - … conference on machine …, 2021 - proceedings.mlr.press
Batch normalization is a key component of most image classification models, but it has many
undesirable properties stemming from its dependence on the batch size and interactions …

Sophia: A scalable stochastic second-order optimizer for language model pre-training

H Liu, Z Li, D Hall, P Liang, T Ma - arxiv preprint arxiv:2305.14342, 2023 - arxiv.org
Given the massive cost of language model pre-training, a non-trivial improvement of the
optimization algorithm would lead to a material reduction on the time and cost of training …

How can we know what language models know?

Z Jiang, FF Xu, J Araki, G Neubig - Transactions of the Association for …, 2020 - direct.mit.edu
Recent work has presented intriguing results examining the knowledge contained in
language models (LMs) by having the LM fill in the blanks of prompts such as “Obama is a …

Ctrl: A conditional transformer language model for controllable generation

NS Keskar, B McCann, LR Varshney, C **ong… - arxiv preprint arxiv …, 2019 - arxiv.org
Large-scale language models show promising text generation capabilities, but users cannot
easily control particular aspects of the generated text. We release CTRL, a 1.63 billion …

Random feature attention

H Peng, N Pappas, D Yogatama, R Schwartz… - arxiv preprint arxiv …, 2021 - arxiv.org
Transformers are state-of-the-art models for a variety of sequence modeling tasks. At their
core is an attention function which models pairwise interactions between the inputs at every …

ERNIE: Enhanced language representation with informative entities

Z Zhang, X Han, Z Liu, X Jiang, M Sun, Q Liu - arxiv preprint arxiv …, 2019 - arxiv.org
Neural language representation models such as BERT pre-trained on large-scale corpora
can well capture rich semantic patterns from plain text, and be fine-tuned to consistently …