Representation learning: A review and new perspectives

Y Bengio, A Courville, P Vincent - IEEE transactions on pattern …, 2013 - ieeexplore.ieee.org
The success of machine learning algorithms generally depends on data representation, and
we hypothesize that this is because different representations can entangle and hide more or …

[PDF][PDF] Unsupervised feature learning and deep learning: A review and new perspectives

Y Bengio, AC Courville, P Vincent - CoRR, abs/1206.5538, 2012 - docs.huihoo.com
The success of machine learning algorithms generally depends on data representation, and
we hypothesize that this is because different representations can entangle and hide more or …

Loss functions and metrics in deep learning

J Terven, DM Cordova-Esparza… - arxiv preprint arxiv …, 2023 - arxiv.org
When training or evaluating deep learning models, two essential parts are picking the
proper loss function and deciding on performance metrics. In this paper, we provide a …

BERT has a mouth, and it must speak: BERT as a Markov random field language model

A Wang, K Cho - arxiv preprint arxiv:1902.04094, 2019 - arxiv.org
We show that BERT (Devlin et al., 2018) is a Markov random field language model. This
formulation gives way to a natural procedure to sample sentences from BERT. We generate …

[CARTE][B] Deep learning

I Goodfellow, Y Bengio, A Courville, Y Bengio - 2016 - synapse.koreamed.org
Kwang Gi Kim https://doi. org/10.4258/hir. 2016.22. 4.351 ing those who are beginning their
careers in deep learning and artificial intelligence research. The other target audience …

Generative flow networks for discrete probabilistic modeling

D Zhang, N Malkin, Z Liu, A Volokhova… - International …, 2022 - proceedings.mlr.press
We present energy-based generative flow networks (EB-GFN), a novel probabilistic
modeling algorithm for high-dimensional discrete data. Building upon the theory of …

Neural autoregressive distribution estimation

B Uria, MA Côté, K Gregor, I Murray… - Journal of Machine …, 2016 - jmlr.org
We present Neural Autoregressive Distribution Estimation (NADE) models, which are neural
network architectures applied to the problem of unsupervised distribution and density …

Deep learning of representations: Looking forward

Y Bengio - International conference on statistical language and …, 2013 - Springer
Deep learning research aims at discovering learning algorithms that discover multiple levels
of distributed representations, with higher levels representing more abstract concepts …

Training restricted Boltzmann machines: An introduction

A Fischer, C Igel - Pattern Recognition, 2014 - Elsevier
Abstract Restricted Boltzmann machines (RBMs) are probabilistic graphical models that can
be interpreted as stochastic neural networks. They have attracted much attention as building …

Efficient learning of deep Boltzmann machines

R Salakhutdinov, H Larochelle - Proceedings of the …, 2010 - proceedings.mlr.press
We present a new approximate inference algorithm for Deep Boltzmann Machines (DBM's),
a generative model with many layers of hidden variables. The algorithm learns a separate …