A survey on model-based reinforcement learning
Reinforcement learning (RL) interacts with the environment to solve sequential decision-
making problems via a trial-and-error approach. Errors are always undesirable in real-world …
making problems via a trial-and-error approach. Errors are always undesirable in real-world …
A review of the gumbel-max trick and its extensions for discrete stochasticity in machine learning
The Gumbel-max trick is a method to draw a sample from a categorical distribution, given by
its unnormalized (log-) probabilities. Over the past years, the machine learning community …
its unnormalized (log-) probabilities. Over the past years, the machine learning community …
Training diffusion models with reinforcement learning
Diffusion models are a class of flexible generative models trained with an approximation to
the log-likelihood objective. However, most use cases of diffusion models are not concerned …
the log-likelihood objective. However, most use cases of diffusion models are not concerned …
Recurrent neural network wave functions
A core technology that has emerged from the artificial intelligence revolution is the recurrent
neural network (RNN). Its unique sequence-based architecture provides a tractable …
neural network (RNN). Its unique sequence-based architecture provides a tractable …
Learning generalisable omni-scale representations for person re-identification
An effective person re-identification (re-ID) model should learn feature representations that
are both discriminative, for distinguishing similar-looking people, and generalisable, for …
are both discriminative, for distinguishing similar-looking people, and generalisable, for …
Global optimality guarantees for policy gradient methods
Policy gradients methods apply to complex, poorly understood, control problems by
performing stochastic gradient descent over a parameterized class of polices. Unfortunately …
performing stochastic gradient descent over a parameterized class of polices. Unfortunately …
Optimal experimental design: Formulations and computations
Questions of 'how best to acquire data'are essential to modelling and prediction in the
natural and social sciences, engineering applications, and beyond. Optimal experimental …
natural and social sciences, engineering applications, and beyond. Optimal experimental …
Differentiable automatic data augmentation
Data augmentation (DA) techniques aim to increase data variability, and thus train deep
networks with better generalisation. The pioneering AutoAugment automated the search for …
networks with better generalisation. The pioneering AutoAugment automated the search for …
Differentiable quantum architecture search
Quantum architecture search (QAS) is the process of automating architecture engineering of
quantum circuits. It has been desired to construct a powerful and general QAS platform …
quantum circuits. It has been desired to construct a powerful and general QAS platform …
Tighter risk certificates for neural networks
This paper presents an empirical study regarding training probabilistic neural networks
using training objectives derived from PAC-Bayes bounds. In the context of probabilistic …
using training objectives derived from PAC-Bayes bounds. In the context of probabilistic …