- Academic Search

FM Luo, T Xu, H Lai, XH Chen, W Zhang… - Science China Information …, 2024 - Springer

Reinforcement learning (RL) interacts with the environment to solve sequential decision-
making problems via a trial-and-error approach. Errors are always undesirable in real-world …

Save Cite Cited by 110 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

A review of the gumbel-max trick and its extensions for discrete stochasticity in machine learning

IAM Huijben, W Kool, MB Paulus… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

The Gumbel-max trick is a method to draw a sample from a categorical distribution, given by
its unnormalized (log-) probabilities. Over the past years, the machine learning community …

Save Cite Cited by 107 Related articles All 9 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Training diffusion models with reinforcement learning

K Black, M Janner, Y Du, I Kostrikov… - arxiv preprint arxiv …, 2023 - arxiv.org

Diffusion models are a class of flexible generative models trained with an approximation to
the log-likelihood objective. However, most use cases of diffusion models are not concerned …

Save Cite Cited by 232 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] aps.org

Recurrent neural network wave functions

M Hibat-Allah, M Ganahl, LE Hayward, RG Melko… - Physical Review …, 2020 - APS

A core technology that has emerged from the artificial intelligence revolution is the recurrent
neural network (RNN). Its unique sequence-based architecture provides a tractable …

Save Cite Cited by 277 Related articles All 6 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Learning generalisable omni-scale representations for person re-identification

K Zhou, Y Yang, A Cavallaro… - IEEE transactions on …, 2021 - ieeexplore.ieee.org

An effective person re-identification (re-ID) model should learn feature representations that
are both discriminative, for distinguishing similar-looking people, and generalisable, for …

Save Cite Cited by 304 Related articles All 7 versions Free GPT-4

[Free GPT-4]

[HTML] informs.org

Global optimality guarantees for policy gradient methods

J Bhandari, D Russo - Operations Research, 2024 - pubsonline.informs.org

Policy gradients methods apply to complex, poorly understood, control problems by
performing stochastic gradient descent over a parameterized class of polices. Unfortunately …

Save Cite Cited by 286 Related articles All 7 versions Free GPT-4

[Free GPT-4]

[PDF] cambridge.org

Optimal experimental design: Formulations and computations

X Huan, J Jagalur, Y Marzouk - Acta Numerica, 2024 - cambridge.org

Questions of 'how best to acquire data'are essential to modelling and prediction in the
natural and social sciences, engineering applications, and beyond. Optimal experimental …

Save Cite Cited by 19 Related articles All 5 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Differentiable automatic data augmentation

Y Li, G Hu, Y Wang, T Hospedales… - Computer Vision–ECCV …, 2020 - Springer

Data augmentation (DA) techniques aim to increase data variability, and thus train deep
networks with better generalisation. The pioneering AutoAugment automated the search for …

Save Cite Cited by 203 Related articles All 10 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Differentiable quantum architecture search

SX Zhang, CY Hsieh, S Zhang… - Quantum Science and …, 2022 - iopscience.iop.org

Quantum architecture search (QAS) is the process of automating architecture engineering of
quantum circuits. It has been desired to construct a powerful and general QAS platform …

Save Cite Cited by 153 Related articles All 7 versions Free GPT-4

[Free GPT-4]

[PDF] jmlr.org

Tighter risk certificates for neural networks

M Pérez-Ortiz, O Rivasplata, J Shawe-Taylor… - Journal of Machine …, 2021 - jmlr.org

This paper presents an empirical study regarding training probabilistic neural networks
using training objectives derived from PAC-Bayes bounds. In the context of probabilistic …

Save Cite Cited by 132 Related articles All 11 versions Free GPT-4 View as HTML

Cite

Advanced search

Saved to My library

A survey on model-based reinforcement learning

A review of the gumbel-max trick and its extensions for discrete stochasticity in machine learning

Training diffusion models with reinforcement learning

Recurrent neural network wave functions

Learning generalisable omni-scale representations for person re-identification

Global optimality guarantees for policy gradient methods

Optimal experimental design: Formulations and computations

Differentiable automatic data augmentation

Differentiable quantum architecture search

Tighter risk certificates for neural networks