A review of the gumbel-max trick and its extensions for discrete stochasticity in machine learning

IAM Huijben, W Kool, MB Paulus… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
The Gumbel-max trick is a method to draw a sample from a categorical distribution, given by
its unnormalized (log-) probabilities. Over the past years, the machine learning community …

Lambdabeam: Neural program search with higher-order functions and lambdas

K Shi, H Dai, WD Li, K Ellis… - Advances in Neural …, 2023 - proceedings.neurips.cc
Search is an important technique in program synthesis that allows for adaptive strategies
such as focusing on particular search directions based on execution results. Several prior …

Ancestral gumbel-top-k sampling for sampling without replacement

W Kool, H Van Hoof, M Welling - Journal of Machine Learning Research, 2020 - jmlr.org
We develop ancestral Gumbel-Top-k sampling: a generic and efficient method for sampling
without replacement from discrete-valued Bayesian networks, which includes multivariate …

Predictive querying for autoregressive neural sequence models

A Boyd, S Showalter, S Mandt… - Advances in Neural …, 2022 - proceedings.neurips.cc
In reasoning about sequential events it is natural to pose probabilistic queries such as
“when will event A occur next” or “what is the probability of A occurring before B”, with …

Conditional Poisson stochastic beams

CI Meister, A Amini, T Vieira… - Proceedings of the …, 2021 - research-collection.ethz.ch
Beam search is the default decoding strategy for many sequence generation tasks in NLP.
The set of approximate K-best items returned by the algorithm is a useful summary of the …

Determinantal beam search

C Meister, M Forster, R Cotterell - arxiv preprint arxiv:2106.07400, 2021 - arxiv.org
Beam search is a go-to strategy for decoding neural sequence models. The algorithm can
naturally be viewed as a subset optimization problem, albeit one where the corresponding …

GraphXForm: Graph transformer for computer-aided molecular design with application to extraction

J Pirnay, JG Rittig, AB Wolf, M Grohe, J Burger… - arxiv preprint arxiv …, 2024 - arxiv.org
Generative deep learning has become pivotal in molecular design for drug discovery and
materials science. A widely used paradigm is to pretrain neural networks on string …

Scaling neural program synthesis with distribution-based search

N Fijalkow, G Lagarde, T Matricon, K Ellis… - Proceedings of the …, 2022 - ojs.aaai.org
We consider the problem of automatically constructing computer programs from input-output
examples. We investigate how to augment probabilistic and neural program synthesis …

Take a step and reconsider: Sequence decoding for self-improved neural combinatorial optimization

J Pirnay, DG Grimm - arxiv preprint arxiv:2407.17206, 2024 - ebooks.iospress.nl
The constructive approach within Neural Combinatorial Optimization (NCO) treats a
combinatorial optimization problem as a finite Markov decision process, where solutions are …

Self-Improvement for Neural Combinatorial Optimization: Sample without Replacement, but Improvement

J Pirnay, DG Grimm - arxiv preprint arxiv:2403.15180, 2024 - arxiv.org
Current methods for end-to-end constructive neural combinatorial optimization usually train
a policy using behavior cloning from expert solutions or policy gradient methods from …