Distributed optimization and statistical learning via the alternating direction method of multipliers
Many problems of recent interest in statistics and machine learning can be posed in the
framework of convex optimization. Due to the explosion in size and complexity of modern …
framework of convex optimization. Due to the explosion in size and complexity of modern …
Theseus: A library for differentiable nonlinear optimization
We present Theseus, an efficient application-agnostic open source library for differentiable
nonlinear least squares (DNLS) optimization built on PyTorch, providing a common …
nonlinear least squares (DNLS) optimization built on PyTorch, providing a common …
A tutorial on dual decomposition and lagrangian relaxation for inference in natural language processing
AM Rush, MJ Collins - Journal of Artificial Intelligence Research, 2012 - jair.org
Dual decomposition, and more generally Lagrangian relaxation, is a classical method for
combinatorial optimization; it has recently been applied to several inference problems in …
combinatorial optimization; it has recently been applied to several inference problems in …
Structured pruning of large language models
Large language models have recently achieved state of the art performance across a wide
variety of natural language tasks. Meanwhile, the size of these models and their latency …
variety of natural language tasks. Meanwhile, the size of these models and their latency …
Joint extraction of events and entities within a document context
B Yang, T Mitchell - arxiv preprint arxiv:1609.03632, 2016 - arxiv.org
Events and entities are closely related; entities are often actors or participants in events and
events without entities are uncommon. The interpretation of events and entities is highly …
events without entities are uncommon. The interpretation of events and entities is highly …
Frame-semantic parsing
Frame semantics is a linguistic theory that has been instantiated for English in the FrameNet
lexicon. We solve the problem of frame-semantic parsing using a two-stage statistical model …
lexicon. We solve the problem of frame-semantic parsing using a two-stage statistical model …
A comparative study of modern inference techniques for structured discrete energy minimization problems
Szeliski et al. published an influential study in 2006 on energy minimization methods for
Markov random fields. This study provided valuable insights in choosing the best …
Markov random fields. This study provided valuable insights in choosing the best …
Online alternating direction method (longer version)
H Wang, A Banerjee - arxiv preprint arxiv:1306.3721, 2013 - arxiv.org
Online optimization has emerged as powerful tool in large scale optimization. In this pa-per,
we introduce efficient online optimization algorithms based on the alternating direction …
we introduce efficient online optimization algorithms based on the alternating direction …
End-to-end learning for structured prediction energy networks
D Belanger, B Yang… - … Conference on Machine …, 2017 - proceedings.mlr.press
Abstract Structured Prediction Energy Networks (SPENs) are a simple, yet expressive family
of structured prediction models (Belanger and McCallum, 2016). An energy function over …
of structured prediction models (Belanger and McCallum, 2016). An energy function over …
A comparative study of modern inference techniques for discrete energy minimization problems
Seven years ago, Szeliski et al. published an influential study on energy minimization
methods for Markov random fields (MRF). This study provided valuable insights in choosing …
methods for Markov random fields (MRF). This study provided valuable insights in choosing …