A survey of deep learning techniques for neural machine translation
In recent years, natural language processing (NLP) has got great development with deep
learning techniques. In the sub-field of machine translation, a new approach named Neural …
learning techniques. In the sub-field of machine translation, a new approach named Neural …
Hierarchical Bayesian nonparametric models with applications
Hierarchical modeling is a fundamental concept in Bayesian statistics. The basic idea is that
parameters are endowed with distributions which may themselves introduce new …
parameters are endowed with distributions which may themselves introduce new …
Experience grounds language
Language understanding research is held back by a failure to relate language to the
physical world it describes and to the social interactions it facilitates. Despite the incredible …
physical world it describes and to the social interactions it facilitates. Despite the incredible …
One billion word benchmark for measuring progress in statistical language modeling
We propose a new benchmark corpus to be used for measuring progress in statistical
language modeling. With almost one billion words of training data, we hope this benchmark …
language modeling. With almost one billion words of training data, we hope this benchmark …
Unsupervised grouped axial data modeling via hierarchical Bayesian nonparametric models with Watson distributions
This paper aims at proposing an unsupervised hierarchical nonparametric Bayesian
framework for modeling axial data (ie, observations are axes of direction) that can be …
framework for modeling axial data (ie, observations are axes of direction) that can be …
Between words and characters: A brief history of open-vocabulary modeling and tokenization in NLP
What are the units of text that we want to model? From bytes to multi-word expressions, text
can be analyzed and generated at many granularities. Until recently, most natural language …
can be analyzed and generated at many granularities. Until recently, most natural language …
Discriminative clustering by regularized information maximization
Is there a principled way to learn a probabilistic discriminative classifier from an unlabeled
data set? We present a framework that simultaneously clusters the data and trains a …
data set? We present a framework that simultaneously clusters the data and trains a …
[PDF][PDF] Dirichlet process.
YW Teh - Encyclopedia of machine learning, 2010 - Citeseer
The Dirichlet process is a stochastic proces used in Bayesian nonparametric models of data,
particularly in Dirichlet process mixture models (also known as infinite mixture models). It is …
particularly in Dirichlet process mixture models (also known as infinite mixture models). It is …
Mondrian forests: Efficient online random forests
Ensembles of randomized decision trees, usually referred to as random forests, are widely
used for classification and regression tasks in machine learning and statistics. Random …
used for classification and regression tasks in machine learning and statistics. Random …
[PDF][PDF] Distance dependent Chinese restaurant processes.
We develop the distance dependent Chinese restaurant process, a flexible class of
distributions over partitions that allows for dependencies between the elements. This class …
distributions over partitions that allows for dependencies between the elements. This class …