Avoiding inferior clusterings with misspecified Gaussian mixture models
Clustering is a fundamental tool for exploratory data analysis, and is ubiquitous across
scientific disciplines. Gaussian Mixture Model (GMM) is a popular probabilistic and …
scientific disciplines. Gaussian Mixture Model (GMM) is a popular probabilistic and …
Parsimonious mixtures of multivariate contaminated normal distributions
A mixture of multivariate contaminated normal distributions is developed for model‐based
clustering. In addition to the parameters of the classical normal mixture, our contaminated …
clustering. In addition to the parameters of the classical normal mixture, our contaminated …
Robust improper maximum likelihood: tuning, computation, and a comparison with other methods for robust Gaussian clustering
P Coretto, C Hennig - Journal of the American Statistical …, 2016 - Taylor & Francis
The two main topics of this article are the introduction of the “optimally tuned robust improper
maximum likelihood estimator”(OTRIMLE) for robust clustering based on the multivariate …
maximum likelihood estimator”(OTRIMLE) for robust clustering based on the multivariate …
A multivariate hidden Markov model for the identification of sea regimes from incomplete skewed and circular time series
The identification of sea regimes from environmental multivariate times series is complicated
by the mixed linear–circular support of the data, by the occurrence of missing values, by the …
by the mixed linear–circular support of the data, by the occurrence of missing values, by the …
Addressing overfitting and underfitting in Gaussian model-based clustering
JL Andrews - Computational Statistics & Data Analysis, 2018 - Elsevier
The expectation–maximization (EM) algorithm is a common approach for parameter
estimation in the context of cluster analysis using finite mixture models. This approach …
estimation in the context of cluster analysis using finite mixture models. This approach …
A globally convergent algorithm for lasso-penalized mixture of linear regression models
Variable selection is an old and pervasive problem in regression analysis. One solution is to
impose a lasso penalty to shrink parameter estimates toward zero and perform continuous …
impose a lasso penalty to shrink parameter estimates toward zero and perform continuous …
Anomaly and Novelty detection for robust semi-supervised learning
Three important issues are often encountered in Supervised and Semi-Supervised
Classification: class memberships are unreliable for some training units (label noise), a …
Classification: class memberships are unreliable for some training units (label noise), a …
A hidden Markov approach to the analysis of space–time environmental data with linear and circular components
The analysis of bivariate space–time series with linear and circular components is
complicated by (1) multiple correlations, across time, space and between variables,(2) …
complicated by (1) multiple correlations, across time, space and between variables,(2) …
Consistency, breakdown robustness, and algorithms for robust improper maximum likelihood clustering
P Coretto, C Hennig - Journal of Machine Learning Research, 2017 - jmlr.org
The robust improper maximum likelihood estimator (RIMLE) is a new method for robust
multivariate clustering finding approximately Gaussian clusters. It maximizes a …
multivariate clustering finding approximately Gaussian clusters. It maximizes a …
A general hidden state random walk model for animal movement
A general hidden state random walk model is proposed to describe the movement of an
animal that takes into account movement taxis with respect to features of the environment. A …
animal that takes into account movement taxis with respect to features of the environment. A …