Transforming variables to central normality
Many real data sets contain numerical features (variables) whose distribution is far from
normal (Gaussian). Instead, their distribution is often skewed. In order to handle such data it …
normal (Gaussian). Instead, their distribution is often skewed. In order to handle such data it …
The cellwise minimum covariance determinant estimator
Abstract The usual Minimum Covariance Determinant (MCD) estimator of a covariance
matrix is robust against casewise outliers. These are cases (that is, rows of the data matrix) …
matrix is robust against casewise outliers. These are cases (that is, rows of the data matrix) …
GenerativeMTD: A deep synthetic data generation framework for small datasets
J Sivakumar, K Ramamurthy, M Radhakrishnan… - Knowledge-Based …, 2023 - Elsevier
Synthetic data generation for tabular data unlike computer vision, is an emerging challenge.
When tabular data needs to be synthesized, it either faces a small dataset problem or …
When tabular data needs to be synthesized, it either faces a small dataset problem or …
The R Package Ecosystem for Robust Statistics
V Todorov - Wiley Interdisciplinary Reviews: Computational …, 2024 - Wiley Online Library
In the last few years, the number of R packages implementing different robust statistical
methods have increased substantially. There are now numerous packages for computing …
methods have increased substantially. There are now numerous packages for computing …
Robust discriminant analysis
Discriminant analysis (DA) is one of the most popular methods for classification due to its
conceptual simplicity, low computational cost, and often solid performance. In its standard …
conceptual simplicity, low computational cost, and often solid performance. In its standard …
[HTML][HTML] Challenges of cellwise outliers
It is well-known that real data often contain outliers. The term outlier usually refers to a case,
usually denoted by a row of the n× d data matrix. In recent times a different type has come …
usually denoted by a row of the n× d data matrix. In recent times a different type has come …
Fast robust correlation for high-dimensional data
The product moment covariance matrix is a cornerstone of multivariate data analysis, from
which one can derive correlations, principal components, Mahalanobis distances and many …
which one can derive correlations, principal components, Mahalanobis distances and many …
[HTML][HTML] MacroPCA: An all-in-one PCA method allowing for missing values as well as cellwise and rowwise outliers
Multivariate data are typically represented by a rectangular matrix (table) in which the rows
are the objects (cases) and the columns are the variables (measurements). When there are …
are the objects (cases) and the columns are the variables (measurements). When there are …
Multivariate outlier detection in applied data analysis: global, local, compositional and Cellwise outliers
Outliers are encountered in all practical situations of data analysis, regardless of the
discipline of application. However, the term outlier is not uniformly defined across all these …
discipline of application. However, the term outlier is not uniformly defined across all these …
Noise simulation in classification with the noisemodel R package: Applications analyzing the impact of errors with chemical data
JA Sáez - Journal of Chemometrics, 2023 - Wiley Online Library
Classification datasets created from chemical processes can be affected by errors, which
impair the accuracy of the models built. This fact highlights the importance of analyzing the …
impair the accuracy of the models built. This fact highlights the importance of analyzing the …