Recent advances and emerging challenges of feature selection in the context of big data

V Bolón-Canedo, N Sánchez-Maroño… - Knowledge-based …, 2015 - Elsevier
In an era of growing data complexity and volume and the advent of big data, feature
selection has a key role to play in hel** reduce high-dimensionality in machine learning …

Feature selection for clustering: A review

S Alelyani, J Tang, H Liu - Data Clustering, 2018 - taylorfrancis.com
Dimensionality reduction techniques can be categorized mainly into feature extraction and
feature selection. In the feature extraction approach, features are projected into a new space …

Whale optimization approaches for wrapper feature selection

M Mafarja, S Mirjalili - Applied Soft Computing, 2018 - Elsevier
Classification accuracy highly dependents on the nature of the features in a dataset which
may contain irrelevant or redundant data. The main aim of feature selection is to eliminate …

Hybrid whale optimization algorithm with simulated annealing for feature selection

MM Mafarja, S Mirjalili - Neurocomputing, 2017 - Elsevier
Hybrid metaheuristics are of the most interesting recent trends in optimization and memetic
algorithms. In this paper, two hybridization models are used to design different feature …

Explainable k-means and k-medians clustering

M Moshkovitz, S Dasgupta… - … on machine learning, 2020 - proceedings.mlr.press
Many clustering algorithms lead to cluster assignments that are hard to explain, partially
because they depend on all the features of the data in a complicated way. To improve …

Randomized algorithms for matrices and data

MW Mahoney - Foundations and Trends® in Machine …, 2011 - nowpublishers.com
Randomized algorithms for very large matrix problems have received a great deal of
attention in recent years. Much of this work was motivated by problems in large-scale data …

Turning Big Data Into Tiny Data: Constant-Size Coresets for -Means, PCA, and Projective Clustering

D Feldman, M Schmidt, C Sohler - SIAM Journal on Computing, 2020 - SIAM
We develop and analyze a method to reduce the size of a very large set of data points in a
high-dimensional Euclidean space R^d to a small set of weighted points such that the result …

Dimensionality reduction for k-means clustering and low rank approximation

MB Cohen, S Elder, C Musco, C Musco… - Proceedings of the forty …, 2015 - dl.acm.org
We show how to approximate a data matrix A with a much smaller sketch~ A that can be
used to solve a general class of constrained k-rank approximation problems to within (1+ ε) …

An efficient approximation to the K-means clustering for massive data

M Capó, A Pérez, JA Lozano - Knowledge-Based Systems, 2017 - Elsevier
Due to the progressive growth of the amount of data available in a wide variety of scientific
fields, it has become more difficult to manipulate and analyze such information. In spite of its …