Privacy-preserving data publishing: A survey of recent developments

BCM Fung, K Wang, R Chen, PS Yu - ACM Computing Surveys (Csur), 2010 - dl.acm.org
The collection of digital information by governments, corporations, and individuals has
created tremendous opportunities for knowledge-and information-based decision making …

A hybrid particle swarm optimization for feature subset selection by integrating a novel local search strategy

P Moradi, M Gholampour - Applied Soft Computing, 2016 - Elsevier
Feature selection has been widely used in data mining and machine learning tasks to make
a model with a small number of features which improves the classifier's accuracy. In this …

[BUCH][B] Evolutionary algorithms for solving multi-objective problems

CAC Coello - 2007 - Springer
Problems with multiple objectives arise in a natural fashion in most disciplines and their
solution has been a challenge to researchers for a long time. Despite the considerable …

Discrimination-aware data mining

D Pedreshi, S Ruggieri, F Turini - Proceedings of the 14th ACM SIGKDD …, 2008 - dl.acm.org
In the context of civil rights law, discrimination refers to unfair or unequal treatment of people
based on membership to a category or a minority, without regard to individual merit. Rules …

A methodology for direct and indirect discrimination prevention in data mining

S Hajian, J Domingo-Ferrer - IEEE transactions on knowledge …, 2012 - ieeexplore.ieee.org
Data mining is an increasingly important technology for extracting useful knowledge hidden
in large collections of data. There are, however, negative social perceptions about data …

[PDF][PDF] An interior-point method for large-scale l1-regularized logistic regression

K Koh, SJ Kim, S Boyd - Journal of Machine learning research, 2007 - jmlr.org
Logistic regression with l1 regularization has been proposed as a promising method for
feature selection in classification problems. In this paper we describe an efficient interior …

Super learner in prediction

EC Polley, MJ Van der Laan - 2010 - biostats.bepress.com
Super learning is a general loss based learning method that has been proposed and
analyzed theoretically in van der Laan et al.(2007). In this article we consider super learning …

A new local search based hybrid genetic algorithm for feature selection

MM Kabir, M Shahjahan, K Murase - Neurocomputing, 2011 - Elsevier
This paper presents a new hybrid genetic algorithm (HGA) for feature selection (FS), called
as HGAFS. The vital aspect of this algorithm is the selection of salient feature subset within a …

K-means clustering versus validation measures: a data distribution perspective

H **ong, J Wu, J Chen - Proceedings of the 12th ACM SIGKDD …, 2006 - dl.acm.org
K-means is a widely used partitional clustering method. While there are considerable
research efforts to characterize the key features of K-means clustering, further investigation …

Cohen's kappa coefficient as a performance measure for feature selection

SM Vieira, U Kaymak… - International conference on …, 2010 - ieeexplore.ieee.org
Measuring the performance of a given classifier is not a straightforward or easy task.
Depending on the application, the overall classification rate may not be sufficient if one, or …