[PDF][PDF] An overview of classification algorithms for imbalanced datasets

V Ganganwar - International Journal of Emerging Technology and …, 2012 - researchgate.net
Unbalanced data set, a problem often found in real world application, can cause seriously
negative effect on classification performance of machine learning algorithms. There have …

A survey of cost-sensitive decision tree induction algorithms

S Lomax, S Vadera - ACM Computing Surveys (CSUR), 2013 - dl.acm.org
The past decade has seen a significant interest on the problem of inducing decision trees
that take account of costs of misclassification and costs of acquiring the features used for …

Mahakil: Diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction

KE Bennin, J Keung, P Phannachitta… - IEEE Transactions …, 2017 - ieeexplore.ieee.org
Highly imbalanced data typically make accurate predictions difficult. Unfortunately, software
defect datasets tend to have fewer defective modules than non-defective modules. Synthetic …

SMOTE: synthetic minority over-sampling technique

NV Chawla, KW Bowyer, LO Hall… - Journal of artificial …, 2002 - jair.org
An approach to the construction of classifiers from imbalanced datasets is described. A
dataset is imbalanced if the classification categories are not approximately equally …

The class imbalance problem: A systematic study

N Japkowicz, S Stephen - Intelligent data analysis, 2002 - content.iospress.com
In machine learning problems, differences in prior class probabilities--or class imbalances--
have been reported to hinder the performance of some standard classifiers, such as …

[PDF][PDF] Addressing the curse of imbalanced training sets: one-sided selection

M Kubat, S Matwin - Icml, 1997 - Citeseer
Adding examples of the majority class to the training set can have a detrimental e ect on the
learner's behavior: noisy or otherwise unreliable examples from the majority class can …

[책][B] Data mining with decision trees: theory and applications

OZ Maimon, L Rokach - 2014 - books.google.com
Decision trees have become one of the most powerful and popular approaches in
knowledge discovery and data mining; it is the science of exploring large and complex …

An empirical comparison of voting classification algorithms: Bagging, boosting, and variants

E Bauer, R Kohavi - Machine learning, 1999 - Springer
Methods for voting classification algorithms, such as Bagging and AdaBoost, have been
shown to be very successful in improving the accuracy of certain classifiers for artificial and …

Safe-level-smote: Safe-level-synthetic minority over-sampling technique for handling the class imbalanced problem

C Bunkhumpornpat, K Sinapiromsaran… - Advances in knowledge …, 2009 - Springer
The class imbalanced problem occurs in various disciplines when one of target classes has
a tiny number of instances comparing to other classes. A typical classifier normally ignores …

[PDF][PDF] C4. 5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling

C Drummond, RC Holte - … on learning from imbalanced datasets II, 2003 - eiti.uottawa.ca
This paper takes a new look at two sampling schemes commonly used to adapt machine
learning algorithms to imbalanced classes and misclassification costs. It uses a performance …