[PDF][PDF] An overview of classification algorithms for imbalanced datasets
V Ganganwar - International Journal of Emerging Technology and …, 2012 - researchgate.net
Unbalanced data set, a problem often found in real world application, can cause seriously
negative effect on classification performance of machine learning algorithms. There have …
negative effect on classification performance of machine learning algorithms. There have …
A survey of cost-sensitive decision tree induction algorithms
S Lomax, S Vadera - ACM Computing Surveys (CSUR), 2013 - dl.acm.org
The past decade has seen a significant interest on the problem of inducing decision trees
that take account of costs of misclassification and costs of acquiring the features used for …
that take account of costs of misclassification and costs of acquiring the features used for …
Mahakil: Diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction
Highly imbalanced data typically make accurate predictions difficult. Unfortunately, software
defect datasets tend to have fewer defective modules than non-defective modules. Synthetic …
defect datasets tend to have fewer defective modules than non-defective modules. Synthetic …
SMOTE: synthetic minority over-sampling technique
An approach to the construction of classifiers from imbalanced datasets is described. A
dataset is imbalanced if the classification categories are not approximately equally …
dataset is imbalanced if the classification categories are not approximately equally …
The class imbalance problem: A systematic study
N Japkowicz, S Stephen - Intelligent data analysis, 2002 - content.iospress.com
In machine learning problems, differences in prior class probabilities--or class imbalances--
have been reported to hinder the performance of some standard classifiers, such as …
have been reported to hinder the performance of some standard classifiers, such as …
[PDF][PDF] Addressing the curse of imbalanced training sets: one-sided selection
M Kubat, S Matwin - Icml, 1997 - Citeseer
Adding examples of the majority class to the training set can have a detrimental e ect on the
learner's behavior: noisy or otherwise unreliable examples from the majority class can …
learner's behavior: noisy or otherwise unreliable examples from the majority class can …
[책][B] Data mining with decision trees: theory and applications
Decision trees have become one of the most powerful and popular approaches in
knowledge discovery and data mining; it is the science of exploring large and complex …
knowledge discovery and data mining; it is the science of exploring large and complex …
An empirical comparison of voting classification algorithms: Bagging, boosting, and variants
E Bauer, R Kohavi - Machine learning, 1999 - Springer
Methods for voting classification algorithms, such as Bagging and AdaBoost, have been
shown to be very successful in improving the accuracy of certain classifiers for artificial and …
shown to be very successful in improving the accuracy of certain classifiers for artificial and …
Safe-level-smote: Safe-level-synthetic minority over-sampling technique for handling the class imbalanced problem
The class imbalanced problem occurs in various disciplines when one of target classes has
a tiny number of instances comparing to other classes. A typical classifier normally ignores …
a tiny number of instances comparing to other classes. A typical classifier normally ignores …
[PDF][PDF] C4. 5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling
This paper takes a new look at two sampling schemes commonly used to adapt machine
learning algorithms to imbalanced classes and misclassification costs. It uses a performance …
learning algorithms to imbalanced classes and misclassification costs. It uses a performance …