SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary

A Fernández, S Garcia, F Herrera, NV Chawla - Journal of artificial …, 2018 - jair.org
The Synthetic Minority Oversampling Technique (SMOTE) preprocessing algorithm is
considered" de facto" standard in the framework of learning from imbalanced data. This is …

A survey of predictive modeling on imbalanced domains

P Branco, L Torgo, RP Ribeiro - ACM computing surveys (CSUR), 2016 - dl.acm.org
Many real-world data-mining applications involve obtaining predictive models using
datasets with strongly imbalanced distributions of the target variable. Frequently, the least …

Predictive performance of presence‐only species distribution models: a benchmark study with reproducible code

R Valavi, G Guillera‐Arroita… - Ecological …, 2022 - Wiley Online Library
Species distribution modeling (SDM) is widely used in ecology and conservation. Currently,
the most available data for SDM are species presence‐only records (available through …

The class imbalance problem in deep learning

K Ghosh, C Bellinger, R Corizzo, P Branco… - Machine Learning, 2024 - Springer
Deep learning has recently unleashed the ability for Machine learning (ML) to make
unparalleled strides. It did so by confronting and successfully addressing, at least to a …

On the class overlap problem in imbalanced data classification

P Vuttipittayamongkol, E Elyan, A Petrovski - Knowledge-based systems, 2021 - Elsevier
Class imbalance is an active research area in the machine learning community. However,
existing and recent literature showed that class overlap had a higher negative impact on the …

A hybrid method with dynamic weighted entropy for handling the problem of class imbalance with overlap in credit card fraud detection

Z Li, M Huang, G Liu, C Jiang - Expert Systems with Applications, 2021 - Elsevier
Class imbalance with overlap is a very challenging problem in electronic fraud transaction
detection. Fraudsters have racked their brains to make a fraud transaction as similar as a …

GHOST: adjusting the decision threshold to handle imbalanced data in machine learning

C Esposito, GA Landrum, N Schneider… - Journal of Chemical …, 2021 - ACS Publications
Machine learning classifiers trained on class imbalanced data are prone to overpredict the
majority class. This leads to a larger misclassification rate for the minority class, which in …

A systematic review on imbalanced learning methods in intelligent fault diagnosis

Z Ren, T Lin, K Feng, Y Zhu, Z Liu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
The theoretical developments of data-driven fault diagnosis methods have yielded fruitful
achievements and significantly benefited industry practices. However, most methods are …

Modelling species presence‐only data with random forests

R Valavi, J Elith, JJ Lahoz‐Monfort… - Ecography, 2021 - Wiley Online Library
The random forest (RF) algorithm is an ensemble of classification or regression trees and is
widely used, including for species distribution modelling (SDM). Many researchers use …

On the joint-effect of class imbalance and overlap: a critical review

MS Santos, PH Abreu, N Japkowicz… - Artificial Intelligence …, 2022 - Springer
Current research on imbalanced data recognises that class imbalance is aggravated by
other data intrinsic characteristics, among which class overlap stands out as one of the most …