SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary

A Fernández, S Garcia, F Herrera, NV Chawla - Journal of artificial …, 2018 - jair.org
The Synthetic Minority Oversampling Technique (SMOTE) preprocessing algorithm is
considered" de facto" standard in the framework of learning from imbalanced data. This is …

Statistical and machine learning models in credit scoring: A systematic literature survey

X Dastile, T Celik, M Potsane - Applied Soft Computing, 2020 - Elsevier
In practice, as a well-known statistical method, the logistic regression model is used to
evaluate the credit-worthiness of borrowers due to its simplicity and transparency in …

Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE

G Douzas, F Bacao, F Last - Information sciences, 2018 - Elsevier
Learning from class-imbalanced data continues to be a common and challenging problem in
supervised learning as standard classification algorithms are designed to handle balanced …

Effective data generation for imbalanced learning using conditional generative adversarial networks

G Douzas, F Bacao - Expert Systems with applications, 2018 - Elsevier
Learning from imbalanced datasets is a frequent but challenging task for standard
classification algorithms. Although there are different strategies to address this problem …

An empirical comparison and evaluation of minority oversampling techniques on a large number of imbalanced datasets

G Kovács - Applied Soft Computing, 2019 - Elsevier
Learning and mining from imbalanced datasets gained increased interest in recent years.
One simple but efficient way to increase the performance of standard machine learning …

Cross-validation for imbalanced datasets: avoiding overoptimistic and overfitting approaches [research frontier]

MS Santos, JP Soares, PH Abreu… - ieee ComputatioNal …, 2018 - ieeexplore.ieee.org
Although cross-validation is a standard procedure for performance evaluation, its joint
application with oversampling remains an open question for researchers farther from the …

A cluster-based oversampling algorithm combining SMOTE and k-means for imbalanced medical data

Z Xu, D Shen, T Nie, Y Kou, N Yin, X Han - Information Sciences, 2021 - Elsevier
The algorithm of C4. 5 decision tree has the advantages of high classification accuracy, fast
calculation speed and comprehensible classification rules, so it is widely used for medical …

Stop oversampling for class imbalance learning: A review

AS Tarawneh, AB Hassanat, GA Altarawneh… - IEEE …, 2022 - ieeexplore.ieee.org
For the last two decades, oversampling has been employed to overcome the challenge of
learning from imbalanced datasets. Many approaches to solving this challenge have been …

Geometric SMOTE a geometrically enhanced drop-in replacement for SMOTE

G Douzas, F Bacao - Information sciences, 2019 - Elsevier
Classification of imbalanced datasets is a challenging task for standard algorithms. Although
many methods exist to address this problem in different ways, generating artificial data for …

GAN-based synthetic brain PET image generation

J Islam, Y Zhang - Brain informatics, 2020 - Springer
In recent days, deep learning technologies have achieved tremendous success in computer
vision-related tasks with the help of large-scale annotated dataset. Obtaining such dataset …