[HTML][HTML] A review of ensemble learning and data augmentation models for class imbalanced problems: Combination, implementation and evaluation

AA Khan, O Chaudhari, R Chandra - Expert Systems with Applications, 2024‏ - Elsevier
Class imbalance (CI) in classification problems arises when the number of observations
belonging to one class is lower than the other. Ensemble learning combines multiple models …

Converting nanotoxicity data to information using artificial intelligence and simulation

X Yan, T Yue, DA Winkler, Y Yin, H Zhu… - Chemical …, 2023‏ - ACS Publications
Decades of nanotoxicology research have generated extensive and diverse data sets.
However, data is not equal to information. The question is how to extract critical information …

A comparison of undersampling, oversampling, and SMOTE methods for dealing with imbalanced classification in educational data mining

T Wongvorachan, S He, O Bulut - Information, 2023‏ - mdpi.com
Educational data mining is capable of producing useful data-driven applications (eg, early
warning systems in schools or the prediction of students' academic achievement) based on …

A survey on imbalanced learning: latest research, applications and future directions

W Chen, K Yang, Z Yu, Y Shi, CLP Chen - Artificial Intelligence Review, 2024‏ - Springer
Imbalanced learning constitutes one of the most formidable challenges within data mining
and machine learning. Despite continuous research advancement over the past decades …

DeepSMOTE: Fusing deep learning and SMOTE for imbalanced data

D Dablain, B Krawczyk… - IEEE transactions on …, 2022‏ - ieeexplore.ieee.org
Despite over two decades of progress, imbalanced data is still considered a significant
challenge for contemporary machine learning models. Modern advances in deep learning …

SHAP and LIME: an evaluation of discriminative power in credit risk

A Gramegna, P Giudici - Frontiers in Artificial Intelligence, 2021‏ - frontiersin.org
In credit risk estimation, the most important element is obtaining a probability of default as
close as possible to the effective risk. This effort quickly prompted new, powerful algorithms …

Effect of dataset size and train/test split ratios in QSAR/QSPR multiclass classification

A Rácz, D Bajusz, K Héberger - Molecules, 2021‏ - mdpi.com
Applied datasets can vary from a few hundred to thousands of samples in typical quantitative
structure-activity/property (QSAR/QSPR) relationships and classification. However, the size …

Review of classification methods on unbalanced data sets

L Wang, M Han, X Li, N Zhang, H Cheng - Ieee Access, 2021‏ - ieeexplore.ieee.org
This paper studies the classification of unbalanced data sets. First, this kind of data sets is
briefly introduced, and then the classification methods of unbalanced data sets are analyzed …

Impact of SMOTE on imbalanced text features for toxic comments classification using RVVC model

V Rupapara, F Rustam, HF Shahzad… - IEEE …, 2021‏ - ieeexplore.ieee.org
Social media platforms and microblogging websites have gained accelerated popularity
during the past few years. These platforms are used for expressing views and opinions …

Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE

G Douzas, F Bacao, F Last - Information sciences, 2018‏ - Elsevier
Learning from class-imbalanced data continues to be a common and challenging problem in
supervised learning as standard classification algorithms are designed to handle balanced …