SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary

A Fernández, S Garcia, F Herrera, NV Chawla - Journal of artificial …, 2018 - jair.org
The Synthetic Minority Oversampling Technique (SMOTE) preprocessing algorithm is
considered" de facto" standard in the framework of learning from imbalanced data. This is …

A review on classification of imbalanced data for wireless sensor networks

H Patel, D Singh Rajput… - International …, 2020 - journals.sagepub.com
Classification of imbalanced data is a vastly explored issue of the last and present decade
and still keeps the same importance because data are an essential term today and it …

On the performance of Matthews correlation coefficient (MCC) for imbalanced dataset

Q Zhu - Pattern Recognition Letters, 2020 - Elsevier
Abstract The Matthews Correlation Coefficient (MCC) is one of the popular measurements
for classification accuracy. It has been generally regarded as a balanced measure which …

[PDF][PDF] Classification with class imbalance problem

A Ali, SM Shamsuddin, AL Ralescu - Int. J. Advance Soft Compu …, 2013 - researchgate.net
Most existing classification approaches assume the underlying training set is evenly
distributed. In class imbalanced classification, the training set for one class (majority) far …

A novel ensemble method for classifying imbalanced data

Z Sun, Q Song, X Zhu, H Sun, B Xu, Y Zhou - Pattern Recognition, 2015 - Elsevier
The class imbalance problems have been reported to severely hinder classification
performance of many standard learning algorithms, and have attracted a great deal of …

An improved and random synthetic minority oversampling technique for imbalanced data

G Wei, W Mu, Y Song, J Dou - Knowledge-based systems, 2022 - Elsevier
Imbalanced data learning has become a major challenge in data mining and machine
learning. Oversampling is an effective way to re-achieve the balance by generating new …

Understanding the apparent superiority of over-sampling through an analysis of local information for class-imbalanced data

V García, JS Sánchez, AI Marqués, R Florencia… - Expert Systems with …, 2020 - Elsevier
Data plays a key role in the design of expert and intelligent systems and therefore, data
preprocessing appears to be a critical step to produce high-quality data and build accurate …

ForesTexter: An efficient random forest algorithm for imbalanced text categorization

Q Wu, Y Ye, H Zhang, MK Ng, SS Ho - Knowledge-Based Systems, 2014 - Elsevier
In this paper, we propose a new random forest (RF) based ensemble method, F ores T exter,
to solve the imbalanced text categorization problems. RF has shown great success in many …

[PDF][PDF] Data imbalance: effects and solutions for classification of large and highly imbalanced data

A Somasundaram, US Reddy - international conference on …, 2016 - researchgate.net
Big Data and Big Data Analytics has gained huge prominence over the recent years. Their
ability to handle massive amounts of data at ease has made them the preferred choice for …

Discovery of urinary biosignatures for tuberculosis and nontuberculous mycobacteria classification using metabolomics and machine learning

NK Anh, NK Phat, NQ Thu, NTN Tien, C Eunsu… - Scientific Reports, 2024 - nature.com
Nontuberculous mycobacteria (NTM) infection diagnosis remains a challenge due to its
overlap** clinical symptoms with tuberculosis (TB), leading to inappropriate treatment …