Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning

G LemaÃŽtre, F Nogueira, CK Aridas - Journal of machine learning …, 2017 - jmlr.org
imbalanced-learn is an open-source python toolbox aiming at providing a wide range of
methods to cope with the problem of imbalanced dataset frequently encountered in machine …

A machine learning method for classification of cervical cancer

JJ Tanimu, M Hamada, M Hassan, H Kakudi… - Electronics, 2022 - mdpi.com
Cervical cancer is one of the leading causes of premature mortality among women
worldwide and more than 85% of these deaths are in develo** countries. There are …

Data sampling methods to deal with the big data multi-class imbalance problem

E Rendon, R Alejo, C Castorena, FJ Isidro-Ortega… - Applied Sciences, 2020 - mdpi.com
The class imbalance problem has been a hot topic in the machine learning community in
recent years. Nowadays, in the time of big data and deep learning, this problem remains in …

Training and evaluating machine learning algorithms for ocean microplastics classification through vibrational spectroscopy

H de Medeiros Back, ECV Junior, OE Alarcon… - Chemosphere, 2022 - Elsevier
Microplastics are contaminants of emerging concern-not only environmental, but also to
human health. Characterizing them is of fundamental importance to evaluate their potential …

Measuring data

M Mitchell, AS Luccioni, N Lambert, M Gerchick… - arxiv preprint arxiv …, 2022 - arxiv.org
We identify the task of measuring data to quantitatively characterize the composition of
machine learning data and datasets. Similar to an object's height, width, and volume, data …

Uncertainty based under-sampling for learning naive bayes classifiers under imbalanced data sets

CK Aridas, S Karlos, VG Kanas, N Fazakis… - IEEE …, 2019 - ieeexplore.ieee.org
In many real world classification tasks, all data classes are not represented equally. This
problem, known also as the curse of class imbalanced in data sets, has a potential impact in …

The proposal of undersampling method for learning from imbalanced datasets

M Bach, A Werner, M Palt - Procedia Computer Science, 2019 - Elsevier
Highly imbalanced data, which occurs in many real-world applications, often makes
machine-based processing difficult or even impossible. The over-and under-sampling …

A survey of machine learning methods and challenges for windows malware classification

E Raff, C Nicholas - arxiv preprint arxiv:2006.09271, 2020 - arxiv.org
Malware classification is a difficult problem, to which machine learning methods have been
applied for decades. Yet progress has often been slow, in part due to a number of unique …

Combining over-sampling and under-sampling techniques for imbalance dataset

N Junsomboon, T Phienthrakul - … of the 9th international conference on …, 2017 - dl.acm.org
An important problem in medical data analysis is imbalance dataset. This problem is a
cause of diagnostic mistake. The results of diagnostic affect to life of patients. If a doctor fails …

[HTML][HTML] Type 2 diabetes mellitus screening and risk factors using decision tree: results of data mining

S Habibi, M Ahmadi, S Alizadeh - Global journal of health science, 2015 - ncbi.nlm.nih.gov
Objectives: The aim of this study was to examine a predictive model using features related to
the diabetes type 2 risk factors. Methods: The data were obtained from a database in a …