On the joint-effect of class imbalance and overlap: a critical review
Current research on imbalanced data recognises that class imbalance is aggravated by
other data intrinsic characteristics, among which class overlap stands out as one of the most …
other data intrinsic characteristics, among which class overlap stands out as one of the most …
How complex is your classification problem? a survey on measuring classification complexity
Characteristics extracted from the training datasets of classification problems have proven to
be effective predictors in a number of meta-analyses. Among them, measures of …
be effective predictors in a number of meta-analyses. Among them, measures of …
A review of microarray datasets and applied feature selection methods
Microarray data classification is a difficult challenge for machine learning researchers due to
its high number of features and the small sample sizes. Feature selection has been soon …
its high number of features and the small sample sizes. Feature selection has been soon …
[HTML][HTML] Microarray cancer feature selection: Review, challenges and research directions
Microarray technology has become an emerging trend in the domain of genetic research in
which many researchers employ to study and investigate the levels of genes' expression in a …
which many researchers employ to study and investigate the levels of genes' expression in a …
Impact of missing data imputation methods on gene expression clustering and classification
Background Several missing value imputation methods for gene expression data have been
proposed in the literature. In the past few years, researchers have been putting a great deal …
proposed in the literature. In the past few years, researchers have been putting a great deal …
Predicting noise filtering efficacy with data complexity measures for nearest neighbor classification
Classifier performance, particularly of instance-based learners such as k-nearest neighbors,
is affected by the presence of noisy data. Noise filters are traditionally employed to remove …
is affected by the presence of noisy data. Noise filters are traditionally employed to remove …
Centralized vs. distributed feature selection methods based on data complexity measures
In the era of Big Data, many datasets have a common characteristic, the large number of
features. As a result, selecting the relevant features and ignoring the irrelevant and …
features. As a result, selecting the relevant features and ignoring the irrelevant and …
A framework model using multifilter feature selection to enhance colon cancer classification
Gene expression profiles can be utilized in the diagnosis of critical diseases such as cancer.
The selection of biomarker genes from these profiles is significant and crucial for cancer …
The selection of biomarker genes from these profiles is significant and crucial for cancer …
Redundancy and complexity metrics for big data classification: towards smart data
It is recognized the importance of knowing the descriptive properties of a dataset when
tackling a data science problem. Having information about the redundancy, complexity and …
tackling a data science problem. Having information about the redundancy, complexity and …
A novel ECOC algorithm for multiclass microarray data classification based on data complexity analysis
MX Sun, KH Liu, QQ Wu, QQ Hong, BZ Wang… - Pattern Recognition, 2019 - Elsevier
Nowadays, a lot of new classification and clustering techniques have been proposed for
microarray data analysis. However, the multiclass microarray data classification is still …
microarray data analysis. However, the multiclass microarray data classification is still …