Dealing with noise problem in machine learning data-sets: A systematic review
The occurrences of noisy data in data set can significantly impact prediction of any
meaningful information. Many empirical studies have shown that noise in data set …
meaningful information. Many empirical studies have shown that noise in data set …
How complex is your classification problem? a survey on measuring classification complexity
Characteristics extracted from the training datasets of classification problems have proven to
be effective predictors in a number of meta-analyses. Among them, measures of …
be effective predictors in a number of meta-analyses. Among them, measures of …
Transforming big data into smart data: An insight on the use of the k‐nearest neighbors algorithm to obtain quality data
The k‐nearest neighbors algorithm is characterized as a simple yet effective data mining
technique. The main drawback of this technique appears when massive amounts of data …
technique. The main drawback of this technique appears when massive amounts of data …
Dynamic selection of normalization techniques using data complexity measures
S Jain, S Shukla, R Wadhvani - Expert Systems with Applications, 2018 - Elsevier
Data preprocessing is an important step for designing classification model. Normalization is
one of the preprocessing techniques used to handle the out-of-bounds attributes. This work …
one of the preprocessing techniques used to handle the out-of-bounds attributes. This work …
Effect of training class label noise on classification performances for land cover map** with satellite image time series
Supervised classification systems used for land cover map** require accurate reference
databases. These reference data come generally from different sources such as field …
databases. These reference data come generally from different sources such as field …
Machine learning algorithms for smart data analysis in internet of things environment: taxonomies and research trends
Machine learning techniques will contribution towards making Internet of Things (IoT)
symmetric applications among the most significant sources of new data in the future. In this …
symmetric applications among the most significant sources of new data in the future. In this …
Combined cleaning and resampling algorithm for multi-class imbalanced data with label noise
The imbalanced data classification is one of the most crucial tasks facing modern data
analysis. Especially when combined with other difficulty factors, such as the presence of …
analysis. Especially when combined with other difficulty factors, such as the presence of …
Application of hybrid artificial neural networks for predicting rate of penetration (ROP): A case study from Marun oil field
Rate of Penetration (ROP) can be considered as a crucial factor in optimization and cost
minimization of drilling operations. In order to predict ROP with satisfactory precision, some …
minimization of drilling operations. In order to predict ROP with satisfactory precision, some …
The impact of inconsistent human annotations on AI driven clinical decision making
In supervised learning model development, domain experts are often used to provide the
class labels (annotations). Annotation inconsistencies commonly occur when even highly …
class labels (annotations). Annotation inconsistencies commonly occur when even highly …
Active learning for network traffic classification: a technical study
Network Traffic Classification (NTC) has become an important feature in various network
management operations, eg, Quality of Service (QoS) provisioning and security services …
management operations, eg, Quality of Service (QoS) provisioning and security services …