Big data preprocessing: methods and prospects
The massive growth in the scale of data has been observed in recent years being a key
factor of the Big Data scenario. Big Data can be defined as high volume, velocity and variety …
factor of the Big Data scenario. Big Data can be defined as high volume, velocity and variety …
Pattern classification with missing data: a review
Pattern classification has been successfully applied in many problem domains, such as
biometric recognition, document classification or medical diagnosis. Missing or unknown …
biometric recognition, document classification or medical diagnosis. Missing or unknown …
PAPILA: Dataset with fundus images and clinical data of both eyes of the same patient for glaucoma assessment
Glaucoma is one of the ophthalmological diseases that frequently causes loss of vision in
today's society. Previous studies assess which anatomical parameters of the optic nerve can …
today's society. Previous studies assess which anatomical parameters of the optic nerve can …
Handling data irregularities in classification: Foundations, trends, and future challenges
Most of the traditional pattern classifiers assume their input data to be well-behaved in terms
of similar underlying class distributions, balanced size of classes, the presence of a full set of …
of similar underlying class distributions, balanced size of classes, the presence of a full set of …
[KNIHA][B] Dimensionality reduction with unsupervised nearest neighbors
O Kramer - 2013 - Springer
The growing information infrastructure in a variety of disciplines involves an increasing
requirement for efficient data mining techniques. Fast dimensionality reduction methods are …
requirement for efficient data mining techniques. Fast dimensionality reduction methods are …
Estimating conversion rate in display advertising from past erformance data
In targeted display advertising, the goal is to identify the best opportunities to display a
banner ad to an online user who is most likely to take a desired action such as purchasing a …
banner ad to an online user who is most likely to take a desired action such as purchasing a …
Hybrid prediction model with missing value imputation for medical data
Accurate prediction in the presence of large number of missing values in the data set has
always been a challenging problem. Most of hybrid models to address this challenge have …
always been a challenging problem. Most of hybrid models to address this challenge have …
Anchors bring ease: An embarrassingly simple approach to partial multi-view clustering
Clustering on multi-view data has attracted much more attention in the past decades. Most
previous studies assume that each instance appears in all views, or there is at least one …
previous studies assume that each instance appears in all views, or there is at least one …
Identifying depression in the National Health and Nutrition Examination Survey data using a deep learning algorithm
Background As depression is the leading cause of disability worldwide, large-scale surveys
have been conducted to establish the occurrence and risk factors of depression. However …
have been conducted to establish the occurrence and risk factors of depression. However …
Network-based high level data classification
Traditional supervised data classification considers only physical features (eg, distance or
similarity) of the input data. Here, this type of learning is called low level classification. On …
similarity) of the input data. Here, this type of learning is called low level classification. On …