A review of feature selection methods for machine learning-based disease risk prediction

N Pudjihartono, T Fadason, AW Kempa-Liehr… - Frontiers in …, 2022 - frontiersin.org
Machine learning has shown utility in detecting patterns within large, unstructured, and
complex datasets. One of the promising applications of machine learning is in precision …

Variable importance analysis: A comprehensive review

P Wei, Z Lu, J Song - Reliability Engineering & System Safety, 2015 - Elsevier
Measuring variable importance for computational models or measured data is an important
task in many applications. It has drawn our attention that the variable importance analysis …

Using recursive feature elimination in random forest to account for correlated variables in high dimensional data

BF Darst, KC Malecki, CD Engelman - BMC genetics, 2018 - Springer
Background Random forest (RF) is a machine-learning method that generally works well
with high-dimensional problems and allows for nonlinear relationships between predictors; …

Comparing methods for detecting multilocus adaptation with multivariate genotype–environment associations

BR Forester, JR Lasky, HH Wagner… - Molecular …, 2018 - Wiley Online Library
Identifying adaptive loci can provide insight into the mechanisms underlying local
adaptation. Genotype–environment association (GEA) methods, which identify these loci …

Detecting epistasis in human complex traits

WH Wei, G Hemani, CS Haley - Nature Reviews Genetics, 2014 - nature.com
Genome-wide association studies (GWASs) have become the focus of the statistical analysis
of complex traits in humans, successfully shedding light on several aspects of genetic …

Beyond treeshap: Efficient computation of any-order shapley interactions for tree ensembles

M Muschalik, F Fumagalli, B Hammer… - Proceedings of the AAAI …, 2024 - ojs.aaai.org
While shallow decision trees may be interpretable, larger ensemble models like gradient-
boosted trees, which often set the state of the art in machine learning problems involving …

A new variable selection approach using random forests

A Hapfelmeier, K Ulm - Computational Statistics & Data Analysis, 2013 - Elsevier
Random Forests are frequently applied as they achieve a high prediction accuracy and have
the ability to identify informative variables. Several approaches for variable selection have …

Do little interactions get lost in dark random forests?

MN Wright, A Ziegler, IR König - BMC bioinformatics, 2016 - Springer
Background Random forests have often been claimed to uncover interaction effects.
However, if and how interaction effects can be differentiated from marginal effects remains …

Statistically reinforced machine learning for nonlinear patterns and variable interactions

M Ryo, MC Rillig - Ecosphere, 2017 - Wiley Online Library
Most statistical models assume linearity and few variable interactions, even though real‐
world ecological patterns often result from nonlinear and highly interactive processes. We …

Integrating Metal–Phenolic Networks-Mediated Separation and Machine Learning-Aided Surface-Enhanced Raman Spectroscopy for Accurate Nanoplastics …

H Ye, S Jiang, Y Yan, B Zhao, ER Grant, DD Kitts… - ACS …, 2024 - ACS Publications
Increasing accumulation of nanoplastics across ecosystems poses a significant threat to
both terrestrial and aquatic life. Surface-enhanced Raman scattering (SERS) is an emerging …