A random forest guided tour

G Biau, E Scornet - Test, 2016 - Springer
The random forest algorithm, proposed by L. Breiman in 2001, has been extremely
successful as a general-purpose classification and regression method. The approach, which …

Unrestricted permutation forces extrapolation: variable importance requires at least one more model, or there is no free variable importance

G Hooker, L Mentch, S Zhou - Statistics and Computing, 2021 - Springer
This paper reviews and advocates against the use of permute-and-predict (PaP) methods for
interpreting black box functions. Methods such as the variable importance measures …

Randomization as regularization: A degrees of freedom explanation for random forest success

L Mentch, S Zhou - Journal of Machine Learning Research, 2020 - jmlr.org
Random forests remain among the most popular off-the-shelf supervised machine learning
tools with a well-established track record of predictive accuracy in both regression and …

[HTML][HTML] Efficient permutation testing of variable importance measures by the example of random forests

A Hapfelmeier, R Hornung, B Haller - Computational Statistics & Data …, 2023 - Elsevier
Hypothesis testing of variable importance measures (VIMPs) is still the subject of ongoing
research. This particularly applies to random forests (RF), for which VIMPs are a popular …

Tree space prototypes: Another look at making tree ensembles interpretable

S Tan, M Soloviev, G Hooker, MT Wells - … of the 2020 ACM-IMS on …, 2020 - dl.acm.org
Ensembles of decision trees perform well on many problems, but are not interpretable. In
contrast to existing approaches in interpretability that focus on explaining relationships …

Comparing predictions of fisheries bycatch using multiple spatiotemporal species distribution model frameworks

BC Stock, EJ Ward, T Eguchi, JE Jannot… - Canadian Journal of …, 2020 - cdnsciencepub.com
Spatiotemporal predictions of bycatch (ie, catch of nontargeted species) have shown
promise as dynamic ocean management tools for reducing bycatch. However, which …

Boosting random forests to reduce bias; one-step boosted forest and its variance estimate

I Ghosal, G Hooker - Journal of Computational and Graphical …, 2020 - Taylor & Francis
In this article, we propose using the principle of boosting to reduce the bias of a random
forest prediction in the regression setting. From the original random forest fit, we extract the …

Linking demography with drivers: climate and competition

BJ Teller, PB Adler, CB Edwards… - Methods in Ecology …, 2016 - Wiley Online Library
In observational demographic data, the number of measured factors that could potentially
drive demography (such as daily weather records between two censuses) can easily exceed …

Decomposing global feature effects based on feature interactions

J Herbinger, MN Wright, T Nagler, B Bischl… - arxiv preprint arxiv …, 2023 - arxiv.org
Global feature effect methods, such as partial dependence plots, provide an intelligible
visualization of the expected marginal feature effect. However, such global feature effect …

Predictive inference with random forests: A new perspective on classical analyses

RJ McAlexander, L Mentch - Research & Politics, 2020 - journals.sagepub.com
Despite the number of problems that can occur when core model assumptions are violated,
nearly all quantitative political science research relies on inflexible regression models that …