R-universe search: topic:random-forest

Showing 14 of total 14 results (show query)

mayer79

missRanger:Fast Imputation of Missing Values

Alternative implementation of the beautiful 'MissForest' algorithm used to impute mixed-type data sets by chaining random forests, introduced by Stekhoven, D.J. and Buehlmann, P. (2012) <doi:10.1093/bioinformatics/btr597>. Under the hood, it uses the lightning fast random forest package 'ranger'. Between the iterative model fitting, we offer the option of using predictive mean matching. This firstly avoids imputation with values not already present in the original data (like a value 0.3334 in 0-1 coded variable). Secondly, predictive mean matching tries to raise the variance in the resulting conditional distributions to a realistic level. This would allow, e.g., to do multiple imputation when repeating the call to missRanger(). Out-of-sample application is supported as well.

Maintained by Michael Mayer. Last updated 4 months ago.

imputation machine-learning missing-values random-forest

69 stars 11.07 score 208 scripts 6 dependents

modeloriented

randomForestExplainer:Explaining and Visualizing Random Forests in Terms of Variable Importance

A set of tools to help explain which variables are most important in a random forests. Various variable importance measures are calculated and visualized in different settings in order to get an idea on how their importance changes depending on our criteria (Hemant Ishwaran and Udaya B. Kogalur and Eiran Z. Gorodeski and Andy J. Minn and Michael S. Lauer (2010) <doi:10.1198/jasa.2009.tm08622>, Leo Breiman (2001) <doi:10.1023/A:1010933404324>).

Maintained by Yue Jiang. Last updated 1 years ago.

random-forest

233 stars 9.59 score 236 scripts

ropensci

aorsf:Accelerated Oblique Random Forests

Fit, interpret, and compute predictions with oblique random forests. Includes support for partial dependence, variable importance, passing customized functions for variable importance and identification of linear combinations of features. Methods for the oblique random survival forest are described in Jaeger et al., (2023) <DOI:10.1080/10618600.2023.2231048>.

Maintained by Byron Jaeger. Last updated 8 days ago.

data-science oblique random-forest survival openblas cpp openmp

58 stars 9.29 score 60 scripts 1 dependents

mlr-org

mlr3mbo:Flexible Bayesian Optimization

A modern and flexible approach to Bayesian Optimization / Model Based Optimization building on the 'bbotk' package. 'mlr3mbo' is a toolbox providing both ready-to-use optimization algorithms as well as their fundamental building blocks allowing for straightforward implementation of custom algorithms. Single- and multi-objective optimization is supported as well as mixed continuous, categorical and conditional search spaces. Moreover, using 'mlr3mbo' for hyperparameter optimization of machine learning models within the 'mlr3' ecosystem is straightforward via 'mlr3tuning'. Examples of ready-to-use optimization algorithms include Efficient Global Optimization by Jones et al. (1998) <doi:10.1023/A:1008306431147>, ParEGO by Knowles (2006) <doi:10.1109/TEVC.2005.851274> and SMS-EGO by Ponweiser et al. (2008) <doi:10.1007/978-3-540-87700-4_78>.

Maintained by Lennart Schneider. Last updated 26 days ago.

automl bayesian-optimization bbotk black-box-optimization gaussian-process hpo hyperparameter hyperparameter-optimization hyperparameter-tuning machine-learning mlr3 model-based-optimization optimization optimizer random-forest tuning

25 stars 8.57 score 120 scripts 3 dependents

nalzok

tree.interpreter:Random Forest Prediction Decomposition and Feature Importance Measure

An R re-implementation of the 'treeinterpreter' package on PyPI <https://pypi.org/project/treeinterpreter/>. Each prediction can be decomposed as 'prediction = bias + feature_1_contribution + ... + feature_n_contribution'. This decomposition is then used to calculate the Mean Decrease Impurity (MDI) and Mean Decrease Impurity using out-of-bag samples (MDI-oob) feature importance measures based on the work of Li et al. (2019) <arXiv:1906.10845>.

Maintained by Qingyao Sun. Last updated 5 years ago.

data-science datascience interpretability machine-learning random-forest cpp

12 stars 5.78 score 6 scripts

blasbenito

spatialRF:Easy Spatial Modeling with Random Forest

Automatic generation and selection of spatial predictors for spatial regression with Random Forest. Spatial predictors are surrogates of variables driving the spatial structure of a response variable. The package offers two methods to generate spatial predictors from a distance matrix among training cases: 1) Moran's Eigenvector Maps (MEMs; Dray, Legendre, and Peres-Neto 2006 <DOI:10.1016/j.ecolmodel.2006.02.015>): computed as the eigenvectors of a weighted matrix of distances; 2) RFsp (Hengl et al. <DOI:10.7717/peerj.5518>): columns of the distance matrix used as spatial predictors. Spatial predictors help minimize the spatial autocorrelation of the model residuals and facilitate an honest assessment of the importance scores of the non-spatial predictors. Additionally, functions to reduce multicollinearity, identify relevant variable interactions, tune random forest hyperparameters, assess model transferability via spatial cross-validation, and explore model results via partial dependence curves and interaction surfaces are included in the package. The modelling functions are built around the highly efficient 'ranger' package (Wright and Ziegler 2017 <DOI:10.18637/jss.v077.i01>).

Maintained by Blas M. Benito. Last updated 3 years ago.

random-forest spatial-analysis spatial-regression

114 stars 5.45 score 49 scripts

mayer79

outForest:Multivariate Outlier Detection and Replacement

Provides a random forest based implementation of the method described in Chapter 7.1.2 (Regression model based anomaly detection) of Chandola et al. (2009) <doi:10.1145/1541880.1541882>. It works as follows: Each numeric variable is regressed onto all other variables by a random forest. If the scaled absolute difference between observed value and out-of-bag prediction of the corresponding random forest is suspiciously large, then a value is considered an outlier. The package offers different options to replace such outliers, e.g. by realistic values found via predictive mean matching. Once the method is trained on a reference data, it can be applied to new data.

Maintained by Michael Mayer. Last updated 8 months ago.

machine-learning outlier outlier-analysis outlier-detection random-forest

13 stars 5.39 score 19 scripts

benjilu

forestError:A Unified Framework for Random Forest Prediction Error Estimation

Estimates the conditional error distributions of random forest predictions and common parameters of those distributions, including conditional misclassification rates, conditional mean squared prediction errors, conditional biases, and conditional quantiles, by out-of-bag weighting of out-of-bag prediction errors as proposed by Lu and Hardin (2021). This package is compatible with several existing packages that implement random forests in R.

Maintained by Benjamin Lu. Last updated 4 years ago.

inference intervals machine-learning machinelearning prediction random-forest randomforest statistics

26 stars 4.62 score 16 scripts

shangzhi-hong

RfEmpImp:Multiple Imputation using Chained Random Forests

An R package for multiple imputation using chained random forests. Implemented methods can handle missing data in mixed types of variables by using prediction-based or node-based conditional distributions constructed using random forests. For prediction-based imputation, the method based on the empirical distribution of out-of-bag prediction errors of random forests and the method based on normality assumption for prediction errors of random forests are provided for imputing continuous variables. And the method based on predicted probabilities is provided for imputing categorical variables. For node-based imputation, the method based on the conditional distribution formed by the predicting nodes of random forests, and the method based on proximity measures of random forests are provided. More details of the statistical methods can be found in Hong et al. (2020) <arXiv:2004.14823>.

Maintained by Shangzhi Hong. Last updated 2 years ago.

imputation missing-data random-forest

5 stars 4.40 score 8 scripts

randel

MixRF:A Random-Forest-Based Approach for Imputing Clustered Incomplete Data

It offers random-forest-based functions to impute clustered incomplete data. The package is tailored for but not limited to imputing multitissue expression data, in which a gene's expression is measured on the collected tissues of an individual but missing on the uncollected tissues.

Maintained by Jiebiao Wang. Last updated 8 years ago.

gene-expression imputation mixed-models random-forest

35 stars 4.39 score 14 scripts

plantedml

randomPlantedForest:Random Planted Forest: A Directly Interpretable Tree Ensemble

An implementation of the Random Planted Forest algorithm for directly interpretable tree ensembles based on a functional ANOVA decomposition.

Maintained by Lukas Burk. Last updated 8 days ago.

intelligibility interpretable-machine-learning interpretable-ml machine-learning ml random-forest cpp

5 stars 4.28 score 38 scripts

forestry-labs

distillML:Model Distillation and Interpretability Methods for Machine Learning Models

Provides several methods for model distillation and interpretability for general black box machine learning models and treatment effect estimation methods. For details on the algorithms implemented, see <https://forestry-labs.github.io/distillML/index.html> Brian Cho, Theo F. Saarinen, Jasjeet S. Sekhon, Simon Walter.

Maintained by Theo Saarinen. Last updated 2 years ago.

bart distillation-model explainable-machine-learning explainable-ml interpretability interpretable-machine-learning machine-learning model random-forest xgboost

7 stars 3.92 score 12 scripts

missvalteam

Iscores:Proper Scoring Rules for Missing Value Imputation

Implementation of a KL-based scoring rule to assess the quality of different missing value imputations in the broad sense as introduced in Michel et al. (2021) <arXiv:2106.03742>.

Maintained by Loris Michel. Last updated 2 years ago.

imputation-methods machine-learning missing-values random-forest

7 stars 3.91 score 23 scripts

swfsc

banter:BioAcoustic eveNT classifiER

Create a hierarchical acoustic event species classifier out of multiple call type detectors as described in Rankin et al (2017) <doi:10.1111/mms.12381>.

Maintained by Eric Archer. Last updated 1 years ago.

acoustics bioacoustics cetaceans classification dolphins machine-learning noaa random-forest species-identification supervised-learning supervised-machine-learning whales jags cpp

9 stars 3.65 score