Showing 14 of total 14 results (show query)
mayer79
missRanger:Fast Imputation of Missing Values
Alternative implementation of the beautiful 'MissForest' algorithm used to impute mixed-type data sets by chaining random forests, introduced by Stekhoven, D.J. and Buehlmann, P. (2012) <doi:10.1093/bioinformatics/btr597>. Under the hood, it uses the lightning fast random forest package 'ranger'. Between the iterative model fitting, we offer the option of using predictive mean matching. This firstly avoids imputation with values not already present in the original data (like a value 0.3334 in 0-1 coded variable). Secondly, predictive mean matching tries to raise the variance in the resulting conditional distributions to a realistic level. This would allow, e.g., to do multiple imputation when repeating the call to missRanger(). Out-of-sample application is supported as well.
Maintained by Michael Mayer. Last updated 4 months ago.
imputationmachine-learningmissing-valuesrandom-forest
69 stars 11.07 score 208 scripts 6 dependentsmodeloriented
randomForestExplainer:Explaining and Visualizing Random Forests in Terms of Variable Importance
A set of tools to help explain which variables are most important in a random forests. Various variable importance measures are calculated and visualized in different settings in order to get an idea on how their importance changes depending on our criteria (Hemant Ishwaran and Udaya B. Kogalur and Eiran Z. Gorodeski and Andy J. Minn and Michael S. Lauer (2010) <doi:10.1198/jasa.2009.tm08622>, Leo Breiman (2001) <doi:10.1023/A:1010933404324>).
Maintained by Yue Jiang. Last updated 1 years ago.
233 stars 9.59 score 236 scriptsropensci
aorsf:Accelerated Oblique Random Forests
Fit, interpret, and compute predictions with oblique random forests. Includes support for partial dependence, variable importance, passing customized functions for variable importance and identification of linear combinations of features. Methods for the oblique random survival forest are described in Jaeger et al., (2023) <DOI:10.1080/10618600.2023.2231048>.
Maintained by Byron Jaeger. Last updated 8 days ago.
data-scienceobliquerandom-forestsurvivalopenblascppopenmp
58 stars 9.29 score 60 scripts 1 dependentsmlr-org
mlr3mbo:Flexible Bayesian Optimization
A modern and flexible approach to Bayesian Optimization / Model Based Optimization building on the 'bbotk' package. 'mlr3mbo' is a toolbox providing both ready-to-use optimization algorithms as well as their fundamental building blocks allowing for straightforward implementation of custom algorithms. Single- and multi-objective optimization is supported as well as mixed continuous, categorical and conditional search spaces. Moreover, using 'mlr3mbo' for hyperparameter optimization of machine learning models within the 'mlr3' ecosystem is straightforward via 'mlr3tuning'. Examples of ready-to-use optimization algorithms include Efficient Global Optimization by Jones et al. (1998) <doi:10.1023/A:1008306431147>, ParEGO by Knowles (2006) <doi:10.1109/TEVC.2005.851274> and SMS-EGO by Ponweiser et al. (2008) <doi:10.1007/978-3-540-87700-4_78>.
Maintained by Lennart Schneider. Last updated 26 days ago.
automlbayesian-optimizationbbotkblack-box-optimizationgaussian-processhpohyperparameterhyperparameter-optimizationhyperparameter-tuningmachine-learningmlr3model-based-optimizationoptimizationoptimizerrandom-foresttuning
25 stars 8.57 score 120 scripts 3 dependentsnalzok
tree.interpreter:Random Forest Prediction Decomposition and Feature Importance Measure
An R re-implementation of the 'treeinterpreter' package on PyPI <https://pypi.org/project/treeinterpreter/>. Each prediction can be decomposed as 'prediction = bias + feature_1_contribution + ... + feature_n_contribution'. This decomposition is then used to calculate the Mean Decrease Impurity (MDI) and Mean Decrease Impurity using out-of-bag samples (MDI-oob) feature importance measures based on the work of Li et al. (2019) <arXiv:1906.10845>.
Maintained by Qingyao Sun. Last updated 5 years ago.
data-sciencedatascienceinterpretabilitymachine-learningrandom-forestcpp
12 stars 5.78 score 6 scriptsblasbenito
spatialRF:Easy Spatial Modeling with Random Forest
Automatic generation and selection of spatial predictors for spatial regression with Random Forest. Spatial predictors are surrogates of variables driving the spatial structure of a response variable. The package offers two methods to generate spatial predictors from a distance matrix among training cases: 1) Moran's Eigenvector Maps (MEMs; Dray, Legendre, and Peres-Neto 2006 <DOI:10.1016/j.ecolmodel.2006.02.015>): computed as the eigenvectors of a weighted matrix of distances; 2) RFsp (Hengl et al. <DOI:10.7717/peerj.5518>): columns of the distance matrix used as spatial predictors. Spatial predictors help minimize the spatial autocorrelation of the model residuals and facilitate an honest assessment of the importance scores of the non-spatial predictors. Additionally, functions to reduce multicollinearity, identify relevant variable interactions, tune random forest hyperparameters, assess model transferability via spatial cross-validation, and explore model results via partial dependence curves and interaction surfaces are included in the package. The modelling functions are built around the highly efficient 'ranger' package (Wright and Ziegler 2017 <DOI:10.18637/jss.v077.i01>).
Maintained by Blas M. Benito. Last updated 3 years ago.
random-forestspatial-analysisspatial-regression
114 stars 5.45 score 49 scriptsmayer79
outForest:Multivariate Outlier Detection and Replacement
Provides a random forest based implementation of the method described in Chapter 7.1.2 (Regression model based anomaly detection) of Chandola et al. (2009) <doi:10.1145/1541880.1541882>. It works as follows: Each numeric variable is regressed onto all other variables by a random forest. If the scaled absolute difference between observed value and out-of-bag prediction of the corresponding random forest is suspiciously large, then a value is considered an outlier. The package offers different options to replace such outliers, e.g. by realistic values found via predictive mean matching. Once the method is trained on a reference data, it can be applied to new data.
Maintained by Michael Mayer. Last updated 8 months ago.
machine-learningoutlieroutlier-analysisoutlier-detectionrandom-forest
13 stars 5.39 score 19 scriptsbenjilu
forestError:A Unified Framework for Random Forest Prediction Error Estimation
Estimates the conditional error distributions of random forest predictions and common parameters of those distributions, including conditional misclassification rates, conditional mean squared prediction errors, conditional biases, and conditional quantiles, by out-of-bag weighting of out-of-bag prediction errors as proposed by Lu and Hardin (2021). This package is compatible with several existing packages that implement random forests in R.
Maintained by Benjamin Lu. Last updated 4 years ago.
inferenceintervalsmachine-learningmachinelearningpredictionrandom-forestrandomforeststatistics
26 stars 4.62 score 16 scriptsshangzhi-hong
RfEmpImp:Multiple Imputation using Chained Random Forests
An R package for multiple imputation using chained random forests. Implemented methods can handle missing data in mixed types of variables by using prediction-based or node-based conditional distributions constructed using random forests. For prediction-based imputation, the method based on the empirical distribution of out-of-bag prediction errors of random forests and the method based on normality assumption for prediction errors of random forests are provided for imputing continuous variables. And the method based on predicted probabilities is provided for imputing categorical variables. For node-based imputation, the method based on the conditional distribution formed by the predicting nodes of random forests, and the method based on proximity measures of random forests are provided. More details of the statistical methods can be found in Hong et al. (2020) <arXiv:2004.14823>.
Maintained by Shangzhi Hong. Last updated 2 years ago.
imputationmissing-datarandom-forest
5 stars 4.40 score 8 scriptsrandel
MixRF:A Random-Forest-Based Approach for Imputing Clustered Incomplete Data
It offers random-forest-based functions to impute clustered incomplete data. The package is tailored for but not limited to imputing multitissue expression data, in which a gene's expression is measured on the collected tissues of an individual but missing on the uncollected tissues.
Maintained by Jiebiao Wang. Last updated 8 years ago.
gene-expressionimputationmixed-modelsrandom-forest
35 stars 4.39 score 14 scriptsplantedml
randomPlantedForest:Random Planted Forest: A Directly Interpretable Tree Ensemble
An implementation of the Random Planted Forest algorithm for directly interpretable tree ensembles based on a functional ANOVA decomposition.
Maintained by Lukas Burk. Last updated 8 days ago.
intelligibilityinterpretable-machine-learninginterpretable-mlmachine-learningmlrandom-forestcpp
5 stars 4.28 score 38 scriptsforestry-labs
distillML:Model Distillation and Interpretability Methods for Machine Learning Models
Provides several methods for model distillation and interpretability for general black box machine learning models and treatment effect estimation methods. For details on the algorithms implemented, see <https://forestry-labs.github.io/distillML/index.html> Brian Cho, Theo F. Saarinen, Jasjeet S. Sekhon, Simon Walter.
Maintained by Theo Saarinen. Last updated 2 years ago.
bartdistillation-modelexplainable-machine-learningexplainable-mlinterpretabilityinterpretable-machine-learningmachine-learningmodelrandom-forestxgboost
7 stars 3.92 score 12 scriptsmissvalteam
Iscores:Proper Scoring Rules for Missing Value Imputation
Implementation of a KL-based scoring rule to assess the quality of different missing value imputations in the broad sense as introduced in Michel et al. (2021) <arXiv:2106.03742>.
Maintained by Loris Michel. Last updated 2 years ago.
imputation-methodsmachine-learningmissing-valuesrandom-forest
7 stars 3.91 score 23 scriptsswfsc
banter:BioAcoustic eveNT classifiER
Create a hierarchical acoustic event species classifier out of multiple call type detectors as described in Rankin et al (2017) <doi:10.1111/mms.12381>.
Maintained by Eric Archer. Last updated 1 years ago.
acousticsbioacousticscetaceansclassificationdolphinsmachine-learningnoaarandom-forestspecies-identificationsupervised-learningsupervised-machine-learningwhalesjagscpp
9 stars 3.65 score