R-universe search: tidymodels

tidymodels

tidymodels:Easily Install and Load the 'Tidymodels' Packages

The tidy modeling "verse" is a collection of packages for modeling and statistical analysis that share the underlying design philosophy, grammar, and data structures of the tidyverse.

Maintained by Max Kuhn. Last updated 3 months ago.

80.8 match 774 stars 16.11 score 65k scripts 14 dependents

tidymodels

broom:Convert Statistical Objects into Tidy Tibbles

Summarizes key information about statistical objects in tidy tibbles. This makes it easy to report results, create plots and consistently work with large numbers of models at once. Broom provides three verbs that each provide different types of information about a model. tidy() summarizes information about model components such as coefficients of a regression. glance() reports information about an entire model, such as goodness of fit measures like AIC and BIC. augment() adds information about individual observations to a dataset, such as fitted values or influence measures.

Maintained by Simon Couch. Last updated 3 months ago.

modeling tidy-data

20.0 match 1.5k stars 21.40 score 37k scripts 1.4k dependents

tidymodels

recipes:Preprocessing and Feature Engineering Steps for Modeling

A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.

Maintained by Max Kuhn. Last updated 11 hours ago.

20.0 match 578 stars 18.37 score 6.5k scripts 369 dependents

tidymodels

parsnip:A Common API to Modeling and Analysis Functions

A common interface is provided to allow users to specify a model without having to remember the different argument names across different functions or computational engines (e.g. 'R', 'Spark', 'Stan', 'H2O', etc).

Maintained by Max Kuhn. Last updated 2 days ago.

21.9 match 606 stars 16.03 score 3.4k scripts 68 dependents

tidymodels

rsample:General Resampling Infrastructure

Classes and functions to create and summarize different types of resampling objects (e.g. bootstrap, cross-validation).

Maintained by Hannah Frick. Last updated 4 months ago.

20.0 match 341 stars 16.85 score 4.8k scripts 77 dependents

tidymodels

infer:Tidy Statistical Inference

The objective of this package is to perform inference using an expressive statistical grammar that coheres with the tidy design framework.

Maintained by Simon Couch. Last updated 4 months ago.

20.0 match 731 stars 15.65 score 3.4k scripts 16 dependents

tidymodels

yardstick:Tidy Characterizations of Model Performance

Tidy tools for quantifying how well model fits to a data set such as confusion matrices, class probability curve summaries, and regression metrics (e.g., RMSE).

Maintained by Emil Hvitfeldt. Last updated 7 hours ago.

20.0 match 382 stars 15.03 score 2.2k scripts 59 dependents

tidymodels

tune:Tidy Tuning Tools

The ability to tune models is important. 'tune' contains functions and classes to be used in conjunction with other 'tidymodels' packages for finding reasonable values of hyper-parameters in models, pre-processing methods, and post-processing steps.

Maintained by Max Kuhn. Last updated 2 months ago.

20.5 match 288 stars 14.22 score 768 scripts 38 dependents

tidymodels

hardhat:Construct Modeling Packages

Building modeling packages is hard. A large amount of effort generally goes into providing an implementation for a new method that is efficient, fast, and correct, but often less emphasis is put on the user interface. A good interface requires specialized knowledge about S3 methods and formulas, which the average package developer might not have. The goal of 'hardhat' is to reduce the burden around building new modeling packages by providing functionality for preprocessing, predicting, and validating input.

Maintained by Hannah Frick. Last updated 1 months ago.

20.0 match 103 stars 14.56 score 175 scripts 421 dependents

tidymodels

dials:Tools for Creating Tuning Parameter Values

Many models contain tuning parameters (i.e. parameters that cannot be directly estimated from the data). These tools can be used to define objects for creating, simulating, or validating values for such parameters.

Maintained by Hannah Frick. Last updated 1 months ago.

20.0 match 114 stars 13.98 score 414 scripts 51 dependents

tidymodels

workflowsets:Create a Collection of 'tidymodels' Workflows

A workflow is a combination of a model and preprocessors (e.g, a formula, recipe, etc.) (Kuhn and Silge (2021) <https://www.tmwr.org/>). In order to try different combinations of these, an object can be created that contains many workflows. There are functions to create workflows en masse as well as training them and visualizing the results.

Maintained by Simon Couch. Last updated 3 months ago.

23.1 match 92 stars 12.07 score 300 scripts 18 dependents

tidymodels

workflows:Modeling Workflows

Managing both a 'parsnip' model and a preprocessor, such as a model formula or recipe from 'recipes', can often be challenging. The goal of 'workflows' is to streamline this process by bundling the model alongside the preprocessor, all within the same object.

Maintained by Simon Couch. Last updated 5 days ago.

20.0 match 207 stars 13.62 score 852 scripts 42 dependents

tidymodels

corrr:Correlations in R

A tool for exploring correlations. It makes it possible to easily perform routine tasks when exploring correlation matrices such as ignoring the diagonal, focusing on the correlations of certain variables against others, or rearranging and visualizing the matrix in terms of the strength of the correlations.

Maintained by Max Kuhn. Last updated 1 years ago.

20.0 match 591 stars 13.48 score 2.8k scripts 7 dependents

business-science

modeltime:The Tidymodels Extension for Time Series Modeling

The time series forecasting framework for use with the 'tidymodels' ecosystem. Models include ARIMA, Exponential Smoothing, and additional time series models from the 'forecast' and 'prophet' packages. Refer to "Forecasting Principles & Practice, Second edition" (<https://otexts.com/fpp2/>). Refer to "Prophet: forecasting at scale" (<https://research.facebook.com/blog/2017/02/prophet-forecasting-at-scale/>.).

Maintained by Matt Dancho. Last updated 3 months ago.

arima data-science deep-learning ets forecasting machine-learning machine-learning-algorithms modeltime prophet tbats tidymodeling tidymodels time time-series time-series-analysis timeseries timeseries-forecasting

25.3 match 547 stars 10.44 score 1.1k scripts 7 dependents

tidymodels

censored:'parsnip' Engines for Survival Models

Engines for survival models from the 'parsnip' package. These include parametric models (e.g., Jackson (2016) <doi:10.18637/jss.v070.i08>), semi-parametric (e.g., Simon et al (2011) <doi:10.18637/jss.v039.i05>), and tree-based models (e.g., Buehlmann and Hothorn (2007) <doi:10.1214/07-STS242>).

Maintained by Hannah Frick. Last updated 6 months ago.

parsnip tidymodels

30.0 match 123 stars 8.72 score 246 scripts 1 dependents

tidymodels

probably:Tools for Post-Processing Predicted Values

Models can be improved by post-processing class probabilities, by: recalibration, conversion to hard probabilities, assessment of equivocal zones, and other activities. 'probably' contains tools for conducting these operations as well as calibration tools and conformal inference techniques for regression models.

Maintained by Max Kuhn. Last updated 3 months ago.

20.0 match 115 stars 12.01 score 21k scripts 1 dependents

tidymodels

stacks:Tidy Model Stacking

Model stacking is an ensemble technique that involves training a model to combine the outputs of many diverse statistical models, and has been shown to improve predictive performance in a variety of settings. 'stacks' implements a grammar for 'tidymodels'-aligned model stacking.

Maintained by Simon Couch. Last updated 3 months ago.

20.5 match 295 stars 11.44 score 860 scripts

tidymodels

tidypredict:Run Predictions Inside the Database

It parses a fitted 'R' model object, and returns a formula in 'Tidy Eval' code that calculates the predictions. It works with several databases back-ends because it leverages 'dplyr' and 'dbplyr' for the final 'SQL' translation of the algorithm. It currently supports lm(), glm(), randomForest(), ranger(), earth(), xgb.Booster.complete(), cubist(), and ctree() models.

Maintained by Emil Hvitfeldt. Last updated 1 months ago.

dbplyr dplyr purrr rlang

20.0 match 259 stars 11.19 score 251 scripts 2 dependents

tidymodels

butcher:Model Butcher

Provides a set of S3 generics to axe components of fitted model objects and help reduce the size of model objects saved to disk.

Maintained by Julia Silge. Last updated 8 days ago.

20.0 match 132 stars 11.16 score 146 scripts 13 dependents

tidymodels

modeldata:Data Sets Useful for Modeling Examples

Data sets used for demonstrating or testing model-related packages are contained in this package.

Maintained by Max Kuhn. Last updated 3 months ago.

20.0 match 22 stars 10.88 score 2.1k scripts 14 dependents

tidymodels

textrecipes:Extra 'Recipes' for Text Processing

Converting text to numerical features requires specifically created procedures, which are implemented as steps according to the 'recipes' package. These steps allows for tokenization, filtering, counting (tf and tfidf) and feature hashing.

Maintained by Emil Hvitfeldt. Last updated 2 months ago.

cpp

20.0 match 160 stars 10.70 score 992 scripts 1 dependents

tidymodels

themis:Extra Recipes Steps for Dealing with Unbalanced Data

A dataset with an uneven number of cases in each class is said to be unbalanced. Many models produce a subpar performance on unbalanced datasets. A dataset can be balanced by increasing the number of minority cases using SMOTE 2011 <doi:10.48550/arXiv.1106.1813>, BorderlineSMOTE 2005 <doi:10.1007/11538059_91> and ADASYN 2008 <https://ieeexplore.ieee.org/document/4633969>. Or by decreasing the number of majority cases using NearMiss 2003 <https://www.site.uottawa.ca/~nat/Workshop2003/jzhang.pdf> or Tomek link removal 1976 <https://ieeexplore.ieee.org/document/4309452>.

Maintained by Emil Hvitfeldt. Last updated 7 hours ago.

20.0 match 143 stars 9.73 score 1.1k scripts 1 dependents

tidymodels

rules:Model Wrappers for Rule-Based Models

Bindings for additional models for use with the 'parsnip' package. Models include prediction rule ensembles (Friedman and Popescu, 2008) <doi:10.1214/07-AOAS148>, C5.0 rules (Quinlan, 1992 ISBN: 1558602380), and Cubist (Kuhn and Johnson, 2013) <doi:10.1007/978-1-4614-6849-3>.

Maintained by Emil Hvitfeldt. Last updated 3 months ago.

20.0 match 40 stars 9.47 score 20k scripts 1 dependents

tidymodels

bonsai:Model Wrappers for Tree-Based Models

Bindings for additional tree-based model engines for use with the 'parsnip' package. Models include gradient boosted decision trees with 'LightGBM' (Ke et al, 2017.), conditional inference trees and conditional random forests with 'partykit' (Hothorn and Zeileis, 2015. and Hothorn et al, 2006. <doi:10.1198/106186006X133933>), and accelerated oblique random forests with 'aorsf' (Jaeger et al, 2022 <doi:10.5281/zenodo.7116854>).

Maintained by Simon Couch. Last updated 3 months ago.

20.0 match 52 stars 9.28 score 612 scripts 1 dependents

tidymodels

embed:Extra Recipes for Encoding Predictors

Predictors can be converted to one or more numeric representations using a variety of methods. Effect encodings using simple generalized linear models <doi:10.48550/arXiv.1611.09477> or nonlinear models <doi:10.48550/arXiv.1604.06737> can be used. There are also functions for dimension reduction and other approaches.

Maintained by Emil Hvitfeldt. Last updated 6 hours ago.

20.0 match 142 stars 9.17 score 1.1k scripts

tidymodels

finetune:Additional Functions for Model Tuning

The ability to tune models is important. 'finetune' enhances the 'tune' package by providing more specialized methods for finding reasonable values of model tuning parameters. Two racing methods described by Kuhn (2014) <arXiv:1405.6974> are included. An iterative search method using generalized simulated annealing (Bohachevsky, Johnson and Stein, 1986) <doi:10.1080/00401706.1986.10488128> is also included.

Maintained by Max Kuhn. Last updated 5 months ago.

20.0 match 62 stars 8.64 score 708 scripts 1 dependents

tidymodels

multilevelmod:Model Wrappers for Multi-Level Models

Bindings for hierarchical regression models for use with the 'parsnip' package. Models include longitudinal generalized linear models (Liang and Zeger, 1986) <doi:10.1093/biomet/73.1.13>, and mixed-effect models (Pinheiro and Bates) <doi:10.1007/978-1-4419-0318-1_1>.

Maintained by Hannah Frick. Last updated 3 months ago.

21.3 match 74 stars 8.07 score 211 scripts

tidymodels

tidyposterior:Bayesian Analysis to Compare Models using Resampling Statistics

Bayesian analysis used here to answer the question: "when looking at resampling results, are the differences between models 'real'?" To answer this, a model can be created were the performance statistic is the resampling statistics (e.g. accuracy or RMSE). These values are explained by the model types. In doing this, we can get parameter estimates for each model's affect on performance and make statistical (and practical) comparisons between models. The methods included here are similar to Benavoli et al (2017) <https://jmlr.org/papers/v18/16-305.html>.

Maintained by Max Kuhn. Last updated 3 months ago.

20.0 match 102 stars 8.42 score 257 scripts

tidymodels

modelenv:Provide Tools to Register Models for Use in 'tidymodels'

An developer focused, low dependency package in 'tidymodels' that provides functions to register how models are to be used. Functions to register models are complimented with accessor functions to retrieve registered model information to aid in model fitting and error handling.

Maintained by Emil Hvitfeldt. Last updated 3 months ago.

23.4 match 4 stars 7.01 score 1 scripts 43 dependents

tidymodels

spatialsample:Spatial Resampling Infrastructure

Functions and classes for spatial resampling to use with the 'rsample' package, such as spatial cross-validation (Brenning, 2012) <doi:10.1109/IGARSS.2012.6352393>. The scope of 'rsample' and 'spatialsample' is to provide the basic building blocks for creating and analyzing resamples of a spatial data set, but neither package includes functions for modeling or computing statistics. The resampled spatial data sets created by 'spatialsample' do not contain much overhead in memory.

Maintained by Michael Mahoney. Last updated 4 months ago.

cpp

20.0 match 72 stars 8.18 score 118 scripts 2 dependents

tidymodels

agua:'tidymodels' Integration with 'h2o'

Create and evaluate models using 'tidymodels' and 'h2o' <https://h2o.ai/>. The package enables users to specify 'h2o' as an engine for several modeling methods.

Maintained by Qiushi Yan. Last updated 7 months ago.

23.9 match 22 stars 6.85 score 80 scripts

tidymodels

usemodels:Boilerplate Code for 'Tidymodels' Analyses

Code snippets to fit models using the tidymodels framework can be easily created for a given data set.

Maintained by Max Kuhn. Last updated 3 months ago.

23.7 match 84 stars 6.90 score 134 scripts

tidymodels

baguette:Efficient Model Functions for Bagging

Tree- and rule-based models can be bagged (<doi:10.1007/BF00058655>) using this package and their predictions equations are stored in an efficient format to reduce the model objects size and speed.

Maintained by Max Kuhn. Last updated 3 months ago.

20.0 match 25 stars 8.05 score 624 scripts 1 dependents

tidymodels

discrim:Model Wrappers for Discriminant Analysis

Bindings for additional classification models for use with the 'parsnip' package. Models include flavors of discriminant analysis, such as linear (Fisher (1936) <doi:10.1111/j.1469-1809.1936.tb02137.x>), regularized (Friedman (1989) <doi:10.1080/01621459.1989.10478752>), and flexible (Hastie, Tibshirani, and Buja (1994) <doi:10.1080/01621459.1994.10476866>), as well as naive Bayes classifiers (Hand and Yu (2007) <doi:10.1111/j.1751-5823.2001.tb00465.x>).

Maintained by Emil Hvitfeldt. Last updated 3 months ago.

20.0 match 28 stars 8.02 score 992 scripts 1 dependents

tidymodels

orbital:Predict with 'tidymodels' Workflows in Databases

Turn 'tidymodels' workflows into objects containing the sufficient sequential equations to perform predictions. These smaller objects allow for low dependency prediction locally or directly in databases.

Maintained by Emil Hvitfeldt. Last updated 1 months ago.

25.5 match 25 stars 6.22 score 11 scripts

tidymodels

modeldb:Fits Models Inside the Database

Uses 'dplyr' and 'tidyeval' to fit statistical models inside the database. It currently supports KMeans and linear regression models.

Maintained by Max Kuhn. Last updated 1 years ago.

database dbplyr dplyr ggplot2 modeling rlang sql tidyeval visualization

20.0 match 79 stars 7.59 score 62 scripts

tidymodels

brulee:High-Level Modeling Functions with 'torch'

Provides high-level modeling functions to define and train models using the 'torch' R package. Models include linear, logistic, and multinomial regression as well as multilayer perceptrons.

Maintained by Max Kuhn. Last updated 3 months ago.

20.0 match 67 stars 7.48 score 214 scripts

tidymodels

applicable:A Compilation of Applicability Domain Methods

A modeling package compiling applicability domain methods in R. It combines different methods to measure the amount of extrapolation new samples can have from the training set. See Netzeva et al (2005) <doi:10.1177/026119290503300209> for an overview of applicability domains.

Maintained by Marly Gotti. Last updated 2 years ago.

20.0 match 47 stars 7.44 score 49 scripts 1 dependents

tidymodels

poissonreg:Model Wrappers for Poisson Regression

Bindings for Poisson regression models for use with the 'parsnip' package. Models include simple generalized linear models, Bayesian models, and zero-inflated Poisson models (Zeileis, Kleiber, and Jackman (2008) <doi:10.18637/jss.v027.i08>).

Maintained by Hannah Frick. Last updated 2 months ago.

20.0 match 22 stars 7.26 score 342 scripts 1 dependents

tidymodels

tidyclust:A Common API to Clustering

A common interface to specifying clustering models, in the same style as 'parsnip'. Creates unified interface across different functions and computational engines.

Maintained by Emil Hvitfeldt. Last updated 7 months ago.

20.0 match 110 stars 7.10 score 139 scripts

tidymodels

shinymodels:Interactive Assessments of Models

Launch a 'shiny' application for 'tidymodels' results. For classification or regression models, the app can be used to determine if there is lack of fit or poorly predicted points.

Maintained by Simon Couch. Last updated 3 months ago.

shiny

20.5 match 47 stars 6.39 score 50 scripts

tidymodels

plsmod:Model Wrappers for Projection Methods

Bindings for additional regression models for use with the 'parsnip' package, including ordinary and spare partial least squares models for regression and classification (Rohart et al (2017) <doi:10.1371/journal.pcbi.1005752>).

Maintained by Max Kuhn. Last updated 3 months ago.

mixomics

20.0 match 14 stars 6.47 score 58 scripts 1 dependents

hsbadr

bayesian:Bindings for Bayesian TidyModels

Fit Bayesian models using 'brms'/'Stan' with 'parsnip'/'tidymodels' via 'bayesian' <doi:10.5281/zenodo.4426836>. 'tidymodels' is a collection of packages for machine learning; see Kuhn and Wickham (2020) <https://www.tidymodels.org>). The technical details of 'brms' and 'Stan' are described in Bürkner (2017) <doi:10.18637/jss.v080.i01>, Bürkner (2018) <doi:10.32614/RJ-2018-017>, and Carpenter et al. (2017) <doi:10.18637/jss.v076.i01>.

Maintained by Hamada S. Badr. Last updated 17 days ago.

bayesian brms stan tidymodels

16.1 match 44 stars 7.50 score 16 scripts

hsbadr

additive:Bindings for Additive TidyModels

Fit Generalized Additive Models (GAM) using 'mgcv' with 'parsnip'/'tidymodels' via 'additive' <doi:10.5281/zenodo.4784245>. 'tidymodels' is a collection of packages for machine learning; see Kuhn and Wickham (2020) <https://www.tidymodels.org>). The technical details of 'mgcv' are described in Wood (2017) <doi:10.1201/9781315370279>.

Maintained by Hamada S. Badr. Last updated 17 days ago.

additive bam gam generalized-additive-models mgcv tidymodels

16.2 match 7 stars 6.11 score 14 scripts

spsanderson

tidyAML:Automatic Machine Learning with 'tidymodels'

The goal of this package will be to provide a simple interface for automatic machine learning that fits the 'tidymodels' framework. The intention is to work for regression and classification problems with a simple verb framework.

Maintained by Steven Sanderson. Last updated 9 months ago.

automatic-machine-learning automl classification machine-learning parsnip r-language r-programming regression tidy tidymodels tidyverse

13.7 match 67 stars 7.16 score 36 scripts 1 dependents

tidymodels

modeldatatoo:More Data Sets Useful for Modeling Examples

More data sets used for demonstrating or testing model-related packages are contained in this package. The data sets are downloaded and cached, allowing for more and bigger data sets.

Maintained by Emil Hvitfeldt. Last updated 10 months ago.

20.0 match 7 stars 4.85 score 34 scripts

business-science

modeltime.ensemble:Ensemble Algorithms for Time Series Forecasting with Modeltime

A 'modeltime' extension that implements time series ensemble forecasting methods including model averaging, weighted averaging, and stacking. These techniques are popular methods to improve forecast accuracy and stability.

Maintained by Matt Dancho. Last updated 6 months ago.

ensemble ensemble-learning forecast forecasting modeltime stacking stacking-ensemble tidymodels time time-series timeseries

11.2 match 75 stars 8.29 score 143 scripts

tidymodels

desirability2:Desirability Functions for Multiparameter Optimization

In-line functions for multivariate optimization via desirability functions (Derringer and Suich, 1980, <doi:10.1080/00224065.1980.11980968>) with easy use within `dplyr` pipelines.

Maintained by Max Kuhn. Last updated 3 months ago.

20.0 match 10 stars 4.53 score 17 scripts

evolecolgroup

tidysdm:Species Distribution Models with Tidymodels

Fit species distribution models (SDMs) using the 'tidymodels' framework, which provides a standardised interface to define models and process their outputs. 'tidysdm' expands 'tidymodels' by providing methods for spatial objects, models and metrics specific to SDMs, as well as a number of specialised functions to process occurrences for contemporary and palaeo datasets. The full functionalities of the package are described in Leonardi et al. (2023) <doi:10.1101/2023.07.24.550358>.

Maintained by Andrea Manica. Last updated 1 months ago.

9.8 match 30 stars 8.74 score 46 scripts

corybrunson

ordr:A Tidyverse Extension for Ordinations and Biplots

Ordination comprises several multivariate exploratory and explanatory techniques with theoretical foundations in geometric data analysis; see Podani (2000, ISBN:90-5782-067-6) for techniques and applications and Le Roux & Rouanet (2005) <doi:10.1007/1-4020-2236-0> for foundations. Greenacre (2010, ISBN:978-84-923846) shows how the most established of these, including principal components analysis, correspondence analysis, multidimensional scaling, factor analysis, and discriminant analysis, rely on eigen-decompositions or singular value decompositions of pre-processed numeric matrix data. These decompositions give rise to a set of shared coordinates along which the row and column elements can be measured. The overlay of their scatterplots on these axes, introduced by Gabriel (1971) <doi:10.1093/biomet/58.3.453>, is called a biplot. 'ordr' provides inspection, extraction, manipulation, and visualization tools for several popular ordination classes supported by a set of recovery methods. It is inspired by and designed to integrate into 'tidyverse' workflows provided by Wickham et al (2019) <doi:10.21105/joss.01686>.

Maintained by Jason Cory Brunson. Last updated 17 days ago.

biplot data-visualization dimension-reduction geometric-data-analysis grammar-of-graphics log-ratio-analysis multivariate-analysis multivariate-statistics ordination tidymodels tidyverse

10.0 match 22 stars 6.93 score 26 scripts

ropensci

waywiser:Ergonomic Methods for Assessing Spatial Models

Assessing predictive models of spatial data can be challenging, both because these models are typically built for extrapolating outside the original region represented by training data and due to potential spatially structured errors, with "hot spots" of higher than expected error clustered geographically due to spatial structure in the underlying data. Methods are provided for assessing models fit to spatial data, including approaches for measuring the spatial structure of model errors, assessing model predictions at multiple spatial scales, and evaluating where predictions can be made safely. Methods are particularly useful for models fit using the 'tidymodels' framework. Methods include Moran's I ('Moran' (1950) <doi:10.2307/2332142>), Geary's C ('Geary' (1954) <doi:10.2307/2986645>), Getis-Ord's G ('Ord' and 'Getis' (1995) <doi:10.1111/j.1538-4632.1995.tb00912.x>), agreement coefficients from 'Ji' and Gallo (2006) (<doi: 10.14358/PERS.72.7.823>), agreement metrics from 'Willmott' (1981) (<doi: 10.1080/02723646.1981.10642213>) and 'Willmott' 'et' 'al'. (2012) (<doi: 10.1002/joc.2419>), an implementation of the area of applicability methodology from 'Meyer' and 'Pebesma' (2021) (<doi:10.1111/2041-210X.13650>), and an implementation of multi-scale assessment as described in 'Riemann' 'et' 'al'. (2010) (<doi:10.1016/j.rse.2010.05.010>).

Maintained by Michael Mahoney. Last updated 7 months ago.

spatial spatial-analysis tidymodels tidyverse

10.5 match 38 stars 6.51 score 19 scripts

jameshwade

measure:A Recipes-style Interface to Tidymodels for Analytical Measurements

Analytical measurements...

Maintained by James Wade. Last updated 6 months ago.

recipes tidymodels

12.9 match 5 stars 5.22 score 55 scripts

business-science

modeltime.resample:Resampling Tools for Time Series Forecasting

A 'modeltime' extension that implements forecast resampling tools that assess time-based model performance and stability for a single time series, panel data, and cross-sectional time series analysis.

Maintained by Matt Dancho. Last updated 1 years ago.

accuracy-metrics backtesting bootstrap bootstrapping cross-validation forecasting modeltime modeltime-resample resampling statistics tidymodels time-series

10.0 match 19 stars 6.64 score 38 scripts 1 dependents

modeloriented

shapviz:SHAP Visualizations

Visualizations for SHAP (SHapley Additive exPlanations), such as waterfall plots, force plots, various types of importance plots, dependence plots, and interaction plots. These plots act on a 'shapviz' object created from a matrix of SHAP values and a corresponding feature dataset. Wrappers for the R packages 'xgboost', 'lightgbm', 'fastshap', 'shapr', 'h2o', 'treeshap', 'DALEX', and 'kernelshap' are added for convenience. By separating visualization and computation, it is possible to display factor variables in graphs, even if the SHAP values are calculated by a model that requires numerical features. The plots are inspired by those provided by the 'shap' package in Python, but there is no dependency on it.

Maintained by Michael Mayer. Last updated 3 days ago.

explainable-ai machine-learning shap shapley-value visualization xai

5.0 match 85 stars 9.82 score 248 scripts

mlverse

tabnet:Fit 'TabNet' Models for Classification and Regression

Implements the 'TabNet' model by Sercan O. Arik et al. (2019) <doi:10.48550/arXiv.1908.07442> with 'Coherent Hierarchical Multi-label Classification Networks' by Giunchiglia et al. <doi:10.48550/arXiv.2010.10151> and provides a consistent interface for fitting and creating predictions. It's also fully compatible with the 'tidymodels' ecosystem.

Maintained by Christophe Regouby. Last updated 4 months ago.

tabnet

3.8 match 108 stars 9.08 score 62 scripts

rstudio

bundle:Serialize Model Objects with a Consistent Interface

Typically, models in 'R' exist in memory and can be saved via regular 'R' serialization. However, some models store information in locations that cannot be saved using 'R' serialization alone. The goal of 'bundle' is to provide a common interface to capture this information, situate it within a portable object, and restore it for use in new settings.

Maintained by Julia Silge. Last updated 2 months ago.

3.8 match 29 stars 9.09 score 171 scripts 3 dependents

paithiov909

baritsu:Wrappers for 'mlpack'

A collection of wrappers for the 'mlpack' package that allows passing formula as their argument.

Maintained by Akiru Kato. Last updated 17 days ago.

tidymodels

10.0 match 3 stars 3.13 score 1 scripts

openpharma

mmrm:Mixed Models for Repeated Measures

Mixed models for repeated measures (MMRM) are a popular choice for analyzing longitudinal continuous outcomes in randomized clinical trials and beyond; see Cnaan, Laird and Slasor (1997) <doi:10.1002/(SICI)1097-0258(19971030)16:20%3C2349::AID-SIM667%3E3.0.CO;2-E> for a tutorial and Mallinckrodt, Lane, Schnell, Peng and Mancuso (2008) <doi:10.1177/009286150804200402> for a review. This package implements MMRM based on the marginal linear model without random effects using Template Model Builder ('TMB') which enables fast and robust model fitting. Users can specify a variety of covariance matrices, weight observations, fit models with restricted or standard maximum likelihood inference, perform hypothesis testing with Satterthwaite or Kenward-Roger adjustment, and extract least square means estimates by using 'emmeans'.

Maintained by Daniel Sabanes Bove. Last updated 10 days ago.

cpp

2.0 match 137 stars 12.08 score 112 scripts 4 dependents

mattheaphy

offsetreg:An Extension of 'Tidymodels' Supporting Offset Terms

Extend the 'tidymodels' ecosystem <https://www.tidymodels.org/> to enable the creation of predictive models with offset terms. Models with offsets are most useful when working with count data or when fitting an adjustment model on top of an existing model with a prior expectation. The former situation is common in insurance where data is often weighted by exposures. The latter is common in life insurance where industry mortality tables are often used as a starting point for setting assumptions.

Maintained by Matt Heaphy. Last updated 10 months ago.

3.8 match 4.18 score 4 scripts

norskregnesentral

shapr:Prediction Explanation with Dependence-Aware Shapley Values

Complex machine learning models are often hard to interpret. However, in many situations it is crucial to understand and explain why a model made a specific prediction. Shapley values is the only method for such prediction explanation framework with a solid theoretical foundation. Previously known methods for estimating the Shapley values do, however, assume feature independence. This package implements methods which accounts for any feature dependence, and thereby produces more accurate estimates of the true Shapley values. An accompanying 'Python' wrapper ('shaprpy') is available through the GitHub repository.

Maintained by Martin Jullum. Last updated 21 hours ago.

explainable-ai explainable-ml rcpp rcpparmadillo shapley openblas cpp openmp

1.5 match 152 stars 10.42 score 172 scripts 1 dependents

statsgary

OddsPlotty:Odds Plot to Visualise a Logistic Regression Model

Uses the outputs of a logistic regression model, from caret <https://CRAN.R-project.org/package=caret>, to build an odds plot. This allows for the rapid visualisation of odds plot ratios and works best with the outputs of CARET's GLM model class, by returning the final trained model.

Maintained by Gary Hutson. Last updated 7 months ago.

2.5 match 17 stars 6.09 score 48 scripts 1 dependents

modeloriented

DALEXtra:Extension for 'DALEX' Package

Provides wrapper of various machine learning models. In applied machine learning, there is a strong belief that we need to strike a balance between interpretability and accuracy. However, in field of the interpretable machine learning, there are more and more new ideas for explaining black-box models, that are implemented in 'R'. 'DALEXtra' creates 'DALEX' Biecek (2018) <arXiv:1806.08915> explainer for many type of models including those created using 'python' 'scikit-learn' and 'keras' libraries, and 'java' 'h2o' library. Important part of the package is Champion-Challenger analysis and innovative approach to model performance across subsets of test data presented in Funnel Plot.

Maintained by Szymon Maksymiuk. Last updated 2 years ago.

extension-for-dalex-package

1.9 match 66 stars 7.68 score 406 scripts 1 dependents

microsoft

finnts:Microsoft Finance Time Series Forecasting Framework

Automated time series forecasting developed by Microsoft Finance. The Microsoft Finance Time Series Forecasting Framework, aka Finn, can be used to forecast any component of the income statement, balance sheet, or any other area of interest by finance. Any numerical quantity over time, Finn can be used to forecast it. While it can be applied outside of the finance domain, Finn was built to meet the needs of financial analysts to better forecast their businesses within a company, and has a lot of built in features that are specific to the needs of financial forecasters. Happy forecasting!

Maintained by Mike Tokic. Last updated 3 months ago.

business data-science feature-selection finance finnts forecasting machine-learning microsoft time-series

1.3 match 191 stars 9.43 score 39 scripts

modeloriented

modelStudio:Interactive Studio for Explanatory Model Analysis

Automate the explanatory analysis of machine learning predictive models. Generate advanced interactive model explanations in the form of a serverless HTML site with only one line of code. This tool is model-agnostic, therefore compatible with most of the black-box predictive models and frameworks. The main function computes various (instance and model-level) explanations and produces a customisable dashboard, which consists of multiple panels for plots with their short descriptions. It is possible to easily save the dashboard and share it with others. 'modelStudio' facilitates the process of Interactive Explanatory Model Analysis introduced in Baniecki et al. (2023) <doi:10.1007/s10618-023-00924-w>.

Maintained by Hubert Baniecki. Last updated 1 years ago.

ai explainable explainable-ai explainable-machine-learning explanatory-model-analysis human iml interactive interactivity interpretability interpretable interpretable-machine-learning learning machine model model-visualization visualization xai

1.5 match 328 stars 7.92 score 56 scripts

gmcmacran

tidydann:Add the 'dann' Model and the 'sub_dann' Model to the Tidymodels Ecosystem

Provides model specifications, tuning parameters for models in 'dann' package. Models based on Hastie (1996) <https://web.stanford.edu/~hastie/Papers/dann_IEEE.pdf>.

Maintained by Greg McMahan. Last updated 7 months ago.

2.9 match 3.04 score 11 scripts

mattheaphy

actxps:Create Actuarial Experience Studies: Prepare Data, Summarize Results, and Create Reports

Experience studies are used by actuaries to explore historical experience across blocks of business and to inform assumption setting activities. This package provides functions for preparing data, creating studies, visualizing results, and beginning assumption development. Experience study methods, including exposure calculations, are described in: Atkinson & McGarry (2016) "Experience Study Calculations" <https://www.soa.org/49378a/globalassets/assets/files/research/experience-study-calculations.pdf>. The limited fluctuation credibility method used by the 'exp_stats()' function is described in: Herzog (1999, ISBN:1-56698-374-6) "Introduction to Credibility Theory".

Maintained by Matt Heaphy. Last updated 19 days ago.

1.3 match 14 stars 6.38 score 23 scripts

statsgary

MLDataR:Collection of Machine Learning Datasets for Supervised Machine Learning

Contains a collection of datasets for working with machine learning tasks. It will contain datasets for supervised machine learning Jiang (2020)<doi:10.1016/j.beth.2020.05.002> and will include datasets for classification and regression. The aim of this package is to use data generated around health and other domains.

Maintained by Gary Hutson. Last updated 1 years ago.

1.2 match 52 stars 5.67 score 18 scripts

tommyjones

tidylda:Latent Dirichlet Allocation Using 'tidyverse' Conventions

Implements an algorithm for Latent Dirichlet Allocation (LDA), Blei et at. (2003) <https://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf>, using style conventions from the 'tidyverse', Wickham et al. (2019)<doi:10.21105/joss.01686>, and 'tidymodels', Kuhn et al.<https://tidymodels.github.io/model-implementation-principles/>. Fitting is done via collapsed Gibbs sampling. Also implements several novel features for LDA such as guided models and transfer learning.

Maintained by Tommy Jones. Last updated 7 days ago.

cpp openmp

0.8 match 41 stars 7.36 score 53 scripts

modeloriented

kernelshap:Kernel SHAP

Efficient implementation of Kernel SHAP, see Lundberg and Lee (2017), and Covert and Lee (2021) <http://proceedings.mlr.press/v130/covert21a>. Furthermore, for up to 14 features, exact permutation SHAP values can be calculated. The package plays well together with meta-learning packages like 'tidymodels', 'caret' or 'mlr3'. Visualizations can be done using the R package 'shapviz'.

Maintained by Michael Mayer. Last updated 5 months ago.

explainable-ai interpretability interpretable-machine-learning machine-learning shap xai

0.5 match 42 stars 8.14 score 117 scripts 17 dependents

grantmcdermott

parttree:Visualize Simple 2-D Decision Tree Partitions

Visualize the partitions of simple decision trees, involving one or two predictors, on the scale of the original data. Provides an intuitive alternative to traditional tree diagrams, by visualizing how a decision tree divides the predictor space in a simple 2D plot alongside the original data. The 'parttree' package supports both classification and regression trees from 'rpart' and 'partykit', as well as trees produced by popular frontend systems like 'tidymodels' and 'mlr3'. Visualization methods are provided for both base R graphics and 'ggplot2'.

Maintained by Grant McDermott. Last updated 7 days ago.

0.5 match 95 stars 6.72 score 92 scripts

ashbythorpe

nestedmodels:Tidy Modelling for Nested Data

A modelling framework for nested data using the 'tidymodels' ecosystem. Specify how to nest data using the 'recipes' package, create testing and training splits using 'rsample', and fit models to this data using the 'parsnip' and 'workflows' packages. Allows any model to be fit to nested data.

Maintained by Ashby Thorpe. Last updated 1 years ago.

0.5 match 12 stars 5.53 score 14 scripts

modeloriented

hstats:Interaction Statistics

Fast, model-agnostic implementation of different H-statistics introduced by Jerome H. Friedman and Bogdan E. Popescu (2008) <doi:10.1214/07-AOAS148>. These statistics quantify interaction strength per feature, feature pair, and feature triple. The package supports multi-output predictions and can account for case weights. In addition, several variants of the original statistics are provided. The shape of the interactions can be explored through partial dependence plots or individual conditional expectation plots. 'DALEX' explainers, meta learners ('mlr3', 'tidymodels', 'caret') and most other models work out-of-the-box.

Maintained by Michael Mayer. Last updated 5 months ago.

interaction interpretability machine-learning rstat statistics xai

0.5 match 28 stars 5.52 score 34 scripts

mayer79

effectplots:Effect Plots

High-performance implementation of various effect plots useful for regression and probabilistic classification tasks. The package includes partial dependence plots (Friedman, 2021, <doi:10.1214/aos/1013203451>), accumulated local effect plots and M-plots (both from Apley and Zhu, 2016, <doi:10.1111/rssb.12377>), as well as plots that describe the statistical associations between model response and features. It supports visualizations with either 'ggplot2' or 'plotly', and is compatible with most models, including 'Tidymodels', models wrapped in 'DALEX' explainers, or models with case weights.

Maintained by Michael Mayer. Last updated 11 days ago.

machine-learning regression xai cpp

0.5 match 18 stars 5.03 score 7 scripts

juanv66x

viralmodels:Viral Load and CD4 Lymphocytes Regression Models

Provides a comprehensive framework for building, evaluating, and visualizing regression models for analyzing viral load and CD4 (Cluster of Differentiation 4) lymphocytes data. It leverages the principles of the tidymodels ecosystem of Max Kuhn and Hadley Wickham (2020) <https://www.tidymodels.org> to offer a user-friendly experience in model development. This package includes functions for data preprocessing, feature engineering, model training, tuning, and evaluation, along with visualization tools to enhance the interpretation of model results. It is specifically designed for researchers in biostatistics, computational biology, and HIV research who aim to perform reproducible and rigorous analyses to gain insights into disease dynamics. The main focus is on improving the understanding of the relationships between viral load, CD4 lymphocytes, and other relevant covariates to contribute to HIV research and the visibility of vulnerable seropositive populations.

Maintained by Juan Pablo Acuña González. Last updated 27 days ago.

0.8 match 3.30 score 7 scripts

juanv66x

viralx:Explainers for Regression Models in HIV Research

A dedicated viral-explainer model tool designed to empower researchers in the field of HIV research, particularly in viral load and CD4 (Cluster of Differentiation 4) lymphocytes regression modeling. Drawing inspiration from the 'tidymodels' framework for rigorous model building of Max Kuhn and Hadley Wickham (2020) <https://www.tidymodels.org>, and the 'DALEXtra' tool for explainability by Przemyslaw Biecek (2020) <doi:10.48550/arXiv.2009.13248>. It aims to facilitate interpretable and reproducible research in biostatistics and computational biology for the benefit of understanding HIV dynamics.

Maintained by Juan Pablo Acuña González. Last updated 3 months ago.

0.8 match 3.00 score 1 scripts

albertoalmuinha

shinyrecipes:Gadget to Use the Data Preprocessing 'recipes' Package Interactively

This gadget allows you to use the 'recipes' package belonging to 'tidymodels' to carry out the data preprocessing tasks in an interactive way. Build your 'recipe' by dragging the variables, visually analyze your data to decide which steps to use, add those steps and pre-process your data.

Maintained by Alberto Almuiña. Last updated 5 years ago.

0.5 match 19 stars 3.98 score 5 scripts