Showing 18 of total 18 results (show query)
tidyverse
dplyr:A Grammar of Data Manipulation
A fast, consistent tool for working with data frame like objects, both in memory and out of memory.
Maintained by Hadley Wickham. Last updated 26 days ago.
4.8k stars 24.68 score 659k scripts 7.8k dependentsr-lib
generics:Common S3 Generics not Provided by Base R Methods Related to Model Fitting
In order to reduce potential package dependencies and conflicts, generics provides a number of commonly used S3 generics.
Maintained by Hadley Wickham. Last updated 1 years ago.
61 stars 14.00 score 131 scripts 9.8k dependentsmodeloriented
DALEX:moDel Agnostic Language for Exploration and eXplanation
Any unverified black box model is the path to failure. Opaqueness leads to distrust. Distrust leads to ignoration. Ignoration leads to rejection. DALEX package xrays any model and helps to explore and explain its behaviour. Machine Learning (ML) models are widely used and have various applications in classification or regression. Models created with boosting, bagging, stacking or similar techniques are often used due to their high performance. But such black-box models usually lack direct interpretability. DALEX package contains various methods that help to understand the link between input variables and model output. Implemented methods help to explore the model on the level of a single instance as well as a level of the whole dataset. All model explainers are model agnostic and can be compared across different models. DALEX package is the cornerstone for 'DrWhy.AI' universe of packages for visual model exploration. Find more details in (Biecek 2018) <https://jmlr.org/papers/v19/18-416.html>.
Maintained by Przemyslaw Biecek. Last updated 2 months ago.
black-boxdalexdata-scienceexplainable-aiexplainable-artificial-intelligenceexplainable-mlexplanationsexplanatory-model-analysisfairnessimlinterpretabilityinterpretable-machine-learningmachine-learningmodel-visualizationpredictive-modelingresponsible-airesponsible-mlxai
1.4k stars 13.40 score 876 scripts 21 dependentsthomasp85
lime:Local Interpretable Model-Agnostic Explanations
When building complex models, it is often difficult to explain why the model should be trusted. While global measures such as accuracy are useful, they cannot be used for explaining why a model made a specific prediction. 'lime' (a port of the 'lime' 'Python' package) is a method for explaining the outcome of black box models by fitting a local model around the point in question an perturbations of this point. The approach is described in more detail in the article by Ribeiro et al. (2016) <arXiv:1602.04938>.
Maintained by Emil Hvitfeldt. Last updated 3 years ago.
caretmodel-checkingmodel-evaluationmodelingcpp
485 stars 11.07 score 732 scripts 1 dependentsnorskregnesentral
shapr:Prediction Explanation with Dependence-Aware Shapley Values
Complex machine learning models are often hard to interpret. However, in many situations it is crucial to understand and explain why a model made a specific prediction. Shapley values is the only method for such prediction explanation framework with a solid theoretical foundation. Previously known methods for estimating the Shapley values do, however, assume feature independence. This package implements methods which accounts for any feature dependence, and thereby produces more accurate estimates of the true Shapley values. An accompanying 'Python' wrapper ('shaprpy') is available through the GitHub repository.
Maintained by Martin Jullum. Last updated 2 days ago.
explainable-aiexplainable-mlrcpprcpparmadilloshapleyopenblascppopenmp
154 stars 10.59 score 175 scripts 1 dependentstrinker
qdapRegex:Regular Expression Removal, Extraction, and Replacement Tools
A collection of regular expression tools associated with the 'qdap' package that may be useful outside of the context of discourse analysis. Tools include removal/extraction/replacement of abbreviations, dates, dollar amounts, email addresses, hash tags, numbers, percentages, citations, person tags, phone numbers, times, and zip codes.
Maintained by Tyler Rinker. Last updated 7 days ago.
50 stars 9.47 score 502 scripts 41 dependentsropensci
elastic:General Purpose Interface to 'Elasticsearch'
Connect to 'Elasticsearch', a 'NoSQL' database built on the 'Java' Virtual Machine. Interacts with the 'Elasticsearch' 'HTTP' API (<https://www.elastic.co/elasticsearch/>), including functions for setting connection details to 'Elasticsearch' instances, loading bulk data, searching for documents with both 'HTTP' query variables and 'JSON' based body requests. In addition, 'elastic' provides functions for interacting with API's for 'indices', documents, nodes, clusters, an interface to the cat API, and more.
Maintained by Scott Chamberlain. Last updated 2 years ago.
databaseelasticsearchhttpapisearchnosqljavajsondocumentsdata-sciencedatabase-wrapperetl
247 stars 8.98 score 151 scripts 1 dependentsbgreenwell
fastshap:Fast Approximate Shapley Values
Computes fast (relative to other implementations) approximate Shapley values for any supervised learning model. Shapley values help to explain the predictions from any black box model using ideas from game theory; see Strumbel and Kononenko (2014) <doi:10.1007/s10115-013-0679-x> for details.
Maintained by Brandon Greenwell. Last updated 1 years ago.
explainable-aiexplainable-mlinterpretable-machine-learningshapleyshapley-valuesvariable-importancexaicpp
119 stars 8.65 score 155 scripts 2 dependentsmarjoleinf
pre:Prediction Rule Ensembles
Derives prediction rule ensembles (PREs). Largely follows the procedure for deriving PREs as described in Friedman & Popescu (2008; <DOI:10.1214/07-AOAS148>), with adjustments and improvements. The main function pre() derives prediction rule ensembles consisting of rules and/or linear terms for continuous, binary, count, multinomial, and multivariate continuous responses. Function gpe() derives generalized prediction ensembles, consisting of rules, hinge and linear functions of the predictor variables.
Maintained by Marjolein Fokkema. Last updated 9 months ago.
58 stars 8.55 score 98 scripts 1 dependentsmodeloriented
survex:Explainable Machine Learning in Survival Analysis
Survival analysis models are commonly used in medicine and other areas. Many of them are too complex to be interpreted by human. Exploration and explanation is needed, but standard methods do not give a broad enough picture. 'survex' provides easy-to-apply methods for explaining survival models, both complex black-boxes and simpler statistical models. They include methods specific to survival analysis such as SurvSHAP(t) introduced in Krzyzinski et al., (2023) <doi:10.1016/j.knosys.2022.110234>, SurvLIME described in Kovalev et al., (2020) <doi:10.1016/j.knosys.2020.106164> as well as extensions of existing ones described in Biecek et al., (2021) <doi:10.1201/9780429027192>.
Maintained by Mikołaj Spytek. Last updated 10 months ago.
biostatisticsbrier-scorescensored-datacox-modelcox-regressionexplainable-aiexplainable-machine-learningexplainable-mlexplanatory-model-analysisinterpretable-machine-learninginterpretable-mlmachine-learningprobabilistic-machine-learningshapsurvival-analysistime-to-eventvariable-importancexai
110 stars 8.40 score 114 scriptsjpfitzinger
tidyfit:Regularized Linear Modeling with Tidy Data
An extension to the 'R' tidy data environment for automated machine learning. The package allows fitting and cross validation of linear regression and classification algorithms on grouped data.
Maintained by Johann Pfitzinger. Last updated 2 months ago.
auto-mlclassificationmachine-learningregressiontidyverse
16 stars 7.12 score 26 scriptsegenn
rtemis:Machine Learning and Visualization
Advanced Machine Learning and Visualization. Unsupervised Learning (Clustering, Decomposition), Supervised Learning (Classification, Regression), Cross-Decomposition, Bagging, Boosting, Meta-models. Static and interactive graphics.
Maintained by E.D. Gennatas. Last updated 2 months ago.
data-sciencedata-visualizationmachine-learningmachine-learning-libraryvisualization
145 stars 7.09 score 50 scripts 2 dependentshadley
ggvis:Interactive Grammar of Graphics
An implementation of an interactive grammar of graphics, taking the best parts of 'ggplot2', combining them with the reactive framework of 'shiny' and drawing web graphics using 'vega'.
Maintained by Hadley Wickham. Last updated 1 years ago.
1 stars 7.02 score 2.3k scripts 11 dependentsdcousin3
ANOPA:Analyses of Proportions using Anscombe Transform
Analyses of Proportions can be performed on the Anscombe (arcsine-related) transformed data. The 'ANOPA' package can analyze proportions obtained from up to four factors. The factors can be within-subject or between-subject or a mix of within- and between-subject. The main, omnibus analysis can be followed by additive decompositions into interaction effects, main effects, simple effects, contrast effects, etc., mimicking precisely the logic of ANOVA. For that reason, we call this set of tools 'ANOPA' (Analysis of Proportion using Anscombe transform) to highlight its similarities with ANOVA. The 'ANOPA' framework also allows plots of proportions easy to obtain along with confidence intervals. Finally, effect sizes and planning statistical power are easily done under this framework. Only particularity, the 'ANOPA' computes F statistics which have an infinite degree of freedom on the denominator. See Laurencelle and Cousineau (2023) <doi:10.3389/fpsyg.2022.1045436>.
Maintained by Denis Cousineau. Last updated 2 months ago.
error-barsproportionsstatistical-testingstatisticssummary-statistics
1 stars 3.56 score 18 scriptsdcousin3
CohensdpLibrary:Cohen's d_p Computation with Confidence Intervals
Computing Cohen's d_p in any experimental designs (between-subject, within-subject, and single-group design). Cousineau (2022) <https://github.com/dcousin3/CohensdpLibrary/>; Cohen (1969, ISBN: 0-8058-0283-5).
Maintained by Denis Cousineau. Last updated 17 days ago.
1 stars 3.00 score 3 scriptscomputationalstylistics
litRiddle:Dataset and Tools to Research the Riddle of Literary Quality
Dataset and functions to explore quality of literary novels. The package is a part of the Riddle of Literary Quality project, and it contains the data of a reader survey about fiction in Dutch, a description of the novels the readers rated, and the results of stylistic measurements of the novels. The package also contains functions to combine, analyze, and visualize these data. For more details, see: Eder M, van Zundert J, Lensink S, van Dalen-Oskam K (2022). Replicating The Riddle of Literary Quality: The litRiddle package for R. In _Digital Humanities 2022: Conference Abstracts_, 636-637.
Maintained by Maciej Eder. Last updated 2 years ago.
2.70 score 2 scripts