R-universe search: exports:recall

Showing 18 of total 18 results (show query)

topepo

caret:Classification and Regression Training

Misc functions for training and plotting classification and regression models.

Maintained by Max Kuhn. Last updated 4 months ago.

1.6k stars 19.24 score 61k scripts 303 dependents

tidymodels

yardstick:Tidy Characterizations of Model Performance

Tidy tools for quantifying how well model fits to a data set such as confusion matrices, class probability curve summaries, and regression metrics (e.g., RMSE).

Maintained by Emil Hvitfeldt. Last updated 18 days ago.

387 stars 15.47 score 2.2k scripts 60 dependents

mfrasco

Metrics:Evaluation Metrics for Machine Learning

An implementation of evaluation metrics in R that are commonly used in supervised machine learning. It implements metrics for regression, time series, binary classification, classification, and information retrieval problems. It has zero dependencies and a consistent, simple interface for all functions.

Maintained by Michael Frasco. Last updated 6 years ago.

99 stars 13.02 score 6.1k scripts 51 dependents

jackstat

ModelMetrics:Rapid Calculation of Model Metrics

Collection of metrics for evaluating models written in C++ using 'Rcpp'. Popular metrics include area under the curve, log loss, root mean square error, etc.

Maintained by Tyler Hunt. Last updated 4 years ago.

auc logloss machine-learning metrics model-evaluation model-metrics cpp

29 stars 11.83 score 1.3k scripts 306 dependents

thie1e

cutpointr:Determine and Evaluate Optimal Cutpoints in Binary Classification Tasks

Estimate cutpoints that optimize a specified metric in binary classification tasks and validate performance using bootstrapping. Some methods for more robust cutpoint estimation are supported, e.g. a parametric method assuming normal distributions, bootstrapped cutpoints, and smoothing of the metric values per cutpoint using Generalized Additive Models. Various plotting functions are included. For an overview of the package see Thiele and Hirschfeld (2021) <doi:10.18637/jss.v098.i11>.

Maintained by Christian Thiele. Last updated 4 months ago.

bootstrapping cutpoint-optimization roc-curve cpp

88 stars 10.44 score 322 scripts 1 dependents

brian-j-smith

MachineShop:Machine Learning Models and Tools

Meta-package for statistical and machine learning with a unified interface for model fitting, prediction, performance assessment, and presentation of results. Approaches for model fitting and prediction of numerical, categorical, or censored time-to-event outcomes include traditional regression models, regularization methods, tree-based methods, support vector machines, neural networks, ensembles, data preprocessing, filtering, and model tuning and selection. Performance metrics are provided for model assessment and can be estimated with independent test sets, split sampling, cross-validation, or bootstrap resampling. Resample estimation can be executed in parallel for faster processing and nested in cases of model tuning and selection. Modeling results can be summarized with descriptive statistics; calibration curves; variable importance; partial dependence plots; confusion matrices; and ROC, lift, and other performance curves.

Maintained by Brian J Smith. Last updated 7 months ago.

classification-models machine-learning predictive-modeling regression-models survival-models

62 stars 7.95 score 121 scripts

adriancorrendo

metrica:Prediction Performance Metrics

A compilation of more than 80 functions designed to quantitatively and visually evaluate prediction performance of regression (continuous variables) and classification (categorical variables) of point-forecast models (e.g. APSIM, DSSAT, DNDC, supervised Machine Learning). For regression, it includes functions to generate plots (scatter, tiles, density, & Bland-Altman plot), and to estimate error metrics (e.g. MBE, MAE, RMSE), error decomposition (e.g. lack of accuracy-precision), model efficiency (e.g. NSE, E1, KGE), indices of agreement (e.g. d, RAC), goodness of fit (e.g. r, R2), adjusted correlation coefficients (e.g. CCC, dcorr), symmetric regression coefficients (intercept, slope), and mean absolute scaled error (MASE) for time series predictions. For classification (binomial and multinomial), it offers functions to generate and plot confusion matrices, and to estimate performance metrics such as accuracy, precision, recall, specificity, F-score, Cohen's Kappa, G-mean, and many more. For more details visit the vignettes <https://adriancorrendo.github.io/metrica/>.

Maintained by Adrian A. Correndo. Last updated 9 months ago.

77 stars 7.88 score 49 scripts

bioc

MLInterfaces:Uniform interfaces to R machine learning procedures for data in Bioconductor containers

This package provides uniform interfaces to machine learning code for data in R and Bioconductor containers.

Maintained by Vincent Carey. Last updated 5 months ago.

classification clustering

7.63 score 79 scripts 6 dependents

fcharte

mldr:Exploratory Data Analysis and Manipulation of Multi-Label Data Sets

Exploratory data analysis and manipulation functions for multi- label data sets along with an interactive Shiny application to ease their use.

Maintained by David Charte. Last updated 5 years ago.

23 stars 7.07 score 168 scripts 2 dependents

mayer79

MetricsWeighted:Weighted Metrics and Performance Measures for Machine Learning

Provides weighted versions of several metrics and performance measures used in machine learning, including average unit deviances of the Bernoulli, Tweedie, Poisson, and Gamma distributions, see Jorgensen B. (1997, ISBN: 978-0412997112). The package also contains a weighted version of generalized R-squared, see e.g. Cohen, J. et al. (2002, ISBN: 978-0805822236). Furthermore, 'dplyr' chains are supported.

Maintained by Michael Mayer. Last updated 8 months ago.

machine-learning metrics performance statistics

11 stars 6.79 score 75 scripts 5 dependents

docma-tu

tosca:Tools for Statistical Content Analysis

A framework for statistical analysis in content analysis. In addition to a pipeline for preprocessing text corpora and linking to the latent Dirichlet allocation from the 'lda' package, plots are offered for the descriptive analysis of text corpora and topic models. In addition, an implementation of Chang's intruder words and intruder topics is provided. Sample data for the vignette is included in the toscaData package, which is available on gitHub: <https://github.com/Docma-TU/toscaData>.

Maintained by Lars Koppers. Last updated 3 years ago.

16 stars 6.64 score 61 scripts 1 dependents

serkor1

SLmetrics:Machine Learning Performance Evaluation on Steroids

Performance evaluation metrics for supervised and unsupervised machine learning, statistical learning and artificial intelligence applications. Core computations are implemented in 'C++' for scalability and efficiency.

Maintained by Serkan Korkmaz. Last updated 2 days ago.

cpp data-analysis data-science eigen3 machine-learning performance-metrics rcpp rcppeigen statistics supervised-learning cpp openmp

22 stars 6.56 score

dppalomar

spectralGraphTopology:Learning Graphs from Data via Spectral Constraints

In the era of big data and hyperconnectivity, learning high-dimensional structures such as graphs from data has become a prominent task in machine learning and has found applications in many fields such as finance, health care, and networks. 'spectralGraphTopology' is an open source, documented, and well-tested R package for learning graphs from data. It provides implementations of state of the art algorithms such as Combinatorial Graph Laplacian Learning (CGL), Spectral Graph Learning (SGL), Graph Estimation based on Majorization-Minimization (GLE-MM), and Graph Estimation based on Alternating Direction Method of Multipliers (GLE-ADMM). In addition, graph learning has been widely employed for clustering, where specific algorithms are available in the literature. To this end, we provide an implementation of the Constrained Laplacian Rank (CLR) algorithm.

Maintained by Ze Vinicius. Last updated 3 years ago.

openblas cpp

2 stars 5.91 score 135 scripts 1 dependents

annennenne

causalDisco:Tools for Causal Discovery on Observational Data

Various tools for inferring causal models from observational data. The package includes an implementation of the temporal Peter-Clark (TPC) algorithm. Petersen, Osler and Ekstrøm (2021) <doi:10.1093/aje/kwab087>. It also includes general tools for evaluating differences in adjacency matrices, which can be used for evaluating performance of causal discovery procedures.

Maintained by Anne Helby Petersen. Last updated 27 days ago.

19 stars 4.76 score 10 scripts

ianmtaylor1

bstrl:Bayesian Streaming Record Linkage

Perform record linkage on streaming files using recursive Bayesian updating.

Maintained by Ian Taylor. Last updated 2 years ago.

2 stars 4.00 score 2 scripts

certe-medical-epidemiology

certetoolbox:A Certe R Package for Miscellaneous Functions

A Certe R Package for miscellaneous functions that do not fit a dedicated package. This package also mitigates the 'vctrs' package by allowing numeric-character coercions. This package is part of the 'certedata' universe.

Maintained by Erwin E. A. Hassing. Last updated 10 days ago.

3.45 score 1 scripts 1 dependents

kwb-r

kwb.context:Get the Function Call Context and Work with it

This package contains functions to get the full tree of function calls that is evaluated when calling a function. The idea is to reuse some of these calls with modified arguments, e.g. to replot a specific plot that was created by an inner plot function that was called from an outer function.

Maintained by Hauke Sonnenberg. Last updated 4 years ago.

2.70 score 3 scripts

codymarquart

rhoR:Rho for Inter Rater Reliability

Rho is used to test the generalization of inter rater reliability (IRR) statistics. Calculating rho starts by generating a large number of simulated, fully-coded data sets: a sizable collection of hypothetical populations, all of which have a kappa value below a given threshold -- which indicates unacceptable agreement. Then kappa is calculated on a sample from each of those sets in the collection to see if it is equal to or higher than the kappa in then real sample. If less than five percent of the distribution of samples from the simulated data sets is greater than actual observed kappa, the null hypothesis is rejected and one can conclude that if the two raters had coded the rest of the data, we would have acceptable agreement (kappa above the threshold).

Maintained by Cody L Marquart. Last updated 5 years ago.

cpp

2.18 score 8 scripts 1 dependents