R-universe search: rmse

hzambran

hydroGOF:Goodness-of-Fit Functions for Comparison of Simulated and Observed Hydrological Time Series

S3 functions implementing both statistical and graphical goodness-of-fit measures between observed and simulated values, mainly oriented to be used during the calibration, validation, and application of hydrological models. Missing values in observed and/or simulated values can be removed before computations. Comments / questions / collaboration of any kind are very welcomed.

Maintained by Mauricio Zambrano-Bigiarini. Last updated 10 months ago.

13.9 match 40 stars 10.29 score 796 scripts 8 dependents

svkucheryavski

mdatools:Multivariate Data Analysis for Chemometrics

Projection based methods for preprocessing, exploring and analysis of multivariate data used in chemometrics. S. Kucheryavskiy (2020) <doi:10.1016/j.chemolab.2020.103937>.

Maintained by Sergey Kucheryavskiy. Last updated 8 months ago.

11.3 match 35 stars 7.37 score 220 scripts 1 dependents

tirgit

missCompare:Intuitive Missing Data Imputation Framework

Offers a convenient pipeline to test and compare various missing data imputation algorithms on simulated and real data. These include simpler methods, such as mean and median imputation and random replacement, but also include more sophisticated algorithms already implemented in popular R packages, such as 'mi', described by Su et al. (2011) <doi:10.18637/jss.v045.i02>; 'mice', described by van Buuren and Groothuis-Oudshoorn (2011) <doi:10.18637/jss.v045.i03>; 'missForest', described by Stekhoven and Buhlmann (2012) <doi:10.1093/bioinformatics/btr597>; 'missMDA', described by Josse and Husson (2016) <doi:10.18637/jss.v070.i01>; and 'pcaMethods', described by Stacklies et al. (2007) <doi:10.1093/bioinformatics/btm069>. The central assumption behind 'missCompare' is that structurally different datasets (e.g. larger datasets with a large number of correlated variables vs. smaller datasets with non correlated variables) will benefit differently from different missing data imputation algorithms. 'missCompare' takes measurements of your dataset and sets up a sandbox to try a curated list of standard and sophisticated missing data imputation algorithms and compares them assuming custom missingness patterns. 'missCompare' will also impute your real-life dataset for you after the selection of the best performing algorithm in the simulations. The package also provides various post-imputation diagnostics and visualizations to help you assess imputation performance.

Maintained by Tibor V. Varga. Last updated 4 years ago.

comparison comparison-benchmarks imputation imputation-algorithm imputation-methods imputations kolmogorov-smirnov missing missing-data missing-data-imputation missing-status-check missing-values missingness post-imputation-diagnostics rmse

12.5 match 39 stars 5.89 score 40 scripts

topepo

caret:Classification and Regression Training

Misc functions for training and plotting classification and regression models.

Maintained by Max Kuhn. Last updated 3 months ago.

3.3 match 1.6k stars 19.24 score 61k scripts 303 dependents

eagerai

fastai:Interface to 'fastai'

The 'fastai' <https://docs.fast.ai/index.html> library simplifies training fast and accurate neural networks using modern best practices. It is based on research in to deep learning best practices undertaken at 'fast.ai', including 'out of the box' support for vision, text, tabular, audio, time series, and collaborative filtering models.

Maintained by Turgut Abdullayev. Last updated 11 months ago.

audio collaborative-filtering darknet darknet-image-classification fastai medical object-detection tabular text vision

6.6 match 118 stars 9.40 score 76 scripts

adriancorrendo

metrica:Prediction Performance Metrics

A compilation of more than 80 functions designed to quantitatively and visually evaluate prediction performance of regression (continuous variables) and classification (categorical variables) of point-forecast models (e.g. APSIM, DSSAT, DNDC, supervised Machine Learning). For regression, it includes functions to generate plots (scatter, tiles, density, & Bland-Altman plot), and to estimate error metrics (e.g. MBE, MAE, RMSE), error decomposition (e.g. lack of accuracy-precision), model efficiency (e.g. NSE, E1, KGE), indices of agreement (e.g. d, RAC), goodness of fit (e.g. r, R2), adjusted correlation coefficients (e.g. CCC, dcorr), symmetric regression coefficients (intercept, slope), and mean absolute scaled error (MASE) for time series predictions. For classification (binomial and multinomial), it offers functions to generate and plot confusion matrices, and to estimate performance metrics such as accuracy, precision, recall, specificity, F-score, Cohen's Kappa, G-mean, and many more. For more details visit the vignettes <https://adriancorrendo.github.io/metrica/>.

Maintained by Adrian A. Correndo. Last updated 9 months ago.

7.4 match 77 stars 8.18 score 49 scripts

tidymodels

yardstick:Tidy Characterizations of Model Performance

Tidy tools for quantifying how well model fits to a data set such as confusion matrices, class probability curve summaries, and regression metrics (e.g., RMSE).

Maintained by Emil Hvitfeldt. Last updated 4 days ago.

3.8 match 387 stars 15.47 score 2.2k scripts 60 dependents

andrisignorell

DescTools:Tools for Descriptive Statistics

A collection of miscellaneous basic statistic functions and convenience wrappers for efficiently describing data. The author's intention was to create a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results. The package contains furthermore functions to produce documents using MS Word (or PowerPoint) and functions to import data from Excel. Many of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The 'BigCamelCase' style was consequently applied to functions borrowed from contributed R packages as well.

Maintained by Andri Signorell. Last updated 9 days ago.

fortran cpp

3.3 match 87 stars 16.68 score 7.7k scripts 99 dependents

gacarrillor

vec2dtransf:2D Cartesian Coordinate Transformation

Applies affine and similarity transformations on vector spatial data (sp objects). Transformations can be defined from control points or directly from parameters. If redundant control points are provided Least Squares is applied allowing to obtain residuals and RMSE.

Maintained by German Carrillo. Last updated 3 months ago.

2d affine affine-transformation coordinates least-squares rmse similarity-transformations sp-objects transformations

13.8 match 5 stars 3.97 score 37 scripts

tidyverse

modelr:Modelling Functions that Work with the Pipe

Functions for modelling that help you seamlessly integrate modelling into a pipeline of data manipulation and visualisation.

Maintained by Hadley Wickham. Last updated 1 years ago.

modelling

3.3 match 401 stars 16.44 score 6.9k scripts 1.0k dependents

easystats

performance:Assessment of Regression Models Performance

Utilities for computing measures to assess model quality, which are not directly provided by R's 'base' or 'stats' packages. These include e.g. measures like r-squared, intraclass correlation coefficient (Nakagawa, Johnson & Schielzeth (2017) <doi:10.1098/rsif.2017.0213>), root mean squared error or functions to check models for overdispersion, singularity or zero-inflation and more. Functions apply to a large variety of regression models, including generalized linear models, mixed effects models and Bayesian models. References: Lüdecke et al. (2021) <doi:10.21105/joss.03139>.

Maintained by Daniel Lüdecke. Last updated 18 days ago.

aic easystats hacktoberfest loo machine-learning mixed-models models performance r2 statistics

3.3 match 1.1k stars 16.17 score 4.3k scripts 47 dependents

bioc

benchdamic:Benchmark of differential abundance methods on microbiome data

Starting from a microbiome dataset (16S or WMS with absolute count values) it is possible to perform several analysis to assess the performances of many differential abundance detection methods. A basic and standardized version of the main differential abundance analysis methods is supplied but the user can also add his method to the benchmark. The analyses focus on 4 main aspects: i) the goodness of fit of each method's distributional assumptions on the observed count data, ii) the ability to control the false discovery rate, iii) the within and between method concordances, iv) the truthfulness of the findings if any apriori knowledge is given. Several graphical functions are available for result visualization.

Maintained by Matteo Calgaro. Last updated 4 months ago.

metagenomics microbiome differentialexpression multiplecomparison normalization preprocessing software benchmark differential-abundance-methods

8.8 match 6 stars 5.73 score 8 scripts

petolau

TSrepr:Time Series Representations

Methods for representations (i.e. dimensionality reduction, preprocessing, feature extraction) of time series to help more accurate and effective time series data mining. Non-data adaptive, data adaptive, model-based and data dictated (clipped) representation methods are implemented. Also various normalisation methods (min-max, z-score, Box-Cox, Yeo-Johnson), and forecasting accuracy measures are implemented.

Maintained by Peter Laurinec. Last updated 5 years ago.

data-analysis data-mining data-mining-algorithms data-science representation time-series time-series-analysis time-series-classification time-series-clustering time-series-data-mining time-series-representations cpp

6.6 match 97 stars 7.23 score 117 scripts

nelson-n

lmForc:Linear Model Forecasting

Introduces in-sample, out-of-sample, pseudo out-of-sample, and benchmark model forecast tests and a new class for working with forecast data, Forecast.

Maintained by Nelson Rayl. Last updated 7 months ago.

forecasting linear-models

8.9 match 6 stars 5.26 score 20 scripts

philchalmers

SimDesign:Structure for Organizing Monte Carlo Simulation Designs

Provides tools to safely and efficiently organize and execute Monte Carlo simulation experiments in R. The package controls the structure and back-end of Monte Carlo simulation experiments by utilizing a generate-analyse-summarise workflow. The workflow safeguards against common simulation coding issues, such as automatically re-simulating non-convergent results, prevents inadvertently overwriting simulation files, catches error and warning messages during execution, implicitly supports parallel processing with high-quality random number generation, and provides tools for managing high-performance computing (HPC) array jobs submitted to schedulers such as SLURM. For a pedagogical introduction to the package see Sigal and Chalmers (2016) <doi:10.1080/10691898.2016.1246953>. For a more in-depth overview of the package and its design philosophy see Chalmers and Adkins (2020) <doi:10.20982/tqmp.16.4.p248>.

Maintained by Phil Chalmers. Last updated 4 hours ago.

monte-carlo-simulation simulation simulation-framework

3.3 match 62 stars 13.36 score 253 scripts 46 dependents

mfrasco

Metrics:Evaluation Metrics for Machine Learning

An implementation of evaluation metrics in R that are commonly used in supervised machine learning. It implements metrics for regression, time series, binary classification, classification, and information retrieval problems. It has zero dependencies and a consistent, simple interface for all functions.

Maintained by Michael Frasco. Last updated 6 years ago.

3.3 match 99 stars 13.02 score 6.1k scripts 51 dependents

jknowles

merTools:Tools for Analyzing Mixed Effect Regression Models

Provides methods for extracting results from mixed-effect model objects fit with the 'lme4' package. Allows construction of prediction intervals efficiently from large scale linear and generalized linear mixed-effects models. This method draws from the simulation framework used in the Gelman and Hill (2007) textbook: Data Analysis Using Regression and Multilevel/Hierarchical Models.

Maintained by Jared E. Knowles. Last updated 1 years ago.

4.0 match 105 stars 10.49 score 768 scripts

geco-bern

rsofun:The P-Model and BiomeE Modelling Framework

Implements the Simulating Optimal FUNctioning framework for site-scale simulations of ecosystem processes, including model calibration. It contains 'Fortran 90' modules for the P-model (Stocker et al. (2020) <doi:10.5194/gmd-13-1545-2020>), SPLASH (Davis et al. (2017) <doi:10.5194/gmd-10-689-2017>) and BiomeE (Weng et al. (2015) <doi:10.5194/bg-12-2655-2015>).

Maintained by Benjamin Stocker. Last updated 13 days ago.

dgvm growth modeling p-model simulation vegetation-dynamics fortran

4.7 match 26 stars 8.77 score 119 scripts

tidyverts

fabletools:Core Tools for Packages in the 'fable' Framework

Provides tools, helpers and data structures for developing models and time series functions for 'fable' and extension packages. These tools support a consistent and tidy interface for time series modelling and analysis.

Maintained by Mitchell OHara-Wild. Last updated 1 months ago.

3.3 match 91 stars 12.18 score 396 scripts 18 dependents

jackstat

ModelMetrics:Rapid Calculation of Model Metrics

Collection of metrics for evaluating models written in C++ using 'Rcpp'. Popular metrics include area under the curve, log loss, root mean square error, etc.

Maintained by Tyler Hunt. Last updated 4 years ago.

auc logloss machine-learning metrics model-evaluation model-metrics cpp

3.3 match 29 stars 11.83 score 1.3k scripts 306 dependents

yanyachen

MLmetrics:Machine Learning Evaluation Metrics

A collection of evaluation metrics, including loss, score and utility functions, that measure regression, classification and ranking performance.

Maintained by Yachen Yan. Last updated 11 months ago.

3.3 match 69 stars 11.09 score 2.2k scripts 20 dependents

cran

CSTools:Assessing Skill of Climate Forecasts on Seasonal-to-Decadal Timescales

Exploits dynamical seasonal forecasts in order to provide information relevant to stakeholders at the seasonal timescale. The package contains process-based methods for forecast calibration, bias correction, statistical and stochastic downscaling, optimal forecast combination and multivariate verification, as well as basic and advanced tools to obtain tailored products. This package was developed in the context of the 'ERA4CS' project 'MEDSCOPE' and the 'H2020 S2S4E' project and includes contributions from 'ArticXchange' project founded by 'EU-PolarNet 2'. 'Pérez-Zanón et al. (2022) <doi:10.5194/gmd-15-6115-2022>'. 'Doblas-Reyes et al. (2005) <doi:10.1111/j.1600-0870.2005.00104.x>'. 'Mishra et al. (2018) <doi:10.1007/s00382-018-4404-z>'. 'Sanchez-Garcia et al. (2019) <doi:10.5194/asr-16-165-2019>'. 'Straus et al. (2007) <doi:10.1175/JCLI4070.1>'. 'Terzago et al. (2018) <doi:10.5194/nhess-18-2825-2018>'. 'Torralba et al. (2017) <doi:10.1175/JAMC-D-16-0204.1>'. 'D'Onofrio et al. (2014) <doi:10.1175/JHM-D-13-096.1>'. 'Verfaillie et al. (2017) <doi:10.5194/gmd-10-4257-2017>'. 'Van Schaeybroeck et al. (2019) <doi:10.1016/B978-0-12-812372-0.00010-8>'. 'Yiou et al. (2013) <doi:10.1007/s00382-012-1626-3>'.

Maintained by Victoria Agudetse. Last updated 1 years ago.

fortran

6.7 match 2 stars 5.32 score 62 scripts 1 dependents

bioc

lute:Framework for cell size scale factor normalized bulk transcriptomics deconvolution experiments

Provides a framework for adjustment on cell type size when performing bulk transcripomics deconvolution. The main framework function provides a means of reference normalization using cell size scale factors. It allows for marker selection and deconvolution using non-negative least squares (NNLS) by default. The framework is extensible for other marker selection and deconvolution algorithms, and users may reuse the generics, methods, and classes for these when developing new algorithms.

Maintained by Sean K Maden. Last updated 5 months ago.

rnaseq sequencing singlecell coverage transcriptomics normalization

6.6 match 2 stars 5.26 score 3 scripts

mhahsler

recommenderlab:Lab for Developing and Testing Recommender Algorithms

Provides a research infrastructure to develop and evaluate collaborative filtering recommender algorithms. This includes a sparse representation for user-item matrices, many popular algorithms, top-N recommendations, and cross-validation. Hahsler (2022) <doi:10.48550/arXiv.2205.12371>.

Maintained by Michael Hahsler. Last updated 7 months ago.

collaborative-filtering recommender-system

3.3 match 214 stars 10.07 score 840 scripts 2 dependents

mharinga

insurancerating:Analytic Insurance Rating Techniques

Functions to build, evaluate, and visualize insurance rating models. It simplifies the process of modeling premiums, and allows to analyze insurance risk factors effectively. The package employs a data-driven strategy for constructing insurance tariff classes, drawing on the work of Antonio and Valdez (2012) <doi:10.1007/s10182-011-0152-7>.

Maintained by Martin Haringa. Last updated 5 months ago.

actuarial actuarial-science insurance pricing

5.6 match 70 stars 5.89 score 28 scripts

laresbernardo

lares:Analytics & Machine Learning Sidekick

Auxiliary package for better/faster analytics, visualization, data mining, and machine learning tasks. With a wide variety of family functions, like Machine Learning, Data Wrangling, Marketing Mix Modeling (Robyn), Exploratory, API, and Scrapper, it helps the analyst or data scientist to get quick and robust results, without the need of repetitive coding or advanced R programming skills.

Maintained by Bernardo Lares. Last updated 23 days ago.

analytics api automation automl data-science descriptive-statistics h2o machine-learning marketing mmm predictive-modeling puzzle rlanguage robyn visualization

3.3 match 233 stars 9.84 score 185 scripts 1 dependents

ben519

mltools:Machine Learning Tools

A collection of machine learning helper functions, particularly assisting in the Exploratory Data Analysis phase. Makes heavy use of the 'data.table' package for optimal speed and memory efficiency. Highlights include a versatile bin_data() function, sparsify() for converting a data.table to sparse matrix format with one-hot encoding, fast evaluation metrics, and empirical_cdf() for calculating empirical Multivariate Cumulative Distribution Functions.

Maintained by Ben Gorman. Last updated 3 years ago.

exploratory-data-analysis machine-learning

3.3 match 72 stars 9.58 score 1.2k scripts 13 dependents

inbo

dhcurve:Automated Modelling of Diameter Height Curves for Trees

Model diameter height curves for individual tree species and forests.

Maintained by Els Lommelen. Last updated 3 days ago.

analysis forestry

10.4 match 3.00 score 8 scripts

brry

berryFunctions:Function Collection Related to Plotting and Hydrology

Draw horizontal histograms, color scattered points by 3rd dimension, enhance date- and log-axis plots, zoom in X11 graphics, trace errors and warnings, use the unit hydrograph in a linear storage cascade, convert lists to data.frames and arrays, fit multiple functions.

Maintained by Berry Boessenkool. Last updated 1 months ago.

3.3 match 13 stars 9.43 score 350 scripts 16 dependents

msainsburydale

NeuralEstimators:Likelihood-Free Parameter Estimation using Neural Networks

An 'R' interface to the 'Julia' package 'NeuralEstimators.jl'. The package facilitates the user-friendly development of neural Bayes estimators, which are neural networks that map data to a point summary of the posterior distribution (Sainsbury-Dale et al., 2024, <doi:10.1080/00031305.2023.2249522>). These estimators are likelihood-free and amortised, in the sense that, once the neural networks are trained on simulated data, inference from observed data can be made in a fraction of the time required by conventional approaches. The package also supports amortised Bayesian or frequentist inference using neural networks that approximate the posterior or likelihood-to-evidence ratio (Zammit-Mangion et al., 2025, Sec. 3.2, 5.2, <doi:10.48550/arXiv.2404.12484>). The package accommodates any model for which simulation is feasible by allowing users to define models implicitly through simulated data.

Maintained by Matthew Sainsbury-Dale. Last updated 14 days ago.

5.0 match 9 stars 5.95 score 3 scripts

beerda

lfl:Linguistic Fuzzy Logic

Various algorithms related to linguistic fuzzy logic: mining for linguistic fuzzy association rules, composition of fuzzy relations, performing perception-based logical deduction (PbLD), and forecasting time-series using fuzzy rule-based ensemble (FRBE). The package also contains basic fuzzy-related algebraic functions capable of handling missing values in different styles (Bochvar, Sobocinski, Kleene etc.), computation of Sugeno integrals and fuzzy transform.

Maintained by Michal Burda. Last updated 4 months ago.

association-rules forecast-model fuzzy-logic inference-rules cpp openmp

5.1 match 8 stars 5.35 score 28 scripts

brian-j-smith

MachineShop:Machine Learning Models and Tools

Meta-package for statistical and machine learning with a unified interface for model fitting, prediction, performance assessment, and presentation of results. Approaches for model fitting and prediction of numerical, categorical, or censored time-to-event outcomes include traditional regression models, regularization methods, tree-based methods, support vector machines, neural networks, ensembles, data preprocessing, filtering, and model tuning and selection. Performance metrics are provided for model assessment and can be estimated with independent test sets, split sampling, cross-validation, or bootstrap resampling. Resample estimation can be executed in parallel for faster processing and nested in cases of model tuning and selection. Modeling results can be summarized with descriptive statistics; calibration curves; variable importance; partial dependence plots; confusion matrices; and ROC, lift, and other performance curves.

Maintained by Brian J Smith. Last updated 7 months ago.

classification-models machine-learning predictive-modeling regression-models survival-models

3.3 match 61 stars 7.95 score 121 scripts

jszlek

fscaret:Automated Feature Selection from 'caret'

Automated feature selection using variety of models provided by 'caret' package. This work was funded by Poland-Singapore bilateral cooperation project no 2/3/POL-SIN/2012.

Maintained by Jakub Szlek. Last updated 7 years ago.

6.6 match 3.97 score 31 scripts

steffenmoritz

imputeR:A General Multivariate Imputation Framework

Multivariate Expectation-Maximization (EM) based imputation framework that offers several different algorithms. These include regularisation methods like Lasso and Ridge regression, tree-based models and dimensionality reduction methods like PCA and PLS.

Maintained by Steffen Moritz. Last updated 4 years ago.

missing-data

5.3 match 16 stars 4.94 score 54 scripts

bioc

DeconRNASeq:Deconvolution of Heterogeneous Tissue Samples for mRNA-Seq data

DeconSeq is an R package for deconvolution of heterogeneous tissues based on mRNA-Seq data. It modeled expression levels from heterogeneous cell populations in mRNA-Seq as the weighted average of expression from different constituting cell types and predicted cell type proportions of single expression profiles.

Maintained by Ting Gong. Last updated 5 months ago.

differentialexpression

4.9 match 5.16 score 72 scripts

gmonette

cv:Cross-Validating Regression Models

Cross-validation methods of regression models that exploit features of various modeling functions to improve speed. Some of the methods implemented in the package are novel, as described in the package vignettes; for general introductions to cross-validation, see, for example, Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani (2021, ISBN 978-1-0716-1417-4, Secs. 5.1, 5.3), "An Introduction to Statistical Learning with Applications in R, Second Edition", and Trevor Hastie, Robert Tibshirani, and Jerome Friedman (2009, ISBN 978-0-387-84857-0, Sec. 7.10), "The Elements of Statistical Learning, Second Edition".

Maintained by Georges Monette. Last updated 4 days ago.

3.3 match 4 stars 7.67 score 86 scripts

mboeck11

BGVAR:Bayesian Global Vector Autoregressions

Estimation of Bayesian Global Vector Autoregressions (BGVAR) with different prior setups and the possibility to introduce stochastic volatility. Built-in priors include the Minnesota, the stochastic search variable selection and Normal-Gamma (NG) prior. For a reference see also Crespo Cuaresma, J., Feldkircher, M. and F. Huber (2016) "Forecasting with Global Vector Autoregressive Models: a Bayesian Approach", Journal of Applied Econometrics, Vol. 31(7), pp. 1371-1391 <doi:10.1002/jae.2504>. Post-processing functions allow for doing predictions, structurally identify the model with short-run or sign-restrictions and compute impulse response functions, historical decompositions and forecast error variance decompositions. Plotting functions are also available. The package has a companion paper: Boeck, M., Feldkircher, M. and F. Huber (2022) "BGVAR: Bayesian Global Vector Autoregressions with Shrinkage Priors in R", Journal of Statistical Software, Vol. 104(9), pp. 1-28 <doi:10.18637/jss.v104.i09>.

Maintained by Maximilian Boeck. Last updated 3 months ago.

openblas cpp

3.3 match 27 stars 7.58 score 156 scripts

rspatial

predicts:Spatial Prediction Tools

Methods for spatial predictive modeling, especially for spatial distribution models. This includes algorithms for model fitting and prediction, as well as methods for model evaluation.

Maintained by Robert J. Hijmans. Last updated 2 months ago.

3.3 match 10 stars 7.46 score 108 scripts 8 dependents

nanxstats

msaenet:Multi-Step Adaptive Estimation Methods for Sparse Regressions

Multi-step adaptive elastic-net (MSAENet) algorithm for feature selection in high-dimensional regressions proposed in Xiao and Xu (2015) <DOI:10.1080/00949655.2015.1016944>, with support for multi-step adaptive MCP-net (MSAMNet) and multi-step adaptive SCAD-net (MSASNet) methods.

Maintained by Nan Xiao. Last updated 8 months ago.

false-positive-control high-dimensional-data linear-regression machine-learning variable-selection

4.0 match 13 stars 6.01 score 52 scripts

nk027

BVAR:Hierarchical Bayesian Vector Autoregression

Estimation of hierarchical Bayesian vector autoregressive models following Kuschnig & Vashold (2021) <doi:10.18637/jss.v100.i14>. Implements hierarchical prior selection for conjugate priors in the fashion of Giannone, Lenza & Primiceri (2015) <doi:10.1162/REST_a_00483>. Functions to compute and identify impulse responses, calculate forecasts, forecast error variance decompositions and scenarios are available. Several methods to print, plot and summarise results facilitate analysis.

Maintained by Nikolas Kuschnig. Last updated 4 months ago.

bayesian bvar forecasts impulse-responses vector-autoregressions

3.3 match 51 stars 7.30 score 68 scripts 1 dependents

tychelab

CoSMoS:Complete Stochastic Modelling Solution

Makes univariate, multivariate, or random fields simulations precise and simple. Just select the desired time series or random fields’ properties and it will do the rest. CoSMoS is based on the framework described in Papalexiou (2018, <doi:10.1016/j.advwatres.2018.02.013>), extended for random fields in Papalexiou and Serinaldi (2020, <doi:10.1029/2019WR026331>), and further advanced in Papalexiou et al. (2021, <doi:10.1029/2020WR029466>) to allow fine-scale space-time simulation of storms (or even cyclone-mimicking fields).

Maintained by Kevin Shook. Last updated 4 years ago.

3.3 match 11 stars 7.10 score 77 scripts

egenn

rtemis:Machine Learning and Visualization

Advanced Machine Learning and Visualization. Unsupervised Learning (Clustering, Decomposition), Supervised Learning (Classification, Regression), Cross-Decomposition, Bagging, Boosting, Meta-models. Static and interactive graphics.

Maintained by E.D. Gennatas. Last updated 1 months ago.

data-science data-visualization machine-learning machine-learning-library visualization

3.3 match 145 stars 7.09 score 50 scripts 2 dependents

vivienroussez

autoTS:Automatic Model Selection and Prediction for Univariate Time Series

Offers a set of functions to easily make predictions for univariate time series. 'autoTS' is a wrapper of existing functions of the 'forecast' and 'prophet' packages, harmonising their outputs in tidy dataframes and using default values for each. The core function getBestModel() allows the user to effortlessly benchmark seven algorithms along with a bagged estimator to identify which one performs the best for a given time series.

Maintained by Vivien Roussez. Last updated 5 years ago.

4.9 match 10 stars 4.78 score 12 scripts

alexchristensen

NetworkToolbox:Methods and Measures for Brain, Cognitive, and Psychometric Network Analysis

Implements network analysis and graph theory measures used in neuroscience, cognitive science, and psychology. Methods include various filtering methods and approaches such as threshold, dependency (Kenett, Tumminello, Madi, Gur-Gershgoren, Mantegna, & Ben-Jacob, 2010 <doi:10.1371/journal.pone.0015032>), Information Filtering Networks (Barfuss, Massara, Di Matteo, & Aste, 2016 <doi:10.1103/PhysRevE.94.062306>), and Efficiency-Cost Optimization (Fallani, Latora, & Chavez, 2017 <doi:10.1371/journal.pcbi.1005305>). Brain methods include the recently developed Connectome Predictive Modeling (see references in package). Also implements several network measures including local network characteristics (e.g., centrality), community-level network characteristics (e.g., community centrality), global network characteristics (e.g., clustering coefficient), and various other measures associated with the reliability and reproducibility of network analysis.

Maintained by Alexander Christensen. Last updated 2 years ago.

network-analysis

3.3 match 23 stars 6.99 score 101 scripts 4 dependents

bioc

SingleMoleculeFootprinting:Analysis tools for Single Molecule Footprinting (SMF) data

SingleMoleculeFootprinting provides functions to analyze Single Molecule Footprinting (SMF) data. Following the workflow exemplified in its vignette, the user will be able to perform basic data analysis of SMF data with minimal coding effort. Starting from an aligned bam file, we show how to perform quality controls over sequencing libraries, extract methylation information at the single molecule level accounting for the two possible kind of SMF experiments (single enzyme or double enzyme), classify single molecules based on their patterns of molecular occupancy, plot SMF information at a given genomic location.

Maintained by Guido Barzaghi. Last updated 27 days ago.

dnamethylation coverage nucleosomepositioning datarepresentation epigenetics methylseq qualitycontrol sequencing

3.5 match 2 stars 6.43 score 27 scripts

r-forge

modEvA:Model Evaluation and Analysis

Analyses species distribution models and evaluates their performance. It includes functions for variation partitioning, extracting variable importance, computing several metrics of model discrimination and calibration performance, optimizing prediction thresholds based on a number of criteria, performing multivariate environmental similarity surface (MESS) analysis, and displaying various analytical plots. Initially described in Barbosa et al. (2013) <doi:10.1111/ddi.12100>.

Maintained by A. Marcia Barbosa. Last updated 10 days ago.

3.3 match 6.82 score 269 scripts 3 dependents

nanxstats

enpls:Ensemble Partial Least Squares Regression

An algorithmic framework for measuring feature importance, outlier detection, model applicability domain evaluation, and ensemble predictive modeling with (sparse) partial least squares regressions.

Maintained by Nan Xiao. Last updated 3 years ago.

chemometrics dimensionality-reduction ensemble-learning machine-learning outlier-detection partial-least-squares-regression

4.0 match 18 stars 5.56 score 40 scripts

mayer79

MetricsWeighted:Weighted Metrics and Performance Measures for Machine Learning

Provides weighted versions of several metrics and performance measures used in machine learning, including average unit deviances of the Bernoulli, Tweedie, Poisson, and Gamma distributions, see Jorgensen B. (1997, ISBN: 978-0412997112). The package also contains a weighted version of generalized R-squared, see e.g. Cohen, J. et al. (2002, ISBN: 978-0805822236). Furthermore, 'dplyr' chains are supported.

Maintained by Michael Mayer. Last updated 8 months ago.

machine-learning metrics performance statistics

3.3 match 11 stars 6.79 score 75 scripts 5 dependents

jfwambaugh

invivoPKfit:Fits Toxicokinetic Models to In Vivo PK Data Sets

Takes in vivo toxicokinetic concentration-time data and fits parameters of 1-compartment and 2-compartment models for each chemical. These methods are described in detail in "Informatics for Toxicokinetics" (submitted).

Maintained by John Wambaugh. Last updated 2 months ago.

8.6 match 2.60 score 4 scripts

moviedo5

fda.usc:Functional Data Analysis and Utilities for Statistical Computing

Routines for exploratory and descriptive analysis of functional data such as depth measurements, atypical curves detection, regression models, supervised classification, unsupervised classification and functional analysis of variance.

Maintained by Manuel Oviedo de la Fuente. Last updated 4 months ago.

functional-data-analysis fortran

2.3 match 12 stars 9.72 score 560 scripts 22 dependents

nsj3

rioja:Analysis of Quaternary Science Data

Constrained clustering, transfer functions, and other methods for analysing Quaternary science data.

Maintained by Steve Juggins. Last updated 6 months ago.

cpp

3.0 match 10 stars 7.21 score 191 scripts 3 dependents

tianxia-jia

mcgf:Markov Chain Gaussian Fields Simulation and Parameter Estimation

Simulating and estimating (regime-switching) Markov chain Gaussian fields with covariance functions of the Gneiting class (Gneiting 2002) <doi:10.1198/016214502760047113>. It supports parameter estimation by weighted least squares and maximum likelihood methods, and produces Kriging forecasts and intervals for existing and new locations.

Maintained by Tianxia Jia. Last updated 9 months ago.

4.4 match 1 stars 4.82 score 11 scripts

egonzato

windows.pls:Segmentation Approaches in Chemometrics

Evaluation of prediction performance of smaller regions of spectra for Chemometrics. Segmentation of spectra, evolving dimensions regions and sliding windows as selection methods. Election of the best model among those computed based on error metrics. Chen et al.(2017) <doi:10.1007/s00216-017-0218-9>.

Maintained by Elia Gonzato. Last updated 2 years ago.

7.8 match 2.70 score 4 scripts

zsteinmetz

envalysis:Miscellaneous Functions for Environmental Analyses

Small toolbox for data analyses in environmental chemistry and ecotoxicology. Provides, for example, calibration() to calculate calibration curves and corresponding limits of detection (LODs) and limits of quantification (LOQs) according to German DIN 32645 (2008). texture() makes it easy to estimate soil particle size distributions from hydrometer measurements (ASTM D422-63, 2007).

Maintained by Zacharias Steinmetz. Last updated 5 months ago.

analytics chemistry ecotoxicology environment soil

3.3 match 8 stars 6.30 score 83 scripts

blasbenito

spatialRF:Easy Spatial Modeling with Random Forest

Automatic generation and selection of spatial predictors for spatial regression with Random Forest. Spatial predictors are surrogates of variables driving the spatial structure of a response variable. The package offers two methods to generate spatial predictors from a distance matrix among training cases: 1) Moran's Eigenvector Maps (MEMs; Dray, Legendre, and Peres-Neto 2006 <DOI:10.1016/j.ecolmodel.2006.02.015>): computed as the eigenvectors of a weighted matrix of distances; 2) RFsp (Hengl et al. <DOI:10.7717/peerj.5518>): columns of the distance matrix used as spatial predictors. Spatial predictors help minimize the spatial autocorrelation of the model residuals and facilitate an honest assessment of the importance scores of the non-spatial predictors. Additionally, functions to reduce multicollinearity, identify relevant variable interactions, tune random forest hyperparameters, assess model transferability via spatial cross-validation, and explore model results via partial dependence curves and interaction surfaces are included in the package. The modelling functions are built around the highly efficient 'ranger' package (Wright and Ziegler 2017 <DOI:10.18637/jss.v077.i01>).

Maintained by Blas M. Benito. Last updated 3 years ago.

random-forest spatial-analysis spatial-regression

3.7 match 114 stars 5.45 score 49 scripts

radiant-rstats

radiant.model:Model Menu for Radiant: Business Analytics using R and Shiny

The Radiant Model menu includes interfaces for linear and logistic regression, naive Bayes, neural networks, classification and regression trees, model evaluation, collaborative filtering, decision analysis, and simulation. The application extends the functionality in 'radiant.data'.

Maintained by Vincent Nijs. Last updated 5 months ago.

3.3 match 19 stars 6.18 score 80 scripts 2 dependents

coatless-rpkg

jjb:Balamuta Miscellaneous

Set of common functions used for manipulating colors, detecting and interacting with 'RStudio', modeling, formatting, determining users' operating system, feature scaling, and more!

Maintained by James Balamuta. Last updated 1 years ago.

5.1 match 2 stars 3.97 score 31 scripts 1 dependents

ymutua

mapsRinteractive:Local Adaptation and Evaluation of Raster Maps

Local adaptation and evaluation of maps of continuous attributes in raster format by use of point location data.

Maintained by Kristin Persson. Last updated 2 years ago.

6.6 match 2 stars 3.00 score 7 scripts

bsnatr

tswge:Time Series for Data Science

Accompanies the texts Time Series for Data Science with R by Woodward, Sadler and Robertson & Applied Time Series Analysis with R, 2nd edition by Woodward, Gray, and Elliott. It is helpful for data analysis and for time series instruction.

Maintained by Bivin Sadler. Last updated 2 years ago.

7.3 match 2.70 score 496 scripts

gdkrmr

DRR:Dimensionality Reduction via Regression

An Implementation of Dimensionality Reduction via Regression using Kernel Ridge Regression.

Maintained by Guido Kraemer. Last updated 2 years ago.

dimensionality-reduction kernel-methods non-linear regression-models

3.7 match 9 stars 5.24 score 8 scripts 1 dependents

cran

nsRFA:Non-Supervised Regional Frequency Analysis

A collection of statistical tools for objective (non-supervised) applications of the Regional Frequency Analysis methods in hydrology. The package refers to the index-value method and, more precisely, helps the hydrologist to: (1) regionalize the index-value; (2) form homogeneous regions with similar growth curves; (3) fit distribution functions to the empirical regional growth curves. Most of the methods are those described in the Flood Estimation Handbook (Centre for Ecology & Hydrology, 1999, ISBN:9781906698003). Homogeneity tests from Hosking and Wallis (1993) <doi:10.1029/92WR01980> and Viglione et al. (2007) <doi:10.1029/2006WR005095> are available.

Maintained by Alberto Viglione. Last updated 10 months ago.

5.6 match 2 stars 3.49 score

anniesbooth

deepgp:Bayesian Deep Gaussian Processes using MCMC

Performs Bayesian posterior inference for deep Gaussian processes following Sauer, Gramacy, and Higdon (2023, <doi:10.48550/arXiv.2012.08015>). See Sauer (2023, <http://hdl.handle.net/10919/114845>) for comprehensive methodological details and <https://bitbucket.org/gramacylab/deepgp-ex/> for a variety of coding examples. Models are trained through MCMC including elliptical slice sampling of latent Gaussian layers and Metropolis-Hastings sampling of kernel hyperparameters. Vecchia-approximation for faster computation is implemented following Sauer, Cooper, and Gramacy (2023, <doi:10.48550/arXiv.2204.02904>). Optional monotonic warpings are implemented following Barnett et al. (2024, <doi:10.48550/arXiv.2408.01540>). Downstream tasks include sequential design through active learning Cohn/integrated mean squared error (ALC/IMSE; Sauer, Gramacy, and Higdon, 2023), optimization through expected improvement (EI; Gramacy, Sauer, and Wycoff, 2022 <doi:10.48550/arXiv.2112.07457>), and contour location through entropy (Booth, Renganathan, and Gramacy, 2024 <doi:10.48550/arXiv.2308.04420>). Models extend up to three layers deep; a one layer model is equivalent to typical Gaussian process regression. Incorporates OpenMP and SNOW parallelization and utilizes C/C++ under the hood.

Maintained by Annie S. Booth. Last updated 7 months ago.

openblas cpp openmp

5.6 match 2 stars 3.45 score 14 scripts

gi0na

ghypernet:Fit and Simulate Generalised Hypergeometric Ensembles of Graphs

Provides functions for model fitting and selection of generalised hypergeometric ensembles of random graphs (gHypEG). To learn how to use it, check the vignettes for a quick tutorial. Please reference its use as Casiraghi, G., Nanumyan, V. (2019) <doi:10.5281/zenodo.2555300> together with those relevant references from the one listed below. The package is based on the research developed at the Chair of Systems Design, ETH Zurich. Casiraghi, G., Nanumyan, V., Scholtes, I., Schweitzer, F. (2016) <arXiv:1607.02441>. Casiraghi, G., Nanumyan, V., Scholtes, I., Schweitzer, F. (2017) <doi:10.1007/978-3-319-67256-4_11>. Casiraghi, G., (2017) <arXiv:1702.02048> Brandenberger, L., Casiraghi, G., Nanumyan, V., Schweitzer, F. (2019) <doi:10.1145/3341161.3342926> Casiraghi, G. (2019) <doi:10.1007/s41109-019-0241-1>. Casiraghi, G., Nanumyan, V. (2021) <doi:10.1038/s41598-021-92519-y>. Casiraghi, G. (2021) <doi:10.1088/2632-072X/ac0493>.

Maintained by Giona Casiraghi. Last updated 11 months ago.

data-mining data-science graphs network network-analysis random-graph-generation random-graphs

3.3 match 8 stars 5.68 score 20 scripts

tomasfryda

h2o:R Interface for the 'H2O' Scalable Machine Learning Platform

R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).

Maintained by Tomas Fryda. Last updated 1 years ago.

2.3 match 3 stars 8.20 score 7.8k scripts 11 dependents

pkr-pkr

NNbenchmark:Datasets and Functions to Benchmark Neural Network Packages

Datasets and functions to benchmark (convergence, speed, ease of use) R packages dedicated to regression with neural networks (no classification in this version). The templates for the tested packages are available in the R, R Markdown and HTML formats at <https://github.com/pkR-pkR/NNbenchmarkTemplates> and <https://theairbend3r.github.io/NNbenchmarkWeb/index.html>. The submitted article to the R-Journal can be read at <https://www.inmodelia.com/gsoc2020.html>.

Maintained by Patrice Kiener. Last updated 1 years ago.

3.4 match 5 stars 5.33 score 283 scripts

iembry

ie2misc:Irucka Embry's Miscellaneous USGS Functions

A collection of Irucka Embry's miscellaneous USGS functions (processing .exp and .psf files, statistical error functions, "+" dyadic operator for use with NA, creating ADAPS and QW spreadsheet files, calculating saturated enthalpy). Irucka created these functions while a Cherokee Nation Technology Solutions (CNTS) United States Geological Survey (USGS) Contractor and/or USGS employee.

Maintained by Irucka Embry. Last updated 1 years ago.

5.1 match 3.43 score 54 scripts

smart-cities-accelerator

onlineforecast:Forecast Modelling for Online Applications

A framework for fitting adaptive forecasting models. Provides a way to use forecasts as input to models, e.g. weather forecasts for energy related forecasting. The models can be fitted recursively and can easily be setup for updating parameters when new data arrives. See the included vignettes, the website <https://onlineforecasting.org> and the paper "onlineforecast: An R package for adaptive and recursive forecasting" <https://journal.r-project.org/articles/RJ-2023-031/>.

Maintained by Peder Bacher. Last updated 1 years ago.

openblas cpp

5.3 match 3 stars 3.28 score 16 scripts

ullid

SoilHyP:Soil Hydraulic Properties

Provides functions for (1) soil water retention (SWC) and unsaturated hydraulic conductivity (Ku) (van Genuchten-Mualem (vGM or vG) [1, 2], Peters-Durner-Iden (PDI) [3, 4, 5], Brooks and Corey (bc) [8]), (2) fitting of parameter for SWC and/or Ku using Shuffled Complex Evolution (SCE) optimisation and (3) calculation of soil hydraulic properties (Ku and soil water contents) based on the simplified evaporation method (SEM) [6, 7]. Main references: [1] van Genuchten (1980) <doi:10.2136/sssaj1980.03615995004400050002x>, [2] Mualem (1976) <doi:10.1029/WR012i003p00513>, [3] Peters (2013) <doi:10.1002/wrcr.20548>, [4] Iden and Durner (2013) <doi:10.1002/2014WR015937>, [5] Peters (2014) <doi:10.1002/2014WR015937>, [6] Wind G. P. (1966), [7] Peters and Durner (2008) <doi:10.1016/j.jhydrol.2008.04.016> and [8] Brooks and Corey (1964).

Maintained by Ullrich Dettmann. Last updated 2 years ago.

5.1 match 3.32 score 35 scripts 2 dependents

rudeboybert

forestecology:Fitting and Assessing Neighborhood Models of the Effect of Interspecific Competition on the Growth of Trees

Code for fitting and assessing models for the growth of trees. In particular for the Bayesian neighborhood competition linear regression model of Allen (2020): methods for model fitting and generating fitted/predicted values, evaluating the effect of competitor species identity using permutation tests, and evaluating model performance using spatial cross-validation.

Maintained by Albert Y. Kim. Last updated 3 years ago.

forestecology tidyverse

3.3 match 12 stars 5.12 score 11 scripts

wsqlab

GaSP:Train and Apply a Gaussian Stochastic Process Model

Train a Gaussian stochastic process model of an unknown function, possibly observed with error, via maximum likelihood or maximum a posteriori (MAP) estimation, run model diagnostics, and make predictions, following Sacks, J., Welch, W.J., Mitchell, T.J., and Wynn, H.P. (1989) "Design and Analysis of Computer Experiments", Statistical Science, <doi:10.1214/ss/1177012413>. Perform sensitivity analysis and visualize low-order effects, following Schonlau, M. and Welch, W.J. (2006), "Screening the Input Variables to a Computer Model Via Analysis of Variance and Visualization", <doi:10.1007/0-387-28014-6_14>.

Maintained by William J. Welch. Last updated 9 months ago.

6.2 match 2.70 score 8 scripts

r-forge

qualV:Qualitative Validation Methods

Qualitative methods for the validation of dynamic models. It contains (i) an orthogonal set of deviance measures for absolute, relative and ordinal scale and (ii) approaches accounting for time shifts. The first approach transforms time to take time delays and speed differences into account. The second divides the time series into interval units according to their main features and finds the longest common subsequence (LCS) using a dynamic programming algorithm.

Maintained by Thomas Petzoldt. Last updated 2 years ago.

3.3 match 1 stars 4.99 score 49 scripts 26 dependents

neerajdhanraj

imputeTestbench:Test Bench for the Comparison of Imputation Methods

Provides a test bench for the comparison of missing data imputation methods in uni-variate time series. Imputation methods are compared using different error metrics. Proposed imputation methods and alternative error metrics can be used.

Maintained by Marcus W. Beck. Last updated 8 years ago.

3.3 match 5 stars 4.94 score 20 scripts 2 dependents

cmsaf

cmsafops:Tools for CM SAF NetCDF Data

The Satellite Application Facility on Climate Monitoring (CM SAF) is a ground segment of the European Organization for the Exploitation of Meteorological Satellites (EUMETSAT) and one of EUMETSATs Satellite Application Facilities. The CM SAF contributes to the sustainable monitoring of the climate system by providing essential climate variables related to the energy and water cycle of the atmosphere (<https://www.cmsaf.eu>). It is a joint cooperation of eight National Meteorological and Hydrological Services. The 'cmsafops' R-package provides a collection of R-operators for the analysis and manipulation of CM SAF NetCDF formatted data. Other CF conform NetCDF data with time, longitude and latitude dimension should be applicable, but there is no guarantee for an error-free application. CM SAF climate data records are provided for free via (<https://wui.cmsaf.eu/safira>). Detailed information and test data are provided on the CM SAF webpage (<http://www.cmsaf.eu/R_toolbox>).

Maintained by Steffen Kothe. Last updated 6 months ago.

3.2 match 2 stars 5.03 score 4 scripts 2 dependents

sollano

forestmangr:Forest Mensuration and Management

Processing forest inventory data with methods such as simple random sampling, stratified random sampling and systematic sampling. There are also functions for yield and growth predictions and model fitting, linear and nonlinear grouped data fitting, and statistical tests. References: Kershaw Jr., Ducey, Beers and Husch (2016). <doi:10.1002/9781118902028>.

Maintained by Sollano Rabelo Braga. Last updated 3 months ago.

2.0 match 17 stars 7.97 score 378 scripts

tripartio

staccuracy:Standardized Accuracy and Other Model Performance Metrics

Standardized accuracy (staccuracy) is a framework for expressing accuracy scores such that 50% represents a reference level of performance and 100% is a perfect prediction. The 'staccuracy' package provides tools for creating staccuracy functions as well as some recommended staccuracy measures. It also provides functions for some classic performance metrics such as mean absolute error (MAE), root mean squared error (RMSE), and area under the receiver operating characteristic curve (AUCROC), as well as their winsorized versions when applicable.

Maintained by Chitu Okoli. Last updated 21 days ago.

3.8 match 1 stars 4.18 score 4 scripts 2 dependents

philipppro

measures:Performance Measures for Statistical Learning

Provides the biggest amount of statistical measures in the whole R world. Includes measures of regression, (multiclass) classification and multilabel classification. The measures come mainly from the 'mlr' package and were programed by several 'mlr' developers.

Maintained by Philipp Probst. Last updated 4 years ago.

3.3 match 1 stars 4.47 score 88 scripts 2 dependents

cran

RSDA:R to Symbolic Data Analysis

Symbolic Data Analysis (SDA) was proposed by professor Edwin Diday in 1987, the main purpose of SDA is to substitute the set of rows (cases) in the data table for a concept (second order statistical unit). This package implements, to the symbolic case, certain techniques of automatic classification, as well as some linear models.

Maintained by Oldemar Rodriguez. Last updated 1 years ago.

4.5 match 1 stars 3.26 score 3 dependents

mateusmaiads

randomMachines:An Ensemble Modeling using Random Machines

A novel ensemble method employing Support Vector Machines (SVMs) as base learners. This powerful ensemble model is designed for both classification (Ara A., et. al, 2021) <doi:10.6339/21-JDS1014>, and regression (Ara A., et. al, 2021) <doi:10.1016/j.eswa.2022.117107> problems, offering versatility and robust performance across different datasets and compared with other consolidated methods as Random Forests (Maia M, et. al, 2021) <doi:10.6339/21-JDS1025>.

Maintained by Mateus Maia. Last updated 12 months ago.

5.1 match 1 stars 2.74 score 11 scripts

joshuawlambert

rFSA:Feasible Solution Algorithm for Finding Best Subsets and Interactions

Assists in statistical model building to find optimal and semi-optimal higher order interactions and best subsets. Uses the lm(), glm(), and other R functions to fit models generated from a feasible solution algorithm. Discussed in Subset Selection in Regression, A Miller (2002). Applied and explained for least median of squares in Hawkins (1993) <doi:10.1016/0167-9473(93)90246-P>. The feasible solution algorithm comes up with model forms of a specific type that can have fixed variables, higher order interactions and their lower order terms.

Maintained by Joshua Lambert. Last updated 4 years ago.

algorithm fsa interaction models parallel statistical statistics subset

3.3 match 7 stars 4.15 score 20 scripts

usepa

httk:High-Throughput Toxicokinetics

Pre-made models that can be rapidly tailored to various chemicals and species using chemical-specific in vitro data and physiological information. These tools allow incorporation of chemical toxicokinetics ("TK") and in vitro-in vivo extrapolation ("IVIVE") into bioinformatics, as described by Pearce et al. (2017) (<doi:10.18637/jss.v079.i04>). Chemical-specific in vitro data characterizing toxicokinetics have been obtained from relatively high-throughput experiments. The chemical-independent ("generic") physiologically-based ("PBTK") and empirical (for example, one compartment) "TK" models included here can be parameterized with in vitro data or in silico predictions which are provided for thousands of chemicals, multiple exposure routes, and various species. High throughput toxicokinetics ("HTTK") is the combination of in vitro data and generic models. We establish the expected accuracy of HTTK for chemicals without in vivo data through statistical evaluation of HTTK predictions for chemicals where in vivo data do exist. The models are systems of ordinary differential equations that are developed in MCSim and solved using compiled (C-based) code for speed. A Monte Carlo sampler is included for simulating human biological variability (Ring et al., 2017 <doi:10.1016/j.envint.2017.06.004>) and propagating parameter uncertainty (Wambaugh et al., 2019 <doi:10.1093/toxsci/kfz205>). Empirically calibrated methods are included for predicting tissue:plasma partition coefficients and volume of distribution (Pearce et al., 2017 <doi:10.1007/s10928-017-9548-7>). These functions and data provide a set of tools for using IVIVE to convert concentrations from high-throughput screening experiments (for example, Tox21, ToxCast) to real-world exposures via reverse dosimetry (also known as "RTK") (Wetmore et al., 2015 <doi:10.1093/toxsci/kfv171>).

Maintained by John Wambaugh. Last updated 1 months ago.

comptox ord

1.3 match 27 stars 10.22 score 307 scripts 1 dependents

xluo11

xxIRT:Item Response Theory and Computer-Based Testing

A suite of psychometric analysis tools for research and operation, including: (1) computation of probability, information, and likelihood for the 3PL, GPCM, and GRM; (2) parameter estimation using joint or marginal likelihood estimation method; (3) simulation of computerized adaptive testing using built-in or customized algorithms; (4) assembly and simulation of multistage testing. The full documentation and tutorials are at <https://github.com/xluo11/xxIRT>.

Maintained by Xiao Luo. Last updated 6 years ago.

3.3 match 25 stars 4.10 score 10 scripts

tspsyched

autoFC:Automatic Construction of Forced-Choice Tests

Forced-choice (FC) response has gained increasing popularity and interest for its resistance to faking when well-designed (Cao & Drasgow, 2019 <doi:10.1037/apl0000414>). To established well-designed FC scales, typically each item within a block should measure different trait and have similar level of social desirability (Zhang et al., 2020 <doi:10.1177/1094428119836486>). Recent study also suggests the importance of high inter-item agreement of social desirability between items within a block (Pavlov et al., 2021 <doi:10.31234/osf.io/hmnrc>). In addition to this, FC developers may also need to maximize factor loading differences (Brown & Maydeu-Olivares, 2011 <doi:10.1177/0013164410375112>) or minimize item location differences (Cao & Drasgow, 2019 <doi:10.1037/apl0000414>) depending on scoring models. Decision of which items should be assigned to the same block, termed item pairing, is thus critical to the quality of an FC test. This pairing process is essentially an optimization process which is currently carried out manually. However, given that we often need to simultaneously meet multiple objectives, manual pairing becomes impractical or even not feasible once the number of latent traits and/or number of items per trait are relatively large. To address these problems, autoFC is developed as a practical tool for facilitating the automatic construction of FC tests (Li et al., 2022 <doi:10.1177/01466216211051726>), essentially exempting users from the burden of manual item pairing and reducing the computational costs and biases induced by simple ranking methods. Given characteristics of each item (and item responses), FC tests can be automatically constructed based on user-defined pairing criteria and weights as well as customized optimization behavior. Users can also construct parallel forms of the same test following the same pairing rules.

Maintained by Mengtong Li. Last updated 4 days ago.

2.7 match 4 stars 4.90 score 3 scripts

kriper0217

valmetrics:Metrics and Plots for Model Evaluation

Functions for metrics and plots for model evaluation. Based on vectors of observed and predicted values. Method: Kristin Piikki, Johanna Wetterlind, Mats Soderstrom and Bo Stenberg (2021). <doi:10.1111/SUM.12694>.

Maintained by Kristin Piikki. Last updated 4 years ago.

6.6 match 2.00 score 2 scripts

xluo11

Rirt:Data Analysis and Parameter Estimation Using Item Response Theory

Parameter estimation, computation of probability, information, and (log-)likelihood, and visualization of item/test characteristic curves and item/test information functions for three uni-dimensional item response theory models: the 3-parameter-logistic model, generalized partial credit model, and graded response model. The full documentation and tutorials are at <https://github.com/xluo11/Rirt>.

Maintained by Xiao Luo. Last updated 5 years ago.

cpp

3.3 match 3 stars 3.95 score 6 scripts 2 dependents

jangraffelman

Correlplot:A Collection of Functions for Graphing Correlation Matrices

Routines for the graphical representation of correlation matrices by means of correlograms, MDS maps and biplots obtained by PCA, PFA or WALS (weighted alternating least squares); See Graffelman & De Leeuw (2023) <doi: 10.1080/00031305.2023.2186952>.

Maintained by Jan Graffelman. Last updated 1 years ago.

4.9 match 2.59 score 13 scripts 1 dependents

cran

airGR:Suite of GR Hydrological Models for Precipitation-Runoff Modelling

Hydrological modelling tools developed at INRAE-Antony (HYCAR Research Unit, France). The package includes several conceptual rainfall-runoff models (GR4H, GR5H, GR4J, GR5J, GR6J, GR2M, GR1A) that can be applied either on a lumped or semi-distributed way. A snow accumulation and melt model (CemaNeige) and the associated functions for the calibration and evaluation of models are also included. Use help(airGR) for package description and references.

Maintained by Olivier Delaigue. Last updated 1 years ago.

fortran

1.9 match 4 stars 6.60 score 164 scripts 4 dependents

rbgramacy

monomvn:Estimation for MVN and Student-t Data with Monotone Missingness

Estimation of multivariate normal (MVN) and student-t data of arbitrary dimension where the pattern of missing data is monotone. See Pantaleo and Gramacy (2010) <doi:10.48550/arXiv.0907.2135>. Through the use of parsimonious/shrinkage regressions (plsr, pcr, lasso, ridge, etc.), where standard regressions fail, the package can handle a nearly arbitrary amount of missing data. The current version supports maximum likelihood inference and a full Bayesian approach employing scale-mixtures for Gibbs sampling. Monotone data augmentation extends this Bayesian approach to arbitrary missingness patterns. A fully functional standalone interface to the Bayesian lasso (from Park & Casella), Normal-Gamma (from Griffin & Brown), Horseshoe (from Carvalho, Polson, & Scott), and ridge regression with model selection via Reversible Jump, and student-t errors (from Geweke) is also provided.

Maintained by Robert B. Gramacy. Last updated 6 months ago.

fortran openblas cpp

3.9 match 4 stars 3.14 score 127 scripts

cran

Fgmutils:Forest Growth Model Utilities

Growth models and forest production require existing data manipulation and the creation of new data, structured from basic forest inventory data. The purpose of this package is provide functions to support these activities.

Maintained by Clayton Vieira Fraga Filho. Last updated 6 years ago.

7.6 match 1.48 score 1 dependents

sciviews

modelit:Statistical Models for 'SciViews::R'

Create and use statistical models (linear, general, nonlinear...) with extensions to support rich-formatted tables, equations and plots for the 'SciViews::R' dialect.

Maintained by Philippe Grosjean. Last updated 4 months ago.

sciviews statsmodels

3.3 match 1 stars 3.30 score 8 scripts

kapelner

bartMachine:Bayesian Additive Regression Trees

An advanced implementation of Bayesian Additive Regression Trees with expanded features for data analysis and visualization.

Maintained by Adam Kapelner. Last updated 2 years ago.

openjdk

1.8 match 6.01 score 309 scripts 6 dependents

cran

s2dv:A Set of Common Tools for Seasonal to Decadal Verification

The advanced version of package 's2dverification'. It is intended for 'seasonal to decadal' (s2d) climate forecast verification, but it can also be used in other kinds of forecasts or general climate analysis. This package is specially designed for the comparison between the experimental and observational datasets. The functionality of the included functions covers from data retrieval, data post-processing, skill scores against observation, to visualization. Compared to 's2dverification', 's2dv' is more compatible with the package 'startR', able to use multiple cores for computation and handle multi-dimensional arrays with a higher flexibility. The CDO version used in development is 1.9.8.

Maintained by Ariadna Batalla. Last updated 5 months ago.

5.4 match 1.95 score 3 dependents

cran

qpcR:Modelling and Analysis of Real-Time PCR Data

Model fitting, optimal model selection and calculation of various features that are essential in the analysis of quantitative real-time polymerase chain reaction (qPCR).

Maintained by Andrej-Nikolai Spiess. Last updated 7 years ago.

3.3 match 2 stars 3.06 score 1 dependents

certe-medical-epidemiology

certestats:A Certe R Package for Statistical Modelling

A Certe R Package for early-warning, applying statistical modelling (such as creating machine learning models), QC rules and distribution analysis. This package is part of the 'certedata' universe.

Maintained by Matthijs S. Berends. Last updated 4 months ago.

statistics

3.3 match 3.02 score 1 scripts 1 dependents

bioc

methyLImp2:Missing value estimation of DNA methylation data

This package allows to estimate missing values in DNA methylation data. methyLImp method is based on linear regression since methylation levels show a high degree of inter-sample correlation. Implementation is parallelised over chromosomes since probes on different chromosomes are usually independent. Mini-batch approach to reduce the runtime in case of large number of samples is available.

Maintained by Anna Plaksienko. Last updated 1 months ago.

dnamethylation microarray software methylationarray regression imputation methylation missing-value-imputation

1.6 match 6 stars 5.62 score 3 scripts

wlenhard

cNORM:Continuous Norming

A comprehensive toolkit for generating continuous test norms in psychometrics and biometrics, and analyzing model fit. The package offers both distribution-free modeling using Taylor polynomials and parametric modeling using the beta-binomial distribution. Originally developed for achievement tests, it is applicable to a wide range of mental, physical, or other test scores dependent on continuous or discrete explanatory variables. The package provides several advantages: It minimizes deviations from representativeness in subsamples, interpolates between discrete levels of explanatory variables, and significantly reduces the required sample size compared to conventional norming per age group. cNORM enables graphical and analytical evaluation of model fit, accommodates a wide range of scales including those with negative and descending values, and even supports conventional norming. It generates norm tables including confidence intervals. It also includes methods for addressing representativeness issues through Iterative Proportional Fitting.

Maintained by Wolfgang Lenhard. Last updated 4 months ago.

beta-binomial biometrics continuous-norming growth-curve norm-scores norm-tables normalization-techniques percentile psychometrics regression-based-norming taylor-series

1.6 match 2 stars 5.49 score 75 scripts

mingsnu

stfit:Spatio-Temporal Functional Imputation Tool

A general spatiotemporal satellite image imputation method based on sparse functional data analytic techniques. The imputation method applies and extends the Functional Principal Analysis by Conditional Estimation (PACE). The underlying idea for the proposed procedure is to impute a missing pixel by borrowing information from temporally and spatially contiguous pixels based on the best linear unbiased prediction.

Maintained by Weicheng Zhu. Last updated 2 years ago.

cpp

3.3 match 2.61 score 41 scripts

david-hervas

repmod:Create Report Table from Different Objects

Tools for generating descriptives and report tables for different models, data.frames and tables and exporting them to different formats.

Maintained by David Hervas Marin. Last updated 2 months ago.

3.3 match 2.60 score 6 scripts

hjboonstra

hbsae:Hierarchical Bayesian Small Area Estimation

Functions to compute small area estimates based on a basic area or unit-level model. The model is fit using restricted maximum likelihood, or in a hierarchical Bayesian way. In the latter case numerical integration is used to average over the posterior density for the between-area variance. The output includes the model fit, small area estimates and corresponding mean squared errors, as well as some model selection measures. Additional functions provide means to compute aggregate estimates and mean squared errors, to minimally adjust the small area estimates to benchmarks at a higher aggregation level, and to graphically compare different sets of small area estimates.

Maintained by Harm Jan Boonstra. Last updated 3 years ago.

3.3 match 2 stars 2.53 score 28 scripts 2 dependents

american-institutes-for-research

wCorr:Weighted Correlations

Calculates Pearson, Spearman, polychoric, and polyserial correlation coefficients, in weighted or unweighted form. The package implements tetrachoric correlation as a special case of the polychoric and biserial correlation as a specific case of the polyserial.

Maintained by Paul Bailey. Last updated 2 years ago.

openblas cpp openmp

1.3 match 6.54 score 118 scripts 8 dependents

zejiang-unsw

WASP:Wavelet System Prediction

The wavelet-based variance transformation method is used for system modelling and prediction. It refines predictor spectral representation using Wavelet Theory, which leads to improved model specifications and prediction accuracy. Details of methodologies used in the package can be found in Jiang, Z., Sharma, A., & Johnson, F. (2020) <doi:10.1029/2019WR026962>, Jiang, Z., Rashid, M. M., Johnson, F., & Sharma, A. (2020) <doi:10.1016/j.envsoft.2020.104907>, and Jiang, Z., Sharma, A., & Johnson, F. (2021) <doi:10.1016/J.JHYDROL.2021.126816>.

Maintained by Ze Jiang. Last updated 7 months ago.

prediction transformation wavelet

1.3 match 9 stars 6.41 score 19 scripts

sujit-sahu

bmstdr:Bayesian Modeling of Spatio-Temporal Data with R

Fits, validates and compares a number of Bayesian models for spatial and space time point referenced and areal unit data. Model fitting is done using several packages: 'rstan', 'INLA', 'spBayes', 'spTimer', 'spTDyn', 'CARBayes' and 'CARBayesST'. Model comparison is performed using the DIC and WAIC, and K-fold cross-validation where the user is free to select their own subset of data rows for validation. Sahu (2022) <doi:10.1201/9780429318443> describes the methods in detail.

Maintained by Sujit K. Sahu. Last updated 1 years ago.

bayesian modelling spatio-temporal-data cpp

1.6 match 15 stars 4.95 score 12 scripts

frareb

devRate:Quantify the Relationship Between Development Rate and Temperature in Ectotherms

A set of functions to quantify the relationship between development rate and temperature and to build phenological models. The package comprises a set of models and estimated parameters borrowed from a literature review in ectotherms. The methods and literature review are described in Rebaudo et al. (2018) <doi:10.1111/2041-210X.12935>, Rebaudo and Rabhi (2018) <doi:10.1111/eea.12693>, and Regnier et al. (2021) <doi:10.1093/ee/nvab115>. An example can be found in Rebaudo et al. (2017) <doi:10.1007/s13355-017-0480-5>.

Maintained by Francois Rebaudo. Last updated 2 years ago.

1.5 match 3 stars 5.31 score 15 scripts

mbinois

hetGP:Heteroskedastic Gaussian Process Modeling and Design under Replication

Performs Gaussian process regression with heteroskedastic noise following the model by Binois, M., Gramacy, R., Ludkovski, M. (2016) <doi:10.48550/arXiv.1611.05902>, with implementation details in Binois, M. & Gramacy, R. B. (2021) <doi:10.18637/jss.v098.i13>. The input dependent noise is modeled as another Gaussian process. Replicated observations are encouraged as they yield computational savings. Sequential design procedures based on the integrated mean square prediction error and lookahead heuristics are provided, and notably fast update functions when adding new observations.

Maintained by Mickael Binois. Last updated 6 months ago.

cpp

1.6 match 5 stars 4.89 score 260 scripts 2 dependents

laurabruckman

netSEM:Network Structural Equation Modeling

The network structural equation modeling conducts a network statistical analysis on a data frame of coincident observations of multiple continuous variables [1]. It builds a pathway model by exploring a pool of domain knowledge guided candidate statistical relationships between each of the variable pairs, selecting the 'best fit' on the basis of a specific criteria such as adjusted r-squared value. This material is based upon work supported by the U.S. National Science Foundation Award EEC-2052776 and EEC-2052662 for the MDS-Rely IUCRC Center, under the NSF Solicitation: NSF 20-570 Industry-University Cooperative Research Centers Program [1] Bruckman, Laura S., Nicholas R. Wheeler, Junheng Ma, Ethan Wang, Carl K. Wang, Ivan Chou, Jiayang Sun, and Roger H. French. (2013) <doi:10.1109/ACCESS.2013.2267611>.

Maintained by Laura S. Bruckman. Last updated 2 years ago.

2.0 match 3.72 score 13 scripts

haddonm

MQMF:Modelling and Quantitative Methods in Fisheries

Complements the book "Using R for Modelling and Quantitative Methods in Fisheries" ISBN 9780367469894, published in 2021 by Chapman & Hall in their "Using R series". There are numerous functions and data-sets that are used in the book's many practical examples.

Maintained by Malcolm Haddon. Last updated 2 years ago.

ecology fisheries haddon quantitative-methods uncertainty

1.8 match 11 stars 4.14 score 25 scripts

zaynesember

speccurvieR:Easy, Fast, and Pretty Specification Curve Analysis

Making specification curve analysis easy, fast, and pretty. It improves upon existing offerings with additional features and 'tidyverse' integration. Users can easily visualize and evaluate how their models behave under different specifications with a high degree of customization. For a description and applications of specification curve analysis see Simonsohn, Simmons, and Nelson (2020) <doi:10.1038/s41562-020-0912-z>.

Maintained by Zayne Sember. Last updated 6 months ago.

regression-diagnostics specification-curve-analysis specification-curve-plot

1.8 match 4 stars 4.00 score 2 scripts

choi-phd

maat:Multiple Administrations Adaptive Testing

Provides an extension of the shadow-test approach to computerized adaptive testing (CAT) implemented in the 'TestDesign' package for the assessment framework involving multiple tests administered periodically throughout the year. This framework is referred to as the Multiple Administrations Adaptive Testing (MAAT) and supports multiple item pools vertically scaled and multiple phases (stages) of CAT within each test. Between phases and tests, transitioning from one item pool (and associated constraints) to another is allowed as deemed necessary to enhance the quality of measurement.

Maintained by Seung W. Choi. Last updated 9 months ago.

1.8 match 4.00 score 5 scripts

florafauna

gapfill:Fill Missing Values in Satellite Data

Tools to fill missing values in satellite data and to develop new gap-fill algorithms. The methods are tailored to data (images) observed at equally-spaced points in time. The package is illustrated with MODIS NDVI data.

Maintained by Florian Gerber. Last updated 4 years ago.

cpp

2.3 match 2 stars 3.18 score 15 scripts

cran

halk:Methods to Create Hierarchical Age Length Keys for Age Assignment

Provides methods for implementing hierarchical age length keys to estimate fish ages from lengths using data borrowing. Users can create hierarchical age length keys and use them to assign ages given length.

Maintained by Paul Frater. Last updated 1 years ago.

3.4 match 2.00 score 4 scripts

promidat

forecasteR:Time Series Forecast System

A web application for displaying, analysing and forecasting univariate time series. Includes basic methods such as mean, naïve, seasonal naïve and drift, as well as more complex methods such as Holt-Winters Box,G and Jenkins, G (1976) <doi:10.1111/jtsa.12194> and ARIMA Brockwell, P.J. and R.A.Davis (1991) <doi:10.1007/978-1-4419-0320-4>.

Maintained by Oldemar Rodriguez. Last updated 2 years ago.

3.3 match 2.00 score 2 scripts

cran

SoftBart:Implements the SoftBart Algorithm

Implements the SoftBart model of described by Linero and Yang (2018) <doi:10.1111/rssb.12293>, with the optional use of a sparsity-inducing prior to allow for variable selection. For usability, the package maintains the same style as the 'BayesTree' package.

Maintained by Antonio R. Linero. Last updated 2 years ago.

openblas cpp

3.3 match 2.00 score

cran

PerMat:Performance Metrics in Predictive Modeling

Performance metric provides different performance measures like mean squared error, root mean square error, mean absolute deviation, mean absolute percentage error etc. of a fitted model. These can provide a way for forecasters to quantitatively compare the performance of competing models. For method details see (i) Pankaj Das (2020) <http://krishi.icar.gov.in/jspui/handle/123456789/44138>.

Maintained by Pankaj Das. Last updated 9 months ago.

3.3 match 2.00 score

mohmedsoudy

MERO:Performing Monte Carlo Expectation Maximization Random Forest Imputation for Biological Data

Perform missing value imputation for biological data using the random forest algorithm, the imputation aim to keep the original mean and standard deviation consistent after imputation.

Maintained by Mohamed Soudy. Last updated 2 years ago.

5.0 match 1.23 score 17 scripts

cran

DTWBI:Imputation of Time Series Based on Dynamic Time Warping

Functions to impute large gaps within time series based on Dynamic Time Warping methods. It contains all required functions to create large missing consecutive values within time series and to fill them, according to the paper Phan et al. (2017), <DOI:10.1016/j.patrec.2017.08.019>. Performance criteria are added to compare similarity between two signals (query and reference).

Maintained by Emilie Poisson-Caillault. Last updated 7 years ago.

4.0 match 1.48 score 1 dependents

chelbert

DiceEval:Construction and Evaluation of Metamodels

Estimation, validation and prediction of models of different types : linear models, additive models, MARS,PolyMARS and Kriging.

Maintained by C. Helbert. Last updated 1 years ago.

3.3 match 1.81 score 64 scripts

bdj34

cloneRate:Estimate Growth Rates from Phylogenetic Trees

Quickly estimate the net growth rate of a population or clone whose growth can be approximated by a birth-death branching process. Input should be phylogenetic tree(s) of clone(s) with edge lengths corresponding to either time or mutations. Based on coalescent results in Johnson et al. (2023) <doi:10.1093/bioinformatics/btad561>. Simulation techniques as well as growth rate methods build on prior work from Lambert A. (2018) <doi:10.1016/j.tpb.2018.04.005> and Stadler T. (2009) <doi:10.1016/j.jtbi.2009.07.018>.

Maintained by Brian Johnson. Last updated 11 months ago.

cpp

1.2 match 4 stars 4.90 score 8 scripts

cran

conf:Visualization and Analysis of Statistical Measures of Confidence

Enables: (1) plotting two-dimensional confidence regions, (2) coverage analysis of confidence region simulations, (3) calculating confidence intervals and the associated actual coverage for binomial proportions, (4) calculating the support values and the probability mass function of the Kaplan-Meier product-limit estimator, and (5) plotting the actual coverage function associated with a confidence interval for the survivor function from a randomly right-censored data set. Each is given in greater detail next. (1) Plots the two-dimensional confidence region for probability distribution parameters (supported distribution suffixes: cauchy, gamma, invgauss, logis, llogis, lnorm, norm, unif, weibull) corresponding to a user-given complete or right-censored dataset and level of significance. The crplot() algorithm plots more points in areas of greater curvature to ensure a smooth appearance throughout the confidence region boundary. An alternative heuristic plots a specified number of points at roughly uniform intervals along its boundary. Both heuristics build upon the radial profile log-likelihood ratio technique for plotting confidence regions given by Jaeger (2016) <doi:10.1080/00031305.2016.1182946>, and are detailed in a publication by Weld et al. (2019) <doi:10.1080/00031305.2018.1564696>. (2) Performs confidence region coverage simulations for a random sample drawn from a user- specified parametric population distribution, or for a user-specified dataset and point of interest with coversim(). (3) Calculates confidence interval bounds for a binomial proportion with binomTest(), calculates the actual coverage with binomTestCoverage(), and plots the actual coverage with binomTestCoveragePlot(). Calculates confidence interval bounds for the binomial proportion using an ensemble of constituent confidence intervals with binomTestEnsemble(). Calculates confidence interval bounds for the binomial proportion using a complete enumeration of all possible transitions from one actual coverage acceptance curve to another which minimizes the root mean square error for n <= 15 and follows the transitions for well-known confidence intervals for n > 15 using binomTestMSE(). (4) The km.support() function calculates the support values of the Kaplan-Meier product-limit estimator for a given sample size n using an induction algorithm described in Qin et al. (2023) <doi:10.1080/00031305.2022.2070279>. The km.outcomes() function generates a matrix containing all possible outcomes (all possible sequences of failure times and right-censoring times) of the value of the Kaplan-Meier product-limit estimator for a particular sample size n. The km.pmf() function generates the probability mass function for the support values of the Kaplan-Meier product-limit estimator for a particular sample size n, probability of observing a failure h at the time of interest expressed as the cumulative probability percentile associated with X = min(T, C), where T is the failure time and C is the censoring time under a random-censoring scheme. The km.surv() function generates multiple probability mass functions of the Kaplan-Meier product-limit estimator for the same arguments as those given for km.pmf(). (5) The km.coverage() function plots the actual coverage function associated with a confidence interval for the survivor function from a randomly right-censored data set for one or more of the following confidence intervals: Greenwood, log-minus-log, Peto, arcsine, and exponential Greenwood. The actual coverage function is plotted for a small number of items on test, stated coverage, failure rate, and censoring rate. The km.coverage() function can print an optional table containing all possible failure/censoring orderings, along with their contribution to the actual coverage function.

Maintained by Christopher Weld. Last updated 10 months ago.

1.8 match 3.15 score

skranz

synthdid:Synthetic Difference-in-Difference Estimation

Estimate average treatment effects in panel data. Currently provides methods only for the case that all treated units adopt treatment at the same time.

Maintained by David A. Hirshberg. Last updated 4 years ago.

1.6 match 3.40 score 84 scripts 1 dependents

cran

regressoR:Regression Data Analysis System

Perform a supervised data analysis on a database through a 'shiny' graphical interface. It includes methods such as linear regression, penalized regression, k-nearest neighbors, decision trees, ada boosting, extreme gradient boosting, random forest, neural networks, deep learning and support vector machines.

Maintained by Oldemar Rodriguez. Last updated 4 months ago.

4.0 match 2 stars 1.30 score

samhaycock

stressor:Algorithms for Testing Models under Stress

Traditional model evaluation metrics fail to capture model performance under less than ideal conditions. This package employs techniques to evaluate models "under-stress". This includes testing models' extrapolation ability, or testing accuracy on specific sub-samples of the overall model space. Details describing stress-testing methods in this package are provided in Haycock (2023) <doi:10.26076/2am5-9f67>. The other primary contribution of this package is provided to R users access to the 'Python' library 'PyCaret' <https://pycaret.org/> for quick and easy access to auto-tuned machine learning models.

Maintained by Sam Haycock. Last updated 11 months ago.

1.8 match 2.70 score 6 scripts

yeyuan98

clockSim:Simulation of the Circadian Clock Gene Network

A preconfigured simulation workflow for the circadian clock gene network.

Maintained by Ye Yuan. Last updated 6 days ago.

1.7 match 2.78 score 3 scripts

kwb-r

kwb.hantush:Calculation of Groundwater Mounding Beneath an Infiltration Basin

Calculation groundwater mounding beneath an infiltration basin based on the Hantush (1967) equation (<doi:10.1029/WR003i001p00227>). The correct implementation is shown with a verification example based on a USGS report (page 25, <https://pubs.usgs.gov/sir/2010/5102/support/sir2010-5102.pdf#page=35>).

Maintained by Michael Rustler. Last updated 5 years ago.

groundwater-modelling groundwater-mounding infiltration-basin modelling project-demeau shiny-app

1.6 match 3.00 score 10 scripts

asmahani

EnsembleBase:Extensible Package for Parallel, Batch Training of Base Learners for Ensemble Modeling

Extensible S4 classes and methods for batch training of regression and classification algorithms such as Random Forest, Gradient Boosting Machine, Neural Network, Support Vector Machines, K-Nearest Neighbors, Penalized Regression (L1/L2), and Bayesian Additive Regression Trees. These algorithms constitute a set of 'base learners', which can subsequently be combined together to form ensemble predictions. This package provides cross-validation wrappers to allow for downstream application of ensemble integration techniques, including best-error selection. All base learner estimation objects are retained, allowing for repeated prediction calls without the need for re-training. For large problems, an option is provided to save estimation objects to disk, along with prediction methods that utilize these objects. This allows users to train and predict with large ensembles of base learners without being constrained by system RAM.

Maintained by Alireza S. Mahani. Last updated 2 months ago.

openjdk

2.3 match 1.95 score 5 scripts 3 dependents

tidymodels

tidyposterior:Bayesian Analysis to Compare Models using Resampling Statistics

Bayesian analysis used here to answer the question: "when looking at resampling results, are the differences between models 'real'?" To answer this, a model can be created were the performance statistic is the resampling statistics (e.g. accuracy or RMSE). These values are explained by the model types. In doing this, we can get parameter estimates for each model's affect on performance and make statistical (and practical) comparisons between models. The methods included here are similar to Benavoli et al (2017) <https://jmlr.org/papers/v18/16-305.html>.

Maintained by Max Kuhn. Last updated 5 months ago.

0.5 match 102 stars 8.44 score 273 scripts

ludovikcoba

rrecsys:Environment for Evaluating Recommender Systems

Processes standard recommendation datasets (e.g., a user-item rating matrix) as input and generates rating predictions and lists of recommended items. Standard algorithm implementations which are included in this package are the following: Global/Item/User-Average baselines, Weighted Slope One, Item-Based KNN, User-Based KNN, FunkSVD, BPR and weighted ALS. They can be assessed according to the standard offline evaluation methodology (Shani, et al. (2011) <doi:10.1007/978-0-387-85820-3_8>) for recommender systems using measures such as MAE, RMSE, Precision, Recall, F1, AUC, NDCG, RankScore and coverage measures. The package (Coba, et al.(2017) <doi: 10.1007/978-3-319-60042-0_36>) is intended for rapid prototyping of recommendation algorithms and education purposes.

Maintained by Ludovik Çoba. Last updated 3 years ago.

cpp

0.5 match 23 stars 6.84 score 25 scripts

gloewing

sMTL:Sparse Multi-Task Learning

Implements L0-constrained Multi-Task Learning and domain generalization algorithms. The algorithms are coded in Julia allowing for fast implementations of the coordinate descent and local combinatorial search algorithms. For more details, see a preprint of the paper: Loewinger et al., (2022) <arXiv:2212.08697>.

Maintained by Gabriel Loewinger. Last updated 2 years ago.

3.3 match 1.00 score 8 scripts

encoreus

MCCM:Mixed Correlation Coefficient Matrix

The IRLS (Iteratively Reweighted Least Squares) and GMM (Generalized Method of Moments) methods are applied to estimate mixed correlation coefficient matrix (Pearson, Polyseries, Polychoric), which can be estimated in pairs or simultaneously. For more information see Peng Zhang and Ben Liu (2024) <doi:10.1080/10618600.2023.2257251>; Ben Liu and Peng Zhang (2024) <doi:10.48550/arXiv.2404.06781>.

Maintained by Ben Liu. Last updated 11 months ago.

3.3 match 1.00 score

guangbaog

DTSR:Distributed Trimmed Scores Regression for Handling Missing Data

Provides functions for handling missing data using Distributed Trimmed Scores Regression and other imputation methods. It includes facilities for data imputation, evaluation metrics, and clustering analysis. It is designed to work in distributed computing environments to handle large datasets efficiently. The philosophy of the package is described in Guo G. (2024) <doi:10.1080/03610918.2022.2091779>.

Maintained by Guangbao Guo. Last updated 4 months ago.

3.1 match 1.00 score

paolomaranzano

SCDA:Spatially-Clustered Data Analysis

Contains functions for statistical data analysis based on spatially-clustered techniques. The package allows estimating the spatially-clustered spatial regression models presented in Cerqueti, Maranzano \& Mattera (2024), "Spatially-clustered spatial autoregressive models with application to agricultural market concentration in Europe", arXiv preprint 2407.15874 <doi:10.48550/arXiv.2407.15874>. Specifically, the current release allows the estimation of the spatially-clustered linear regression model (SCLM), the spatially-clustered spatial autoregressive model (SCSAR), the spatially-clustered spatial Durbin model (SCSEM), and the spatially-clustered linear regression model with spatially-lagged exogenous covariates (SCSLX). From release 0.0.2, the library contains functions to estimate spatial clustering based on Adiajacent Matrix K-Means (AMKM) as described in Zhou, Liu \& Zhu (2019), "Weighted adjacent matrix for K-means clustering", Multimedia Tools and Applications, 78 (23) <doi:10.1007/s11042-019-08009-x>.

Maintained by Paolo Maranzano. Last updated 5 months ago.

1.6 match 1.79 score 31 scripts

cran

serieslcb:Lower Confidence Bounds for Binomial Series System

Calculate and compare lower confidence bounds for binomial series system reliability. The R 'shiny' application, launched by the function launch_app(), weaves together a workflow of customized simulations and delta coverage calculations to output recommended lower confidence bound methods.

Maintained by Edward Schuberg. Last updated 6 years ago.

2.3 match 1.20 score 16 scripts

wasquith

copBasic:General Bivariate Copula Theory and Many Utility Functions

Extensive functions for bivariate copula (bicopula) computations and related operations for bicopula theory. The lower, upper, product, and select other bicopula are implemented along with operations including the diagonal, survival copula, dual of a copula, co-copula, and numerical bicopula density. Level sets, horizontal and vertical sections are supported. Numerical derivatives and inverses of a bicopula are provided through which simulation is implemented. Bicopula composition, convex combination, asymmetry extension, and products also are provided. Support extends to the Kendall Function as well as the Lmoments thereof. Kendall Tau, Spearman Rho and Footrule, Gini Gamma, Blomqvist Beta, Hoeffding Phi, Schweizer- Wolff Sigma, tail dependency, tail order, skewness, and bivariate Lmoments are implemented, and positive/negative quadrant dependency, left (right) increasing (decreasing) are available. Other features include Kullback-Leibler Divergence, Vuong Procedure, spectral measure, and Lcomoments for inference, maximum likelihood, and AIC, BIC, and RMSE for goodness-of-fit.

Maintained by William Asquith. Last updated 1 months ago.

0.5 match 5.34 score 139 scripts 5 dependents

oscarperpinan

tdr:Target Diagram

Implementation of target diagrams using 'lattice' and 'ggplot2' graphics. Target diagrams provide a graphical overview of the respective contributions of the unbiased RMSE and MBE to the total RMSE (Jolliff, J. et al., 2009. "Summary Diagrams for Coupled Hydrodynamic-Ecosystem Model Skill Assessment." Journal of Marine Systems 76: 64–82.)

Maintained by Oscar Perpinan Lamigueiro. Last updated 9 years ago.

0.8 match 2 stars 2.96 score 46 scripts

cran

deFit:Fitting Differential Equations to Time Series Data

Use numerical optimization to fit ordinary differential equations (ODEs) to time series data to examine the dynamic relationships between variables or the characteristics of a dynamical system. It can now be used to estimate the parameters of ODEs up to second order, and can also apply to multilevel systems. See <https://github.com/yueqinhu/defit> for details.

Maintained by Yueqin Hu. Last updated 5 months ago.

1.8 match 1.00 score 2 scripts

joshlmiller1978

TSEind:Total Survey Error (Independent Samples)

Calculates total survey error (TSE) for one or more surveys, using both scale-dependent and scale-independent metrics. Package works directly from the data set, with no hand calculations required: just upload a properly structured data set (see TESTIND and its documentation), properly input column names (see functions documentation), and run your functions. For more on TSE, see: Weisberg, Herbert (2005, ISBN:0-226-89128-3); Biemer, Paul (2010) <doi:10.1093/poq/nfq058>; Biemer, Paul et.al. (2017, ISBN:9781119041672); etc.

Maintained by Joshua Miller. Last updated 6 years ago.

1.7 match 1.00 score

cran

EMAR:Empirical Model Assessment

A tool that allows users to generate various indices for evaluating statistical models. The fitstat() function computes indices based on the fitting data. The valstat() function computes indices based on the validation data set. Both fitstat() and valstat() will return 16 indices SSR: residual sum of squares, TRE: total relative error, Bias: mean bias, MRB: mean relative bias, MAB: mean absolute bias, MAPE: mean absolute percentage error, MSE: mean squared error, RMSE: root mean square error, Percent.RMSE: percentage root mean squared error, R2: coefficient of determination, R2adj: adjusted coefficient of determination, APC: Amemiya's prediction criterion, logL: Log-likelihood, AIC: Akaike information criterion, AICc: corrected Akaike information criterion, BIC: Bayesian information criterion, HQC: Hannan-Quin information criterion. The lower the better for the SSR, TRE, Bias, MRB, MAB, MAPE, MSE, RMSE, Percent.RMSE, APC, AIC, AICc, BIC and HQC indices. The higher the better for R2 and R2adj indices. Petre Stoica, P., Selén, Y. (2004) <doi:10.1109/MSP.2004.1311138>\n Zhou et al. (2023) <doi:10.3389/fpls.2023.1186250>\n Ogana, F.N., Ercanli, I. (2021) <doi:10.1007/s11676-021-01373-1>\n Musabbikhah et al. (2019) <doi:10.1088/1742-6596/1175/1/012270>.

Maintained by Friday Nwabueze Ogana. Last updated 1 years ago.

1.0 match 1.00 score

sandipgarai

AllMetrics:Calculating Multiple Performance Metrics of a Prediction Model

Provides a function to calculate multiple performance metrics for actual and predicted values. In total eight metrics will be calculated for particular actual and predicted series. Helps to describe a Statistical model's performance in predicting a data. Also helps to compare various models' performance. The metrics are Root Mean Squared Error (RMSE), Relative Root Mean Squared Error (RRMSE), Mean absolute Error (MAE), Mean absolute percentage error (MAPE), Mean Absolute Scaled Error (MASE), Nash-Sutcliffe Efficiency (NSE), Willmott’s Index (WI), and Legates and McCabe Index (LME). Among them, first five are expected to be lesser whereas, the last three are greater the better. More details can be found from Garai and Paul (2023) <doi:10.1016/j.iswa.2023.200202> and Garai et al. (2024) <doi:10.1007/s11063-024-11552-w>.

Maintained by Dr. Sandip Garai. Last updated 1 years ago.

0.5 match 1.78 score 2 dependents

agulb

ehaGoF:Calculates Goodness of Fit Statistics

Calculates 15 different goodness of fit criteria. These are; standard deviation ratio (SDR), coefficient of variation (CV), relative root mean square error (RRMSE), Pearson's correlation coefficients (PC), root mean square error (RMSE), performance index (PI), mean error (ME), global relative approximation error (RAE), mean relative approximation error (MRAE), mean absolute percentage error (MAPE), mean absolute deviation (MAD), coefficient of determination (R-squared), adjusted coefficient of determination (adjusted R-squared), Akaike's information criterion (AIC), corrected Akaike's information criterion (CAIC), Mean Square Error (MSE), Bayesian Information Criterion (BIC) and Normalized Mean Square Error (NMSE).

Maintained by Alper Gulbe. Last updated 5 years ago.

0.5 match 1.70 score 50 scripts

ranjitstat

WaveletSVR:Wavelet-SVR Hybrid Model for Time Series Forecasting

The main aim of this package is to combine the advantage of wavelet and support vector machine models for time series forecasting. This package also gives the accuracy measurements in terms of RMSE and MAPE. This package fits the hybrid Wavelet SVR model for time series forecasting The main aim of this package is to combine the advantage of wavelet and Support Vector Regression (SVR) models for time series forecasting. This package also gives the accuracy measurements in terms of Root Mean Square Error (RMSE) and Mean Absolute Prediction Error (MAPE). This package is based on the algorithm of Raimundo and Okamoto (2018) <DOI: 10.1109/INFOCT.2018.8356851>.

Maintained by Ranjit Kumar Paul. Last updated 3 years ago.

0.8 match 1.00 score

ranjitstat

WaveletRF:Wavelet-RF Hybrid Model for Time Series Forecasting

The Wavelet Decomposition followed by Random Forest Regression (RF) models have been applied for time series forecasting. The maximum overlap discrete wavelet transform (MODWT) algorithm was chosen as it works for any length of the series. The series is first divided into training and testing sets. In each of the wavelet decomposed series, the supervised machine learning approach namely random forest was employed to train the model. This package also provides accuracy metrics in the form of Root Mean Square Error (RMSE) and Mean Absolute Prediction Error (MAPE). This package is based on the algorithm of Ding et al. (2021) <DOI: 10.1007/s11356-020-12298-3>.

Maintained by Ranjit Kumar Paul. Last updated 3 years ago.

0.5 match 1.00 score

kapiliasri

decompML:Decomposition Based Machine Learning Model

The hybrid model is a highly effective forecasting approach that integrates decomposition techniques with machine learning to enhance time series prediction accuracy. Each decomposition technique breaks down a time series into multiple intrinsic mode functions (IMFs), which are then individually modeled and forecasted using machine learning algorithms. The final forecast is obtained by aggregating the predictions of all IMFs, producing an ensemble output for the time series. The performance of the developed models is evaluated using international monthly maize price data, assessed through metrics such as root mean squared error (RMSE), mean absolute percentage error (MAPE), and mean absolute error (MAE). For method details see Choudhary, K. et al. (2023). <https://ssca.org.in/media/14_SA44052022_R3_SA_21032023_Girish_Jha_FINAL_Finally.pdf>.

Maintained by Kapil Choudhary. Last updated 26 days ago.

0.5 match 1.00 score

imiqbal

Imneuron:AI Powered Neural Network Solutions for Regression Tasks

It offers a sophisticated and versatile tool for creating and evaluating artificial intelligence based neural network models tailored for regression analysis on datasets with continuous target variables. Leveraging the power of neural networks, it allows users to experiment with various hidden neuron configurations across two layers, optimizing model performance through "5 fold"" or "10 fold"" cross validation. The package normalizes input data to ensure efficient training and assesses model accuracy using key metrics such as R squared (R2), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Percentage Error (PER). By storing and visualizing the best performing models, it provides a comprehensive solution for precise and efficient regression modeling making it an invaluable tool for data scientists and researchers aiming to harness AI for predictive analytics.

Maintained by M Iqbal Jeelani. Last updated 9 months ago.

0.5 match 1.00 score