Showing 141 of total 141 results (show query)
hzambran
hydroGOF:Goodness-of-Fit Functions for Comparison of Simulated and Observed Hydrological Time Series
S3 functions implementing both statistical and graphical goodness-of-fit measures between observed and simulated values, mainly oriented to be used during the calibration, validation, and application of hydrological models. Missing values in observed and/or simulated values can be removed before computations. Comments / questions / collaboration of any kind are very welcomed.
Maintained by Mauricio Zambrano-Bigiarini. Last updated 10 months ago.
13.9 match 40 stars 10.29 score 796 scripts 8 dependentssvkucheryavski
mdatools:Multivariate Data Analysis for Chemometrics
Projection based methods for preprocessing, exploring and analysis of multivariate data used in chemometrics. S. Kucheryavskiy (2020) <doi:10.1016/j.chemolab.2020.103937>.
Maintained by Sergey Kucheryavskiy. Last updated 8 months ago.
11.3 match 35 stars 7.37 score 220 scripts 1 dependentstirgit
missCompare:Intuitive Missing Data Imputation Framework
Offers a convenient pipeline to test and compare various missing data imputation algorithms on simulated and real data. These include simpler methods, such as mean and median imputation and random replacement, but also include more sophisticated algorithms already implemented in popular R packages, such as 'mi', described by Su et al. (2011) <doi:10.18637/jss.v045.i02>; 'mice', described by van Buuren and Groothuis-Oudshoorn (2011) <doi:10.18637/jss.v045.i03>; 'missForest', described by Stekhoven and Buhlmann (2012) <doi:10.1093/bioinformatics/btr597>; 'missMDA', described by Josse and Husson (2016) <doi:10.18637/jss.v070.i01>; and 'pcaMethods', described by Stacklies et al. (2007) <doi:10.1093/bioinformatics/btm069>. The central assumption behind 'missCompare' is that structurally different datasets (e.g. larger datasets with a large number of correlated variables vs. smaller datasets with non correlated variables) will benefit differently from different missing data imputation algorithms. 'missCompare' takes measurements of your dataset and sets up a sandbox to try a curated list of standard and sophisticated missing data imputation algorithms and compares them assuming custom missingness patterns. 'missCompare' will also impute your real-life dataset for you after the selection of the best performing algorithm in the simulations. The package also provides various post-imputation diagnostics and visualizations to help you assess imputation performance.
Maintained by Tibor V. Varga. Last updated 4 years ago.
comparisoncomparison-benchmarksimputationimputation-algorithmimputation-methodsimputationskolmogorov-smirnovmissingmissing-datamissing-data-imputationmissing-status-checkmissing-valuesmissingnesspost-imputation-diagnosticsrmse
12.5 match 39 stars 5.89 score 40 scriptstopepo
caret:Classification and Regression Training
Misc functions for training and plotting classification and regression models.
Maintained by Max Kuhn. Last updated 3 months ago.
3.3 match 1.6k stars 19.24 score 61k scripts 303 dependentseagerai
fastai:Interface to 'fastai'
The 'fastai' <https://docs.fast.ai/index.html> library simplifies training fast and accurate neural networks using modern best practices. It is based on research in to deep learning best practices undertaken at 'fast.ai', including 'out of the box' support for vision, text, tabular, audio, time series, and collaborative filtering models.
Maintained by Turgut Abdullayev. Last updated 11 months ago.
audiocollaborative-filteringdarknetdarknet-image-classificationfastaimedicalobject-detectiontabulartextvision
6.6 match 118 stars 9.40 score 76 scriptstidymodels
yardstick:Tidy Characterizations of Model Performance
Tidy tools for quantifying how well model fits to a data set such as confusion matrices, class probability curve summaries, and regression metrics (e.g., RMSE).
Maintained by Emil Hvitfeldt. Last updated 4 days ago.
3.8 match 387 stars 15.47 score 2.2k scripts 60 dependentsgacarrillor
vec2dtransf:2D Cartesian Coordinate Transformation
Applies affine and similarity transformations on vector spatial data (sp objects). Transformations can be defined from control points or directly from parameters. If redundant control points are provided Least Squares is applied allowing to obtain residuals and RMSE.
Maintained by German Carrillo. Last updated 3 months ago.
2daffineaffine-transformationcoordinatesleast-squaresrmsesimilarity-transformationssp-objectstransformations
13.8 match 5 stars 3.97 score 37 scriptstidyverse
modelr:Modelling Functions that Work with the Pipe
Functions for modelling that help you seamlessly integrate modelling into a pipeline of data manipulation and visualisation.
Maintained by Hadley Wickham. Last updated 1 years ago.
3.3 match 401 stars 16.44 score 6.9k scripts 1.0k dependentseasystats
performance:Assessment of Regression Models Performance
Utilities for computing measures to assess model quality, which are not directly provided by R's 'base' or 'stats' packages. These include e.g. measures like r-squared, intraclass correlation coefficient (Nakagawa, Johnson & Schielzeth (2017) <doi:10.1098/rsif.2017.0213>), root mean squared error or functions to check models for overdispersion, singularity or zero-inflation and more. Functions apply to a large variety of regression models, including generalized linear models, mixed effects models and Bayesian models. References: Lüdecke et al. (2021) <doi:10.21105/joss.03139>.
Maintained by Daniel Lüdecke. Last updated 18 days ago.
aiceasystatshacktoberfestloomachine-learningmixed-modelsmodelsperformancer2statistics
3.3 match 1.1k stars 16.17 score 4.3k scripts 47 dependentsbioc
benchdamic:Benchmark of differential abundance methods on microbiome data
Starting from a microbiome dataset (16S or WMS with absolute count values) it is possible to perform several analysis to assess the performances of many differential abundance detection methods. A basic and standardized version of the main differential abundance analysis methods is supplied but the user can also add his method to the benchmark. The analyses focus on 4 main aspects: i) the goodness of fit of each method's distributional assumptions on the observed count data, ii) the ability to control the false discovery rate, iii) the within and between method concordances, iv) the truthfulness of the findings if any apriori knowledge is given. Several graphical functions are available for result visualization.
Maintained by Matteo Calgaro. Last updated 4 months ago.
metagenomicsmicrobiomedifferentialexpressionmultiplecomparisonnormalizationpreprocessingsoftwarebenchmarkdifferential-abundance-methods
8.8 match 6 stars 5.73 score 8 scriptspetolau
TSrepr:Time Series Representations
Methods for representations (i.e. dimensionality reduction, preprocessing, feature extraction) of time series to help more accurate and effective time series data mining. Non-data adaptive, data adaptive, model-based and data dictated (clipped) representation methods are implemented. Also various normalisation methods (min-max, z-score, Box-Cox, Yeo-Johnson), and forecasting accuracy measures are implemented.
Maintained by Peter Laurinec. Last updated 5 years ago.
data-analysisdata-miningdata-mining-algorithmsdata-sciencerepresentationtime-seriestime-series-analysistime-series-classificationtime-series-clusteringtime-series-data-miningtime-series-representationscpp
6.6 match 97 stars 7.23 score 117 scriptsnelson-n
lmForc:Linear Model Forecasting
Introduces in-sample, out-of-sample, pseudo out-of-sample, and benchmark model forecast tests and a new class for working with forecast data, Forecast.
Maintained by Nelson Rayl. Last updated 7 months ago.
8.9 match 6 stars 5.26 score 20 scriptsphilchalmers
SimDesign:Structure for Organizing Monte Carlo Simulation Designs
Provides tools to safely and efficiently organize and execute Monte Carlo simulation experiments in R. The package controls the structure and back-end of Monte Carlo simulation experiments by utilizing a generate-analyse-summarise workflow. The workflow safeguards against common simulation coding issues, such as automatically re-simulating non-convergent results, prevents inadvertently overwriting simulation files, catches error and warning messages during execution, implicitly supports parallel processing with high-quality random number generation, and provides tools for managing high-performance computing (HPC) array jobs submitted to schedulers such as SLURM. For a pedagogical introduction to the package see Sigal and Chalmers (2016) <doi:10.1080/10691898.2016.1246953>. For a more in-depth overview of the package and its design philosophy see Chalmers and Adkins (2020) <doi:10.20982/tqmp.16.4.p248>.
Maintained by Phil Chalmers. Last updated 4 hours ago.
monte-carlo-simulationsimulationsimulation-framework
3.3 match 62 stars 13.36 score 253 scripts 46 dependentsmfrasco
Metrics:Evaluation Metrics for Machine Learning
An implementation of evaluation metrics in R that are commonly used in supervised machine learning. It implements metrics for regression, time series, binary classification, classification, and information retrieval problems. It has zero dependencies and a consistent, simple interface for all functions.
Maintained by Michael Frasco. Last updated 6 years ago.
3.3 match 99 stars 13.02 score 6.1k scripts 51 dependentsjknowles
merTools:Tools for Analyzing Mixed Effect Regression Models
Provides methods for extracting results from mixed-effect model objects fit with the 'lme4' package. Allows construction of prediction intervals efficiently from large scale linear and generalized linear mixed-effects models. This method draws from the simulation framework used in the Gelman and Hill (2007) textbook: Data Analysis Using Regression and Multilevel/Hierarchical Models.
Maintained by Jared E. Knowles. Last updated 1 years ago.
4.0 match 105 stars 10.49 score 768 scriptsgeco-bern
rsofun:The P-Model and BiomeE Modelling Framework
Implements the Simulating Optimal FUNctioning framework for site-scale simulations of ecosystem processes, including model calibration. It contains 'Fortran 90' modules for the P-model (Stocker et al. (2020) <doi:10.5194/gmd-13-1545-2020>), SPLASH (Davis et al. (2017) <doi:10.5194/gmd-10-689-2017>) and BiomeE (Weng et al. (2015) <doi:10.5194/bg-12-2655-2015>).
Maintained by Benjamin Stocker. Last updated 13 days ago.
dgvmgrowthmodelingp-modelsimulationvegetation-dynamicsfortran
4.7 match 26 stars 8.77 score 119 scriptstidyverts
fabletools:Core Tools for Packages in the 'fable' Framework
Provides tools, helpers and data structures for developing models and time series functions for 'fable' and extension packages. These tools support a consistent and tidy interface for time series modelling and analysis.
Maintained by Mitchell OHara-Wild. Last updated 1 months ago.
3.3 match 91 stars 12.18 score 396 scripts 18 dependentsjackstat
ModelMetrics:Rapid Calculation of Model Metrics
Collection of metrics for evaluating models written in C++ using 'Rcpp'. Popular metrics include area under the curve, log loss, root mean square error, etc.
Maintained by Tyler Hunt. Last updated 4 years ago.
aucloglossmachine-learningmetricsmodel-evaluationmodel-metricscpp
3.3 match 29 stars 11.83 score 1.3k scripts 306 dependentsyanyachen
MLmetrics:Machine Learning Evaluation Metrics
A collection of evaluation metrics, including loss, score and utility functions, that measure regression, classification and ranking performance.
Maintained by Yachen Yan. Last updated 11 months ago.
3.3 match 69 stars 11.09 score 2.2k scripts 20 dependentsbioc
lute:Framework for cell size scale factor normalized bulk transcriptomics deconvolution experiments
Provides a framework for adjustment on cell type size when performing bulk transcripomics deconvolution. The main framework function provides a means of reference normalization using cell size scale factors. It allows for marker selection and deconvolution using non-negative least squares (NNLS) by default. The framework is extensible for other marker selection and deconvolution algorithms, and users may reuse the generics, methods, and classes for these when developing new algorithms.
Maintained by Sean K Maden. Last updated 5 months ago.
rnaseqsequencingsinglecellcoveragetranscriptomicsnormalization
6.6 match 2 stars 5.26 score 3 scriptsmhahsler
recommenderlab:Lab for Developing and Testing Recommender Algorithms
Provides a research infrastructure to develop and evaluate collaborative filtering recommender algorithms. This includes a sparse representation for user-item matrices, many popular algorithms, top-N recommendations, and cross-validation. Hahsler (2022) <doi:10.48550/arXiv.2205.12371>.
Maintained by Michael Hahsler. Last updated 7 months ago.
collaborative-filteringrecommender-system
3.3 match 214 stars 10.07 score 840 scripts 2 dependentsmharinga
insurancerating:Analytic Insurance Rating Techniques
Functions to build, evaluate, and visualize insurance rating models. It simplifies the process of modeling premiums, and allows to analyze insurance risk factors effectively. The package employs a data-driven strategy for constructing insurance tariff classes, drawing on the work of Antonio and Valdez (2012) <doi:10.1007/s10182-011-0152-7>.
Maintained by Martin Haringa. Last updated 5 months ago.
actuarialactuarial-scienceinsurancepricing
5.6 match 70 stars 5.89 score 28 scriptslaresbernardo
lares:Analytics & Machine Learning Sidekick
Auxiliary package for better/faster analytics, visualization, data mining, and machine learning tasks. With a wide variety of family functions, like Machine Learning, Data Wrangling, Marketing Mix Modeling (Robyn), Exploratory, API, and Scrapper, it helps the analyst or data scientist to get quick and robust results, without the need of repetitive coding or advanced R programming skills.
Maintained by Bernardo Lares. Last updated 23 days ago.
analyticsapiautomationautomldata-sciencedescriptive-statisticsh2omachine-learningmarketingmmmpredictive-modelingpuzzlerlanguagerobynvisualization
3.3 match 233 stars 9.84 score 185 scripts 1 dependentsben519
mltools:Machine Learning Tools
A collection of machine learning helper functions, particularly assisting in the Exploratory Data Analysis phase. Makes heavy use of the 'data.table' package for optimal speed and memory efficiency. Highlights include a versatile bin_data() function, sparsify() for converting a data.table to sparse matrix format with one-hot encoding, fast evaluation metrics, and empirical_cdf() for calculating empirical Multivariate Cumulative Distribution Functions.
Maintained by Ben Gorman. Last updated 3 years ago.
exploratory-data-analysismachine-learning
3.3 match 72 stars 9.58 score 1.2k scripts 13 dependentsinbo
dhcurve:Automated Modelling of Diameter Height Curves for Trees
Model diameter height curves for individual tree species and forests.
Maintained by Els Lommelen. Last updated 3 days ago.
10.4 match 3.00 score 8 scriptsbrry
berryFunctions:Function Collection Related to Plotting and Hydrology
Draw horizontal histograms, color scattered points by 3rd dimension, enhance date- and log-axis plots, zoom in X11 graphics, trace errors and warnings, use the unit hydrograph in a linear storage cascade, convert lists to data.frames and arrays, fit multiple functions.
Maintained by Berry Boessenkool. Last updated 1 months ago.
3.3 match 13 stars 9.43 score 350 scripts 16 dependentsmsainsburydale
NeuralEstimators:Likelihood-Free Parameter Estimation using Neural Networks
An 'R' interface to the 'Julia' package 'NeuralEstimators.jl'. The package facilitates the user-friendly development of neural Bayes estimators, which are neural networks that map data to a point summary of the posterior distribution (Sainsbury-Dale et al., 2024, <doi:10.1080/00031305.2023.2249522>). These estimators are likelihood-free and amortised, in the sense that, once the neural networks are trained on simulated data, inference from observed data can be made in a fraction of the time required by conventional approaches. The package also supports amortised Bayesian or frequentist inference using neural networks that approximate the posterior or likelihood-to-evidence ratio (Zammit-Mangion et al., 2025, Sec. 3.2, 5.2, <doi:10.48550/arXiv.2404.12484>). The package accommodates any model for which simulation is feasible by allowing users to define models implicitly through simulated data.
Maintained by Matthew Sainsbury-Dale. Last updated 14 days ago.
5.0 match 9 stars 5.95 score 3 scriptsbeerda
lfl:Linguistic Fuzzy Logic
Various algorithms related to linguistic fuzzy logic: mining for linguistic fuzzy association rules, composition of fuzzy relations, performing perception-based logical deduction (PbLD), and forecasting time-series using fuzzy rule-based ensemble (FRBE). The package also contains basic fuzzy-related algebraic functions capable of handling missing values in different styles (Bochvar, Sobocinski, Kleene etc.), computation of Sugeno integrals and fuzzy transform.
Maintained by Michal Burda. Last updated 4 months ago.
association-rulesforecast-modelfuzzy-logicinference-rulescppopenmp
5.1 match 8 stars 5.35 score 28 scriptsbrian-j-smith
MachineShop:Machine Learning Models and Tools
Meta-package for statistical and machine learning with a unified interface for model fitting, prediction, performance assessment, and presentation of results. Approaches for model fitting and prediction of numerical, categorical, or censored time-to-event outcomes include traditional regression models, regularization methods, tree-based methods, support vector machines, neural networks, ensembles, data preprocessing, filtering, and model tuning and selection. Performance metrics are provided for model assessment and can be estimated with independent test sets, split sampling, cross-validation, or bootstrap resampling. Resample estimation can be executed in parallel for faster processing and nested in cases of model tuning and selection. Modeling results can be summarized with descriptive statistics; calibration curves; variable importance; partial dependence plots; confusion matrices; and ROC, lift, and other performance curves.
Maintained by Brian J Smith. Last updated 7 months ago.
classification-modelsmachine-learningpredictive-modelingregression-modelssurvival-models
3.3 match 61 stars 7.95 score 121 scriptsjszlek
fscaret:Automated Feature Selection from 'caret'
Automated feature selection using variety of models provided by 'caret' package. This work was funded by Poland-Singapore bilateral cooperation project no 2/3/POL-SIN/2012.
Maintained by Jakub Szlek. Last updated 7 years ago.
6.6 match 3.97 score 31 scriptssteffenmoritz
imputeR:A General Multivariate Imputation Framework
Multivariate Expectation-Maximization (EM) based imputation framework that offers several different algorithms. These include regularisation methods like Lasso and Ridge regression, tree-based models and dimensionality reduction methods like PCA and PLS.
Maintained by Steffen Moritz. Last updated 4 years ago.
5.3 match 16 stars 4.94 score 54 scriptsbioc
DeconRNASeq:Deconvolution of Heterogeneous Tissue Samples for mRNA-Seq data
DeconSeq is an R package for deconvolution of heterogeneous tissues based on mRNA-Seq data. It modeled expression levels from heterogeneous cell populations in mRNA-Seq as the weighted average of expression from different constituting cell types and predicted cell type proportions of single expression profiles.
Maintained by Ting Gong. Last updated 5 months ago.
4.9 match 5.16 score 72 scriptsgmonette
cv:Cross-Validating Regression Models
Cross-validation methods of regression models that exploit features of various modeling functions to improve speed. Some of the methods implemented in the package are novel, as described in the package vignettes; for general introductions to cross-validation, see, for example, Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani (2021, ISBN 978-1-0716-1417-4, Secs. 5.1, 5.3), "An Introduction to Statistical Learning with Applications in R, Second Edition", and Trevor Hastie, Robert Tibshirani, and Jerome Friedman (2009, ISBN 978-0-387-84857-0, Sec. 7.10), "The Elements of Statistical Learning, Second Edition".
Maintained by Georges Monette. Last updated 4 days ago.
3.3 match 4 stars 7.67 score 86 scriptsrspatial
predicts:Spatial Prediction Tools
Methods for spatial predictive modeling, especially for spatial distribution models. This includes algorithms for model fitting and prediction, as well as methods for model evaluation.
Maintained by Robert J. Hijmans. Last updated 2 months ago.
3.3 match 10 stars 7.46 score 108 scripts 8 dependentsnanxstats
msaenet:Multi-Step Adaptive Estimation Methods for Sparse Regressions
Multi-step adaptive elastic-net (MSAENet) algorithm for feature selection in high-dimensional regressions proposed in Xiao and Xu (2015) <DOI:10.1080/00949655.2015.1016944>, with support for multi-step adaptive MCP-net (MSAMNet) and multi-step adaptive SCAD-net (MSASNet) methods.
Maintained by Nan Xiao. Last updated 8 months ago.
false-positive-controlhigh-dimensional-datalinear-regressionmachine-learningvariable-selection
4.0 match 13 stars 6.01 score 52 scriptsnk027
BVAR:Hierarchical Bayesian Vector Autoregression
Estimation of hierarchical Bayesian vector autoregressive models following Kuschnig & Vashold (2021) <doi:10.18637/jss.v100.i14>. Implements hierarchical prior selection for conjugate priors in the fashion of Giannone, Lenza & Primiceri (2015) <doi:10.1162/REST_a_00483>. Functions to compute and identify impulse responses, calculate forecasts, forecast error variance decompositions and scenarios are available. Several methods to print, plot and summarise results facilitate analysis.
Maintained by Nikolas Kuschnig. Last updated 4 months ago.
bayesianbvarforecastsimpulse-responsesvector-autoregressions
3.3 match 51 stars 7.30 score 68 scripts 1 dependentstychelab
CoSMoS:Complete Stochastic Modelling Solution
Makes univariate, multivariate, or random fields simulations precise and simple. Just select the desired time series or random fields’ properties and it will do the rest. CoSMoS is based on the framework described in Papalexiou (2018, <doi:10.1016/j.advwatres.2018.02.013>), extended for random fields in Papalexiou and Serinaldi (2020, <doi:10.1029/2019WR026331>), and further advanced in Papalexiou et al. (2021, <doi:10.1029/2020WR029466>) to allow fine-scale space-time simulation of storms (or even cyclone-mimicking fields).
Maintained by Kevin Shook. Last updated 4 years ago.
3.3 match 11 stars 7.10 score 77 scriptsegenn
rtemis:Machine Learning and Visualization
Advanced Machine Learning and Visualization. Unsupervised Learning (Clustering, Decomposition), Supervised Learning (Classification, Regression), Cross-Decomposition, Bagging, Boosting, Meta-models. Static and interactive graphics.
Maintained by E.D. Gennatas. Last updated 1 months ago.
data-sciencedata-visualizationmachine-learningmachine-learning-libraryvisualization
3.3 match 145 stars 7.09 score 50 scripts 2 dependentsvivienroussez
autoTS:Automatic Model Selection and Prediction for Univariate Time Series
Offers a set of functions to easily make predictions for univariate time series. 'autoTS' is a wrapper of existing functions of the 'forecast' and 'prophet' packages, harmonising their outputs in tidy dataframes and using default values for each. The core function getBestModel() allows the user to effortlessly benchmark seven algorithms along with a bagged estimator to identify which one performs the best for a given time series.
Maintained by Vivien Roussez. Last updated 5 years ago.
4.9 match 10 stars 4.78 score 12 scriptsalexchristensen
NetworkToolbox:Methods and Measures for Brain, Cognitive, and Psychometric Network Analysis
Implements network analysis and graph theory measures used in neuroscience, cognitive science, and psychology. Methods include various filtering methods and approaches such as threshold, dependency (Kenett, Tumminello, Madi, Gur-Gershgoren, Mantegna, & Ben-Jacob, 2010 <doi:10.1371/journal.pone.0015032>), Information Filtering Networks (Barfuss, Massara, Di Matteo, & Aste, 2016 <doi:10.1103/PhysRevE.94.062306>), and Efficiency-Cost Optimization (Fallani, Latora, & Chavez, 2017 <doi:10.1371/journal.pcbi.1005305>). Brain methods include the recently developed Connectome Predictive Modeling (see references in package). Also implements several network measures including local network characteristics (e.g., centrality), community-level network characteristics (e.g., community centrality), global network characteristics (e.g., clustering coefficient), and various other measures associated with the reliability and reproducibility of network analysis.
Maintained by Alexander Christensen. Last updated 2 years ago.
3.3 match 23 stars 6.99 score 101 scripts 4 dependentsbioc
SingleMoleculeFootprinting:Analysis tools for Single Molecule Footprinting (SMF) data
SingleMoleculeFootprinting provides functions to analyze Single Molecule Footprinting (SMF) data. Following the workflow exemplified in its vignette, the user will be able to perform basic data analysis of SMF data with minimal coding effort. Starting from an aligned bam file, we show how to perform quality controls over sequencing libraries, extract methylation information at the single molecule level accounting for the two possible kind of SMF experiments (single enzyme or double enzyme), classify single molecules based on their patterns of molecular occupancy, plot SMF information at a given genomic location.
Maintained by Guido Barzaghi. Last updated 27 days ago.
dnamethylationcoveragenucleosomepositioningdatarepresentationepigeneticsmethylseqqualitycontrolsequencing
3.5 match 2 stars 6.43 score 27 scriptsr-forge
modEvA:Model Evaluation and Analysis
Analyses species distribution models and evaluates their performance. It includes functions for variation partitioning, extracting variable importance, computing several metrics of model discrimination and calibration performance, optimizing prediction thresholds based on a number of criteria, performing multivariate environmental similarity surface (MESS) analysis, and displaying various analytical plots. Initially described in Barbosa et al. (2013) <doi:10.1111/ddi.12100>.
Maintained by A. Marcia Barbosa. Last updated 10 days ago.
3.3 match 6.82 score 269 scripts 3 dependentsnanxstats
enpls:Ensemble Partial Least Squares Regression
An algorithmic framework for measuring feature importance, outlier detection, model applicability domain evaluation, and ensemble predictive modeling with (sparse) partial least squares regressions.
Maintained by Nan Xiao. Last updated 3 years ago.
chemometricsdimensionality-reductionensemble-learningmachine-learningoutlier-detectionpartial-least-squares-regression
4.0 match 18 stars 5.56 score 40 scriptsmayer79
MetricsWeighted:Weighted Metrics and Performance Measures for Machine Learning
Provides weighted versions of several metrics and performance measures used in machine learning, including average unit deviances of the Bernoulli, Tweedie, Poisson, and Gamma distributions, see Jorgensen B. (1997, ISBN: 978-0412997112). The package also contains a weighted version of generalized R-squared, see e.g. Cohen, J. et al. (2002, ISBN: 978-0805822236). Furthermore, 'dplyr' chains are supported.
Maintained by Michael Mayer. Last updated 8 months ago.
machine-learningmetricsperformancestatistics
3.3 match 11 stars 6.79 score 75 scripts 5 dependentsjfwambaugh
invivoPKfit:Fits Toxicokinetic Models to In Vivo PK Data Sets
Takes in vivo toxicokinetic concentration-time data and fits parameters of 1-compartment and 2-compartment models for each chemical. These methods are described in detail in "Informatics for Toxicokinetics" (submitted).
Maintained by John Wambaugh. Last updated 2 months ago.
8.6 match 2.60 score 4 scriptsmoviedo5
fda.usc:Functional Data Analysis and Utilities for Statistical Computing
Routines for exploratory and descriptive analysis of functional data such as depth measurements, atypical curves detection, regression models, supervised classification, unsupervised classification and functional analysis of variance.
Maintained by Manuel Oviedo de la Fuente. Last updated 4 months ago.
functional-data-analysisfortran
2.3 match 12 stars 9.72 score 560 scripts 22 dependentsnsj3
rioja:Analysis of Quaternary Science Data
Constrained clustering, transfer functions, and other methods for analysing Quaternary science data.
Maintained by Steve Juggins. Last updated 6 months ago.
3.0 match 10 stars 7.21 score 191 scripts 3 dependentstianxia-jia
mcgf:Markov Chain Gaussian Fields Simulation and Parameter Estimation
Simulating and estimating (regime-switching) Markov chain Gaussian fields with covariance functions of the Gneiting class (Gneiting 2002) <doi:10.1198/016214502760047113>. It supports parameter estimation by weighted least squares and maximum likelihood methods, and produces Kriging forecasts and intervals for existing and new locations.
Maintained by Tianxia Jia. Last updated 9 months ago.
4.4 match 1 stars 4.82 score 11 scriptsegonzato
windows.pls:Segmentation Approaches in Chemometrics
Evaluation of prediction performance of smaller regions of spectra for Chemometrics. Segmentation of spectra, evolving dimensions regions and sliding windows as selection methods. Election of the best model among those computed based on error metrics. Chen et al.(2017) <doi:10.1007/s00216-017-0218-9>.
Maintained by Elia Gonzato. Last updated 2 years ago.
7.8 match 2.70 score 4 scriptszsteinmetz
envalysis:Miscellaneous Functions for Environmental Analyses
Small toolbox for data analyses in environmental chemistry and ecotoxicology. Provides, for example, calibration() to calculate calibration curves and corresponding limits of detection (LODs) and limits of quantification (LOQs) according to German DIN 32645 (2008). texture() makes it easy to estimate soil particle size distributions from hydrometer measurements (ASTM D422-63, 2007).
Maintained by Zacharias Steinmetz. Last updated 5 months ago.
analyticschemistryecotoxicologyenvironmentsoil
3.3 match 8 stars 6.30 score 83 scriptsblasbenito
spatialRF:Easy Spatial Modeling with Random Forest
Automatic generation and selection of spatial predictors for spatial regression with Random Forest. Spatial predictors are surrogates of variables driving the spatial structure of a response variable. The package offers two methods to generate spatial predictors from a distance matrix among training cases: 1) Moran's Eigenvector Maps (MEMs; Dray, Legendre, and Peres-Neto 2006 <DOI:10.1016/j.ecolmodel.2006.02.015>): computed as the eigenvectors of a weighted matrix of distances; 2) RFsp (Hengl et al. <DOI:10.7717/peerj.5518>): columns of the distance matrix used as spatial predictors. Spatial predictors help minimize the spatial autocorrelation of the model residuals and facilitate an honest assessment of the importance scores of the non-spatial predictors. Additionally, functions to reduce multicollinearity, identify relevant variable interactions, tune random forest hyperparameters, assess model transferability via spatial cross-validation, and explore model results via partial dependence curves and interaction surfaces are included in the package. The modelling functions are built around the highly efficient 'ranger' package (Wright and Ziegler 2017 <DOI:10.18637/jss.v077.i01>).
Maintained by Blas M. Benito. Last updated 3 years ago.
random-forestspatial-analysisspatial-regression
3.7 match 114 stars 5.45 score 49 scriptsradiant-rstats
radiant.model:Model Menu for Radiant: Business Analytics using R and Shiny
The Radiant Model menu includes interfaces for linear and logistic regression, naive Bayes, neural networks, classification and regression trees, model evaluation, collaborative filtering, decision analysis, and simulation. The application extends the functionality in 'radiant.data'.
Maintained by Vincent Nijs. Last updated 5 months ago.
3.3 match 19 stars 6.18 score 80 scripts 2 dependentscoatless-rpkg
jjb:Balamuta Miscellaneous
Set of common functions used for manipulating colors, detecting and interacting with 'RStudio', modeling, formatting, determining users' operating system, feature scaling, and more!
Maintained by James Balamuta. Last updated 1 years ago.
5.1 match 2 stars 3.97 score 31 scripts 1 dependentsymutua
mapsRinteractive:Local Adaptation and Evaluation of Raster Maps
Local adaptation and evaluation of maps of continuous attributes in raster format by use of point location data.
Maintained by Kristin Persson. Last updated 2 years ago.
6.6 match 2 stars 3.00 score 7 scriptsbsnatr
tswge:Time Series for Data Science
Accompanies the texts Time Series for Data Science with R by Woodward, Sadler and Robertson & Applied Time Series Analysis with R, 2nd edition by Woodward, Gray, and Elliott. It is helpful for data analysis and for time series instruction.
Maintained by Bivin Sadler. Last updated 2 years ago.
7.3 match 2.70 score 496 scriptsgdkrmr
DRR:Dimensionality Reduction via Regression
An Implementation of Dimensionality Reduction via Regression using Kernel Ridge Regression.
Maintained by Guido Kraemer. Last updated 2 years ago.
dimensionality-reductionkernel-methodsnon-linearregression-models
3.7 match 9 stars 5.24 score 8 scripts 1 dependentscran
nsRFA:Non-Supervised Regional Frequency Analysis
A collection of statistical tools for objective (non-supervised) applications of the Regional Frequency Analysis methods in hydrology. The package refers to the index-value method and, more precisely, helps the hydrologist to: (1) regionalize the index-value; (2) form homogeneous regions with similar growth curves; (3) fit distribution functions to the empirical regional growth curves. Most of the methods are those described in the Flood Estimation Handbook (Centre for Ecology & Hydrology, 1999, ISBN:9781906698003). Homogeneity tests from Hosking and Wallis (1993) <doi:10.1029/92WR01980> and Viglione et al. (2007) <doi:10.1029/2006WR005095> are available.
Maintained by Alberto Viglione. Last updated 10 months ago.
5.6 match 2 stars 3.49 scoregi0na
ghypernet:Fit and Simulate Generalised Hypergeometric Ensembles of Graphs
Provides functions for model fitting and selection of generalised hypergeometric ensembles of random graphs (gHypEG). To learn how to use it, check the vignettes for a quick tutorial. Please reference its use as Casiraghi, G., Nanumyan, V. (2019) <doi:10.5281/zenodo.2555300> together with those relevant references from the one listed below. The package is based on the research developed at the Chair of Systems Design, ETH Zurich. Casiraghi, G., Nanumyan, V., Scholtes, I., Schweitzer, F. (2016) <arXiv:1607.02441>. Casiraghi, G., Nanumyan, V., Scholtes, I., Schweitzer, F. (2017) <doi:10.1007/978-3-319-67256-4_11>. Casiraghi, G., (2017) <arXiv:1702.02048> Brandenberger, L., Casiraghi, G., Nanumyan, V., Schweitzer, F. (2019) <doi:10.1145/3341161.3342926> Casiraghi, G. (2019) <doi:10.1007/s41109-019-0241-1>. Casiraghi, G., Nanumyan, V. (2021) <doi:10.1038/s41598-021-92519-y>. Casiraghi, G. (2021) <doi:10.1088/2632-072X/ac0493>.
Maintained by Giona Casiraghi. Last updated 11 months ago.
data-miningdata-sciencegraphsnetworknetwork-analysisrandom-graph-generationrandom-graphs
3.3 match 8 stars 5.68 score 20 scriptstomasfryda
h2o:R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Maintained by Tomas Fryda. Last updated 1 years ago.
2.3 match 3 stars 8.20 score 7.8k scripts 11 dependentspkr-pkr
NNbenchmark:Datasets and Functions to Benchmark Neural Network Packages
Datasets and functions to benchmark (convergence, speed, ease of use) R packages dedicated to regression with neural networks (no classification in this version). The templates for the tested packages are available in the R, R Markdown and HTML formats at <https://github.com/pkR-pkR/NNbenchmarkTemplates> and <https://theairbend3r.github.io/NNbenchmarkWeb/index.html>. The submitted article to the R-Journal can be read at <https://www.inmodelia.com/gsoc2020.html>.
Maintained by Patrice Kiener. Last updated 1 years ago.
3.4 match 5 stars 5.33 score 283 scriptsiembry
ie2misc:Irucka Embry's Miscellaneous USGS Functions
A collection of Irucka Embry's miscellaneous USGS functions (processing .exp and .psf files, statistical error functions, "+" dyadic operator for use with NA, creating ADAPS and QW spreadsheet files, calculating saturated enthalpy). Irucka created these functions while a Cherokee Nation Technology Solutions (CNTS) United States Geological Survey (USGS) Contractor and/or USGS employee.
Maintained by Irucka Embry. Last updated 1 years ago.
5.1 match 3.43 score 54 scriptssmart-cities-accelerator
onlineforecast:Forecast Modelling for Online Applications
A framework for fitting adaptive forecasting models. Provides a way to use forecasts as input to models, e.g. weather forecasts for energy related forecasting. The models can be fitted recursively and can easily be setup for updating parameters when new data arrives. See the included vignettes, the website <https://onlineforecasting.org> and the paper "onlineforecast: An R package for adaptive and recursive forecasting" <https://journal.r-project.org/articles/RJ-2023-031/>.
Maintained by Peder Bacher. Last updated 1 years ago.
5.3 match 3 stars 3.28 score 16 scriptsullid
SoilHyP:Soil Hydraulic Properties
Provides functions for (1) soil water retention (SWC) and unsaturated hydraulic conductivity (Ku) (van Genuchten-Mualem (vGM or vG) [1, 2], Peters-Durner-Iden (PDI) [3, 4, 5], Brooks and Corey (bc) [8]), (2) fitting of parameter for SWC and/or Ku using Shuffled Complex Evolution (SCE) optimisation and (3) calculation of soil hydraulic properties (Ku and soil water contents) based on the simplified evaporation method (SEM) [6, 7]. Main references: [1] van Genuchten (1980) <doi:10.2136/sssaj1980.03615995004400050002x>, [2] Mualem (1976) <doi:10.1029/WR012i003p00513>, [3] Peters (2013) <doi:10.1002/wrcr.20548>, [4] Iden and Durner (2013) <doi:10.1002/2014WR015937>, [5] Peters (2014) <doi:10.1002/2014WR015937>, [6] Wind G. P. (1966), [7] Peters and Durner (2008) <doi:10.1016/j.jhydrol.2008.04.016> and [8] Brooks and Corey (1964).
Maintained by Ullrich Dettmann. Last updated 2 years ago.
5.1 match 3.32 score 35 scripts 2 dependentsrudeboybert
forestecology:Fitting and Assessing Neighborhood Models of the Effect of Interspecific Competition on the Growth of Trees
Code for fitting and assessing models for the growth of trees. In particular for the Bayesian neighborhood competition linear regression model of Allen (2020): methods for model fitting and generating fitted/predicted values, evaluating the effect of competitor species identity using permutation tests, and evaluating model performance using spatial cross-validation.
Maintained by Albert Y. Kim. Last updated 3 years ago.
3.3 match 12 stars 5.12 score 11 scriptswsqlab
GaSP:Train and Apply a Gaussian Stochastic Process Model
Train a Gaussian stochastic process model of an unknown function, possibly observed with error, via maximum likelihood or maximum a posteriori (MAP) estimation, run model diagnostics, and make predictions, following Sacks, J., Welch, W.J., Mitchell, T.J., and Wynn, H.P. (1989) "Design and Analysis of Computer Experiments", Statistical Science, <doi:10.1214/ss/1177012413>. Perform sensitivity analysis and visualize low-order effects, following Schonlau, M. and Welch, W.J. (2006), "Screening the Input Variables to a Computer Model Via Analysis of Variance and Visualization", <doi:10.1007/0-387-28014-6_14>.
Maintained by William J. Welch. Last updated 9 months ago.
6.2 match 2.70 score 8 scriptsr-forge
qualV:Qualitative Validation Methods
Qualitative methods for the validation of dynamic models. It contains (i) an orthogonal set of deviance measures for absolute, relative and ordinal scale and (ii) approaches accounting for time shifts. The first approach transforms time to take time delays and speed differences into account. The second divides the time series into interval units according to their main features and finds the longest common subsequence (LCS) using a dynamic programming algorithm.
Maintained by Thomas Petzoldt. Last updated 2 years ago.
3.3 match 1 stars 4.99 score 49 scripts 26 dependentsneerajdhanraj
imputeTestbench:Test Bench for the Comparison of Imputation Methods
Provides a test bench for the comparison of missing data imputation methods in uni-variate time series. Imputation methods are compared using different error metrics. Proposed imputation methods and alternative error metrics can be used.
Maintained by Marcus W. Beck. Last updated 8 years ago.
3.3 match 5 stars 4.94 score 20 scripts 2 dependentssollano
forestmangr:Forest Mensuration and Management
Processing forest inventory data with methods such as simple random sampling, stratified random sampling and systematic sampling. There are also functions for yield and growth predictions and model fitting, linear and nonlinear grouped data fitting, and statistical tests. References: Kershaw Jr., Ducey, Beers and Husch (2016). <doi:10.1002/9781118902028>.
Maintained by Sollano Rabelo Braga. Last updated 3 months ago.
2.0 match 17 stars 7.97 score 378 scriptstripartio
staccuracy:Standardized Accuracy and Other Model Performance Metrics
Standardized accuracy (staccuracy) is a framework for expressing accuracy scores such that 50% represents a reference level of performance and 100% is a perfect prediction. The 'staccuracy' package provides tools for creating staccuracy functions as well as some recommended staccuracy measures. It also provides functions for some classic performance metrics such as mean absolute error (MAE), root mean squared error (RMSE), and area under the receiver operating characteristic curve (AUCROC), as well as their winsorized versions when applicable.
Maintained by Chitu Okoli. Last updated 21 days ago.
3.8 match 1 stars 4.18 score 4 scripts 2 dependentsphilipppro
measures:Performance Measures for Statistical Learning
Provides the biggest amount of statistical measures in the whole R world. Includes measures of regression, (multiclass) classification and multilabel classification. The measures come mainly from the 'mlr' package and were programed by several 'mlr' developers.
Maintained by Philipp Probst. Last updated 4 years ago.
3.3 match 1 stars 4.47 score 88 scripts 2 dependentscran
RSDA:R to Symbolic Data Analysis
Symbolic Data Analysis (SDA) was proposed by professor Edwin Diday in 1987, the main purpose of SDA is to substitute the set of rows (cases) in the data table for a concept (second order statistical unit). This package implements, to the symbolic case, certain techniques of automatic classification, as well as some linear models.
Maintained by Oldemar Rodriguez. Last updated 1 years ago.
4.5 match 1 stars 3.26 score 3 dependentsmateusmaiads
randomMachines:An Ensemble Modeling using Random Machines
A novel ensemble method employing Support Vector Machines (SVMs) as base learners. This powerful ensemble model is designed for both classification (Ara A., et. al, 2021) <doi:10.6339/21-JDS1014>, and regression (Ara A., et. al, 2021) <doi:10.1016/j.eswa.2022.117107> problems, offering versatility and robust performance across different datasets and compared with other consolidated methods as Random Forests (Maia M, et. al, 2021) <doi:10.6339/21-JDS1025>.
Maintained by Mateus Maia. Last updated 12 months ago.
5.1 match 1 stars 2.74 score 11 scriptsjoshuawlambert
rFSA:Feasible Solution Algorithm for Finding Best Subsets and Interactions
Assists in statistical model building to find optimal and semi-optimal higher order interactions and best subsets. Uses the lm(), glm(), and other R functions to fit models generated from a feasible solution algorithm. Discussed in Subset Selection in Regression, A Miller (2002). Applied and explained for least median of squares in Hawkins (1993) <doi:10.1016/0167-9473(93)90246-P>. The feasible solution algorithm comes up with model forms of a specific type that can have fixed variables, higher order interactions and their lower order terms.
Maintained by Joshua Lambert. Last updated 4 years ago.
algorithmfsainteractionmodelsparallelstatisticalstatisticssubset
3.3 match 7 stars 4.15 score 20 scriptsxluo11
xxIRT:Item Response Theory and Computer-Based Testing
A suite of psychometric analysis tools for research and operation, including: (1) computation of probability, information, and likelihood for the 3PL, GPCM, and GRM; (2) parameter estimation using joint or marginal likelihood estimation method; (3) simulation of computerized adaptive testing using built-in or customized algorithms; (4) assembly and simulation of multistage testing. The full documentation and tutorials are at <https://github.com/xluo11/xxIRT>.
Maintained by Xiao Luo. Last updated 6 years ago.
3.3 match 25 stars 4.10 score 10 scriptskriper0217
valmetrics:Metrics and Plots for Model Evaluation
Functions for metrics and plots for model evaluation. Based on vectors of observed and predicted values. Method: Kristin Piikki, Johanna Wetterlind, Mats Soderstrom and Bo Stenberg (2021). <doi:10.1111/SUM.12694>.
Maintained by Kristin Piikki. Last updated 4 years ago.
6.6 match 2.00 score 2 scriptsxluo11
Rirt:Data Analysis and Parameter Estimation Using Item Response Theory
Parameter estimation, computation of probability, information, and (log-)likelihood, and visualization of item/test characteristic curves and item/test information functions for three uni-dimensional item response theory models: the 3-parameter-logistic model, generalized partial credit model, and graded response model. The full documentation and tutorials are at <https://github.com/xluo11/Rirt>.
Maintained by Xiao Luo. Last updated 5 years ago.
3.3 match 3 stars 3.95 score 6 scripts 2 dependentsjangraffelman
Correlplot:A Collection of Functions for Graphing Correlation Matrices
Routines for the graphical representation of correlation matrices by means of correlograms, MDS maps and biplots obtained by PCA, PFA or WALS (weighted alternating least squares); See Graffelman & De Leeuw (2023) <doi: 10.1080/00031305.2023.2186952>.
Maintained by Jan Graffelman. Last updated 1 years ago.
4.9 match 2.59 score 13 scripts 1 dependentscran
airGR:Suite of GR Hydrological Models for Precipitation-Runoff Modelling
Hydrological modelling tools developed at INRAE-Antony (HYCAR Research Unit, France). The package includes several conceptual rainfall-runoff models (GR4H, GR5H, GR4J, GR5J, GR6J, GR2M, GR1A) that can be applied either on a lumped or semi-distributed way. A snow accumulation and melt model (CemaNeige) and the associated functions for the calibration and evaluation of models are also included. Use help(airGR) for package description and references.
Maintained by Olivier Delaigue. Last updated 1 years ago.
1.9 match 4 stars 6.60 score 164 scripts 4 dependentsrbgramacy
monomvn:Estimation for MVN and Student-t Data with Monotone Missingness
Estimation of multivariate normal (MVN) and student-t data of arbitrary dimension where the pattern of missing data is monotone. See Pantaleo and Gramacy (2010) <doi:10.48550/arXiv.0907.2135>. Through the use of parsimonious/shrinkage regressions (plsr, pcr, lasso, ridge, etc.), where standard regressions fail, the package can handle a nearly arbitrary amount of missing data. The current version supports maximum likelihood inference and a full Bayesian approach employing scale-mixtures for Gibbs sampling. Monotone data augmentation extends this Bayesian approach to arbitrary missingness patterns. A fully functional standalone interface to the Bayesian lasso (from Park & Casella), Normal-Gamma (from Griffin & Brown), Horseshoe (from Carvalho, Polson, & Scott), and ridge regression with model selection via Reversible Jump, and student-t errors (from Geweke) is also provided.
Maintained by Robert B. Gramacy. Last updated 6 months ago.
3.9 match 4 stars 3.14 score 127 scriptscran
Fgmutils:Forest Growth Model Utilities
Growth models and forest production require existing data manipulation and the creation of new data, structured from basic forest inventory data. The purpose of this package is provide functions to support these activities.
Maintained by Clayton Vieira Fraga Filho. Last updated 6 years ago.
7.6 match 1.48 score 1 dependentssciviews
modelit:Statistical Models for 'SciViews::R'
Create and use statistical models (linear, general, nonlinear...) with extensions to support rich-formatted tables, equations and plots for the 'SciViews::R' dialect.
Maintained by Philippe Grosjean. Last updated 4 months ago.
3.3 match 1 stars 3.30 score 8 scriptskapelner
bartMachine:Bayesian Additive Regression Trees
An advanced implementation of Bayesian Additive Regression Trees with expanded features for data analysis and visualization.
Maintained by Adam Kapelner. Last updated 2 years ago.
1.8 match 6.01 score 309 scripts 6 dependentscran
s2dv:A Set of Common Tools for Seasonal to Decadal Verification
The advanced version of package 's2dverification'. It is intended for 'seasonal to decadal' (s2d) climate forecast verification, but it can also be used in other kinds of forecasts or general climate analysis. This package is specially designed for the comparison between the experimental and observational datasets. The functionality of the included functions covers from data retrieval, data post-processing, skill scores against observation, to visualization. Compared to 's2dverification', 's2dv' is more compatible with the package 'startR', able to use multiple cores for computation and handle multi-dimensional arrays with a higher flexibility. The CDO version used in development is 1.9.8.
Maintained by Ariadna Batalla. Last updated 5 months ago.
5.4 match 1.95 score 3 dependentscran
qpcR:Modelling and Analysis of Real-Time PCR Data
Model fitting, optimal model selection and calculation of various features that are essential in the analysis of quantitative real-time polymerase chain reaction (qPCR).
Maintained by Andrej-Nikolai Spiess. Last updated 7 years ago.
3.3 match 2 stars 3.06 score 1 dependentscerte-medical-epidemiology
certestats:A Certe R Package for Statistical Modelling
A Certe R Package for early-warning, applying statistical modelling (such as creating machine learning models), QC rules and distribution analysis. This package is part of the 'certedata' universe.
Maintained by Matthijs S. Berends. Last updated 4 months ago.
3.3 match 3.02 score 1 scripts 1 dependentsbioc
methyLImp2:Missing value estimation of DNA methylation data
This package allows to estimate missing values in DNA methylation data. methyLImp method is based on linear regression since methylation levels show a high degree of inter-sample correlation. Implementation is parallelised over chromosomes since probes on different chromosomes are usually independent. Mini-batch approach to reduce the runtime in case of large number of samples is available.
Maintained by Anna Plaksienko. Last updated 1 months ago.
dnamethylationmicroarraysoftwaremethylationarrayregressionimputationmethylationmissing-value-imputation
1.6 match 6 stars 5.62 score 3 scriptswlenhard
cNORM:Continuous Norming
A comprehensive toolkit for generating continuous test norms in psychometrics and biometrics, and analyzing model fit. The package offers both distribution-free modeling using Taylor polynomials and parametric modeling using the beta-binomial distribution. Originally developed for achievement tests, it is applicable to a wide range of mental, physical, or other test scores dependent on continuous or discrete explanatory variables. The package provides several advantages: It minimizes deviations from representativeness in subsamples, interpolates between discrete levels of explanatory variables, and significantly reduces the required sample size compared to conventional norming per age group. cNORM enables graphical and analytical evaluation of model fit, accommodates a wide range of scales including those with negative and descending values, and even supports conventional norming. It generates norm tables including confidence intervals. It also includes methods for addressing representativeness issues through Iterative Proportional Fitting.
Maintained by Wolfgang Lenhard. Last updated 4 months ago.
beta-binomialbiometricscontinuous-norminggrowth-curvenorm-scoresnorm-tablesnormalization-techniquespercentilepsychometricsregression-based-normingtaylor-series
1.6 match 2 stars 5.49 score 75 scriptsmingsnu
stfit:Spatio-Temporal Functional Imputation Tool
A general spatiotemporal satellite image imputation method based on sparse functional data analytic techniques. The imputation method applies and extends the Functional Principal Analysis by Conditional Estimation (PACE). The underlying idea for the proposed procedure is to impute a missing pixel by borrowing information from temporally and spatially contiguous pixels based on the best linear unbiased prediction.
Maintained by Weicheng Zhu. Last updated 2 years ago.
3.3 match 2.61 score 41 scriptsdavid-hervas
repmod:Create Report Table from Different Objects
Tools for generating descriptives and report tables for different models, data.frames and tables and exporting them to different formats.
Maintained by David Hervas Marin. Last updated 2 months ago.
3.3 match 2.60 score 6 scriptshjboonstra
hbsae:Hierarchical Bayesian Small Area Estimation
Functions to compute small area estimates based on a basic area or unit-level model. The model is fit using restricted maximum likelihood, or in a hierarchical Bayesian way. In the latter case numerical integration is used to average over the posterior density for the between-area variance. The output includes the model fit, small area estimates and corresponding mean squared errors, as well as some model selection measures. Additional functions provide means to compute aggregate estimates and mean squared errors, to minimally adjust the small area estimates to benchmarks at a higher aggregation level, and to graphically compare different sets of small area estimates.
Maintained by Harm Jan Boonstra. Last updated 3 years ago.
3.3 match 2 stars 2.53 score 28 scripts 2 dependentsamerican-institutes-for-research
wCorr:Weighted Correlations
Calculates Pearson, Spearman, polychoric, and polyserial correlation coefficients, in weighted or unweighted form. The package implements tetrachoric correlation as a special case of the polychoric and biserial correlation as a specific case of the polyserial.
Maintained by Paul Bailey. Last updated 2 years ago.
1.3 match 6.54 score 118 scripts 8 dependentszejiang-unsw
WASP:Wavelet System Prediction
The wavelet-based variance transformation method is used for system modelling and prediction. It refines predictor spectral representation using Wavelet Theory, which leads to improved model specifications and prediction accuracy. Details of methodologies used in the package can be found in Jiang, Z., Sharma, A., & Johnson, F. (2020) <doi:10.1029/2019WR026962>, Jiang, Z., Rashid, M. M., Johnson, F., & Sharma, A. (2020) <doi:10.1016/j.envsoft.2020.104907>, and Jiang, Z., Sharma, A., & Johnson, F. (2021) <doi:10.1016/J.JHYDROL.2021.126816>.
Maintained by Ze Jiang. Last updated 7 months ago.
predictiontransformationwavelet
1.3 match 9 stars 6.41 score 19 scriptssujit-sahu
bmstdr:Bayesian Modeling of Spatio-Temporal Data with R
Fits, validates and compares a number of Bayesian models for spatial and space time point referenced and areal unit data. Model fitting is done using several packages: 'rstan', 'INLA', 'spBayes', 'spTimer', 'spTDyn', 'CARBayes' and 'CARBayesST'. Model comparison is performed using the DIC and WAIC, and K-fold cross-validation where the user is free to select their own subset of data rows for validation. Sahu (2022) <doi:10.1201/9780429318443> describes the methods in detail.
Maintained by Sujit K. Sahu. Last updated 1 years ago.
bayesianmodellingspatio-temporal-datacpp
1.6 match 15 stars 4.95 score 12 scriptsfrareb
devRate:Quantify the Relationship Between Development Rate and Temperature in Ectotherms
A set of functions to quantify the relationship between development rate and temperature and to build phenological models. The package comprises a set of models and estimated parameters borrowed from a literature review in ectotherms. The methods and literature review are described in Rebaudo et al. (2018) <doi:10.1111/2041-210X.12935>, Rebaudo and Rabhi (2018) <doi:10.1111/eea.12693>, and Regnier et al. (2021) <doi:10.1093/ee/nvab115>. An example can be found in Rebaudo et al. (2017) <doi:10.1007/s13355-017-0480-5>.
Maintained by Francois Rebaudo. Last updated 2 years ago.
1.5 match 3 stars 5.31 score 15 scriptsmbinois
hetGP:Heteroskedastic Gaussian Process Modeling and Design under Replication
Performs Gaussian process regression with heteroskedastic noise following the model by Binois, M., Gramacy, R., Ludkovski, M. (2016) <doi:10.48550/arXiv.1611.05902>, with implementation details in Binois, M. & Gramacy, R. B. (2021) <doi:10.18637/jss.v098.i13>. The input dependent noise is modeled as another Gaussian process. Replicated observations are encouraged as they yield computational savings. Sequential design procedures based on the integrated mean square prediction error and lookahead heuristics are provided, and notably fast update functions when adding new observations.
Maintained by Mickael Binois. Last updated 6 months ago.
1.6 match 5 stars 4.89 score 260 scripts 2 dependentslaurabruckman
netSEM:Network Structural Equation Modeling
The network structural equation modeling conducts a network statistical analysis on a data frame of coincident observations of multiple continuous variables [1]. It builds a pathway model by exploring a pool of domain knowledge guided candidate statistical relationships between each of the variable pairs, selecting the 'best fit' on the basis of a specific criteria such as adjusted r-squared value. This material is based upon work supported by the U.S. National Science Foundation Award EEC-2052776 and EEC-2052662 for the MDS-Rely IUCRC Center, under the NSF Solicitation: NSF 20-570 Industry-University Cooperative Research Centers Program [1] Bruckman, Laura S., Nicholas R. Wheeler, Junheng Ma, Ethan Wang, Carl K. Wang, Ivan Chou, Jiayang Sun, and Roger H. French. (2013) <doi:10.1109/ACCESS.2013.2267611>.
Maintained by Laura S. Bruckman. Last updated 2 years ago.
2.0 match 3.72 score 13 scriptshaddonm
MQMF:Modelling and Quantitative Methods in Fisheries
Complements the book "Using R for Modelling and Quantitative Methods in Fisheries" ISBN 9780367469894, published in 2021 by Chapman & Hall in their "Using R series". There are numerous functions and data-sets that are used in the book's many practical examples.
Maintained by Malcolm Haddon. Last updated 2 years ago.
ecologyfisherieshaddonquantitative-methodsuncertainty
1.8 match 11 stars 4.14 score 25 scriptszaynesember
speccurvieR:Easy, Fast, and Pretty Specification Curve Analysis
Making specification curve analysis easy, fast, and pretty. It improves upon existing offerings with additional features and 'tidyverse' integration. Users can easily visualize and evaluate how their models behave under different specifications with a high degree of customization. For a description and applications of specification curve analysis see Simonsohn, Simmons, and Nelson (2020) <doi:10.1038/s41562-020-0912-z>.
Maintained by Zayne Sember. Last updated 6 months ago.
regression-diagnosticsspecification-curve-analysisspecification-curve-plot
1.8 match 4 stars 4.00 score 2 scriptschoi-phd
maat:Multiple Administrations Adaptive Testing
Provides an extension of the shadow-test approach to computerized adaptive testing (CAT) implemented in the 'TestDesign' package for the assessment framework involving multiple tests administered periodically throughout the year. This framework is referred to as the Multiple Administrations Adaptive Testing (MAAT) and supports multiple item pools vertically scaled and multiple phases (stages) of CAT within each test. Between phases and tests, transitioning from one item pool (and associated constraints) to another is allowed as deemed necessary to enhance the quality of measurement.
Maintained by Seung W. Choi. Last updated 9 months ago.
1.8 match 4.00 score 5 scriptsflorafauna
gapfill:Fill Missing Values in Satellite Data
Tools to fill missing values in satellite data and to develop new gap-fill algorithms. The methods are tailored to data (images) observed at equally-spaced points in time. The package is illustrated with MODIS NDVI data.
Maintained by Florian Gerber. Last updated 4 years ago.
2.3 match 2 stars 3.18 score 15 scriptscran
halk:Methods to Create Hierarchical Age Length Keys for Age Assignment
Provides methods for implementing hierarchical age length keys to estimate fish ages from lengths using data borrowing. Users can create hierarchical age length keys and use them to assign ages given length.
Maintained by Paul Frater. Last updated 1 years ago.
3.4 match 2.00 score 4 scriptspromidat
forecasteR:Time Series Forecast System
A web application for displaying, analysing and forecasting univariate time series. Includes basic methods such as mean, naïve, seasonal naïve and drift, as well as more complex methods such as Holt-Winters Box,G and Jenkins, G (1976) <doi:10.1111/jtsa.12194> and ARIMA Brockwell, P.J. and R.A.Davis (1991) <doi:10.1007/978-1-4419-0320-4>.
Maintained by Oldemar Rodriguez. Last updated 2 years ago.
3.3 match 2.00 score 2 scriptscran
SoftBart:Implements the SoftBart Algorithm
Implements the SoftBart model of described by Linero and Yang (2018) <doi:10.1111/rssb.12293>, with the optional use of a sparsity-inducing prior to allow for variable selection. For usability, the package maintains the same style as the 'BayesTree' package.
Maintained by Antonio R. Linero. Last updated 2 years ago.
3.3 match 2.00 scorecran
PerMat:Performance Metrics in Predictive Modeling
Performance metric provides different performance measures like mean squared error, root mean square error, mean absolute deviation, mean absolute percentage error etc. of a fitted model. These can provide a way for forecasters to quantitatively compare the performance of competing models. For method details see (i) Pankaj Das (2020) <http://krishi.icar.gov.in/jspui/handle/123456789/44138>.
Maintained by Pankaj Das. Last updated 9 months ago.
3.3 match 2.00 scoremohmedsoudy
MERO:Performing Monte Carlo Expectation Maximization Random Forest Imputation for Biological Data
Perform missing value imputation for biological data using the random forest algorithm, the imputation aim to keep the original mean and standard deviation consistent after imputation.
Maintained by Mohamed Soudy. Last updated 2 years ago.
5.0 match 1.23 score 17 scriptscran
DTWBI:Imputation of Time Series Based on Dynamic Time Warping
Functions to impute large gaps within time series based on Dynamic Time Warping methods. It contains all required functions to create large missing consecutive values within time series and to fill them, according to the paper Phan et al. (2017), <DOI:10.1016/j.patrec.2017.08.019>. Performance criteria are added to compare similarity between two signals (query and reference).
Maintained by Emilie Poisson-Caillault. Last updated 7 years ago.
4.0 match 1.48 score 1 dependentschelbert
DiceEval:Construction and Evaluation of Metamodels
Estimation, validation and prediction of models of different types : linear models, additive models, MARS,PolyMARS and Kriging.
Maintained by C. Helbert. Last updated 1 years ago.
3.3 match 1.81 score 64 scriptsbdj34
cloneRate:Estimate Growth Rates from Phylogenetic Trees
Quickly estimate the net growth rate of a population or clone whose growth can be approximated by a birth-death branching process. Input should be phylogenetic tree(s) of clone(s) with edge lengths corresponding to either time or mutations. Based on coalescent results in Johnson et al. (2023) <doi:10.1093/bioinformatics/btad561>. Simulation techniques as well as growth rate methods build on prior work from Lambert A. (2018) <doi:10.1016/j.tpb.2018.04.005> and Stadler T. (2009) <doi:10.1016/j.jtbi.2009.07.018>.
Maintained by Brian Johnson. Last updated 11 months ago.
1.2 match 4 stars 4.90 score 8 scriptsskranz
synthdid:Synthetic Difference-in-Difference Estimation
Estimate average treatment effects in panel data. Currently provides methods only for the case that all treated units adopt treatment at the same time.
Maintained by David A. Hirshberg. Last updated 4 years ago.
1.6 match 3.40 score 84 scripts 1 dependentscran
regressoR:Regression Data Analysis System
Perform a supervised data analysis on a database through a 'shiny' graphical interface. It includes methods such as linear regression, penalized regression, k-nearest neighbors, decision trees, ada boosting, extreme gradient boosting, random forest, neural networks, deep learning and support vector machines.
Maintained by Oldemar Rodriguez. Last updated 4 months ago.
4.0 match 2 stars 1.30 scoresamhaycock
stressor:Algorithms for Testing Models under Stress
Traditional model evaluation metrics fail to capture model performance under less than ideal conditions. This package employs techniques to evaluate models "under-stress". This includes testing models' extrapolation ability, or testing accuracy on specific sub-samples of the overall model space. Details describing stress-testing methods in this package are provided in Haycock (2023) <doi:10.26076/2am5-9f67>. The other primary contribution of this package is provided to R users access to the 'Python' library 'PyCaret' <https://pycaret.org/> for quick and easy access to auto-tuned machine learning models.
Maintained by Sam Haycock. Last updated 11 months ago.
1.8 match 2.70 score 6 scriptsyeyuan98
clockSim:Simulation of the Circadian Clock Gene Network
A preconfigured simulation workflow for the circadian clock gene network.
Maintained by Ye Yuan. Last updated 6 days ago.
1.7 match 2.78 score 3 scriptskwb-r
kwb.hantush:Calculation of Groundwater Mounding Beneath an Infiltration Basin
Calculation groundwater mounding beneath an infiltration basin based on the Hantush (1967) equation (<doi:10.1029/WR003i001p00227>). The correct implementation is shown with a verification example based on a USGS report (page 25, <https://pubs.usgs.gov/sir/2010/5102/support/sir2010-5102.pdf#page=35>).
Maintained by Michael Rustler. Last updated 5 years ago.
groundwater-modellinggroundwater-moundinginfiltration-basinmodellingproject-demeaushiny-app
1.6 match 3.00 score 10 scriptstidymodels
tidyposterior:Bayesian Analysis to Compare Models using Resampling Statistics
Bayesian analysis used here to answer the question: "when looking at resampling results, are the differences between models 'real'?" To answer this, a model can be created were the performance statistic is the resampling statistics (e.g. accuracy or RMSE). These values are explained by the model types. In doing this, we can get parameter estimates for each model's affect on performance and make statistical (and practical) comparisons between models. The methods included here are similar to Benavoli et al (2017) <https://jmlr.org/papers/v18/16-305.html>.
Maintained by Max Kuhn. Last updated 5 months ago.
0.5 match 102 stars 8.44 score 273 scriptsludovikcoba
rrecsys:Environment for Evaluating Recommender Systems
Processes standard recommendation datasets (e.g., a user-item rating matrix) as input and generates rating predictions and lists of recommended items. Standard algorithm implementations which are included in this package are the following: Global/Item/User-Average baselines, Weighted Slope One, Item-Based KNN, User-Based KNN, FunkSVD, BPR and weighted ALS. They can be assessed according to the standard offline evaluation methodology (Shani, et al. (2011) <doi:10.1007/978-0-387-85820-3_8>) for recommender systems using measures such as MAE, RMSE, Precision, Recall, F1, AUC, NDCG, RankScore and coverage measures. The package (Coba, et al.(2017) <doi: 10.1007/978-3-319-60042-0_36>) is intended for rapid prototyping of recommendation algorithms and education purposes.
Maintained by Ludovik Çoba. Last updated 3 years ago.
0.5 match 23 stars 6.84 score 25 scriptsgloewing
sMTL:Sparse Multi-Task Learning
Implements L0-constrained Multi-Task Learning and domain generalization algorithms. The algorithms are coded in Julia allowing for fast implementations of the coordinate descent and local combinatorial search algorithms. For more details, see a preprint of the paper: Loewinger et al., (2022) <arXiv:2212.08697>.
Maintained by Gabriel Loewinger. Last updated 2 years ago.
3.3 match 1.00 score 8 scriptsencoreus
MCCM:Mixed Correlation Coefficient Matrix
The IRLS (Iteratively Reweighted Least Squares) and GMM (Generalized Method of Moments) methods are applied to estimate mixed correlation coefficient matrix (Pearson, Polyseries, Polychoric), which can be estimated in pairs or simultaneously. For more information see Peng Zhang and Ben Liu (2024) <doi:10.1080/10618600.2023.2257251>; Ben Liu and Peng Zhang (2024) <doi:10.48550/arXiv.2404.06781>.
Maintained by Ben Liu. Last updated 11 months ago.
3.3 match 1.00 scoreguangbaog
DTSR:Distributed Trimmed Scores Regression for Handling Missing Data
Provides functions for handling missing data using Distributed Trimmed Scores Regression and other imputation methods. It includes facilities for data imputation, evaluation metrics, and clustering analysis. It is designed to work in distributed computing environments to handle large datasets efficiently. The philosophy of the package is described in Guo G. (2024) <doi:10.1080/03610918.2022.2091779>.
Maintained by Guangbao Guo. Last updated 4 months ago.
3.1 match 1.00 scorecran
serieslcb:Lower Confidence Bounds for Binomial Series System
Calculate and compare lower confidence bounds for binomial series system reliability. The R 'shiny' application, launched by the function launch_app(), weaves together a workflow of customized simulations and delta coverage calculations to output recommended lower confidence bound methods.
Maintained by Edward Schuberg. Last updated 6 years ago.
2.3 match 1.20 score 16 scriptsoscarperpinan
tdr:Target Diagram
Implementation of target diagrams using 'lattice' and 'ggplot2' graphics. Target diagrams provide a graphical overview of the respective contributions of the unbiased RMSE and MBE to the total RMSE (Jolliff, J. et al., 2009. "Summary Diagrams for Coupled Hydrodynamic-Ecosystem Model Skill Assessment." Journal of Marine Systems 76: 64–82.)
Maintained by Oscar Perpinan Lamigueiro. Last updated 9 years ago.
0.8 match 2 stars 2.96 score 46 scriptscran
deFit:Fitting Differential Equations to Time Series Data
Use numerical optimization to fit ordinary differential equations (ODEs) to time series data to examine the dynamic relationships between variables or the characteristics of a dynamical system. It can now be used to estimate the parameters of ODEs up to second order, and can also apply to multilevel systems. See <https://github.com/yueqinhu/defit> for details.
Maintained by Yueqin Hu. Last updated 5 months ago.
1.8 match 1.00 score 2 scriptsjoshlmiller1978
TSEind:Total Survey Error (Independent Samples)
Calculates total survey error (TSE) for one or more surveys, using both scale-dependent and scale-independent metrics. Package works directly from the data set, with no hand calculations required: just upload a properly structured data set (see TESTIND and its documentation), properly input column names (see functions documentation), and run your functions. For more on TSE, see: Weisberg, Herbert (2005, ISBN:0-226-89128-3); Biemer, Paul (2010) <doi:10.1093/poq/nfq058>; Biemer, Paul et.al. (2017, ISBN:9781119041672); etc.
Maintained by Joshua Miller. Last updated 6 years ago.
1.7 match 1.00 scoresandipgarai
AllMetrics:Calculating Multiple Performance Metrics of a Prediction Model
Provides a function to calculate multiple performance metrics for actual and predicted values. In total eight metrics will be calculated for particular actual and predicted series. Helps to describe a Statistical model's performance in predicting a data. Also helps to compare various models' performance. The metrics are Root Mean Squared Error (RMSE), Relative Root Mean Squared Error (RRMSE), Mean absolute Error (MAE), Mean absolute percentage error (MAPE), Mean Absolute Scaled Error (MASE), Nash-Sutcliffe Efficiency (NSE), Willmott’s Index (WI), and Legates and McCabe Index (LME). Among them, first five are expected to be lesser whereas, the last three are greater the better. More details can be found from Garai and Paul (2023) <doi:10.1016/j.iswa.2023.200202> and Garai et al. (2024) <doi:10.1007/s11063-024-11552-w>.
Maintained by Dr. Sandip Garai. Last updated 1 years ago.
0.5 match 1.78 score 2 dependentsagulb
ehaGoF:Calculates Goodness of Fit Statistics
Calculates 15 different goodness of fit criteria. These are; standard deviation ratio (SDR), coefficient of variation (CV), relative root mean square error (RRMSE), Pearson's correlation coefficients (PC), root mean square error (RMSE), performance index (PI), mean error (ME), global relative approximation error (RAE), mean relative approximation error (MRAE), mean absolute percentage error (MAPE), mean absolute deviation (MAD), coefficient of determination (R-squared), adjusted coefficient of determination (adjusted R-squared), Akaike's information criterion (AIC), corrected Akaike's information criterion (CAIC), Mean Square Error (MSE), Bayesian Information Criterion (BIC) and Normalized Mean Square Error (NMSE).
Maintained by Alper Gulbe. Last updated 5 years ago.
0.5 match 1.70 score 50 scriptsranjitstat
WaveletSVR:Wavelet-SVR Hybrid Model for Time Series Forecasting
The main aim of this package is to combine the advantage of wavelet and support vector machine models for time series forecasting. This package also gives the accuracy measurements in terms of RMSE and MAPE. This package fits the hybrid Wavelet SVR model for time series forecasting The main aim of this package is to combine the advantage of wavelet and Support Vector Regression (SVR) models for time series forecasting. This package also gives the accuracy measurements in terms of Root Mean Square Error (RMSE) and Mean Absolute Prediction Error (MAPE). This package is based on the algorithm of Raimundo and Okamoto (2018) <DOI: 10.1109/INFOCT.2018.8356851>.
Maintained by Ranjit Kumar Paul. Last updated 3 years ago.
0.8 match 1.00 scoreranjitstat
WaveletRF:Wavelet-RF Hybrid Model for Time Series Forecasting
The Wavelet Decomposition followed by Random Forest Regression (RF) models have been applied for time series forecasting. The maximum overlap discrete wavelet transform (MODWT) algorithm was chosen as it works for any length of the series. The series is first divided into training and testing sets. In each of the wavelet decomposed series, the supervised machine learning approach namely random forest was employed to train the model. This package also provides accuracy metrics in the form of Root Mean Square Error (RMSE) and Mean Absolute Prediction Error (MAPE). This package is based on the algorithm of Ding et al. (2021) <DOI: 10.1007/s11356-020-12298-3>.
Maintained by Ranjit Kumar Paul. Last updated 3 years ago.
0.5 match 1.00 score