Showing 200 of total 1870 results (show query)
cran
NISTunits:Fundamental Physical Constants and Unit Conversions from NIST
Fundamental physical constants (Quantity, Value, Uncertainty, Unit) for SI (International System of Units) and non-SI units, plus unit conversions Based on the data from NIST (National Institute of Standards and Technology, USA)
Maintained by Jose Gama. Last updated 9 years ago.
376.3 match 2.85 score 10 dependentsrobinhankin
magic:Create and Investigate Magic Squares
A collection of functions for the manipulation and analysis of arbitrarily dimensioned arrays. The original motivation for the package was the development of efficient, vectorized algorithms for the creation and investigation of magic squares and high-dimensional magic hypercubes.
Maintained by Robin K. S. Hankin. Last updated 2 months ago.
59.7 match 3 stars 11.12 score 436 scripts 230 dependentsrstudio
keras3:R Interface to 'Keras'
Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.
Maintained by Tomasz Kalinowski. Last updated 2 days ago.
24.6 match 845 stars 13.60 score 264 scripts 2 dependentsmlverse
torch:Tensors and Neural Networks with 'GPU' Acceleration
Provides functionality to define and train neural networks similar to 'PyTorch' by Paszke et al (2019) <doi:10.48550/arXiv.1912.01703> but written entirely in R using the 'libtorch' library. Also supports low-level tensor operations and 'GPU' acceleration.
Maintained by Daniel Falbel. Last updated 8 days ago.
18.8 match 520 stars 16.52 score 1.4k scripts 38 dependentskwstat
agridat:Agricultural Datasets
Datasets from books, papers, and websites related to agriculture. Example graphics and analyses are included. Data come from small-plot trials, multi-environment trials, uniformity trials, yield monitors, and more.
Maintained by Kevin Wright. Last updated 30 days ago.
28.8 match 126 stars 10.78 score 1.7k scripts 1 dependentsmlr-org
mlr3:Machine Learning in R - Next Generation
Efficient, object-oriented programming on the building blocks of machine learning. Provides 'R6' objects for tasks, learners, resamplings, and measures. The package is geared towards scalability and larger datasets by supporting parallelization and out-of-memory data-backends like databases. While 'mlr3' focuses on the core computational operations, add-on packages provide additional functionality.
Maintained by Marc Becker. Last updated 6 days ago.
classificationdata-sciencemachine-learningmlr3regression
17.7 match 972 stars 14.86 score 2.3k scripts 35 dependentsdrostlab
philentropy:Similarity and Distance Quantification Between Probability Functions
Computes 46 optimized distance and similarity measures for comparing probability functions (Drost (2018) <doi:10.21105/joss.00765>). These comparisons between probability functions have their foundations in a broad range of scientific disciplines from mathematics to ecology. The aim of this package is to provide a core framework for clustering, classification, statistical inference, goodness-of-fit, non-parametric statistics, information theory, and machine learning tasks that are based on comparing univariate or multivariate probability functions.
Maintained by Hajk-Georg Drost. Last updated 3 months ago.
distance-measuresdistance-quantificationinformation-theoryjensen-shannon-divergenceparametric-distributionssimilarity-measuresstatisticscpp
20.4 match 137 stars 12.44 score 484 scripts 24 dependentsmfrasco
Metrics:Evaluation Metrics for Machine Learning
An implementation of evaluation metrics in R that are commonly used in supervised machine learning. It implements metrics for regression, time series, binary classification, classification, and information retrieval problems. It has zero dependencies and a consistent, simple interface for all functions.
Maintained by Michael Frasco. Last updated 6 years ago.
17.7 match 99 stars 13.02 score 6.1k scripts 51 dependentsnanxstats
enpls:Ensemble Partial Least Squares Regression
An algorithmic framework for measuring feature importance, outlier detection, model applicability domain evaluation, and ensemble predictive modeling with (sparse) partial least squares regressions.
Maintained by Nan Xiao. Last updated 3 years ago.
chemometricsdimensionality-reductionensemble-learningmachine-learningoutlier-detectionpartial-least-squares-regression
37.3 match 18 stars 5.56 score 40 scriptscran
nlme:Linear and Nonlinear Mixed Effects Models
Fit and compare Gaussian linear and nonlinear mixed-effects models.
Maintained by R Core Team. Last updated 2 months ago.
15.7 match 6 stars 13.00 score 13k scripts 8.7k dependentsfbertran
plsRglm:Partial Least Squares Regression for Generalized Linear Models
Provides (weighted) Partial least squares Regression for generalized linear models and repeated k-fold cross-validation of such models using various criteria <arXiv:1810.01005>. It allows for missing data in the explanatory variables. Bootstrap confidence intervals constructions are also available.
Maintained by Frederic Bertrand. Last updated 2 years ago.
24.6 match 16 stars 7.75 score 103 scripts 5 dependentsdoomlab
MOTE:Effect Size and Confidence Interval Calculator
Measure of the Effect ('MOTE') is an effect size calculator, including a wide variety of effect sizes in the mean differences family (all versions of d) and the variance overlap family (eta, omega, epsilon, r). 'MOTE' provides non-central confidence intervals for each effect size, relevant test statistics, and output for reporting in APA Style (American Psychological Association, 2010, <ISBN:1433805618>) with 'LaTeX'. In research, an over-reliance on p-values may conceal the fact that a study is under-powered (Halsey, Curran-Everett, Vowler, & Drummond, 2015 <doi:10.1038/nmeth.3288>). A test may be statistically significant, yet practically inconsequential (Fritz, Scherndl, & Kรผhberger, 2012 <doi:10.1177/0959354312436870>). Although the American Psychological Association has long advocated for the inclusion of effect sizes (Wilkinson & American Psychological Association Task Force on Statistical Inference, 1999 <doi:10.1037/0003-066X.54.8.594>), the vast majority of peer-reviewed, published academic studies stop short of reporting effect sizes and confidence intervals (Cumming, 2013, <doi:10.1177/0956797613504966>). 'MOTE' simplifies the use and interpretation of effect sizes and confidence intervals. For more information, visit <https://www.aggieerin.com/shiny-server>.
Maintained by Erin M. Buchanan. Last updated 3 years ago.
confidenceeffectintervalsizestatistics
26.3 match 17 stars 6.69 score 320 scripts 1 dependentsrsquaredacademy
olsrr:Tools for Building OLS Regression Models
Tools designed to make it easier for users, particularly beginner/intermediate R users to build ordinary least squares regression models. Includes comprehensive regression output, heteroskedasticity tests, collinearity diagnostics, residual diagnostics, measures of influence, model fit assessment and variable selection procedures.
Maintained by Aravind Hebbali. Last updated 4 months ago.
collinearity-diagnosticslinear-modelsregressionstepwise-regression
13.2 match 103 stars 12.19 score 1.4k scripts 4 dependentsgjmvanboxtel
gsignal:Signal Processing
R implementation of the 'Octave' package 'signal', containing a variety of signal processing tools, such as signal generation and measurement, correlation and convolution, filtering, filter design, filter analysis and conversion, power spectrum analysis, system identification, decimation and sample rate change, and windowing.
Maintained by Geert van Boxtel. Last updated 2 months ago.
14.8 match 24 stars 10.03 score 133 scripts 34 dependentstomkellygenetics
matrixcalc:Collection of Functions for Matrix Calculations
A collection of functions to support matrix calculations for probability, econometric and numerical analysis. There are additional functions that are comparable to APL functions which are useful for actuarial models such as pension mathematics. This package is used for teaching and research purposes at the Department of Finance and Risk Engineering, New York University, Polytechnic Institute, Brooklyn, NY 11201. Horn, R.A. (1990) Matrix Analysis. ISBN 978-0521386326. Lancaster, P. (1969) Theory of Matrices. ISBN 978-0124355507. Lay, D.C. (1995) Linear Algebra: And Its Applications. ISBN 978-0201845563.
Maintained by S. Thomas Kelly. Last updated 4 years ago.
17.6 match 8.32 score 1.7k scripts 149 dependentsr-forge
Matrix:Sparse and Dense Matrix Classes and Methods
A rich hierarchy of sparse and dense matrix classes, including general, symmetric, triangular, and diagonal matrices with numeric, logical, or pattern entries. Efficient methods for operating on such matrices, often wrapping the 'BLAS', 'LAPACK', and 'SuiteSparse' libraries.
Maintained by Martin Maechler. Last updated 9 days ago.
8.5 match 1 stars 17.23 score 33k scripts 12k dependentsuchidamizuki
jpgrid:Functions for the Grid Square Codes in Japan
Provides functions for grid square codes in Japan (<https://www.stat.go.jp/english/data/mesh/index.html>). Generates the grid square codes from longitude/latitude, geometries, and the grid square codes of different scales, and vice versa.
Maintained by Mizuki Uchida. Last updated 6 months ago.
30.9 match 8 stars 4.41 score 16 scriptsrvlenth
emmeans:Estimated Marginal Means, aka Least-Squares Means
Obtain estimated marginal means (EMMs) for many linear, generalized linear, and mixed models. Compute contrasts or linear functions of EMMs, trends, and comparisons of slopes. Plots and other displays. Least-squares means are discussed, and the term "estimated marginal means" is suggested, in Searle, Speed, and Milliken (1980) Population marginal means in the linear model: An alternative to least squares means, The American Statistician 34(4), 216-221 <doi:10.1080/00031305.1980.10483031>.
Maintained by Russell V. Lenth. Last updated 5 days ago.
7.1 match 377 stars 19.19 score 13k scripts 187 dependentshrbrmstr
waffle:Create Waffle Chart Visualizations
Square pie charts (a.k.a. waffle charts) can be used to communicate parts of a whole for categorical quantities. To emulate the percentage view of a pie chart, a 10x10 grid should be used with each square representing 1% of the total. Modern uses of waffle charts do not necessarily adhere to this rule and can be created with a grid of any rectangular shape. Best practices suggest keeping the number of categories small, just as should be done when creating pie charts. Tools are provided to create waffle charts as well as stitch them together, and to use glyphs for making isotype pictograms.
Maintained by Bob Rudis. Last updated 1 years ago.
data-visualisationdata-visualizationdatavisualizationggplot2square-pie-chartswaffle-charts
12.7 match 778 stars 10.66 score 1.3k scripts 5 dependentscvxgrp
CVXR:Disciplined Convex Optimization
An object-oriented modeling language for disciplined convex programming (DCP) as described in Fu, Narasimhan, and Boyd (2020, <doi:10.18637/jss.v094.i14>). It allows the user to formulate convex optimization problems in a natural way following mathematical convention and DCP rules. The system analyzes the problem, verifies its convexity, converts it into a canonical form, and hands it off to an appropriate solver to obtain the solution. Interfaces to solvers on CRAN and elsewhere are provided, both commercial and open source.
Maintained by Anqi Fu. Last updated 4 months ago.
10.5 match 207 stars 12.89 score 768 scripts 51 dependentsbioc
mixOmics:Omics Data Integration Project
Multivariate methods are well suited to large omics data sets where the number of variables (e.g. genes, proteins, metabolites) is much larger than the number of samples (patients, cells, mice). They have the appealing properties of reducing the dimension of the data by using instrumental variables (components), which are defined as combinations of all variables. Those components are then used to produce useful graphical outputs that enable better understanding of the relationships and correlation structures between the different data sets that are integrated. mixOmics offers a wide range of multivariate methods for the exploration and integration of biological datasets with a particular focus on variable selection. The package proposes several sparse multivariate models we have developed to identify the key variables that are highly correlated, and/or explain the biological outcome of interest. The data that can be analysed with mixOmics may come from high throughput sequencing technologies, such as omics data (transcriptomics, metabolomics, proteomics, metagenomics etc) but also beyond the realm of omics (e.g. spectral imaging). The methods implemented in mixOmics can also handle missing values without having to delete entire rows with missing data. A non exhaustive list of methods include variants of generalised Canonical Correlation Analysis, sparse Partial Least Squares and sparse Discriminant Analysis. Recently we implemented integrative methods to combine multiple data sets: N-integration with variants of Generalised Canonical Correlation Analysis and P-integration with variants of multi-group Partial Least Squares.
Maintained by Eva Hamrud. Last updated 6 days ago.
immunooncologymicroarraysequencingmetabolomicsmetagenomicsproteomicsgenepredictionmultiplecomparisonclassificationregressionbioconductorgenomicsgenomics-datagenomics-visualizationmultivariate-analysismultivariate-statisticsomicsr-pkgr-project
9.4 match 182 stars 13.71 score 1.3k scripts 22 dependentsprojectmosaic
mosaic:Project MOSAIC Statistics and Mathematics Teaching Utilities
Data sets and utilities from Project MOSAIC (<http://www.mosaic-web.org>) used to teach mathematics, statistics, computation and modeling. Funded by the NSF, Project MOSAIC is a community of educators working to tie together aspects of quantitative work that students in science, technology, engineering and mathematics will need in their professional lives, but which are usually taught in isolation, if at all.
Maintained by Randall Pruim. Last updated 1 years ago.
9.7 match 93 stars 13.32 score 7.2k scripts 7 dependentsbioc
limma:Linear Models for Microarray and Omics Data
Data analysis, linear models and differential expression for omics data.
Maintained by Gordon Smyth. Last updated 7 days ago.
exonarraygeneexpressiontranscriptionalternativesplicingdifferentialexpressiondifferentialsplicinggenesetenrichmentdataimportbayesianclusteringregressiontimecoursemicroarraymicrornaarraymrnamicroarrayonechannelproprietaryplatformstwochannelsequencingrnaseqbatcheffectmultiplecomparisonnormalizationpreprocessingqualitycontrolbiomedicalinformaticscellbiologycheminformaticsepigeneticsfunctionalgenomicsgeneticsimmunooncologymetabolomicsproteomicssystemsbiologytranscriptomics
9.0 match 13.81 score 16k scripts 585 dependentschristiangoueguel
HotellingEllipse:Hotellingโs T-Squared Statistic and Ellipse
Functions to calculate the Hotellingโs T-squared statistic and corresponding confidence ellipses. Provides the semi-axes of the Hotellingโs T-squared ellipses at 95% and 99% confidence levels. Enables users to obtain the coordinates in two or three dimensions at user-defined confidence levels, allowing for the construction of 2D or 3D ellipses with customized confidence levels. Bro and Smilde (2014) <DOI:10.1039/c3ay41907j>. Brereton (2016) <DOI:10.1002/cem.2763>.
Maintained by Christian L. Goueguel. Last updated 2 months ago.
confidence-ellipsehotelling-ellipsehotelling-s-t-squarehotelling-t2hotellings-t2-distributionmultivariate-distributionoutlierspartial-least-squares-regressionpcaplsprincipal-component-analysis
23.2 match 7 stars 5.29 score 14 scriptssalvatoremangiafico
rcompanion:Functions to Support Extension Education Program Evaluation
Functions and datasets to support Summary and Analysis of Extension Program Evaluation in R, and An R Companion for the Handbook of Biological Statistics. Vignettes are available at <https://rcompanion.org>.
Maintained by Salvatore Mangiafico. Last updated 1 months ago.
15.3 match 4 stars 8.01 score 2.4k scripts 5 dependentstidymodels
infer:Tidy Statistical Inference
The objective of this package is to perform inference using an expressive statistical grammar that coheres with the tidy design framework.
Maintained by Simon Couch. Last updated 6 months ago.
7.8 match 736 stars 15.75 score 3.5k scripts 18 dependentst-kalinowski
keras:R Interface to 'Keras'
Interface to 'Keras' <https://keras.io>, a high-level neural networks 'API'. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both 'CPU' and 'GPU' devices.
Maintained by Tomasz Kalinowski. Last updated 11 months ago.
11.0 match 10.93 score 10k scripts 55 dependentssvkucheryavski
mdatools:Multivariate Data Analysis for Chemometrics
Projection based methods for preprocessing, exploring and analysis of multivariate data used in chemometrics. S. Kucheryavskiy (2020) <doi:10.1016/j.chemolab.2020.103937>.
Maintained by Sergey Kucheryavskiy. Last updated 8 months ago.
16.1 match 36 stars 7.41 score 220 scripts 1 dependentsyanyachen
MLmetrics:Machine Learning Evaluation Metrics
A collection of evaluation metrics, including loss, score and utility functions, that measure regression, classification and ranking performance.
Maintained by Yachen Yan. Last updated 11 months ago.
10.7 match 69 stars 11.09 score 2.2k scripts 20 dependentshwborchers
pracma:Practical Numerical Math Functions
Provides a large number of functions from numerical analysis and linear algebra, numerical optimization, differential equations, time series, plus some well-known special mathematical functions. Uses 'MATLAB' function names where appropriate to simplify porting.
Maintained by Hans W. Borchers. Last updated 1 years ago.
9.6 match 29 stars 12.34 score 6.6k scripts 931 dependentskenaho1
asbio:A Collection of Statistical Tools for Biologists
Contains functions from: Aho, K. (2014) Foundational and Applied Statistics for Biologists using R. CRC/Taylor and Francis, Boca Raton, FL, ISBN: 978-1-4398-7338-0.
Maintained by Ken Aho. Last updated 2 months ago.
16.1 match 5 stars 7.32 score 310 scripts 3 dependentscran
mgcv:Mixed GAM Computation Vehicle with Automatic Smoothness Estimation
Generalized additive (mixed) models, some of their extensions and other generalized ridge regression with multiple smoothing parameter estimation by (Restricted) Marginal Likelihood, Generalized Cross Validation and similar, or using iterated nested Laplace approximation for fully Bayesian inference. See Wood (2017) <doi:10.1201/9781315370279> for an overview. Includes a gam() function, a wide variety of smoothers, 'JAGS' support and distributions beyond the exponential family.
Maintained by Simon Wood. Last updated 1 years ago.
8.8 match 32 stars 12.71 score 17k scripts 7.8k dependentsfriendly
matlib:Matrix Functions for Teaching and Learning Linear Algebra and Multivariate Statistics
A collection of matrix functions for teaching and learning matrix linear algebra as used in multivariate statistical methods. Many of these functions are designed for tutorial purposes in learning matrix algebra ideas using R. In some cases, functions are provided for concepts available elsewhere in R, but where the function call or name is not obvious. In other cases, functions are provided to show or demonstrate an algorithm. In addition, a collection of functions are provided for drawing vector diagrams in 2D and 3D and for rendering matrix expressions and equations in LaTeX.
Maintained by Michael Friendly. Last updated 4 days ago.
diagramslinear-equationsmatrixmatrix-functionsmatrix-visualizervectorvignette
8.6 match 65 stars 12.89 score 900 scripts 11 dependentsadeverse
ade4:Analysis of Ecological Data: Exploratory and Euclidean Methods in Environmental Sciences
Tools for multivariate data analysis. Several methods are provided for the analysis (i.e., ordination) of one-table (e.g., principal component analysis, correspondence analysis), two-table (e.g., coinertia analysis, redundancy analysis), three-table (e.g., RLQ analysis) and K-table (e.g., STATIS, multiple coinertia analysis). The philosophy of the package is described in Dray and Dufour (2007) <doi:10.18637/jss.v022.i04>.
Maintained by Aurรฉlie Siberchicot. Last updated 14 days ago.
7.3 match 39 stars 14.96 score 2.2k scripts 256 dependentstomasfryda
h2o:R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Maintained by Tomas Fryda. Last updated 1 years ago.
12.7 match 3 stars 8.20 score 7.8k scripts 11 dependentseasystats
performance:Assessment of Regression Models Performance
Utilities for computing measures to assess model quality, which are not directly provided by R's 'base' or 'stats' packages. These include e.g. measures like r-squared, intraclass correlation coefficient (Nakagawa, Johnson & Schielzeth (2017) <doi:10.1098/rsif.2017.0213>), root mean squared error or functions to check models for overdispersion, singularity or zero-inflation and more. Functions apply to a large variety of regression models, including generalized linear models, mixed effects models and Bayesian models. References: Lรผdecke et al. (2021) <doi:10.21105/joss.03139>.
Maintained by Daniel Lรผdecke. Last updated 1 days ago.
aiceasystatshacktoberfestloomachine-learningmixed-modelsmodelsperformancer2statistics
6.3 match 1.1k stars 16.18 score 4.3k scripts 47 dependentsnashjc
nlsr:Functions for Nonlinear Least Squares Solutions - Updated 2022
Provides tools for working with nonlinear least squares problems. For the estimation of models reliable and robust tools than nls(), where the the Gauss-Newton method frequently stops with 'singular gradient' messages. This is accomplished by using, where possible, analytic derivatives to compute the matrix of derivatives and a stabilization of the solution of the estimation equations. Tools for approximate or externally supplied derivative matrices are included. Bounds and masks on parameters are handled properly.
Maintained by John C Nash. Last updated 29 days ago.
14.4 match 7.02 score 94 scripts 5 dependentsalexpghayes
distributions3:Probability Distributions as S3 Objects
Tools to create and manipulate probability distributions using S3. Generics pdf(), cdf(), quantile(), and random() provide replacements for base R's d/p/q/r style functions. Functions and arguments have been named carefully to minimize confusion for students in intro stats courses. The documentation for each distribution contains detailed mathematical notes.
Maintained by Alex Hayes. Last updated 6 months ago.
8.9 match 102 stars 11.35 score 118 scripts 7 dependentshiroyukiyamamoto
loadings:Loadings for Principal Component Analysis and Partial Least Squares
Computing statistical hypothesis testing for loading in principal component analysis (PCA) (Yamamoto, H. et al. (2014) <doi:10.1186/1471-2105-15-51>), orthogonal smoothed PCA (OS-PCA) (Yamamoto, H. et al. (2021) <doi:10.3390/metabo11030149>), one-sided kernel PCA (Yamamoto, H. (2023) <doi:10.51094/jxiv.262>), partial least squares (PLS) and PLS discriminant analysis (PLS-DA) (Yamamoto, H. et al. (2009) <doi:10.1016/j.chemolab.2009.05.006>), PLS with rank order of groups (PLS-ROG) (Yamamoto, H. (2017) <doi:10.1002/cem.2883>), regularized canonical correlation analysis discriminant analysis (RCCA-DA) (Yamamoto, H. et al. (2008) <doi:10.1016/j.bej.2007.12.009>), multiset PLS and PLS-ROG (Yamamoto, H. (2022) <doi:10.1101/2022.08.30.505949>).
Maintained by Hiroyuki Yamamoto. Last updated 11 months ago.
24.5 match 3 stars 4.08 score 27 scripts 1 dependentshzambran
hydroGOF:Goodness-of-Fit Functions for Comparison of Simulated and Observed Hydrological Time Series
S3 functions implementing both statistical and graphical goodness-of-fit measures between observed and simulated values, mainly oriented to be used during the calibration, validation, and application of hydrological models. Missing values in observed and/or simulated values can be removed before computations. Comments / questions / collaboration of any kind are very welcomed.
Maintained by Mauricio Zambrano-Bigiarini. Last updated 10 months ago.
9.5 match 40 stars 10.29 score 796 scripts 8 dependentsdb969
rsq:R-Squared and Related Measures
Calculate generalized R-squared, partial R-squared, and partial correlation coefficients for generalized linear (mixed) models (including quasi models with well defined variance functions).
Maintained by Dabao Zhang. Last updated 6 months ago.
20.0 match 4.86 score 492 scripts 2 dependentsjorischau
gslnls:GSL Multi-Start Nonlinear Least-Squares Fitting
An R interface to weighted nonlinear least-squares optimization with the GNU Scientific Library (GSL), see M. Galassi et al. (2009, ISBN:0954612078). The available trust region methods include the Levenberg-Marquardt algorithm with and without geodesic acceleration, the Steihaug-Toint conjugate gradient algorithm for large systems and several variants of Powell's dogleg algorithm. Multi-start optimization based on quasi-random samples is implemented using a modified version of the algorithm in Hickernell and Yuan (1997, OR Transactions). Robust nonlinear regression can be performed using various robust loss functions, in which case the optimization problem is solved by iterative reweighted least squares (IRLS). Bindings are provided to tune a number of parameters affecting the low-level aspects of the trust region algorithms. The interface mimics R's nls() function and returns model objects inheriting from the same class.
Maintained by Joris Chau. Last updated 2 months ago.
gnu-scientific-librarygsllevenberg-marquardtmulti-startnonlinear-least-squaresnonlinear-regressionrobust-regresssionfortranglibc
15.6 match 16 stars 6.23 score 35 scripts 2 dependentsjkrijthe
RSSL:Implementations of Semi-Supervised Learning Approaches for Classification
A collection of implementations of semi-supervised classifiers and methods to evaluate their performance. The package includes implementations of, among others, Implicitly Constrained Learning, Moment Constrained Learning, the Transductive SVM, Manifold regularization, Maximum Contrastive Pessimistic Likelihood estimation, S4VM and WellSVM.
Maintained by Jesse Krijthe. Last updated 1 years ago.
16.0 match 58 stars 6.05 score 128 scripts 1 dependentspepijn-devries
csquares:Concise Spatial Query and Representation System (c-Squares)
Encode and decode c-squares, from and to simple feature (sf) or spatiotemporal arrays (stars) objects. Use c-squares codes to quickly join or query spatial data.
Maintained by Pepijn de Vries. Last updated 7 months ago.
16.6 match 2 stars 5.81 score 20 scriptszeileis
ivreg:Instrumental-Variables Regression by '2SLS', '2SM', or '2SMM', with Diagnostics
Instrumental variable estimation for linear models by two-stage least-squares (2SLS) regression or by robust-regression via M-estimation (2SM) or MM-estimation (2SMM). The main ivreg() model-fitting function is designed to provide a workflow as similar as possible to standard lm() regression. A wide range of methods is provided for fitted ivreg model objects, including extensive functionality for computing and graphing regression diagnostics in addition to other standard model tools.
Maintained by Achim Zeileis. Last updated 2 months ago.
instrumental-variablesregression-diagnosticstwo-stage-least-squares-regression
9.4 match 20 stars 10.24 score 360 scripts 4 dependentsjackstat
ModelMetrics:Rapid Calculation of Model Metrics
Collection of metrics for evaluating models written in C++ using 'Rcpp'. Popular metrics include area under the curve, log loss, root mean square error, etc.
Maintained by Tyler Hunt. Last updated 4 years ago.
aucloglossmachine-learningmetricsmodel-evaluationmodel-metricscpp
8.1 match 29 stars 11.83 score 1.3k scripts 306 dependentspachadotdev
cpp11armadillo:An 'Armadillo' Interface
Provides function declarations and inline function definitions that facilitate communication between R and the 'Armadillo' 'C++' library for linear algebra and scientific computing. This implementation is detailed in Vargas Sepulveda and Schneider Malamud (2024) <doi:10.48550/arXiv.2408.11074>.
Maintained by Mauricio Vargas Sepulveda. Last updated 28 days ago.
armadillocppcpp11hacktoberfestlinear-algebra
10.4 match 9 stars 9.14 score 1 scripts 16 dependentstidymodels
yardstick:Tidy Characterizations of Model Performance
Tidy tools for quantifying how well model fits to a data set such as confusion matrices, class probability curve summaries, and regression metrics (e.g., RMSE).
Maintained by Emil Hvitfeldt. Last updated 6 days ago.
6.1 match 387 stars 15.47 score 2.2k scripts 60 dependentsvanzanden
ggsolvencyii:A 'ggplot2'-Plot of Composition of Solvency II SCR: SF and IM
An implementation of 'ggplot2'-methods to present the composition of Solvency II Solvency Capital Requirement (SCR) as a series of concentric circle-parts. Solvency II (Solvency 2) is European insurance legislation, coming in force by the delegated acts of October 10, 2014. <https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=OJ%3AL%3A2015%3A012%3ATOC>. Additional files, defining the structure of the Standard Formula (SF) method of the SCR-calculation are provided. The structure files can be adopted for localization or for insurance companies who use Internal Models (IM). Options are available for combining smaller components, horizontal and vertical scaling, rotation, and plotting only some circle-parts. With outlines and connectors several SCR-compositions can be compared, for example in ORSA-scenarios (Own Risk and Solvency Assessment).
Maintained by Marco van Zanden. Last updated 6 years ago.
16.9 match 3 stars 5.58 score 63 scriptsspatstat
spatstat.geom:Geometrical Functionality of the 'spatstat' Family
Defines spatial data types and supports geometrical operations on them. Data types include point patterns, windows (domains), pixel images, line segment patterns, tessellations and hyperframes. Capabilities include creation and manipulation of data (using command line or graphical interaction), plotting, geometrical operations (rotation, shift, rescale, affine transformation), convex hull, discretisation and pixellation, Dirichlet tessellation, Delaunay triangulation, pairwise distances, nearest-neighbour distances, distance transform, morphological operations (erosion, dilation, closing, opening), quadrat counting, geometrical measurement, geometrical covariance, colour maps, calculus on spatial domains, Gaussian blur, level sets of images, transects of images, intersections between objects, minimum distance matching. (Excludes spatial data on a network, which are supported by the package 'spatstat.linnet'.)
Maintained by Adrian Baddeley. Last updated 20 hours ago.
classes-and-objectsdistance-calculationgeometrygeometry-processingimagesmensurationplottingpoint-patternsspatial-dataspatial-data-analysis
7.5 match 7 stars 12.10 score 241 scripts 227 dependentsopenpharma
mmrm:Mixed Models for Repeated Measures
Mixed models for repeated measures (MMRM) are a popular choice for analyzing longitudinal continuous outcomes in randomized clinical trials and beyond; see Cnaan, Laird and Slasor (1997) <doi:10.1002/(SICI)1097-0258(19971030)16:20%3C2349::AID-SIM667%3E3.0.CO;2-E> for a tutorial and Mallinckrodt, Lane, Schnell, Peng and Mancuso (2008) <doi:10.1177/009286150804200402> for a review. This package implements MMRM based on the marginal linear model without random effects using Template Model Builder ('TMB') which enables fast and robust model fitting. Users can specify a variety of covariance matrices, weight observations, fit models with restricted or standard maximum likelihood inference, perform hypothesis testing with Satterthwaite or Kenward-Roger adjustment, and extract least square means estimates by using 'emmeans'.
Maintained by Daniel Sabanes Bove. Last updated 12 days ago.
7.3 match 138 stars 12.15 score 113 scripts 4 dependentseasystats
effectsize:Indices of Effect Size
Provide utilities to work with indices of effect size for a wide variety of models and hypothesis tests (see list of supported models using the function 'insight::supported_models()'), allowing computation of and conversion between indices such as Cohen's d, r, odds, etc. References: Ben-Shachar et al. (2020) <doi:10.21105/joss.02815>.
Maintained by Mattan S. Ben-Shachar. Last updated 2 months ago.
anovacohens-dcomputeconversioncorrelationeffect-sizeeffectsizehacktoberfesthedges-ginterpretationstandardizationstandardizedstatistics
5.4 match 344 stars 16.38 score 1.8k scripts 29 dependentsbraverock
PerformanceAnalytics:Econometric Tools for Performance and Risk Analysis
Collection of econometric functions for performance and risk analysis. In addition to standard risk and performance metrics, this package aims to aid practitioners and researchers in utilizing the latest research in analysis of non-normal return streams. In general, it is most tested on return (rather than price) data on a regular scale, but most functions will work with irregular return data as well, and increasing numbers of functions will work with P&L or price data where possible.
Maintained by Brian G. Peterson. Last updated 3 months ago.
5.5 match 222 stars 15.93 score 4.8k scripts 20 dependentsrfastofficial
Rfast2:A Collection of Efficient and Extremely Fast R Functions II
A collection of fast statistical and utility functions for data analysis. Functions for regression, maximum likelihood, column-wise statistics and many more have been included. C++ has been utilized to speed up the functions. References: Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>.
Maintained by Manos Papadakis. Last updated 1 years ago.
10.6 match 38 stars 8.09 score 75 scripts 26 dependentsfbertran
plsRbeta:Partial Least Squares Regression for Beta Regression Models
Provides Partial least squares Regression for (weighted) beta regression models (Bertrand 2013, <http://journal-sfds.fr/article/view/215>) and k-fold cross-validation of such models using various criteria. It allows for missing data in the explanatory variables. Bootstrap confidence intervals constructions are also available.
Maintained by Frederic Bertrand. Last updated 2 years ago.
19.6 match 2 stars 4.34 score 22 scriptsmrcieu
TwoSampleMR:Two Sample MR Functions and Interface to MRC Integrative Epidemiology Unit OpenGWAS Database
A package for performing Mendelian randomization using GWAS summary data. It uses the IEU OpenGWAS database <https://gwas.mrcieu.ac.uk/> to automatically obtain data, and a wide range of methods to run the analysis.
Maintained by Gibran Hemani. Last updated 4 hours ago.
7.5 match 474 stars 11.27 score 1.7k scripts 1 dependentsrikenbit
guidedPLS:Supervised Dimensional Reduction by Guided Partial Least Squares
Guided partial least squares (guided-PLS) is the combination of partial least squares by singular value decomposition (PLS-SVD) and guided principal component analysis (guided-PCA). For the details of the methods, see the reference section of GitHub README.md <https://github.com/rikenbit/guidedPLS>.
Maintained by Koki Tsuyuzaki. Last updated 2 years ago.
21.0 match 4.00 scorebioc
structToolbox:Data processing & analysis tools for Metabolomics and other omics
An extensive set of data (pre-)processing and analysis methods and tools for metabolomics and other omics, with a strong emphasis on statistics and machine learning. This toolbox allows the user to build extensive and standardised workflows for data analysis. The methods and tools have been implemented using class-based templates provided by the struct (Statistics in R Using Class-based Templates) package. The toolbox includes pre-processing methods (e.g. signal drift and batch correction, normalisation, missing value imputation and scaling), univariate (e.g. ttest, various forms of ANOVA, KruskalโWallis test and more) and multivariate statistical methods (e.g. PCA and PLS, including cross-validation and permutation testing) as well as machine learning methods (e.g. Support Vector Machines). The STATistics Ontology (STATO) has been integrated and implemented to provide standardised definitions for the different methods, inputs and outputs.
Maintained by Gavin Rhys Lloyd. Last updated 27 days ago.
workflowstepmetabolomicsbioconductor-packagedimslc-msmachine-learningmultivariate-analysisstatisticsunivariate
13.2 match 10 stars 6.26 score 12 scriptschorscroft
zalpha:Run a Suite of Selection Statistics
A suite of statistics for identifying areas of the genome under selective pressure. See Jacobs, Sluckin and Kivisild (2016) <doi:10.1534/genetics.115.185900>.
Maintained by Clare Horscroft. Last updated 3 years ago.
20.5 match 2 stars 4.00 score 4 scriptspsychmeta
psychmeta:Psychometric Meta-Analysis Toolkit
Tools for computing bare-bones and psychometric meta-analyses and for generating psychometric data for use in meta-analysis simulations. Supports bare-bones, individual-correction, and artifact-distribution methods for meta-analyzing correlations and d values. Includes tools for converting effect sizes, computing sporadic artifact corrections, reshaping meta-analytic databases, computing multivariate corrections for range variation, and more. Bugs can be reported to <https://github.com/psychmeta/psychmeta/issues> or <issues@psychmeta.com>.
Maintained by Jeffrey A. Dahlke. Last updated 9 months ago.
hacktoberfestmeta-analysispsychologypsychometricpsychometrics
9.9 match 57 stars 8.25 score 151 scriptsmhenderson
wallis:Room squares in R
Room squares in R.
Maintained by Matthew Henderson. Last updated 7 months ago.
combinatorial-designscombinatoricsroom-squares
31.9 match 2.54 score 1 scriptsegenn
rtemis:Machine Learning and Visualization
Advanced Machine Learning and Visualization. Unsupervised Learning (Clustering, Decomposition), Supervised Learning (Classification, Regression), Cross-Decomposition, Bagging, Boosting, Meta-models. Static and interactive graphics.
Maintained by E.D. Gennatas. Last updated 1 months ago.
data-sciencedata-visualizationmachine-learningmachine-learning-libraryvisualization
11.4 match 145 stars 7.09 score 50 scripts 2 dependentsr-forge
DPQ:Density, Probability, Quantile ('DPQ') Computations
Computations for approximations and alternatives for the 'DPQ' (Density (pdf), Probability (cdf) and Quantile) functions for probability distributions in R. Primary focus is on (central and non-central) beta, gamma and related distributions such as the chi-squared, F, and t. -- For several distribution functions, provide functions implementing formulas from Johnson, Kotz, and Kemp (1992) <doi:10.1002/bimj.4710360207> and Johnson, Kotz, and Balakrishnan (1995) for discrete or continuous distributions respectively. This is for the use of researchers in these numerical approximation implementations, notably for my own use in order to improve standard R pbeta(), qgamma(), ..., etc: {'"dpq"'-functions}.
Maintained by Martin Maechler. Last updated 2 months ago.
13.8 match 5.75 score 43 scripts 1 dependentsrmheiberger
HH:Statistical Analysis and Data Display: Heiberger and Holland
Support software for Statistical Analysis and Data Display (Second Edition, Springer, ISBN 978-1-4939-2121-8, 2015) and (First Edition, Springer, ISBN 0-387-40270-5, 2004) by Richard M. Heiberger and Burt Holland. This contemporary presentation of statistical methods features extensive use of graphical displays for exploring data and for displaying the analysis. The second edition includes redesigned graphics and additional chapters. The authors emphasize how to construct and interpret graphs, discuss principles of graphical design, and show how accompanying traditional tabular results are used to confirm the visual impressions derived directly from the graphs. Many of the graphical formats are novel and appear here for the first time in print. All chapters have exercises. All functions introduced in the book are in the package. R code for all examples, both graphs and tables, in the book is included in the scripts directory of the package.
Maintained by Richard M. Heiberger. Last updated 1 months ago.
12.1 match 3 stars 6.42 score 752 scripts 5 dependentsdsy109
mixtools:Tools for Analyzing Finite Mixture Models
Analyzes finite mixture models for various parametric and semiparametric settings. This includes mixtures of parametric distributions (normal, multivariate normal, multinomial, gamma), various Reliability Mixture Models (RMMs), mixtures-of-regressions settings (linear regression, logistic regression, Poisson regression, linear regression with changepoints, predictor-dependent mixing proportions, random effects regressions, hierarchical mixtures-of-experts), and tools for selecting the number of components (bootstrapping the likelihood ratio test statistic, mixturegrams, and model selection criteria). Bayesian estimation of mixtures-of-linear-regressions models is available as well as a novel data depth method for obtaining credible bands. This package is based upon work supported by the National Science Foundation under Grant No. SES-0518772 and the Chan Zuckerberg Initiative: Essential Open Source Software for Science (Grant No. 2020-255193).
Maintained by Derek Young. Last updated 9 months ago.
mixture-modelsmixture-of-expertssemiparametric-regression
6.8 match 20 stars 11.34 score 1.4k scripts 56 dependentsr-lib
scales:Scale Functions for Visualization
Graphical scales map data to aesthetics, and provide methods for automatically determining breaks and labels for axes and legends.
Maintained by Thomas Lin Pedersen. Last updated 5 months ago.
3.8 match 419 stars 19.88 score 88k scripts 7.9k dependentskhliland
pls:Partial Least Squares and Principal Component Regression
Multivariate regression methods Partial Least Squares Regression (PLSR), Principal Component Regression (PCR) and Canonical Powered Partial Least Squares (CPPLS).
Maintained by Kristian Hovde Liland. Last updated 2 months ago.
5.5 match 36 stars 13.50 score 3.2k scripts 85 dependentsben519
mltools:Machine Learning Tools
A collection of machine learning helper functions, particularly assisting in the Exploratory Data Analysis phase. Makes heavy use of the 'data.table' package for optimal speed and memory efficiency. Highlights include a versatile bin_data() function, sparsify() for converting a data.table to sparse matrix format with one-hot encoding, fast evaluation metrics, and empirical_cdf() for calculating empirical Multivariate Cumulative Distribution Functions.
Maintained by Ben Gorman. Last updated 3 years ago.
exploratory-data-analysismachine-learning
7.5 match 72 stars 9.58 score 1.2k scripts 13 dependentsbarbarabodinier
sharp:Stability-enHanced Approaches using Resampling Procedures
In stability selection (N Meinshausen, P Bรผhlmann (2010) <doi:10.1111/j.1467-9868.2010.00740.x>) and consensus clustering (S Monti et al (2003) <doi:10.1023/A:1023949509487>), resampling techniques are used to enhance the reliability of the results. In this package, hyper-parameters are calibrated by maximising model stability, which is measured under the null hypothesis that all selection (or co-membership) probabilities are identical (B Bodinier et al (2023a) <doi:10.1093/jrsssc/qlad058> and B Bodinier et al (2023b) <doi:10.1093/bioinformatics/btad635>). Functions are readily implemented for the use of LASSO regression, sparse PCA, sparse (group) PLS or graphical LASSO in stability selection, and hierarchical clustering, partitioning around medoids, K means or Gaussian mixture models in consensus clustering.
Maintained by Barbara Bodinier. Last updated 1 years ago.
12.2 match 13 stars 5.91 score 124 scriptsbilldenney
PKNCA:Perform Pharmacokinetic Non-Compartmental Analysis
Compute standard Non-Compartmental Analysis (NCA) parameters for typical pharmacokinetic analyses and summarize them.
Maintained by Bill Denney. Last updated 18 days ago.
ncanoncompartmental-analysispharmacokinetics
5.7 match 73 stars 12.61 score 214 scripts 4 dependentstidymodels
recipes:Preprocessing and Feature Engineering Steps for Modeling
A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.
Maintained by Max Kuhn. Last updated 8 hours ago.
3.8 match 584 stars 18.73 score 7.2k scripts 382 dependentscran
agricolae:Statistical Procedures for Agricultural Research
Original idea was presented in the thesis "A statistical analysis tool for agricultural research" to obtain the degree of Master on science, National Engineering University (UNI), Lima-Peru. Some experimental data for the examples come from the CIP and others research. Agricolae offers extensive functionality on experimental design especially for agricultural and plant breeding experiments, which can also be useful for other purposes. It supports planning of lattice, Alpha, Cyclic, Complete Block, Latin Square, Graeco-Latin Squares, augmented block, factorial, split and strip plot designs. There are also various analysis facilities for experimental data, e.g. treatment comparison procedures and several non-parametric tests comparison, biodiversity indexes and consensus cluster.
Maintained by Felipe de Mendiburu. Last updated 1 years ago.
10.0 match 7 stars 7.01 score 15 dependentsemitanaka
edibble:Encapsulating Elements of Experimental Design
A system to facilitate designing comparative (and non-comparative) experiments using the grammar of experimental designs <https://emitanaka.org/edibble-book/>. An experimental design is treated as an intermediate, mutable object that is built progressively by fundamental experimental components like units, treatments, and their relation. The system aids in experimental planning, management and workflow.
Maintained by Emi Tanaka. Last updated 4 months ago.
9.3 match 217 stars 7.43 score 62 scriptsopengeos
whitebox:'WhiteboxTools' R Frontend
An R frontend for the 'WhiteboxTools' library, which is an advanced geospatial data analysis platform developed by Prof. John Lindsay at the University of Guelph's Geomorphometry and Hydrogeomatics Research Group. 'WhiteboxTools' can be used to perform common geographical information systems (GIS) analysis operations, such as cost-distance analysis, distance buffering, and raster reclassification. Remote sensing and image processing tasks include image enhancement (e.g. panchromatic sharpening, contrast adjustments), image mosaicing, numerous filtering operations, simple classification (k-means), and common image transformations. 'WhiteboxTools' also contains advanced tooling for spatial hydrological analysis (e.g. flow-accumulation, watershed delineation, stream network analysis, sink removal), terrain analysis (e.g. common terrain indices such as slope, curvatures, wetness index, hillshading; hypsometric analysis; multi-scale topographic position analysis), and LiDAR data processing. Suggested citation: Lindsay (2016) <doi:10.1016/j.cageo.2016.07.003>.
Maintained by Andrew Brown. Last updated 5 months ago.
geomorphometrygeoprocessinggeospatialgishydrologyremote-sensingrstudio
7.1 match 173 stars 9.65 score 203 scripts 2 dependentsverasls
lvmisc:Veras Miscellaneous
Contains a collection of useful functions for basic data computation and manipulation, wrapper functions for generating 'ggplot2' graphics, including statistical model diagnostic plots, methods for computing statistical models quality measures (such as AIC, BIC, r squared, root mean squared error) and general utilities.
Maintained by Lucas Veras. Last updated 1 years ago.
12.7 match 6 stars 5.40 score 14 scripts 1 dependentslrberge
fixest:Fast Fixed-Effects Estimations
Fast and user-friendly estimation of econometric models with multiple fixed-effects. Includes ordinary least squares (OLS), generalized linear models (GLM) and the negative binomial. The core of the package is based on optimized parallel C++ code, scaling especially well for large data sets. The method to obtain the fixed-effects coefficients is based on Berge (2018) <https://github.com/lrberge/fixest/blob/master/_DOCS/FENmlm_paper.pdf>. Further provides tools to export and view the results of several estimations with intuitive design to cluster the standard-errors.
Maintained by Laurent Berge. Last updated 7 months ago.
4.6 match 387 stars 14.69 score 3.8k scripts 25 dependentsheliosdrm
pwr:Basic Functions for Power Analysis
Power analysis functions along the lines of Cohen (1988).
Maintained by Helios De Rosario. Last updated 1 years ago.
5.2 match 105 stars 12.97 score 2.6k scripts 28 dependentstrevorld
gridpattern:'grid' Pattern Grobs
Provides 'grid' grobs that fill in a user-defined area with various patterns. Includes enhanced versions of the geometric and image-based patterns originally contained in the 'ggpattern' package as well as original 'pch', 'polygon_tiling', 'regular_polygon', 'rose', 'text', 'wave', and 'weave' patterns plus support for custom user-defined patterns.
Maintained by Trevor L. Davis. Last updated 1 months ago.
7.9 match 33 stars 8.42 score 4 scripts 4 dependentsamices
mice:Multivariate Imputation by Chained Equations
Multiple imputation using Fully Conditional Specification (FCS) implemented by the MICE algorithm as described in Van Buuren and Groothuis-Oudshoorn (2011) <doi:10.18637/jss.v045.i03>. Each variable has its own imputation model. Built-in imputation models are provided for continuous data (predictive mean matching, normal), binary data (logistic regression), unordered categorical data (polytomous logistic regression) and ordered categorical data (proportional odds). MICE can also impute continuous two-level data (normal model, pan, second-level variables). Passive imputation can be used to maintain consistency between variables. Various diagnostic plots are available to inspect the quality of the imputations.
Maintained by Stef van Buuren. Last updated 8 days ago.
chained-equationsfcsimputationmicemissing-datamissing-valuesmultiple-imputationmultivariate-datacpp
3.9 match 462 stars 16.50 score 10k scripts 154 dependentsfbertran
plsdof:Degrees of Freedom and Statistical Inference for Partial Least Squares Regression
The plsdof package provides Degrees of Freedom estimates for Partial Least Squares (PLS) Regression. Model selection for PLS is based on various information criteria (aic, bic, gmdl) or on cross-validation. Estimates for the mean and covariance of the PLS regression coefficients are available. They allow the construction of approximate confidence intervals and the application of test procedures (Kramer and Sugiyama 2012 <doi:10.1198/jasa.2011.tm10107>). Further, cross-validation procedures for Ridge Regression and Principal Components Regression are available.
Maintained by Frederic Bertrand. Last updated 2 years ago.
17.3 match 3 stars 3.65 score 30 scriptscran
sae:Small Area Estimation
Functions for small area estimation.
Maintained by Yolanda Marhuenda. Last updated 5 years ago.
11.5 match 6 stars 5.49 score 83 scripts 8 dependentsocbe-uio
contingencytables:Statistical Analysis of Contingency Tables
Provides functions to perform statistical inference of data organized in contingency tables. This package is a companion to the "Statistical Analysis of Contingency Tables" book by Fagerland et al. <ISBN 9781466588172>.
Maintained by Waldir Leoncio. Last updated 7 months ago.
15.3 match 3 stars 4.13 score 8 scripts 1 dependentsxiaoruizhu
SurrogateRsq:Goodness-of-Fit Analysis for Categorical Data using the Surrogate R-Squared
To assess and compare the models' goodness of fit, R-squared is one of the most popular measures. For categorical data analysis, however, no universally adopted R-squared measure can resemble the ordinary least square (OLS) R-squared for linear models with continuous data. This package implement the surrogate R-squared measure for categorical data analysis, which is proposed in the study of Dungang Liu, Xiaorui Zhu, Brandon Greenwell, and Zewei Lin (2022) <doi:10.1111/bmsp.12289>. It can generate a point or interval measure of the surrogate R-squared. It can also provide a ranking measure of the percentage contribution of each variable to the overall surrogate R-squared. This ranking assessment allows one to check the importance of each variable in terms of their explained variance. This package can be jointly used with other existing R packages for variable selection and model diagnostics in the model-building process.
Maintained by Xiaorui (Jeremy) Zhu. Last updated 12 months ago.
categorical-data-analysisgoodness-of-fitr-squared-statisticstatistics
13.9 match 5 stars 4.48 score 12 scriptsbriencj
dae:Functions Useful in the Design and ANOVA of Experiments
The content falls into the following groupings: (i) Data, (ii) Factor manipulation functions, (iii) Design functions, (iv) ANOVA functions, (v) Matrix functions, (vi) Projector and canonical efficiency functions, and (vii) Miscellaneous functions. There is a vignette describing how to use the design functions for randomizing and assessing designs available as a vignette called 'DesignNotes'. The ANOVA functions facilitate the extraction of information when the 'Error' function has been used in the call to 'aov'. The package 'dae' can also be installed from <http://chris.brien.name/rpackages/>.
Maintained by Chris Brien. Last updated 4 months ago.
7.2 match 1 stars 8.62 score 356 scripts 7 dependentskassambara
rstatix:Pipe-Friendly Framework for Basic Statistical Tests
Provides a simple and intuitive pipe-friendly framework, coherent with the 'tidyverse' design philosophy, for performing basic statistical tests, including t-test, Wilcoxon test, ANOVA, Kruskal-Wallis and correlation analyses. The output of each test is automatically transformed into a tidy data frame to facilitate visualization. Additional functions are available for reshaping, reordering, manipulating and visualizing correlation matrix. Functions are also included to facilitate the analysis of factorial experiments, including purely 'within-Ss' designs (repeated measures), purely 'between-Ss' designs, and mixed 'within-and-between-Ss' designs. It's also possible to compute several effect size metrics, including "eta squared" for ANOVA, "Cohen's d" for t-test and 'Cramer V' for the association between categorical variables. The package contains helper functions for identifying univariate and multivariate outliers, assessing normality and homogeneity of variances.
Maintained by Alboukadel Kassambara. Last updated 2 years ago.
4.1 match 456 stars 15.16 score 11k scripts 420 dependentssingmann
afex:Analysis of Factorial Experiments
Convenience functions for analyzing factorial experiments using ANOVA or mixed models. aov_ez(), aov_car(), and aov_4() allow specification of between, within (i.e., repeated-measures), or mixed (i.e., split-plot) ANOVAs for data in long format (i.e., one observation per row), automatically aggregating multiple observations per individual and cell of the design. mixed() fits mixed models using lme4::lmer() and computes p-values for all fixed effects using either Kenward-Roger or Satterthwaite approximation for degrees of freedom (LMM only), parametric bootstrap (LMMs and GLMMs), or likelihood ratio tests (LMMs and GLMMs). afex_plot() provides a high-level interface for interaction or one-way plots using ggplot2, combining raw data and model estimates. afex uses type 3 sums of squares as default (imitating commercial statistical software).
Maintained by Henrik Singmann. Last updated 7 months ago.
4.2 match 123 stars 14.50 score 1.4k scripts 15 dependentsphilipppro
measures:Performance Measures for Statistical Learning
Provides the biggest amount of statistical measures in the whole R world. Includes measures of regression, (multiclass) classification and multilabel classification. The measures come mainly from the 'mlr' package and were programed by several 'mlr' developers.
Maintained by Philipp Probst. Last updated 4 years ago.
13.4 match 1 stars 4.47 score 88 scripts 2 dependentsk-m-m
nnls:The Lawson-Hanson Algorithm for Non-Negative Least Squares (NNLS)
An R interface to the Lawson-Hanson implementation of an algorithm for non-negative least squares (NNLS). Also allows the combination of non-negative and non-positive constraints.
Maintained by Katharine Mullen. Last updated 5 months ago.
8.4 match 7.13 score 251 scripts 167 dependentsdgbonett
statpsych:Statistical Methods for Psychologists
Implements confidence interval and sample size methods that are especially useful in psychological research. The methods can be applied in 1-group, 2-group, paired-samples, and multiple-group designs and to a variety of parameters including means, medians, proportions, slopes, standardized mean differences, standardized linear contrasts of means, plus several measures of correlation and association. Confidence interval and sample size functions are given for single parameters as well as differences, ratios, and linear contrasts of parameters. The sample size functions can be used to approximate the sample size needed to estimate a parameter or function of parameters with desired confidence interval precision or to perform a variety of hypothesis tests (directional two-sided, equivalence, superiority, noninferiority) with desired power. For details see: Statistical Methods for Psychologists, Volumes 1 โ 4, <https://dgbonett.sites.ucsc.edu/>.
Maintained by Douglas G. Bonett. Last updated 3 months ago.
12.4 match 6 stars 4.83 score 15 scripts 1 dependentsdeclaredesign
estimatr:Fast Estimators for Design-Based Inference
Fast procedures for small set of commonly-used, design-appropriate estimators with robust standard errors and confidence intervals. Includes estimators for linear regression, instrumental variables regression, difference-in-means, Horvitz-Thompson estimation, and regression improving precision of experimental estimates by interacting treatment with centered pre-treatment covariates introduced by Lin (2013) <doi:10.1214/12-AOAS583>.
Maintained by Graeme Blair. Last updated 1 months ago.
5.2 match 133 stars 11.58 score 1.7k scripts 11 dependentsnlmixr2
rxode2:Facilities for Simulating from ODE-Based Models
Facilities for running simulations from ordinary differential equation ('ODE') models, such as pharmacometrics and other compartmental models. A compilation manager translates the ODE model into C, compiles it, and dynamically loads the object code into R for improved computational efficiency. An event table object facilitates the specification of complex dosing regimens (optional) and sampling schedules. NB: The use of this package requires both C and Fortran compilers, for details on their use with R please see Section 6.3, Appendix A, and Appendix D in the "R Administration and Installation" manual. Also the code is mostly released under GPL. The 'VODE' and 'LSODA' are in the public domain. The information is available in the inst/COPYRIGHTS.
Maintained by Matthew L. Fidler. Last updated 1 months ago.
5.3 match 40 stars 11.24 score 220 scripts 13 dependentschikuang
SLSEdesign:Optimal Regression Design under the Second-Order Least Squares Estimator
With given inputs that include number of points, discrete design space, a measure of skewness, models and parameter value, this package calculates the objective value, optimal designs and plot the equivalence theory under A- and D-optimal criteria under the second-order Least squares estimator. This package is based on the paper "Properties of optimal regression designs under the second-order least squares estimator" by Chi-Kuang Yeh and Julie Zhou (2021) <doi:10.1007/s00362-018-01076-6>.
Maintained by Chi-Kuang Yeh. Last updated 5 months ago.
convex-optimizationcvxdesign-of-experimentsleast-squaresoptimal-designs
12.8 match 4.54 score 2 scriptskhliland
multiblock:Multiblock Data Fusion in Statistics and Machine Learning
Functions and datasets to support Smilde, Nรฆs and Liland (2021, ISBN: 978-1-119-60096-1) "Multiblock Data Fusion in Statistics and Machine Learning - Applications in the Natural and Life Sciences". This implements and imports a large collection of methods for multiblock data analysis with common interfaces, result- and plotting functions, several real data sets and six vignettes covering a range different applications.
Maintained by Kristian Hovde Liland. Last updated 2 months ago.
8.6 match 14 stars 6.68 score 19 scriptspaul-buerkner
brms:Bayesian Regression Models using 'Stan'
Fit Bayesian generalized (non-)linear multivariate multilevel models using 'Stan' for full Bayesian inference. A wide range of distributions and link functions are supported, allowing users to fit -- among others -- linear, robust linear, count data, survival, response times, ordinal, zero-inflated, hurdle, and even self-defined mixture models all in a multilevel context. Further modeling options include both theory-driven and data-driven non-linear terms, auto-correlation structures, censoring and truncation, meta-analytic standard errors, and quite a few more. In addition, all parameters of the response distribution can be predicted in order to perform distributional regression. Prior specifications are flexible and explicitly encourage users to apply prior distributions that actually reflect their prior knowledge. Models can easily be evaluated and compared using several methods assessing posterior or prior predictions. References: Bรผrkner (2017) <doi:10.18637/jss.v080.i01>; Bรผrkner (2018) <doi:10.32614/RJ-2018-017>; Bรผrkner (2021) <doi:10.18637/jss.v100.i05>; Carpenter et al. (2017) <doi:10.18637/jss.v076.i01>.
Maintained by Paul-Christian Bรผrkner. Last updated 5 days ago.
bayesian-inferencebrmsmultilevel-modelsstanstatistical-models
3.4 match 1.3k stars 16.61 score 13k scripts 34 dependentsr-spatial
spdep:Spatial Dependence: Weighting Schemes, Statistics
A collection of functions to create spatial weights matrix objects from polygon 'contiguities', from point patterns by distance and tessellations, for summarizing these objects, and for permitting their use in spatial data analysis, including regional aggregation by minimum spanning tree; a collection of tests for spatial 'autocorrelation', including global 'Morans I' and 'Gearys C' proposed by 'Cliff' and 'Ord' (1973, ISBN: 0850860369) and (1981, ISBN: 0850860814), 'Hubert/Mantel' general cross product statistic, Empirical Bayes estimates and 'Assunรงรฃo/Reis' (1999) <doi:10.1002/(SICI)1097-0258(19990830)18:16%3C2147::AID-SIM179%3E3.0.CO;2-I> Index, 'Getis/Ord' G ('Getis' and 'Ord' 1992) <doi:10.1111/j.1538-4632.1992.tb00261.x> and multicoloured join count statistics, 'APLE' ('Li 'et al.' ) <doi:10.1111/j.1538-4632.2007.00708.x>, local 'Moran's I', 'Gearys C' ('Anselin' 1995) <doi:10.1111/j.1538-4632.1995.tb00338.x> and 'Getis/Ord' G ('Ord' and 'Getis' 1995) <doi:10.1111/j.1538-4632.1995.tb00912.x>, 'saddlepoint' approximations ('Tiefelsdorf' 2002) <doi:10.1111/j.1538-4632.2002.tb01084.x> and exact tests for global and local 'Moran's I' ('Bivand et al.' 2009) <doi:10.1016/j.csda.2008.07.021> and 'LOSH' local indicators of spatial heteroscedasticity ('Ord' and 'Getis') <doi:10.1007/s00168-011-0492-y>. The implementation of most of these measures is described in 'Bivand' and 'Wong' (2018) <doi:10.1007/s11749-018-0599-x>, with further extensions in 'Bivand' (2022) <doi:10.1111/gean.12319>. 'Lagrange' multiplier tests for spatial dependence in linear models are provided ('Anselin et al'. 1996) <doi:10.1016/0166-0462(95)02111-6>, as are 'Rao' score tests for hypothesised spatial 'Durbin' models based on linear models ('Koley' and 'Bera' 2023) <doi:10.1080/17421772.2023.2256810>. A local indicators for categorical data (LICD) implementation based on 'Carrer et al.' (2021) <doi:10.1016/j.jas.2020.105306> and 'Bivand et al.' (2017) <doi:10.1016/j.spasta.2017.03.003> was added in 1.3-7. From 'spdep' and 'spatialreg' versions >= 1.2-1, the model fitting functions previously present in this package are defunct in 'spdep' and may be found in 'spatialreg'.
Maintained by Roger Bivand. Last updated 20 days ago.
spatial-autocorrelationspatial-dependencespatial-weights
3.4 match 131 stars 16.62 score 6.0k scripts 107 dependentsgdurif
plsgenomics:PLS Analyses for Genomics
Routines for PLS-based genomic analyses, implementing PLS methods for classification with microarray data and prediction of transcription factor activities from combined ChIP-chip analysis. The >=1.2-1 versions include two new classification methods for microarray data: GSIM and Ridge PLS. The >=1.3 versions includes a new classification method combining variable selection and compression in logistic regression context: logit-SPLS; and an adaptive version of the sparse PLS.
Maintained by Ghislain Durif. Last updated 12 months ago.
10.1 match 5.55 score 140 scripts 2 dependentskhliland
plsVarSel:Variable Selection in Partial Least Squares
Interfaces and methods for variable selection in Partial Least Squares. The methods include filter methods, wrapper methods and embedded methods. Both regression and classification is supported.
Maintained by Kristian Hovde Liland. Last updated 5 days ago.
8.8 match 3 stars 6.33 score 40 scripts 4 dependentsspatstat
spatstat.utils:Utility Functions for 'spatstat'
Contains utility functions for the 'spatstat' family of packages which may also be useful for other purposes.
Maintained by Adrian Baddeley. Last updated 4 days ago.
spatial-analysisspatial-dataspatstat
4.8 match 5 stars 11.66 score 134 scripts 248 dependentsasgr
imager:Image Processing Library Based on 'CImg'
Fast image processing for images in up to 4 dimensions (two spatial dimensions, one time/depth dimension, one colour dimension). Provides most traditional image processing tools (filtering, morphology, transformations, etc.) as well as various functions for easily analysing image data using R. The package wraps 'CImg', <http://cimg.eu>, a simple, modern C++ library for image processing.
Maintained by Aaron Robotham. Last updated 29 days ago.
4.0 match 17 stars 13.62 score 2.4k scripts 45 dependentslaplacesdemonr
LaplacesDemon:Complete Environment for Bayesian Inference
Provides a complete environment for Bayesian inference using a variety of different samplers (see ?LaplacesDemon for an overview).
Maintained by Henrik Singmann. Last updated 12 months ago.
4.0 match 93 stars 13.45 score 1.8k scripts 60 dependentsr-lum
Luminescence:Comprehensive Luminescence Dating Data Analysis
A collection of various R functions for the purpose of Luminescence dating data analysis. This includes, amongst others, data import, export, application of age models, curve deconvolution, sequence analysis and plotting of equivalent dose distributions.
Maintained by Sebastian Kreutzer. Last updated 3 hours ago.
bayesian-statisticsdata-sciencegeochronologyluminescenceluminescence-datingopen-scienceoslplottingradiofluorescencetlxsygcpp
5.0 match 15 stars 10.74 score 178 scripts 8 dependentslindbrook
cholera:Amend, Augment and Aid Analysis of John Snow's Cholera Map
Amends errors, augments data and aids analysis of John Snow's map of the 1854 London cholera outbreak.
Maintained by lindbrook. Last updated 4 hours ago.
choleradata-visualizationdatasetsepidemiologyjohn-snowpublic-healthtriangulation-delaunayvoronoivoronoi-polygons
5.8 match 136 stars 9.34 score 95 scriptsmodeloriented
auditor:Model Audit - Verification, Validation, and Error Analysis
Provides an easy to use unified interface for creating validation plots for any model. The 'auditor' helps to avoid repetitive work consisting of writing code needed to create residual plots. This visualizations allow to asses and compare the goodness of fit, performance, and similarity of models.
Maintained by Alicja Gosiewska. Last updated 1 years ago.
classificationerror-analysisexplainable-artificial-intelligencemachine-learningmodel-validationregression-modelsresidualsxai
6.1 match 58 stars 8.76 score 94 scripts 2 dependentssparklyr
sparklyr:R Interface to Apache Spark
R interface to Apache Spark, a fast and general engine for big data processing, see <https://spark.apache.org/>. This package supports connecting to local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end, and provides an interface to Spark's built-in machine learning algorithms.
Maintained by Edgar Ruiz. Last updated 8 hours ago.
apache-sparkdistributeddplyridelivymachine-learningremote-clusterssparksparklyr
3.5 match 959 stars 15.20 score 4.0k scripts 21 dependentsjslefche
piecewiseSEM:Piecewise Structural Equation Modeling
Implements piecewise structural equation modeling from a single list of structural equations, with new methods for non-linear, latent, and composite variables, standardized coefficients, query-based prediction and indirect effects. See <http://jslefche.github.io/piecewiseSEM/> for more.
Maintained by Jon Lefcheck. Last updated 9 months ago.
5.4 match 163 stars 9.85 score 452 scriptsbioc
CMA:Synthesis of microarray-based classification
This package provides a comprehensive collection of various microarray-based classification algorithms both from Machine Learning and Statistics. Variable Selection, Hyperparameter tuning, Evaluation and Comparison can be performed combined or stepwise in a user-friendly environment.
Maintained by Roman Hornung. Last updated 5 months ago.
10.4 match 5.09 score 61 scriptstopepo
caret:Classification and Regression Training
Misc functions for training and plotting classification and regression models.
Maintained by Max Kuhn. Last updated 3 months ago.
2.8 match 1.6k stars 19.24 score 61k scripts 303 dependentsepiforecasts
scoringutils:Utilities for Scoring and Assessing Predictions
Facilitate the evaluation of forecasts in a convenient framework based on data.table. It allows user to to check their forecasts and diagnose issues, to visualise forecasts and missing data, to transform data before scoring, to handle missing forecasts, to aggregate scores, and to visualise the results of the evaluation. The package mostly focuses on the evaluation of probabilistic forecasts and allows evaluating several different forecast types and input formats. Find more information about the package in the Vignettes as well as in the accompanying paper, <doi:10.48550/arXiv.2205.07090>.
Maintained by Nikos Bosse. Last updated 15 days ago.
forecast-evaluationforecasting
4.6 match 52 stars 11.37 score 326 scripts 7 dependentsjuliasilge
widyr:Widen, Process, then Re-Tidy Data
Encapsulates the pattern of untidying data into a wide matrix, performing some processing, then turning it back into a tidy form. This is useful for several operations such as co-occurrence counts, correlations, or clustering that are mathematically convenient on wide matrices.
Maintained by Julia Silge. Last updated 2 years ago.
4.7 match 328 stars 11.11 score 1.7k scripts 2 dependentskurthornik
clue:Cluster Ensembles
CLUster Ensembles.
Maintained by Kurt Hornik. Last updated 4 months ago.
5.3 match 2 stars 9.85 score 496 scripts 401 dependentsaalfons
laeken:Estimation of Indicators on Social Exclusion and Poverty
Estimation of indicators on social exclusion and poverty, as well as Pareto tail modeling for empirical income distributions.
Maintained by Andreas Alfons. Last updated 1 years ago.
5.4 match 3 stars 9.57 score 300 scripts 30 dependentsjclavel
mvMORPH:Multivariate Comparative Tools for Fitting Evolutionary Models to Morphometric Data
Fits multivariate (Brownian Motion, Early Burst, ACDC, Ornstein-Uhlenbeck and Shifts) models of continuous traits evolution on trees and time series. 'mvMORPH' also proposes high-dimensional multivariate comparative tools (linear models using Generalized Least Squares and multivariate tests) based on penalized likelihood. See Clavel et al. (2015) <DOI:10.1111/2041-210X.12420>, Clavel et al. (2019) <DOI:10.1093/sysbio/syy045>, and Clavel & Morlon (2020) <DOI:10.1093/sysbio/syaa010>.
Maintained by Julien Clavel. Last updated 1 months ago.
5.4 match 17 stars 9.46 score 189 scripts 3 dependentsalexanderrobitzsch
miceadds:Some Additional Multiple Imputation Functions, Especially for 'mice'
Contains functions for multiple imputation which complements existing functionality in R. In particular, several imputation methods for the mice package (van Buuren & Groothuis-Oudshoorn, 2011, <doi:10.18637/jss.v045.i03>) are implemented. Main features of the miceadds package include plausible value imputation (Mislevy, 1991, <doi:10.1007/BF02294457>), multilevel imputation for variables at any level or with any number of hierarchical and non-hierarchical levels (Grund, Luedtke & Robitzsch, 2018, <doi:10.1177/1094428117703686>; van Buuren, 2018, Ch.7, <doi:10.1201/9780429492259>), imputation using partial least squares (PLS) for high dimensional predictors (Robitzsch, Pham & Yanagida, 2016), nested multiple imputation (Rubin, 2003, <doi:10.1111/1467-9574.00217>), substantive model compatible imputation (Bartlett et al., 2015, <doi:10.1177/0962280214521348>), and features for the generation of synthetic datasets (Reiter, 2005, <doi:10.1111/j.1467-985X.2004.00343.x>; Nowok, Raab, & Dibben, 2016, <doi:10.18637/jss.v074.i11>).
Maintained by Alexander Robitzsch. Last updated 17 days ago.
missing-datamultiple-imputationopenblascpp
5.6 match 16 stars 9.16 score 542 scripts 9 dependentsdjnavarro
lsr:Companion to "Learning Statistics with R"
A collection of tools intended to make introductory statistics easier to teach, including wrappers for common hypothesis tests and basic data manipulation. It accompanies Navarro, D. J. (2015). Learning Statistics with R: A Tutorial for Psychology Students and Other Beginners, Version 0.6.
Maintained by Danielle Navarro. Last updated 3 years ago.
5.3 match 12 stars 9.55 score 1.7k scripts 11 dependentsthomasp85
ggraph:An Implementation of Grammar of Graphics for Graphs and Networks
The grammar of graphics as implemented in ggplot2 is a poor fit for graph and network visualizations due to its reliance on tabular data input. ggraph is an extension of the ggplot2 API tailored to graph visualizations and provides the same flexible approach to building up plots layer by layer.
Maintained by Thomas Lin Pedersen. Last updated 1 years ago.
ggplot-extensionggplot2graph-visualizationnetwork-visualizationvisualizationcpp
3.0 match 1.1k stars 16.96 score 9.2k scripts 111 dependentsstatdivlab
corncob:Count Regression for Correlated Observations with the Beta-Binomial
Statistical modeling for correlated count data using the beta-binomial distribution, described in Martin et al. (2020) <doi:10.1214/19-AOAS1283>. It allows for both mean and overdispersion covariates.
Maintained by Amy D Willis. Last updated 1 days ago.
5.2 match 106 stars 9.82 score 248 scripts 1 dependentsmoviedo5
fda.usc:Functional Data Analysis and Utilities for Statistical Computing
Routines for exploratory and descriptive analysis of functional data such as depth measurements, atypical curves detection, regression models, supervised classification, unsupervised classification and functional analysis of variance.
Maintained by Manuel Oviedo de la Fuente. Last updated 4 months ago.
functional-data-analysisfortran
5.2 match 12 stars 9.72 score 560 scripts 22 dependentsyihui
animation:A Gallery of Animations in Statistics and Utilities to Create Animations
Provides functions for animations in statistics, covering topics in probability theory, mathematical statistics, multivariate statistics, non-parametric statistics, sampling survey, linear models, time series, computational statistics, data mining and machine learning. These functions may be helpful in teaching statistics and data analysis. Also provided in this package are a series of functions to save animations to various formats, e.g. Flash, 'GIF', HTML pages, 'PDF' and videos. 'PDF' animations can be inserted into 'Sweave' / 'knitr' easily.
Maintained by Yihui Xie. Last updated 2 years ago.
animationstatistical-computingstatistical-graphicsstatistics
4.1 match 208 stars 12.08 score 2.5k scripts 29 dependentsrvlenth
lsmeans:Least-Squares Means
Obtain least-squares means for linear, generalized linear, and mixed models. Compute contrasts or linear functions of least-squares means, and comparisons of slopes. Plots and compact letter displays. Least-squares means were proposed in Harvey, W (1960) "Least-squares analysis of data with unequal subclass numbers", Tech Report ARS-20-8, USDA National Agricultural Library, and discussed further in Searle, Speed, and Milliken (1980) "Population marginal means in the linear model: An alternative to least squares means", The American Statistician 34(4), 216-221 <doi:10.1080/00031305.1980.10483031>. NOTE: lsmeans now relies primarily on code in the 'emmeans' package. 'lsmeans' will be archived in the near future.
Maintained by Russell Lenth. Last updated 6 years ago.
6.4 match 12 stars 7.82 score 1.8k scriptsjensharbers
agricolaeplotr:Visualization of Design of Experiments from the 'agricolae' Package
Visualization of Design of Experiments from the 'agricolae' package with 'ggplot2' framework The user provides an experiment design from the 'agricolae' package, calls the corresponding function and will receive a visualization with 'ggplot2' based functions that are specific for each design. As there are many different designs, each design is tested on its type. The output can be modified with standard 'ggplot2' commands or with other packages with 'ggplot2' function extensions.
Maintained by Jens Harbers. Last updated 2 months ago.
7.8 match 8 stars 6.27 score 78 scriptsr-forge
survey:Analysis of Complex Survey Samples
Summary statistics, two-sample tests, rank tests, generalised linear models, cumulative link models, Cox models, loglinear models, and general maximum pseudolikelihood estimation for multistage stratified, cluster-sampled, unequally weighted survey samples. Variances by Taylor series linearisation or replicate weights. Post-stratification, calibration, and raking. Two-phase and multiphase subsampling designs. Graphics. PPS sampling without replacement. Small-area estimation. Dual-frame designs.
Maintained by "Thomas Lumley". Last updated 6 months ago.
3.5 match 1 stars 13.93 score 13k scripts 235 dependentsbioboot
bio3d:Biological Structure Analysis
Utilities to process, organize and explore protein structure, sequence and dynamics data. Features include the ability to read and write structure, sequence and dynamic trajectory data, perform sequence and structure database searches, data summaries, atom selection, alignment, superposition, rigid core identification, clustering, torsion analysis, distance matrix analysis, structure and sequence conservation analysis, normal mode analysis, principal component analysis of heterogeneous structure data, and correlation network analysis from normal mode and molecular dynamics data. In addition, various utility functions are provided to enable the statistical and graphical power of the R environment to work with biological sequence and structural data. Please refer to the URLs below for more information.
Maintained by Barry Grant. Last updated 5 months ago.
5.7 match 5 stars 8.49 score 1.4k scripts 10 dependentsfloschuberth
cSEM:Composite-Based Structural Equation Modeling
Estimate, assess, test, and study linear, nonlinear, hierarchical and multigroup structural equation models using composite-based approaches and procedures, including estimation techniques such as partial least squares path modeling (PLS-PM) and its derivatives (PLSc, ordPLSc, robustPLSc), generalized structured component analysis (GSCA), generalized structured component analysis with uniqueness terms (GSCAm), generalized canonical correlation analysis (GCCA), principal component analysis (PCA), factor score regression (FSR) using sum score, regression or Bartlett scores (including bias correction using Croonโs approach), as well as several tests and typical postestimation procedures (e.g., verify admissibility of the estimates, assess the model fit, test the model fit etc.).
Maintained by Florian Schuberth. Last updated 10 hours ago.
5.1 match 28 stars 9.22 score 56 scripts 2 dependentsrstudio
tfprobability:Interface to 'TensorFlow Probability'
Interface to 'TensorFlow Probability', a 'Python' library built on 'TensorFlow' that makes it easy to combine probabilistic models and deep learning on modern hardware ('TPU', 'GPU'). 'TensorFlow Probability' includes a wide selection of probability distributions and bijectors, probabilistic layers, variational inference, Markov chain Monte Carlo, and optimizers such as Nelder-Mead, BFGS, and SGLD.
Maintained by Tomasz Kalinowski. Last updated 3 years ago.
5.5 match 54 stars 8.63 score 221 scripts 3 dependentsjeksterslab
betaSandwich:Robust Confidence Intervals for Standardized Regression Coefficients
Generates robust confidence intervals for standardized regression coefficients using heteroskedasticity-consistent standard errors for models fitted by lm() as described in Dudgeon (2017) <doi:10.1007/s11336-017-9563-z>. The package can also be used to generate confidence intervals for R-squared, adjusted R-squared, and differences of standardized regression coefficients. A description of the package and code examples are presented in Pesigan, Sun, and Cheung (2023) <doi:10.1080/00273171.2023.2201277>.
Maintained by Ivan Jacob Agaloos Pesigan. Last updated 2 months ago.
confidence-intervalsheteroskedasticity-consistent-standard-errorsstandardized-regression-coefficients
11.5 match 4.11 score 16 scriptscran
SPSL:Site Percolation on Square Lattices (SPSL)
Provides basic functionality for labeling iso- & anisotropic percolation clusters on 2D & 3D square lattices with various lattice sizes, occupation probabilities, von Neumann & Moore (1,d)-neighborhoods, and random variables weighting the percolation lattice sites.
Maintained by Pavel V. Moskalev. Last updated 6 years ago.
32.0 match 1.48 score 1 dependentscran
MuMIn:Multi-Model Inference
Tools for model selection and model averaging with support for a wide range of statistical models. Automated model selection through subsets of the maximum model, with optional constraints for model inclusion. Averaging of model parameters and predictions based on model weights derived from information criteria (AICc and alike) or custom model weighting schemes.
Maintained by Kamil Bartoล. Last updated 9 months ago.
5.3 match 8 stars 8.84 score 5.6k scripts 27 dependentsgastonstat
plspm:Tools for Partial Least Squares Path Modeling (PLS-PM)
Partial Least Squares Path Modeling (PLS-PM) analysis for both metric and non-metric data, as well as REBUS analysis for latent class detection.
Maintained by Gaston Sanchez. Last updated 3 years ago.
6.8 match 67 stars 6.97 score 115 scriptscollinerickson
GauPro:Gaussian Process Fitting
Fits a Gaussian process model to data. Gaussian processes are commonly used in computer experiments to fit an interpolating model. The model is stored as an 'R6' object and can be easily updated with new data. There are options to run in parallel, and 'Rcpp' has been used to speed up calculations. For more info about Gaussian process software, see Erickson et al. (2018) <doi:10.1016/j.ejor.2017.10.002>.
Maintained by Collin Erickson. Last updated 2 days ago.
5.6 match 16 stars 8.44 score 104 scripts 1 dependentsphilchalmers
SimDesign:Structure for Organizing Monte Carlo Simulation Designs
Provides tools to safely and efficiently organize and execute Monte Carlo simulation experiments in R. The package controls the structure and back-end of Monte Carlo simulation experiments by utilizing a generate-analyse-summarise workflow. The workflow safeguards against common simulation coding issues, such as automatically re-simulating non-convergent results, prevents inadvertently overwriting simulation files, catches error and warning messages during execution, implicitly supports parallel processing with high-quality random number generation, and provides tools for managing high-performance computing (HPC) array jobs submitted to schedulers such as SLURM. For a pedagogical introduction to the package see Sigal and Chalmers (2016) <doi:10.1080/10691898.2016.1246953>. For a more in-depth overview of the package and its design philosophy see Chalmers and Adkins (2020) <doi:10.20982/tqmp.16.4.p248>.
Maintained by Phil Chalmers. Last updated 1 days ago.
monte-carlo-simulationsimulationsimulation-framework
3.5 match 62 stars 13.38 score 253 scripts 46 dependentsbioc
TDbasedUFE:Tensor Decomposition Based Unsupervised Feature Extraction
This is a comprehensive package to perform Tensor decomposition based unsupervised feature extraction. It can perform unsupervised feature extraction. It uses tensor decomposition. It is applicable to gene expression, DNA methylation, and histone modification etc. It can perform multiomics analysis. It is also potentially applicable to single cell omics data sets.
Maintained by Y-h. Taguchi. Last updated 5 months ago.
geneexpressionfeatureextractionmethylationarraysinglecellbioinformaticsdna-methylationgene-expression-profileshistone-modificationsmultiomicstensor-decomposition
8.5 match 5 stars 5.48 score 9 scripts 1 dependentsfriendly
heplots:Visualizing Hypothesis Tests in Multivariate Linear Models
Provides HE plot and other functions for visualizing hypothesis tests in multivariate linear models. HE plots represent sums-of-squares-and-products matrices for linear hypotheses and for error using ellipses (in two dimensions) and ellipsoids (in three dimensions). The related 'candisc' package provides visualizations in a reduced-rank canonical discriminant space when there are more than a few response variables.
Maintained by Michael Friendly. Last updated 11 days ago.
linear-hypothesesmatricesmultivariate-linear-modelsplotrepeated-measure-designsvisualizing-hypothesis-tests
4.0 match 9 stars 11.49 score 1.1k scripts 7 dependentsbiomodhub
biomod2:Ensemble Platform for Species Distribution Modeling
Functions for species distribution modeling, calibration and evaluation, ensemble of models, ensemble forecasting and visualization. The package permits to run consistently up to 10 single models on a presence/absences (resp presences/pseudo-absences) dataset and to combine them in ensemble models and ensemble projections. Some bench of other evaluation and visualisation tools are also available within the package.
Maintained by Maya Guรฉguen. Last updated 1 days ago.
3.3 match 95 stars 13.90 score 536 scripts 7 dependentscran
irrICC:Intraclass Correlations for Quantifying Inter-Rater Reliability
Calculates various intraclass correlation coefficients used to quantify inter-rater and intra-rater reliability. The assumption here is that the raters produced quantitative ratings. Most of the statistical procedures implemented in this package are described in details in Gwet, K.L. (2014, ISBN:978-0970806284): "Handbook of Inter-Rater Reliability," 4th edition, Advanced Analytics, LLC.
Maintained by Kilem L. Gwet. Last updated 5 years ago.
15.2 match 3.00 scorejmcurran
Hotelling:Hotelling's T^2 Test and Variants
A set of R functions which implements Hotelling's T^2 test and some variants of it. Functions are also included for Aitchison's additive log ratio and centred log ratio transformations.
Maintained by James Curran. Last updated 4 years ago.
6.7 match 2 stars 6.78 score 139 scripts 3 dependentsr-forge
robustbase:Basic Robust Statistics
"Essential" Robust Statistics. Tools allowing to analyze data with robust methods. This includes regression methodology including model selections and multivariate statistics where we strive to cover the book "Robust Statistics, Theory and Methods" by 'Maronna, Martin and Yohai'; Wiley 2006.
Maintained by Martin Maechler. Last updated 4 months ago.
3.4 match 13.33 score 1.7k scripts 480 dependentsr-forge
expm:Matrix Exponential, Log, 'etc'
Computation of the matrix exponential, logarithm, sqrt, and related quantities, using traditional and modern methods.
Maintained by Martin Maechler. Last updated 5 months ago.
3.8 match 11.91 score 1.3k scripts 432 dependentsfbertran
plsRcox:Partial Least Squares Regression for Cox Models and Related Techniques
Provides Partial least squares Regression and various regular, sparse or kernel, techniques for fitting Cox models in high dimensional settings <doi:10.1093/bioinformatics/btu660>, Bastien, P., Bertrand, F., Meyer N., Maumy-Bertrand, M. (2015), Deviance residuals-based sparse PLS and sparse kernel PLS regression for censored data, Bioinformatics, 31(3):397-404. Cross validation criteria were studied in <arXiv:1810.02962>, Bertrand, F., Bastien, Ph. and Maumy-Bertrand, M. (2018), Cross validating extensions of kernel, sparse or regular partial least squares regression models to censored data.
Maintained by Frederic Bertrand. Last updated 2 years ago.
8.7 match 4 stars 5.13 score 56 scripts 2 dependentsfaosorios
fastmatrix:Fast Computation of some Matrices Useful in Statistics
Small set of functions to fast computation of some matrices and operations useful in statistics and econometrics. Currently, there are functions for efficient computation of duplication, commutation and symmetrizer matrices with minimal storage requirements. Some commonly used matrix decompositions (LU and LDL), basic matrix operations (for instance, Hadamard, Kronecker products and the Sherman-Morrison formula) and iterative solvers for linear systems are also available. In addition, the package includes a number of common statistical procedures such as the sweep operator, weighted mean and covariance matrix using an online algorithm, linear regression (using Cholesky, QR, SVD, sweep operator and conjugate gradients methods), ridge regression (with optimal selection of the ridge parameter considering several procedures), omnibus tests for univariate normality, functions to compute the multivariate skewness, kurtosis, the Mahalanobis distance (checking the positive defineteness), and the Wilson-Hilferty transformation of gamma variables. Furthermore, the package provides interfaces to C code callable by another C code from other R packages.
Maintained by Felipe Osorio. Last updated 1 years ago.
commutation-matrixjarque-bera-testldl-factorizationlu-factorizationmatrix-api-for-r-packagesmatrix-normsmodified-choleskyols-regressionpower-methodridge-regressionsherman-morrisonstatisticssweep-operatorsymmetrizer-matrixfortranopenblas
7.1 match 19 stars 6.27 score 37 scripts 10 dependentsspatstat
spatstat.data:Datasets for 'spatstat' Family
Contains all the datasets for the 'spatstat' family of packages.
Maintained by Adrian Baddeley. Last updated 2 days ago.
kernel-densitypoint-processspatial-analysisspatial-dataspatial-data-analysisspatstatstatistical-analysisstatistical-methodsstatistical-testsstatistics
4.0 match 6 stars 11.07 score 186 scripts 228 dependentskevhuy
WALS:Weighted-Average Least Squares Model Averaging
Implements Weighted-Average Least Squares model averaging for negative binomial regression models of Huynh (2024) <doi:10.48550/arXiv.2404.11324>, generalized linear models of De Luca, Magnus, Peracchi (2018) <doi:10.1016/j.jeconom.2017.12.007> and linear regression models of Magnus, Powell, Pruefer (2010) <doi:10.1016/j.jeconom.2009.07.004>, see also Magnus, De Luca (2016) <doi:10.1111/joes.12094>. Weighted-Average Least Squares for the linear regression model is based on the original 'MATLAB' code by Magnus and De Luca <https://www.janmagnus.nl/items/WALS.pdf>, see also Kumar, Magnus (2013) <doi:10.1007/s13571-013-0060-9> and De Luca, Magnus (2011) <doi:10.1177/1536867X1201100402>.
Maintained by Kevin Huynh. Last updated 9 months ago.
13.9 match 1 stars 3.18 score 1 scriptsrrwen
draw:Wrapper Functions for Producing Graphics
A set of user-friendly wrapper functions for creating consistent graphics and diagrams with lines, common shapes, text, and page settings. Compatible with and based on the R 'grid' package.
Maintained by Richard Wen. Last updated 7 years ago.
boxcirclecurvediagramdrawgraphicsgridlinepagerectanglereproducibleshapesquaretexttriangle
10.0 match 2 stars 4.39 score 35 scriptsrudjer
SparseM:Sparse Linear Algebra
Some basic linear algebra functionality for sparse matrices is provided: including Cholesky decomposition and backsolving as well as standard R subsetting and Kronecker products.
Maintained by Roger Koenker. Last updated 8 months ago.
3.8 match 3 stars 11.47 score 306 scripts 1.5k dependentsk3jph
cmna:Computational Methods for Numerical Analysis
Provides the source and examples for James P. Howard, II, "Computational Methods for Numerical Analysis with R," <https://jameshoward.us/cmna/>, a book on numerical methods in R.
Maintained by James Howard. Last updated 4 years ago.
bisectiondifferential-equationsheat-equationinterpolationleast-squaresmatrix-factorizationmonte-carlonewtonnumerical-analysisoptimizationpartial-differential-equationsquadratureroot-findingsecantsplinestestthattraveling-salespersonwave-equation
7.5 match 16 stars 5.65 score 62 scripts 3 dependentstrevorhastie
glmnet:Lasso and Elastic-Net Regularized Generalized Linear Models
Extremely efficient procedures for fitting the entire lasso or elastic-net regularization path for linear regression, logistic and multinomial regression models, Poisson regression, Cox model, multiple-response Gaussian, and the grouped multinomial regression; see <doi:10.18637/jss.v033.i01> and <doi:10.18637/jss.v039.i05>. There are two new and important additions. The family argument can be a GLM family object, which opens the door to any programmed family (<doi:10.18637/jss.v106.i01>). This comes with a modest computational cost, so when the built-in families suffice, they should be used instead. The other novelty is the relax option, which refits each of the active sets in the path unpenalized. The algorithm uses cyclical coordinate descent in a path-wise fashion, as described in the papers cited.
Maintained by Trevor Hastie. Last updated 2 years ago.
2.8 match 82 stars 15.15 score 22k scripts 736 dependentscmollica
PLMIX:Bayesian Analysis of Finite Mixture of Plackett-Luce Models
Fit finite mixtures of Plackett-Luce models for partial top rankings/orderings within the Bayesian framework. It provides MAP point estimates via EM algorithm and posterior MCMC simulations via Gibbs Sampling. It also fits MLE as a special case of the noninformative Bayesian analysis with vague priors. In addition to inferential techniques, the package assists other fundamental phases of a model-based analysis for partial rankings/orderings, by including functions for data manipulation, simulation, descriptive summary, model selection and goodness-of-fit evaluation. Main references on the methods are Mollica and Tardella (2017) <doi.org/10.1007/s11336-016-9530-0> and Mollica and Tardella (2014) <doi/10.1002/sim.6224>.
Maintained by Cristina Mollica. Last updated 4 years ago.
13.4 match 3.15 score 28 scriptsludovikcoba
rrecsys:Environment for Evaluating Recommender Systems
Processes standard recommendation datasets (e.g., a user-item rating matrix) as input and generates rating predictions and lists of recommended items. Standard algorithm implementations which are included in this package are the following: Global/Item/User-Average baselines, Weighted Slope One, Item-Based KNN, User-Based KNN, FunkSVD, BPR and weighted ALS. They can be assessed according to the standard offline evaluation methodology (Shani, et al. (2011) <doi:10.1007/978-0-387-85820-3_8>) for recommender systems using measures such as MAE, RMSE, Precision, Recall, F1, AUC, NDCG, RankScore and coverage measures. The package (Coba, et al.(2017) <doi: 10.1007/978-3-319-60042-0_36>) is intended for rapid prototyping of recommendation algorithms and education purposes.
Maintained by Ludovik รoba. Last updated 3 years ago.
6.1 match 23 stars 6.84 score 25 scriptsbioxgeo
geodiv:Methods for Calculating Gradient Surface Metrics
Methods for calculating gradient surface metrics for continuous analysis of landscape features.
Maintained by Annie C. Smith. Last updated 1 years ago.
7.1 match 11 stars 5.88 score 23 scripts 1 dependentsstan-dev
rstanarm:Bayesian Applied Regression Modeling via Stan
Estimates previously compiled regression models using the 'rstan' package, which provides the R interface to the Stan C++ library for Bayesian estimation. Users specify models via the customary R syntax with a formula and data.frame plus some additional arguments for priors.
Maintained by Ben Goodrich. Last updated 9 months ago.
bayesianbayesian-data-analysisbayesian-inferencebayesian-methodsbayesian-statisticsmultilevel-modelsrstanrstanarmstanstatistical-modelingcpp
2.7 match 393 stars 15.68 score 5.0k scripts 13 dependentsradiant-rstats
radiant.data:Data Menu for Radiant: Business Analytics using R and Shiny
The Radiant Data menu includes interfaces for loading, saving, viewing, visualizing, summarizing, transforming, and combining data. It also contains functionality to generate reproducible reports of the analyses conducted in the application.
Maintained by Vincent Nijs. Last updated 5 months ago.
5.0 match 54 stars 8.30 score 146 scripts 6 dependentsrvaradhan
SQUAREM:Squared Extrapolation Methods for Accelerating EM-Like Monotone Algorithms
Algorithms for accelerating the convergence of slow, monotone sequences from smooth, contraction mapping such as the EM algorithm. It can be used to accelerate any smooth, linearly convergent acceleration scheme. A tutorial style introduction to this package is available in a vignette on the CRAN download page or, when the package is loaded in an R session, with vignette("SQUAREM"). Refer to the J Stat Software article: <doi:10.18637/jss.v092.i07>.
Maintained by Ravi Varadhan. Last updated 4 years ago.
4.5 match 2 stars 9.26 score 84 scripts 502 dependentspecanproject
PEcAn.benchmark:PEcAn Functions Used for Benchmarking
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PEcAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. The PEcAn.benchmark package provides utilities for comparing models and data, including a suite of statistical metrics and plots.
Maintained by Mike Dietze. Last updated 4 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
3.9 match 216 stars 10.70 score 416 scripts 11 dependentsspatstat
spatstat.explore:Exploratory Data Analysis for the 'spatstat' Family
Functionality for exploratory data analysis and nonparametric analysis of spatial data, mainly spatial point patterns, in the 'spatstat' family of packages. (Excludes analysis of spatial data on a linear network, which is covered by the separate package 'spatstat.linnet'.) Methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported.
Maintained by Adrian Baddeley. Last updated 2 days ago.
cluster-detectionconfidence-intervalshypothesis-testingk-functionroc-curvesscan-statisticssignificance-testingsimulation-envelopesspatial-analysisspatial-data-analysisspatial-sharpeningspatial-smoothingspatial-statistics
4.1 match 1 stars 10.18 score 67 scripts 149 dependentsgeomorphr
geomorph:Geometric Morphometric Analyses of 2D and 3D Landmark Data
Read, manipulate, and digitize landmark data, generate shape variables via Procrustes analysis for points, curves and surfaces, perform shape analyses, and provide graphical depictions of shapes and patterns of shape variation.
Maintained by Dean Adams. Last updated 1 months ago.
3.4 match 76 stars 12.05 score 700 scripts 6 dependentscran
wavethresh:Wavelets Statistics and Transforms
Performs 1, 2 and 3D real and complex-valued wavelet transforms, nondecimated transforms, wavelet packet transforms, nondecimated wavelet packet transforms, multiple wavelet transforms, complex-valued wavelet transforms, wavelet shrinkage for various kinds of data, locally stationary wavelet time series, nonstationary multiscale transfer function modeling, density estimation.
Maintained by Guy Nason. Last updated 7 months ago.
7.0 match 5.89 score 41 dependentshomerhanumat
tigerstats:R Functions for Elementary Statistics
A collection of data sets and functions that are useful in the teaching of statistics at an elementary level to students who may have little or no previous experience with the command line. The functions for elementary inferential procedures follow a uniform interface for user input. Some of the functions are instructional applets that can only be run on the R Studio integrated development environment with package 'manipulate' installed. Other instructional applets are Shiny apps that may be run locally. In teaching the package is used alongside of package 'mosaic', 'mosaicData' and 'abd', which are therefore listed as dependencies.
Maintained by Homer White. Last updated 4 years ago.
7.1 match 16 stars 5.77 score 327 scriptsbioc
ropls:PCA, PLS(-DA) and OPLS(-DA) for multivariate analysis and feature selection of omics data
Latent variable modeling with Principal Component Analysis (PCA) and Partial Least Squares (PLS) are powerful methods for visualization, regression, classification, and feature selection of omics data where the number of variables exceeds the number of samples and with multicollinearity among variables. Orthogonal Partial Least Squares (OPLS) enables to separately model the variation correlated (predictive) to the factor of interest and the uncorrelated (orthogonal) variation. While performing similarly to PLS, OPLS facilitates interpretation. Successful applications of these chemometrics techniques include spectroscopic data such as Raman spectroscopy, nuclear magnetic resonance (NMR), mass spectrometry (MS) in metabolomics and proteomics, but also transcriptomics data. In addition to scores, loadings and weights plots, the package provides metrics and graphics to determine the optimal number of components (e.g. with the R2 and Q2 coefficients), check the validity of the model by permutation testing, detect outliers, and perform feature selection (e.g. with Variable Importance in Projection or regression coefficients). The package can be accessed via a user interface on the Workflow4Metabolomics.org online resource for computational metabolomics (built upon the Galaxy environment).
Maintained by Etienne A. Thevenot. Last updated 5 months ago.
regressionclassificationprincipalcomponenttranscriptomicsproteomicsmetabolomicslipidomicsmassspectrometryimmunooncology
5.4 match 7.55 score 210 scripts 8 dependentsmobiodiv
mobsim:Spatial Simulation and Scale-Dependent Analysis of Biodiversity Changes
Simulation, analysis and sampling of spatial biodiversity data (May, Gerstner, McGlinn, Xiao & Chase 2017) <doi:10.1111/2041-210x.12986>. In the simulation tools user define the numbers of species and individuals, the species abundance distribution and species aggregation. Functions for analysis include species rarefaction and accumulation curves, species-area relationships and the distance decay of similarity.
Maintained by Felix May. Last updated 3 months ago.
biodiversitymacroecologypoint-pattern-analysisrarefactionsimulationspeciesspecies-abundance-distributionscpp
5.2 match 20 stars 7.84 score 76 scriptsrsquaredacademy
inferr:Inferential Statistics
Select set of parametric and non-parametric statistical tests. 'inferr' builds upon the solid set of statistical tests provided in 'stats' package by including additional data types as inputs, expanding and restructuring the test results. The tests included are t tests, variance tests, proportion tests, chi square tests, Levene's test, McNemar Test, Cochran's Q test and Runs test.
Maintained by Aravind Hebbali. Last updated 4 months ago.
inferenceinferential-statisticsnon-parametricparametricstatistical-testscpp
6.6 match 37 stars 6.10 score 34 scriptssigbertklinke
exams.forge:Support for Compiling Examination Tasks using the 'exams' Package
The main aim is to further facilitate the creation of exercises based on the package 'exams' by Grรผn, B., and Zeileis, A. (2009) <doi:10.18637/jss.v029.i10>. Creating effective student exercises involves challenges such as creating appropriate data sets and ensuring access to intermediate values for accurate explanation of solutions. The functionality includes the generation of univariate and bivariate data including simple time series, functions for theoretical distributions and their approximation, statistical and mathematical calculations for tasks in basic statistics courses as well as general tasks such as string manipulation, LaTeX/HTML formatting and the editing of XML task files for 'Moodle'.
Maintained by Sigbert Klinke. Last updated 8 months ago.
15.0 match 2.70 score 1 scriptsmwheymans
psfmi:Prediction Model Pooling, Selection and Performance Evaluation Across Multiply Imputed Datasets
Pooling, backward and forward selection of linear, logistic and Cox regression models in multiply imputed datasets. Backward and forward selection can be done from the pooled model using Rubin's Rules (RR), the D1, D2, D3, D4 and the median p-values method. This is also possible for Mixed models. The models can contain continuous, dichotomous, categorical and restricted cubic spline predictors and interaction terms between all these type of predictors. The stability of the models can be evaluated using (cluster) bootstrapping. The package further contains functions to pool model performance measures as ROC/AUC, Reclassification, R-squared, scaled Brier score, H&L test and calibration plots for logistic regression models. Internal validation can be done across multiply imputed datasets with cross-validation or bootstrapping. The adjusted intercept after shrinkage of pooled regression coefficients can be obtained. Backward and forward selection as part of internal validation is possible. A function to externally validate logistic prediction models in multiple imputed datasets is available and a function to compare models. For Cox models a strata variable can be included. Eekhout (2017) <doi:10.1186/s12874-017-0404-7>. Wiel (2009) <doi:10.1093/biostatistics/kxp011>. Marshall (2009) <doi:10.1186/1471-2288-9-57>.
Maintained by Martijn Heymans. Last updated 2 years ago.
cox-regressionimputationimputed-datasetslogisticmultiple-imputationpoolpredictorregressionselectionsplinespline-predictors
5.6 match 10 stars 7.17 score 70 scriptsgi0na
ghypernet:Fit and Simulate Generalised Hypergeometric Ensembles of Graphs
Provides functions for model fitting and selection of generalised hypergeometric ensembles of random graphs (gHypEG). To learn how to use it, check the vignettes for a quick tutorial. Please reference its use as Casiraghi, G., Nanumyan, V. (2019) <doi:10.5281/zenodo.2555300> together with those relevant references from the one listed below. The package is based on the research developed at the Chair of Systems Design, ETH Zurich. Casiraghi, G., Nanumyan, V., Scholtes, I., Schweitzer, F. (2016) <arXiv:1607.02441>. Casiraghi, G., Nanumyan, V., Scholtes, I., Schweitzer, F. (2017) <doi:10.1007/978-3-319-67256-4_11>. Casiraghi, G., (2017) <arXiv:1702.02048> Brandenberger, L., Casiraghi, G., Nanumyan, V., Schweitzer, F. (2019) <doi:10.1145/3341161.3342926> Casiraghi, G. (2019) <doi:10.1007/s41109-019-0241-1>. Casiraghi, G., Nanumyan, V. (2021) <doi:10.1038/s41598-021-92519-y>. Casiraghi, G. (2021) <doi:10.1088/2632-072X/ac0493>.
Maintained by Giona Casiraghi. Last updated 11 months ago.
data-miningdata-sciencegraphsnetworknetwork-analysisrandom-graph-generationrandom-graphs
7.0 match 8 stars 5.68 score 20 scriptschrisaberson
pwr2ppl:Power Analyses for Common Designs (Power to the People)
Statistical power analysis for designs including t-tests, correlations, multiple regression, ANOVA, mediation, and logistic regression. Functions accompany Aberson (2019) <doi:10.4324/9781315171500>.
Maintained by Chris Aberson. Last updated 3 years ago.
9.5 match 17 stars 4.16 score 17 scriptstwolodzko
extraDistr:Additional Univariate and Multivariate Distributions
Density, distribution function, quantile function and random generation for a number of univariate and multivariate distributions. This package implements the following distributions: Bernoulli, beta-binomial, beta-negative binomial, beta prime, Bhattacharjee, Birnbaum-Saunders, bivariate normal, bivariate Poisson, categorical, Dirichlet, Dirichlet-multinomial, discrete gamma, discrete Laplace, discrete normal, discrete uniform, discrete Weibull, Frechet, gamma-Poisson, generalized extreme value, Gompertz, generalized Pareto, Gumbel, half-Cauchy, half-normal, half-t, Huber density, inverse chi-squared, inverse-gamma, Kumaraswamy, Laplace, location-scale t, logarithmic, Lomax, multivariate hypergeometric, multinomial, negative hypergeometric, non-standard beta, normal mixture, Poisson mixture, Pareto, power, reparametrized beta, Rayleigh, shifted Gompertz, Skellam, slash, triangular, truncated binomial, truncated normal, truncated Poisson, Tukey lambda, Wald, zero-inflated binomial, zero-inflated negative binomial, zero-inflated Poisson.
Maintained by Tymoteusz Wolodzko. Last updated 13 days ago.
c-plus-plusc-plus-plus-11distributionmultivariate-distributionsprobabilityrandom-generationrcppstatisticscpp
3.4 match 53 stars 11.60 score 1.5k scripts 107 dependentswinvector
vtreat:A Statistically Sound 'data.frame' Processor/Conditioner
A 'data.frame' processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. 'vtreat' prepares variables so that data has fewer exceptional cases, making it easier to safely use models in production. Common problems 'vtreat' defends against: 'Inf', 'NA', too many categorical levels, rare categorical levels, and new categorical levels (levels seen during application, but not during training). Reference: "'vtreat': a data.frame Processor for Predictive Modeling", Zumel, Mount, 2016, <DOI:10.5281/zenodo.1173313>.
Maintained by John Mount. Last updated 2 months ago.
categorical-variablesmachine-learning-algorithmsnested-modelsprepare-data
3.5 match 285 stars 11.19 score 328 scripts 1 dependentsr-forge
isotone:Active Set and Generalized PAVA for Isotone Optimization
Contains two main functions: one for solving general isotone regression problems using the pool-adjacent-violators algorithm (PAVA); another one provides a framework for active set methods for isotone optimization problems with arbitrary order restrictions. Various types of loss functions are prespecified.
Maintained by Patrick Mair. Last updated 3 months ago.
5.7 match 6.88 score 80 scripts 13 dependentsrobinhankin
ResistorArray:Electrical Properties of Resistor Networks
Electrical properties of resistor networks using matrix methods.
Maintained by Robin K. S. Hankin. Last updated 1 years ago.
9.0 match 4.32 score 14 scripts 1 dependentsvegandevs
vegan:Community Ecology Package
Ordination methods, diversity analysis and other functions for community and vegetation ecologists.
Maintained by Jari Oksanen. Last updated 18 days ago.
ecological-modellingecologyordinationfortranopenblas
2.0 match 472 stars 19.41 score 15k scripts 440 dependentsmdplot
MDplot:Visualising Molecular Dynamics Analyses
Provides automatisation for plot generation succeeding common molecular dynamics analyses. This includes straightforward plots, such as RMSD (Root-Mean-Square-Deviation) and RMSF (Root-Mean-Square-Fluctuation) but also more sophisticated ones such as dihedral angle maps, hydrogen bonds, cluster bar plots and DSSP (Definition of Secondary Structure of Proteins) analysis. Currently able to load GROMOS, GROMACS and AMBER formats, respectively.
Maintained by Christian Margreitter. Last updated 3 years ago.
6.0 match 27 stars 6.46 score 36 scriptsjmsigner
amt:Animal Movement Tools
Manage and analyze animal movement data. The functionality of 'amt' includes methods to calculate home ranges, track statistics (e.g. step lengths, speed, or turning angles), prepare data for fitting habitat selection analyses, and simulation of space-use from fitted step-selection functions.
Maintained by Johannes Signer. Last updated 4 months ago.
3.7 match 41 stars 10.54 score 418 scriptszdebruine
RcppML:Rcpp Machine Learning Library
Fast machine learning algorithms including matrix factorization and divisive clustering for large sparse and dense matrices.
Maintained by Zach DeBruine. Last updated 2 years ago.
clusteringmatrix-factorizationnmfrcpprcppeigensparse-matrixcppopenmp
3.7 match 104 stars 10.53 score 125 scripts 46 dependentssymbolixau
h3r:Hexagonal Hierarchical Geospatial Indexing System
Provides access to Uber's 'H3' geospatial indexing system via 'h3lib' <https://CRAN.R-project.org/package=h3lib>. 'h3r' is designed to mimic the 'H3' Application Programming Interface (API) <https://h3geo.org/docs/api/indexing/>, so that any function in the API is also available in 'h3r'.
Maintained by David Cooley. Last updated 3 months ago.
8.6 match 5 stars 4.52 score 33 scriptsoptad
adoptr:Adaptive Optimal Two-Stage Designs
Optimize one or two-arm, two-stage designs for clinical trials with respect to several implemented objective criteria or custom objectives. Optimization under uncertainty and conditional (given stage-one outcome) constraints are supported. See Pilz et al. (2019) <doi:10.1002/sim.8291> and Kunzmann et al. (2021) <doi:10.18637/jss.v098.i09> for details.
Maintained by Maximilian Pilz. Last updated 6 months ago.
5.4 match 1 stars 7.09 score 39 scripts 1 dependentsebbertd
chisq.posthoc.test:A Post Hoc Analysis for Pearson's Chi-Squared Test for Count Data
Perform post hoc analysis based on residuals of Pearson's Chi-squared Test for Count Data based on T. Mark Beasley & Randall E. Schumacker (1995) <doi: 10.1080/00220973.1995.9943797>.
Maintained by Daniel Ebbert. Last updated 5 years ago.
chisq-testchisquarechisquare-test
7.7 match 2 stars 4.99 score 98 scriptspachadotdev
gravity:Estimation Methods for Gravity Models
A wrapper of different standard estimation methods for gravity models. This package provides estimation methods for log-log models and multiplicative models.
Maintained by Mauricio Vargas. Last updated 4 months ago.
bvubvwddmeconometricsglmgpmlgravityinternational-tradelmmaximum-likelihoodnbpmlnlsolsppmlsilstobittrade
5.5 match 35 stars 6.98 score 55 scriptsafrimapr
afrilearndata:Small Africa Map Datasets for Learning
Small African datasets to help with learning and teaching of spatial techniques and mapping. Part of afrimapr project. To provide analysts based in Africa with more easily relateable example datasets. R objects for points, lines, polygons and raster. Source files including .gpkg, .shp, .kml, .tif, .grd, .csv.
Maintained by Andy South. Last updated 3 years ago.
mapspatialteachingvisualization
10.4 match 15 stars 3.68 score 64 scriptsmlverse
luz:Higher Level 'API' for 'torch'
A high level interface for 'torch' providing utilities to reduce the the amount of code needed for common tasks, abstract away torch details and make the same code work on both the 'CPU' and 'GPU'. It's flexible enough to support expressing a large range of models. It's heavily inspired by 'fastai' by Howard et al. (2020) <arXiv:2002.04688>, 'Keras' by Chollet et al. (2015) and 'PyTorch Lightning' by Falcon et al. (2019) <doi:10.5281/zenodo.3828935>.
Maintained by Daniel Falbel. Last updated 6 months ago.
3.9 match 89 stars 9.86 score 318 scripts 4 dependentsbdwilliamson
vimp:Perform Inference on Algorithm-Agnostic Variable Importance
Calculate point estimates of and valid confidence intervals for nonparametric, algorithm-agnostic variable importance measures in high and low dimensions, using flexible estimators of the underlying regression functions. For more information about the methods, please see Williamson et al. (Biometrics, 2020), Williamson et al. (JASA, 2021), and Williamson and Feng (ICML, 2020).
Maintained by Brian D. Williamson. Last updated 1 months ago.
machine-learningnonparametric-statisticsstatistical-inferencevariable-importance
5.6 match 23 stars 6.79 score 67 scriptsbioc
snpStats:SnpMatrix and XSnpMatrix classes and methods
Classes and statistical methods for large SNP association studies. This extends the earlier snpMatrix package, allowing for uncertainty in genotypes.
Maintained by David Clayton. Last updated 5 months ago.
microarraysnpgeneticvariabilityzlib
4.0 match 9.48 score 674 scripts 20 dependentsjeffreyevans
yaImpute:Nearest Neighbor Observation Imputation and Evaluation Tools
Performs nearest neighbor-based imputation using one or more alternative approaches to processing multivariate data. These include methods based on canonical correlation: analysis, canonical correspondence analysis, and a multivariate adaptation of the random forest classification and regression techniques of Leo Breiman and Adele Cutler. Additional methods are also offered. The package includes functions for comparing the results from running alternative techniques, detecting imputation targets that are notably distant from reference observations, detecting and correcting for bias, bootstrapping and building ensemble imputations, and mapping results.
Maintained by Jeffrey S. Evans. Last updated 6 months ago.
5.1 match 3 stars 7.40 score 94 scripts 12 dependentsmerck
r2rtf:Easily Create Production-Ready Rich Text Format (RTF) Tables and Figures
Create production-ready Rich Text Format (RTF) tables and figures with flexible format.
Maintained by Benjamin Wang. Last updated 8 days ago.
3.5 match 78 stars 10.82 score 171 scripts 10 dependents