R-universe search: square

cran

NISTunits:Fundamental Physical Constants and Unit Conversions from NIST

Fundamental physical constants (Quantity, Value, Uncertainty, Unit) for SI (International System of Units) and non-SI units, plus unit conversions Based on the data from NIST (National Institute of Standards and Technology, USA)

Maintained by Jose Gama. Last updated 9 years ago.

376.3 match 2.85 score 10 dependents

robinhankin

magic:Create and Investigate Magic Squares

A collection of functions for the manipulation and analysis of arbitrarily dimensioned arrays. The original motivation for the package was the development of efficient, vectorized algorithms for the creation and investigation of magic squares and high-dimensional magic hypercubes.

Maintained by Robin K. S. Hankin. Last updated 2 months ago.

59.7 match 3 stars 11.12 score 436 scripts 230 dependents

rstudio

keras3:R Interface to 'Keras'

Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.

Maintained by Tomasz Kalinowski. Last updated 2 days ago.

24.6 match 845 stars 13.60 score 264 scripts 2 dependents

mlverse

torch:Tensors and Neural Networks with 'GPU' Acceleration

Provides functionality to define and train neural networks similar to 'PyTorch' by Paszke et al (2019) <doi:10.48550/arXiv.1912.01703> but written entirely in R using the 'libtorch' library. Also supports low-level tensor operations and 'GPU' acceleration.

Maintained by Daniel Falbel. Last updated 8 days ago.

autograd deep-learning torch cpp

18.8 match 520 stars 16.52 score 1.4k scripts 38 dependents

kwstat

agridat:Agricultural Datasets

Datasets from books, papers, and websites related to agriculture. Example graphics and analyses are included. Data come from small-plot trials, multi-environment trials, uniformity trials, yield monitors, and more.

Maintained by Kevin Wright. Last updated 30 days ago.

data

28.8 match 126 stars 10.78 score 1.7k scripts 1 dependents

mlr-org

mlr3:Machine Learning in R - Next Generation

Efficient, object-oriented programming on the building blocks of machine learning. Provides 'R6' objects for tasks, learners, resamplings, and measures. The package is geared towards scalability and larger datasets by supporting parallelization and out-of-memory data-backends like databases. While 'mlr3' focuses on the core computational operations, add-on packages provide additional functionality.

Maintained by Marc Becker. Last updated 6 days ago.

classification data-science machine-learning mlr3 regression

17.7 match 972 stars 14.86 score 2.3k scripts 35 dependents

drostlab

philentropy:Similarity and Distance Quantification Between Probability Functions

Computes 46 optimized distance and similarity measures for comparing probability functions (Drost (2018) <doi:10.21105/joss.00765>). These comparisons between probability functions have their foundations in a broad range of scientific disciplines from mathematics to ecology. The aim of this package is to provide a core framework for clustering, classification, statistical inference, goodness-of-fit, non-parametric statistics, information theory, and machine learning tasks that are based on comparing univariate or multivariate probability functions.

Maintained by Hajk-Georg Drost. Last updated 3 months ago.

distance-measures distance-quantification information-theory jensen-shannon-divergence parametric-distributions similarity-measures statistics cpp

20.4 match 137 stars 12.44 score 484 scripts 24 dependents

mfrasco

Metrics:Evaluation Metrics for Machine Learning

An implementation of evaluation metrics in R that are commonly used in supervised machine learning. It implements metrics for regression, time series, binary classification, classification, and information retrieval problems. It has zero dependencies and a consistent, simple interface for all functions.

Maintained by Michael Frasco. Last updated 6 years ago.

17.7 match 99 stars 13.02 score 6.1k scripts 51 dependents

nanxstats

enpls:Ensemble Partial Least Squares Regression

An algorithmic framework for measuring feature importance, outlier detection, model applicability domain evaluation, and ensemble predictive modeling with (sparse) partial least squares regressions.

Maintained by Nan Xiao. Last updated 3 years ago.

chemometrics dimensionality-reduction ensemble-learning machine-learning outlier-detection partial-least-squares-regression

37.3 match 18 stars 5.56 score 40 scripts

cran

nlme:Linear and Nonlinear Mixed Effects Models

Fit and compare Gaussian linear and nonlinear mixed-effects models.

Maintained by R Core Team. Last updated 2 months ago.

fortran

15.7 match 6 stars 13.00 score 13k scripts 8.7k dependents

fbertran

plsRglm:Partial Least Squares Regression for Generalized Linear Models

Provides (weighted) Partial least squares Regression for generalized linear models and repeated k-fold cross-validation of such models using various criteria <arXiv:1810.01005>. It allows for missing data in the explanatory variables. Bootstrap confidence intervals constructions are also available.

Maintained by Frederic Bertrand. Last updated 2 years ago.

24.6 match 16 stars 7.75 score 103 scripts 5 dependents

revelle

psych:Procedures for Psychological, Psychometric, and Personality Research

A general purpose toolbox developed originally for personality, psychometric theory and experimental psychology. Functions are primarily for multivariate analysis and scale construction using factor analysis, principal component analysis, cluster analysis and reliability analysis, although others provide basic descriptive statistics. Item Response Theory is done using factor analysis of tetrachoric and polychoric correlations. Functions for analyzing data at multiple levels include within and between group statistics, including correlations and factor analysis. Validation and cross validation of scales developed using basic machine learning algorithms are provided, as are functions for simulating and testing particular item and test structures. Several functions serve as a useful front end for structural equation modeling. Graphical displays of path diagrams, including mediation models, factor analysis and structural equation models are created using basic graphics. Some of the functions are written to support a book on psychometric theory as well as publications in personality research. For more information, see the <https://personality-project.org/r/> web page.

Maintained by William Revelle. Last updated 3 months ago.

13.6 match 52 stars 13.94 score 29k scripts 317 dependents

doomlab

MOTE:Effect Size and Confidence Interval Calculator

Measure of the Effect ('MOTE') is an effect size calculator, including a wide variety of effect sizes in the mean differences family (all versions of d) and the variance overlap family (eta, omega, epsilon, r). 'MOTE' provides non-central confidence intervals for each effect size, relevant test statistics, and output for reporting in APA Style (American Psychological Association, 2010, <ISBN:1433805618>) with 'LaTeX'. In research, an over-reliance on p-values may conceal the fact that a study is under-powered (Halsey, Curran-Everett, Vowler, & Drummond, 2015 <doi:10.1038/nmeth.3288>). A test may be statistically significant, yet practically inconsequential (Fritz, Scherndl, & Kühberger, 2012 <doi:10.1177/0959354312436870>). Although the American Psychological Association has long advocated for the inclusion of effect sizes (Wilkinson & American Psychological Association Task Force on Statistical Inference, 1999 <doi:10.1037/0003-066X.54.8.594>), the vast majority of peer-reviewed, published academic studies stop short of reporting effect sizes and confidence intervals (Cumming, 2013, <doi:10.1177/0956797613504966>). 'MOTE' simplifies the use and interpretation of effect sizes and confidence intervals. For more information, visit <https://www.aggieerin.com/shiny-server>.

Maintained by Erin M. Buchanan. Last updated 3 years ago.

confidence effect interval size statistics

26.3 match 17 stars 6.69 score 320 scripts 1 dependents

rsquaredacademy

olsrr:Tools for Building OLS Regression Models

Tools designed to make it easier for users, particularly beginner/intermediate R users to build ordinary least squares regression models. Includes comprehensive regression output, heteroskedasticity tests, collinearity diagnostics, residual diagnostics, measures of influence, model fit assessment and variable selection procedures.

Maintained by Aravind Hebbali. Last updated 4 months ago.

collinearity-diagnostics linear-models regression stepwise-regression

13.2 match 103 stars 12.19 score 1.4k scripts 4 dependents

rfastofficial

Rfast:A Collection of Efficient and Extremely Fast R Functions

A collection of fast (utility) functions for data analysis. Column and row wise means, medians, variances, minimums, maximums, many t, F and G-square tests, many regressions (normal, logistic, Poisson), are some of the many fast functions. References: a) Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>. b) Tsagris M. and Papadakis M. (2018). Forward regression in R: from the extreme slow to the extreme fast. Journal of Data Science, 16(4): 771--780. <doi:10.6339/JDS.201810_16(4).00006>. c) Chatzipantsiou C., Dimitriadis M., Papadakis M. and Tsagris M. (2020). Extremely Efficient Permutation and Bootstrap Hypothesis Tests Using Hypothesis Tests Using R. Journal of Modern Applied Statistical Methods, 18(2), eP2898. <doi:10.48550/arXiv.1806.10947>. d) Tsagris M., Papadakis M., Alenazi A. and Alzeley O. (2024). Computationally Efficient Outlier Detection for High-Dimensional Data Using the MDP Algorithm. Computation, 12(9): 185. <doi:10.3390/computation12090185>. e) Tsagris M. and Papadakis M. (2025). Fast and light-weight energy statistics using the R package Rfast. <doi:10.48550/arXiv.2501.02849>.

Maintained by Manos Papadakis. Last updated 19 days ago.

openblas cpp openmp

12.5 match 147 stars 12.54 score 1.2k scripts 166 dependents

gjmvanboxtel

gsignal:Signal Processing

R implementation of the 'Octave' package 'signal', containing a variety of signal processing tools, such as signal generation and measurement, correlation and convolution, filtering, filter design, filter analysis and conversion, power spectrum analysis, system identification, decimation and sample rate change, and windowing.

Maintained by Geert van Boxtel. Last updated 2 months ago.

signal-processing signals cpp

14.8 match 24 stars 10.03 score 133 scripts 34 dependents

tomkellygenetics

matrixcalc:Collection of Functions for Matrix Calculations

A collection of functions to support matrix calculations for probability, econometric and numerical analysis. There are additional functions that are comparable to APL functions which are useful for actuarial models such as pension mathematics. This package is used for teaching and research purposes at the Department of Finance and Risk Engineering, New York University, Polytechnic Institute, Brooklyn, NY 11201. Horn, R.A. (1990) Matrix Analysis. ISBN 978-0521386326. Lancaster, P. (1969) Theory of Matrices. ISBN 978-0124355507. Lay, D.C. (1995) Linear Algebra: And Its Applications. ISBN 978-0201845563.

Maintained by S. Thomas Kelly. Last updated 4 years ago.

17.6 match 8.32 score 1.7k scripts 149 dependents

yelleknek

MBESS:The MBESS R Package

Implements methods that are useful in designing research studies and analyzing data, with particular emphasis on methods that are developed for or used within the behavioral, educational, and social sciences (broadly defined). That being said, many of the methods implemented within MBESS are applicable to a wide variety of disciplines. MBESS has a suite of functions for a variety of related topics, such as effect sizes, confidence intervals for effect sizes (including standardized effect sizes and noncentral effect sizes), sample size planning (from the accuracy in parameter estimation [AIPE], power analytic, equivalence, and minimum-risk point estimation perspectives), mediation analysis, various properties of distributions, and a variety of utility functions. MBESS (pronounced 'em-bes') was originally an acronym for 'Methods for the Behavioral, Educational, and Social Sciences,' but MBESS became more general and now contains methods applicable and used in a wide variety of fields and is an orphan acronym, in the sense that what was an acronym is now literally its name. MBESS has greatly benefited from others, see <https://www3.nd.edu/~kkelley/site/MBESS.html> for a detailed list of those that have contributed and other details.

Maintained by Ken Kelley. Last updated 1 years ago.

17.8 match 2 stars 8.21 score 274 scripts 23 dependents

r-forge

Matrix:Sparse and Dense Matrix Classes and Methods

A rich hierarchy of sparse and dense matrix classes, including general, symmetric, triangular, and diagonal matrices with numeric, logical, or pattern entries. Efficient methods for operating on such matrices, often wrapping the 'BLAS', 'LAPACK', and 'SuiteSparse' libraries.

Maintained by Martin Maechler. Last updated 9 days ago.

openblas

8.5 match 1 stars 17.23 score 33k scripts 12k dependents

uchidamizuki

jpgrid:Functions for the Grid Square Codes in Japan

Provides functions for grid square codes in Japan (<https://www.stat.go.jp/english/data/mesh/index.html>). Generates the grid square codes from longitude/latitude, geometries, and the grid square codes of different scales, and vice versa.

Maintained by Mizuki Uchida. Last updated 6 months ago.

30.9 match 8 stars 4.41 score 16 scripts

rvlenth

emmeans:Estimated Marginal Means, aka Least-Squares Means

Obtain estimated marginal means (EMMs) for many linear, generalized linear, and mixed models. Compute contrasts or linear functions of EMMs, trends, and comparisons of slopes. Plots and other displays. Least-squares means are discussed, and the term "estimated marginal means" is suggested, in Searle, Speed, and Milliken (1980) Population marginal means in the linear model: An alternative to least squares means, The American Statistician 34(4), 216-221 <doi:10.1080/00031305.1980.10483031>.

Maintained by Russell V. Lenth. Last updated 5 days ago.

7.1 match 377 stars 19.19 score 13k scripts 187 dependents

adriancorrendo

metrica:Prediction Performance Metrics

A compilation of more than 80 functions designed to quantitatively and visually evaluate prediction performance of regression (continuous variables) and classification (categorical variables) of point-forecast models (e.g. APSIM, DSSAT, DNDC, supervised Machine Learning). For regression, it includes functions to generate plots (scatter, tiles, density, & Bland-Altman plot), and to estimate error metrics (e.g. MBE, MAE, RMSE), error decomposition (e.g. lack of accuracy-precision), model efficiency (e.g. NSE, E1, KGE), indices of agreement (e.g. d, RAC), goodness of fit (e.g. r, R2), adjusted correlation coefficients (e.g. CCC, dcorr), symmetric regression coefficients (intercept, slope), and mean absolute scaled error (MASE) for time series predictions. For classification (binomial and multinomial), it offers functions to generate and plot confusion matrices, and to estimate performance metrics such as accuracy, precision, recall, specificity, F-score, Cohen's Kappa, G-mean, and many more. For more details visit the vignettes <https://adriancorrendo.github.io/metrica/>.

Maintained by Adrian A. Correndo. Last updated 9 months ago.

16.6 match 77 stars 8.18 score 49 scripts

hrbrmstr

waffle:Create Waffle Chart Visualizations

Square pie charts (a.k.a. waffle charts) can be used to communicate parts of a whole for categorical quantities. To emulate the percentage view of a pie chart, a 10x10 grid should be used with each square representing 1% of the total. Modern uses of waffle charts do not necessarily adhere to this rule and can be created with a grid of any rectangular shape. Best practices suggest keeping the number of categories small, just as should be done when creating pie charts. Tools are provided to create waffle charts as well as stitch them together, and to use glyphs for making isotype pictograms.

Maintained by Bob Rudis. Last updated 1 years ago.

data-visualisation data-visualization datavisualization ggplot2 square-pie-charts waffle-charts

12.7 match 778 stars 10.66 score 1.3k scripts 5 dependents

cvxgrp

CVXR:Disciplined Convex Optimization

An object-oriented modeling language for disciplined convex programming (DCP) as described in Fu, Narasimhan, and Boyd (2020, <doi:10.18637/jss.v094.i14>). It allows the user to formulate convex optimization problems in a natural way following mathematical convention and DCP rules. The system analyzes the problem, verifies its convexity, converts it into a canonical form, and hands it off to an appropriate solver to obtain the solution. Interfaces to solvers on CRAN and elsewhere are provided, both commercial and open source.

Maintained by Anqi Fu. Last updated 4 months ago.

cpp

10.5 match 207 stars 12.89 score 768 scripts 51 dependents

bioc

mixOmics:Omics Data Integration Project

Multivariate methods are well suited to large omics data sets where the number of variables (e.g. genes, proteins, metabolites) is much larger than the number of samples (patients, cells, mice). They have the appealing properties of reducing the dimension of the data by using instrumental variables (components), which are defined as combinations of all variables. Those components are then used to produce useful graphical outputs that enable better understanding of the relationships and correlation structures between the different data sets that are integrated. mixOmics offers a wide range of multivariate methods for the exploration and integration of biological datasets with a particular focus on variable selection. The package proposes several sparse multivariate models we have developed to identify the key variables that are highly correlated, and/or explain the biological outcome of interest. The data that can be analysed with mixOmics may come from high throughput sequencing technologies, such as omics data (transcriptomics, metabolomics, proteomics, metagenomics etc) but also beyond the realm of omics (e.g. spectral imaging). The methods implemented in mixOmics can also handle missing values without having to delete entire rows with missing data. A non exhaustive list of methods include variants of generalised Canonical Correlation Analysis, sparse Partial Least Squares and sparse Discriminant Analysis. Recently we implemented integrative methods to combine multiple data sets: N-integration with variants of Generalised Canonical Correlation Analysis and P-integration with variants of multi-group Partial Least Squares.

Maintained by Eva Hamrud. Last updated 6 days ago.

immunooncology microarray sequencing metabolomics metagenomics proteomics geneprediction multiplecomparison classification regression bioconductor genomics genomics-data genomics-visualization multivariate-analysis multivariate-statistics omics r-pkg r-project

9.4 match 182 stars 13.71 score 1.3k scripts 22 dependents

projectmosaic

mosaic:Project MOSAIC Statistics and Mathematics Teaching Utilities

Data sets and utilities from Project MOSAIC (<http://www.mosaic-web.org>) used to teach mathematics, statistics, computation and modeling. Funded by the NSF, Project MOSAIC is a community of educators working to tie together aspects of quantitative work that students in science, technology, engineering and mathematics will need in their professional lives, but which are usually taught in isolation, if at all.

Maintained by Randall Pruim. Last updated 1 years ago.

9.7 match 93 stars 13.32 score 7.2k scripts 7 dependents

bioc

limma:Linear Models for Microarray and Omics Data

Data analysis, linear models and differential expression for omics data.

Maintained by Gordon Smyth. Last updated 7 days ago.

exonarray geneexpression transcription alternativesplicing differentialexpression differentialsplicing genesetenrichment dataimport bayesian clustering regression timecourse microarray micrornaarray mrnamicroarray onechannel proprietaryplatforms twochannel sequencing rnaseq batcheffect multiplecomparison normalization preprocessing qualitycontrol biomedicalinformatics cellbiology cheminformatics epigenetics functionalgenomics genetics immunooncology metabolomics proteomics systemsbiology transcriptomics

9.0 match 13.81 score 16k scripts 585 dependents

christiangoueguel

HotellingEllipse:Hotelling’s T-Squared Statistic and Ellipse

Functions to calculate the Hotelling’s T-squared statistic and corresponding confidence ellipses. Provides the semi-axes of the Hotelling’s T-squared ellipses at 95% and 99% confidence levels. Enables users to obtain the coordinates in two or three dimensions at user-defined confidence levels, allowing for the construction of 2D or 3D ellipses with customized confidence levels. Bro and Smilde (2014) <DOI:10.1039/c3ay41907j>. Brereton (2016) <DOI:10.1002/cem.2763>.

Maintained by Christian L. Goueguel. Last updated 2 months ago.

confidence-ellipse hotelling-ellipse hotelling-s-t-square hotelling-t2 hotellings-t2-distribution multivariate-distribution outliers partial-least-squares-regression pca pls principal-component-analysis

23.2 match 7 stars 5.29 score 14 scripts

salvatoremangiafico

rcompanion:Functions to Support Extension Education Program Evaluation

Functions and datasets to support Summary and Analysis of Extension Program Evaluation in R, and An R Companion for the Handbook of Biological Statistics. Vignettes are available at <https://rcompanion.org>.

Maintained by Salvatore Mangiafico. Last updated 1 months ago.

15.3 match 4 stars 8.01 score 2.4k scripts 5 dependents

tidymodels

infer:Tidy Statistical Inference

The objective of this package is to perform inference using an expressive statistical grammar that coheres with the tidy design framework.

Maintained by Simon Couch. Last updated 6 months ago.

7.8 match 736 stars 15.75 score 3.5k scripts 18 dependents

t-kalinowski

keras:R Interface to 'Keras'

Interface to 'Keras' <https://keras.io>, a high-level neural networks 'API'. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both 'CPU' and 'GPU' devices.

Maintained by Tomasz Kalinowski. Last updated 11 months ago.

11.0 match 10.93 score 10k scripts 55 dependents

svkucheryavski

mdatools:Multivariate Data Analysis for Chemometrics

Projection based methods for preprocessing, exploring and analysis of multivariate data used in chemometrics. S. Kucheryavskiy (2020) <doi:10.1016/j.chemolab.2020.103937>.

Maintained by Sergey Kucheryavskiy. Last updated 8 months ago.

16.1 match 36 stars 7.41 score 220 scripts 1 dependents

yanyachen

MLmetrics:Machine Learning Evaluation Metrics

A collection of evaluation metrics, including loss, score and utility functions, that measure regression, classification and ranking performance.

Maintained by Yachen Yan. Last updated 11 months ago.

10.7 match 69 stars 11.09 score 2.2k scripts 20 dependents

hwborchers

pracma:Practical Numerical Math Functions

Provides a large number of functions from numerical analysis and linear algebra, numerical optimization, differential equations, time series, plus some well-known special mathematical functions. Uses 'MATLAB' function names where appropriate to simplify porting.

Maintained by Hans W. Borchers. Last updated 1 years ago.

9.6 match 29 stars 12.34 score 6.6k scripts 931 dependents

kenaho1

asbio:A Collection of Statistical Tools for Biologists

Contains functions from: Aho, K. (2014) Foundational and Applied Statistics for Biologists using R. CRC/Taylor and Francis, Boca Raton, FL, ISBN: 978-1-4398-7338-0.

Maintained by Ken Aho. Last updated 2 months ago.

16.1 match 5 stars 7.32 score 310 scripts 3 dependents

tyee001

VGAM:Vector Generalized Linear and Additive Models

An implementation of about 6 major classes of statistical regression models. The central algorithm is Fisher scoring and iterative reweighted least squares. At the heart of this package are the vector generalized linear and additive model (VGLM/VGAM) classes. VGLMs can be loosely thought of as multivariate GLMs. VGAMs are data-driven VGLMs that use smoothing. The book "Vector Generalized Linear and Additive Models: With an Implementation in R" (Yee, 2015) <DOI:10.1007/978-1-4939-2818-7> gives details of the statistical framework and the package. Currently only fixed-effects models are implemented. Many (100+) models and distributions are estimated by maximum likelihood estimation (MLE) or penalized MLE. The other classes are RR-VGLMs (reduced-rank VGLMs), quadratic RR-VGLMs, doubly constrained RR-VGLMs, quadratic RR-VGLMs, reduced-rank VGAMs, RCIMs (row-column interaction models)---these classes perform constrained and unconstrained quadratic ordination (CQO/UQO) models in ecology, as well as constrained additive ordination (CAO). Hauck-Donner effect detection is implemented. Note that these functions are subject to change; see the NEWS and ChangeLog files for latest changes.

Maintained by Thomas Yee. Last updated 1 months ago.

fortran

10.7 match 10 stars 10.67 score 3.6k scripts 169 dependents

cran

mgcv:Mixed GAM Computation Vehicle with Automatic Smoothness Estimation

Generalized additive (mixed) models, some of their extensions and other generalized ridge regression with multiple smoothing parameter estimation by (Restricted) Marginal Likelihood, Generalized Cross Validation and similar, or using iterated nested Laplace approximation for fully Bayesian inference. See Wood (2017) <doi:10.1201/9781315370279> for an overview. Includes a gam() function, a wide variety of smoothers, 'JAGS' support and distributions beyond the exponential family.

Maintained by Simon Wood. Last updated 1 years ago.

openblas openmp

8.8 match 32 stars 12.71 score 17k scripts 7.8k dependents

friendly

matlib:Matrix Functions for Teaching and Learning Linear Algebra and Multivariate Statistics

A collection of matrix functions for teaching and learning matrix linear algebra as used in multivariate statistical methods. Many of these functions are designed for tutorial purposes in learning matrix algebra ideas using R. In some cases, functions are provided for concepts available elsewhere in R, but where the function call or name is not obvious. In other cases, functions are provided to show or demonstrate an algorithm. In addition, a collection of functions are provided for drawing vector diagrams in 2D and 3D and for rendering matrix expressions and equations in LaTeX.

Maintained by Michael Friendly. Last updated 4 days ago.

diagrams linear-equations matrix matrix-functions matrix-visualizer vector vignette

8.6 match 65 stars 12.89 score 900 scripts 11 dependents

adeverse

ade4:Analysis of Ecological Data: Exploratory and Euclidean Methods in Environmental Sciences

Tools for multivariate data analysis. Several methods are provided for the analysis (i.e., ordination) of one-table (e.g., principal component analysis, correspondence analysis), two-table (e.g., coinertia analysis, redundancy analysis), three-table (e.g., RLQ analysis) and K-table (e.g., STATIS, multiple coinertia analysis). The philosophy of the package is described in Dray and Dufour (2007) <doi:10.18637/jss.v022.i04>.

Maintained by Aurélie Siberchicot. Last updated 14 days ago.

openblas cpp

7.3 match 39 stars 14.96 score 2.2k scripts 256 dependents

tomasfryda

h2o:R Interface for the 'H2O' Scalable Machine Learning Platform

R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).

Maintained by Tomas Fryda. Last updated 1 years ago.

12.7 match 3 stars 8.20 score 7.8k scripts 11 dependents

easystats

performance:Assessment of Regression Models Performance

Utilities for computing measures to assess model quality, which are not directly provided by R's 'base' or 'stats' packages. These include e.g. measures like r-squared, intraclass correlation coefficient (Nakagawa, Johnson & Schielzeth (2017) <doi:10.1098/rsif.2017.0213>), root mean squared error or functions to check models for overdispersion, singularity or zero-inflation and more. Functions apply to a large variety of regression models, including generalized linear models, mixed effects models and Bayesian models. References: Lüdecke et al. (2021) <doi:10.21105/joss.03139>.

Maintained by Daniel Lüdecke. Last updated 1 days ago.

aic easystats hacktoberfest loo machine-learning mixed-models models performance r2 statistics

6.3 match 1.1k stars 16.18 score 4.3k scripts 47 dependents

nashjc

nlsr:Functions for Nonlinear Least Squares Solutions - Updated 2022

Provides tools for working with nonlinear least squares problems. For the estimation of models reliable and robust tools than nls(), where the the Gauss-Newton method frequently stops with 'singular gradient' messages. This is accomplished by using, where possible, analytic derivatives to compute the matrix of derivatives and a stabilization of the solution of the estimation equations. Tools for approximate or externally supplied derivative matrices are included. Bounds and masks on parameters are handled properly.

Maintained by John C Nash. Last updated 29 days ago.

14.4 match 7.02 score 94 scripts 5 dependents

alexpghayes

distributions3:Probability Distributions as S3 Objects

Tools to create and manipulate probability distributions using S3. Generics pdf(), cdf(), quantile(), and random() provide replacements for base R's d/p/q/r style functions. Functions and arguments have been named carefully to minimize confusion for students in intro stats courses. The documentation for each distribution contains detailed mathematical notes.

Maintained by Alex Hayes. Last updated 6 months ago.

8.9 match 102 stars 11.35 score 118 scripts 7 dependents

hiroyukiyamamoto

loadings:Loadings for Principal Component Analysis and Partial Least Squares

Computing statistical hypothesis testing for loading in principal component analysis (PCA) (Yamamoto, H. et al. (2014) <doi:10.1186/1471-2105-15-51>), orthogonal smoothed PCA (OS-PCA) (Yamamoto, H. et al. (2021) <doi:10.3390/metabo11030149>), one-sided kernel PCA (Yamamoto, H. (2023) <doi:10.51094/jxiv.262>), partial least squares (PLS) and PLS discriminant analysis (PLS-DA) (Yamamoto, H. et al. (2009) <doi:10.1016/j.chemolab.2009.05.006>), PLS with rank order of groups (PLS-ROG) (Yamamoto, H. (2017) <doi:10.1002/cem.2883>), regularized canonical correlation analysis discriminant analysis (RCCA-DA) (Yamamoto, H. et al. (2008) <doi:10.1016/j.bej.2007.12.009>), multiset PLS and PLS-ROG (Yamamoto, H. (2022) <doi:10.1101/2022.08.30.505949>).

Maintained by Hiroyuki Yamamoto. Last updated 11 months ago.

24.5 match 3 stars 4.08 score 27 scripts 1 dependents

hzambran

hydroGOF:Goodness-of-Fit Functions for Comparison of Simulated and Observed Hydrological Time Series

S3 functions implementing both statistical and graphical goodness-of-fit measures between observed and simulated values, mainly oriented to be used during the calibration, validation, and application of hydrological models. Missing values in observed and/or simulated values can be removed before computations. Comments / questions / collaboration of any kind are very welcomed.

Maintained by Mauricio Zambrano-Bigiarini. Last updated 10 months ago.

9.5 match 40 stars 10.29 score 796 scripts 8 dependents

db969

rsq:R-Squared and Related Measures

Calculate generalized R-squared, partial R-squared, and partial correlation coefficients for generalized linear (mixed) models (including quasi models with well defined variance functions).

Maintained by Dabao Zhang. Last updated 6 months ago.

20.0 match 4.86 score 492 scripts 2 dependents

jorischau

gslnls:GSL Multi-Start Nonlinear Least-Squares Fitting

An R interface to weighted nonlinear least-squares optimization with the GNU Scientific Library (GSL), see M. Galassi et al. (2009, ISBN:0954612078). The available trust region methods include the Levenberg-Marquardt algorithm with and without geodesic acceleration, the Steihaug-Toint conjugate gradient algorithm for large systems and several variants of Powell's dogleg algorithm. Multi-start optimization based on quasi-random samples is implemented using a modified version of the algorithm in Hickernell and Yuan (1997, OR Transactions). Robust nonlinear regression can be performed using various robust loss functions, in which case the optimization problem is solved by iterative reweighted least squares (IRLS). Bindings are provided to tune a number of parameters affecting the low-level aspects of the trust region algorithms. The interface mimics R's nls() function and returns model objects inheriting from the same class.

Maintained by Joris Chau. Last updated 2 months ago.

gnu-scientific-library gsl levenberg-marquardt multi-start nonlinear-least-squares nonlinear-regression robust-regresssion fortran glibc

15.6 match 16 stars 6.23 score 35 scripts 2 dependents

jkrijthe

RSSL:Implementations of Semi-Supervised Learning Approaches for Classification

A collection of implementations of semi-supervised classifiers and methods to evaluate their performance. The package includes implementations of, among others, Implicitly Constrained Learning, Moment Constrained Learning, the Transductive SVM, Manifold regularization, Maximum Contrastive Pessimistic Likelihood estimation, S4VM and WellSVM.

Maintained by Jesse Krijthe. Last updated 1 years ago.

openblas cpp

16.0 match 58 stars 6.05 score 128 scripts 1 dependents

pepijn-devries

csquares:Concise Spatial Query and Representation System (c-Squares)

Encode and decode c-squares, from and to simple feature (sf) or spatiotemporal arrays (stars) objects. Use c-squares codes to quickly join or query spatial data.

Maintained by Pepijn de Vries. Last updated 7 months ago.

16.6 match 2 stars 5.81 score 20 scripts

zeileis

ivreg:Instrumental-Variables Regression by '2SLS', '2SM', or '2SMM', with Diagnostics

Instrumental variable estimation for linear models by two-stage least-squares (2SLS) regression or by robust-regression via M-estimation (2SM) or MM-estimation (2SMM). The main ivreg() model-fitting function is designed to provide a workflow as similar as possible to standard lm() regression. A wide range of methods is provided for fitted ivreg model objects, including extensive functionality for computing and graphing regression diagnostics in addition to other standard model tools.

Maintained by Achim Zeileis. Last updated 2 months ago.

instrumental-variables regression-diagnostics two-stage-least-squares-regression

9.4 match 20 stars 10.24 score 360 scripts 4 dependents

jackstat

ModelMetrics:Rapid Calculation of Model Metrics

Collection of metrics for evaluating models written in C++ using 'Rcpp'. Popular metrics include area under the curve, log loss, root mean square error, etc.

Maintained by Tyler Hunt. Last updated 4 years ago.

auc logloss machine-learning metrics model-evaluation model-metrics cpp

8.1 match 29 stars 11.83 score 1.3k scripts 306 dependents

pachadotdev

cpp11armadillo:An 'Armadillo' Interface

Provides function declarations and inline function definitions that facilitate communication between R and the 'Armadillo' 'C++' library for linear algebra and scientific computing. This implementation is detailed in Vargas Sepulveda and Schneider Malamud (2024) <doi:10.48550/arXiv.2408.11074>.

Maintained by Mauricio Vargas Sepulveda. Last updated 28 days ago.

armadillo cpp cpp11 hacktoberfest linear-algebra

10.4 match 9 stars 9.14 score 1 scripts 16 dependents

tidymodels

yardstick:Tidy Characterizations of Model Performance

Tidy tools for quantifying how well model fits to a data set such as confusion matrices, class probability curve summaries, and regression metrics (e.g., RMSE).

Maintained by Emil Hvitfeldt. Last updated 6 days ago.

6.1 match 387 stars 15.47 score 2.2k scripts 60 dependents

vanzanden

ggsolvencyii:A 'ggplot2'-Plot of Composition of Solvency II SCR: SF and IM

An implementation of 'ggplot2'-methods to present the composition of Solvency II Solvency Capital Requirement (SCR) as a series of concentric circle-parts. Solvency II (Solvency 2) is European insurance legislation, coming in force by the delegated acts of October 10, 2014. <https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=OJ%3AL%3A2015%3A012%3ATOC>. Additional files, defining the structure of the Standard Formula (SF) method of the SCR-calculation are provided. The structure files can be adopted for localization or for insurance companies who use Internal Models (IM). Options are available for combining smaller components, horizontal and vertical scaling, rotation, and plotting only some circle-parts. With outlines and connectors several SCR-compositions can be compared, for example in ORSA-scenarios (Own Risk and Solvency Assessment).

Maintained by Marco van Zanden. Last updated 6 years ago.

16.9 match 3 stars 5.58 score 63 scripts

spatstat

spatstat.geom:Geometrical Functionality of the 'spatstat' Family

Defines spatial data types and supports geometrical operations on them. Data types include point patterns, windows (domains), pixel images, line segment patterns, tessellations and hyperframes. Capabilities include creation and manipulation of data (using command line or graphical interaction), plotting, geometrical operations (rotation, shift, rescale, affine transformation), convex hull, discretisation and pixellation, Dirichlet tessellation, Delaunay triangulation, pairwise distances, nearest-neighbour distances, distance transform, morphological operations (erosion, dilation, closing, opening), quadrat counting, geometrical measurement, geometrical covariance, colour maps, calculus on spatial domains, Gaussian blur, level sets of images, transects of images, intersections between objects, minimum distance matching. (Excludes spatial data on a network, which are supported by the package 'spatstat.linnet'.)

Maintained by Adrian Baddeley. Last updated 20 hours ago.

classes-and-objects distance-calculation geometry geometry-processing images mensuration plotting point-patterns spatial-data spatial-data-analysis

7.5 match 7 stars 12.10 score 241 scripts 227 dependents

andrisignorell

DescTools:Tools for Descriptive Statistics

A collection of miscellaneous basic statistic functions and convenience wrappers for efficiently describing data. The author's intention was to create a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results. The package contains furthermore functions to produce documents using MS Word (or PowerPoint) and functions to import data from Excel. Many of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The 'BigCamelCase' style was consequently applied to functions borrowed from contributed R packages as well.

Maintained by Andri Signorell. Last updated 2 days ago.

fortran cpp

5.3 match 87 stars 16.70 score 7.7k scripts 99 dependents

openpharma

mmrm:Mixed Models for Repeated Measures

Mixed models for repeated measures (MMRM) are a popular choice for analyzing longitudinal continuous outcomes in randomized clinical trials and beyond; see Cnaan, Laird and Slasor (1997) <doi:10.1002/(SICI)1097-0258(19971030)16:20%3C2349::AID-SIM667%3E3.0.CO;2-E> for a tutorial and Mallinckrodt, Lane, Schnell, Peng and Mancuso (2008) <doi:10.1177/009286150804200402> for a review. This package implements MMRM based on the marginal linear model without random effects using Template Model Builder ('TMB') which enables fast and robust model fitting. Users can specify a variety of covariance matrices, weight observations, fit models with restricted or standard maximum likelihood inference, perform hypothesis testing with Satterthwaite or Kenward-Roger adjustment, and extract least square means estimates by using 'emmeans'.

Maintained by Daniel Sabanes Bove. Last updated 12 days ago.

cpp

7.3 match 138 stars 12.15 score 113 scripts 4 dependents

easystats

effectsize:Indices of Effect Size

Provide utilities to work with indices of effect size for a wide variety of models and hypothesis tests (see list of supported models using the function 'insight::supported_models()'), allowing computation of and conversion between indices such as Cohen's d, r, odds, etc. References: Ben-Shachar et al. (2020) <doi:10.21105/joss.02815>.

Maintained by Mattan S. Ben-Shachar. Last updated 2 months ago.

anova cohens-d compute conversion correlation effect-size effectsize hacktoberfest hedges-g interpretation standardization standardized statistics

5.4 match 344 stars 16.38 score 1.8k scripts 29 dependents

braverock

PerformanceAnalytics:Econometric Tools for Performance and Risk Analysis

Collection of econometric functions for performance and risk analysis. In addition to standard risk and performance metrics, this package aims to aid practitioners and researchers in utilizing the latest research in analysis of non-normal return streams. In general, it is most tested on return (rather than price) data on a regular scale, but most functions will work with irregular return data as well, and increasing numbers of functions will work with P&L or price data where possible.

Maintained by Brian G. Peterson. Last updated 3 months ago.

5.5 match 222 stars 15.93 score 4.8k scripts 20 dependents

rfastofficial

Rfast2:A Collection of Efficient and Extremely Fast R Functions II

A collection of fast statistical and utility functions for data analysis. Functions for regression, maximum likelihood, column-wise statistics and many more have been included. C++ has been utilized to speed up the functions. References: Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>.

Maintained by Manos Papadakis. Last updated 1 years ago.

openblas cpp openmp

10.6 match 38 stars 8.09 score 75 scripts 26 dependents

fbertran

plsRbeta:Partial Least Squares Regression for Beta Regression Models

Provides Partial least squares Regression for (weighted) beta regression models (Bertrand 2013, <http://journal-sfds.fr/article/view/215>) and k-fold cross-validation of such models using various criteria. It allows for missing data in the explanatory variables. Bootstrap confidence intervals constructions are also available.

Maintained by Frederic Bertrand. Last updated 2 years ago.

19.6 match 2 stars 4.34 score 22 scripts

mrcieu

TwoSampleMR:Two Sample MR Functions and Interface to MRC Integrative Epidemiology Unit OpenGWAS Database

A package for performing Mendelian randomization using GWAS summary data. It uses the IEU OpenGWAS database <https://gwas.mrcieu.ac.uk/> to automatically obtain data, and a wide range of methods to run the analysis.

Maintained by Gibran Hemani. Last updated 4 hours ago.

7.5 match 474 stars 11.27 score 1.7k scripts 1 dependents

rikenbit

guidedPLS:Supervised Dimensional Reduction by Guided Partial Least Squares

Guided partial least squares (guided-PLS) is the combination of partial least squares by singular value decomposition (PLS-SVD) and guided principal component analysis (guided-PCA). For the details of the methods, see the reference section of GitHub README.md <https://github.com/rikenbit/guidedPLS>.

Maintained by Koki Tsuyuzaki. Last updated 2 years ago.

21.0 match 4.00 score

bioc

structToolbox:Data processing & analysis tools for Metabolomics and other omics

An extensive set of data (pre-)processing and analysis methods and tools for metabolomics and other omics, with a strong emphasis on statistics and machine learning. This toolbox allows the user to build extensive and standardised workflows for data analysis. The methods and tools have been implemented using class-based templates provided by the struct (Statistics in R Using Class-based Templates) package. The toolbox includes pre-processing methods (e.g. signal drift and batch correction, normalisation, missing value imputation and scaling), univariate (e.g. ttest, various forms of ANOVA, Kruskal–Wallis test and more) and multivariate statistical methods (e.g. PCA and PLS, including cross-validation and permutation testing) as well as machine learning methods (e.g. Support Vector Machines). The STATistics Ontology (STATO) has been integrated and implemented to provide standardised definitions for the different methods, inputs and outputs.

Maintained by Gavin Rhys Lloyd. Last updated 27 days ago.

workflowstep metabolomics bioconductor-package dims lc-ms machine-learning multivariate-analysis statistics univariate

13.2 match 10 stars 6.26 score 12 scripts

ycroissant

plm:Linear Models for Panel Data

A set of estimators for models and (robust) covariance matrices, and tests for panel data econometrics, including within/fixed effects, random effects, between, first-difference, nested random effects as well as instrumental-variable (IV) and Hausman-Taylor-style models, panel generalized method of moments (GMM) and general FGLS models, mean groups (MG), demeaned MG, and common correlated effects (CCEMG) and pooled (CCEP) estimators with common factors, variable coefficients and limited dependent variables models. Test functions include model specification, serial correlation, cross-sectional dependence, panel unit root and panel Granger (non-)causality. Typical references are general econometrics text books such as Baltagi (2021), Econometric Analysis of Panel Data (<doi:10.1007/978-3-030-53953-5>), Hsiao (2014), Analysis of Panel Data (<doi:10.1017/CBO9781139839327>), and Croissant and Millo (2018), Panel Data Econometrics with R (<doi:10.1002/9781119504641>).

Maintained by Kevin Tappe. Last updated 3 days ago.

6.9 match 59 stars 12.06 score 39 dependents

chorscroft

zalpha:Run a Suite of Selection Statistics

A suite of statistics for identifying areas of the genome under selective pressure. See Jacobs, Sluckin and Kivisild (2016) <doi:10.1534/genetics.115.185900>.

Maintained by Clare Horscroft. Last updated 3 years ago.

evolution genetics

20.5 match 2 stars 4.00 score 4 scripts

psychmeta

psychmeta:Psychometric Meta-Analysis Toolkit

Tools for computing bare-bones and psychometric meta-analyses and for generating psychometric data for use in meta-analysis simulations. Supports bare-bones, individual-correction, and artifact-distribution methods for meta-analyzing correlations and d values. Includes tools for converting effect sizes, computing sporadic artifact corrections, reshaping meta-analytic databases, computing multivariate corrections for range variation, and more. Bugs can be reported to <https://github.com/psychmeta/psychmeta/issues> or <issues@psychmeta.com>.

Maintained by Jeffrey A. Dahlke. Last updated 9 months ago.

hacktoberfest meta-analysis psychology psychometric psychometrics

9.9 match 57 stars 8.25 score 151 scripts

mkshaw

r2mlm:R-Squared Measures for Multilevel Models

Generates both total- and level-specific R-squared measures from Rights and Sterba’s (2019) <doi:10.1037/met0000184> framework of R-squared measures for multilevel models with random intercepts and/or slopes, which is based on a complete decomposition of variance. Additionally generates graphical representations of these R-squared measures to allow visualizing and interpreting all measures in the framework together as an integrated set. This framework subsumes 10 previously-developed R-squared measures for multilevel models as special cases of 5 measures from the framework, and it also includes several newly-developed measures. Measures in the framework can be used to compute R-squared differences when comparing multilevel models (following procedures in Rights & Sterba (2020) <doi:10.1080/00273171.2019.1660605>). Bootstrapped confidence intervals can also be calculated. To use the confidence interval functionality, download bootmlm from <https://github.com/marklhc/bootmlm>.

Maintained by Mairead Shaw. Last updated 1 years ago.

15.6 match 27 stars 5.24 score 130 scripts

cran

FuzzySTs:Fuzzy Statistical Tools

The main goal of this package is to present various fuzzy statistical tools. It intends to provide an implementation of the theoretical and empirical approaches presented in the book entitled "The signed distance measure in fuzzy statistical analysis. Some theoretical, empirical and programming advances" <doi: 10.1007/978-3-030-76916-1>. For the theoretical approaches, see Berkachy R. and Donze L. (2019) <doi:10.1007/978-3-030-03368-2_1>. For the empirical approaches, see Berkachy R. and Donze L. (2016) <ISBN: 978-989-758-201-1>). Important (non-exhaustive) implementation highlights of this package are as follows: (1) a numerical procedure to estimate the fuzzy difference and the fuzzy square. (2) two numerical methods of fuzzification. (3) a function performing different possibilities of distances, including the signed distance and the generalized signed distance for instance with all its properties. (4) numerical estimations of fuzzy statistical measures such as the variance, the moment, etc. (5) two methods of estimation of the bootstrap distribution of the likelihood ratio in the fuzzy context. (6) an estimation of a fuzzy confidence interval by the likelihood ratio method. (7) testing fuzzy hypotheses and/or fuzzy data by fuzzy confidence intervals in the Kwakernaak - Kruse and Meyer sense. (8) a general method to estimate the fuzzy p-value with fuzzy hypotheses and/or fuzzy data. (9) a method of estimation of global and individual evaluations of linguistic questionnaires. (10) numerical estimations of multi-ways analysis of variance models in the fuzzy context. The unbalance in the considered designs are also foreseen.

Maintained by Redina Berkachy. Last updated 8 months ago.

23.9 match 3.40 score

mhenderson

wallis:Room squares in R

Room squares in R.

Maintained by Matthew Henderson. Last updated 7 months ago.

combinatorial-designs combinatorics room-squares

31.9 match 2.54 score 1 scripts

egenn

rtemis:Machine Learning and Visualization

Advanced Machine Learning and Visualization. Unsupervised Learning (Clustering, Decomposition), Supervised Learning (Classification, Regression), Cross-Decomposition, Bagging, Boosting, Meta-models. Static and interactive graphics.

Maintained by E.D. Gennatas. Last updated 1 months ago.

data-science data-visualization machine-learning machine-learning-library visualization

11.4 match 145 stars 7.09 score 50 scripts 2 dependents

r-forge

DPQ:Density, Probability, Quantile ('DPQ') Computations

Computations for approximations and alternatives for the 'DPQ' (Density (pdf), Probability (cdf) and Quantile) functions for probability distributions in R. Primary focus is on (central and non-central) beta, gamma and related distributions such as the chi-squared, F, and t. -- For several distribution functions, provide functions implementing formulas from Johnson, Kotz, and Kemp (1992) <doi:10.1002/bimj.4710360207> and Johnson, Kotz, and Balakrishnan (1995) for discrete or continuous distributions respectively. This is for the use of researchers in these numerical approximation implementations, notably for my own use in order to improve standard R pbeta(), qgamma(), ..., etc: {'"dpq"'-functions}.

Maintained by Martin Maechler. Last updated 2 months ago.

fortran

13.8 match 5.75 score 43 scripts 1 dependents

rmheiberger

HH:Statistical Analysis and Data Display: Heiberger and Holland

Support software for Statistical Analysis and Data Display (Second Edition, Springer, ISBN 978-1-4939-2121-8, 2015) and (First Edition, Springer, ISBN 0-387-40270-5, 2004) by Richard M. Heiberger and Burt Holland. This contemporary presentation of statistical methods features extensive use of graphical displays for exploring data and for displaying the analysis. The second edition includes redesigned graphics and additional chapters. The authors emphasize how to construct and interpret graphs, discuss principles of graphical design, and show how accompanying traditional tabular results are used to confirm the visual impressions derived directly from the graphs. Many of the graphical formats are novel and appear here for the first time in print. All chapters have exercises. All functions introduced in the book are in the package. R code for all examples, both graphs and tables, in the book is included in the scripts directory of the package.

Maintained by Richard M. Heiberger. Last updated 1 months ago.

12.1 match 3 stars 6.42 score 752 scripts 5 dependents

dsy109

mixtools:Tools for Analyzing Finite Mixture Models

Analyzes finite mixture models for various parametric and semiparametric settings. This includes mixtures of parametric distributions (normal, multivariate normal, multinomial, gamma), various Reliability Mixture Models (RMMs), mixtures-of-regressions settings (linear regression, logistic regression, Poisson regression, linear regression with changepoints, predictor-dependent mixing proportions, random effects regressions, hierarchical mixtures-of-experts), and tools for selecting the number of components (bootstrapping the likelihood ratio test statistic, mixturegrams, and model selection criteria). Bayesian estimation of mixtures-of-linear-regressions models is available as well as a novel data depth method for obtaining credible bands. This package is based upon work supported by the National Science Foundation under Grant No. SES-0518772 and the Chan Zuckerberg Initiative: Essential Open Source Software for Science (Grant No. 2020-255193).

Maintained by Derek Young. Last updated 9 months ago.

mixture-models mixture-of-experts semiparametric-regression

6.8 match 20 stars 11.34 score 1.4k scripts 56 dependents

r-lib

scales:Scale Functions for Visualization

Graphical scales map data to aesthetics, and provide methods for automatically determining breaks and labels for axes and legends.

Maintained by Thomas Lin Pedersen. Last updated 5 months ago.

ggplot2

3.8 match 419 stars 19.88 score 88k scripts 7.9k dependents

khliland

pls:Partial Least Squares and Principal Component Regression

Multivariate regression methods Partial Least Squares Regression (PLSR), Principal Component Regression (PCR) and Canonical Powered Partial Least Squares (CPPLS).

Maintained by Kristian Hovde Liland. Last updated 2 months ago.

5.5 match 36 stars 13.50 score 3.2k scripts 85 dependents

liamrevell

phytools:Phylogenetic Tools for Comparative Biology (and Other Things)

A wide range of methods for phylogenetic analysis - concentrated in phylogenetic comparative biology, but also including numerous techniques for visualizing, analyzing, manipulating, reading or writing, and even inferring phylogenetic trees. Included among the functions in phylogenetic comparative biology are various for ancestral state reconstruction, model-fitting, and simulation of phylogenies and trait data. A broad range of plotting methods for phylogenies and comparative data include (but are not restricted to) methods for mapping trait evolution on trees, for projecting trees into phenotype space or a onto a geographic map, and for visualizing correlated speciation between trees. Lastly, numerous functions are designed for reading, writing, analyzing, inferring, simulating, and manipulating phylogenetic trees and comparative data. For instance, there are functions for computing consensus phylogenies from a set, for simulating phylogenetic trees and data under a range of models, for randomly or non-randomly attaching species or clades to a tree, as well as for a wide range of other manipulations and analyses that phylogenetic biologists might find useful in their research.

Maintained by Liam J. Revell. Last updated 29 days ago.

5.3 match 218 stars 13.85 score 4.8k scripts 76 dependents

ben519

mltools:Machine Learning Tools

A collection of machine learning helper functions, particularly assisting in the Exploratory Data Analysis phase. Makes heavy use of the 'data.table' package for optimal speed and memory efficiency. Highlights include a versatile bin_data() function, sparsify() for converting a data.table to sparse matrix format with one-hot encoding, fast evaluation metrics, and empirical_cdf() for calculating empirical Multivariate Cumulative Distribution Functions.

Maintained by Ben Gorman. Last updated 3 years ago.

exploratory-data-analysis machine-learning

7.5 match 72 stars 9.58 score 1.2k scripts 13 dependents

barbarabodinier

sharp:Stability-enHanced Approaches using Resampling Procedures

In stability selection (N Meinshausen, P Bühlmann (2010) <doi:10.1111/j.1467-9868.2010.00740.x>) and consensus clustering (S Monti et al (2003) <doi:10.1023/A:1023949509487>), resampling techniques are used to enhance the reliability of the results. In this package, hyper-parameters are calibrated by maximising model stability, which is measured under the null hypothesis that all selection (or co-membership) probabilities are identical (B Bodinier et al (2023a) <doi:10.1093/jrsssc/qlad058> and B Bodinier et al (2023b) <doi:10.1093/bioinformatics/btad635>). Functions are readily implemented for the use of LASSO regression, sparse PCA, sparse (group) PLS or graphical LASSO in stability selection, and hierarchical clustering, partitioning around medoids, K means or Gaussian mixture models in consensus clustering.

Maintained by Barbara Bodinier. Last updated 1 years ago.

12.2 match 13 stars 5.91 score 124 scripts

billdenney

PKNCA:Perform Pharmacokinetic Non-Compartmental Analysis

Compute standard Non-Compartmental Analysis (NCA) parameters for typical pharmacokinetic analyses and summarize them.

Maintained by Bill Denney. Last updated 18 days ago.

nca noncompartmental-analysis pharmacokinetics

5.7 match 73 stars 12.61 score 214 scripts 4 dependents

joemsong

FunChisq:Model-Free Functional Chi-Squared and Exact Tests

Statistical hypothesis testing methods for inferring model-free functional dependency using asymptotic chi-squared or exact distributions. Functional test statistics are asymmetric and functionally optimal, unique from other related statistics. Tests in this package reveal evidence for causality based on the causality-by- functionality principle. They include asymptotic functional chi-squared tests (Zhang & Song 2013) <doi:10.48550/arXiv.1311.2707>, an adapted functional chi-squared test (Kumar & Song 2022) <doi:10.1093/bioinformatics/btac206>, and an exact functional test (Zhong & Song 2019) <doi:10.1109/TCBB.2018.2809743> (Nguyen et al. 2020) <doi:10.24963/ijcai.2020/372>. The normalized functional chi-squared test was used by Best Performer 'NMSUSongLab' in HPN-DREAM (DREAM8) Breast Cancer Network Inference Challenges (Hill et al. 2016) <doi:10.1038/nmeth.3773>. A function index (Zhong & Song 2019) <doi:10.1186/s12920-019-0565-9> (Kumar et al. 2018) <doi:10.1109/BIBM.2018.8621502> derived from the functional test statistic offers a new effect size measure for the strength of functional dependency, a better alternative to conditional entropy in many aspects. For continuous data, these tests offer an advantage over regression analysis when a parametric functional form cannot be assumed; for categorical data, they provide a novel means to assess directional dependency not possible with symmetrical Pearson's chi-squared or Fisher's exact tests.

Maintained by Joe Song. Last updated 10 months ago.

cpp

16.3 match 4.37 score 29 scripts

tidymodels

recipes:Preprocessing and Feature Engineering Steps for Modeling

A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.

Maintained by Max Kuhn. Last updated 8 hours ago.

3.8 match 584 stars 18.73 score 7.2k scripts 382 dependents

cran

agricolae:Statistical Procedures for Agricultural Research

Original idea was presented in the thesis "A statistical analysis tool for agricultural research" to obtain the degree of Master on science, National Engineering University (UNI), Lima-Peru. Some experimental data for the examples come from the CIP and others research. Agricolae offers extensive functionality on experimental design especially for agricultural and plant breeding experiments, which can also be useful for other purposes. It supports planning of lattice, Alpha, Cyclic, Complete Block, Latin Square, Graeco-Latin Squares, augmented block, factorial, split and strip plot designs. There are also various analysis facilities for experimental data, e.g. treatment comparison procedures and several non-parametric tests comparison, biodiversity indexes and consensus cluster.

Maintained by Felipe de Mendiburu. Last updated 1 years ago.

10.0 match 7 stars 7.01 score 15 dependents

emitanaka

edibble:Encapsulating Elements of Experimental Design

A system to facilitate designing comparative (and non-comparative) experiments using the grammar of experimental designs <https://emitanaka.org/edibble-book/>. An experimental design is treated as an intermediate, mutable object that is built progressively by fundamental experimental components like units, treatments, and their relation. The system aids in experimental planning, management and workflow.

Maintained by Emi Tanaka. Last updated 4 months ago.

experimental-designs

9.3 match 217 stars 7.43 score 62 scripts

opengeos

whitebox:'WhiteboxTools' R Frontend

An R frontend for the 'WhiteboxTools' library, which is an advanced geospatial data analysis platform developed by Prof. John Lindsay at the University of Guelph's Geomorphometry and Hydrogeomatics Research Group. 'WhiteboxTools' can be used to perform common geographical information systems (GIS) analysis operations, such as cost-distance analysis, distance buffering, and raster reclassification. Remote sensing and image processing tasks include image enhancement (e.g. panchromatic sharpening, contrast adjustments), image mosaicing, numerous filtering operations, simple classification (k-means), and common image transformations. 'WhiteboxTools' also contains advanced tooling for spatial hydrological analysis (e.g. flow-accumulation, watershed delineation, stream network analysis, sink removal), terrain analysis (e.g. common terrain indices such as slope, curvatures, wetness index, hillshading; hypsometric analysis; multi-scale topographic position analysis), and LiDAR data processing. Suggested citation: Lindsay (2016) <doi:10.1016/j.cageo.2016.07.003>.

Maintained by Andrew Brown. Last updated 5 months ago.

geomorphometry geoprocessing geospatial gis hydrology remote-sensing rstudio

7.1 match 173 stars 9.65 score 203 scripts 2 dependents

verasls

lvmisc:Veras Miscellaneous

Contains a collection of useful functions for basic data computation and manipulation, wrapper functions for generating 'ggplot2' graphics, including statistical model diagnostic plots, methods for computing statistical models quality measures (such as AIC, BIC, r squared, root mean squared error) and general utilities.

Maintained by Lucas Veras. Last updated 1 years ago.

12.7 match 6 stars 5.40 score 14 scripts 1 dependents

lrberge

fixest:Fast Fixed-Effects Estimations

Fast and user-friendly estimation of econometric models with multiple fixed-effects. Includes ordinary least squares (OLS), generalized linear models (GLM) and the negative binomial. The core of the package is based on optimized parallel C++ code, scaling especially well for large data sets. The method to obtain the fixed-effects coefficients is based on Berge (2018) <https://github.com/lrberge/fixest/blob/master/_DOCS/FENmlm_paper.pdf>. Further provides tools to export and view the results of several estimations with intuitive design to cluster the standard-errors.

Maintained by Laurent Berge. Last updated 7 months ago.

cpp openmp

4.6 match 387 stars 14.69 score 3.8k scripts 25 dependents

heliosdrm

pwr:Basic Functions for Power Analysis

Power analysis functions along the lines of Cohen (1988).

Maintained by Helios De Rosario. Last updated 1 years ago.

5.2 match 105 stars 12.97 score 2.6k scripts 28 dependents

elvanceyhan

nnspat:Nearest Neighbor Methods for Spatial Patterns

Contains the functions for testing the spatial patterns (of segregation, spatial symmetry, association, disease clustering, species correspondence and reflexivity) based on nearest neighbor relations, especially using contingency tables such as nearest neighbor contingency tables (Ceyhan (2010) <doi:10.1007/s10651-008-0104-x> and Ceyhan (2017) <doi:10.1016/j.jkss.2016.10.002> and references therein), nearest neighbor symmetry contingency tables (Ceyhan (2014) <doi:10.1155/2014/698296>), species correspondence contingency tables and reflexivity contingency tables (Ceyhan (2018) <doi:10.2436/20.8080.02.72>) for two (or higher) dimensional data. Also contains functions for generating patterns of segregation, association, uniformity in a multi-class setting (Ceyhan (2014) <doi:10.1007/s00477-013-0824-9>), and various non-random labeling patterns for disease clustering in two dimensional cases (Ceyhan (2014) <doi:10.1002/sim.6053>), and for visualization of all these patterns for the two dimensional data. The tests are usually (asymptotic) normal z-tests and chi-square tests.

Maintained by Elvan Ceyhan. Last updated 3 years ago.

23.0 match 2.90 score 16 scripts

trevorld

gridpattern:'grid' Pattern Grobs

Provides 'grid' grobs that fill in a user-defined area with various patterns. Includes enhanced versions of the geometric and image-based patterns originally contained in the 'ggpattern' package as well as original 'pch', 'polygon_tiling', 'regular_polygon', 'rose', 'text', 'wave', and 'weave' patterns plus support for custom user-defined patterns.

Maintained by Trevor L. Davis. Last updated 1 months ago.

7.9 match 33 stars 8.42 score 4 scripts 4 dependents

amices

mice:Multivariate Imputation by Chained Equations

Multiple imputation using Fully Conditional Specification (FCS) implemented by the MICE algorithm as described in Van Buuren and Groothuis-Oudshoorn (2011) <doi:10.18637/jss.v045.i03>. Each variable has its own imputation model. Built-in imputation models are provided for continuous data (predictive mean matching, normal), binary data (logistic regression), unordered categorical data (polytomous logistic regression) and ordered categorical data (proportional odds). MICE can also impute continuous two-level data (normal model, pan, second-level variables). Passive imputation can be used to maintain consistency between variables. Various diagnostic plots are available to inspect the quality of the imputations.

Maintained by Stef van Buuren. Last updated 8 days ago.

chained-equations fcs imputation mice missing-data missing-values multiple-imputation multivariate-data cpp

3.9 match 462 stars 16.50 score 10k scripts 154 dependents

fbertran

plsdof:Degrees of Freedom and Statistical Inference for Partial Least Squares Regression

The plsdof package provides Degrees of Freedom estimates for Partial Least Squares (PLS) Regression. Model selection for PLS is based on various information criteria (aic, bic, gmdl) or on cross-validation. Estimates for the mean and covariance of the PLS regression coefficients are available. They allow the construction of approximate confidence intervals and the application of test procedures (Kramer and Sugiyama 2012 <doi:10.1198/jasa.2011.tm10107>). Further, cross-validation procedures for Ridge Regression and Principal Components Regression are available.

Maintained by Frederic Bertrand. Last updated 2 years ago.

17.3 match 3 stars 3.65 score 30 scripts

cran

sae:Small Area Estimation

Functions for small area estimation.

Maintained by Yolanda Marhuenda. Last updated 5 years ago.

11.5 match 6 stars 5.49 score 83 scripts 8 dependents

ocbe-uio

contingencytables:Statistical Analysis of Contingency Tables

Provides functions to perform statistical inference of data organized in contingency tables. This package is a companion to the "Statistical Analysis of Contingency Tables" book by Fagerland et al. <ISBN 9781466588172>.

Maintained by Waldir Leoncio. Last updated 7 months ago.

contingency-table

15.3 match 3 stars 4.13 score 8 scripts 1 dependents

xiaoruizhu

SurrogateRsq:Goodness-of-Fit Analysis for Categorical Data using the Surrogate R-Squared

To assess and compare the models' goodness of fit, R-squared is one of the most popular measures. For categorical data analysis, however, no universally adopted R-squared measure can resemble the ordinary least square (OLS) R-squared for linear models with continuous data. This package implement the surrogate R-squared measure for categorical data analysis, which is proposed in the study of Dungang Liu, Xiaorui Zhu, Brandon Greenwell, and Zewei Lin (2022) <doi:10.1111/bmsp.12289>. It can generate a point or interval measure of the surrogate R-squared. It can also provide a ranking measure of the percentage contribution of each variable to the overall surrogate R-squared. This ranking assessment allows one to check the importance of each variable in terms of their explained variance. This package can be jointly used with other existing R packages for variable selection and model diagnostics in the model-building process.

Maintained by Xiaorui (Jeremy) Zhu. Last updated 12 months ago.

categorical-data-analysis goodness-of-fit r-squared-statistic statistics

13.9 match 5 stars 4.48 score 12 scripts

briencj

dae:Functions Useful in the Design and ANOVA of Experiments

The content falls into the following groupings: (i) Data, (ii) Factor manipulation functions, (iii) Design functions, (iv) ANOVA functions, (v) Matrix functions, (vi) Projector and canonical efficiency functions, and (vii) Miscellaneous functions. There is a vignette describing how to use the design functions for randomizing and assessing designs available as a vignette called 'DesignNotes'. The ANOVA functions facilitate the extraction of information when the 'Error' function has been used in the call to 'aov'. The package 'dae' can also be installed from <http://chris.brien.name/rpackages/>.

Maintained by Chris Brien. Last updated 4 months ago.

7.2 match 1 stars 8.62 score 356 scripts 7 dependents

kassambara

rstatix:Pipe-Friendly Framework for Basic Statistical Tests

Provides a simple and intuitive pipe-friendly framework, coherent with the 'tidyverse' design philosophy, for performing basic statistical tests, including t-test, Wilcoxon test, ANOVA, Kruskal-Wallis and correlation analyses. The output of each test is automatically transformed into a tidy data frame to facilitate visualization. Additional functions are available for reshaping, reordering, manipulating and visualizing correlation matrix. Functions are also included to facilitate the analysis of factorial experiments, including purely 'within-Ss' designs (repeated measures), purely 'between-Ss' designs, and mixed 'within-and-between-Ss' designs. It's also possible to compute several effect size metrics, including "eta squared" for ANOVA, "Cohen's d" for t-test and 'Cramer V' for the association between categorical variables. The package contains helper functions for identifying univariate and multivariate outliers, assessing normality and homogeneity of variances.

Maintained by Alboukadel Kassambara. Last updated 2 years ago.

4.1 match 456 stars 15.16 score 11k scripts 420 dependents

singmann

afex:Analysis of Factorial Experiments

Convenience functions for analyzing factorial experiments using ANOVA or mixed models. aov_ez(), aov_car(), and aov_4() allow specification of between, within (i.e., repeated-measures), or mixed (i.e., split-plot) ANOVAs for data in long format (i.e., one observation per row), automatically aggregating multiple observations per individual and cell of the design. mixed() fits mixed models using lme4::lmer() and computes p-values for all fixed effects using either Kenward-Roger or Satterthwaite approximation for degrees of freedom (LMM only), parametric bootstrap (LMMs and GLMMs), or likelihood ratio tests (LMMs and GLMMs). afex_plot() provides a high-level interface for interaction or one-way plots using ggplot2, combining raw data and model estimates. afex uses type 3 sums of squares as default (imitating commercial statistical software).

Maintained by Henrik Singmann. Last updated 7 months ago.

4.2 match 123 stars 14.50 score 1.4k scripts 15 dependents

philipppro

measures:Performance Measures for Statistical Learning

Provides the biggest amount of statistical measures in the whole R world. Includes measures of regression, (multiclass) classification and multilabel classification. The measures come mainly from the 'mlr' package and were programed by several 'mlr' developers.

Maintained by Philipp Probst. Last updated 4 years ago.

13.4 match 1 stars 4.47 score 88 scripts 2 dependents

k-m-m

nnls:The Lawson-Hanson Algorithm for Non-Negative Least Squares (NNLS)

An R interface to the Lawson-Hanson implementation of an algorithm for non-negative least squares (NNLS). Also allows the combination of non-negative and non-positive constraints.

Maintained by Katharine Mullen. Last updated 5 months ago.

fortran

8.4 match 7.13 score 251 scripts 167 dependents

dgbonett

statpsych:Statistical Methods for Psychologists

Implements confidence interval and sample size methods that are especially useful in psychological research. The methods can be applied in 1-group, 2-group, paired-samples, and multiple-group designs and to a variety of parameters including means, medians, proportions, slopes, standardized mean differences, standardized linear contrasts of means, plus several measures of correlation and association. Confidence interval and sample size functions are given for single parameters as well as differences, ratios, and linear contrasts of parameters. The sample size functions can be used to approximate the sample size needed to estimate a parameter or function of parameters with desired confidence interval precision or to perform a variety of hypothesis tests (directional two-sided, equivalence, superiority, noninferiority) with desired power. For details see: Statistical Methods for Psychologists, Volumes 1 – 4, <https://dgbonett.sites.ucsc.edu/>.

Maintained by Douglas G. Bonett. Last updated 3 months ago.

12.4 match 6 stars 4.83 score 15 scripts 1 dependents

declaredesign

estimatr:Fast Estimators for Design-Based Inference

Fast procedures for small set of commonly-used, design-appropriate estimators with robust standard errors and confidence intervals. Includes estimators for linear regression, instrumental variables regression, difference-in-means, Horvitz-Thompson estimation, and regression improving precision of experimental estimates by interacting treatment with centered pre-treatment covariates introduced by Lin (2013) <doi:10.1214/12-AOAS583>.

Maintained by Graeme Blair. Last updated 1 months ago.

cpp

5.2 match 133 stars 11.58 score 1.7k scripts 11 dependents

nlmixr2

rxode2:Facilities for Simulating from ODE-Based Models

Facilities for running simulations from ordinary differential equation ('ODE') models, such as pharmacometrics and other compartmental models. A compilation manager translates the ODE model into C, compiles it, and dynamically loads the object code into R for improved computational efficiency. An event table object facilitates the specification of complex dosing regimens (optional) and sampling schedules. NB: The use of this package requires both C and Fortran compilers, for details on their use with R please see Section 6.3, Appendix A, and Appendix D in the "R Administration and Installation" manual. Also the code is mostly released under GPL. The 'VODE' and 'LSODA' are in the public domain. The information is available in the inst/COPYRIGHTS.

Maintained by Matthew L. Fidler. Last updated 1 months ago.

fortran openblas cpp openmp

5.3 match 40 stars 11.24 score 220 scripts 13 dependents

chikuang

SLSEdesign:Optimal Regression Design under the Second-Order Least Squares Estimator

With given inputs that include number of points, discrete design space, a measure of skewness, models and parameter value, this package calculates the objective value, optimal designs and plot the equivalence theory under A- and D-optimal criteria under the second-order Least squares estimator. This package is based on the paper "Properties of optimal regression designs under the second-order least squares estimator" by Chi-Kuang Yeh and Julie Zhou (2021) <doi:10.1007/s00362-018-01076-6>.

Maintained by Chi-Kuang Yeh. Last updated 5 months ago.

convex-optimization cvx design-of-experiments least-squares optimal-designs

12.8 match 4.54 score 2 scripts

khliland

multiblock:Multiblock Data Fusion in Statistics and Machine Learning

Functions and datasets to support Smilde, Næs and Liland (2021, ISBN: 978-1-119-60096-1) "Multiblock Data Fusion in Statistics and Machine Learning - Applications in the Natural and Life Sciences". This implements and imports a large collection of methods for multiblock data analysis with common interfaces, result- and plotting functions, several real data sets and six vignettes covering a range different applications.

Maintained by Kristian Hovde Liland. Last updated 2 months ago.

cpp

8.6 match 14 stars 6.68 score 19 scripts

paul-buerkner

brms:Bayesian Regression Models using 'Stan'

Fit Bayesian generalized (non-)linear multivariate multilevel models using 'Stan' for full Bayesian inference. A wide range of distributions and link functions are supported, allowing users to fit -- among others -- linear, robust linear, count data, survival, response times, ordinal, zero-inflated, hurdle, and even self-defined mixture models all in a multilevel context. Further modeling options include both theory-driven and data-driven non-linear terms, auto-correlation structures, censoring and truncation, meta-analytic standard errors, and quite a few more. In addition, all parameters of the response distribution can be predicted in order to perform distributional regression. Prior specifications are flexible and explicitly encourage users to apply prior distributions that actually reflect their prior knowledge. Models can easily be evaluated and compared using several methods assessing posterior or prior predictions. References: Bürkner (2017) <doi:10.18637/jss.v080.i01>; Bürkner (2018) <doi:10.32614/RJ-2018-017>; Bürkner (2021) <doi:10.18637/jss.v100.i05>; Carpenter et al. (2017) <doi:10.18637/jss.v076.i01>.

Maintained by Paul-Christian Bürkner. Last updated 5 days ago.

bayesian-inference brms multilevel-models stan statistical-models

3.4 match 1.3k stars 16.61 score 13k scripts 34 dependents

r-spatial

spdep:Spatial Dependence: Weighting Schemes, Statistics

A collection of functions to create spatial weights matrix objects from polygon 'contiguities', from point patterns by distance and tessellations, for summarizing these objects, and for permitting their use in spatial data analysis, including regional aggregation by minimum spanning tree; a collection of tests for spatial 'autocorrelation', including global 'Morans I' and 'Gearys C' proposed by 'Cliff' and 'Ord' (1973, ISBN: 0850860369) and (1981, ISBN: 0850860814), 'Hubert/Mantel' general cross product statistic, Empirical Bayes estimates and 'Assunção/Reis' (1999) <doi:10.1002/(SICI)1097-0258(19990830)18:16%3C2147::AID-SIM179%3E3.0.CO;2-I> Index, 'Getis/Ord' G ('Getis' and 'Ord' 1992) <doi:10.1111/j.1538-4632.1992.tb00261.x> and multicoloured join count statistics, 'APLE' ('Li 'et al.' ) <doi:10.1111/j.1538-4632.2007.00708.x>, local 'Moran's I', 'Gearys C' ('Anselin' 1995) <doi:10.1111/j.1538-4632.1995.tb00338.x> and 'Getis/Ord' G ('Ord' and 'Getis' 1995) <doi:10.1111/j.1538-4632.1995.tb00912.x>, 'saddlepoint' approximations ('Tiefelsdorf' 2002) <doi:10.1111/j.1538-4632.2002.tb01084.x> and exact tests for global and local 'Moran's I' ('Bivand et al.' 2009) <doi:10.1016/j.csda.2008.07.021> and 'LOSH' local indicators of spatial heteroscedasticity ('Ord' and 'Getis') <doi:10.1007/s00168-011-0492-y>. The implementation of most of these measures is described in 'Bivand' and 'Wong' (2018) <doi:10.1007/s11749-018-0599-x>, with further extensions in 'Bivand' (2022) <doi:10.1111/gean.12319>. 'Lagrange' multiplier tests for spatial dependence in linear models are provided ('Anselin et al'. 1996) <doi:10.1016/0166-0462(95)02111-6>, as are 'Rao' score tests for hypothesised spatial 'Durbin' models based on linear models ('Koley' and 'Bera' 2023) <doi:10.1080/17421772.2023.2256810>. A local indicators for categorical data (LICD) implementation based on 'Carrer et al.' (2021) <doi:10.1016/j.jas.2020.105306> and 'Bivand et al.' (2017) <doi:10.1016/j.spasta.2017.03.003> was added in 1.3-7. From 'spdep' and 'spatialreg' versions >= 1.2-1, the model fitting functions previously present in this package are defunct in 'spdep' and may be found in 'spatialreg'.

Maintained by Roger Bivand. Last updated 20 days ago.

spatial-autocorrelation spatial-dependence spatial-weights

3.4 match 131 stars 16.62 score 6.0k scripts 107 dependents

bcjaeger

r2glmm:Computes R Squared for Mixed (Multilevel) Models

The model R squared and semi-partial R squared for the linear and generalized linear mixed model (LMM and GLMM) are computed with confidence limits. The R squared measure from Edwards et.al (2008) <DOI:10.1002/sim.3429> is extended to the GLMM using penalized quasi-likelihood (PQL) estimation (see Jaeger et al. 2016 <DOI:10.1080/02664763.2016.1193725>). Three methods of computation are provided and described as follows. First, The Kenward-Roger approach. Due to some inconsistency between the 'pbkrtest' package and the 'glmmPQL' function, the Kenward-Roger approach in the 'r2glmm' package is limited to the LMM. Second, The method introduced by Nakagawa and Schielzeth (2013) <DOI:10.1111/j.2041-210x.2012.00261.x> and later extended by Johnson (2014) <DOI:10.1111/2041-210X.12225>. The 'r2glmm' package only computes marginal R squared for the LMM and does not generalize the statistic to the GLMM; however, confidence limits and semi-partial R squared for fixed effects are useful additions. Lastly, an approach using standardized generalized variance (SGV) can be used for covariance model selection. Package installation instructions can be found in the readme file.

Maintained by Byron Jaeger. Last updated 10 months ago.

9.0 match 16 stars 6.29 score 243 scripts

gdurif

plsgenomics:PLS Analyses for Genomics

Routines for PLS-based genomic analyses, implementing PLS methods for classification with microarray data and prediction of transcription factor activities from combined ChIP-chip analysis. The >=1.2-1 versions include two new classification methods for microarray data: GSIM and Ridge PLS. The >=1.3 versions includes a new classification method combining variable selection and compression in logistic regression context: logit-SPLS; and an adaptive version of the sparse PLS.

Maintained by Ghislain Durif. Last updated 12 months ago.

10.1 match 5.55 score 140 scripts 2 dependents

khliland

plsVarSel:Variable Selection in Partial Least Squares

Interfaces and methods for variable selection in Partial Least Squares. The methods include filter methods, wrapper methods and embedded methods. Both regression and classification is supported.

Maintained by Kristian Hovde Liland. Last updated 5 days ago.

8.8 match 3 stars 6.33 score 40 scripts 4 dependents

spatstat

spatstat.utils:Utility Functions for 'spatstat'

Contains utility functions for the 'spatstat' family of packages which may also be useful for other purposes.

Maintained by Adrian Baddeley. Last updated 4 days ago.

spatial-analysis spatial-data spatstat

4.8 match 5 stars 11.66 score 134 scripts 248 dependents

asgr

imager:Image Processing Library Based on 'CImg'

Fast image processing for images in up to 4 dimensions (two spatial dimensions, one time/depth dimension, one colour dimension). Provides most traditional image processing tools (filtering, morphology, transformations, etc.) as well as various functions for easily analysing image data using R. The package wraps 'CImg', <http://cimg.eu>, a simple, modern C++ library for image processing.

Maintained by Aaron Robotham. Last updated 29 days ago.

libx11 fftw3 tiff cpp openmp

4.0 match 17 stars 13.62 score 2.4k scripts 45 dependents

laplacesdemonr

LaplacesDemon:Complete Environment for Bayesian Inference

Provides a complete environment for Bayesian inference using a variety of different samplers (see ?LaplacesDemon for an overview).

Maintained by Henrik Singmann. Last updated 12 months ago.

4.0 match 93 stars 13.45 score 1.8k scripts 60 dependents

r-lum

Luminescence:Comprehensive Luminescence Dating Data Analysis

A collection of various R functions for the purpose of Luminescence dating data analysis. This includes, amongst others, data import, export, application of age models, curve deconvolution, sequence analysis and plotting of equivalent dose distributions.

Maintained by Sebastian Kreutzer. Last updated 3 hours ago.

bayesian-statistics data-science geochronology luminescence luminescence-dating open-science osl plotting radiofluorescence tl xsyg cpp

5.0 match 15 stars 10.74 score 178 scripts 8 dependents

lindbrook

cholera:Amend, Augment and Aid Analysis of John Snow's Cholera Map

Amends errors, augments data and aids analysis of John Snow's map of the 1854 London cholera outbreak.

Maintained by lindbrook. Last updated 4 hours ago.

cholera data-visualization datasets epidemiology john-snow public-health triangulation-delaunay voronoi voronoi-polygons

5.8 match 136 stars 9.34 score 95 scripts

briencj

asremlPlus:Augments 'ASReml-R' in Fitting Mixed Models and Packages Generally in Exploring Prediction Differences

Assists in automating the selection of terms to include in mixed models when 'asreml' is used to fit the models. Procedures are available for choosing models that conform to the hierarchy or marginality principle, for fitting and choosing between two-dimensional spatial models using correlation, natural cubic smoothing spline and P-spline models. A history of the fitting of a sequence of models is kept in a data frame. Also used to compute functions and contrasts of, to investigate differences between and to plot predictions obtained using any model fitting function. The content falls into the following natural groupings: (i) Data, (ii) Model modification functions, (iii) Model selection and description functions, (iv) Model diagnostics and simulation functions, (v) Prediction production and presentation functions, (vi) Response transformation functions, (vii) Object manipulation functions, and (viii) Miscellaneous functions (for further details see 'asremlPlus-package' in help). The 'asreml' package provides a computationally efficient algorithm for fitting a wide range of linear mixed models using Residual Maximum Likelihood. It is a commercial package and a license for it can be purchased from 'VSNi' <https://vsni.co.uk/> as 'asreml-R', who will supply a zip file for local installation/updating (see <https://asreml.kb.vsni.co.uk/>). It is not needed for functions that are methods for 'alldiffs' and 'data.frame' objects. The package 'asremPlus' can also be installed from <http://chris.brien.name/rpackages/>.

Maintained by Chris Brien. Last updated 30 days ago.

asreml mixed-models

5.8 match 19 stars 9.34 score 200 scripts

modeloriented

auditor:Model Audit - Verification, Validation, and Error Analysis

Provides an easy to use unified interface for creating validation plots for any model. The 'auditor' helps to avoid repetitive work consisting of writing code needed to create residual plots. This visualizations allow to asses and compare the goodness of fit, performance, and similarity of models.

Maintained by Alicja Gosiewska. Last updated 1 years ago.

classification error-analysis explainable-artificial-intelligence machine-learning model-validation regression-models residuals xai

6.1 match 58 stars 8.76 score 94 scripts 2 dependents

sparklyr

sparklyr:R Interface to Apache Spark

R interface to Apache Spark, a fast and general engine for big data processing, see <https://spark.apache.org/>. This package supports connecting to local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end, and provides an interface to Spark's built-in machine learning algorithms.

Maintained by Edgar Ruiz. Last updated 8 hours ago.

apache-spark distributed dplyr ide livy machine-learning remote-clusters spark sparklyr

3.5 match 959 stars 15.20 score 4.0k scripts 21 dependents

jslefche

piecewiseSEM:Piecewise Structural Equation Modeling

Implements piecewise structural equation modeling from a single list of structural equations, with new methods for non-linear, latent, and composite variables, standardized coefficients, query-based prediction and indirect effects. See <http://jslefche.github.io/piecewiseSEM/> for more.

Maintained by Jon Lefcheck. Last updated 9 months ago.

sem

5.4 match 163 stars 9.85 score 452 scripts

bioc

CMA:Synthesis of microarray-based classification

This package provides a comprehensive collection of various microarray-based classification algorithms both from Machine Learning and Statistics. Variable Selection, Hyperparameter tuning, Evaluation and Comparison can be performed combined or stepwise in a user-friendly environment.

Maintained by Roman Hornung. Last updated 5 months ago.

classification decisiontree

10.4 match 5.09 score 61 scripts

topepo

caret:Classification and Regression Training

Misc functions for training and plotting classification and regression models.

Maintained by Max Kuhn. Last updated 3 months ago.

2.8 match 1.6k stars 19.24 score 61k scripts 303 dependents

epiforecasts

scoringutils:Utilities for Scoring and Assessing Predictions

Facilitate the evaluation of forecasts in a convenient framework based on data.table. It allows user to to check their forecasts and diagnose issues, to visualise forecasts and missing data, to transform data before scoring, to handle missing forecasts, to aggregate scores, and to visualise the results of the evaluation. The package mostly focuses on the evaluation of probabilistic forecasts and allows evaluating several different forecast types and input formats. Find more information about the package in the Vignettes as well as in the accompanying paper, <doi:10.48550/arXiv.2205.07090>.

Maintained by Nikos Bosse. Last updated 15 days ago.

forecast-evaluation forecasting

4.6 match 52 stars 11.37 score 326 scripts 7 dependents

juliasilge

widyr:Widen, Process, then Re-Tidy Data

Encapsulates the pattern of untidying data into a wide matrix, performing some processing, then turning it back into a tidy form. This is useful for several operations such as co-occurrence counts, correlations, or clustering that are mathematically convenient on wide matrices.

Maintained by Julia Silge. Last updated 2 years ago.

4.7 match 328 stars 11.11 score 1.7k scripts 2 dependents

kurthornik

clue:Cluster Ensembles

CLUster Ensembles.

Maintained by Kurt Hornik. Last updated 4 months ago.

5.3 match 2 stars 9.85 score 496 scripts 401 dependents

aalfons

laeken:Estimation of Indicators on Social Exclusion and Poverty

Estimation of indicators on social exclusion and poverty, as well as Pareto tail modeling for empirical income distributions.

Maintained by Andreas Alfons. Last updated 1 years ago.

5.4 match 3 stars 9.57 score 300 scripts 30 dependents

jclavel

mvMORPH:Multivariate Comparative Tools for Fitting Evolutionary Models to Morphometric Data

Fits multivariate (Brownian Motion, Early Burst, ACDC, Ornstein-Uhlenbeck and Shifts) models of continuous traits evolution on trees and time series. 'mvMORPH' also proposes high-dimensional multivariate comparative tools (linear models using Generalized Least Squares and multivariate tests) based on penalized likelihood. See Clavel et al. (2015) <DOI:10.1111/2041-210X.12420>, Clavel et al. (2019) <DOI:10.1093/sysbio/syy045>, and Clavel & Morlon (2020) <DOI:10.1093/sysbio/syaa010>.

Maintained by Julien Clavel. Last updated 1 months ago.

openblas

5.4 match 17 stars 9.46 score 189 scripts 3 dependents

alexanderrobitzsch

miceadds:Some Additional Multiple Imputation Functions, Especially for 'mice'

Contains functions for multiple imputation which complements existing functionality in R. In particular, several imputation methods for the mice package (van Buuren & Groothuis-Oudshoorn, 2011, <doi:10.18637/jss.v045.i03>) are implemented. Main features of the miceadds package include plausible value imputation (Mislevy, 1991, <doi:10.1007/BF02294457>), multilevel imputation for variables at any level or with any number of hierarchical and non-hierarchical levels (Grund, Luedtke & Robitzsch, 2018, <doi:10.1177/1094428117703686>; van Buuren, 2018, Ch.7, <doi:10.1201/9780429492259>), imputation using partial least squares (PLS) for high dimensional predictors (Robitzsch, Pham & Yanagida, 2016), nested multiple imputation (Rubin, 2003, <doi:10.1111/1467-9574.00217>), substantive model compatible imputation (Bartlett et al., 2015, <doi:10.1177/0962280214521348>), and features for the generation of synthetic datasets (Reiter, 2005, <doi:10.1111/j.1467-985X.2004.00343.x>; Nowok, Raab, & Dibben, 2016, <doi:10.18637/jss.v074.i11>).

Maintained by Alexander Robitzsch. Last updated 17 days ago.

missing-data multiple-imputation openblas cpp

5.6 match 16 stars 9.16 score 542 scripts 9 dependents

djnavarro

lsr:Companion to "Learning Statistics with R"

A collection of tools intended to make introductory statistics easier to teach, including wrappers for common hypothesis tests and basic data manipulation. It accompanies Navarro, D. J. (2015). Learning Statistics with R: A Tutorial for Psychology Students and Other Beginners, Version 0.6.

Maintained by Danielle Navarro. Last updated 3 years ago.

5.3 match 12 stars 9.55 score 1.7k scripts 11 dependents

thomasp85

ggraph:An Implementation of Grammar of Graphics for Graphs and Networks

The grammar of graphics as implemented in ggplot2 is a poor fit for graph and network visualizations due to its reliance on tabular data input. ggraph is an extension of the ggplot2 API tailored to graph visualizations and provides the same flexible approach to building up plots layer by layer.

Maintained by Thomas Lin Pedersen. Last updated 1 years ago.

ggplot-extension ggplot2 graph-visualization network-visualization visualization cpp

3.0 match 1.1k stars 16.96 score 9.2k scripts 111 dependents

statdivlab

corncob:Count Regression for Correlated Observations with the Beta-Binomial

Statistical modeling for correlated count data using the beta-binomial distribution, described in Martin et al. (2020) <doi:10.1214/19-AOAS1283>. It allows for both mean and overdispersion covariates.

Maintained by Amy D Willis. Last updated 1 days ago.

5.2 match 106 stars 9.82 score 248 scripts 1 dependents

moviedo5

fda.usc:Functional Data Analysis and Utilities for Statistical Computing

Routines for exploratory and descriptive analysis of functional data such as depth measurements, atypical curves detection, regression models, supervised classification, unsupervised classification and functional analysis of variance.

Maintained by Manuel Oviedo de la Fuente. Last updated 4 months ago.

functional-data-analysis fortran

5.2 match 12 stars 9.72 score 560 scripts 22 dependents

yihui

animation:A Gallery of Animations in Statistics and Utilities to Create Animations

Provides functions for animations in statistics, covering topics in probability theory, mathematical statistics, multivariate statistics, non-parametric statistics, sampling survey, linear models, time series, computational statistics, data mining and machine learning. These functions may be helpful in teaching statistics and data analysis. Also provided in this package are a series of functions to save animations to various formats, e.g. Flash, 'GIF', HTML pages, 'PDF' and videos. 'PDF' animations can be inserted into 'Sweave' / 'knitr' easily.

Maintained by Yihui Xie. Last updated 2 years ago.

animation statistical-computing statistical-graphics statistics

4.1 match 208 stars 12.08 score 2.5k scripts 29 dependents

rvlenth

lsmeans:Least-Squares Means

Obtain least-squares means for linear, generalized linear, and mixed models. Compute contrasts or linear functions of least-squares means, and comparisons of slopes. Plots and compact letter displays. Least-squares means were proposed in Harvey, W (1960) "Least-squares analysis of data with unequal subclass numbers", Tech Report ARS-20-8, USDA National Agricultural Library, and discussed further in Searle, Speed, and Milliken (1980) "Population marginal means in the linear model: An alternative to least squares means", The American Statistician 34(4), 216-221 <doi:10.1080/00031305.1980.10483031>. NOTE: lsmeans now relies primarily on code in the 'emmeans' package. 'lsmeans' will be archived in the near future.

Maintained by Russell Lenth. Last updated 6 years ago.

6.4 match 12 stars 7.82 score 1.8k scripts

mathscell

nlsic:Non Linear Least Squares with Inequality Constraints

We solve non linear least squares problems with optional equality and/or inequality constraints. Non linear iterations are globalized with back-tracking method. Linear problems are solved by dense QR decomposition from 'LAPACK' which can limit the size of treated problems. On the other side, we avoid condition number degradation which happens in classical quadratic programming approach. Inequality constraints treatment on each non linear iteration is based on 'NNLS' method (by Lawson and Hanson). We provide an original function 'lsi_ln' for solving linear least squares problem with inequality constraints in least norm sens. Thus if Jacobian of the problem is rank deficient a solution still can be provided. However, truncation errors are probable in this case. Equality constraints are treated by using a basis of Null-space. User defined function calculating residuals must return a list having residual vector (not their squared sum) and Jacobian. If Jacobian is not in the returned list, package 'numDeriv' is used to calculated finite difference version of Jacobian. The 'NLSIC' method was fist published in Sokol et al. (2012) <doi:10.1093/bioinformatics/btr716>.

Maintained by Serguei Sokol. Last updated 9 months ago.

17.7 match 1 stars 2.78 score 4 scripts 1 dependents

jensharbers

agricolaeplotr:Visualization of Design of Experiments from the 'agricolae' Package

Visualization of Design of Experiments from the 'agricolae' package with 'ggplot2' framework The user provides an experiment design from the 'agricolae' package, calls the corresponding function and will receive a visualization with 'ggplot2' based functions that are specific for each design. As there are many different designs, each design is tested on its type. The output can be modified with standard 'ggplot2' commands or with other packages with 'ggplot2' function extensions.

Maintained by Jens Harbers. Last updated 2 months ago.

7.8 match 8 stars 6.27 score 78 scripts

r-forge

survey:Analysis of Complex Survey Samples

Summary statistics, two-sample tests, rank tests, generalised linear models, cumulative link models, Cox models, loglinear models, and general maximum pseudolikelihood estimation for multistage stratified, cluster-sampled, unequally weighted survey samples. Variances by Taylor series linearisation or replicate weights. Post-stratification, calibration, and raking. Two-phase and multiphase subsampling designs. Graphics. PPS sampling without replacement. Small-area estimation. Dual-frame designs.

Maintained by "Thomas Lumley". Last updated 6 months ago.

cpp

3.5 match 1 stars 13.93 score 13k scripts 235 dependents

bioboot

bio3d:Biological Structure Analysis

Utilities to process, organize and explore protein structure, sequence and dynamics data. Features include the ability to read and write structure, sequence and dynamic trajectory data, perform sequence and structure database searches, data summaries, atom selection, alignment, superposition, rigid core identification, clustering, torsion analysis, distance matrix analysis, structure and sequence conservation analysis, normal mode analysis, principal component analysis of heterogeneous structure data, and correlation network analysis from normal mode and molecular dynamics data. In addition, various utility functions are provided to enable the statistical and graphical power of the R environment to work with biological sequence and structural data. Please refer to the URLs below for more information.

Maintained by Barry Grant. Last updated 5 months ago.

zlib cpp

5.7 match 5 stars 8.49 score 1.4k scripts 10 dependents

floschuberth

cSEM:Composite-Based Structural Equation Modeling

Estimate, assess, test, and study linear, nonlinear, hierarchical and multigroup structural equation models using composite-based approaches and procedures, including estimation techniques such as partial least squares path modeling (PLS-PM) and its derivatives (PLSc, ordPLSc, robustPLSc), generalized structured component analysis (GSCA), generalized structured component analysis with uniqueness terms (GSCAm), generalized canonical correlation analysis (GCCA), principal component analysis (PCA), factor score regression (FSR) using sum score, regression or Bartlett scores (including bias correction using Croon’s approach), as well as several tests and typical postestimation procedures (e.g., verify admissibility of the estimates, assess the model fit, test the model fit etc.).

Maintained by Florian Schuberth. Last updated 10 hours ago.

5.1 match 28 stars 9.22 score 56 scripts 2 dependents

rstudio

tfprobability:Interface to 'TensorFlow Probability'

Interface to 'TensorFlow Probability', a 'Python' library built on 'TensorFlow' that makes it easy to combine probabilistic models and deep learning on modern hardware ('TPU', 'GPU'). 'TensorFlow Probability' includes a wide selection of probability distributions and bijectors, probabilistic layers, variational inference, Markov chain Monte Carlo, and optimizers such as Nelder-Mead, BFGS, and SGLD.

Maintained by Tomasz Kalinowski. Last updated 3 years ago.

5.5 match 54 stars 8.63 score 221 scripts 3 dependents

jeksterslab

betaSandwich:Robust Confidence Intervals for Standardized Regression Coefficients

Generates robust confidence intervals for standardized regression coefficients using heteroskedasticity-consistent standard errors for models fitted by lm() as described in Dudgeon (2017) <doi:10.1007/s11336-017-9563-z>. The package can also be used to generate confidence intervals for R-squared, adjusted R-squared, and differences of standardized regression coefficients. A description of the package and code examples are presented in Pesigan, Sun, and Cheung (2023) <doi:10.1080/00273171.2023.2201277>.

Maintained by Ivan Jacob Agaloos Pesigan. Last updated 2 months ago.

confidence-intervals heteroskedasticity-consistent-standard-errors standardized-regression-coefficients

11.5 match 4.11 score 16 scripts

cran

SPSL:Site Percolation on Square Lattices (SPSL)

Provides basic functionality for labeling iso- & anisotropic percolation clusters on 2D & 3D square lattices with various lattice sizes, occupation probabilities, von Neumann & Moore (1,d)-neighborhoods, and random variables weighting the percolation lattice sites.

Maintained by Pavel V. Moskalev. Last updated 6 years ago.

32.0 match 1.48 score 1 dependents

cran

MuMIn:Multi-Model Inference

Tools for model selection and model averaging with support for a wide range of statistical models. Automated model selection through subsets of the maximum model, with optional constraints for model inclusion. Averaging of model parameters and predictions based on model weights derived from information criteria (AICc and alike) or custom model weighting schemes.

Maintained by Kamil Bartoń. Last updated 9 months ago.

5.3 match 8 stars 8.84 score 5.6k scripts 27 dependents

gastonstat

plspm:Tools for Partial Least Squares Path Modeling (PLS-PM)

Partial Least Squares Path Modeling (PLS-PM) analysis for both metric and non-metric data, as well as REBUS analysis for latent class detection.

Maintained by Gaston Sanchez. Last updated 3 years ago.

path-modeling pls-sem plspm

6.8 match 67 stars 6.97 score 115 scripts

collinerickson

GauPro:Gaussian Process Fitting

Fits a Gaussian process model to data. Gaussian processes are commonly used in computer experiments to fit an interpolating model. The model is stored as an 'R6' object and can be easily updated with new data. There are options to run in parallel, and 'Rcpp' has been used to speed up calculations. For more info about Gaussian process software, see Erickson et al. (2018) <doi:10.1016/j.ejor.2017.10.002>.

Maintained by Collin Erickson. Last updated 2 days ago.

openblas cpp openmp

5.6 match 16 stars 8.44 score 104 scripts 1 dependents

philchalmers

SimDesign:Structure for Organizing Monte Carlo Simulation Designs

Provides tools to safely and efficiently organize and execute Monte Carlo simulation experiments in R. The package controls the structure and back-end of Monte Carlo simulation experiments by utilizing a generate-analyse-summarise workflow. The workflow safeguards against common simulation coding issues, such as automatically re-simulating non-convergent results, prevents inadvertently overwriting simulation files, catches error and warning messages during execution, implicitly supports parallel processing with high-quality random number generation, and provides tools for managing high-performance computing (HPC) array jobs submitted to schedulers such as SLURM. For a pedagogical introduction to the package see Sigal and Chalmers (2016) <doi:10.1080/10691898.2016.1246953>. For a more in-depth overview of the package and its design philosophy see Chalmers and Adkins (2020) <doi:10.20982/tqmp.16.4.p248>.

Maintained by Phil Chalmers. Last updated 1 days ago.

monte-carlo-simulation simulation simulation-framework

3.5 match 62 stars 13.38 score 253 scripts 46 dependents

bioc

TDbasedUFE:Tensor Decomposition Based Unsupervised Feature Extraction

This is a comprehensive package to perform Tensor decomposition based unsupervised feature extraction. It can perform unsupervised feature extraction. It uses tensor decomposition. It is applicable to gene expression, DNA methylation, and histone modification etc. It can perform multiomics analysis. It is also potentially applicable to single cell omics data sets.

Maintained by Y-h. Taguchi. Last updated 5 months ago.

geneexpression featureextraction methylationarray singlecell bioinformatics dna-methylation gene-expression-profiles histone-modifications multiomics tensor-decomposition

8.5 match 5 stars 5.48 score 9 scripts 1 dependents

friendly

heplots:Visualizing Hypothesis Tests in Multivariate Linear Models

Provides HE plot and other functions for visualizing hypothesis tests in multivariate linear models. HE plots represent sums-of-squares-and-products matrices for linear hypotheses and for error using ellipses (in two dimensions) and ellipsoids (in three dimensions). The related 'candisc' package provides visualizations in a reduced-rank canonical discriminant space when there are more than a few response variables.

Maintained by Michael Friendly. Last updated 11 days ago.

linear-hypotheses matrices multivariate-linear-models plot repeated-measure-designs visualizing-hypothesis-tests

4.0 match 9 stars 11.49 score 1.1k scripts 7 dependents

biomodhub

biomod2:Ensemble Platform for Species Distribution Modeling

Functions for species distribution modeling, calibration and evaluation, ensemble of models, ensemble forecasting and visualization. The package permits to run consistently up to 10 single models on a presence/absences (resp presences/pseudo-absences) dataset and to combine them in ensemble models and ensemble projections. Some bench of other evaluation and visualisation tools are also available within the package.

Maintained by Maya Guéguen. Last updated 1 days ago.

3.3 match 95 stars 13.90 score 536 scripts 7 dependents

jangraffelman

HardyWeinberg:Statistical Tests and Graphics for Hardy-Weinberg Equilibrium

Contains tools for exploring Hardy-Weinberg equilibrium (Hardy, 1908; Weinberg, 1908) for bi and multi-allelic genetic marker data. All classical tests (chi-square, exact, likelihood-ratio and permutation tests) with bi-allelic variants are included in the package, as well as functions for power computation and for the simulation of marker data under equilibrium and disequilibrium. Routines for dealing with markers on the X-chromosome are included (Graffelman & Weir, 2016) <doi:10.1038/hdy.2016.20>, including Bayesian procedures. Some exact and permutation procedures also work with multi-allelic variants. Special test procedures that jointly address Hardy-Weinberg equilibrium and equality of allele frequencies in both sexes are supplied, for the bi and multi-allelic case. Functions for testing equilibrium in the presence of missing data by using multiple imputation are also provided. Implements several graphics for exploring the equilibrium status of a large set of bi-allelic markers: ternary plots with acceptance regions, log-ratio plots and Q-Q plots. The functionality of the package is explained in detail in a related JSS paper <doi:10.18637/jss.v064.i03>.

Maintained by Jan Graffelman. Last updated 12 months ago.

cpp

7.3 match 6.30 score 167 scripts 4 dependents

cran

irrICC:Intraclass Correlations for Quantifying Inter-Rater Reliability

Calculates various intraclass correlation coefficients used to quantify inter-rater and intra-rater reliability. The assumption here is that the raters produced quantitative ratings. Most of the statistical procedures implemented in this package are described in details in Gwet, K.L. (2014, ISBN:978-0970806284): "Handbook of Inter-Rater Reliability," 4th edition, Advanced Analytics, LLC.

Maintained by Kilem L. Gwet. Last updated 5 years ago.

15.2 match 3.00 score

jmcurran

Hotelling:Hotelling's T^2 Test and Variants

A set of R functions which implements Hotelling's T^2 test and some variants of it. Functions are also included for Aitchison's additive log ratio and centred log ratio transformations.

Maintained by James Curran. Last updated 4 years ago.

6.7 match 2 stars 6.78 score 139 scripts 3 dependents

r-forge

robustbase:Basic Robust Statistics

"Essential" Robust Statistics. Tools allowing to analyze data with robust methods. This includes regression methodology including model selections and multivariate statistics where we strive to cover the book "Robust Statistics, Theory and Methods" by 'Maronna, Martin and Yohai'; Wiley 2006.

Maintained by Martin Maechler. Last updated 4 months ago.

fortran openblas

3.4 match 13.33 score 1.7k scripts 480 dependents

r-forge

expm:Matrix Exponential, Log, 'etc'

Computation of the matrix exponential, logarithm, sqrt, and related quantities, using traditional and modern methods.

Maintained by Martin Maechler. Last updated 5 months ago.

fortran openblas

3.8 match 11.91 score 1.3k scripts 432 dependents

fbertran

plsRcox:Partial Least Squares Regression for Cox Models and Related Techniques

Provides Partial least squares Regression and various regular, sparse or kernel, techniques for fitting Cox models in high dimensional settings <doi:10.1093/bioinformatics/btu660>, Bastien, P., Bertrand, F., Meyer N., Maumy-Bertrand, M. (2015), Deviance residuals-based sparse PLS and sparse kernel PLS regression for censored data, Bioinformatics, 31(3):397-404. Cross validation criteria were studied in <arXiv:1810.02962>, Bertrand, F., Bastien, Ph. and Maumy-Bertrand, M. (2018), Cross validating extensions of kernel, sparse or regular partial least squares regression models to censored data.

Maintained by Frederic Bertrand. Last updated 2 years ago.

8.7 match 4 stars 5.13 score 56 scripts 2 dependents

faosorios

fastmatrix:Fast Computation of some Matrices Useful in Statistics

Small set of functions to fast computation of some matrices and operations useful in statistics and econometrics. Currently, there are functions for efficient computation of duplication, commutation and symmetrizer matrices with minimal storage requirements. Some commonly used matrix decompositions (LU and LDL), basic matrix operations (for instance, Hadamard, Kronecker products and the Sherman-Morrison formula) and iterative solvers for linear systems are also available. In addition, the package includes a number of common statistical procedures such as the sweep operator, weighted mean and covariance matrix using an online algorithm, linear regression (using Cholesky, QR, SVD, sweep operator and conjugate gradients methods), ridge regression (with optimal selection of the ridge parameter considering several procedures), omnibus tests for univariate normality, functions to compute the multivariate skewness, kurtosis, the Mahalanobis distance (checking the positive defineteness), and the Wilson-Hilferty transformation of gamma variables. Furthermore, the package provides interfaces to C code callable by another C code from other R packages.

Maintained by Felipe Osorio. Last updated 1 years ago.

commutation-matrix jarque-bera-test ldl-factorization lu-factorization matrix-api-for-r-packages matrix-norms modified-cholesky ols-regression power-method ridge-regression sherman-morrison statistics sweep-operator symmetrizer-matrix fortran openblas

7.1 match 19 stars 6.27 score 37 scripts 10 dependents

spatstat

spatstat.data:Datasets for 'spatstat' Family

Contains all the datasets for the 'spatstat' family of packages.

Maintained by Adrian Baddeley. Last updated 2 days ago.

kernel-density point-process spatial-analysis spatial-data spatial-data-analysis spatstat statistical-analysis statistical-methods statistical-tests statistics

4.0 match 6 stars 11.07 score 186 scripts 228 dependents

kevhuy

WALS:Weighted-Average Least Squares Model Averaging

Implements Weighted-Average Least Squares model averaging for negative binomial regression models of Huynh (2024) <doi:10.48550/arXiv.2404.11324>, generalized linear models of De Luca, Magnus, Peracchi (2018) <doi:10.1016/j.jeconom.2017.12.007> and linear regression models of Magnus, Powell, Pruefer (2010) <doi:10.1016/j.jeconom.2009.07.004>, see also Magnus, De Luca (2016) <doi:10.1111/joes.12094>. Weighted-Average Least Squares for the linear regression model is based on the original 'MATLAB' code by Magnus and De Luca <https://www.janmagnus.nl/items/WALS.pdf>, see also Kumar, Magnus (2013) <doi:10.1007/s13571-013-0060-9> and De Luca, Magnus (2011) <doi:10.1177/1536867X1201100402>.

Maintained by Kevin Huynh. Last updated 9 months ago.

13.9 match 1 stars 3.18 score 1 scripts

rrwen

draw:Wrapper Functions for Producing Graphics

A set of user-friendly wrapper functions for creating consistent graphics and diagrams with lines, common shapes, text, and page settings. Compatible with and based on the R 'grid' package.

Maintained by Richard Wen. Last updated 7 years ago.

box circle curve diagram draw graphics grid line page rectangle reproducible shape square text triangle

10.0 match 2 stars 4.39 score 35 scripts

rudjer

SparseM:Sparse Linear Algebra

Some basic linear algebra functionality for sparse matrices is provided: including Cholesky decomposition and backsolving as well as standard R subsetting and Kronecker products.

Maintained by Roger Koenker. Last updated 8 months ago.

fortran

3.8 match 3 stars 11.47 score 306 scripts 1.5k dependents

k3jph

cmna:Computational Methods for Numerical Analysis

Provides the source and examples for James P. Howard, II, "Computational Methods for Numerical Analysis with R," <https://jameshoward.us/cmna/>, a book on numerical methods in R.

Maintained by James Howard. Last updated 4 years ago.

bisection differential-equations heat-equation interpolation least-squares matrix-factorization monte-carlo newton numerical-analysis optimization partial-differential-equations quadrature root-finding secant splines testthat traveling-salesperson wave-equation

7.5 match 16 stars 5.65 score 62 scripts 3 dependents

trevorhastie

glmnet:Lasso and Elastic-Net Regularized Generalized Linear Models

Extremely efficient procedures for fitting the entire lasso or elastic-net regularization path for linear regression, logistic and multinomial regression models, Poisson regression, Cox model, multiple-response Gaussian, and the grouped multinomial regression; see <doi:10.18637/jss.v033.i01> and <doi:10.18637/jss.v039.i05>. There are two new and important additions. The family argument can be a GLM family object, which opens the door to any programmed family (<doi:10.18637/jss.v106.i01>). This comes with a modest computational cost, so when the built-in families suffice, they should be used instead. The other novelty is the relax option, which refits each of the active sets in the path unpenalized. The algorithm uses cyclical coordinate descent in a path-wise fashion, as described in the papers cited.

Maintained by Trevor Hastie. Last updated 2 years ago.

fortran cpp

2.8 match 82 stars 15.15 score 22k scripts 736 dependents

cmollica

PLMIX:Bayesian Analysis of Finite Mixture of Plackett-Luce Models

Fit finite mixtures of Plackett-Luce models for partial top rankings/orderings within the Bayesian framework. It provides MAP point estimates via EM algorithm and posterior MCMC simulations via Gibbs Sampling. It also fits MLE as a special case of the noninformative Bayesian analysis with vague priors. In addition to inferential techniques, the package assists other fundamental phases of a model-based analysis for partial rankings/orderings, by including functions for data manipulation, simulation, descriptive summary, model selection and goodness-of-fit evaluation. Main references on the methods are Mollica and Tardella (2017) <doi.org/10.1007/s11336-016-9530-0> and Mollica and Tardella (2014) <doi/10.1002/sim.6224>.

Maintained by Cristina Mollica. Last updated 4 years ago.

cpp

13.4 match 3.15 score 28 scripts

ludovikcoba

rrecsys:Environment for Evaluating Recommender Systems

Processes standard recommendation datasets (e.g., a user-item rating matrix) as input and generates rating predictions and lists of recommended items. Standard algorithm implementations which are included in this package are the following: Global/Item/User-Average baselines, Weighted Slope One, Item-Based KNN, User-Based KNN, FunkSVD, BPR and weighted ALS. They can be assessed according to the standard offline evaluation methodology (Shani, et al. (2011) <doi:10.1007/978-0-387-85820-3_8>) for recommender systems using measures such as MAE, RMSE, Precision, Recall, F1, AUC, NDCG, RankScore and coverage measures. The package (Coba, et al.(2017) <doi: 10.1007/978-3-319-60042-0_36>) is intended for rapid prototyping of recommendation algorithms and education purposes.

Maintained by Ludovik Çoba. Last updated 3 years ago.

cpp

6.1 match 23 stars 6.84 score 25 scripts

bioxgeo

geodiv:Methods for Calculating Gradient Surface Metrics

Methods for calculating gradient surface metrics for continuous analysis of landscape features.

Maintained by Annie C. Smith. Last updated 1 years ago.

cpp

7.1 match 11 stars 5.88 score 23 scripts 1 dependents

stan-dev

rstanarm:Bayesian Applied Regression Modeling via Stan

Estimates previously compiled regression models using the 'rstan' package, which provides the R interface to the Stan C++ library for Bayesian estimation. Users specify models via the customary R syntax with a formula and data.frame plus some additional arguments for priors.

Maintained by Ben Goodrich. Last updated 9 months ago.

bayesian bayesian-data-analysis bayesian-inference bayesian-methods bayesian-statistics multilevel-models rstan rstanarm stan statistical-modeling cpp

2.7 match 393 stars 15.68 score 5.0k scripts 13 dependents

radiant-rstats

radiant.data:Data Menu for Radiant: Business Analytics using R and Shiny

The Radiant Data menu includes interfaces for loading, saving, viewing, visualizing, summarizing, transforming, and combining data. It also contains functionality to generate reproducible reports of the analyses conducted in the application.

Maintained by Vincent Nijs. Last updated 5 months ago.

5.0 match 54 stars 8.30 score 146 scripts 6 dependents

rvaradhan

SQUAREM:Squared Extrapolation Methods for Accelerating EM-Like Monotone Algorithms

Algorithms for accelerating the convergence of slow, monotone sequences from smooth, contraction mapping such as the EM algorithm. It can be used to accelerate any smooth, linearly convergent acceleration scheme. A tutorial style introduction to this package is available in a vignette on the CRAN download page or, when the package is loaded in an R session, with vignette("SQUAREM"). Refer to the J Stat Software article: <doi:10.18637/jss.v092.i07>.

Maintained by Ravi Varadhan. Last updated 4 years ago.

4.5 match 2 stars 9.26 score 84 scripts 502 dependents

pecanproject

PEcAn.benchmark:PEcAn Functions Used for Benchmarking

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PEcAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. The PEcAn.benchmark package provides utilities for comparing models and data, including a suite of statistical metrics and plots.

Maintained by Mike Dietze. Last updated 4 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants

3.9 match 216 stars 10.70 score 416 scripts 11 dependents

spatstat

spatstat.explore:Exploratory Data Analysis for the 'spatstat' Family

Functionality for exploratory data analysis and nonparametric analysis of spatial data, mainly spatial point patterns, in the 'spatstat' family of packages. (Excludes analysis of spatial data on a linear network, which is covered by the separate package 'spatstat.linnet'.) Methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported.

Maintained by Adrian Baddeley. Last updated 2 days ago.

cluster-detection confidence-intervals hypothesis-testing k-function roc-curves scan-statistics significance-testing simulation-envelopes spatial-analysis spatial-data-analysis spatial-sharpening spatial-smoothing spatial-statistics

4.1 match 1 stars 10.18 score 67 scripts 149 dependents

geomorphr

geomorph:Geometric Morphometric Analyses of 2D and 3D Landmark Data

Read, manipulate, and digitize landmark data, generate shape variables via Procrustes analysis for points, curves and surfaces, perform shape analyses, and provide graphical depictions of shapes and patterns of shape variation.

Maintained by Dean Adams. Last updated 1 months ago.

3.4 match 76 stars 12.05 score 700 scripts 6 dependents

eikeluedeling

chillR:Statistical Methods for Phenology Analysis in Temperate Fruit Trees

The phenology of plants (i.e. the timing of their annual life phases) depends on climatic cues. For temperate trees and many other plants, spring phases, such as leaf emergence and flowering, have been found to result from the effects of both cool (chilling) conditions and heat. Fruit tree scientists (pomologists) have developed some metrics to quantify chilling and heat (e.g. see Luedeling (2012) <doi:10.1016/j.scienta.2012.07.011>). 'chillR' contains functions for processing temperature records into chilling (Chilling Hours, Utah Chill Units and Chill Portions) and heat units (Growing Degree Hours). Regarding chilling metrics, Chill Portions are often considered the most promising, but they are difficult to calculate. This package makes it easy. 'chillR' also contains procedures for conducting a PLS analysis relating phenological dates (e.g. bloom dates) to either mean temperatures or mean chill and heat accumulation rates, based on long-term weather and phenology records (Luedeling and Gassner (2012) <doi:10.1016/j.agrformet.2011.10.020>). As of version 0.65, it also includes functions for generating weather scenarios with a weather generator, for conducting climate change analyses for temperature-based climatic metrics and for plotting results from such analyses. Since version 0.70, 'chillR' contains a function for interpolating hourly temperature records.

Maintained by Eike Luedeling. Last updated 4 months ago.

cpp

6.7 match 3 stars 6.13 score 346 scripts 1 dependents

cran

wavethresh:Wavelets Statistics and Transforms

Performs 1, 2 and 3D real and complex-valued wavelet transforms, nondecimated transforms, wavelet packet transforms, nondecimated wavelet packet transforms, multiple wavelet transforms, complex-valued wavelet transforms, wavelet shrinkage for various kinds of data, locally stationary wavelet time series, nonstationary multiscale transfer function modeling, density estimation.

Maintained by Guy Nason. Last updated 7 months ago.

7.0 match 5.89 score 41 dependents

homerhanumat

tigerstats:R Functions for Elementary Statistics

A collection of data sets and functions that are useful in the teaching of statistics at an elementary level to students who may have little or no previous experience with the command line. The functions for elementary inferential procedures follow a uniform interface for user input. Some of the functions are instructional applets that can only be run on the R Studio integrated development environment with package 'manipulate' installed. Other instructional applets are Shiny apps that may be run locally. In teaching the package is used alongside of package 'mosaic', 'mosaicData' and 'abd', which are therefore listed as dependencies.

Maintained by Homer White. Last updated 4 years ago.

7.1 match 16 stars 5.77 score 327 scripts

bioc

ropls:PCA, PLS(-DA) and OPLS(-DA) for multivariate analysis and feature selection of omics data

Latent variable modeling with Principal Component Analysis (PCA) and Partial Least Squares (PLS) are powerful methods for visualization, regression, classification, and feature selection of omics data where the number of variables exceeds the number of samples and with multicollinearity among variables. Orthogonal Partial Least Squares (OPLS) enables to separately model the variation correlated (predictive) to the factor of interest and the uncorrelated (orthogonal) variation. While performing similarly to PLS, OPLS facilitates interpretation. Successful applications of these chemometrics techniques include spectroscopic data such as Raman spectroscopy, nuclear magnetic resonance (NMR), mass spectrometry (MS) in metabolomics and proteomics, but also transcriptomics data. In addition to scores, loadings and weights plots, the package provides metrics and graphics to determine the optimal number of components (e.g. with the R2 and Q2 coefficients), check the validity of the model by permutation testing, detect outliers, and perform feature selection (e.g. with Variable Importance in Projection or regression coefficients). The package can be accessed via a user interface on the Workflow4Metabolomics.org online resource for computational metabolomics (built upon the Galaxy environment).

Maintained by Etienne A. Thevenot. Last updated 5 months ago.

regression classification principalcomponent transcriptomics proteomics metabolomics lipidomics massspectrometry immunooncology

5.4 match 7.55 score 210 scripts 8 dependents

mobiodiv

mobsim:Spatial Simulation and Scale-Dependent Analysis of Biodiversity Changes

Simulation, analysis and sampling of spatial biodiversity data (May, Gerstner, McGlinn, Xiao & Chase 2017) <doi:10.1111/2041-210x.12986>. In the simulation tools user define the numbers of species and individuals, the species abundance distribution and species aggregation. Functions for analysis include species rarefaction and accumulation curves, species-area relationships and the distance decay of similarity.

Maintained by Felix May. Last updated 3 months ago.

biodiversity macroecology point-pattern-analysis rarefaction simulation species species-abundance-distributions cpp

5.2 match 20 stars 7.84 score 76 scripts

rsquaredacademy

inferr:Inferential Statistics

Select set of parametric and non-parametric statistical tests. 'inferr' builds upon the solid set of statistical tests provided in 'stats' package by including additional data types as inputs, expanding and restructuring the test results. The tests included are t tests, variance tests, proportion tests, chi square tests, Levene's test, McNemar Test, Cochran's Q test and Runs test.

Maintained by Aravind Hebbali. Last updated 4 months ago.

inference inferential-statistics non-parametric parametric statistical-tests cpp

6.6 match 37 stars 6.10 score 34 scripts

sigbertklinke

exams.forge:Support for Compiling Examination Tasks using the 'exams' Package

The main aim is to further facilitate the creation of exercises based on the package 'exams' by Grün, B., and Zeileis, A. (2009) <doi:10.18637/jss.v029.i10>. Creating effective student exercises involves challenges such as creating appropriate data sets and ensuring access to intermediate values for accurate explanation of solutions. The functionality includes the generation of univariate and bivariate data including simple time series, functions for theoretical distributions and their approximation, statistical and mathematical calculations for tasks in basic statistics courses as well as general tasks such as string manipulation, LaTeX/HTML formatting and the editing of XML task files for 'Moodle'.

Maintained by Sigbert Klinke. Last updated 8 months ago.

15.0 match 2.70 score 1 scripts

mwheymans

psfmi:Prediction Model Pooling, Selection and Performance Evaluation Across Multiply Imputed Datasets

Pooling, backward and forward selection of linear, logistic and Cox regression models in multiply imputed datasets. Backward and forward selection can be done from the pooled model using Rubin's Rules (RR), the D1, D2, D3, D4 and the median p-values method. This is also possible for Mixed models. The models can contain continuous, dichotomous, categorical and restricted cubic spline predictors and interaction terms between all these type of predictors. The stability of the models can be evaluated using (cluster) bootstrapping. The package further contains functions to pool model performance measures as ROC/AUC, Reclassification, R-squared, scaled Brier score, H&L test and calibration plots for logistic regression models. Internal validation can be done across multiply imputed datasets with cross-validation or bootstrapping. The adjusted intercept after shrinkage of pooled regression coefficients can be obtained. Backward and forward selection as part of internal validation is possible. A function to externally validate logistic prediction models in multiple imputed datasets is available and a function to compare models. For Cox models a strata variable can be included. Eekhout (2017) <doi:10.1186/s12874-017-0404-7>. Wiel (2009) <doi:10.1093/biostatistics/kxp011>. Marshall (2009) <doi:10.1186/1471-2288-9-57>.

Maintained by Martijn Heymans. Last updated 2 years ago.

cox-regression imputation imputed-datasets logistic multiple-imputation pool predictor regression selection spline spline-predictors

5.6 match 10 stars 7.17 score 70 scripts

cran

bifurcatingr:Bifurcating Autoregressive Models

Estimation of bifurcating autoregressive models of any order, p, BAR(p) as well as several types of bias correction for the least squares estimators of the autoregressive parameters as described in Zhou and Basawa (2005) <doi:10.1016/j.spl.2005.04.024> and Elbayoumi and Mostafa (2020) <doi:10.1002/sta4.342>. Currently, the bias correction methods supported include bootstrap (single, double and fast-double) bias correction and linear-bias-function-based bias correction. Functions for generating and plotting bifurcating autoregressive data from any BAR(p) model are also included. This new version includes calculating several type of bias-corrected and -uncorrected confidence intervals for the least squares estimators of the autoregressive parameters as described in Elbayoumi and Mostafa (2023) <doi:10.6339/23-JDS1092>.

Maintained by Tamer Elbayoumi. Last updated 11 months ago.

14.9 match 2.70 score

tomaspinall

LSMRealOptions:Value American and Real Options Through LSM Simulation

The least-squares Monte Carlo (LSM) simulation method is a popular method for the approximation of the value of early and multiple exercise options. 'LSMRealOptions' provides implementations of the LSM simulation method to value American option products and capital investment projects through real options analysis. 'LSMRealOptions' values capital investment projects with cash flows dependent upon underlying state variables that are stochastically evolving, providing analysis into the timing and critical values at which investment is optimal. 'LSMRealOptions' provides flexibility in the stochastic processes followed by underlying assets, the number of state variables, basis functions and underlying asset characteristics to allow a broad range of assets to be valued through the LSM simulation method. Real options projects are further able to be valued whilst considering construction periods, time-varying initial capital expenditures and path-dependent operational flexibility including the ability to temporarily shutdown or permanently abandon projects after initial investment has occurred. The LSM simulation method was first presented in the prolific work of Longstaff and Schwartz (2001) <doi:10.1093/rfs/14.1.113>.

Maintained by Thomas Aspinall. Last updated 4 years ago.

9.3 match 4.26 score 12 scripts 1 dependents

gi0na

ghypernet:Fit and Simulate Generalised Hypergeometric Ensembles of Graphs

Provides functions for model fitting and selection of generalised hypergeometric ensembles of random graphs (gHypEG). To learn how to use it, check the vignettes for a quick tutorial. Please reference its use as Casiraghi, G., Nanumyan, V. (2019) <doi:10.5281/zenodo.2555300> together with those relevant references from the one listed below. The package is based on the research developed at the Chair of Systems Design, ETH Zurich. Casiraghi, G., Nanumyan, V., Scholtes, I., Schweitzer, F. (2016) <arXiv:1607.02441>. Casiraghi, G., Nanumyan, V., Scholtes, I., Schweitzer, F. (2017) <doi:10.1007/978-3-319-67256-4_11>. Casiraghi, G., (2017) <arXiv:1702.02048> Brandenberger, L., Casiraghi, G., Nanumyan, V., Schweitzer, F. (2019) <doi:10.1145/3341161.3342926> Casiraghi, G. (2019) <doi:10.1007/s41109-019-0241-1>. Casiraghi, G., Nanumyan, V. (2021) <doi:10.1038/s41598-021-92519-y>. Casiraghi, G. (2021) <doi:10.1088/2632-072X/ac0493>.

Maintained by Giona Casiraghi. Last updated 11 months ago.

data-mining data-science graphs network network-analysis random-graph-generation random-graphs

7.0 match 8 stars 5.68 score 20 scripts

chrisaberson

pwr2ppl:Power Analyses for Common Designs (Power to the People)

Statistical power analysis for designs including t-tests, correlations, multiple regression, ANOVA, mediation, and logistic regression. Functions accompany Aberson (2019) <doi:10.4324/9781315171500>.

Maintained by Chris Aberson. Last updated 3 years ago.

9.5 match 17 stars 4.16 score 17 scripts

twolodzko

extraDistr:Additional Univariate and Multivariate Distributions

Density, distribution function, quantile function and random generation for a number of univariate and multivariate distributions. This package implements the following distributions: Bernoulli, beta-binomial, beta-negative binomial, beta prime, Bhattacharjee, Birnbaum-Saunders, bivariate normal, bivariate Poisson, categorical, Dirichlet, Dirichlet-multinomial, discrete gamma, discrete Laplace, discrete normal, discrete uniform, discrete Weibull, Frechet, gamma-Poisson, generalized extreme value, Gompertz, generalized Pareto, Gumbel, half-Cauchy, half-normal, half-t, Huber density, inverse chi-squared, inverse-gamma, Kumaraswamy, Laplace, location-scale t, logarithmic, Lomax, multivariate hypergeometric, multinomial, negative hypergeometric, non-standard beta, normal mixture, Poisson mixture, Pareto, power, reparametrized beta, Rayleigh, shifted Gompertz, Skellam, slash, triangular, truncated binomial, truncated normal, truncated Poisson, Tukey lambda, Wald, zero-inflated binomial, zero-inflated negative binomial, zero-inflated Poisson.

Maintained by Tymoteusz Wolodzko. Last updated 13 days ago.

c-plus-plus c-plus-plus-11 distribution multivariate-distributions probability random-generation rcpp statistics cpp

3.4 match 53 stars 11.60 score 1.5k scripts 107 dependents

winvector

vtreat:A Statistically Sound 'data.frame' Processor/Conditioner

A 'data.frame' processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. 'vtreat' prepares variables so that data has fewer exceptional cases, making it easier to safely use models in production. Common problems 'vtreat' defends against: 'Inf', 'NA', too many categorical levels, rare categorical levels, and new categorical levels (levels seen during application, but not during training). Reference: "'vtreat': a data.frame Processor for Predictive Modeling", Zumel, Mount, 2016, <DOI:10.5281/zenodo.1173313>.

Maintained by John Mount. Last updated 2 months ago.

categorical-variables machine-learning-algorithms nested-models prepare-data

3.5 match 285 stars 11.19 score 328 scripts 1 dependents

r-forge

isotone:Active Set and Generalized PAVA for Isotone Optimization

Contains two main functions: one for solving general isotone regression problems using the pool-adjacent-violators algorithm (PAVA); another one provides a framework for active set methods for isotone optimization problems with arbitrary order restrictions. Various types of loss functions are prespecified.

Maintained by Patrick Mair. Last updated 3 months ago.

5.7 match 6.88 score 80 scripts 13 dependents

robinhankin

ResistorArray:Electrical Properties of Resistor Networks

Electrical properties of resistor networks using matrix methods.

Maintained by Robin K. S. Hankin. Last updated 1 years ago.

9.0 match 4.32 score 14 scripts 1 dependents

vegandevs

vegan:Community Ecology Package

Ordination methods, diversity analysis and other functions for community and vegetation ecologists.

Maintained by Jari Oksanen. Last updated 18 days ago.

ecological-modelling ecology ordination fortran openblas

2.0 match 472 stars 19.41 score 15k scripts 440 dependents

mdplot

MDplot:Visualising Molecular Dynamics Analyses

Provides automatisation for plot generation succeeding common molecular dynamics analyses. This includes straightforward plots, such as RMSD (Root-Mean-Square-Deviation) and RMSF (Root-Mean-Square-Fluctuation) but also more sophisticated ones such as dihedral angle maps, hydrogen bonds, cluster bar plots and DSSP (Definition of Secondary Structure of Proteins) analysis. Currently able to load GROMOS, GROMACS and AMBER formats, respectively.

Maintained by Christian Margreitter. Last updated 3 years ago.

6.0 match 27 stars 6.46 score 36 scripts

jmsigner

amt:Animal Movement Tools

Manage and analyze animal movement data. The functionality of 'amt' includes methods to calculate home ranges, track statistics (e.g. step lengths, speed, or turning angles), prepare data for fitting habitat selection analyses, and simulation of space-use from fitted step-selection functions.

Maintained by Johannes Signer. Last updated 4 months ago.

3.7 match 41 stars 10.54 score 418 scripts

zdebruine

RcppML:Rcpp Machine Learning Library

Fast machine learning algorithms including matrix factorization and divisive clustering for large sparse and dense matrices.

Maintained by Zach DeBruine. Last updated 2 years ago.

clustering matrix-factorization nmf rcpp rcppeigen sparse-matrix cpp openmp

3.7 match 104 stars 10.53 score 125 scripts 46 dependents

symbolixau

h3r:Hexagonal Hierarchical Geospatial Indexing System

Provides access to Uber's 'H3' geospatial indexing system via 'h3lib' <https://CRAN.R-project.org/package=h3lib>. 'h3r' is designed to mimic the 'H3' Application Programming Interface (API) <https://h3geo.org/docs/api/indexing/>, so that any function in the API is also available in 'h3r'.

Maintained by David Cooley. Last updated 3 months ago.

8.6 match 5 stars 4.52 score 33 scripts

optad

adoptr:Adaptive Optimal Two-Stage Designs

Optimize one or two-arm, two-stage designs for clinical trials with respect to several implemented objective criteria or custom objectives. Optimization under uncertainty and conditional (given stage-one outcome) constraints are supported. See Pilz et al. (2019) <doi:10.1002/sim.8291> and Kunzmann et al. (2021) <doi:10.18637/jss.v098.i09> for details.

Maintained by Maximilian Pilz. Last updated 6 months ago.

5.4 match 1 stars 7.09 score 39 scripts 1 dependents

ebbertd

chisq.posthoc.test:A Post Hoc Analysis for Pearson's Chi-Squared Test for Count Data

Perform post hoc analysis based on residuals of Pearson's Chi-squared Test for Count Data based on T. Mark Beasley & Randall E. Schumacker (1995) <doi: 10.1080/00220973.1995.9943797>.

Maintained by Daniel Ebbert. Last updated 5 years ago.

chisq-test chisquare chisquare-test

7.7 match 2 stars 4.99 score 98 scripts

pachadotdev

gravity:Estimation Methods for Gravity Models

A wrapper of different standard estimation methods for gravity models. This package provides estimation methods for log-log models and multiplicative models.

Maintained by Mauricio Vargas. Last updated 4 months ago.

bvu bvw ddm econometrics glm gpml gravity international-trade lm maximum-likelihood nbpml nls ols ppml sils tobit trade

5.5 match 35 stars 6.98 score 55 scripts

afrimapr

afrilearndata:Small Africa Map Datasets for Learning

Small African datasets to help with learning and teaching of spatial techniques and mapping. Part of afrimapr project. To provide analysts based in Africa with more easily relateable example datasets. R objects for points, lines, polygons and raster. Source files including .gpkg, .shp, .kml, .tif, .grd, .csv.

Maintained by Andy South. Last updated 3 years ago.

map spatial teaching visualization

10.4 match 15 stars 3.68 score 64 scripts

mlverse

luz:Higher Level 'API' for 'torch'

A high level interface for 'torch' providing utilities to reduce the the amount of code needed for common tasks, abstract away torch details and make the same code work on both the 'CPU' and 'GPU'. It's flexible enough to support expressing a large range of models. It's heavily inspired by 'fastai' by Howard et al. (2020) <arXiv:2002.04688>, 'Keras' by Chollet et al. (2015) and 'PyTorch Lightning' by Falcon et al. (2019) <doi:10.5281/zenodo.3828935>.

Maintained by Daniel Falbel. Last updated 6 months ago.

3.9 match 89 stars 9.86 score 318 scripts 4 dependents

bdwilliamson

vimp:Perform Inference on Algorithm-Agnostic Variable Importance

Calculate point estimates of and valid confidence intervals for nonparametric, algorithm-agnostic variable importance measures in high and low dimensions, using flexible estimators of the underlying regression functions. For more information about the methods, please see Williamson et al. (Biometrics, 2020), Williamson et al. (JASA, 2021), and Williamson and Feng (ICML, 2020).

Maintained by Brian D. Williamson. Last updated 1 months ago.

machine-learning nonparametric-statistics statistical-inference variable-importance

5.6 match 23 stars 6.79 score 67 scripts

bioc

snpStats:SnpMatrix and XSnpMatrix classes and methods

Classes and statistical methods for large SNP association studies. This extends the earlier snpMatrix package, allowing for uncertainty in genotypes.

Maintained by David Clayton. Last updated 5 months ago.

microarray snp geneticvariability zlib

4.0 match 9.48 score 674 scripts 20 dependents

jeffreyevans

yaImpute:Nearest Neighbor Observation Imputation and Evaluation Tools

Performs nearest neighbor-based imputation using one or more alternative approaches to processing multivariate data. These include methods based on canonical correlation: analysis, canonical correspondence analysis, and a multivariate adaptation of the random forest classification and regression techniques of Leo Breiman and Adele Cutler. Additional methods are also offered. The package includes functions for comparing the results from running alternative techniques, detecting imputation targets that are notably distant from reference observations, detecting and correcting for bias, bootstrapping and building ensemble imputations, and mapping results.

Maintained by Jeffrey S. Evans. Last updated 6 months ago.

imputation cpp

5.1 match 3 stars 7.40 score 94 scripts 12 dependents

merck

r2rtf:Easily Create Production-Ready Rich Text Format (RTF) Tables and Figures

Create production-ready Rich Text Format (RTF) tables and figures with flexible format.

Maintained by Benjamin Wang. Last updated 8 days ago.

3.5 match 78 stars 10.82 score 171 scripts 10 dependents