R-universe search: variance

bioc

variancePartition:Quantify and interpret drivers of variation in multilevel gene expression experiments

Quantify and interpret multiple sources of biological and technical variation in gene expression experiments. Uses a linear mixed model to quantify variation in gene expression attributable to individual, tissue, time point, or technical variables. Includes dream differential expression analysis for repeated measures.

Maintained by Gabriel E. Hoffman. Last updated 2 months ago.

rnaseq geneexpression genesetenrichment differentialexpression batcheffect qualitycontrol regression epigenetics functionalgenomics transcriptomics normalization preprocessing microarray immunooncology software

30.3 match 7 stars 11.69 score 1.1k scripts 3 dependents

cran

nlme:Linear and Nonlinear Mixed Effects Models

Fit and compare Gaussian linear and nonlinear mixed-effects models.

Maintained by R Core Team. Last updated 2 months ago.

fortran

27.0 match 6 stars 13.00 score 13k scripts 8.7k dependents

svkucheryavski

mdatools:Multivariate Data Analysis for Chemometrics

Projection based methods for preprocessing, exploring and analysis of multivariate data used in chemometrics. S. Kucheryavskiy (2020) <doi:10.1016/j.chemolab.2020.103937>.

Maintained by Sergey Kucheryavskiy. Last updated 8 months ago.

44.9 match 35 stars 7.37 score 220 scripts 1 dependents

jepusto

clubSandwich:Cluster-Robust (Sandwich) Variance Estimators with Small-Sample Corrections

Provides several cluster-robust variance estimators (i.e., sandwich estimators) for ordinary and weighted least squares linear regression models, including the bias-reduced linearization estimator introduced by Bell and McCaffrey (2002) <https://www150.statcan.gc.ca/n1/pub/12-001-x/2002002/article/9058-eng.pdf> and developed further by Pustejovsky and Tipton (2017) <DOI:10.1080/07350015.2016.1247004>. The package includes functions for estimating the variance- covariance matrix and for testing single- and multiple- contrast hypotheses based on Wald test statistics. Tests of single regression coefficients use Satterthwaite or saddle-point corrections. Tests of multiple- contrast hypotheses use an approximation to Hotelling's T-squared distribution. Methods are provided for a variety of fitted models, including lm() and mlm objects, glm(), geeglm() (from package 'geepack'), ivreg() (from package 'AER'), ivreg() (from package 'ivreg' when estimated by ordinary least squares), plm() (from package 'plm'), gls() and lme() (from 'nlme'), lmer() (from `lme4`), robu() (from 'robumeta'), and rma.uni() and rma.mv() (from 'metafor').

Maintained by James Pustejovsky. Last updated 16 days ago.

28.8 match 48 stars 11.25 score 656 scripts 4 dependents

covaruber

sommer:Solving Mixed Model Equations in R

Structural multivariate-univariate linear mixed model solver for estimation of multiple random effects with unknown variance-covariance structures (e.g., heterogeneous and unstructured) and known covariance among levels of random effects (e.g., pedigree and genomic relationship matrices) (Covarrubias-Pazaran, 2016 <doi:10.1371/journal.pone.0156744>; Maier et al., 2015 <doi:10.1016/j.ajhg.2014.12.006>; Jensen et al., 1997). REML estimates can be obtained using the Direct-Inversion Newton-Raphson and Direct-Inversion Average Information algorithms for the problems r x r (r being the number of records) or using the Henderson-based average information algorithm for the problem c x c (c being the number of coefficients to estimate). Spatial models can also be fitted using the two-dimensional spline functionality available.

Maintained by Giovanny Covarrubias-Pazaran. Last updated 22 days ago.

average-information mixed-models rcpparmadillo openblas cpp openmp

23.6 match 43 stars 12.70 score 300 scripts 9 dependents

psychmeta

psychmeta:Psychometric Meta-Analysis Toolkit

Tools for computing bare-bones and psychometric meta-analyses and for generating psychometric data for use in meta-analysis simulations. Supports bare-bones, individual-correction, and artifact-distribution methods for meta-analyzing correlations and d values. Includes tools for converting effect sizes, computing sporadic artifact corrections, reshaping meta-analytic databases, computing multivariate corrections for range variation, and more. Bugs can be reported to <https://github.com/psychmeta/psychmeta/issues> or <issues@psychmeta.com>.

Maintained by Jeffrey A. Dahlke. Last updated 9 months ago.

hacktoberfest meta-analysis psychology psychometric psychometrics

33.1 match 57 stars 8.25 score 151 scripts

smac-group

avar:Allan Variance

Implements the allan variance and allan variance linear regression estimator for latent time series models. More details about the method can be found, for example, in Guerrier, S., Molinari, R., & Stebler, Y. (2016) <doi:10.1109/LSP.2016.2541867>.

Maintained by Stéphane Guerrier. Last updated 3 years ago.

allan-variance inertial-sensors statistics time-series cpp

55.4 match 5 stars 4.88 score 9 scripts

gaynorr

AlphaSimR:Breeding Program Simulations

The successor to the 'AlphaSim' software for breeding program simulation [Faux et al. (2016) <doi:10.3835/plantgenome2016.02.0013>]. Used for stochastic simulations of breeding programs to the level of DNA sequence for every individual. Contained is a wide range of functions for modeling common tasks in a breeding program, such as selection and crossing. These functions allow for constructing simulations of highly complex plant and animal breeding programs via scripting in the R software environment. Such simulations can be used to evaluate overall breeding program performance and conduct research into breeding program design, such as implementation of genomic selection. Included is the 'Markovian Coalescent Simulator' ('MaCS') for fast simulation of biallelic sequences according to a population demographic history [Chen et al. (2009) <doi:10.1101/gr.083634.108>].

Maintained by Chris Gaynor. Last updated 5 months ago.

breeding genomics simulation openblas cpp openmp

24.0 match 47 stars 10.22 score 534 scripts 2 dependents

andrisignorell

DescTools:Tools for Descriptive Statistics

A collection of miscellaneous basic statistic functions and convenience wrappers for efficiently describing data. The author's intention was to create a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results. The package contains furthermore functions to produce documents using MS Word (or PowerPoint) and functions to import data from Excel. Many of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The 'BigCamelCase' style was consequently applied to functions borrowed from contributed R packages as well.

Maintained by Andri Signorell. Last updated 9 hours ago.

fortran cpp

13.6 match 87 stars 16.70 score 7.7k scripts 99 dependents

smac-group

wv:Wavelet Variance

Provides a series of tools to compute and plot quantities related to classical and robust wavelet variance for time series and regular lattices. More details can be found, for example, in Serroukh, A., Walden, A.T., & Percival, D.B. (2000) <doi:10.2307/2669537> and Guerrier, S. & Molinari, R. (2016) <arXiv:1607.05858>.

Maintained by Stéphane Guerrier. Last updated 2 years ago.

signal-processing time-series wavelet-variance openblas cpp

41.9 match 17 stars 5.43 score 15 scripts 1 dependents

wviechtb

metafor:Meta-Analysis Package for R

A comprehensive collection of functions for conducting meta-analyses in R. The package includes functions to calculate various effect sizes or outcome measures, fit equal-, fixed-, random-, and mixed-effects models to such data, carry out moderator and meta-regression analyses, and create various types of meta-analytical plots (e.g., forest, funnel, radial, L'Abbe, Baujat, bubble, and GOSH plots). For meta-analyses of binomial and person-time data, the package also provides functions that implement specialized methods, including the Mantel-Haenszel method, Peto's method, and a variety of suitable generalized linear (mixed-effects) models (i.e., mixed-effects logistic and Poisson regression models). Finally, the package provides functionality for fitting meta-analytic multivariate/multilevel models that account for non-independent sampling errors and/or true effects (e.g., due to the inclusion of multiple treatment studies, multiple endpoints, or other forms of clustering). Network meta-analyses and meta-analyses accounting for known correlation structures (e.g., due to phylogenetic relatedness) can also be conducted. An introduction to the package can be found in Viechtbauer (2010) <doi:10.18637/jss.v036.i03>.

Maintained by Wolfgang Viechtbauer. Last updated 2 days ago.

meta-analysis mixed-effects multilevel-models multivariate

13.6 match 246 stars 16.30 score 4.9k scripts 92 dependents

tushiqi

MAnorm2:Tools for Normalizing and Comparing ChIP-seq Samples

Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is the premier technology for profiling genome-wide localization of chromatin-binding proteins, including transcription factors and histones with various modifications. This package provides a robust method for normalizing ChIP-seq signals across individual samples or groups of samples. It also designs a self-contained system of statistical models for calling differential ChIP-seq signals between two or more biological conditions as well as for calling hypervariable ChIP-seq signals across samples. Refer to Tu et al. (2021) <doi:10.1101/gr.262675.120> and Chen et al. (2022) <doi:10.1186/s13059-022-02627-9> for associated statistical details.

Maintained by Shiqi Tu. Last updated 2 years ago.

chip-seq differential-analysis empirical-bayes winsorize-values

36.4 match 32 stars 5.48 score 19 scripts

emmanuelparadis

ape:Analyses of Phylogenetics and Evolution

Functions for reading, writing, plotting, and manipulating phylogenetic trees, analyses of comparative data in a phylogenetic framework, ancestral character analyses, analyses of diversification and macroevolution, computing distances from DNA sequences, reading and writing nucleotide sequences as well as importing from BioConductor, and several tools such as Mantel's test, generalized skyline plots, graphical exploration of phylogenetic data (alex, trex, kronoviz), estimation of absolute evolutionary rates and clock-like trees using mean path lengths and penalized likelihood, dating trees with non-contemporaneous sequences, translating DNA into AA sequences, and assessing sequence alignments. Phylogeny estimation can be done with the NJ, BIONJ, ME, MVR, SDM, and triangle methods, and several methods handling incomplete distance matrices (NJ*, BIONJ*, MVR*, and the corresponding triangle method). Some functions call external applications (PhyML, Clustal, T-Coffee, Muscle) whose results are returned into R.

Maintained by Emmanuel Paradis. Last updated 14 hours ago.

openblas cpp

11.4 match 64 stars 17.22 score 13k scripts 599 dependents

csblatvia

vardpoor:Variance Estimation for Sample Surveys by the Ultimate Cluster Method

Generation of domain variables, linearization of several non-linear population statistics (the ratio of two totals, weighted income percentile, relative median income ratio, at-risk-of-poverty rate, at-risk-of-poverty threshold, Gini coefficient, gender pay gap, the aggregate replacement ratio, the relative median income ratio, median income below at-risk-of-poverty gap, income quintile share ratio, relative median at-risk-of-poverty gap), computation of regression residuals in case of weight calibration, variance estimation of sample surveys by the ultimate cluster method (Hansen, Hurwitz and Madow, Sample Survey Methods And Theory, vol. I: Methods and Applications; vol. II: Theory. 1953, New York: John Wiley and Sons), variance estimation for longitudinal, cross-sectional measures and measures of change for single and multistage stage cluster sampling designs (Berger, Y. G., 2015, <doi:10.1111/rssa.12116>). Several other precision measures are derived - standard error, the coefficient of variation, the margin of error, confidence interval, design effect.

Maintained by Martins Liberts. Last updated 3 years ago.

sampling statistics variance

33.6 match 10 stars 5.78 score 240 scripts

bschneidr

svrep:Tools for Creating, Updating, and Analyzing Survey Replicate Weights

Provides tools for creating and working with survey replicate weights, extending functionality of the 'survey' package from Lumley (2004) <doi:10.18637/jss.v009.i08>. Implements bootstrap methods for complex surveys, including the generalized survey bootstrap as described by Beaumont and Patak (2012) <doi:10.1111/j.1751-5823.2011.00166.x>. Methods are provided for applying nonresponse adjustments to both full-sample and replicate weights as described by Rust and Rao (1996) <doi:10.1177/096228029600500305>. Implements methods for sample-based calibration described by Opsomer and Erciulescu (2021) <https://www150.statcan.gc.ca/n1/pub/12-001-x/2021002/article/00006-eng.htm>. Diagnostic functions are included to compare weights and weighted estimates from different sets of replicate weights.

Maintained by Ben Schneider. Last updated 7 days ago.

22.8 match 8 stars 8.12 score 54 scripts 3 dependents

xrobin

pROC:Display and Analyze ROC Curves

Tools for visualizing, smoothing and comparing receiver operating characteristic (ROC curves). (Partial) area under the curve (AUC) can be compared with statistical tests based on U-statistics or bootstrap. Confidence intervals can be computed for (p)AUC or ROC curves.

Maintained by Xavier Robin. Last updated 4 months ago.

bootstrapping covariance hypothesis-testing machine-learning plot plotting roc roc-curve variance cpp

12.0 match 125 stars 15.18 score 16k scripts 445 dependents

cran

samplingVarEst:Sampling Variance Estimation

Functions to calculate some point estimators and estimate their variance under unequal probability sampling without replacement. Single and two-stage sampling designs are considered. Some approximations for the second-order inclusion probabilities (joint inclusion probabilities) are available (sample and population based). A variety of Jackknife variance estimators are implemented. Almost every function is written in C (compiled) code for faster results. The functions incorporate some performance improvements for faster results with large datasets.

Maintained by Emilio Lopez Escobar. Last updated 2 years ago.

78.8 match 1 stars 2.27 score 62 scripts 1 dependents

spatstat

spatstat.model:Parametric Statistical Modelling and Inference for the 'spatstat' Family

Functionality for parametric statistical modelling and inference for spatial data, mainly spatial point patterns, in the 'spatstat' family of packages. (Excludes analysis of spatial data on a linear network, which is covered by the separate package 'spatstat.linnet'.) Supports parametric modelling, formal statistical inference, and model validation. Parametric models include Poisson point processes, Cox point processes, Neyman-Scott cluster processes, Gibbs point processes and determinantal point processes. Models can be fitted to data using maximum likelihood, maximum pseudolikelihood, maximum composite likelihood and the method of minimum contrast. Fitted models can be simulated and predicted. Formal inference includes hypothesis tests (quadrat counting tests, Cressie-Read tests, Clark-Evans test, Berman test, Diggle-Cressie-Loosmore-Ford test, scan test, studentised permutation test, segregation test, ANOVA tests of fitted models, adjusted composite likelihood ratio test, envelope tests, Dao-Genton test, balanced independent two-stage test), confidence intervals for parameters, and prediction intervals for point counts. Model validation techniques include leverage, influence, partial residuals, added variable plots, diagnostic plots, pseudoscore residual plots, model compensators and Q-Q plots.

Maintained by Adrian Baddeley. Last updated 8 days ago.

analysis-of-variance cluster-process confidence-intervals cox-process determinantal-point-processes gibbs-process influence leverage model-diagnostics neyman-scott parameter-estimation poisson-process spatial-analysis spatial-modelling spatial-point-processes statistical-inference

19.5 match 5 stars 9.09 score 6 scripts 46 dependents

rpolars

polars:Lightning-Fast 'DataFrame' Library

Lightning-fast 'DataFrame' library written in 'Rust'. Convert R data to 'Polars' data and vice versa. Perform fast, lazy, larger-than-memory and optimized data queries. 'Polars' is interoperable with the package 'arrow', as both are based on the 'Apache Arrow' Columnar Format.

Maintained by Soren Welling. Last updated 3 days ago.

arrow polars rust

14.4 match 499 stars 12.01 score 1.0k scripts 2 dependents

cran

queueing:Analysis of Queueing Networks and Models

It provides versatile tools for analysis of birth and death based Markovian Queueing Models and Single and Multiclass Product-Form Queueing Networks. It implements M/M/1, M/M/c, M/M/Infinite, M/M/1/K, M/M/c/K, M/M/c/c, M/M/1/K/K, M/M/c/K/K, M/M/c/K/m, M/M/Infinite/K/K, Multiple Channel Open Jackson Networks, Multiple Channel Closed Jackson Networks, Single Channel Multiple Class Open Networks, Single Channel Multiple Class Closed Networks and Single Channel Multiple Class Mixed Networks. Also it provides a B-Erlang, C-Erlang and Engset calculators. This work is dedicated to the memory of D. Sixto Rios Insua.

Maintained by Pedro Canadilla. Last updated 5 years ago.

66.0 match 2.58 score 376 scripts

doccstat

fastcpd:Fast Change Point Detection via Sequential Gradient Descent

Implements fast change point detection algorithm based on the paper "Sequential Gradient Descent and Quasi-Newton's Method for Change-Point Analysis" by Xianyang Zhang, Trisha Dawn <https://proceedings.mlr.press/v206/zhang23b.html>. The algorithm is based on dynamic programming with pruning and sequential gradient descent. It is able to detect change points a magnitude faster than the vanilla Pruned Exact Linear Time(PELT). The package includes examples of linear regression, logistic regression, Poisson regression, penalized linear regression data, and whole lot more examples with custom cost function in case the user wants to use their own cost function.

Maintained by Xingchi Li. Last updated 1 days ago.

change-point-detection cpp custom-function gradient-descent lasso linear-regression logistic-regression offline pelt penalized-regression poisson-regression quasi-newton statistics time-series warm-start fortran openblas cpp openmp

24.3 match 22 stars 7.00 score 7 scripts

rfastofficial

Rfast:A Collection of Efficient and Extremely Fast R Functions

A collection of fast (utility) functions for data analysis. Column and row wise means, medians, variances, minimums, maximums, many t, F and G-square tests, many regressions (normal, logistic, Poisson), are some of the many fast functions. References: a) Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>. b) Tsagris M. and Papadakis M. (2018). Forward regression in R: from the extreme slow to the extreme fast. Journal of Data Science, 16(4): 771--780. <doi:10.6339/JDS.201810_16(4).00006>. c) Chatzipantsiou C., Dimitriadis M., Papadakis M. and Tsagris M. (2020). Extremely Efficient Permutation and Bootstrap Hypothesis Tests Using Hypothesis Tests Using R. Journal of Modern Applied Statistical Methods, 18(2), eP2898. <doi:10.48550/arXiv.1806.10947>. d) Tsagris M., Papadakis M., Alenazi A. and Alzeley O. (2024). Computationally Efficient Outlier Detection for High-Dimensional Data Using the MDP Algorithm. Computation, 12(9): 185. <doi:10.3390/computation12090185>. e) Tsagris M. and Papadakis M. (2025). Fast and light-weight energy statistics using the R package Rfast. <doi:10.48550/arXiv.2501.02849>.

Maintained by Manos Papadakis. Last updated 17 days ago.

openblas cpp openmp

13.1 match 147 stars 12.54 score 1.2k scripts 166 dependents

bioc

DESeq2:Differential gene expression analysis based on the negative binomial distribution

Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.

Maintained by Michael Love. Last updated 11 days ago.

sequencing rnaseq chipseq geneexpression transcription normalization differentialexpression bayesian regression principalcomponent clustering immunooncology openblas cpp

10.1 match 375 stars 16.11 score 17k scripts 115 dependents

r-forge

survey:Analysis of Complex Survey Samples

Summary statistics, two-sample tests, rank tests, generalised linear models, cumulative link models, Cox models, loglinear models, and general maximum pseudolikelihood estimation for multistage stratified, cluster-sampled, unequally weighted survey samples. Variances by Taylor series linearisation or replicate weights. Post-stratification, calibration, and raking. Two-phase and multiphase subsampling designs. Graphics. PPS sampling without replacement. Small-area estimation. Dual-frame designs.

Maintained by "Thomas Lumley". Last updated 6 months ago.

cpp

11.5 match 1 stars 13.94 score 13k scripts 232 dependents

distancedevelopment

dsm:Density Surface Modelling of Distance Sampling Data

Density surface modelling of line transect data. A Generalized Additive Model-based approach is used to calculate spatially-explicit estimates of animal abundance from distance sampling (also presence/absence and strip transect) data. Several utility functions are provided for model checking, plotting and variance estimation.

Maintained by Laura Marshall. Last updated 2 years ago.

25.8 match 8 stars 6.09 score 146 scripts

yayayaoyaoyao

RARtrials:Response-Adaptive Randomization in Clinical Trials

Some response-adaptive randomization methods commonly found in literature are included in this package. These methods include the randomized play-the-winner rule for binary endpoint (Wei and Durham (1978) <doi:10.2307/2286290>), the doubly adaptive biased coin design with minimal variance strategy for binary endpoint (Atkinson and Biswas (2013) <doi:10.1201/b16101>, Rosenberger and Lachin (2015) <doi:10.1002/9781118742112>) and maximal power strategy targeting Neyman allocation for binary endpoint (Tymofyeyev, Rosenberger, and Hu (2007) <doi:10.1198/016214506000000906>) and RSIHR allocation with each letter representing the first character of the names of the individuals who first proposed this rule (Youngsook and Hu (2010) <doi:10.1198/sbr.2009.0056>, Bello and Sabo (2016) <doi:10.1080/00949655.2015.1114116>), A-optimal Allocation for continuous endpoint (Sverdlov and Rosenberger (2013) <doi:10.1080/15598608.2013.783726>), Aa-optimal Allocation for continuous endpoint (Sverdlov and Rosenberger (2013) <doi:10.1080/15598608.2013.783726>), generalized RSIHR allocation for continuous endpoint (Atkinson and Biswas (2013) <doi:10.1201/b16101>), Bayesian response-adaptive randomization with a control group using the Thall \& Wathen method for binary and continuous endpoints (Thall and Wathen (2007) <doi:10.1016/j.ejca.2007.01.006>) and the forward-looking Gittins index rule for binary and continuous endpoints (Villar, Wason, and Bowden (2015) <doi:10.1111/biom.12337>, Williamson and Villar (2019) <doi:10.1111/biom.13119>).

Maintained by Chuyao Xu. Last updated 2 months ago.

33.3 match 4.65 score

bioc

scran:Methods for Single-Cell RNA-Seq Data Analysis

Implements miscellaneous functions for interpretation of single-cell RNA-seq data. Methods are provided for assignment of cell cycle phase, detection of highly variable and significantly correlated genes, identification of marker genes, and other common tasks in routine single-cell analysis workflows.

Maintained by Aaron Lun. Last updated 5 months ago.

immunooncology normalization sequencing rnaseq software geneexpression transcriptomics singlecell clustering bioconductor-package human-cell-atlas single-cell-rna-seq openblas cpp

10.7 match 41 stars 13.14 score 7.6k scripts 36 dependents

bioc

MOFA2:Multi-Omics Factor Analysis v2

The MOFA2 package contains a collection of tools for training and analysing multi-omic factor analysis (MOFA). MOFA is a probabilistic factor model that aims to identify principal axes of variation from data sets that can comprise multiple omic layers and/or groups of samples. Additional time or space information on the samples can be incorporated using the MEFISTO framework, which is part of MOFA2. Downstream analysis functions to inspect molecular features underlying each factor, vizualisation, imputation etc are available.

Maintained by Ricard Argelaguet. Last updated 5 months ago.

dimensionreduction bayesian visualization factor-analysis mofa multi-omics

13.5 match 319 stars 10.02 score 502 scripts

bioc

MBECS:Evaluation and correction of batch effects in microbiome data-sets

The Microbiome Batch Effect Correction Suite (MBECS) provides a set of functions to evaluate and mitigate unwated noise due to processing in batches. To that end it incorporates a host of batch correcting algorithms (BECA) from various packages. In addition it offers a correction and reporting pipeline that provides a preliminary look at the characteristics of a data-set before and after correcting for batch effects.

Maintained by Michael Olbrich. Last updated 5 months ago.

batcheffect microbiome reportwriting visualization normalization qualitycontrol

29.3 match 4 stars 4.60 score 4 scripts

cran

VCA:Variance Component Analysis

ANOVA and REML estimation of linear mixed models is implemented, once following Searle et al. (1991, ANOVA for unbalanced data), once making use of the 'lme4' package. The primary objective of this package is to perform a variance component analysis (VCA) according to CLSI EP05-A3 guideline "Evaluation of Precision of Quantitative Measurement Procedures" (2014). There are plotting methods for visualization of an experimental design, plotting random effects and residuals. For ANOVA type estimation two methods for computing ANOVA mean squares are implemented (SWEEP and quadratic forms). The covariance matrix of variance components can be derived, which is used in estimating confidence intervals. Linear hypotheses of fixed effects and LS means can be computed. LS means can be computed at specific values of covariables and with custom weighting schemes for factor variables. See ?VCA for a more comprehensive description of the features.

Maintained by Andre Schuetzenmeister. Last updated 1 years ago.

fortran openblas

29.8 match 2 stars 4.51 score 5 dependents

alanarnholt

BSDA:Basic Statistics and Data Analysis

Data sets for book "Basic Statistics and Data Analysis" by Larry J. Kitchens.

Maintained by Alan T. Arnholt. Last updated 2 years ago.

14.7 match 7 stars 9.11 score 1.3k scripts 6 dependents

tidymodels

recipes:Preprocessing and Feature Engineering Steps for Modeling

A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.

Maintained by Max Kuhn. Last updated 6 days ago.

6.9 match 584 stars 18.71 score 7.2k scripts 380 dependents

vegandevs

vegan:Community Ecology Package

Ordination methods, diversity analysis and other functions for community and vegetation ecologists.

Maintained by Jari Oksanen. Last updated 16 days ago.

ecological-modelling ecology ordination fortran openblas

6.5 match 472 stars 19.41 score 15k scripts 440 dependents

zejiang-unsw

WASP:Wavelet System Prediction

The wavelet-based variance transformation method is used for system modelling and prediction. It refines predictor spectral representation using Wavelet Theory, which leads to improved model specifications and prediction accuracy. Details of methodologies used in the package can be found in Jiang, Z., Sharma, A., & Johnson, F. (2020) <doi:10.1029/2019WR026962>, Jiang, Z., Rashid, M. M., Johnson, F., & Sharma, A. (2020) <doi:10.1016/j.envsoft.2020.104907>, and Jiang, Z., Sharma, A., & Johnson, F. (2021) <doi:10.1016/J.JHYDROL.2021.126816>.

Maintained by Ze Jiang. Last updated 7 months ago.

prediction transformation wavelet

19.3 match 9 stars 6.41 score 19 scripts

ulrichriegel

Pareto:The Pareto, Piecewise Pareto and Generalized Pareto Distribution

Utilities for the Pareto, piecewise Pareto and generalized Pareto distribution that are useful for reinsurance pricing. In particular, the package provides a non-trivial algorithm that can be used to match the expected losses of a tower of reinsurance layers with a layer-independent collective risk model. The theoretical background of the matching algorithm and most other methods are described in Ulrich Riegel (2018) <doi:10.1007/s13385-018-0177-3>.

Maintained by Ulrich Riegel. Last updated 2 years ago.

17.6 match 10 stars 7.00 score 134 scripts 1 dependents

bioc

limma:Linear Models for Microarray and Omics Data

Data analysis, linear models and differential expression for omics data.

Maintained by Gordon Smyth. Last updated 6 days ago.

exonarray geneexpression transcription alternativesplicing differentialexpression differentialsplicing genesetenrichment dataimport bayesian clustering regression timecourse microarray micrornaarray mrnamicroarray onechannel proprietaryplatforms twochannel sequencing rnaseq batcheffect multiplecomparison normalization preprocessing qualitycontrol biomedicalinformatics cellbiology cheminformatics epigenetics functionalgenomics genetics immunooncology metabolomics proteomics systemsbiology transcriptomics

8.7 match 13.81 score 16k scripts 585 dependents

kisungyou

SHT:Statistical Hypothesis Testing Toolbox

We provide a collection of statistical hypothesis testing procedures ranging from classical to modern methods for non-trivial settings such as high-dimensional scenario. For the general treatment of statistical hypothesis testing, see the book by Lehmann and Romano (2005) <doi:10.1007/0-387-27605-X>.

Maintained by Kisung You. Last updated 19 days ago.

openblas cpp openmp

22.7 match 6 stars 5.13 score 50 scripts 1 dependents

kkholst

lava:Latent Variable Models

A general implementation of Structural Equation Models with latent variables (MLE, 2SLS, and composite likelihood estimators) with both continuous, censored, and ordinal outcomes (Holst and Budtz-Joergensen (2013) <doi:10.1007/s00180-012-0344-y>). Mixture latent variable models and non-linear latent variable models (Holst and Budtz-Joergensen (2020) <doi:10.1093/biostatistics/kxy082>). The package also provides methods for graph exploration (d-separation, back-door criterion), simulation of general non-linear latent variable models, and estimation of influence functions for a broad range of statistical models.

Maintained by Klaus K. Holst. Last updated 2 months ago.

latent-variable-models simulation statistics structural-equation-models

9.0 match 33 stars 12.85 score 610 scripts 476 dependents

mitchelloharawild

distributional:Vectorised Probability Distributions

Vectorised distribution objects with tools for manipulating, visualising, and using probability distributions. Designed to allow model prediction outputs to return distributions rather than their parameters, allowing users to directly interact with predictive distributions in a data-oriented workflow. In addition to providing generic replacements for p/d/q/r functions, other useful statistics can be computed including means, variances, intervals, and highest density regions.

Maintained by Mitchell OHara-Wild. Last updated 2 months ago.

probability-distribution statistics vctrs

8.5 match 101 stars 13.50 score 744 scripts 384 dependents

pettermostad

lestat:A Package for Learning Statistics

Some simple objects and functions to do statistics using linear models and a Bayesian framework.

Maintained by Petter Mostad. Last updated 7 years ago.

50.3 match 2.28 score 64 scripts 1 dependents

adeverse

ade4:Analysis of Ecological Data: Exploratory and Euclidean Methods in Environmental Sciences

Tools for multivariate data analysis. Several methods are provided for the analysis (i.e., ordination) of one-table (e.g., principal component analysis, correspondence analysis), two-table (e.g., coinertia analysis, redundancy analysis), three-table (e.g., RLQ analysis) and K-table (e.g., STATIS, multiple coinertia analysis). The philosophy of the package is described in Dray and Dufour (2007) <doi:10.18637/jss.v022.i04>.

Maintained by Aurélie Siberchicot. Last updated 13 days ago.

openblas cpp

7.6 match 39 stars 14.96 score 2.2k scripts 256 dependents

lme4

lme4:Linear Mixed-Effects Models using 'Eigen' and S4

Fit linear and generalized linear mixed-effects models. The models and their components are represented using S4 classes and methods. The core computational algorithms are implemented using the 'Eigen' C++ library for numerical linear algebra and 'RcppEigen' "glue".

Maintained by Ben Bolker. Last updated 3 days ago.

cpp

5.5 match 647 stars 20.69 score 35k scripts 1.5k dependents

tbates

umx:Structural Equation Modeling and Twin Modeling in R

Quickly create, run, and report structural equation models, and twin models. See '?umx' for help, and umx_open_CRAN_page("umx") for NEWS. Timothy C. Bates, Michael C. Neale, Hermine H. Maes, (2019). umx: A library for Structural Equation and Twin Modelling in R. Twin Research and Human Genetics, 22, 27-41. <doi:10.1017/thg.2019.2>.

Maintained by Timothy C. Bates. Last updated 2 days ago.

behavior-genetics genetics openmx psychology sem statistics structural-equation-modeling tutorials twin-models umx

11.9 match 44 stars 9.45 score 472 scripts

cran

FuzzySTs:Fuzzy Statistical Tools

The main goal of this package is to present various fuzzy statistical tools. It intends to provide an implementation of the theoretical and empirical approaches presented in the book entitled "The signed distance measure in fuzzy statistical analysis. Some theoretical, empirical and programming advances" <doi: 10.1007/978-3-030-76916-1>. For the theoretical approaches, see Berkachy R. and Donze L. (2019) <doi:10.1007/978-3-030-03368-2_1>. For the empirical approaches, see Berkachy R. and Donze L. (2016) <ISBN: 978-989-758-201-1>). Important (non-exhaustive) implementation highlights of this package are as follows: (1) a numerical procedure to estimate the fuzzy difference and the fuzzy square. (2) two numerical methods of fuzzification. (3) a function performing different possibilities of distances, including the signed distance and the generalized signed distance for instance with all its properties. (4) numerical estimations of fuzzy statistical measures such as the variance, the moment, etc. (5) two methods of estimation of the bootstrap distribution of the likelihood ratio in the fuzzy context. (6) an estimation of a fuzzy confidence interval by the likelihood ratio method. (7) testing fuzzy hypotheses and/or fuzzy data by fuzzy confidence intervals in the Kwakernaak - Kruse and Meyer sense. (8) a general method to estimate the fuzzy p-value with fuzzy hypotheses and/or fuzzy data. (9) a method of estimation of global and individual evaluations of linguistic questionnaires. (10) numerical estimations of multi-ways analysis of variance models in the fuzzy context. The unbalance in the considered designs are also foreseen.

Maintained by Redina Berkachy. Last updated 8 months ago.

32.4 match 3.40 score

bsvars

bsvars:Bayesian Estimation of Structural Vector Autoregressive Models

Provides fast and efficient procedures for Bayesian analysis of Structural Vector Autoregressions. This package estimates a wide range of models, including homo-, heteroskedastic, and non-normal specifications. Structural models can be identified by adjustable exclusion restrictions, time-varying volatility, or non-normality. They all include a flexible three-level equation-specific local-global hierarchical prior distribution for the estimated level of shrinkage for autoregressive and structural parameters. Additionally, the package facilitates predictive and structural analyses such as impulse responses, forecast error variance and historical decompositions, forecasting, verification of heteroskedasticity, non-normality, and hypotheses on autoregressive parameters, as well as analyses of structural shocks, volatilities, and fitted values. Beautiful plots, informative summary functions, and extensive documentation including the vignette by Woźniak (2024) <doi:10.48550/arXiv.2410.15090> complement all this. The implemented techniques align closely with those presented in Lütkepohl, Shang, Uzeda, & Woźniak (2024) <doi:10.48550/arXiv.2404.11057>, Lütkepohl & Woźniak (2020) <doi:10.1016/j.jedc.2020.103862>, and Song & Woźniak (2021) <doi:10.1093/acrefore/9780190625979.013.174>. The 'bsvars' package is aligned regarding objects, workflows, and code structure with the R package 'bsvarSIGNs' by Wang & Woźniak (2024) <doi:10.32614/CRAN.package.bsvarSIGNs>, and they constitute an integrated toolset.

Maintained by Tomasz Woźniak. Last updated 1 months ago.

bayesian-inference econometrics vector-autoregression openblas cpp openmp

14.3 match 46 stars 7.67 score 32 scripts 1 dependents

distancedevelopment

mrds:Mark-Recapture Distance Sampling

Animal abundance estimation via conventional, multiple covariate and mark-recapture distance sampling (CDS/MCDS/MRDS). Detection function fitting is performed via maximum likelihood. Also included are diagnostics and plotting for fitted detection functions. Abundance estimation is via a Horvitz-Thompson-like estimator.

Maintained by Laura Marshall. Last updated 2 months ago.

13.5 match 4 stars 8.05 score 78 scripts 7 dependents

alexkowa

EnvStats:Package for Environmental Statistics, Including US EPA Guidance

Graphical and statistical analyses of environmental data, with focus on analyzing chemical concentrations and physical parameters, usually in the context of mandated environmental monitoring. Major environmental statistical methods found in the literature and regulatory guidance documents, with extensive help that explains what these methods do, how to use them, and where to find them in the literature. Numerous built-in data sets from regulatory guidance documents and environmental statistics literature. Includes scripts reproducing analyses presented in the book "EnvStats: An R Package for Environmental Statistics" (Millard, 2013, Springer, ISBN 978-1-4614-8455-4, <doi:10.1007/978-1-4614-8456-1>).

Maintained by Alexander Kowarik. Last updated 17 days ago.

8.4 match 26 stars 12.80 score 2.4k scripts 46 dependents

bjw34032

waveslim:Basic Wavelet Routines for One-, Two-, and Three-Dimensional Signal Processing

Basic wavelet routines for time series (1D), image (2D) and array (3D) analysis. The code provided here is based on wavelet methodology developed in Percival and Walden (2000); Gencay, Selcuk and Whitcher (2001); the dual-tree complex wavelet transform (DTCWT) from Kingsbury (1999, 2001) as implemented by Selesnick; and Hilbert wavelet pairs (Selesnick 2001, 2002). All figures in chapters 4-7 of GSW (2001) are reproducible using this package and R code available at the book website(s) below.

Maintained by Brandon Whitcher. Last updated 10 months ago.

wavelets

13.6 match 3 stars 7.88 score 108 scripts 23 dependents

briencj

dae:Functions Useful in the Design and ANOVA of Experiments

The content falls into the following groupings: (i) Data, (ii) Factor manipulation functions, (iii) Design functions, (iv) ANOVA functions, (v) Matrix functions, (vi) Projector and canonical efficiency functions, and (vii) Miscellaneous functions. There is a vignette describing how to use the design functions for randomizing and assessing designs available as a vignette called 'DesignNotes'. The ANOVA functions facilitate the extraction of information when the 'Error' function has been used in the call to 'aov'. The package 'dae' can also be installed from <http://chris.brien.name/rpackages/>.

Maintained by Chris Brien. Last updated 4 months ago.

12.4 match 1 stars 8.62 score 356 scripts 7 dependents

jinghuazhao

gap:Genetic Analysis Package

As first reported [Zhao, J. H. 2007. "gap: Genetic Analysis Package". J Stat Soft 23(8):1-18. <doi:10.18637/jss.v023.i08>], it is designed as an integrated package for genetic data analysis of both population and family data. Currently, it contains functions for sample size calculations of both population-based and family-based designs, probability of familial disease aggregation, kinship calculation, statistics in linkage analysis, and association analysis involving genetic markers including haplotype analysis with or without environmental covariates. Over years, the package has been developed in-between many projects hence also in line with the name (gap).

Maintained by Jing Hua Zhao. Last updated 16 days ago.

genetics imputation lmm fortran

8.9 match 12 stars 11.88 score 448 scripts 16 dependents

kristyrobledo

VarReg:Semi-Parametric Variance Regression

Methods for fitting semi-parametric mean and variance models, with normal or censored data. Extended to allow a regression in the location, scale and shape parameters, and further for multiple regression in each.

Maintained by Kristy Robledo. Last updated 2 years ago.

23.2 match 1 stars 4.46 score 29 scripts

bioc

dreamlet:Scalable differential expression analysis of single cell transcriptomics datasets with complex study designs

Recent advances in single cell/nucleus transcriptomic technology has enabled collection of cohort-scale datasets to study cell type specific gene expression differences associated disease state, stimulus, and genetic regulation. The scale of these data, complex study designs, and low read count per cell mean that characterizing cell type specific molecular mechanisms requires a user-frieldly, purpose-build analytical framework. We have developed the dreamlet package that applies a pseudobulk approach and fits a regression model for each gene and cell cluster to test differential expression across individuals associated with a trait of interest. Use of precision-weighted linear mixed models enables accounting for repeated measures study designs, high dimensional batch effects, and varying sequencing depth or observed cells per biosample.

Maintained by Gabriel Hoffman. Last updated 5 months ago.

rnaseq geneexpression differentialexpression batcheffect qualitycontrol regression genesetenrichment generegulation epigenetics functionalgenomics transcriptomics normalization singlecell preprocessing sequencing immunooncology software cpp

12.8 match 12 stars 8.09 score 128 scripts

rstudio

keras3:R Interface to 'Keras'

Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.

Maintained by Tomasz Kalinowski. Last updated 4 days ago.

7.4 match 845 stars 13.57 score 264 scripts 2 dependents

mrcieu

TwoSampleMR:Two Sample MR Functions and Interface to MRC Integrative Epidemiology Unit OpenGWAS Database

A package for performing Mendelian randomization using GWAS summary data. It uses the IEU OpenGWAS database <https://gwas.mrcieu.ac.uk/> to automatically obtain data, and a wide range of methods to run the analysis.

Maintained by Gibran Hemani. Last updated 11 days ago.

8.7 match 467 stars 11.23 score 1.7k scripts 1 dependents

r-forge

sandwich:Robust Covariance Matrix Estimators

Object-oriented software for model-robust covariance matrix estimators. Starting out from the basic robust Eicker-Huber-White sandwich covariance methods include: heteroscedasticity-consistent (HC) covariances for cross-section data; heteroscedasticity- and autocorrelation-consistent (HAC) covariances for time series data (such as Andrews' kernel HAC, Newey-West, and WEAVE estimators); clustered covariances (one-way and multi-way); panel and panel-corrected covariances; outer-product-of-gradients covariances; and (clustered) bootstrap covariances. All methods are applicable to (generalized) linear model objects fitted by lm() and glm() but can also be adapted to other classes through S3 methods. Details can be found in Zeileis et al. (2020) <doi:10.18637/jss.v095.i01>, Zeileis (2004) <doi:10.18637/jss.v011.i10> and Zeileis (2006) <doi:10.18637/jss.v016.i09>.

Maintained by Achim Zeileis. Last updated 2 months ago.

6.5 match 14.92 score 11k scripts 887 dependents

kisungyou

Rdimtools:Dimension Reduction and Estimation Methods

We provide linear and nonlinear dimension reduction techniques. Intrinsic dimension estimation methods for exploratory analysis are also provided. For more details on the package, see the paper by You and Shung (2022) <doi:10.1016/j.simpa.2022.100414>.

Maintained by Kisung You. Last updated 2 years ago.

dimension-estimation dimension-reduction manifold-learning subspace-learning openblas cpp openmp

11.3 match 52 stars 8.37 score 186 scripts 8 dependents

franzmohr

bvartools:Bayesian Inference of Vector Autoregressive and Error Correction Models

Assists in the set-up of algorithms for Bayesian inference of vector autoregressive (VAR) and error correction (VEC) models. Functions for posterior simulation, forecasting, impulse response analysis and forecast error variance decomposition are largely based on the introductory texts of Chan, Koop, Poirier and Tobias (2019, ISBN: 9781108437493), Koop and Korobilis (2010) <doi:10.1561/0800000013> and Luetkepohl (2006, ISBN: 9783540262398).

Maintained by Franz X. Mohr. Last updated 1 years ago.

bayesian bayesian-inference bayesian-var bvar bvecm gibbs-sampling mcmc vector-autoregression vector-error-correction-model openblas cpp

14.0 match 31 stars 6.80 score 34 scripts 1 dependents

wasquith

lmomco:L-Moments, Censored L-Moments, Trimmed L-Moments, L-Comoments, and Many Distributions

Extensive functions for Lmoments (LMs) and probability-weighted moments (PWMs), distribution parameter estimation, LMs for distributions, LM ratio diagrams, multivariate Lcomoments, and asymmetric (asy) trimmed LMs (TLMs). Maximum likelihood and maximum product spacings estimation are available. Right-tail and left-tail LM censoring by threshold or indicator variable are available. LMs of residual (resid) and reversed (rev) residual life are implemented along with 13 quantile operators for reliability analyses. Exact analytical bootstrap estimates of order statistics, LMs, and LM var-covars are available. Harri-Coble Tau34-squared Normality Test is available. Distributions with L, TL, and added (+) support for right-tail censoring (RC) encompass: Asy Exponential (Exp) Power [L], Asy Triangular [L], Cauchy [TL], Eta-Mu [L], Exp. [L], Gamma [L], Generalized (Gen) Exp Poisson [L], Gen Extreme Value [L], Gen Lambda [L, TL], Gen Logistic [L], Gen Normal [L], Gen Pareto [L+RC, TL], Govindarajulu [L], Gumbel [L], Kappa [L], Kappa-Mu [L], Kumaraswamy [L], Laplace [L], Linear Mean Residual Quantile Function [L], Normal [L], 3p log-Normal [L], Pearson Type III [L], Polynomial Density-Quantile 3 and 4 [L], Rayleigh [L], Rev-Gumbel [L+RC], Rice [L], Singh Maddala [L], Slash [TL], 3p Student t [L], Truncated Exponential [L], Wakeby [L], and Weibull [L].

Maintained by William Asquith. Last updated 1 months ago.

flood-frequency-analysis l-moments mle-estimation mps-estimation probability-distribution rainfall-frequency-analysis reliability-analysis risk-analysis survival-analysis

11.8 match 2 stars 8.06 score 458 scripts 38 dependents

ledell

cvAUC:Cross-Validated Area Under the ROC Curve Confidence Intervals

Tools for working with and evaluating cross-validated area under the ROC curve (AUC) estimators. The primary functions of the package are ci.cvAUC and ci.pooled.cvAUC, which report cross-validated AUC and compute confidence intervals for cross-validated AUC estimates based on influence curves for i.i.d. and pooled repeated measures data, respectively. One benefit to using influence curve based confidence intervals is that they require much less computation time than bootstrapping methods. The utility functions, AUC and cvAUC, are simple wrappers for functions from the ROCR package.

Maintained by Erin LeDell. Last updated 3 years ago.

auc confidence-intervals cross-validation machine-learning statistics variance

10.0 match 23 stars 9.17 score 317 scripts 40 dependents

stochastictree

stochtree:Stochastic Tree Ensembles (XBART and BART) for Supervised Learning and Causal Inference

Flexible stochastic tree ensemble software. Robust implementations of Bayesian Additive Regression Trees (BART) Chipman, George, McCulloch (2010) <doi:10.1214/09-AOAS285> for supervised learning and Bayesian Causal Forests (BCF) Hahn, Murray, Carvalho (2020) <doi:10.1214/19-BA1195> for causal inference. Enables model serialization and parallel sampling and provides a low-level interface for custom stochastic forest samplers.

Maintained by Drew Herren. Last updated 18 days ago.

bart bayesian-machine-learning bayesian-methods decision-trees gradient-boosted-trees machine-learning probabilistic-models tree-ensembles cpp

10.8 match 20 stars 8.52 score 40 scripts

openanalytics

BIGL:Biochemically Intuitive Generalized Loewe Model

Response surface methods for drug synergy analysis. Available methods include generalized and classical Loewe formulations as well as Highest Single Agent methodology. Response surfaces can be plotted in an interactive 3-D plot and formal statistical tests for presence of synergistic effects are available. Implemented methods and tests are described in the article "BIGL: Biochemically Intuitive Generalized Loewe null model for prediction of the expected combined effect compatible with partial agonism and antagonism" by Koen Van der Borght, Annelies Tourny, Rytis Bagdziunas, Olivier Thas, Maxim Nazarov, Heather Turner, Bie Verbist & Hugo Ceulemans (2017) <doi:10.1038/s41598-017-18068-5>.

Maintained by Maxim Nazarov. Last updated 2 years ago.

14.8 match 7 stars 6.02 score 37 scripts

mariushofert

nvmix:Multivariate Normal Variance Mixtures

Functions for working with (grouped) multivariate normal variance mixture distributions (evaluation of distribution functions and densities, random number generation and parameter estimation), including Student's t distribution for non-integer degrees-of-freedom as well as the grouped t distribution and copula with multiple degrees-of-freedom parameters. See <doi:10.18637/jss.v102.i02> for a high-level description of select functionality.

Maintained by Marius Hofert. Last updated 1 years ago.

30.6 match 1 stars 2.90 score 40 scripts

nepem-ufsc

metan:Multi Environment Trials Analysis

Performs stability analysis of multi-environment trial data using parametric and non-parametric methods. Parametric methods includes Additive Main Effects and Multiplicative Interaction (AMMI) analysis by Gauch (2013) <doi:10.2135/cropsci2013.04.0241>, Ecovalence by Wricke (1965), Genotype plus Genotype-Environment (GGE) biplot analysis by Yan & Kang (2003) <doi:10.1201/9781420040371>, geometric adaptability index by Mohammadi & Amri (2008) <doi:10.1007/s10681-007-9600-6>, joint regression analysis by Eberhart & Russel (1966) <doi:10.2135/cropsci1966.0011183X000600010011x>, genotypic confidence index by Annicchiarico (1992), Murakami & Cruz's (2004) method, power law residuals (POLAR) statistics by Doring et al. (2015) <doi:10.1016/j.fcr.2015.08.005>, scale-adjusted coefficient of variation by Doring & Reckling (2018) <doi:10.1016/j.eja.2018.06.007>, stability variance by Shukla (1972) <doi:10.1038/hdy.1972.87>, weighted average of absolute scores by Olivoto et al. (2019a) <doi:10.2134/agronj2019.03.0220>, and multi-trait stability index by Olivoto et al. (2019b) <doi:10.2134/agronj2019.03.0221>. Non-parametric methods includes superiority index by Lin & Binns (1988) <doi:10.4141/cjps88-018>, nonparametric measures of phenotypic stability by Huehn (1990) <doi:10.1007/BF00024241>, TOP third statistic by Fox et al. (1990) <doi:10.1007/BF00040364>. Functions for computing biometrical analysis such as path analysis, canonical correlation, partial correlation, clustering analysis, and tools for inspecting, manipulating, summarizing and plotting typical multi-environment trial data are also provided.

Maintained by Tiago Olivoto. Last updated 9 days ago.

9.3 match 2 stars 9.48 score 1.3k scripts 2 dependents

r-forge

agrmt:Calculate Concentration and Dispersion in Ordered Rating Scales

Calculates concentration and dispersion in ordered rating scales. It implements various measures of concentration and dispersion to describe what researchers variably call agreement, concentration, consensus, dispersion, or polarization among respondents in ordered data. It also implements other related measures to classify distributions. In addition to a generic city-block based concentration measure and a generic dispersion measure, the package implements various measures, including van der Eijk's (2001) <DOI: 10.1023/A:1010374114305> measure of agreement A, measures of concentration by Leik, Tatsle and Wierman, Blair and Lacy, Kvalseth, Berry and Mielke, Reardon, and Garcia-Montalvo and Reynal-Querol. Furthermore, the package provides an implementation of Galtungs AJUS-system to classify distributions, as well as a function to identify the position of multiple modes.

Maintained by Didier Ruedin. Last updated 29 days ago.

24.6 match 3.58 score 38 scripts

aalfons

laeken:Estimation of Indicators on Social Exclusion and Poverty

Estimation of indicators on social exclusion and poverty, as well as Pareto tail modeling for empirical income distributions.

Maintained by Andreas Alfons. Last updated 1 years ago.

9.2 match 3 stars 9.57 score 300 scripts 30 dependents

pecanproject

PEcAn.uncertainty:PEcAn Functions Used for Propagating and Partitioning Uncertainties in Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by David LeBauer. Last updated 2 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

9.8 match 216 stars 8.91 score 15 scripts 5 dependents

alexanderrobitzsch

sirt:Supplementary Item Response Theory Models

Supplementary functions for item response models aiming to complement existing R packages. The functionality includes among others multidimensional compensatory and noncompensatory IRT models (Reckase, 2009, <doi:10.1007/978-0-387-89976-3>), MCMC for hierarchical IRT models and testlet models (Fox, 2010, <doi:10.1007/978-1-4419-0742-4>), NOHARM (McDonald, 1982, <doi:10.1177/014662168200600402>), Rasch copula model (Braeken, 2011, <doi:10.1007/s11336-010-9190-4>; Schroeders, Robitzsch & Schipolowski, 2014, <doi:10.1111/jedm.12054>), faceted and hierarchical rater models (DeCarlo, Kim & Johnson, 2011, <doi:10.1111/j.1745-3984.2011.00143.x>), ordinal IRT model (ISOP; Scheiblechner, 1995, <doi:10.1007/BF02301417>), DETECT statistic (Stout, Habing, Douglas & Kim, 1996, <doi:10.1177/014662169602000403>), local structural equation modeling (LSEM; Hildebrandt, Luedtke, Robitzsch, Sommer & Wilhelm, 2016, <doi:10.1080/00273171.2016.1142856>).

Maintained by Alexander Robitzsch. Last updated 3 months ago.

item-response-theory openblas cpp

8.5 match 23 stars 10.01 score 280 scripts 22 dependents

cran

circular:Circular Statistics

Circular Statistics, from "Topics in circular Statistics" (2001) S. Rao Jammalamadaka and A. SenGupta, World Scientific.

Maintained by Eduardo García-Portugués. Last updated 7 months ago.

fortran

10.9 match 7 stars 7.76 score 1.1k scripts 40 dependents

knoths

spc:Statistical Process Control -- Calculation of ARL and Other Control Chart Performance Measures

Evaluation of control charts by means of the zero-state, steady-state ARL (Average Run Length) and RL quantiles. Setting up control charts for given in-control ARL. The control charts under consideration are one- and two-sided EWMA, CUSUM, and Shiryaev-Roberts schemes for monitoring the mean or variance of normally distributed independent data. ARL calculation of the same set of schemes under drift (in the mean) are added. Eventually, all ARL measures for the multivariate EWMA (MEWMA) are provided.

Maintained by Sven Knoth. Last updated 7 months ago.

openblas

25.5 match 5 stars 3.30 score 66 scripts 1 dependents

distancedevelopment

Distance:Distance Sampling Detection Function and Abundance Estimation

A simple way of fitting detection functions to distance sampling data for both line and point transects. Adjustment term selection, left and right truncation as well as monotonicity constraints and binning are supported. Abundance and density estimates can also be calculated (via a Horvitz-Thompson-like estimator) if survey area information is provided. See Miller et al. (2019) <doi:10.18637/jss.v089.i01> for more information on methods and <https://examples.distancesampling.org/> for example analyses.

Maintained by Laura Marshall. Last updated 11 days ago.

9.4 match 11 stars 8.89 score 358 scripts 3 dependents

jgx65

hierfstat:Estimation and Tests of Hierarchical F-Statistics

Estimates hierarchical F-statistics from haploid or diploid genetic data with any numbers of levels in the hierarchy, following the algorithm of Yang (Evolution(1998), 52:950). Tests via randomisations the significance of each F and variance components, using the likelihood-ratio statistics G (Goudet et al. (1996) <https://academic.oup.com/genetics/article/144/4/1933/6017091>). Estimates genetic diversity statistics for haploid and diploid genetic datasets in various formats, including inbreeding and coancestry coefficients, and population specific F-statistics following Weir and Goudet (2017) <https://academic.oup.com/genetics/article/206/4/2085/6072590>.

Maintained by Jerome Goudet. Last updated 4 months ago.

devtools fstatistics gwas hierfstat kinship population-genetics population-genomics quantitative-genetics simulations

7.6 match 25 stars 10.94 score 560 scripts 4 dependents

vlyubchich

lawstat:Tools for Biostatistics, Public Policy, and Law

Statistical tests widely utilized in biostatistics, public policy, and law. Along with the well-known tests for equality of means and variances, randomness, and measures of relative variability, the package contains new robust tests of symmetry, omnibus and directional tests of normality, and their graphical counterparts such as robust QQ plot, robust trend tests for variances, etc. All implemented tests and methods are illustrated by simulations and real-life examples from legal statistics, economics, and biostatistics.

Maintained by Yulia R. Gel. Last updated 2 years ago.

11.5 match 7.17 score 484 scripts 6 dependents

bioc

dearseq:Differential Expression Analysis for RNA-seq data through a robust variance component test

Differential Expression Analysis RNA-seq data with variance component score test accounting for data heteroscedasticity through precision weights. Perform both gene-wise and gene set analyses, and can deal with repeated or longitudinal data. Methods are detailed in: i) Agniel D & Hejblum BP (2017) Variance component score test for time-course gene set analysis of longitudinal RNA-seq data, Biostatistics, 18(4):589-604 ; and ii) Gauthier M, Agniel D, Thiébaut R & Hejblum BP (2020) dearseq: a variance component score test for RNA-Seq differential analysis that effectively controls the false discovery rate, NAR Genomics and Bioinformatics, 2(4):lqaa093.

Maintained by Boris P. Hejblum. Last updated 5 months ago.

biomedicalinformatics cellbiology differentialexpression dnaseq geneexpression genetics genesetenrichment immunooncology kegg regression rnaseq sequencing systemsbiology timecourse transcription transcriptomics

13.2 match 8 stars 6.20 score 11 scripts 1 dependents

inseefr

gustave:A User-Oriented Statistical Toolkit for Analytical Variance Estimation

Provides a toolkit for analytical variance estimation in survey sampling. Apart from the implementation of standard variance estimators, its main feature is to help the sampling expert produce easy-to-use variance estimation "wrappers", where systematic operations (linearization, domain estimation) are handled in a consistent and transparent way.

Maintained by Khaled Larbi. Last updated 1 years ago.

official-statistics sampling survey-sampling variance-estimation

20.3 match 9 stars 3.91 score 18 scripts

gsucarrat

gets:General-to-Specific (GETS) Modelling and Indicator Saturation Methods

Automated General-to-Specific (GETS) modelling of the mean and variance of a regression, and indicator saturation methods for detecting and testing for structural breaks in the mean, see Pretis, Reade and Sucarrat (2018) <doi:10.18637/jss.v086.i03> for an overview of the package. In advanced use, the estimator and diagnostics tests can be fully user-specified, see Sucarrat (2021) <doi:10.32614/RJ-2021-024>.

Maintained by Genaro Sucarrat. Last updated 8 months ago.

11.5 match 8 stars 6.89 score 73 scripts 3 dependents

bioc

DEqMS:a tool to perform statistical analysis of differential protein expression for quantitative proteomics data.

DEqMS is developped on top of Limma. However, Limma assumes same prior variance for all genes. In proteomics, the accuracy of protein abundance estimates varies by the number of peptides/PSMs quantified in both label-free and labelled data. Proteins quantification by multiple peptides or PSMs are more accurate. DEqMS package is able to estimate different prior variances for proteins quantified by different number of PSMs/peptides, therefore acchieving better accuracy. The package can be applied to analyze both label-free and labelled proteomics data.

Maintained by Yafeng Zhu. Last updated 5 months ago.

immunooncology proteomics massspectrometry preprocessing differentialexpression multiplecomparison normalization bayesian experimenthubsoftware limma quantitative-proteomic-analysis

9.4 match 23 stars 8.18 score 58 scripts 1 dependents

devillemereuil

Reacnorm:Perform a Partition of Variance of Reaction Norms

Partitions the phenotypic variance of a plastic trait, studied through its reaction norm. The variance partition distinguishes between the variance arising from the average shape of the reaction norms (V_Plas) and the (additive) genetic variance . The latter is itself separated into an environment-blind component (V_G/V_A) and the component arising from plasticity (V_GxE/V_AxE). The package also provides a way to further partition V_Plas into aspects (slope/curvature) of the shape of the average reaction norm (pi-decomposition) and partition V_Add (gamma-decomposition) and V_AxE (iota-decomposition) into the impact of genetic variation in the reaction norm parameters. Reference: de Villemereuil & Chevin (2025) <doi:10.32942/X2NC8B>.

Maintained by Pierre de Villemereuil. Last updated 18 days ago.

14.2 match 4 stars 5.34 score

ingebogh

makemyprior:Intuitive Construction of Joint Priors for Variance Parameters

Tool for easy prior construction and visualization. It helps to formulates joint prior distributions for variance parameters in latent Gaussian models. The resulting prior is robust and can be created in an intuitive way. A graphical user interface (GUI) can be used to choose the joint prior, where the user can click through the model and select priors. An extensive guide is available in the GUI. The package allows for direct inference with the specified model and prior. Using a hierarchical variance decomposition, we formulate a joint variance prior that takes the whole model structure into account. In this way, existing knowledge can intuitively be incorporated at the level it applies to. Alternatively, one can use independent variance priors for each model components in the latent Gaussian model. Details can be found in the accompanying scientific paper: Hem, Fuglstad, Riebler (2024, Journal of Statistical Software, <doi:10.18637/jss.v110.i03>).

Maintained by Ingeborg Hem. Last updated 7 months ago.

14.9 match 1 stars 5.06 score 19 scripts

cran

fullfact:Full Factorial Breeding Analysis

We facilitate the analysis of full factorial mating designs with mixed-effects models. The package contains six vignettes containing detailed examples.

Maintained by Aimee Lee Houde. Last updated 1 years ago.

26.8 match 2.78 score

powerandsamplesize

powertools:Power and Sample Size Tools

Power and sample size calculations for a variety of study designs and outcomes. Methods include t tests, ANOVA (including tests for interactions, simple effects and contrasts), proportions, categorical data (chi-square tests and proportional odds), linear, logistic and Poisson regression, alternative and coprimary endpoints, power for confidence intervals, correlation coefficient tests, cluster randomized trials, individually randomized group treatment trials, multisite trials, treatment-by-covariate interaction effects and nonparametric tests of location. Utilities are provided for computing various effect sizes. Companion package to the book "Power and Sample Size in R", Crespi (2025, ISBN:9781138591622).

Maintained by Catherine M. Crespi. Last updated 5 days ago.

20.1 match 3.65 score 9 scripts

bioc

M3Drop:Michaelis-Menten Modelling of Dropouts in single-cell RNASeq

This package fits a model to the pattern of dropouts in single-cell RNASeq data. This model is used as a null to identify significantly variable (i.e. differentially expressed) genes for use in downstream analysis, such as clustering cells. Also includes an method for calculating exact Pearson residuals in UMI-tagged data using a library-size aware negative binomial model.

Maintained by Tallulah Andrews. Last updated 5 months ago.

rnaseq sequencing transcriptomics geneexpression software differentialexpression dimensionreduction featureextraction human-cell-atlas rna-seq single-cell single-cell-rna-seq

8.4 match 29 stars 8.71 score 119 scripts 2 dependents

eitsupi

neopolars:R Bindings for the 'polars' Rust Library

Lightning-fast 'DataFrame' library written in 'Rust'. Convert R data to 'Polars' data and vice versa. Perform fast, lazy, larger-than-memory and optimized data queries. 'Polars' is interoperable with the package 'arrow', as both are based on the 'Apache Arrow' Columnar Format.

Maintained by Tatsuya Shima. Last updated 1 days ago.

rust cargo

14.9 match 40 stars 4.86 score 1 scripts

bioc

mixOmics:Omics Data Integration Project

Multivariate methods are well suited to large omics data sets where the number of variables (e.g. genes, proteins, metabolites) is much larger than the number of samples (patients, cells, mice). They have the appealing properties of reducing the dimension of the data by using instrumental variables (components), which are defined as combinations of all variables. Those components are then used to produce useful graphical outputs that enable better understanding of the relationships and correlation structures between the different data sets that are integrated. mixOmics offers a wide range of multivariate methods for the exploration and integration of biological datasets with a particular focus on variable selection. The package proposes several sparse multivariate models we have developed to identify the key variables that are highly correlated, and/or explain the biological outcome of interest. The data that can be analysed with mixOmics may come from high throughput sequencing technologies, such as omics data (transcriptomics, metabolomics, proteomics, metagenomics etc) but also beyond the realm of omics (e.g. spectral imaging). The methods implemented in mixOmics can also handle missing values without having to delete entire rows with missing data. A non exhaustive list of methods include variants of generalised Canonical Correlation Analysis, sparse Partial Least Squares and sparse Discriminant Analysis. Recently we implemented integrative methods to combine multiple data sets: N-integration with variants of Generalised Canonical Correlation Analysis and P-integration with variants of multi-group Partial Least Squares.

Maintained by Eva Hamrud. Last updated 4 days ago.

immunooncology microarray sequencing metabolomics metagenomics proteomics geneprediction multiplecomparison classification regression bioconductor genomics genomics-data genomics-visualization multivariate-analysis multivariate-statistics omics r-pkg r-project

5.3 match 182 stars 13.71 score 1.3k scripts 22 dependents

floschuberth

cSEM:Composite-Based Structural Equation Modeling

Estimate, assess, test, and study linear, nonlinear, hierarchical and multigroup structural equation models using composite-based approaches and procedures, including estimation techniques such as partial least squares path modeling (PLS-PM) and its derivatives (PLSc, ordPLSc, robustPLSc), generalized structured component analysis (GSCA), generalized structured component analysis with uniqueness terms (GSCAm), generalized canonical correlation analysis (GCCA), principal component analysis (PCA), factor score regression (FSR) using sum score, regression or Bartlett scores (including bias correction using Croon’s approach), as well as several tests and typical postestimation procedures (e.g., verify admissibility of the estimates, assess the model fit, test the model fit etc.).

Maintained by Florian Schuberth. Last updated 17 days ago.

7.8 match 28 stars 9.11 score 56 scripts 2 dependents

skembel

picante:Integrating Phylogenies and Ecology

Functions for phylocom integration, community analyses, null-models, traits and evolution. Implements numerous ecophylogenetic approaches including measures of community phylogenetic and trait diversity, phylogenetic signal, estimation of trait values for unobserved taxa, null models for community and phylogeny randomizations, and utility functions for data input/output and phylogeny plotting. A full description of package functionality and methods are provided by Kembel et al. (2010) <doi:10.1093/bioinformatics/btq166>.

Maintained by Steven W. Kembel. Last updated 2 years ago.

6.2 match 34 stars 11.42 score 1.1k scripts 16 dependents

appliedstat

rQCC:Robust Quality Control Chart

Constructs various robust quality control charts based on the median or Hodges-Lehmann estimator (location) and the median absolute deviation (MAD) or Shamos estimator (scale). The estimators used for the robust control charts are all unbiased with a sample of finite size. For more details, see Park, Kim and Wang (2022) <doi:10.1080/03610918.2019.1699114>. In addition, using this R package, the conventional quality control charts such as X-bar, S, R, p, np, u, c, g, h, and t charts are also easily constructed. This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2022R1A2C1091319).

Maintained by Chanseok Park. Last updated 1 years ago.

control-chart goodness-of-fit r-language weibull

15.0 match 2 stars 4.70 score 3 scripts

tfletcher05

psychometric:Applied Psychometric Theory

Contains functions useful for correlation theory, meta-analysis (validity-generalization), reliability, item analysis, inter-rater reliability, and classical utility.

Maintained by Thomas D. Fletcher. Last updated 1 years ago.

16.6 match 4.24 score 181 scripts 1 dependents

rempsyc

rempsyc:Convenience Functions for Psychology

Make your workflow faster and easier. Easily customizable plots (via 'ggplot2'), nice APA tables (following the style of the *American Psychological Association*) exportable to Word (via 'flextable'), easily run statistical tests or check assumptions, and automatize various other tasks.

Maintained by Rémi Thériault. Last updated 1 months ago.

convenience-functions ggplot2 psychology statistics visualization

6.5 match 43 stars 10.68 score 214 scripts 2 dependents

reumandc

tsvr:Timescale-Specific Variance Ratio for Use in Community Ecology

Tools for timescale decomposition of the classic variance ratio of community ecology. Tools are as described in Zhao et al (in prep), extending commonly used methods introduced by Peterson et al (1975) <doi: 10.2307/1936306>.

Maintained by Daniel C. Reuman. Last updated 4 years ago.

15.5 match 2 stars 4.46 score 29 scripts

tagteam

riskRegression:Risk Regression Models and Prediction Scores for Survival Analysis with Competing Risks

Implementation of the following methods for event history analysis. Risk regression models for survival endpoints also in the presence of competing risks are fitted using binomial regression based on a time sequence of binary event status variables. A formula interface for the Fine-Gray regression model and an interface for the combination of cause-specific Cox regression models. A toolbox for assessing and comparing performance of risk predictions (risk markers and risk prediction models). Prediction performance is measured by the Brier score and the area under the ROC curve for binary possibly time-dependent outcome. Inverse probability of censoring weighting and pseudo values are used to deal with right censored data. Lists of risk markers and lists of risk models are assessed simultaneously. Cross-validation repeatedly splits the data, trains the risk prediction models on one part of each split and then summarizes and compares the performance across splits.

Maintained by Thomas Alexander Gerds. Last updated 18 days ago.

openblas cpp

5.3 match 46 stars 13.00 score 736 scripts 35 dependents

bioc

transformGamPoi:Variance Stabilizing Transformation for Gamma-Poisson Models

Variance-stabilizing transformations help with the analysis of heteroskedastic data (i.e., data where the variance is not constant, like count data). This package provide two types of variance stabilizing transformations: (1) methods based on the delta method (e.g., 'acosh', 'log(x+1)'), (2) model residual based (Pearson and randomized quantile residuals).

Maintained by Constantin Ahlmann-Eltze. Last updated 5 months ago.

singlecell normalization preprocessing regression cpp

11.5 match 21 stars 5.95 score 21 scripts

pln-team

PLNmodels:Poisson Lognormal Models

The Poisson-lognormal model and variants (Chiquet, Mariadassou and Robin, 2021 <doi:10.3389/fevo.2021.588292>) can be used for a variety of multivariate problems when count data are at play, including principal component analysis for count data, discriminant analysis, model-based clustering and network inference. Implements variational algorithms to fit such models accompanied with a set of functions for visualization and diagnostic.

Maintained by Julien Chiquet. Last updated 4 days ago.

count-data multivariate-analysis network-inference pca poisson-lognormal-model openblas cpp

7.1 match 56 stars 9.50 score 226 scripts

jclavel

mvMORPH:Multivariate Comparative Tools for Fitting Evolutionary Models to Morphometric Data

Fits multivariate (Brownian Motion, Early Burst, ACDC, Ornstein-Uhlenbeck and Shifts) models of continuous traits evolution on trees and time series. 'mvMORPH' also proposes high-dimensional multivariate comparative tools (linear models using Generalized Least Squares and multivariate tests) based on penalized likelihood. See Clavel et al. (2015) <DOI:10.1111/2041-210X.12420>, Clavel et al. (2019) <DOI:10.1093/sysbio/syy045>, and Clavel & Morlon (2020) <DOI:10.1093/sysbio/syaa010>.

Maintained by Julien Clavel. Last updated 1 months ago.

openblas

7.1 match 17 stars 9.46 score 189 scripts 3 dependents

dgrun

RaceID:Identification of Cell Types, Inference of Lineage Trees, and Prediction of Noise Dynamics from Single-Cell RNA-Seq Data

Application of 'RaceID' allows inference of cell types and prediction of lineage trees by the 'StemID2' algorithm (Herman, J.S., Sagar, Grun D. (2018) <DOI:10.1038/nmeth.4662>). 'VarID2' is part of this package and allows quantification of biological gene expression noise at single-cell resolution (Rosales-Alvarez, R.E., Rettkowski, J., Herman, J.S., Dumbovic, G., Cabezas-Wallscheid, N., Grun, D. (2023) <DOI:10.1186/s13059-023-02974-1>).

Maintained by Dominic Grün. Last updated 4 months ago.

cpp

14.1 match 4.74 score 110 scripts

cran

popbio:Construction and Analysis of Matrix Population Models

Construct and analyze projection matrix models from a demography study of marked individuals classified by age or stage. The package covers methods described in Matrix Population Models by Caswell (2001) and Quantitative Conservation Biology by Morris and Doak (2002).

Maintained by Chris Stubben. Last updated 12 months ago.

10.7 match 6.24 score 1.0k scripts 5 dependents

rolkra

explore:Simplifies Exploratory Data Analysis

Interactive data exploration with one line of code, automated reporting or use an easy to remember set of tidy functions for low code exploratory data analysis.

Maintained by Roland Krasser. Last updated 3 months ago.

data-exploration data-visualisation decision-trees eda rmarkdown shiny tidy

5.8 match 228 stars 11.43 score 221 scripts 1 dependents

famuvie

breedR:Statistical Methods for Forest Genetic Resources Analysts

Statistical tools to build predictive models for the breeders community. It aims to assess the genetic value of individuals under a number of situations, including spatial autocorrelation, genetic/environment interaction and competition. It is under active development as part of the Trees4Future project, particularly developed having forest genetic trials in mind. But can be used for animals or other situations as well.

Maintained by Facundo Muñoz. Last updated 8 months ago.

12.0 match 33 stars 5.44 score 24 scripts

henrikbengtsson

matrixStats:Functions that Apply to Rows and Columns of Matrices (and to Vectors)

High-performing functions operating on rows and columns of matrices, e.g. col / rowMedians(), col / rowRanks(), and col / rowSds(). Functions optimized per data type and for subsetted calculations such that both memory usage and processing time is minimized. There are also optimized vector-based methods, e.g. binMeans(), madDiff() and weightedMedian().

Maintained by Henrik Bengtsson. Last updated 2 months ago.

matrix performance vector

3.6 match 208 stars 18.09 score 20k scripts 2.3k dependents

paulantondeen

POV:Partition of Variation Variance Component Analysis Method

An implementation of the Partition Of variation (POV) method as developed by Dr. Thomas A Little <https://thomasalittleconsulting.com> in 1993 for the analysis of semiconductor data for hard drive manufacturing. POV is based on sequential sum of squares and is an exact method that explains all observed variation. It quantitates both the between and within factor variation effects and can quantitate the influence of both continuous and categorical factors.

Maintained by Paul Deen. Last updated 4 years ago.

14.1 match 4.54 score 4 scripts

mikewlcheung

metaSEM:Meta-Analysis using Structural Equation Modeling

A collection of functions for conducting meta-analysis using a structural equation modeling (SEM) approach via the 'OpenMx' and 'lavaan' packages. It also implements various procedures to perform meta-analytic structural equation modeling on the correlation and covariance matrices, see Cheung (2015) <doi:10.3389/fpsyg.2014.01521>.

Maintained by Mike Cheung. Last updated 10 days ago.

meta-analysis meta-analytic-sem missing-data multilevel-models multivariate-analysis structural-equation-modeling structural-equation-models

6.8 match 30 stars 9.43 score 208 scripts 1 dependents

bart1

move:Visualizing and Analyzing Animal Track Data

Contains functions to access movement data stored in 'movebank.org' as well as tools to visualize and statistically analyze animal movement data, among others functions to calculate dynamic Brownian Bridge Movement Models. Move helps addressing movement ecology questions.

Maintained by Bart Kranstauber. Last updated 4 months ago.

cpp

7.3 match 8.74 score 690 scripts 3 dependents

dpc10ster

RJafroc:Artificial Intelligence Systems and Observer Performance

Analyzing the performance of artificial intelligence (AI) systems/algorithms characterized by a 'search-and-report' strategy. Historically observer performance has dealt with measuring radiologists' performances in search tasks, e.g., searching for lesions in medical images and reporting them, but the implicit location information has been ignored. The implemented methods apply to analyzing the absolute and relative performances of AI systems, comparing AI performance to a group of human readers or optimizing the reporting threshold of an AI system. In addition to performing historical receiver operating receiver operating characteristic (ROC) analysis (localization information ignored), the software also performs free-response receiver operating characteristic (FROC) analysis, where lesion localization information is used. A book using the software has been published: Chakraborty DP: Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, Taylor-Francis LLC; 2017: <https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840>. Online updates to this book, which use the software, are at <https://dpc10ster.github.io/RJafrocQuickStart/>, <https://dpc10ster.github.io/RJafrocRocBook/> and at <https://dpc10ster.github.io/RJafrocFrocBook/>. Supported data collection paradigms are the ROC, FROC and the location ROC (LROC). ROC data consists of single ratings per images, where a rating is the perceived confidence level that the image is that of a diseased patient. An ROC curve is a plot of true positive fraction vs. false positive fraction. FROC data consists of a variable number (zero or more) of mark-rating pairs per image, where a mark is the location of a reported suspicious region and the rating is the confidence level that it is a real lesion. LROC data consists of a rating and a location of the most suspicious region, for every image. Four models of observer performance, and curve-fitting software, are implemented: the binormal model (BM), the contaminated binormal model (CBM), the correlated contaminated binormal model (CORCBM), and the radiological search model (RSM). Unlike the binormal model, CBM, CORCBM and RSM predict 'proper' ROC curves that do not inappropriately cross the chance diagonal. Additionally, RSM parameters are related to search performance (not measured in conventional ROC analysis) and classification performance. Search performance refers to finding lesions, i.e., true positives, while simultaneously not finding false positive locations. Classification performance measures the ability to distinguish between true and false positive locations. Knowing these separate performances allows principled optimization of reader or AI system performance. This package supersedes Windows JAFROC (jackknife alternative FROC) software V4.2.1, <https://github.com/dpc10ster/WindowsJafroc>. Package functions are organized as follows. Data file related function names are preceded by 'Df', curve fitting functions by 'Fit', included data sets by 'dataset', plotting functions by 'Plot', significance testing functions by 'St', sample size related functions by 'Ss', data simulation functions by 'Simulate' and utility functions by 'Util'. Implemented are figures of merit (FOMs) for quantifying performance and functions for visualizing empirical or fitted operating characteristics: e.g., ROC, FROC, alternative FROC (AFROC) and weighted AFROC (wAFROC) curves. For fully crossed study designs significance testing of reader-averaged FOM differences between modalities is implemented via either Dorfman-Berbaum-Metz or the Obuchowski-Rockette methods. Also implemented is single modality analysis, which allows comparison of performance of a group of radiologists to a specified value, or comparison of AI to a group of radiologists interpreting the same cases. Crossed-modality analysis is implemented wherein there are two crossed modality factors and the aim is to determined performance in each modality factor averaged over all levels of the second factor. Sample size estimation tools are provided for ROC and FROC studies; these use estimates of the relevant variances from a pilot study to predict required numbers of readers and cases in a pivotal study to achieve the desired power. Utility and data file manipulation functions allow data to be read in any of the currently used input formats, including Excel, and the results of the analysis can be viewed in text or Excel output files. The methods are illustrated with several included datasets from the author's collaborations. This update includes improvements to the code, some as a result of user-reported bugs and new feature requests, and others discovered during ongoing testing and code simplification.

Maintained by Dev Chakraborty. Last updated 5 months ago.

ai-optimization artificial-intelligence-algorithms computer-aided-diagnosis froc-analysis roc-analysis target-classification target-localization cpp

11.2 match 19 stars 5.69 score 65 scripts

data-edu

tidyLPA:Easily Carry Out Latent Profile Analysis (LPA) Using Open-Source or Commercial Software

Easily carry out latent profile analysis ("LPA"), determine the correct number of classes based on best practices, and tabulate and plot the results. Provides functionality to estimate commonly-specified models with free means, variances, and covariances for each profile. Follows a tidy approach, in that output is in the form of a data frame that can subsequently be computed on. Models can be estimated using the free open source 'R' packages 'Mclust' and 'OpenMx', or using the commercial program 'MPlus', via the 'MplusAutomation' package.

Maintained by Joshua M Rosenberg. Last updated 1 years ago.

7.3 match 57 stars 8.71 score 121 scripts

cran

coxme:Mixed Effects Cox Models

Fit Cox proportional hazards models containing both fixed and random effects. The random effects can have a general form, of which familial interactions (a "kinship" matrix) is a particular special case. Note that the simplest case of a mixed effects Cox model, i.e. a single random per-group intercept, is also called a "frailty" model. The approach is based on Ripatti and Palmgren, Biometrics 2002.

Maintained by Terry M. Therneau. Last updated 7 months ago.

7.1 match 2 stars 8.78 score 562 scripts 15 dependents

rmheiberger

HH:Statistical Analysis and Data Display: Heiberger and Holland

Support software for Statistical Analysis and Data Display (Second Edition, Springer, ISBN 978-1-4939-2121-8, 2015) and (First Edition, Springer, ISBN 0-387-40270-5, 2004) by Richard M. Heiberger and Burt Holland. This contemporary presentation of statistical methods features extensive use of graphical displays for exploring data and for displaying the analysis. The second edition includes redesigned graphics and additional chapters. The authors emphasize how to construct and interpret graphs, discuss principles of graphical design, and show how accompanying traditional tabular results are used to confirm the visual impressions derived directly from the graphs. Many of the graphical formats are novel and appear here for the first time in print. All chapters have exercises. All functions introduced in the book are in the package. R code for all examples, both graphs and tables, in the book is included in the scripts directory of the package.

Maintained by Richard M. Heiberger. Last updated 1 months ago.

9.7 match 3 stars 6.42 score 752 scripts 5 dependents

alinamateikondylis

sampling:Survey Sampling

Functions to draw random samples using different sampling schemes are available. Functions are also provided to obtain (generalized) calibration weights, different estimators, as well some variance estimators.

Maintained by Alina Matei. Last updated 1 years ago.

7.7 match 2 stars 7.98 score 772 scripts 29 dependents

easystats

insight:Easy Access to Model Information for Various Model Objects

A tool to provide an easy, intuitive and consistent access to information contained in various R models, like model formulas, model terms, information about random effects, data that was used to fit the model or data from response variables. 'insight' mainly revolves around two types of functions: Functions that find (the names of) information, starting with 'find_', and functions that get the underlying data, starting with 'get_'. The package has a consistent syntax and works with many different model objects, where otherwise functions to access these information are missing.

Maintained by Daniel Lüdecke. Last updated 5 days ago.

easystats hacktoberfest insight models names predictors random

3.5 match 412 stars 17.24 score 568 scripts 210 dependents

satijalab

Seurat:Tools for Single Cell Genomics

A toolkit for quality control, analysis, and exploration of single cell RNA sequencing data. 'Seurat' aims to enable users to identify and interpret sources of heterogeneity from single cell transcriptomic measurements, and to integrate diverse types of single cell data. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031>, and Hao, Hao, et al (2020) <doi:10.1101/2020.10.12.335331> for more details.

Maintained by Paul Hoffman. Last updated 1 years ago.

human-cell-atlas single-cell-genomics single-cell-rna-seq cpp

3.6 match 2.4k stars 16.86 score 50k scripts 73 dependents

sinnweja

vcpen:Penalized Variance Components Analysis

Method to perform penalized variance component analysis.

Maintained by Jason Sinnwell. Last updated 3 years ago.

openblas cpp

22.6 match 2.70 score 1 scripts

meireles

spectrolab:Class and Methods for Spectral Data

Input/Output, processing and visualization of spectra taken with different spectrometers, including SVC (Spectra Vista), ASD and PSR (Spectral Evolution). Implements an S3 class spectra that other packages can build on. Provides methods to access, plot, manipulate, splice sensor overlap, vector normalize and smooth spectra.

Maintained by Jose Eduardo Meireles. Last updated 2 months ago.

science

8.3 match 16 stars 7.39 score 256 scripts

bioc

sparseMatrixStats:Summary Statistics for Rows and Columns of Sparse Matrices

High performance functions for row and column operations on sparse matrices. For example: col / rowMeans2, col / rowMedians, col / rowVars etc. Currently, the optimizations are limited to data in the column sparse format. This package is inspired by the matrixStats package by Henrik Bengtsson.

Maintained by Constantin Ahlmann-Eltze. Last updated 5 months ago.

infrastructure software datarepresentation cpp

5.1 match 54 stars 11.98 score 174 scripts 126 dependents

cecileproust-lima

lcmm:Extended Mixed Models Using Latent Classes and Latent Processes

Estimation of various extensions of the mixed models including latent class mixed models, joint latent class mixed models, mixed models for curvilinear outcomes, mixed models for multivariate longitudinal outcomes using a maximum likelihood estimation method (Proust-Lima, Philipps, Liquet (2017) <doi:10.18637/jss.v078.i02>).

Maintained by Cecile Proust-Lima. Last updated 1 months ago.

fortran

5.3 match 62 stars 11.41 score 249 scripts 7 dependents

devillemereuil

QGglmm:Estimate Quantitative Genetics Parameters from Generalised Linear Mixed Models

Compute various quantitative genetics parameters from a Generalised Linear Mixed Model (GLMM) estimates. Especially, it yields the observed phenotypic mean, phenotypic variance and additive genetic variance.

Maintained by Pierre de Villemereuil. Last updated 2 months ago.

9.5 match 15 stars 6.38 score 32 scripts

bioc

genefilter:genefilter: methods for filtering genes from high-throughput experiments

Some basic functions for filtering genes.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

microarray fortran cpp

5.4 match 11.10 score 2.4k scripts 142 dependents

usepa

spsurvey:Spatial Sampling Design and Analysis

A design-based approach to statistical inference, with a focus on spatial data. Spatially balanced samples are selected using the Generalized Random Tessellation Stratified (GRTS) algorithm. The GRTS algorithm can be applied to finite resources (point geometries) and infinite resources (linear / linestring and areal / polygon geometries) and flexibly accommodates a diverse set of sampling design features, including stratification, unequal inclusion probabilities, proportional (to size) inclusion probabilities, legacy (historical) sites, a minimum distance between sites, and two options for replacement sites (reverse hierarchical order and nearest neighbor). Data are analyzed using a wide range of analysis functions that perform categorical variable analysis, continuous variable analysis, attributable risk analysis, risk difference analysis, relative risk analysis, change analysis, and trend analysis. spsurvey can also be used to summarize objects, visualize objects, select samples that are not spatially balanced, select panel samples, measure the amount of spatial balance in a sample, adjust design weights, and more. For additional details, see Dumelle et al. (2023) <doi:10.18637/jss.v105.i03>.

Maintained by Michael Dumelle. Last updated 7 months ago.

ord

6.6 match 15 stars 8.98 score 241 scripts 1 dependents

bioc

MatrixGenerics:S4 Generic Summary Statistic Functions that Operate on Matrix-Like Objects

S4 generic functions modeled after the 'matrixStats' API for alternative matrix implementations. Packages with alternative matrix implementation can depend on this package and implement the generic functions that are defined here for a useful set of row and column summary statistics. Other package developers can import this package and handle a different matrix implementations without worrying about incompatibilities.

Maintained by Peter Hickey. Last updated 2 months ago.

infrastructure software bioconductor-package core-package

5.1 match 12 stars 11.64 score 129 scripts 1.3k dependents

glmmtmb

glmmTMB:Generalized Linear Mixed Models using Template Model Builder

Fit linear and generalized linear mixed models with various extensions, including zero-inflation. The models are fitted using maximum likelihood estimation via 'TMB' (Template Model Builder). Random effects are assumed to be Gaussian on the scale of the linear predictor and are integrated out using the Laplace approximation. Gradients are calculated using automatic differentiation.

Maintained by Mollie Brooks. Last updated 12 days ago.

cpp openmp

3.5 match 312 stars 16.77 score 3.7k scripts 24 dependents

bioc

scater:Single-Cell Analysis Toolkit for Gene Expression Data in R

A collection of tools for doing various analyses of single-cell RNA-seq gene expression data, with a focus on quality control and visualization.

Maintained by Alan OCallaghan. Last updated 9 days ago.

immunooncology singlecell rnaseq qualitycontrol preprocessing normalization visualization dimensionreduction transcriptomics geneexpression sequencing software dataimport datarepresentation infrastructure coverage

5.3 match 11.07 score 12k scripts 43 dependents

easystats

performance:Assessment of Regression Models Performance

Utilities for computing measures to assess model quality, which are not directly provided by R's 'base' or 'stats' packages. These include e.g. measures like r-squared, intraclass correlation coefficient (Nakagawa, Johnson & Schielzeth (2017) <doi:10.1098/rsif.2017.0213>), root mean squared error or functions to check models for overdispersion, singularity or zero-inflation and more. Functions apply to a large variety of regression models, including generalized linear models, mixed effects models and Bayesian models. References: Lüdecke et al. (2021) <doi:10.21105/joss.03139>.

Maintained by Daniel Lüdecke. Last updated 19 days ago.

aic easystats hacktoberfest loo machine-learning mixed-models models performance r2 statistics

3.6 match 1.1k stars 16.17 score 4.3k scripts 47 dependents

heliosdrm

pwr:Basic Functions for Power Analysis

Power analysis functions along the lines of Cohen (1988).

Maintained by Helios De Rosario. Last updated 1 years ago.

4.5 match 105 stars 12.97 score 2.6k scripts 28 dependents

sb452

MendelianRandomization:Mendelian Randomization Package

Encodes several methods for performing Mendelian randomization analyses with summarized data. Summarized data on genetic associations with the exposure and with the outcome can be obtained from large consortia. These data can be used for obtaining causal estimates using instrumental variable methods.

Maintained by Stephen Burgess. Last updated 2 years ago.

openblas cpp

8.5 match 1 stars 6.83 score 940 scripts 1 dependents

cran

geesmv:Modified Variance Estimators for Generalized Estimating Equations

Generalized estimating equations with the original sandwich variance estimator proposed by Liang and Zeger (1986), and eight types of more recent modified variance estimators for improving the finite small-sample performance.

Maintained by Zheng Li. Last updated 9 years ago.

32.7 match 1.78 score

paulgovan

PRA:Project Risk Analysis

Data analysis for Project Risk Management via the Second Moment Method, Monte Carlo Simulation, Contingency Analysis, Sensitivity Analysis, Earned Value Management, Learning Curves, Design Structure Matrices, and more.

Maintained by Paul Govan. Last updated 3 months ago.

causal-networks contingency-analysis monte-carlo-simulation risk-analysis sensitivity-analysis

9.9 match 3 stars 5.82 score 8 scripts

bozenne

LMMstar:Repeated Measurement Models for Discrete Times

Companion R package for the course "Statistical analysis of correlated and repeated measurements for health science researchers" taught by the section of Biostatistics of the University of Copenhagen. It implements linear mixed models where the model for the variance-covariance of the residuals is specified via patterns (compound symmetry, toeplitz, unstructured, ...). Statistical inference for mean, variance, and correlation parameters is performed based on the observed information and a Satterthwaite approximation of the degrees of freedom. Normalized residuals are provided to assess model misspecification. Statistical inference can be performed for arbitrary linear or non-linear combination(s) of model coefficients. Predictions can be computed conditional to covariates only or also to outcome values.

Maintained by Brice Ozenne. Last updated 5 months ago.

9.2 match 4 stars 6.28 score 141 scripts

stats-uoa

s20x:Functions for University of Auckland Course STATS 201/208 Data Analysis

A set of functions used in teaching STATS 201/208 Data Analysis at the University of Auckland. The functions are designed to make parts of R more accessible to a large undergraduate population who are mostly not statistics majors.

Maintained by James Curran. Last updated 2 years ago.

9.0 match 3 stars 6.40 score 211 scripts 3 dependents

r4ss

r4ss:R Code for Stock Synthesis

A collection of R functions for use with Stock Synthesis, a fisheries stock assessment modeling platform written in ADMB by Dr. Richard D. Methot at the NOAA Northwest Fisheries Science Center. The functions include tools for summarizing and plotting results, manipulating files, visualizing model parameterizations, and various other common stock assessment tasks. This version of '{r4ss}' is compatible with Stock Synthesis versions 3.24 through 3.30 (specifically version 3.30.23.1, from December 2024). Support for 3.24 models is only through the core functions for reading output and plotting.

Maintained by Ian G. Taylor. Last updated 5 days ago.

fisheries fisheries-stock-assessment stock-synthesis

5.1 match 43 stars 11.38 score 1.0k scripts 2 dependents

friendly

heplots:Visualizing Hypothesis Tests in Multivariate Linear Models

Provides HE plot and other functions for visualizing hypothesis tests in multivariate linear models. HE plots represent sums-of-squares-and-products matrices for linear hypotheses and for error using ellipses (in two dimensions) and ellipsoids (in three dimensions). The related 'candisc' package provides visualizations in a reduced-rank canonical discriminant space when there are more than a few response variables.

Maintained by Michael Friendly. Last updated 9 days ago.

linear-hypotheses matrices multivariate-linear-models plot repeated-measure-designs visualizing-hypothesis-tests

5.0 match 9 stars 11.49 score 1.1k scripts 7 dependents

nceas

codyn:Community Dynamics Metrics

Univariate and multivariate temporal and spatial diversity indices, rank abundance curves, and community stability measures. The functions implement measures that are either explicitly temporal and include the option to calculate them over multiple replicates, or spatial and include the option to calculate them over multiple time points. Functions fall into five categories: static diversity indices, temporal diversity indices, spatial diversity indices, rank abundance curves, and community stability measures. The diversity indices are temporal and spatial analogs to traditional diversity indices. Specifically, the package includes functions to calculate community richness, evenness and diversity at a given point in space and time. In addition, it contains functions to calculate species turnover, mean rank shifts, and lags in community similarity between two time points. Details of the methods are available in Hallett et al. (2016) <doi:10.1111/2041-210X.12569> and Avolio et al. (2019) <doi:10.1002/ecs2.2881>.

Maintained by Matthew B. Jones. Last updated 4 years ago.

6.3 match 34 stars 9.07 score 230 scripts

r-forge

tramME:Transformation Models with Mixed Effects

Likelihood-based estimation of mixed-effects transformation models using the Template Model Builder ('TMB', Kristensen et al., 2016) <doi:10.18637/jss.v070.i05>. The technical details of transformation models are given in Hothorn et al. (2018) <doi:10.1111/sjos.12291>. Likelihood contributions of exact, randomly censored (left, right, interval) and truncated observations are supported. The random effects are assumed to be normally distributed on the scale of the transformation function, the marginal likelihood is evaluated using the Laplace approximation, and the gradients are calculated with automatic differentiation (Tamasi & Hothorn, 2021) <doi:10.32614/RJ-2021-075>. Penalized smooth shift terms can be defined using 'mgcv'.

Maintained by Balint Tamasi. Last updated 4 days ago.

cpp openmp

13.9 match 4.12 score 1 scripts

r-forge

car:Companion to Applied Regression

Functions to Accompany J. Fox and S. Weisberg, An R Companion to Applied Regression, Third Edition, Sage, 2019.

Maintained by John Fox. Last updated 5 months ago.

3.8 match 15.29 score 43k scripts 901 dependents

tyee001

VGAM:Vector Generalized Linear and Additive Models

An implementation of about 6 major classes of statistical regression models. The central algorithm is Fisher scoring and iterative reweighted least squares. At the heart of this package are the vector generalized linear and additive model (VGLM/VGAM) classes. VGLMs can be loosely thought of as multivariate GLMs. VGAMs are data-driven VGLMs that use smoothing. The book "Vector Generalized Linear and Additive Models: With an Implementation in R" (Yee, 2015) <DOI:10.1007/978-1-4939-2818-7> gives details of the statistical framework and the package. Currently only fixed-effects models are implemented. Many (100+) models and distributions are estimated by maximum likelihood estimation (MLE) or penalized MLE. The other classes are RR-VGLMs (reduced-rank VGLMs), quadratic RR-VGLMs, doubly constrained RR-VGLMs, quadratic RR-VGLMs, reduced-rank VGAMs, RCIMs (row-column interaction models)---these classes perform constrained and unconstrained quadratic ordination (CQO/UQO) models in ecology, as well as constrained additive ordination (CAO). Hauck-Donner effect detection is implemented. Note that these functions are subject to change; see the NEWS and ChangeLog files for latest changes.

Maintained by Thomas Yee. Last updated 1 months ago.

fortran

5.4 match 10 stars 10.67 score 3.6k scripts 169 dependents

mi2datalab

tidycharts:Generate Tidy Charts Inspired by 'IBCS'

There is a wide range of R packages created for data visualization, but still, there was no simple and easily accessible way to create clean and transparent charts - up to now. The 'tidycharts' package enables the user to generate charts compliant with International Business Communication Standards ('IBCS'). It means unified bar widths, colors, chart sizes, etc. Creating homogeneous reports has never been that easy! Additionally, users can apply semantic notation to indicate different data scenarios (plan, budget, forecast). What's more, it is possible to customize the charts by creating a personal color pallet with the possibility of switching to default options after the experiments. We wanted the package to be helpful in writing reports, so we also made joining charts in a one, clear image possible. All charts are generated in SVG format and can be shown in the 'RStudio' viewer pane or exported to HTML output of 'knitr'/'markdown'.

Maintained by Bartosz Sawicki. Last updated 3 years ago.

charts clean ibcs visualization

10.8 match 5 stars 5.23 score 17 scripts

elvanceyhan

nnspat:Nearest Neighbor Methods for Spatial Patterns

Contains the functions for testing the spatial patterns (of segregation, spatial symmetry, association, disease clustering, species correspondence and reflexivity) based on nearest neighbor relations, especially using contingency tables such as nearest neighbor contingency tables (Ceyhan (2010) <doi:10.1007/s10651-008-0104-x> and Ceyhan (2017) <doi:10.1016/j.jkss.2016.10.002> and references therein), nearest neighbor symmetry contingency tables (Ceyhan (2014) <doi:10.1155/2014/698296>), species correspondence contingency tables and reflexivity contingency tables (Ceyhan (2018) <doi:10.2436/20.8080.02.72>) for two (or higher) dimensional data. Also contains functions for generating patterns of segregation, association, uniformity in a multi-class setting (Ceyhan (2014) <doi:10.1007/s00477-013-0824-9>), and various non-random labeling patterns for disease clustering in two dimensional cases (Ceyhan (2014) <doi:10.1002/sim.6053>), and for visualization of all these patterns for the two dimensional data. The tests are usually (asymptotic) normal z-tests and chi-square tests.

Maintained by Elvan Ceyhan. Last updated 3 years ago.

19.4 match 2.90 score 16 scripts

jepusto

lmeInfo:Information Matrices for 'lmeStruct' and 'glsStruct' Objects

Provides analytic derivatives and information matrices for fitted linear mixed effects (lme) models and generalized least squares (gls) models estimated using lme() (from package 'nlme') and gls() (from package 'nlme'), respectively. The package includes functions for estimating the sampling variance-covariance of variance component parameters using the inverse Fisher information. The variance components include the parameters of the random effects structure (for lme models), the variance structure, and the correlation structure. The expected and average forms of the Fisher information matrix are used in the calculations, and models estimated by full maximum likelihood or restricted maximum likelihood are supported. The package also includes a function for estimating standardized mean difference effect sizes (Pustejovsky, Hedges, and Shadish (2014) <DOI:10.3102/1076998614547577>) based on fitted lme or gls models.

Maintained by Man Chen. Last updated 2 years ago.

9.3 match 4 stars 6.03 score 34 scripts 4 dependents

mrcieu

mrbayes:Bayesian Summary Data Models for Mendelian Randomization Studies

Bayesian estimation of inverse variance weighted (IVW), Burgess et al. (2013) <doi:10.1002/gepi.21758>, and MR-Egger, Bowden et al. (2015) <doi:10.1093/ije/dyv080>, summary data models for Mendelian randomization analyses.

Maintained by Tom Palmer. Last updated 11 days ago.

cpp

10.3 match 4 stars 5.45 score 2 scripts

e-sensing

sits:Satellite Image Time Series Analysis for Earth Observation Data Cubes

An end-to-end toolkit for land use and land cover classification using big Earth observation data, based on machine learning methods applied to satellite image data cubes, as described in Simoes et al (2021) <doi:10.3390/rs13132428>. Builds regular data cubes from collections in AWS, Microsoft Planetary Computer, Brazil Data Cube, Copernicus Data Space Environment (CDSE), Digital Earth Africa, Digital Earth Australia, NASA HLS using the Spatio-temporal Asset Catalog (STAC) protocol (<https://stacspec.org/>) and the 'gdalcubes' R package developed by Appel and Pebesma (2019) <doi:10.3390/data4030092>. Supports visualization methods for images and time series and smoothing filters for dealing with noisy time series. Includes functions for quality assessment of training samples using self-organized maps as presented by Santos et al (2021) <doi:10.1016/j.isprsjprs.2021.04.014>. Includes methods to reduce training samples imbalance proposed by Chawla et al (2002) <doi:10.1613/jair.953>. Provides machine learning methods including support vector machines, random forests, extreme gradient boosting, multi-layer perceptrons, temporal convolutional neural networks proposed by Pelletier et al (2019) <doi:10.3390/rs11050523>, and temporal attention encoders by Garnot and Landrieu (2020) <doi:10.48550/arXiv.2007.00586>. Supports GPU processing of deep learning models using torch <https://torch.mlverse.org/>. Performs efficient classification of big Earth observation data cubes and includes functions for post-classification smoothing based on Bayesian inference as described by Camara et al (2024) <doi:10.3390/rs16234572>, and methods for active learning and uncertainty assessment. Supports region-based time series analysis using package supercells <https://jakubnowosad.com/supercells/>. Enables best practices for estimating area and assessing accuracy of land change as recommended by Olofsson et al (2014) <doi:10.1016/j.rse.2014.02.015>. Minimum recommended requirements: 16 GB RAM and 4 CPU dual-core.

Maintained by Gilberto Camara. Last updated 1 months ago.

big-earth-data cbers earth-observation eo-datacubes geospatial image-time-series land-cover-classification landsat planetary-computer r-spatial remote-sensing rspatial satellite-image-time-series satellite-imagery sentinel-2 stac-api stac-catalog cpp

5.9 match 494 stars 9.50 score 384 scripts

braverock

PerformanceAnalytics:Econometric Tools for Performance and Risk Analysis

Collection of econometric functions for performance and risk analysis. In addition to standard risk and performance metrics, this package aims to aid practitioners and researchers in utilizing the latest research in analysis of non-normal return streams. In general, it is most tested on return (rather than price) data on a regular scale, but most functions will work with irregular return data as well, and increasing numbers of functions will work with P&L or price data where possible.

Maintained by Brian G. Peterson. Last updated 3 months ago.

3.5 match 222 stars 15.93 score 4.8k scripts 20 dependents

r-forge

coin:Conditional Inference Procedures in a Permutation Test Framework

Conditional inference procedures for the general independence problem including two-sample, K-sample (non-parametric ANOVA), correlation, censored, ordered and multivariate problems described in <doi:10.18637/jss.v028.i08>.

Maintained by Torsten Hothorn. Last updated 9 months ago.

4.8 match 11.68 score 1.6k scripts 74 dependents

jeffreyevans

spatialEco:Spatial Analysis and Modelling Utilities

Utilities to support spatial data manipulation, query, sampling and modelling in ecological applications. Functions include models for species population density, spatial smoothing, multivariate separability, point process model for creating pseudo- absences and sub-sampling, Quadrant-based sampling and analysis, auto-logistic modeling, sampling models, cluster optimization, statistical exploratory tools and raster-based metrics.

Maintained by Jeffrey S. Evans. Last updated 13 days ago.

biodiversity conservation ecology r-spatial raster spatial vector

5.8 match 110 stars 9.55 score 736 scripts 2 dependents

jonathancornelissen

highfrequency:Tools for Highfrequency Data Analysis

Provide functionality to manage, clean and match highfrequency trades and quotes data, calculate various liquidity measures, estimate and forecast volatility, detect price jumps and investigate microstructure noise and intraday periodicity. A detailed vignette can be found in the paper "Analyzing Intraday Financial Data in R: The highfrequency Package" by Boudt, Kleen, and Sjoerup (2022, <doi:10.18637/jss.v104.i08>). The DOI in the CITATION is for a new Journal of Statistical Software publication that will be registered after publication on CRAN. A working paper version can be found on SSRN: <doi:10.2139/ssrn.3917548>.

Maintained by Kris Boudt. Last updated 2 years ago.

openblas cpp openmp

7.5 match 152 stars 7.37 score 286 scripts

lcrawlab

mvMAPIT:Multivariate Genome Wide Marginal Epistasis Test

Epistasis, commonly defined as the interaction between genetic loci, is known to play an important role in the phenotypic variation of complex traits. As a result, many statistical methods have been developed to identify genetic variants that are involved in epistasis, and nearly all of these approaches carry out this task by focusing on analyzing one trait at a time. Previous studies have shown that jointly modeling multiple phenotypes can often dramatically increase statistical power for association mapping. In this package, we present the 'multivariate MArginal ePIstasis Test' ('mvMAPIT') – a multi-outcome generalization of a recently proposed epistatic detection method which seeks to detect marginal epistasis or the combined pairwise interaction effects between a given variant and all other variants. By searching for marginal epistatic effects, one can identify genetic variants that are involved in epistasis without the need to identify the exact partners with which the variants interact – thus, potentially alleviating much of the statistical and computational burden associated with conventional explicit search based methods. Our proposed 'mvMAPIT' builds upon this strategy by taking advantage of correlation structure between traits to improve the identification of variants involved in epistasis. We formulate 'mvMAPIT' as a multivariate linear mixed model and develop a multi-trait variance component estimation algorithm for efficient parameter inference and P-value computation. Together with reasonable model approximations, our proposed approach is scalable to moderately sized genome-wide association studies. Crawford et al. (2017) <doi:10.1371/journal.pgen.1006869>. Stamp et al. (2023) <doi:10.1093/g3journal/jkad118>.

Maintained by Julian Stamp. Last updated 5 months ago.

cpp epistasis epistasis-analysis gwas gwas-tools linear-mixed-models mapit mvmapit variance-components openblas cpp openmp

8.0 match 11 stars 6.90 score 17 scripts 1 dependents

elvanceyhan

pcds:Proximity Catch Digraphs and Their Applications

Contains the functions for construction and visualization of various families of the proximity catch digraphs (PCDs) (see (Ceyhan (2005) ISBN:978-3-639-19063-2), for computing the graph invariants for testing the patterns of segregation and association against complete spatial randomness (CSR) or uniformity in one, two and three dimensional cases. The package also has tools for generating points from these spatial patterns. The graph invariants used in testing spatial point data are the domination number (Ceyhan (2011) <doi:10.1080/03610921003597211>) and arc density (Ceyhan et al. (2006) <doi:10.1016/j.csda.2005.03.002>; Ceyhan et al. (2007) <doi:10.1002/cjs.5550350106>). The PCD families considered are Arc-Slice PCDs, Proportional-Edge PCDs, and Central Similarity PCDs.

Maintained by Elvan Ceyhan. Last updated 2 years ago.

9.5 match 5.80 score 21 scripts 2 dependents

briencj

asremlPlus:Augments 'ASReml-R' in Fitting Mixed Models and Packages Generally in Exploring Prediction Differences

Assists in automating the selection of terms to include in mixed models when 'asreml' is used to fit the models. Procedures are available for choosing models that conform to the hierarchy or marginality principle, for fitting and choosing between two-dimensional spatial models using correlation, natural cubic smoothing spline and P-spline models. A history of the fitting of a sequence of models is kept in a data frame. Also used to compute functions and contrasts of, to investigate differences between and to plot predictions obtained using any model fitting function. The content falls into the following natural groupings: (i) Data, (ii) Model modification functions, (iii) Model selection and description functions, (iv) Model diagnostics and simulation functions, (v) Prediction production and presentation functions, (vi) Response transformation functions, (vii) Object manipulation functions, and (viii) Miscellaneous functions (for further details see 'asremlPlus-package' in help). The 'asreml' package provides a computationally efficient algorithm for fitting a wide range of linear mixed models using Residual Maximum Likelihood. It is a commercial package and a license for it can be purchased from 'VSNi' <https://vsni.co.uk/> as 'asreml-R', who will supply a zip file for local installation/updating (see <https://asreml.kb.vsni.co.uk/>). It is not needed for functions that are methods for 'alldiffs' and 'data.frame' objects. The package 'asremPlus' can also be installed from <http://chris.brien.name/rpackages/>.

Maintained by Chris Brien. Last updated 28 days ago.

asreml mixed-models

5.9 match 19 stars 9.34 score 200 scripts

zackfisher

robumeta:Robust Variance Meta-Regression

Functions for conducting robust variance estimation (RVE) meta-regression using both large and small sample RVE estimators under various weighting schemes. These methods are distribution free and provide valid point estimates, standard errors and hypothesis tests even when the degree and structure of dependence between effect sizes is unknown. Also included are functions for conducting sensitivity analyses under correlated effects weighting and producing RVE-based forest plots.

Maintained by Zachary Fisher. Last updated 4 years ago.

7.1 match 8 stars 7.75 score 178 scripts 4 dependents

bsaul

geex:An API for M-Estimation

Provides a general, flexible framework for estimating parameters and empirical sandwich variance estimator from a set of unbiased estimating equations (i.e., M-estimation in the vein of Stefanski & Boos (2002) <doi:10.1198/000313002753631330>). All examples from Stefanski & Boos (2002) are published in the corresponding Journal of Statistical Software paper "The Calculus of M-Estimation in R with geex" by Saul & Hudgens (2020) <doi:10.18637/jss.v092.i02>. Also provides an API to compute finite-sample variance corrections.

Maintained by Bradley Saul. Last updated 10 months ago.

asymptotics covariance-estimates covariance-estimation estimate-parameters estimating-equations estimation inference m-estimation robust sandwich

7.1 match 8 stars 7.70 score 131 scripts 2 dependents

inlabru-org

inlabru:Bayesian Latent Gaussian Modelling using INLA and Extensions

Facilitates spatial and general latent Gaussian modeling using integrated nested Laplace approximation via the INLA package (<https://www.r-inla.org>). Additionally, extends the GAM-like model class to more general nonlinear predictor expressions, and implements a log Gaussian Cox process likelihood for modeling univariate and spatial point processes based on ecological survey data. Model components are specified with general inputs and mapping methods to the latent variables, and the predictors are specified via general R expressions, with separate expressions for each observation likelihood model in multi-likelihood models. A prediction method based on fast Monte Carlo sampling allows posterior prediction of general expressions of the latent variables. Ecology-focused introduction in Bachl, Lindgren, Borchers, and Illian (2019) <doi:10.1111/2041-210X.13168>.

Maintained by Finn Lindgren. Last updated 4 days ago.

4.3 match 96 stars 12.62 score 832 scripts 6 dependents

ngreifer

cobalt:Covariate Balance Tables and Plots

Generate balance tables and plots for covariates of groups preprocessed through matching, weighting or subclassification, for example, using propensity scores. Includes integration with 'MatchIt', 'WeightIt', 'MatchThem', 'twang', 'Matching', 'optmatch', 'CBPS', 'ebal', 'cem', 'sbw', and 'designmatch' for assessing balance on the output of their preprocessing functions. Users can also specify data for balance assessment not generated through the above packages. Also included are methods for assessing balance in clustered or multiply imputed data sets or data sets with multi-category, continuous, or longitudinal treatments.

Maintained by Noah Greifer. Last updated 11 months ago.

causal-inference propensity-scores

4.2 match 75 stars 12.98 score 1.0k scripts 8 dependents

jinkim3

kim:A Toolkit for Behavioral Scientists

A collection of functions for analyzing data typically collected or used by behavioral scientists. Examples of the functions include a function that compares groups in a factorial experimental design, a function that conducts two-way analysis of variance (ANOVA), and a function that cleans a data set generated by Qualtrics surveys. Some of the functions will require installing additional package(s). Such packages and other references are cited within the section describing the relevant functions. Many functions in this package rely heavily on these two popular R packages: Dowle et al. (2021) <https://CRAN.R-project.org/package=data.table>. Wickham et al. (2021) <https://CRAN.R-project.org/package=ggplot2>.

Maintained by Jin Kim. Last updated 19 days ago.

11.6 match 7 stars 4.66 score 3 scripts

mboeck11

BGVAR:Bayesian Global Vector Autoregressions

Estimation of Bayesian Global Vector Autoregressions (BGVAR) with different prior setups and the possibility to introduce stochastic volatility. Built-in priors include the Minnesota, the stochastic search variable selection and Normal-Gamma (NG) prior. For a reference see also Crespo Cuaresma, J., Feldkircher, M. and F. Huber (2016) "Forecasting with Global Vector Autoregressive Models: a Bayesian Approach", Journal of Applied Econometrics, Vol. 31(7), pp. 1371-1391 <doi:10.1002/jae.2504>. Post-processing functions allow for doing predictions, structurally identify the model with short-run or sign-restrictions and compute impulse response functions, historical decompositions and forecast error variance decompositions. Plotting functions are also available. The package has a companion paper: Boeck, M., Feldkircher, M. and F. Huber (2022) "BGVAR: Bayesian Global Vector Autoregressions with Shrinkage Priors in R", Journal of Statistical Software, Vol. 104(9), pp. 1-28 <doi:10.18637/jss.v104.i09>.

Maintained by Maximilian Boeck. Last updated 3 months ago.

openblas cpp

7.1 match 27 stars 7.58 score 156 scripts

icasas

tvReg:Time-Varying Coefficient for Single and Multi-Equation Regressions

Fitting time-varying coefficient models for single and multi-equation regressions, using kernel smoothing techniques.

Maintained by Isabel Casas. Last updated 2 years ago.

autoregressive nonparametric regression sure vectorautoregressive

8.6 match 19 stars 6.25 score 62 scripts

friendly

genridge:Generalized Ridge Trace Plots for Ridge Regression

The genridge package introduces generalizations of the standard univariate ridge trace plot used in ridge regression and related methods. These graphical methods show both bias (actually, shrinkage) and precision, by plotting the covariance ellipsoids of the estimated coefficients, rather than just the estimates themselves. 2D and 3D plotting methods are provided, both in the space of the predictor variables and in the transformed space of the PCA/SVD of the predictors.

Maintained by Michael Friendly. Last updated 3 months ago.

bias-variance graphics principal-component-analysis regression-models ridge-regression singular-value-decomposition

11.1 match 4 stars 4.84 score 69 scripts

vandomed

dvmisc:Convenience Functions, Moving Window Statistics, and Graphics

Collection of functions for running and summarizing statistical simulation studies, creating visualizations (e.g. CART Shiny app, histograms with fitted probability mass/density functions), calculating moving-window statistics efficiently, and performing common computations.

Maintained by Dane R. Van Domelen. Last updated 4 years ago.

aic bmi histograms miscellaneous cpp

8.6 match 1 stars 6.18 score 125 scripts 8 dependents

mandymejia

templateICAr:Estimate Brain Networks and Connectivity with ICA and Empirical Priors

Implements the template ICA (independent components analysis) model proposed in Mejia et al. (2020) <doi:10.1080/01621459.2019.1679638> and the spatial template ICA model proposed in proposed in Mejia et al. (2022) <doi:10.1080/10618600.2022.2104289>. Both models estimate subject-level brain as deviations from known population-level networks, which are estimated using standard ICA algorithms. Both models employ an expectation-maximization algorithm for estimation of the latent brain networks and unknown model parameters. Includes direct support for 'CIFTI', 'GIFTI', and 'NIFTI' neuroimaging file formats.

Maintained by Amanda Mejia. Last updated 3 days ago.

8.4 match 10 stars 6.35 score 25 scripts

vitomuggeo

segmented:Regression Models with Break-Points / Change-Points Estimation (with Possibly Random Effects)

Fitting regression models where, in addition to possible linear terms, one or more covariates have segmented (i.e., broken-line or piece-wise linear) or stepmented (i.e. piece-wise constant) effects. Multiple breakpoints for the same variable are allowed. The estimation method is discussed in Muggeo (2003, <doi:10.1002/sim.1545>) and illustrated in Muggeo (2008, <https://www.r-project.org/doc/Rnews/Rnews_2008-1.pdf>). An approach for hypothesis testing is presented in Muggeo (2016, <doi:10.1080/00949655.2016.1149855>), and interval estimation for the breakpoint is discussed in Muggeo (2017, <doi:10.1111/anzs.12200>). Segmented mixed models, i.e. random effects in the change point, are discussed in Muggeo (2014, <doi:10.1177/1471082X13504721>). Estimation of piecewise-constant relationships and changepoints (mean-shift models) is discussed in Fasola et al. (2018, <doi:10.1007/s00180-017-0740-4>).

Maintained by Vito M. R. Muggeo. Last updated 16 days ago.

5.2 match 9 stars 10.03 score 1.2k scripts 203 dependents

venelin

PCMBase:Simulation and Likelihood Calculation of Phylogenetic Comparative Models

Phylogenetic comparative methods represent models of continuous trait data associated with the tips of a phylogenetic tree. Examples of such models are Gaussian continuous time branching stochastic processes such as Brownian motion (BM) and Ornstein-Uhlenbeck (OU) processes, which regard the data at the tips of the tree as an observed (final) state of a Markov process starting from an initial state at the root and evolving along the branches of the tree. The PCMBase R package provides a general framework for manipulating such models. This framework consists of an application programming interface for specifying data and model parameters, and efficient algorithms for simulating trait evolution under a model and calculating the likelihood of model parameters for an assumed model and trait data. The package implements a growing collection of models, which currently includes BM, OU, BM/OU with jumps, two-speed OU as well as mixed Gaussian models, in which different types of the above models can be associated with different branches of the tree. The PCMBase package is limited to trait-simulation and likelihood calculation of (mixed) Gaussian phylogenetic models. The PCMFit package provides functionality for inference of these models to tree and trait data. The package web-site <https://venelin.github.io/PCMBase/> provides access to the documentation and other resources.

Maintained by Venelin Mitov. Last updated 10 months ago.

6.9 match 6 stars 7.56 score 85 scripts 3 dependents

rsquaredacademy

inferr:Inferential Statistics

Select set of parametric and non-parametric statistical tests. 'inferr' builds upon the solid set of statistical tests provided in 'stats' package by including additional data types as inputs, expanding and restructuring the test results. The tests included are t tests, variance tests, proportion tests, chi square tests, Levene's test, McNemar Test, Cochran's Q test and Runs test.

Maintained by Aravind Hebbali. Last updated 4 months ago.

inference inferential-statistics non-parametric parametric statistical-tests cpp

8.5 match 37 stars 6.10 score 34 scripts

decodegenetics

gnonadd:Various Non-Additive Models for Genetic Associations

The goal of 'gnonadd' is to simplify workflows in the analysis of non-additive effects of sequence variants. This includes variance effects (Ivarsdottir et. al (2017) <doi:10.1038/ng.3928>), correlation effects, interaction effects and dominance effects. The package also includes convenience functions for visualization.

Maintained by Audunn S. Snaebjarnarson. Last updated 3 months ago.

19.0 match 1 stars 2.70 score

diystat

NBPSeq:Negative Binomial Models for RNA-Sequencing Data

Negative Binomial (NB) models for two-group comparisons and regression inferences from RNA-Sequencing Data.

Maintained by Yanming Di. Last updated 11 years ago.

10.4 match 1 stars 4.88 score 17 scripts 3 dependents

nicholasjclark

mvgam:Multivariate (Dynamic) Generalized Additive Models

Fit Bayesian Dynamic Generalized Additive Models to multivariate observations. Users can build nonlinear State-Space models that can incorporate semiparametric effects in observation and process components, using a wide range of observation families. Estimation is performed using Markov Chain Monte Carlo with Hamiltonian Monte Carlo in the software 'Stan'. References: Clark & Wells (2023) <doi:10.1111/2041-210X.13974>.

Maintained by Nicholas J Clark. Last updated 21 hours ago.

bayesian-statistics dynamic-factor-models ecological-modelling forecasting gaussian-process generalised-additive-models generalized-additive-models joint-species-distribution-modelling multilevel-models multivariate-timeseries stan time-series-analysis timeseries vector-autoregression vectorautoregression cpp

5.2 match 139 stars 9.85 score 117 scripts

hojsgaard

pbkrtest:Parametric Bootstrap, Kenward-Roger and Satterthwaite Based Methods for Test in Mixed Models

Computes p-values based on (a) Satterthwaite or Kenward-Rogers degree of freedom methods and (b) parametric bootstrap for mixed effects models as implemented in the 'lme4' package. Implements parametric bootstrap test for generalized linear mixed models as implemented in 'lme4' and generalized linear models. The package is documented in the paper by Halekoh and Højsgaard, (2012, <doi:10.18637/jss.v059.i09>). Please see 'citation("pbkrtest")' for citation details.

Maintained by Søren Højsgaard. Last updated 10 days ago.

3.5 match 5 stars 14.36 score 648 scripts 915 dependents

poissonconsulting

extras:Helper Functions for Bayesian Analyses

Functions to 'numericise' 'R' objects (coerce to numeric objects), summarise 'MCMC' (Monte Carlo Markov Chain) samples and calculate deviance residuals as well as 'R' translations of some 'BUGS' (Bayesian Using Gibbs Sampling), 'JAGS' (Just Another Gibbs Sampler), 'STAN' and 'TMB' (Template Model Builder) functions.

Maintained by Nicole Hill. Last updated 2 months ago.

6.0 match 9 stars 8.49 score 15 scripts 16 dependents

gasparrini

mixmeta:An Extended Mixed-Effects Framework for Meta-Analysis

A collection of functions to perform various meta-analytical models through a unified mixed-effects framework, including standard univariate fixed and random-effects meta-analysis and meta-regression, and non-standard extensions such as multivariate, multilevel, longitudinal, and dose-response models.

Maintained by Antonio Gasparrini. Last updated 3 years ago.

7.2 match 13 stars 6.96 score 63 scripts 13 dependents

smartdata-analysis-and-statistics

metamisc:Meta-Analysis of Diagnosis and Prognosis Research Studies

Facilitate frequentist and Bayesian meta-analysis of diagnosis and prognosis research studies. It includes functions to summarize multiple estimates of prediction model discrimination and calibration performance (Debray et al., 2019) <doi:10.1177/0962280218785504>. It also includes functions to evaluate funnel plot asymmetry (Debray et al., 2018) <doi:10.1002/jrsm.1266>. Finally, the package provides functions for developing multivariable prediction models from datasets with clustering (de Jong et al., 2021) <doi:10.1002/sim.8981>.

Maintained by Thomas Debray. Last updated 1 months ago.

meta-analysis prognosis prognostic-models

6.7 match 7 stars 7.48 score 102 scripts

easystats

effectsize:Indices of Effect Size

Provide utilities to work with indices of effect size for a wide variety of models and hypothesis tests (see list of supported models using the function 'insight::supported_models()'), allowing computation of and conversion between indices such as Cohen's d, r, odds, etc. References: Ben-Shachar et al. (2020) <doi:10.21105/joss.02815>.

Maintained by Mattan S. Ben-Shachar. Last updated 1 months ago.

anova cohens-d compute conversion correlation effect-size effectsize hacktoberfest hedges-g interpretation standardization standardized statistics

3.0 match 344 stars 16.38 score 1.8k scripts 29 dependents

mestherlv

mme:Multinomial Mixed Effects Models

Fit Gaussian Multinomial mixed-effects models for small area estimation: Model 1, with one random effect in each category of the response variable (Lopez-Vizcaino,E. et al., 2013) <doi:10.1177/1471082X13478873>; Model 2, introducing independent time effect; Model 3, introducing correlated time effect. mme calculates direct and parametric bootstrap MSE estimators (Lopez-Vizcaino,E et al., 2014) <doi:10.1111/rssa.12085>.

Maintained by E. Lopez-Vizcaino. Last updated 6 years ago.

19.2 match 1 stars 2.59 score 39 scripts

cran

agricolae:Statistical Procedures for Agricultural Research

Original idea was presented in the thesis "A statistical analysis tool for agricultural research" to obtain the degree of Master on science, National Engineering University (UNI), Lima-Peru. Some experimental data for the examples come from the CIP and others research. Agricolae offers extensive functionality on experimental design especially for agricultural and plant breeding experiments, which can also be useful for other purposes. It supports planning of lattice, Alpha, Cyclic, Complete Block, Latin Square, Graeco-Latin Squares, augmented block, factorial, split and strip plot designs. There are also various analysis facilities for experimental data, e.g. treatment comparison procedures and several non-parametric tests comparison, biodiversity indexes and consensus cluster.

Maintained by Felipe de Mendiburu. Last updated 1 years ago.

7.1 match 7 stars 7.01 score 15 dependents

tmatta

lsasim:Functions to Facilitate the Simulation of Large Scale Assessment Data

Provides functions to simulate data from large-scale educational assessments, including background questionnaire data and cognitive item responses that adhere to a multiple-matrix sampled design. The theoretical foundation can be found on Matta, T.H., Rutkowski, L., Rutkowski, D. et al. (2018) <doi:10.1186/s40536-018-0068-8>.

Maintained by Waldir Leoncio. Last updated 2 months ago.

7.8 match 6 stars 6.41 score 18 scripts

jfieberg

SightabilityModel:Wildlife Sightability Modeling

Uses logistic regression to model the probability of detection as a function of covariates. This model is then used with observational survey data to estimate population size, while accounting for uncertain detection. See Steinhorst and Samuel (1989).

Maintained by Schwarz Carl James. Last updated 2 years ago.

10.0 match 1 stars 4.96 score 23 scripts

cmsaf

cmsafops:Tools for CM SAF NetCDF Data

The Satellite Application Facility on Climate Monitoring (CM SAF) is a ground segment of the European Organization for the Exploitation of Meteorological Satellites (EUMETSAT) and one of EUMETSATs Satellite Application Facilities. The CM SAF contributes to the sustainable monitoring of the climate system by providing essential climate variables related to the energy and water cycle of the atmosphere (<https://www.cmsaf.eu>). It is a joint cooperation of eight National Meteorological and Hydrological Services. The 'cmsafops' R-package provides a collection of R-operators for the analysis and manipulation of CM SAF NetCDF formatted data. Other CF conform NetCDF data with time, longitude and latitude dimension should be applicable, but there is no guarantee for an error-free application. CM SAF climate data records are provided for free via (<https://wui.cmsaf.eu/safira>). Detailed information and test data are provided on the CM SAF webpage (<http://www.cmsaf.eu/R_toolbox>).

Maintained by Steffen Kothe. Last updated 6 months ago.

9.8 match 2 stars 5.03 score 4 scripts 2 dependents

hannahcomiskey

mcmsupply:Estimating Public and Private Sector Contraceptive Market Supply Shares

Family Planning programs and initiatives typically use nationally representative surveys to estimate key indicators of a country’s family planning progress. However, in recent years, routinely collected family planning services data (Service Statistics) have been used as a supplementary data source to bridge gaps in the surveys. The use of service statistics comes with the caveat that adjustments need to be made for missing private sector contributions to the contraceptive method supply chain. Evaluating the supply source of modern contraceptives often relies on Demographic Health Surveys (DHS), where many countries do not have recent data beyond 2015/16. Fortunately, in the absence of recent surveys we can rely on statistical model-based estimates and projections to fill the knowledge gap. We present a Bayesian, hierarchical, penalized-spline model with multivariate-normal spline coefficients, to account for across method correlations, to produce country-specific,annual estimates for the proportion of modern contraceptive methods coming from the public and private sectors. This package provides a quick and convenient way for users to access the DHS modern contraceptive supply share data at national and subnational administration levels, estimate, evaluate and plot annual estimates with uncertainty for a sample of low- and middle-income countries. Methods for the estimation of method supply shares at the national level are described in Comiskey, Alkema, Cahill (2022) <arXiv:2212.03844>.

Maintained by Hannah Comiskey. Last updated 12 months ago.

jags cpp

9.5 match 2 stars 5.15 score 20 scripts

bioc

lumi:BeadArray Specific Methods for Illumina Methylation and Expression Microarrays

The lumi package provides an integrated solution for the Illumina microarray data analysis. It includes functions of Illumina BeadStudio (GenomeStudio) data input, quality control, BeadArray-specific variance stabilization, normalization and gene annotation at the probe level. It also includes the functions of processing Illumina methylation microarrays, especially Illumina Infinium methylation microarrays.

Maintained by Lei Huang. Last updated 5 months ago.

microarray onechannel preprocessing dnamethylation qualitycontrol twochannel

7.8 match 6.27 score 294 scripts 5 dependents

gergness

srvyr:'dplyr'-Like Syntax for Summary Statistics of Survey Data

Use piping, verbs like 'group_by' and 'summarize', and other 'dplyr' inspired syntactic style when calculating summary statistics on survey data using functions from the 'survey' package.

Maintained by Greg Freedman Ellis. Last updated 1 months ago.

survey

3.5 match 215 stars 13.88 score 1.8k scripts 15 dependents

cran

PracTools:Designing and Weighting Survey Samples

Functions and datasets to support Valliant, Dever, and Kreuter (2018), <doi:10.1007/978-3-319-93632-1>, "Practical Tools for Designing and Weighting Survey Samples". Contains functions for sample size calculation for survey samples using stratified or clustered one-, two-, and three-stage sample designs, and single-stage audit sample designs. Functions are included that will group geographic units accounting for distances apart and measures of size. Other functions compute variance components for multistage designs and sample sizes in two-phase designs. A number of example data sets are included.

Maintained by Richard Valliant. Last updated 9 months ago.

10.9 match 1 stars 4.48 score 101 scripts 1 dependents

simsem

semTools:Useful Tools for Structural Equation Modeling

Provides miscellaneous tools for structural equation modeling, many of which extend the 'lavaan' package. For example, latent interactions can be estimated using product indicators (Lin et al., 2010, <doi:10.1080/10705511.2010.488999>) and simple effects probed; analytical power analyses can be conducted (Jak et al., 2021, <doi:10.3758/s13428-020-01479-0>); and scale reliability can be estimated based on estimated factor-model parameters.

Maintained by Terrence D. Jorgensen. Last updated 3 days ago.

3.5 match 79 stars 13.74 score 1.1k scripts 31 dependents

ntguardian

CPAT:Change Point Analysis Tests

Implements several statistical tests for structural change, specifically the tests featured in Horváth, Rice and Miller (in press): CUSUM (with weighted/trimmed variants), Darling-Erdös, Hidalgo-Seo, Andrews, and the new Rényi-type test.

Maintained by Curtis Miller. Last updated 6 years ago.

change point statistics tests cpp

9.0 match 11 stars 5.37 score 43 scripts

xiaolei-lab

rMVP:Memory-Efficient, Visualize-Enhanced, Parallel-Accelerated GWAS Tool

A memory-efficient, visualize-enhanced, parallel-accelerated Genome-Wide Association Study (GWAS) tool. It can (1) effectively process large data, (2) rapidly evaluate population structure, (3) efficiently estimate variance components several algorithms, (4) implement parallel-accelerated association tests of markers three methods, (5) globally efficient design on GWAS process computing, (6) enhance visualization of related information. 'rMVP' contains three models GLM (Alkes Price (2006) <DOI:10.1038/ng1847>), MLM (Jianming Yu (2006) <DOI:10.1038/ng1702>) and FarmCPU (Xiaolei Liu (2016) <doi:10.1371/journal.pgen.1005767>); variance components estimation methods EMMAX (Hyunmin Kang (2008) <DOI:10.1534/genetics.107.080101>;), FaSTLMM (method: Christoph Lippert (2011) <DOI:10.1038/nmeth.1681>, R implementation from 'GAPIT2': You Tang and Xiaolei Liu (2016) <DOI:10.1371/journal.pone.0107684> and 'SUPER': Qishan Wang and Feng Tian (2014) <DOI:10.1371/journal.pone.0107684>), and HE regression (Xiang Zhou (2017) <DOI:10.1214/17-AOAS1052>).

Maintained by Xiaolei Liu. Last updated 2 months ago.

openblas cpp openmp

6.0 match 287 stars 8.06 score 38 scripts

stan-dev

posterior:Tools for Working with Posterior Distributions

Provides useful tools for both users and developers of packages for fitting Bayesian models or working with output from Bayesian models. The primary goals of the package are to: (a) Efficiently convert between many different useful formats of draws (samples) from posterior or prior distributions. (b) Provide consistent methods for operations commonly performed on draws, for example, subsetting, binding, or mutating draws. (c) Provide various summaries of draws in convenient formats. (d) Provide lightweight implementations of state of the art posterior inference diagnostics. References: Vehtari et al. (2021) <doi:10.1214/20-BA1221>.

Maintained by Paul-Christian Bürkner. Last updated 10 days ago.

bayes bayesian mcmc

3.0 match 168 stars 16.13 score 3.3k scripts 342 dependents

thijsjanzen

treestats:Phylogenetic Tree Statistics

Collection of phylogenetic tree statistics, collected throughout the literature. All functions have been written to maximize computation speed. The package includes umbrella functions to calculate all statistics, all balance associated statistics, or all branching time related statistics. Furthermore, the 'treestats' package supports summary statistic calculations on Ltables, provides speed-improved coding of branching times, Ltable conversion and includes algorithms to create intermediately balanced trees. Full description can be found in Janzen (2024) <doi:10.1016/j.ympev.2024.108168>.

Maintained by Thijs Janzen. Last updated 6 months ago.

cpp

8.9 match 16 stars 5.43 score 16 scripts 1 dependents

davidfirth

qvcalc:Quasi Variances for Factor Effects in Statistical Models

Functions to compute quasi variances and associated measures of approximation error.

Maintained by David Firth. Last updated 2 months ago.

7.2 match 6.63 score 40 scripts 25 dependents

bioc

sRACIPE:Systems biology tool to simulate gene regulatory circuits

sRACIPE implements a randomization-based method for gene circuit modeling. It allows us to study the effect of both the gene expression noise and the parametric variation on any gene regulatory circuit (GRC) using only its topology, and simulates an ensemble of models with random kinetic parameters at multiple noise levels. Statistical analysis of the generated gene expressions reveals the basin of attraction and stability of various phenotypic states and their changes associated with intrinsic and extrinsic noises. sRACIPE provides a holistic picture to evaluate the effects of both the stochastic nature of cellular processes and the parametric variation.

Maintained by Mingyang Lu. Last updated 20 days ago.

researchfield systemsbiology mathematicalbiology geneexpression generegulation genetarget cpp

7.5 match 4 stars 6.40 score 209 scripts

jedazard

MVR:Mean-Variance Regularization

Implements a non-parametric method for joint adaptive mean-variance regularization and variance stabilization of high-dimensional data. It is suited for handling difficult problems posed by high-dimensional multivariate datasets (p >> n paradigm). Among those are that the variance is often a function of the mean, variable-specific estimators of variances are not reliable, and tests statistics have low powers due to a lack of degrees of freedom. Key features include: (i) Normalization and/or variance stabilization of the data, (ii) Computation of mean-variance-regularized t-statistics (F-statistics to follow), (iii) Generation of diverse diagnostic plots, (iv) Computationally efficient implementation using C/C++ interfacing and an option for parallel computing to enjoy a faster and easier experience in the R environment.

Maintained by Jean-Eudes Dazard. Last updated 3 years ago.

cpp

12.6 match 1 stars 3.78 score 12 scripts

khliland

pls:Partial Least Squares and Principal Component Regression

Multivariate regression methods Partial Least Squares Regression (PLSR), Principal Component Regression (PCR) and Canonical Powered Partial Least Squares (CPPLS).

Maintained by Kristian Hovde Liland. Last updated 2 months ago.

3.5 match 36 stars 13.50 score 3.2k scripts 85 dependents

nandy006

smallarea:Fits a Fay Herriot Model

Inference techniques for Fay Herriot Model.

Maintained by Abhishek Nandy. Last updated 8 years ago.

11.3 match 1 stars 4.18 score 9 scripts 1 dependents

cran

compositions:Compositional Data Analysis

Provides functions for the consistent analysis of compositional data (e.g. portions of substances) and positive numbers (e.g. concentrations) in the way proposed by J. Aitchison and V. Pawlowsky-Glahn.

Maintained by K. Gerald van den Boogaart. Last updated 1 years ago.

openblas

7.4 match 1 stars 6.35 score 36 dependents

tidymodels

infer:Tidy Statistical Inference

The objective of this package is to perform inference using an expressive statistical grammar that coheres with the tidy design framework.

Maintained by Simon Couch. Last updated 6 months ago.

3.0 match 734 stars 15.69 score 3.5k scripts 17 dependents

r-computing-lab

BGmisc:An R Package for Extended Behavior Genetics Analysis

Provides functions for behavior genetics analysis, including variance component model identification [Hunter et al. (2021) <doi:10.1007/s10519-021-10055-x>], calculation of relatedness coefficients using path-tracing methods [Wright (1922) <doi:10.1086/279872>; McArdle & McDonald (1984) <doi:10.1111/j.2044-8317.1984.tb00802.x>], inference of relatedness, pedigree conversion, and simulation of multi-generational family data [Lyu et al. (2024) <doi:10.1101/2024.12.19.629449>]. For a full overview, see Garrison et al. (2024) <doi:10.21105/joss.06203>.

Maintained by S. Mason Garrison. Last updated 24 days ago.

behavior-genetics

6.9 match 1 stars 6.83 score 35 scripts

apariciojohan

agriutilities:Utilities for Data Analysis in Agriculture

Utilities designed to make the analysis of field trials easier and more accessible for everyone working in plant breeding. It provides a simple and intuitive interface for conducting single and multi-environmental trial analysis, with minimal coding required. Whether you're a beginner or an experienced user, 'agriutilities' will help you quickly and easily carry out complex analyses with confidence. With built-in functions for fitting Linear Mixed Models, 'agriutilities' is the ideal choice for anyone who wants to save time and focus on interpreting their results. Some of the functions require the R package 'asreml' for the 'ASReml' software, this can be obtained upon purchase from 'VSN' international <https://vsni.co.uk/software/asreml-r/>.

Maintained by Johan Aparicio. Last updated 2 months ago.

lmm met-analysis plant-breeding

6.3 match 18 stars 7.46 score 88 scripts 1 dependents

easystats

see:Model Visualisation Toolbox for 'easystats' and 'ggplot2'

Provides plotting utilities supporting packages in the 'easystats' ecosystem (<https://github.com/easystats/easystats>) and some extra themes, geoms, and scales for 'ggplot2'. Color scales are based on <https://materialui.co/>. References: Lüdecke et al. (2021) <doi:10.21105/joss.03393>.

Maintained by Indrajeet Patil. Last updated 5 days ago.

data-visualization easystats ggplot2 hacktoberfest plotting see statistics visualisation visualization

3.5 match 902 stars 13.22 score 2.0k scripts 3 dependents

johnjsl7

daewr:Design and Analysis of Experiments with R

Contains Data frames and functions used in the book "Design and Analysis of Experiments with R", Lawson(2015) ISBN-13:978-1-4398-6813-3.

Maintained by John Lawson. Last updated 2 years ago.

12.1 match 3 stars 3.83 score 217 scripts 3 dependents

cran

GUniFrac:Generalized UniFrac Distances, Distance-Based Multivariate Methods and Feature-Based Univariate Methods for Microbiome Data Analysis

A suite of methods for powerful and robust microbiome data analysis including data normalization, data simulation, community-level association testing and differential abundance analysis. It implements generalized UniFrac distances, Geometric Mean of Pairwise Ratios (GMPR) normalization, semiparametric data simulator, distance-based statistical methods, and feature-based statistical methods. The distance-based statistical methods include three extensions of PERMANOVA: (1) PERMANOVA using the Freedman-Lane permutation scheme, (2) PERMANOVA omnibus test using multiple matrices, and (3) analytical approach to approximating PERMANOVA p-value. Feature-based statistical methods include linear model-based methods for differential abundance analysis of zero-inflated high-dimensional compositional data.

Maintained by Jun Chen. Last updated 2 years ago.

cpp

7.8 match 5.96 score 277 scripts 7 dependents

koalaverse

vip:Variable Importance Plots

A general framework for constructing variable importance plots from various types of machine learning models in R. Aside from some standard model- specific variable importance measures, this package also provides model- agnostic approaches that can be applied to any supervised learning algorithm. These include 1) an efficient permutation-based variable importance measure, 2) variable importance based on Shapley values (Strumbelj and Kononenko, 2014) <doi:10.1007/s10115-013-0679-x>, and 3) the variance-based approach described in Greenwell et al. (2018) <arXiv:1805.04755>. A variance-based method for quantifying the relative strength of interaction effects is also included (see the previous reference for details).

Maintained by Brandon M. Greenwell. Last updated 2 years ago.

interaction-effect machine-learning partial-dependence-plot supervised-learning-algorithms variable-importance variable-importance-plots

4.0 match 187 stars 11.61 score 3.5k scripts 6 dependents

jelsema

CLME:Constrained Inference for Linear Mixed Effects Models

Estimation and inference for linear models where some or all of the fixed-effects coefficients are subject to order restrictions. This package uses the robust residual bootstrap methodology for inference, and can handle some structure in the residual variance matrix.

Maintained by Casey M. Jelsema. Last updated 5 years ago.

10.8 match 2 stars 4.28 score 38 scripts

ycphs

openxlsx:Read, Write and Edit xlsx Files

Simplifies the creation of Excel .xlsx files by providing a high level interface to writing, styling and editing worksheets. Through the use of 'Rcpp', read/write times are comparable to the 'xlsx' and 'XLConnect' packages with the added benefit of removing the dependency on Java.

Maintained by Jan Marvin Garbuszus. Last updated 2 months ago.

xlsx cpp

2.4 match 232 stars 18.98 score 20k scripts 270 dependents

greenwoodlab

pcev:Principal Component of Explained Variance

Principal component of explained variance (PCEV) is a statistical tool for the analysis of a multivariate response vector. It is a dimension- reduction technique, similar to Principal component analysis (PCA), that seeks to maximize the proportion of variance (in the response vector) being explained by a set of covariates.

Maintained by Maxime Turgeon. Last updated 6 years ago.

10.6 match 4 stars 4.30 score 7 scripts

bioc

edgeR:Empirical Analysis of Digital Gene Expression Data in R

Differential expression analysis of sequence count data. Implements a range of statistical methodology based on the negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models, quasi-likelihood, and gene set enrichment. Can perform differential analyses of any type of omics data that produces read counts, including RNA-seq, ChIP-seq, ATAC-seq, Bisulfite-seq, SAGE, CAGE, metabolomics, or proteomics spectral counts. RNA-seq analyses can be conducted at the gene or isoform level, and tests can be conducted for differential exon or transcript usage.

Maintained by Yunshun Chen. Last updated 6 days ago.

alternativesplicing batcheffect bayesian biomedicalinformatics cellbiology chipseq clustering coverage differentialexpression differentialmethylation differentialsplicing dnamethylation epigenetics functionalgenomics geneexpression genesetenrichment genetics immunooncology multiplecomparison normalization pathways proteomics qualitycontrol regression rnaseq sage sequencing singlecell systemsbiology timecourse transcription transcriptomics openblas

3.4 match 13.40 score 17k scripts 255 dependents

tguillerme

dispRity:Measuring Disparity

A modular package for measuring disparity (multidimensional space occupancy). Disparity can be calculated from any matrix defining a multidimensional space. The package provides a set of implemented metrics to measure properties of the space and allows users to provide and test their own metrics. The package also provides functions for looking at disparity in a serial way (e.g. disparity through time) or per groups as well as visualising the results. Finally, this package provides several statistical tests for disparity analysis.

Maintained by Thomas Guillerme. Last updated 2 days ago.

disparity ecology multidimensionality palaeobiology

5.3 match 26 stars 8.69 score 220 scripts 1 dependents

robinhankin

calibrator:Bayesian Calibration of Complex Computer Codes

Performs Bayesian calibration of computer models as per Kennedy and O'Hagan 2001. The package includes routines to find the hyperparameters and parameters; see the help page for stage1() for a worked example using the toy dataset. A tutorial is provided in the calex.Rnw vignette; and a suite of especially simple one dimensional examples appears in inst/doc/one.dim/.

Maintained by Robin K. S. Hankin. Last updated 4 years ago.

9.9 match 1 stars 4.60 score 44 scripts 3 dependents

kbroman

qtl:Tools for Analyzing QTL Experiments

Analysis of experimental crosses to identify genes (called quantitative trait loci, QTLs) contributing to variation in quantitative traits. Broman et al. (2003) <doi:10.1093/bioinformatics/btg112>.

Maintained by Karl W Broman. Last updated 7 months ago.

openblas

3.5 match 80 stars 12.79 score 2.4k scripts 29 dependents

bioc

corral:Correspondence Analysis for Single Cell Data

Correspondence analysis (CA) is a matrix factorization method, and is similar to principal components analysis (PCA). Whereas PCA is designed for application to continuous, approximately normally distributed data, CA is appropriate for non-negative, count-based data that are in the same additive scale. The corral package implements CA for dimensionality reduction of a single matrix of single-cell data, as well as a multi-table adaptation of CA that leverages data-optimized scaling to align data generated from different sequencing platforms by projecting into a shared latent space. corral utilizes sparse matrices and a fast implementation of SVD, and can be called directly on Bioconductor objects (e.g., SingleCellExperiment) for easy pipeline integration. The package also includes additional options, including variations of CA to address overdispersion in count data (e.g., Freeman-Tukey chi-squared residual), as well as the option to apply CA-style processing to continuous data (e.g., proteomic TOF intensities) with the Hellinger distance adaptation of CA.

Maintained by Lauren Hsu. Last updated 5 months ago.

batcheffect dimensionreduction geneexpression preprocessing principalcomponent sequencing singlecell software visualization

9.8 match 4.64 score 22 scripts

cran

mgcv:Mixed GAM Computation Vehicle with Automatic Smoothness Estimation

Generalized additive (mixed) models, some of their extensions and other generalized ridge regression with multiple smoothing parameter estimation by (Restricted) Marginal Likelihood, Generalized Cross Validation and similar, or using iterated nested Laplace approximation for fully Bayesian inference. See Wood (2017) <doi:10.1201/9781315370279> for an overview. Includes a gam() function, a wide variety of smoothers, 'JAGS' support and distributions beyond the exponential family.

Maintained by Simon Wood. Last updated 1 years ago.

openblas openmp

3.5 match 32 stars 12.71 score 17k scripts 7.8k dependents