R-universe search: univariate

alexiosg

rugarch:Univariate GARCH Models

ARFIMA, in-mean, external regressors and various GARCH flavors, with methods for fit, forecast, simulation, inference and plotting.

Maintained by Alexios Galanos. Last updated 3 months ago.

cpp

49.8 match 26 stars 12.13 score 1.3k scripts 15 dependents

astamm

roahd:Robust Analysis of High Dimensional Data

A collection of methods for the robust analysis of univariate and multivariate functional data, possibly in high-dimensional cases, and hence with attention to computational efficiency and simplicity of use. See the R Journal publication of Ieva et al. (2019) <doi:10.32614/RJ-2019-032> for an in-depth presentation of the 'roahd' package. See Aleman-Gomez et al. (2021) <arXiv:2103.08874> for details about the concept of depthgram.

Maintained by Aymeric Stamm. Last updated 3 years ago.

46.2 match 2 stars 6.29 score 164 scripts 2 dependents

joemsong

Ckmeans.1d.dp:Optimal, Fast, and Reproducible Univariate Clustering

Fast, optimal, and reproducible weighted univariate clustering by dynamic programming. Four problems are solved, including univariate k-means (Wang & Song 2011) <doi:10.32614/RJ-2011-015> (Song & Zhong 2020) <doi:10.1093/bioinformatics/btaa613>, k-median, k-segments, and multi-channel weighted k-means. Dynamic programming is used to minimize the sum of (weighted) within-cluster distances using respective metrics. Its advantage over heuristic clustering in efficiency and accuracy is pronounced when there are many clusters. Multi-channel weighted k-means groups multiple univariate signals into k clusters. An auxiliary function generates histograms adaptive to patterns in data. This package provides a powerful set of tools for univariate data analysis with guaranteed optimality, efficiency, and reproducibility, useful for peak calling on temporal, spatial, and spectral data.

Maintained by Joe Song. Last updated 2 years ago.

cpp

25.1 match 19 stars 8.62 score 339 scripts 19 dependents

lbbe-software

fitdistrplus:Help to Fit of a Parametric Distribution to Non-Censored or Censored Data

Extends the fitdistr() function (of the MASS package) with several functions to help the fit of a parametric distribution to non-censored or censored data. Censored data may contain left censored, right censored and interval censored values, with several lower and upper bounds. In addition to maximum likelihood estimation (MLE), the package provides moment matching (MME), quantile matching (QME), maximum goodness-of-fit estimation (MGE) and maximum spacing estimation (MSE) methods (available only for non-censored data). Weighted versions of MLE, MME, QME and MSE are available. See e.g. Casella & Berger (2002), Statistical inference, Pacific Grove, for a general introduction to parametric estimation.

Maintained by Aurélie Siberchicot. Last updated 12 days ago.

12.5 match 54 stars 16.15 score 4.5k scripts 153 dependents

laplacesdemonr

LaplacesDemon:Complete Environment for Bayesian Inference

Provides a complete environment for Bayesian inference using a variety of different samplers (see ?LaplacesDemon for an overview).

Maintained by Henrik Singmann. Last updated 12 months ago.

14.7 match 93 stars 13.45 score 1.8k scripts 60 dependents

rfastofficial

Rfast:A Collection of Efficient and Extremely Fast R Functions

A collection of fast (utility) functions for data analysis. Column and row wise means, medians, variances, minimums, maximums, many t, F and G-square tests, many regressions (normal, logistic, Poisson), are some of the many fast functions. References: a) Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>. b) Tsagris M. and Papadakis M. (2018). Forward regression in R: from the extreme slow to the extreme fast. Journal of Data Science, 16(4): 771--780. <doi:10.6339/JDS.201810_16(4).00006>. c) Chatzipantsiou C., Dimitriadis M., Papadakis M. and Tsagris M. (2020). Extremely Efficient Permutation and Bootstrap Hypothesis Tests Using Hypothesis Tests Using R. Journal of Modern Applied Statistical Methods, 18(2), eP2898. <doi:10.48550/arXiv.1806.10947>. d) Tsagris M., Papadakis M., Alenazi A. and Alzeley O. (2024). Computationally Efficient Outlier Detection for High-Dimensional Data Using the MDP Algorithm. Computation, 12(9): 185. <doi:10.3390/computation12090185>. e) Tsagris M. and Papadakis M. (2025). Fast and light-weight energy statistics using the R package Rfast. <doi:10.48550/arXiv.2501.02849>.

Maintained by Manos Papadakis. Last updated 17 days ago.

openblas cpp openmp

12.4 match 147 stars 12.54 score 1.2k scripts 166 dependents

jeffreyracine

np:Nonparametric Kernel Smoothing Methods for Mixed Data Types

Nonparametric (and semiparametric) kernel methods that seamlessly handle a mix of continuous, unordered, and ordered factor data types. We would like to gratefully acknowledge support from the Natural Sciences and Engineering Research Council of Canada (NSERC, <https://www.nserc-crsng.gc.ca/>), the Social Sciences and Humanities Research Council of Canada (SSHRC, <https://www.sshrc-crsh.gc.ca/>), and the Shared Hierarchical Academic Research Computing Network (SHARCNET, <https://sharcnet.ca/>). We would also like to acknowledge the contributions of the GNU GSL authors. In particular, we adapt the GNU GSL B-spline routine gsl_bspline.c adding automated support for quantile knots (in addition to uniform knots), providing missing functionality for derivatives, and for extending the splines beyond their endpoints.

Maintained by Jeffrey S. Racine. Last updated 1 months ago.

12.0 match 49 stars 12.64 score 672 scripts 44 dependents

haotianxu

changepoints:A Collection of Change-Point Detection Methods

Performs a series of offline and/or online change-point detection algorithms for 1) univariate mean: <doi:10.1214/20-EJS1710>, <arXiv:2006.03283>; 2) univariate polynomials: <doi:10.1214/21-EJS1963>; 3) univariate and multivariate nonparametric settings: <doi:10.1214/21-EJS1809>, <doi:10.1109/TIT.2021.3130330>; 4) high-dimensional covariances: <doi:10.3150/20-BEJ1249>; 5) high-dimensional networks with and without missing values: <doi:10.1214/20-AOS1953>, <arXiv:2101.05477>, <arXiv:2110.06450>; 6) high-dimensional linear regression models: <arXiv:2010.10410>, <arXiv:2207.12453>; 7) high-dimensional vector autoregressive models: <arXiv:1909.06359>; 8) high-dimensional self exciting point processes: <arXiv:2006.03572>; 9) dependent dynamic nonparametric random dot product graphs: <arXiv:1911.07494>; 10) univariate mean against adversarial attacks: <arXiv:2105.10417>.

Maintained by Haotian Xu. Last updated 1 years ago.

openblas cpp

26.1 match 12 stars 5.78 score 25 scripts

ewenharrison

finalfit:Quickly Create Elegant Regression Results Tables and Plots when Modelling

Generate regression results tables and plots in final format for publication. Explore models and export directly to PDF and 'Word' using 'RMarkdown'.

Maintained by Ewen Harrison. Last updated 7 months ago.

12.1 match 270 stars 11.43 score 1.0k scripts

r-spatial

classInt:Choose Univariate Class Intervals

Selected commonly used methods for choosing univariate class intervals for mapping or other graphics purposes.

Maintained by Roger Bivand. Last updated 3 months ago.

fortran

7.4 match 34 stars 16.02 score 3.2k scripts 1.2k dependents

dsy109

mixtools:Tools for Analyzing Finite Mixture Models

Analyzes finite mixture models for various parametric and semiparametric settings. This includes mixtures of parametric distributions (normal, multivariate normal, multinomial, gamma), various Reliability Mixture Models (RMMs), mixtures-of-regressions settings (linear regression, logistic regression, Poisson regression, linear regression with changepoints, predictor-dependent mixing proportions, random effects regressions, hierarchical mixtures-of-experts), and tools for selecting the number of components (bootstrapping the likelihood ratio test statistic, mixturegrams, and model selection criteria). Bayesian estimation of mixtures-of-linear-regressions models is available as well as a novel data depth method for obtaining credible bands. This package is based upon work supported by the National Science Foundation under Grant No. SES-0518772 and the Chan Zuckerberg Initiative: Essential Open Source Software for Science (Grant No. 2020-255193).

Maintained by Derek Young. Last updated 9 months ago.

mixture-models mixture-of-experts semiparametric-regression

10.2 match 20 stars 11.34 score 1.4k scripts 56 dependents

tyee001

VGAM:Vector Generalized Linear and Additive Models

An implementation of about 6 major classes of statistical regression models. The central algorithm is Fisher scoring and iterative reweighted least squares. At the heart of this package are the vector generalized linear and additive model (VGLM/VGAM) classes. VGLMs can be loosely thought of as multivariate GLMs. VGAMs are data-driven VGLMs that use smoothing. The book "Vector Generalized Linear and Additive Models: With an Implementation in R" (Yee, 2015) <DOI:10.1007/978-1-4939-2818-7> gives details of the statistical framework and the package. Currently only fixed-effects models are implemented. Many (100+) models and distributions are estimated by maximum likelihood estimation (MLE) or penalized MLE. The other classes are RR-VGLMs (reduced-rank VGLMs), quadratic RR-VGLMs, doubly constrained RR-VGLMs, quadratic RR-VGLMs, reduced-rank VGAMs, RCIMs (row-column interaction models)---these classes perform constrained and unconstrained quadratic ordination (CQO/UQO) models in ecology, as well as constrained additive ordination (CAO). Hauck-Donner effect detection is implemented. Note that these functions are subject to change; see the NEWS and ChangeLog files for latest changes.

Maintained by Thomas Yee. Last updated 1 months ago.

fortran

10.8 match 10 stars 10.67 score 3.6k scripts 169 dependents

milanwiedemann

lcsm:Univariate and Bivariate Latent Change Score Modelling

Helper functions to implement univariate and bivariate latent change score models in R using the 'lavaan' package. For details about Latent Change Score Modeling (LCSM) see McArdle (2009) <doi:10.1146/annurev.psych.60.110707.163612> and Grimm, An, McArdle, Zonderman and Resnick (2012) <doi:10.1080/10705511.2012.659627>. The package automatically generates 'lavaan' syntax for different model specifications and varying timepoints. The 'lavaan' syntax generated by this package can be returned and further specifications can be added manually. Longitudinal plots as well as simplified path diagrams can be created to visualise data and model specifications. Estimated model parameters and fit statistics can be extracted as data frames. Data for different univariate and bivariate LCSM can be simulated by specifying estimates for model parameters to explore their effects. This package combines the strengths of other R packages like 'lavaan', 'broom', and 'semPlot' by generating 'lavaan' syntax that helps these packages work together.

Maintained by Milan Wiedemann. Last updated 2 years ago.

17.8 match 17 stars 6.34 score 43 scripts

florianstijven

Surrogate:Evaluation of Surrogate Endpoints in Clinical Trials

In a clinical trial, it frequently occurs that the most credible outcome to evaluate the effectiveness of a new therapy (the true endpoint) is difficult to measure. In such a situation, it can be an effective strategy to replace the true endpoint by a (bio)marker that is easier to measure and that allows for a prediction of the treatment effect on the true endpoint (a surrogate endpoint). The package 'Surrogate' allows for an evaluation of the appropriateness of a candidate surrogate endpoint based on the meta-analytic, information-theoretic, and causal-inference frameworks. Part of this software has been developed using funding provided from the European Union's Seventh Framework Programme for research, technological development and demonstration (Grant Agreement no 602552), the Special Research Fund (BOF) of Hasselt University (BOF-number: BOF2OCPO3), GlaxoSmithKline Biologicals, Baekeland Mandaat (HBC.2022.0145), and Johnson & Johnson Innovative Medicine.

Maintained by Wim Van Der Elst. Last updated 17 days ago.

17.5 match 1 stars 6.42 score 133 scripts

mikejareds

hermiter:Efficient Sequential and Batch Estimation of Univariate and Bivariate Probability Density Functions and Cumulative Distribution Functions along with Quantiles (Univariate) and Nonparametric Correlation (Bivariate)

Facilitates estimation of full univariate and bivariate probability density functions and cumulative distribution functions along with full quantile functions (univariate) and nonparametric correlation (bivariate) using Hermite series based estimators. These estimators are particularly useful in the sequential setting (both stationary and non-stationary) and one-pass batch estimation setting for large data sets. Based on: Stephanou, Michael, Varughese, Melvin and Macdonald, Iain. "Sequential quantiles via Hermite series density estimation." Electronic Journal of Statistics 11.1 (2017): 570-607 <doi:10.1214/17-EJS1245>, Stephanou, Michael and Varughese, Melvin. "On the properties of Hermite series based distribution function estimators." Metrika (2020) <doi:10.1007/s00184-020-00785-z> and Stephanou, Michael and Varughese, Melvin. "Sequential estimation of Spearman rank correlation using Hermite series estimators." Journal of Multivariate Analysis (2021) <doi:10.1016/j.jmva.2021.104783>.

Maintained by Michael Stephanou. Last updated 7 months ago.

cumulative-distribution-function kendall-correlation-coefficient online-algorithms probability-density-function quantile spearman-correlation-coefficient statistics streaming-algorithms streaming-data cpp

19.2 match 15 stars 5.58 score 17 scripts

tagteam

Publish:Format Output of Various Routines in a Suitable Way for Reports and Publication

A bunch of convenience functions that transform the results of some basic statistical analyses into table format nearly ready for publication. This includes descriptive tables, tables of logistic regression and Cox regression results as well as forest plots.

Maintained by Thomas A. Gerds. Last updated 10 days ago.

10.0 match 15 stars 10.11 score 274 scripts 36 dependents

mikewlcheung

metaSEM:Meta-Analysis using Structural Equation Modeling

A collection of functions for conducting meta-analysis using a structural equation modeling (SEM) approach via the 'OpenMx' and 'lavaan' packages. It also implements various procedures to perform meta-analytic structural equation modeling on the correlation and covariance matrices, see Cheung (2015) <doi:10.3389/fpsyg.2014.01521>.

Maintained by Mike Cheung. Last updated 9 days ago.

meta-analysis meta-analytic-sem missing-data multilevel-models multivariate-analysis structural-equation-modeling structural-equation-models

10.2 match 30 stars 9.43 score 208 scripts 1 dependents

biodiverse

spAbundance:Univariate and Multivariate Spatial Modeling of Species Abundance

Fits single-species (univariate) and multi-species (multivariate) non-spatial and spatial abundance models in a Bayesian framework using Markov Chain Monte Carlo (MCMC). Spatial models are fit using Nearest Neighbor Gaussian Processes (NNGPs). Details on NNGP models are given in Datta, Banerjee, Finley, and Gelfand (2016) <doi:10.1080/01621459.2015.1044091> and Finley, Datta, and Banerjee (2022) <doi:10.18637/jss.v103.i05>. Fits single-species and multi-species spatial and non-spatial versions of generalized linear mixed models (Gaussian, Poisson, Negative Binomial), N-mixture models (Royle 2004 <doi:10.1111/j.0006-341X.2004.00142.x>) and hierarchical distance sampling models (Royle, Dawson, Bates (2004) <doi:10.1890/03-3127>). Multi-species spatial models are fit using a spatial factor modeling approach with NNGPs for computational efficiency.

Maintained by Jeffrey Doser. Last updated 17 days ago.

openblas cpp openmp

15.6 match 17 stars 6.15 score 43 scripts 1 dependents

bioc

structToolbox:Data processing & analysis tools for Metabolomics and other omics

An extensive set of data (pre-)processing and analysis methods and tools for metabolomics and other omics, with a strong emphasis on statistics and machine learning. This toolbox allows the user to build extensive and standardised workflows for data analysis. The methods and tools have been implemented using class-based templates provided by the struct (Statistics in R Using Class-based Templates) package. The toolbox includes pre-processing methods (e.g. signal drift and batch correction, normalisation, missing value imputation and scaling), univariate (e.g. ttest, various forms of ANOVA, Kruskal–Wallis test and more) and multivariate statistical methods (e.g. PCA and PLS, including cross-validation and permutation testing) as well as machine learning methods (e.g. Support Vector Machines). The STATistics Ontology (STATO) has been integrated and implemented to provide standardised definitions for the different methods, inputs and outputs.

Maintained by Gavin Rhys Lloyd. Last updated 25 days ago.

workflowstep metabolomics bioconductor-package dims lc-ms machine-learning multivariate-analysis statistics univariate

14.9 match 10 stars 6.26 score 12 scripts

luca-scr

mclust:Gaussian Mixture Modelling for Model-Based Clustering, Classification, and Density Estimation

Gaussian finite mixture models fitted via EM algorithm for model-based clustering, classification, and density estimation, including Bayesian regularization, dimension reduction for visualisation, and resampling-based inference.

Maintained by Luca Scrucca. Last updated 11 months ago.

fortran openblas

7.5 match 21 stars 12.23 score 6.6k scripts 587 dependents

egenn

rtemis:Machine Learning and Visualization

Advanced Machine Learning and Visualization. Unsupervised Learning (Clustering, Decomposition), Supervised Learning (Classification, Regression), Cross-Decomposition, Bagging, Boosting, Meta-models. Static and interactive graphics.

Maintained by E.D. Gennatas. Last updated 1 months ago.

data-science data-visualization machine-learning machine-learning-library visualization

11.3 match 145 stars 7.09 score 50 scripts 2 dependents

choonghyunryu

dlookr:Tools for Data Diagnosis, Exploration, Transformation

A collection of tools that support data diagnosis, exploration, and transformation. Data diagnostics provides information and visualization of missing values, outliers, and unique and negative values to help you understand the distribution and quality of your data. Data exploration provides information and visualization of the descriptive statistics of univariate variables, normality tests and outliers, correlation of two variables, and the relationship between the target variable and predictor. Data transformation supports binning for categorizing continuous variables, imputes missing values and outliers, and resolves skewness. And it creates automated reports that support these three tasks.

Maintained by Choonghyun Ryu. Last updated 9 months ago.

7.1 match 212 stars 11.05 score 748 scripts 2 dependents

billvenables

polynom:A Collection of Functions to Implement a Class for Univariate Polynomial Manipulations

A collection of functions to implement a class for univariate polynomial manipulations.

Maintained by Bill Venables. Last updated 3 years ago.

8.0 match 1 stars 9.50 score 438 scripts 614 dependents

cardiomoon

autoReg:Automatic Linear and Logistic Regression and Survival Analysis

Make summary tables for descriptive statistics and select explanatory variables automatically in various regression models. Support linear models, generalized linear models and cox-proportional hazard models. Generate publication-ready tables summarizing result of regression analysis and plots. The tables and plots can be exported in "HTML", "pdf('LaTex')", "docx('MS Word')" and "pptx('MS Powerpoint')" documents.

Maintained by Keon-Woong Moon. Last updated 1 years ago.

10.7 match 47 stars 7.00 score 69 scripts

robjhyndman

forecast:Forecasting Functions for Time Series and Linear Models

Methods and tools for displaying and analysing univariate time series forecasts including exponential smoothing via state space models and automatic ARIMA modelling.

Maintained by Rob Hyndman. Last updated 7 months ago.

forecast forecasting openblas cpp

4.0 match 1.1k stars 18.63 score 16k scripts 239 dependents

cran

mixAK:Multivariate Normal Mixture Models and Mixtures of Generalized Linear Mixed Models Including Model Based Clustering

Contains a mixture of statistical methods including the MCMC methods to analyze normal mixtures. Additionally, model based clustering methods are implemented to perform classification based on (multivariate) longitudinal (or otherwise correlated) data. The basis for such clustering is a mixture of multivariate generalized linear mixed models. The package is primarily related to the publications Komárek (2009, Comp. Stat. and Data Anal.) <doi:10.1016/j.csda.2009.05.006> and Komárek and Komárková (2014, J. of Stat. Soft.) <doi:10.18637/jss.v059.i12>. It also implements methods published in Komárek and Komárková (2013, Ann. of Appl. Stat.) <doi:10.1214/12-AOAS580>, Hughes, Komárek, Bonnett, Czanner, García-Fiñana (2017, Stat. in Med.) <doi:10.1002/sim.7397>, Jaspers, Komárek, Aerts (2018, Biom. J.) <doi:10.1002/bimj.201600253> and Hughes, Komárek, Czanner, García-Fiñana (2018, Stat. Meth. in Med. Res) <doi:10.1177/0962280216674496>.

Maintained by Arnošt Komárek. Last updated 6 months ago.

openblas

16.0 match 4 stars 4.54 score 108 scripts 3 dependents

jamesotto852

ggdensity:Interpretable Bivariate Density Visualization with 'ggplot2'

The 'ggplot2' package provides simple functions for visualizing contours of 2-d kernel density estimates. 'ggdensity' implements several additional density estimators as well as more interpretable visualizations based on highest density regions instead of the traditional height of the estimated density surface.

Maintained by James Otto. Last updated 1 years ago.

9.0 match 231 stars 8.11 score 185 scripts 2 dependents

kisungyou

SHT:Statistical Hypothesis Testing Toolbox

We provide a collection of statistical hypothesis testing procedures ranging from classical to modern methods for non-trivial settings such as high-dimensional scenario. For the general treatment of statistical hypothesis testing, see the book by Lehmann and Romano (2005) <doi:10.1007/0-387-27605-X>.

Maintained by Kisung You. Last updated 18 days ago.

openblas cpp openmp

13.7 match 6 stars 5.13 score 50 scripts 1 dependents

rfastofficial

Rfast2:A Collection of Efficient and Extremely Fast R Functions II

A collection of fast statistical and utility functions for data analysis. Functions for regression, maximum likelihood, column-wise statistics and many more have been included. C++ has been utilized to speed up the functions. References: Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>.

Maintained by Manos Papadakis. Last updated 1 years ago.

openblas cpp openmp

8.7 match 38 stars 8.09 score 75 scripts 26 dependents

bioc

decoupleR:decoupleR: Ensemble of computational methods to infer biological activities from omics data

Many methods allow us to extract biological activities from omics data using information from prior knowledge resources, reducing the dimensionality for increased statistical power and better interpretability. Here, we present decoupleR, a Bioconductor package containing different statistical methods to extract these signatures within a unified framework. decoupleR allows the user to flexibly test any method with any resource. It incorporates methods that take into account the sign and weight of network interactions. decoupleR can be used with any omic, as long as its features can be linked to a biological process based on prior knowledge. For example, in transcriptomics gene sets regulated by a transcription factor, or in phospho-proteomics phosphosites that are targeted by a kinase.

Maintained by Pau Badia-i-Mompel. Last updated 5 months ago.

differentialexpression functionalgenomics geneexpression generegulation network software statisticalmethod transcription

6.1 match 230 stars 11.27 score 316 scripts 3 dependents

shotaochi

scorepeak:Peak Functions for Peak Detection in Univariate Time Series

Provides peak functions, which enable us to detect peaks in time series. The methods implemented in this package are based on Girish Keshav Palshikar (2009) <https://www.researchgate.net/publication/228853276_Simple_Algorithms_for_Peak_Detection_in_Time-Series>.

Maintained by Shota Ochi. Last updated 4 years ago.

peak-detection univariate cpp

18.0 match 1 stars 3.70 score 6 scripts

cran

bayesm:Bayesian Inference for Marketing/Micro-Econometrics

Covers many important models used in marketing and micro-econometrics applications. The package includes: Bayes Regression (univariate or multivariate dep var), Bayes Seemingly Unrelated Regression (SUR), Binary and Ordinal Probit, Multinomial Logit (MNL) and Multinomial Probit (MNP), Multivariate Probit, Negative Binomial (Poisson) Regression, Multivariate Mixtures of Normals (including clustering), Dirichlet Process Prior Density Estimation with normal base, Hierarchical Linear Models with normal prior and covariates, Hierarchical Linear Models with a mixture of normals prior and covariates, Hierarchical Multinomial Logits with a mixture of normals prior and covariates, Hierarchical Multinomial Logits with a Dirichlet Process prior and covariates, Hierarchical Negative Binomial Regression Models, Bayesian analysis of choice-based conjoint data, Bayesian treatment of linear instrumental variables models, Analysis of Multivariate Ordinal survey data with scale usage heterogeneity (as in Rossi et al, JASA (01)), Bayesian Analysis of Aggregate Random Coefficient Logit Models as in BLP (see Jiang, Manchanda, Rossi 2009) For further reference, consult our book, Bayesian Statistics and Marketing by Rossi, Allenby and McCulloch (Wiley first edition 2005 and second forthcoming) and Bayesian Non- and Semi-Parametric Methods and Applications (Princeton U Press 2014).

Maintained by Peter Rossi. Last updated 1 years ago.

openblas cpp

7.8 match 20 stars 8.20 score 322 scripts 43 dependents

c7rishi

BAMBI:Bivariate Angular Mixture Models

Fit (using Bayesian methods) and simulate mixtures of univariate and bivariate angular distributions. Chakraborty and Wong (2021) <doi:10.18637/jss.v099.i11>.

Maintained by Saptarshi Chakraborty. Last updated 5 months ago.

cpp

13.2 match 3 stars 4.83 score 65 scripts 1 dependents

mariarizzo

energy:E-Statistics: Multivariate Inference via the Energy of Data

E-statistics (energy) tests and statistics for multivariate and univariate inference, including distance correlation, one-sample, two-sample, and multi-sample tests for comparing multivariate distributions, are implemented. Measuring and testing multivariate independence based on distance correlation, partial distance correlation, multivariate goodness-of-fit tests, k-groups and hierarchical clustering based on energy distance, testing for multivariate normality, distance components (disco) for non-parametric analysis of structured data, and other energy statistics/methods are implemented.

Maintained by Maria Rizzo. Last updated 7 months ago.

distance-correlation energy multivariate-analysis statistics cpp

6.0 match 45 stars 10.60 score 634 scripts 45 dependents

gloewing

fastFMM:Fast Functional Mixed Models using Fast Univariate Inference

Implementation of the fast univariate inference approach (Cui et al. (2022) <doi:10.1080/10618600.2021.1950006>, Loewinger et al. (2024) <doi:10.7554/eLife.95802.2>) for fitting functional mixed models. User guides and Python package information can be found at <https://github.com/gloewing/photometry_FLMM>.

Maintained by Erjia Cui. Last updated 3 days ago.

9.7 match 8 stars 6.42 score 22 scripts

waternumbers

anomalous:Anomaly Detection using the CAPA and PELT Algorithms

Implimentations of the univariate CAPA <doi:10.1002/sam.11586> and PELT <doi:10.1080/01621459.2012.737745> algotithms along with various cost functions for different distributions and models. The modular design, using R6 classes, favour ease of extension (for example user written cost functions) over the performance of other implimentations (e.g. <doi:10.32614/CRAN.package.changepoint>, <doi:10.32614/CRAN.package.anomaly>).

Maintained by Paul Smith. Last updated 3 months ago.

cpp

13.5 match 4.61 score 18 scripts

insightsengineering

tern:Create Common TLGs Used in Clinical Trials

Table, Listings, and Graphs (TLG) library for common outputs used in clinical trials.

Maintained by Joe Zhu. Last updated 2 months ago.

clinical-trials graphs listings nest outputs tables

4.9 match 79 stars 12.62 score 186 scripts 9 dependents

bioc

metaCCA:Summary Statistics-Based Multivariate Meta-Analysis of Genome-Wide Association Studies Using Canonical Correlation Analysis

metaCCA performs multivariate analysis of a single or multiple GWAS based on univariate regression coefficients. It allows multivariate representation of both phenotype and genotype. metaCCA extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness.

Maintained by Anna Cichonska. Last updated 5 months ago.

genomewideassociation snp genetics regression statisticalmethod software

13.7 match 4.26 score 5 scripts

tpetzoldt

FAmle:Maximum Likelihood and Bayesian Estimation of Univariate Probability Distributions

Estimate parameters of univariate probability distributions with maximum likelihood and Bayesian methods.

Maintained by Thomas Petzoldt. Last updated 3 years ago.

15.2 match 3.81 score 13 scripts

tlverse

sl3:Pipelines for Machine Learning and Super Learning

A modern implementation of the Super Learner prediction algorithm, coupled with a general purpose framework for composing arbitrary pipelines for machine learning tasks.

Maintained by Jeremy Coyle. Last updated 4 months ago.

data-science ensemble-learning ensemble-model machine-learning model-selection regression stacking statistics

5.7 match 100 stars 9.94 score 748 scripts 7 dependents

finleya

spBayes:Univariate and Multivariate Spatial-Temporal Modeling

Fits univariate and multivariate spatio-temporal random effects models for point-referenced data using Markov chain Monte Carlo (MCMC). Details are given in Finley, Banerjee, and Gelfand (2015) <doi:10.18637/jss.v063.i13> and Finley and Banerjee <doi:10.1016/j.envsoft.2019.104608>.

Maintained by Andrew Finley. Last updated 6 months ago.

openblas cpp openmp

12.0 match 1 stars 4.69 score 231 scripts 7 dependents

josetamezpena

FRESA.CAD:Feature Selection Algorithms for Computer Aided Diagnosis

Contains a set of utilities for building and testing statistical models (linear, logistic,ordinal or COX) for Computer Aided Diagnosis/Prognosis applications. Utilities include data adjustment, univariate analysis, model building, model-validation, longitudinal analysis, reporting and visualization.

Maintained by Jose Gerardo Tamez-Pena. Last updated 1 months ago.

openblas cpp openmp

10.0 match 7 stars 5.59 score 31 scripts

r-forge

POT:Generalized Pareto Distribution and Peaks Over Threshold

Some functions useful to perform a Peak Over Threshold analysis in univariate and bivariate cases, see Beirlant et al. (2004) <doi:10.1002/0470012382>. A user guide is available in the vignette.

Maintained by Christophe Dutang. Last updated 5 months ago.

8.9 match 6.20 score 105 scripts 2 dependents

r-forge

surveillance:Temporal and Spatio-Temporal Modeling and Monitoring of Epidemic Phenomena

Statistical methods for the modeling and monitoring of time series of counts, proportions and categorical data, as well as for the modeling of continuous-time point processes of epidemic phenomena. The monitoring methods focus on aberration detection in count data time series from public health surveillance of communicable diseases, but applications could just as well originate from environmetrics, reliability engineering, econometrics, or social sciences. The package implements many typical outbreak detection procedures such as the (improved) Farrington algorithm, or the negative binomial GLR-CUSUM method of Hoehle and Paul (2008) <doi:10.1016/j.csda.2008.02.015>. A novel CUSUM approach combining logistic and multinomial logistic modeling is also included. The package contains several real-world data sets, the ability to simulate outbreak data, and to visualize the results of the monitoring in a temporal, spatial or spatio-temporal fashion. A recent overview of the available monitoring procedures is given by Salmon et al. (2016) <doi:10.18637/jss.v070.i10>. For the retrospective analysis of epidemic spread, the package provides three endemic-epidemic modeling frameworks with tools for visualization, likelihood inference, and simulation. hhh4() estimates models for (multivariate) count time series following Paul and Held (2011) <doi:10.1002/sim.4177> and Meyer and Held (2014) <doi:10.1214/14-AOAS743>. twinSIR() models the susceptible-infectious-recovered (SIR) event history of a fixed population, e.g, epidemics across farms or networks, as a multivariate point process as proposed by Hoehle (2009) <doi:10.1002/bimj.200900050>. twinstim() estimates self-exciting point process models for a spatio-temporal point pattern of infective events, e.g., time-stamped geo-referenced surveillance data, as proposed by Meyer et al. (2012) <doi:10.1111/j.1541-0420.2011.01684.x>. A recent overview of the implemented space-time modeling frameworks for epidemic phenomena is given by Meyer et al. (2017) <doi:10.18637/jss.v077.i11>.

Maintained by Sebastian Meyer. Last updated 2 days ago.

cpp

5.1 match 2 stars 10.68 score 446 scripts 3 dependents

psychmeta

psychmeta:Psychometric Meta-Analysis Toolkit

Tools for computing bare-bones and psychometric meta-analyses and for generating psychometric data for use in meta-analysis simulations. Supports bare-bones, individual-correction, and artifact-distribution methods for meta-analyzing correlations and d values. Includes tools for converting effect sizes, computing sporadic artifact corrections, reshaping meta-analytic databases, computing multivariate corrections for range variation, and more. Bugs can be reported to <https://github.com/psychmeta/psychmeta/issues> or <issues@psychmeta.com>.

Maintained by Jeffrey A. Dahlke. Last updated 9 months ago.

hacktoberfest meta-analysis psychology psychometric psychometrics

6.6 match 57 stars 8.25 score 151 scripts

r-forge

car:Companion to Applied Regression

Functions to Accompany J. Fox and S. Weisberg, An R Companion to Applied Regression, Third Edition, Sage, 2019.

Maintained by John Fox. Last updated 5 months ago.

3.5 match 15.29 score 43k scripts 901 dependents

mkln

meshed:Bayesian Regression with Meshed Gaussian Processes

Fits Bayesian regression models based on latent Meshed Gaussian Processes (MGP) as described in Peruzzi, Banerjee, Finley (2020) <doi:10.1080/01621459.2020.1833889>, Peruzzi, Banerjee, Dunson, and Finley (2021) <arXiv:2101.03579>, Peruzzi and Dunson (2024) <arXiv:2201.10080>. Funded by ERC grant 856506 and NIH grant R01ES028804.

Maintained by Michele Peruzzi. Last updated 7 months ago.

bayesian mcmc multivariate regression spatial spatiotemporal openblas cpp openmp

8.7 match 13 stars 6.11 score 49 scripts

dwarton

mvabund:Statistical Methods for Analysing Multivariate Abundance Data

A set of tools for displaying, modeling and analysing multivariate abundance data in community ecology. See 'mvabund-package.Rd' for details of overall package organization. The package is implemented with the Gnu Scientific Library (<http://www.gnu.org/software/gsl/>) and 'Rcpp' (<http://dirk.eddelbuettel.com/code/rcpp.html>) 'R' / 'C++' classes.

Maintained by David Warton. Last updated 1 years ago.

gsl cpp

5.3 match 10 stars 10.13 score 680 scripts 5 dependents

fridleylab

spatialTIME:Spatial Analysis of Vectra Immunoflourescent Data

Visualization and analysis of Vectra Immunoflourescent data. Options for calculating both the univariate and bivariate Ripley's K are included. Calculations are performed using a permutation-based approach presented by Wilson et al. <doi:10.1101/2021.04.27.21256104>.

Maintained by Fridley Lab. Last updated 7 months ago.

8.7 match 4 stars 6.08 score 30 scripts

rivasiker

PhaseTypeR:General-Purpose Phase-Type Functions

General implementation of core function from phase-type theory. 'PhaseTypeR' can be used to model continuous and discrete phase-type distributions, both univariate and multivariate. The package includes functions for outputting the mean and (co)variance of phase-type distributions; their density, probability and quantile functions; functions for random draws; functions for reward-transformation; and functions for plotting the distributions as networks. For more information on these functions please refer to Bladt and Nielsen (2017, ISBN: 978-1-4939-8377-3) and Campillo Navarro (2019) <https://orbit.dtu.dk/en/publications/order-statistics-and-multivariate-discrete-phase-type-distributio>.

Maintained by Iker Rivas-González. Last updated 2 years ago.

9.8 match 2 stars 5.37 score 39 scripts

duncanobrien

EWSmethods:Forecasting Tipping Points at the Community Level

Rolling and expanding window approaches to assessing abundance based early warning signals, non-equilibrium resilience measures, and machine learning. See Dakos et al. (2012) <doi:10.1371/journal.pone.0041010>, Deb et al. (2022) <doi:10.1098/rsos.211475>, Drake and Griffen (2010) <doi:10.1038/nature09389>, Ushio et al. (2018) <doi:10.1038/nature25504> and Weinans et al. (2021) <doi:10.1038/s41598-021-87839-y> for methodological details. Graphical presentation of the outputs are also provided for clear and publishable figures. Visit the 'EWSmethods' website for more information, and tutorials.

Maintained by Duncan OBrien. Last updated 7 months ago.

9.6 match 8 stars 5.51 score 20 scripts

dcomtois

summarytools:Tools to Quickly and Neatly Summarize Data

Data frame summaries, cross-tabulations, weight-enabled frequency tables and common descriptive (univariate) statistics in concise tables available in a variety of formats (plain ASCII, Markdown and HTML). A good point-of-entry for exploring data, both for experienced and new R users.

Maintained by Dominic Comtois. Last updated 16 hours ago.

descriptive-statistics frequency-table html-report markdown pander pandoc pandoc-markdown rmarkdown rstudio

3.6 match 526 stars 14.52 score 2.9k scripts 6 dependents

covaruber

sommer:Solving Mixed Model Equations in R

Structural multivariate-univariate linear mixed model solver for estimation of multiple random effects with unknown variance-covariance structures (e.g., heterogeneous and unstructured) and known covariance among levels of random effects (e.g., pedigree and genomic relationship matrices) (Covarrubias-Pazaran, 2016 <doi:10.1371/journal.pone.0156744>; Maier et al., 2015 <doi:10.1016/j.ajhg.2014.12.006>; Jensen et al., 1997). REML estimates can be obtained using the Direct-Inversion Newton-Raphson and Direct-Inversion Average Information algorithms for the problems r x r (r being the number of records) or using the Henderson-based average information algorithm for the problem c x c (c being the number of coefficients to estimate). Spatial models can also be fitted using the two-dimensional spline functionality available.

Maintained by Giovanny Covarrubias-Pazaran. Last updated 21 days ago.

average-information mixed-models rcpparmadillo openblas cpp openmp

4.1 match 43 stars 12.70 score 300 scripts 9 dependents

cran

survivalAnalysis:High-Level Interface for Survival Analysis and Associated Plots

A high-level interface to perform survival analysis, including Kaplan-Meier analysis and log-rank tests and Cox regression. Aims at providing a clear and elegant syntax, support for use in a pipeline, structured output and plotting. Builds upon the 'survminer' package for Kaplan-Meier plots and provides a customizable implementation for forest plots. Kaplan & Meier (1958) <doi:10.1080/01621459.1958.10501452> Cox (1972) <JSTOR:2985181> Peto & Peto (1972) <JSTOR:2344317>.

Maintained by Marcel Wiesweg. Last updated 3 years ago.

11.6 match 3 stars 4.43 score 1 dependents

braverock

PerformanceAnalytics:Econometric Tools for Performance and Risk Analysis

Collection of econometric functions for performance and risk analysis. In addition to standard risk and performance metrics, this package aims to aid practitioners and researchers in utilizing the latest research in analysis of non-normal return streams. In general, it is most tested on return (rather than price) data on a regular scale, but most functions will work with irregular return data as well, and increasing numbers of functions will work with P&L or price data where possible.

Maintained by Brian G. Peterson. Last updated 3 months ago.

3.2 match 222 stars 15.93 score 4.8k scripts 20 dependents

bioc

phenomis:Postprocessing and univariate analysis of omics data

The 'phenomis' package provides methods to perform post-processing (i.e. quality control and normalization) as well as univariate statistical analysis of single and multi-omics data sets. These methods include quality control metrics, signal drift and batch effect correction, intensity transformation, univariate hypothesis testing, but also clustering (as well as annotation of metabolomics data). The data are handled in the standard Bioconductor formats (i.e. SummarizedExperiment and MultiAssayExperiment for single and multi-omics datasets, respectively; the alternative ExpressionSet and MultiDataSet formats are also supported for convenience). As a result, all methods can be readily chained as workflows. The pipeline can be further enriched by multivariate analysis and feature selection, by using the 'ropls' and 'biosigner' packages, which support the same formats. Data can be conveniently imported from and exported to text files. Although the methods were initially targeted to metabolomics data, most of the methods can be applied to other types of omics data (e.g., transcriptomics, proteomics).

Maintained by Etienne A. Thevenot. Last updated 5 months ago.

batcheffect clustering coverage kegg massspectrometry metabolomics normalization proteomics qualitycontrol sequencing statisticalmethod transcriptomics

11.7 match 4.40 score 6 scripts

tomroh

fitur:Fit Univariate Distributions

Wrapper for computing parameters for univariate distributions using MLE. It creates an object that stores d, p, q, r functions as well as parameters and statistics for diagnostics. Currently supports automated fitting from base and actuar packages. A manually fitting distribution fitting function is included to support directly specifying parameters for any distribution from ancillary packages.

Maintained by Thomas Roh. Last updated 3 years ago.

9.5 match 4 stars 5.32 score 35 scripts

gasparrini

mvmeta:Multivariate and Univariate Meta-Analysis and Meta-Regression

Collection of functions to perform fixed and random-effects multivariate and univariate meta-analysis and meta-regression.

Maintained by Antonio Gasparrini. Last updated 5 years ago.

6.9 match 6 stars 7.29 score 151 scripts 10 dependents

bioc

TCGAbiolinks:TCGAbiolinks: An R/Bioconductor package for integrative analysis with GDC data

The aim of TCGAbiolinks is : i) facilitate the GDC open-access data retrieval, ii) prepare the data using the appropriate pre-processing strategies, iii) provide the means to carry out different standard analyses and iv) to easily reproduce earlier research results. In more detail, the package provides multiple methods for analysis (e.g., differential expression analysis, identifying differentially methylated regions) and methods for visualization (e.g., survival plots, volcano plots, starburst plots) in order to easily develop complete analysis pipelines.

Maintained by Tiago Chedraoui Silva. Last updated 26 days ago.

dnamethylation differentialmethylation generegulation geneexpression methylationarray differentialexpression pathways network sequencing survival software bioc bioconductor gdc integrative-analysis tcga tcga-data tcgabiolinks

3.4 match 305 stars 14.45 score 1.6k scripts 6 dependents

bioc

microbiome:Microbiome Analytics

Utilities for microbiome analysis.

Maintained by Leo Lahti. Last updated 5 months ago.

metagenomics microbiome sequencing systemsbiology hitchip hitchip-atlas human-microbiome microbiology microbiome-analysis phyloseq population-study

3.9 match 290 stars 12.50 score 2.0k scripts 5 dependents

clarahapp

funData:An S4 Class for Functional Data

S4 classes for univariate and multivariate functional data with utility functions. See <doi:10.18637/jss.v093.i05> for a detailed description of the package functionalities and its interplay with the MFPCA package for multivariate functional principal component analysis <https://CRAN.R-project.org/package=MFPCA>.

Maintained by Clara Happ-Kurz. Last updated 1 years ago.

7.9 match 14 stars 6.15 score 111 scripts 6 dependents

tbates

umx:Structural Equation Modeling and Twin Modeling in R

Quickly create, run, and report structural equation models, and twin models. See '?umx' for help, and umx_open_CRAN_page("umx") for NEWS. Timothy C. Bates, Michael C. Neale, Hermine H. Maes, (2019). umx: A library for Structural Equation and Twin Modelling in R. Twin Research and Human Genetics, 22, 27-41. <doi:10.1017/thg.2019.2>.

Maintained by Timothy C. Bates. Last updated 1 days ago.

behavior-genetics genetics openmx psychology sem statistics structural-equation-modeling tutorials twin-models umx

5.0 match 44 stars 9.45 score 472 scripts

myles-lewis

nestedcv:Nested Cross-Validation with 'glmnet' and 'caret'

Implements nested k*l-fold cross-validation for lasso and elastic-net regularised linear models via the 'glmnet' package and other machine learning models via the 'caret' package <doi:10.1093/bioadv/vbad048>. Cross-validation of 'glmnet' alpha mixing parameter and embedded fast filter functions for feature selection are provided. Described as double cross-validation by Stone (1977) <doi:10.1111/j.2517-6161.1977.tb01603.x>. Also implemented is a method using outer CV to measure unbiased model performance metrics when fitting Bayesian linear and logistic regression shrinkage models using the horseshoe prior over parameters to encourage a sparse model as described by Piironen & Vehtari (2017) <doi:10.1214/17-EJS1337SI>.

Maintained by Myles Lewis. Last updated 5 days ago.

6.0 match 12 stars 7.92 score 46 scripts

nanxstats

oneclust:Maximum Homogeneity Clustering for Univariate Data

Maximum homogeneity clustering algorithm for one-dimensional data described in W. D. Fisher (1958) <doi:10.1080/01621459.1958.10501479> via dynamic programming.

Maintained by Nan Xiao. Last updated 1 years ago.

clustering-algorithm feature-engineering homogeneity peak-calling univariate-data cpp

10.5 match 5 stars 4.40 score

cran

reportRmd:Tidy Presentation of Clinical Reporting

Streamlined statistical reporting in 'Rmarkdown' environments. Facilitates the automated reporting of descriptive statistics, multiple univariate models, multivariable models and tables combining these outputs. Plotting functions include customisable survival curves, forest plots from logistic and ordinal regression and bivariate comparison plots.

Maintained by Lisa Avery. Last updated 2 months ago.

13.3 match 3.45 score 19 scripts 1 dependents

neerajdhanraj

PSF:Forecasting of univariate time series using the Pattern Sequence-based Forecasting (PSF) algorithm

Pattern Sequence Based Forecasting (PSF) takes univariate time series data as input and assist to forecast its future values. This algorithm forecasts the behavior of time series based on similarity of pattern sequences. Initially, clustering is done with the labeling of samples from database. The labels associated with samples are then used for forecasting the future behaviour of time series data. The further technical details and references regarding PSF are discussed in Vignette.

Maintained by Neeraj Bokde. Last updated 8 years ago.

6.7 match 17 stars 6.85 score 28 scripts 2 dependents

jasjeetsekhon

Matching:Multivariate and Propensity Score Matching with Balance Optimization

Provides functions for multivariate and propensity score matching and for finding optimal balance based on a genetic search algorithm. A variety of univariate and multivariate metrics to determine if balance has been obtained are also provided. For details, see the paper by Jasjeet Sekhon (2007, <doi:10.18637/jss.v042.i07>).

Maintained by Jasjeet Singh Sekhon. Last updated 5 months ago.

cpp

4.4 match 24 stars 10.36 score 852 scripts 10 dependents

neon-biodiversity

Ostats:O-Stats, or Pairwise Community-Level Niche Overlap Statistics

O-statistics, or overlap statistics, measure the degree of community-level trait overlap. They are estimated by fitting nonparametric kernel density functions to each species’ trait distribution and calculating their areas of overlap. For instance, the median pairwise overlap for a community is calculated by first determining the overlap of each species pair in trait space, and then taking the median overlap of each species pair in a community. This median overlap value is called the O-statistic (O for overlap). The Ostats() function calculates separate univariate overlap statistics for each trait, while the Ostats_multivariate() function calculates a single multivariate overlap statistic for all traits. O-statistics can be evaluated against null models to obtain standardized effect sizes. 'Ostats' is part of the collaborative Macrosystems Biodiversity Project "Local- to continental-scale drivers of biodiversity across the National Ecological Observatory Network (NEON)." For more information on this project, see the Macrosystems Biodiversity Website (<https://neon-biodiversity.github.io/>). Calculation of O-statistics is described in Read et al. (2018) <doi:10.1111/ecog.03641>, and a teaching module for introducing the underlying biological concepts at an undergraduate level is described in Grady et al. (2018) <http://tiee.esa.org/vol/v14/issues/figure_sets/grady/abstract.html>.

Maintained by Quentin D. Read. Last updated 4 months ago.

ecology

6.8 match 7 stars 6.69 score 28 scripts

helske

bssm:Bayesian Inference of Non-Linear and Non-Gaussian State Space Models

Efficient methods for Bayesian inference of state space models via Markov chain Monte Carlo (MCMC) based on parallel importance sampling type weighted estimators (Vihola, Helske, and Franks, 2020, <doi:10.1111/sjos.12492>), particle MCMC, and its delayed acceptance version. Gaussian, Poisson, binomial, negative binomial, and Gamma observation densities and basic stochastic volatility models with linear-Gaussian state dynamics, as well as general non-linear Gaussian models and discretised diffusion models are supported. See Helske and Vihola (2021, <doi:10.32614/RJ-2021-103>) for details.

Maintained by Jouni Helske. Last updated 6 months ago.

bayesian-inference cpp markov-chain-monte-carlo particle-filter state-space time-series openblas cpp openmp

6.9 match 42 stars 6.43 score 11 scripts

clarahapp

MFPCA:Multivariate Functional Principal Component Analysis for Data Observed on Different Dimensional Domains

Calculate a multivariate functional principal component analysis for data observed on different dimensional domains. The estimation algorithm relies on univariate basis expansions for each element of the multivariate functional data (Happ & Greven, 2018) <doi:10.1080/01621459.2016.1273115>. Multivariate and univariate functional data objects are represented by S4 classes for this type of data implemented in the package 'funData'. For more details on the general concepts of both packages and a case study, see Happ-Kurz (2020) <doi:10.18637/jss.v093.i05>.

Maintained by Clara Happ-Kurz. Last updated 3 years ago.

fftw3

6.4 match 32 stars 6.89 score 203 scripts 4 dependents

simulatr

simrel:Simulation of Multivariate Linear Model Data

Researchers have been using simulated data from a multivariate linear model to compare and evaluate different methods, ideas and models. Additionally, teachers and educators have been using a simulation tool to demonstrate and teach various statistical and machine learning concepts. This package helps users to simulate linear model data with a wide range of properties by tuning few parameters such as relevant latent components. In addition, a shiny app as an 'RStudio' gadget gives users a simple interface for using the simulation function. See more on: Sæbø, S., Almøy, T., Helland, I.S. (2015) <doi:10.1016/j.chemolab.2015.05.012> and Rimal, R., Almøy, T., Sæbø, S. (2018) <doi:10.1016/j.chemolab.2018.02.009>.

Maintained by Raju Rimal. Last updated 2 years ago.

bivariate-simulation multivariate-simulation relevant-predictor-components simulated-data simulation univariate-simulation

9.0 match 3 stars 4.78 score 40 scripts

cran

fGarch:Rmetrics - Autoregressive Conditional Heteroskedastic Modelling

Analyze and model heteroskedastic behavior in financial time series.

Maintained by Georgi N. Boshnakov. Last updated 12 months ago.

fortran

5.3 match 6 stars 8.20 score 1.1k scripts 51 dependents

stscl

gdverse:Analysis of Spatial Stratified Heterogeneity

Analyzing spatial factors and exploring spatial associations based on the concept of spatial stratified heterogeneity, while also taking into account local spatial dependencies, spatial interpretability, complex spatial interactions, and robust spatial stratification. Additionally, it supports the spatial stratified heterogeneity family established in academic literature.

Maintained by Wenbo Lv. Last updated 1 days ago.

geographical-detector geoinformatics geospatial-analysis spatial-statistics spatial-stratified-heterogeneity cpp

4.7 match 32 stars 9.07 score 41 scripts 2 dependents

twolodzko

extraDistr:Additional Univariate and Multivariate Distributions

Density, distribution function, quantile function and random generation for a number of univariate and multivariate distributions. This package implements the following distributions: Bernoulli, beta-binomial, beta-negative binomial, beta prime, Bhattacharjee, Birnbaum-Saunders, bivariate normal, bivariate Poisson, categorical, Dirichlet, Dirichlet-multinomial, discrete gamma, discrete Laplace, discrete normal, discrete uniform, discrete Weibull, Frechet, gamma-Poisson, generalized extreme value, Gompertz, generalized Pareto, Gumbel, half-Cauchy, half-normal, half-t, Huber density, inverse chi-squared, inverse-gamma, Kumaraswamy, Laplace, location-scale t, logarithmic, Lomax, multivariate hypergeometric, multinomial, negative hypergeometric, non-standard beta, normal mixture, Poisson mixture, Pareto, power, reparametrized beta, Rayleigh, shifted Gompertz, Skellam, slash, triangular, truncated binomial, truncated normal, truncated Poisson, Tukey lambda, Wald, zero-inflated binomial, zero-inflated negative binomial, zero-inflated Poisson.

Maintained by Tymoteusz Wolodzko. Last updated 10 days ago.

c-plus-plus c-plus-plus-11 distribution multivariate-distributions probability random-generation rcpp statistics cpp

3.6 match 53 stars 11.60 score 1.5k scripts 107 dependents

nliulab

AutoScore:An Interpretable Machine Learning-Based Automatic Clinical Score Generator

A novel interpretable machine learning-based framework to automate the development of a clinical scoring model for predefined outcomes. Our novel framework consists of six modules: variable ranking with machine learning, variable transformation, score derivation, model selection, domain knowledge-based score fine-tuning, and performance evaluation.The details are described in our research paper<doi:10.2196/21798>. Users or clinicians could seamlessly generate parsimonious sparse-score risk models (i.e., risk scores), which can be easily implemented and validated in clinical practice. We hope to see its application in various medical case studies.

Maintained by Feng Xie. Last updated 14 days ago.

5.4 match 32 stars 7.70 score 30 scripts

smartdata-analysis-and-statistics

metamisc:Meta-Analysis of Diagnosis and Prognosis Research Studies

Facilitate frequentist and Bayesian meta-analysis of diagnosis and prognosis research studies. It includes functions to summarize multiple estimates of prediction model discrimination and calibration performance (Debray et al., 2019) <doi:10.1177/0962280218785504>. It also includes functions to evaluate funnel plot asymmetry (Debray et al., 2018) <doi:10.1002/jrsm.1266>. Finally, the package provides functions for developing multivariable prediction models from datasets with clustering (de Jong et al., 2021) <doi:10.1002/sim.8981>.

Maintained by Thomas Debray. Last updated 30 days ago.

meta-analysis prognosis prognostic-models

5.5 match 7 stars 7.48 score 102 scripts

matrix-profile-foundation

tsmp:Time Series with Matrix Profile

A toolkit implementing the Matrix Profile concept that was created by CS-UCR <http://www.cs.ucr.edu/~eamonn/MatrixProfile.html>.

Maintained by Francisco Bischoff. Last updated 3 years ago.

algorithm matrix-profile motif-search time-series cpp

5.6 match 72 stars 7.29 score 179 scripts 1 dependents

bioc

AneuFinder:Analysis of Copy Number Variation in Single-Cell-Sequencing Data

AneuFinder implements functions for copy-number detection, breakpoint detection, and karyotype and heterogeneity analysis in single-cell whole genome sequencing and strand-seq data.

Maintained by Aaron Taudt. Last updated 5 months ago.

immunooncology software sequencing singlecell copynumbervariation genomicvariation hiddenmarkovmodel wholegenome cpp

5.3 match 17 stars 7.70 score 37 scripts

r-forge

multcomp:Simultaneous Inference in General Parametric Models

Simultaneous tests and confidence intervals for general linear hypotheses in parametric models, including linear, generalized linear, linear mixed effects, and survival models. The package includes demos reproducing analyzes presented in the book "Multiple Comparisons Using R" (Bretz, Hothorn, Westfall, 2010, CRC Press).

Maintained by Torsten Hothorn. Last updated 2 months ago.

3.0 match 13.49 score 7.5k scripts 366 dependents

yelleknek

AMCP:A Model Comparison Perspective

Accompanies "Designing experiments and analyzing data: A model comparison perspective" (3rd ed.) by Maxwell, Delaney, & Kelley (2018; Routledge). Contains all of the data sets in the book's chapters and end-of-chapter exercises. Information about the book is available at <http://www.DesigningExperiments.com>.

Maintained by Ken Kelley. Last updated 5 years ago.

data

10.3 match 3.91 score 162 scripts

tgouhier

biwavelet:Conduct Univariate and Bivariate Wavelet Analyses

This is a port of the WTC MATLAB package written by Aslak Grinsted and the wavelet program written by Christopher Torrence and Gibert P. Compo. This package can be used to perform univariate and bivariate (cross-wavelet, wavelet coherence, wavelet clustering) analyses.

Maintained by Tarik Gouhier. Last updated 7 months ago.

cpp

5.3 match 45 stars 7.51 score 81 scripts 1 dependents

optad

adoptr:Adaptive Optimal Two-Stage Designs

Optimize one or two-arm, two-stage designs for clinical trials with respect to several implemented objective criteria or custom objectives. Optimization under uncertainty and conditional (given stage-one outcome) constraints are supported. See Pilz et al. (2019) <doi:10.1002/sim.8291> and Kunzmann et al. (2021) <doi:10.18637/jss.v098.i09> for details.

Maintained by Maximilian Pilz. Last updated 5 months ago.

5.5 match 1 stars 7.09 score 39 scripts 1 dependents

gallegoj

tfarima:Transfer Function and ARIMA Models

Building customized transfer function and ARIMA models with multiple operators and parameter restrictions. Functions for model identification, model estimation (exact or conditional maximum likelihood), model diagnostic checking, automatic outlier detection, calendar effects, forecasting and seasonal adjustment. See Bell and Hillmer (1983) <doi:10.1080/01621459.1983.10478005>, Box, Jenkins, Reinsel and Ljung <ISBN:978-1-118-67502-1>, Box, Pierce and Newbold (1987) <doi:10.1080/01621459.1987.10478430>, Box and Tiao (1975) <doi:10.1080/01621459.1975.10480264>, Chen and Liu (1993) <doi:10.1080/01621459.1993.10594321>.

Maintained by Jose L. Gallego. Last updated 12 months ago.

openblas cpp openmp

9.7 match 2 stars 4.04 score 11 scripts

geobosh

Countr:Flexible Univariate Count Models Based on Renewal Processes

Flexible univariate count models based on renewal processes. The models may include covariates and can be specified with familiar formula syntax as in glm() and package 'flexsurv'. The methodology is described by Kharrat et all (2019) <doi:10.18637/jss.v090.i13> (included as vignette 'Countr_guide' in the package). If the suggested package 'pscl' is not available from CRAN, it can be installed with 'remotes::install_github("cran/pscl")'. It is no longer used by the functions in this package but is needed for some of the extended examples.

Maintained by Georgi N. Boshnakov. Last updated 1 years ago.

count-data renewal-process sports-modelling openblas cpp

6.8 match 4 stars 5.71 score 43 scripts

mharinga

insurancerating:Analytic Insurance Rating Techniques

Functions to build, evaluate, and visualize insurance rating models. It simplifies the process of modeling premiums, and allows to analyze insurance risk factors effectively. The package employs a data-driven strategy for constructing insurance tariff classes, drawing on the work of Antonio and Valdez (2012) <doi:10.1007/s10182-011-0152-7>.

Maintained by Martin Haringa. Last updated 5 months ago.

actuarial actuarial-science insurance pricing

6.5 match 70 stars 5.89 score 28 scripts

julia-wrobel

mxfda:A Functional Data Analysis Package for Spatial Single Cell Data

Methods and tools for deriving spatial summary functions from single-cell imaging data and performing functional data analyses. Functions can be applied to other single-cell technologies such as spatial transcriptomics. Functional regression and functional principal component analysis methods are in the 'refund' package <https://cran.r-project.org/package=refund> while calculation of the spatial summary functions are from the 'spatstat' package <https://spatstat.org/>.

Maintained by Alex Soupir. Last updated 24 days ago.

7.3 match 1 stars 5.22 score 8 scripts

bioc

CPSM:CPSM: Cancer patient survival model

The CPSM package provides a comprehensive computational pipeline for predicting the survival probability of cancer patients. It offers a series of steps including data processing, splitting data into training and test subsets, and normalization of data. The package enables the selection of significant features based on univariate survival analysis and generates a LASSO prognostic index score. It supports the development of predictive models for survival probability using various features and provides visualization tools to draw survival curves based on predicted survival probabilities. Additionally, SPM includes functionalities for generating bar plots that depict the predicted mean and median survival times of patients, making it a versatile tool for survival analysis in cancer research.

Maintained by Harpreet Kaur. Last updated 4 days ago.

geneexpression normalization survival

9.7 match 3.90 score

mflores72000

ILS:Interlaboratory Study

It performs interlaboratory studies (ILS) to detect those laboratories that provide non-consistent results when comparing to others. It permits to work simultaneously with various testing materials, from standard univariate, and functional data analysis (FDA) perspectives. The univariate approach based on ASTM E691-08 consist of estimating the Mandel's h and k statistics to identify those laboratories that provide more significant different results, testing also the presence of outliers by Cochran and Grubbs tests, Analysis of variance (ANOVA) techniques are provided (F and Tuckey tests) to test differences in means corresponding to different laboratories per each material. Taking into account the functional nature of data retrieved in analytical chemistry, applied physics and engineering (spectra, thermograms, etc.). ILS package provides a FDA approach for finding the Mandel's k and h statistics distribution by smoothing bootstrap resampling.

Maintained by Miguel Flores. Last updated 2 years ago.

5.8 match 6.48 score 75 scripts

pharmaverse

tidytlg:Create TLGs using the 'tidyverse'

Generate tables, listings, and graphs (TLG) using 'tidyverse.' Tables can be created functionally, using a standard TLG process, or by specifying table and column metadata to create generic analysis summaries. The 'envsetup' package can also be leveraged to create environments for table creation.

Maintained by Konrad Pagacz. Last updated 9 months ago.

4.7 match 33 stars 8.07 score 22 scripts

lme4

lme4:Linear Mixed-Effects Models using 'Eigen' and S4

Fit linear and generalized linear mixed-effects models. The models and their components are represented using S4 classes and methods. The core computational algorithms are implemented using the 'Eigen' C++ library for numerical linear algebra and 'RcppEigen' "glue".

Maintained by Ben Bolker. Last updated 2 days ago.

cpp

1.8 match 647 stars 20.69 score 35k scripts 1.5k dependents

mrijalussholihin

sae.prop:Small Area Estimation using Fay-Herriot Models with Additive Logistic Transformation

Implements Additive Logistic Transformation (alr) for Small Area Estimation under Fay Herriot Model. Small Area Estimation is used to borrow strength from auxiliary variables to improve the effectiveness of a domain sample size. This package uses Empirical Best Linear Unbiased Prediction (EBLUP) estimator. The Additive Logistic Transformation (alr) are based on transformation by Aitchison J (1986). The covariance matrix for multivariate application is base on covariance matrix used by Esteban M, Lombardía M, López-Vizcaíno E, Morales D, and Pérez A <doi:10.1007/s11749-019-00688-w>. The non-sampled models are modified area-level models based on models proposed by Anisa R, Kurnia A, and Indahwati I <doi:10.9790/5728-10121519>, with univariate model using model-3, and multivariate model using model-1. The MSE are estimated using Parametric Bootstrap approach. For non-sampled cases, MSE are estimated using modified approach proposed by Haris F and Ubaidillah A <doi:10.4108/eai.2-8-2019.2290339>.

Maintained by M. Rijalus Sholihin. Last updated 3 years ago.

13.5 match 2.70 score

bioc

Voyager:From geospatial to spatial omics

SpatialFeatureExperiment (SFE) is a new S4 class for working with spatial single-cell genomics data. The voyager package implements basic exploratory spatial data analysis (ESDA) methods for SFE. Univariate methods include univariate global spatial ESDA methods such as Moran's I, permutation testing for Moran's I, and correlograms. Bivariate methods include Lee's L and cross variogram. Multivariate methods include MULTISPATI PCA and multivariate local Geary's C recently developed by Anselin. The Voyager package also implements plotting functions to plot SFE data and ESDA results.

Maintained by Lambda Moses. Last updated 3 months ago.

geneexpression spatial transcriptomics visualization bioconductor eda esda exploratory-data-analysis omics spatial-statistics spatial-transcriptomics

4.1 match 87 stars 8.71 score 173 scripts

crunch-io

crunch:Crunch.io Data Tools

The Crunch.io service <https://crunch.io/> provides a cloud-based data store and analytic engine, as well as an intuitive web interface. Using this package, analysts can interact with and manipulate Crunch datasets from within R. Importantly, this allows technical researchers to collaborate naturally with team members, managers, and clients who prefer a point-and-click interface.

Maintained by Greg Freedman Ellis. Last updated 10 days ago.

3.4 match 9 stars 10.53 score 200 scripts 2 dependents

shangzhi-hong

RfEmpImp:Multiple Imputation using Chained Random Forests

An R package for multiple imputation using chained random forests. Implemented methods can handle missing data in mixed types of variables by using prediction-based or node-based conditional distributions constructed using random forests. For prediction-based imputation, the method based on the empirical distribution of out-of-bag prediction errors of random forests and the method based on normality assumption for prediction errors of random forests are provided for imputing continuous variables. And the method based on predicted probabilities is provided for imputing categorical variables. For node-based imputation, the method based on the conditional distribution formed by the predicting nodes of random forests, and the method based on proximity measures of random forests are provided. More details of the statistical methods can be found in Hong et al. (2020) <arXiv:2004.14823>.

Maintained by Shangzhi Hong. Last updated 2 years ago.

imputation missing-data random-forest

8.1 match 5 stars 4.40 score 8 scripts

kylebgorman

ldamatch:Selection of Statistically Similar Research Groups

Select statistically similar research groups by backward selection using various robust algorithms, including a heuristic based on linear discriminant analysis, multiple heuristics based on the test statistic, and parallelized exhaustive search.

Maintained by Kyle Gorman. Last updated 11 months ago.

17.7 match 2.00 score 9 scripts

franciscomartinezdelrio

utsf:Univariate Time Series Forecasting

An engine for univariate time series forecasting using different regression models in an autoregressive way. The engine provides an uniform interface for applying the different models. Furthermore, it is extensible so that users can easily apply their own regression models to univariate time series forecasting and benefit from all the features of the engine, such as preprocessings or estimation of forecast accuracy.

Maintained by Francisco Martinez. Last updated 27 days ago.

6.8 match 2 stars 5.23 score 4 scripts

stan-dev

rstanarm:Bayesian Applied Regression Modeling via Stan

Estimates previously compiled regression models using the 'rstan' package, which provides the R interface to the Stan C++ library for Bayesian estimation. Users specify models via the customary R syntax with a formula and data.frame plus some additional arguments for priors.

Maintained by Ben Goodrich. Last updated 9 months ago.

bayesian bayesian-data-analysis bayesian-inference bayesian-methods bayesian-statistics multilevel-models rstan rstanarm stan statistical-modeling cpp

2.3 match 393 stars 15.68 score 5.0k scripts 13 dependents

ohdsi

Cyclops:Cyclic Coordinate Descent for Logistic, Poisson and Survival Analysis

This model fitting tool incorporates cyclic coordinate descent and majorization-minimization approaches to fit a variety of regression models found in large-scale observational healthcare data. Implementations focus on computational optimization and fine-scale parallelization to yield efficient inference in massive datasets. Please see: Suchard, Simpson, Zorych, Ryan and Madigan (2013) <doi:10.1145/2414416.2414791>.

Maintained by Marc A. Suchard. Last updated 3 months ago.

hades cpp

3.9 match 39 stars 9.05 score 73 scripts 4 dependents

tsmodels

tsgarch:Univariate GARCH Models

Multiple flavors of the Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model with a large choice of conditional distributions. Methods for specification, estimation, prediction, filtering, simulation, statistical testing and more. Represents a partial re-write and re-think of 'rugarch', making use of automatic differentiation for estimation.

Maintained by Alexios Galanos. Last updated 3 months ago.

garch cpp

5.0 match 13 stars 6.93 score 16 scripts 1 dependents

dicook

nullabor:Tools for Graphical Inference

Tools for visual inference. Generate null data sets and null plots using permutation and simulation. Calculate distance metrics for a lineup, and examine the distributions of metrics.

Maintained by Di Cook. Last updated 1 months ago.

3.3 match 57 stars 10.38 score 370 scripts 2 dependents

kassambara

rstatix:Pipe-Friendly Framework for Basic Statistical Tests

Provides a simple and intuitive pipe-friendly framework, coherent with the 'tidyverse' design philosophy, for performing basic statistical tests, including t-test, Wilcoxon test, ANOVA, Kruskal-Wallis and correlation analyses. The output of each test is automatically transformed into a tidy data frame to facilitate visualization. Additional functions are available for reshaping, reordering, manipulating and visualizing correlation matrix. Functions are also included to facilitate the analysis of factorial experiments, including purely 'within-Ss' designs (repeated measures), purely 'between-Ss' designs, and mixed 'within-and-between-Ss' designs. It's also possible to compute several effect size metrics, including "eta squared" for ANOVA, "Cohen's d" for t-test and 'Cramer V' for the association between categorical variables. The package contains helper functions for identifying univariate and multivariate outliers, assessing normality and homogeneity of variances.

Maintained by Alboukadel Kassambara. Last updated 2 years ago.

2.3 match 456 stars 15.16 score 11k scripts 420 dependents

baddstats

goftest:Classical Goodness-of-Fit Tests for Univariate Distributions

Cramer-Von Mises and Anderson-Darling tests of goodness-of-fit for continuous univariate distributions, using efficient algorithms.

Maintained by Adrian Baddeley. Last updated 5 years ago.

3.5 match 4 stars 9.90 score 260 scripts 207 dependents

tnagler

kde1d:Univariate Kernel Density Estimation

Provides an efficient implementation of univariate local polynomial kernel density estimators that can handle bounded and discrete data. See Geenens (2014) <doi:10.48550/arXiv.1303.4121>, Geenens and Wang (2018) <doi:10.48550/arXiv.1602.04862>, Nagler (2018a) <doi:10.48550/arXiv.1704.07457>, Nagler (2018b) <doi:10.48550/arXiv.1705.05431>.

Maintained by Thomas Nagler. Last updated 2 months ago.

cpp

5.3 match 13 stars 6.39 score 55 scripts 16 dependents

sqyu

genscore:Generalized Score Matching Estimators

Implementation of the Generalized Score Matching estimator in Yu et al. (2019) <http://jmlr.org/papers/v20/18-278.html> for non-negative graphical models (truncated Gaussian, exponential square-root, gamma, a-b models) and univariate truncated Gaussian distributions. Also includes the original estimator for untruncated Gaussian graphical models from Lin et al. (2016) <doi:10.1214/16-EJS1126>, with the addition of a diagonal multiplier.

Maintained by Shiqing Yu. Last updated 5 years ago.

density-estimation graphical-models interaction-models score-matching undirected-graphs

8.1 match 1 stars 4.18 score 3 scripts 1 dependents

r-forge

distr:Object Oriented Implementation of Distributions

S4-classes and methods for distributions.

Maintained by Peter Ruckdeschel. Last updated 2 months ago.

3.8 match 8.84 score 327 scripts 32 dependents

quantsulting

ghyp:Generalized Hyperbolic Distribution and Its Special Cases

Detailed functionality for working with the univariate and multivariate Generalized Hyperbolic distribution and its special cases (Hyperbolic (hyp), Normal Inverse Gaussian (NIG), Variance Gamma (VG), skewed Student-t and Gaussian distribution). Especially, it contains fitting procedures, an AIC-based model selection routine, and functions for the computation of density, quantile, probability, random variates, expected shortfall and some portfolio optimization and plotting routines as well as the likelihood ratio test. In addition, it contains the Generalized Inverse Gaussian distribution. See Chapter 3 of A. J. McNeil, R. Frey, and P. Embrechts. Quantitative risk management: Concepts, techniques and tools. Princeton University Press, Princeton (2005).

Maintained by Marc Weibel. Last updated 7 months ago.

5.9 match 5.58 score 90 scripts 8 dependents

ggobi

GGally:Extension to 'ggplot2'

The R package 'ggplot2' is a plotting system based on the grammar of graphics. 'GGally' extends 'ggplot2' by adding several functions to reduce the complexity of combining geometric objects with transformed data. Some of these functions include a pairwise plot matrix, a two group pairwise plot matrix, a parallel coordinates plot, a survival plot, and several functions to plot networks.

Maintained by Barret Schloerke. Last updated 10 months ago.

2.0 match 597 stars 16.15 score 17k scripts 154 dependents

ddsjoberg

gtsummary:Presentation-Ready Data Summary and Analytic Result Tables

Creates presentation-ready tables summarizing data sets, regression models, and more. The code to create the tables is concise and highly customizable. Data frames can be summarized with any function, e.g. mean(), median(), even user-written functions. Regression models are summarized and include the reference rows for categorical variables. Common regression models, such as logistic regression and Cox proportional hazards regression, are automatically identified and the tables are pre-filled with appropriate column headers.

Maintained by Daniel D. Sjoberg. Last updated 2 days ago.

easy-to-use gt html5 regression-models reproducibility reproducible-research statistics summary-statistics summary-tables table1 tableone

1.9 match 1.1k stars 17.00 score 8.2k scripts 15 dependents

convfunctimeseries

NTS:Nonlinear Time Series Analysis

Simulation, estimation, prediction procedure, and model identification methods for nonlinear time series analysis, including threshold autoregressive models, Markov-switching models, convolutional functional autoregressive models, nonlinearity tests, Kalman filters and various sequential Monte Carlo methods. More examples and details about this package can be found in the book "Nonlinear Time Series Analysis" by Ruey S. Tsay and Rong Chen, John Wiley & Sons, 2018 (ISBN: 978-1-119-26407-1).

Maintained by Xialu Liu. Last updated 1 years ago.

10.8 match 2 stars 2.94 score 48 scripts

kkholst

mets:Analysis of Multivariate Event Times

Implementation of various statistical models for multivariate event history data <doi:10.1007/s10985-013-9244-x>. Including multivariate cumulative incidence models <doi:10.1002/sim.6016>, and bivariate random effects probit models (Liability models) <doi:10.1016/j.csda.2015.01.014>. Modern methods for survival analysis, including regression modelling (Cox, Fine-Gray, Ghosh-Lin, Binomial regression) with fast computation of influence functions.

Maintained by Klaus K. Holst. Last updated 2 days ago.

multivariate-time-to-event survival-analysis time-to-event fortran openblas cpp

2.3 match 14 stars 13.47 score 236 scripts 42 dependents

maximeherve

RVAideMemoire:Testing and Plotting Procedures for Biostatistics

Contains miscellaneous functions useful in biostatistics, mostly univariate and multivariate testing procedures with a special emphasis on permutation tests. Many functions intend to simplify user's life by shortening existing procedures or by implementing plotting functions that can be used with as many methods from different packages as possible.

Maintained by Maxime HERVE. Last updated 1 years ago.

5.9 match 8 stars 5.31 score 632 scripts

pecanproject

PEcAn.uncertainty:PEcAn Functions Used for Propagating and Partitioning Uncertainties in Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by David LeBauer. Last updated 2 days ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

3.4 match 216 stars 8.91 score 15 scripts 5 dependents

cran

compound.Cox:Univariate Feature Selection and Compound Covariate for Predicting Survival, Including Copula-Based Analyses for Dependent Censoring

Univariate feature selection and compound covariate methods under the Cox model with high-dimensional features (e.g., gene expressions). Available are survival data for non-small-cell lung cancer patients with gene expressions (Chen et al 2007 New Engl J Med) <DOI:10.1056/NEJMoa060096>, statistical methods in Emura et al (2012 PLoS ONE) <DOI:10.1371/journal.pone.0047627>, Emura & Chen (2016 Stat Methods Med Res) <DOI:10.1177/0962280214533378>, and Emura et al (2019)<DOI:10.1016/j.cmpb.2018.10.020>. Algorithms for generating correlated gene expressions are also available. Estimation of survival functions via copula-graphic (CG) estimators is also implemented, which is useful for sensitivity analyses under dependent censoring (Yeh et al 2023 Biomedicines) <DOI:10.3390/biomedicines11030797> and factorial survival analyses (Emura et al 2024 Stat Methods Med Res) <DOI:10.1177/09622802231215805>.

Maintained by Takeshi Emura. Last updated 2 months ago.

13.3 match 2.29 score 3 dependents

hmsc-r

Hmsc:Hierarchical Model of Species Communities

Hierarchical Modelling of Species Communities (HMSC) is a model-based approach for analyzing community ecological data. This package implements it in the Bayesian framework with Gibbs Markov chain Monte Carlo (MCMC) sampling (Tikhonov et al. (2020) <doi:10.1111/2041-210X.13345>).

Maintained by Otso Ovaskainen. Last updated 5 days ago.

2.9 match 105 stars 10.31 score 476 scripts

r-spatial

spdep:Spatial Dependence: Weighting Schemes, Statistics

A collection of functions to create spatial weights matrix objects from polygon 'contiguities', from point patterns by distance and tessellations, for summarizing these objects, and for permitting their use in spatial data analysis, including regional aggregation by minimum spanning tree; a collection of tests for spatial 'autocorrelation', including global 'Morans I' and 'Gearys C' proposed by 'Cliff' and 'Ord' (1973, ISBN: 0850860369) and (1981, ISBN: 0850860814), 'Hubert/Mantel' general cross product statistic, Empirical Bayes estimates and 'Assunção/Reis' (1999) <doi:10.1002/(SICI)1097-0258(19990830)18:16%3C2147::AID-SIM179%3E3.0.CO;2-I> Index, 'Getis/Ord' G ('Getis' and 'Ord' 1992) <doi:10.1111/j.1538-4632.1992.tb00261.x> and multicoloured join count statistics, 'APLE' ('Li 'et al.' ) <doi:10.1111/j.1538-4632.2007.00708.x>, local 'Moran's I', 'Gearys C' ('Anselin' 1995) <doi:10.1111/j.1538-4632.1995.tb00338.x> and 'Getis/Ord' G ('Ord' and 'Getis' 1995) <doi:10.1111/j.1538-4632.1995.tb00912.x>, 'saddlepoint' approximations ('Tiefelsdorf' 2002) <doi:10.1111/j.1538-4632.2002.tb01084.x> and exact tests for global and local 'Moran's I' ('Bivand et al.' 2009) <doi:10.1016/j.csda.2008.07.021> and 'LOSH' local indicators of spatial heteroscedasticity ('Ord' and 'Getis') <doi:10.1007/s00168-011-0492-y>. The implementation of most of these measures is described in 'Bivand' and 'Wong' (2018) <doi:10.1007/s11749-018-0599-x>, with further extensions in 'Bivand' (2022) <doi:10.1111/gean.12319>. 'Lagrange' multiplier tests for spatial dependence in linear models are provided ('Anselin et al'. 1996) <doi:10.1016/0166-0462(95)02111-6>, as are 'Rao' score tests for hypothesised spatial 'Durbin' models based on linear models ('Koley' and 'Bera' 2023) <doi:10.1080/17421772.2023.2256810>. A local indicators for categorical data (LICD) implementation based on 'Carrer et al.' (2021) <doi:10.1016/j.jas.2020.105306> and 'Bivand et al.' (2017) <doi:10.1016/j.spasta.2017.03.003> was added in 1.3-7. From 'spdep' and 'spatialreg' versions >= 1.2-1, the model fitting functions previously present in this package are defunct in 'spdep' and may be found in 'spatialreg'.

Maintained by Roger Bivand. Last updated 18 days ago.

spatial-autocorrelation spatial-dependence spatial-weights

1.8 match 131 stars 16.62 score 6.0k scripts 107 dependents

tidymodels

hardhat:Construct Modeling Packages

Building modeling packages is hard. A large amount of effort generally goes into providing an implementation for a new method that is efficient, fast, and correct, but often less emphasis is put on the user interface. A good interface requires specialized knowledge about S3 methods and formulas, which the average package developer might not have. The goal of 'hardhat' is to reduce the burden around building new modeling packages by providing functionality for preprocessing, predicting, and validating input.

Maintained by Hannah Frick. Last updated 1 months ago.

2.0 match 103 stars 14.88 score 175 scripts 436 dependents

amices

mice:Multivariate Imputation by Chained Equations

Multiple imputation using Fully Conditional Specification (FCS) implemented by the MICE algorithm as described in Van Buuren and Groothuis-Oudshoorn (2011) <doi:10.18637/jss.v045.i03>. Each variable has its own imputation model. Built-in imputation models are provided for continuous data (predictive mean matching, normal), binary data (logistic regression), unordered categorical data (polytomous logistic regression) and ordered categorical data (proportional odds). MICE can also impute continuous two-level data (normal model, pan, second-level variables). Passive imputation can be used to maintain consistency between variables. Various diagnostic plots are available to inspect the quality of the imputations.

Maintained by Stef van Buuren. Last updated 6 days ago.

chained-equations fcs imputation mice missing-data missing-values multiple-imputation multivariate-data cpp

1.8 match 462 stars 16.50 score 10k scripts 154 dependents

jokergoo

CePa:Centrality-Based Pathway Enrichment

This package aims to find significant pathways through network topology information. It has several advantages compared with current pathway enrichment tools. First, pathway node instead of single gene is taken as the basic unit when analysing networks to meet the fact that genes must be constructed into complexes to hold normal functions. Second, multiple network centrality measures are applied simultaneously to measure importance of nodes from different aspects to make a full view on the biological system. CePa extends standard pathway enrichment methods, which include both over-representation analysis procedure and gene-set analysis procedure. <https://doi.org/10.1093/bioinformatics/btt008>.

Maintained by Zuguang Gu. Last updated 4 years ago.

4.5 match 3 stars 6.53 score 75 scripts

cran

sn:The Skew-Normal and Related Distributions Such as the Skew-t and the SUN

Build and manipulate probability distributions of the skew-normal family and some related ones, notably the skew-t and the SUN families. For the skew-normal and the skew-t distributions, statistical methods are provided for data fitting and model diagnostics, in the univariate and the multivariate case.

Maintained by Adelchi Azzalini. Last updated 2 years ago.

3.9 match 3 stars 7.44 score 92 dependents

oguzhanogreden

dcurver:Utility Functions for Davidian Curves

A Davidian curve defines a seminonparametric density, whose shape and flexibility can be tuned by easy to estimate parameters. Since a special case of a Davidian curve is the standard normal density, Davidian curves can be used for relaxing normality assumption in statistical applications (Zhang & Davidian, 2001) <doi:10.1111/j.0006-341X.2001.00795.x>. This package provides the density function, the gradient of the loglikelihood and a random generator for Davidian curves.

Maintained by Oğuzhan Öğreden. Last updated 7 years ago.

openblas cpp openmp

5.3 match 5.43 score 4 scripts 42 dependents

openbiox

UCSCXenaShiny:Interactive Analysis of UCSC Xena Data

Provides functions and a Shiny application for downloading, analyzing and visualizing datasets from UCSC Xena (<http://xena.ucsc.edu/>), which is a collection of UCSC-hosted public databases such as TCGA, ICGC, TARGET, GTEx, CCLE, and others.

Maintained by Shixiang Wang. Last updated 4 months ago.

cancer-dataset shiny-apps ucsc-xena

3.4 match 96 stars 8.54 score 35 scripts

jonasmoss

univariateML:Maximum Likelihood Estimation for Univariate Densities

User-friendly maximum likelihood estimation (Fisher (1921) <doi:10.1098/rsta.1922.0009>) of univariate densities.

Maintained by Jonas Moss. Last updated 13 days ago.

density estimation maximum-likelihood

3.5 match 8 stars 8.10 score 62 scripts 7 dependents

andrisignorell

DescTools:Tools for Descriptive Statistics

A collection of miscellaneous basic statistic functions and convenience wrappers for efficiently describing data. The author's intention was to create a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results. The package contains furthermore functions to produce documents using MS Word (or PowerPoint) and functions to import data from Excel. Many of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The 'BigCamelCase' style was consequently applied to functions borrowed from contributed R packages as well.

Maintained by Andri Signorell. Last updated 9 days ago.

fortran cpp

1.7 match 87 stars 16.68 score 7.7k scripts 99 dependents

mjuraska

sievePH:Sieve Analysis Methods for Proportional Hazards Models

Implements a suite of semiparametric and nonparametric kernel-smoothed estimation and testing procedures for continuous mark-specific stratified hazard ratio (treatment/placebo) models in a randomized treatment efficacy trial with a time-to-event endpoint. Semiparametric methods, allowing multivariate marks, are described in Juraska M and Gilbert PB (2013), Mark-specific hazard ratio model with multivariate continuous marks: an application to vaccine efficacy. Biometrics 69(2):328-337 <doi:10.1111/biom.12016>, and in Juraska M and Gilbert PB (2016), Mark-specific hazard ratio model with missing multivariate marks. Lifetime Data Analysis 22(4):606-25 <doi:10.1007/s10985-015-9353-9>. Nonparametric kernel-smoothed methods, allowing univariate marks only, are described in Sun Y and Gilbert PB (2012), Estimation of stratified mark‐specific proportional hazards models with missing marks. Scandinavian Journal of Statistics}, 39(1):34-52 <doi:10.1111/j.1467-9469.2011.00746.x>, and in Gilbert PB and Sun Y (2015), Inferences on relative failure rates in stratified mark-specific proportional hazards models with missing marks, with application to human immunodeficiency virus vaccine efficacy trials. Journal of the Royal Statistical Society Series C: Applied Statistics, 64(1):49-73 <doi:10.1111/rssc.12067>. Both semiparametric and nonparametric approaches consider two scenarios: (1) the mark is fully observed in all subjects who experience the event of interest, and (2) the mark is subject to missingness-at-random in subjects who experience the event of interest. For models with missing marks, estimators are implemented based on (i) inverse probability weighting (IPW) of complete cases (for the semiparametric framework), and (ii) augmentation of the IPW estimating functions by leveraging correlations between the mark and auxiliary data to 'impute' the augmentation term for subjects with missing marks (for both the semiparametric and nonparametric framework). The augmented IPW estimators are doubly robust and recommended for use with incomplete mark data. The semiparametric methods make two key assumptions: (i) the time-to-event is assumed to be conditionally independent of the mark given treatment, and (ii) the weight function in the semiparametric density ratio/biased sampling model is assumed to be exponential. Diagnostic testing procedures for evaluating validity of both assumptions are implemented. Summary and plotting functions are provided for estimation and inferential results.

Maintained by Michal Juraska. Last updated 9 months ago.

openblas cpp openmp

7.0 match 4.04 score 11 scripts

hrbrmstr

ggalt:Extra Coordinate Systems, 'Geoms', Statistical Transformations, Scales and Fonts for 'ggplot2'

A compendium of new geometries, coordinate systems, statistical transformations, scales and fonts for 'ggplot2', including splines, 1d and 2d densities, univariate average shifted histograms, a new map coordinate system based on the 'PROJ.4'-library along with geom_cartogram() that mimics the original functionality of geom_map(), formatters for "bytes", a stat_stepribbon() function, increased 'plotly' compatibility and the 'StateFace' open source font 'ProPublica'. Further new functionality includes lollipop charts, dumbbell charts, the ability to encircle points and coordinate-system-based text annotations.

Maintained by Bob Rudis. Last updated 2 years ago.

geom ggplot-extension ggplot2 ggplot2-geom ggplot2-scales

2.2 match 674 stars 12.59 score 2.3k scripts 7 dependents

valentint

robust:Port of the S+ "Robust Library"

Methods for robust statistics, a state of the art in the early 2000s, notably for robust regression and robust multivariate analysis.

Maintained by Valentin Todorov. Last updated 7 months ago.

fortran openblas

3.7 match 7.52 score 572 scripts 8 dependents

denisrustand

INLAjoint:Multivariate Joint Modeling for Longitudinal and Time-to-Event Outcomes with 'INLA'

Estimation of joint models for multivariate longitudinal markers (with various distributions available) and survival outcomes (possibly accounting for competing risks) with Integrated Nested Laplace Approximations (INLA). The flexible and user friendly function joint() facilitates the use of the fast and reliable inference technique implemented in the 'INLA' package for joint modeling. More details are given in the help page of the joint() function (accessible via ?joint in the R console) and the vignette associated to the joint() function (accessible via vignette("INLAjoint") in the R console).

Maintained by Denis Rustand. Last updated 24 days ago.

3.8 match 19 stars 7.36 score 40 scripts

andrija-djurovic

PDtoolkit:Collection of Tools for PD Rating Model Development and Validation

The goal of this package is to cover the most common steps in probability of default (PD) rating model development and validation. The main procedures available are those that refer to univariate, bivariate, multivariate analysis, calibration and validation. Along with accompanied 'monobin' and 'monobinShiny' packages, 'PDtoolkit' provides functions which are suitable for different data transformation and modeling tasks such as: imputations, monotonic binning of numeric risk factors, binning of categorical risk factors, weights of evidence (WoE) and information value (IV) calculations, WoE coding (replacement of risk factors modalities with WoE values), risk factor clustering, area under curve (AUC) calculation and others. Additionally, package provides set of validation functions for testing homogeneity, heterogeneity, discriminatory and predictive power of the model.

Maintained by Andrija Djurovic. Last updated 1 years ago.

5.8 match 14 stars 4.78 score 86 scripts

termehs

netropy:Statistical Entropy Analysis of Network Data

Statistical entropy analysis of network data as introduced by Frank and Shafie (2016) <doi:10.1177/0759106315615511>, and a in textbook which is in progress.

Maintained by Termeh Shafie. Last updated 5 months ago.

4.4 match 12 stars 6.26 score 9 scripts

atsa-es

atsar:Stan Routines For Univariate And Multivariate Time Series

Bundles univariate and multivariate STAN scripts for FISH 507 class.

Maintained by Eric J. Ward. Last updated 9 months ago.

bayesian stan time-series cpp

4.8 match 48 stars 5.68 score 33 scripts

cran

ash:David Scott's ASH Routines

David Scott's ASH routines ported from S-PLUS to R.

Maintained by Albrecht Gebhardt. Last updated 10 years ago.

fortran

4.5 match 6.04 score 66 scripts 172 dependents

fbertran

plsRglm:Partial Least Squares Regression for Generalized Linear Models

Provides (weighted) Partial least squares Regression for generalized linear models and repeated k-fold cross-validation of such models using various criteria <arXiv:1810.01005>. It allows for missing data in the explanatory variables. Bootstrap confidence intervals constructions are also available.

Maintained by Frederic Bertrand. Last updated 2 years ago.

3.5 match 16 stars 7.75 score 103 scripts 5 dependents

techtonique

ahead:Time Series Forecasting with uncertainty quantification

Univariate and multivariate time series forecasting with uncertainty quantification.

Maintained by T. Moudiki. Last updated 27 days ago.

forecasting machine-learning predictive-modeling statistical-learning time-series time-series-forecasting uncertainty-quantification cpp

5.6 match 21 stars 4.77 score 51 scripts

r-forge

nor1mix:Normal aka Gaussian 1-d Mixture Models

Onedimensional Normal (i.e. Gaussian) Mixture Models (S3) Classes, for, e.g., density estimation or clustering algorithms research and teaching; providing the widely used Marron-Wand densities. Efficient random number generation and graphics. Fitting to data by efficient ML (Maximum Likelihood) or traditional EM estimation.

Maintained by Martin Maechler. Last updated 3 months ago.

3.6 match 7.25 score 86 scripts 44 dependents

cran

evd:Functions for Extreme Value Distributions

Extends simulation, distribution, quantile and density functions to univariate and multivariate parametric extreme value distributions, and provides fitting functions which calculate maximum likelihood estimates for univariate and bivariate maxima models, and for univariate and bivariate threshold models.

Maintained by Alec Stephenson. Last updated 6 months ago.

2.8 match 2 stars 9.46 score 748 scripts 82 dependents

ls-git-17

fdANOVA:Analysis of Variance for Univariate and Multivariate Functional Data

Performs analysis of variance testing procedures for univariate and multivariate functional data (Cuesta-Albertos and Febrero-Bande (2010) <doi:10.1007/s11749-010-0185-3>, Gorecki and Smaga (2015) <doi:10.1007/s00180-015-0555-0>, Gorecki and Smaga (2017) <doi:10.1080/02664763.2016.1247791>, Zhang et al. (2018) <doi:10.1016/j.csda.2018.05.004>).

Maintained by Lukasz Smaga. Last updated 7 years ago.

8.1 match 3 stars 3.23 score 28 scripts

eikeluedeling

decisionSupport:Quantitative Support of Decision Making under Uncertainty

Supporting the quantitative analysis of binary welfare based decision making processes using Monte Carlo simulations. Decision support is given on two levels: (i) The actual decision level is to choose between two alternatives under probabilistic uncertainty. This package calculates the optimal decision based on maximizing expected welfare. (ii) The meta decision level is to allocate resources to reduce the uncertainty in the underlying decision problem, i.e to increase the current information to improve the actual decision making process. This problem is dealt with using the Value of Information Analysis. The Expected Value of Information for arbitrary prospective estimates can be calculated as well as Individual Expected Value of Perfect Information. The probabilistic calculations are done via Monte Carlo simulations. This Monte Carlo functionality can be used on its own.

Maintained by Eike Luedeling. Last updated 11 months ago.

5.0 match 6 stars 5.17 score 123 scripts

bioc

limma:Linear Models for Microarray and Omics Data

Data analysis, linear models and differential expression for omics data.

Maintained by Gordon Smyth. Last updated 5 days ago.

exonarray geneexpression transcription alternativesplicing differentialexpression differentialsplicing genesetenrichment dataimport bayesian clustering regression timecourse microarray micrornaarray mrnamicroarray onechannel proprietaryplatforms twochannel sequencing rnaseq batcheffect multiplecomparison normalization preprocessing qualitycontrol biomedicalinformatics cellbiology cheminformatics epigenetics functionalgenomics genetics immunooncology metabolomics proteomics systemsbiology transcriptomics

1.9 match 13.81 score 16k scripts 585 dependents

r-forge

distrEx:Extensions of Package 'distr'

Extends package 'distr' by functionals, distances, and conditional distributions.

Maintained by Matthias Kohl. Last updated 2 months ago.

3.9 match 6.68 score 107 scripts 17 dependents

ajsims1704

rdecision:Decision Analytic Modelling in Health Economics

Classes and functions for modelling health care interventions using decision trees and semi-Markov models. Mechanisms are provided for associating an uncertainty distribution with each source variable and for ensuring transparency of the mathematical relationships between variables. The package terminology follows Briggs "Decision Modelling for Health Economic Evaluation" (2006, ISBN:978-0-19-852662-9).

Maintained by Andrew Sims. Last updated 1 months ago.

4.0 match 3 stars 6.46 score 22 scripts

allegropiano

GLDEX:Fitting Single and Mixture of Generalised Lambda Distributions

The fitting algorithms considered in this package have two major objectives. One is to provide a smoothing device to fit distributions to data using the weight and unweighted discretised approach based on the bin width of the histogram. The other is to provide a definitive fit to the data set using the maximum likelihood and quantile matching estimation. Other methods such as moment matching, starship method, L moment matching are also provided. Diagnostics on goodness of fit can be done via qqplots, KS-resample tests and comparing mean, variance, skewness and kurtosis of the data with the fitted distribution. References include the following: Karvanen and Nuutinen (2008) "Characterizing the generalized lambda distribution by L-moments" <doi:10.1016/j.csda.2007.06.021>, King and MacGillivray (1999) "A starship method for fitting the generalised lambda distributions" <doi:10.1111/1467-842X.00089>, Su (2005) "A Discretized Approach to Flexibly Fit Generalized Lambda Distributions to Data" <doi:10.22237/jmasm/1130803560>, Su (2007) "Nmerical Maximum Log Likelihood Estimation for Generalized Lambda Distributions" <doi:10.1016/j.csda.2006.06.008>, Su (2007) "Fitting Single and Mixture of Generalized Lambda Distributions to Data via Discretized and Maximum Likelihood Methods: GLDEX in R" <doi:10.18637/jss.v021.i09>, Su (2009) "Confidence Intervals for Quantiles Using Generalized Lambda Distributions" <doi:10.1016/j.csda.2009.02.014>, Su (2010) "Chapter 14: Fitting GLDs and Mixture of GLDs to Data using Quantile Matching Method" <doi:10.1201/b10159>, Su (2010) "Chapter 15: Fitting GLD to data using GLDEX 1.0.4 in R" <doi:10.1201/b10159>, Su (2015) "Flexible Parametric Quantile Regression Model" <doi:10.1007/s11222-014-9457-1>, Su (2021) "Flexible parametric accelerated failure time model"<doi:10.1080/10543406.2021.1934854>.

Maintained by Steve Su. Last updated 2 years ago.

8.5 match 3.05 score 93 scripts 2 dependents

asmahani

MfUSampler:Multivariate-from-Univariate (MfU) MCMC Sampler

Convenience functions for multivariate MCMC using univariate samplers including: slice sampler with stepout and shrinkage (Neal (2003) <DOI:10.1214/aos/1056562461>), adaptive rejection sampler (Gilks and Wild (1992) <DOI:10.2307/2347565>), adaptive rejection Metropolis (Gilks et al (1995) <DOI:10.2307/2986138>), and univariate Metropolis with Gaussian proposal.

Maintained by Alireza S. Mahani. Last updated 2 years ago.

8.3 match 3.08 score 20 scripts 2 dependents

kleest0

SemiCompRisks:Hierarchical Models for Parametric and Semi-Parametric Analyses of Semi-Competing Risks Data

Hierarchical multistate models are considered to perform the analysis of independent/clustered semi-competing risks data. The package allows to choose the specification for model components from a range of options giving users substantial flexibility, including: accelerated failure time or proportional hazards regression models; parametric or non-parametric specifications for baseline survival functions and cluster-specific random effects distribution; a Markov or semi-Markov specification for terminal event following non-terminal event. While estimation is mainly performed within the Bayesian paradigm, the package also provides the maximum likelihood estimation approach for several parametric models. The package also includes functions for univariate survival analysis as complementary analysis tools.

Maintained by Kyu Ha Lee. Last updated 4 years ago.

gsl

10.6 match 2.41 score 26 scripts

jomulder

BFpack:Flexible Bayes Factor Testing of Scientific Expectations

Implementation of default Bayes factors for testing statistical hypotheses under various statistical models. The package is intended for applied quantitative researchers in the social and behavioral sciences, medical research, and related fields. The Bayes factor tests can be executed for statistical models such as univariate and multivariate normal linear models, correlation analysis, generalized linear models, special cases of linear mixed models, survival models, relational event models. Parameters that can be tested are location parameters (e.g., group means, regression coefficients), variances (e.g., group variances), and measures of association (e.g,. polychoric/polyserial/biserial/tetrachoric/product moments correlations), among others. The statistical underpinnings are described in O'Hagan (1995) <DOI:10.1111/j.2517-6161.1995.tb02017.x>, De Santis and Spezzaferri (2001) <DOI:10.1016/S0378-3758(00)00240-8>, Mulder and Xin (2022) <DOI:10.1080/00273171.2021.1904809>, Mulder and Gelissen (2019) <DOI:10.1080/02664763.2021.1992360>, Mulder (2016) <DOI:10.1016/j.jmp.2014.09.004>, Mulder and Fox (2019) <DOI:10.1214/18-BA1115>, Mulder and Fox (2013) <DOI:10.1007/s11222-011-9295-3>, Boeing-Messing, van Assen, Hofman, Hoijtink, and Mulder (2017) <DOI:10.1037/met0000116>, Hoijtink, Mulder, van Lissa, and Gu (2018) <DOI:10.1037/met0000201>, Gu, Mulder, and Hoijtink (2018) <DOI:10.1111/bmsp.12110>, Hoijtink, Gu, and Mulder (2018) <DOI:10.1111/bmsp.12145>, and Hoijtink, Gu, Mulder, and Rosseel (2018) <DOI:10.1037/met0000187>. When using the packages, please refer to the package Mulder et al. (2021) <DOI:10.18637/jss.v100.i18> and the relevant methodological papers.

Maintained by Joris Mulder. Last updated 1 months ago.

fortran openblas

3.1 match 15 stars 8.24 score 55 scripts 3 dependents

mrcieu

TwoSampleMR:Two Sample MR Functions and Interface to MRC Integrative Epidemiology Unit OpenGWAS Database

A package for performing Mendelian randomization using GWAS summary data. It uses the IEU OpenGWAS database <https://gwas.mrcieu.ac.uk/> to automatically obtain data, and a wide range of methods to run the analysis.

Maintained by Gibran Hemani. Last updated 10 days ago.

2.3 match 467 stars 11.23 score 1.7k scripts 1 dependents

brad-cannell

freqtables:Make Quick Descriptive Tables for Categorical Variables

Quickly make tables of descriptive statistics (i.e., counts, percentages, confidence intervals) for categorical variables. This package is designed to work in a Tidyverse pipeline, and consideration has been given to get results from R to Microsoft Word ® with minimal pain.

Maintained by Brad Cannell. Last updated 1 years ago.

categorical-data data-analysis descriptive-statistics epidemiology

4.2 match 12 stars 6.00 score 84 scripts

cran

Compositional:Compositional Data Analysis

Regression, classification, contour plots, hypothesis testing and fitting of distributions for compositional data are some of the functions included. We further include functions for percentages (or proportions). The standard textbook for such data is John Aitchison's (1986) "The statistical analysis of compositional data". Relevant papers include: a) Tsagris M.T., Preston S. and Wood A.T.A. (2011). "A data-based power transformation for compositional data". Fourth International International Workshop on Compositional Data Analysis. <doi:10.48550/arXiv.1106.1451> b) Tsagris M. (2014). "The k-NN algorithm for compositional data: a revised approach with and without zero values present". Journal of Data Science, 12(3): 519--534. <doi:10.6339/JDS.201407_12(3).0008>. c) Tsagris M. (2015). "A novel, divergence based, regression for compositional data". Proceedings of the 28th Panhellenic Statistics Conference, 15-18 April 2015, Athens, Greece, 430--444. <doi:10.48550/arXiv.1511.07600>. d) Tsagris M. (2015). "Regression analysis with compositional data containing zero values". Chilean Journal of Statistics, 6(2): 47--57. <https://soche.cl/chjs/volumes/06/02/Tsagris(2015).pdf>. e) Tsagris M., Preston S. and Wood A.T.A. (2016). "Improved supervised classification for compositional data using the alpha-transformation". Journal of Classification, 33(2): 243--261. <doi:10.1007/s00357-016-9207-5>. f) Tsagris M., Preston S. and Wood A.T.A. (2017). "Nonparametric hypothesis testing for equality of means on the simplex". Journal of Statistical Computation and Simulation, 87(2): 406--422. <doi:10.1080/00949655.2016.1216554>. g) Tsagris M. and Stewart C. (2018). "A Dirichlet regression model for compositional data with zeros". Lobachevskii Journal of Mathematics, 39(3): 398--412. <doi:10.1134/S1995080218030198>. h) Alenazi A. (2019). "Regression for compositional data with compositional data as predictor variables with or without zero values". Journal of Data Science, 17(1): 219--238. <doi:10.6339/JDS.201901_17(1).0010>. i) Tsagris M. and Stewart C. (2020). "A folded model for compositional data analysis". Australian and New Zealand Journal of Statistics, 62(2): 249--277. <doi:10.1111/anzs.12289>. j) Alenazi A.A. (2022). "f-divergence regression models for compositional data". Pakistan Journal of Statistics and Operation Research, 18(4): 867--882. <doi:10.18187/pjsor.v18i4.3969>. k) Tsagris M. and Stewart C. (2022). "A Review of Flexible Transformations for Modeling Compositional Data". In Advances and Innovations in Statistics and Data Science, pp. 225--234. <doi:10.1007/978-3-031-08329-7_10>. l) Alenazi A. (2023). "A review of compositional data analysis and recent advances". Communications in Statistics--Theory and Methods, 52(16): 5535--5567. <doi:10.1080/03610926.2021.2014890>. m) Tsagris M., Alenazi A. and Stewart C. (2023). "Flexible non-parametric regression models for compositional response data with zeros". Statistics and Computing, 33(106). <doi:10.1007/s11222-023-10277-5>. n) Tsagris. M. (2025). "Constrained least squares simplicial-simplicial regression". Statistics and Computing, 35(27). <doi:10.1007/s11222-024-10560-z>. o) Sevinc V. and Tsagris. M. (2024). "Energy Based Equality of Distributions Testing for Compositional Data". <doi:10.48550/arXiv.2412.05199>.

Maintained by Michail Tsagris. Last updated 2 months ago.

6.9 match 3 stars 3.64 score 4 dependents

mayer79

missRanger:Fast Imputation of Missing Values

Alternative implementation of the beautiful 'MissForest' algorithm used to impute mixed-type data sets by chaining random forests, introduced by Stekhoven, D.J. and Buehlmann, P. (2012) <doi:10.1093/bioinformatics/btr597>. Under the hood, it uses the lightning fast random forest package 'ranger'. Between the iterative model fitting, we offer the option of using predictive mean matching. This firstly avoids imputation with values not already present in the original data (like a value 0.3334 in 0-1 coded variable). Secondly, predictive mean matching tries to raise the variance in the resulting conditional distributions to a realistic level. This would allow, e.g., to do multiple imputation when repeating the call to missRanger(). Out-of-sample application is supported as well.

Maintained by Michael Mayer. Last updated 3 months ago.

imputation machine-learning missing-values random-forest

2.3 match 69 stars 11.07 score 208 scripts 6 dependents

joon-e

tidycomm:Data Modification and Analysis for Communication Research

Provides convenience functions for common data modification and analysis tasks in communication research. This includes functions for univariate and bivariate data analysis, index generation and reliability computation, and intercoder reliability tests. All functions follow the style and syntax of the tidyverse, and are construed to perform their computations on multiple variables at once. Functions for univariate and bivariate data analysis comprise summary statistics for continuous and categorical variables, as well as several tests of bivariate association including effect sizes. Functions for data modification comprise index generation and automated reliability analysis of index variables. Functions for intercoder reliability comprise tests of several intercoder reliability estimates, including simple and mean pairwise percent agreement, Krippendorff's Alpha (Krippendorff 2004, ISBN: 9780761915454), and various Kappa coefficients (Brennan & Prediger 1981 <doi: 10.1177/001316448104100307>; Cohen 1960 <doi: 10.1177/001316446002000104>; Fleiss 1971 <doi: 10.1037/h0031619>).

Maintained by Julian Unkel. Last updated 10 months ago.

3.8 match 15 stars 6.59 score 52 scripts

prdm0

AcceptReject:Acceptance-Rejection Method for Generating Pseudo-Random Observations

Provides a function that implements the acceptance-rejection method in an optimized manner to generate pseudo-random observations for discrete or continuous random variables. Proposed by von Neumann J. (1951), <https://mcnp.lanl.gov/pdf_files/>, the function is optimized to work in parallel on Unix-based operating systems and performs well on Windows systems. The acceptance-rejection method implemented optimizes the probability of generating observations from the desired random variable, by simply providing the probability function or probability density function, in the discrete and continuous cases, respectively. Implementation is based on references CASELLA, George at al. (2004) <https://www.jstor.org/stable/4356322>, NEAL, Radford M. (2003) <https://www.jstor.org/stable/3448413> and Bishop, Christopher M. (2006, ISBN: 978-0387310732).

Maintained by Pedro Rafael D. Marinho. Last updated 10 months ago.

monte-carlo monte-carlo-simulation rejection-sampling statistics-library cpp

4.7 match 2 stars 5.30 score 7 scripts

hwborchers

pracma:Practical Numerical Math Functions

Provides a large number of functions from numerical analysis and linear algebra, numerical optimization, differential equations, time series, plus some well-known special mathematical functions. Uses 'MATLAB' function names where appropriate to simplify porting.

Maintained by Hans W. Borchers. Last updated 1 years ago.

2.0 match 29 stars 12.34 score 6.6k scripts 931 dependents

saviviro

uGMAR:Estimate Univariate Gaussian and Student's t Mixture Autoregressive Models

Maximum likelihood estimation of univariate Gaussian Mixture Autoregressive (GMAR), Student's t Mixture Autoregressive (StMAR), and Gaussian and Student's t Mixture Autoregressive (G-StMAR) models, quantile residual tests, graphical diagnostics, forecast and simulate from GMAR, StMAR and G-StMAR processes. Leena Kalliovirta, Mika Meitz, Pentti Saikkonen (2015) <doi:10.1111/jtsa.12108>, Mika Meitz, Daniel Preve, Pentti Saikkonen (2023) <doi:10.1080/03610926.2021.1916531>, Savi Virolainen (2022) <doi:10.1515/snde-2020-0060>.

Maintained by Savi Virolainen. Last updated 2 months ago.

5.0 match 1 stars 4.88 score 51 scripts

stochastictree

stochtree:Stochastic Tree Ensembles (XBART and BART) for Supervised Learning and Causal Inference

Flexible stochastic tree ensemble software. Robust implementations of Bayesian Additive Regression Trees (BART) Chipman, George, McCulloch (2010) <doi:10.1214/09-AOAS285> for supervised learning and Bayesian Causal Forests (BCF) Hahn, Murray, Carvalho (2020) <doi:10.1214/19-BA1195> for causal inference. Enables model serialization and parallel sampling and provides a low-level interface for custom stochastic forest samplers.

Maintained by Drew Herren. Last updated 17 days ago.

bart bayesian-machine-learning bayesian-methods decision-trees gradient-boosted-trees machine-learning probabilistic-models tree-ensembles cpp

2.8 match 20 stars 8.52 score 40 scripts

jenfb

bkmr:Bayesian Kernel Machine Regression

Implementation of a statistical approach for estimating the joint health effects of multiple concurrent exposures, as described in Bobb et al (2015) <doi:10.1093/biostatistics/kxu058>.

Maintained by Jennifer F. Bobb. Last updated 4 months ago.

3.4 match 55 stars 7.03 score 182 scripts 1 dependents

cran

MLE:Maximum Likelihood Estimation of Various Univariate and Multivariate Distributions

Several functions for maximum likelihood estimation of various univariate and multivariate distributions. The list includes more than 100 functions for univariate continuous and discrete distributions, distributions that lie on the real line, the positive line, interval restricted, circular distributions. Further, multivariate continuous and discrete distributions, distributions for compositional and directional data, etc. Some references include Johnson N. L., Kotz S. and Balakrishnan N. (1994). "Continuous Univariate Distributions, Volume 1" <ISBN:978-0-471-58495-7>, Johnson, Norman L. Kemp, Adrianne W. Kotz, Samuel (2005). "Univariate Discrete Distributions". <ISBN:978-0-471-71580-1> and Mardia, K. V. and Jupp, P. E. (2000). "Directional Statistics". <ISBN:978-0-471-95333-3>.

Maintained by Michail Tsagris. Last updated 26 days ago.

14.1 match 1.70 score 2 scripts

almutveraart

trawl:Estimation and Simulation of Trawl Processes

Contains R functions for simulating and estimating integer-valued trawl processes as described in the article Veraart (2019),"Modeling, simulation and inference for multivariate time series of counts using trawl processes", Journal of Multivariate Analysis, 169, pages 110-129, <doi:10.1016/j.jmva.2018.08.012> and for simulating random vectors from the bivariate negative binomial and the bi- and trivariate logarithmic series distributions.

Maintained by Almut E. D. Veraart. Last updated 4 years ago.

8.5 match 2.81 score 32 scripts

config-i1

smooth:Forecasting Using State Space Models

Functions implementing Single Source of Error state space models for purposes of time series analysis and forecasting. The package includes ADAM (Svetunkov, 2023, <https://openforecast.org/adam/>), Exponential Smoothing (Hyndman et al., 2008, <doi: 10.1007/978-3-540-71918-2>), SARIMA (Svetunkov & Boylan, 2019 <doi: 10.1080/00207543.2019.1600764>), Complex Exponential Smoothing (Svetunkov & Kourentzes, 2018, <doi: 10.13140/RG.2.2.24986.29123>), Simple Moving Average (Svetunkov & Petropoulos, 2018 <doi: 10.1080/00207543.2017.1380326>) and several simulation functions. It also allows dealing with intermittent demand based on the iETS framework (Svetunkov & Boylan, 2019, <doi: 10.13140/RG.2.2.35897.06242>).

Maintained by Ivan Svetunkov. Last updated 1 days ago.

arima arima-forecasting ces ets exponential-smoothing forecast state-space time-series openblas cpp

2.0 match 90 stars 11.87 score 412 scripts 25 dependents

daya6489

SmartEDA:Summarize and Explore the Data

Exploratory analysis on any input data describing the structure and the relationships present in the data. The package automatically select the variable and does related descriptive statistics. Analyzing information value, weight of evidence, custom tables, summary statistics, graphical techniques will be performed for both numeric and categorical predictors.

Maintained by Dayanand Ubrangala. Last updated 1 years ago.

analysis exploratory-data-analysis

3.3 match 42 stars 7.25 score 214 scripts

unuran

Runuran:R Interface to the 'UNU.RAN' Random Variate Generators

Interface to the 'UNU.RAN' library for Universal Non-Uniform RANdom variate generators. Thus it allows to build non-uniform random number generators from quite arbitrary distributions. In particular, it provides an algorithm for fast numerical inversion for distribution with given density function. In addition, the package contains densities, distribution functions and quantiles from a couple of distributions.

Maintained by Josef Leydold. Last updated 5 months ago.

3.4 match 6.87 score 180 scripts 8 dependents

cran

stratifyR:Optimal Stratification of Univariate Populations

The stratification of univariate populations under stratified sampling designs is implemented according to Khan et al. (2002) <doi:10.1177/0008068320020518> and Khan et al. (2015) <doi:10.1080/02664763.2015.1018674> in this library. It determines the Optimum Strata Boundaries (OSB) and Optimum Sample Sizes (OSS) for the study variable, y, using the best-fit frequency distribution of a survey variable (if data is available) or a hypothetical distribution (if data is not available). The method formulates the problem of determining the OSB as mathematical programming problem which is solved by using a dynamic programming technique. If a dataset of the population is available to the surveyor, the method estimates its best-fit distribution and determines the OSB and OSS under Neyman allocation directly. When the dataset is not available, stratification is made based on the assumption that the values of the study variable, y, are available as hypothetical realizations of proxy values of y from recent surveys. Thus, it requires certain distributional assumptions about the study variable. At present, it handles stratification for the populations where the study variable follows a continuous distribution, namely, Pareto, Triangular, Right-triangular, Weibull, Gamma, Exponential, Uniform, Normal, Log-normal and Cauchy distributions.

Maintained by Karuna G. Reddy. Last updated 21 days ago.

10.0 match 2.34 score 22 scripts

cran

SNSeg:Self-Normalization(SN) Based Change-Point Estimation for Time Series

Implementations self-normalization (SN) based algorithms for change-points estimation in time series data. This comprises nested local-window algorithms for detecting changes in both univariate and multivariate time series developed in Zhao, Jiang and Shao (2022) <doi:10.1111/rssb.12552>.

Maintained by Zifeng Zhao. Last updated 10 months ago.

cpp

10.2 match 1 stars 2.30 score

billvenables

PolynomF:Polynomials in R

Implements univariate polynomial operations in R, including polynomial arithmetic, finding zeros, plotting, and some operations on lists of polynomials.

Maintained by Bill Venables. Last updated 1 years ago.

cpp

5.1 match 4.54 score 50 scripts 14 dependents

melff

memisc:Management of Survey Data and Presentation of Analysis Results

An infrastructure for the management of survey data including value labels, definable missing values, recoding of variables, production of code books, and import of (subsets of) 'SPSS' and 'Stata' files is provided. Further, the package allows to produce tables and data frames of arbitrary descriptive statistics and (almost) publication-ready tables of regression model estimates, which can be exported to 'LaTeX' and HTML.

Maintained by Martin Elff. Last updated 11 days ago.

survey-data

1.9 match 46 stars 12.34 score 1.2k scripts 13 dependents

kkholst

lava:Latent Variable Models

A general implementation of Structural Equation Models with latent variables (MLE, 2SLS, and composite likelihood estimators) with both continuous, censored, and ordinal outcomes (Holst and Budtz-Joergensen (2013) <doi:10.1007/s00180-012-0344-y>). Mixture latent variable models and non-linear latent variable models (Holst and Budtz-Joergensen (2020) <doi:10.1093/biostatistics/kxy082>). The package also provides methods for graph exploration (d-separation, back-door criterion), simulation of general non-linear latent variable models, and estimation of influence functions for a broad range of statistical models.

Maintained by Klaus K. Holst. Last updated 2 months ago.

latent-variable-models simulation statistics structural-equation-models

1.8 match 33 stars 12.85 score 610 scripts 476 dependents

gregorkastner

stochvol:Efficient Bayesian Inference for Stochastic Volatility (SV) Models

Efficient algorithms for fully Bayesian estimation of stochastic volatility (SV) models with and without asymmetry (leverage) via Markov chain Monte Carlo (MCMC) methods. Methodological details are given in Kastner and Frühwirth-Schnatter (2014) <doi:10.1016/j.csda.2013.01.002> and Hosszejni and Kastner (2019) <doi:10.1007/978-3-030-30611-3_8>; the most common use cases are described in Hosszejni and Kastner (2021) <doi:10.18637/jss.v100.i12> and Kastner (2016) <doi:10.18637/jss.v069.i05> and the package examples.

Maintained by Darjus Hosszejni. Last updated 5 months ago.

openblas cpp

2.8 match 15 stars 8.16 score 90 scripts 8 dependents

pokotylo

ddalpha:Depth-Based Classification and Calculation of Data Depth

Contains procedures for depth-based supervised learning, which are entirely non-parametric, in particular the DDalpha-procedure (Lange, Mosler and Mozharovskyi, 2014 <doi:10.1007/s00362-012-0488-4>). The training data sample is transformed by a statistical depth function to a compact low-dimensional space, where the final classification is done. It also offers an extension to functional data and routines for calculating certain notions of statistical depth functions. 50 multivariate and 5 functional classification problems are included. (Pokotylo, Mozharovskyi and Dyckerhoff, 2019 <doi:10.18637/jss.v091.i05>).

Maintained by Oleksii Pokotylo. Last updated 6 months ago.

fortran cpp

5.2 match 2 stars 4.40 score 211 scripts 7 dependents

hotneim

lg:Locally Gaussian Distributions: Estimation and Methods

An implementation of locally Gaussian distributions. It provides methods for implementing locally Gaussian multivariate density estimation, conditional density estimation, various independence tests for iid and time series data, a test for conditional independence and a test for financial contagion.

Maintained by Håkon Otneim. Last updated 5 years ago.

5.5 match 4 stars 4.18 score 25 scripts

revelle

psych:Procedures for Psychological, Psychometric, and Personality Research

A general purpose toolbox developed originally for personality, psychometric theory and experimental psychology. Functions are primarily for multivariate analysis and scale construction using factor analysis, principal component analysis, cluster analysis and reliability analysis, although others provide basic descriptive statistics. Item Response Theory is done using factor analysis of tetrachoric and polychoric correlations. Functions for analyzing data at multiple levels include within and between group statistics, including correlations and factor analysis. Validation and cross validation of scales developed using basic machine learning algorithms are provided, as are functions for simulating and testing particular item and test structures. Several functions serve as a useful front end for structural equation modeling. Graphical displays of path diagrams, including mediation models, factor analysis and structural equation models are created using basic graphics. Some of the functions are written to support a book on psychometric theory as well as publications in personality research. For more information, see the <https://personality-project.org/r/> web page.

Maintained by William Revelle. Last updated 3 months ago.

1.6 match 52 stars 13.94 score 29k scripts 317 dependents

stocnet

migraph:Univariate and Multivariate Tests for Multimodal and Other Networks

A set of tools for testing networks. It includes functions for univariate and multivariate conditional uniform graph and quadratic assignment procedure testing, and network regression. The package is a complement to 'Multimodal Political Networks' (2021, ISBN:9781108985000), and includes various datasets used in the book. Built on the 'manynet' package, all functions operate with matrices, edge lists, and 'igraph', 'network', and 'tidygraph' objects, and on one-mode and two-mode (bipartite) networks.

Maintained by James Hollway. Last updated 4 months ago.

igraph multilevel-networks multimodal-network network-analysis sna

3.5 match 40 stars 6.49 score 33 scripts

bioc

netresponse:Functional Network Analysis

Algorithms for functional network analysis. Includes an implementation of a variational Dirichlet process Gaussian mixture model for nonparametric mixture modeling.

Maintained by Leo Lahti. Last updated 5 months ago.

cellbiology clustering geneexpression genetics network graphandnetwork differentialexpression microarray networkinference transcription

4.0 match 3 stars 5.64 score 21 scripts

luyouepiusf

DySS:Dynamic Screening Systems

In practice, we will encounter problems where the longitudinal performance of processes needs to be monitored over time. Dynamic screening systems (DySS) are methods that aim to identify and give signals to processes with poor performance as early as possible. This package is designed to implement dynamic screening systems and the related methods. References: Qiu, P. and Xiang, D. (2014) <doi:10.1080/00401706.2013.822423>; Qiu, P. and Xiang, D. (2015) <doi:10.1002/sim.6477>; Li, J. and Qiu, P. (2016) <doi:10.1080/0740817X.2016.1146423>; Li, J. and Qiu, P. (2017) <doi:10.1002/qre.2160>; You, L. and Qiu, P. (2019) <doi:10.1080/00949655.2018.1552273>; Qiu, P., Xia, Z., and You, L. (2020) <doi:10.1080/00401706.2019.1604434>; You, L., Qiu, A., Huang, B., and Qiu, P. (2020) <doi:10.1002/bimj.201900127>; You, L. and Qiu, P. (2021) <doi:10.1080/00224065.2020.1767006>.

Maintained by Lu You. Last updated 3 years ago.

fortran openblas cpp

11.2 match 2.00 score 6 scripts

hectorrdb

Ecume:Equality of 2 (or k) Continuous Univariate and Multivariate Distributions

We implement (or re-implements in R) a variety of statistical tools. They are focused on non-parametric two-sample (or k-sample) distribution comparisons in the univariate or multivariate case. See the vignette for more info.

Maintained by Hector Roux de Bezieux. Last updated 10 months ago.

software infrastructure

4.6 match 1 stars 4.86 score 16 scripts 3 dependents

danheck

RRreg:Correlation and Regression Analyses for Randomized Response Data

Univariate and multivariate methods to analyze randomized response (RR) survey designs (e.g., Warner, S. L. (1965). Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60, 63–69, <doi:10.2307/2283137>). Besides univariate estimates of true proportions, RR variables can be used for correlations, as dependent variable in a logistic regression (with or without random effects), or as predictors in a linear regression (Heck, D. W., & Moshagen, M. (2018). RRreg: An R package for correlation and regression analyses of randomized response data. Journal of Statistical Software, 85(2), 1–29, <doi:10.18637/jss.v085.i02>). For simulations and the estimation of statistical power, RR data can be generated according to several models. The implemented methods also allow to test the link between continuous covariates and dishonesty in cheating paradigms such as the coin-toss or dice-roll task (Moshagen, M., & Hilbig, B. E. (2017). The statistical analysis of cheating paradigms. Behavior Research Methods, 49, 724–732, <doi:10.3758/s13428-016-0729-x>).

Maintained by Daniel W. Heck. Last updated 2 years ago.

4.1 match 3 stars 5.46 score 48 scripts

markvanderloo

extremevalues:Univariate Outlier Detection

Detect outliers in one-dimensional data.

Maintained by Mark van der Loo. Last updated 3 months ago.

6.3 match 3.54 score 29 scripts 2 dependents

mlverse

torch:Tensors and Neural Networks with 'GPU' Acceleration

Provides functionality to define and train neural networks similar to 'PyTorch' by Paszke et al (2019) <doi:10.48550/arXiv.1912.01703> but written entirely in R using the 'libtorch' library. Also supports low-level tensor operations and 'GPU' acceleration.

Maintained by Daniel Falbel. Last updated 5 days ago.

autograd deep-learning torch cpp

1.3 match 520 stars 16.52 score 1.4k scripts 38 dependents

ovvo-financial

NNS:Nonlinear Nonparametric Statistics

Nonlinear nonparametric statistics using partial moments. Partial moments are the elements of variance and asymptotically approximate the area of f(x). These robust statistics provide the basis for nonlinear analysis while retaining linear equivalences. NNS offers: Numerical integration, Numerical differentiation, Clustering, Correlation, Dependence, Causal analysis, ANOVA, Regression, Classification, Seasonality, Autoregressive modeling, Normalization, Stochastic dominance and Advanced Monte Carlo sampling. All routines based on: Viole, F. and Nawrocki, D. (2013), Nonlinear Nonparametric Statistics: Using Partial Moments (ISBN: 1490523995).

Maintained by Fred Viole. Last updated 5 days ago.

clustering econometrics machine-learning nonlinear nonparametric partial-moments statistics time-series cpp

2.0 match 71 stars 10.96 score 66 scripts 3 dependents

loremerdrignac

EpiStats:Tools for Epidemiologists

Provides set of functions aimed at epidemiologists. The package includes commands for measures of association and impact for case control studies and cohort studies. It may be particularly useful for outbreak investigations including univariable analysis and stratified analysis. The functions for cohort studies include the CS(), CSTable() and CSInter() commands. The functions for case control studies include the CC(), CCTable() and CCInter() commands. References - Cornfield, J. 1956. A statistical problem arising from retrospective studies. In Vol. 4 of Proceedings of the Third Berkeley Symposium, ed. J. Neyman, 135-148. Berkeley, CA - University of California Press. Woolf, B. 1955. On estimating the relation between blood group disease. Annals of Human Genetics 19 251-253. Reprinted in Evolution of Epidemiologic Ideas Annotated Readings on Concepts and Methods, ed. S. Greenland, pp. 108-110. Newton Lower Falls, MA Epidemiology Resources. Gilles Desve & Peter Makary, 2007. 'CSTABLE Stata module to calculate summary table for cohort study' Statistical Software Components S456879, Boston College Department of Economics. Gilles Desve & Peter Makary, 2007. 'CCTABLE Stata module to calculate summary table for case-control study' Statistical Software Components S456878, Boston College Department of Economics.

Maintained by Lore Merdrignac. Last updated 1 years ago.

7.5 match 2.91 score 82 scripts

rjdverse

rjd3sts:State Space Framework and Structural Time Series with 'JDemetra+ 3.x'

R Interface to 'JDemetra+ 3.x' (<https://github.com/jdemetra>) time series analysis software. It offers access to several functions on state space models and structural time series.

Maintained by Jean Palate. Last updated 8 months ago.

openjdk

3.3 match 2 stars 6.64 score 25 scripts 4 dependents

jcatwood

VeccTMVN:Multivariate Normal Probabilities using Vecchia Approximation

Under a different representation of the multivariate normal (MVN) probability, we can use the Vecchia approximation to sample the integrand at a linear complexity with respect to n. Additionally, both the SOV algorithm from Genz (92) and the exponential-tilting method from Botev (2017) can be adapted to linear complexity. The reference for the method implemented in this package is Jian Cao and Matthias Katzfuss (2024) "Linear-Cost Vecchia Approximation of Multivariate Normal Probabilities" <doi:10.48550/arXiv.2311.09426>. Two major references for the development of our method are Alan Genz (1992) "Numerical Computation of Multivariate Normal Probabilities" <doi:10.1080/10618600.1992.10477010> and Z. I. Botev (2017) "The Normal Law Under Linear Restrictions: Simulation and Estimation via Minimax Tilting" <doi:10.48550/arXiv.1603.04166>.

Maintained by Jian Cao. Last updated 4 months ago.

normal-distribution sampling-methods statistics fortran openblas cpp openmp

6.1 match 2 stars 3.56 score 36 scripts

gshs-ornl

revengc:Reverse Engineering Summarized Data

Decoupled (e.g. separate averages) and censored (e.g. > 100 species) variables are continually reported by many well-established organizations (e.g. World Health Organization (WHO), Centers for Disease Control and Prevention (CDC), World Bank, and various national censuses). The challenge therefore is to infer what the original data could have been given summarized information. We present an R package that reverse engineers decoupled and/or censored count data with two main functions. The cnbinom.pars() function estimates the average and dispersion parameter of a censored univariate frequency table. The rec() function reverse engineers summarized data into an uncensored bivariate table of probabilities.

Maintained by Samantha Duchscherer. Last updated 6 years ago.

6.3 match 5 stars 3.44 score 11 scripts

tim-tu

weibulltools:Statistical Methods for Life Data Analysis

Provides statistical methods and visualizations that are often used in reliability engineering. Comprises a compact and easily accessible set of methods and visualization tools that make the examination and adjustment as well as the analysis and interpretation of field data (and bench tests) as simple as possible. Non-parametric estimators like Median Ranks, Kaplan-Meier (Abernethy, 2006, <ISBN:978-0-9653062-3-2>), Johnson (Johnson, 1964, <ISBN:978-0444403223>), and Nelson-Aalen for failure probability estimation within samples that contain failures as well as censored data are included. The package supports methods like Maximum Likelihood and Rank Regression, (Genschel and Meeker, 2010, <DOI:10.1080/08982112.2010.503447>) for the estimation of multiple parametric lifetime distributions, as well as the computation of confidence intervals of quantiles and probabilities using the delta method related to Fisher's confidence intervals (Meeker and Escobar, 1998, <ISBN:9780471673279>) and the beta-binomial confidence bounds. If desired, mixture model analysis can be done with segmented regression and the EM algorithm. Besides the well-known Weibull analysis, the package also contains Monte Carlo methods for the correction and completion of imprecisely recorded or unknown lifetime characteristics. (Verband der Automobilindustrie e.V. (VDA), 2016, <ISSN:0943-9412>). Plots are created statically ('ggplot2') or interactively ('plotly') and can be customized with functions of the respective visualization package. The graphical technique of probability plotting as well as the addition of regression lines and confidence bounds to existing plots are supported.

Maintained by Tim-Gunnar Hensel. Last updated 2 years ago.

field-data-analysis interactive-visualizations plotly reliability-analysis weibull-analysis weibulltools openblas cpp

3.5 match 13 stars 6.15 score 54 scripts

jamesramsay5

fda:Functional Data Analysis

These functions were developed to support functional data analysis as described in Ramsay, J. O. and Silverman, B. W. (2005) Functional Data Analysis. New York: Springer and in Ramsay, J. O., Hooker, Giles, and Graves, Spencer (2009). Functional Data Analysis with R and Matlab (Springer). The package includes data sets and script files working many examples including all but one of the 76 figures in this latter book. Matlab versions are available by ftp from <https://www.psych.mcgill.ca/misc/fda/downloads/FDAfuns/>.

Maintained by James Ramsay. Last updated 4 months ago.

1.8 match 3 stars 12.29 score 2.0k scripts 143 dependents

stephens999

etrunct:Computes Moments of Univariate Truncated t Distribution

Computes moments of univariate truncated t distribution. There is only one exported function, e_trunct(), which should be seen for details.

Maintained by Matthew Stephens. Last updated 6 years ago.

5.2 match 4.11 score 2 scripts 16 dependents

rdpeng

simpleboot:Simple Bootstrap Routines

Simple bootstrap routines.

Maintained by Roger D. Peng. Last updated 9 months ago.

3.5 match 12 stars 5.99 score 135 scripts 4 dependents

privefl

bigstatsr:Statistical Tools for Filebacked Big Matrices

Easy-to-use, efficient, flexible and scalable statistical tools. Package bigstatsr provides and uses Filebacked Big Matrices via memory-mapping. It provides for instance matrix operations, Principal Component Analysis, sparse linear supervised models, utility functions and more <doi:10.1093/bioinformatics/bty185>.

Maintained by Florian Privé. Last updated 6 months ago.

big-data large-matrices memory-mapped-file parallel-computing statistical-methods openblas cpp openmp

2.0 match 180 stars 10.59 score 394 scripts 16 dependents

cran

robfilter:Robust Time Series Filters

Implementations for several robust procedures that allow for (online) extraction of the signal of univariate or multivariate time series by applying robust regression techniques to a moving time window are provided. Included are univariate filtering procedures based on repeated-median regression as well as hybrid and trimmed filters derived from it; see Schettlinger et al. (2006) <doi:10.1515/BMT.2006.010>. The adaptive online repeated median by Schettlinger et al. (2010) <doi:10.1002/acs.1105> and the slope comparing adaptive repeated median by Borowski and Fried (2013) <doi:10.1007/s11222-013-9391-7> choose the width of the moving time window adaptively. Multivariate versions are also provided; see Borowski et al. (2009) <doi:10.1080/03610910802514972> for a multivariate online adaptive repeated median and Borowski (2012) <doi:10.17877/DE290R-14393> for a multivariate slope comparing adaptive repeated median. Furthermore, a repeated-median based filter with automatic outlier replacement and shift detection is provided; see Fried (2004) <doi:10.1080/10485250410001656444>.

Maintained by Roland Fried. Last updated 8 months ago.

cpp

11.0 match 2 stars 1.90 score

phargarten2

miWQS:Multiple Imputation Using Weighted Quantile Sum Regression

The miWQS package handles the uncertainty due to below the detection limit in a correlated component mixture problem. Researchers want to determine if a set/mixture of continuous and correlated components/chemicals is associated with an outcome and if so, which components are important in that mixture. These components share a common outcome but are interval-censored between zero and low thresholds, or detection limits, that may be different across the components. This package applies the multiple imputation (MI) procedure to the weighted quantile sum regression (WQS) methodology for continuous, binary, or count outcomes (Hargarten & Wheeler (2020) <doi:10.1016/j.envres.2020.109466>). The imputation models are: bootstrapping imputation (Lubin et al (2004) <doi:10.1289/ehp.7199>), univariate Bayesian imputation (Hargarten & Wheeler (2020) <doi:10.1016/j.envres.2020.109466>), and multivariate Bayesian regression imputation.

Maintained by Paul M. Hargarten. Last updated 1 years ago.

4.4 match 2 stars 4.78 score 20 scripts 1 dependents

mayur1009

cleanTS:Testbench for Univariate Time Series Cleaning

A reliable and efficient tool for cleaning univariate time series data. It implements reliable and efficient procedures for automating the process of cleaning univariate time series data. The package provides integration with already developed and deployed tools for missing value imputation and outlier detection. It also provides a way of visualizing large time-series data in different resolutions.

Maintained by Mayur Shende. Last updated 1 years ago.

5.6 match 11 stars 3.74 score 3 scripts

stan-dev

bayesplot:Plotting for Bayesian Models

Plotting functions for posterior analysis, MCMC diagnostics, prior and posterior predictive checks, and other visualizations to support the applied Bayesian workflow advocated in Gabry, Simpson, Vehtari, Betancourt, and Gelman (2019) <doi:10.1111/rssa.12378>. The package is designed not only to provide convenient functionality for users, but also a common set of functions that can be easily used by developers working on a variety of R packages for Bayesian modeling, particularly (but not exclusively) packages interfacing with 'Stan'.

Maintained by Jonah Gabry. Last updated 1 months ago.

bayesian ggplot2 mcmc pandoc stan statistical-graphics visualization

1.3 match 436 stars 16.69 score 6.5k scripts 98 dependents

nlsy-links

NlsyLinks:Utilities and Kinship Information for Research with the NLSY

Utilities and kinship information for behavior genetics and developmental research using the National Longitudinal Survey of Youth (NLSY; <https://www.nlsinfo.org/>).

Maintained by S. Mason Garrison. Last updated 7 days ago.

behavior-genetics kinship-information national-longitudinal-survey nlsy

2.8 match 7 stars 7.49 score 185 scripts

ropensci

aorsf:Accelerated Oblique Random Forests

Fit, interpret, and compute predictions with oblique random forests. Includes support for partial dependence, variable importance, passing customized functions for variable importance and identification of linear combinations of features. Methods for the oblique random survival forest are described in Jaeger et al., (2023) <DOI:10.1080/10618600.2023.2231048>.

Maintained by Byron Jaeger. Last updated 3 days ago.

data-science oblique random-forest survival openblas cpp openmp

2.3 match 58 stars 9.21 score 60 scripts 1 dependents

kzst

rbcc:Risk-Based Control Charts

Univariate and multivariate versions of risk-based control charts. Univariate versions of control charts, such as the risk-based version of X-bar, Moving Average (MA), Exponentially Weighted Moving Average Control Charts (EWMA), and Cumulative Sum Control Charts (CUSUM) charts. The risk-based version of the multivariate T2 control chart. Plot and summary functions. Kosztyan et. al. (2016) <doi:10.1016/j.eswa.2016.06.019>.

Maintained by Zsolt Tibor Kosztyan. Last updated 25 days ago.

6.0 match 3.48 score 1 scripts

lbelzile

mev:Modelling of Extreme Values

Various tools for the analysis of univariate, multivariate and functional extremes. Exact simulation from max-stable processes [Dombry, Engelke and Oesting (2016) <doi:10.1093/biomet/asw008>, R-Pareto processes for various parametric models, including Brown-Resnick (Wadsworth and Tawn, 2014, <doi:10.1093/biomet/ast042>) and Extremal Student (Thibaud and Opitz, 2015, <doi:10.1093/biomet/asv045>). Threshold selection methods, including Wadsworth (2016) <doi:10.1080/00401706.2014.998345>, and Northrop and Coleman (2014) <doi:10.1007/s10687-014-0183-z>. Multivariate extreme diagnostics. Estimation and likelihoods for univariate extremes, e.g., Coles (2001) <doi:10.1007/978-1-4471-3675-0>.

Maintained by Leo Belzile. Last updated 5 months ago.

extreme-value-statistics likelihood-functions max-stable simulation threshold-selection openblas cpp openmp

2.5 match 13 stars 8.23 score 94 scripts 4 dependents

blasbenito

distantia:Advanced Toolset for Efficient Time Series Dissimilarity Analysis

Fast C++ implementation of Dynamic Time Warping for time series dissimilarity analysis, with applications in environmental monitoring and sensor data analysis, climate science, signal processing and pattern recognition, and financial data analysis. Built upon the ideas presented in Benito and Birks (2020) <doi:10.1111/ecog.04895>, provides tools for analyzing time series of varying lengths and structures, including irregular multivariate time series. Key features include individual variable contribution analysis, restricted permutation tests for statistical significance, and imputation of missing data via GAMs. Additionally, the package provides an ample set of tools to prepare and manage time series data.

Maintained by Blas M. Benito. Last updated 25 days ago.

3.6 match 23 stars 5.76 score 11 scripts

cran

rriskDistributions:Fitting Distributions to Given Data or Known Quantiles

Collection of functions for fitting distributions to given data or by known quantiles. Two main functions fit.perc() and fit.cont() provide users a GUI that allows to choose a most appropriate distribution without any knowledge of the R syntax. Note, this package is a part of the 'rrisk' project.

Maintained by Matthias Greiner. Last updated 8 years ago.

7.0 match 2.94 score 1 dependents

hanjunwei-lab

ssMutPA:Single-Sample Mutation-Based Pathway Analysis

A systematic bioinformatics tool to perform single-sample mutation-based pathway analysis by integrating somatic mutation data with the Protein-Protein Interaction (PPI) network. In this method, we use local and global weighted strategies to evaluate the effects of network genes from mutations according to the network topology and then calculate the mutation-based pathway enrichment score (ssMutPES) to reflect the accumulated effect of mutations of each pathway. Subsequently, the ssMutPES profiles are used for unsupervised spectral clustering to identify cancer subtypes.

Maintained by Junwei Han. Last updated 5 months ago.

5.1 match 4.00 score 9 scripts

gaynorr

AlphaSimR:Breeding Program Simulations

The successor to the 'AlphaSim' software for breeding program simulation [Faux et al. (2016) <doi:10.3835/plantgenome2016.02.0013>]. Used for stochastic simulations of breeding programs to the level of DNA sequence for every individual. Contained is a wide range of functions for modeling common tasks in a breeding program, such as selection and crossing. These functions allow for constructing simulations of highly complex plant and animal breeding programs via scripting in the R software environment. Such simulations can be used to evaluate overall breeding program performance and conduct research into breeding program design, such as implementation of genomic selection. Included is the 'Markovian Coalescent Simulator' ('MaCS') for fast simulation of biallelic sequences according to a population demographic history [Chen et al. (2009) <doi:10.1101/gr.083634.108>].

Maintained by Chris Gaynor. Last updated 5 months ago.

breeding genomics simulation openblas cpp openmp

2.0 match 47 stars 10.22 score 534 scripts 2 dependents

cran

tmvmixnorm:Sampling from Truncated Multivariate Normal and t Distributions

Efficient sampling of truncated multivariate (scale) mixtures of normals under linear inequality constraints is nontrivial due to the analytically intractable normalizing constant. Meanwhile, traditional methods may subject to numerical issues, especially when the dimension is high and dependence is strong. Algorithms proposed by Li and Ghosh (2015) <doi: 10.1080/15598608.2014.996690> are adopted for overcoming difficulties in simulating truncated distributions. Efficient rejection sampling for simulating truncated univariate normal distribution is included in the package, which shows superiority in terms of acceptance rate and numerical stability compared to existing methods and R packages. An efficient function for sampling from truncated multivariate normal distribution subject to convex polytope restriction regions based on Gibbs sampler for conditional truncated univariate distribution is provided. By extending the sampling method, a function for sampling truncated multivariate Student's t distribution is also developed. Moreover, the proposed method and computation remain valid for high dimensional and strong dependence scenarios. Empirical results in Li and Ghosh (2015) <doi: 10.1080/15598608.2014.996690> illustrated the superior performance in terms of various criteria (e.g. mixing and integrated auto-correlation time).

Maintained by Ting Fung (Ralph) Ma. Last updated 4 years ago.

9.4 match 2.18 score 5 dependents

bouchranasri

GaussianHMM1d:Inference, Goodness-of-Fit and Forecast for Univariate Gaussian Hidden Markov Models

Inference, goodness-of-fit test, and prediction densities and intervals for univariate Gaussian Hidden Markov Models (HMM). The goodness-of-fit is based on a Cramer-von Mises statistic and uses parametric bootstrap to estimate the p-value. The description of the methodology is taken from Chapter 10.2 of Remillard (2013) <doi:10.1201/b14285>.

Maintained by Bouchra R. Nasri. Last updated 1 months ago.

18.8 match 1.08 score 12 scripts

inzightvit

iNZightPlots:Graphical Tools for Exploring Data with 'iNZight'

Simple plotting function(s) for exploratory data analysis with flexible options allowing for easy plot customisation. The goal is to make it easy for beginners to start exploring a dataset through simple R function calls, as well as provide a similar interface to summary statistics and inference information. Includes functionality to generate interactive HTML-driven graphs. Used by 'iNZight', a graphical user interface providing easy exploration and visualisation of data for students of statistics, available in both desktop and online versions.

Maintained by Tom Elliott. Last updated 2 months ago.

4.0 match 2 stars 5.06 score 19 scripts 1 dependents