R-universe search: diagnostics

laplacesdemonr

LaplacesDemon:Complete Environment for Bayesian Inference

Provides a complete environment for Bayesian inference using a variety of different samplers (see ?LaplacesDemon for an overview).

Maintained by Henrik Singmann. Last updated 12 months ago.

33.5 match 93 stars 13.45 score 1.8k scripts 60 dependents

stan-dev

posterior:Tools for Working with Posterior Distributions

Provides useful tools for both users and developers of packages for fitting Bayesian models or working with output from Bayesian models. The primary goals of the package are to: (a) Efficiently convert between many different useful formats of draws (samples) from posterior or prior distributions. (b) Provide consistent methods for operations commonly performed on draws, for example, subsetting, binding, or mutating draws. (c) Provide various summaries of draws in convenient formats. (d) Provide lightweight implementations of state of the art posterior inference diagnostics. References: Vehtari et al. (2021) <doi:10.1214/20-BA1221>.

Maintained by Paul-Christian Bürkner. Last updated 9 days ago.

bayes bayesian mcmc

23.6 match 168 stars 16.13 score 3.3k scripts 342 dependents

rsquaredacademy

olsrr:Tools for Building OLS Regression Models

Tools designed to make it easier for users, particularly beginner/intermediate R users to build ordinary least squares regression models. Includes comprehensive regression output, heteroskedasticity tests, collinearity diagnostics, residual diagnostics, measures of influence, model fit assessment and variable selection procedures.

Maintained by Aravind Hebbali. Last updated 4 months ago.

collinearity-diagnostics linear-models regression stepwise-regression

24.0 match 103 stars 12.19 score 1.4k scripts 4 dependents

zeileis

ivreg:Instrumental-Variables Regression by '2SLS', '2SM', or '2SMM', with Diagnostics

Instrumental variable estimation for linear models by two-stage least-squares (2SLS) regression or by robust-regression via M-estimation (2SM) or MM-estimation (2SMM). The main ivreg() model-fitting function is designed to provide a workflow as similar as possible to standard lm() regression. A wide range of methods is provided for fitted ivreg model objects, including extensive functionality for computing and graphing regression diagnostics in addition to other standard model tools.

Maintained by Achim Zeileis. Last updated 2 months ago.

instrumental-variables regression-diagnostics two-stage-least-squares-regression

25.4 match 20 stars 10.24 score 360 scripts 4 dependents

paternogbc

sensiPhy:Sensitivity Analysis for Comparative Methods

An implementation of sensitivity analysis for phylogenetic comparative methods. The package is an umbrella of statistical and graphical methods that estimate and report different types of uncertainty in PCM: (i) Species Sampling uncertainty (sample size; influential species and clades). (ii) Phylogenetic uncertainty (different topologies and/or branch lengths). (iii) Data uncertainty (intraspecific variation and measurement error).

Maintained by Gustavo Paterno. Last updated 5 years ago.

comparative-methods ecology evolution phylogenetics sensitivity-analysis

40.4 match 13 stars 6.38 score 61 scripts

ellessenne

comorbidity:Computing Comorbidity Scores

Computing comorbidity indices and scores such as the weighted Charlson score (Charlson, 1987 <doi:10.1016/0021-9681(87)90171-8>) and the Elixhauser comorbidity score (Elixhauser, 1998 <doi:10.1097/00005650-199801000-00004>) using ICD-9-CM or ICD-10 codes (Quan, 2005 <doi:10.1097/01.mlr.0000182534.19832.83>). Australian and Swedish modifications of the Charlson Comorbidity Index are available as well (Sundararajan, 2004 <doi:10.1016/j.jclinepi.2004.03.012> and Ludvigsson, 2021 <doi:10.2147/CLEP.S282475>), together with different weighting algorithms for both the Charlson and Elixhauser comorbidity scores.

Maintained by Alessandro Gasparini. Last updated 8 months ago.

comorbidity

22.4 match 83 stars 9.36 score 98 scripts

ohdsi

PhenotypeR:Assess Study Cohorts Using a Common Data Model

Phenotype study cohorts in data mapped to the Observational Medical Outcomes Partnership Common Data Model. Diagnostics are run at the database, code list, cohort, and population level to assess whether study cohorts are ready for research.

Maintained by Edward Burn. Last updated 4 days ago.

28.0 match 2 stars 7.40 score 57 scripts

spatstat

spatstat.model:Parametric Statistical Modelling and Inference for the 'spatstat' Family

Functionality for parametric statistical modelling and inference for spatial data, mainly spatial point patterns, in the 'spatstat' family of packages. (Excludes analysis of spatial data on a linear network, which is covered by the separate package 'spatstat.linnet'.) Supports parametric modelling, formal statistical inference, and model validation. Parametric models include Poisson point processes, Cox point processes, Neyman-Scott cluster processes, Gibbs point processes and determinantal point processes. Models can be fitted to data using maximum likelihood, maximum pseudolikelihood, maximum composite likelihood and the method of minimum contrast. Fitted models can be simulated and predicted. Formal inference includes hypothesis tests (quadrat counting tests, Cressie-Read tests, Clark-Evans test, Berman test, Diggle-Cressie-Loosmore-Ford test, scan test, studentised permutation test, segregation test, ANOVA tests of fitted models, adjusted composite likelihood ratio test, envelope tests, Dao-Genton test, balanced independent two-stage test), confidence intervals for parameters, and prediction intervals for point counts. Model validation techniques include leverage, influence, partial residuals, added variable plots, diagnostic plots, pseudoscore residual plots, model compensators and Q-Q plots.

Maintained by Adrian Baddeley. Last updated 6 days ago.

analysis-of-variance cluster-process confidence-intervals cox-process determinantal-point-processes gibbs-process influence leverage model-diagnostics neyman-scott parameter-estimation poisson-process spatial-analysis spatial-modelling spatial-point-processes statistical-inference

22.3 match 5 stars 9.09 score 6 scripts 46 dependents

aloy

HLMdiag:Diagnostic Tools for Hierarchical (Multilevel) Linear Models

A suite of diagnostic tools for hierarchical (multilevel) linear models. The tools include not only leverage and traditional deletion diagnostics (Cook's distance, covratio, covtrace, and MDFFITS) but also convenience functions and graphics for residual analysis. Models can be fit using either lmer in the 'lme4' package or lme in the 'nlme' package.

Maintained by Adam Loy. Last updated 4 years ago.

openblas cpp

23.2 match 17 stars 8.63 score 170 scripts 7 dependents

chstock

DTComPair:Comparison of Binary Diagnostic Tests in a Paired Study Design

Comparison of the accuracy of two binary diagnostic tests in a "paired" study design, i.e. when each test is applied to each subject in the study.

Maintained by Christian Stock. Last updated 5 months ago.

clinical-epidemiology comparative-analysis diagnosis diagnostic-accuracy-studies diagnostic-likelihood-ratio diagnostic-tests medicine predictive-value sensitivity specificity

39.0 match 1 stars 5.07 score 47 scripts

florianhartig

DHARMa:Residual Diagnostics for Hierarchical (Multi-Level / Mixed) Regression Models

The 'DHARMa' package uses a simulation-based approach to create readily interpretable scaled (quantile) residuals for fitted (generalized) linear mixed models. Currently supported are linear and generalized linear (mixed) models from 'lme4' (classes 'lmerMod', 'glmerMod'), 'glmmTMB', 'GLMMadaptive', and 'spaMM'; phylogenetic linear models from 'phylolm' (classes 'phylolm' and 'phyloglm'); generalized additive models ('gam' from 'mgcv'); 'glm' (including 'negbin' from 'MASS', but excluding quasi-distributions) and 'lm' model classes. Moreover, externally created simulations, e.g. posterior predictive simulations from Bayesian software such as 'JAGS', 'STAN', or 'BUGS' can be processed as well. The resulting residuals are standardized to values between 0 and 1 and can be interpreted as intuitively as residuals from a linear regression. The package also provides a number of plot and test functions for typical model misspecification problems, such as over/underdispersion, zero-inflation, and residual spatial, phylogenetic and temporal autocorrelation.

Maintained by Florian Hartig. Last updated 11 days ago.

glmm regression regression-diagnostics residual

13.1 match 226 stars 14.74 score 2.8k scripts 10 dependents

vpnsctl

mixpoissonreg:Mixed Poisson Regression for Overdispersed Count Data

Fits mixed Poisson regression models (Poisson-Inverse Gaussian or Negative-Binomial) on data sets with response variables being count data. The models can have varying precision parameter, where a linear regression structure (through a link function) is assumed to hold on the precision parameter. The Expectation-Maximization algorithm for both these models (Poisson Inverse Gaussian and Negative Binomial) is an important contribution of this package. Another important feature of this package is the set of functions to perform global and local influence analysis. See Barreto-Souza and Simas (2016) <doi:10.1007/s11222-015-9601-6> for further details.

Maintained by Alexandre B. Simas. Last updated 4 years ago.

count-data diagnostics influence-analysis local-influence negative-binomial-regression poisson-inverse-gaussian-regression

30.6 match 3 stars 5.44 score 23 scripts

stan-dev

bayesplot:Plotting for Bayesian Models

Plotting functions for posterior analysis, MCMC diagnostics, prior and posterior predictive checks, and other visualizations to support the applied Bayesian workflow advocated in Gabry, Simpson, Vehtari, Betancourt, and Gelman (2019) <doi:10.1111/rssa.12378>. The package is designed not only to provide convenient functionality for users, but also a common set of functions that can be easily used by developers working on a variety of R packages for Bayesian modeling, particularly (but not exclusively) packages interfacing with 'Stan'.

Maintained by Jonah Gabry. Last updated 1 months ago.

bayesian ggplot2 mcmc pandoc stan statistical-graphics visualization

9.8 match 436 stars 16.69 score 6.5k scripts 98 dependents

stan-dev

rstan:R Interface to Stan

User-facing R functions are provided to parse, compile, test, estimate, and analyze Stan models by accessing the header-only Stan library provided by the 'StanHeaders' package. The Stan project develops a probabilistic programming language that implements full Bayesian statistical inference via Markov Chain Monte Carlo, rough Bayesian inference via 'variational' approximation, and (optionally penalized) maximum likelihood estimation via optimization. In all three cases, automatic differentiation is used to quickly and accurately evaluate gradients without burdening the user with the need to derive the partial derivatives.

Maintained by Ben Goodrich. Last updated 1 days ago.

bayesian-data-analysis bayesian-inference bayesian-statistics mcmc stan cpp

8.4 match 1.1k stars 18.67 score 14k scripts 279 dependents

uupharmacometrics

xpose:Diagnostics for Pharmacometric Models

Diagnostics for non-linear mixed-effects (population) models from 'NONMEM' <https://www.iconplc.com/solutions/technologies/nonmem/>. 'xpose' facilitates data import, creation of numerical run summary and provide 'ggplot2'-based graphics for data exploration and model diagnostics.

Maintained by Benjamin Guiastrennec. Last updated 2 months ago.

diagnostics ggplot2 nonmem pharmacometrics xpose

14.1 match 62 stars 11.02 score 183 scripts 6 dependents

andy-iskauskas

hmer:History Matching and Emulation Package

A set of objects and functions for Bayes Linear emulation and history matching. Core functionality includes automated training of emulators to data, diagnostic functions to ensure suitability, and a variety of proposal methods for generating 'waves' of points. For details on the mathematical background, there are many papers available on the topic (see references attached to function help files or the below references); for details of the functions in this package, consult the manual or help files. Iskauskas, A, et al. (2024) <doi:10.18637/jss.v109.i10>. Bower, R.G., Goldstein, M., and Vernon, I. (2010) <doi:10.1214/10-BA524>. Craig, P.S., Goldstein, M., Seheult, A.H., and Smith, J.A. (1997) <doi:10.1007/978-1-4612-2290-3_2>.

Maintained by Andrew Iskauskas. Last updated 10 days ago.

21.0 match 16 stars 7.19 score 37 scripts

doebler

mada:Meta-Analysis of Diagnostic Accuracy

Provides functions for diagnostic meta-analysis. Next to basic analysis and visualization the bivariate Model of Reitsma et al. (2005) that is equivalent to the HSROC of Rutter & Gatsonis (2001) can be fitted. A new approach based to diagnostic meta-analysis of Holling et al. (2012) is also available. Standard methods like summary, plot and so on are provided.

Maintained by Philipp Doebler. Last updated 3 years ago.

29.4 match 2 stars 5.09 score 58 scripts 3 dependents

modeloriented

DALEX:moDel Agnostic Language for Exploration and eXplanation

Any unverified black box model is the path to failure. Opaqueness leads to distrust. Distrust leads to ignoration. Ignoration leads to rejection. DALEX package xrays any model and helps to explore and explain its behaviour. Machine Learning (ML) models are widely used and have various applications in classification or regression. Models created with boosting, bagging, stacking or similar techniques are often used due to their high performance. But such black-box models usually lack direct interpretability. DALEX package contains various methods that help to understand the link between input variables and model output. Implemented methods help to explore the model on the level of a single instance as well as a level of the whole dataset. All model explainers are model agnostic and can be compared across different models. DALEX package is the cornerstone for 'DrWhy.AI' universe of packages for visual model exploration. Find more details in (Biecek 2018) <https://jmlr.org/papers/v19/18-416.html>.

Maintained by Przemyslaw Biecek. Last updated 30 days ago.

black-box dalex data-science explainable-ai explainable-artificial-intelligence explainable-ml explanations explanatory-model-analysis fairness iml interpretability interpretable-machine-learning machine-learning model-visualization predictive-modeling responsible-ai responsible-ml xai

10.9 match 1.4k stars 13.40 score 876 scripts 21 dependents

githubwilly

forsearch:Diagnostic Analysis Using Forward Search Procedure for Various Models

Identifies potential data outliers and their impact on estimates and analyses. Tool for evaluation of study credibility. Uses the forward search approach of Atkinson and Riani, "Robust Diagnostic Regression Analysis", 2000,<ISBN: o-387-95017-6> to prepare descriptive statistics of a dataset that is to be analyzed by functions lm {stats}, glm {stats}, nls {stats}, lme {nlme}, or coxph {survival}, or their equivalent in another language. Includes graphics functions to display the descriptive statistics.

Maintained by William Fairweather. Last updated 2 months ago.

36.2 match 4.00 score

stan-dev

cmdstanr:R Interface to 'CmdStan'

A lightweight interface to 'Stan' <https://mc-stan.org>. The 'CmdStanR' interface is an alternative to 'RStan' that calls the command line interface for compilation and running algorithms instead of interfacing with C++ via 'Rcpp'. This has many benefits including always being compatible with the latest version of Stan, fewer installation errors, fewer unexpected crashes in RStudio, and a more permissive license.

Maintained by Andrew Johnson. Last updated 9 months ago.

bayes bayesian markov-chain-monte-carlo maximum-likelihood mcmc stan variational-inference

11.4 match 145 stars 12.27 score 5.2k scripts 9 dependents

r-forge

deSolve:Solvers for Initial Value Problems of Differential Equations ('ODE', 'DAE', 'DDE')

Functions that solve initial value problems of a system of first-order ordinary differential equations ('ODE'), of partial differential equations ('PDE'), of differential algebraic equations ('DAE'), and of delay differential equations. The functions provide an interface to the FORTRAN functions 'lsoda', 'lsodar', 'lsode', 'lsodes' of the 'ODEPACK' collection, to the FORTRAN functions 'dvode', 'zvode' and 'daspk' and a C-implementation of solvers of the 'Runge-Kutta' family with fixed or variable time steps. The package contains routines designed for solving 'ODEs' resulting from 1-D, 2-D and 3-D partial differential equations ('PDE') that have been converted to 'ODEs' by numerical differencing.

Maintained by Thomas Petzoldt. Last updated 1 years ago.

fortran openblas

11.1 match 12.33 score 8.0k scripts 427 dependents

wjakethompson

measr:Bayesian Psychometric Measurement Using 'Stan'

Estimate diagnostic classification models (also called cognitive diagnostic models) with 'Stan'. Diagnostic classification models are confirmatory latent class models, as described by Rupp et al. (2010, ISBN: 978-1-60623-527-0). Automatically generate 'Stan' code for the general loglinear cognitive diagnostic diagnostic model proposed by Henson et al. (2009) <doi:10.1007/s11336-008-9089-5> and other subtypes that introduce additional model constraints. Using the generated 'Stan' code, estimate the model evaluate the model's performance using model fit indices, information criteria, and reliability metrics.

Maintained by W. Jake Thompson. Last updated 2 months ago.

bayesian cdm cmdstanr cognitive-diagnosis cognitive-diagnostic-models dcm diagnostic-classification-models psychometrics rstan stan cpp

19.7 match 10 stars 6.75 score 31 scripts

darwin-eu

DrugExposureDiagnostics:Diagnostics for OMOP Common Data Model Drug Records

Ingredient specific diagnostics for drug exposure records in the Observational Medical Outcomes Partnership (OMOP) common data model.

Maintained by Ger Inberg. Last updated 2 days ago.

18.2 match 4 stars 7.11 score 41 scripts

martynplummer

coda:Output Analysis and Diagnostics for MCMC

Provides functions for summarizing and plotting the output from Markov Chain Monte Carlo (MCMC) simulations, as well as diagnostic tests of convergence to the equilibrium distribution of the Markov chain.

Maintained by Martyn Plummer. Last updated 1 years ago.

11.4 match 6 stars 11.33 score 8.3k scripts 1.1k dependents

alexanderrobitzsch

CDM:Cognitive Diagnosis Modeling

Functions for cognitive diagnosis modeling and multidimensional item response modeling for dichotomous and polytomous item responses. This package enables the estimation of the DINA and DINO model (Junker & Sijtsma, 2001, <doi:10.1177/01466210122032064>), the multiple group (polytomous) GDINA model (de la Torre, 2011, <doi:10.1007/s11336-011-9207-7>), the multiple choice DINA model (de la Torre, 2009, <doi:10.1177/0146621608320523>), the general diagnostic model (GDM; von Davier, 2008, <doi:10.1348/000711007X193957>), the structured latent class model (SLCA; Formann, 1992, <doi:10.1080/01621459.1992.10475229>) and regularized latent class analysis (Chen, Li, Liu, & Ying, 2017, <doi:10.1007/s11336-016-9545-6>). See George, Robitzsch, Kiefer, Gross, and Uenlue (2017) <doi:10.18637/jss.v074.i02> or Robitzsch and George (2019, <doi:10.1007/978-3-030-05584-4_26>) for further details on estimation and the package structure. For tutorials on how to use the CDM package see George and Robitzsch (2015, <doi:10.20982/tqmp.11.3.p189>) as well as Ravand and Robitzsch (2015).

Maintained by Alexander Robitzsch. Last updated 9 months ago.

cognitive-diagnostic-models item-response-theory cpp

14.3 match 22 stars 8.76 score 138 scripts 28 dependents

statnet

ergm:Fit, Simulate and Diagnose Exponential-Family Models for Networks

An integrated set of tools to analyze and simulate networks based on exponential-family random graph models (ERGMs). 'ergm' is a part of the Statnet suite of packages for network analysis. See Hunter, Handcock, Butts, Goodreau, and Morris (2008) <doi:10.18637/jss.v024.i03> and Krivitsky, Hunter, Morris, and Klumb (2023) <doi:10.18637/jss.v105.i06>.

Maintained by Pavel N. Krivitsky. Last updated 5 days ago.

8.0 match 100 stars 15.36 score 1.4k scripts 36 dependents

jwiley

JWileymisc:Miscellaneous Utilities and Functions

Miscellaneous tools and functions, including: generate descriptive statistics tables, format output, visualize relations among variables or check distributions, and generic functions for residual and model diagnostics.

Maintained by Joshua F. Wiley. Last updated 2 months ago.

16.2 match 6 stars 7.42 score 241 scripts 4 dependents

rsquaredacademy

blorr:Tools for Developing Binary Logistic Regression Models

Tools designed to make it easier for beginner and intermediate users to build and validate binary logistic regression models. Includes bivariate analysis, comprehensive regression output, model fit statistics, variable selection procedures, model validation techniques and a 'shiny' app for interactive model building.

Maintained by Aravind Hebbali. Last updated 4 months ago.

logistic-regression-models regression cpp

16.0 match 17 stars 7.13 score 144 scripts 1 dependents

uupharmacometrics

xpose4:Diagnostics for Nonlinear Mixed-Effect Models

A model building aid for nonlinear mixed-effects (population) model analysis using NONMEM, facilitating data set checkout, exploration and visualization, model diagnostics, candidate covariate identification and model comparison. The methods are described in Keizer et al. (2013) <doi:10.1038/psp.2013.24>, and Jonsson et al. (1999) <doi:10.1016/s0169-2607(98)00067-4>.

Maintained by Andrew C. Hooker. Last updated 1 years ago.

diagnostics nonmem pharmacometrics population-model xpose

15.4 match 35 stars 7.30 score 315 scripts

rchlumsk

RavenR:Raven Hydrological Modelling Framework R Support and Analysis

Utilities for processing input and output files associated with the Raven Hydrological Modelling Framework. Includes various plotting functions, model diagnostics, reading output files into extensible time series format, and support for writing Raven input files. The 'RavenR' package is also archived at Chlumsky et al. (2020) <doi:10.5281/zenodo.4248183>. The Raven Hydrologic Modelling Framework method can be referenced with Craig et al. (2020) <doi:10.1016/j.envsoft.2020.104728>.

Maintained by Robert Chlumsky. Last updated 4 months ago.

diagnostics hydrology modeling modelling visualization water water-resources watershed cpp

15.2 match 36 stars 7.06 score 20 scripts

hneth

riskyr:Rendering Risk Literacy more Transparent

Risk-related information (like the prevalence of conditions, the sensitivity and specificity of diagnostic tests, or the effectiveness of interventions or treatments) can be expressed in terms of frequencies or probabilities. By providing a toolbox of corresponding metrics and representations, 'riskyr' computes, translates, and visualizes risk-related information in a variety of ways. Adopting multiple complementary perspectives provides insights into the interplay between key parameters and renders teaching and training programs on risk literacy more transparent.

Maintained by Hansjoerg Neth. Last updated 10 months ago.

2x2-matrix bayesian-inference contingency-table representation risk risk-literacy visualization

14.3 match 19 stars 7.36 score 80 scripts

goodekat

ggResidpanel:Panels and Interactive Versions of Diagnostic Plots using 'ggplot2'

An R package for creating diagnostic plots for models. The package allows for the creation of panels of plots and interactive plots.

Maintained by Katherine Goode. Last updated 2 months ago.

13.6 match 37 stars 7.68 score 262 scripts

brianstock

MixSIAR:Bayesian Mixing Models in R

Creates and runs Bayesian mixing models to analyze biological tracer data (i.e. stable isotopes, fatty acids), which estimate the proportions of source (prey) contributions to a mixture (consumer). 'MixSIAR' is not one model, but a framework that allows a user to create a mixing model based on their data structure and research questions, via options for fixed/ random effects, source data types, priors, and error terms. 'MixSIAR' incorporates several years of advances since 'MixSIR' and 'SIAR'.

Maintained by Brian Stock. Last updated 4 years ago.

jags cpp

10.9 match 96 stars 9.21 score 122 scripts

sfcheung

semfindr:Influential Cases in Structural Equation Modeling

Sensitivity analysis in structural equation modeling using influence measures and diagnostic plots. Support leave-one-out casewise sensitivity analysis presented by Pek and MacCallum (2011) <doi:10.1080/00273171.2011.561068> and approximate casewise influence using scores and casewise likelihood.

Maintained by Shu Fai Cheung. Last updated 12 days ago.

diagnostics influential-cases lavaan outlier-detection sensitivity-analysis structural-equation-modeling

16.5 match 1 stars 6.03 score 90 scripts

graemeleehickey

adaptDiag:Bayesian Adaptive Designs for Diagnostic Trials

Simulate clinical trials for diagnostic test devices and evaluate the operating characteristics under an adaptive design with futility assessment determined via the posterior predictive probabilities.

Maintained by Graeme L. Hickey. Last updated 3 months ago.

adaptive bayesian bayesian-statistics clinical-trials diagnostic-tests diagnostics statistics

21.0 match 4 stars 4.60 score 5 scripts

graysonwhite

gglm:Grammar of Graphics for Linear Model Diagnostic Plots

Allows for easy creation of diagnostic plots for a variety of model objects using the Grammar of Graphics. Provides functionality for both individual diagnostic plots and an array of four standard diagnostic plots.

Maintained by Grayson White. Last updated 1 years ago.

diagnostic-plots grammar-of-graphics linear-model-diagnostics

18.1 match 79 stars 5.34 score 45 scripts

matthewblackwell

Amelia:A Program for Missing Data

A tool that "multiply imputes" missing data in a single cross-section (such as a survey), from a time series (like variables collected for each year in a country), or from a time-series-cross-sectional data set (such as collected by years for each of several countries). Amelia II implements our bootstrapping-based algorithm that gives essentially the same answers as the standard IP or EMis approaches, is usually considerably faster than existing approaches and can handle many more variables. Unlike Amelia I and other statistically rigorous imputation software, it virtually never crashes (but please let us know if you find to the contrary!). The program also generalizes existing approaches by allowing for trends in time series across observations within a cross-sectional unit, as well as priors that allow experts to incorporate beliefs they have about the values of missing cells in their data. Amelia II also includes useful diagnostics of the fit of multiple imputation models. The program works from the R command line or via a graphical user interface that does not require users to know R.

Maintained by Matthew Blackwell. Last updated 4 months ago.

openblas cpp

10.4 match 1 stars 9.06 score 1.4k scripts 7 dependents

zheng206

ComBatFamQC:Comprehensive Batch Effect Diagnostics and Harmonization

Provides a comprehensive framework for batch effect diagnostics, harmonization, and post-harmonization downstream analysis. Features include interactive visualization tools, robust statistical tests, and a range of harmonization techniques. Additionally, 'ComBatFamQC' enables the creation of life-span age trend plots with estimated age-adjusted centiles and facilitates the generation of covariate-corrected residuals for analytical purposes. Methods for harmonization are based on approaches described in Johnson et al., (2007) <doi:10.1093/biostatistics/kxj037>, Beer et al., (2020) <doi:10.1016/j.neuroimage.2020.117129>, Pomponio et al., (2020) <doi:10.1016/j.neuroimage.2019.116450>, and Chen et al., (2021) <doi:10.1002/hbm.25688>.

Maintained by Zheng Ren. Last updated 2 months ago.

diagnostic-tool harmonization rshinyapp

17.6 match 2 stars 5.35 score 16 scripts

stan-dev

loo:Efficient Leave-One-Out Cross-Validation and WAIC for Bayesian Models

Efficient approximate leave-one-out cross-validation (LOO) for Bayesian models fit using Markov chain Monte Carlo, as described in Vehtari, Gelman, and Gabry (2017) <doi:10.1007/s11222-016-9696-4>. The approximation uses Pareto smoothed importance sampling (PSIS), a new procedure for regularizing importance weights. As a byproduct of the calculations, we also obtain approximate standard errors for estimated predictive errors and for the comparison of predictive errors between models. The package also provides methods for using stacking and other model weighting techniques to average Bayesian predictive distributions.

Maintained by Jonah Gabry. Last updated 1 days ago.

bayes bayesian bayesian-data-analysis bayesian-inference bayesian-methods bayesian-statistics cross-validation information-criterion model-comparison stan

5.4 match 152 stars 17.30 score 2.6k scripts 297 dependents

luca-scr

mclust:Gaussian Mixture Modelling for Model-Based Clustering, Classification, and Density Estimation

Gaussian finite mixture models fitted via EM algorithm for model-based clustering, classification, and density estimation, including Bayesian regularization, dimension reduction for visualisation, and resampling-based inference.

Maintained by Luca Scrucca. Last updated 11 months ago.

fortran openblas

7.6 match 21 stars 12.23 score 6.6k scripts 587 dependents

rstudio

renv:Project Environments

A dependency management toolkit for R. Using 'renv', you can create and manage project-local R libraries, save the state of these libraries to a 'lockfile', and later restore your library as required. Together, these tools can help make your projects more isolated, portable, and reproducible.

Maintained by Kevin Ushey. Last updated 2 days ago.

5.0 match 1.0k stars 18.55 score 1.5k scripts 113 dependents

bioc

OUTRIDER:OUTRIDER - OUTlier in RNA-Seq fInDER

Identification of aberrant gene expression in RNA-seq data. Read count expectations are modeled by an autoencoder to control for confounders in the data. Given these expectations, the RNA-seq read counts are assumed to follow a negative binomial distribution with a gene-specific dispersion. Outliers are then identified as read counts that significantly deviate from this distribution. Furthermore, OUTRIDER provides useful plotting functions to analyze and visualize the results.

Maintained by Christian Mertes. Last updated 5 months ago.

immunooncology rnaseq transcriptomics alignment sequencing geneexpression genetics count-data diagnostics expression-analysis mendelian-genetics outlier-detection rna-seq openblas cpp

10.0 match 49 stars 9.07 score 110 scripts 1 dependents

bioc

FRASER:Find RAre Splicing Events in RNA-Seq Data

Detection of rare aberrant splicing events in transcriptome profiles. Read count ratio expectations are modeled by an autoencoder to control for confounding factors in the data. Given these expectations, the ratios are assumed to follow a beta-binomial distribution with a junction specific dispersion. Outlier events are then identified as read-count ratios that deviate significantly from this distribution. FRASER is able to detect alternative splicing, but also intron retention. The package aims to support diagnostics in the field of rare diseases where RNA-seq is performed to identify aberrant splicing defects.

Maintained by Christian Mertes. Last updated 5 months ago.

rnaseq alternativesplicing sequencing software genetics coverage aberrant-splicing diagnostics outlier-detection rare-disease rna-seq splicing openblas cpp

10.5 match 41 stars 8.50 score 155 scripts

wviechtb

metafor:Meta-Analysis Package for R

A comprehensive collection of functions for conducting meta-analyses in R. The package includes functions to calculate various effect sizes or outcome measures, fit equal-, fixed-, random-, and mixed-effects models to such data, carry out moderator and meta-regression analyses, and create various types of meta-analytical plots (e.g., forest, funnel, radial, L'Abbe, Baujat, bubble, and GOSH plots). For meta-analyses of binomial and person-time data, the package also provides functions that implement specialized methods, including the Mantel-Haenszel method, Peto's method, and a variety of suitable generalized linear (mixed-effects) models (i.e., mixed-effects logistic and Poisson regression models). Finally, the package provides functionality for fitting meta-analytic multivariate/multilevel models that account for non-independent sampling errors and/or true effects (e.g., due to the inclusion of multiple treatment studies, multiple endpoints, or other forms of clustering). Network meta-analyses and meta-analyses accounting for known correlation structures (e.g., due to phylogenetic relatedness) can also be conducted. An introduction to the package can be found in Viechtbauer (2010) <doi:10.18637/jss.v036.i03>.

Maintained by Wolfgang Viechtbauer. Last updated 22 hours ago.

meta-analysis mixed-effects multilevel-models multivariate

5.4 match 246 stars 16.30 score 4.9k scripts 92 dependents

guido-s

diagmeta:Meta-Analysis of Diagnostic Accuracy Studies with Several Cutpoints

Provides methods by Steinhauser et al. (2016) <DOI:10.1186/s12874-016-0196-1> for meta-analysis of diagnostic accuracy studies with several cutpoints.

Maintained by Guido Schwarzer. Last updated 6 months ago.

diagnostic-accuracy-studies meta-analysis rstudio

16.7 match 4 stars 5.15 score 10 scripts

wjbraun

DAAG:Data Analysis and Graphics Data and Functions

Functions and data sets used in examples and exercises in the text Maindonald, J.H. and Braun, W.J. (2003, 2007, 2010) "Data Analysis and Graphics Using R", and in an upcoming Maindonald, Braun, and Andrews text that builds on this earlier text.

Maintained by W. John Braun. Last updated 11 months ago.

10.4 match 8.25 score 1.2k scripts 1 dependents

ncss-tech

aqp:Algorithms for Quantitative Pedology

The Algorithms for Quantitative Pedology (AQP) project was started in 2009 to organize a loosely-related set of concepts and source code on the topic of soil profile visualization, aggregation, and classification into this package (aqp). Over the past 8 years, the project has grown into a suite of related R packages that enhance and simplify the quantitative analysis of soil profile data. Central to the AQP project is a new vocabulary of specialized functions and data structures that can accommodate the inherent complexity of soil profile information; freeing the scientist to focus on ideas rather than boilerplate data processing tasks <doi:10.1016/j.cageo.2012.10.020>. These functions and data structures have been extensively tested and documented, applied to projects involving hundreds of thousands of soil profiles, and deeply integrated into widely used tools such as SoilWeb <https://casoilresource.lawr.ucdavis.edu/soilweb-apps>. Components of the AQP project (aqp, soilDB, sharpshootR, soilReports packages) serve an important role in routine data analysis within the USDA-NRCS Soil Science Division. The AQP suite of R packages offer a convenient platform for bridging the gap between pedometric theory and practice.

Maintained by Dylan Beaudette. Last updated 27 days ago.

digital-soil-mapping ncss-tech nrcs pedology pedometrics soil soil-survey usda

6.9 match 55 stars 11.77 score 1.2k scripts 2 dependents

lme4

lme4:Linear Mixed-Effects Models using 'Eigen' and S4

Fit linear and generalized linear mixed-effects models. The models and their components are represented using S4 classes and methods. The core computational algorithms are implemented using the 'Eigen' C++ library for numerical linear algebra and 'RcppEigen' "glue".

Maintained by Ben Bolker. Last updated 1 days ago.

cpp

3.7 match 647 stars 20.69 score 35k scripts 1.5k dependents

capnrefsmmat

regressinator:Simulate and Diagnose (Generalized) Linear Models

Simulate samples from populations with known covariate distributions, generate response variables according to common linear and generalized linear model families, draw from sampling distributions of regression estimates, and perform visual inference on diagnostics from model fits.

Maintained by Alex Reinhart. Last updated 5 months ago.

statistics

12.4 match 4 stars 6.08 score 25 scripts

rafromb

SynergyLMM:Statistical Framework for in Vivo Drug Combination Studies

A framework for evaluating drug combination effects in preclinical in vivo studies. 'SynergyLMM' provides functions to analyze longitudinal tumor growth experiments using linear mixed-effects models, perform time-dependent analyses of synergy and antagonism, evaluate model diagnostics and performance, and assess both post-hoc and a priori statistical power. The calculation of drug combination synergy follows the statistical framework provided by Demidenko and Miller (2019, <doi:10.1371/journal.pone.0224137>). The implementation and analysis of linear mixed-effect models is based on the methods described by Pinheiro and Bates (2000, <doi:10.1007/b98882>), and Gałecki and Burzykowski (2013, <doi:10.1007/978-1-4614-3900-4>).

Maintained by Rafael Romero-Becerra. Last updated 1 months ago.

14.1 match 2 stars 5.32 score

merliseclyde

BAS:Bayesian Variable Selection and Model Averaging using Bayesian Adaptive Sampling

Package for Bayesian Variable Selection and Model Averaging in linear models and generalized linear models using stochastic or deterministic sampling without replacement from posterior distributions. Prior distributions on coefficients are from Zellner's g-prior or mixtures of g-priors corresponding to the Zellner-Siow Cauchy Priors or the mixture of g-priors from Liang et al (2008) <DOI:10.1198/016214507000001337> for linear models or mixtures of g-priors from Li and Clyde (2019) <DOI:10.1080/01621459.2018.1469992> in generalized linear models. Other model selection criteria include AIC, BIC and Empirical Bayes estimates of g. Sampling probabilities may be updated based on the sampled models using sampling w/out replacement or an efficient MCMC algorithm which samples models using a tree structure of the model space as an efficient hash table. See Clyde, Ghosh and Littman (2010) <DOI:10.1198/jcgs.2010.09049> for details on the sampling algorithms. Uniform priors over all models or beta-binomial prior distributions on model size are allowed, and for large p truncated priors on the model space may be used to enforce sampling models that are full rank. The user may force variables to always be included in addition to imposing constraints that higher order interactions are included only if their parents are included in the model. This material is based upon work supported by the National Science Foundation under Division of Mathematical Sciences grant 1106891. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Maintained by Merlise Clyde. Last updated 4 months ago.

bayesian bayesian-inference generalized-linear-models linear-regression logistic-regression mcmc model-selection poisson-regression predictive-modeling regression variable-selection fortran openblas

6.8 match 44 stars 10.81 score 420 scripts 3 dependents

weecology

LDATS:Latent Dirichlet Allocation Coupled with Time Series Analyses

Combines Latent Dirichlet Allocation (LDA) and Bayesian multinomial time series methods in a two-stage analysis to quantify dynamics in high-dimensional temporal data. LDA decomposes multivariate data into lower-dimension latent groupings, whose relative proportions are modeled using generalized Bayesian time series models that include abrupt changepoints and smooth dynamics. The methods are described in Blei et al. (2003) <doi:10.1162/jmlr.2003.3.4-5.993>, Western and Kleykamp (2004) <doi:10.1093/pan/mph023>, Venables and Ripley (2002, ISBN-13:978-0387954578), and Christensen et al. (2018) <doi:10.1002/ecy.2373>.

Maintained by Juniper L. Simonis. Last updated 5 years ago.

changepoint lda parallel-tempering portal softmax

10.4 match 25 stars 6.93 score 45 scripts

cran

epiR:Tools for the Analysis of Epidemiological Data

Tools for the analysis of epidemiological and surveillance data. Contains functions for directly and indirectly adjusting measures of disease frequency, quantifying measures of association on the basis of single or multiple strata of count data presented in a contingency table, computation of confidence intervals around incidence risk and incidence rate estimates and sample size calculations for cross-sectional, case-control and cohort studies. Surveillance tools include functions to calculate an appropriate sample size for 1- and 2-stage representative freedom surveys, functions to estimate surveillance system sensitivity and functions to support scenario tree modelling analyses.

Maintained by Mark Stevenson. Last updated 1 months ago.

8.6 match 10 stars 8.18 score 10 dependents

koalaverse

sure:Surrogate Residuals for Ordinal and General Regression Models

An implementation of the surrogate approach to residuals and diagnostics for ordinal and general regression models; for details, see Liu and Zhang (2017, <doi:https://doi.org/10.1080/01621459.2017.1292915>) and Greenwell et al. (2017, <https://journal.r-project.org/archive/2018/RJ-2018-004/index.html>). These residuals can be used to construct standard residual plots for model diagnostics (e.g., residual-vs-fitted value plots, residual-vs-covariate plots, Q-Q plots, etc.). The package also provides an 'autoplot' function for producing standard diagnostic plots using 'ggplot2' graphics. The package currently supports cumulative link models from packages 'MASS', 'ordinal', 'rms', and 'VGAM'. Support for binary regression models using the standard 'glm' function is also available.

Maintained by Brandon Greenwell. Last updated 12 days ago.

categorical-data diagnostics ordinal-regression residuals

12.5 match 9 stars 5.58 score 47 scripts 1 dependents

bioc

peakPantheR:Peak Picking and Annotation of High Resolution Experiments

An automated pipeline for the detection, integration and reporting of predefined features across a large number of mass spectrometry data files. It enables the real time annotation of multiple compounds in a single file, or the parallel annotation of multiple compounds in multiple files. A graphical user interface as well as command line functions will assist in assessing the quality of annotation and update fitting parameters until a satisfactory result is obtained.

Maintained by Arnaud Wolfer. Last updated 5 months ago.

massspectrometry metabolomics peakdetection feature-detection mass-spectrometry

10.2 match 12 stars 6.82 score 23 scripts

easystats

bayestestR:Understand and Describe Bayesian Models and Posterior Distributions

Provides utilities to describe posterior distributions and Bayesian models. It includes point-estimates such as Maximum A Posteriori (MAP), measures of dispersion (Highest Density Interval - HDI; Kruschke, 2015 <doi:10.1016/C2012-0-00477-2>) and indices used for null-hypothesis testing (such as ROPE percentage, pd and Bayes factors). References: Makowski et al. (2021) <doi:10.21105/joss.01541>.

Maintained by Dominique Makowski. Last updated 11 days ago.

bayes-factors bayesfactor bayesian bayesian-framework credible-interval easystats hacktoberfest hdi map posterior-distributions rope

4.0 match 579 stars 16.82 score 2.2k scripts 82 dependents

vegandevs

vegan:Community Ecology Package

Ordination methods, diversity analysis and other functions for community and vegetation ecologists.

Maintained by Jari Oksanen. Last updated 15 days ago.

ecological-modelling ecology ordination fortran openblas

3.5 match 472 stars 19.41 score 15k scripts 440 dependents

bstewart

stm:Estimation of the Structural Topic Model

The Structural Topic Model (STM) allows researchers to estimate topic models with document-level covariates. The package also includes tools for model selection, visualization, and estimation of topic-covariate regressions. Methods developed in Roberts et. al. (2014) <doi:10.1111/ajps.12103> and Roberts et. al. (2016) <doi:10.1080/01621459.2016.1141684>. Vignette is Roberts et. al. (2019) <doi:10.18637/jss.v091.i02>.

Maintained by Brandon Stewart. Last updated 1 years ago.

openblas cpp

5.3 match 404 stars 12.63 score 1.6k scripts 6 dependents

gavinsimpson

analogue:Analogue and Weighted Averaging Methods for Palaeoecology

Fits Modern Analogue Technique and Weighted Averaging transfer function models for prediction of environmental data from species data, and related methods used in palaeoecology.

Maintained by Gavin L. Simpson. Last updated 6 months ago.

7.3 match 14 stars 8.96 score 185 scripts 4 dependents

colintredoux

r4lineups:Statistical Inference on Lineup Fairness

Since the early 1970s eyewitness testimony researchers have recognised the importance of estimating properties such as lineup bias (is the lineup biased against the suspect, leading to a rate of choosing higher than one would expect by chance?), and lineup size (how many reasonable choices are in fact available to the witness? A lineup is supposed to consist of a suspect and a number of additional members, or foils, whom a poor-quality witness might mistake for the perpetrator). Lineup measures are descriptive, in the first instance, but since the earliest articles in the literature researchers have recognised the importance of reasoning inferentially about them. This package contains functions to compute various properties of laboratory or police lineups, and is intended for use by researchers in forensic psychology and/or eyewitness testimony research. Among others, the r4lineups package includes functions for calculating lineup proportion, functional size, various estimates of effective size, diagnosticity ratio, homogeneity of the diagnosticity ratio, ROC curves for confidence x accuracy data and the degree of similarity of faces in a lineup.

Maintained by Colin Tredoux. Last updated 7 years ago.

24.3 match 2.58 score 38 scripts

cran

CopulaREMADA:Copula Mixed Models for Multivariate Meta-Analysis of Diagnostic Test Accuracy Studies

The bivariate copula mixed model for meta-analysis of diagnostic test accuracy studies in Nikoloulopoulos (2015) <doi:10.1002/sim.6595> and Nikoloulopoulos (2018) <doi:10.1007/s10182-017-0299-y>. The vine copula mixed model for meta-analysis of diagnostic test accuracy studies accounting for disease prevalence in Nikoloulopoulos (2017) <doi:10.1177/0962280215596769> and also accounting for non-evaluable subjects in Nikoloulopoulos (2020) <doi:10.1515/ijb-2019-0107>. The hybrid vine copula mixed model for meta-analysis of diagnostic test accuracy case-control and cohort studies in Nikoloulopoulos (2018) <doi:10.1177/0962280216682376>. The D-vine copula mixed model for meta-analysis and comparison of two diagnostic tests in Nikoloulopoulos (2019) <doi:10.1177/0962280218796685>. The multinomial quadrivariate D-vine copula mixed model for meta-analysis of diagnostic tests with non-evaluable subjects in Nikoloulopoulos (2020) <doi:10.1177/0962280220913898>. The one-factor copula mixed model for joint meta-analysis of multiple diagnostic tests in Nikoloulopoulos (2022) <doi:10.1111/rssa.12838>. The multinomial six-variate 1-truncated D-vine copula mixed model for meta-analysis of two diagnostic tests accounting for within and between studies dependence in Nikoloulopoulos (2024) <doi:10.1177/09622802241269645>. The 1-truncated D-vine copula mixed models for meta-analysis of diagnostic accuracy studies without a gold standard (Nikoloulopoulos, 2024).

Maintained by Aristidis K. Nikoloulopoulos. Last updated 5 months ago.

39.0 match 2 stars 1.60 score 10 scripts

jedazard

MVR:Mean-Variance Regularization

Implements a non-parametric method for joint adaptive mean-variance regularization and variance stabilization of high-dimensional data. It is suited for handling difficult problems posed by high-dimensional multivariate datasets (p >> n paradigm). Among those are that the variance is often a function of the mean, variable-specific estimators of variances are not reliable, and tests statistics have low powers due to a lack of degrees of freedom. Key features include: (i) Normalization and/or variance stabilization of the data, (ii) Computation of mean-variance-regularized t-statistics (F-statistics to follow), (iii) Generation of diverse diagnostic plots, (iv) Computationally efficient implementation using C/C++ interfacing and an option for parallel computing to enjoy a faster and easier experience in the R environment.

Maintained by Jean-Eudes Dazard. Last updated 3 years ago.

cpp

16.4 match 1 stars 3.78 score 12 scripts

brian-j-smith

boa:Bayesian Output Analysis Program (BOA) for MCMC

A menu-driven program and library of functions for carrying out convergence diagnostics and statistical and graphical analysis of Markov chain Monte Carlo sampling output.

Maintained by Brian J. Smith. Last updated 9 years ago.

17.3 match 1 stars 3.58 score 38 scripts

tirgit

missCompare:Intuitive Missing Data Imputation Framework

Offers a convenient pipeline to test and compare various missing data imputation algorithms on simulated and real data. These include simpler methods, such as mean and median imputation and random replacement, but also include more sophisticated algorithms already implemented in popular R packages, such as 'mi', described by Su et al. (2011) <doi:10.18637/jss.v045.i02>; 'mice', described by van Buuren and Groothuis-Oudshoorn (2011) <doi:10.18637/jss.v045.i03>; 'missForest', described by Stekhoven and Buhlmann (2012) <doi:10.1093/bioinformatics/btr597>; 'missMDA', described by Josse and Husson (2016) <doi:10.18637/jss.v070.i01>; and 'pcaMethods', described by Stacklies et al. (2007) <doi:10.1093/bioinformatics/btm069>. The central assumption behind 'missCompare' is that structurally different datasets (e.g. larger datasets with a large number of correlated variables vs. smaller datasets with non correlated variables) will benefit differently from different missing data imputation algorithms. 'missCompare' takes measurements of your dataset and sets up a sandbox to try a curated list of standard and sophisticated missing data imputation algorithms and compares them assuming custom missingness patterns. 'missCompare' will also impute your real-life dataset for you after the selection of the best performing algorithm in the simulations. The package also provides various post-imputation diagnostics and visualizations to help you assess imputation performance.

Maintained by Tibor V. Varga. Last updated 4 years ago.

comparison comparison-benchmarks imputation imputation-algorithm imputation-methods imputations kolmogorov-smirnov missing missing-data missing-data-imputation missing-status-check missing-values missingness post-imputation-diagnostics rmse

10.5 match 39 stars 5.89 score 40 scripts

lmarusich

rmcorr:Repeated Measures Correlation

Compute the repeated measures correlation, a statistical technique for determining the overall within-individual relationship among paired measures assessed on two or more occasions, first introduced by Bland and Altman (1995). Includes functions for diagnostics, p-value, effect size with confidence interval including optional bootstrapping, as well as graphing. Also includes several example datasets. For more details, see the web documentation <https://lmarusich.github.io/rmcorr/index.html> and the original paper: Bakdash and Marusich (2017) <doi:10.3389/fpsyg.2017.00456>.

Maintained by Laura R. Marusich. Last updated 7 months ago.

6.7 match 7 stars 9.18 score 304 scripts

xfim

ggmcmc:Tools for Analyzing MCMC Simulations from Bayesian Inference

Tools for assessing and diagnosing convergence of Markov Chain Monte Carlo simulations, as well as for graphically display results from full MCMC analysis. The package also facilitates the graphical interpretation of models by providing flexible functions to plot the results against observed variables, and functions to work with hierarchical/multilevel batches of parameters (Fernández-i-Marín, 2016 <doi:10.18637/jss.v070.i09>).

Maintained by Xavier Fernández i Marín. Last updated 2 years ago.

bayesian-data-analysis ggplot2 graphical jags mcmc stan

5.0 match 112 stars 12.02 score 1.6k scripts 8 dependents

davidorme

caper:Comparative Analyses of Phylogenetics and Evolution in R

Functions for performing phylogenetic comparative analyses.

Maintained by David Orme. Last updated 1 years ago.

7.9 match 1 stars 7.41 score 928 scripts 5 dependents

bioc

GRaNIE:GRaNIE: Reconstruction cell type specific gene regulatory networks including enhancers using single-cell or bulk chromatin accessibility and RNA-seq data

Genetic variants associated with diseases often affect non-coding regions, thus likely having a regulatory role. To understand the effects of genetic variants in these regulatory regions, identifying genes that are modulated by specific regulatory elements (REs) is crucial. The effect of gene regulatory elements, such as enhancers, is often cell-type specific, likely because the combinations of transcription factors (TFs) that are regulating a given enhancer have cell-type specific activity. This TF activity can be quantified with existing tools such as diffTF and captures differences in binding of a TF in open chromatin regions. Collectively, this forms a gene regulatory network (GRN) with cell-type and data-specific TF-RE and RE-gene links. Here, we reconstruct such a GRN using single-cell or bulk RNAseq and open chromatin (e.g., using ATACseq or ChIPseq for open chromatin marks) and optionally (Capture) Hi-C data. Our network contains different types of links, connecting TFs to regulatory elements, the latter of which is connected to genes in the vicinity or within the same chromatin domain (TAD). We use a statistical framework to assign empirical FDRs and weights to all links using a permutation-based approach.

Maintained by Christian Arnold. Last updated 5 months ago.

software geneexpression generegulation networkinference genesetenrichment biomedicalinformatics genetics transcriptomics atacseq rnaseq graphandnetwork regression transcription chipseq

10.8 match 5.40 score 24 scripts

tmsalab

simcdm:Simulate Cognitive Diagnostic Model ('CDM') Data

Provides efficient R and 'C++' routines to simulate cognitive diagnostic model data for Deterministic Input, Noisy "And" Gate ('DINA') and reduced Reparameterized Unified Model ('rRUM') from Culpepper and Hudson (2017) <doi: 10.1177/0146621617707511>, Culpepper (2015) <doi:10.3102/1076998615595403>, and de la Torre (2009) <doi:10.3102/1076998607309474>.

Maintained by James Joseph Balamuta. Last updated 1 years ago.

cognitive-diagnostic-models psychometrics rcpp rcpparmadillo simulation openblas cpp

11.8 match 4.95 score 15 scripts 2 dependents

bioc

metaseqR2:An R package for the analysis and result reporting of RNA-Seq data by combining multiple statistical algorithms

Provides an interface to several normalization and statistical testing packages for RNA-Seq gene expression data. Additionally, it creates several diagnostic plots, performs meta-analysis by combinining the results of several statistical tests and reports the results in an interactive way.

Maintained by Panagiotis Moulos. Last updated 3 days ago.

software geneexpression differentialexpression workflowstep preprocessing qualitycontrol normalization reportwriting rnaseq transcription sequencing transcriptomics bayesian clustering cellbiology biomedicalinformatics functionalgenomics systemsbiology immunooncology alternativesplicing differentialsplicing multiplecomparison timecourse dataimport atacseq epigenetics regression proprietaryplatforms genesetenrichment batcheffect chipseq

9.7 match 7 stars 6.05 score 3 scripts

leef-uzh

LEEF.analysis:Access Functions, Tests and Basic Analysis of the RRD Data from the LEEF Project

Provides simple access functions to read data out of the sqlite RRD database. SQL queries can be configured in a yaml config file and used.

Maintained by Rainer M. Krug. Last updated 1 months ago.

23.6 match 2.44 score 23 scripts

gokmenzararsiz

dtComb:Statistical Combination of Diagnostic Tests

A system for combining two diagnostic tests using various approaches that include statistical and machine-learning-based methodologies. These approaches are divided into four groups: linear combination methods, non-linear combination methods, mathematical operators, and machine learning algorithms. See the <https://biotools.erciyes.edu.tr/dtComb/> website for more information, documentation, and examples.

Maintained by Gokmen Zararsiz. Last updated 5 months ago.

12.1 match 4.70 score 7 scripts

n-kall

priorsense:Prior Diagnostics and Sensitivity Analysis

Provides functions for prior and likelihood sensitivity analysis in Bayesian models. Currently it implements methods to determine the sensitivity of the posterior to power-scaling perturbations of the prior and likelihood.

Maintained by Noa Kallioinen. Last updated 11 days ago.

bayes bayesian bayesian-data-analysis bayesian-methods prior-distribution sensitivity-analysis stan

6.7 match 59 stars 8.27 score 70 scripts

harrysouthworth

texmex:Statistical Modelling of Extreme Values

Statistical extreme value modelling of threshold excesses, maxima and multivariate extremes. Univariate models for threshold excesses and maxima are the Generalised Pareto, and Generalised Extreme Value model respectively. These models may be fitted by using maximum (optionally penalised-)likelihood, or Bayesian estimation, and both classes of models may be fitted with covariates in any/all model parameters. Model diagnostics support the fitting process. Graphical output for visualising fitted models and return level estimates is provided. For serially dependent sequences, the intervals declustering algorithm of Ferro and Segers (2003) <doi:10.1111/1467-9868.00401> is provided, with diagnostic support to aid selection of threshold and declustering horizon. Multivariate modelling is performed via the conditional approach of Heffernan and Tawn (2004) <doi:10.1111/j.1467-9868.2004.02050.x>, with graphical tools for threshold selection and to diagnose estimation convergence.

Maintained by Harry Southworth. Last updated 1 years ago.

cpp

8.0 match 7 stars 6.92 score 66 scripts 1 dependents

alexpkeil1

bkmrhat:Parallel Chain Tools for Bayesian Kernel Machine Regression

Bayesian kernel machine regression (from the 'bkmr' package) is a Bayesian semi-parametric generalized linear model approach under identity and probit links. There are a number of functions in this package that extend Bayesian kernel machine regression fits to allow multiple-chain inference and diagnostics, which leverage functions from the 'future', 'rstan', and 'coda' packages. Reference: Bobb, J. F., Henn, B. C., Valeri, L., & Coull, B. A. (2018). Statistical software for analyzing the health effects of multiple concurrent exposures via Bayesian kernel machine regression. ; <doi:10.1186/s12940-018-0413-y>.

Maintained by Alexander Keil. Last updated 3 years ago.

12.1 match 7 stars 4.54 score 10 scripts

sthomas522

hmclearn:Fit Statistical Models Using Hamiltonian Monte Carlo

Provide users with a framework to learn the intricacies of the Hamiltonian Monte Carlo algorithm with hands-on experience by tuning and fitting their own models. All of the code is written in R. Theoretical references are listed below:. Neal, Radford (2011) "Handbook of Markov Chain Monte Carlo" ISBN: 978-1420079418, Betancourt, Michael (2017) "A Conceptual Introduction to Hamiltonian Monte Carlo" <arXiv:1701.02434>, Thomas, S., Tu, W. (2020) "Learning Hamiltonian Monte Carlo in R" <arXiv:2006.16194>, Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013) "Bayesian Data Analysis" ISBN: 978-1439840955, Agresti, Alan (2015) "Foundations of Linear and Generalized Linear Models ISBN: 978-1118730034, Pinheiro, J., Bates, D. (2006) "Mixed-effects Models in S and S-Plus" ISBN: 978-1441903174.

Maintained by Samuel Thomas. Last updated 4 years ago.

9.7 match 11 stars 5.64 score 16 scripts

cran

biclust:BiCluster Algorithms

The main function biclust() provides several algorithms to find biclusters in two-dimensional data: Cheng and Church (2000, ISBN:1-57735-115-0), spectral (2003) <doi:10.1101/gr.648603>, plaid model (2005) <doi:10.1016/j.csda.2004.02.003>, xmotifs (2003) <doi:10.1142/9789812776303_0008> and bimax (2006) <doi:10.1093/bioinformatics/btl060>. In addition, the package provides methods for data preprocessing (normalization and discretisation), visualisation, and validation of bicluster solutions.

Maintained by Sebastian Kaiser. Last updated 2 years ago.

9.4 match 3 stars 5.79 score 160 scripts 16 dependents

ropengov

hetu:Structural Handling of Finnish Personal Identity Codes

Structural handling of Finnish identity codes (natural persons and organizations); extract information, check ID validity and diagnostics.

Maintained by Pyry Kantanen. Last updated 3 months ago.

ropengov

11.1 match 2 stars 4.86 score 18 scripts

pecanproject

PEcAn.assim.batch:PEcAn Functions Used for Ecological Forecasts and Reanalysis

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.

Maintained by Istem Fer. Last updated 14 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants jags cpp

5.4 match 216 stars 9.94 score 20 scripts 2 dependents

paulnorthrop

exdex:Estimation of the Extremal Index

Performs frequentist inference for the extremal index of a stationary time series. Two types of methodology are used. One type is based on a model that relates the distribution of block maxima to the marginal distribution of series and leads to the semiparametric maxima estimators described in Northrop (2015) <doi:10.1007/s10687-015-0221-5> and Berghaus and Bucher (2018) <doi:10.1214/17-AOS1621>. Sliding block maxima are used to increase precision of estimation. A graphical block size diagnostic is provided. The other type of methodology uses a model for the distribution of threshold inter-exceedance times (Ferro and Segers (2003) <doi:10.1111/1467-9868.00401>). Three versions of this type of approach are provided: the iterated weight least squares approach of Suveges (2007) <doi:10.1007/s10687-007-0034-2>, the K-gaps model of Suveges and Davison (2010) <doi:10.1214/09-AOAS292> and a similar approach of Holesovsky and Fusek (2020) <doi:10.1007/s10687-020-00374-3> that we refer to as D-gaps. For the K-gaps and D-gaps models this package allows missing values in the data, can accommodate independent subsets of data, such as monthly or seasonal time series from different years, and can incorporate information from right-censored inter-exceedance times. Graphical diagnostics for the threshold level and the respective tuning parameters K and D are provided.

Maintained by Paul J. Northrop. Last updated 11 months ago.

block-maxima extremal-index extreme extreme-value-statistics extremes inference maxima semiparametric semiparametric-estimation semiparametric-maxima-estimators theta threshold value cpp

10.9 match 4.92 score 11 scripts 5 dependents

hoehna

TESS:Diversification Rate Estimation and Fast Simulation of Reconstructed Phylogenetic Trees under Tree-Wide Time-Heterogeneous Birth-Death Processes Including Mass-Extinction Events

Simulation of reconstructed phylogenetic trees under tree-wide time-heterogeneous birth-death processes and estimation of diversification parameters under the same model. Speciation and extinction rates can be any function of time and mass-extinction events at specific times can be provided. Trees can be simulated either conditioned on the number of species, the time of the process, or both. Additionally, the likelihood equations are implemented for convenience and can be used for Maximum Likelihood (ML) estimation and Bayesian inference.

Maintained by Sebastian Hoehna. Last updated 3 years ago.

cpp

8.9 match 2 stars 5.93 score 95 scripts 1 dependents

bioc

methylumi:Handle Illumina methylation data

This package provides classes for holding and manipulating Illumina methylation data. Based on eSet, it can contain MIAME information, sample information, feature information, and multiple matrices of data. An "intelligent" import function, methylumiR can read the Illumina text files and create a MethyLumiSet. methylumIDAT can directly read raw IDAT files from HumanMethylation27 and HumanMethylation450 microarrays. Normalization, background correction, and quality control features for GoldenGate, Infinium, and Infinium HD arrays are also included.

Maintained by Sean Davis. Last updated 5 months ago.

dnamethylation twochannel preprocessing qualitycontrol cpgisland

5.3 match 9 stars 9.90 score 89 scripts 9 dependents

tmsalab

ohoegdm:Ordinal Higher-Order Exploratory General Diagnostic Model for Polytomous Data

Perform a Bayesian estimation of the ordinal exploratory Higher-order General Diagnostic Model (OHOEGDM) for Polytomous Data described by Culpepper, S. A. and Balamuta, J. J. (In Press) <doi:10.1080/00273171.2021.1985949>.

Maintained by James Joseph Balamuta. Last updated 3 years ago.

diagnostic-model exploratory-diagnostic-models psychometrics openblas cpp openmp

19.1 match 2.70 score

insightsengineering

rbmi:Reference Based Multiple Imputation

Implements standard and reference based multiple imputation methods for continuous longitudinal endpoints (Gower-Page et al. (2022) <doi:10.21105/joss.04251>). In particular, this package supports deterministic conditional mean imputation and jackknifing as described in Wolbers et al. (2022) <doi:10.1002/pst.2234>, Bayesian multiple imputation as described in Carpenter et al. (2013) <doi:10.1080/10543406.2013.834911>, and bootstrapped maximum likelihood imputation as described in von Hippel and Bartlett (2021) <doi: 10.1214/20-STS793>.

Maintained by Isaac Gravestock. Last updated 22 days ago.

5.9 match 18 stars 8.78 score 33 scripts 1 dependents

akreutzmann

trafo:Estimation, Comparison and Selection of Transformations

Estimation, selection and comparison of several families of transformations. The families of transformations included in the package are the following: Bickel-Doksum (Bickel and Doksum 1981 <doi:10.2307/2287831>), Box-Cox, Dual (Yang 2006 <doi:10.1016/j.econlet.2006.01.011>), Glog (Durbin et al. 2002 <doi:10.1093/bioinformatics/18.suppl_1.S105>), gpower (Kelmansky et al. 2013 <doi:10.1515/sagmb-2012-0030>), Log, Log-shift opt (Feng et al. 2016 <doi:10.1002/sta4.104>), Manly, modulus (John and Draper 1980 <doi:10.2307/2986305>), Neglog (Whittaker et al. 2005 <doi:10.1111/j.1467-9876.2005.00520.x>), Reciprocal and Yeo-Johnson. The package simplifies to compare linear models with untransformed and transformed dependent variable as well as linear models where the dependent variable is transformed with different transformations. Furthermore, the package employs maximum likelihood approaches, moments optimization and divergence minimization to estimate the optimal transformation parameter.

Maintained by Lily Medina. Last updated 5 years ago.

12.3 match 4.16 score 72 scripts

florianhartig

BayesianTools:General-Purpose MCMC and SMC Samplers and Tools for Bayesian Statistics

General-purpose MCMC and SMC samplers, as well as plots and diagnostic functions for Bayesian statistics, with a particular focus on calibrating complex system models. Implemented samplers include various Metropolis MCMC variants (including adaptive and/or delayed rejection MH), the T-walk, two differential evolution MCMCs, two DREAM MCMCs, and a sequential Monte Carlo (SMC) particle filter.

Maintained by Florian Hartig. Last updated 1 years ago.

bayes ecological-models mcmc optimization smc systems-biology cpp

5.0 match 122 stars 10.17 score 580 scripts 5 dependents

boopsboops

spider:Species Identity and Evolution in R

Analysis of species limits and DNA barcoding data. Included are functions for generating important summary statistics from DNA barcode data, assessing specimen identification efficacy, testing and optimizing divergence threshold limits, assessment of diagnostic nucleotides, and calculation of the probability of reciprocal monophyly. Additionally, a sliding window function offers opportunities to analyse information across a gene, often used for marker design in degraded DNA studies. Further information on the package has been published in Brown et al (2012) <doi:10.1111/j.1755-0998.2011.03108.x>.

Maintained by Rupert A. Collins. Last updated 6 years ago.

dna-barcode edna evolution species-delimitation species-identity

9.8 match 2 stars 5.20 score 66 scripts 1 dependents

cran

evd:Functions for Extreme Value Distributions

Extends simulation, distribution, quantile and density functions to univariate and multivariate parametric extreme value distributions, and provides fitting functions which calculate maximum likelihood estimates for univariate and bivariate maxima models, and for univariate and bivariate threshold models.

Maintained by Alec Stephenson. Last updated 6 months ago.

5.3 match 2 stars 9.46 score 748 scripts 82 dependents

cdriveraus

ctsem:Continuous Time Structural Equation Modelling

Hierarchical continuous (and discrete) time state space modelling, for linear and nonlinear systems measured by continuous variables, with limited support for binary data. The subject specific dynamic system is modelled as a stochastic differential equation (SDE) or difference equation, measurement models are typically multivariate normal factor models. Linear mixed effects SDE's estimated via maximum likelihood and optimization are the default. Nonlinearities, (state dependent parameters) and random effects on all parameters are possible, using either max likelihood / max a posteriori optimization (with optional importance sampling) or Stan's Hamiltonian Monte Carlo sampling. See <https://github.com/cdriveraus/ctsem/raw/master/vignettes/hierarchicalmanual.pdf> for details. Priors may be used. For the conceptual overview of the hierarchical Bayesian linear SDE approach, see <https://www.researchgate.net/publication/324093594_Hierarchical_Bayesian_Continuous_Time_Dynamic_Modeling>. Exogenous inputs may also be included, for an overview of such possibilities see <https://www.researchgate.net/publication/328221807_Understanding_the_Time_Course_of_Interventions_with_Continuous_Time_Dynamic_Models> . Stan based functions are not available on 32 bit Windows systems at present. <https://cdriver.netlify.app/> contains some tutorial blog posts.

Maintained by Charles Driver. Last updated 10 days ago.

stochastic-differential-equations time-series cpp

5.3 match 42 stars 9.58 score 366 scripts 1 dependents

ddimmery

tidyhte:Tidy Estimation of Heterogeneous Treatment Effects

Estimates heterogeneous treatment effects using tidy semantics on experimental or observational data. Methods are based on the doubly-robust learner of Kennedy (n.d.) <arXiv:2004.14497>. You provide a simple recipe for what machine learning algorithms to use in estimating the nuisance functions and 'tidyhte' will take care of cross-validation, estimation, model selection, diagnostics and construction of relevant quantities of interest about the variability of treatment effects.

Maintained by Drew Dimmery. Last updated 2 years ago.

9.4 match 14 stars 5.36 score 11 scripts

itfeature

mctest:Multicollinearity Diagnostic Measures

Package computes popular and widely used multicollinearity diagnostic measures <doi:10.17576/jsm-2019-4809-26> and <doi:10.32614/RJ-2016-062> . Package also indicates which regressors may be the reason of collinearity among regressors.

Maintained by Imdad Ullah Muhammad. Last updated 5 years ago.

11.4 match 4.39 score 1.0k scripts 1 dependents

kaigu1990

mcradds:Processing and Analyzing of Diagnostics Trials

Provides methods and functions to analyze the quantitative or qualitative performance for diagnostic assays, and outliers detection, reader precision and reference range are discussed. Most of the methods and algorithms refer to CLSI (Clinical & Laboratory Standards Institute) recommendations and NMPA (National Medical Products Administration) guidelines. In additional, relevant plots are constructed by 'ggplot2'.

Maintained by Kai Gu. Last updated 6 months ago.

in-vitro-diagnostic ivd mcr

12.5 match 1 stars 4.00 score 7 scripts

ericgilleland

ismev:An Introduction to Statistical Modeling of Extreme Values

Functions to support the computations carried out in `An Introduction to Statistical Modeling of Extreme Values' by Stuart Coles. The functions may be divided into the following groups; maxima/minima, order statistics, peaks over thresholds and point processes.

Maintained by Eric Gilleland. Last updated 7 years ago.

9.2 match 1 stars 5.19 score 326 scripts 13 dependents

alarm-redist

redist:Simulation Methods for Legislative Redistricting

Enables researchers to sample redistricting plans from a pre-specified target distribution using Sequential Monte Carlo and Markov Chain Monte Carlo algorithms. The package allows for the implementation of various constraints in the redistricting process such as geographic compactness and population parity requirements. Tools for analysis such as computation of various summary statistics and plotting functionality are also included. The package implements the SMC algorithm of McCartan and Imai (2023) <doi:10.1214/23-AOAS1763>, the enumeration algorithm of Fifield, Imai, Kawahara, and Kenny (2020) <doi:10.1080/2330443X.2020.1791773>, the Flip MCMC algorithm of Fifield, Higgins, Imai and Tarr (2020) <doi:10.1080/10618600.2020.1739532>, the Merge-split/Recombination algorithms of Carter et al. (2019) <arXiv:1911.01503> and DeFord et al. (2021) <doi:10.1162/99608f92.eb30390f>, and the Short-burst optimization algorithm of Cannon et al. (2020) <arXiv:2011.02288>.

Maintained by Christopher T. Kenny. Last updated 2 months ago.

geospatial gerrymandering redistricting sampling openblas cpp openmp

5.2 match 68 stars 9.17 score 259 scripts

ggpmxdevelopment

ggPMX:'ggplot2' Based Tool to Facilitate Diagnostic Plots for NLME Models

At Novartis, we aimed at standardizing the set of diagnostic plots used for modeling activities in order to reduce the overall effort required for generating such plots. For this, we developed a guidance that proposes an adequate set of diagnostics and a toolbox, called 'ggPMX' to execute them. 'ggPMX' is a toolbox that can generate all diagnostic plots at a quality sufficient for publication and submissions using few lines of code. This package focuses on plots recommended by ISoP <doi:10.1002/psp4.12161>. While not required, you can get/install the 'R' 'lixoftConnectors' package in the 'Monolix' installation, as described at the following url <https://monolix.lixoft.com/monolix-api/lixoftconnectors_installation/>. When 'lixoftConnectors' is available, 'R' can use 'Monolix' directly to create the required Chart Data instead of exporting it from the 'Monolix' gui.

Maintained by Matthew Fidler. Last updated 1 years ago.

pharmacometrics pmx reporting

6.6 match 39 stars 7.23 score 80 scripts

yuimaproject

yuima:The YUIMA Project Package for SDEs

Simulation and Inference for SDEs and Other Stochastic Processes.

Maintained by Stefano M. Iacus. Last updated 1 days ago.

openblas cpp

6.5 match 9 stars 7.26 score 92 scripts 2 dependents

appsilon

rhino:A Framework for Enterprise Shiny Applications

A framework that supports creating and extending enterprise Shiny applications using best practices.

Maintained by Kamil Żyła. Last updated 2 days ago.

rhinoverse shiny

5.3 match 304 stars 8.99 score 145 scripts

ncss-tech

SoilTaxonomy:A System of Soil Classification for Making and Interpreting Soil Surveys

Taxonomic dictionaries, formative element lists, and functions related to the maintenance, development and application of U.S. Soil Taxonomy. Data and functionality are based on official U.S. Department of Agriculture sources including the latest edition of the Keys to Soil Taxonomy. Descriptions and metadata are obtained from the National Soil Information System or Soil Survey Geographic databases. Other sources are referenced in the data documentation. Provides tools for understanding and interacting with concepts in the U.S. Soil Taxonomic System. Most of the current utilities are for working with taxonomic concepts at the "higher" taxonomic levels: Order, Suborder, Great Group, and Subgroup.

Maintained by Andrew Brown. Last updated 5 months ago.

great-group ncss-tech soil soil-survey soil-taxonomy subgroup suborder usda

8.3 match 15 stars 5.65 score

aalfons

robmed:(Robust) Mediation Analysis

Perform mediation analysis via the fast-and-robust bootstrap test ROBMED (Alfons, Ates & Groenen, 2022a; <doi:10.1177/1094428121999096>), as well as various other methods. Details on the implementation and code examples can be found in Alfons, Ates, and Groenen (2022b) <doi:10.18637/jss.v103.i13>. Further discussion on robust mediation analysis can be found in Alfons & Schley (2024) <doi:10.31234/osf.io/2hqdy>.

Maintained by Andreas Alfons. Last updated 14 days ago.

7.2 match 6 stars 6.35 score 31 scripts 1 dependents

michaelchirico

potools:Tools for Internationalization and Portability in R Packages

Translating messages in R packages is managed using the po top-level directory and the 'gettext' program. This package provides some helper functions for building this support in R packages, e.g. common validation & I/O tasks.

Maintained by Michael Chirico. Last updated 9 months ago.

i18n translation

6.3 match 59 stars 7.20 score 15 scripts

bioc

arrayQuality:Assessing array quality on spotted arrays

Functions for performing print-run and array level quality assessment.

Maintained by Agnes Paquet. Last updated 5 months ago.

microarray twochannel qualitycontrol visualization

13.8 match 3.30 score 10 scripts

seungjae2525

MRMCbinary:Multi-Reader Multi-Case Analysis of Binary Diagnostic Tests

The goal of 'MRMCbinary' is to compare the performance of diagnostic tests (i.e., sensitivity and specificity) for binary outcomes in multi-reader multi-case (MRMC) studies. It is based on conditional logistic regression and Cochran’s Q test (or McNemar’s test when the number of modalities is equal to 2).

Maintained by Seungjae Lee. Last updated 23 days ago.

binary-data diagnostic-accuracy mrmc

14.2 match 1 stars 3.18 score

tilburgnetworkgroup

remstimate:Optimization Frameworks for Tie-Oriented and Actor-Oriented Relational Event Models

A comprehensive set of tools designed for optimizing likelihood within a tie-oriented (Butts, C., 2008, <doi:10.1111/j.1467-9531.2008.00203.x>) or an actor-oriented modelling framework (Stadtfeld, C., & Block, P., 2017, <doi:10.15195/v4.a14>) in relational event networks. The package accommodates both frequentist and Bayesian approaches. The frequentist approaches that the package incorporates are the Maximum Likelihood Optimization (MLE) and the Gradient-based Optimization (GDADAMAX). The Bayesian methodologies included in the package are the Bayesian Sampling Importance Resampling (BSIR) and the Hamiltonian Monte Carlo (HMC). The flexibility of choosing between frequentist and Bayesian optimization approaches allows researchers to select the estimation approach which aligns the most with their analytical preferences.

Maintained by Giuseppe Arena. Last updated 2 months ago.

openblas cpp openmp

8.8 match 5 stars 5.15 score 14 scripts

friendly

VisCollin:Visualizing Collinearity Diagnostics

Provides methods to calculate diagnostics for multicollinearity among predictors in a linear or generalized linear model. It also provides methods to visualize those diagnostics following Friendly & Kwan (2009), "Where’s Waldo: Visualizing Collinearity Diagnostics", <doi:10.1198/tast.2009.0012>. These include better tabular presentation of collinearity diagnostics that highlight the important numbers, a semi-graphic tableplot of the diagnostics to make warning and danger levels more salient, and a "collinearity biplot" of the smallest dimensions of predictor space, where collinearity is most apparent.

Maintained by Michael Friendly. Last updated 1 years ago.

biplots collinearity-diagnostics graphics regression-models

16.1 match 1 stars 2.78 score 12 scripts

siacus

cem:Coarsened Exact Matching

Implementation of the Coarsened Exact Matching algorithm discussed along with its properties in Iacus, King, Porro (2011) <DOI:10.1198/jasa.2011.tm09599>; Iacus, King, Porro (2012) <DOI:10.1093/pan/mpr013> and Iacus, King, Porro (2019) <DOI:10.1017/pan.2018.29>.

Maintained by Stefano M. Iacus. Last updated 3 years ago.

7.8 match 2 stars 5.76 score 239 scripts 1 dependents

gavinsimpson

gratia:Graceful 'ggplot'-Based Graphics and Other Functions for GAMs Fitted Using 'mgcv'

Graceful 'ggplot'-based graphics and utility functions for working with generalized additive models (GAMs) fitted using the 'mgcv' package. Provides a reimplementation of the plot() method for GAMs that 'mgcv' provides, as well as 'tidyverse' compatible representations of estimated smooths.

Maintained by Gavin L. Simpson. Last updated 4 days ago.

distributional-regression gam gamm generalized-additive-mixed-models generalized-additive-models ggplot2 glm lm mgcv penalized-spline random-effects smoothing splines

3.5 match 216 stars 12.68 score 1.6k scripts 1 dependents

r-forge

POT:Generalized Pareto Distribution and Peaks Over Threshold

Some functions useful to perform a Peak Over Threshold analysis in univariate and bivariate cases, see Beirlant et al. (2004) <doi:10.1002/0470012382>. A user guide is available in the vignette.

Maintained by Christophe Dutang. Last updated 5 months ago.

7.1 match 6.19 score 105 scripts 2 dependents

smac-group

simts:Time Series Analysis Tools

A system contains easy-to-use tools as a support for time series analysis courses. In particular, it incorporates a technique called Generalized Method of Wavelet Moments (GMWM) as well as its robust implementation for fast and robust parameter estimation of time series models which is described, for example, in Guerrier et al. (2013) <doi: 10.1080/01621459.2013.799920>. More details can also be found in the paper linked to via the URL below.

Maintained by Stéphane Guerrier. Last updated 2 years ago.

rcpp rcpparmadillo simulation time-series timeseries timeseries-data openblas cpp

5.7 match 15 stars 7.68 score 59 scripts 4 dependents

ropensci

dynamite:Bayesian Modeling and Causal Inference for Multivariate Longitudinal Data

Easy-to-use and efficient interface for Bayesian inference of complex panel (time series) data using dynamic multivariate panel models by Helske and Tikka (2024) <doi:10.1016/j.alcr.2024.100617>. The package supports joint modeling of multiple measurements per individual, time-varying and time-invariant effects, and a wide range of discrete and continuous distributions. Estimation of these dynamic multivariate panel models is carried out via 'Stan'. For an in-depth tutorial of the package, see (Tikka and Helske, 2024) <doi:10.48550/arXiv.2302.01607>.

Maintained by Santtu Tikka. Last updated 18 days ago.

bayesian-inference panel-data stan statistical-models

5.5 match 29 stars 7.92 score 20 scripts

stan-dev

shinystan:Interactive Visual and Numerical Diagnostics and Posterior Analysis for Bayesian Models

A graphical user interface for interactive Markov chain Monte Carlo (MCMC) diagnostics and plots and tables helpful for analyzing a posterior sample. The interface is powered by the 'Shiny' web application framework from 'RStudio' and works with the output of MCMC programs written in any programming language (and has extended functionality for 'Stan' models fit using the 'rstan' and 'rstanarm' packages).

Maintained by Jonah Gabry. Last updated 3 years ago.

bayesian bayesian-data-analysis bayesian-inference bayesian-methods bayesian-statistics mcmc shiny-apps stan statistical-graphics

3.3 match 198 stars 13.05 score 1.6k scripts 13 dependents

statnet

latentnet:Latent Position and Cluster Models for Statistical Networks

Fit and simulate latent position and cluster models for statistical networks. See Krivitsky and Handcock (2008) <doi:10.18637/jss.v024.i05> and Krivitsky, Handcock, Raftery, and Hoff (2009) <doi:10.1016/j.socnet.2009.04.001>.

Maintained by Pavel N. Krivitsky. Last updated 4 days ago.

openblas

5.2 match 19 stars 8.36 score 191 scripts 4 dependents

flr

FLSAM:An Implementation of the State-Space Assessment Model for FLR

This package provides an FLR wrapper to the SAM state-space assessment model.

Maintained by N.T. Hintzen. Last updated 3 months ago.

9.5 match 4 stars 4.51 score 406 scripts

connordonegan

geostan:Bayesian Spatial Analysis

For spatial data analysis; provides exploratory spatial analysis tools, spatial regression, spatial econometric, and disease mapping models, model diagnostics, and special methods for inference with small area survey data (e.g., the America Community Survey (ACS)) and censored population health monitoring data. Models are pre-specified using the Stan programming language, a platform for Bayesian inference using Markov chain Monte Carlo (MCMC). References: Carpenter et al. (2017) <doi:10.18637/jss.v076.i01>; Donegan (2021) <doi:10.31219/osf.io/3ey65>; Donegan (2022) <doi:10.21105/joss.04716>; Donegan, Chun and Hughes (2020) <doi:10.1016/j.spasta.2020.100450>; Donegan, Chun and Griffith (2021) <doi:10.3390/ijerph18136856>; Morris et al. (2019) <doi:10.1016/j.sste.2019.100301>.

Maintained by Connor Donegan. Last updated 3 months ago.

bayesian bayesian-inference bayesian-statistics epidemiology modeling public-health rspatial spatial stan cpp

4.8 match 80 stars 8.82 score 46 scripts

giorgilancs

PrevMap:Geostatistical Modelling of Spatially Referenced Prevalence Data

Provides functions for both likelihood-based and Bayesian analysis of spatially referenced prevalence data. For a tutorial on the use of the R package, see Giorgi and Diggle (2017) <doi:10.18637/jss.v078.i08>.

Maintained by Emanuele Giorgi. Last updated 2 years ago.

9.8 match 4.36 score 46 scripts

epimodel

EpiModel:Mathematical Modeling of Infectious Disease Dynamics

Tools for simulating mathematical models of infectious disease dynamics. Epidemic model classes include deterministic compartmental models, stochastic individual-contact models, and stochastic network models. Network models use the robust statistical methods of exponential-family random graph models (ERGMs) from the Statnet suite of software packages in R. Standard templates for epidemic modeling include SI, SIR, and SIS disease types. EpiModel features an API for extending these templates to address novel scientific research aims. Full methods for EpiModel are detailed in Jenness et al. (2018, <doi:10.18637/jss.v084.i08>).

Maintained by Samuel Jenness. Last updated 2 months ago.

agent-based-modeling epidemics epidemiology infectious-diseases network-graph cpp

3.7 match 250 stars 11.57 score 315 scripts

murrayefford

secr:Spatially Explicit Capture-Recapture

Functions to estimate the density and size of a spatially distributed animal population sampled with an array of passive detectors, such as traps, or by searching polygons or transects. Models incorporating distance-dependent detection are fitted by maximizing the likelihood. Tools are included for data manipulation and model selection.

Maintained by Murray Efford. Last updated 1 days ago.

cpp

4.1 match 3 stars 10.18 score 410 scripts 5 dependents

helske

diagis:Diagnostic Plot and Multivariate Summary Statistics of Weighted Samples from Importance Sampling

Fast functions for effective sample size, weighted multivariate mean, variance, and quantile computation, and weight diagnostic plot for generic importance sampling type or other probability weighted samples.

Maintained by Jouni Helske. Last updated 2 years ago.

cpp importance-sampling weighted-samples openblas cpp

9.6 match 1 stars 4.32 score 14 scripts 1 dependents

xiangdonggu

icensmis:Study Design and Data Analysis in the Presence of Error-Prone Diagnostic Tests and Self-Reported Outcomes

We consider studies in which information from error-prone diagnostic tests or self-reports are gathered sequentially to determine the occurrence of a silent event. Using a likelihood-based approach incorporating the proportional hazards assumption, we provide functions to estimate the survival distribution and covariate effects. We also provide functions for power and sample size calculations for this setting. Please refer to Xiangdong Gu, Yunsheng Ma, and Raji Balasubramanian (2015) <doi: 10.1214/15-AOAS810>, Xiangdong Gu and Raji Balasubramanian (2016) <doi: 10.1002/sim.6962>, Xiangdong Gu, Mahlet G Tadesse, Andrea S Foulkes, Yunsheng Ma, and Raji Balasubramanian (2020) <doi: 10.1186/s12911-020-01223-w>.

Maintained by Xiangdong Gu. Last updated 4 years ago.

cpp

13.0 match 1 stars 3.18 score 8 scripts 1 dependents

tmsalab

hmcdm:Hidden Markov Cognitive Diagnosis Models for Learning

Fitting hidden Markov models of learning under the cognitive diagnosis framework. The estimation of the hidden Markov diagnostic classification model, the first order hidden Markov model, the reduced-reparameterized unified learning model, and the joint learning model for responses and response times.

Maintained by Sunbeom Kwon. Last updated 2 years ago.

cognitive-diagnostic-models psychometrics rcpp rcpparmadillo openblas cpp openmp

7.2 match 7 stars 5.70 score 12 scripts

jwiley

multilevelTools:Multilevel and Mixed Effects Model Diagnostics and Effect Sizes

Effect sizes, diagnostics and performance metrics for multilevel and mixed effects models. Includes marginal and conditional 'R2' estimates for linear mixed effects models based on Johnson (2014) <doi:10.1111/2041-210X.12225>.

Maintained by Joshua F. Wiley. Last updated 12 months ago.

7.1 match 4 stars 5.74 score 136 scripts

cmusso86

recalibratiNN:Quantile Recalibration for Regression Models

Enables the diagnostics and enhancement of regression model calibration.It offers both global and local visualization tools for calibration diagnostics and provides one recalibration method: Torres R, Nott DJ, Sisson SA, Rodrigues T, Reis JG, Rodrigues GS (2024) <doi:10.48550/arXiv.2403.05756>. The method leverages on Probabilistic Integral Transform (PIT) values to both evaluate and perform the calibration of statistical models. For a more detailed description of the package, please refer to the bachelor's thesis available bellow.

Maintained by Carolina Musso. Last updated 2 months ago.

calibration gaussian-models neural-network probability recalibration regression-models

7.5 match 7 stars 5.39 score 8 scripts

business-science

timetk:A Tool Kit for Working with Time Series

Easy visualization, wrangling, and feature engineering of time series data for forecasting and machine learning prediction. Consolidates and extends time series functionality from packages including 'dplyr', 'stats', 'xts', 'forecast', 'slider', 'padr', 'recipes', and 'rsample'.

Maintained by Matt Dancho. Last updated 1 years ago.

coercion coercion-functions data-mining dplyr forecast forecasting forecasting-models machine-learning series-decomposition series-signature tibble tidy tidyquant tidyverse time time-series timeseries

2.8 match 625 stars 14.15 score 4.0k scripts 16 dependents

jm-umn

distfreereg:Distribution-Free Goodness-of-Fit Testing for Regression

Implements distribution-free goodness-of-fit regression testing for the mean structure of parametric models introduced in Khmaladze (2021) <doi:10.1007/s10463-021-00786-3>.

Maintained by Jesse Miller. Last updated 4 months ago.

9.4 match 4.25 score 178 scripts

functionaldata

fdapace:Functional Data Analysis and Empirical Dynamics

A versatile package that provides implementation of various methods of Functional Data Analysis (FDA) and Empirical Dynamics. The core of this package is Functional Principal Component Analysis (FPCA), a key technique for functional data analysis, for sparsely or densely sampled random trajectories and time courses, via the Principal Analysis by Conditional Estimation (PACE) algorithm. This core algorithm yields covariance and mean functions, eigenfunctions and principal component (scores), for both functional data and derivatives, for both dense (functional) and sparse (longitudinal) sampling designs. For sparse designs, it provides fitted continuous trajectories with confidence bands, even for subjects with very few longitudinal observations. PACE is a viable and flexible alternative to random effects modeling of longitudinal data. There is also a Matlab version (PACE) that contains some methods not available on fdapace and vice versa. Updates to fdapace were supported by grants from NIH Echo and NSF DMS-1712864 and DMS-2014626. Please cite our package if you use it (You may run the command citation("fdapace") to get the citation format and bibtex entry). References: Wang, J.L., Chiou, J., Müller, H.G. (2016) <doi:10.1146/annurev-statistics-041715-033624>; Chen, K., Zhang, X., Petersen, A., Müller, H.G. (2017) <doi:10.1007/s12561-015-9137-5>.

Maintained by Yidong Zhou. Last updated 9 months ago.

cpp

3.5 match 31 stars 11.46 score 474 scripts 25 dependents

friendly

mvinfluence:Influence Measures and Diagnostic Plots for Multivariate Linear Models

Computes regression deletion diagnostics for multivariate linear models and provides some associated diagnostic plots. The diagnostic measures include hat-values (leverages), generalized Cook's distance, and generalized squared 'studentized' residuals. Several types of plots to detect influential observations are provided.

Maintained by Michael Friendly. Last updated 2 years ago.

multivariate-analysis multivariate-linear-regression statistics visualization

9.0 match 2 stars 4.41 score 26 scripts

gsucarrat

gets:General-to-Specific (GETS) Modelling and Indicator Saturation Methods

Automated General-to-Specific (GETS) modelling of the mean and variance of a regression, and indicator saturation methods for detecting and testing for structural breaks in the mean, see Pretis, Reade and Sucarrat (2018) <doi:10.18637/jss.v086.i03> for an overview of the package. In advanced use, the estimator and diagnostics tests can be fully user-specified, see Sucarrat (2021) <doi:10.32614/RJ-2021-024>.

Maintained by Genaro Sucarrat. Last updated 8 months ago.

5.8 match 8 stars 6.89 score 73 scripts 3 dependents

bioc

ClustAll:ClustAll: Data driven strategy to robustly identify stratification of patients within complex diseases

Data driven strategy to find hidden groups of patients with complex diseases using clinical data. ClustAll facilitates the unsupervised identification of multiple robust stratifications. ClustAll, is able to overcome the most common limitations found when dealing with clinical data (missing values, correlated data, mixed data types).

Maintained by Asier Ortega-Legarreta. Last updated 5 months ago.

software statisticalmethod clustering dimensionreduction principalcomponent

10.3 match 3.78 score 1 scripts

dwarton

ecostats:Code and Data Accompanying the Eco-Stats Text (Warton 2022)

Functions and data supporting the Eco-Stats text (Warton, 2022, Springer), and solutions to exercises. Functions include tools for using simulation envelopes in diagnostic plots, and a function for diagnostic plots of multivariate linear models. Datasets mentioned in the package are included here (where not available elsewhere) and there is a vignette for each chapter of the text with solutions to exercises.

Maintained by David Warton. Last updated 1 years ago.

5.8 match 8 stars 6.58 score 53 scripts

jolars

tactile:New and Extended Plots, Methods, and Panel Functions for 'lattice'

Extensions to 'lattice', providing new high-level functions, methods for existing functions, panel functions, and a theme.

Maintained by Johan Larsson. Last updated 2 years ago.

lattice linear-models plotting time-series

6.1 match 7 stars 6.33 score 154 scripts

nomahi

dmetatools:Computational tools for meta-analysis of diagnostic accuracy test

Computational tools for meta-analysis of diagnostic accuracy test. This package enables computations of confidence interval for the AUC of summary ROC curve and some related AUC-based inference methods.

Maintained by Hisashi Noma. Last updated 3 years ago.

auc bootstrap diagnostic-tests meta-analysis summary-roc-curve

14.2 match 2.70 score 2 scripts

tyee001

VGAM:Vector Generalized Linear and Additive Models

An implementation of about 6 major classes of statistical regression models. The central algorithm is Fisher scoring and iterative reweighted least squares. At the heart of this package are the vector generalized linear and additive model (VGLM/VGAM) classes. VGLMs can be loosely thought of as multivariate GLMs. VGAMs are data-driven VGLMs that use smoothing. The book "Vector Generalized Linear and Additive Models: With an Implementation in R" (Yee, 2015) <DOI:10.1007/978-1-4939-2818-7> gives details of the statistical framework and the package. Currently only fixed-effects models are implemented. Many (100+) models and distributions are estimated by maximum likelihood estimation (MLE) or penalized MLE. The other classes are RR-VGLMs (reduced-rank VGLMs), quadratic RR-VGLMs, doubly constrained RR-VGLMs, quadratic RR-VGLMs, reduced-rank VGAMs, RCIMs (row-column interaction models)---these classes perform constrained and unconstrained quadratic ordination (CQO/UQO) models in ecology, as well as constrained additive ordination (CAO). Hauck-Donner effect detection is implemented. Note that these functions are subject to change; see the NEWS and ChangeLog files for latest changes.

Maintained by Thomas Yee. Last updated 1 months ago.

fortran

3.6 match 10 stars 10.67 score 3.6k scripts 169 dependents

bioc

BindingSiteFinder:Binding site defintion based on iCLIP data

Precise knowledge on the binding sites of an RNA-binding protein (RBP) is key to understand (post-) transcriptional regulatory processes. Here we present a workflow that describes how exact binding sites can be defined from iCLIP data. The package provides functions for binding site definition and result visualization. For details please see the vignette.

Maintained by Mirko Brüggemann. Last updated 4 months ago.

sequencing geneexpression generegulation functionalgenomics coverage dataimport binding-site-classification binding-sites bioconductor-package iclip rna-binding-proteins

6.7 match 6 stars 5.65 score 3 scripts

amices

mice:Multivariate Imputation by Chained Equations

Multiple imputation using Fully Conditional Specification (FCS) implemented by the MICE algorithm as described in Van Buuren and Groothuis-Oudshoorn (2011) <doi:10.18637/jss.v045.i03>. Each variable has its own imputation model. Built-in imputation models are provided for continuous data (predictive mean matching, normal), binary data (logistic regression), unordered categorical data (polytomous logistic regression) and ordered categorical data (proportional odds). MICE can also impute continuous two-level data (normal model, pan, second-level variables). Passive imputation can be used to maintain consistency between variables. Various diagnostic plots are available to inspect the quality of the imputations.

Maintained by Stef van Buuren. Last updated 5 days ago.

chained-equations fcs imputation mice missing-data missing-values multiple-imputation multivariate-data cpp

2.3 match 462 stars 16.50 score 10k scripts 154 dependents

cran

MASS:Support Functions and Datasets for Venables and Ripley's MASS

Functions and datasets to support Venables and Ripley, "Modern Applied Statistics with S" (4th edition, 2002).

Maintained by Brian Ripley. Last updated 15 days ago.

3.6 match 19 stars 10.53 score 11k dependents

ohdsi

PatientLevelPrediction:Develop Clinical Prediction Models Using the Common Data Model

A user friendly way to create patient level prediction models using the Observational Medical Outcomes Partnership Common Data Model. Given a cohort of interest and an outcome of interest, the package can use data in the Common Data Model to build a large set of features. These features can then be used to fit a predictive model with a number of machine learning algorithms. This is further described in Reps (2017) <doi:10.1093/jamia/ocy032>.

Maintained by Egill Fridgeirsson. Last updated 7 days ago.

hades openjdk

3.5 match 190 stars 10.85 score 297 scripts

omniacsdao

Rnumerai:Interface to the Numerai Machine Learning Tournament API

Routines to interact with the Numerai Machine Learning Tournament API <https://numer.ai>. The functionality includes the ability to automatically download the current tournament data, submit predictions, and to get information for your user.

Maintained by Eric Hare. Last updated 2 years ago.

6.8 match 35 stars 5.53 score 39 scripts

farrellday

miceRanger:Multiple Imputation by Chained Equations with Random Forests

Multiple Imputation has been shown to be a flexible method to impute missing values by Van Buuren (2007) <doi:10.1177/0962280206074463>. Expanding on this, random forests have been shown to be an accurate model by Stekhoven and Buhlmann <arXiv:1105.0828> to impute missing values in datasets. They have the added benefits of returning out of bag error and variable importance estimates, as well as being simple to run in parallel.

Maintained by Sam Wilson. Last updated 3 years ago.

imputation-methods machine-learning mice missing-data missing-values random-forests

5.3 match 67 stars 7.09 score 41 scripts 1 dependents

flavjack

inti:Tools and Statistical Procedures in Plant Science

The 'inti' package is part of the 'inkaverse' project for developing different procedures and tools used in plant science and experimental designs. The mean aim of the package is to support researchers during the planning of experiments and data collection (tarpuy()), data analysis and graphics (yupana()) , and technical writing. Learn more about the 'inkaverse' project at <https://inkaverse.com/>.

Maintained by Flavio Lozano-Isla. Last updated 17 days ago.

agriculture apps inkaverse lmm plant-breeding plant-science shiny

4.5 match 5 stars 8.24 score 193 scripts

bioc

BASiCS:Bayesian Analysis of Single-Cell Sequencing data

Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model to perform statistical analyses of single-cell RNA sequencing datasets in the context of supervised experiments (where the groups of cells of interest are known a priori, e.g. experimental conditions or cell types). BASiCS performs built-in data normalisation (global scaling) and technical noise quantification (based on spike-in genes). BASiCS provides an intuitive detection criterion for highly (or lowly) variable genes within a single group of cells. Additionally, BASiCS can compare gene expression patterns between two or more pre-specified groups of cells. Unlike traditional differential expression tools, BASiCS quantifies changes in expression that lie beyond comparisons of means, also allowing the study of changes in cell-to-cell heterogeneity. The latter can be quantified via a biological over-dispersion parameter that measures the excess of variability that is observed with respect to Poisson sampling noise, after normalisation and technical noise removal. Due to the strong mean/over-dispersion confounding that is typically observed for scRNA-seq datasets, BASiCS also tests for changes in residual over-dispersion, defined by residual values with respect to a global mean/over-dispersion trend.

Maintained by Catalina Vallejos. Last updated 5 months ago.

immunooncology normalization sequencing rnaseq software geneexpression transcriptomics singlecell differentialexpression bayesian cellbiology bioconductor-package gene-expression rcpp rcpparmadillo scrna-seq single-cell openblas cpp openmp

3.6 match 83 stars 10.26 score 368 scripts 1 dependents

bioc

gDNAx:Diagnostics for assessing genomic DNA contamination in RNA-seq data

Provides diagnostics for assessing genomic DNA contamination in RNA-seq data, as well as plots representing these diagnostics. Moreover, the package can be used to get an insight into the strand library protocol used and, in case of strand-specific libraries, the strandedness of the data. Furthermore, it provides functionality to filter out reads of potential gDNA origin.

Maintained by Robert Castelo. Last updated 1 months ago.

transcription transcriptomics rnaseq sequencing preprocessing software geneexpression coverage differentialexpression functionalgenomics splicedalignment alignment

7.3 match 1 stars 5.08 score 3 scripts

coatless-rpkg

surreal:Create Datasets with Hidden Images in Residual Plots

Implements the "Residual (Sur)Realism" algorithm described by Stefanski (2007) <doi:10.1198/000313007X190079> to generate datasets that reveal hidden images or messages in their residual plots. It offers both predefined datasets and tools to embed custom text or images into residual structures. Allowing users to create intriguing visual demonstrations for teaching model diagnostics.

Maintained by James Joseph Balamuta. Last updated 6 months ago.

diagnostic-plots residual-analysis residuals

8.0 match 15 stars 4.57 score 4 scripts

doug-friedman

topicdoc:Topic-Specific Diagnostics for LDA and CTM Topic Models

Calculates topic-specific diagnostics (e.g. mean token length, exclusivity) for Latent Dirichlet Allocation and Correlated Topic Models fit using the 'topicmodels' package. For more details, see Chapter 12 in Airoldi et al. (2014, ISBN:9781466504080), pp 262-272 Mimno et al. (2011, ISBN:9781937284114), and Bischof et al. (2014) <arXiv:1206.4631v1>.

Maintained by Doug Friedman. Last updated 3 years ago.

natural-language-processing text-mining topic-modeling topic-modelling topic-models

6.7 match 25 stars 5.48 score 24 scripts

dwarton

mvabund:Statistical Methods for Analysing Multivariate Abundance Data

A set of tools for displaying, modeling and analysing multivariate abundance data in community ecology. See 'mvabund-package.Rd' for details of overall package organization. The package is implemented with the Gnu Scientific Library (<http://www.gnu.org/software/gsl/>) and 'Rcpp' (<http://dirk.eddelbuettel.com/code/rcpp.html>) 'R' / 'C++' classes.

Maintained by David Warton. Last updated 1 years ago.

gsl cpp

3.6 match 10 stars 10.13 score 680 scripts 5 dependents

statnet

ergm.multi:Fit, Simulate and Diagnose Exponential-Family Models for Multiple or Multilayer Networks

A set of extensions for the 'ergm' package to fit multilayer/multiplex/multirelational networks and samples of multiple networks. 'ergm.multi' is a part of the Statnet suite of packages for network analysis. See Krivitsky, Koehly, and Marcum (2020) <doi:10.1007/s11336-020-09720-7> and Krivitsky, Coletti, and Hens (2023) <doi:10.1080/01621459.2023.2242627>.

Maintained by Pavel N. Krivitsky. Last updated 3 months ago.

3.8 match 14 stars 9.67 score 11 scripts 5 dependents

donaldrwilliams

BGGM:Bayesian Gaussian Graphical Models

Fit Bayesian Gaussian graphical models. The methods are separated into two Bayesian approaches for inference: hypothesis testing and estimation. There are extensions for confirmatory hypothesis testing, comparing Gaussian graphical models, and node wise predictability. These methods were recently introduced in the Gaussian graphical model literature, including Williams (2019) <doi:10.31234/osf.io/x8dpr>, Williams and Mulder (2019) <doi:10.31234/osf.io/ypxd8>, Williams, Rast, Pericchi, and Mulder (2019) <doi:10.31234/osf.io/yt386>.

Maintained by Philippe Rast. Last updated 3 months ago.

bayes-factors bayesian-hypothesis-testing gaussian-graphical-models openblas cpp openmp

3.8 match 55 stars 9.64 score 102 scripts 1 dependents

bstaton1

postpack:Utilities for Processing Posterior Samples Stored in 'mcmc.lists'

The aim of 'postpack' is to provide the infrastructure for a standardized workflow for 'mcmc.list' objects. These objects can be used to store output from models fitted with Bayesian inference using 'JAGS', 'WinBUGS', 'OpenBUGS', 'NIMBLE', 'Stan', or even custom MCMC algorithms. Although the 'coda' R package provides some methods for these objects, it is somewhat limited in easily performing post-processing tasks for specific nodes. Models are ever increasing in their complexity and the number of tracked nodes, and oftentimes a user may wish to summarize/diagnose sampling behavior for only a small subset of nodes at a time for a particular question or figure. Thus, many 'postpack' functions support performing tasks on a subset of nodes, where the subset is specified with regular expressions. The functions in 'postpack' streamline the extraction, summarization, and diagnostics of specific monitored nodes after model fitting. Further, because there is rarely only ever one model under consideration, 'postpack' scales efficiently to perform the same tasks on output from multiple models simultaneously, facilitating rapid assessment of model sensitivity to changes in assumptions.

Maintained by Ben Staton. Last updated 2 years ago.

5.3 match 2 stars 6.75 score 233 scripts 1 dependents

nicholasjclark

mvgam:Multivariate (Dynamic) Generalized Additive Models

Fit Bayesian Dynamic Generalized Additive Models to multivariate observations. Users can build nonlinear State-Space models that can incorporate semiparametric effects in observation and process components, using a wide range of observation families. Estimation is performed using Markov Chain Monte Carlo with Hamiltonian Monte Carlo in the software 'Stan'. References: Clark & Wells (2023) <doi:10.1111/2041-210X.13974>.

Maintained by Nicholas J Clark. Last updated 7 hours ago.

bayesian-statistics dynamic-factor-models ecological-modelling forecasting gaussian-process generalised-additive-models generalized-additive-models joint-species-distribution-modelling multilevel-models multivariate-timeseries stan time-series-analysis timeseries vector-autoregression vectorautoregression cpp

3.6 match 139 stars 9.85 score 117 scripts

ncn-foreigners

singleRcapture:Single-Source Capture-Recapture Models

Implementation of single-source capture-recapture methods for population size estimation using zero-truncated, zero-one truncated and zero-truncated one-inflated Poisson, Geometric and Negative Binomial regression as well as Zelterman's and Chao's regression. Package includes point and interval estimators for the population size with variances estimated using analytical or bootstrap method. Details can be found in: van der Heijden et all. (2003) <doi:10.1191/1471082X03st057oa>, Böhning and van der Heijden (2019) <doi:10.1214/18-AOAS1232>, Böhning et al. (2020) Capture-Recapture Methods for the Social and Medical Sciences or Böhning and Friedl (2021) <doi:10.1007/s10260-021-00556-8>.

Maintained by Maciej Beręsewicz. Last updated 30 days ago.

5.8 match 11 stars 6.16 score 29 scripts

stamats

MKpower:Power Analysis and Sample Size Calculation

Power analysis and sample size calculation for Welch and Hsu (Hedderich and Sachs (2018), ISBN:978-3-662-56657-2) t-tests including Monte-Carlo simulations of empirical power and type-I-error. Power and sample size calculation for Wilcoxon rank sum and signed rank tests via Monte-Carlo simulations. Power and sample size required for the evaluation of a diagnostic test(-system) (Flahault et al. (2005), <doi:10.1016/j.jclinepi.2004.12.009>; Dobbin and Simon (2007), <doi:10.1093/biostatistics/kxj036>) as well as for a single proportion (Fleiss et al. (2003), ISBN:978-0-471-52629-2; Piegorsch (2004), <doi:10.1016/j.csda.2003.10.002>; Thulin (2014), <doi:10.1214/14-ejs909>), comparing two negative binomial rates (Zhu and Lakkis (2014), <doi:10.1002/sim.5947>), ANCOVA (Shieh (2020), <doi:10.1007/s11336-019-09692-3>), reference ranges (Jennen-Steinmetz and Wellek (2005), <doi:10.1002/sim.2177>), multiple primary endpoints (Sozu et al. (2015), ISBN:978-3-319-22005-5), and AUC (Hanley and McNeil (1982), <doi:10.1148/radiology.143.1.7063747>).

Maintained by Matthias Kohl. Last updated 6 months ago.

5.9 match 7 stars 5.95 score 32 scripts

matthieu-bruneaux

isotracer:Isotopic Tracer Analysis Using MCMC

Implements Bayesian models to analyze data from tracer addition experiments. The implemented method was originally described in the article "A New Method to Reconstruct Quantitative Food Webs and Nutrient Flows from Isotope Tracer Addition Experiments" by López-Sepulcre et al. (2020) <doi:10.1086/708546>.

Maintained by Matthieu Bruneaux. Last updated 4 months ago.

cpp

5.9 match 5.92 score 60 scripts

r-lib

cli:Helpers for Developing Command Line Interfaces

A suite of tools to build attractive command line interfaces ('CLIs'), from semantic elements: headings, lists, alerts, paragraphs, etc. Supports custom themes via a 'CSS'-like language. It also contains a number of lower level 'CLI' elements: rules, boxes, trees, and 'Unicode' symbols with 'ASCII' alternatives. It support ANSI colors and text styles as well.

Maintained by Gábor Csárdi. Last updated 4 days ago.

cli

1.8 match 664 stars 19.30 score 1.4k scripts 14k dependents

yeukyul

lindia:Automated Linear Regression Diagnostic

Provides a set of streamlined functions that allow easy generation of linear regression diagnostic plots necessarily for checking linear model assumptions. This package is meant for easy scheming of linear regression diagnostics, while preserving merits of "The Grammar of Graphics" as implemented in 'ggplot2'. See the 'ggplot2' website for more information regarding the specific capability of graphics.

Maintained by Yeuk Yu Lee. Last updated 2 years ago.

5.6 match 104 stars 6.16 score 139 scripts

s-mckay-curtis

mcmcplots:Create Plots from MCMC Output

Functions for convenient plotting and viewing of MCMC output.

Maintained by S. McKay Curtis. Last updated 7 years ago.

5.3 match 4 stars 6.53 score 880 scripts 4 dependents

hannahcomiskey

mcmsupply:Estimating Public and Private Sector Contraceptive Market Supply Shares

Family Planning programs and initiatives typically use nationally representative surveys to estimate key indicators of a country’s family planning progress. However, in recent years, routinely collected family planning services data (Service Statistics) have been used as a supplementary data source to bridge gaps in the surveys. The use of service statistics comes with the caveat that adjustments need to be made for missing private sector contributions to the contraceptive method supply chain. Evaluating the supply source of modern contraceptives often relies on Demographic Health Surveys (DHS), where many countries do not have recent data beyond 2015/16. Fortunately, in the absence of recent surveys we can rely on statistical model-based estimates and projections to fill the knowledge gap. We present a Bayesian, hierarchical, penalized-spline model with multivariate-normal spline coefficients, to account for across method correlations, to produce country-specific,annual estimates for the proportion of modern contraceptive methods coming from the public and private sectors. This package provides a quick and convenient way for users to access the DHS modern contraceptive supply share data at national and subnational administration levels, estimate, evaluate and plot annual estimates with uncertainty for a sample of low- and middle-income countries. Methods for the estimation of method supply shares at the national level are described in Comiskey, Alkema, Cahill (2022) <arXiv:2212.03844>.

Maintained by Hannah Comiskey. Last updated 11 months ago.

jags cpp

6.7 match 2 stars 5.15 score 20 scripts

tmsalab

cIRT:Choice Item Response Theory

Jointly model the accuracy of cognitive responses and item choices within a Bayesian hierarchical framework as described by Culpepper and Balamuta (2015) <doi:10.1007/s11336-015-9484-7>. In addition, the package contains the datasets used within the analysis of the paper.

Maintained by James Joseph Balamuta. Last updated 3 years ago.

armadillo bayesian choice cognitive-diagnostic-models gibbs-sampling item-response-theory rcpparmadillo openblas cpp openmp

6.7 match 4 stars 5.14 score 23 scripts

kvasilopoulos

exuber:Econometric Analysis of Explosive Time Series

Testing for and dating periods of explosive dynamics (exuberance) in time series using the univariate and panel recursive unit root tests proposed by Phillips et al. (2015) <doi:10.1111/iere.12132> and Pavlidis et al. (2016) <doi:10.1007/s11146-015-9531-2>.The recursive least-squares algorithm utilizes the matrix inversion lemma to avoid matrix inversion which results in significant speed improvements. Simulation of a variety of periodically-collapsing bubble processes. Details can be found in Vasilopoulos et al. (2022) <doi:10.18637/jss.v103.i10>.

Maintained by Kostas Vasilopoulos. Last updated 12 months ago.

dickey-fuller explosive-dynamics simulation time-series openblas cpp

5.0 match 29 stars 6.83 score 77 scripts

blue-matter

SAMtool:Stock Assessment Methods Toolkit

Simulation tools for closed-loop simulation are provided for the 'MSEtool' operating model to inform data-rich fisheries. 'SAMtool' provides a conditioning model, assessment models of varying complexity with standardized reporting, model-based management procedures, and diagnostic tools for evaluating assessments inside closed-loop simulation.

Maintained by Quang Huynh. Last updated 18 days ago.

cpp

5.2 match 3 stars 6.49 score 36 scripts 1 dependents

bayesstats

jarbes:Just a Rather Bayesian Evidence Synthesis

Provides a new class of Bayesian meta-analysis models that incorporates a model for internal and external validity bias. In this way, it is possible to combine studies of diverse quality and different types. For example, we can combine the results of randomized control trials (RCTs) with the results of observational studies (OS).

Maintained by Pablo Emilio Verde. Last updated 3 months ago.

jags cpp

17.6 match 1 stars 1.91 score 27 scripts

rich-payne

dreamer:Dose Response Models for Bayesian Model Averaging

Fits dose-response models utilizing a Bayesian model averaging approach as outlined in Gould (2019) <doi:10.1002/bimj.201700211> for both continuous and binary responses. Longitudinal dose-response modeling is also supported in a Bayesian model averaging framework as outlined in Payne, Ray, and Thomann (2024) <doi:10.1080/10543406.2023.2292214>. Functions for plotting and calculating various posterior quantities (e.g. posterior mean, quantiles, probability of minimum efficacious dose, etc.) are also implemented. Copyright Eli Lilly and Company (2019).

Maintained by Richard Daniel Payne. Last updated 3 months ago.

bayesian dose-response-modeling jags cpp

6.4 match 9 stars 5.26 score 5 scripts

paulnorthrop

chandwich:Chandler-Bate Sandwich Loglikelihood Adjustment

Performs adjustments of a user-supplied independence loglikelihood function using a robust sandwich estimator of the parameter covariance matrix, based on the methodology in Chandler and Bate (2007) <doi:10.1093/biomet/asm015>. This can be used for cluster correlated data when interest lies in the parameters of the marginal distributions or for performing inferences that are robust to certain types of model misspecification. Functions for profiling the adjusted loglikelihoods are also provided, as are functions for calculating and plotting confidence intervals, for single model parameters, and confidence regions, for pairs of model parameters. Nested models can be compared using an adjusted likelihood ratio test.

Maintained by Paul J. Northrop. Last updated 2 years ago.

clustered-data clusters composite-likelihood independence-loglikelihood mle robust sandwich statistical-inference

5.6 match 4 stars 5.88 score 18 scripts 7 dependents

cotterell

TDCM:The Transition Diagnostic Classification Model Framework

Estimate the transition diagnostic classification model (TDCM) described in Madison & Bradshaw (2018) <doi:10.1007/s11336-018-9638-5>, a longitudinal extension of the log-linear cognitive diagnosis model (LCDM) in Henson, Templin & Willse (2009) <doi:10.1007/s11336-008-9089-5>. As the LCDM subsumes many other diagnostic classification models (DCMs), many other DCMs can be estimated longitudinally via the TDCM. The 'TDCM' package includes functions to estimate the single-group and multigroup TDCM, summarize results of interest including item parameters, growth proportions, transition probabilities, transitional reliability, attribute correlations, model fit, and growth plots.

Maintained by Michael E. Cotterell. Last updated 4 months ago.

statistics

7.2 match 4.54 score 5 scripts

modeloriented

auditor:Model Audit - Verification, Validation, and Error Analysis

Provides an easy to use unified interface for creating validation plots for any model. The 'auditor' helps to avoid repetitive work consisting of writing code needed to create residual plots. This visualizations allow to asses and compare the goodness of fit, performance, and similarity of models.

Maintained by Alicja Gosiewska. Last updated 1 years ago.

classification error-analysis explainable-artificial-intelligence machine-learning model-validation regression-models residuals xai

3.8 match 58 stars 8.76 score 94 scripts 2 dependents

statnet

lolog:Latent Order Logistic Graph Models

Estimation of Latent Order Logistic (LOLOG) Models for Networks. LOLOGs are a flexible and fully general class of statistical graph models. This package provides functions for performing MOM, GMM and variational inference. Visual diagnostics and goodness of fit metrics are provided. See Fellows (2018) <arXiv:1804.04583> for a detailed description of the methods.

Maintained by Ian E. Fellows. Last updated 1 years ago.

cpp

5.9 match 5 stars 5.56 score 72 scripts

bioc

maaslin3:"Refining and extending generalized multivariate linear models for meta-omic association discovery"

MaAsLin 3 refines and extends generalized multivariate linear models for meta-omicron association discovery. It finds abundance and prevalence associations between microbiome meta-omics features and complex metadata in population-scale epidemiological studies. The software includes multiple analysis methods (including support for multiple covariates, repeated measures, and ordered predictors), filtering, normalization, and transform options to customize analysis for your specific study.

Maintained by William Nickols. Last updated 8 hours ago.

metagenomics software microbiome normalization multiplecomparison

4.0 match 32 stars 8.12 score 34 scripts

paulnorthrop

threshr:Threshold Selection and Uncertainty for Extreme Value Analysis

Provides functions for the selection of thresholds for use in extreme value models, based mainly on the methodology in Northrop, Attalides and Jonathan (2017) <doi:10.1111/rssc.12159>. It also performs predictive inferences about future extreme values, based either on a single threshold or on a weighted average of inferences from multiple thresholds, using the 'revdbayes' package <https://cran.r-project.org/package=revdbayes>. At the moment only the case where the data can be treated as independent identically distributed observations is considered.

Maintained by Paul J. Northrop. Last updated 1 months ago.

extreme-value-statistics extremes generalized inference pareto plot prediction threshold threshold-selection uncertainty

5.6 match 6 stars 5.72 score 29 scripts 1 dependents

choonghyunryu

dlookr:Tools for Data Diagnosis, Exploration, Transformation

A collection of tools that support data diagnosis, exploration, and transformation. Data diagnostics provides information and visualization of missing values, outliers, and unique and negative values to help you understand the distribution and quality of your data. Data exploration provides information and visualization of the descriptive statistics of univariate variables, normality tests and outliers, correlation of two variables, and the relationship between the target variable and predictor. Data transformation supports binning for categorizing continuous variables, imputes missing values and outliers, and resolves skewness. And it creates automated reports that support these three tasks.

Maintained by Choonghyun Ryu. Last updated 9 months ago.

2.9 match 212 stars 11.05 score 748 scripts 2 dependents

wenchao-ma

GDINA:The Generalized DINA Model Framework

A set of psychometric tools for cognitive diagnosis modeling based on the generalized deterministic inputs, noisy and gate (G-DINA) model by de la Torre (2011) <DOI:10.1007/s11336-011-9207-7> and its extensions, including the sequential G-DINA model by Ma and de la Torre (2016) <DOI:10.1111/bmsp.12070> for polytomous responses, and the polytomous G-DINA model by Chen and de la Torre <DOI:10.1177/0146621613479818> for polytomous attributes. Joint attribute distribution can be independent, saturated, higher-order, loglinear smoothed or structured. Q-matrix validation, item and model fit statistics, model comparison at test and item level and differential item functioning can also be conducted. A graphical user interface is also provided. For tutorials, please check Ma and de la Torre (2020) <DOI:10.18637/jss.v093.i14>, Ma and de la Torre (2019) <DOI:10.1111/emip.12262>, Ma (2019) <DOI:10.1007/978-3-030-05584-4_29> and de la Torre and Akbay (2019).

Maintained by Wenchao Ma. Last updated 30 days ago.

cdm cognitive-diagnosis dcm dina-model dino estimation-models gdina item-response-theory psychometrics openblas cpp

3.5 match 30 stars 8.92 score 94 scripts 6 dependents

usepa

spmodel:Spatial Statistical Modeling and Prediction

Fit, summarize, and predict for a variety of spatial statistical models applied to point-referenced and areal (lattice) data. Parameters are estimated using various methods. Additional modeling features include anisotropy, non-spatial random effects, partition factors, big data approaches, and more. Model-fit statistics are used to summarize, visualize, and compare models. Predictions at unobserved locations are readily obtainable. For additional details, see Dumelle et al. (2023) <doi:10.1371/journal.pone.0282524>.

Maintained by Michael Dumelle. Last updated 2 days ago.

4.1 match 15 stars 7.66 score 112 scripts 3 dependents

guillaumepressiat

pmeasyr:Donnees PMSI avec R

Import de donnees PMSI. Gestion des archives. Formats depuis 2011. Connexion et interface avec une db. requetr. Valorisation des rsa, des rapss.

Maintained by Guillaume Pressiat. Last updated 12 days ago.

had mco pmsi psy ssr

4.7 match 20 stars 6.76 score 53 scripts

cran

smdi:Perform Structural Missing Data Investigations

An easy to use implementation of routine structural missing data diagnostics with functions to visualize the proportions of missing observations, investigate missing data patterns and conduct various empirical missing data diagnostic tests. Reference: Weberpals J, Raman SR, Shaw PA, Lee H, Hammill BG, Toh S, Connolly JG, Dandreo KJ, Tian F, Liu W, Li J, Hernández-Muñoz JJ, Glynn RJ, Desai RJ. smdi: an R package to perform structural missing data investigations on partially observed confounders in real-world evidence studies. JAMIA Open. 2024 Jan 31;7(1):ooae008. <doi:10.1093/jamiaopen/ooae008>.

Maintained by Janick Weberpals. Last updated 5 months ago.

10.5 match 3.00 score

bioc

bluster:Clustering Algorithms for Bioconductor

Wraps common clustering algorithms in an easily extended S4 framework. Backends are implemented for hierarchical, k-means and graph-based clustering. Several utilities are also provided to compare and evaluate clustering results.

Maintained by Aaron Lun. Last updated 5 months ago.

immunooncology software geneexpression transcriptomics singlecell clustering cpp

3.3 match 9.43 score 636 scripts 51 dependents

berchuck

spBFA:Spatial Bayesian Factor Analysis

Implements a spatial Bayesian non-parametric factor analysis model with inference in a Bayesian setting using Markov chain Monte Carlo (MCMC). Spatial correlation is introduced in the columns of the factor loadings matrix using a Bayesian non-parametric prior, the probit stick-breaking process. Areal spatial data is modeled using a conditional autoregressive (CAR) prior and point-referenced spatial data is treated using a Gaussian process. The response variable can be modeled as Gaussian, probit, Tobit, or Binomial (using Polya-Gamma augmentation). Temporal correlation is introduced for the latent factors through a hierarchical structure and can be specified as exponential or first-order autoregressive. Full details of the package can be found in the accompanying vignette. Furthermore, the details of the package can be found in "Bayesian Non-Parametric Factor Analysis for Longitudinal Spatial Surfaces", by Berchuck et al (2019), <arXiv:1911.04337>. The paper is in press at the journal Bayesian Analysis.

Maintained by Samuel I. Berchuck. Last updated 3 years ago.

openblas cpp

7.5 match 3 stars 4.18 score 3 scripts

sprfmo

jjmR:Graphics and diagnostics tools for SPRFMO's Joint Jack Mackerel model

Graphics and diagnostics tools for SPRFMO's Joint Jack Mackerel model.

Maintained by Ricardo Oliveros-Ramos. Last updated 5 months ago.

8.1 match 2 stars 3.81 score 12 scripts 1 dependents

lukeduttweiler

skipTrack:A Bayesian Hierarchical Model that Controls for Non-Adherence in Mobile Menstrual Cycle Tracking

Implements a Bayesian hierarchical model designed to identify skips in mobile menstrual cycle self-tracking on mobile apps. Future developments will allow for the inclusion of covariates affecting cycle mean and regularity, as well as extra information regarding tracking non-adherence. Main methods to be outlined in a forthcoming paper, with alternative models from Li et al. (2022) <doi:10.1093/jamia/ocab182>.

Maintained by Luke Duttweiler. Last updated 2 months ago.

6.3 match 4.95 score 4 scripts

nmartinkova

diemr:Diagnostic Index Expectation Maximisation in R

Likelihood-based genome polarisation finds which alleles of genomic markers belong to which side of the barrier. Co-estimates which individuals belong to either side of the barrier and barrier strength. Uses expectation maximisation in likelihood framework. The method is described in Baird et al. (2023) <doi:10.1111/2041-210X.14010>.

Maintained by Natalia Martinkova. Last updated 2 months ago.

9.5 match 3.26 score 1 scripts

modeloriented

survex:Explainable Machine Learning in Survival Analysis

Survival analysis models are commonly used in medicine and other areas. Many of them are too complex to be interpreted by human. Exploration and explanation is needed, but standard methods do not give a broad enough picture. 'survex' provides easy-to-apply methods for explaining survival models, both complex black-boxes and simpler statistical models. They include methods specific to survival analysis such as SurvSHAP(t) introduced in Krzyzinski et al., (2023) <doi:10.1016/j.knosys.2022.110234>, SurvLIME described in Kovalev et al., (2020) <doi:10.1016/j.knosys.2020.106164> as well as extensions of existing ones described in Biecek et al., (2021) <doi:10.1201/9780429027192>.

Maintained by Mikołaj Spytek. Last updated 9 months ago.

biostatistics brier-scores censored-data cox-model cox-regression explainable-ai explainable-machine-learning explainable-ml explanatory-model-analysis interpretable-machine-learning interpretable-ml machine-learning probabilistic-machine-learning shap survival-analysis time-to-event variable-importance xai

3.7 match 110 stars 8.40 score 114 scripts

ncss-tech

sharpshootR:A Soil Survey Toolkit

A collection of data processing, visualization, and export functions to support soil survey operations. Many of the functions build on the `SoilProfileCollection` S4 class provided by the aqp package, extending baseline visualization to more elaborate depictions in the context of spatial and taxonomic data. While this package is primarily developed by and for the USDA-NRCS, in support of the National Cooperative Soil Survey, the authors strive for generalization sufficient to support any soil survey operation. Many of the included functions are used by the SoilWeb suite of websites and movile applications. These functions are provided here, with additional documentation, to enable others to replicate high quality versions of these figures for their own purposes.

Maintained by Dylan Beaudette. Last updated 11 days ago.

3.7 match 18 stars 8.37 score 327 scripts

berchuck

womblR:Spatiotemporal Boundary Detection Model for Areal Unit Data

Implements a spatiotemporal boundary detection model with a dissimilarity metric for areal data with inference in a Bayesian setting using Markov chain Monte Carlo (MCMC). The response variable can be modeled as Gaussian (no nugget), probit or Tobit link and spatial correlation is introduced at each time point through a conditional autoregressive (CAR) prior. Temporal correlation is introduced through a hierarchical structure and can be specified as exponential or first-order autoregressive. Full details of the package can be found in the accompanying vignette. Furthermore, the details of the package can be found in "Diagnosing Glaucoma Progression with Visual Field Data Using a Spatiotemporal Boundary Detection Method", by Berchuck et al (2018), <arXiv:1805.11636>. The paper is in press at the Journal of the American Statistical Association.

Maintained by Samuel I. Berchuck. Last updated 3 years ago.

openblas cpp

7.5 match 1 stars 4.10 score 25 scripts

easystats

effectsize:Indices of Effect Size

Provide utilities to work with indices of effect size for a wide variety of models and hypothesis tests (see list of supported models using the function 'insight::supported_models()'), allowing computation of and conversion between indices such as Cohen's d, r, odds, etc. References: Ben-Shachar et al. (2020) <doi:10.21105/joss.02815>.

Maintained by Mattan S. Ben-Shachar. Last updated 1 months ago.

anova cohens-d compute conversion correlation effect-size effectsize hacktoberfest hedges-g interpretation standardization standardized statistics

1.9 match 344 stars 16.38 score 1.8k scripts 29 dependents

bioc

scDiagnostics:Cell type annotation diagnostics

The scDiagnostics package provides diagnostic plots to assess the quality of cell type assignments from single cell gene expression profiles. The implemented functionality allows to assess the reliability of cell type annotations, investigate gene expression patterns, and explore relationships between different cell types in query and reference datasets allowing users to detect potential misalignments between reference and query datasets. The package also provides visualization capabilities for diagnostics purposes.

Maintained by Anthony Christidis. Last updated 5 months ago.

annotation classification clustering geneexpression rnaseq singlecell software transcriptomics

3.9 match 8 stars 7.77 score 46 scripts

cran

boot:Bootstrap Functions (Originally by Angelo Canty for S)

Functions and datasets for bootstrapping from the book "Bootstrap Methods and Their Application" by A. C. Davison and D. V. Hinkley (1997, CUP), originally written by Angelo Canty for S.

Maintained by Alessandra R. Brazzale. Last updated 7 months ago.

3.7 match 2 stars 8.21 score 2.3k dependents

valentint

robust:Port of the S+ "Robust Library"

Methods for robust statistics, a state of the art in the early 2000s, notably for robust regression and robust multivariate analysis.

Maintained by Valentin Todorov. Last updated 7 months ago.

fortran openblas

4.0 match 7.52 score 572 scripts 8 dependents

rmi-pacta

pacta.multi.loanbook:Run 'PACTA' on Multiple Loan Books Easily

Run Paris Agreement Capital Transition Assessment ('PACTA') analyses on multiple loan books in a structured way. Provides access to standard 'PACTA' metrics and additional 'PACTA'-related metrics for multiple loan books. Results take the form of 'csv' files and plots and are exported to user-specified project paths.

Maintained by Jacob Kastl. Last updated 18 hours ago.

climate-change pacta pactaverse sustainable-finance

4.6 match 6.48 score 4 scripts

zaynesember

speccurvieR:Easy, Fast, and Pretty Specification Curve Analysis

Making specification curve analysis easy, fast, and pretty. It improves upon existing offerings with additional features and 'tidyverse' integration. Users can easily visualize and evaluate how their models behave under different specifications with a high degree of customization. For a description and applications of specification curve analysis see Simonsohn, Simmons, and Nelson (2020) <doi:10.1038/s41562-020-0912-z>.

Maintained by Zayne Sember. Last updated 6 months ago.

regression-diagnostics specification-curve-analysis specification-curve-plot

7.5 match 4 stars 4.00 score 2 scripts

niaid

HDStIM:High Dimensional Stimulation Immune Mapping ('HDStIM')

A method for identifying responses to experimental stimulation in mass or flow cytometry that uses high dimensional analysis of measured parameters and can be performed with an end-to-end unsupervised approach. In the context of in vitro stimulation assays where high-parameter cytometry was used to monitor intracellular response markers, using cell populations annotated either through automated clustering or manual gating for a combined set of stimulated and unstimulated samples, 'HDStIM' labels cells as responding or non-responding. The package also provides auxiliary functions to rank intracellular markers based on their contribution to identifying responses and generating diagnostic plots.

Maintained by Rohit Farmer. Last updated 1 years ago.

complexheatmap assay cytof cytometry cytometry-analysis-pipeline flowcytometry stimulation

6.8 match 3 stars 4.41 score 17 scripts

smartdata-analysis-and-statistics

metamisc:Meta-Analysis of Diagnosis and Prognosis Research Studies

Facilitate frequentist and Bayesian meta-analysis of diagnosis and prognosis research studies. It includes functions to summarize multiple estimates of prediction model discrimination and calibration performance (Debray et al., 2019) <doi:10.1177/0962280218785504>. It also includes functions to evaluate funnel plot asymmetry (Debray et al., 2018) <doi:10.1002/jrsm.1266>. Finally, the package provides functions for developing multivariable prediction models from datasets with clustering (de Jong et al., 2021) <doi:10.1002/sim.8981>.

Maintained by Thomas Debray. Last updated 29 days ago.

meta-analysis prognosis prognostic-models

4.0 match 7 stars 7.48 score 102 scripts

paul-buerkner

brms:Bayesian Regression Models using 'Stan'

Fit Bayesian generalized (non-)linear multivariate multilevel models using 'Stan' for full Bayesian inference. A wide range of distributions and link functions are supported, allowing users to fit -- among others -- linear, robust linear, count data, survival, response times, ordinal, zero-inflated, hurdle, and even self-defined mixture models all in a multilevel context. Further modeling options include both theory-driven and data-driven non-linear terms, auto-correlation structures, censoring and truncation, meta-analytic standard errors, and quite a few more. In addition, all parameters of the response distribution can be predicted in order to perform distributional regression. Prior specifications are flexible and explicitly encourage users to apply prior distributions that actually reflect their prior knowledge. Models can easily be evaluated and compared using several methods assessing posterior or prior predictions. References: Bürkner (2017) <doi:10.18637/jss.v080.i01>; Bürkner (2018) <doi:10.32614/RJ-2018-017>; Bürkner (2021) <doi:10.18637/jss.v100.i05>; Carpenter et al. (2017) <doi:10.18637/jss.v076.i01>.

Maintained by Paul-Christian Bürkner. Last updated 1 days ago.

bayesian-inference brms multilevel-models stan statistical-models

1.8 match 1.3k stars 16.61 score 13k scripts 34 dependents

reditorsupport

languageserver:Language Server Protocol

An implementation of the Language Server Protocol for R. The Language Server protocol is used by an editor client to integrate features like auto completion. See <https://microsoft.github.io/language-server-protocol/> for details.

Maintained by Randy Lai. Last updated 1 years ago.

language-server-protocol

3.0 match 607 stars 9.93 score 207 scripts 1 dependents

mlr-org

mlr3:Machine Learning in R - Next Generation

Efficient, object-oriented programming on the building blocks of machine learning. Provides 'R6' objects for tasks, learners, resamplings, and measures. The package is geared towards scalability and larger datasets by supporting parallelization and out-of-memory data-backends like databases. While 'mlr3' focuses on the core computational operations, add-on packages provide additional functionality.

Maintained by Marc Becker. Last updated 3 days ago.

classification data-science machine-learning mlr3 regression

2.0 match 972 stars 14.86 score 2.3k scripts 35 dependents

jaredsmurray

bcf:Causal Inference for a Binary Treatment and Continuous Outcome using Bayesian Causal Forests

Causal inference for a binary treatment and continuous outcome using Bayesian Causal Forests. See Hahn, Murray and Carvalho (2020) <https://projecteuclid.org/journals/bayesian-analysis/volume-15/issue-3/Bayesian-Regression-Tree-Models-for-Causal-Inference--Regularization-Confounding/10.1214/19-BA1195.full> for additional information. This implementation relies on code originally accompanying Pratola et. al. (2013) <arXiv:1309.1906>.

Maintained by Jared S. Murray. Last updated 1 years ago.

openblas cpp

3.7 match 41 stars 8.12 score 46 scripts

novartis

RBesT:R Bayesian Evidence Synthesis Tools

Tool-set to support Bayesian evidence synthesis. This includes meta-analysis, (robust) prior derivation from historical data, operating characteristics and analysis (1 and 2 sample cases). Please refer to Weber et al. (2021) <doi:10.18637/jss.v100.i19> for details on applying this package while Neuenschwander et al. (2010) <doi:10.1177/1740774509356002> and Schmidli et al. (2014) <doi:10.1111/biom.12242> explain details on the methodology.

Maintained by Sebastian Weber. Last updated 2 months ago.

bayesian clinical historical-data meta-analysis cpp

3.8 match 22 stars 7.87 score 115 scripts 4 dependents

cran

UPG:Efficient Bayesian Algorithms for Binary and Categorical Data Regression Models

Efficient Bayesian implementations of probit, logit, multinomial logit and binomial logit models. Functions for plotting and tabulating the estimation output are available as well. Estimation is based on Gibbs sampling where the Markov chain Monte Carlo algorithms are based on the latent variable representations and marginal data augmentation algorithms described in "Gregor Zens, Sylvia Frühwirth-Schnatter & Helga Wagner (2023). Ultimate Pólya Gamma Samplers – Efficient MCMC for possibly imbalanced binary and categorical data, Journal of the American Statistical Association <doi:10.1080/01621459.2023.2259030>".

Maintained by Gregor Zens. Last updated 4 months ago.

8.8 match 3.31 score 41 scripts

ggobi

GGally:Extension to 'ggplot2'

The R package 'ggplot2' is a plotting system based on the grammar of graphics. 'GGally' extends 'ggplot2' by adding several functions to reduce the complexity of combining geometric objects with transformed data. Some of these functions include a pairwise plot matrix, a two group pairwise plot matrix, a parallel coordinates plot, a survival plot, and several functions to plot networks.

Maintained by Barret Schloerke. Last updated 10 months ago.

1.8 match 597 stars 16.15 score 17k scripts 154 dependents

rjdverse

rjd3toolkit:Utility Functions around 'JDemetra+ 3.0'

R Interface to 'JDemetra+ 3.x' (<https://github.com/jdemetra>) time series analysis software. It provides functions allowing to model time series (create outlier regressors, user-defined calendar regressors, UCARIMA models...), to test the presence of trading days or seasonal effects and also to set specifications in pre-adjustment and benchmarking when using rjd3x13 or rjd3tramoseats.

Maintained by Tanguy Barthelemy. Last updated 5 months ago.

jdemetra seasonal-adjustment timeseries openjdk

5.0 match 5 stars 5.81 score 48 scripts 15 dependents

insightsengineering

tern.mmrm:Tables and Graphs for Mixed Models for Repeated Measures (MMRM)

Mixed models for repeated measures (MMRM) are a popular choice for analyzing longitudinal continuous outcomes in randomized clinical trials and beyond; see for example Cnaan, Laird and Slasor (1997) <doi:10.1002/(SICI)1097-0258(19971030)16:20%3C2349::AID-SIM667%3E3.0.CO;2-E>. This package provides an interface for fitting MMRM within the 'tern' <https://cran.r-project.org/package=tern> framework by Zhu et al. (2023) and tabulate results easily using 'rtables' <https://cran.r-project.org/package=rtables> by Becker et al. (2023). It builds on 'mmrm' <https://cran.r-project.org/package=mmrm> by Sabanés Bové et al. (2023) for the actual MMRM computations.

Maintained by Joe Zhu. Last updated 6 months ago.

graphs listings statistical-engineering tables

4.0 match 6 stars 7.26 score 8 scripts 1 dependents

bioc

rScudo:Signature-based Clustering for Diagnostic Purposes

SCUDO (Signature-based Clustering for Diagnostic Purposes) is a rank-based method for the analysis of gene expression profiles for diagnostic and classification purposes. It is based on the identification of sample-specific gene signatures composed of the most up- and down-regulated genes for that sample. Starting from gene expression data, functions in this package identify sample-specific gene signatures and use them to build a graph of samples. In this graph samples are joined by edges if they have a similar expression profile, according to a pre-computed similarity matrix. The similarity between the expression profiles of two samples is computed using a method similar to GSEA. The graph of samples can then be used to perform community clustering or to perform supervised classification of samples in a testing set.

Maintained by Matteo Ciciani. Last updated 5 months ago.

geneexpression differentialexpression biomedicalinformatics classification clustering graphandnetwork network proteomics transcriptomics systemsbiology featureextraction

5.6 match 4 stars 5.19 score 13 scripts

hafen

stlplus:Enhanced Seasonal Decomposition of Time Series by Loess

Decompose a time series into seasonal, trend, and remainder components using an implementation of Seasonal Decomposition of Time Series by Loess (STL) that provides several enhancements over the STL method in the stats package. These enhancements include handling missing values, providing higher order (quadratic) loess smoothing with automated parameter choices, frequency component smoothing beyond the seasonal and trend components, and some basic plot methods for diagnostics.

Maintained by Ryan Hafen. Last updated 8 years ago.

cpp

4.1 match 66 stars 7.02 score 63 scripts 5 dependents

blue-matter

MSEtool:Management Strategy Evaluation Toolkit

Development, simulation testing, and implementation of management procedures for fisheries (see Carruthers & Hordyk (2018) <doi:10.1111/2041-210X.13081>).

Maintained by Adrian Hordyk. Last updated 24 days ago.

cpp

3.8 match 8 stars 7.69 score 163 scripts 3 dependents

stamats

MKmisc:Miscellaneous Functions from M. Kohl

Contains several functions for statistical data analysis; e.g. for sample size and power calculations, computation of confidence intervals and tests, and generation of similarity matrices.

Maintained by Matthias Kohl. Last updated 2 years ago.

3.9 match 11 stars 7.40 score 129 scripts 1 dependents

paulnorthrop

revdbayes:Ratio-of-Uniforms Sampling for Bayesian Extreme Value Analysis

Provides functions for the Bayesian analysis of extreme value models. The 'rust' package <https://cran.r-project.org/package=rust> is used to simulate a random sample from the required posterior distribution. The functionality of 'revdbayes' is similar to the 'evdbayes' package <https://cran.r-project.org/package=evdbayes>, which uses Markov Chain Monte Carlo ('MCMC') methods for posterior simulation. In addition, there are functions for making inferences about the extremal index, using the models for threshold inter-exceedance times of Suveges and Davison (2010) <doi:10.1214/09-AOAS292> and Holesovsky and Fusek (2020) <doi:10.1007/s10687-020-00374-3>. Also provided are d,p,q,r functions for the Generalised Extreme Value ('GEV') and Generalised Pareto ('GP') distributions that deal appropriately with cases where the shape parameter is very close to zero.

Maintained by Paul J. Northrop. Last updated 7 months ago.

analysis bayesian extreme extreme-value-statistics extremes generalized-pareto-distribution gev inference nhpp point-process posterior predictive rcpp value openblas cpp

3.8 match 4 stars 7.62 score 58 scripts 4 dependents