R-universe search: conditioning

renkun-ken

rlist:A Toolbox for Non-Tabular Data Manipulation

Provides a set of functions for data manipulation with list objects, including mapping, filtering, grouping, sorting, updating, searching, and other useful functions. Most functions are designed to be pipeline friendly so that data processing with lists can be chained.

Maintained by Kun Ren. Last updated 2 years ago.

24.6 match 206 stars 13.73 score 2.2k scripts 123 dependents

albgarre

biogrowth:Modelling of Population Growth

Modelling of population growth under static and dynamic environmental conditions. Includes functions for model fitting and making prediction under isothermal and dynamic conditions. The methods (algorithms & models) are based on predictive microbiology (See Perez-Rodriguez and Valero (2012, ISBN:978-1-4614-5519-6)).

Maintained by Alberto Garre. Last updated 17 hours ago.

48.7 match 5 stars 6.82 score 44 scripts

alexisderumigny

CondCopulas:Estimation and Inference for Conditional Copula Models

Provides functions for the estimation of conditional copulas models, various estimators of conditional Kendall's tau (proposed in Derumigny and Fermanian (2019a, 2019b, 2020) <doi:10.1515/demo-2019-0016>, <doi:10.1016/j.csda.2019.01.013>, <doi:10.1016/j.jmva.2020.104610>), and test procedures for the simplifying assumption (proposed in Derumigny and Fermanian (2017) <doi:10.1515/demo-2017-0011> and Derumigny, Fermanian and Min (2022) <doi:10.1002/cjs.11742>).

Maintained by Alexis Derumigny. Last updated 6 months ago.

conditional-copulas conditional-kendalls-tau copulas r-pkg simplifying-assumption

69.8 match 2 stars 4.70 score 7 scripts

uupharmacometrics

xpose4:Diagnostics for Nonlinear Mixed-Effect Models

A model building aid for nonlinear mixed-effects (population) model analysis using NONMEM, facilitating data set checkout, exploration and visualization, model diagnostics, candidate covariate identification and model comparison. The methods are described in Keizer et al. (2013) <doi:10.1038/psp.2013.24>, and Jonsson et al. (1999) <doi:10.1016/s0169-2607(98)00067-4>.

Maintained by Andrew C. Hooker. Last updated 1 years ago.

diagnostics nonmem pharmacometrics population-model xpose

38.0 match 35 stars 7.30 score 315 scripts

keaven

gsDesign:Group Sequential Design

Derives group sequential clinical trial designs and describes their properties. Particular focus on time-to-event, binary, and continuous outcomes. Largely based on methods described in Jennison, Christopher and Turnbull, Bruce W., 2000, "Group Sequential Methods with Applications to Clinical Trials" ISBN: 0-8493-0316-8.

Maintained by Keaven Anderson. Last updated 14 days ago.

biostatistics boundaries clinical-trials design spending-functions

18.3 match 51 stars 13.05 score 338 scripts 5 dependents

bioc

EBSeq:An R package for gene and isoform differential expression analysis of RNA-seq data

Differential Expression analysis at both gene and isoform level using RNA-seq data

Maintained by Xiuyu Ma. Last updated 2 months ago.

immunooncology statisticalmethod differentialexpression multiplecomparison rnaseq sequencing cpp

27.4 match 7.77 score 162 scripts 6 dependents

melff

mclogit:Multinomial Logit Models, with or without Random Effects or Overdispersion

Provides estimators for multinomial logit models in their conditional logit and baseline logit variants, with or without random effects, with or without overdispersion. Random effects models are estimated using the PQL technique (based on a Laplace approximation) or the MQL technique (based on a Solomon-Cox approximation). Estimates should be treated with caution if the group sizes are small.

Maintained by Martin Elff. Last updated 3 months ago.

18.7 match 23 stars 11.03 score 262 scripts 4 dependents

sfcheung

manymome:Mediation, Moderation and Moderated-Mediation After Model Fitting

Computes indirect effects, conditional effects, and conditional indirect effects in a structural equation model or path model after model fitting, with no need to define any user parameters or label any paths in the model syntax, using the approach presented in Cheung and Cheung (2024) <doi:10.3758/s13428-023-02224-z>. Can also form bootstrap confidence intervals by doing bootstrapping only once and reusing the bootstrap estimates in all subsequent computations. Supports bootstrap confidence intervals for standardized (partially or completely) indirect effects, conditional effects, and conditional indirect effects as described in Cheung (2009) <doi:10.3758/BRM.41.2.425> and Cheung, Cheung, Lau, Hui, and Vong (2022) <doi:10.1037/hea0001188>. Model fitting can be done by structural equation modeling using lavaan() or regression using lm().

Maintained by Shu Fai Cheung. Last updated 24 days ago.

bootstrapping confidence-interval lavaan manymome mediation moderated-mediation moderation regression sem standardized-effect-size structural-equation-modeling

24.3 match 1 stars 8.06 score 172 scripts 4 dependents

nhejazi

haldensify:Highly Adaptive Lasso Conditional Density Estimation

An algorithm for flexible conditional density estimation based on application of pooled hazard regression to an artificial repeated measures dataset constructed by discretizing the support of the outcome variable. To facilitate flexible estimation of the conditional density, the highly adaptive lasso, a non-parametric regression function shown to estimate cadlag (RCLL) functions at a suitably fast convergence rate, is used. The use of pooled hazards regression for conditional density estimation as implemented here was first described for by Díaz and van der Laan (2011) <doi:10.2202/1557-4679.1356>. Building on the conditional density estimation utilities, non-parametric inverse probability weighted (IPW) estimators of the causal effects of additive modified treatment policies are implemented, using conditional density estimation to estimate the generalized propensity score. Non-parametric IPW estimators based on this can be coupled with sieve estimation (undersmoothing) of the generalized propensity score to attain the semi-parametric efficiency bound (per Hejazi, Benkeser, Díaz, and van der Laan <doi:10.48550/arXiv.2205.05777>).

Maintained by Nima Hejazi. Last updated 6 months ago.

causal-inference conditional-density-estimates density-estimation highly-adaptive-lasso inverse-probability-weights machine-learning nonparametric-regression propensity-score

26.3 match 17 stars 7.34 score 72 scripts 3 dependents

r-lib

rlang:Functions for Base Types and Core R and 'Tidyverse' Features

A toolbox for working with base types, core R features like the condition system, and core 'Tidyverse' features like tidy evaluation.

Maintained by Lionel Henry. Last updated 21 days ago.

8.5 match 517 stars 20.53 score 9.8k scripts 15k dependents

ralmond

CPTtools:Tools for Creating Conditional Probability Tables

Provides support parameterized tables for Bayesian networks, particularly the IRT-like DiBello tables. Also, provides some tools for visualing the networks.

Maintained by Russell Almond. Last updated 3 months ago.

bayesian-network statistics

34.5 match 1 stars 5.05 score 21 scripts 4 dependents

r-forge

partykit:A Toolkit for Recursive Partytioning

A toolkit with infrastructure for representing, summarizing, and visualizing tree-structured regression and classification models. This unified infrastructure can be used for reading/coercing tree models from different sources ('rpart', 'RWeka', 'PMML') yielding objects that share functionality for print()/plot()/predict() methods. Furthermore, new and improved reimplementations of conditional inference trees (ctree()) and model-based recursive partitioning (mob()) from the 'party' package are provided based on the new infrastructure. A description of this package was published by Hothorn and Zeileis (2015) <https://jmlr.org/papers/v16/hothorn15a.html>.

Maintained by Torsten Hothorn. Last updated 6 days ago.

13.3 match 12.71 score 2.3k scripts 97 dependents

openair-project

openair:Tools for the Analysis of Air Pollution Data

Tools to analyse, interpret and understand air pollution data. Data are typically regular time series and air quality measurement, meteorological data and dispersion model output can be analysed. The package is described in Carslaw and Ropkins (2012, <doi:10.1016/j.envsoft.2011.09.008>) and subsequent papers.

Maintained by David Carslaw. Last updated 27 days ago.

air-quality air-quality-data meteorology openair cpp

12.8 match 311 stars 12.91 score 1.2k scripts 12 dependents

optad

adoptr:Adaptive Optimal Two-Stage Designs

Optimize one or two-arm, two-stage designs for clinical trials with respect to several implemented objective criteria or custom objectives. Optimization under uncertainty and conditional (given stage-one outcome) constraints are supported. See Pilz et al. (2019) <doi:10.1002/sim.8291> and Kunzmann et al. (2021) <doi:10.18637/jss.v098.i09> for details.

Maintained by Maximilian Pilz. Last updated 6 months ago.

20.6 match 1 stars 7.09 score 39 scripts 1 dependents

idsia

bayesRecon:Probabilistic Reconciliation via Conditioning

Provides methods for probabilistic reconciliation of hierarchical forecasts of time series. The available methods include analytical Gaussian reconciliation (Corani et al., 2021) <doi:10.1007/978-3-030-67664-3_13>, MCMC reconciliation of count time series (Corani et al., 2024) <doi:10.1016/j.ijforecast.2023.04.003>, Bottom-Up Importance Sampling (Zambon et al., 2024) <doi:10.1007/s11222-023-10343-y>, methods for the reconciliation of mixed hierarchies (Mix-Cond and TD-cond) (Zambon et al., 2024. The 40th Conference on Uncertainty in Artificial Intelligence, accepted).

Maintained by Dario Azzimonti. Last updated 2 months ago.

reconciliation timeseries

20.2 match 7 stars 7.13 score 40 scripts

braverock

PerformanceAnalytics:Econometric Tools for Performance and Risk Analysis

Collection of econometric functions for performance and risk analysis. In addition to standard risk and performance metrics, this package aims to aid practitioners and researchers in utilizing the latest research in analysis of non-normal return streams. In general, it is most tested on return (rather than price) data on a regular scale, but most functions will work with irregular return data as well, and increasing numbers of functions will work with P&L or price data where possible.

Maintained by Brian G. Peterson. Last updated 3 months ago.

9.0 match 222 stars 15.93 score 4.8k scripts 20 dependents

stephenslab

mashr:Multivariate Adaptive Shrinkage

Implements the multivariate adaptive shrinkage (mash) method of Urbut et al (2019) <DOI:10.1038/s41588-018-0268-8> for estimating and testing large numbers of effects in many conditions (or many outcomes). Mash takes an empirical Bayes approach to testing and effect estimation; it estimates patterns of similarity among conditions, then exploits these patterns to improve accuracy of the effect estimates. The core linear algebra is implemented in C++ for fast model fitting and posterior computation.

Maintained by Peter Carbonetto. Last updated 4 months ago.

openblas gsl cpp openmp

12.9 match 91 stars 11.04 score 624 scripts 3 dependents

jsanchezalv

WARDEN:Workflows for Health Technology Assessments in R using Discrete EveNts

Toolkit to support and perform discrete event simulations without resource constraints in the context of health technology assessments (HTA). The package focuses on cost-effectiveness modelling and aims to be submission-ready to relevant HTA bodies in alignment with 'NICE TSD 15' <https://www.sheffield.ac.uk/nice-dsu/tsds/patient-level-simulation>. More details an examples can be found in the package website <https://jsanchezalv.github.io/WARDEN/>.

Maintained by Javier Sanchez Alvarez. Last updated 3 months ago.

21.3 match 6 stars 6.62 score 9 scripts

zhenkewu

baker:"Nested Partially Latent Class Models"

Provides functions to specify, fit and visualize nested partially-latent class models ( Wu, Deloria-Knoll, Hammitt, and Zeger (2016) <doi:10.1111/rssc.12101>; Wu, Deloria-Knoll, and Zeger (2017) <doi:10.1093/biostatistics/kxw037>; Wu and Chen (2021) <doi:10.1002/sim.8804>) for inference of population disease etiology and individual diagnosis. In the motivating Pneumonia Etiology Research for Child Health (PERCH) study, because both quantities of interest sum to one hundred percent, the PERCH scientists frequently refer to them as population etiology pie and individual etiology pie, hence the name of the package.

Maintained by Zhenke Wu. Last updated 11 months ago.

bayesian case-control latent-class-analysis jags cpp

23.2 match 8 stars 6.00 score 21 scripts

jwb133

smcfcs:Multiple Imputation of Covariates by Substantive Model Compatible Fully Conditional Specification

Implements multiple imputation of missing covariates by Substantive Model Compatible Fully Conditional Specification. This is a modification of the popular FCS/chained equations multiple imputation approach, and allows imputation of missing covariate values from models which are compatible with the user specified substantive model.

Maintained by Jonathan Bartlett. Last updated 4 days ago.

14.2 match 11 stars 9.00 score 59 scripts 1 dependents

bsvars

bsvars:Bayesian Estimation of Structural Vector Autoregressive Models

Provides fast and efficient procedures for Bayesian analysis of Structural Vector Autoregressions. This package estimates a wide range of models, including homo-, heteroskedastic, and non-normal specifications. Structural models can be identified by adjustable exclusion restrictions, time-varying volatility, or non-normality. They all include a flexible three-level equation-specific local-global hierarchical prior distribution for the estimated level of shrinkage for autoregressive and structural parameters. Additionally, the package facilitates predictive and structural analyses such as impulse responses, forecast error variance and historical decompositions, forecasting, verification of heteroskedasticity, non-normality, and hypotheses on autoregressive parameters, as well as analyses of structural shocks, volatilities, and fitted values. Beautiful plots, informative summary functions, and extensive documentation including the vignette by Woźniak (2024) <doi:10.48550/arXiv.2410.15090> complement all this. The implemented techniques align closely with those presented in Lütkepohl, Shang, Uzeda, & Woźniak (2024) <doi:10.48550/arXiv.2404.11057>, Lütkepohl & Woźniak (2020) <doi:10.1016/j.jedc.2020.103862>, and Song & Woźniak (2021) <doi:10.1093/acrefore/9780190625979.013.174>. The 'bsvars' package is aligned regarding objects, workflows, and code structure with the R package 'bsvarSIGNs' by Wang & Woźniak (2024) <doi:10.32614/CRAN.package.bsvarSIGNs>, and they constitute an integrated toolset.

Maintained by Tomasz Woźniak. Last updated 1 months ago.

bayesian-inference econometrics vector-autoregression openblas cpp openmp

16.6 match 46 stars 7.67 score 32 scripts 1 dependents

netfacs

NetFACS:Network Applications to Facial Communication Data

Functions to analyze and visualize communication data, based on network theory and resampling methods. Farine, D. R. (2017) <doi:10.1111/2041-210X.12772>; Carsey, T., & Harden, J. (2014) <doi:10.4135/9781483319605>. Primarily targeted at datasets of facial expressions coded with the Facial Action Coding System. Ekman, P., Friesen, W. V., & Hager, J. C. (2002). "Facial action coding system - investigator's guide" <https://www.paulekman.com/facial-action-coding-system/>.

Maintained by Alan V. Rincon. Last updated 11 months ago.

24.9 match 8 stars 5.08 score 5 scripts

nicolas-robette

moreparty:A Toolbox for Conditional Inference Trees and Random Forests

Additions to 'party' and 'partykit' packages : tools for the interpretation of forests (surrogate trees, prototypes, etc.), feature selection (see Gregorutti et al (2017) <arXiv:1310.5726>, Hapfelmeier and Ulm (2013) <doi:10.1016/j.csda.2012.09.020>, Altmann et al (2010) <doi:10.1093/bioinformatics/btq134>) and parallelized versions of conditional forest and variable importance functions. Also modules and a shiny app for conditional inference trees.

Maintained by Nicolas Robette. Last updated 12 months ago.

30.3 match 3 stars 4.18 score 8 scripts

norskregnesentral

shapr:Prediction Explanation with Dependence-Aware Shapley Values

Complex machine learning models are often hard to interpret. However, in many situations it is crucial to understand and explain why a model made a specific prediction. Shapley values is the only method for such prediction explanation framework with a solid theoretical foundation. Previously known methods for estimating the Shapley values do, however, assume feature independence. This package implements methods which accounts for any feature dependence, and thereby produces more accurate estimates of the true Shapley values. An accompanying 'Python' wrapper ('shaprpy') is available through the GitHub repository.

Maintained by Martin Jullum. Last updated 1 months ago.

explainable-ai explainable-ml rcpp rcpparmadillo shapley openblas cpp openmp

11.9 match 153 stars 10.62 score 175 scripts 1 dependents

paul-buerkner

brms:Bayesian Regression Models using 'Stan'

Fit Bayesian generalized (non-)linear multivariate multilevel models using 'Stan' for full Bayesian inference. A wide range of distributions and link functions are supported, allowing users to fit -- among others -- linear, robust linear, count data, survival, response times, ordinal, zero-inflated, hurdle, and even self-defined mixture models all in a multilevel context. Further modeling options include both theory-driven and data-driven non-linear terms, auto-correlation structures, censoring and truncation, meta-analytic standard errors, and quite a few more. In addition, all parameters of the response distribution can be predicted in order to perform distributional regression. Prior specifications are flexible and explicitly encourage users to apply prior distributions that actually reflect their prior knowledge. Models can easily be evaluated and compared using several methods assessing posterior or prior predictions. References: Bürkner (2017) <doi:10.18637/jss.v080.i01>; Bürkner (2018) <doi:10.32614/RJ-2018-017>; Bürkner (2021) <doi:10.18637/jss.v100.i05>; Carpenter et al. (2017) <doi:10.18637/jss.v076.i01>.

Maintained by Paul-Christian Bürkner. Last updated 4 days ago.

bayesian-inference brms multilevel-models stan statistical-models

7.3 match 1.3k stars 16.61 score 13k scripts 34 dependents

jmbarbone

cnd:Create and Register Conditions

An interface for creating new condition generators objects. Generators are special functions that can be saved in registries and linked to other functions. Utilities for documenting your generators, and new conditions is provided for package development.

Maintained by Jordan Mark Barbone. Last updated 20 days ago.

conditions r-development

32.7 match 1 stars 3.60 score 7 scripts

bnosac

crfsuite:Conditional Random Fields for Labelling Sequential Data in Natural Language Processing

Wraps the 'CRFsuite' library <https://github.com/chokkan/crfsuite> allowing users to fit a Conditional Random Field model and to apply it on existing data. The focus of the implementation is in the area of Natural Language Processing where this R package allows you to easily build and apply models for named entity recognition, text chunking, part of speech tagging, intent recognition or classification of any category you have in mind. Next to training, a small web application is included in the package to allow you to easily construct training data.

Maintained by Jan Wijffels. Last updated 2 years ago.

chunking conditional-random-fields crf crfsuite data-science intent-classification natural-language-processing ner nlp cpp

18.5 match 63 stars 6.34 score 35 scripts

mucollective

multiverse:Create 'multiverse analysis' in R

Implement 'multiverse' style analyses (Steegen S., Tuerlinckx F, Gelman A., Vanpaemal, W., 2016) <doi:10.1177/1745691616658637> to show the robustness of statistical inference. 'Multiverse analysis' is a philosophy of statistical reporting where paper authors report the outcomes of many different statistical analyses in order to show how fragile or robust their findings are. The 'multiverse' package (Sarma A., Kale A., Moon M., Taback N., Chevalier F., Hullman J., Kay M., 2021) <doi:10.31219/osf.io/yfbwm> allows users to concisely and flexibly implement 'multiverse-style' analysis, which involve declaring alternate ways of performing an analysis step, in R and R Notebooks.

Maintained by Abhraneel Sarma. Last updated 4 months ago.

13.5 match 62 stars 8.37 score 42 scripts

r-forge

distrEx:Extensions of Package 'distr'

Extends package 'distr' by functionals, distances, and conditional distributions.

Maintained by Matthias Kohl. Last updated 2 months ago.

17.0 match 6.68 score 107 scripts 17 dependents

robjhyndman

hdrcde:Highest Density Regions and Conditional Density Estimation

Computation of highest density regions in one and two dimensions, kernel estimation of univariate density functions conditional on one covariate,and multimodal regression.

Maintained by Rob Hyndman. Last updated 2 years ago.

fortran

11.0 match 23 stars 10.30 score 128 scripts 161 dependents

r-forge

coin:Conditional Inference Procedures in a Permutation Test Framework

Conditional inference procedures for the general independence problem including two-sample, K-sample (non-parametric ANOVA), correlation, censored, ordered and multivariate problems described in <doi:10.18637/jss.v028.i08>.

Maintained by Torsten Hothorn. Last updated 9 months ago.

9.6 match 11.68 score 1.6k scripts 74 dependents

asheshrambachan

HonestDiD:Robust Inference in Difference-in-Differences and Event Study Designs

Provides functions to conduct robust inference in difference-in-differences and event study designs by implementing the methods developed in Rambachan & Roth (2023) <doi:10.1093/restud/rdad018>, "A More Credible Approach to Parallel Trends" [Previously titled "An Honest Approach..."]. Inference is conducted under a weaker version of the parallel trends assumption. Uniformly valid confidence sets are constructed based upon conditional confidence sets, fixed-length confidence sets and hybridized confidence sets.

Maintained by Ashesh Rambachan. Last updated 19 days ago.

difference-in-differences event-studies robust-inference

15.5 match 195 stars 7.11 score 63 scripts

bioc

MultiRNAflow:An R package for integrated analysis of temporal RNA-seq data with multiple biological conditions

Our R package MultiRNAflow provides an easy to use unified framework allowing to automatically make both unsupervised and supervised (DE) analysis for datasets with an arbitrary number of biological conditions and time points. In particular, our code makes a deep downstream analysis of DE information, e.g. identifying temporal patterns across biological conditions and DE genes which are specific to a biological condition for each time.

Maintained by Rodolphe Loubaton. Last updated 5 months ago.

sequencing rnaseq geneexpression transcription timecourse preprocessing visualization normalization principalcomponent clustering differentialexpression genesetenrichment pathways

20.8 match 6 stars 5.26 score 4 scripts

bioc

DAPAR:Tools for the Differential Analysis of Proteins Abundance with R

The package DAPAR is a Bioconductor distributed R package which provides all the necessary functions to analyze quantitative data from label-free proteomics experiments. Contrarily to most other similar R packages, it is endowed with rich and user-friendly graphical interfaces, so that no programming skill is required (see `Prostar` package).

Maintained by Samuel Wieczorek. Last updated 5 months ago.

proteomics normalization preprocessing massspectrometry qualitycontrol go dataimport prostar1

20.1 match 2 stars 5.42 score 22 scripts 1 dependents

hojsgaard

gRain:Bayesian Networks

Probability propagation in Bayesian networks, also known as graphical independence networks. Documentation of the package is provided in vignettes included in the package and in the paper by Højsgaard (2012, <doi:10.18637/jss.v046.i10>). See 'citation("gRain")' for details.

Maintained by Søren Højsgaard. Last updated 5 months ago.

cpp

11.8 match 2 stars 9.13 score 408 scripts 8 dependents

willnickols

chyper:Functions for Conditional Hypergeometric Distributions

An implementation of the probability mass function, cumulative density function, quantile function, random number generator, maximum likelihood estimator, and p-value generator from a conditional hypergeometric distribution: the distribution of how many items are in the overlap of all samples when samples of arbitrary size are each taken without replacement from populations of arbitrary size.

Maintained by William Nickols. Last updated 7 months ago.

26.5 match 4.00 score 8 scripts

ghuiber

BTYD:Implementing BTYD Models with the Log Sum Exp Patch

Functions for data preparation, parameter estimation, scoring, and plotting for the BG/BB (Fader, Hardie, and Shang 2010 <doi:10.1287/mksc.1100.0580>), BG/NBD (Fader, Hardie, and Lee 2005 <doi:10.1287/mksc.1040.0098>) and Pareto/NBD and Gamma/Gamma (Fader, Hardie, and Lee 2005 <doi:10.1509/jmkr.2005.42.4.415>) models.

Maintained by Gabi Huiber. Last updated 3 years ago.

17.4 match 7 stars 6.03 score 103 scripts 1 dependents

kwb-r

kwb.utils:General Utility Functions Developed at KWB

This package contains some small helper functions that aim at improving the quality of code developed at Kompetenzzentrum Wasser gGmbH (KWB).

Maintained by Hauke Sonnenberg. Last updated 12 months ago.

14.3 match 8 stars 7.33 score 12 scripts 78 dependents

bucky2177

dRiftDM:Estimating (Time-Dependent) Drift Diffusion Models

Fit and explore Drift Diffusion Models (DDMs), a common tool in psychology for describing decision processes in simple tasks. It can handle both time-independent and time-dependent DDMs. You either choose prebuilt models or create your own, and the package takes care of model predictions and parameter estimation. Model predictions are derived via the numerical solutions provided by Richter, Ulrich, and Janczyk (2023, <doi:10.1016/j.jmp.2023.102756>).

Maintained by Valentin Koob. Last updated 14 days ago.

cpp

15.7 match 6 stars 6.58 score 5 scripts

bioc

swfdr:Estimation of the science-wise false discovery rate and the false discovery rate conditional on covariates

This package allows users to estimate the science-wise false discovery rate from Jager and Leek, "Empirical estimates suggest most published medical research is true," 2013, Biostatistics, using an EM approach due to the presence of rounding and censoring. It also allows users to estimate the false discovery rate conditional on covariates, using a regression framework, as per Boca and Leek, "A direct approach to estimating false discovery rates conditional on covariates," 2018, PeerJ.

Maintained by Simina M. Boca. Last updated 5 months ago.

multiplecomparison statisticalmethod software

16.4 match 3 stars 6.25 score 37 scripts

ycphs

openxlsx:Read, Write and Edit xlsx Files

Simplifies the creation of Excel .xlsx files by providing a high level interface to writing, styling and editing worksheets. Through the use of 'Rcpp', read/write times are comparable to the 'xlsx' and 'XLConnect' packages with the added benefit of removing the dependency on Java.

Maintained by Jan Marvin Garbuszus. Last updated 2 months ago.

xlsx cpp

5.3 match 232 stars 18.98 score 20k scripts 270 dependents

mlr-org

mlr3extralearners:Extra Learners For mlr3

Extra learners for use in mlr3.

Maintained by Sebastian Fischer. Last updated 4 months ago.

machine-learning mlr3

10.8 match 94 stars 9.16 score 474 scripts

rstudio

gt:Easily Create Presentation-Ready Display Tables

Build display tables from tabular data with an easy-to-use set of functions. With its progressive approach, we can construct display tables with a cohesive set of table parts. Table values can be formatted using any of the included formatting functions. Footnotes and cell styles can be precisely added through a location targeting system. The way in which 'gt' handles things for you means that you don't often have to worry about the fine details.

Maintained by Richard Iannone. Last updated 13 days ago.

docx easy-to-use html latex rtf summary-tables

5.3 match 2.1k stars 18.36 score 20k scripts 112 dependents

beerda

nuggets:Extensible Data Pattern Searching Framework

Extensible framework for subgroup discovery (Atzmueller (2015) <doi:10.1002/widm.1144>), contrast patterns (Chen (2022) <doi:10.48550/arXiv.2209.13556>), emerging patterns (Dong (1999) <doi:10.1145/312129.312191>), association rules (Agrawal (1994) <https://www.vldb.org/conf/1994/P487.PDF>) and conditional correlations (Hájek (1978) <doi:10.1007/978-3-642-66943-9>). Both crisp (Boolean, binary) and fuzzy data are supported. It generates conditions in the form of elementary conjunctions, evaluates them on a dataset and checks the induced sub-data for interesting statistical properties. A user-defined function may be defined to evaluate on each generated condition to search for custom patterns.

Maintained by Michal Burda. Last updated 5 days ago.

association-rule-mining contrast-pattern-mining data-mining fuzzy knowledge-discovery pattern-recognition cpp openmp

18.2 match 2 stars 5.38 score 10 scripts

biomodhub

biomod2:Ensemble Platform for Species Distribution Modeling

Functions for species distribution modeling, calibration and evaluation, ensemble of models, ensemble forecasting and visualization. The package permits to run consistently up to 10 single models on a presence/absences (resp presences/pseudo-absences) dataset and to combine them in ensemble models and ensemble projections. Some bench of other evaluation and visualisation tools are also available within the package.

Maintained by Maya Guéguen. Last updated 21 hours ago.

7.0 match 95 stars 13.90 score 536 scripts 7 dependents

siacus

sde:Simulation and Inference for Stochastic Differential Equations

Companion package to the book Simulation and Inference for Stochastic Differential Equations With R Examples, ISBN 978-0-387-75838-1, Springer, NY. *

Maintained by Stefano Maria Iacus. Last updated 2 years ago.

13.8 match 7.02 score 178 scripts 15 dependents

wasquith

lmomco:L-Moments, Censored L-Moments, Trimmed L-Moments, L-Comoments, and Many Distributions

Extensive functions for Lmoments (LMs) and probability-weighted moments (PWMs), distribution parameter estimation, LMs for distributions, LM ratio diagrams, multivariate Lcomoments, and asymmetric (asy) trimmed LMs (TLMs). Maximum likelihood and maximum product spacings estimation are available. Right-tail and left-tail LM censoring by threshold or indicator variable are available. LMs of residual (resid) and reversed (rev) residual life are implemented along with 13 quantile operators for reliability analyses. Exact analytical bootstrap estimates of order statistics, LMs, and LM var-covars are available. Harri-Coble Tau34-squared Normality Test is available. Distributions with L, TL, and added (+) support for right-tail censoring (RC) encompass: Asy Exponential (Exp) Power [L], Asy Triangular [L], Cauchy [TL], Eta-Mu [L], Exp. [L], Gamma [L], Generalized (Gen) Exp Poisson [L], Gen Extreme Value [L], Gen Lambda [L, TL], Gen Logistic [L], Gen Normal [L], Gen Pareto [L+RC, TL], Govindarajulu [L], Gumbel [L], Kappa [L], Kappa-Mu [L], Kumaraswamy [L], Laplace [L], Linear Mean Residual Quantile Function [L], Normal [L], 3p log-Normal [L], Pearson Type III [L], Polynomial Density-Quantile 3 and 4 [L], Rayleigh [L], Rev-Gumbel [L+RC], Rice [L], Singh Maddala [L], Slash [TL], 3p Student t [L], Truncated Exponential [L], Wakeby [L], and Weibull [L].

Maintained by William Asquith. Last updated 1 months ago.

flood-frequency-analysis l-moments mle-estimation mps-estimation probability-distribution rainfall-frequency-analysis reliability-analysis risk-analysis survival-analysis

11.8 match 2 stars 8.06 score 458 scripts 38 dependents

cwolock

survML:Tools for Flexible Survival Analysis Using Machine Learning

Statistical tools for analyzing time-to-event data using machine learning. Implements survival stacking for conditional survival estimation, standardized survival function estimation for current status data, and methods for algorithm-agnostic variable importance. See Wolock CJ, Gilbert PB, Simon N, and Carone M (2024) <doi:10.1080/10618600.2024.2304070>.

Maintained by Charles Wolock. Last updated 9 hours ago.

11.7 match 16 stars 8.06 score 73 scripts 1 dependents

verbal-autopsy-software

InSilicoVA:Probabilistic Verbal Autopsy Coding with 'InSilicoVA' Algorithm

Computes individual causes of death and population cause-specific mortality fractions using the 'InSilicoVA' algorithm from McCormick et al. (2016) <DOI:10.1080/01621459.2016.1152191>. It uses data derived from verbal autopsy (VA) interviews, in a format similar to the input of the widely used 'InterVA' method. This package provides general model fitting and customization for 'InSilicoVA' algorithm and basic graphical visualization of the output.

Maintained by Zehang Richard Li. Last updated 1 months ago.

va-algorithm openjdk

16.6 match 3 stars 5.67 score 35 scripts 1 dependents

bioc

BatchQC:Batch Effects Quality Control Software

Sequencing and microarray samples often are collected or processed in multiple batches or at different times. This often produces technical biases that can lead to incorrect results in the downstream analysis. BatchQC is a software tool that streamlines batch preprocessing and evaluation by providing interactive diagnostics, visualizations, and statistical analyses to explore the extent to which batch variation impacts the data. BatchQC diagnostics help determine whether batch adjustment needs to be done, and how correction should be applied before proceeding with a downstream analysis. Moreover, BatchQC interactively applies multiple common batch effect approaches to the data and the user can quickly see the benefits of each method. BatchQC is developed as a Shiny App. The output is organized into multiple tabs and each tab features an important part of the batch effect analysis and visualization of the data. The BatchQC interface has the following analysis groups: Summary, Differential Expression, Median Correlations, Heatmaps, Circular Dendrogram, PCA Analysis, Shape, ComBat and SVA.

Maintained by Jessica McClintock. Last updated 5 months ago.

batcheffect graphandnetwork microarray normalization principalcomponent sequencing software visualization qualitycontrol rnaseq preprocessing differentialexpression immunooncology

10.4 match 7 stars 8.96 score 54 scripts

bioc

edgeR:Empirical Analysis of Digital Gene Expression Data in R

Differential expression analysis of sequence count data. Implements a range of statistical methodology based on the negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models, quasi-likelihood, and gene set enrichment. Can perform differential analyses of any type of omics data that produces read counts, including RNA-seq, ChIP-seq, ATAC-seq, Bisulfite-seq, SAGE, CAGE, metabolomics, or proteomics spectral counts. RNA-seq analyses can be conducted at the gene or isoform level, and tests can be conducted for differential exon or transcript usage.

Maintained by Yunshun Chen. Last updated 7 days ago.

alternativesplicing batcheffect bayesian biomedicalinformatics cellbiology chipseq clustering coverage differentialexpression differentialmethylation differentialsplicing dnamethylation epigenetics functionalgenomics geneexpression genesetenrichment genetics immunooncology multiplecomparison normalization pathways proteomics qualitycontrol regression rnaseq sage sequencing singlecell systemsbiology timecourse transcription transcriptomics openblas

6.9 match 13.40 score 17k scripts 255 dependents

ocbe-uio

contingencytables:Statistical Analysis of Contingency Tables

Provides functions to perform statistical inference of data organized in contingency tables. This package is a companion to the "Statistical Analysis of Contingency Tables" book by Fagerland et al. <ISBN 9781466588172>.

Maintained by Waldir Leoncio. Last updated 7 months ago.

contingency-table

22.3 match 3 stars 4.13 score 8 scripts 1 dependents

pierreroudier

clhs:Conditioned Latin Hypercube Sampling

Conditioned Latin hypercube sampling, as published by Minasny and McBratney (2006) <DOI:10.1016/j.cageo.2005.12.009>. This method proposes to stratify sampling in presence of ancillary data. An extension of this method, which propose to associate a cost to each individual and take it into account during the optimisation process, is also proposed (Roudier et al., 2012, <DOI:10.1201/b12728>).

Maintained by Pierre Roudier. Last updated 3 years ago.

cpp

12.0 match 12 stars 7.54 score 115 scripts 2 dependents

kimberlywebb

COMBO:Correcting Misclassified Binary Outcomes in Association Studies

Use frequentist and Bayesian methods to estimate parameters from a binary outcome misclassification model. These methods correct for the problem of "label switching" by assuming that the sum of outcome sensitivity and specificity is at least 1. A description of the analysis methods is available in Hochstedler and Wells (2023) <doi:10.48550/arXiv.2303.10215>.

Maintained by Kimberly Hochstedler Webb. Last updated 21 days ago.

jags cpp

17.8 match 1 stars 5.08 score 4 scripts

colinfay

attempt:Tools for Defensive Programming

Tools for defensive programming, inspired by 'purrr' mappers and based on 'rlang'.'attempt' extends and facilitates defensive programming by providing a consistent grammar, and provides a set of easy to use functions for common tests and conditions. 'attempt' only depends on 'rlang', and focuses on speed, so it can be easily integrated in other functions and used in data analysis.

Maintained by Colin Fay. Last updated 7 months ago.

7.8 match 126 stars 11.57 score 101 scripts 86 dependents

markusfritsch

pdynmc:Moment Condition Based Estimation of Linear Dynamic Panel Data Models

Linear dynamic panel data modeling based on linear and nonlinear moment conditions as proposed by Holtz-Eakin, Newey, and Rosen (1988) <doi:10.2307/1913103>, Ahn and Schmidt (1995) <doi:10.1016/0304-4076(94)01641-C>, and Arellano and Bover (1995) <doi:10.1016/0304-4076(94)01642-D>. Estimation of the model parameters relies on the Generalized Method of Moments (GMM) and instrumental variables (IV) estimation, numerical optimization (when nonlinear moment conditions are employed) and the computation of closed form solutions (when estimation is based on linear moment conditions). One-step, two-step and iterated estimation is available. For inference and specification testing, Windmeijer (2005) <doi:10.1016/j.jeconom.2004.02.005> and doubly corrected standard errors (Hwang, Kang, Lee, 2021 <doi:10.1016/j.jeconom.2020.09.010>) are available. Additionally, serial correlation tests, tests for overidentification, and Wald tests are provided. Functions for visualizing panel data structures and modeling results obtained from GMM estimation are also available. The plot methods include functions to plot unbalanced panel structure, coefficient ranges and coefficient paths across GMM iterations (the latter is implemented according to the plot shown in Hansen and Lee, 2021 <doi:10.3982/ECTA16274>). For a more detailed description of the GMM-based functionality, please see Fritsch, Pua, Schnurbus (2021) <doi:10.32614/RJ-2021-035>. For more details on the IV-based estimation routines, see Fritsch, Pua, and Schnurbus (WP, 2024) and Han and Phillips (2010) <doi:10.1017/S026646660909063X>.

Maintained by Markus Fritsch. Last updated 15 days ago.

13.4 match 4 stars 6.65 score 106 scripts

r-lib

lintr:A 'Linter' for R Code

Checks adherence to a given style, syntax errors and possible semantic issues. Supports on the fly checking of R code edited with 'RStudio IDE', 'Emacs', 'Vim', 'Sublime Text', 'Atom' and 'Visual Studio Code'.

Maintained by Michael Chirico. Last updated 14 hours ago.

linter

5.2 match 1.2k stars 16.99 score 916 scripts 33 dependents

rstudio

shiny:Web Application Framework for R

Makes it incredibly easy to build interactive web applications with R. Automatic "reactive" binding between inputs and outputs and extensive prebuilt widgets make it possible to build beautiful, responsive, and powerful applications with minimal effort.

Maintained by Winston Chang. Last updated 15 days ago.

reactive rstudio shiny web-app web-development

4.1 match 5.4k stars 21.28 score 108k scripts 1.8k dependents

janmarvin

openxlsx2:Read, Write and Edit 'xlsx' Files

Simplifies the creation of 'xlsx' files by providing a high level interface to writing, styling and editing worksheets.

Maintained by Jan Marvin Garbuszus. Last updated 14 hours ago.

xlsx cpp

6.4 match 138 stars 13.66 score 194 scripts 11 dependents

immunogenomics

harmony:Fast, Sensitive, and Accurate Integration of Single Cell Data

Implementation of the Harmony algorithm for single cell integration, described in Korsunsky et al <doi:10.1038/s41592-019-0619-0>. Package includes a standalone Harmony function and interfaces to external frameworks.

Maintained by Ilya Korsunsky. Last updated 4 months ago.

algorithm data-integration scrna-seq openblas cpp

6.4 match 554 stars 13.74 score 5.5k scripts 8 dependents

jeffreyracine

np:Nonparametric Kernel Smoothing Methods for Mixed Data Types

Nonparametric (and semiparametric) kernel methods that seamlessly handle a mix of continuous, unordered, and ordered factor data types. We would like to gratefully acknowledge support from the Natural Sciences and Engineering Research Council of Canada (NSERC, <https://www.nserc-crsng.gc.ca/>), the Social Sciences and Humanities Research Council of Canada (SSHRC, <https://www.sshrc-crsh.gc.ca/>), and the Shared Hierarchical Academic Research Computing Network (SHARCNET, <https://sharcnet.ca/>). We would also like to acknowledge the contributions of the GNU GSL authors. In particular, we adapt the GNU GSL B-spline routine gsl_bspline.c adding automated support for quantile knots (in addition to uniform knots), providing missing functionality for derivatives, and for extending the splines beyond their endpoints.

Maintained by Jeffrey S. Racine. Last updated 1 months ago.

6.8 match 49 stars 12.64 score 672 scripts 44 dependents

usdaforestservice

FIESTA:Forest Inventory Estimation and Analysis

A research estimation tool for analysts that work with sample-based inventory data from the U.S. Department of Agriculture, Forest Service, Forest Inventory and Analysis (FIA) Program.

Maintained by Grayson White. Last updated 17 hours ago.

11.8 match 30 stars 7.25 score 62 scripts

bioc

BiocGenerics:S4 generic functions used in Bioconductor

The package defines many S4 generic functions used in Bioconductor.

Maintained by Hervé Pagès. Last updated 1 months ago.

infrastructure bioconductor-package core-package

6.0 match 12 stars 14.22 score 612 scripts 2.2k dependents

the-hull

datacleanr:Interactive and Reproducible Data Cleaning

Flexible and efficient cleaning of data with interactivity. 'datacleanr' facilitates best practices in data analyses and reproducibility with built-in features and by translating interactive/manual operations to code. The package is designed for interoperability, and so seamlessly fits into reproducible analyses pipelines in 'R'.

Maintained by Alexander Hurley. Last updated 3 years ago.

annotation-tool data-cleaning outlier-detection outlier-removal reproducibility

19.2 match 20 stars 4.38 score 24 scripts

ropensci

tarchetypes:Archetypes for Targets

Function-oriented Make-like declarative pipelines for Statistics and data science are supported in the 'targets' R package. As an extension to 'targets', the 'tarchetypes' package provides convenient user-side functions to make 'targets' easier to use. By establishing reusable archetypes for common kinds of targets and pipelines, these functions help express complicated reproducible pipelines concisely and compactly. The methods in this package were influenced by the 'targets' R package. by Will Landau (2018) <doi:10.21105/joss.00550>.

Maintained by William Michael Landau. Last updated 22 days ago.

data-science high-performance-computing peer-reviewed pipeline r-targetopia reproducibility targets workflow

7.2 match 141 stars 11.43 score 1.7k scripts 10 dependents

vincentarelbundock

marginaleffects:Predictions, Comparisons, Slopes, Marginal Means, and Hypothesis Tests

Compute and plot predictions, slopes, marginal means, and comparisons (contrasts, risk ratios, odds, etc.) for over 100 classes of statistical and machine learning models in R. Conduct linear and non-linear hypothesis tests, or equivalence tests. Calculate uncertainty estimates using the delta method, bootstrapping, or simulation-based inference. Details can be found in Arel-Bundock, Greifer, and Heiss (2024) <doi:10.18637/jss.v111.i09>.

Maintained by Vincent Arel-Bundock. Last updated 2 days ago.

cpp

5.6 match 505 stars 14.51 score 1.8k scripts 9 dependents

bioc

metaseqR2:An R package for the analysis and result reporting of RNA-Seq data by combining multiple statistical algorithms

Provides an interface to several normalization and statistical testing packages for RNA-Seq gene expression data. Additionally, it creates several diagnostic plots, performs meta-analysis by combinining the results of several statistical tests and reports the results in an interactive way.

Maintained by Panagiotis Moulos. Last updated 6 days ago.

software geneexpression differentialexpression workflowstep preprocessing qualitycontrol normalization reportwriting rnaseq transcription sequencing transcriptomics bayesian clustering cellbiology biomedicalinformatics functionalgenomics systemsbiology immunooncology alternativesplicing differentialsplicing multiplecomparison timecourse dataimport atacseq epigenetics regression proprietaryplatforms genesetenrichment batcheffect chipseq

13.4 match 7 stars 6.05 score 3 scripts

kjhealy

gssrdoc:Document General Social Survey Variable

The General Social Survey (GSS) is a long-running, mostly annual survey of US households. It is administered by the National Opinion Research Center (NORC). This package contains the a tibble with information on the survey variables, together with every variable documented as an R help page. For more information on the GSS see \url{http://gss.norc.org}.

Maintained by Kieran Healy. Last updated 11 months ago.

35.6 match 2.28 score 38 scripts

hneth

riskyr:Rendering Risk Literacy more Transparent

Risk-related information (like the prevalence of conditions, the sensitivity and specificity of diagnostic tests, or the effectiveness of interventions or treatments) can be expressed in terms of frequencies or probabilities. By providing a toolbox of corresponding metrics and representations, 'riskyr' computes, translates, and visualizes risk-related information in a variety of ways. Adopting multiple complementary perspectives provides insights into the interplay between key parameters and renders teaching and training programs on risk literacy more transparent.

Maintained by Hansjoerg Neth. Last updated 10 months ago.

2x2-matrix bayesian-inference contingency-table representation risk risk-literacy visualization

11.2 match 19 stars 7.18 score 80 scripts

njtierney

naniar:Data Structures, Summaries, and Visualisations for Missing Data

Missing values are ubiquitous in data and need to be explored and handled in the initial stages of analysis. 'naniar' provides data structures and functions that facilitate the plotting of missing values and examination of imputations. This allows missing data dependencies to be explored with minimal deviation from the common work patterns of 'ggplot2' and tidy data. The work is fully discussed at Tierney & Cook (2023) <doi:10.18637/jss.v105.i07>.

Maintained by Nicholas Tierney. Last updated 5 days ago.

data-visualisation ggplot2 missing-data missingness tidy-data

5.2 match 657 stars 15.63 score 5.1k scripts 9 dependents

jakobbossek

cmaesr:Covariance Matrix Adaptation Evolution Strategy

Pure R implementation of the Covariance Matrix Adaptation - Evolution Strategy (CMA-ES) with optional restarts (IPOP-CMA-ES).

Maintained by Jakob Bossek. Last updated 8 years ago.

21.5 match 9 stars 3.73 score 12 scripts

philchalmers

SimDesign:Structure for Organizing Monte Carlo Simulation Designs

Provides tools to safely and efficiently organize and execute Monte Carlo simulation experiments in R. The package controls the structure and back-end of Monte Carlo simulation experiments by utilizing a generate-analyse-summarise workflow. The workflow safeguards against common simulation coding issues, such as automatically re-simulating non-convergent results, prevents inadvertently overwriting simulation files, catches error and warning messages during execution, implicitly supports parallel processing with high-quality random number generation, and provides tools for managing high-performance computing (HPC) array jobs submitted to schedulers such as SLURM. For a pedagogical introduction to the package see Sigal and Chalmers (2016) <doi:10.1080/10691898.2016.1246953>. For a more in-depth overview of the package and its design philosophy see Chalmers and Adkins (2020) <doi:10.20982/tqmp.16.4.p248>.

Maintained by Phil Chalmers. Last updated 22 hours ago.

monte-carlo-simulation simulation simulation-framework

6.0 match 62 stars 13.38 score 253 scripts 46 dependents

marshalllab

MGDrivE2:Mosquito Gene Drive Explorer 2

A simulation modeling framework which significantly extends capabilities from the 'MGDrivE' simulation package via a new mathematical and computational framework based on stochastic Petri nets. For more information about 'MGDrivE', see our publication: <https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/2041-210X.13318>. Some of the notable capabilities of 'MGDrivE2' include: incorporation of human populations, epidemiological dynamics, time-varying parameters, and a continuous-time simulation framework with various sampling algorithms for both deterministic and stochastic interpretations. 'MGDrivE2' relies on the genetic inheritance structures provided in package 'MGDrivE', so we suggest installing that package initially.

Maintained by Sean L. Wu. Last updated 4 years ago.

12.5 match 6 stars 6.33 score 30 scripts

bioc

projectR:Functions for the projection of weights from PCA, CoGAPS, NMF, correlation, and clustering

Functions for the projection of data into the spaces defined by PCA, CoGAPS, NMF, correlation, and clustering.

Maintained by Genevieve Stein-OBrien. Last updated 5 months ago.

functionalprediction generegulation biologicalquestion software

9.6 match 62 stars 8.11 score 70 scripts

thijsjanzen

GUILDS:Implementation of Sampling Formulas for the Unified Neutral Model of Biodiversity and Biogeography, with or without Guild Structure

A collection of sampling formulas for the unified neutral model of biogeography and biodiversity. Alongside the sampling formulas, it includes methods to perform maximum likelihood optimization of the sampling formulas, methods to generate data given the neutral model, and methods to estimate the expected species abundance distribution. Sampling formulas included in the GUILDS package are the Etienne Sampling Formula (Etienne 2005), the guild sampling formula, where guilds are assumed to differ in dispersal ability (Janzen et al. 2015), and the guilds sampling formula conditioned on guild size (Janzen et al. 2015).

Maintained by Thijs Janzen. Last updated 5 days ago.

cpp

14.4 match 2 stars 5.43 score 18 scripts 5 dependents

alishinski

lavaanPlot:Path Diagrams for 'Lavaan' Models via 'DiagrammeR'

Plots path diagrams from models in 'lavaan' using the plotting functionality from the 'DiagrammeR' package. 'DiagrammeR' provides nice path diagrams via 'Graphviz', and these functions make it easy to generate these diagrams from a 'lavaan' path model without having to write the DOT language graph specification.

Maintained by Alex Lishinski. Last updated 1 years ago.

9.4 match 40 stars 8.33 score 294 scripts

majianthu

copent:Estimating Copula Entropy and Transfer Entropy

The nonparametric methods for estimating copula entropy, transfer entropy, and the statistics for multivariate normality test and two-sample test are implemented. The methods for estimating transfer entropy and the statistics for multivariate normality test and two-sample test are based on the method for estimating copula entropy. The method for change point detection with copula entropy based two-sample test is also implemented. Please refer to Ma and Sun (2011) <doi:10.1016/S1007-0214(11)70008-6>, Ma (2019) <doi:10.48550/arXiv.1910.04375>, Ma (2022) <doi:10.48550/arXiv.2206.05956>, Ma (2023) <doi:10.48550/arXiv.2307.07247>, and Ma (2024) <doi:10.48550/arXiv.2403.07892> for more information.

Maintained by MA Jian. Last updated 9 months ago.

causal-discovery causality change-point-detection conditional-independence-test conditional-mutual-information copula copula-entropy correlation entropy granger-causality information-theory mutual-information mutualinf normality-test transfer-entropy two-sample-test variable-selection

15.1 match 41 stars 5.15 score 23 scripts 1 dependents

mbq

praznik:Tools for Information-Based Feature Selection and Scoring

A toolbox of fast, native and parallel implementations of various information-based importance criteria estimators and feature selection filters based on them, inspired by the overview by Brown, Pocock, Zhao and Lujan (2012) <https://www.jmlr.org/papers/v13/brown12a.html>. Contains, among other, minimum redundancy maximal relevancy ('mRMR') method by Peng, Long and Ding (2005) <doi:10.1109/TPAMI.2005.159>; joint mutual information ('JMI') method by Yang and Moody (1999) <https://papers.nips.cc/paper/1779-data-visualization-and-feature-selection-new-algorithms-for-nongaussian-data>; double input symmetrical relevance ('DISR') method by Meyer and Bontempi (2006) <doi:10.1007/11732242_9> as well as joint mutual information maximisation ('JMIM') method by Bennasar, Hicks and Setchi (2015) <doi:10.1016/j.eswa.2015.07.007>.

Maintained by Miron B. Kursa. Last updated 2 years ago.

openmp

15.4 match 5.05 score 34 scripts 6 dependents

r-forge

party:A Laboratory for Recursive Partytioning

A computational toolbox for recursive partitioning. The core of the package is ctree(), an implementation of conditional inference trees which embed tree-structured regression models into a well defined theory of conditional inference procedures. This non-parametric class of regression trees is applicable to all kinds of regression problems, including nominal, ordinal, numeric, censored as well as multivariate response variables and arbitrary measurement scales of the covariates. Based on conditional inference trees, cforest() provides an implementation of Breiman's random forests. The function mob() implements an algorithm for recursive partitioning based on parametric models (e.g. linear models, GLMs or survival regression) employing parameter instability tests for split selection. Extensible functionality for visualizing tree-structured regression models is available. The methods are described in Hothorn et al. (2006) <doi:10.1198/106186006X133933>, Zeileis et al. (2008) <doi:10.1198/106186008X319331> and Strobl et al. (2007) <doi:10.1186/1471-2105-8-25>.

Maintained by Torsten Hothorn. Last updated 2 months ago.

openblas

6.6 match 11.52 score 3.2k scripts 29 dependents

pharmaverse

admiral:ADaM in R Asset Library

A toolbox for programming Clinical Data Interchange Standards Consortium (CDISC) compliant Analysis Data Model (ADaM) datasets in R. ADaM datasets are a mandatory part of any New Drug or Biologics License Application submitted to the United States Food and Drug Administration (FDA). Analysis derivations are implemented in accordance with the "Analysis Data Model Implementation Guide" (CDISC Analysis Data Model Team, 2021, <https://www.cdisc.org/standards/foundational/adam>).

Maintained by Ben Straub. Last updated 18 hours ago.

cdisc clinical-trials open-source

5.5 match 238 stars 13.92 score 486 scripts 4 dependents

tidyverse

dplyr:A Grammar of Data Manipulation

A fast, consistent tool for working with data frame like objects, both in memory and out of memory.

Maintained by Hadley Wickham. Last updated 15 days ago.

data-manipulation grammar cpp

3.1 match 4.8k stars 24.68 score 659k scripts 7.8k dependents

epiverse-trace

epidemics:Composable Epidemic Scenario Modelling

A library of compartmental epidemic models taken from the published literature, and classes to represent affected populations, public health response measures including non-pharmaceutical interventions on social contacts, non-pharmaceutical and pharmaceutical interventions that affect disease transmissibility, vaccination regimes, and disease seasonality, which can be combined to compose epidemic scenario models.

Maintained by Rosalind Eggo. Last updated 9 months ago.

decision-support epidemic-modelling epidemic-simulations epidemiology epiverse infectious-disease-dynamics model-library non-pharmaceutical-interventions rcpp rcppeigen scenario-analysis vaccination cpp

10.1 match 9 stars 7.48 score 59 scripts

rfastofficial

Rfast:A Collection of Efficient and Extremely Fast R Functions

A collection of fast (utility) functions for data analysis. Column and row wise means, medians, variances, minimums, maximums, many t, F and G-square tests, many regressions (normal, logistic, Poisson), are some of the many fast functions. References: a) Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>. b) Tsagris M. and Papadakis M. (2018). Forward regression in R: from the extreme slow to the extreme fast. Journal of Data Science, 16(4): 771--780. <doi:10.6339/JDS.201810_16(4).00006>. c) Chatzipantsiou C., Dimitriadis M., Papadakis M. and Tsagris M. (2020). Extremely Efficient Permutation and Bootstrap Hypothesis Tests Using Hypothesis Tests Using R. Journal of Modern Applied Statistical Methods, 18(2), eP2898. <doi:10.48550/arXiv.1806.10947>. d) Tsagris M., Papadakis M., Alenazi A. and Alzeley O. (2024). Computationally Efficient Outlier Detection for High-Dimensional Data Using the MDP Algorithm. Computation, 12(9): 185. <doi:10.3390/computation12090185>. e) Tsagris M. and Papadakis M. (2025). Fast and light-weight energy statistics using the R package Rfast. <doi:10.48550/arXiv.2501.02849>.

Maintained by Manos Papadakis. Last updated 19 days ago.

openblas cpp openmp

5.9 match 147 stars 12.54 score 1.2k scripts 166 dependents

gavinsimpson

gratia:Graceful 'ggplot'-Based Graphics and Other Functions for GAMs Fitted Using 'mgcv'

Graceful 'ggplot'-based graphics and utility functions for working with generalized additive models (GAMs) fitted using the 'mgcv' package. Provides a reimplementation of the plot() method for GAMs that 'mgcv' provides, as well as 'tidyverse' compatible representations of estimated smooths.

Maintained by Gavin L. Simpson. Last updated 2 days ago.

distributional-regression gam gamm generalized-additive-mixed-models generalized-additive-models ggplot2 glm lm mgcv penalized-spline random-effects smoothing splines

5.7 match 217 stars 12.99 score 1.6k scripts 2 dependents

munterfi

eRTG3D:Empirically Informed Random Trajectory Generation in 3-D

Creates realistic random trajectories in a 3-D space between two given fix points, so-called conditional empirical random walks (CERWs). The trajectory generation is based on empirical distribution functions extracted from observed trajectories (training data) and thus reflects the geometrical movement characteristics of the mover. A digital elevation model (DEM), representing the Earth's surface, and a background layer of probabilities (e.g. food sources, uplift potential, waterbodies, etc.) can be used to influence the trajectories. Unterfinger M (2018). "3-D Trajectory Simulation in Movement Ecology: Conditional Empirical Random Walk". Master's thesis, University of Zurich. <https://www.geo.uzh.ch/dam/jcr:6194e41e-055c-4635-9807-53c5a54a3be7/MasterThesis_Unterfinger_2018.pdf>. Technitis G, Weibel R, Kranstauber B, Safi K (2016). "An algorithm for empirically informed random trajectory generation between two endpoints". GIScience 2016: Ninth International Conference on Geographic Information Science, 9, online. <doi:10.5167/uzh-130652>.

Maintained by Merlin Unterfinger. Last updated 3 years ago.

3d birds conditional-empirical-random-walk gliding-and-soaring machine-learning movement-ecology random-trajectory-generator random-walk simulation trajectory-generation

13.0 match 6 stars 5.71 score 19 scripts

bcallaway11

did:Treatment Effects with Multiple Periods and Groups

The standard Difference-in-Differences (DID) setup involves two periods and two groups -- a treated group and untreated group. Many applications of DID methods involve more than two periods and have individuals that are treated at different points in time. This package contains tools for computing average treatment effect parameters in Difference in Differences setups with more than two periods and with variation in treatment timing using the methods developed in Callaway and Sant'Anna (2021) <doi:10.1016/j.jeconom.2020.12.001>. The main parameters are group-time average treatment effects which are the average treatment effect for a particular group at a a particular time. These can be aggregated into a fewer number of treatment effect parameters, and the package deals with the cases where there is selective treatment timing, dynamic treatment effects, calendar time effects, or combinations of these. There are also functions for testing the Difference in Differences assumption, and plotting group-time average treatment effects.

Maintained by Brantly Callaway. Last updated 4 months ago.

6.1 match 327 stars 12.01 score 696 scripts 3 dependents

bioc

puma:Propagating Uncertainty in Microarray Analysis(including Affymetrix tranditional 3' arrays and exon arrays and Human Transcriptome Array 2.0)

Most analyses of Affymetrix GeneChip data (including tranditional 3' arrays and exon arrays and Human Transcriptome Array 2.0) are based on point estimates of expression levels and ignore the uncertainty of such estimates. By propagating uncertainty to downstream analyses we can improve results from microarray analyses. For the first time, the puma package makes a suite of uncertainty propagation methods available to a general audience. In additon to calculte gene expression from Affymetrix 3' arrays, puma also provides methods to process exon arrays and produces gene and isoform expression for alternative splicing study. puma also offers improvements in terms of scope and speed of execution over previously available uncertainty propagation methods. Included are summarisation, differential expression detection, clustering and PCA methods, together with useful plotting functions.

Maintained by Xuejun Liu. Last updated 5 months ago.

microarray onechannel preprocessing differentialexpression clustering exonarray geneexpression mrnamicroarray chiponchip alternativesplicing differentialsplicing bayesian twochannel dataimport hta2.0

16.2 match 4.53 score 17 scripts

r-lib

testthat:Unit Testing for R

Software testing is important, but, in part because it is frustrating and boring, many of us avoid it. 'testthat' is a testing framework for R that is easy to learn and use, and integrates with your existing 'workflow'.

Maintained by Hadley Wickham. Last updated 18 days ago.

unit-testing cpp

3.5 match 900 stars 20.97 score 74k scripts 465 dependents

rstudio

keras3:R Interface to 'Keras'

Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.

Maintained by Tomasz Kalinowski. Last updated 1 days ago.

5.3 match 845 stars 13.60 score 264 scripts 2 dependents

ropensci

targets:Dynamic Function-Oriented 'Make'-Like Declarative Pipelines

Pipeline tools coordinate the pieces of computationally demanding analysis projects. The 'targets' package is a 'Make'-like pipeline tool for statistics and data science in R. The package skips costly runtime for tasks that are already up to date, orchestrates the necessary computation with implicit parallel computing, and abstracts files as R objects. If all the current output matches the current upstream code and data, then the whole pipeline is up to date, and the results are more trustworthy than otherwise. The methodology in this package borrows from GNU 'Make' (2015, ISBN:978-9881443519) and 'drake' (2018, <doi:10.21105/joss.00550>).

Maintained by William Michael Landau. Last updated 16 hours ago.

data-science high-performance-computing make peer-reviewed pipeline r-targetopia reproducibility reproducible-research targets workflow

4.8 match 975 stars 15.18 score 4.6k scripts 22 dependents

markajoc

condvis:Conditional Visualization for Statistical Models

Exploring fitted models by interactively taking 2-D and 3-D sections in data space.

Maintained by Mark OConnell. Last updated 7 years ago.

models statistics visualization

16.1 match 20 stars 4.38 score 24 scripts

humaniverse

asylum:Data on Asylum and Resettlement for the UK

Data on Asylum and Resettlement for the UK, provided by the Home Office <https://www.gov.uk/government/statistical-data-sets/immigration-system-statistics-data-tables>.

Maintained by Matthew Gwynfryn Thomas. Last updated 19 days ago.

13.9 match 3 stars 4.99 score 36 scripts

stan-dev

rstanarm:Bayesian Applied Regression Modeling via Stan

Estimates previously compiled regression models using the 'rstan' package, which provides the R interface to the Stan C++ library for Bayesian estimation. Users specify models via the customary R syntax with a formula and data.frame plus some additional arguments for priors.

Maintained by Ben Goodrich. Last updated 9 months ago.

bayesian bayesian-data-analysis bayesian-inference bayesian-methods bayesian-statistics multilevel-models rstan rstanarm stan statistical-modeling cpp

4.4 match 393 stars 15.68 score 5.0k scripts 13 dependents

ralmond

RNetica:R interface to Netica(R) Bayesian Network Engine

This provides an R interface to the Netica (http://norsys.com/) Bayesian network library API.

Maintained by Russell Almond. Last updated 3 months ago.

bayesian-network

13.9 match 2 stars 4.92 score 14 scripts 2 dependents

melff

memisc:Management of Survey Data and Presentation of Analysis Results

An infrastructure for the management of survey data including value labels, definable missing values, recoding of variables, production of code books, and import of (subsets of) 'SPSS' and 'Stata' files is provided. Further, the package allows to produce tables and data frames of arbitrary descriptive statistics and (almost) publication-ready tables of regression model estimates, which can be exported to 'LaTeX' and HTML.

Maintained by Martin Elff. Last updated 13 days ago.

survey-data

5.5 match 46 stars 12.34 score 1.2k scripts 13 dependents

hojsgaard

gRim:Graphical Interaction Models

Provides the following types of models: Models for contingency tables (i.e. log-linear models) Graphical Gaussian models for multivariate normal data (i.e. covariance selection models) Mixed interaction models. Documentation about 'gRim' is provided by vignettes included in this package and the book by Højsgaard, Edwards and Lauritzen (2012, <doi:10.1007/978-1-4614-2299-0>); see 'citation("gRim")' for details.

Maintained by Søren Højsgaard. Last updated 5 months ago.

openblas cpp

11.8 match 2 stars 5.77 score 74 scripts

alexkowa

EnvStats:Package for Environmental Statistics, Including US EPA Guidance

Graphical and statistical analyses of environmental data, with focus on analyzing chemical concentrations and physical parameters, usually in the context of mandated environmental monitoring. Major environmental statistical methods found in the literature and regulatory guidance documents, with extensive help that explains what these methods do, how to use them, and where to find them in the literature. Numerous built-in data sets from regulatory guidance documents and environmental statistics literature. Includes scripts reproducing analyses presented in the book "EnvStats: An R Package for Environmental Statistics" (Millard, 2013, Springer, ISBN 978-1-4614-8455-4, <doi:10.1007/978-1-4614-8456-1>).

Maintained by Alexander Kowarik. Last updated 18 days ago.

5.3 match 26 stars 12.80 score 2.4k scripts 46 dependents

bioc

InterCellar:InterCellar: an R-Shiny app for interactive analysis and exploration of cell-cell communication in single-cell transcriptomics

InterCellar is implemented as an R/Bioconductor Package containing a Shiny app that allows users to interactively analyze cell-cell communication from scRNA-seq data. Starting from precomputed ligand-receptor interactions, InterCellar provides filtering options, annotations and multiple visualizations to explore clusters, genes and functions. Finally, based on functional annotation from Gene Ontology and pathway databases, InterCellar implements data-driven analyses to investigate cell-cell communication in one or multiple conditions.

Maintained by Marta Interlandi. Last updated 5 months ago.

software singlecell visualization go transcriptomics

13.5 match 9 stars 4.95 score 7 scripts

spatstat

spatstat.geom:Geometrical Functionality of the 'spatstat' Family

Defines spatial data types and supports geometrical operations on them. Data types include point patterns, windows (domains), pixel images, line segment patterns, tessellations and hyperframes. Capabilities include creation and manipulation of data (using command line or graphical interaction), plotting, geometrical operations (rotation, shift, rescale, affine transformation), convex hull, discretisation and pixellation, Dirichlet tessellation, Delaunay triangulation, pairwise distances, nearest-neighbour distances, distance transform, morphological operations (erosion, dilation, closing, opening), quadrat counting, geometrical measurement, geometrical covariance, colour maps, calculus on spatial domains, Gaussian blur, level sets of images, transects of images, intersections between objects, minimum distance matching. (Excludes spatial data on a network, which are supported by the package 'spatstat.linnet'.)

Maintained by Adrian Baddeley. Last updated 9 hours ago.

classes-and-objects distance-calculation geometry geometry-processing images mensuration plotting point-patterns spatial-data spatial-data-analysis

5.5 match 7 stars 12.10 score 241 scripts 227 dependents

bioc

mirTarRnaSeq:mirTarRnaSeq

mirTarRnaSeq R package can be used for interactive mRNA miRNA sequencing statistical analysis. This package utilizes expression or differential expression mRNA and miRNA sequencing results and performs interactive correlation and various GLMs (Regular GLM, Multivariate GLM, and Interaction GLMs ) analysis between mRNA and miRNA expriments. These experiments can be time point experiments, and or condition expriments.

Maintained by Mercedeh Movassagh. Last updated 5 months ago.

mirna regression software sequencing smallrna timecourse differentialexpression

16.4 match 4.00 score 9 scripts

janoleko

LaMa:Fast Numerical Maximum Likelihood Estimation for Latent Markov Models

A variety of latent Markov models, including hidden Markov models, hidden semi-Markov models, state-space models and continuous-time variants can be formulated and estimated within the same framework via directly maximising the likelihood function using the so-called forward algorithm. Applied researchers often need custom models that standard software does not easily support. Writing tailored 'R' code offers flexibility but suffers from slow estimation speeds. We address these issues by providing easy-to-use functions (written in 'C++' for speed) for common tasks like the forward algorithm. These functions can be combined into custom models in a Lego-type approach, offering up to 10-20 times faster estimation via standard numerical optimisers. To aid in building fully custom likelihood functions, several vignettes are included that show how to simulate data from and estimate all the above model classes.

Maintained by Jan-Ole Koslik. Last updated 3 days ago.

openblas cpp openmp

8.4 match 9 stars 7.84 score 42 scripts

rje42

rje:Miscellaneous Useful Functions for Statistics

A series of functions in some way considered useful to the author. These include methods for subsetting tables and generating indices for arrays, conditioning and intervening in probability distributions, generating combinations, fast transformations, and more...

Maintained by Robin Evans. Last updated 12 months ago.

10.0 match 6.50 score 173 scripts 10 dependents

kkholst

targeted:Targeted Inference

Various methods for targeted and semiparametric inference including augmented inverse probability weighted (AIPW) estimators for missing data and causal inference (Bang and Robins (2005) <doi:10.1111/j.1541-0420.2005.00377.x>), variable importance and conditional average treatment effects (CATE) (van der Laan (2006) <doi:10.2202/1557-4679.1008>), estimators for risk differences and relative risks (Richardson et al. (2017) <doi:10.1080/01621459.2016.1192546>), assumption lean inference for generalized linear model parameters (Vansteelandt et al. (2022) <doi:10.1111/rssb.12504>).

Maintained by Klaus K. Holst. Last updated 1 months ago.

causal-inference double-robust estimation semiparametric-estimation statistics openblas cpp openmp

9.1 match 11 stars 7.20 score 30 scripts 1 dependents

hadley

assertthat:Easy Pre and Post Assertions

An extension to stopifnot() that makes it easy to declare the pre and post conditions that you code should satisfy, while also producing friendly error messages so that your users know what's gone wrong.

Maintained by Hadley Wickham. Last updated 6 years ago.

4.3 match 207 stars 15.21 score 2.5k scripts 984 dependents

projectmosaic

mosaic:Project MOSAIC Statistics and Mathematics Teaching Utilities

Data sets and utilities from Project MOSAIC (<http://www.mosaic-web.org>) used to teach mathematics, statistics, computation and modeling. Funded by the NSF, Project MOSAIC is a community of educators working to tie together aspects of quantitative work that students in science, technology, engineering and mathematics will need in their professional lives, but which are usually taught in isolation, if at all.

Maintained by Randall Pruim. Last updated 1 years ago.

4.9 match 93 stars 13.32 score 7.2k scripts 7 dependents

harrysouthworth

texmex:Statistical Modelling of Extreme Values

Statistical extreme value modelling of threshold excesses, maxima and multivariate extremes. Univariate models for threshold excesses and maxima are the Generalised Pareto, and Generalised Extreme Value model respectively. These models may be fitted by using maximum (optionally penalised-)likelihood, or Bayesian estimation, and both classes of models may be fitted with covariates in any/all model parameters. Model diagnostics support the fitting process. Graphical output for visualising fitted models and return level estimates is provided. For serially dependent sequences, the intervals declustering algorithm of Ferro and Segers (2003) <doi:10.1111/1467-9868.00401> is provided, with diagnostic support to aid selection of threshold and declustering horizon. Multivariate modelling is performed via the conditional approach of Heffernan and Tawn (2004) <doi:10.1111/j.1467-9868.2004.02050.x>, with graphical tools for threshold selection and to diagnose estimation convergence.

Maintained by Harry Southworth. Last updated 1 years ago.

cpp

9.3 match 7 stars 6.92 score 66 scripts 1 dependents

pettermostad

lestat:A Package for Learning Statistics

Some simple objects and functions to do statistics using linear models and a Bayesian framework.

Maintained by Petter Mostad. Last updated 7 years ago.

27.8 match 2.28 score 64 scripts 1 dependents

nelson-n

lmForc:Linear Model Forecasting

Introduces in-sample, out-of-sample, pseudo out-of-sample, and benchmark model forecast tests and a new class for working with forecast data, Forecast.

Maintained by Nelson Rayl. Last updated 7 months ago.

forecasting linear-models

12.0 match 6 stars 5.26 score 20 scripts

moviedo5

fda.usc:Functional Data Analysis and Utilities for Statistical Computing

Routines for exploratory and descriptive analysis of functional data such as depth measurements, atypical curves detection, regression models, supervised classification, unsupervised classification and functional analysis of variance.

Maintained by Manuel Oviedo de la Fuente. Last updated 4 months ago.

functional-data-analysis fortran

6.5 match 12 stars 9.72 score 560 scripts 22 dependents

tiledb-inc

tiledb:Modern Database Engine for Complex Data Based on Multi-Dimensional Arrays

The modern database 'TileDB' introduces a powerful on-disk format for storing and accessing any complex data based on multi-dimensional arrays. It supports dense and sparse arrays, dataframes and key-values stores, cloud storage ('S3', 'GCS', 'Azure'), chunked arrays, multiple compression, encryption and checksum filters, uses a fully multi-threaded implementation, supports parallel I/O, data versioning ('time travel'), metadata and groups. It is implemented as an embeddable cross-platform C++ library with APIs from several languages, and integrations. This package provides the R support.

Maintained by Isaiah Norton. Last updated 6 days ago.

array hdfs s3 storage-manager tiledb cpp

5.3 match 107 stars 11.96 score 306 scripts 4 dependents

jcrodriguez1989

rco:The R Code Optimizer

Automatically apply different strategies to optimize R code. 'rco' functions take R code as input, and returns R code as output.

Maintained by Juan Cruz Rodriguez. Last updated 4 months ago.

compiler fast gcc hpc optimization optimizer

9.3 match 82 stars 6.73 score

uscbiostats

hJAM:Hierarchical Joint Analysis of Marginal Summary Statistics

Provides functions to implement a hierarchical approach which is designed to perform joint analysis of summary statistics using the framework of Mendelian Randomization or transcriptome analysis. Reference: Lai Jiang, Shujing Xu, Nicholas Mancuso, Paul J. Newcombe, David V. Conti (2020). "A Hierarchical Approach Using Marginal Summary Statistics for Multiple Intermediates in a Mendelian Randomization or Transcriptome Analysis." <bioRxiv><doi:10.1101/2020.02.03.924241>.

Maintained by Lai Jiang. Last updated 1 years ago.

12.2 match 9 stars 5.13 score 5 scripts

flr

FLBEIA:Bio-Economic Impact Assessment of Management Strategies using FLR

A simulation toolbox that describes a fishery system under a Management Strategy Estrategy approach. The objective of the model is to facilitate the Bio-Economic evaluation of Management strategies. It is multistock, multifleet and seasonal. The simulation is divided in 2 main blocks, the Operating Model (OM) and the Management Procedure (MP). In turn, each of these two blocks is divided in 3 components: the biological, the fleets and the covariables on the one hand, and the observation, the assessment and the advice on the other.

Maintained by FLBEIA Team. Last updated 7 days ago.

cpp

10.5 match 11 stars 5.97 score 156 scripts

cran

mixAK:Multivariate Normal Mixture Models and Mixtures of Generalized Linear Mixed Models Including Model Based Clustering

Contains a mixture of statistical methods including the MCMC methods to analyze normal mixtures. Additionally, model based clustering methods are implemented to perform classification based on (multivariate) longitudinal (or otherwise correlated) data. The basis for such clustering is a mixture of multivariate generalized linear mixed models. The package is primarily related to the publications Komárek (2009, Comp. Stat. and Data Anal.) <doi:10.1016/j.csda.2009.05.006> and Komárek and Komárková (2014, J. of Stat. Soft.) <doi:10.18637/jss.v059.i12>. It also implements methods published in Komárek and Komárková (2013, Ann. of Appl. Stat.) <doi:10.1214/12-AOAS580>, Hughes, Komárek, Bonnett, Czanner, García-Fiñana (2017, Stat. in Med.) <doi:10.1002/sim.7397>, Jaspers, Komárek, Aerts (2018, Biom. J.) <doi:10.1002/bimj.201600253> and Hughes, Komárek, Czanner, García-Fiñana (2018, Stat. Meth. in Med. Res) <doi:10.1177/0962280216674496>.

Maintained by Arnošt Komárek. Last updated 6 months ago.

openblas

17.5 match 4 stars 3.55 score 3 dependents

epimodel

EpiModel:Mathematical Modeling of Infectious Disease Dynamics

Tools for simulating mathematical models of infectious disease dynamics. Epidemic model classes include deterministic compartmental models, stochastic individual-contact models, and stochastic network models. Network models use the robust statistical methods of exponential-family random graph models (ERGMs) from the Statnet suite of software packages in R. Standard templates for epidemic modeling include SI, SIR, and SIS disease types. EpiModel features an API for extending these templates to address novel scientific research aims. Full methods for EpiModel are detailed in Jenness et al. (2018, <doi:10.18637/jss.v084.i08>).

Maintained by Samuel Jenness. Last updated 2 months ago.

agent-based-modeling epidemics epidemiology infectious-diseases network-graph cpp

5.3 match 250 stars 11.57 score 315 scripts

zeehio

condformat:Conditional Formatting in Data Frames

Apply and visualize conditional formatting to data frames in R. It renders a data frame with cells formatted according to criteria defined by rules, using a tidy evaluation syntax. The table is printed either opening a web browser or within the 'RStudio' viewer if available. The conditional formatting rules allow to highlight cells matching a condition or add a gradient background to a given column. This package supports both 'HTML' and 'LaTeX' outputs in 'knitr' reports, and exporting to an 'xlsx' file.

Maintained by Sergio Oller Moreno. Last updated 1 years ago.

formatting html latex table visualisation

9.4 match 25 stars 6.53 score 91 scripts 1 dependents

r-forge

Matrix:Sparse and Dense Matrix Classes and Methods

A rich hierarchy of sparse and dense matrix classes, including general, symmetric, triangular, and diagonal matrices with numeric, logical, or pattern entries. Efficient methods for operating on such matrices, often wrapping the 'BLAS', 'LAPACK', and 'SuiteSparse' libraries.

Maintained by Martin Maechler. Last updated 8 days ago.

openblas

3.6 match 1 stars 17.23 score 33k scripts 12k dependents

mlr-org

paradox:Define and Work with Parameter Spaces for Complex Algorithms

Define parameter spaces, constraints and dependencies for arbitrary algorithms, to program on such spaces. Also includes statistical designs and random samplers. Objects are implemented as 'R6' classes.

Maintained by Martin Binder. Last updated 8 months ago.

experimental-design hyperparameters mlr3 transformations

5.3 match 29 stars 11.56 score 316 scripts 38 dependents

talgalili

dendextend:Extending 'dendrogram' Functionality in R

Offers a set of functions for extending 'dendrogram' objects in R, letting you visualize and compare trees of 'hierarchical clusterings'. You can (1) Adjust a tree's graphical parameters - the color, size, type, etc of its branches, nodes and labels. (2) Visually and statistically compare different 'dendrograms' to one another.

Maintained by Tal Galili. Last updated 2 months ago.

3.6 match 154 stars 17.02 score 6.0k scripts 164 dependents

deepayan

lattice:Trellis Graphics for R

A powerful and elegant high-level data visualization system inspired by Trellis graphics, with an emphasis on multivariate data. Lattice is sufficient for typical graphics needs, and is also flexible enough to handle most nonstandard requirements. See ?Lattice for an introduction.

Maintained by Deepayan Sarkar. Last updated 11 months ago.

3.5 match 68 stars 17.33 score 27k scripts 13k dependents

bioc

cqn:Conditional quantile normalization

A normalization tool for RNA-Seq data, implementing the conditional quantile normalization method.

Maintained by Kasper Daniel Hansen. Last updated 5 months ago.

immunooncology rnaseq preprocessing differentialexpression

8.7 match 6.93 score 238 scripts 4 dependents

sammo3182

interplot:Plot the Effects of Variables in Interaction Terms

Plots the conditional coefficients ("marginal effects") of variables included in multiplicative interaction terms.

Maintained by Yue Hu. Last updated 1 years ago.

10.7 match 15 stars 5.64 score 107 scripts

cran

gss:General Smoothing Splines

A comprehensive package for structural multivariate function estimation using smoothing splines.

Maintained by Chong Gu. Last updated 5 months ago.

fortran openblas

9.4 match 3 stars 6.40 score 137 dependents

cran

sna:Tools for Social Network Analysis

A range of tools for social network analysis, including node and graph-level indices, structural distance and covariance methods, structural equivalence detection, network regression, random graph generation, and 2D/3D network visualization.

Maintained by Carter T. Butts. Last updated 6 months ago.

8.8 match 8 stars 6.78 score 94 dependents

tnagler

VineCopula:Statistical Inference of Vine Copulas

Provides tools for the statistical analysis of regular vine copula models, see Aas et al. (2009) <doi:10.1016/j.insmatheco.2007.02.001> and Dissman et al. (2013) <doi:10.1016/j.csda.2012.08.010>. The package includes tools for parameter estimation, model selection, simulation, goodness-of-fit tests, and visualization. Tools for estimation, selection and exploratory data analysis of bivariate copula models are also provided.

Maintained by Thomas Nagler. Last updated 26 days ago.

copula estimation statistics vine

5.4 match 91 stars 10.99 score 362 scripts 23 dependents

lukeduttweiler

skipTrack:A Bayesian Hierarchical Model that Controls for Non-Adherence in Mobile Menstrual Cycle Tracking

Implements a Bayesian hierarchical model designed to identify skips in mobile menstrual cycle self-tracking on mobile apps. Future developments will allow for the inclusion of covariates affecting cycle mean and regularity, as well as extra information regarding tracking non-adherence. Main methods to be outlined in a forthcoming paper, with alternative models from Li et al. (2022) <doi:10.1093/jamia/ocab182>.

Maintained by Luke Duttweiler. Last updated 2 months ago.

12.0 match 4.95 score 4 scripts

bioc

topdownr:Investigation of Fragmentation Conditions in Top-Down Proteomics

The topdownr package allows automatic and systemic investigation of fragment conditions. It creates Thermo Orbitrap Fusion Lumos method files to test hundreds of fragmentation conditions. Additionally it provides functions to analyse and process the generated MS data and determine the best conditions to maximise overall fragment coverage.

Maintained by Sebastian Gibb. Last updated 5 months ago.

immunooncology infrastructure proteomics massspectrometry coverage mass-spectrometry topdown

11.7 match 1 stars 5.08 score

rpact-com

rpact:Confirmatory Adaptive Clinical Trial Design and Analysis

Design and analysis of confirmatory adaptive clinical trials with continuous, binary, and survival endpoints according to the methods described in the monograph by Wassmer and Brannath (2016) <doi:10.1007/978-3-319-32562-0>. This includes classical group sequential as well as multi-stage adaptive hypotheses tests that are based on the combination testing principle.

Maintained by Friedrich Pahlke. Last updated 12 days ago.

adaptive-design analysis clinical-trials count-data group-sequential-designs power-calculation sample-size-calculation simulation validated fortran cpp

7.4 match 25 stars 7.98 score 110 scripts 1 dependents

eeethb

edgedata:Datasets that Support the EDGE Server DIY Logic

Datasets from most recent Center for Consumer Information and Insurance Oversight (CCIIO) DIY entry in a tidy format. These support the Centers for Medicare and Medicaid Services' (CMS) risk adjustment Do-It-Yourself (DIY) process, which allows health insurance issuers to calculate member risk profiles under the Health and Human Services-Hierarchical Condition Categories (HHS-HCC) regression model. This regression model is used to calculate risk adjustment transfers. Risk adjustment is a selection mitigation program implemented under the Patient Protection and Affordable Care Act (ACA or Obamacare) in the USA. Under the ACA, health insurance issuers submit claims data to CMS in order for CMS to calculate a risk score under the HHS-HCC regression model. However, CMS does not inform issuers of their average risk score until after the data submission deadline. These data sets can be used by issuers to calculate their average risk score mid-year. More information about risk adjustment and the HHS-HCC model can be found here: <https://www.cms.gov/mmrr/Articles/A2014/MMRR2014_004_03_a03.html>.

Maintained by Ethan Brockmann. Last updated 3 years ago.

21.7 match 1 stars 2.70 score 1 scripts

declaredesign

estimatr:Fast Estimators for Design-Based Inference

Fast procedures for small set of commonly-used, design-appropriate estimators with robust standard errors and confidence intervals. Includes estimators for linear regression, instrumental variables regression, difference-in-means, Horvitz-Thompson estimation, and regression improving precision of experimental estimates by interacting treatment with centered pre-treatment covariates introduced by Lin (2013) <doi:10.1214/12-AOAS583>.

Maintained by Graeme Blair. Last updated 1 months ago.

cpp

5.0 match 133 stars 11.58 score 1.7k scripts 11 dependents

haozhu233

kableExtra:Construct Complex Table with 'kable' and Pipe Syntax

Build complex HTML or 'LaTeX' tables using 'kable()' from 'knitr' and the piping syntax from 'magrittr'. Function 'kable()' is a light weight table generator coming from 'knitr'. This package simplifies the way to manipulate the HTML or 'LaTeX' codes generated by 'kable()' and allows users to construct complex tables and customize styles using a readable syntax.

Maintained by Hao Zhu. Last updated 12 days ago.

html kable kableextra knitr latex rmarkdown

3.0 match 702 stars 19.35 score 55k scripts 163 dependents

iscience-kn

dropR:Dropout Analysis by Condition

Analysis and visualization of dropout between conditions in surveys and (online) experiments. Features include computation of dropout statistics, comparing dropout between conditions (e.g. Chi square), analyzing survival (e.g. Kaplan-Meier estimation), comparing conditions with the most different rates of dropout (Kolmogorov-Smirnov) and visualizing the result of each in designated plotting functions. Sources: Andrea Frick, Marie-Terese Baechtiger & Ulf-Dietrich Reips (2001) <https://www.researchgate.net/publication/223956222_Financial_incentives_personal_information_and_drop-out_in_online_studies>; Ulf-Dietrich Reips (2002) "Standards for Internet-Based Experimenting" <doi:10.1027//1618-3169.49.4.243>.

Maintained by Annika Tave Overlander. Last updated 4 months ago.

dropout experiments psychology social-science

9.6 match 6 stars 6.06 score 16 scripts

chr1swallace

coloc:Colocalisation Tests of Two Genetic Traits

Performs the colocalisation tests described in Giambartolomei et al (2013) <doi:10.1371/journal.pgen.1004383>, Wallace (2020) <doi:10.1371/journal.pgen.1008720>, Wallace (2021) <doi:10.1371/journal.pgen.1009440>.

Maintained by Chris Wallace. Last updated 4 months ago.

4.7 match 162 stars 12.23 score 916 scripts 3 dependents

smartdata-analysis-and-statistics

precmed:Precision Medicine

A doubly robust precision medicine approach to fit, cross-validate and visualize prediction models for the conditional average treatment effect (CATE). It implements doubly robust estimation and semiparametric modeling approach of treatment-covariate interactions as proposed by Yadlowsky et al. (2020) <doi:10.1080/01621459.2020.1772080>.

Maintained by Thomas Debray. Last updated 5 months ago.

precision-medicine

13.7 match 4 stars 4.20 score 4 scripts

angelospsy

multifear:Multiverse Analyses for Conditioning Data

A suite of functions for performing analyses, based on a multiverse approach, for conditioning data. Specifically, given the appropriate data, the functions are able to perform t-tests, analyses of variance, and mixed models for the provided data and return summary statistics and plots. The function is also able to return for all those tests p-values, confidence intervals, and Bayes factors. The methods are described in Lonsdorf, Gerlicher, Klingelhofer-Jens, & Krypotos (2022) <doi:10.1016/j.brat.2022.104072>.

Maintained by Angelos-Miltiadis Krypotos. Last updated 1 years ago.

conditioning multiverse

13.6 match 3 stars 4.18 score 7 scripts

andreasstatsr

CP:Conditional Power Calculations

Functions for calculating the conditional power for different models in survival time analysis within randomized clinical trials with two different treatments to be compared and survival as an endpoint.

Maintained by Andreas Kuehnapfel. Last updated 5 years ago.

17.1 match 3.32 score 21 scripts

ddebeer

permimp:Conditional Permutation Importance

An add-on to the 'party' package, with a faster implementation of the partial-conditional permutation importance for random forests. The standard permutation importance is implemented exactly the same as in the 'party' package. The conditional permutation importance can be computed faster, with an option to be backward compatible to the 'party' implementation. The package is compatible with random forests fit using the 'party' and the 'randomForest' package. The methods are described in Strobl et al. (2007) <doi:10.1186/1471-2105-8-25> and Debeer and Strobl (2020) <doi:10.1186/s12859-020-03622-2>.

Maintained by Dries Debeer. Last updated 2 years ago.

9.7 match 4 stars 5.85 score 39 scripts 1 dependents

friendly

vcdExtra:'vcd' Extensions and Additions

Provides additional data sets, methods and documentation to complement the 'vcd' package for Visualizing Categorical Data and the 'gnm' package for Generalized Nonlinear Models. In particular, 'vcdExtra' extends mosaic, assoc and sieve plots from 'vcd' to handle 'glm()' and 'gnm()' models and adds a 3D version in 'mosaic3d'. Additionally, methods are provided for comparing and visualizing lists of 'glm' and 'loglm' objects. This package is now a support package for the book, "Discrete Data Analysis with R" by Michael Friendly and David Meyer.

Maintained by Michael Friendly. Last updated 5 months ago.

categorical-data-visualization generalized-linear-models mosaic-plots

5.5 match 24 stars 10.34 score 472 scripts 3 dependents

ghtaranto

scapesClassification:User-Defined Classification of Raster Surfaces

Series of algorithms to translate users' mental models of seascapes, landscapes and, more generally, of geographic features into computer representations (classifications). Spaces and geographic objects are classified with user-defined rules taking into account spatial data as well as spatial relationships among different classes and objects.

Maintained by Gerald H. Taranto. Last updated 3 years ago.

classification-algorithm object-detection raster spatial

13.3 match 1 stars 4.22 score 33 scripts

bioc

SpeCond:Condition specific detection from expression data

This package performs a gene expression data analysis to detect condition-specific genes. Such genes are significantly up- or down-regulated in a small number of conditions. It does so by fitting a mixture of normal distributions to the expression values. Conditions can be environmental conditions, different tissues, organs or any other sources that you wish to compare in terms of gene expression.

Maintained by Florence Cavalli. Last updated 5 months ago.

microarray differentialexpression multiplecomparison clustering reportwriting

14.4 match 3.89 score 13 scripts

christinaheinze

CondIndTests:Nonlinear Conditional Independence Tests

Code for a variety of nonlinear conditional independence tests: Kernel conditional independence test (Zhang et al., UAI 2011, <arXiv:1202.3775>), Residual Prediction test (based on Shah and Buehlmann, <arXiv:1511.03334>), Invariant environment prediction, Invariant target prediction, Invariant residual distribution test, Invariant conditional quantile prediction (all from Heinze-Deml et al., <arXiv:1706.08576>).

Maintained by Christina Heinze-Deml. Last updated 5 years ago.

11.2 match 17 stars 4.91 score 32 scripts 1 dependents

pachadotdev

cpp11armadillo:An 'Armadillo' Interface

Provides function declarations and inline function definitions that facilitate communication between R and the 'Armadillo' 'C++' library for linear algebra and scientific computing. This implementation is detailed in Vargas Sepulveda and Schneider Malamud (2024) <doi:10.48550/arXiv.2408.11074>.

Maintained by Mauricio Vargas Sepulveda. Last updated 27 days ago.

armadillo cpp cpp11 hacktoberfest linear-algebra

6.0 match 9 stars 9.14 score 1 scripts 16 dependents

amayer2010

EffectLiteR:Average and Conditional Effects

Use structural equation modeling to estimate average and conditional effects of a treatment variable on an outcome variable, taking into account multiple continuous and categorical covariates.

Maintained by Axel Mayer. Last updated 7 months ago.

11.5 match 8 stars 4.81 score 18 scripts 1 dependents

tguillerme

treats:Trees and Traits Simulations

A modular package for simulating phylogenetic trees and species traits jointly. Trees can be simulated using modular birth-death parameters (e.g. changing starting parameters or algorithm rules). Traits can be simulated in any way designed by the user. The growth of the tree and the traits can influence each other through modifiers objects providing rules for affecting each other. Finally, events can be created to modify both the tree and the traits under specific conditions ( Guillerme, 2024 <DOI:10.1111/2041-210X.14306>).

Maintained by Thomas Guillerme. Last updated 16 hours ago.

11.8 match 3 stars 4.66 score 19 scripts

bioc

BiocFHIR:Illustration of FHIR ingestion and transformation using R

FHIR R4 bundles in JSON format are derived from https://synthea.mitre.org/downloads. Transformation inspired by a kaggle notebook published by Dr Alexander Scarlat, https://www.kaggle.com/code/drscarlat/fhir-starter-parse-healthcare-bundles-into-tables. This is a very limited illustration of some basic parsing and reorganization processes. Additional tooling will be required to move beyond the Synthea data illustrations.

Maintained by Vincent Carey. Last updated 5 months ago.

infrastructure dataimport datarepresentation fhir

9.4 match 4 stars 5.78 score 15 scripts

bioc

DEP:Differential Enrichment analysis of Proteomics data

This package provides an integrated analysis workflow for robust and reproducible analysis of mass spectrometry proteomics data for differential protein expression or differential enrichment. It requires tabular input (e.g. txt files) as generated by quantitative analysis softwares of raw mass spectrometry data, such as MaxQuant or IsobarQuant. Functions are provided for data preparation, filtering, variance normalization and imputation of missing values, as well as statistical testing of differentially enriched / expressed proteins. It also includes tools to check intermediate steps in the workflow, such as normalization and missing values imputation. Finally, visualization tools are provided to explore the results, including heatmap, volcano plot and barplot representations. For scientists with limited experience in R, the package also contains wrapper functions that entail the complete analysis workflow and generate a report. Even easier to use are the interactive Shiny apps that are provided by the package.

Maintained by Arne Smits. Last updated 5 months ago.

immunooncology proteomics massspectrometry differentialexpression datarepresentation

7.7 match 7.10 score 628 scripts

treynkens

ReIns:Functions from "Reinsurance: Actuarial and Statistical Aspects"

Functions from the book "Reinsurance: Actuarial and Statistical Aspects" (2017) by Hansjoerg Albrecher, Jan Beirlant and Jef Teugels <https://www.wiley.com/en-us/Reinsurance%3A+Actuarial+and+Statistical+Aspects-p-9780470772683>.

Maintained by Tom Reynkens. Last updated 4 months ago.

extremes reinsurance risk-analysis cpp

8.6 match 22 stars 6.31 score 186 scripts

bioc

rifiComparative:'rifiComparative' compares the output of rifi from two different conditions.

'rifiComparative' is a continuation of rifi package. It compares two conditions output of rifi using half-life and mRNA at time 0 segments. As an input for the segmentation, the difference between half-life of both condtions and log2FC of the mRNA at time 0 are used. The package provides segmentation, statistics, summary table, fragments visualization and some additional useful plots for further anaylsis.

Maintained by Loubna Youssar. Last updated 5 months ago.

rnaseq differentialexpression generegulation transcriptomics microarray software

13.4 match 4.00 score

npm27

lrd:A Package for Processing Lexical Response Data

Lexical response data is a package that can be used for processing cued-recall, free-recall, and sentence responses from memory experiments.

Maintained by Nicholas Maxwell. Last updated 3 years ago.

10.1 match 3 stars 5.30 score 33 scripts

anestistouloumis

SimCorMultRes:Simulates Correlated Multinomial Responses

Simulates correlated multinomial responses conditional on a marginal model specification.

Maintained by Anestis Touloumis. Last updated 1 years ago.

binary longitudinal-studies multinomial simulation

8.8 match 7 stars 6.04 score 26 scripts 2 dependents

mmollina

mappoly:Genetic Linkage Maps in Autopolyploids

Construction of genetic maps in autopolyploid full-sib populations. Uses pairwise recombination fraction estimation as the first source of information to sequentially position allelic variants in specific homologous chromosomes. For situations where pairwise analysis has limited power, the algorithm relies on the multilocus likelihood obtained through a hidden Markov model (HMM). For more detail, please see Mollinari and Garcia (2019) <doi:10.1534/g3.119.400378> and Mollinari et al. (2020) <doi:10.1534/g3.119.400620>.

Maintained by Marcelo Mollinari. Last updated 12 days ago.

polyploid polyploid-genetic-mapping polyploidy cpp

7.0 match 27 stars 7.56 score 111 scripts 1 dependents

sfcheung

stdmod:Standardized Moderation Effect and Its Confidence Interval

Functions for computing a standardized moderation effect in moderated regression and forming its confidence interval by nonparametric bootstrapping as proposed in Cheung, Cheung, Lau, Hui, and Vong (2022) <doi:10.1037/hea0001188>. Also includes simple-to-use functions for computing conditional effects (unstandardized or standardized) and plotting moderation effects.

Maintained by Shu Fai Cheung. Last updated 6 months ago.

bootstrapping confidence-interval effect-sizes moderation regression standardization standardized-moderation

9.4 match 1 stars 5.62 score 46 scripts

r-lum

Luminescence:Comprehensive Luminescence Dating Data Analysis

A collection of various R functions for the purpose of Luminescence dating data analysis. This includes, amongst others, data import, export, application of age models, curve deconvolution, sequence analysis and plotting of equivalent dose distributions.

Maintained by Sebastian Kreutzer. Last updated 18 hours ago.

bayesian-statistics data-science geochronology luminescence luminescence-dating open-science osl plotting radiofluorescence tl xsyg cpp

4.9 match 15 stars 10.76 score 178 scripts 8 dependents

ropensci

karel:Learning programming with Karel the robot

This is the R implementation of Karel the robot, a programming language created by Dr. R. E. Pattis at Stanford University in 1981. Karel is an useful tool to teach introductory concepts about general programming, such as algorithmic decomposition, conditional statements, loops, etc., in an interactive and fun way, by writing programs to make Karel the robot achieve certain tasks in the world she lives in. Originally based on Pascal, Karel was implemented in many languages through these decades, including 'Java', 'C++', 'Ruby' and 'Python'. This is the first package implementing Karel in R.

Maintained by Marcos Prunello. Last updated 8 months ago.

learning programming r-language

7.6 match 10 stars 6.87 score 31 scripts

spsanderson

healthyR:Hospital Data Analysis Workflow Tools

Hospital data analysis workflow tools, modeling, and automations. This library provides many useful tools to review common administrative hospital data. Some of these include average length of stay, readmission rates, average net pay amounts by service lines just to name a few. The aim is to provide a simple and consistent verb framework that takes the guesswork out of everything.

Maintained by Steven Sanderson. Last updated 9 months ago.

analysis analytics healthcare healthyr

7.2 match 30 stars 7.27 score 103 scripts 1 dependents

tnagler

vinereg:D-Vine Quantile Regression

Implements D-vine quantile regression models with parametric or nonparametric pair-copulas. See Kraus and Czado (2017) <doi:10.1016/j.csda.2016.12.009> and Schallhorn et al. (2017) <doi:10.48550/arXiv.1705.08310>.

Maintained by Thomas Nagler. Last updated 2 months ago.

copula estimation statistics vine cpp

9.1 match 11 stars 5.76 score 26 scripts

rich-iannone

DiagrammeR:Graph/Network Visualization

Build graph/network structures using functions for stepwise addition and deletion of nodes and edges. Work with data available in tables for bulk addition of nodes, edges, and associated metadata. Use graph selections and traversals to apply changes to specific nodes or edges. A wide selection of graph algorithms allow for the analysis of graphs. Visualize the graphs and take advantage of any aesthetic properties assigned to nodes and edges.

Maintained by Richard Iannone. Last updated 2 months ago.

graph graph-functions network-graph property-graph visualization

3.4 match 1.7k stars 15.18 score 3.8k scripts 87 dependents

cran

nlme:Linear and Nonlinear Mixed Effects Models

Fit and compare Gaussian linear and nonlinear mixed-effects models.

Maintained by R Core Team. Last updated 2 months ago.

fortran

4.0 match 6 stars 13.00 score 13k scripts 8.7k dependents

gjmvanboxtel

gsignal:Signal Processing

R implementation of the 'Octave' package 'signal', containing a variety of signal processing tools, such as signal generation and measurement, correlation and convolution, filtering, filter design, filter analysis and conversion, power spectrum analysis, system identification, decimation and sample rate change, and windowing.

Maintained by Geert van Boxtel. Last updated 2 months ago.

signal-processing signals cpp

5.1 match 24 stars 10.03 score 133 scripts 34 dependents

cud2v

pccc:Pediatric Complex Chronic Conditions

An implementation of the pediatric complex chronic conditions (CCC) classification system using R and C++.

Maintained by Seth Russell. Last updated 5 months ago.

cpp

8.7 match 5 stars 5.93 score 38 scripts

mplatzer

BTYDplus:Probabilistic Models for Assessing and Predicting your Customer Base

Provides advanced statistical methods to describe and predict customers' purchase behavior in a non-contractual setting. It uses historic transaction records to fit a probabilistic model, which then allows to compute quantities of managerial interest on a cohort- as well as on a customer level (Customer Lifetime Value, Customer Equity, P(alive), etc.). This package complements the BTYD package by providing several additional buy-till-you-die models, that have been published in the marketing literature, but whose implementation are complex and non-trivial. These models are: NBD [Ehrenberg (1959) <doi:10.2307/2985810>], MBG/NBD [Batislam et al (2007) <doi:10.1016/j.ijresmar.2006.12.005>], (M)BG/CNBD-k [Reutterer et al (2020) <doi:10.1016/j.ijresmar.2020.09.002>], Pareto/NBD (HB) [Abe (2009) <doi:10.1287/mksc.1090.0502>] and Pareto/GGG [Platzer and Reutterer (2016) <doi:10.1287/mksc.2015.0963>].

Maintained by Michael Platzer. Last updated 11 months ago.

crm customer-behavior marketing-science predictive-analytics cpp

6.9 match 188 stars 7.48 score 64 scripts

cran

bnlearn:Bayesian Network Structure Learning, Parameter Learning and Inference

Bayesian network structure learning, parameter learning and inference. This package implements constraint-based (PC, GS, IAMB, Inter-IAMB, Fast-IAMB, MMPC, Hiton-PC, HPC), pairwise (ARACNE and Chow-Liu), score-based (Hill-Climbing and Tabu Search) and hybrid (MMHC, RSMAX2, H2PC) structure learning algorithms for discrete, Gaussian and conditional Gaussian networks, along with many score functions and conditional independence tests. The Naive Bayes and the Tree-Augmented Naive Bayes (TAN) classifiers are also implemented. Some utility functions (model comparison and manipulation, random data generation, arc orientation testing, simple and advanced plots) are included, as well as support for parameter estimation (maximum likelihood and Bayesian) and inference, conditional probability queries, cross-validation, bootstrap and model averaging. Development snapshots with the latest bugfixes are available from <https://www.bnlearn.com/>.

Maintained by Marco Scutari. Last updated 2 months ago.

openblas

6.7 match 57 stars 7.72 score 32 dependents

bioc

FRASER:Find RAre Splicing Events in RNA-Seq Data

Detection of rare aberrant splicing events in transcriptome profiles. Read count ratio expectations are modeled by an autoencoder to control for confounding factors in the data. Given these expectations, the ratios are assumed to follow a beta-binomial distribution with a junction specific dispersion. Outlier events are then identified as read-count ratios that deviate significantly from this distribution. FRASER is able to detect alternative splicing, but also intron retention. The package aims to support diagnostics in the field of rare diseases where RNA-seq is performed to identify aberrant splicing defects.

Maintained by Christian Mertes. Last updated 5 months ago.

rnaseq alternativesplicing sequencing software genetics coverage aberrant-splicing diagnostics outlier-detection rare-disease rna-seq splicing openblas cpp

6.0 match 41 stars 8.50 score 155 scripts

bioc

tradeSeq:trajectory-based differential expression analysis for sequencing data

tradeSeq provides a flexible method for fitting regression models that can be used to find genes that are differentially expressed along one or multiple lineages in a trajectory. Based on the fitted models, it uses a variety of tests suited to answer different questions of interest, e.g. the discovery of genes for which expression is associated with pseudotime, or which are differentially expressed (in a specific region) along the trajectory. It fits a negative binomial generalized additive model (GAM) for each gene, and performs inference on the parameters of the GAM.

Maintained by Hector Roux de Bezieux. Last updated 5 months ago.

clustering regression timecourse differentialexpression geneexpression rnaseq sequencing software singlecell transcriptomics multiplecomparison visualization

5.0 match 247 stars 10.06 score 440 scripts

r-forge

pcalg:Methods for Graphical Models and Causal Inference

Functions for causal structure learning and causal inference using graphical models. The main algorithms for causal structure learning are PC (for observational data without hidden variables), FCI and RFCI (for observational data with hidden variables), and GIES (for a mix of data from observational studies (i.e. observational data) and data from experiments involving interventions (i.e. interventional data) without hidden variables). For causal inference the IDA algorithm, the Generalized Backdoor Criterion (GBC), the Generalized Adjustment Criterion (GAC) and some related functions are implemented. Functions for incorporating background knowledge are provided.

Maintained by Markus Kalisch. Last updated 6 months ago.

openblas cpp

6.9 match 7.32 score 700 scripts 19 dependents

tidyverse

ggplot2:Create Elegant Data Visualisations Using the Grammar of Graphics

A system for 'declaratively' creating graphics, based on "The Grammar of Graphics". You provide the data, tell 'ggplot2' how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.

Maintained by Thomas Lin Pedersen. Last updated 11 days ago.

data-visualisation visualisation

2.0 match 6.6k stars 25.10 score 645k scripts 7.5k dependents

bips-hb

micd:Multiple Imputation in Causal Graph Discovery

Modified functions of the package 'pcalg' and some additional functions to run the PC and the FCI (Fast Causal Inference) algorithm for constraint-based causal discovery in incomplete and multiply imputed datasets. Foraita R, Friemel J, Günther K, Behrens T, Bullerdiek J, Nimzyk R, Ahrens W, Didelez V (2020) <doi:10.1111/rssa.12565>; Andrews RM, Foraita R, Didelez V, Witte J (2021) <arXiv:2108.13395>; Witte J, Foraita R, Didelez V (2022) <doi:10.1002/sim.9535>.

Maintained by Ronja Foraita. Last updated 2 years ago.

causal-discovery graphical-models multiple-imputation

13.6 match 5 stars 3.70 score 20 scripts

jarathomas

InterVA5:Replicate and Analyse 'InterVA5'

Provides an R version of the 'InterVA5' software (<http://www.byass.uk/interva/>) for coding cause of death from verbal autopsies. It also provides simple graphical representation of individual and population level statistics.

Maintained by Jason Thomas. Last updated 4 years ago.

21.1 match 2.38 score 20 scripts 2 dependents

bioc

slingshot:Tools for ordering single-cell sequencing

Provides functions for inferring continuous, branching lineage structures in low-dimensional data. Slingshot was designed to model developmental trajectories in single-cell RNA sequencing data and serve as a component in an analysis pipeline after dimensionality reduction and clustering. It is flexible enough to handle arbitrarily many branching events and allows for the incorporation of prior knowledge through supervised graph construction.

Maintained by Kelly Street. Last updated 5 months ago.

clustering differentialexpression geneexpression rnaseq sequencing software singlecell transcriptomics visualization

4.2 match 283 stars 12.01 score 1.0k scripts 4 dependents

hheiling

glmmPen:High Dimensional Penalized Generalized Linear Mixed Models (pGLMM)

Fits high dimensional penalized generalized linear mixed models using the Monte Carlo Expectation Conditional Minimization (MCECM) algorithm. The purpose of the package is to perform variable selection on both the fixed and random effects simultaneously for generalized linear mixed models. The package supports fitting of Binomial, Gaussian, and Poisson data with canonical links, and supports penalization using the MCP, SCAD, or LASSO penalties. The MCECM algorithm is described in Rashid et al. (2020) <doi:10.1080/01621459.2019.1671197>. The techniques used in the minimization portion of the procedure (the M-step) are derived from the procedures of the 'ncvreg' package (Breheny and Huang (2011) <doi:10.1214/10-AOAS388>) and 'grpreg' package (Breheny and Huang (2015) <doi:10.1007/s11222-013-9424-2>), with appropriate modifications to account for the estimation and penalization of the random effects. The 'ncvreg' and 'grpreg' packages also describe the MCP, SCAD, and LASSO penalties.

Maintained by Hillary Heiling. Last updated 7 months ago.

cpp

13.3 match 6 stars 3.73 score 18 scripts

lrberge

stringmagic:Character String Operations and Interpolation, Magic Edition

Performs complex string operations compactly and efficiently. Supports string interpolation jointly with over 50 string operations. Also enhances regular string functions (like grep() and co). See an introduction at <https://lrberge.github.io/stringmagic/>.

Maintained by Laurent R Berge. Last updated 7 months ago.

interpolation string cpp

4.7 match 15 stars 10.56 score 37 scripts 33 dependents

nelson-gon

mde:Missing Data Explorer

Correct identification and handling of missing data is one of the most important steps in any analysis. To aid this process, 'mde' provides a very easy to use yet robust framework to quickly get an idea of where the missing data lies and therefore find the most appropriate action to take. Graham WJ (2009) <doi:10.1146/annurev.psych.58.110405.085530>.

Maintained by Nelson Gonzabato. Last updated 3 years ago.

data-analysis data-cleaning data-exploration data-science datacleaner datacleaning exploratory-data-analysis missing missing-data missing-value-treatment missing-values missingness omit recode replace statistics

8.8 match 4 stars 5.61 score 34 scripts

bioc

DCATS:Differential Composition Analysis Transformed by a Similarity matrix

Methods to detect the differential composition abundances between conditions in singel-cell RNA-seq experiments, with or without replicates. It aims to correct bias introduced by missclaisification and enable controlling of confounding covariates. To avoid the influence of proportion change from big cell types, DCATS can use either total cell number or specific reference group as normalization term.

Maintained by Xinyi Lin. Last updated 5 months ago.

singlecell normalization

10.9 match 4.53 score 34 scripts

cran

vcd:Visualizing Categorical Data

Visualization techniques, data sets, summary and inference procedures aimed particularly at categorical data. Special emphasis is given to highly extensible grid graphics. The package was package was originally inspired by the book "Visualizing Categorical Data" by Michael Friendly and is now the main support package for a new book, "Discrete Data Analysis with R" by Michael Friendly and David Meyer (2015).

Maintained by David Meyer. Last updated 6 months ago.

6.0 match 5 stars 8.19 score 87 dependents

openpharma

mmrm:Mixed Models for Repeated Measures

Mixed models for repeated measures (MMRM) are a popular choice for analyzing longitudinal continuous outcomes in randomized clinical trials and beyond; see Cnaan, Laird and Slasor (1997) <doi:10.1002/(SICI)1097-0258(19971030)16:20%3C2349::AID-SIM667%3E3.0.CO;2-E> for a tutorial and Mallinckrodt, Lane, Schnell, Peng and Mancuso (2008) <doi:10.1177/009286150804200402> for a review. This package implements MMRM based on the marginal linear model without random effects using Template Model Builder ('TMB') which enables fast and robust model fitting. Users can specify a variety of covariance matrices, weight observations, fit models with restricted or standard maximum likelihood inference, perform hypothesis testing with Satterthwaite or Kenward-Roger adjustment, and extract least square means estimates by using 'emmeans'.

Maintained by Daniel Sabanes Bove. Last updated 11 days ago.

cpp

4.0 match 138 stars 12.15 score 113 scripts 4 dependents

r-forge

copula:Multivariate Dependence with Copulas

Classes (S4) of commonly used elliptical, Archimedean, extreme-value and other copula families, as well as their rotations, mixtures and asymmetrizations. Nested Archimedean copulas, related tools and special functions. Methods for density, distribution, random number generation, bivariate dependence measures, Rosenblatt transform, Kendall distribution function, perspective and contour plots. Fitting of copula models with potentially partly fixed parameters, including standard errors. Serial independence tests, copula specification tests (independence, exchangeability, radial symmetry, extreme-value dependence, goodness-of-fit) and model selection based on cross-validation. Empirical copula, smoothed versions, and non-parametric estimators of the Pickands dependence function.

Maintained by Martin Maechler. Last updated 13 days ago.

4.1 match 11.83 score 1.2k scripts 86 dependents

hotneim

lg:Locally Gaussian Distributions: Estimation and Methods

An implementation of locally Gaussian distributions. It provides methods for implementing locally Gaussian multivariate density estimation, conditional density estimation, various independence tests for iid and time series data, a test for conditional independence and a test for financial contagion.

Maintained by Håkon Otneim. Last updated 5 years ago.

11.7 match 4 stars 4.18 score 25 scripts

michaelhallquist

MplusAutomation:An R Package for Facilitating Large-Scale Latent Variable Analyses in Mplus

Leverages the R language to automate latent variable model estimation and interpretation using 'Mplus', a powerful latent variable modeling program developed by Muthen and Muthen (<https://www.statmodel.com>). Specifically, this package provides routines for creating related groups of models, running batches of models, and extracting and tabulating model parameters and fit statistics.

Maintained by Michael Hallquist. Last updated 2 months ago.

3.8 match 86 stars 12.96 score 664 scripts 13 dependents

rethomics

damr:Interface to Drosophila Activity Monitor System Result Files

Loads behavioural data from the widely used Drosophila Activity Monitor System (DAMS, TriKinetics <https://trikinetics.com/>) into the rethomics framework.

Maintained by Quentin Geissmann. Last updated 11 months ago.

biological-data-analysis parsing

9.8 match 6 stars 4.92 score 31 scripts 1 dependents

saviviro

uGMAR:Estimate Univariate Gaussian and Student's t Mixture Autoregressive Models

Maximum likelihood estimation of univariate Gaussian Mixture Autoregressive (GMAR), Student's t Mixture Autoregressive (StMAR), and Gaussian and Student's t Mixture Autoregressive (G-StMAR) models, quantile residual tests, graphical diagnostics, forecast and simulate from GMAR, StMAR and G-StMAR processes. Leena Kalliovirta, Mika Meitz, Pentti Saikkonen (2015) <doi:10.1111/jtsa.12108>, Mika Meitz, Daniel Preve, Pentti Saikkonen (2023) <doi:10.1080/03610926.2021.1916531>, Savi Virolainen (2022) <doi:10.1515/snde-2020-0060>.

Maintained by Savi Virolainen. Last updated 2 months ago.

9.9 match 1 stars 4.88 score 51 scripts

larmarange

labelled:Manipulating Labelled Data

Work with labelled data imported from 'SPSS' or 'Stata' with 'haven' or 'foreign'. This package provides useful functions to deal with "haven_labelled" and "haven_labelled_spss" classes introduced by 'haven' package.

Maintained by Joseph Larmarange. Last updated 28 days ago.

haven labels metadata sas spss stata

3.2 match 76 stars 15.02 score 2.4k scripts 96 dependents

dnychka

fields:Tools for Spatial Data

For curve, surface and function fitting with an emphasis on splines, spatial data, geostatistics, and spatial statistics. The major methods include cubic, and thin plate splines, Kriging, and compactly supported covariance functions for large data sets. The splines and Kriging methods are supported by functions that can determine the smoothing parameter (nugget and sill variance) and other covariance function parameters by cross validation and also by restricted maximum likelihood. For Kriging there is an easy to use function that also estimates the correlation scale (range parameter). A major feature is that any covariance function implemented in R and following a simple format can be used for spatial prediction. There are also many useful functions for plotting and working with spatial data as images. This package also contains an implementation of sparse matrix methods for large spatial data sets and currently requires the sparse matrix (spam) package. Use help(fields) to get started and for an overview. The fields source code is deliberately commented and provides useful explanations of numerical details as a companion to the manual pages. The commented source code can be viewed by expanding the source code version and looking in the R subdirectory. The reference for fields can be generated by the citation function in R and has DOI <doi:10.5065/D6W957CT>. Development of this package was supported in part by the National Science Foundation Grant 1417857, the National Center for Atmospheric Research, and Colorado School of Mines. See the Fields URL for a vignette on using this package and some background on spatial statistics.

Maintained by Douglas Nychka. Last updated 9 months ago.

fortran

3.8 match 15 stars 12.60 score 7.7k scripts 295 dependents

arpapiemonte

OpeNoise:Environmental Noise Pollution Data Analysis

Provides analyse, interpret and understand noise pollution data. Data are typically regular time series measured with sound meter. The package is partially described in Fogola, Grasso, Masera and Scordino (2023, <DOI:10.61782/fa.2023.0063>).

Maintained by Pasquale Scordino. Last updated 4 months ago.

13.7 match 2 stars 3.48 score 5 scripts

gkremling

gofreg:Bootstrap-Based Goodness-of-Fit Tests for Parametric Regression

Provides statistical methods to check if a parametric family of conditional density functions fits to some given dataset of covariates and response variables. Different test statistics can be used to determine the goodness-of-fit of the assumed model, see Andrews (1997) <doi:10.2307/2171880>, Bierens & Wang (2012) <doi:10.1017/S0266466611000168>, Dikta & Scheer (2021) <doi:10.1007/978-3-030-73480-0> and Kremling & Dikta (2024) <doi:10.48550/arXiv.2409.20262>. As proposed in these papers, the corresponding p-values are approximated using a parametric bootstrap method.

Maintained by Gitte Kremling. Last updated 6 months ago.

9.0 match 5.30 score 9 scripts

bioc

Category:Category Analysis

A collection of tools for performing category (gene set enrichment) analysis.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

annotation go pathways genesetenrichment

6.0 match 7.93 score 183 scripts 16 dependents

cmollica

PLMIX:Bayesian Analysis of Finite Mixture of Plackett-Luce Models

Fit finite mixtures of Plackett-Luce models for partial top rankings/orderings within the Bayesian framework. It provides MAP point estimates via EM algorithm and posterior MCMC simulations via Gibbs Sampling. It also fits MLE as a special case of the noninformative Bayesian analysis with vague priors. In addition to inferential techniques, the package assists other fundamental phases of a model-based analysis for partial rankings/orderings, by including functions for data manipulation, simulation, descriptive summary, model selection and goodness-of-fit evaluation. Main references on the methods are Mollica and Tardella (2017) <doi.org/10.1007/s11336-016-9530-0> and Mollica and Tardella (2014) <doi/10.1002/sim.6224>.

Maintained by Cristina Mollica. Last updated 4 years ago.

cpp

15.0 match 3.15 score 28 scripts

neural-structured-additive-learning

deeptrafo:Fitting Deep Conditional Transformation Models

Allows for the specification of deep conditional transformation models (DCTMs) and ordinal neural network transformation models, as described in Baumann et al (2021) <doi:10.1007/978-3-030-86523-8_1> and Kook et al (2022) <doi:10.1016/j.patcog.2021.108263>. Extensions such as autoregressive DCTMs (Ruegamer et al, 2023, <doi:10.1007/s11222-023-10212-8>) and transformation ensembles (Kook et al, 2022, <doi:10.48550/arXiv.2205.12729>) are implemented. The software package is described in Kook et al (2024, <doi:10.18637/jss.v111.i10>).

Maintained by Lucas Kook. Last updated 2 months ago.

10.6 match 5 stars 4.44 score 11 scripts

indrajeetpatil

ggstatsplot:'ggplot2' Based Plots with Statistical Details

Extension of 'ggplot2', 'ggstatsplot' creates graphics with details from statistical tests included in the plots themselves. It provides an easier syntax to generate information-rich plots for statistical analysis of continuous (violin plots, scatterplots, histograms, dot plots, dot-and-whisker plots) or categorical (pie and bar charts) data. Currently, it supports the most common types of statistical approaches and tests: parametric, nonparametric, robust, and Bayesian versions of t-test/ANOVA, correlation analyses, contingency table analysis, meta-analysis, and regression analyses. References: Patil (2021) <doi:10.21105/joss.03236>.

Maintained by Indrajeet Patil. Last updated 21 days ago.

bayes-factors datascience dataviz effect-size ggplot-extension hypothesis-testing non-parametric-statistics regression-models statistical-analysis

3.2 match 2.1k stars 14.49 score 3.0k scripts 1 dependents

pdhoff

amen:Additive and Multiplicative Effects Models for Networks and Relational Data

Analysis of dyadic network and relational data using additive and multiplicative effects (AME) models. The basic model includes regression terms, the covariance structure of the social relations model (Warner, Kenny and Stoto (1979) <DOI:10.1037/0022-3514.37.10.1742>, Wong (1982) <DOI:10.2307/2287296>), and multiplicative factor models (Hoff(2009) <DOI:10.1007/s10588-008-9040-4>). Several different link functions accommodate different relational data structures, including binary/network data, normal relational data, zero-inflated positive outcomes using a tobit model, ordinal relational data and data from fixed-rank nomination schemes. Several of these link functions are discussed in Hoff, Fosdick, Volfovsky and Stovel (2013) <DOI:10.1017/nws.2013.17>. Development of this software was supported in part by NIH grant R01HD067509.

Maintained by Peter Hoff. Last updated 4 years ago.

6.9 match 28 stars 6.81 score 153 scripts

modeloriented

randomForestExplainer:Explaining and Visualizing Random Forests in Terms of Variable Importance

A set of tools to help explain which variables are most important in a random forests. Various variable importance measures are calculated and visualized in different settings in order to get an idea on how their importance changes depending on our criteria (Hemant Ishwaran and Udaya B. Kogalur and Eiran Z. Gorodeski and Andy J. Minn and Michael S. Lauer (2010) <doi:10.1198/jasa.2009.tm08622>, Leo Breiman (2001) <doi:10.1023/A:1010933404324>).

Maintained by Yue Jiang. Last updated 12 months ago.

random-forest

4.9 match 233 stars 9.59 score 236 scripts

jarrodhadfield

MCMCglmm:MCMC Generalised Linear Mixed Models

Fits Multivariate Generalised Linear Mixed Models (and related models) using Markov chain Monte Carlo techniques (Hadfield 2010 J. Stat. Soft.).

Maintained by Jarrod Hadfield. Last updated 3 months ago.

cpp

5.3 match 2 stars 8.83 score 1.2k scripts 13 dependents

numbersman77

bpp:Computations Around Bayesian Predictive Power

Implements functions to update Bayesian Predictive Power Computations after not stopping a clinical trial at an interim analysis. Such an interim analysis can either be blinded or unblinded. Code is provided for Normally distributed endpoints with known variance, with a prominent example being the hazard ratio.

Maintained by Kaspar Rufibach. Last updated 24 days ago.

14.9 match 3.12 score 19 scripts

hojsgaard

gRbase:A Package for Graphical Modelling in R

The 'gRbase' package provides graphical modelling features used by e.g. the packages 'gRain', 'gRim' and 'gRc'. 'gRbase' implements graph algorithms including (i) maximum cardinality search (for marked and unmarked graphs). (ii) moralization, (iii) triangulation, (iv) creation of junction tree. 'gRbase' facilitates array operations, 'gRbase' implements functions for testing for conditional independence. 'gRbase' illustrates how hierarchical log-linear models may be implemented and describes concept of graphical meta data. The facilities of the package are documented in the book by Højsgaard, Edwards and Lauritzen (2012, <doi:10.1007/978-1-4614-2299-0>) and in the paper by Dethlefsen and Højsgaard, (2005, <doi:10.18637/jss.v014.i17>). Please see 'citation("gRbase")' for citation details.

Maintained by Søren Højsgaard. Last updated 4 months ago.

openblas cpp

5.0 match 3 stars 9.24 score 241 scripts 20 dependents

insightsengineering

rbmi:Reference Based Multiple Imputation

Implements standard and reference based multiple imputation methods for continuous longitudinal endpoints (Gower-Page et al. (2022) <doi:10.21105/joss.04251>). In particular, this package supports deterministic conditional mean imputation and jackknifing as described in Wolbers et al. (2022) <doi:10.1002/pst.2234>, Bayesian multiple imputation as described in Carpenter et al. (2013) <doi:10.1080/10543406.2013.834911>, and bootstrapped maximum likelihood imputation as described in von Hippel and Bartlett (2021) <doi: 10.1214/20-STS793>.

Maintained by Isaac Gravestock. Last updated 25 days ago.

5.3 match 18 stars 8.78 score 33 scripts 1 dependents

marlonecobos

nichevol:Tools for Ecological Niche Evolution Assessment Considering Uncertainty

A collection of tools that allow users to perform critical steps in the process of assessing ecological niche evolution over phylogenies, with uncertainty incorporated explicitly in reconstructions. The method proposed here for ancestral reconstruction of ecological niches characterizes species' niches using a bin-based approach that incorporates uncertainty in estimations. Compared to other existing methods, the approaches presented here reduce risk of overestimation of amounts and rates of ecological niche evolution. The main analyses include: initial exploration of environmental data in occurrence records and accessible areas, preparation of data for phylogenetic analyses, executing comparative phylogenetic analyses of ecological niches, and plotting for interpretations. Details on the theoretical background and methods used can be found in: Owens et al. (2020) <doi:10.1002/ece3.6359>, Peterson et al. (1999) <doi:10.1126/science.285.5431.1265>, Soberón and Peterson (2005) <doi:10.17161/bi.v2i0.4>, Peterson (2011) <doi:10.1111/j.1365-2699.2010.02456.x>, Barve et al. (2011) <doi:10.1111/ecog.02671>, Machado-Stredel et al. (2021) <doi:10.21425/F5FBG48814>, Owens et al. (2013) <doi:10.1016/j.ecolmodel.2013.04.011>, Saupe et al. (2018) <doi:10.1093/sysbio/syx084>, and Cobos et al. (2021) <doi:10.1111/jav.02868>.

Maintained by Marlon E. Cobos. Last updated 2 years ago.

12.1 match 14 stars 3.85 score 2 scripts

sooahnshin

aihuman:Experimental Evaluation of Algorithm-Assisted Human Decision-Making

Provides statistical methods for analyzing experimental evaluation of the causal impacts of algorithmic recommendations on human decisions developed by Imai, Jiang, Greiner, Halen, and Shin (2023) <doi:10.1093/jrsssa/qnad010> and Ben-Michael, Greiner, Huang, Imai, Jiang, and Shin (2024) <doi:10.48550/arXiv.2403.12108>. The data used for this paper, and made available here, are interim, based on only half of the observations in the study and (for those observations) only half of the study follow-up period. We use them only to illustrate methods, not to draw substantive conclusions.

Maintained by Sooahn Shin. Last updated 3 months ago.

openblas cpp openmp

10.1 match 2 stars 4.60 score 8 scripts

winvector

WVPlots:Common Plots for Analysis

Select data analysis plots, under a standardized calling interface implemented on top of 'ggplot2' and 'plotly'. Plots of interest include: 'ROC', gain curve, scatter plot with marginal distributions, conditioned scatter plot with marginal densities, box and stem with matching theoretical distribution, and density with matching theoretical distribution.

Maintained by John Mount. Last updated 11 months ago.

5.8 match 85 stars 8.00 score 280 scripts

cran

boot:Bootstrap Functions (Originally by Angelo Canty for S)

Functions and datasets for bootstrapping from the book "Bootstrap Methods and Their Application" by A. C. Davison and D. V. Hinkley (1997, CUP), originally written by Angelo Canty for S.

Maintained by Alessandra R. Brazzale. Last updated 7 months ago.

5.6 match 2 stars 8.21 score 2.3k dependents

stan-dev

posterior:Tools for Working with Posterior Distributions

Provides useful tools for both users and developers of packages for fitting Bayesian models or working with output from Bayesian models. The primary goals of the package are to: (a) Efficiently convert between many different useful formats of draws (samples) from posterior or prior distributions. (b) Provide consistent methods for operations commonly performed on draws, for example, subsetting, binding, or mutating draws. (c) Provide various summaries of draws in convenient formats. (d) Provide lightweight implementations of state of the art posterior inference diagnostics. References: Vehtari et al. (2021) <doi:10.1214/20-BA1221>.

Maintained by Paul-Christian Bürkner. Last updated 12 days ago.

bayes bayesian mcmc

2.8 match 168 stars 16.13 score 3.3k scripts 342 dependents

cran

lpcde:Boundary Adaptive Local Polynomial Conditional Density Estimator

Tools for estimation and inference of conditional densities, derivatives and functions. This is the companion software for Cattaneo, Chandak, Jansson and Ma (2024) <doi:10.3150/23-BEJ1711>.

Maintained by Rajita Chandak. Last updated 21 days ago.

cpp

18.8 match 2.40 score 3 scripts