R-universe search: pool

rstudio

pool:Object Pooling

Enables the creation of object pools, which make it less computationally expensive to fetch a new object. Currently the only supported pooled objects are 'DBI' connections.

Maintained by Hadley Wickham. Last updated 5 months ago.

78.6 match 255 stars 12.85 score 684 scripts 27 dependents

mwheymans

psfmi:Prediction Model Pooling, Selection and Performance Evaluation Across Multiply Imputed Datasets

Pooling, backward and forward selection of linear, logistic and Cox regression models in multiply imputed datasets. Backward and forward selection can be done from the pooled model using Rubin's Rules (RR), the D1, D2, D3, D4 and the median p-values method. This is also possible for Mixed models. The models can contain continuous, dichotomous, categorical and restricted cubic spline predictors and interaction terms between all these type of predictors. The stability of the models can be evaluated using (cluster) bootstrapping. The package further contains functions to pool model performance measures as ROC/AUC, Reclassification, R-squared, scaled Brier score, H&L test and calibration plots for logistic regression models. Internal validation can be done across multiply imputed datasets with cross-validation or bootstrapping. The adjusted intercept after shrinkage of pooled regression coefficients can be obtained. Backward and forward selection as part of internal validation is possible. A function to externally validate logistic prediction models in multiple imputed datasets is available and a function to compare models. For Cox models a strata variable can be included. Eekhout (2017) <doi:10.1186/s12874-017-0404-7>. Wiel (2009) <doi:10.1093/biostatistics/kxp011>. Marshall (2009) <doi:10.1186/1471-2288-9-57>.

Maintained by Martijn Heymans. Last updated 2 years ago.

cox-regression imputation imputed-datasets logistic multiple-imputation pool predictor regression selection spline spline-predictors

70.6 match 10 stars 7.17 score 70 scripts

crlsierra

SoilR:Models of Soil Organic Matter Decomposition

Functions for modeling Soil Organic Matter decomposition in terrestrial ecosystems with linear and nonlinear systems of differential equations. The package implements models according to the compartmental system representation described in Sierra and others (2012) <doi:10.5194/gmd-5-1045-2012> and Sierra and others (2014) <doi:10.5194/gmd-7-1919-2014>.

Maintained by Carlos A. Sierra. Last updated 1 years ago.

159.9 match 5 stars 2.88 score 153 scripts

cschwarz-stat-sfu-ca

SPAS:Stratified-Petersen Analysis System

The Stratified-Petersen Analysis System (SPAS) is designed to estimate abundance in two-sample capture-recapture experiments where the capture and recaptures are stratified. This is a generalization of the simple Lincoln-Petersen estimator. Strata may be defined in time or in space or both, and the s strata in which marking takes place may differ from the t strata in which recoveries take place. When s=t, SPAS reduces to the method described by Darroch (1961) <doi:10.2307/2332748>. When s<t, SPAS implements the methods described in Plante, Rivest, and Tremblay (1988) <doi:10.2307/2533994>. Schwarz and Taylor (1998) <doi:10.1139/f97-238> describe the use of SPAS in estimating return of salmon stratified by time and geography. A related package, BTSPAS, deals with temporal stratification where a spline is used to model the distribution of the population over time as it passes the second capture location. This is the R-version of the (now obsolete) standalone Windows program of the same name.

Maintained by Carl James Schwarz. Last updated 1 months ago.

cpp

60.9 match 2 stars 6.55 score 28 scripts 1 dependents

mlverse

torch:Tensors and Neural Networks with 'GPU' Acceleration

Provides functionality to define and train neural networks similar to 'PyTorch' by Paszke et al (2019) <doi:10.48550/arXiv.1912.01703> but written entirely in R using the 'libtorch' library. Also supports low-level tensor operations and 'GPU' acceleration.

Maintained by Daniel Falbel. Last updated 5 days ago.

autograd deep-learning torch cpp

23.0 match 520 stars 16.52 score 1.4k scripts 38 dependents

amices

mice:Multivariate Imputation by Chained Equations

Multiple imputation using Fully Conditional Specification (FCS) implemented by the MICE algorithm as described in Van Buuren and Groothuis-Oudshoorn (2011) <doi:10.18637/jss.v045.i03>. Each variable has its own imputation model. Built-in imputation models are provided for continuous data (predictive mean matching, normal), binary data (logistic regression), unordered categorical data (polytomous logistic regression) and ordered categorical data (proportional odds). MICE can also impute continuous two-level data (normal model, pan, second-level variables). Passive imputation can be used to maintain consistency between variables. Various diagnostic plots are available to inspect the quality of the imputations.

Maintained by Stef van Buuren. Last updated 6 days ago.

chained-equations fcs imputation mice missing-data missing-values multiple-imputation multivariate-data cpp

21.6 match 462 stars 16.50 score 10k scripts 154 dependents

rstudio

keras3:R Interface to 'Keras'

Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.

Maintained by Tomasz Kalinowski. Last updated 4 days ago.

25.0 match 845 stars 13.57 score 264 scripts 2 dependents

vandomed

pooling:Fit Poolwise Regression Models

Functions for calculating power and fitting regression models in studies where a biomarker is measured in "pooled" samples rather than for each individual. Approaches for handling measurement error follow the framework of Schisterman et al. (2010) <doi:10.1002/sim.3823>.

Maintained by Dane R. Van Domelen. Last updated 5 years ago.

assay-modeling biomarkers efficiency epidemiology maximum-likelihood measurement-error pooling

87.2 match 3.60 score 80 scripts

spatstat

spatstat.explore:Exploratory Data Analysis for the 'spatstat' Family

Functionality for exploratory data analysis and nonparametric analysis of spatial data, mainly spatial point patterns, in the 'spatstat' family of packages. (Excludes analysis of spatial data on a linear network, which is covered by the separate package 'spatstat.linnet'.) Methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported.

Maintained by Adrian Baddeley. Last updated 1 months ago.

cluster-detection confidence-intervals hypothesis-testing k-function roc-curves scan-statistics significance-testing simulation-envelopes spatial-analysis spatial-data-analysis spatial-sharpening spatial-smoothing spatial-statistics

30.3 match 1 stars 10.17 score 67 scripts 148 dependents

t-kalinowski

keras:R Interface to 'Keras'

Interface to 'Keras' <https://keras.io>, a high-level neural networks 'API'. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both 'CPU' and 'GPU' devices.

Maintained by Tomasz Kalinowski. Last updated 11 months ago.

25.7 match 10.82 score 10k scripts 54 dependents

mwheymans

miceafter:Data and Statistical Analyses after Multiple Imputation

Statistical Analyses and Pooling after Multiple Imputation. A large variety of repeated statistical analysis can be performed and finally pooled. Statistical analysis that are available are, among others, Levene's test, Odds and Risk Ratios, One sample proportions, difference between proportions and linear and logistic regression models. Functions can also be used in combination with the Pipe operator. More and more statistical analyses and pooling functions will be added over time. Heymans (2007) <doi:10.1186/1471-2288-7-33>. Eekhout (2017) <doi:10.1186/s12874-017-0404-7>. Wiel (2009) <doi:10.1093/biostatistics/kxp011>. Marshall (2009) <doi:10.1186/1471-2288-9-57>. Sidi (2021) <doi:10.1080/00031305.2021.1898468>. Lott (2018) <doi:10.1080/00031305.2018.1473796>. Grund (2021) <doi:10.31234/osf.io/d459g>.

Maintained by Martijn Heymans. Last updated 2 years ago.

43.0 match 2 stars 4.84 score 23 scripts

stan-dev

rstanarm:Bayesian Applied Regression Modeling via Stan

Estimates previously compiled regression models using the 'rstan' package, which provides the R interface to the Stan C++ library for Bayesian estimation. Users specify models via the customary R syntax with a formula and data.frame plus some additional arguments for priors.

Maintained by Ben Goodrich. Last updated 9 months ago.

bayesian bayesian-data-analysis bayesian-inference bayesian-methods bayesian-statistics multilevel-models rstan rstanarm stan statistical-modeling cpp

9.9 match 393 stars 15.68 score 5.0k scripts 13 dependents

beckerbenj

eatATA:Create Constraints for Small Test Assembly Problems

Provides simple functions to create constraints for small test assembly problems (e.g. van der Linden (2005, ISBN: 978-0-387-29054-6)) using sparse matrices. Currently, 'GLPK', 'lpSolve', 'Symphony', and 'Gurobi' are supported as solvers. The 'gurobi' package is not available from any mainstream repository; see <https://www.gurobi.com/downloads/>.

Maintained by Benjamin Becker. Last updated 2 months ago.

26.1 match 4 stars 5.68 score 20 scripts

mlr-org

mlr3torch:Deep Learning with 'mlr3'

Deep Learning library that extends the mlr3 framework by building upon the 'torch' package. It allows to conveniently build, train, and evaluate deep learning models without having to worry about low level details. Custom architectures can be created using the graph language defined in 'mlr3pipelines'.

Maintained by Sebastian Fischer. Last updated 1 months ago.

data-science deep-learning machine-learning mlr3 torch

17.6 match 42 stars 7.63 score 78 scripts

insightsengineering

rbmi:Reference Based Multiple Imputation

Implements standard and reference based multiple imputation methods for continuous longitudinal endpoints (Gower-Page et al. (2022) <doi:10.21105/joss.04251>). In particular, this package supports deterministic conditional mean imputation and jackknifing as described in Wolbers et al. (2022) <doi:10.1002/pst.2234>, Bayesian multiple imputation as described in Carpenter et al. (2013) <doi:10.1080/10543406.2013.834911>, and bootstrapped maximum likelihood imputation as described in von Hippel and Bartlett (2021) <doi: 10.1214/20-STS793>.

Maintained by Isaac Gravestock. Last updated 23 days ago.

14.5 match 18 stars 8.78 score 33 scripts 1 dependents

choi-phd

TestDesign:Optimal Test Design Approach to Fixed and Adaptive Test Construction

Uses the optimal test design approach by Birnbaum (1968, ISBN:9781593119348) and van der Linden (2018) <doi:10.1201/9781315117430> to construct fixed, adaptive, and parallel tests. Supports the following mixed-integer programming (MIP) solver packages: 'Rsymphony', 'highs', 'gurobi', 'lpSolve', and 'Rglpk'. The 'gurobi' package is not available from CRAN; see <https://www.gurobi.com/downloads/>.

Maintained by Seung W. Choi. Last updated 6 months ago.

openblas cpp

16.5 match 3 stars 7.34 score 37 scripts 2 dependents

angusmclure

PoolTestR:Prevalence and Regression for Pool-Tested (Group-Tested) Data

An easy-to-use tool for working with presence/absence tests on 'pooled' or 'grouped' samples. The primary application is for estimating prevalence of a marker in a population based on the results of tests on pooled specimens. This sampling method is often employed in surveillance of rare conditions in humans or animals (e.g. molecular xenomonitoring). The package was initially conceived as an R-based alternative to the molecular xenomonitoring software, 'PoolScreen' <https://sites.uab.edu/statgenetics/software/>. However, it goes further, allowing for estimates of prevalence to be adjusted for hierarchical sampling frames, and perform flexible mixed-effect regression analyses (McLure et al. Environmental Modelling and Software. <DOI:10.1016/j.envsoft.2021.105158>). The package is currently in early stages, however more features are planned or in the works: e.g. adjustments for imperfect test specificity/sensitivity, functions for helping with optimal experimental design, and functions for spatial modelling.

Maintained by Angus McLure. Last updated 2 months ago.

cpp

24.7 match 5 stars 4.83 score 8 scripts

vsousa

poolHelper:Simulates Pooled Sequencing Genetic Data

Simulates pooled sequencing data under a variety of conditions. Also allows for the evaluation of the average absolute difference between allele frequencies computed from genotypes and those computed from pooled data. Carvalho et al., (2022) <doi:10.1101/2023.01.20.524733>.

Maintained by João Carvalho. Last updated 2 years ago.

28.1 match 4.18 score 3 scripts 1 dependents

bioc

gCrisprTools:Suite of Functions for Pooled Crispr Screen QC and Analysis

Set of tools for evaluating pooled high-throughput screening experiments, typically employing CRISPR/Cas9 or shRNA expression cassettes. Contains methods for interrogating library and cassette behavior within an experiment, identifying differentially abundant cassettes, aggregating signals to identify candidate targets for empirical validation, hypothesis testing, and comprehensive reporting. Version 2.0 extends these applications to include a variety of tools for contextualizing and integrating signals across many experiments, incorporates extended signal enrichment methodologies via the "sparrow" package, and streamlines many formal requirements to aid in interpretablity.

Maintained by Russell Bainer. Last updated 5 months ago.

immunooncology crispr pooledscreens experimentaldesign biomedicalinformatics cellbiology functionalgenomics pharmacogenomics pharmacogenetics systemsbiology differentialexpression genesetenrichment genetics multiplecomparison normalization preprocessing qualitycontrol rnaseq regression software visualization

19.6 match 4.78 score 8 scripts

agqhammond

UKFE:UK Flood Estimation

Functions to implement the methods of the Flood Estimation Handbook (FEH), associated updates and the revitalised flood hydrograph model (ReFH). Currently the package uses NRFA peak flow dataset version 13. Aside from FEH functionality, further hydrological functions are available. Most of the methods implemented in this package are described in one or more of the following: "Flood Estimation Handbook", Centre for Ecology & Hydrology (1999, ISBN:0 948540 94 X). "Flood Estimation Handbook Supplementary Report No. 1", Kjeldsen (2007, ISBN:0 903741 15 7). "Regional Frequency Analysis - an approach based on L-moments", Hosking & Wallis (1997, ISBN: 978 0 521 01940 8). "Proposal of the extreme rank plot for extreme value analysis: with an emphasis on flood frequency studies", Hammond (2019, <doi:10.2166/nh.2019.157>). "Making better use of local data in flood frequency estimation", Environment Agency (2017, ISBN: 978 1 84911 387 8). "Sampling uncertainty of UK design flood estimation" , Hammond (2021, <doi:10.2166/nh.2021.059>). "Improving the FEH statistical procedures for flood frequency estimation", Environment Agency (2008, ISBN: 978 1 84432 920 5). "Low flow estimation in the United Kingdom", Institute of Hydrology (1992, ISBN 0 948540 45 1). Wallingford HydroSolutions, (2016, <http://software.hydrosolutions.co.uk/winfap4/Urban-Adjustment-Procedure-Technical-Note.pdf>). Data from the UK National River Flow Archive (<https://nrfa.ceh.ac.uk/>, terms and conditions: <https://nrfa.ceh.ac.uk/costs-terms-and-conditions>).

Maintained by Anthony Hammond. Last updated 1 months ago.

45.5 match 1 stars 1.78 score

innager

Umoments:Unbiased Central Moment Estimates

Calculates one-sample unbiased central moment estimates and two-sample pooled estimates up to 6th order, including estimates of powers and products of central moments. Provides the machinery for obtaining unbiased central moment estimators beyond 6th order by generating expressions for expectations of raw sample moments and their powers and products. Gerlovina and Hubbard (2019) <doi:10.1080/25742558.2019.1701917>.

Maintained by Inna Gerlovina. Last updated 5 months ago.

19.8 match 4.00 score 2 scripts

wwiecek

baggr:Bayesian Aggregate Treatment Effects

Running and comparing meta-analyses of data with hierarchical Bayesian models in Stan, including convenience functions for formatting data, plotting and pooling measures specific to meta-analysis. This implements many models from Meager (2019) <doi:10.1257/app.20170299>.

Maintained by Witold Wiecek. Last updated 1 years ago.

bayesian-statistics meta-analysis quantile-regression stan treatment-effects cpp

10.7 match 49 stars 7.24 score 88 scripts

bioc

nullranges:Generation of null ranges via bootstrapping or covariate matching

Modular package for generation of sets of ranges representing the null hypothesis. These can take the form of bootstrap samples of ranges (using the block bootstrap framework of Bickel et al 2010), or sets of control ranges that are matched across one or more covariates. nullranges is designed to be inter-operable with other packages for analysis of genomic overlap enrichment, including the plyranges Bioconductor package.

Maintained by Michael Love. Last updated 5 months ago.

visualization genesetenrichment functionalgenomics epigenetics generegulation genetarget genomeannotation annotation genomewideassociation histonemodification chipseq atacseq dnaseseq rnaseq hiddenmarkovmodel bioconductor bootstrap genomics matching statistics

9.5 match 27 stars 8.16 score 50 scripts 1 dependents

bioc

CAGEfightR:Analysis of Cap Analysis of Gene Expression (CAGE) data using Bioconductor

CAGE is a widely used high throughput assay for measuring transcription start site (TSS) activity. CAGEfightR is an R/Bioconductor package for performing a wide range of common data analysis tasks for CAGE and 5'-end data in general. Core functionality includes: import of CAGE TSSs (CTSSs), tag (or unidirectional) clustering for TSS identification, bidirectional clustering for enhancer identification, annotation with transcript and gene models, correlation of TSS and enhancer expression, calculation of TSS shapes, quantification of CAGE expression as expression matrices and genome brower visualization.

Maintained by Malte Thodberg. Last updated 5 months ago.

software transcription coverage geneexpression generegulation peakdetection dataimport datarepresentation transcriptomics sequencing annotation genomebrowsers normalization preprocessing visualization

10.3 match 8 stars 7.46 score 67 scripts 1 dependents

minatonakazawa

fmsb:Functions for Medical Statistics Book with some Demographic Data

Several utility functions for the book entitled "Practices of Medical and Health Data Analysis using R" (Pearson Education Japan, 2007) with Japanese demographic data and some demographic analysis related functions.

Maintained by Minato Nakazawa. Last updated 1 years ago.

10.0 match 3 stars 7.74 score 1.9k scripts 23 dependents

ledell

cvAUC:Cross-Validated Area Under the ROC Curve Confidence Intervals

Tools for working with and evaluating cross-validated area under the ROC curve (AUC) estimators. The primary functions of the package are ci.cvAUC and ci.pooled.cvAUC, which report cross-validated AUC and compute confidence intervals for cross-validated AUC estimates based on influence curves for i.i.d. and pooled repeated measures data, respectively. One benefit to using influence curve based confidence intervals is that they require much less computation time than bootstrapping methods. The utility functions, AUC and cvAUC, are simple wrappers for functions from the ROCR package.

Maintained by Erin LeDell. Last updated 3 years ago.

auc confidence-intervals cross-validation machine-learning statistics variance

7.8 match 23 stars 9.17 score 317 scripts 40 dependents

cran

sna:Tools for Social Network Analysis

A range of tools for social network analysis, including node and graph-level indices, structural distance and covariance methods, structural equivalence detection, network regression, random graph generation, and 2D/3D network visualization.

Maintained by Carter T. Butts. Last updated 6 months ago.

10.1 match 8 stars 6.78 score 94 dependents

apache

arrow:Integration to 'Apache' 'Arrow'

'Apache' 'Arrow' <https://arrow.apache.org/> is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. This package provides an interface to the 'Arrow C++' library.

Maintained by Jonathan Keane. Last updated 1 months ago.

arrow curl openssl cpp

3.5 match 15k stars 19.22 score 10k scripts 81 dependents

insightsengineering

tern.rbmi:Create Interface for 'RBMI' and 'tern'

'RBMI' implements standard and reference based multiple imputation methods for continuous longitudinal endpoints (Gower-Page et al. (2022) <doi:10.21105/joss.04251>). This package provides an interface for 'RBMI' uses the 'tern' <https://cran.r-project.org/package=tern> framework by Zhu et al. (2023) and tabulate results easily using 'rtables' <https://cran.r-project.org/package=rtables> by Becker et al. (2023).

Maintained by Joe Zhu. Last updated 9 days ago.

graphs listings tables

10.3 match 3 stars 6.53 score 3 scripts

cran

epiR:Tools for the Analysis of Epidemiological Data

Tools for the analysis of epidemiological and surveillance data. Contains functions for directly and indirectly adjusting measures of disease frequency, quantifying measures of association on the basis of single or multiple strata of count data presented in a contingency table, computation of confidence intervals around incidence risk and incidence rate estimates and sample size calculations for cross-sectional, case-control and cohort studies. Surveillance tools include functions to calculate an appropriate sample size for 1- and 2-stage representative freedom surveys, functions to estimate surveillance system sensitivity and functions to support scenario tree modelling analyses.

Maintained by Mark Stevenson. Last updated 2 months ago.

8.2 match 10 stars 8.18 score 10 dependents

bioc

BASiCS:Bayesian Analysis of Single-Cell Sequencing data

Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model to perform statistical analyses of single-cell RNA sequencing datasets in the context of supervised experiments (where the groups of cells of interest are known a priori, e.g. experimental conditions or cell types). BASiCS performs built-in data normalisation (global scaling) and technical noise quantification (based on spike-in genes). BASiCS provides an intuitive detection criterion for highly (or lowly) variable genes within a single group of cells. Additionally, BASiCS can compare gene expression patterns between two or more pre-specified groups of cells. Unlike traditional differential expression tools, BASiCS quantifies changes in expression that lie beyond comparisons of means, also allowing the study of changes in cell-to-cell heterogeneity. The latter can be quantified via a biological over-dispersion parameter that measures the excess of variability that is observed with respect to Poisson sampling noise, after normalisation and technical noise removal. Due to the strong mean/over-dispersion confounding that is typically observed for scRNA-seq datasets, BASiCS also tests for changes in residual over-dispersion, defined by residual values with respect to a global mean/over-dispersion trend.

Maintained by Catalina Vallejos. Last updated 5 months ago.

immunooncology normalization sequencing rnaseq software geneexpression transcriptomics singlecell differentialexpression bayesian cellbiology bioconductor-package gene-expression rcpp rcpparmadillo scrna-seq single-cell openblas cpp openmp

6.5 match 83 stars 10.26 score 368 scripts 1 dependents

simongrund1

mitml:Tools for Multiple Imputation in Multilevel Modeling

Provides tools for multiple imputation of missing data in multilevel modeling. Includes a user-friendly interface to the packages 'pan' and 'jomo', and several functions for visualization, data management and the analysis of multiply imputed data sets.

Maintained by Simon Grund. Last updated 1 years ago.

imputation missing-data mixed-effects multilevel-data multilevel-models

5.3 match 29 stars 12.36 score 246 scripts 153 dependents

bioc

snpStats:SnpMatrix and XSnpMatrix classes and methods

Classes and statistical methods for large SNP association studies. This extends the earlier snpMatrix package, allowing for uncertainty in genotypes.

Maintained by David Clayton. Last updated 5 months ago.

microarray snp geneticvariability zlib

6.8 match 9.41 score 674 scripts 17 dependents

pakillo

lump.split.pool.ENM:Comparing different ENM approaches

Facilitate running simulations to compare different approaches for ecological niche modelling, namely splitting, lumping, and fit models with partial pooling. See https://doi.org/10.1016/j.tree.2018.10.012 for more information.

Maintained by Francisco Rodriguez-Sanchez. Last updated 5 years ago.

31.8 match 2 stars 2.00 score

reconhub

incidence:Compute, Handle, Plot and Model Incidence of Dated Events

Provides functions and classes to compute, handle and visualise incidence from dated events for a defined time interval. Dates can be provided in various standard formats. The class 'incidence' is used to store computed incidence and can be easily manipulated, subsetted, and plotted. In addition, log-linear models can be fitted to 'incidence' objects using 'fit'. This package is part of the RECON (<https://www.repidemicsconsortium.org/>) toolkit for outbreak analysis.

Maintained by Tim Taylor. Last updated 7 months ago.

outbreak

5.2 match 58 stars 12.06 score 504 scripts 11 dependents

cran

binGroup:Evaluation and Experimental Design for Binomial Group Testing

Methods for estimation and hypothesis testing of proportions in group testing designs: methods for estimating a proportion in a single population (assuming sensitivity and specificity equal to 1 in designs with equal group sizes), as well as hypothesis tests and functions for experimental design for this situation. For estimating one proportion or the difference of proportions, a number of confidence interval methods are included, which can deal with various different pool sizes. Further, regression methods are implemented for simple pooling and matrix pooling designs. Methods for identification of positive items in group testing designs: Optimal testing configurations can be found for hierarchical and array-based algorithms. Operating characteristics can be calculated for testing configurations across a wide variety of situations.

Maintained by Frank Schaarschmidt. Last updated 7 years ago.

28.5 match 2.18 score

bfifield

hettx:Fisherian and Neymanian Methods for Detecting and Measuring Treatment Effect Variation

Implements methods developed by Ding, Feller, and Miratrix (2016) <doi:10.1111/rssb.12124> <arXiv:1412.5000>, and Ding, Feller, and Miratrix (2018) <doi:10.1080/01621459.2017.1407322> <arXiv:1605.06566> for testing whether there is unexplained variation in treatment effects across observations, and for characterizing the extent of the explained and unexplained variation in treatment effects. The package includes wrapper functions implementing the proposed methods, as well as helper functions for analyzing and visualizing the results of the test.

Maintained by Ben Fifield. Last updated 2 years ago.

11.4 match 10 stars 5.32 score 21 scripts

bioc

methylKit:DNA methylation analysis from high-throughput bisulfite sequencing results

methylKit is an R package for DNA methylation analysis and annotation from high-throughput bisulfite sequencing. The package is designed to deal with sequencing data from RRBS and its variants, but also target-capture methods and whole genome bisulfite sequencing. It also has functions to analyze base-pair resolution 5hmC data from experimental protocols such as oxBS-Seq and TAB-Seq. Methylation calling can be performed directly from Bismark aligned BAM files.

Maintained by Altuna Akalin. Last updated 16 days ago.

dnamethylation sequencing methylseq genome-biology methylation statistical-analysis visualization curl bzip2 xz-utils zlib cpp

5.0 match 220 stars 11.80 score 578 scripts 3 dependents

sjentsch

jmvReadWrite:Read and Write 'jamovi' Files ('.omv')

The free and open a statistical spreadsheet 'jamovi' (<https://www.jamovi.org>) aims to make statistical analyses easy and intuitive. 'jamovi' produces syntax that can directly be used in R (in connection with the R-package 'jmv'). Having import / export routines for the data files 'jamovi' produces ('.omv') permits an easy transfer of data and analyses between 'jamovi' and R.

Maintained by Sebastian Jentschke. Last updated 11 hours ago.

jamovi

9.6 match 5 stars 6.09 score 32 scripts

strengejacke

ggeffects:Create Tidy Data Frames of Marginal Effects for 'ggplot' from Model Outputs

Compute marginal effects and adjusted predictions from statistical models and returns the result as tidy data frames. These data frames are ready to use with the 'ggplot2'-package. Effects and predictions can be calculated for many different models. Interaction terms, splines and polynomial terms are also supported. The main functions are ggpredict(), ggemmeans() and ggeffect(). There is a generic plot()-method to plot the results using 'ggplot2'.

Maintained by Daniel Lüdecke. Last updated 4 days ago.

estimated-marginal-means hacktoberfest marginal-effects prediction

3.6 match 588 stars 15.55 score 3.6k scripts 7 dependents

farhadpishgar

MatchThem:Matching and Weighting Multiply Imputed Datasets

Provides essential tools for the pre-processing techniques of matching and weighting multiply imputed datasets. The package includes functions for matching within and across multiply imputed datasets using various methods, estimating weights for units in the imputed datasets using multiple weighting methods, calculating causal effect estimates in each matched or weighted dataset using parametric or non-parametric statistical models, and pooling the resulting estimates according to Rubin's rules (please see <https://journal.r-project.org/archive/2021/RJ-2021-073/> for more details).

Maintained by Farhad Pishgar. Last updated 5 months ago.

7.6 match 16 stars 7.34 score 112 scripts

cran

PoolDilutionR:Calculate Gross Biogeochemical Flux Rates from Isotope Pool Dilution Data

Pool dilution is a isotope tracer technique wherein a biogeochemical pool is artifically enriched with its heavy isotopologue and the gross productive and consumptive fluxes of that pool are quantified by the change in pool size and isotopic composition over time. This package calculates gross production and consumption rates from closed-system isotopic pool dilution time series data. Pool size concentrations and heavy isotope (e.g., 15N) content are measured over time and the model optimizes production rate (P) and the first order rate constant (k) by minimizing error in the model-predicted total pool size, as well as the isotopic signature. The model optimizes rates by weighting information against the signal:noise ratio of concentration and heavy- isotope signatures using measurement precision as well as the magnitude of change over time. The calculations used here are based on von Fischer and Hedin (2002) <doi:10.1029/2001GB001448> with some modifications.

Maintained by Kendalynn A. Morris. Last updated 2 years ago.

20.3 match 2.70 score 8 scripts

bioc

crisprDesign:Comprehensive design of CRISPR gRNAs for nucleases and base editors

Provides a comprehensive suite of functions to design and annotate CRISPR guide RNA (gRNAs) sequences. This includes on- and off-target search, on-target efficiency scoring, off-target scoring, full gene and TSS contextual annotations, and SNP annotation (human only). It currently support five types of CRISPR modalities (modes of perturbations): CRISPR knockout, CRISPR activation, CRISPR inhibition, CRISPR base editing, and CRISPR knockdown. All types of CRISPR nucleases are supported, including DNA- and RNA-target nucleases such as Cas9, Cas12a, and Cas13d. All types of base editors are also supported. gRNA design can be performed on reference genomes, transcriptomes, and custom DNA and RNA sequences. Both unpaired and paired gRNA designs are enabled.

Maintained by Jean-Philippe Fortin. Last updated 10 days ago.

crispr functionalgenomics genetarget bioconductor bioconductor-package crispr-cas9 crispr-design crispr-target genomics-analysis grna grna-sequence grna-sequences sgrna sgrna-design

6.5 match 22 stars 8.28 score 80 scripts 3 dependents

rfastofficial

Rfast:A Collection of Efficient and Extremely Fast R Functions

A collection of fast (utility) functions for data analysis. Column and row wise means, medians, variances, minimums, maximums, many t, F and G-square tests, many regressions (normal, logistic, Poisson), are some of the many fast functions. References: a) Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>. b) Tsagris M. and Papadakis M. (2018). Forward regression in R: from the extreme slow to the extreme fast. Journal of Data Science, 16(4): 771--780. <doi:10.6339/JDS.201810_16(4).00006>. c) Chatzipantsiou C., Dimitriadis M., Papadakis M. and Tsagris M. (2020). Extremely Efficient Permutation and Bootstrap Hypothesis Tests Using Hypothesis Tests Using R. Journal of Modern Applied Statistical Methods, 18(2), eP2898. <doi:10.48550/arXiv.1806.10947>. d) Tsagris M., Papadakis M., Alenazi A. and Alzeley O. (2024). Computationally Efficient Outlier Detection for High-Dimensional Data Using the MDP Algorithm. Computation, 12(9): 185. <doi:10.3390/computation12090185>. e) Tsagris M. and Papadakis M. (2025). Fast and light-weight energy statistics using the R package Rfast. <doi:10.48550/arXiv.2501.02849>.

Maintained by Manos Papadakis. Last updated 17 days ago.

openblas cpp openmp

4.3 match 147 stars 12.54 score 1.2k scripts 166 dependents

detlew

PowerTOST:Power and Sample Size for (Bio)Equivalence Studies

Contains functions to calculate power and sample size for various study designs used in bioequivalence studies. Use known.designs() to see the designs supported. Power and sample size can be obtained based on different methods, amongst them prominently the TOST procedure (two one-sided t-tests). See README and NEWS for further information.

Maintained by Detlew Labes. Last updated 12 months ago.

5.5 match 20 stars 9.61 score 112 scripts 4 dependents

oakleyj

SHELF:Tools to Support the Sheffield Elicitation Framework

Implements various methods for eliciting a probability distribution for a single parameter from an expert or a group of experts. The expert provides a small number of probability judgements, corresponding to points on his or her cumulative distribution function. A range of parametric distributions can then be fitted and displayed, with feedback provided in the form of fitted probabilities and percentiles. For multiple experts, a weighted linear pool can be calculated. Also includes functions for eliciting beliefs about population distributions; eliciting multivariate distributions using a Gaussian copula; eliciting a Dirichlet distribution; eliciting distributions for variance parameters in a random effects meta-analysis model; survival extrapolation. R Shiny apps for most of the methods are included.

Maintained by Jeremy Oakley. Last updated 15 days ago.

5.7 match 19 stars 8.90 score 73 scripts 3 dependents

ccicb

CRUX:Easily explore patterns of somatic variation in cancer using 'CRUX'

Shiny app for exploring somatic variation in cancer. Powered by maftools.

Maintained by Sam El-Kamand. Last updated 1 years ago.

23.2 match 2 stars 2.00 score 5 scripts

friendly

WordPools:Word Pools Used in Studies of Learning and Memory

Collects several classical word pools used most often to provide lists of words in psychological studies of learning and memory. It provides a simple function, 'pickList' for selecting random samples of words within given ranges.

Maintained by Michael Friendly. Last updated 1 years ago.

experiment memory wordlist-generator

14.5 match 3 stars 3.18 score 8 scripts

reckziegel

ffp:Fully Flexible Probabilities for Stress Testing and Portfolio Construction

Implements numerical entropy-pooling for portfolio construction and scenario analysis as described in Meucci, Attilio (2008) and Meucci, Attilio (2010) <doi:10.2139/ssrn.1696802>.

Maintained by Bernardo Reckziegel. Last updated 2 years ago.

bayesian-inference entropy-pooling flexible-probabilities portolio-optimization risk-management scenarios views

8.0 match 15 stars 5.68 score 32 scripts

easystats

modelbased:Estimation of Model-Based Predictions, Contrasts and Means

Implements a general interface for model-based estimations for a wide variety of models, used in the computation of marginal means, contrast analysis and predictions. For a list of supported models, see 'insight::supported_models()'.

Maintained by Dominique Makowski. Last updated 2 days ago.

contrast-analysis contrasts easystats estimate ggplot2 hacktoberfest marginal marginal-effects means predict

3.6 match 241 stars 12.35 score 315 scripts 4 dependents

cmstatr

cmstatr:Statistical Methods for Composite Material Data

An implementation of the statistical methods commonly used for advanced composite materials in aerospace applications. This package focuses on calculating basis values (lower tolerance bounds) for material strength properties, as well as performing the associated diagnostic tests. This package provides functions for calculating basis values assuming several different distributions, as well as providing functions for non-parametric methods of computing basis values. Functions are also provided for testing the hypothesis that there is no difference between strength and modulus data from an alternate sample and that from a "qualification" or "baseline" sample. For a discussion of these statistical methods and their use, see the Composite Materials Handbook, Volume 1 (2012, ISBN: 978-0-7680-7811-4). Additional details about this package are available in the paper by Kloppenborg (2020, <doi:10.21105/joss.02265>).

Maintained by Stefan Kloppenborg. Last updated 4 months ago.

composite-material-data data materials-science statistical-analysis statistics

7.0 match 4 stars 6.26 score 23 scripts

azure

AzureRMR:Interface to 'Azure Resource Manager'

A lightweight but powerful R interface to the 'Azure Resource Manager' REST API. The package exposes a comprehensive class framework and related tools for creating, updating and deleting 'Azure' resource groups, resources and templates. While 'AzureRMR' can be used to manage any 'Azure' service, it can also be extended by other packages to provide extra functionality for specific services. Part of the 'AzureR' family of packages.

Maintained by Hong Ooi. Last updated 1 years ago.

azure azure-resource-manager azure-sdk-r cloud

4.4 match 20 stars 9.94 score 51 scripts 12 dependents

mangothecat

BLCOP:Black-Litterman and Copula Opinion Pooling Frameworks

An implementation of the Black-Litterman Model and Attilio Meucci's copula opinion pooling framework as described in Meucci, Attilio (2005) <doi:10.2139/ssrn.848407>, Meucci, Attilio (2006) <doi:10.2139/ssrn.872577> and Meucci, Attilio (2008) <doi:10.2139/ssrn.1117574>.

Maintained by Joe Russell. Last updated 4 years ago.

8.1 match 6 stars 5.37 score 39 scripts

bioc

ReactomeGSA:Client for the Reactome Analysis Service for comparative multi-omics gene set analysis

The ReactomeGSA packages uses Reactome's online analysis service to perform a multi-omics gene set analysis. The main advantage of this package is, that the retrieved results can be visualized using REACTOME's powerful webapplication. Since Reactome's analysis service also uses R to perfrom the actual gene set analysis you will get similar results when using the same packages (such as limma and edgeR) locally. Therefore, if you only require a gene set analysis, different packages are more suited.

Maintained by Johannes Griss. Last updated 4 months ago.

genesetenrichment proteomics transcriptomics systemsbiology geneexpression reactome

5.4 match 23 stars 8.05 score 67 scripts

choi-phd

maat:Multiple Administrations Adaptive Testing

Provides an extension of the shadow-test approach to computerized adaptive testing (CAT) implemented in the 'TestDesign' package for the assessment framework involving multiple tests administered periodically throughout the year. This framework is referred to as the Multiple Administrations Adaptive Testing (MAAT) and supports multiple item pools vertically scaled and multiple phases (stages) of CAT within each test. Between phases and tests, transitioning from one item pool (and associated constraints) to another is allowed as deemed necessary to enhance the quality of measurement.

Maintained by Seung W. Choi. Last updated 9 months ago.

10.8 match 4.00 score 5 scripts

cran

ADLP:Accident and Development Period Adjusted Linear Pools for Actuarial Stochastic Reserving

Loss reserving generally focuses on identifying a single model that can generate superior predictive performance. However, different loss reserving models specialise in capturing different aspects of loss data. This is recognised in practice in the sense that results from different models are often considered, and sometimes combined. For instance, actuaries may take a weighted average of the prediction outcomes from various loss reserving models, often based on subjective assessments. This package allows for the use of a systematic framework to objectively combine (i.e. ensemble) multiple stochastic loss reserving models such that the strengths offered by different models can be utilised effectively. Our framework is developed in Avanzi et al. (2023). Firstly, our criteria model combination considers the full distributional properties of the ensemble and not just the central estimate - which is of particular importance in the reserving context. Secondly, our framework is that it is tailored for the features inherent to reserving data. These include, for instance, accident, development, calendar, and claim maturity effects. Crucially, the relative importance and scarcity of data across accident periods renders the problem distinct from the traditional ensemble techniques in statistical learning. Our framework is illustrated with a complex synthetic dataset. In the results, the optimised ensemble outperforms both (i) traditional model selection strategies, and (ii) an equally weighted ensemble. In particular, the improvement occurs not only with central estimates but also relevant quantiles, such as the 75th percentile of reserves (typically of interest to both insurers and regulators). Reference: Avanzi B, Li Y, Wong B, Xian A (2023) "Ensemble distributional forecasting for insurance loss reserving" <doi:10.48550/arXiv.2206.08541>.

Maintained by Yanfeng Li. Last updated 11 months ago.

15.8 match 2.70 score

bgoodri

mi:Missing Data Imputation and Model Checking

The mi package provides functions for data manipulation, imputing missing values in an approximate Bayesian framework, diagnostics of the models used to generate the imputations, confidence-building mechanisms to validate some of the assumptions of the imputation algorithm, and functions to analyze multiply imputed data sets with the appropriate degree of sampling uncertainty.

Maintained by Ben Goodrich. Last updated 3 years ago.

5.1 match 2 stars 8.25 score 244 scripts 47 dependents

rpolars

polars:Lightning-Fast 'DataFrame' Library

Lightning-fast 'DataFrame' library written in 'Rust'. Convert R data to 'Polars' data and vice versa. Perform fast, lazy, larger-than-memory and optimized data queries. 'Polars' is interoperable with the package 'arrow', as both are based on the 'Apache Arrow' Columnar Format.

Maintained by Soren Welling. Last updated 3 days ago.

arrow polars rust

3.4 match 499 stars 12.01 score 1.0k scripts 2 dependents

mgautierinra

poolfstat:Computing f-Statistics and Building Admixture Graphs Based on Allele Count or Pool-Seq Read Count Data

Functions for the computation of F-, f- and D-statistics (e.g., Fst, hierarchical F-statistics, Patterson's F2, F3, F3*, F4 and D parameters) in population genomics studies from allele count or Pool-Seq read count data and for the fitting, building and visualization of admixture graphs. The package also includes several utilities to manipulate Pool-Seq data stored in standard format (e.g., such as 'vcf' files or 'rsync' files generated by the the 'PoPoolation' software) and perform conversion to alternative format (as used in the 'BayPass' and 'SelEstim' software). As of version 2.0, the package also includes utilities to manipulate standard allele count data (e.g., stored in TreeMix, BayPass and SelEstim format).

Maintained by Mathieu Gautier. Last updated 4 months ago.

cpp

11.1 match 3.67 score 118 scripts

jiefei-wang

aws.ecx:Communicating with AWS EC2 and ECS using AWS REST APIs

Providing the functions for communicating with Amazon Web Services(AWS) Elastic Compute Cloud(EC2) and Elastic Container Service(ECS). The functions will have the prefix 'ecs_' or 'ec2_' depending on the class of the API. The request will be sent via the REST API and the parameters are given by the function argument. The credentials can be set via 'aws_set_credentials'. The EC2 documentation can be found at <https://docs.aws.amazon.com/AWSEC2/latest/APIReference/Welcome.html> and ECS can be found at <https://docs.aws.amazon.com/AmazonECS/latest/APIReference/Welcome.html>.

Maintained by Jiefei Wang. Last updated 3 years ago.

ec2 ecs ecs-functions

9.6 match 1 stars 4.18 score 2 scripts

ycroissant

plm:Linear Models for Panel Data

A set of estimators for models and (robust) covariance matrices, and tests for panel data econometrics, including within/fixed effects, random effects, between, first-difference, nested random effects as well as instrumental-variable (IV) and Hausman-Taylor-style models, panel generalized method of moments (GMM) and general FGLS models, mean groups (MG), demeaned MG, and common correlated effects (CCEMG) and pooled (CCEP) estimators with common factors, variable coefficients and limited dependent variables models. Test functions include model specification, serial correlation, cross-sectional dependence, panel unit root and panel Granger (non-)causality. Typical references are general econometrics text books such as Baltagi (2021), Econometric Analysis of Panel Data (<doi:10.1007/978-3-030-53953-5>), Hsiao (2014), Analysis of Panel Data (<doi:10.1017/CBO9781139839327>), and Croissant and Millo (2018), Panel Data Econometrics with R (<doi:10.1002/9781119504641>).

Maintained by Kevin Tappe. Last updated 6 hours ago.

3.2 match 59 stars 12.06 score 39 dependents

nilspetras

IPV:Item Pool Visualization

Generate plots based on the Item Pool Visualization concept for latent constructs. Item Pool Visualizations are used to display the conceptual structure of a set of items (self-report or psychometric). Dantlgraber, Stieger, & Reips (2019) <doi:10.1177/2059799119884283>.

Maintained by Nils Petras. Last updated 2 years ago.

9.2 match 2 stars 4.00 score 3 scripts

tlverse

sl3:Pipelines for Machine Learning and Super Learning

A modern implementation of the Super Learner prediction algorithm, coupled with a general purpose framework for composing arbitrary pipelines for machine learning tasks.

Maintained by Jeremy Coyle. Last updated 4 months ago.

data-science ensemble-learning ensemble-model machine-learning model-selection regression stacking statistics

3.7 match 100 stars 9.94 score 748 scripts 7 dependents

cran

CB2:CRISPR Pooled Screen Analysis using Beta-Binomial Test

Provides functions for hit gene identification and quantification of sgRNA (single-guided RNA) abundances for CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) pooled screen data analysis. Details are in Jeong et al. (2019) <doi:10.1101/gr.245571.118> and Baggerly et al. (2003) <doi:10.1093/bioinformatics/btg173>.

Maintained by Hyun-Hwan Jeong. Last updated 5 years ago.

cpp

10.1 match 3.60 score 40 scripts

predictiveecology

CBMutils:Utilities for modelling carbon based on CBM-CFS3

Implementation of several components of the Carbon Budget Model of the Canadian Forest Service (v3).

Maintained by Alex M Chubaty. Last updated 1 days ago.

7.9 match 4.51 score 3 scripts

alexanderrobitzsch

miceadds:Some Additional Multiple Imputation Functions, Especially for 'mice'

Contains functions for multiple imputation which complements existing functionality in R. In particular, several imputation methods for the mice package (van Buuren & Groothuis-Oudshoorn, 2011, <doi:10.18637/jss.v045.i03>) are implemented. Main features of the miceadds package include plausible value imputation (Mislevy, 1991, <doi:10.1007/BF02294457>), multilevel imputation for variables at any level or with any number of hierarchical and non-hierarchical levels (Grund, Luedtke & Robitzsch, 2018, <doi:10.1177/1094428117703686>; van Buuren, 2018, Ch.7, <doi:10.1201/9780429492259>), imputation using partial least squares (PLS) for high dimensional predictors (Robitzsch, Pham & Yanagida, 2016), nested multiple imputation (Rubin, 2003, <doi:10.1111/1467-9574.00217>), substantive model compatible imputation (Bartlett et al., 2015, <doi:10.1177/0962280214521348>), and features for the generation of synthetic datasets (Reiter, 2005, <doi:10.1111/j.1467-985X.2004.00343.x>; Nowok, Raab, & Dibben, 2016, <doi:10.18637/jss.v074.i11>).

Maintained by Alexander Robitzsch. Last updated 15 days ago.

missing-data multiple-imputation openblas cpp

3.9 match 16 stars 9.16 score 542 scripts 9 dependents

robertemprechtinger

metaHelper:Transforms Statistical Measures Commonly Used for Meta-Analysis

Helps calculate statistical values commonly used in meta-analysis. It provides several methods to compute different forms of standardized mean differences, as well as other values such as standard errors and standard deviations. The methods used in this package are described in the following references: Altman D G, Bland J M. (2011) <doi:10.1136/bmj.d2090> Borenstein, M., Hedges, L.V., Higgins, J.P.T. and Rothstein, H.R. (2009) <doi:10.1002/9780470743386.ch4> Chinn S. (2000) <doi:10.1002/1097-0258(20001130)19:22%3C3127::aid-sim784%3E3.0.co;2-m> Cochrane Handbook (2011) <https://handbook-5-1.cochrane.org/front_page.htm> Cooper, H., Hedges, L. V., & Valentine, J. C. (2009) <https://psycnet.apa.org/record/2009-05060-000> Cohen, J. (1977) <https://psycnet.apa.org/record/1987-98267-000> Ellis, P.D. (2009) <https://www.psychometrica.de/effect_size.html> Goulet-Pelletier, J.-C., & Cousineau, D. (2018) <doi:10.20982/tqmp.14.4.p242> Hedges, L. V. (1981) <doi:10.2307/1164588> Hedges L. V., Olkin I. (1985) <doi:10.1016/C2009-0-03396-0> Murad M H, Wang Z, Zhu Y, Saadi S, Chu H, Lin L et al. (2023) <doi:10.1136/bmj-2022-073141> Mayer M (2023) <https://search.r-project.org/CRAN/refmans/confintr/html/ci_proportion.html> Stackoverflow (2014) <https://stats.stackexchange.com/questions/82720/confidence-interval-around-binomial-estimate-of-0-or-1> Stackoverflow (2018) <https://stats.stackexchange.com/q/338043>.

Maintained by Robert Emprechtinger. Last updated 8 months ago.

9.0 match 4 stars 3.90 score

oliviergimenez

R2ucare:Goodness-of-Fit Tests for Capture-Recapture Models

Performs goodness-of-fit tests for capture-recapture models as described by Gimenez et al. (2018) <doi:10.1111/2041-210X.13014>. Also contains several functions to process capture-recapture data.

Maintained by Olivier Gimenez. Last updated 3 years ago.

arnason-schwarz-models capture-recapture-models cormack-jolly-seber goodness-of-fit

5.8 match 6 stars 6.10 score 104 scripts

vegandevs

vegan:Community Ecology Package

Ordination methods, diversity analysis and other functions for community and vegetation ecologists.

Maintained by Jari Oksanen. Last updated 16 days ago.

ecological-modelling ecology ordination fortran openblas

1.8 match 472 stars 19.41 score 15k scripts 440 dependents

r-forge

isotone:Active Set and Generalized PAVA for Isotone Optimization

Contains two main functions: one for solving general isotone regression problems using the pool-adjacent-violators algorithm (PAVA); another one provides a framework for active set methods for isotone optimization problems with arbitrary order restrictions. Various types of loss functions are prespecified.

Maintained by Patrick Mair. Last updated 3 months ago.

5.0 match 6.88 score 80 scripts 13 dependents

bioc

gscreend:Analysis of pooled genetic screens

Package for the analysis of pooled genetic screens (e.g. CRISPR-KO). The analysis of such screens is based on the comparison of gRNA abundances before and after a cell proliferation phase. The gscreend packages takes gRNA counts as input and allows detection of genes whose knockout decreases or increases cell proliferation.

Maintained by Katharina Imkeller. Last updated 5 months ago.

software statisticalmethod pooledscreens crispr

5.4 match 11 stars 6.34 score 7 scripts

mschubert

clustermq:Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM, PBS/Torque)

Evaluate arbitrary function calls using workers on HPC schedulers in single line of code. All processing is done on the network without accessing the file system. Remote schedulers are supported via SSH.

Maintained by Michael Schubert. Last updated 24 days ago.

cluster high-performance-computing lsf sge slurm ssh zeromq3 cpp

3.3 match 149 stars 10.23 score 253 scripts

rostats

RSurveillance:Design and Analysis of Disease Surveillance Activities

A range of functions for the design and analysis of disease surveillance activities. These functions were originally developed for animal health surveillance activities but can be equally applied to aquatic animal, wildlife, plant and human health surveillance activities. Utilities are included for sample size calculation and analysis of representative surveys for disease freedom, risk-based studies for disease freedom and for prevalence estimation. This package is based on Cameron A., Conraths F., Frohlich A., Schauer B., Schulz K., Sergeant E., Sonnenburg J., Staubach C. (2015). R package of functions for risk-based surveillance. Deliverable 6.24, WP 6 - Decision making tools for implementing risk-based surveillance, Grant Number no. 310806, RISKSUR (<https://www.fp7-risksur.eu/sites/default/files/documents/Deliverables/RISKSUR_%28310806%29_D6.24.pdf>). Many of the 'RSurveillance' functions are incorporated into the 'epitools' website: Sergeant, ESG, 2019. Epitools epidemiological calculators. Ausvet Pty Ltd. Available at: <http://epitools.ausvet.com.au>.

Maintained by Rohan Sadler. Last updated 5 years ago.

8.3 match 3.98 score 64 scripts

ozancinar

poolr:Methods for Pooling P-Values from (Dependent) Tests

Functions for pooling/combining the results (i.e., p-values) from (dependent) hypothesis tests. Included are Fisher's method, Stouffer's method, the inverse chi-square method, the Bonferroni method, Tippett's method, and the binomial test. Each method can be adjusted based on an estimate of the effective number of tests or using empirically derived null distribution using pseudo replicates. For Fisher's, Stouffer's, and the inverse chi-square method, direct generalizations based on multivariate theory are also available (leading to Brown's method, Strube's method, and the generalized inverse chi-square method). An introduction can be found in Cinar and Viechtbauer (2022) <doi:10.18637/jss.v101.i01>.

Maintained by Ozan Cinar. Last updated 2 days ago.

5.2 match 12 stars 6.32 score 145 scripts 1 dependents

yufree

enviGCMS:GC/LC-MS Data Analysis for Environmental Science

Gas/Liquid Chromatography-Mass Spectrometer(GC/LC-MS) Data Analysis for Environmental Science. This package covered topics such molecular isotope ratio, matrix effects and Short-Chain Chlorinated Paraffins analysis etc. in environmental analysis.

Maintained by Miao YU. Last updated 2 months ago.

environment mass-spectrometry metabolomics

5.0 match 17 stars 6.49 score 30 scripts 1 dependents

rfastofficial

Rfast2:A Collection of Efficient and Extremely Fast R Functions II

A collection of fast statistical and utility functions for data analysis. Functions for regression, maximum likelihood, column-wise statistics and many more have been included. C++ has been utilized to speed up the functions. References: Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>.

Maintained by Manos Papadakis. Last updated 1 years ago.

openblas cpp openmp

4.0 match 38 stars 8.09 score 75 scripts 26 dependents

ecospat

ecospat:Spatial Ecology Miscellaneous Methods

Collection of R functions and data sets for the support of spatial ecology analyses with a focus on pre, core and post modelling analyses of species distribution, niche quantification and community assembly. Written by current and former members and collaborators of the ecospat group of Antoine Guisan, Department of Ecology and Evolution (DEE) and Institute of Earth Surface Dynamics (IDYST), University of Lausanne, Switzerland. Read Di Cola et al. (2016) <doi:10.1111/ecog.02671> for details.

Maintained by Olivier Broennimann. Last updated 1 months ago.

3.4 match 32 stars 9.35 score 418 scripts 1 dependents

revelle

psych:Procedures for Psychological, Psychometric, and Personality Research

A general purpose toolbox developed originally for personality, psychometric theory and experimental psychology. Functions are primarily for multivariate analysis and scale construction using factor analysis, principal component analysis, cluster analysis and reliability analysis, although others provide basic descriptive statistics. Item Response Theory is done using factor analysis of tetrachoric and polychoric correlations. Functions for analyzing data at multiple levels include within and between group statistics, including correlations and factor analysis. Validation and cross validation of scales developed using basic machine learning algorithms are provided, as are functions for simulating and testing particular item and test structures. Several functions serve as a useful front end for structural equation modeling. Graphical displays of path diagrams, including mediation models, factor analysis and structural equation models are created using basic graphics. Some of the functions are written to support a book on psychometric theory as well as publications in personality research. For more information, see the <https://personality-project.org/r/> web page.

Maintained by William Revelle. Last updated 3 months ago.

2.3 match 52 stars 13.94 score 29k scripts 317 dependents

easystats

parameters:Processing of Model Parameters

Utilities for processing the parameters of various statistical models. Beyond computing p values, CIs, and other indices for a wide variety of models (see list of supported models using the function 'insight::supported_models()'), this package implements features like bootstrapping or simulating of parameters and models, feature reduction (feature extraction and variable selection) as well as functions to describe data and variable characteristics (e.g. skewness, kurtosis, smoothness or distribution).

Maintained by Daniel Lüdecke. Last updated 2 days ago.

beta bootstrap ci confidence-intervals data-reduction easystats fa feature-extraction feature-reduction hacktoberfest parameters pca pvalues regression-models robust-statistics standardize standardized-estimates statistical-models

2.0 match 453 stars 15.65 score 1.8k scripts 56 dependents

cmlmagneville

mFD:Compute and Illustrate the Multiple Facets of Functional Diversity

Computing functional traits-based distances between pairs of species for species gathered in assemblages allowing to build several functional spaces. The package allows to compute functional diversity indices assessing the distribution of species (and of their dominance) in a given functional space for each assemblage and the overlap between assemblages in a given functional space, see: Chao et al. (2018) <doi:10.1002/ecm.1343>, Maire et al. (2015) <doi:10.1111/geb.12299>, Mouillot et al. (2013) <doi:10.1016/j.tree.2012.10.004>, Mouillot et al. (2014) <doi:10.1073/pnas.1317625111>, Ricotta and Szeidl (2009) <doi:10.1016/j.tpb.2009.10.001>. Graphical outputs are included. Visit the 'mFD' website for more information, documentation and examples.

Maintained by Camille Magneville. Last updated 3 months ago.

4.3 match 26 stars 7.35 score 61 scripts

easystats

effectsize:Indices of Effect Size

Provide utilities to work with indices of effect size for a wide variety of models and hypothesis tests (see list of supported models using the function 'insight::supported_models()'), allowing computation of and conversion between indices such as Cohen's d, r, odds, etc. References: Ben-Shachar et al. (2020) <doi:10.21105/joss.02815>.

Maintained by Mattan S. Ben-Shachar. Last updated 1 months ago.

anova cohens-d compute conversion correlation effect-size effectsize hacktoberfest hedges-g interpretation standardization standardized statistics

1.9 match 344 stars 16.38 score 1.8k scripts 29 dependents

mightymetrika

npboottprm:Nonparametric Bootstrap Test with Pooled Resampling

Addressing crucial research questions often necessitates a small sample size due to factors such as distinctive target populations, rarity of the event under study, time and cost constraints, ethical concerns, or group-level unit of analysis. Many readily available analytic methods, however, do not accommodate small sample sizes, and the choice of the best method can be unclear. The 'npboottprm' package enables the execution of nonparametric bootstrap tests with pooled resampling to help fill this gap. Grounded in the statistical methods for small sample size studies detailed in Dwivedi, Mallawaarachchi, and Alvarado (2017) <doi:10.1002/sim.7263>, the package facilitates a range of statistical tests, encompassing independent t-tests, paired t-tests, and one-way Analysis of Variance (ANOVA) F-tests. The nonparboot() function undertakes essential computations, yielding detailed outputs which include test statistics, effect sizes, confidence intervals, and bootstrap distributions. Further, 'npboottprm' incorporates an interactive 'shiny' web application, nonparboot_app(), offering intuitive, user-friendly data exploration.

Maintained by Mackson Ncube. Last updated 6 months ago.

datascience nonparametric statistics

6.9 match 1 stars 4.32 score 5 scripts 2 dependents

alanarnholt

BSDA:Basic Statistics and Data Analysis

Data sets for book "Basic Statistics and Data Analysis" by Larry J. Kitchens.

Maintained by Alan T. Arnholt. Last updated 2 years ago.

3.3 match 7 stars 9.11 score 1.3k scripts 6 dependents

maarten14c

rice:Radiocarbon Equations

Provides functions for the calibration of radiocarbon dates, as well as options to calculate different radiocarbon realms (C14 age, F14C, pMC, D14C) and estimating the effects of contamination or local reservoir offsets (Reimer and Reimer 2001 <doi:10.1017/S0033822200038339>). The methods follow long-established recommendations such as Stuiver and Polach (1977) <doi:10.1017/S0033822200003672> and Reimer et al. (2004) <doi:10.1017/S0033822200033154>. This package complements the data package 'rintcal'.

Maintained by Maarten Blaauw. Last updated 2 months ago.

4.8 match 1 stars 6.13 score 13 scripts 4 dependents

satijalab

Seurat:Tools for Single Cell Genomics

A toolkit for quality control, analysis, and exploration of single cell RNA sequencing data. 'Seurat' aims to enable users to identify and interpret sources of heterogeneity from single cell transcriptomic measurements, and to integrate diverse types of single cell data. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031>, and Hao, Hao, et al (2020) <doi:10.1101/2020.10.12.335331> for more details.

Maintained by Paul Hoffman. Last updated 1 years ago.

human-cell-atlas single-cell-genomics single-cell-rna-seq cpp

1.7 match 2.4k stars 16.86 score 50k scripts 73 dependents

martingmayer

preventr:An Implementation of the PREVENT and Pooled Cohort Equations

Implements the American Heart Association Predicting Risk of cardiovascular disease EVENTs (PREVENT) equations from Khan SS, Matsushita K, Sang Y, and colleagues (2023) <doi:10.1161/CIRCULATIONAHA.123.067626>, with optional comparison with their de facto predecessor, the Pooled Cohort Equations from the American Heart Association and American College of Cardiology (2013) <doi:10.1161/01.cir.0000437741.48606.98> and the revision to the Pooled Cohort Equations from Yadlowsky and colleagues (2018) <doi:10.7326/M17-3011>.

Maintained by Martin Mayer. Last updated 2 months ago.

5.3 match 11 stars 5.34 score 4 scripts

ingorohlfing

QCAcluster:Tools for the Analysis of Clustered Data in QCA

Clustered set-relational data in Qualitative Comparative Analysis (QCA) can have a hierarchical structure, a panel structure or repeated cross sections. 'QCAcluster' allows QCA researchers to supplement the analysis of pooled the data with a disaggregated perspective focusing on selected partitions of the data. The pooled data can be partitioned along the dimensions of the clustered data (individual cross sections or time series) to perform partition-specific truth table minimizations. Empirical researchers can further calculate the weight that each partition has on the parameters of the pooled solution and the diversity of the cases under analysis within and across partitions (see <https://ingorohlfing.github.io/QCAcluster/>).

Maintained by Ingo Rohlfing. Last updated 3 years ago.

6.0 match 2 stars 4.68 score 12 scripts

bioc

pram:Pooling RNA-seq datasets for assembling transcript models

Publicly available RNA-seq data is routinely used for retrospective analysis to elucidate new biology. Novel transcript discovery enabled by large collections of RNA-seq datasets has emerged as one of such analysis. To increase the power of transcript discovery from large collections of RNA-seq datasets, we developed a new R package named Pooling RNA-seq and Assembling Models (PRAM), which builds transcript models in intergenic regions from pooled RNA-seq datasets. This package includes functions for defining intergenic regions, extracting and pooling related RNA-seq alignments, predicting, selected, and evaluating transcript models.

Maintained by Peng Liu. Last updated 5 months ago.

software technology sequencing rnaseq biologicalquestion geneprediction genomeannotation researchfield transcriptomics bioconductor-package genome-annotation rna-seq transcript-model

6.7 match 1 stars 4.18 score 3 scripts

vsousa

poolABC:Approximate Bayesian Computation with Pooled Sequencing Data

Provides functions to simulate Pool-seq data under models of demographic formation and to import Pool-seq data from real populations. Implements two ABC algorithms for performing parameter estimation and model selection using Pool-seq data. Cross-validation can also be performed to assess the accuracy of ABC estimates and model choice. Carvalho et al., (2022) <doi:10.1111/1755-0998.13834>.

Maintained by João Carvalho. Last updated 2 years ago.

7.5 match 1 stars 3.70 score 3 scripts

cran

TestDimorph:Analysis of the Interpopulation Difference in Degree of Sexual Dimorphism Using Summary Statistics

Offers a solution for the unavailability of raw data in most anthropological studies by facilitating the calculations of several sexual dimorphism related analyses using the published summary statistics of metric data (mean, standard deviation and sex specific sample size) as illustrated by the works of Relethford, J. H., & Hodges, D. C. (1985) <doi:10.1002/ajpa.1330660105>, Greene, D. L. (1989) <doi:10.1002/ajpa.1330790113> and Konigsberg, L. W. (1991) <doi:10.1002/ajpa.1330840110>.

Maintained by Bassam A. Abulnoor. Last updated 1 years ago.

10.2 match 1 stars 2.70 score

ropensci

rdhs:API Client and Dataset Management for the Demographic and Health Survey (DHS) Data

Provides a client for (1) querying the DHS API for survey indicators and metadata (<https://api.dhsprogram.com/#/index.html>), (2) identifying surveys and datasets for analysis, (3) downloading survey datasets from the DHS website, (4) loading datasets and associate metadata into R, and (5) extracting variables and combining datasets for pooled analysis.

Maintained by OJ Watson. Last updated 17 days ago.

dataset dhs dhs-api extract peer-reviewed survey-data

2.7 match 35 stars 10.07 score 286 scripts 3 dependents

cran

mMPA:Implementation of Marker-Assisted Mini-Pooling with Algorithm

To determine the number of quantitative assays needed for a sample of data using pooled testing methods, which include mini-pooling (MP), MP with algorithm (MPA), and marker-assisted MPA (mMPA). To estimate the number of assays needed, the package also provides a tool to conduct Monte Carlo (MC) to simulate different orders in which the sample would be collected to form pools. Using MC avoids the dependence of the estimated number of assays on any specific ordering of the samples to form pools.

Maintained by "Tao Liu, PhD". Last updated 6 years ago.

13.6 match 2.00 score

kassambara

survminer:Drawing Survival Curves using 'ggplot2'

Contains the function 'ggsurvplot()' for drawing easily beautiful and 'ready-to-publish' survival curves with the 'number at risk' table and 'censoring count plot'. Other functions are also available to plot adjusted curves for `Cox` model and to visually examine 'Cox' model assumptions.

Maintained by Alboukadel Kassambara. Last updated 5 months ago.

1.7 match 524 stars 15.87 score 7.0k scripts 55 dependents

cfwp

rags2ridges:Ridge Estimation of Precision Matrices from High-Dimensional Data

Proper L2-penalized maximum likelihood estimators for precision matrices and supporting functions to employ these estimators in a graphical modeling setting. For details, see Peeters, Bilgrau, & van Wieringen (2022) <doi:10.18637/jss.v102.i04> and associated publications.

Maintained by Carel F.W. Peeters. Last updated 1 years ago.

c-plus-plus graphical-models machine-learning networkscience statistics openblas cpp

4.8 match 8 stars 5.60 score 46 scripts

nhejazi

haldensify:Highly Adaptive Lasso Conditional Density Estimation

An algorithm for flexible conditional density estimation based on application of pooled hazard regression to an artificial repeated measures dataset constructed by discretizing the support of the outcome variable. To facilitate flexible estimation of the conditional density, the highly adaptive lasso, a non-parametric regression function shown to estimate cadlag (RCLL) functions at a suitably fast convergence rate, is used. The use of pooled hazards regression for conditional density estimation as implemented here was first described for by Díaz and van der Laan (2011) <doi:10.2202/1557-4679.1356>. Building on the conditional density estimation utilities, non-parametric inverse probability weighted (IPW) estimators of the causal effects of additive modified treatment policies are implemented, using conditional density estimation to estimate the generalized propensity score. Non-parametric IPW estimators based on this can be coupled with sieve estimation (undersmoothing) of the generalized propensity score to attain the semi-parametric efficiency bound (per Hejazi, Benkeser, Díaz, and van der Laan <doi:10.48550/arXiv.2205.05777>).

Maintained by Nima Hejazi. Last updated 6 months ago.

causal-inference conditional-density-estimates density-estimation highly-adaptive-lasso inverse-probability-weights machine-learning nonparametric-regression propensity-score

3.6 match 17 stars 7.34 score 72 scripts 3 dependents

easystats

see:Model Visualisation Toolbox for 'easystats' and 'ggplot2'

Provides plotting utilities supporting packages in the 'easystats' ecosystem (<https://github.com/easystats/easystats>) and some extra themes, geoms, and scales for 'ggplot2'. Color scales are based on <https://materialui.co/>. References: Lüdecke et al. (2021) <doi:10.21105/joss.03393>.

Maintained by Indrajeet Patil. Last updated 5 days ago.

data-visualization easystats ggplot2 hacktoberfest plotting see statistics visualisation visualization

2.0 match 902 stars 13.22 score 2.0k scripts 3 dependents

cran

vectorsurvR:Data Access and Analytical Tools for 'VectorSurv' Users

Allows registered 'VectorSurv' <https://vectorsurv.org/> users access to data through the 'VectorSurv API' <https://api.vectorsurv.org/>. Additionally provides functions for analysis and visualization.

Maintained by Christina De Cesaris. Last updated 2 months ago.

7.9 match 3.30 score

john-harrold

ubiquity:PKPD, PBPK, and Systems Pharmacology Modeling Tools

Complete work flow for the analysis of pharmacokinetic pharmacodynamic (PKPD), physiologically-based pharmacokinetic (PBPK) and systems pharmacology models including: creation of ordinary differential equation-based models, pooled parameter estimation, individual/population based simulations, rule-based simulations for clinical trial design and modeling assays, deployment with a customizable 'Shiny' app, and non-compartmental analysis. System-specific analysis templates can be generated and each element includes integrated reporting with 'PowerPoint' and 'Word'.

Maintained by John Harrold. Last updated 16 days ago.

modeling pkpd

3.6 match 13 stars 7.14 score 33 scripts

mhoehle

binomSamSize:Confidence Intervals and Sample Size Determination for a Binomial Proportion under Simple Random Sampling and Pooled Sampling

A suite of functions to compute confidence intervals and necessary sample sizes for the parameter p of the Bernoulli B(p) distribution under simple random sampling or under pooled sampling. Such computations are e.g. of interest when investigating the incidence or prevalence in populations. The package contains functions to compute coverage probabilities and coverage coefficients of the provided confidence intervals procedures. Sample size calculations are based on expected length.

Maintained by Michael Hoehle. Last updated 12 months ago.

fortran

11.9 match 2 stars 2.18 score 15 scripts

ohdsi

ResultModelManager:Result Model Manager

Database data model management utilities for R packages in the Observational Health Data Sciences and Informatics program <https://ohdsi.org>. 'ResultModelManager' provides utility functions to allow package maintainers to migrate existing SQL database models, export and import results in consistent patterns.

Maintained by Jamie Gilbert. Last updated 6 months ago.

openjdk

3.5 match 4 stars 7.38 score 9 scripts 3 dependents

frmunoz

ecolottery:Coalescent-Based Simulation of Ecological Communities

Coalescent-Based Simulation of Ecological Communities as proposed by Munoz et al. (2018) <doi:10.1111/2041-210X.12918>. The package includes a tool for estimating parameters of community assembly by using Approximate Bayesian Computation.

Maintained by François Munoz. Last updated 2 years ago.

4.1 match 15 stars 6.18 score 17 scripts 1 dependents

guido-s

meta:General Package for Meta-Analysis

User-friendly general package providing standard methods for meta-analysis and supporting Schwarzer, Carpenter, and Rücker <DOI:10.1007/978-3-319-21416-0>, "Meta-Analysis with R" (2015): - common effect and random effects meta-analysis; - several plots (forest, funnel, Galbraith / radial, L'Abbe, Baujat, bubble); - three-level meta-analysis model; - generalised linear mixed model; - logistic regression with penalised likelihood for rare events; - Hartung-Knapp method for random effects model; - Kenward-Roger method for random effects model; - prediction interval; - statistical tests for funnel plot asymmetry; - trim-and-fill method to evaluate bias in meta-analysis; - meta-regression; - cumulative meta-analysis and leave-one-out meta-analysis; - import data from 'RevMan 5'; - produce forest plot summarising several (subgroup) meta-analyses.

Maintained by Guido Schwarzer. Last updated 25 days ago.

meta-analysis rstudio

1.7 match 84 stars 14.84 score 2.3k scripts 29 dependents

tdjorgensen

lavaan.mi:Fit Structural Equation Models to Multiply Imputed Data

The primary purpose of 'lavaan.mi' is to extend the functionality of the R package 'lavaan', which implements structural equation modeling (SEM). When incomplete data have been multiply imputed, the imputed data sets can be analyzed by 'lavaan' using complete-data estimation methods, but results must be pooled across imputations (Rubin, 1987, <doi:10.1002/9780470316696>). The 'lavaan.mi' package automates the pooling of point and standard-error estimates, as well as a variety of test statistics, using a familiar interface that allows users to fit an SEM to multiple imputations as they would to a single data set using the 'lavaan' package.

Maintained by Terrence D. Jorgensen. Last updated 4 days ago.

4.6 match 3 stars 5.45 score

appliedstat

rQCC:Robust Quality Control Chart

Constructs various robust quality control charts based on the median or Hodges-Lehmann estimator (location) and the median absolute deviation (MAD) or Shamos estimator (scale). The estimators used for the robust control charts are all unbiased with a sample of finite size. For more details, see Park, Kim and Wang (2022) <doi:10.1080/03610918.2019.1699114>. In addition, using this R package, the conventional quality control charts such as X-bar, S, R, p, np, u, c, g, h, and t charts are also easily constructed. This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2022R1A2C1091319).

Maintained by Chanseok Park. Last updated 1 years ago.

control-chart goodness-of-fit r-language weibull

5.4 match 2 stars 4.70 score 3 scripts

bioc

mastR:Markers Automated Screening Tool in R

mastR is an R package designed for automated screening of signatures of interest for specific research questions. The package is developed for generating refined lists of signature genes from multiple group comparisons based on the results from edgeR and limma differential expression (DE) analysis workflow. It also takes into account the background noise of tissue-specificity, which is often ignored by other marker generation tools. This package is particularly useful for the identification of group markers in various biological and medical applications, including cancer research and developmental biology.

Maintained by Jinjin Chen. Last updated 5 months ago.

software geneexpression transcriptomics differentialexpression visualization

5.0 match 4 stars 5.08 score 3 scripts

pavlakrotka

NCC:Simulation and Analysis of Platform Trials with Non-Concurrent Controls

Design and analysis of flexible platform trials with non-concurrent controls. Functions for data generation, analysis, visualization and running simulation studies are provided. The implemented analysis methods are described in: Bofill Roig et al. (2022) <doi:10.1186/s12874-022-01683-w>, Saville et al. (2022) <doi:10.1177/17407745221112013> and Schmidli et al. (2014) <doi:10.1111/biom.12242>.

Maintained by Pavla Krotka. Last updated 6 days ago.

clinical-trials platform-trials simulation statistical-inference jags cpp

3.8 match 5 stars 6.64 score 29 scripts

bioc

limma:Linear Models for Microarray and Omics Data

Data analysis, linear models and differential expression for omics data.

Maintained by Gordon Smyth. Last updated 5 days ago.

exonarray geneexpression transcription alternativesplicing differentialexpression differentialsplicing genesetenrichment dataimport bayesian clustering regression timecourse microarray micrornaarray mrnamicroarray onechannel proprietaryplatforms twochannel sequencing rnaseq batcheffect multiplecomparison normalization preprocessing qualitycontrol biomedicalinformatics cellbiology cheminformatics epigenetics functionalgenomics genetics immunooncology metabolomics proteomics systemsbiology transcriptomics

1.8 match 13.81 score 16k scripts 585 dependents

cran

nlme:Linear and Nonlinear Mixed Effects Models

Fit and compare Gaussian linear and nonlinear mixed-effects models.

Maintained by R Core Team. Last updated 2 months ago.

fortran

1.9 match 6 stars 13.00 score 13k scripts 8.7k dependents

nsj3

rioja:Analysis of Quaternary Science Data

Constrained clustering, transfer functions, and other methods for analysing Quaternary science data.

Maintained by Steve Juggins. Last updated 6 months ago.

cpp

3.4 match 10 stars 7.21 score 191 scripts 3 dependents

maelstrom-research

Rmonize:Support Retrospective Harmonization of Data

Functions to support rigorous retrospective data harmonization processing, evaluation, and documentation across datasets from different studies based on Maelstrom Research guidelines. The package includes the core functions to evaluate and format the main inputs that define the harmonization process, apply specified processing rules to generate harmonized data, diagnose processing errors, and summarize and evaluate harmonized outputs. The main inputs that define the processing are a DataSchema (list and definitions of harmonized variables to be generated) and Data Processing Elements (processing rules to be applied to generate harmonized variables from study-specific variables). The main outputs of processing are harmonized datasets, associated metadata, and tabular and visual summary reports. As described in Maelstrom Research guidelines for rigorous retrospective data harmonization (Fortier I and al. (2017) <doi:10.1093/ije/dyw075>).

Maintained by Guillaume Fabre. Last updated 12 months ago.

4.3 match 5 stars 5.58 score 51 scripts

truecluster

ff:Memory-Efficient Storage of Large Data on Disk and Fast Access Functions

The ff package provides data structures that are stored on disk but behave (almost) as if they were in RAM by transparently mapping only a section (pagesize) in main memory - the effective virtual memory consumption per ff object. ff supports R's standard atomic data types 'double', 'logical', 'raw' and 'integer' and non-standard atomic types boolean (1 bit), quad (2 bit unsigned), nibble (4 bit unsigned), byte (1 byte signed with NAs), ubyte (1 byte unsigned), short (2 byte signed with NAs), ushort (2 byte unsigned), single (4 byte float with NAs). For example 'quad' allows efficient storage of genomic data as an 'A','T','G','C' factor. The unsigned types support 'circular' arithmetic. There is also support for close-to-atomic types 'factor', 'ordered', 'POSIXct', 'Date' and custom close-to-atomic types. ff not only has native C-support for vectors, matrices and arrays with flexible dimorder (major column-order, major row-order and generalizations for arrays). There is also a ffdf class not unlike data.frames and import/export filters for csv files. ff objects store raw data in binary flat files in native encoding, and complement this with metadata stored in R as physical and virtual attributes. ff objects have well-defined hybrid copying semantics, which gives rise to certain performance improvements through virtualization. ff objects can be stored and reopened across R sessions. ff files can be shared by multiple ff R objects (using different data en/de-coding schemes) in the same process or from multiple R processes to exploit parallelism. A wide choice of finalizer options allows to work with 'permanent' files as well as creating/removing 'temporary' ff files completely transparent to the user. On certain OS/Filesystem combinations, creating the ff files works without notable delay thanks to using sparse file allocation. Several access optimization techniques such as Hybrid Index Preprocessing and Virtualization are implemented to achieve good performance even with large datasets, for example virtual matrix transpose without touching a single byte on disk. Further, to reduce disk I/O, 'logicals' and non-standard data types get stored native and compact on binary flat files i.e. logicals take up exactly 2 bits to represent TRUE, FALSE and NA. Beyond basic access functions, the ff package also provides compatibility functions that facilitate writing code for ff and ram objects and support for batch processing on ff objects (e.g. as.ram, as.ff, ffapply). ff interfaces closely with functionality from package 'bit': chunked looping, fast bit operations and coercions between different objects that can store subscript information ('bit', 'bitwhich', ff 'boolean', ri range index, hi hybrid index). This allows to work interactively with selections of large datasets and quickly modify selection criteria. Further high-performance enhancements can be made available upon request.

Maintained by Jens Oehlschlägel. Last updated 2 months ago.

cpp

2.0 match 27 stars 12.01 score 764 scripts 71 dependents

bioc

DMRcaller:Differentially Methylated Regions caller

Uses Bisulfite sequencing data in two conditions and identifies differentially methylated regions between the conditions in CG and non-CG context. The input is the CX report files produced by Bismark and the output is a list of DMRs stored as GRanges objects.

Maintained by Nicolae Radu Zabet. Last updated 5 months ago.

differentialmethylation dnamethylation software sequencing coverage

5.9 match 4.08 score 8 scripts

bioc

MutationalPatterns:Comprehensive genome-wide analysis of mutational processes

Mutational processes leave characteristic footprints in genomic DNA. This package provides a comprehensive set of flexible functions that allows researchers to easily evaluate and visualize a multitude of mutational patterns in base substitution catalogues of e.g. healthy samples, tumour samples, or DNA-repair deficient cells. The package covers a wide range of patterns including: mutational signatures, transcriptional and replicative strand bias, lesion segregation, genomic distribution and association with genomic features, which are collectively meaningful for studying the activity of mutational processes. The package works with single nucleotide variants (SNVs), insertions and deletions (Indels), double base substitutions (DBSs) and larger multi base substitutions (MBSs). The package provides functionalities for both extracting mutational signatures de novo and determining the contribution of previously identified mutational signatures on a single sample level. MutationalPatterns integrates with common R genomic analysis workflows and allows easy association with (publicly available) annotation data.

Maintained by Mark van Roosmalen. Last updated 5 months ago.

genetics somaticmutation

3.3 match 7.27 score 251 scripts 1 dependents

janoleko

LaMa:Fast Numerical Maximum Likelihood Estimation for Latent Markov Models

A variety of latent Markov models, including hidden Markov models, hidden semi-Markov models, state-space models and continuous-time variants can be formulated and estimated within the same framework via directly maximising the likelihood function using the so-called forward algorithm. Applied researchers often need custom models that standard software does not easily support. Writing tailored 'R' code offers flexibility but suffers from slow estimation speeds. We address these issues by providing easy-to-use functions (written in 'C++' for speed) for common tasks like the forward algorithm. These functions can be combined into custom models in a Lego-type approach, offering up to 10-20 times faster estimation via standard numerical optimisers. To aid in building fully custom likelihood functions, several vignettes are included that show how to simulate data from and estimate all the above model classes.

Maintained by Jan-Ole Koslik. Last updated 1 days ago.

openblas cpp openmp

3.0 match 9 stars 7.84 score 42 scripts

ropensci

eph:Argentina's Permanent Household Survey Data and Manipulation Utilities

Tools to download and manipulate the Permanent Household Survey from Argentina (EPH is the Spanish acronym for Permanent Household Survey). e.g: get_microdata() for downloading the datasets, get_poverty_lines() for downloading the official poverty baskets, calculate_poverty() for the calculation of stating if a household is in poverty or not, following the official methodology. organize_panels() is used to concatenate observations from different periods, and organize_labels() adds the official labels to the data. The implemented methods are based on INDEC (2016) <http://www.estadistica.ec.gba.gov.ar/dpe/images/SOCIEDAD/EPH_metodologia_22_pobreza.pdf>. As this package works with the argentinian Permanent Household Survey and its main audience is from this country, the documentation was written in Spanish.

Maintained by Carolina Pradier. Last updated 7 months ago.

eph indec mercado-de-trabajo rstatses

2.8 match 59 stars 8.38 score 255 scripts

mchaffin17

MetProc:Separate Metabolites into Likely Measurement Artifacts and True Metabolites

Split an untargeted metabolomics data set into a set of likely true metabolites and a set of likely measurement artifacts. This process involves comparing missing rates of pooled plasma samples and biological samples. The functions assume a fixed injection order of samples where biological samples are randomized and processed between intermittent pooled plasma samples. By comparing patterns of missing data across injection order, metabolites that appear in blocks and are likely artifacts can be separated from metabolites that seem to have random dispersion of missing data. The two main metrics used are: 1. the number of consecutive blocks of samples with present data and 2. the correlation of missing rates between biological samples and flanking pooled plasma samples.

Maintained by Mark Chaffin. Last updated 9 years ago.

9.4 match 2.46 score 29 scripts

bioc

edgeR:Empirical Analysis of Digital Gene Expression Data in R

Differential expression analysis of sequence count data. Implements a range of statistical methodology based on the negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models, quasi-likelihood, and gene set enrichment. Can perform differential analyses of any type of omics data that produces read counts, including RNA-seq, ChIP-seq, ATAC-seq, Bisulfite-seq, SAGE, CAGE, metabolomics, or proteomics spectral counts. RNA-seq analyses can be conducted at the gene or isoform level, and tests can be conducted for differential exon or transcript usage.

Maintained by Yunshun Chen. Last updated 5 days ago.

alternativesplicing batcheffect bayesian biomedicalinformatics cellbiology chipseq clustering coverage differentialexpression differentialmethylation differentialsplicing dnamethylation epigenetics functionalgenomics geneexpression genesetenrichment genetics immunooncology multiplecomparison normalization pathways proteomics qualitycontrol regression rnaseq sage sequencing singlecell systemsbiology timecourse transcription transcriptomics openblas

1.7 match 13.40 score 17k scripts 255 dependents

cran

benthos:Marine Benthic Ecosystem Analysis

Preprocessing tools and biodiversity measures (species abundance, species richness, population heterogeneity and sensitivity) for analysing marine benthic data. See Van Loon et al. (2015) <doi:10.1016/j.seares.2015.05.002> for an application of these tools.

Maintained by Dennis Walvoort. Last updated 3 years ago.

9.0 match 2.53 score 34 scripts

thibautjombart

adegenet:Exploratory Analysis of Genetic and Genomic Data

Toolset for the exploration of genetic and genomic data. Adegenet provides formal (S4) classes for storing and handling various genetic data, including genetic markers with varying ploidy and hierarchical population structure ('genind' class), alleles counts by populations ('genpop'), and genome-wide SNP data ('genlight'). It also implements original multivariate methods (DAPC, sPCA), graphics, statistical tests, simulation tools, distance and similarity measures, and several spatial methods. A range of both empirical and simulated datasets is also provided to illustrate various methods.

Maintained by Zhian N. Kamvar. Last updated 1 months ago.

1.8 match 182 stars 12.60 score 1.9k scripts 29 dependents

azure

AzureContainers:Interface to 'Container Instances', 'Docker Registry' and 'Kubernetes' in 'Azure'

An interface to container functionality in Microsoft's 'Azure' cloud: <https://azure.microsoft.com/en-us/product-categories/containers/>. Manage 'Azure Container Instance' (ACI), 'Azure Container Registry' (ACR) and 'Azure Kubernetes Service' (AKS) resources, push and pull images, and deploy services. On the client side, lightweight shells to the 'docker', 'docker-compose', 'kubectl' and 'helm' commandline tools are provided. Part of the 'AzureR' family of packages.

Maintained by Hong Ooi. Last updated 2 years ago.

azure-container-instances azure-container-registry azure-kubernetes-service azure-sdk-r containers kubernetes

3.5 match 25 stars 6.49 score 57 scripts

atahk

pscl:Political Science Computational Laboratory

Bayesian analysis of item-response theory (IRT) models, roll call analysis; computing highest density regions; maximum likelihood estimation of zero-inflated and hurdle models for count data; goodness-of-fit measures for GLMs; data sets used in writing and teaching; seats-votes curves.

Maintained by Simon Jackman. Last updated 1 years ago.

1.7 match 67 stars 13.28 score 2.7k scripts 54 dependents

simsem

semTools:Useful Tools for Structural Equation Modeling

Provides miscellaneous tools for structural equation modeling, many of which extend the 'lavaan' package. For example, latent interactions can be estimated using product indicators (Lin et al., 2010, <doi:10.1080/10705511.2010.488999>) and simple effects probed; analytical power analyses can be conducted (Jak et al., 2021, <doi:10.3758/s13428-020-01479-0>); and scale reliability can be estimated based on estimated factor-model parameters.

Maintained by Terrence D. Jorgensen. Last updated 3 days ago.

1.6 match 79 stars 13.74 score 1.1k scripts 31 dependents

bioc

PRONE:The PROteomics Normalization Evaluator

High-throughput omics data are often affected by systematic biases introduced throughout all the steps of a clinical study, from sample collection to quantification. Normalization methods aim to adjust for these biases to make the actual biological signal more prominent. However, selecting an appropriate normalization method is challenging due to the wide range of available approaches. Therefore, a comparative evaluation of unnormalized and normalized data is essential in identifying an appropriate normalization strategy for a specific data set. This R package provides different functions for preprocessing, normalizing, and evaluating different normalization approaches. Furthermore, normalization methods can be evaluated on downstream steps, such as differential expression analysis and statistical enrichment analysis. Spike-in data sets with known ground truth and real-world data sets of biological experiments acquired by either tandem mass tag (TMT) or label-free quantification (LFQ) can be analyzed.

Maintained by Lis Arend. Last updated 16 days ago.

proteomics preprocessing normalization differentialexpression visualization data-analysis evaluation

5.0 match 2 stars 4.38 score 9 scripts

rvlenth

emmeans:Estimated Marginal Means, aka Least-Squares Means

Obtain estimated marginal means (EMMs) for many linear, generalized linear, and mixed models. Compute contrasts or linear functions of EMMs, trends, and comparisons of slopes. Plots and other displays. Least-squares means are discussed, and the term "estimated marginal means" is suggested, in Searle, Speed, and Milliken (1980) Population marginal means in the linear model: An alternative to least squares means, The American Statistician 34(4), 216-221 <doi:10.1080/00031305.1980.10483031>.

Maintained by Russell V. Lenth. Last updated 3 days ago.

1.1 match 377 stars 19.19 score 13k scripts 187 dependents

rstudio

tfestimators:Interface to 'TensorFlow' Estimators

Interface to 'TensorFlow' Estimators <https://www.tensorflow.org/guide/estimator>, a high-level API that provides implementations of many different model types including linear models and deep neural networks.

Maintained by Tomasz Kalinowski. Last updated 3 years ago.

2.5 match 57 stars 8.42 score 170 scripts

umich-cphds

lodi:Limit of Detection Imputation for Single-Pollutant Models

Impute observed values below the limit of detection (LOD) via censored likelihood multiple imputation (CLMI) in single-pollutant models, developed by Boss et al (2019) <doi:10.1097/EDE.0000000000001052>. CLMI handles exposure detection limits that may change throughout the course of exposure assessment. 'lodi' provides functions for imputing and pooling for this method.

Maintained by Alexander Rix. Last updated 5 years ago.

5.7 match 1 stars 3.70 score 10 scripts

bioc

PLPE:Local Pooled Error Test for Differential Expression with Paired High-throughput Data

This package performs tests for paired high-throughput data.

Maintained by Soo-heang Eo. Last updated 5 months ago.

proteomics microarray differentialexpression

6.3 match 3.30 score 7 scripts

bioc

LPE:Methods for analyzing microarray data using Local Pooled Error (LPE) method

This LPE library is used to do significance analysis of microarray data with small number of replicates. It uses resampling based FDR adjustment, and gives less conservative results than traditional 'BH' or 'BY' procedures. Data accepted is raw data in txt format from MAS4, MAS5 or dChip. Data can also be supplied after normalization. LPE library is primarily used for analyzing data between two conditions. To use it for paired data, see LPEP library. For using LPE in multiple conditions, use HEM library.

Maintained by Nitin Jain. Last updated 5 months ago.

microarray differentialexpression

4.5 match 4.58 score 21 scripts 1 dependents

bioc

mnem:Mixture Nested Effects Models

Mixture Nested Effects Models (mnem) is an extension of Nested Effects Models and allows for the analysis of single cell perturbation data provided by methods like Perturb-Seq (Dixit et al., 2016) or Crop-Seq (Datlinger et al., 2017). In those experiments each of many cells is perturbed by a knock-down of a specific gene, i.e. several cells are perturbed by a knock-down of gene A, several by a knock-down of gene B, ... and so forth. The observed read-out has to be multi-trait and in the case of the Perturb-/Crop-Seq gene are expression profiles for each cell. mnem uses a mixture model to simultaneously cluster the cell population into k clusters and and infer k networks causally linking the perturbed genes for each cluster. The mixture components are inferred via an expectation maximization algorithm.

Maintained by Martin Pirkl. Last updated 4 months ago.

pathways systemsbiology networkinference network rnaseq pooledscreens singlecell crispr atacseq dnaseq geneexpression cpp

3.6 match 4 stars 5.64 score 15 scripts 4 dependents

bromaghin

qfasar:Quantitative Fatty Acid Signature Analysis in R

An implementation of Quantitative Fatty Acid Signature Analysis (QFASA) in R. QFASA is a method of estimating the diet composition of predators. The fundamental unit of information in QFASA is a fatty acid signature (signature), which is a vector of proportions describing the composition of fatty acids within lipids. Signature data from at least one predator and from samples of all potential prey types are required. Calibration coefficients, which adjust for the differential metabolism of individual fatty acids by predators, are also required. Given those data inputs, a predator signature is modeled as a mixture of prey signatures and its diet estimate is obtained as the mixture that minimizes a measure of distance between the observed and modeled signatures. A variety of estimation options and simulation capabilities are implemented. Please refer to the vignette for additional details and references.

Maintained by Jeffrey F. Bromaghin. Last updated 5 years ago.

6.9 match 2.90 score 40 scripts

bioc

ISAnalytics:Analyze gene therapy vector insertion sites data identified from genomics next generation sequencing reads for clonal tracking studies

In gene therapy, stem cells are modified using viral vectors to deliver the therapeutic transgene and replace functional properties since the genetic modification is stable and inherited in all cell progeny. The retrieval and mapping of the sequences flanking the virus-host DNA junctions allows the identification of insertion sites (IS), essential for monitoring the evolution of genetically modified cells in vivo. A comprehensive toolkit for the analysis of IS is required to foster clonal trackign studies and supporting the assessment of safety and long term efficacy in vivo. This package is aimed at (1) supporting automation of IS workflow, (2) performing base and advance analysis for IS tracking (clonal abundance, clonal expansions and statistics for insertional mutagenesis, etc.), (3) providing basic biology insights of transduced stem cells in vivo.

Maintained by Francesco Gazzo. Last updated 3 months ago.

biomedicalinformatics sequencing singlecell

3.4 match 3 stars 5.83 score 15 scripts

phargarten2

miWQS:Multiple Imputation Using Weighted Quantile Sum Regression

The miWQS package handles the uncertainty due to below the detection limit in a correlated component mixture problem. Researchers want to determine if a set/mixture of continuous and correlated components/chemicals is associated with an outcome and if so, which components are important in that mixture. These components share a common outcome but are interval-censored between zero and low thresholds, or detection limits, that may be different across the components. This package applies the multiple imputation (MI) procedure to the weighted quantile sum regression (WQS) methodology for continuous, binary, or count outcomes (Hargarten & Wheeler (2020) <doi:10.1016/j.envres.2020.109466>). The imputation models are: bootstrapping imputation (Lubin et al (2004) <doi:10.1289/ehp.7199>), univariate Bayesian imputation (Hargarten & Wheeler (2020) <doi:10.1016/j.envres.2020.109466>), and multivariate Bayesian regression imputation.

Maintained by Paul M. Hargarten. Last updated 1 years ago.

4.1 match 2 stars 4.78 score 20 scripts 1 dependents

awamaeva

trajmsm:Marginal Structural Models with Latent Class Growth Analysis of Treatment Trajectories

Implements marginal structural models combined with a latent class growth analysis framework for assessing the causal effect of treatment trajectories. Based on the approach described in "Marginal Structural Models with Latent Class Growth Analysis of Treatment Trajectories" Diop, A., Sirois, C., Guertin, J.R., Schnitzer, M.E., Candas, B., Cossette, B., Poirier, P., Brophy, J., Mésidor, M., Blais, C. and Hamel, D., (2023) <doi:10.1177/09622802231202384>.

Maintained by Awa Diop. Last updated 1 years ago.

g-computation inverse-probability-weights marginal-structural-models tmle trajectory-analysis

5.8 match 5 stars 3.40 score

highlanderlab

SIMplyBee:'AlphaSimR' Extension for Simulating Honeybee Populations and Breeding Programmes

An extension of the 'AlphaSimR' package (<https://cran.r-project.org/package=AlphaSimR>) for stochastic simulations of honeybee populations and breeding programmes. 'SIMplyBee' enables simulation of individual bees that form a colony, which includes a queen, fathers (drones the queen mated with), virgin queens, workers, and drones. Multiple colony can be merged into a population of colonies, such as an apiary or a whole country of colonies. Functions enable operations on castes, colony, or colonies, to ease 'R' scripting of whole populations. All 'AlphaSimR' functionality with respect to genomes and genetic and phenotype values is available and further extended for honeybees, including haplo-diploidy, complementary sex determiner locus, colony events (swarming, supersedure, etc.), and colony phenotype values.

Maintained by Jana Obšteter. Last updated 6 months ago.

cpp openmp

3.1 match 2 stars 6.24 score 18 scripts

bioc

methylPipe:Base resolution DNA methylation data analysis

Memory efficient analysis of base resolution DNA methylation data in both the CpG and non-CpG sequence context. Integration of DNA methylation data derived from any methodology providing base- or low-resolution data.

Maintained by Mattia Furlan. Last updated 5 months ago.

methylseq dnamethylation coverage sequencing

4.1 match 4.73 score 1 scripts 1 dependents

schlosslab

schtools:Schloss Lab Tools for Reproducible Microbiome Research

A collection of useful functions and example code created and used by the Schloss Lab for reproducible microbiome research. Perform common tasks like read files created by mothur <https://mothur.org/>, tidy up your microbiome data, and format R Markdown documents for publication. See the website <http://www.schlosslab.org/schtools/> for more information, documentation, and examples.

Maintained by Kelly Sovacool. Last updated 2 years ago.

microbiome mothur

2.9 match 30 stars 6.62 score 35 scripts

gongcastro

bvq:Barcelona Vocabulary Questionnaire Database and Helper Functions

Download, clean, and process the Barcelona Vocabulary Questionnaire (BVQ) data. BVQ is a vocabulary inventory developed for assesing the vocabulary of Catalan-Spanish bilinguals infants from the Metropolitan Area of Barcelona (Spain). This package includes functions to download the data from formr servers, and return the processed data in multiple formats.

Maintained by Gonzalo Garcia-Castro. Last updated 2 months ago.

bilingualism language psycholinguistics vocabulary

4.5 match 1 stars 4.26 score 8 scripts

cran

paleoTS:Analyze Paleontological Time-Series

Facilitates analysis of paleontological sequences of trait values. Functions are provided to fit, using maximum likelihood, simple evolutionary models (including unbiased random walks, directional evolution,stasis, Ornstein-Uhlenbeck, covariate-tracking) and complex models (punctuation, mode shifts).

Maintained by Gene Hunt. Last updated 6 months ago.

4.3 match 1 stars 4.48 score 2 dependents

braverock

PortfolioAnalytics:Portfolio Analysis, Including Numerical Methods for Optimization of Portfolios

Portfolio optimization and analysis routines and graphics.

Maintained by Brian G. Peterson. Last updated 3 months ago.

1.7 match 81 stars 11.49 score 626 scripts 2 dependents

bioc

DNABarcodes:A tool for creating and analysing DNA barcodes used in Next Generation Sequencing multiplexing experiments

The package offers a function to create DNA barcode sets capable of correcting insertion, deletion, and substitution errors. Existing barcodes can be analysed regarding their minimal, maximal and average distances between barcodes. Finally, reads that start with a (possibly mutated) barcode can be demultiplexed, i.e., assigned to their original reference barcode.

Maintained by Tilo Buschmann. Last updated 5 months ago.

preprocessing sequencing cpp openmp

4.1 match 4.51 score 27 scripts

mxrodriguezuvigo

ROCnReg:ROC Curve Inference with and without Covariates

Estimates the pooled (unadjusted) Receiver Operating Characteristic (ROC) curve, the covariate-adjusted ROC (AROC) curve, and the covariate-specific/conditional ROC (cROC) curve by different methods, both Bayesian and frequentist. Also, it provides functions to obtain ROC-based optimal cutpoints utilizing several criteria. Based on Erkanli, A. et al. (2006) <doi:10.1002/sim.2496>; Faraggi, D. (2003) <doi:10.1111/1467-9884.00350>; Gu, J. et al. (2008) <doi:10.1002/sim.3366>; Inacio de Carvalho, V. et al. (2013) <doi:10.1214/13-BA825>; Inacio de Carvalho, V., and Rodriguez-Alvarez, M.X. (2022) <doi:10.1214/21-STS839>; Janes, H., and Pepe, M.S. (2009) <doi:10.1093/biomet/asp002>; Pepe, M.S. (1998) <http://www.jstor.org/stable/2534001?seq=1>; Rodriguez-Alvarez, M.X. et al. (2011a) <doi:10.1016/j.csda.2010.07.018>; Rodriguez-Alvarez, M.X. et al. (2011a) <doi:10.1007/s11222-010-9184-1>. Please see Rodriguez-Alvarez, M.X. and Inacio, V. (2021) <doi:10.32614/RJ-2021-066> for more details.

Maintained by Maria Xose Rodriguez-Alvarez. Last updated 10 months ago.

11.0 match 1 stars 1.66 score 46 scripts

r-hyperspec

hyperSpec:Work with Hyperspectral Data, i.e. Spectra + Meta Information (Spatial, Time, Concentration, ...)

Comfortable ways to work with hyperspectral data sets, i.e. spatially or time-resolved spectra, or spectra with any other kind of information associated with each of the spectra. The spectra can be data as obtained in XRF, UV/VIS, Fluorescence, AES, NIR, IR, Raman, NMR, MS, etc. More generally, any data that is recorded over a discretized variable, e.g. absorbance = f(wavelength), stored as a vector of absorbance values for discrete wavelengths is suitable.

Maintained by Claudia Beleites. Last updated 10 months ago.

data-wrangling hyperspectral imaging infrared nmr raman spectroscopy uv-vis xrf

2.3 match 16 stars 8.13 score 233 scripts 2 dependents

openpharma

simaerep:Find Clinical Trial Sites Under-Reporting Adverse Events

Monitoring of Adverse Event (AE) reporting in clinical trials is important for patient safety. Sites that are under-reporting AEs can be detected using Bootstrap-based simulations that simulate overall AE reporting. Based on the simulation an AE under-reporting probability is assigned to each site in a given trial (Koneswarakantha 2021 <doi:10.1007/s40264-020-01011-5>).

Maintained by Bjoern Koneswarakantha. Last updated 2 months ago.

ae-reporting clinical-trials

3.5 match 22 stars 5.22 score 25 scripts

gabrielnakamura

FishPhyloMaker:Phylogenies for a List of Finned-Ray Fishes

Provides an alternative to facilitate the construction of a phylogeny for fish species from a list of species or a community matrix using as a backbone the phylogenetic tree proposed by Rabosky et al. (2018) <doi:10.1038/s41586-018-0273-1>.

Maintained by Gabriel Nakamura. Last updated 1 years ago.

3.3 match 8 stars 5.49 score 13 scripts

bioc

PureCN:Copy number calling and SNV classification using targeted short read sequencing

This package estimates tumor purity, copy number, and loss of heterozygosity (LOH), and classifies single nucleotide variants (SNVs) by somatic status and clonality. PureCN is designed for targeted short read sequencing data, integrates well with standard somatic variant detection and copy number pipelines, and has support for tumor samples without matching normal samples.

Maintained by Markus Riester. Last updated 2 months ago.

copynumbervariation software sequencing variantannotation variantdetection coverage immunooncology bioconductor-package cell-free-dna copy-number loh tumor-heterogeneity tumor-mutational-burden tumor-purity

1.9 match 132 stars 9.72 score 40 scripts

bioc

cydar:Using Mass Cytometry for Differential Abundance Analyses

Identifies differentially abundant populations between samples and groups in mass cytometry data. Provides methods for counting cells into hyperspheres, controlling the spatial false discovery rate, and visualizing changes in abundance in the high-dimensional marker space.

Maintained by Aaron Lun. Last updated 5 months ago.

immunooncology flowcytometry multiplecomparison proteomics singlecell cpp

3.2 match 5.64 score 48 scripts

dcousin3

superb:Summary Plots with Adjusted Error Bars

Computes standard error and confidence interval of various descriptive statistics under various designs and sampling schemes. The main function, superb(), return a plot. It can also be used to obtain a dataframe with the statistics and their precision intervals so that other plotting environments (e.g., Excel) can be used. See Cousineau and colleagues (2021) <doi:10.1177/25152459211035109> or Cousineau (2017) <doi:10.5709/acp-0214-z> for a review as well as Cousineau (2005) <doi:10.20982/tqmp.01.1.p042>, Morey (2008) <doi:10.20982/tqmp.04.2.p061>, Baguley (2012) <doi:10.3758/s13428-011-0123-7>, Cousineau & Laurencelle (2016) <doi:10.1037/met0000055>, Cousineau & O'Brien (2014) <doi:10.3758/s13428-013-0441-z>, Calderini & Harding <doi:10.20982/tqmp.15.1.p001> for specific references.

Maintained by Denis Cousineau. Last updated 2 months ago.

error-bars plotting statistics summary-plots summary-statistics visualization

1.9 match 19 stars 9.55 score 155 scripts 2 dependents

kosukeimai

emIRT:EM Algorithms for Estimating Item Response Theory Models

Various Expectation-Maximization (EM) algorithms are implemented for item response theory (IRT) models. The package includes IRT models for binary and ordinal responses, along with dynamic and hierarchical IRT models with binary responses. The latter two models are fitted using variational EM. The package also includes variational network and text scaling models. The algorithms are described in Imai, Lo, and Olmsted (2016) <DOI:10.1017/S000305541600037X>.

Maintained by Kosuke Imai. Last updated 8 months ago.

openblas cpp openmp

3.3 match 26 stars 5.43 score 26 scripts

bioc

QDNAseq:Quantitative DNA Sequencing for Chromosomal Aberrations

Quantitative DNA sequencing for chromosomal aberrations. The genome is divided into non-overlapping fixed-sized bins, number of sequence reads in each counted, adjusted with a simultaneous two-dimensional loess correction for sequence mappability and GC content, and filtered to remove spurious regions in the genome. Downstream steps of segmentation and calling are also implemented via packages DNAcopy and CGHcall, respectively.

Maintained by Daoud Sie. Last updated 5 months ago.

copynumbervariation dnaseq genetics genomeannotation preprocessing qualitycontrol sequencing

1.8 match 49 stars 10.10 score 177 scripts 4 dependents

kmkuesters

pooledpeaks:Genetic Analysis of Pooled Samples

Analyzing genetic data obtained from pooled samples. This package can read in Fragment Analysis output files, process the data, and score peaks, as well as facilitate various analyses, including cluster analysis, calculation of genetic distances and diversity indices, as well as bootstrap resampling for statistical inference. Specifically tailored to handle genetic data efficiently, researchers can explore population structure, genetic differentiation, and genetic relatedness among samples. We updated some functions from Covarrubias-Pazaran et al. (2016) <doi:10.1186/s12863-016-0365-6> to allow for the use of new file formats and referenced the following to write our genetic analysis functions: Long et al. (2022) <doi:10.1038/s41598-022-04776-0>, Jost (2008) <doi:10.1111/j.1365-294x.2008.03887.x>, Nei (1973) <doi:10.1073/pnas.70.12.3321>, Foulley et al. (2006) <doi:10.1016/j.livprodsci.2005.10.021>, Chao et al. (2008) <doi:10.1111/j.1541-0420.2008.01010.x>.

Maintained by Kathleen Kuesters. Last updated 1 days ago.

3.6 match 1 stars 4.85 score 3 scripts

zarquon42b

Morpho:Calculations and Visualisations Related to Geometric Morphometrics

A toolset for Geometric Morphometrics and mesh processing. This includes (among other stuff) mesh deformations based on reference points, permutation tests, detection of outliers, processing of sliding semi-landmarks and semi-automated surface landmark placement.

Maintained by Stefan Schlager. Last updated 5 months ago.

openblas cpp openmp

1.8 match 51 stars 10.00 score 218 scripts 13 dependents

bdwilliamson

flevr:Flexible, Ensemble-Based Variable Selection with Potentially Missing Data

Perform variable selection in settings with possibly missing data based on extrinsic (algorithm-specific) and intrinsic (population-level) variable importance. Uses a Super Learner ensemble to estimate the underlying prediction functions that give rise to estimates of variable importance. For more information about the methods, please see Williamson and Huang (2023+) <arXiv:2202.12989>.

Maintained by Brian D. Williamson. Last updated 1 years ago.

3.5 match 5 stars 4.88 score 2 scripts

marjoleinf

pre:Prediction Rule Ensembles

Derives prediction rule ensembles (PREs). Largely follows the procedure for deriving PREs as described in Friedman & Popescu (2008; <DOI:10.1214/07-AOAS148>), with adjustments and improvements. The main function pre() derives prediction rule ensembles consisting of rules and/or linear terms for continuous, binary, count, multinomial, and multivariate continuous responses. Function gpe() derives generalized prediction ensembles, consisting of rules, hinge and linear functions of the predictor variables.

Maintained by Marjolein Fokkema. Last updated 9 months ago.

2.0 match 58 stars 8.49 score 98 scripts 1 dependents

adayim

forestploter:Create a Flexible Forest Plot

Create a forest plot based on the layout of the data. Confidence intervals in multiple columns by groups can be done easily. Editing the plot, inserting/adding text, applying a theme to the plot, and much more.

Maintained by Alimu Dayimu. Last updated 6 months ago.

forestplot

1.8 match 93 stars 9.31 score 207 scripts 4 dependents

doebler

mada:Meta-Analysis of Diagnostic Accuracy

Provides functions for diagnostic meta-analysis. Next to basic analysis and visualization the bivariate Model of Reitsma et al. (2005) that is equivalent to the HSROC of Rutter & Gatsonis (2001) can be fitted. A new approach based to diagnostic meta-analysis of Holling et al. (2012) is also available. Standard methods like summary, plot and so on are provided.

Maintained by Philipp Doebler. Last updated 3 years ago.

3.3 match 2 stars 5.09 score 58 scripts 3 dependents

daijiang

phyr:Model Based Phylogenetic Analysis

A collection of functions to do model-based phylogenetic analysis. It includes functions to calculate community phylogenetic diversity, to estimate correlations among functional traits while accounting for phylogenetic relationships, and to fit phylogenetic generalized linear mixed models. The Bayesian phylogenetic generalized linear mixed models are fitted with the 'INLA' package (<https://www.r-inla.org>).

Maintained by Daijiang Li. Last updated 1 years ago.

bayesian glmm inla phylogeny species-distribution-modeling openblas cpp

1.9 match 31 stars 8.67 score 107 scripts 2 dependents

mightymetrika

npboottprmFBar:Informative Nonparametric Bootstrap Test with Pooled Resampling

Sample sizes are often small due to hard to reach target populations, rare target events, time constraints, limited budgets, or ethical considerations. Two statistical methods with promising performance in small samples are the nonparametric bootstrap test with pooled resampling method, which is the focus of Dwivedi, Mallawaarachchi, and Alvarado (2017) <doi:10.1002/sim.7263>, and informative hypothesis testing, which is implemented in the 'restriktor' package. The 'npboottprmFBar' package uses the nonparametric bootstrap test with pooled resampling method to implement informative hypothesis testing. The bootFbar() function can be used to analyze data with this method and the persimon() function can be used to conduct performance simulations on type-one error and statistical power.

Maintained by Mackson Ncube. Last updated 6 months ago.

5.4 match 3.00 score 5 scripts

citoverse

cito:Building and Training Neural Networks

The 'cito' package provides a user-friendly interface for training and interpreting deep neural networks (DNN). 'cito' simplifies the fitting of DNNs by supporting the familiar formula syntax, hyperparameter tuning under cross-validation, and helps to detect and handle convergence problems. DNNs can be trained on CPU, GPU and MacOS GPUs. In addition, 'cito' has many downstream functionalities such as various explainable AI (xAI) metrics (e.g. variable importance, partial dependence plots, accumulated local effect plots, and effect estimates) to interpret trained DNNs. 'cito' optionally provides confidence intervals (and p-values) for all xAI metrics and predictions. At the same time, 'cito' is computationally efficient because it is based on the deep learning framework 'torch'. The 'torch' package is native to R, so no Python installation or other API is required for this package.

Maintained by Maximilian Pichler. Last updated 2 months ago.

machine-learning neural-network

1.8 match 42 stars 9.07 score 129 scripts 1 dependents

thlytras

FluMoDL:Influenza-Attributable Mortality with Distributed-Lag Models

Functions to estimate the mortality attributable to influenza and temperature, using distributed-lag nonlinear models (DLNMs), as first implemented in Lytras et al. (2019) <doi:10.2807/1560-7917.ES.2019.24.14.1800118>. Full descriptions of underlying DLNM methodology in Gasparrini et al. <doi:10.1002/sim.3940> (DLNMs), <doi:10.1186/1471-2288-14-55> (attributable risk from DLNMs) and <doi:10.1002/sim.5471> (multivariate meta-analysis).

Maintained by Theodore Lytras. Last updated 6 years ago.

4.8 match 1 stars 3.32 score 42 scripts

qpmnguyen

SBICgraph:Structural Bayesian Information Criterion for Graphical Models

This is the implementation of the novel structural Bayesian information criterion by Zhou, 2020 (under review). In this method, the prior structure is modeled and incorporated into the Bayesian information criterion framework. Additionally, we also provide the implementation of a two-step algorithm to generate the candidate model pool.

Maintained by Quang Nguyen. Last updated 4 years ago.

5.8 match 2.70 score 3 scripts

bcjaeger

PooledCohort:Predicted Risk for CVD using Pooled Cohort Equations, PREVENT Equations, and Other Contemporary CVD Risk Calculators

The 2017 American College of Cardiology and American Heart Association blood pressure guideline recommends using 10-year predicted atherosclerotic cardiovascular disease risk to guide the decision to initiate or intensify antihypertensive medication. The guideline recommends using the Pooled Cohort risk prediction equations to predict 10-year atherosclerotic cardiovascular disease risk. This package implements the original Pooled Cohort risk prediction equations and also incorporates updated versions based on more contemporary data and statistical methods.

Maintained by Byron Jaeger. Last updated 6 months ago.

3.5 match 7 stars 4.50 score 8 scripts 1 dependents

atbounds

ATbounds:Bounding Treatment Effects by Limited Information Pooling

Estimation and inference methods for bounding average treatment effects (on the treated) that are valid under an unconfoundedness assumption. The bounds are designed to be robust in challenging situations, for example, when the conditioning variables take on a large number of different values in the observed sample, or when the overlap condition is violated. This robustness is achieved by only using limited "pooling" of information across observations. For more details, see the paper by Lee and Weidner (2021), "Bounding Treatment Effects by Pooling Limited Information across Observations," <arXiv:2111.05243>.

Maintained by Sokbae Lee. Last updated 3 years ago.

causal-inference lack-of-overlap limited-overlap partial-identification treatment-effects unconfoundedness-assumption

3.7 match 3 stars 4.18 score 6 scripts

bioc

scuttle:Single-Cell RNA-Seq Analysis Utilities

Provides basic utility functions for performing single-cell analyses, focusing on simple normalization, quality control and data transformations. Also provides some helper functions to assist development of other packages.

Maintained by Aaron Lun. Last updated 5 months ago.

immunooncology singlecell rnaseq qualitycontrol preprocessing normalization transcriptomics geneexpression sequencing software dataimport openblas cpp

1.5 match 10.21 score 1.7k scripts 80 dependents

corentinjgosling

metaConvert:An Automatic Suite for Estimation of Various Effect Size Measures

Automatically estimate 11 effect size measures from a well-formatted dataset. Various other functions can help, for example, removing dependency between several effect sizes, or identifying differences between two datasets. This package is mainly designed to assist in conducting a systematic review with a meta-analysis but can be useful to any researcher interested in estimating an effect size.

Maintained by Corentin J. Gosling. Last updated 4 months ago.

4.8 match 3.18 score 3 scripts

bergsmat

nonmemica:Create and Evaluate NONMEM Models in a Project Context

Systematically creates and modifies NONMEM(R) control streams. Harvests NONMEM output, builds run logs, creates derivative data, generates diagnostics. NONMEM (ICON Development Solutions <https://www.iconplc.com/>) is software for nonlinear mixed effects modeling. See 'package?nonmemica'.

Maintained by Tim Bergsma. Last updated 2 months ago.

3.3 match 4 stars 4.58 score 45 scripts

bioc

DCATS:Differential Composition Analysis Transformed by a Similarity matrix

Methods to detect the differential composition abundances between conditions in singel-cell RNA-seq experiments, with or without replicates. It aims to correct bias introduced by missclaisification and enable controlling of confounding covariates. To avoid the influence of proportion change from big cell types, DCATS can use either total cell number or specific reference group as normalization term.

Maintained by Xinyi Lin. Last updated 5 months ago.

singlecell normalization

3.3 match 4.53 score 34 scripts

pilaboratory

sads:Maximum Likelihood Models for Species Abundance Distributions

Maximum likelihood tools to fit and compare models of species abundance distributions and of species rank-abundance distributions.

Maintained by Paulo I. Prado. Last updated 1 years ago.

1.7 match 23 stars 8.66 score 244 scripts 3 dependents

jongheepark

MCMCpack:Markov Chain Monte Carlo (MCMC) Package

Contains functions to perform Bayesian inference using posterior simulation for a number of statistical models. Most simulation is done in compiled C++ written in the Scythe Statistical Library Version 1.0.3. All models return 'coda' mcmc objects that can then be summarized using the 'coda' package. Some useful utility functions such as density functions, pseudo-random number generators for statistical distributions, a general purpose Metropolis sampling algorithm, and tools for visualization are provided.

Maintained by Jong Hee Park. Last updated 7 months ago.

cpp

1.5 match 13 stars 9.40 score 2.6k scripts 150 dependents

bmcclintock

momentuHMM:Maximum Likelihood Analysis of Animal Movement Behavior Using Multivariate Hidden Markov Models

Extended tools for analyzing telemetry data using generalized hidden Markov models. Features of momentuHMM (pronounced ``momentum'') include data pre-processing and visualization, fitting HMMs to location and auxiliary biotelemetry or environmental data, biased and correlated random walk movement models, discrete- or continuous-time HMMs, continuous- or discrete-space movement models, approximate Langevin diffusion models, hierarchical HMMs, multiple imputation for incorporating location measurement error and missing data, user-specified design matrices and constraints for covariate modelling of parameters, random effects, decoding of the state process, visualization of fitted models, model checking and selection, and simulation. See McClintock and Michelot (2018) <doi:10.1111/2041-210X.12995>.

Maintained by Brett McClintock. Last updated 30 days ago.

openblas cpp

1.7 match 43 stars 8.47 score 162 scripts

epiforecasts

epinowcast:Flexible Hierarchical Nowcasting

Tools to enable flexible and efficient hierarchical nowcasting of right-truncated epidemiological time-series using a semi-mechanistic Bayesian model with support for a range of reporting and generative processes. Nowcasting, in this context, is gaining situational awareness using currently available observations and the reporting patterns of historical observations. This can be useful when tracking the spread of infectious disease in real-time: without nowcasting, changes in trends can be obfuscated by partial reporting or their detection may be delayed due to the use of simpler methods like truncation. While the package has been designed with epidemiological applications in mind, it could be applied to any set of right-truncated time-series count data.

Maintained by Sam Abbott. Last updated 11 months ago.

cmdstanr effective-reproduction-number-estimation epidemiology infectious-disease-surveillance nowcasting outbreak-analysis pandemic-preparedness real-time-infectious-disease-modelling stan

1.8 match 61 stars 7.88 score 65 scripts

bioc

ADAPT:Analysis of Microbiome Differential Abundance by Pooling Tobit Models

ADAPT carries out differential abundance analysis for microbiome metagenomics data in phyloseq format. It has two innovations. One is to treat zero counts as left censored and use Tobit models for log count ratios. The other is an innovative way to find non-differentially abundant taxa as reference, then use the reference taxa to find the differentially abundant ones.

Maintained by Mukai Wang. Last updated 5 months ago.

differentialexpression microbiome normalization sequencing metagenomics software multiplecomparison openblas cpp

2.9 match 4.81 score 26 scripts

sgiddens

DPpack:Differentially Private Statistical Analysis and Machine Learning

An implementation of common statistical analysis and models with differential privacy (Dwork et al., 2006a) <doi:10.1007/11681878_14> guarantees. The package contains, for example, functions providing differentially private computations of mean, variance, median, histograms, and contingency tables. It also implements some statistical models and machine learning algorithms such as linear regression (Kifer et al., 2012) <https://proceedings.mlr.press/v23/kifer12.html> and SVM (Chaudhuri et al., 2011) <https://jmlr.org/papers/v12/chaudhuri11a.html>. In addition, it implements some popular randomization mechanisms, including the Laplace mechanism (Dwork et al., 2006a) <doi:10.1007/11681878_14>, Gaussian mechanism (Dwork et al., 2006b) <doi:10.1007/11761679_29>, analytic Gaussian mechanism (Balle & Wang, 2018) <https://proceedings.mlr.press/v80/balle18a.html>, and exponential mechanism (McSherry & Talwar, 2007) <doi:10.1109/FOCS.2007.66>.

Maintained by Spencer Giddens. Last updated 5 months ago.

3.8 match 3 stars 3.65 score 3 scripts

epinowcast

epinowcast:Flexible Hierarchical Nowcasting

Tools to enable flexible and efficient hierarchical nowcasting of right-truncated epidemiological time-series using a semi-mechanistic Bayesian model with support for a range of reporting and generative processes. Nowcasting, in this context, is gaining situational awareness using currently available observations and the reporting patterns of historical observations. This can be useful when tracking the spread of infectious disease in real-time: without nowcasting, changes in trends can be obfuscated by partial reporting or their detection may be delayed due to the use of simpler methods like truncation. While the package has been designed with epidemiological applications in mind, it could be applied to any set of right-truncated time-series count data.

Maintained by Sam Abbott. Last updated 11 months ago.

cmdstanr effective-reproduction-number-estimation epidemiology infectious-disease-surveillance nowcasting outbreak-analysis pandemic-preparedness real-time-infectious-disease-modelling stan

1.8 match 61 stars 7.79 score 71 scripts

kkholst

targeted:Targeted Inference

Various methods for targeted and semiparametric inference including augmented inverse probability weighted (AIPW) estimators for missing data and causal inference (Bang and Robins (2005) <doi:10.1111/j.1541-0420.2005.00377.x>), variable importance and conditional average treatment effects (CATE) (van der Laan (2006) <doi:10.2202/1557-4679.1008>), estimators for risk differences and relative risks (Richardson et al. (2017) <doi:10.1080/01621459.2016.1192546>), assumption lean inference for generalized linear model parameters (Vansteelandt et al. (2022) <doi:10.1111/rssb.12504>).

Maintained by Klaus K. Holst. Last updated 1 months ago.

causal-inference double-robust estimation semiparametric-estimation statistics openblas cpp openmp

1.9 match 11 stars 7.20 score 30 scripts 1 dependents

pmartr

pmartR:Panomics Marketplace - Quality Control and Statistical Analysis for Panomics Data

Provides functionality for quality control processing and statistical analysis of mass spectrometry (MS) omics data, in particular proteomic (either at the peptide or the protein level), lipidomic, and metabolomic data, as well as RNA-seq based count data and nuclear magnetic resonance (NMR) data. This includes data transformation, specification of groups that are to be compared against each other, filtering of features and/or samples, data normalization, data summarization (correlation, PCA), and statistical comparisons between defined groups. Implements methods described in: Webb-Robertson et al. (2014) <doi:10.1074/mcp.M113.030932>. Webb-Robertson et al. (2011) <doi:10.1002/pmic.201100078>. Matzke et al. (2011) <doi:10.1093/bioinformatics/btr479>. Matzke et al. (2013) <doi:10.1002/pmic.201200269>. Polpitiya et al. (2008) <doi:10.1093/bioinformatics/btn217>. Webb-Robertson et al. (2010) <doi:10.1021/pr1005247>.

Maintained by Lisa Bramer. Last updated 3 days ago.

data-summarization lipids mass-spectrometry metabolites metabolomics-data peptides proteins rna-seq-analysis openblas cpp

1.8 match 40 stars 7.69 score 144 scripts

mightymetrika

bootwar:Nonparametric Bootstrap Test with Pooled Resampling Card Game

The card game War is simple in its rules but can be lengthy. In another domain, the nonparametric bootstrap test with pooled resampling (nbpr) methods, as outlined in Dwivedi, Mallawaarachchi, and Alvarado (2017) <doi:10.1002/sim.7263>, is optimal for comparing paired or unpaired means in non-normal data, especially for small sample size studies. However, many researchers are unfamiliar with these methods. The 'bootwar' package bridges this gap by enabling users to grasp the concepts of nbpr via Boot War, a variation of the card game War designed for small samples. The package provides functions like score_keeper() and play_round() to streamline gameplay and scoring. Once a predetermined number of rounds concludes, users can employ the analyze_game() function to derive game results. This function leverages the 'npboottprm' package's nonparboot() to report nbpr results and, for comparative analysis, also reports results from the 'stats' package's t.test() function. Additionally, 'bootwar' features an interactive 'shiny' web application, bootwar(). This offers a user-centric interface to experience Boot War, enhancing understanding of nbpr methods across various distributions, sample sizes, number of bootstrap resamples, and confidence intervals.

Maintained by Mackson Ncube. Last updated 1 years ago.

bootstrap data-science resampling statistics

3.4 match 4.00 score 6 scripts

bioc

scMultiSim:Simulation of Multi-Modality Single Cell Data Guided By Gene Regulatory Networks and Cell-Cell Interactions

scMultiSim simulates paired single cell RNA-seq, single cell ATAC-seq and RNA velocity data, while incorporating mechanisms of gene regulatory networks, chromatin accessibility and cell-cell interactions. It allows users to tune various parameters controlling the amount of each biological factor, variation of gene-expression levels, the influence of chromatin accessibility on RNA sequence data, and so on. It can be used to benchmark various computational methods for single cell multi-omics data, and to assist in experimental design of wet-lab experiments.

Maintained by Hechen Li. Last updated 5 months ago.

singlecell transcriptomics geneexpression sequencing experimentaldesign

1.9 match 23 stars 7.15 score 11 scripts

stmcg

metamedian:Meta-Analysis of Medians

Implements several methods to meta-analyze studies that report the sample median of the outcome. The methods described by McGrath et al. (2019) <doi:10.1002/sim.8013>, Ozturk and Balakrishnan (2020) <doi:10.1002/sim.8738>, and McGrath et al. (2020a) <doi:10.1002/bimj.201900036> can be applied to directly meta-analyze the median or difference of medians between groups. Additionally, a number of methods (e.g., McGrath et al. (2020b) <doi:10.1177/0962280219889080>, Cai et al. (2021) <doi:10.1177/09622802211047348>, and McGrath et al. (2023) <doi:10.1177/09622802221139233>) are implemented to estimate study-specific (difference of) means and their standard errors in order to estimate the pooled (difference of) means. Methods for meta-analyzing median survival times (McGrath et al. (2025) <doi:10.48550/arXiv.2503.03065>) are also implemented. See McGrath et al. (2024) <doi:10.1002/jrsm.1686> for a detailed guide on using the package.

Maintained by Sean McGrath. Last updated 8 days ago.

2.8 match 9 stars 4.86 score 16 scripts

bioc

seq.hotSPOT:Targeted sequencing panel design based on mutation hotspots

seq.hotSPOT provides a resource for designing effective sequencing panels to help improve mutation capture efficacy for ultradeep sequencing projects. Using SNV datasets, this package designs custom panels for any tissue of interest and identify the genomic regions likely to contain the most mutations. Establishing efficient targeted sequencing panels can allow researchers to study mutation burden in tissues at high depth without the economic burden of whole-exome or whole-genome sequencing. This tool was developed to make high-depth sequencing panels to study low-frequency clonal mutations in clinically normal and cancerous tissues.

Maintained by Sydney Grant. Last updated 5 months ago.

software technology sequencing dnaseq wholegenome

3.3 match 4.00 score 3 scripts

robbriers

biotic:Calculation of Freshwater Biotic Indices

Calculates a range of UK freshwater invertebrate biotic indices including BMWP, Whalley, WHPT, Habitat-specific BMWP, AWIC, LIFE and PSI.

Maintained by Dr Rob Briers. Last updated 3 years ago.

3.4 match 3.95 score 18 scripts

my-jiang

vamc:A Monte Carlo Valuation Framework for Variable Annuities

Implementation of a Monte Carlo simulation engine for valuing synthetic portfolios of variable annuities, which reflect realistic features of common annuity contracts in practice. It aims to facilitate the development and dissemination of research related to the efficient valuation of a portfolio of large variable annuities. The main valuation methodology was proposed by Gan (2017) <doi:10.1515/demo-2017-0021>.

Maintained by Mingyi Jiang. Last updated 5 years ago.

5.3 match 1 stars 2.45 score 28 scripts

bioc

EmpiricalBrownsMethod:Uses Brown's method to combine p-values from dependent tests

Combining P-values from multiple statistical tests is common in bioinformatics. However, this procedure is non-trivial for dependent P-values. This package implements an empirical adaptation of Brown’s Method (an extension of Fisher’s Method) for combining dependent P-values which is appropriate for highly correlated data sets found in high-throughput biological experiments.

Maintained by David Gibbs. Last updated 5 months ago.

statisticalmethod geneexpression pathways

2.3 match 25 stars 5.65 score 7 scripts 3 dependents

boehringer-ingelheim

tipmap:Tipping Point Analysis for Bayesian Dynamic Borrowing

Tipping point analysis for clinical trials that employ Bayesian dynamic borrowing via robust meta-analytic predictive (MAP) priors. Further functions facilitate expert elicitation of a primary weight of the informative component of the robust MAP prior and computation of operating characteristics. Intended use is the planning, analysis and interpretation of extrapolation studies in pediatric drug development, but applicability is generally wider.

Maintained by Christian Stock. Last updated 12 months ago.

bayesian-borrowing bayesian-methods clinical-trial evidence-synthesis extrapolation pediatrics pharmaceutical-development prior-elicitation tipping-point weighting

2.8 match 2 stars 4.38 score 12 scripts

vandomed

dvmisc:Convenience Functions, Moving Window Statistics, and Graphics

Collection of functions for running and summarizing statistical simulation studies, creating visualizations (e.g. CART Shiny app, histograms with fitted probability mass/density functions), calculating moving-window statistics efficiently, and performing common computations.

Maintained by Dane R. Van Domelen. Last updated 4 years ago.

aic bmi histograms miscellaneous cpp

2.0 match 1 stars 6.18 score 125 scripts 8 dependents

kenaho1

asbio:A Collection of Statistical Tools for Biologists

Contains functions from: Aho, K. (2014) Foundational and Applied Statistics for Biologists using R. CRC/Taylor and Francis, Boca Raton, FL, ISBN: 978-1-4398-7338-0.

Maintained by Ken Aho. Last updated 2 months ago.

1.7 match 5 stars 7.32 score 310 scripts 3 dependents

dcahoy

metabup:Bayesian Meta-Analysis Using Basic Uncertain Pooling

Contains functions that allow Bayesian meta-analysis (1) with binomial data, counts(y) and total counts (n) or, (2) with user-supplied point estimates and associated variances. Case (1) provides an analysis based on the logit transformation of the sample proportion. This methodology is also appropriate for combining data from sample surveys and related sources. The functions can calculate the corresponding similarity matrix. More details can be found in Cahoy and Sedransk (2023), Cahoy and Sedransk (2022) <doi:10.1007/s42519-018-0027-2>, Evans and Sedransk (2001) <doi:10.1093/biomet/88.3.643>, and Malec and Sedransk (1992) <doi:10.1093/biomet/79.3.593>.

Maintained by Dexter Cahoy. Last updated 2 years ago.

4.6 match 2.70 score

brentkaplan

beezdemand:Behavioral Economic Easy Demand

Facilitates many of the analyses performed in studies of behavioral economic demand. The package supports commonly-used options for modeling operant demand including (1) data screening proposed by Stein, Koffarnus, Snider, Quisenberry, & Bickel (2015; <doi:10.1037/pha0000020>), (2) fitting models of demand such as linear (Hursh, Raslear, Bauman, & Black, 1989, <doi:10.1007/978-94-009-2470-3_22>), exponential (Hursh & Silberberg, 2008, <doi:10.1037/0033-295X.115.1.186>) and modified exponential (Koffarnus, Franck, Stein, & Bickel, 2015, <doi:10.1037/pha0000045>), and (3) calculating numerous measures relevant to applied behavioral economists (Intensity, Pmax, Omax). Also supports plotting and comparing data.

Maintained by Brent Kaplan. Last updated 7 months ago.

2.0 match 15 stars 6.12 score 29 scripts 1 dependents

doomlab

MOTE:Effect Size and Confidence Interval Calculator

Measure of the Effect ('MOTE') is an effect size calculator, including a wide variety of effect sizes in the mean differences family (all versions of d) and the variance overlap family (eta, omega, epsilon, r). 'MOTE' provides non-central confidence intervals for each effect size, relevant test statistics, and output for reporting in APA Style (American Psychological Association, 2010, <ISBN:1433805618>) with 'LaTeX'. In research, an over-reliance on p-values may conceal the fact that a study is under-powered (Halsey, Curran-Everett, Vowler, & Drummond, 2015 <doi:10.1038/nmeth.3288>). A test may be statistically significant, yet practically inconsequential (Fritz, Scherndl, & Kühberger, 2012 <doi:10.1177/0959354312436870>). Although the American Psychological Association has long advocated for the inclusion of effect sizes (Wilkinson & American Psychological Association Task Force on Statistical Inference, 1999 <doi:10.1037/0003-066X.54.8.594>), the vast majority of peer-reviewed, published academic studies stop short of reporting effect sizes and confidence intervals (Cumming, 2013, <doi:10.1177/0956797613504966>). 'MOTE' simplifies the use and interpretation of effect sizes and confidence intervals. For more information, visit <https://www.aggieerin.com/shiny-server>.

Maintained by Erin M. Buchanan. Last updated 3 years ago.

confidence effect interval size statistics

1.8 match 17 stars 6.69 score 320 scripts 1 dependents

mjuraska

futility:Interim Analysis of Operational Futility in Randomized Trials with Time-to-Event Endpoints and Fixed Follow-Up

Randomized clinical trials commonly follow participants for a time-to-event efficacy endpoint for a fixed period of time. Consequently, at the time when the last enrolled participant completes their follow-up, the number of observed endpoints is a random variable. Assuming data collected through an interim timepoint, simulation-based estimation and inferential procedures in the standard right-censored failure time analysis framework are conducted for the distribution of the number of endpoints--in total as well as by treatment arm--at the end of the follow-up period. The future (i.e., yet unobserved) enrollment, endpoint, and dropout times are generated according to mechanisms specified in the simTrial() function in the 'seqDesign' package. A Bayesian model for the endpoint rate, offering the option to specify a robust mixture prior distribution, is used for generating future data (see the vignette for details). Inference can be restricted to participants who received treatment according to the protocol and are observed to be at risk for the endpoint at a specified timepoint. Plotting functions are provided for graphical display of results.

Maintained by Michal Juraska. Last updated 3 years ago.

3.3 match 1 stars 3.70 score 6 scripts

sdanzige

ADAPTS:Automated Deconvolution Augmentation of Profiles for Tissue Specific Cells

Tools to construct (or add to) cell-type signature matrices using flow sorted or single cell samples and deconvolve bulk gene expression data. Useful for assessing the quality of single cell RNAseq experiments, estimating the accuracy of signature matrices, and determining cell-type spillover. Please cite: Danziger SA et al. (2019) ADAPTS: Automated Deconvolution Augmentation of Profiles for Tissue Specific cells <doi:10.1371/journal.pone.0224693>.

Maintained by Samuel A Danziger. Last updated 3 years ago.

1.8 match 2 stars 6.56 score 40 scripts 1 dependents

lbau7

baskexact:Analytical Calculation of Basket Trial Operating Characteristics

Analytically calculates the operating characteristics of single-stage and two-stage basket trials with equal sample sizes using the power prior design by Baumann et al. (2024) <doi:10.48550/arXiv.2309.06988> and the design by Fujikawa et al. (2020) <doi:10.1002/bimj.201800404>.

Maintained by Lukas Baumann. Last updated 7 months ago.

openblas cpp

2.3 match 2 stars 5.22 score 11 scripts

statnet

ergm.count:Fit, Simulate and Diagnose Exponential-Family Models for Networks with Count Edges

A set of extensions for the 'ergm' package to fit weighted networks whose edge weights are counts. See Krivitsky (2012) <doi:10.1214/12-EJS696> and Krivitsky, Hunter, Morris, and Klumb (2023) <doi:10.18637/jss.v105.i06>.

Maintained by Pavel N. Krivitsky. Last updated 4 months ago.

1.3 match 10 stars 8.78 score 140 scripts 1 dependents

weirichs

eatRep:Educational Assessment Tools for Replication Methods

Replication methods to compute some basic statistic operations (means, standard deviations, frequency tables, percentiles, mean comparisons using weighted effect coding, generalized linear models, and linear multilevel models) in complex survey designs comprising multiple imputed or nested imputed variables and/or a clustered sampling structure which both deserve special procedures at least in estimating standard errors. See the package documentation for a more detailed description along with references.

Maintained by Sebastian Weirich. Last updated 17 days ago.

2.3 match 1 stars 5.16 score 13 scripts

pbiecek

SmarterPoland:Tools for Accessing Various Datasets Developed by the Foundation SmarterPoland.pl

Tools for accessing and processing datasets prepared by the Foundation SmarterPoland.pl. Among all: access to API of Google Maps, Central Statistical Office of Poland, MojePanstwo, Eurostat, WHO and other sources.

Maintained by Przemyslaw Biecek. Last updated 2 years ago.

2.0 match 8 stars 5.67 score 196 scripts 2 dependents

bozenne

LMMstar:Repeated Measurement Models for Discrete Times

Companion R package for the course "Statistical analysis of correlated and repeated measurements for health science researchers" taught by the section of Biostatistics of the University of Copenhagen. It implements linear mixed models where the model for the variance-covariance of the residuals is specified via patterns (compound symmetry, toeplitz, unstructured, ...). Statistical inference for mean, variance, and correlation parameters is performed based on the observed information and a Satterthwaite approximation of the degrees of freedom. Normalized residuals are provided to assess model misspecification. Statistical inference can be performed for arbitrary linear or non-linear combination(s) of model coefficients. Predictions can be computed conditional to covariates only or also to outcome values.

Maintained by Brice Ozenne. Last updated 5 months ago.

1.8 match 4 stars 6.28 score 141 scripts

cran

expertsurv:Incorporate Expert Opinion with Parametric Survival Models

Enables users to incorporate expert opinion with parametric survival analysis using a Bayesian or frequentist approach. Expert Opinion can be provided on the survival probabilities at certain time-point(s) or for the difference in mean survival between two treatment arms. Please reference it's use as Cooney, P., White, A. (2023) <doi:10.1177/0272989X221150212>.

Maintained by Philip Cooney. Last updated 23 days ago.

cpp

3.8 match 3.00 score 1 scripts

cran

GAD:Analysis of Variance from General Principles

Analysis of complex ANOVA models with any combination of orthogonal/nested and fixed/random factors, as described by Underwood (1997). There are two restrictions: (i) data must be balanced; (ii) fixed nested factors are not allowed. Homogeneity of variances is checked using Cochran's C test and 'a posteriori' comparisons of means are done using Student-Newman-Keuls (SNK) procedure. For those terms with no denominator in the F-ratio calculation, pooled mean squares and quasi F-ratios are provided. Magnitute of effects are assessed by components of variation.

Maintained by Leonardo Sandrini-Neto. Last updated 11 months ago.

5.5 match 2.00 score

bioc

ChIPpeakAnno:Batch annotation of the peaks identified from either ChIP-seq, ChIP-chip experiments, or any experiments that result in large number of genomic interval data

The package encompasses a range of functions for identifying the closest gene, exon, miRNA, or custom features—such as highly conserved elements and user-supplied transcription factor binding sites. Additionally, users can retrieve sequences around the peaks and obtain enriched Gene Ontology (GO) or Pathway terms. In version 2.0.5 and beyond, new functionalities have been introduced. These include features for identifying peaks associated with bi-directional promoters along with summary statistics (peaksNearBDP), summarizing motif occurrences in peaks (summarizePatternInPeaks), and associating additional identifiers with annotated peaks or enrichedGO (addGeneIDs). The package integrates with various other packages such as biomaRt, IRanges, Biostrings, BSgenome, GO.db, multtest, and stat to enhance its analytical capabilities.

Maintained by Jianhong Ou. Last updated 2 months ago.

annotation chipseq chipchip

1.3 match 8.75 score 584 scripts 6 dependents

cran

PoolBal:Balancing Central and Marginal Rejection of Pooled p-Values

When using pooled p-values to adjust for multiple testing, there is an inherent balance that must be struck between rejection based on weak evidence spread among many tests and strong evidence in a few, explored in Salahub and Olford (2023) <arXiv:2310.16600>. This package provides functionality to compute marginal and central rejection levels and the centrality quotient for p-value pooling functions and provides implementations of the chi-squared quantile pooled p-value (described in Salahub and Oldford (2023)) and a proposal from Heard and Rubin-Delanchy (2018) <doi:10.1093/biomet/asx076> to control the quotient's value.

Maintained by Chris Salahub. Last updated 1 years ago.

10.8 match 1.00 score 2 scripts

johnaponte

repana:Repeatable Analysis in R

Set of utilities to facilitate the reproduction of analysis in R. It allow to make_structure(), clean_structure(), and run and log programs in a predefined order to allow secondary files, analysis and reports be constructed in an ordered and reproducible form.

Maintained by John J. Aponte. Last updated 22 days ago.

1.8 match 5 stars 5.98 score 19 scripts