R-universe search: permutation

gavinsimpson

permute:Functions for Generating Restricted Permutations of Data

A set of restricted permutation designs for freely exchangeable, line transects (time series), and spatial grid designs plus permutation of blocks (groups of samples) is provided. 'permute' also allows split-plot designs, in which the whole-plots or split-plots or both can be freely-exchangeable or one of the restricted designs. The 'permute' package is modelled after the permutation schemes of 'Canoco 3.1' (and later) by Cajo ter Braak.

Maintained by Gavin L. Simpson. Last updated 7 months ago.

permutation restricted-permutations

111.0 match 23 stars 13.28 score 538 scripts 488 dependents

robinhankin

permutations:The Symmetric Group: Permutations of a Finite Set

Manipulates invertible functions from a finite set to itself. Can transform from word form to cycle form and back. To cite the package in publications please use Hankin (2020) "Introducing the permutations R package", SoftwareX, volume 11 <doi:10.1016/j.softx.2020.100453>.

Maintained by Robin K. S. Hankin. Last updated 1 months ago.

146.9 match 6 stars 8.23 score 49 scripts 2 dependents

kbroman

qtl:Tools for Analyzing QTL Experiments

Analysis of experimental crosses to identify genes (called quantitative trait loci, QTLs) contributing to variation in quantitative traits. Broman et al. (2003) <doi:10.1093/bioinformatics/btg112>.

Maintained by Karl W Broman. Last updated 7 months ago.

openblas

30.9 match 80 stars 12.79 score 2.4k scripts 29 dependents

vegandevs

vegan:Community Ecology Package

Ordination methods, diversity analysis and other functions for community and vegetation ecologists.

Maintained by Jari Oksanen. Last updated 15 days ago.

ecological-modelling ecology ordination fortran openblas

18.3 match 472 stars 19.41 score 15k scripts 440 dependents

jwood000

RcppAlgos:High Performance Tools for Combinatorics and Computational Mathematics

Provides optimized functions and flexible iterators implemented in C++ for solving problems in combinatorics and computational mathematics. Handles various combinatorial objects including combinations, permutations, integer partitions and compositions, Cartesian products, unordered Cartesian products, and partition of groups. Utilizes the RMatrix class from 'RcppParallel' for thread safety. The combination and permutation functions contain constraint parameters that allow for generation of all results of a vector meeting specific criteria (e.g. finding all combinations such that the sum is between two bounds). Capable of ranking/unranking combinatorial objects efficiently (e.g. retrieve only the nth lexicographical result) which sets up nicely for parallelization as well as random sampling. Gmp support permits exploration where the total number of results is large (e.g. comboSample(10000, 500, n = 4)). Additionally, there are several high performance number theoretic functions that are useful for problems common in computational mathematics. Some of these functions make use of the fast integer division library 'libdivide'. The primeSieve function is based on the segmented sieve of Eratosthenes implementation by Kim Walisch. It is also efficient for large numbers by using the cache friendly improvements originally developed by Tomás Oliveira. Finally, there is a prime counting function that implements Legendre's formula based on the work of Kim Walisch.

Maintained by Joseph Wood. Last updated 1 months ago.

combinations combinatorics factorization number-theory parallel permutation prime-factorizations primesieve gmp cpp

34.3 match 45 stars 10.04 score 153 scripts 12 dependents

cvoeten

permutes:Permutation Tests for Time Series Data

Helps you determine the analysis window to use when analyzing densely-sampled time-series data, such as EEG data, using permutation testing (Maris & Oostenveld, 2007) <doi:10.1016/j.jneumeth.2007.03.024>. These permutation tests can help identify the timepoints where significance of an effect begins and ends, and the results can be plotted in various types of heatmap for reporting. Mixed-effects models are supported using an implementation of the approach by Lee & Braun (2012) <doi:10.1111/j.1541-0420.2011.01675.x>.

Maintained by Cesko C. Voeten. Last updated 2 years ago.

73.3 match 4.23 score 16 scripts

igraph

igraph:Network Analysis and Visualization

Routines for simple graphs and network analysis. It can handle large graphs very well and provides functions for generating random and regular graphs, graph visualization, centrality methods and much more.

Maintained by Kirill Müller. Last updated 1 days ago.

complex-networks graph-algorithms graph-theory mathematics network-analysis network-graph fortran libxml2 glpk openblas cpp

11.5 match 581 stars 21.10 score 31k scripts 1.9k dependents

mhahsler

seriation:Infrastructure for Ordering Objects Using Seriation

Infrastructure for ordering objects with an implementation of several seriation/sequencing/ordination techniques to reorder matrices, dissimilarity matrices, and dendrograms. Also provides (optimally) reordered heatmaps, color images and clustering visualizations like dissimilarity plots, and visual assessment of cluster tendency plots (VAT and iVAT). Hahsler et al (2008) <doi:10.18637/jss.v025.i03>.

Maintained by Michael Hahsler. Last updated 3 months ago.

combinatorial-optimization ordination seriation fortran

17.0 match 77 stars 14.07 score 640 scripts 79 dependents

adeverse

ade4:Analysis of Ecological Data: Exploratory and Euclidean Methods in Environmental Sciences

Tools for multivariate data analysis. Several methods are provided for the analysis (i.e., ordination) of one-table (e.g., principal component analysis, correspondence analysis), two-table (e.g., coinertia analysis, redundancy analysis), three-table (e.g., RLQ analysis) and K-table (e.g., STATIS, multiple coinertia analysis). The philosophy of the package is described in Dray and Dufour (2007) <doi:10.18637/jss.v022.i04>.

Maintained by Aurélie Siberchicot. Last updated 11 days ago.

openblas cpp

11.3 match 39 stars 14.96 score 2.2k scripts 256 dependents

bioc

regioneR:Association analysis of genomic regions based on permutation tests

regioneR offers a statistical framework based on customizable permutation tests to assess the association between genomic region sets and other genomic features.

Maintained by Bernat Gel. Last updated 5 months ago.

genetics chipseq dnaseq methylseq copynumbervariation

17.8 match 9.00 score 2.7k scripts 21 dependents

rfastofficial

Rfast:A Collection of Efficient and Extremely Fast R Functions

A collection of fast (utility) functions for data analysis. Column and row wise means, medians, variances, minimums, maximums, many t, F and G-square tests, many regressions (normal, logistic, Poisson), are some of the many fast functions. References: a) Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>. b) Tsagris M. and Papadakis M. (2018). Forward regression in R: from the extreme slow to the extreme fast. Journal of Data Science, 16(4): 771--780. <doi:10.6339/JDS.201810_16(4).00006>. c) Chatzipantsiou C., Dimitriadis M., Papadakis M. and Tsagris M. (2020). Extremely Efficient Permutation and Bootstrap Hypothesis Tests Using Hypothesis Tests Using R. Journal of Modern Applied Statistical Methods, 18(2), eP2898. <doi:10.48550/arXiv.1806.10947>. d) Tsagris M., Papadakis M., Alenazi A. and Alzeley O. (2024). Computationally Efficient Outlier Detection for High-Dimensional Data Using the MDP Algorithm. Computation, 12(9): 185. <doi:10.3390/computation12090185>. e) Tsagris M. and Papadakis M. (2025). Fast and light-weight energy statistics using the R package Rfast. <doi:10.48550/arXiv.2501.02849>.

Maintained by Manos Papadakis. Last updated 16 days ago.

openblas cpp openmp

12.7 match 147 stars 12.54 score 1.2k scripts 166 dependents

bioc

methylInheritance:Permutation-Based Analysis associating Conserved Differentially Methylated Elements Across Multiple Generations to a Treatment Effect

Permutation analysis, based on Monte Carlo sampling, for testing the hypothesis that the number of conserved differentially methylated elements, between several generations, is associated to an effect inherited from a treatment and that stochastic effect can be dismissed.

Maintained by Astrid Deschênes. Last updated 5 months ago.

biologicalquestion epigenetics dnamethylation differentialmethylation methylseq software immunooncology statisticalmethod wholegenome sequencing analysis bioconductor bioinformatics cpg differentially-methylated-elements inheritance monte-carlo-sampling permutation

34.2 match 4.60 score 1 scripts

martinzaefferer

CEGO:Combinatorial Efficient Global Optimization

Model building, surrogate model based optimization and Efficient Global Optimization in combinatorial or mixed search spaces.

Maintained by Martin Zaefferer. Last updated 2 months ago.

50.8 match 1 stars 3.04 score 73 scripts

r-gregmisc

gtools:Various R Programming Tools

Functions to assist in R programming, including: - assist in developing, updating, and maintaining R and R packages ('ask', 'checkRVersion', 'getDependencies', 'keywords', 'scat'), - calculate the logit and inverse logit transformations ('logit', 'inv.logit'), - test if a value is missing, empty or contains only NA and NULL values ('invalid'), - manipulate R's .Last function ('addLast'), - define macros ('defmacro'), - detect odd and even integers ('odd', 'even'), - convert strings containing non-ASCII characters (like single quotes) to plain ASCII ('ASCIIfy'), - perform a binary search ('binsearch'), - sort strings containing both numeric and character components ('mixedsort'), - create a factor variable from the quantiles of a continuous variable ('quantcut'), - enumerate permutations and combinations ('combinations', 'permutation'), - calculate and convert between fold-change and log-ratio ('foldchange', 'logratio2foldchange', 'foldchange2logratio'), - calculate probabilities and generate random numbers from Dirichlet distributions ('rdirichlet', 'ddirichlet'), - apply a function over adjacent subsets of a vector ('running'), - modify the TCP_NODELAY ('de-Nagle') flag for socket objects, - efficient 'rbind' of data frames, even if the column names don't match ('smartbind'), - generate significance stars from p-values ('stars.pval'), - convert characters to/from ASCII codes ('asc', 'chr'), - convert character vector to ASCII representation ('ASCIIfy'), - apply title capitalization rules to a character vector ('capwords').

Maintained by Ben Bolker. Last updated 9 months ago.

10.4 match 25 stars 14.47 score 11k scripts 1.1k dependents

r-spatial

spdep:Spatial Dependence: Weighting Schemes, Statistics

A collection of functions to create spatial weights matrix objects from polygon 'contiguities', from point patterns by distance and tessellations, for summarizing these objects, and for permitting their use in spatial data analysis, including regional aggregation by minimum spanning tree; a collection of tests for spatial 'autocorrelation', including global 'Morans I' and 'Gearys C' proposed by 'Cliff' and 'Ord' (1973, ISBN: 0850860369) and (1981, ISBN: 0850860814), 'Hubert/Mantel' general cross product statistic, Empirical Bayes estimates and 'Assunção/Reis' (1999) <doi:10.1002/(SICI)1097-0258(19990830)18:16%3C2147::AID-SIM179%3E3.0.CO;2-I> Index, 'Getis/Ord' G ('Getis' and 'Ord' 1992) <doi:10.1111/j.1538-4632.1992.tb00261.x> and multicoloured join count statistics, 'APLE' ('Li 'et al.' ) <doi:10.1111/j.1538-4632.2007.00708.x>, local 'Moran's I', 'Gearys C' ('Anselin' 1995) <doi:10.1111/j.1538-4632.1995.tb00338.x> and 'Getis/Ord' G ('Ord' and 'Getis' 1995) <doi:10.1111/j.1538-4632.1995.tb00912.x>, 'saddlepoint' approximations ('Tiefelsdorf' 2002) <doi:10.1111/j.1538-4632.2002.tb01084.x> and exact tests for global and local 'Moran's I' ('Bivand et al.' 2009) <doi:10.1016/j.csda.2008.07.021> and 'LOSH' local indicators of spatial heteroscedasticity ('Ord' and 'Getis') <doi:10.1007/s00168-011-0492-y>. The implementation of most of these measures is described in 'Bivand' and 'Wong' (2018) <doi:10.1007/s11749-018-0599-x>, with further extensions in 'Bivand' (2022) <doi:10.1111/gean.12319>. 'Lagrange' multiplier tests for spatial dependence in linear models are provided ('Anselin et al'. 1996) <doi:10.1016/0166-0462(95)02111-6>, as are 'Rao' score tests for hypothesised spatial 'Durbin' models based on linear models ('Koley' and 'Bera' 2023) <doi:10.1080/17421772.2023.2256810>. A local indicators for categorical data (LICD) implementation based on 'Carrer et al.' (2021) <doi:10.1016/j.jas.2020.105306> and 'Bivand et al.' (2017) <doi:10.1016/j.spasta.2017.03.003> was added in 1.3-7. From 'spdep' and 'spatialreg' versions >= 1.2-1, the model fitting functions previously present in this package are defunct in 'spdep' and may be found in 'spatialreg'.

Maintained by Roger Bivand. Last updated 17 days ago.

spatial-autocorrelation spatial-dependence spatial-weights

9.0 match 131 stars 16.62 score 6.0k scripts 107 dependents

randy3k

arrangements:Fast Generators and Iterators for Permutations, Combinations, Integer Partitions and Compositions

Fast generators and iterators for permutations, combinations, integer partitions and compositions. The arrangements are in lexicographical order and generated iteratively in a memory efficient manner. It has been demonstrated that 'arrangements' outperforms most existing packages of similar kind. Benchmarks could be found at <https://randy3k.github.io/arrangements/articles/benchmark.html>.

Maintained by Randy Lai. Last updated 2 years ago.

gmp

17.7 match 52 stars 7.89 score 118 scripts 23 dependents

permaverse

flipr:Flexible Inference via Permutations in R

A flexible permutation framework for making inference such as point estimation, confidence intervals or hypothesis testing, on any kind of data, be it univariate, multivariate, or more complex such as network-valued data, topological data, functional data or density-valued data.

Maintained by Aymeric Stamm. Last updated 12 days ago.

cpp

18.9 match 6 stars 6.89 score 24 scripts 1 dependents

graemetlloyd

Claddis:Measuring Morphological Diversity and Evolutionary Tempo

Measures morphological diversity from discrete character data and estimates evolutionary tempo on phylogenetic trees. Imports morphological data from #NEXUS (Maddison et al. (1997) <doi:10.1093/sysbio/46.4.590>) format with read_nexus_matrix(), and writes to both #NEXUS and TNT format (Goloboff et al. (2008) <doi:10.1111/j.1096-0031.2008.00217.x>). Main functions are test_rates(), which implements AIC and likelihood ratio tests for discrete character rates introduced across Lloyd et al. (2012) <doi:10.1111/j.1558-5646.2011.01460.x>, Brusatte et al. (2014) <doi:10.1016/j.cub.2014.08.034>, Close et al. (2015) <doi:10.1016/j.cub.2015.06.047>, and Lloyd (2016) <doi:10.1111/bij.12746>, and calculate_morphological_distances(), which implements multiple discrete character distance metrics from Gower (1971) <doi:10.2307/2528823>, Wills (1998) <doi:10.1006/bijl.1998.0255>, Lloyd (2016) <doi:10.1111/bij.12746>, and Hopkins and St John (2018) <doi:10.1098/rspb.2018.1784>. This also includes the GED correction from Lehmann et al. (2019) <doi:10.1111/pala.12430>. Multiple functions implement morphospace plots: plot_chronophylomorphospace() implements Sakamoto and Ruta (2012) <doi:10.1371/journal.pone.0039752>, plot_morphospace() implements Wills et al. (1994) <doi:10.1017/S009483730001263X>, plot_changes_on_tree() implements Wang and Lloyd (2016) <doi:10.1098/rspb.2016.0214>, and plot_morphospace_stack() implements Foote (1993) <doi:10.1017/S0094837300015864>. Other functions include safe_taxonomic_reduction(), which implements Wilkinson (1995) <doi:10.1093/sysbio/44.4.501>, map_dollo_changes() implements the Dollo stochastic character mapping of Tarver et al. (2018) <doi:10.1093/gbe/evy096>, and estimate_ancestral_states() implements the ancestral state options of Lloyd (2018) <doi:10.1111/pala.12380>. calculate_tree_length() and reconstruct_ancestral_states() implements the generalised algorithms from Swofford and Maddison (1992; no doi).

Maintained by Graeme T. Lloyd. Last updated 6 months ago.

15.1 match 13 stars 7.81 score 77 scripts 2 dependents

angeella

pARI:Permutation-Based All-Resolutions Inference

Computes the All-Resolution Inference method in the permutation framework, i.e., simultaneous lower confidence bounds for the number of true discoveries. <doi:10.1002/sim.9725>.

Maintained by Angela Andreella. Last updated 6 months ago.

ari cluster-map copes discoveries fmri fsl permutation selective-inference simultaneous-confidence-bounds spm openblas cpp

24.3 match 4 stars 4.78 score 9 scripts 1 dependents

alexkowa

EnvStats:Package for Environmental Statistics, Including US EPA Guidance

Graphical and statistical analyses of environmental data, with focus on analyzing chemical concentrations and physical parameters, usually in the context of mandated environmental monitoring. Major environmental statistical methods found in the literature and regulatory guidance documents, with extensive help that explains what these methods do, how to use them, and where to find them in the literature. Numerous built-in data sets from regulatory guidance documents and environmental statistics literature. Includes scripts reproducing analyses presented in the book "EnvStats: An R Package for Environmental Statistics" (Millard, 2013, Springer, ISBN 978-1-4614-8455-4, <doi:10.1007/978-1-4614-8456-1>).

Maintained by Alexander Kowarik. Last updated 15 days ago.

8.8 match 26 stars 12.80 score 2.4k scripts 46 dependents

maximeherve

RVAideMemoire:Testing and Plotting Procedures for Biostatistics

Contains miscellaneous functions useful in biostatistics, mostly univariate and multivariate testing procedures with a special emphasis on permutation tests. Many functions intend to simplify user's life by shortening existing procedures or by implementing plotting functions that can be used with as many methods from different packages as possible.

Maintained by Maxime HERVE. Last updated 1 years ago.

20.7 match 8 stars 5.31 score 632 scripts

tidyverse

modelr:Modelling Functions that Work with the Pipe

Functions for modelling that help you seamlessly integrate modelling into a pipeline of data manipulation and visualisation.

Maintained by Hadley Wickham. Last updated 1 years ago.

modelling

6.7 match 401 stars 16.44 score 6.9k scripts 1.0k dependents

mtorchiano

lmPerm:Permutation Tests for Linear Models

Linear model functions using permutation tests.

Maintained by Marco Torchiano. Last updated 5 years ago.

12.5 match 13 stars 8.40 score 306 scripts 4 dependents

randy3k

iterpc:Efficient Iterator for Permutations and Combinations

Iterator for generating permutations and combinations. They can be either drawn with or without replacement, or with distinct/ non-distinct items (multiset). The generated sequences are in lexicographical order (dictionary order). The algorithms to generate permutations and combinations are memory efficient. These iterative algorithms enable users to process all sequences without putting all results in the memory at the same time. The algorithms are written in C/C++ for faster performance. Note: 'iterpc' is no longer being maintained. Users are recommended to switch to 'arrangements'.

Maintained by Randy Lai. Last updated 5 years ago.

14.5 match 9 stars 7.17 score 47 scripts 5 dependents

yjunechoe

jlmerclusterperm:Cluster-Based Permutation Analysis for Densely Sampled Time Data

An implementation of fast cluster-based permutation analysis (CPA) for densely-sampled time data developed in Maris & Oostenveld, 2007 <doi:10.1016/j.jneumeth.2007.03.024>. Supports (generalized, mixed-effects) regression models for the calculation of timewise statistics. Provides both a wholesale and a piecemeal interface to the CPA procedure with an emphasis on interpretability and diagnostics. Integrates 'Julia' libraries 'MixedModels.jl' and 'GLM.jl' for performance improvements, with additional functionalities for interfacing with 'Julia' from 'R' powered by the 'JuliaConnectoR' package.

Maintained by June Choe. Last updated 5 days ago.

cluster-based-permutation-test eeg eyetracking mixed-effects-models timeseries

16.6 match 13 stars 5.86 score 14 scripts

wjbraun

DAAG:Data Analysis and Graphics Data and Functions

Functions and data sets used in examples and exercises in the text Maindonald, J.H. and Braun, W.J. (2003, 2007, 2010) "Data Analysis and Graphics Using R", and in an upcoming Maindonald, Braun, and Andrews text that builds on this earlier text.

Maintained by W. John Braun. Last updated 11 months ago.

11.8 match 8.25 score 1.2k scripts 1 dependents

bioc

CaDrA:Candidate Driver Analysis

Performs both stepwise and backward heuristic search for candidate (epi)genetic drivers based on a binary multi-omics dataset. CaDrA's main objective is to identify features which, together, are significantly skewed or enriched pertaining to a given vector of continuous scores (e.g. sample-specific scores representing a phenotypic readout of interest, such as protein expression, pathway activity, etc.), based on the union occurence (i.e. logical OR) of the events.

Maintained by Reina Chau. Last updated 5 months ago.

microarray rnaseq geneexpression software featureextraction

13.1 match 24 stars 7.19 score 12 scripts

r-forge

coin:Conditional Inference Procedures in a Permutation Test Framework

Conditional inference procedures for the general independence problem including two-sample, K-sample (non-parametric ANOVA), correlation, censored, ordered and multivariate problems described in <doi:10.18637/jss.v028.i08>.

Maintained by Torsten Hothorn. Last updated 9 months ago.

7.8 match 11.68 score 1.6k scripts 74 dependents

reinhardfurrer

spam:SPArse Matrix

Set of functions for sparse matrix algebra. Differences with other sparse matrix packages are: (1) we only support (essentially) one sparse matrix format, (2) based on transparent and simple structure(s), (3) tailored for MCMC calculations within G(M)RF. (4) and it is fast and scalable (with the extension package spam64). Documentation about 'spam' is provided by vignettes included in this package, see also Furrer and Sain (2010) <doi:10.18637/jss.v036.i10>; see 'citation("spam")' for details.

Maintained by Reinhard Furrer. Last updated 2 months ago.

fortran openblas cpp

9.8 match 1 stars 9.26 score 420 scripts 433 dependents

bioc

RgnTX:Colocalization analysis of transcriptome elements in the presence of isoform heterogeneity and ambiguity

RgnTX allows the integration of transcriptome annotations so as to model the complex alternative splicing patterns. It supports the testing of transcriptome elements without clear isoform association, which is often the real scenario due to technical limitations. It involves functions that do permutaion test for evaluating association between features and transcriptome regions.

Maintained by Yue Wang. Last updated 5 months ago.

alternativesplicing sequencing rnaseq methylseq transcription splicedalignment

22.5 match 4.00 score 6 scripts

tidymodels

rsample:General Resampling Infrastructure

Classes and functions to create and summarize different types of resampling objects (e.g. bootstrap, cross-validation).

Maintained by Hannah Frick. Last updated 4 days ago.

5.3 match 341 stars 16.72 score 5.2k scripts 79 dependents

przechoj

gips:Gaussian Model Invariant by Permutation Symmetry

Find the permutation symmetry group such that the covariance matrix of the given data is approximately invariant under it. Discovering such a permutation decreases the number of observations needed to fit a Gaussian model, which is of great use when it is smaller than the number of variables. Even if that is not the case, the covariance matrix found with 'gips' approximates the actual covariance with less statistical error. The methods implemented in this package are described in Graczyk et al. (2022) <doi:10.1214/22-AOS2174>.

Maintained by Adam Przemysław Chojecki. Last updated 8 months ago.

covariance-estimation machine-learning normal-distribution

12.8 match 6 stars 6.40 score 31 scripts

fbertran

plsRglm:Partial Least Squares Regression for Generalized Linear Models

Provides (weighted) Partial least squares Regression for generalized linear models and repeated k-fold cross-validation of such models using various criteria <arXiv:1810.01005>. It allows for missing data in the explanatory variables. Bootstrap confidence intervals constructions are also available.

Maintained by Frederic Bertrand. Last updated 2 years ago.

10.6 match 16 stars 7.75 score 103 scripts 5 dependents

r-forge

surveillance:Temporal and Spatio-Temporal Modeling and Monitoring of Epidemic Phenomena

Statistical methods for the modeling and monitoring of time series of counts, proportions and categorical data, as well as for the modeling of continuous-time point processes of epidemic phenomena. The monitoring methods focus on aberration detection in count data time series from public health surveillance of communicable diseases, but applications could just as well originate from environmetrics, reliability engineering, econometrics, or social sciences. The package implements many typical outbreak detection procedures such as the (improved) Farrington algorithm, or the negative binomial GLR-CUSUM method of Hoehle and Paul (2008) <doi:10.1016/j.csda.2008.02.015>. A novel CUSUM approach combining logistic and multinomial logistic modeling is also included. The package contains several real-world data sets, the ability to simulate outbreak data, and to visualize the results of the monitoring in a temporal, spatial or spatio-temporal fashion. A recent overview of the available monitoring procedures is given by Salmon et al. (2016) <doi:10.18637/jss.v070.i10>. For the retrospective analysis of epidemic spread, the package provides three endemic-epidemic modeling frameworks with tools for visualization, likelihood inference, and simulation. hhh4() estimates models for (multivariate) count time series following Paul and Held (2011) <doi:10.1002/sim.4177> and Meyer and Held (2014) <doi:10.1214/14-AOAS743>. twinSIR() models the susceptible-infectious-recovered (SIR) event history of a fixed population, e.g, epidemics across farms or networks, as a multivariate point process as proposed by Hoehle (2009) <doi:10.1002/bimj.200900050>. twinstim() estimates self-exciting point process models for a spatio-temporal point pattern of infective events, e.g., time-stamped geo-referenced surveillance data, as proposed by Meyer et al. (2012) <doi:10.1111/j.1541-0420.2011.01684.x>. A recent overview of the implemented space-time modeling frameworks for epidemic phenomena is given by Meyer et al. (2017) <doi:10.18637/jss.v077.i11>.

Maintained by Sebastian Meyer. Last updated 17 hours ago.

cpp

7.3 match 2 stars 10.68 score 446 scripts 3 dependents

jaromilfrossard

permuco:Permutation Tests for Regression, (Repeated Measures) ANOVA/ANCOVA and Comparison of Signals

Functions to compute p-values based on permutation tests. Regression, ANOVA and ANCOVA, omnibus F-tests, marginal unilateral and bilateral t-tests are available. Several methods to handle nuisance variables are implemented (Kherad-Pajouh, S., & Renaud, O. (2010) <doi:10.1016/j.csda.2010.02.015> ; Kherad-Pajouh, S., & Renaud, O. (2014) <doi:10.1007/s00362-014-0617-3> ; Winkler, A. M., Ridgway, G. R., Webster, M. A., Smith, S. M., & Nichols, T. E. (2014) <doi:10.1016/j.neuroimage.2014.01.060>). An extension for the comparison of signals issued from experimental conditions (e.g. EEG/ERP signals) is provided. Several corrections for multiple testing are possible, including the cluster-mass statistic (Maris, E., & Oostenveld, R. (2007) <doi:10.1016/j.jneumeth.2007.03.024>) and the threshold-free cluster enhancement (Smith, S. M., & Nichols, T. E. (2009) <doi:10.1016/j.neuroimage.2008.03.061>).

Maintained by Jaromil Frossard. Last updated 7 months ago.

cpp

11.6 match 13 stars 6.57 score 81 scripts

jmcurran

multicool:Permutations of Multisets in Cool-Lex Order

A set of tools to permute multisets without loops or hash tables and to generate integer partitions. The permutation functions are based on C code from Aaron Williams. Cool-lex order is similar to colexicographical order. The algorithm is described in Williams, A. Loopless Generation of Multiset Permutations by Prefix Shifts. SODA 2009, Symposium on Discrete Algorithms, New York, United States. The permutation code is distributed without restrictions. The code for stable and efficient computation of multinomial coefficients comes from Dave Barber. The code can be download from <http://tamivox.org/dave/multinomial/index.html> and is distributed without conditions. The package also generates the integer partitions of a positive, non-zero integer n. The C++ code for this is based on Python code from Jerome Kelleher which can be found here <https://jeromekelleher.net/category/combinatorics.html>. The C++ code and Python code are distributed without conditions.

Maintained by James Curran. Last updated 1 years ago.

cpp

9.7 match 2 stars 7.74 score 11 scripts 273 dependents

ddebeer

permimp:Conditional Permutation Importance

An add-on to the 'party' package, with a faster implementation of the partial-conditional permutation importance for random forests. The standard permutation importance is implemented exactly the same as in the 'party' package. The conditional permutation importance can be computed faster, with an option to be backward compatible to the 'party' implementation. The package is compatible with random forests fit using the 'party' and the 'randomForest' package. The methods are described in Strobl et al. (2007) <doi:10.1186/1471-2105-8-25> and Debeer and Strobl (2020) <doi:10.1186/s12859-020-03622-2>.

Maintained by Dries Debeer. Last updated 2 years ago.

12.8 match 4 stars 5.85 score 39 scripts 1 dependents

briencj

asremlPlus:Augments 'ASReml-R' in Fitting Mixed Models and Packages Generally in Exploring Prediction Differences

Assists in automating the selection of terms to include in mixed models when 'asreml' is used to fit the models. Procedures are available for choosing models that conform to the hierarchy or marginality principle, for fitting and choosing between two-dimensional spatial models using correlation, natural cubic smoothing spline and P-spline models. A history of the fitting of a sequence of models is kept in a data frame. Also used to compute functions and contrasts of, to investigate differences between and to plot predictions obtained using any model fitting function. The content falls into the following natural groupings: (i) Data, (ii) Model modification functions, (iii) Model selection and description functions, (iv) Model diagnostics and simulation functions, (v) Prediction production and presentation functions, (vi) Response transformation functions, (vii) Object manipulation functions, and (viii) Miscellaneous functions (for further details see 'asremlPlus-package' in help). The 'asreml' package provides a computationally efficient algorithm for fitting a wide range of linear mixed models using Residual Maximum Likelihood. It is a commercial package and a license for it can be purchased from 'VSNi' <https://vsni.co.uk/> as 'asreml-R', who will supply a zip file for local installation/updating (see <https://asreml.kb.vsni.co.uk/>). It is not needed for functions that are methods for 'alldiffs' and 'data.frame' objects. The package 'asremPlus' can also be installed from <http://chris.brien.name/rpackages/>.

Maintained by Chris Brien. Last updated 26 days ago.

asreml mixed-models

7.8 match 19 stars 9.34 score 200 scripts

cran

vipor:Plot Categorical Data Using Quasirandom Noise and Density Estimates

Generate a violin point plot, a combination of a violin/histogram plot and a scatter plot by offsetting points within a category based on their density using quasirandom noise.

Maintained by Scott Sherrill-Mix. Last updated 1 years ago.

10.4 match 6.86 score 95 dependents

cran

e1071:Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien

Functions for latent class analysis, short time Fourier transform, fuzzy clustering, support vector machines, shortest path computation, bagged clustering, naive Bayes classifier, generalized k-nearest neighbour ...

Maintained by David Meyer. Last updated 6 months ago.

cpp

4.9 match 28 stars 14.46 score 19k scripts 2.0k dependents

bioc

ClusterSignificance:The ClusterSignificance package provides tools to assess if class clusters in dimensionality reduced data representations have a separation different from permuted data

The ClusterSignificance package provides tools to assess if class clusters in dimensionality reduced data representations have a separation different from permuted data. The term class clusters here refers to, clusters of points representing known classes in the data. This is particularly useful to determine if a subset of the variables, e.g. genes in a specific pathway, alone can separate samples into these established classes. ClusterSignificance accomplishes this by, projecting all points onto a one dimensional line. Cluster separations are then scored and the probability of the seen separation being due to chance is evaluated using a permutation method.

Maintained by Jason T Serviss. Last updated 5 months ago.

clustering classification principalcomponent statisticalmethod

14.4 match 4.78 score 4 scripts

cran

PerMallows:Permutations and Mallows Distributions

Includes functions to work with the Mallows and Generalized Mallows Models. The considered distances are Kendall's-tau, Cayley, Hamming and Ulam and it includes functions for making inference, sampling and learning such distributions, some of which are novel in the literature. As a by-product, PerMallows also includes operations for permutations, paying special attention to those related with the Kendall's-tau, Cayley, Ulam and Hamming distances. It is also possible to generate random permutations at a given distance, or with a given number of inversions, or cycles, or fixed points or even with a given length on LIS (longest increasing subsequence).

Maintained by Ekhine Irurozki. Last updated 30 days ago.

cpp

66.9 match 1 stars 1.00 score

r-forge

Matrix:Sparse and Dense Matrix Classes and Methods

A rich hierarchy of sparse and dense matrix classes, including general, symmetric, triangular, and diagonal matrices with numeric, logical, or pattern entries. Efficient methods for operating on such matrices, often wrapping the 'BLAS', 'LAPACK', and 'SuiteSparse' libraries.

Maintained by Martin Maechler. Last updated 5 days ago.

openblas

3.9 match 1 stars 17.23 score 33k scripts 12k dependents

blasbenito

distantia:Advanced Toolset for Efficient Time Series Dissimilarity Analysis

Fast C++ implementation of Dynamic Time Warping for time series dissimilarity analysis, with applications in environmental monitoring and sensor data analysis, climate science, signal processing and pattern recognition, and financial data analysis. Built upon the ideas presented in Benito and Birks (2020) <doi:10.1111/ecog.04895>, provides tools for analyzing time series of varying lengths and structures, including irregular multivariate time series. Key features include individual variable contribution analysis, restricted permutation tests for statistical significance, and imputation of missing data via GAMs. Additionally, the package provides an ample set of tools to prepare and manage time series data.

Maintained by Blas M. Benito. Last updated 24 days ago.

11.1 match 23 stars 5.76 score 11 scripts

andrisignorell

DescTools:Tools for Descriptive Statistics

A collection of miscellaneous basic statistic functions and convenience wrappers for efficiently describing data. The author's intention was to create a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results. The package contains furthermore functions to produce documents using MS Word (or PowerPoint) and functions to import data from Excel. Many of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The 'BigCamelCase' style was consequently applied to functions borrowed from contributed R packages as well.

Maintained by Andri Signorell. Last updated 9 days ago.

fortran cpp

3.8 match 87 stars 16.68 score 7.7k scripts 99 dependents

ahb108

rcarbon:Calibration and Analysis of Radiocarbon Dates

Enables the calibration and analysis of radiocarbon dates, often but not exclusively for the purposes of archaeological research. It includes functions not only for basic calibration, uncalibration, and plotting of one or more dates, but also a statistical framework for building demographic and related longitudinal inferences from aggregate radiocarbon date lists, including: Monte-Carlo simulation test (Timpson et al 2014 <doi:10.1016/j.jas.2014.08.011>), random mark permutation test (Crema et al 2016 <doi:10.1371/journal.pone.0154809>) and spatial permutation tests (Crema, Bevan, and Shennan 2017 <doi:10.1016/j.jas.2017.09.007>).

Maintained by Enrico Crema. Last updated 6 months ago.

7.7 match 34 stars 8.14 score 274 scripts 2 dependents

bioc

metagenomeSeq:Statistical analysis for sparse high-throughput sequencing

metagenomeSeq is designed to determine features (be it Operational Taxanomic Unit (OTU), species, etc.) that are differentially abundant between two or more groups of multiple samples. metagenomeSeq is designed to address the effects of both normalization and under-sampling of microbial communities on disease association detection and the testing of feature correlations.

Maintained by Joseph N. Paulson. Last updated 3 months ago.

immunooncology classification clustering geneticvariability differentialexpression microbiome metagenomics normalization visualization multiplecomparison sequencing software

5.2 match 69 stars 12.02 score 494 scripts 7 dependents

usepa

pTITAN2:Permutations of Treatment Labels and TITAN2 Analysis

Permute treatment labels for taxa and environmental gradients to generate an empirical distribution of change points. This is an extension for the 'TITAN2' package <https://cran.r-project.org/package=TITAN2>.

Maintained by Peter DeWitt. Last updated 3 years ago.

epa-unknown

16.8 match 1 stars 3.70 score 7 scripts

thothorn

libcoin:Linear Test Statistics for Permutation Inference

Basic infrastructure for linear test statistics and permutation inference in the framework of Strasser and Weber (1999) <https://epub.wu.ac.at/102/>. This package must not be used by end-users. CRAN package 'coin' implements all user interfaces and is ready to be used by anyone.

Maintained by Torsten Hothorn. Last updated 1 years ago.

openblas

9.1 match 1 stars 6.81 score 25 scripts 171 dependents

andreyshabalin

shiftR:Fast Enrichment Analysis via Circular Permutations

Fast enrichment analysis for locally correlated statistics via circular permutations. The analysis can be performed at multiple significance thresholds for both primary and auxiliary data sets with efficient correction for multiple testing.

Maintained by Andrey A Shabalin. Last updated 6 years ago.

15.1 match 1 stars 4.04 score 11 scripts

bioc

COCOA:Coordinate Covariation Analysis

COCOA is a method for understanding epigenetic variation among samples. COCOA can be used with epigenetic data that includes genomic coordinates and an epigenetic signal, such as DNA methylation and chromatin accessibility data. To describe the method on a high level, COCOA quantifies inter-sample variation with either a supervised or unsupervised technique then uses a database of "region sets" to annotate the variation among samples. A region set is a set of genomic regions that share a biological annotation, for instance transcription factor (TF) binding regions, histone modification regions, or open chromatin regions. COCOA can identify region sets that are associated with epigenetic variation between samples and increase understanding of variation in your data.

Maintained by John Lawson. Last updated 5 months ago.

epigenetics dnamethylation atacseq dnaseseq methylseq methylationarray principalcomponent genomicvariation generegulation genomeannotation systemsbiology functionalgenomics chipseq sequencing immunooncology dna-methylation pca

8.6 match 10 stars 7.02 score 21 scripts

cecileproust-lima

lcmm:Extended Mixed Models Using Latent Classes and Latent Processes

Estimation of various extensions of the mixed models including latent class mixed models, joint latent class mixed models, mixed models for curvilinear outcomes, mixed models for multivariate longitudinal outcomes using a maximum likelihood estimation method (Proust-Lima, Philipps, Liquet (2017) <doi:10.18637/jss.v078.i02>).

Maintained by Cecile Proust-Lima. Last updated 1 months ago.

fortran

5.3 match 62 stars 11.41 score 249 scripts 7 dependents

dicook

nullabor:Tools for Graphical Inference

Tools for visual inference. Generate null data sets and null plots using permutation and simulation. Calculate distance metrics for a lineup, and examine the distributions of metrics.

Maintained by Di Cook. Last updated 1 months ago.

5.8 match 57 stars 10.38 score 370 scripts 2 dependents

uscbiostats

partition:Agglomerative Partitioning Framework for Dimension Reduction

A fast and flexible framework for agglomerative partitioning. 'partition' uses an approach called Direct-Measure-Reduce to create new variables that maintain the user-specified minimum level of information. Each reduced variable is also interpretable: the original variables map to one and only one variable in the reduced data set. 'partition' is flexible, as well: how variables are selected to reduce, how information loss is measured, and the way data is reduced can all be customized. 'partition' is based on the Partition framework discussed in Millstein et al. (2020) <doi:10.1093/bioinformatics/btz661>.

Maintained by Malcolm Barrett. Last updated 4 months ago.

data-reduction dimensionality-reduction partitional-clustering openblas cpp

7.8 match 36 stars 7.72 score 27 scripts 1 dependents

bioc

AlpsNMR:Automated spectraL Processing System for NMR

Reads Bruker NMR data directories both zipped and unzipped. It provides automated and efficient signal processing for untargeted NMR metabolomics. It is able to interpolate the samples, detect outliers, exclude regions, normalize, detect peaks, align the spectra, integrate peaks, manage metadata and visualize the spectra. After spectra proccessing, it can apply multivariate analysis on extracted data. Efficient plotting with 1-D data is also available. Basic reading of 1D ACD/Labs exported JDX samples is also available.

Maintained by Sergio Oller Moreno. Last updated 5 months ago.

software preprocessing visualization classification cheminformatics metabolomics dataimport

7.9 match 15 stars 7.59 score 12 scripts 1 dependents

ignaciomsarmiento

RATest:Randomization Tests

A collection of randomization tests, data sets and examples. The current version focuses on three testing problems and their implementation in empirical work. First, it facilitates the empirical researcher to test for particular hypotheses, such as comparisons of means, medians, and variances from k populations using robust permutation tests, which asymptotic validity holds under very weak assumptions, while retaining the exact rejection probability in finite samples when the underlying distributions are identical. Second, the description and implementation of a permutation test for testing the continuity assumption of the baseline covariates in the sharp regression discontinuity design (RDD) as in Canay and Kamat (2017) <https://goo.gl/UZFqt7>. More specifically, it allows the user to select a set of covariates and test the aforementioned hypothesis using a permutation test based on the Cramer-von Miss test statistic. Graphical inspection of the empirical CDF and histograms for the variables of interest is also supported in the package. Third, it provides the practitioner with an effortless implementation of a permutation test based on the martingale decomposition of the empirical process for the goodness-of-fit testing problem with an estimated nuisance parameter. An application of this testing problem is the one of testing for heterogeneous treatment effects in a randomized control trial.

Maintained by Mauricio Olivares-Gonzalez. Last updated 6 years ago.

13.0 match 6 stars 4.52 score 11 scripts

eriqande

gscramble:Simulating Admixed Genotypes Without Replacement

A genomic simulation approach for creating biologically informed individual genotypes from empirical data that 1) samples alleles from populations without replacement, 2) segregates alleles based on species-specific recombination rates. 'gscramble' is a flexible simulation approach that allows users to create pedigrees of varying complexity in order to simulate admixed genotypes. Furthermore, it allows users to track haplotype blocks from the source populations through the pedigrees.

Maintained by Eric C. Anderson. Last updated 1 years ago.

noaa-omics-software

11.8 match 4.83 score 15 scripts

brandmaier

pdc:Permutation Distribution Clustering

Permutation Distribution Clustering is a clustering method for time series. Dissimilarity of time series is formalized as the divergence between their permutation distributions. The permutation distribution was proposed as measure of the complexity of a time series.

Maintained by Andreas M. Brandmaier. Last updated 2 years ago.

10.1 match 6 stars 5.61 score 25 scripts 9 dependents

bioc

GSALightning:Fast Permutation-based Gene Set Analysis

GSALightning provides a fast implementation of permutation-based gene set analysis for two-sample problem. This package is particularly useful when testing simultaneously a large number of gene sets, or when a large number of permutations is necessary for more accurate p-values estimation.

Maintained by Billy Heung Wing Chang. Last updated 5 months ago.

software biologicalquestion genesetenrichment differentialexpression geneexpression transcription

14.2 match 5 stars 4.00 score 4 scripts

hwborchers

pracma:Practical Numerical Math Functions

Provides a large number of functions from numerical analysis and linear algebra, numerical optimization, differential equations, time series, plus some well-known special mathematical functions. Uses 'MATLAB' function names where appropriate to simplify porting.

Maintained by Hans W. Borchers. Last updated 1 years ago.

4.5 match 29 stars 12.34 score 6.6k scripts 931 dependents

bioc

SeqGSEA:Gene Set Enrichment Analysis (GSEA) of RNA-Seq Data: integrating differential expression and splicing

The package generally provides methods for gene set enrichment analysis of high-throughput RNA-Seq data by integrating differential expression and splicing. It uses negative binomial distribution to model read count data, which accounts for sequencing biases and biological variation. Based on permutation tests, statistical significance can also be achieved regarding each gene's differential expression and splicing, respectively.

Maintained by Xi Wang. Last updated 5 months ago.

sequencing rnaseq genesetenrichment geneexpression differentialexpression differentialsplicing immunooncology

12.8 match 4.34 score 11 scripts

qddyy

LearnNonparam:'R6'-Based Flexible Framework for Permutation Tests

Implements non-parametric tests from Higgins (2004, ISBN:0534387756), including tests for one sample, two samples, k samples, paired comparisons, blocked designs, trends and association. Built with 'Rcpp' for efficiency and 'R6' for flexible, object-oriented design, the package provides a unified framework for performing or creating custom permutation tests.

Maintained by Yan Du. Last updated 1 months ago.

hypothesis-test nonparametric-statistics permutation-test cpp

10.9 match 6 stars 5.01 score 2 scripts

cran

GUniFrac:Generalized UniFrac Distances, Distance-Based Multivariate Methods and Feature-Based Univariate Methods for Microbiome Data Analysis

A suite of methods for powerful and robust microbiome data analysis including data normalization, data simulation, community-level association testing and differential abundance analysis. It implements generalized UniFrac distances, Geometric Mean of Pairwise Ratios (GMPR) normalization, semiparametric data simulator, distance-based statistical methods, and feature-based statistical methods. The distance-based statistical methods include three extensions of PERMANOVA: (1) PERMANOVA using the Freedman-Lane permutation scheme, (2) PERMANOVA omnibus test using multiple matrices, and (3) analytical approach to approximating PERMANOVA p-value. Feature-based statistical methods include linear model-based methods for differential abundance analysis of zero-inflated high-dimensional compositional data.

Maintained by Jun Chen. Last updated 2 years ago.

cpp

9.2 match 5.96 score 277 scripts 7 dependents

sritchie73

NetRep:Permutation Testing Network Module Preservation Across Datasets

Functions for assessing the replication/preservation of a network module's topology across datasets through permutation testing; Ritchie et al. (2015) <doi: 10.1016/j.cels.2016.06.012>.

Maintained by Scott Ritchie. Last updated 4 years ago.

openblas cpp

8.0 match 12 stars 6.84 score 16 scripts 3 dependents

swampthingpaul

NADA2:Data Analysis for Censored Environmental Data

Contains methods described by Dennis Helsel in his book "Statistics for Censored Environmental Data using Minitab and R" (2011) and courses and videos at <https://practicalstats.com>. This package adds new functions to the `NADA` Package.

Maintained by Paul Julian. Last updated 6 months ago.

8.8 match 15 stars 6.16 score 16 scripts

declaredesign

randomizr:Easy-to-Use Tools for Common Forms of Random Assignment and Sampling

Generates random assignments for common experimental designs and random samples for common sampling designs.

Maintained by Alexander Coppock. Last updated 1 months ago.

5.5 match 37 stars 9.90 score 396 scripts 13 dependents

mlcollyer

RRPP:Linear Model Evaluation with Randomized Residuals in a Permutation Procedure

Linear model calculations are made for many random versions of data. Using residual randomization in a permutation procedure, sums of squares are calculated over many permutations to generate empirical probability distributions for evaluating model effects. Additionally, coefficients, statistics, fitted values, and residuals generated over many permutations can be used for various procedures including pairwise tests, prediction, classification, and model comparison. This package should provide most tools one could need for the analysis of high-dimensional data, especially in ecology and evolutionary biology, but certainly other fields, as well.

Maintained by Michael Collyer. Last updated 25 days ago.

5.5 match 4 stars 9.84 score 173 scripts 7 dependents

phamdn

peramo:Permutation Tests for Randomization Model

Perform permutation-based hypothesis testing for randomized experiments as suggested in Ludbrook & Dudley (1998) <doi:10.2307/2685470> and Ernst (2004) <doi:10.1214/088342304000000396>, introduced in Pham et al. (2022) <doi:10.1016/j.chemosphere.2022.136736>.

Maintained by Duy Nghia Pham. Last updated 7 months ago.

17.9 match 3.00 score

mbrueckner

permGS:Permutational Group Sequential Test for Time-to-Event Data

Permutational group-sequential tests for time-to-event data based on the log-rank test statistic. Supports exact permutation test when the censoring distributions are equal in the treatment and the control group and approximate imputation-permutation methods when the censoring distributions are different.

Maintained by Matthias Brueckner. Last updated 6 years ago.

permutation-test statistics survival-analysis

19.5 match 2.70 score 8 scripts

cran

sparcl:Perform Sparse Hierarchical Clustering and Sparse K-Means Clustering

Implements the sparse clustering methods of Witten and Tibshirani (2010): "A framework for feature selection in clustering"; published in Journal of the American Statistical Association 105(490): 713-726.

Maintained by Daniela Witten. Last updated 6 years ago.

fortran

12.5 match 1 stars 4.20 score 133 scripts 4 dependents

pecanproject

PEcAn.data.atmosphere:PEcAn Functions Used for Managing Climate Driver Data

The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The PECAn.data.atmosphere package converts climate driver data into a standard format for models integrated into PEcAn. As a standalone package, it provides an interface to access diverse climate data sets.

Maintained by David LeBauer. Last updated 14 hours ago.

bayesian cyberinfrastructure data-assimilation data-science ecosystem-model ecosystem-science forecasting meta-analysis national-science-foundation pecan plants

4.5 match 216 stars 11.59 score 64 scripts 14 dependents

dongwenluo

predictmeans:Predicted Means for Linear and Semiparametric Models

Providing functions to diagnose and make inferences from various linear models, such as those obtained from 'aov', 'lm', 'glm', 'gls', 'lme', 'lmer', 'glmmTMB' and 'semireg'. Inferences include predicted means and standard errors, contrasts, multiple comparisons, permutation tests, adjusted R-square and graphs.

Maintained by Dongwen Luo. Last updated 11 months ago.

8.2 match 2 stars 6.26 score 152 scripts 2 dependents

bioc

jazzPanda:Finding spatially relevant marker genes in image based spatial transcriptomics data

This package contains the function to find marker genes for image-based spatial transcriptomics data. There are functions to create spatial vectors from the cell and transcript coordiantes, which are passed as inputs to find marker genes. Marker genes are detected for every cluster by two approaches. The first approach is by permtuation testing, which is implmented in parallel for finding marker genes for one sample study. The other approach is to build a linear model for every gene. This approach can account for multiple samples and backgound noise.

Maintained by Melody Jin. Last updated 13 days ago.

spatial geneexpression differentialexpression statisticalmethod transcriptomics correlation linear-models marker-genes spatial-transcriptomics

10.3 match 2 stars 5.00 score

sonsoleslp

tna:Transition Network Analysis (TNA)

Provides tools for performing Transition Network Analysis (TNA) to study relational dynamics, including functions for building and plotting TNA models, calculating centrality measures, and identifying dominant events and patterns. TNA statistical techniques (e.g., bootstrapping and permutation tests) ensure the reliability of observed insights and confirm that identified dynamics are meaningful. See (Saqr et al., 2025) <doi:10.1145/3706468.3706513> for more details on TNA.

Maintained by Sonsoles López-Pernas. Last updated 23 hours ago.

educational-data-mining learning-analytics markov-model temporal-analysis

7.9 match 4 stars 6.48 score 5 scripts

toshi-ara

brunnermunzel:(Permuted) Brunner-Munzel Test

Provides the functions for Brunner-Munzel test and permuted Brunner-Munzel test, which enable to use formula, matrix, and table as argument. These functions are based on Brunner and Munzel (2000) <doi:10.1002/(SICI)1521-4036(200001)42:1%3C17::AID-BIMJ17%3E3.0.CO;2-U> and Neubert and Brunner (2007) <doi:10.1016/j.csda.2006.05.024>, and are written with FORTRAN.

Maintained by Toshiaki Ara. Last updated 3 years ago.

fortran

8.7 match 5 stars 5.83 score 30 scripts 1 dependents

rqtl

qtl2:Quantitative Trait Locus Mapping in Experimental Crosses

Provides a set of tools to perform quantitative trait locus (QTL) analysis in experimental crosses. It is a reimplementation of the 'R/qtl' package to better handle high-dimensional data and complex cross designs. Broman et al. (2019) <doi:10.1534/genetics.118.301595>.

Maintained by Karl W Broman. Last updated 7 days ago.

cpp

5.3 match 34 stars 9.48 score 1.1k scripts 5 dependents

bioc

distinct:distinct: a method for differential analyses via hierarchical permutation tests

distinct is a statistical method to perform differential testing between two or more groups of distributions; differential testing is performed via hierarchical non-parametric permutation tests on the cumulative distribution functions (cdfs) of each sample. While most methods for differential expression target differences in the mean abundance between conditions, distinct, by comparing full cdfs, identifies, both, differential patterns involving changes in the mean, as well as more subtle variations that do not involve the mean (e.g., unimodal vs. bi-modal distributions with the same mean). distinct is a general and flexible tool: due to its fully non-parametric nature, which makes no assumptions on how the data was generated, it can be applied to a variety of datasets. It is particularly suitable to perform differential state analyses on single cell data (i.e., differential analyses within sub-populations of cells), such as single cell RNA sequencing (scRNA-seq) and high-dimensional flow or mass cytometry (HDCyto) data. To use distinct one needs data from two or more groups of samples (i.e., experimental conditions), with at least 2 samples (i.e., biological replicates) per group.

Maintained by Simone Tiberi. Last updated 5 months ago.

genetics rnaseq sequencing differentialexpression geneexpression multiplecomparison software transcription statisticalmethod visualization singlecell flowcytometry genetarget openblas cpp

7.8 match 11 stars 6.35 score 34 scripts 1 dependents

smn74

MANOVA.RM:Resampling-Based Analysis of Multivariate Data and Repeated Measures Designs

Implemented are various tests for semi-parametric repeated measures and general MANOVA designs that do neither assume multivariate normality nor covariance homogeneity, i.e., the procedures are applicable for a wide range of general multivariate factorial designs. In addition to asymptotic inference methods, novel bootstrap and permutation approaches are implemented as well. These provide more accurate results in case of small to moderate sample sizes. Furthermore, post-hoc comparisons are provided for the multivariate analyses. Friedrich, S., Konietschke, F. and Pauly, M. (2019) <doi:10.32614/RJ-2019-051>.

Maintained by Sarah Friedrich. Last updated 1 months ago.

multivariate-data permutation repeated-measures resampling

10.5 match 11 stars 4.63 score 39 scripts

bstewart

stm:Estimation of the Structural Topic Model

The Structural Topic Model (STM) allows researchers to estimate topic models with document-level covariates. The package also includes tools for model selection, visualization, and estimation of topic-covariate regressions. Methods developed in Roberts et. al. (2014) <doi:10.1111/ajps.12103> and Roberts et. al. (2016) <doi:10.1080/01621459.2016.1141684>. Vignette is Roberts et. al. (2019) <doi:10.18637/jss.v091.i02>.

Maintained by Brandon Stewart. Last updated 1 years ago.

openblas cpp

3.8 match 404 stars 12.63 score 1.6k scripts 6 dependents

cran

sna:Tools for Social Network Analysis

A range of tools for social network analysis, including node and graph-level indices, structural distance and covariance methods, structural equivalence detection, network regression, random graph generation, and 2D/3D network visualization.

Maintained by Carter T. Butts. Last updated 6 months ago.

6.9 match 8 stars 6.78 score 94 dependents

thothorn

exactRankTests:Exact Distributions for Rank and Permutation Tests

Computes exact conditional p-values and quantiles using an implementation of the Shift-Algorithm by Streitberg & Roehmel.

Maintained by Torsten Hothorn. Last updated 3 years ago.

6.5 match 1 stars 7.13 score 276 scripts 65 dependents

simsem

semTools:Useful Tools for Structural Equation Modeling

Provides miscellaneous tools for structural equation modeling, many of which extend the 'lavaan' package. For example, latent interactions can be estimated using product indicators (Lin et al., 2010, <doi:10.1080/10705511.2010.488999>) and simple effects probed; analytical power analyses can be conducted (Jak et al., 2021, <doi:10.3758/s13428-020-01479-0>); and scale reliability can be estimated based on estimated factor-model parameters.

Maintained by Terrence D. Jorgensen. Last updated 2 days ago.

3.4 match 79 stars 13.74 score 1.1k scripts 31 dependents

colbystatsvyrsch

CIPerm:Computationally-Efficient Confidence Intervals for Mean Shift from Permutation Methods

Implements computationally-efficient construction of confidence intervals from permutation or randomization tests for simple differences in means, based on Nguyen (2009) <doi:10.15760/etd.7798>.

Maintained by Jerzy Wieczorek. Last updated 3 years ago.

11.5 match 1 stars 4.00 score 7 scripts

pbreheny

ncvreg:Regularization Paths for SCAD and MCP Penalized Regression Models

Fits regularization paths for linear regression, GLM, and Cox regression models using lasso or nonconvex penalties, in particular the minimax concave penalty (MCP) and smoothly clipped absolute deviation (SCAD) penalty, with options for additional L2 penalties (the "elastic net" idea). Utilities for carrying out cross-validation as well as post-fitting visualization, summarization, inference, and prediction are also provided. For more information, see Breheny and Huang (2011) <doi:10.1214/10-AOAS388> or visit the ncvreg homepage <https://pbreheny.github.io/ncvreg/>.

Maintained by Patrick Breheny. Last updated 2 days ago.

3.8 match 43 stars 12.04 score 458 scripts 38 dependents

vanderleidebastiani

SYNCSA:Analysis of Functional and Phylogenetic Patterns in Metacommunities

Analysis of metacommunities based on functional traits and phylogeny of the community components. The functions that are offered here implement for the R environment methods that have been available in the SYNCSA application written in C++ (by Valerio Pillar, available at <http://ecoqua.ecologia.ufrgs.br/SYNCSA.html>).

Maintained by Vanderlei Julio Debastiani. Last updated 5 years ago.

8.5 match 3 stars 5.36 score 28 scripts 1 dependents

koalaverse

vip:Variable Importance Plots

A general framework for constructing variable importance plots from various types of machine learning models in R. Aside from some standard model- specific variable importance measures, this package also provides model- agnostic approaches that can be applied to any supervised learning algorithm. These include 1) an efficient permutation-based variable importance measure, 2) variable importance based on Shapley values (Strumbelj and Kononenko, 2014) <doi:10.1007/s10115-013-0679-x>, and 3) the variance-based approach described in Greenwell et al. (2018) <arXiv:1805.04755>. A variance-based method for quantifying the relative strength of interaction effects is also included (see the previous reference for details).

Maintained by Brandon M. Greenwell. Last updated 2 years ago.

interaction-effect machine-learning partial-dependence-plot supervised-learning-algorithms variable-importance variable-importance-plots

3.9 match 187 stars 11.61 score 3.5k scripts 6 dependents

zachmayer

caretEnsemble:Ensembles of Caret Models

Functions for creating ensembles of caret models: caretList() and caretStack(). caretList() is a convenience function for fitting multiple caret::train() models to the same dataset. caretStack() will make linear or non-linear combinations of these models, using a caret::train() model as a meta-model.

Maintained by Zachary A. Deane-Mayer. Last updated 3 months ago.

3.8 match 226 stars 11.92 score 780 scripts 1 dependents

bioc

EMDomics:Earth Mover's Distance for Differential Analysis of Genomics Data

The EMDomics algorithm is used to perform a supervised multi-class analysis to measure the magnitude and statistical significance of observed continuous genomics data between groups. Usually the data will be gene expression values from array-based or sequence-based experiments, but data from other types of experiments can also be analyzed (e.g. copy number variation). Traditional methods like Significance Analysis of Microarrays (SAM) and Linear Models for Microarray Data (LIMMA) use significance tests based on summary statistics (mean and standard deviation) of the distributions. This approach lacks power to identify expression differences between groups that show high levels of intra-group heterogeneity. The Earth Mover's Distance (EMD) algorithm instead computes the "work" needed to transform one distribution into another, thus providing a metric of the overall difference in shape between two distributions. Permutation of sample labels is used to generate q-values for the observed EMD scores. This package also incorporates the Komolgorov-Smirnov (K-S) test and the Cramer von Mises test (CVM), which are both common distribution comparison tests.

Maintained by Sadhika Malladi. Last updated 5 months ago.

software differentialexpression geneexpression microarray

10.5 match 4.23 score 17 scripts

thomasp85

lime:Local Interpretable Model-Agnostic Explanations

When building complex models, it is often difficult to explain why the model should be trusted. While global measures such as accuracy are useful, they cannot be used for explaining why a model made a specific prediction. 'lime' (a port of the 'lime' 'Python' package) is a method for explaining the outcome of black box models by fitting a local model around the point in question an perturbations of this point. The approach is described in more detail in the article by Ribeiro et al. (2016) <arXiv:1602.04938>.

Maintained by Emil Hvitfeldt. Last updated 3 years ago.

caret model-checking model-evaluation modeling cpp

4.0 match 485 stars 11.07 score 732 scripts 1 dependents

veseshan

clinfun:Clinical Trial Design and Data Analysis Functions

Utilities to make your clinical collaborations easier if not fun. It contains functions for designing studies such as Simon 2-stage and group sequential designs and for data analysis such as Jonckheere-Terpstra test and estimating survival quantiles.

Maintained by Venkatraman E. Seshan. Last updated 1 years ago.

fortran

5.5 match 5 stars 7.86 score 124 scripts 8 dependents

cran

perm:Exact or Asymptotic Permutation Tests

Perform Exact or Asymptotic permutation tests [see Fay and Shaw <doi:10.18637/jss.v036.i02>].

Maintained by Michael P. Fay. Last updated 2 years ago.

9.0 match 4.83 score 118 scripts 9 dependents

soroushmdg

gwid:Genome-Wide Identity-by-Descent

Methods and tools for the analysis of Genome Wide Identity-by-Descent ('gwid') mapping data, focusing on testing whether there is a higher occurrence of Identity-By-Descent (IBD) segments around potential causal variants in cases compared to controls, which is crucial for identifying rare variants. To enhance its analytical power, 'gwid' incorporates a Sliding Window Approach, allowing for the detection and analysis of signals from multiple Single Nucleotide Polymorphisms (SNPs).

Maintained by Soroush Mahmoudiandehkordi. Last updated 6 months ago.

12.1 match 1 stars 3.60 score 4 scripts

bioc

twilight:Estimation of local false discovery rate

In a typical microarray setting with gene expression data observed under two conditions, the local false discovery rate describes the probability that a gene is not differentially expressed between the two conditions given its corrresponding observed score or p-value level. The resulting curve of p-values versus local false discovery rate offers an insight into the twilight zone between clear differential and clear non-differential gene expression. Package 'twilight' contains two main functions: Function twilight.pval performs a two-condition test on differences in means for a given input matrix or expression set and computes permutation based p-values. Function twilight performs a stochastic downhill search to estimate local false discovery rates and effect size distributions. The package further provides means to filter for permutations that describe the null distribution correctly. Using filtered permutations, the influence of hidden confounders could be diminished.

Maintained by Stefanie Senger. Last updated 27 days ago.

microarray differentialexpression multiplecomparison

12.8 match 3.40 score 14 scripts 1 dependents

cwatson

brainGraph:Graph Theory Analysis of Brain MRI Data

A set of tools for performing graph theory analysis of brain MRI data. It works with data from a Freesurfer analysis (cortical thickness, volumes, local gyrification index, surface area), diffusion tensor tractography data (e.g., from FSL) and resting-state fMRI data (e.g., from DPABI). It contains a graphical user interface for graph visualization and data exploration, along with several functions for generating useful figures.

Maintained by Christopher G. Watson. Last updated 1 years ago.

brain-connectivity brain-imaging complex-networks connectome connectomics fmri graph-theory mri network-analysis neuroimaging neuroscience statistics tractography

5.4 match 188 stars 7.86 score 107 scripts 3 dependents

davidvandijcke

DiSCos:Distributional Synthetic Controls Estimation

The method of synthetic controls is a widely-adopted tool for evaluating causal effects of policy changes in settings with observational data. In many settings where it is applicable, researchers want to identify causal effects of policy changes on a treated unit at an aggregate level while having access to data at a finer granularity. This package implements a simple extension of the synthetic controls estimator, developed in Gunsilius (2023) <doi:10.3982/ECTA18260>, that takes advantage of this additional structure and provides nonparametric estimates of the heterogeneity within the aggregate unit. The idea is to replicate the quantile function associated with the treated unit by a weighted average of quantile functions of the control units. The package contains tools for aggregating and plotting the resulting distributional estimates, as well as for carrying out inference on them.

Maintained by David Van Dijcke. Last updated 2 days ago.

8.9 match 1 stars 4.81 score 8 scripts

moderndive

moderndive:Tidyverse-Friendly Introductory Linear Regression

Datasets and wrapper functions for tidyverse-friendly introductory linear regression, used in "Statistical Inference via Data Science: A ModernDive into R and the Tidyverse" available at <https://moderndive.com/>.

Maintained by Albert Y. Kim. Last updated 3 months ago.

3.8 match 88 stars 11.35 score 1.8k scripts

bioc

waddR:Statistical tests for detecting differential distributions based on the 2-Wasserstein distance

The package offers statistical tests based on the 2-Wasserstein distance for detecting and characterizing differences between two distributions given in the form of samples. Functions for calculating the 2-Wasserstein distance and testing for differential distributions are provided, as well as a specifically tailored test for differential expression in single-cell RNA sequencing data.

Maintained by Julian Flesch. Last updated 5 months ago.

software statisticalmethod singlecell differentialexpression cpp

6.3 match 25 stars 6.70 score 6 scripts

jamesramsay5

fda:Functional Data Analysis

These functions were developed to support functional data analysis as described in Ramsay, J. O. and Silverman, B. W. (2005) Functional Data Analysis. New York: Springer and in Ramsay, J. O., Hooker, Giles, and Graves, Spencer (2009). Functional Data Analysis with R and Matlab (Springer). The package includes data sets and script files working many examples including all but one of the 76 figures in this latter book. Matlab versions are available by ftp from <https://www.psych.mcgill.ca/misc/fda/downloads/FDAfuns/>.

Maintained by James Ramsay. Last updated 4 months ago.

3.4 match 3 stars 12.29 score 2.0k scripts 143 dependents

neurodata

mgc:Multiscale Graph Correlation

Multiscale Graph Correlation (MGC) is a framework developed by Vogelstein et al. (2019) <DOI:10.7554/eLife.41690> that extends global correlation procedures to be multiscale; consequently, MGC tests typically require far fewer samples than existing methods for a wide variety of dependence structures and dimensionalities, while maintaining computational efficiency. Moreover, MGC provides a simple and elegant multiscale characterization of the potentially complex latent geometry underlying the relationship.

Maintained by Eric Bridgeford. Last updated 4 years ago.

5.6 match 9 stars 7.50 score 59 scripts 2 dependents

bioc

ChIPpeakAnno:Batch annotation of the peaks identified from either ChIP-seq, ChIP-chip experiments, or any experiments that result in large number of genomic interval data

The package encompasses a range of functions for identifying the closest gene, exon, miRNA, or custom features—such as highly conserved elements and user-supplied transcription factor binding sites. Additionally, users can retrieve sequences around the peaks and obtain enriched Gene Ontology (GO) or Pathway terms. In version 2.0.5 and beyond, new functionalities have been introduced. These include features for identifying peaks associated with bi-directional promoters along with summary statistics (peaksNearBDP), summarizing motif occurrences in peaks (summarizePatternInPeaks), and associating additional identifiers with annotated peaks or enrichedGO (addGeneIDs). The package integrates with various other packages such as biomaRt, IRanges, Biostrings, BSgenome, GO.db, multtest, and stat to enhance its analytical capabilities.

Maintained by Jianhong Ou. Last updated 2 months ago.

annotation chipseq chipchip

4.8 match 8.75 score 584 scripts 6 dependents

gagolews

stringi:Fast and Portable Character String Processing Facilities

A collection of character string/text/natural language processing tools for pattern searching (e.g., with 'Java'-like regular expressions or the 'Unicode' collation algorithm), random string generation, case mapping, string transliteration, concatenation, sorting, padding, wrapping, Unicode normalisation, date-time formatting and parsing, and many more. They are fast, consistent, convenient, and - thanks to 'ICU' (International Components for Unicode) - portable across all locales and platforms. Documentation about 'stringi' is provided via its website at <https://stringi.gagolewski.com/> and the paper by Gagolewski (2022, <doi:10.18637/jss.v103.i02>).

Maintained by Marek Gagolewski. Last updated 1 months ago.

icu icu4c natural-language-processing nlp regex regexp string-manipulation stringi stringr text text-processing tidy-data unicode cpp

2.3 match 309 stars 18.31 score 10k scripts 8.6k dependents

t-kalinowski

keras:R Interface to 'Keras'

Interface to 'Keras' <https://keras.io>, a high-level neural networks 'API'. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both 'CPU' and 'GPU' devices.

Maintained by Tomasz Kalinowski. Last updated 11 months ago.

3.8 match 10.82 score 10k scripts 54 dependents

josiahparry

sfdep:Spatial Dependence for Simple Features

An interface to 'spdep' to integrate with 'sf' objects and the 'tidyverse'.

Maintained by Dexter Locke. Last updated 6 months ago.

r-spatial spatial

5.8 match 130 stars 7.01 score 130 scripts

cran

asnipe:Animal Social Network Inference and Permutations for Ecologists

Implements several tools that are used in animal social network analysis, as described in Whitehead (2007) Analyzing Animal Societies <University of Chicago Press> and Farine & Whitehead (2015) <doi: 10.1111/1365-2656.12418>. In particular, this package provides the tools to infer groups and generate networks from observation data, perform permutation tests on the data, calculate lagged association rates, and performed multiple regression analysis on social network data.

Maintained by Damien R. Farine. Last updated 1 years ago.

9.2 match 2 stars 4.36 score 173 scripts 2 dependents

schlosslab

mikropml:User-Friendly R Package for Supervised Machine Learning Pipelines

An interface to build machine learning models for classification and regression problems. 'mikropml' implements the ML pipeline described by Topçuoğlu et al. (2020) <doi:10.1128/mBio.00434-20> with reasonable default options for data preprocessing, hyperparameter tuning, cross-validation, testing, model evaluation, and interpretation steps. See the website <https://www.schlosslab.org/mikropml/> for more information, documentation, and examples.

Maintained by Kelly Sovacool. Last updated 2 years ago.

machine-learning

5.1 match 56 stars 7.83 score 86 scripts

zarquon42b

Morpho:Calculations and Visualisations Related to Geometric Morphometrics

A toolset for Geometric Morphometrics and mesh processing. This includes (among other stuff) mesh deformations based on reference points, permutation tests, detection of outliers, processing of sliding semi-landmarks and semi-automated surface landmark placement.

Maintained by Stefan Schlager. Last updated 5 months ago.

openblas cpp openmp

4.0 match 51 stars 10.00 score 218 scripts 13 dependents

adeverse

adespatial:Multivariate Multiscale Spatial Analysis

Tools for the multiscale spatial analysis of multivariate data. Several methods are based on the use of a spatial weighting matrix and its eigenvector decomposition (Moran's Eigenvectors Maps, MEM). Several approaches are described in the review Dray et al (2012) <doi:10.1890/11-1183.1>.

Maintained by Aurélie Siberchicot. Last updated 11 days ago.

openblas

3.6 match 36 stars 11.06 score 398 scripts 2 dependents

spatstat

spatstat.explore:Exploratory Data Analysis for the 'spatstat' Family

Functionality for exploratory data analysis and nonparametric analysis of spatial data, mainly spatial point patterns, in the 'spatstat' family of packages. (Excludes analysis of spatial data on a linear network, which is covered by the separate package 'spatstat.linnet'.) Methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported.

Maintained by Adrian Baddeley. Last updated 1 months ago.

cluster-detection confidence-intervals hypothesis-testing k-function roc-curves scan-statistics significance-testing simulation-envelopes spatial-analysis spatial-data-analysis spatial-sharpening spatial-smoothing spatial-statistics

3.9 match 1 stars 10.17 score 67 scripts 148 dependents

fkeck

phylosignal:Exploring the Phylogenetic Signal in Continuous Traits

A collection of tools to explore the phylogenetic signal in univariate and multivariate data. The package provides functions to plot traits data against a phylogenetic tree, different measures and tests for the phylogenetic signal, methods to describe where the signal is located and a phylogenetic clustering method.

Maintained by Francois Keck. Last updated 1 years ago.

openblas cpp

5.4 match 16 stars 7.22 score 104 scripts

bioc

structToolbox:Data processing & analysis tools for Metabolomics and other omics

An extensive set of data (pre-)processing and analysis methods and tools for metabolomics and other omics, with a strong emphasis on statistics and machine learning. This toolbox allows the user to build extensive and standardised workflows for data analysis. The methods and tools have been implemented using class-based templates provided by the struct (Statistics in R Using Class-based Templates) package. The toolbox includes pre-processing methods (e.g. signal drift and batch correction, normalisation, missing value imputation and scaling), univariate (e.g. ttest, various forms of ANOVA, Kruskal–Wallis test and more) and multivariate statistical methods (e.g. PCA and PLS, including cross-validation and permutation testing) as well as machine learning methods (e.g. Support Vector Machines). The STATistics Ontology (STATO) has been integrated and implemented to provide standardised definitions for the different methods, inputs and outputs.

Maintained by Gavin Rhys Lloyd. Last updated 24 days ago.

workflowstep metabolomics bioconductor-package dims lc-ms machine-learning multivariate-analysis statistics univariate

6.3 match 10 stars 6.26 score 12 scripts

aloy

CarletonStats:Functions for Statistics Classes at Carleton College

Includes commands for bootstrapping and permutation tests, a command for created grouped bar plots, and a demo of the quantile-normal plot for data drawn from different distributions.

Maintained by Adam Loy. Last updated 7 months ago.

10.3 match 3.81 score 65 scripts

prodriguezsosa

conText:'a la Carte' on Text (ConText) Embedding Regression

A fast, flexible and transparent framework to estimate context-specific word and short document embeddings using the 'a la carte' embeddings approach developed by Khodak et al. (2018) <arXiv:1805.05388> and evaluate hypotheses about covariate effects on embeddings using the regression framework developed by Rodriguez et al. (2021)<https://github.com/prodriguezsosa/EmbeddingRegression>.

Maintained by Pedro L. Rodriguez. Last updated 11 months ago.

4.1 match 104 stars 9.40 score 1.7k scripts

kbroman

broman:Karl Broman's R Code

Miscellaneous R functions, including functions related to graphics (mostly for base graphics), permutation tests, running mean/median, and general utilities.

Maintained by Karl W Broman. Last updated 10 months ago.

4.4 match 183 stars 8.80 score 648 scripts 1 dependents

dusadrian

admisc:Adrian Dusa's Miscellaneous

Contains functions used across packages 'DDIwR', 'QCA' and 'venn'. Interprets and translates, factorizes and negates SOP - Sum of Products expressions, for both binary and multi-value crisp sets, and extracts information (set names, set values) from those expressions. Other functions perform various other checks if possibly numeric (even if all numbers reside in a character vector) and coerce to numeric, or check if the numbers are whole. It also offers, among many others, a highly versatile recoding routine and some more flexible alternatives to the base functions 'with()' and 'within()'. SOP simplification functions in this package use related minimization from package 'QCA', which is recommended to be installed despite not being listed in the Imports field, due to circular dependency issues.

Maintained by Adrian Dusa. Last updated 2 days ago.

5.0 match 2 stars 7.61 score 20 scripts 92 dependents

ecospat

ecospat:Spatial Ecology Miscellaneous Methods

Collection of R functions and data sets for the support of spatial ecology analyses with a focus on pre, core and post modelling analyses of species distribution, niche quantification and community assembly. Written by current and former members and collaborators of the ecospat group of Antoine Guisan, Department of Ecology and Evolution (DEE) and Institute of Earth Surface Dynamics (IDYST), University of Lausanne, Switzerland. Read Di Cola et al. (2016) <doi:10.1111/ecog.02671> for details.

Maintained by Olivier Broennimann. Last updated 1 months ago.

4.0 match 32 stars 9.35 score 418 scripts 1 dependents

bioc

epistasisGA:An R package to identify multi-snp effects in nuclear family studies using the GADGETS method

This package runs the GADGETS method to identify epistatic effects in nuclear family studies. It also provides functions for permutation-based inference and graphical visualization of the results.

Maintained by Michael Nodzenski. Last updated 5 months ago.

genetics snp geneticvariability openblas cpp

8.3 match 1 stars 4.48 score 5 scripts

cdowd

twosamples:Fast Permutation Based Two Sample Tests

Fast randomization based two sample tests. Testing the hypothesis that two samples come from the same distribution using randomization to create p-values. Included tests are: Kolmogorov-Smirnov, Kuiper, Cramer-von Mises, Anderson-Darling, Wasserstein, and DTS. The default test (two_sample) is based on the DTS test statistic, as it is the most powerful, and thus most useful to most users. The DTS test statistic builds on the Wasserstein distance by using a weighting scheme like that of Anderson-Darling. See the companion paper at <arXiv:2007.01360> or <https://codowd.com/public/DTS.pdf> for details of that test statistic, and non-standard uses of the package (parallel for big N, weighted observations, one sample tests, etc). We also include the permutation scheme to make test building simple for others.

Maintained by Connor Dowd. Last updated 2 years ago.

distance-metric ecdf cpp

5.4 match 17 stars 6.88 score 62 scripts 8 dependents

egenn

rtemis:Machine Learning and Visualization

Advanced Machine Learning and Visualization. Unsupervised Learning (Clustering, Decomposition), Supervised Learning (Classification, Regression), Cross-Decomposition, Bagging, Boosting, Meta-models. Static and interactive graphics.

Maintained by E.D. Gennatas. Last updated 1 months ago.

data-science data-visualization machine-learning machine-learning-library visualization

5.3 match 145 stars 7.09 score 50 scripts 2 dependents

bioc

BioNAR:Biological Network Analysis in R

the R package BioNAR, developed to step by step analysis of PPI network. The aim is to quantify and rank each protein’s simultaneous impact into multiple complexes based on network topology and clustering. Package also enables estimating of co-occurrence of diseases across the network and specific clusters pointing towards shared/common mechanisms.

Maintained by Anatoly Sorokin. Last updated 17 days ago.

software graphandnetwork network

6.3 match 3 stars 5.90 score 35 scripts

jangraffelman

HardyWeinberg:Statistical Tests and Graphics for Hardy-Weinberg Equilibrium

Contains tools for exploring Hardy-Weinberg equilibrium (Hardy, 1908; Weinberg, 1908) for bi and multi-allelic genetic marker data. All classical tests (chi-square, exact, likelihood-ratio and permutation tests) with bi-allelic variants are included in the package, as well as functions for power computation and for the simulation of marker data under equilibrium and disequilibrium. Routines for dealing with markers on the X-chromosome are included (Graffelman & Weir, 2016) <doi:10.1038/hdy.2016.20>, including Bayesian procedures. Some exact and permutation procedures also work with multi-allelic variants. Special test procedures that jointly address Hardy-Weinberg equilibrium and equality of allele frequencies in both sexes are supplied, for the bi and multi-allelic case. Functions for testing equilibrium in the presence of missing data by using multiple imputation are also provided. Implements several graphics for exploring the equilibrium status of a large set of bi-allelic markers: ternary plots with acceptance regions, log-ratio plots and Q-Q plots. The functionality of the package is explained in detail in a related JSS paper <doi:10.18637/jss.v064.i03>.

Maintained by Jan Graffelman. Last updated 11 months ago.

cpp

5.8 match 6.30 score 167 scripts 4 dependents

r-forge

zoo:S3 Infrastructure for Regular and Irregular Time Series (Z's Ordered Observations)

An S3 class with methods for totally ordered indexed observations. It is particularly aimed at irregular time series of numeric vectors/matrices and factors. zoo's key design goals are independence of a particular index/date/time class and consistency with ts and base R by providing methods to extend standard generics.

Maintained by Achim Zeileis. Last updated 12 days ago.

2.3 match 16.23 score 33k scripts 2.2k dependents

gjmvanboxtel

gsignal:Signal Processing

R implementation of the 'Octave' package 'signal', containing a variety of signal processing tools, such as signal generation and measurement, correlation and convolution, filtering, filter design, filter analysis and conversion, power spectrum analysis, system identification, decimation and sample rate change, and windowing.

Maintained by Geert van Boxtel. Last updated 2 months ago.

signal-processing signals cpp

3.6 match 24 stars 10.03 score 133 scripts 34 dependents

drewdstat

wqspt:Permutation Test for Weighted Quantile Sum Regression

Implements a permutation test method for the weighted quantile sum (WQS) regression, building off the 'gWQS' package (Renzetti et al. <https://CRAN.R-project.org/package=gWQS>). Weighted quantile sum regression is a statistical technique to evaluate the effect of complex exposure mixtures on an outcome (Carrico et al. 2015 <doi:10.1007/s13253-014-0180-3>). The model features a statistical power and Type I error (i.e., false positive) rate trade-off, as there is a machine learning step to determine the weights that optimize the linear model fit. This package provides an alternative method based on a permutation test that should reliably allow for both high power and low false positive rate when utilizing WQS regression (Day et al. 2022 <doi:10.1289/EHP10570>).

Maintained by Drew Day. Last updated 9 days ago.

9.0 match 4.00 score 2 scripts

cogdisreslab

KRSA:KRSA: Kinome Random Sampling Analyzer

The goal of this package is to analyze the PamChip data and identify the changes in the active kinome. The package can preprocess the PamChip data output from BioNavigator and use Random Sampling and Permutation Analysis to identify upstream kinases. Additionally, this package provides a set of useful visualizations for the PamChip data.

Maintained by Ali Sajid Imami. Last updated 9 days ago.

kinase phosphatases pamchip kinome random sampling permutation analysis

8.0 match 4 stars 4.42 score 49 scripts

tidyverse

forcats:Tools for Working with Categorical Variables (Factors)

Helpers for reordering factor levels (including moving specified levels to front, ordering by first appearance, reversing, and randomly shuffling), and tools for modifying factor levels (including collapsing rare levels into other, 'anonymising', and manually 'recoding').

Maintained by Hadley Wickham. Last updated 1 years ago.

factor tidyverse

1.9 match 555 stars 18.77 score 21k scripts 1.2k dependents

radicalcommecol

cxr:A Toolbox for Modelling Species Coexistence in R

Recent developments in modern coexistence theory have advanced our understanding on how species are able to persist and co-occur with other species at varying abundances. However, applying this mathematical framework to empirical data is still challenging, precluding a larger adoption of the theoretical tools developed by empiricists. This package provides a complete toolbox for modelling interaction effects between species, and calculate fitness and niche differences. The functions are flexible, may accept covariates, and different fitting algorithms can be used. A full description of the underlying methods is available in García-Callejas, D., Godoy, O., and Bartomeus, I. (2020) <doi:10.1111/2041-210X.13443>. Furthermore, the package provides a series of functions to calculate dynamics for stage-structured populations across sites.

Maintained by David Garcia-Callejas. Last updated 1 months ago.

5.3 match 10 stars 6.51 score 27 scripts

nceas

codyn:Community Dynamics Metrics

Univariate and multivariate temporal and spatial diversity indices, rank abundance curves, and community stability measures. The functions implement measures that are either explicitly temporal and include the option to calculate them over multiple replicates, or spatial and include the option to calculate them over multiple time points. Functions fall into five categories: static diversity indices, temporal diversity indices, spatial diversity indices, rank abundance curves, and community stability measures. The diversity indices are temporal and spatial analogs to traditional diversity indices. Specifically, the package includes functions to calculate community richness, evenness and diversity at a given point in space and time. In addition, it contains functions to calculate species turnover, mean rank shifts, and lags in community similarity between two time points. Details of the methods are available in Hallett et al. (2016) <doi:10.1111/2041-210X.12569> and Avolio et al. (2019) <doi:10.1002/ecs2.2881>.

Maintained by Matthew B. Jones. Last updated 4 years ago.

3.8 match 34 stars 9.07 score 230 scripts

bioc

deltaCaptureC:This Package Discovers Meso-scale Chromatin Remodeling from 3C Data

This package discovers meso-scale chromatin remodelling from 3C data. 3C data is local in nature. It givens interaction counts between restriction enzyme digestion fragments and a preferred 'viewpoint' region. By binning this data and using permutation testing, this package can test whether there are statistically significant changes in the interaction counts between the data from two cell types or two treatments.

Maintained by Michael Shapiro. Last updated 5 months ago.

biologicalquestion statisticalmethod

9.9 match 3.48 score 1 scripts

bblonder

hypervolume:High Dimensional Geometry, Set Operations, Projection, and Inference Using Kernel Density Estimation, Support Vector Machines, and Convex Hulls

Estimates the shape and volume of high-dimensional datasets and performs set operations: intersection / overlap, union, unique components, inclusion test, and hole detection. Uses stochastic geometry approach to high-dimensional kernel density estimation, support vector machine delineation, and convex hull generation. Applications include modeling trait and niche hypervolumes and species distribution modeling.

Maintained by Benjamin Blonder. Last updated 2 months ago.

openblas cpp

3.5 match 23 stars 9.75 score 211 scripts 7 dependents

ericarcher

rfPermute:Estimate Permutation p-Values for Random Forest Importance Metrics

Estimate significance of importance metrics for a Random Forest model by permuting the response variable. Produces null distribution of importance metrics for each predictor variable and p-value of observed. Provides summary and visualization functions for 'randomForest' results.

Maintained by Eric Archer. Last updated 2 years ago.

jags cpp

5.0 match 27 stars 6.77 score 96 scripts 1 dependents

dkahle

TITAN2:Threshold Indicator Taxa Analysis

Uses indicator species scores across binary partitions of a sample set to detect congruence in taxon-specific changes of abundance and occurrence frequency along an environmental gradient as evidence of an ecological community threshold. Relevant references include Baker and King (2010) <doi:10.1111/j.2041-210X.2009.00007.x>, King and Baker (2010) <doi:10.1899/09-144.1>, and Baker and King (2013) <doi:10.1899/12-142.1>.

Maintained by David Kahle. Last updated 1 years ago.

5.2 match 13 stars 6.59 score 30 scripts

prabhleenkaur19

aniSNA:Statistical Network Analysis of Animal Social Networks

Obtain network structures from animal GPS telemetry observations and statistically analyse them to assess their adequacy for social network analysis. Methods include pre-network data permutations, bootstrapping techniques to obtain confidence intervals for global and node-level network metrics, and correlation and regression analysis of the local network metrics.

Maintained by Prabhleen Kaur. Last updated 2 months ago.

cpp

10.7 match 3.18 score

r-forge

randtoolbox:Toolbox for Pseudo and Quasi Random Number Generation and Random Generator Tests

Provides (1) pseudo random generators - general linear congruential generators, multiple recursive generators and generalized feedback shift register (SF-Mersenne Twister algorithm (<doi:10.1007/978-3-540-74496-2_36>) and WELL (<doi:10.1145/1132973.1132974>) generators); (2) quasi random generators - the Torus algorithm, the Sobol sequence, the Halton sequence (including the Van der Corput sequence) and (3) some generator tests - the gap test, the serial test, the poker test, see, e.g., Gentle (2003) <doi:10.1007/b97336>. Take a look at the Distribution task view of types and tests of random number generators. The package can be provided without the 'rngWELL' dependency on demand. Package in Memoriam of Diethelm and Barbara Wuertz.

Maintained by Christophe Dutang. Last updated 3 months ago.

3.3 match 1 stars 10.23 score 578 scripts 80 dependents

biometris

douconca:Double Constrained Correspondence Analysis for Trait-Environment Analysis in Ecology

Double constrained correspondence analysis (dc-CA) analyzes (multi-)trait (multi-)environment ecological data by using the 'vegan' package and native R code. Throughout the two step algorithm of ter Braak et al. (2018) is used. This algorithm combines and extends community- (sample-) and species-level analyses, i.e. the usual community weighted means (CWM)-based regression analysis and the species-level analysis of species-niche centroids (SNC)-based regression analysis. The two steps use canonical correspondence analysis to regress the abundance data on to the traits and (weighted) redundancy analysis to regress the CWM of the orthonormalized traits on to the environmental predictors. The function dc_CA() has an option to divide the abundance data of a site by the site total, giving equal site weights. This division has the advantage that the multivariate analysis corresponds with an unweighted (multi-trait) community-level analysis, instead of being weighted. The first step of the algorithm uses vegan::cca(). The second step uses wrda() but vegan::rda() if the site weights are equal. This version has a predict() function. For details see ter Braak et al. 2018 <doi:10.1007/s10651-017-0395-x>.

Maintained by Bart-Jan van Rossum. Last updated 3 months ago.

correspondence-analysis ecology ecology-modeling multi-environment multi-trait

6.7 match 5.02 score 6 scripts

adrientaudiere

MiscMetabar:Miscellaneous Functions for Metabarcoding Analysis

Facilitate the description, transformation, exploration, and reproducibility of metabarcoding analyses. 'MiscMetabar' is mainly built on top of the 'phyloseq', 'dada2' and 'targets' R packages. It helps to build reproducible and robust bioinformatics pipelines in R. 'MiscMetabar' makes ecological analysis of alpha and beta-diversity easier, more reproducible and more powerful by integrating a large number of tools. Important features are described in Taudière A. (2023) <doi:10.21105/joss.06038>.

Maintained by Adrien Taudière. Last updated 24 days ago.

sequencing microbiome metagenomics clustering classification visualization amplicon amplicon-sequencing biodiversity-informatics ecology illumina metabarcoding ngs-analysis

5.2 match 17 stars 6.44 score 23 scripts

cran

statcomp:Statistical Complexity and Information Measures for Time Series Analysis

An implementation of local and global statistical complexity measures (aka Information Theory Quantifiers, ITQ) for time series analysis based on ordinal statistics (Bandt and Pompe (2002) <DOI:10.1103/PhysRevLett.88.174102>). Several distance measures that operate on ordinal pattern distributions, auxiliary functions for ordinal pattern analysis, and generating functions for stochastic and deterministic-chaotic processes for ITQ testing are provided.

Maintained by Sebastian Sippel. Last updated 5 years ago.

9.8 match 4 stars 3.41 score 72 scripts 1 dependents

bioc

mina:Microbial community dIversity and Network Analysis

An increasing number of microbiome datasets have been generated and analyzed with the help of rapidly developing sequencing technologies. At present, analysis of taxonomic profiling data is mainly conducted using composition-based methods, which ignores interactions between community members. Besides this, a lack of efficient ways to compare microbial interaction networks limited the study of community dynamics. To better understand how community diversity is affected by complex interactions between its members, we developed a framework (Microbial community dIversity and Network Analysis, mina), a comprehensive framework for microbial community diversity analysis and network comparison. By defining and integrating network-derived community features, we greatly reduce noise-to-signal ratio for diversity analyses. A bootstrap and permutation-based method was implemented to assess community network dissimilarities and extract discriminative features in a statistically principled way.

Maintained by Rui Guan. Last updated 5 months ago.

software workflowstep cpp

6.8 match 5 stars 4.85 score 14 scripts

bioc

singscore:Rank-based single-sample gene set scoring method

A simple single-sample gene signature scoring method that uses rank-based statistics to analyze the sample's gene expression profile. It scores the expression activities of gene sets at a single-sample level.

Maintained by Malvika Kharbanda. Last updated 5 months ago.

software geneexpression genesetenrichment bioinformatics

3.3 match 41 stars 10.03 score 124 scripts 4 dependents

biooss

sensitivity:Global Sensitivity Analysis of Model Outputs and Importance Measures

A collection of functions for sensitivity analysis of model outputs (factor screening, global sensitivity analysis and robustness analysis), for variable importance measures of data, as well as for interpretability of machine learning models. Most of the functions have to be applied on scalar output, but several functions support multi-dimensional outputs.

Maintained by Bertrand Iooss. Last updated 7 months ago.

cpp

4.9 match 17 stars 6.74 score 472 scripts 8 dependents

bioc

LimROTS:A Hybrid Method Integrating Empirical Bayes and Reproducibility-Optimized Statistics for Robust Analysis of Proteomics and Metabolomics Data

Differential expression analysis is a prevalent method utilised in the examination of diverse biological data. The reproducibility-optimized test statistic (ROTS) modifies a t-statistic based on the data's intrinsic characteristics and ranks features according to their statistical significance for differential expression between two or more groups (f-statistic). Focussing on proteomics and metabolomics, the current ROTS implementation cannot account for technical or biological covariates such as MS batches or gender differences among the samples. Consequently, we developed LimROTS, which employs a reproducibility-optimized test statistic utilising the limma methodology to simulate complex experimental designs. LimROTS is a hybrid method integrating empirical bayes and reproducibility-optimized statistics for robust analysis of proteomics and metabolomics data.

Maintained by Ali Mostafa Anwar. Last updated 3 months ago.

software geneexpression differentialexpression microarray rnaseq proteomics immunooncology metabolomics mrnamicroarray

7.0 match 1 stars 4.70 score 1 scripts

iandryden

shapes:Statistical Shape Analysis

Routines for the statistical analysis of landmark shapes, including Procrustes analysis, graphical displays, principal components analysis, permutation and bootstrap tests, thin-plate spline transformation grids and comparing covariance matrices. See Dryden, I.L. and Mardia, K.V. (2016). Statistical shape analysis, with Applications in R (2nd Edition), John Wiley and Sons.

Maintained by Ian Dryden. Last updated 4 months ago.

3.8 match 7 stars 8.50 score 225 scripts 24 dependents

bnaras

PMA:Penalized Multivariate Analysis

Performs Penalized Multivariate Analysis: a penalized matrix decomposition, sparse principal components analysis, and sparse canonical correlation analysis, described in Witten, Tibshirani and Hastie (2009) <doi:10.1093/biostatistics/kxp008> and Witten and Tibshirani (2009) Extensions of sparse canonical correlation analysis, with applications to genomic data <doi:10.2202/1544-6115.1470>.

Maintained by Balasubramanian Narasimhan. Last updated 1 years ago.

cpp

4.5 match 4 stars 7.24 score 254 scripts 11 dependents

mllg

checkmate:Fast and Versatile Argument Checks

Tests and assertions to perform frequent argument checks. A substantial part of the package was written in C to minimize any worries about execution time overhead.

Maintained by Michel Lang. Last updated 8 months ago.

assertions testthat

2.0 match 276 stars 16.28 score 1.5k scripts 1.9k dependents

rstudio

tfprobability:Interface to 'TensorFlow Probability'

Interface to 'TensorFlow Probability', a 'Python' library built on 'TensorFlow' that makes it easy to combine probabilistic models and deep learning on modern hardware ('TPU', 'GPU'). 'TensorFlow Probability' includes a wide selection of probability distributions and bijectors, probabilistic layers, variational inference, Markov chain Monte Carlo, and optimizers such as Nelder-Mead, BFGS, and SGLD.

Maintained by Tomasz Kalinowski. Last updated 3 years ago.

3.8 match 54 stars 8.63 score 221 scripts 3 dependents

jfukuyama

phyloseqGraphTest:Graph-Based Permutation Tests for Microbiome Data

Provides functions for graph-based multiple-sample testing and visualization of microbiome data, in particular data stored in 'phyloseq' objects. The tests are based on those described in Friedman and Rafsky (1979) <http://www.jstor.org/stable/2958919>, and the tests are described in more detail in Callahan et al. (2016) <doi:10.12688/f1000research.8986.1>.

Maintained by Julia Fukuyama. Last updated 1 years ago.

6.7 match 4 stars 4.81 score 16 scripts

bioc

MicrobiotaProcess:A comprehensive R package for managing and analyzing microbiome and other ecological data within the tidy framework

MicrobiotaProcess is an R package for analysis, visualization and biomarker discovery of microbial datasets. It introduces MPSE class, this make it more interoperable with the existing computing ecosystem. Moreover, it introduces a tidy microbiome data structure paradigm and analysis grammar. It provides a wide variety of microbiome data analysis procedures under the unified and common framework (tidy-like framework).

Maintained by Shuangbin Xu. Last updated 5 months ago.

visualization microbiome software multiplecomparison featureextraction microbiome-analysis microbiome-data

3.3 match 183 stars 9.70 score 126 scripts 1 dependents

bioc

BiocGenerics:S4 generic functions used in Bioconductor

The package defines many S4 generic functions used in Bioconductor.

Maintained by Hervé Pagès. Last updated 1 months ago.

infrastructure bioconductor-package core-package

2.3 match 12 stars 14.22 score 612 scripts 2.2k dependents

bioc

progeny:Pathway RespOnsive GENes for activity inference from gene expression

PROGENy is resource that leverages a large compendium of publicly available signaling perturbation experiments to yield a common core of pathway responsive genes for human and mouse. These, coupled with any statistical method, can be used to infer pathway activities from bulk or single-cell transcriptomics.

Maintained by Aurélien Dugourd. Last updated 5 months ago.

systemsbiology geneexpression functionalprediction generegulation

3.6 match 99 stars 8.90 score 221 scripts 1 dependents

hfgolino

EGAnet:Exploratory Graph Analysis – a Framework for Estimating the Number of Dimensions in Multivariate Data using Network Psychometrics

Implements the Exploratory Graph Analysis (EGA) framework for dimensionality and psychometric assessment. EGA estimates the number of dimensions in psychological data using network estimation methods and community detection algorithms. A bootstrap method is provided to assess the stability of dimensions and items. Fit is evaluated using the Entropy Fit family of indices. Unique Variable Analysis evaluates the extent to which items are locally dependent (or redundant). Network loadings provide similar information to factor loadings and can be used to compute network scores. A bootstrap and permutation approach are available to assess configural and metric invariance. Hierarchical structures can be detected using Hierarchical EGA. Time series and intensive longitudinal data can be analyzed using Dynamic EGA, supporting individual, group, and population level assessments.

Maintained by Hudson Golino. Last updated 8 days ago.

4.1 match 47 stars 7.80 score 61 scripts 1 dependents

bioc

MPAC:Multi-omic Pathway Analysis of Cells

Multi-omic Pathway Analysis of Cells (MPAC), integrates multi-omic data for understanding cellular mechanisms. It predicts novel patient groups with distinct pathway profiles as well as identifying key pathway proteins with potential clinical associations. From CNA and RNA-seq data, it determines genes’ DNA and RNA states (i.e., repressed, normal, or activated), which serve as the input for PARADIGM to calculate Inferred Pathway Levels (IPLs). It also permutes DNA and RNA states to create a background distribution to filter IPLs as a way to remove events observed by chance. It provides multiple methods for downstream analysis and visualization.

Maintained by Peng Liu. Last updated 15 hours ago.

software technology sequencing rnaseq survival clustering immunooncology

7.5 match 4.20 score 1 scripts

jlessler

IDSpatialStats:Estimate Global Clustering in Infectious Disease

Implements various novel and standard clustering statistics and other analyses useful for understanding the spread of infectious disease.

Maintained by Justin Lessler. Last updated 8 months ago.

11.6 match 1 stars 2.69 score 33 scripts

dkahle

mpoly:Symbolic Computation and More with Multivariate Polynomials

Symbolic computing with multivariate polynomials in R.

Maintained by David Kahle. Last updated 4 months ago.

5.0 match 12 stars 6.25 score 70 scripts 7 dependents

khliland

HDANOVA:High-Dimensional Analysis of Variance

Functions and datasets to support Smilde, Marini, Westerhuis and Liland (2025, ISBN: 978-1-394-21121-0) "Analysis of Variance for High-Dimensional Data - Applications in Life, Food and Chemical Sciences". This implements and imports a collection of methods for HD-ANOVA data analysis with common interfaces, result- and plotting functions, multiple real data sets and four vignettes covering a range different applications.

Maintained by Kristian Hovde Liland. Last updated 2 days ago.

7.1 match 4.35 score 8 scripts 1 dependents

cran

network:Classes for Relational Data

Tools to create and modify network objects. The network class can represent a range of relational data types, and supports arbitrary vertex/edge/graph attributes.

Maintained by Carter T. Butts. Last updated 3 months ago.

4.0 match 3 stars 7.65 score 146 dependents

tomasfryda

h2o:R Interface for the 'H2O' Scalable Machine Learning Platform

R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).

Maintained by Tomas Fryda. Last updated 1 years ago.

3.8 match 3 stars 8.20 score 7.8k scripts 11 dependents

bioc

INDEED:Interactive Visualization of Integrated Differential Expression and Differential Network Analysis for Biomarker Candidate Selection Package

An R package for integrated differential expression and differential network analysis based on omic data for cancer biomarker discovery. Both correlation and partial correlation can be used to generate differential network to aid the traditional differential expression analysis to identify changes between biomolecules on both their expression and pairwise association levels. A detailed description of the methodology has been published in Methods journal (PMID: 27592383). An interactive visualization feature allows for the exploration and selection of candidate biomarkers.

Maintained by Ressom group. Last updated 5 months ago.

immunooncology software researchfield biologicalquestion statisticalmethod differentialexpression massspectrometry metabolomics

5.1 match 4 stars 5.92 score 10 scripts

annavesely

sumSome:True Discovery Guarantee by Sum-Based Tests

It allows to quickly perform closed testing by sum-based global tests, and construct lower confidence bounds for the TDP, simultaneously over all subsets of hypotheses. As main features, it produces permutation-based simultaneous lower confidence bounds for the proportion of active voxels in clusters for fMRI data, differentially expressed genes in pathways for gene expression data, and significant effects for multiverse analysis. Details may be found in Vesely at al. (2023) < doi:10.1093/jrsssb/qkad019> and Tian at al. (2022) <doi:10.1111/sjos.12614>.

Maintained by Anna Vesely. Last updated 2 months ago.

cpp

11.2 match 1 stars 2.70 score 3 scripts

salvatoremangiafico

rcompanion:Functions to Support Extension Education Program Evaluation

Functions and datasets to support Summary and Analysis of Extension Program Evaluation in R, and An R Companion for the Handbook of Biological Statistics. Vignettes are available at <https://rcompanion.org>.

Maintained by Salvatore Mangiafico. Last updated 30 days ago.

3.8 match 4 stars 8.01 score 2.4k scripts 5 dependents

bioc

DelayedTensor:R package for sparse and out-of-core arithmetic and decomposition of Tensor

DelayedTensor operates Tensor arithmetic directly on DelayedArray object. DelayedTensor provides some generic function related to Tensor arithmetic/decompotision and dispatches it on the DelayedArray class. DelayedTensor also suppors Tensor contraction by einsum function, which is inspired by numpy einsum.

Maintained by Koki Tsuyuzaki. Last updated 5 months ago.

software infrastructure datarepresentation dimensionreduction

6.3 match 4 stars 4.68 score 3 scripts

hanjunwei-lab

ICDS:Identification of Cancer Dysfunctional Subpathway with Omics Data

Identify Cancer Dysfunctional Sub-pathway by integrating gene expression, DNA methylation and copy number variation, and pathway topological information. 1)We firstly calculate the gene risk scores by integrating three kinds of data: DNA methylation, copy number variation, and gene expression. 2)Secondly, we perform a greedy search algorithm to identify the key dysfunctional sub-pathways within the pathways for which the discriminative scores were locally maximal. 3)Finally, the permutation test was used to calculate statistical significance level for these key dysfunctional sub-pathways.

Maintained by Junwei Han. Last updated 8 months ago.

6.5 match 4.54 score 3 scripts

pboutros

bedr:Genomic Region Processing using Tools Such as 'BEDTools', 'BEDOPS' and 'Tabix'

Genomic regions processing using open-source command line tools such as 'BEDTools', 'BEDOPS' and 'Tabix'. These tools offer scalable and efficient utilities to perform genome arithmetic e.g indexing, formatting and merging. bedr API enhances access to these tools as well as offers additional utilities for genomic regions processing.

Maintained by Paul C. Boutros. Last updated 6 years ago.

5.9 match 4.98 score 264 scripts 2 dependents

tidymodels

infer:Tidy Statistical Inference

The objective of this package is to perform inference using an expressive statistical grammar that coheres with the tidy design framework.

Maintained by Simon Couch. Last updated 6 months ago.

1.9 match 734 stars 15.69 score 3.5k scripts 17 dependents

wviechtb

metafor:Meta-Analysis Package for R

A comprehensive collection of functions for conducting meta-analyses in R. The package includes functions to calculate various effect sizes or outcome measures, fit equal-, fixed-, random-, and mixed-effects models to such data, carry out moderator and meta-regression analyses, and create various types of meta-analytical plots (e.g., forest, funnel, radial, L'Abbe, Baujat, bubble, and GOSH plots). For meta-analyses of binomial and person-time data, the package also provides functions that implement specialized methods, including the Mantel-Haenszel method, Peto's method, and a variety of suitable generalized linear (mixed-effects) models (i.e., mixed-effects logistic and Poisson regression models). Finally, the package provides functionality for fitting meta-analytic multivariate/multilevel models that account for non-independent sampling errors and/or true effects (e.g., due to the inclusion of multiple treatment studies, multiple endpoints, or other forms of clustering). Network meta-analyses and meta-analyses accounting for known correlation structures (e.g., due to phylogenetic relatedness) can also be conducted. An introduction to the package can be found in Viechtbauer (2010) <doi:10.18637/jss.v036.i03>.

Maintained by Wolfgang Viechtbauer. Last updated 22 hours ago.

meta-analysis mixed-effects multilevel-models multivariate

1.8 match 246 stars 16.30 score 4.9k scripts 92 dependents

vathymut

dsos:Dataset Shift with Outlier Scores

Test for no adverse shift in two-sample comparison when we have a training set, the reference distribution, and a test set. The approach is flexible and relies on a robust and powerful test statistic, the weighted AUC. Technical details are in Kamulete, V. M. (2021) <arXiv:1908.04000>. Modern notions of outlyingness such as trust scores and prediction uncertainty can be used as the underlying scores for example.

Maintained by Vathy M. Kamulete. Last updated 2 years ago.

data-drift data-validation dataset-shifts drift-detection machine-learning mlops model-monitoring model-validation performance-monitoring statistical-process-control statistical-tests

5.8 match 2 stars 5.08 score 40 scripts

cran

permutest:Run Permutation Tests and Construct Associated Confidence Intervals

Implements permutation tests for any test statistic and randomization scheme and constructs associated confidence intervals as described in Glazer and Stark (2024) <doi:10.48550/arXiv.2405.05238>.

Maintained by Amanda Glazer. Last updated 6 months ago.

17.1 match 1.70 score 2 scripts

ocbe-uio

permChacko:Chacko Test for Order-Restriction with Permutation

Implements an extension of the Chacko chi-square test for ordered vectors (Chacko, 1966, <https://www.jstor.org/stable/25051572>). Our extension brings the Chacko test to the computer age by implementing a permutation test to offer a numeric estimate of the p-value, which is particularly useful when the analytic solution is not available.

Maintained by Waldir Leoncio. Last updated 6 months ago.

6.8 match 4.30 score 3 scripts

merck

gMCPLite:Lightweight Graph Based Multiple Comparison Procedures

A lightweight fork of 'gMCP' with functions for graphical described multiple test procedures introduced in Bretz et al. (2009) <doi:10.1002/sim.3495> and Bretz et al. (2011) <doi:10.1002/bimj.201000239>. Implements a flexible function using 'ggplot2' to create multiplicity graph visualizations. Contains instructions of multiplicity graph and graphical testing for group sequential design, described in Maurer and Bretz (2013) <doi:10.1080/19466315.2013.807748>, with necessary unit testing using 'testthat'.

Maintained by Nan Xiao. Last updated 1 years ago.

5.0 match 11 stars 5.79 score 14 scripts

alexchristensen

NetworkToolbox:Methods and Measures for Brain, Cognitive, and Psychometric Network Analysis

Implements network analysis and graph theory measures used in neuroscience, cognitive science, and psychology. Methods include various filtering methods and approaches such as threshold, dependency (Kenett, Tumminello, Madi, Gur-Gershgoren, Mantegna, & Ben-Jacob, 2010 <doi:10.1371/journal.pone.0015032>), Information Filtering Networks (Barfuss, Massara, Di Matteo, & Aste, 2016 <doi:10.1103/PhysRevE.94.062306>), and Efficiency-Cost Optimization (Fallani, Latora, & Chavez, 2017 <doi:10.1371/journal.pcbi.1005305>). Brain methods include the recently developed Connectome Predictive Modeling (see references in package). Also implements several network measures including local network characteristics (e.g., centrality), community-level network characteristics (e.g., community centrality), global network characteristics (e.g., clustering coefficient), and various other measures associated with the reliability and reproducibility of network analysis.

Maintained by Alexander Christensen. Last updated 2 years ago.

network-analysis

4.1 match 23 stars 6.99 score 101 scripts 4 dependents

dwarton

ecostats:Code and Data Accompanying the Eco-Stats Text (Warton 2022)

Functions and data supporting the Eco-Stats text (Warton, 2022, Springer), and solutions to exercises. Functions include tools for using simulation envelopes in diagnostic plots, and a function for diagnostic plots of multivariate linear models. Datasets mentioned in the package are included here (where not available elsewhere) and there is a vignette for each chapter of the text with solutions to exercises.

Maintained by David Warton. Last updated 1 years ago.

4.4 match 8 stars 6.58 score 53 scripts

bioc

Category:Category Analysis

A collection of tools for performing category (gene set enrichment) analysis.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

annotation go pathways genesetenrichment

3.6 match 7.93 score 183 scripts 16 dependents

talgalili

dendextend:Extending 'dendrogram' Functionality in R

Offers a set of functions for extending 'dendrogram' objects in R, letting you visualize and compare trees of 'hierarchical clusterings'. You can (1) Adjust a tree's graphical parameters - the color, size, type, etc of its branches, nodes and labels. (2) Visually and statistically compare different 'dendrograms' to one another.

Maintained by Tal Galili. Last updated 2 months ago.

1.7 match 154 stars 17.02 score 6.0k scripts 164 dependents

bioc

PolySTest:PolySTest: Detection of differentially regulated features. Combined statistical testing for data with few replicates and missing values

The complexity of high-throughput quantitative omics experiments often leads to low replicates numbers and many missing values. We implemented a new test to simultaneously consider missing values and quantitative changes, which we combined with well-performing statistical tests for high confidence detection of differentially regulated features. The package contains functions to run the test and to visualize the results.

Maintained by Veit Schwämmle. Last updated 4 months ago.

massspectrometry proteomics software differentialexpression

5.8 match 4.95 score 12 scripts

loelschlaeger

oeli:Utilities for Developing Data Science Software

Some general helper functions that I (and maybe others) find useful when developing data science software.

Maintained by Lennart Oelschläger. Last updated 4 months ago.

openblas cpp

5.3 match 2 stars 5.42 score 1 scripts 4 dependents

lukketotte

MultSurvTests:Permutation Tests for Multivariate Survival Analysis

Multivariate version of the two-sample Gehan and logrank tests, as described in L.J Wei & J.M Lachin (1984) and Persson et al. (2019).

Maintained by Lukas Arnroth. Last updated 4 years ago.

openblas cpp

10.4 match 2.70 score 2 scripts

mhahsler

arules:Mining Association Rules and Frequent Itemsets

Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules). Also provides C implementations of the association mining algorithms Apriori and Eclat. Hahsler, Gruen and Hornik (2005) <doi:10.18637/jss.v014.i15>.

Maintained by Michael Hahsler. Last updated 1 months ago.

arules association-rules frequent-itemsets

2.0 match 194 stars 13.99 score 3.3k scripts 28 dependents

bioc

GraphAlignment:GraphAlignment

Graph alignment is an extension package for the R programming environment which provides functions for finding an alignment between two networks based on link and node similarity scores. (J. Berg and M. Laessig, "Cross-species analysis of biological networks by Bayesian alignment", PNAS 103 (29), 10967-10972 (2006))

Maintained by Joern P. Meier. Last updated 5 months ago.

graphandnetwork network

7.1 match 3.90 score 9 scripts

gbm-developers

gbm:Generalized Boosted Regression Models

An implementation of extensions to Freund and Schapire's AdaBoost algorithm and Friedman's gradient boosting machine. Includes regression methods for least squares, absolute loss, t-distribution loss, quantile regression, logistic, multinomial logistic, Poisson, Cox proportional hazards partial likelihood, AdaBoost exponential loss, Huberized hinge loss, and Learning to Rank measures (LambdaMart). Originally developed by Greg Ridgeway. Newer version available at github.com/gbm-developers/gbm3.

Maintained by Greg Ridgeway. Last updated 9 months ago.

cpp

2.0 match 52 stars 13.85 score 6.8k scripts 91 dependents

michbur

biogram:N-Gram Analysis of Biological Sequences

Tools for extraction and analysis of various n-grams (k-mers) derived from biological sequences (proteins or nucleic acids). Contains QuiPT (quick permutation test) for fast feature-filtering of the n-gram data.

Maintained by Michal Burdukiewicz. Last updated 7 months ago.

biological-sequences ngram-analysis

3.6 match 10 stars 7.50 score 87 scripts 3 dependents

asgr

imager:Image Processing Library Based on 'CImg'

Fast image processing for images in up to 4 dimensions (two spatial dimensions, one time/depth dimension, one colour dimension). Provides most traditional image processing tools (filtering, morphology, transformations, etc.) as well as various functions for easily analysing image data using R. The package wraps 'CImg', <http://cimg.eu>, a simple, modern C++ library for image processing.

Maintained by Aaron Robotham. Last updated 25 days ago.

libx11 fftw3 tiff cpp openmp

2.0 match 17 stars 13.62 score 2.4k scripts 45 dependents

cran

jmuOutlier:Permutation Tests for Nonparametric Statistics

Performs a permutation test on the difference between two location parameters, a permutation correlation test, a permutation F-test, the Siegel-Tukey test, a ratio mean deviance test. Also performs some graphing techniques, such as for confidence intervals, vector addition, and Fourier analysis; and includes functions related to the Laplace (double exponential) and triangular distributions. Performs power calculations for the binomial test.

Maintained by Steven T. Garren. Last updated 6 years ago.

12.1 match 2.26 score 1 dependents

bioc

HEM:Heterogeneous error model for identification of differentially expressed genes under multiple conditions

This package fits heterogeneous error models for analysis of microarray data

Maintained by HyungJun Cho. Last updated 5 months ago.

microarray differentialexpression

6.3 match 4.30 score 6 scripts

metabocomp

MUVR2:Multivariate Methods with Unbiased Variable Selection

Predictive multivariate modelling for metabolomics. Types: Classification and regression. Methods: Partial Least Squares, Random Forest ans Elastic Net Data structures: Paired and unpaired Validation: repeated double cross-validation (Westerhuis et al. (2008)<doi:10.1007/s11306-007-0099-6>, Filzmoser et al. (2009)<doi:10.1002/cem.1225>) Variable selection: Performed internally, through tuning in the inner cross-validation loop.

Maintained by Yingxiao Yan. Last updated 6 months ago.

7.1 match 1 stars 3.81 score 1 scripts

cran

AUtests:Approximate Unconditional and Permutation Tests

Performs approximate unconditional and permutation testing for 2x2 contingency tables. Motivated by testing for disease association with rare genetic variants in case-control studies. When variants are extremely rare, these tests give better control of Type I error than standard tests.

Maintained by Arjun Sondhi. Last updated 5 years ago.

13.4 match 2.00 score

aaamini

nett:Network Analysis and Community Detection

Features tools for the network data analysis and community detection. Provides multiple methods for fitting, model selection and goodness-of-fit testing in degree-corrected stochastic blocks models. Most of the computations are fast and scalable for sparse networks, esp. for Poisson versions of the models. Implements the following: Amini, Chen, Bickel and Levina (2013) <doi:10.1214/13-AOS1138> Bickel and Sarkar (2015) <doi:10.1111/rssb.12117> Lei (2016) <doi:10.1214/15-AOS1370> Wang and Bickel (2017) <doi:10.1214/16-AOS1457> Zhang and Amini (2020) <arXiv:2012.15047> Le and Levina (2022) <doi:10.1214/21-EJS1971>.

Maintained by Arash A. Amini. Last updated 2 years ago.

cpp

4.8 match 8 stars 5.48 score 19 scripts

cran

wPerm:Permutation Tests

Supplies permutation-test alternatives to traditional hypothesis-test procedures such as two-sample tests for means, medians, and standard deviations; correlation tests; tests for homogeneity and independence; and more. Suitable for general audiences, including individual and group users, introductory statistics courses, and more advanced statistics courses that desire an introduction to permutation tests.

Maintained by Neil A. Weiss. Last updated 9 years ago.

20.0 match 1.30 score

r-gregmisc

gdata:Various R Programming Tools for Data Manipulation

Various R programming tools for data manipulation, including medical unit conversions, combining objects, character vector operations, factor manipulation, obtaining information about R objects, generating fixed-width format files, extracting components of date & time objects, operations on columns of data frames, matrix operations, operations on vectors, operations on data frames, value of last evaluated expression, and a resample() wrapper for sample() that ensures consistent behavior for both scalar and vector arguments.

Maintained by Arni Magnusson. Last updated 2 months ago.

1.9 match 9 stars 13.62 score 4.5k scripts 124 dependents

cran

Compositional:Compositional Data Analysis

Regression, classification, contour plots, hypothesis testing and fitting of distributions for compositional data are some of the functions included. We further include functions for percentages (or proportions). The standard textbook for such data is John Aitchison's (1986) "The statistical analysis of compositional data". Relevant papers include: a) Tsagris M.T., Preston S. and Wood A.T.A. (2011). "A data-based power transformation for compositional data". Fourth International International Workshop on Compositional Data Analysis. <doi:10.48550/arXiv.1106.1451> b) Tsagris M. (2014). "The k-NN algorithm for compositional data: a revised approach with and without zero values present". Journal of Data Science, 12(3): 519--534. <doi:10.6339/JDS.201407_12(3).0008>. c) Tsagris M. (2015). "A novel, divergence based, regression for compositional data". Proceedings of the 28th Panhellenic Statistics Conference, 15-18 April 2015, Athens, Greece, 430--444. <doi:10.48550/arXiv.1511.07600>. d) Tsagris M. (2015). "Regression analysis with compositional data containing zero values". Chilean Journal of Statistics, 6(2): 47--57. <https://soche.cl/chjs/volumes/06/02/Tsagris(2015).pdf>. e) Tsagris M., Preston S. and Wood A.T.A. (2016). "Improved supervised classification for compositional data using the alpha-transformation". Journal of Classification, 33(2): 243--261. <doi:10.1007/s00357-016-9207-5>. f) Tsagris M., Preston S. and Wood A.T.A. (2017). "Nonparametric hypothesis testing for equality of means on the simplex". Journal of Statistical Computation and Simulation, 87(2): 406--422. <doi:10.1080/00949655.2016.1216554>. g) Tsagris M. and Stewart C. (2018). "A Dirichlet regression model for compositional data with zeros". Lobachevskii Journal of Mathematics, 39(3): 398--412. <doi:10.1134/S1995080218030198>. h) Alenazi A. (2019). "Regression for compositional data with compositional data as predictor variables with or without zero values". Journal of Data Science, 17(1): 219--238. <doi:10.6339/JDS.201901_17(1).0010>. i) Tsagris M. and Stewart C. (2020). "A folded model for compositional data analysis". Australian and New Zealand Journal of Statistics, 62(2): 249--277. <doi:10.1111/anzs.12289>. j) Alenazi A.A. (2022). "f-divergence regression models for compositional data". Pakistan Journal of Statistics and Operation Research, 18(4): 867--882. <doi:10.18187/pjsor.v18i4.3969>. k) Tsagris M. and Stewart C. (2022). "A Review of Flexible Transformations for Modeling Compositional Data". In Advances and Innovations in Statistics and Data Science, pp. 225--234. <doi:10.1007/978-3-031-08329-7_10>. l) Alenazi A. (2023). "A review of compositional data analysis and recent advances". Communications in Statistics--Theory and Methods, 52(16): 5535--5567. <doi:10.1080/03610926.2021.2014890>. m) Tsagris M., Alenazi A. and Stewart C. (2023). "Flexible non-parametric regression models for compositional response data with zeros". Statistics and Computing, 33(106). <doi:10.1007/s11222-023-10277-5>. n) Tsagris. M. (2025). "Constrained least squares simplicial-simplicial regression". Statistics and Computing, 35(27). <doi:10.1007/s11222-024-10560-z>. o) Sevinc V. and Tsagris. M. (2024). "Energy Based Equality of Distributions Testing for Compositional Data". <doi:10.48550/arXiv.2412.05199>.

Maintained by Michail Tsagris. Last updated 2 months ago.

7.0 match 3 stars 3.64 score 4 dependents

winvector

sigr:Succinct and Correct Statistical Summaries for Reports

Succinctly and correctly format statistical summaries of various models and tests (F-test, Chi-Sq-test, Fisher-test, T-test, and rank-significance). This package also includes empirical tests, such as Monte Carlo and bootstrap distribution estimates.

Maintained by John Mount. Last updated 2 years ago.

3.5 match 28 stars 7.18 score 97 scripts 1 dependents

stamats

MKinfer:Inferential Statistics

Computation of various confidence intervals (Altman et al. (2000), ISBN:978-0-727-91375-3; Hedderich and Sachs (2018), ISBN:978-3-662-56657-2) including bootstrapped versions (Davison and Hinkley (1997), ISBN:978-0-511-80284-3) as well as Hsu (Hedderich and Sachs (2018), ISBN:978-3-662-56657-2), permutation (Janssen (1997), <doi:10.1016/S0167-7152(97)00043-6>), bootstrap (Davison and Hinkley (1997), ISBN:978-0-511-80284-3), intersection-union (Sozu et al. (2015), ISBN:978-3-319-22005-5) and multiple imputation (Barnard and Rubin (1999), <doi:10.1093/biomet/86.4.948>) t-test; furthermore, computation of intersection-union z-test as well as multiple imputation Wilcoxon tests. Graphical visualization by volcano and Bland-Altman plots (Bland and Altman (1986), <doi:10.1016/S0140-6736(86)90837-8>; Shieh (2018), <doi:10.1186/s12874-018-0505-y>).

Maintained by Matthias Kohl. Last updated 11 months ago.

3.8 match 6 stars 6.56 score 71 scripts 4 dependents

winvector

wrapr:Wrap R Tools for Debugging and Parametric Programming

Tools for writing and debugging R code. Provides: '%.>%' dot-pipe (an 'S3' configurable pipe), unpack/to (R style multiple assignment/return), 'build_frame()'/'draw_frame()' ('data.frame' example tools), 'qc()' (quoting concatenate), ':=' (named map builder), 'let()' (converts non-standard evaluation interfaces to parametric standard evaluation interfaces, inspired by 'gtools::strmacro()' and 'base::bquote()'), and more.

Maintained by John Mount. Last updated 2 years ago.

2.3 match 137 stars 11.11 score 390 scripts 12 dependents

bioc

sRACIPE:Systems biology tool to simulate gene regulatory circuits

sRACIPE implements a randomization-based method for gene circuit modeling. It allows us to study the effect of both the gene expression noise and the parametric variation on any gene regulatory circuit (GRC) using only its topology, and simulates an ensemble of models with random kinetic parameters at multiple noise levels. Statistical analysis of the generated gene expressions reveals the basin of attraction and stability of various phenotypic states and their changes associated with intrinsic and extrinsic noises. sRACIPE provides a holistic picture to evaluate the effects of both the stochastic nature of cellular processes and the parametric variation.

Maintained by Mingyang Lu. Last updated 18 days ago.

researchfield systemsbiology mathematicalbiology geneexpression generegulation genetarget cpp

3.9 match 4 stars 6.40 score 209 scripts

daqana

dqrng:Fast Pseudo Random Number Generators

Several fast random number generators are provided as C++ header only libraries: The PCG family by O'Neill (2014 <https://www.cs.hmc.edu/tr/hmc-cs-2014-0905.pdf>) as well as the Xoroshiro / Xoshiro family by Blackman and Vigna (2021 <doi:10.1145/3460772>). In addition fast functions for generating random numbers according to a uniform, normal and exponential distribution are included. The latter two use the Ziggurat algorithm originally proposed by Marsaglia and Tsang (2000, <doi:10.18637/jss.v005.i08>). The fast sampling methods support unweighted sampling both with and without replacement. These functions are exported to R and as a C++ interface and are enabled for use with the default 64 bit generator from the PCG family, Xoroshiro128+/++/** and Xoshiro256+/++/** as well as the 64 bit version of the 20 rounds Threefry engine (Salmon et al., 2011, <doi:10.1145/2063384.2063405>) as provided by the package 'sitmo'.

Maintained by Ralf Stubner. Last updated 6 months ago.

random random-distributions random-generation random-sampling rng cpp

1.9 match 42 stars 13.12 score 188 scripts 183 dependents

pgiraudoux

pgirmess:Spatial Analysis and Data Mining for Field Ecologists

Set of tools for reading, writing and transforming spatial and seasonal data, model selection and specific statistical tests for ecologists. It includes functions to interpolate regular positions of points between landmarks, to discretize polylines into regular point positions, link distant observations to points and convert a bounding box in a spatial object. It also provides miscellaneous functions for field ecologists such as spatial statistics and inference on diversity indexes, writing data.frame with Chinese characters.

Maintained by Patrick Giraudoux. Last updated 1 years ago.

3.4 match 5 stars 7.32 score 422 scripts 2 dependents

tesselle

kairos:Analysis of Chronological Patterns from Archaeological Count Data

A toolkit for absolute and relative dating and analysis of chronological patterns. This package includes functions for chronological modeling and dating of archaeological assemblages from count data. It provides methods for matrix seriation. It also allows to compute time point estimates and density estimates of the occupation and duration of an archaeological site.

Maintained by Nicolas Frerebeau. Last updated 11 days ago.

chronology matrix-seriation archaeology archaeological-science

5.3 match 4.66 score 11 scripts 1 dependents

mike-lawrence

ez:Easy Analysis and Visualization of Factorial Experiments

Facilitates easy analysis of factorial experiments, including purely within-Ss designs (a.k.a. "repeated measures"), purely between-Ss designs, and mixed within-and-between-Ss designs. The functions in this package aim to provide simple, intuitive and consistent specification of data analysis and visualization. Visualization functions also include design visualization for pre-analysis data auditing, and correlation matrix visualization. Finally, this package includes functions for non-parametric analysis, including permutation tests and bootstrap resampling. The bootstrap function obtains predictions either by cell means or by more advanced/powerful mixed effects models, yielding predictions and confidence intervals that may be easily visualized at any level of the experiment's design.

Maintained by Michael A. Lawrence. Last updated 8 years ago.

2.4 match 53 stars 10.28 score 2.7k scripts 12 dependents

gabrielgesteira

qtlpoly:Random-Effect Multiple QTL Mapping in Autopolyploids

Performs random-effect multiple interval mapping (REMIM) in full-sib families of autopolyploid species based on restricted maximum likelihood (REML) estimation and score statistics, as described in Pereira et al. (2020) <doi:10.1534/genetics.120.303080>.

Maintained by Gabriel de Siqueira Gesteira. Last updated 4 months ago.

polyploid qtl-mapping openblas cpp openmp

4.7 match 6 stars 5.17 score 61 scripts

choi-phd

lordif:Logistic Ordinal Regression Differential Item Functioning using IRT

Performs analysis of Differential Item Functioning (DIF) for dichotomous and polytomous items using an iterative hybrid of ordinal logistic regression and item response theory (IRT) according to Choi, Gibbons, and Crane (2011) <doi:10.18637/jss.v039.i08>.

Maintained by Seung W. Choi. Last updated 2 months ago.

4.8 match 1 stars 5.12 score 35 scripts 1 dependents

bioc

viper:Virtual Inference of Protein-activity by Enriched Regulon analysis

Inference of protein activity from gene expression data, including the VIPER and msVIPER algorithms

Maintained by Mariano J Alvarez. Last updated 5 months ago.

systemsbiology networkenrichment geneexpression functionalprediction generegulation

3.5 match 7.00 score 342 scripts 5 dependents

henrikbengtsson

R.utils:Various Programming Utilities

Utility functions useful when programming and developing R packages.

Maintained by Henrik Bengtsson. Last updated 1 years ago.

1.8 match 63 stars 13.74 score 5.7k scripts 814 dependents

bioc

npGSEA:Permutation approximation methods for gene set enrichment analysis (non-permutation GSEA)

Current gene set enrichment methods rely upon permutations for inference. These approaches are computationally expensive and have minimum achievable p-values based on the number of permutations, not on the actual observed statistics. We have derived three parametric approximations to the permutation distributions of two gene set enrichment test statistics. We are able to reduce the computational burden and granularity issues of permutation testing with our method, which is implemented in this package. npGSEA calculates gene set enrichment statistics and p-values without the computational cost of permutations. It is applicable in settings where one or many gene sets are of interest. There are also built-in plotting functions to help users visualize results.

Maintained by Jessica Larson. Last updated 5 months ago.

genesetenrichment microarray statisticalmethod pathways

7.3 match 3.30 score 4 scripts