Showing 200 of total 754 results (show query)
gavinsimpson
permute:Functions for Generating Restricted Permutations of Data
A set of restricted permutation designs for freely exchangeable, line transects (time series), and spatial grid designs plus permutation of blocks (groups of samples) is provided. 'permute' also allows split-plot designs, in which the whole-plots or split-plots or both can be freely-exchangeable or one of the restricted designs. The 'permute' package is modelled after the permutation schemes of 'Canoco 3.1' (and later) by Cajo ter Braak.
Maintained by Gavin L. Simpson. Last updated 7 months ago.
permutationrestricted-permutations
111.0 match 23 stars 13.28 score 538 scripts 488 dependentsrobinhankin
permutations:The Symmetric Group: Permutations of a Finite Set
Manipulates invertible functions from a finite set to itself. Can transform from word form to cycle form and back. To cite the package in publications please use Hankin (2020) "Introducing the permutations R package", SoftwareX, volume 11 <doi:10.1016/j.softx.2020.100453>.
Maintained by Robin K. S. Hankin. Last updated 1 months ago.
146.9 match 6 stars 8.23 score 49 scripts 2 dependentskbroman
qtl:Tools for Analyzing QTL Experiments
Analysis of experimental crosses to identify genes (called quantitative trait loci, QTLs) contributing to variation in quantitative traits. Broman et al. (2003) <doi:10.1093/bioinformatics/btg112>.
Maintained by Karl W Broman. Last updated 7 months ago.
30.9 match 80 stars 12.79 score 2.4k scripts 29 dependentsvegandevs
vegan:Community Ecology Package
Ordination methods, diversity analysis and other functions for community and vegetation ecologists.
Maintained by Jari Oksanen. Last updated 15 days ago.
ecological-modellingecologyordinationfortranopenblas
18.3 match 472 stars 19.41 score 15k scripts 440 dependentsjwood000
RcppAlgos:High Performance Tools for Combinatorics and Computational Mathematics
Provides optimized functions and flexible iterators implemented in C++ for solving problems in combinatorics and computational mathematics. Handles various combinatorial objects including combinations, permutations, integer partitions and compositions, Cartesian products, unordered Cartesian products, and partition of groups. Utilizes the RMatrix class from 'RcppParallel' for thread safety. The combination and permutation functions contain constraint parameters that allow for generation of all results of a vector meeting specific criteria (e.g. finding all combinations such that the sum is between two bounds). Capable of ranking/unranking combinatorial objects efficiently (e.g. retrieve only the nth lexicographical result) which sets up nicely for parallelization as well as random sampling. Gmp support permits exploration where the total number of results is large (e.g. comboSample(10000, 500, n = 4)). Additionally, there are several high performance number theoretic functions that are useful for problems common in computational mathematics. Some of these functions make use of the fast integer division library 'libdivide'. The primeSieve function is based on the segmented sieve of Eratosthenes implementation by Kim Walisch. It is also efficient for large numbers by using the cache friendly improvements originally developed by Tomás Oliveira. Finally, there is a prime counting function that implements Legendre's formula based on the work of Kim Walisch.
Maintained by Joseph Wood. Last updated 1 months ago.
combinationscombinatoricsfactorizationnumber-theoryparallelpermutationprime-factorizationsprimesievegmpcpp
34.3 match 45 stars 10.04 score 153 scripts 12 dependentscvoeten
permutes:Permutation Tests for Time Series Data
Helps you determine the analysis window to use when analyzing densely-sampled time-series data, such as EEG data, using permutation testing (Maris & Oostenveld, 2007) <doi:10.1016/j.jneumeth.2007.03.024>. These permutation tests can help identify the timepoints where significance of an effect begins and ends, and the results can be plotted in various types of heatmap for reporting. Mixed-effects models are supported using an implementation of the approach by Lee & Braun (2012) <doi:10.1111/j.1541-0420.2011.01675.x>.
Maintained by Cesko C. Voeten. Last updated 2 years ago.
73.3 match 4.23 score 16 scriptsigraph
igraph:Network Analysis and Visualization
Routines for simple graphs and network analysis. It can handle large graphs very well and provides functions for generating random and regular graphs, graph visualization, centrality methods and much more.
Maintained by Kirill Müller. Last updated 1 days ago.
complex-networksgraph-algorithmsgraph-theorymathematicsnetwork-analysisnetwork-graphfortranlibxml2glpkopenblascpp
11.5 match 581 stars 21.10 score 31k scripts 1.9k dependentsmhahsler
seriation:Infrastructure for Ordering Objects Using Seriation
Infrastructure for ordering objects with an implementation of several seriation/sequencing/ordination techniques to reorder matrices, dissimilarity matrices, and dendrograms. Also provides (optimally) reordered heatmaps, color images and clustering visualizations like dissimilarity plots, and visual assessment of cluster tendency plots (VAT and iVAT). Hahsler et al (2008) <doi:10.18637/jss.v025.i03>.
Maintained by Michael Hahsler. Last updated 3 months ago.
combinatorial-optimizationordinationseriationfortran
17.0 match 77 stars 14.07 score 640 scripts 79 dependentsadeverse
ade4:Analysis of Ecological Data: Exploratory and Euclidean Methods in Environmental Sciences
Tools for multivariate data analysis. Several methods are provided for the analysis (i.e., ordination) of one-table (e.g., principal component analysis, correspondence analysis), two-table (e.g., coinertia analysis, redundancy analysis), three-table (e.g., RLQ analysis) and K-table (e.g., STATIS, multiple coinertia analysis). The philosophy of the package is described in Dray and Dufour (2007) <doi:10.18637/jss.v022.i04>.
Maintained by Aurélie Siberchicot. Last updated 11 days ago.
11.3 match 39 stars 14.96 score 2.2k scripts 256 dependentsbioc
regioneR:Association analysis of genomic regions based on permutation tests
regioneR offers a statistical framework based on customizable permutation tests to assess the association between genomic region sets and other genomic features.
Maintained by Bernat Gel. Last updated 5 months ago.
geneticschipseqdnaseqmethylseqcopynumbervariation
17.8 match 9.00 score 2.7k scripts 21 dependentsbioc
methylInheritance:Permutation-Based Analysis associating Conserved Differentially Methylated Elements Across Multiple Generations to a Treatment Effect
Permutation analysis, based on Monte Carlo sampling, for testing the hypothesis that the number of conserved differentially methylated elements, between several generations, is associated to an effect inherited from a treatment and that stochastic effect can be dismissed.
Maintained by Astrid Deschênes. Last updated 5 months ago.
biologicalquestionepigeneticsdnamethylationdifferentialmethylationmethylseqsoftwareimmunooncologystatisticalmethodwholegenomesequencinganalysisbioconductorbioinformaticscpgdifferentially-methylated-elementsinheritancemonte-carlo-samplingpermutation
34.2 match 4.60 score 1 scriptsmartinzaefferer
CEGO:Combinatorial Efficient Global Optimization
Model building, surrogate model based optimization and Efficient Global Optimization in combinatorial or mixed search spaces.
Maintained by Martin Zaefferer. Last updated 2 months ago.
50.8 match 1 stars 3.04 score 73 scriptsr-spatial
spdep:Spatial Dependence: Weighting Schemes, Statistics
A collection of functions to create spatial weights matrix objects from polygon 'contiguities', from point patterns by distance and tessellations, for summarizing these objects, and for permitting their use in spatial data analysis, including regional aggregation by minimum spanning tree; a collection of tests for spatial 'autocorrelation', including global 'Morans I' and 'Gearys C' proposed by 'Cliff' and 'Ord' (1973, ISBN: 0850860369) and (1981, ISBN: 0850860814), 'Hubert/Mantel' general cross product statistic, Empirical Bayes estimates and 'Assunção/Reis' (1999) <doi:10.1002/(SICI)1097-0258(19990830)18:16%3C2147::AID-SIM179%3E3.0.CO;2-I> Index, 'Getis/Ord' G ('Getis' and 'Ord' 1992) <doi:10.1111/j.1538-4632.1992.tb00261.x> and multicoloured join count statistics, 'APLE' ('Li 'et al.' ) <doi:10.1111/j.1538-4632.2007.00708.x>, local 'Moran's I', 'Gearys C' ('Anselin' 1995) <doi:10.1111/j.1538-4632.1995.tb00338.x> and 'Getis/Ord' G ('Ord' and 'Getis' 1995) <doi:10.1111/j.1538-4632.1995.tb00912.x>, 'saddlepoint' approximations ('Tiefelsdorf' 2002) <doi:10.1111/j.1538-4632.2002.tb01084.x> and exact tests for global and local 'Moran's I' ('Bivand et al.' 2009) <doi:10.1016/j.csda.2008.07.021> and 'LOSH' local indicators of spatial heteroscedasticity ('Ord' and 'Getis') <doi:10.1007/s00168-011-0492-y>. The implementation of most of these measures is described in 'Bivand' and 'Wong' (2018) <doi:10.1007/s11749-018-0599-x>, with further extensions in 'Bivand' (2022) <doi:10.1111/gean.12319>. 'Lagrange' multiplier tests for spatial dependence in linear models are provided ('Anselin et al'. 1996) <doi:10.1016/0166-0462(95)02111-6>, as are 'Rao' score tests for hypothesised spatial 'Durbin' models based on linear models ('Koley' and 'Bera' 2023) <doi:10.1080/17421772.2023.2256810>. A local indicators for categorical data (LICD) implementation based on 'Carrer et al.' (2021) <doi:10.1016/j.jas.2020.105306> and 'Bivand et al.' (2017) <doi:10.1016/j.spasta.2017.03.003> was added in 1.3-7. From 'spdep' and 'spatialreg' versions >= 1.2-1, the model fitting functions previously present in this package are defunct in 'spdep' and may be found in 'spatialreg'.
Maintained by Roger Bivand. Last updated 17 days ago.
spatial-autocorrelationspatial-dependencespatial-weights
9.0 match 131 stars 16.62 score 6.0k scripts 107 dependentsrandy3k
arrangements:Fast Generators and Iterators for Permutations, Combinations, Integer Partitions and Compositions
Fast generators and iterators for permutations, combinations, integer partitions and compositions. The arrangements are in lexicographical order and generated iteratively in a memory efficient manner. It has been demonstrated that 'arrangements' outperforms most existing packages of similar kind. Benchmarks could be found at <https://randy3k.github.io/arrangements/articles/benchmark.html>.
Maintained by Randy Lai. Last updated 2 years ago.
17.7 match 52 stars 7.89 score 118 scripts 23 dependentspermaverse
flipr:Flexible Inference via Permutations in R
A flexible permutation framework for making inference such as point estimation, confidence intervals or hypothesis testing, on any kind of data, be it univariate, multivariate, or more complex such as network-valued data, topological data, functional data or density-valued data.
Maintained by Aymeric Stamm. Last updated 12 days ago.
18.9 match 6 stars 6.89 score 24 scripts 1 dependentsangeella
pARI:Permutation-Based All-Resolutions Inference
Computes the All-Resolution Inference method in the permutation framework, i.e., simultaneous lower confidence bounds for the number of true discoveries. <doi:10.1002/sim.9725>.
Maintained by Angela Andreella. Last updated 6 months ago.
aricluster-mapcopesdiscoveriesfmrifslpermutationselective-inferencesimultaneous-confidence-boundsspmopenblascpp
24.3 match 4 stars 4.78 score 9 scripts 1 dependentsalexkowa
EnvStats:Package for Environmental Statistics, Including US EPA Guidance
Graphical and statistical analyses of environmental data, with focus on analyzing chemical concentrations and physical parameters, usually in the context of mandated environmental monitoring. Major environmental statistical methods found in the literature and regulatory guidance documents, with extensive help that explains what these methods do, how to use them, and where to find them in the literature. Numerous built-in data sets from regulatory guidance documents and environmental statistics literature. Includes scripts reproducing analyses presented in the book "EnvStats: An R Package for Environmental Statistics" (Millard, 2013, Springer, ISBN 978-1-4614-8455-4, <doi:10.1007/978-1-4614-8456-1>).
Maintained by Alexander Kowarik. Last updated 15 days ago.
8.8 match 26 stars 12.80 score 2.4k scripts 46 dependentsmaximeherve
RVAideMemoire:Testing and Plotting Procedures for Biostatistics
Contains miscellaneous functions useful in biostatistics, mostly univariate and multivariate testing procedures with a special emphasis on permutation tests. Many functions intend to simplify user's life by shortening existing procedures or by implementing plotting functions that can be used with as many methods from different packages as possible.
Maintained by Maxime HERVE. Last updated 1 years ago.
20.7 match 8 stars 5.31 score 632 scriptstidyverse
modelr:Modelling Functions that Work with the Pipe
Functions for modelling that help you seamlessly integrate modelling into a pipeline of data manipulation and visualisation.
Maintained by Hadley Wickham. Last updated 1 years ago.
6.7 match 401 stars 16.44 score 6.9k scripts 1.0k dependentsmtorchiano
lmPerm:Permutation Tests for Linear Models
Linear model functions using permutation tests.
Maintained by Marco Torchiano. Last updated 5 years ago.
12.5 match 13 stars 8.40 score 306 scripts 4 dependentsrandy3k
iterpc:Efficient Iterator for Permutations and Combinations
Iterator for generating permutations and combinations. They can be either drawn with or without replacement, or with distinct/ non-distinct items (multiset). The generated sequences are in lexicographical order (dictionary order). The algorithms to generate permutations and combinations are memory efficient. These iterative algorithms enable users to process all sequences without putting all results in the memory at the same time. The algorithms are written in C/C++ for faster performance. Note: 'iterpc' is no longer being maintained. Users are recommended to switch to 'arrangements'.
Maintained by Randy Lai. Last updated 5 years ago.
14.5 match 9 stars 7.17 score 47 scripts 5 dependentsyjunechoe
jlmerclusterperm:Cluster-Based Permutation Analysis for Densely Sampled Time Data
An implementation of fast cluster-based permutation analysis (CPA) for densely-sampled time data developed in Maris & Oostenveld, 2007 <doi:10.1016/j.jneumeth.2007.03.024>. Supports (generalized, mixed-effects) regression models for the calculation of timewise statistics. Provides both a wholesale and a piecemeal interface to the CPA procedure with an emphasis on interpretability and diagnostics. Integrates 'Julia' libraries 'MixedModels.jl' and 'GLM.jl' for performance improvements, with additional functionalities for interfacing with 'Julia' from 'R' powered by the 'JuliaConnectoR' package.
Maintained by June Choe. Last updated 5 days ago.
cluster-based-permutation-testeegeyetrackingmixed-effects-modelstimeseries
16.6 match 13 stars 5.86 score 14 scriptswjbraun
DAAG:Data Analysis and Graphics Data and Functions
Functions and data sets used in examples and exercises in the text Maindonald, J.H. and Braun, W.J. (2003, 2007, 2010) "Data Analysis and Graphics Using R", and in an upcoming Maindonald, Braun, and Andrews text that builds on this earlier text.
Maintained by W. John Braun. Last updated 11 months ago.
11.8 match 8.25 score 1.2k scripts 1 dependentsbioc
CaDrA:Candidate Driver Analysis
Performs both stepwise and backward heuristic search for candidate (epi)genetic drivers based on a binary multi-omics dataset. CaDrA's main objective is to identify features which, together, are significantly skewed or enriched pertaining to a given vector of continuous scores (e.g. sample-specific scores representing a phenotypic readout of interest, such as protein expression, pathway activity, etc.), based on the union occurence (i.e. logical OR) of the events.
Maintained by Reina Chau. Last updated 5 months ago.
microarrayrnaseqgeneexpressionsoftwarefeatureextraction
13.1 match 24 stars 7.19 score 12 scriptsr-forge
coin:Conditional Inference Procedures in a Permutation Test Framework
Conditional inference procedures for the general independence problem including two-sample, K-sample (non-parametric ANOVA), correlation, censored, ordered and multivariate problems described in <doi:10.18637/jss.v028.i08>.
Maintained by Torsten Hothorn. Last updated 9 months ago.
7.8 match 11.68 score 1.6k scripts 74 dependentsreinhardfurrer
spam:SPArse Matrix
Set of functions for sparse matrix algebra. Differences with other sparse matrix packages are: (1) we only support (essentially) one sparse matrix format, (2) based on transparent and simple structure(s), (3) tailored for MCMC calculations within G(M)RF. (4) and it is fast and scalable (with the extension package spam64). Documentation about 'spam' is provided by vignettes included in this package, see also Furrer and Sain (2010) <doi:10.18637/jss.v036.i10>; see 'citation("spam")' for details.
Maintained by Reinhard Furrer. Last updated 2 months ago.
9.8 match 1 stars 9.26 score 420 scripts 433 dependentsbioc
RgnTX:Colocalization analysis of transcriptome elements in the presence of isoform heterogeneity and ambiguity
RgnTX allows the integration of transcriptome annotations so as to model the complex alternative splicing patterns. It supports the testing of transcriptome elements without clear isoform association, which is often the real scenario due to technical limitations. It involves functions that do permutaion test for evaluating association between features and transcriptome regions.
Maintained by Yue Wang. Last updated 5 months ago.
alternativesplicingsequencingrnaseqmethylseqtranscriptionsplicedalignment
22.5 match 4.00 score 6 scriptstidymodels
rsample:General Resampling Infrastructure
Classes and functions to create and summarize different types of resampling objects (e.g. bootstrap, cross-validation).
Maintained by Hannah Frick. Last updated 4 days ago.
5.3 match 341 stars 16.72 score 5.2k scripts 79 dependentsprzechoj
gips:Gaussian Model Invariant by Permutation Symmetry
Find the permutation symmetry group such that the covariance matrix of the given data is approximately invariant under it. Discovering such a permutation decreases the number of observations needed to fit a Gaussian model, which is of great use when it is smaller than the number of variables. Even if that is not the case, the covariance matrix found with 'gips' approximates the actual covariance with less statistical error. The methods implemented in this package are described in Graczyk et al. (2022) <doi:10.1214/22-AOS2174>.
Maintained by Adam Przemysław Chojecki. Last updated 8 months ago.
covariance-estimationmachine-learningnormal-distribution
12.8 match 6 stars 6.40 score 31 scriptsfbertran
plsRglm:Partial Least Squares Regression for Generalized Linear Models
Provides (weighted) Partial least squares Regression for generalized linear models and repeated k-fold cross-validation of such models using various criteria <arXiv:1810.01005>. It allows for missing data in the explanatory variables. Bootstrap confidence intervals constructions are also available.
Maintained by Frederic Bertrand. Last updated 2 years ago.
10.6 match 16 stars 7.75 score 103 scripts 5 dependentsjaromilfrossard
permuco:Permutation Tests for Regression, (Repeated Measures) ANOVA/ANCOVA and Comparison of Signals
Functions to compute p-values based on permutation tests. Regression, ANOVA and ANCOVA, omnibus F-tests, marginal unilateral and bilateral t-tests are available. Several methods to handle nuisance variables are implemented (Kherad-Pajouh, S., & Renaud, O. (2010) <doi:10.1016/j.csda.2010.02.015> ; Kherad-Pajouh, S., & Renaud, O. (2014) <doi:10.1007/s00362-014-0617-3> ; Winkler, A. M., Ridgway, G. R., Webster, M. A., Smith, S. M., & Nichols, T. E. (2014) <doi:10.1016/j.neuroimage.2014.01.060>). An extension for the comparison of signals issued from experimental conditions (e.g. EEG/ERP signals) is provided. Several corrections for multiple testing are possible, including the cluster-mass statistic (Maris, E., & Oostenveld, R. (2007) <doi:10.1016/j.jneumeth.2007.03.024>) and the threshold-free cluster enhancement (Smith, S. M., & Nichols, T. E. (2009) <doi:10.1016/j.neuroimage.2008.03.061>).
Maintained by Jaromil Frossard. Last updated 7 months ago.
11.6 match 13 stars 6.57 score 81 scriptsjmcurran
multicool:Permutations of Multisets in Cool-Lex Order
A set of tools to permute multisets without loops or hash tables and to generate integer partitions. The permutation functions are based on C code from Aaron Williams. Cool-lex order is similar to colexicographical order. The algorithm is described in Williams, A. Loopless Generation of Multiset Permutations by Prefix Shifts. SODA 2009, Symposium on Discrete Algorithms, New York, United States. The permutation code is distributed without restrictions. The code for stable and efficient computation of multinomial coefficients comes from Dave Barber. The code can be download from <http://tamivox.org/dave/multinomial/index.html> and is distributed without conditions. The package also generates the integer partitions of a positive, non-zero integer n. The C++ code for this is based on Python code from Jerome Kelleher which can be found here <https://jeromekelleher.net/category/combinatorics.html>. The C++ code and Python code are distributed without conditions.
Maintained by James Curran. Last updated 1 years ago.
9.7 match 2 stars 7.74 score 11 scripts 273 dependentsddebeer
permimp:Conditional Permutation Importance
An add-on to the 'party' package, with a faster implementation of the partial-conditional permutation importance for random forests. The standard permutation importance is implemented exactly the same as in the 'party' package. The conditional permutation importance can be computed faster, with an option to be backward compatible to the 'party' implementation. The package is compatible with random forests fit using the 'party' and the 'randomForest' package. The methods are described in Strobl et al. (2007) <doi:10.1186/1471-2105-8-25> and Debeer and Strobl (2020) <doi:10.1186/s12859-020-03622-2>.
Maintained by Dries Debeer. Last updated 2 years ago.
12.8 match 4 stars 5.85 score 39 scripts 1 dependentscran
vipor:Plot Categorical Data Using Quasirandom Noise and Density Estimates
Generate a violin point plot, a combination of a violin/histogram plot and a scatter plot by offsetting points within a category based on their density using quasirandom noise.
Maintained by Scott Sherrill-Mix. Last updated 1 years ago.
10.4 match 6.86 score 95 dependentscran
e1071:Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien
Functions for latent class analysis, short time Fourier transform, fuzzy clustering, support vector machines, shortest path computation, bagged clustering, naive Bayes classifier, generalized k-nearest neighbour ...
Maintained by David Meyer. Last updated 6 months ago.
4.9 match 28 stars 14.46 score 19k scripts 2.0k dependentsbioc
ClusterSignificance:The ClusterSignificance package provides tools to assess if class clusters in dimensionality reduced data representations have a separation different from permuted data
The ClusterSignificance package provides tools to assess if class clusters in dimensionality reduced data representations have a separation different from permuted data. The term class clusters here refers to, clusters of points representing known classes in the data. This is particularly useful to determine if a subset of the variables, e.g. genes in a specific pathway, alone can separate samples into these established classes. ClusterSignificance accomplishes this by, projecting all points onto a one dimensional line. Cluster separations are then scored and the probability of the seen separation being due to chance is evaluated using a permutation method.
Maintained by Jason T Serviss. Last updated 5 months ago.
clusteringclassificationprincipalcomponentstatisticalmethod
14.4 match 4.78 score 4 scriptscran
PerMallows:Permutations and Mallows Distributions
Includes functions to work with the Mallows and Generalized Mallows Models. The considered distances are Kendall's-tau, Cayley, Hamming and Ulam and it includes functions for making inference, sampling and learning such distributions, some of which are novel in the literature. As a by-product, PerMallows also includes operations for permutations, paying special attention to those related with the Kendall's-tau, Cayley, Ulam and Hamming distances. It is also possible to generate random permutations at a given distance, or with a given number of inversions, or cycles, or fixed points or even with a given length on LIS (longest increasing subsequence).
Maintained by Ekhine Irurozki. Last updated 30 days ago.
66.9 match 1 stars 1.00 scorer-forge
Matrix:Sparse and Dense Matrix Classes and Methods
A rich hierarchy of sparse and dense matrix classes, including general, symmetric, triangular, and diagonal matrices with numeric, logical, or pattern entries. Efficient methods for operating on such matrices, often wrapping the 'BLAS', 'LAPACK', and 'SuiteSparse' libraries.
Maintained by Martin Maechler. Last updated 5 days ago.
3.9 match 1 stars 17.23 score 33k scripts 12k dependentsblasbenito
distantia:Advanced Toolset for Efficient Time Series Dissimilarity Analysis
Fast C++ implementation of Dynamic Time Warping for time series dissimilarity analysis, with applications in environmental monitoring and sensor data analysis, climate science, signal processing and pattern recognition, and financial data analysis. Built upon the ideas presented in Benito and Birks (2020) <doi:10.1111/ecog.04895>, provides tools for analyzing time series of varying lengths and structures, including irregular multivariate time series. Key features include individual variable contribution analysis, restricted permutation tests for statistical significance, and imputation of missing data via GAMs. Additionally, the package provides an ample set of tools to prepare and manage time series data.
Maintained by Blas M. Benito. Last updated 24 days ago.
dissimilaritydynamic-time-warpinglock-steptime-seriescpp
11.1 match 23 stars 5.76 score 11 scriptsahb108
rcarbon:Calibration and Analysis of Radiocarbon Dates
Enables the calibration and analysis of radiocarbon dates, often but not exclusively for the purposes of archaeological research. It includes functions not only for basic calibration, uncalibration, and plotting of one or more dates, but also a statistical framework for building demographic and related longitudinal inferences from aggregate radiocarbon date lists, including: Monte-Carlo simulation test (Timpson et al 2014 <doi:10.1016/j.jas.2014.08.011>), random mark permutation test (Crema et al 2016 <doi:10.1371/journal.pone.0154809>) and spatial permutation tests (Crema, Bevan, and Shennan 2017 <doi:10.1016/j.jas.2017.09.007>).
Maintained by Enrico Crema. Last updated 6 months ago.
7.7 match 34 stars 8.14 score 274 scripts 2 dependentsbioc
metagenomeSeq:Statistical analysis for sparse high-throughput sequencing
metagenomeSeq is designed to determine features (be it Operational Taxanomic Unit (OTU), species, etc.) that are differentially abundant between two or more groups of multiple samples. metagenomeSeq is designed to address the effects of both normalization and under-sampling of microbial communities on disease association detection and the testing of feature correlations.
Maintained by Joseph N. Paulson. Last updated 3 months ago.
immunooncologyclassificationclusteringgeneticvariabilitydifferentialexpressionmicrobiomemetagenomicsnormalizationvisualizationmultiplecomparisonsequencingsoftware
5.2 match 69 stars 12.02 score 494 scripts 7 dependentsusepa
pTITAN2:Permutations of Treatment Labels and TITAN2 Analysis
Permute treatment labels for taxa and environmental gradients to generate an empirical distribution of change points. This is an extension for the 'TITAN2' package <https://cran.r-project.org/package=TITAN2>.
Maintained by Peter DeWitt. Last updated 3 years ago.
16.8 match 1 stars 3.70 score 7 scriptsthothorn
libcoin:Linear Test Statistics for Permutation Inference
Basic infrastructure for linear test statistics and permutation inference in the framework of Strasser and Weber (1999) <https://epub.wu.ac.at/102/>. This package must not be used by end-users. CRAN package 'coin' implements all user interfaces and is ready to be used by anyone.
Maintained by Torsten Hothorn. Last updated 1 years ago.
9.1 match 1 stars 6.81 score 25 scripts 171 dependentsandreyshabalin
shiftR:Fast Enrichment Analysis via Circular Permutations
Fast enrichment analysis for locally correlated statistics via circular permutations. The analysis can be performed at multiple significance thresholds for both primary and auxiliary data sets with efficient correction for multiple testing.
Maintained by Andrey A Shabalin. Last updated 6 years ago.
15.1 match 1 stars 4.04 score 11 scriptsbioc
COCOA:Coordinate Covariation Analysis
COCOA is a method for understanding epigenetic variation among samples. COCOA can be used with epigenetic data that includes genomic coordinates and an epigenetic signal, such as DNA methylation and chromatin accessibility data. To describe the method on a high level, COCOA quantifies inter-sample variation with either a supervised or unsupervised technique then uses a database of "region sets" to annotate the variation among samples. A region set is a set of genomic regions that share a biological annotation, for instance transcription factor (TF) binding regions, histone modification regions, or open chromatin regions. COCOA can identify region sets that are associated with epigenetic variation between samples and increase understanding of variation in your data.
Maintained by John Lawson. Last updated 5 months ago.
epigeneticsdnamethylationatacseqdnaseseqmethylseqmethylationarrayprincipalcomponentgenomicvariationgeneregulationgenomeannotationsystemsbiologyfunctionalgenomicschipseqsequencingimmunooncologydna-methylationpca
8.6 match 10 stars 7.02 score 21 scriptscecileproust-lima
lcmm:Extended Mixed Models Using Latent Classes and Latent Processes
Estimation of various extensions of the mixed models including latent class mixed models, joint latent class mixed models, mixed models for curvilinear outcomes, mixed models for multivariate longitudinal outcomes using a maximum likelihood estimation method (Proust-Lima, Philipps, Liquet (2017) <doi:10.18637/jss.v078.i02>).
Maintained by Cecile Proust-Lima. Last updated 1 months ago.
5.3 match 62 stars 11.41 score 249 scripts 7 dependentsdicook
nullabor:Tools for Graphical Inference
Tools for visual inference. Generate null data sets and null plots using permutation and simulation. Calculate distance metrics for a lineup, and examine the distributions of metrics.
Maintained by Di Cook. Last updated 1 months ago.
5.8 match 57 stars 10.38 score 370 scripts 2 dependentsuscbiostats
partition:Agglomerative Partitioning Framework for Dimension Reduction
A fast and flexible framework for agglomerative partitioning. 'partition' uses an approach called Direct-Measure-Reduce to create new variables that maintain the user-specified minimum level of information. Each reduced variable is also interpretable: the original variables map to one and only one variable in the reduced data set. 'partition' is flexible, as well: how variables are selected to reduce, how information loss is measured, and the way data is reduced can all be customized. 'partition' is based on the Partition framework discussed in Millstein et al. (2020) <doi:10.1093/bioinformatics/btz661>.
Maintained by Malcolm Barrett. Last updated 4 months ago.
data-reductiondimensionality-reductionpartitional-clusteringopenblascpp
7.8 match 36 stars 7.72 score 27 scripts 1 dependentsbioc
AlpsNMR:Automated spectraL Processing System for NMR
Reads Bruker NMR data directories both zipped and unzipped. It provides automated and efficient signal processing for untargeted NMR metabolomics. It is able to interpolate the samples, detect outliers, exclude regions, normalize, detect peaks, align the spectra, integrate peaks, manage metadata and visualize the spectra. After spectra proccessing, it can apply multivariate analysis on extracted data. Efficient plotting with 1-D data is also available. Basic reading of 1D ACD/Labs exported JDX samples is also available.
Maintained by Sergio Oller Moreno. Last updated 5 months ago.
softwarepreprocessingvisualizationclassificationcheminformaticsmetabolomicsdataimport
7.9 match 15 stars 7.59 score 12 scripts 1 dependentseriqande
gscramble:Simulating Admixed Genotypes Without Replacement
A genomic simulation approach for creating biologically informed individual genotypes from empirical data that 1) samples alleles from populations without replacement, 2) segregates alleles based on species-specific recombination rates. 'gscramble' is a flexible simulation approach that allows users to create pedigrees of varying complexity in order to simulate admixed genotypes. Furthermore, it allows users to track haplotype blocks from the source populations through the pedigrees.
Maintained by Eric C. Anderson. Last updated 1 years ago.
11.8 match 4.83 score 15 scriptsbrandmaier
pdc:Permutation Distribution Clustering
Permutation Distribution Clustering is a clustering method for time series. Dissimilarity of time series is formalized as the divergence between their permutation distributions. The permutation distribution was proposed as measure of the complexity of a time series.
Maintained by Andreas M. Brandmaier. Last updated 2 years ago.
10.1 match 6 stars 5.61 score 25 scripts 9 dependentsbioc
GSALightning:Fast Permutation-based Gene Set Analysis
GSALightning provides a fast implementation of permutation-based gene set analysis for two-sample problem. This package is particularly useful when testing simultaneously a large number of gene sets, or when a large number of permutations is necessary for more accurate p-values estimation.
Maintained by Billy Heung Wing Chang. Last updated 5 months ago.
softwarebiologicalquestiongenesetenrichmentdifferentialexpressiongeneexpressiontranscription
14.2 match 5 stars 4.00 score 4 scriptshwborchers
pracma:Practical Numerical Math Functions
Provides a large number of functions from numerical analysis and linear algebra, numerical optimization, differential equations, time series, plus some well-known special mathematical functions. Uses 'MATLAB' function names where appropriate to simplify porting.
Maintained by Hans W. Borchers. Last updated 1 years ago.
4.5 match 29 stars 12.34 score 6.6k scripts 931 dependentsbioc
SeqGSEA:Gene Set Enrichment Analysis (GSEA) of RNA-Seq Data: integrating differential expression and splicing
The package generally provides methods for gene set enrichment analysis of high-throughput RNA-Seq data by integrating differential expression and splicing. It uses negative binomial distribution to model read count data, which accounts for sequencing biases and biological variation. Based on permutation tests, statistical significance can also be achieved regarding each gene's differential expression and splicing, respectively.
Maintained by Xi Wang. Last updated 5 months ago.
sequencingrnaseqgenesetenrichmentgeneexpressiondifferentialexpressiondifferentialsplicingimmunooncology
12.8 match 4.34 score 11 scriptsqddyy
LearnNonparam:'R6'-Based Flexible Framework for Permutation Tests
Implements non-parametric tests from Higgins (2004, ISBN:0534387756), including tests for one sample, two samples, k samples, paired comparisons, blocked designs, trends and association. Built with 'Rcpp' for efficiency and 'R6' for flexible, object-oriented design, the package provides a unified framework for performing or creating custom permutation tests.
Maintained by Yan Du. Last updated 1 months ago.
hypothesis-testnonparametric-statisticspermutation-testcpp
10.9 match 6 stars 5.01 score 2 scriptssritchie73
NetRep:Permutation Testing Network Module Preservation Across Datasets
Functions for assessing the replication/preservation of a network module's topology across datasets through permutation testing; Ritchie et al. (2015) <doi: 10.1016/j.cels.2016.06.012>.
Maintained by Scott Ritchie. Last updated 4 years ago.
8.0 match 12 stars 6.84 score 16 scripts 3 dependentsswampthingpaul
NADA2:Data Analysis for Censored Environmental Data
Contains methods described by Dennis Helsel in his book "Statistics for Censored Environmental Data using Minitab and R" (2011) and courses and videos at <https://practicalstats.com>. This package adds new functions to the `NADA` Package.
Maintained by Paul Julian. Last updated 6 months ago.
8.8 match 15 stars 6.16 score 16 scriptsdeclaredesign
randomizr:Easy-to-Use Tools for Common Forms of Random Assignment and Sampling
Generates random assignments for common experimental designs and random samples for common sampling designs.
Maintained by Alexander Coppock. Last updated 1 months ago.
5.5 match 37 stars 9.90 score 396 scripts 13 dependentsmlcollyer
RRPP:Linear Model Evaluation with Randomized Residuals in a Permutation Procedure
Linear model calculations are made for many random versions of data. Using residual randomization in a permutation procedure, sums of squares are calculated over many permutations to generate empirical probability distributions for evaluating model effects. Additionally, coefficients, statistics, fitted values, and residuals generated over many permutations can be used for various procedures including pairwise tests, prediction, classification, and model comparison. This package should provide most tools one could need for the analysis of high-dimensional data, especially in ecology and evolutionary biology, but certainly other fields, as well.
Maintained by Michael Collyer. Last updated 25 days ago.
5.5 match 4 stars 9.84 score 173 scripts 7 dependentsphamdn
peramo:Permutation Tests for Randomization Model
Perform permutation-based hypothesis testing for randomized experiments as suggested in Ludbrook & Dudley (1998) <doi:10.2307/2685470> and Ernst (2004) <doi:10.1214/088342304000000396>, introduced in Pham et al. (2022) <doi:10.1016/j.chemosphere.2022.136736>.
Maintained by Duy Nghia Pham. Last updated 7 months ago.
17.9 match 3.00 scorembrueckner
permGS:Permutational Group Sequential Test for Time-to-Event Data
Permutational group-sequential tests for time-to-event data based on the log-rank test statistic. Supports exact permutation test when the censoring distributions are equal in the treatment and the control group and approximate imputation-permutation methods when the censoring distributions are different.
Maintained by Matthias Brueckner. Last updated 6 years ago.
permutation-teststatisticssurvival-analysis
19.5 match 2.70 score 8 scriptscran
sparcl:Perform Sparse Hierarchical Clustering and Sparse K-Means Clustering
Implements the sparse clustering methods of Witten and Tibshirani (2010): "A framework for feature selection in clustering"; published in Journal of the American Statistical Association 105(490): 713-726.
Maintained by Daniela Witten. Last updated 6 years ago.
12.5 match 1 stars 4.20 score 133 scripts 4 dependentspecanproject
PEcAn.data.atmosphere:PEcAn Functions Used for Managing Climate Driver Data
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The PECAn.data.atmosphere package converts climate driver data into a standard format for models integrated into PEcAn. As a standalone package, it provides an interface to access diverse climate data sets.
Maintained by David LeBauer. Last updated 14 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
4.5 match 216 stars 11.59 score 64 scripts 14 dependentsdongwenluo
predictmeans:Predicted Means for Linear and Semiparametric Models
Providing functions to diagnose and make inferences from various linear models, such as those obtained from 'aov', 'lm', 'glm', 'gls', 'lme', 'lmer', 'glmmTMB' and 'semireg'. Inferences include predicted means and standard errors, contrasts, multiple comparisons, permutation tests, adjusted R-square and graphs.
Maintained by Dongwen Luo. Last updated 11 months ago.
8.2 match 2 stars 6.26 score 152 scripts 2 dependentsbioc
jazzPanda:Finding spatially relevant marker genes in image based spatial transcriptomics data
This package contains the function to find marker genes for image-based spatial transcriptomics data. There are functions to create spatial vectors from the cell and transcript coordiantes, which are passed as inputs to find marker genes. Marker genes are detected for every cluster by two approaches. The first approach is by permtuation testing, which is implmented in parallel for finding marker genes for one sample study. The other approach is to build a linear model for every gene. This approach can account for multiple samples and backgound noise.
Maintained by Melody Jin. Last updated 13 days ago.
spatialgeneexpressiondifferentialexpressionstatisticalmethodtranscriptomicscorrelationlinear-modelsmarker-genesspatial-transcriptomics
10.3 match 2 stars 5.00 scoresonsoleslp
tna:Transition Network Analysis (TNA)
Provides tools for performing Transition Network Analysis (TNA) to study relational dynamics, including functions for building and plotting TNA models, calculating centrality measures, and identifying dominant events and patterns. TNA statistical techniques (e.g., bootstrapping and permutation tests) ensure the reliability of observed insights and confirm that identified dynamics are meaningful. See (Saqr et al., 2025) <doi:10.1145/3706468.3706513> for more details on TNA.
Maintained by Sonsoles López-Pernas. Last updated 23 hours ago.
educational-data-mininglearning-analyticsmarkov-modeltemporal-analysis
7.9 match 4 stars 6.48 score 5 scriptstoshi-ara
brunnermunzel:(Permuted) Brunner-Munzel Test
Provides the functions for Brunner-Munzel test and permuted Brunner-Munzel test, which enable to use formula, matrix, and table as argument. These functions are based on Brunner and Munzel (2000) <doi:10.1002/(SICI)1521-4036(200001)42:1%3C17::AID-BIMJ17%3E3.0.CO;2-U> and Neubert and Brunner (2007) <doi:10.1016/j.csda.2006.05.024>, and are written with FORTRAN.
Maintained by Toshiaki Ara. Last updated 3 years ago.
8.7 match 5 stars 5.83 score 30 scripts 1 dependentsrqtl
qtl2:Quantitative Trait Locus Mapping in Experimental Crosses
Provides a set of tools to perform quantitative trait locus (QTL) analysis in experimental crosses. It is a reimplementation of the 'R/qtl' package to better handle high-dimensional data and complex cross designs. Broman et al. (2019) <doi:10.1534/genetics.118.301595>.
Maintained by Karl W Broman. Last updated 7 days ago.
5.3 match 34 stars 9.48 score 1.1k scripts 5 dependentsbioc
distinct:distinct: a method for differential analyses via hierarchical permutation tests
distinct is a statistical method to perform differential testing between two or more groups of distributions; differential testing is performed via hierarchical non-parametric permutation tests on the cumulative distribution functions (cdfs) of each sample. While most methods for differential expression target differences in the mean abundance between conditions, distinct, by comparing full cdfs, identifies, both, differential patterns involving changes in the mean, as well as more subtle variations that do not involve the mean (e.g., unimodal vs. bi-modal distributions with the same mean). distinct is a general and flexible tool: due to its fully non-parametric nature, which makes no assumptions on how the data was generated, it can be applied to a variety of datasets. It is particularly suitable to perform differential state analyses on single cell data (i.e., differential analyses within sub-populations of cells), such as single cell RNA sequencing (scRNA-seq) and high-dimensional flow or mass cytometry (HDCyto) data. To use distinct one needs data from two or more groups of samples (i.e., experimental conditions), with at least 2 samples (i.e., biological replicates) per group.
Maintained by Simone Tiberi. Last updated 5 months ago.
geneticsrnaseqsequencingdifferentialexpressiongeneexpressionmultiplecomparisonsoftwaretranscriptionstatisticalmethodvisualizationsinglecellflowcytometrygenetargetopenblascpp
7.8 match 11 stars 6.35 score 34 scripts 1 dependentssmn74
MANOVA.RM:Resampling-Based Analysis of Multivariate Data and Repeated Measures Designs
Implemented are various tests for semi-parametric repeated measures and general MANOVA designs that do neither assume multivariate normality nor covariance homogeneity, i.e., the procedures are applicable for a wide range of general multivariate factorial designs. In addition to asymptotic inference methods, novel bootstrap and permutation approaches are implemented as well. These provide more accurate results in case of small to moderate sample sizes. Furthermore, post-hoc comparisons are provided for the multivariate analyses. Friedrich, S., Konietschke, F. and Pauly, M. (2019) <doi:10.32614/RJ-2019-051>.
Maintained by Sarah Friedrich. Last updated 1 months ago.
multivariate-datapermutationrepeated-measuresresampling
10.5 match 11 stars 4.63 score 39 scriptsbstewart
stm:Estimation of the Structural Topic Model
The Structural Topic Model (STM) allows researchers to estimate topic models with document-level covariates. The package also includes tools for model selection, visualization, and estimation of topic-covariate regressions. Methods developed in Roberts et. al. (2014) <doi:10.1111/ajps.12103> and Roberts et. al. (2016) <doi:10.1080/01621459.2016.1141684>. Vignette is Roberts et. al. (2019) <doi:10.18637/jss.v091.i02>.
Maintained by Brandon Stewart. Last updated 1 years ago.
3.8 match 404 stars 12.63 score 1.6k scripts 6 dependentscran
sna:Tools for Social Network Analysis
A range of tools for social network analysis, including node and graph-level indices, structural distance and covariance methods, structural equivalence detection, network regression, random graph generation, and 2D/3D network visualization.
Maintained by Carter T. Butts. Last updated 6 months ago.
6.9 match 8 stars 6.78 score 94 dependentsthothorn
exactRankTests:Exact Distributions for Rank and Permutation Tests
Computes exact conditional p-values and quantiles using an implementation of the Shift-Algorithm by Streitberg & Roehmel.
Maintained by Torsten Hothorn. Last updated 3 years ago.
6.5 match 1 stars 7.13 score 276 scripts 65 dependentssimsem
semTools:Useful Tools for Structural Equation Modeling
Provides miscellaneous tools for structural equation modeling, many of which extend the 'lavaan' package. For example, latent interactions can be estimated using product indicators (Lin et al., 2010, <doi:10.1080/10705511.2010.488999>) and simple effects probed; analytical power analyses can be conducted (Jak et al., 2021, <doi:10.3758/s13428-020-01479-0>); and scale reliability can be estimated based on estimated factor-model parameters.
Maintained by Terrence D. Jorgensen. Last updated 2 days ago.
3.4 match 79 stars 13.74 score 1.1k scripts 31 dependentscolbystatsvyrsch
CIPerm:Computationally-Efficient Confidence Intervals for Mean Shift from Permutation Methods
Implements computationally-efficient construction of confidence intervals from permutation or randomization tests for simple differences in means, based on Nguyen (2009) <doi:10.15760/etd.7798>.
Maintained by Jerzy Wieczorek. Last updated 3 years ago.
11.5 match 1 stars 4.00 score 7 scriptspbreheny
ncvreg:Regularization Paths for SCAD and MCP Penalized Regression Models
Fits regularization paths for linear regression, GLM, and Cox regression models using lasso or nonconvex penalties, in particular the minimax concave penalty (MCP) and smoothly clipped absolute deviation (SCAD) penalty, with options for additional L2 penalties (the "elastic net" idea). Utilities for carrying out cross-validation as well as post-fitting visualization, summarization, inference, and prediction are also provided. For more information, see Breheny and Huang (2011) <doi:10.1214/10-AOAS388> or visit the ncvreg homepage <https://pbreheny.github.io/ncvreg/>.
Maintained by Patrick Breheny. Last updated 2 days ago.
3.8 match 43 stars 12.04 score 458 scripts 38 dependentsvanderleidebastiani
SYNCSA:Analysis of Functional and Phylogenetic Patterns in Metacommunities
Analysis of metacommunities based on functional traits and phylogeny of the community components. The functions that are offered here implement for the R environment methods that have been available in the SYNCSA application written in C++ (by Valerio Pillar, available at <http://ecoqua.ecologia.ufrgs.br/SYNCSA.html>).
Maintained by Vanderlei Julio Debastiani. Last updated 5 years ago.
8.5 match 3 stars 5.36 score 28 scripts 1 dependentskoalaverse
vip:Variable Importance Plots
A general framework for constructing variable importance plots from various types of machine learning models in R. Aside from some standard model- specific variable importance measures, this package also provides model- agnostic approaches that can be applied to any supervised learning algorithm. These include 1) an efficient permutation-based variable importance measure, 2) variable importance based on Shapley values (Strumbelj and Kononenko, 2014) <doi:10.1007/s10115-013-0679-x>, and 3) the variance-based approach described in Greenwell et al. (2018) <arXiv:1805.04755>. A variance-based method for quantifying the relative strength of interaction effects is also included (see the previous reference for details).
Maintained by Brandon M. Greenwell. Last updated 2 years ago.
interaction-effectmachine-learningpartial-dependence-plotsupervised-learning-algorithmsvariable-importancevariable-importance-plots
3.9 match 187 stars 11.61 score 3.5k scripts 6 dependentszachmayer
caretEnsemble:Ensembles of Caret Models
Functions for creating ensembles of caret models: caretList() and caretStack(). caretList() is a convenience function for fitting multiple caret::train() models to the same dataset. caretStack() will make linear or non-linear combinations of these models, using a caret::train() model as a meta-model.
Maintained by Zachary A. Deane-Mayer. Last updated 3 months ago.
3.8 match 226 stars 11.92 score 780 scripts 1 dependentsbioc
EMDomics:Earth Mover's Distance for Differential Analysis of Genomics Data
The EMDomics algorithm is used to perform a supervised multi-class analysis to measure the magnitude and statistical significance of observed continuous genomics data between groups. Usually the data will be gene expression values from array-based or sequence-based experiments, but data from other types of experiments can also be analyzed (e.g. copy number variation). Traditional methods like Significance Analysis of Microarrays (SAM) and Linear Models for Microarray Data (LIMMA) use significance tests based on summary statistics (mean and standard deviation) of the distributions. This approach lacks power to identify expression differences between groups that show high levels of intra-group heterogeneity. The Earth Mover's Distance (EMD) algorithm instead computes the "work" needed to transform one distribution into another, thus providing a metric of the overall difference in shape between two distributions. Permutation of sample labels is used to generate q-values for the observed EMD scores. This package also incorporates the Komolgorov-Smirnov (K-S) test and the Cramer von Mises test (CVM), which are both common distribution comparison tests.
Maintained by Sadhika Malladi. Last updated 5 months ago.
softwaredifferentialexpressiongeneexpressionmicroarray
10.5 match 4.23 score 17 scriptsthomasp85
lime:Local Interpretable Model-Agnostic Explanations
When building complex models, it is often difficult to explain why the model should be trusted. While global measures such as accuracy are useful, they cannot be used for explaining why a model made a specific prediction. 'lime' (a port of the 'lime' 'Python' package) is a method for explaining the outcome of black box models by fitting a local model around the point in question an perturbations of this point. The approach is described in more detail in the article by Ribeiro et al. (2016) <arXiv:1602.04938>.
Maintained by Emil Hvitfeldt. Last updated 3 years ago.
caretmodel-checkingmodel-evaluationmodelingcpp
4.0 match 485 stars 11.07 score 732 scripts 1 dependentsveseshan
clinfun:Clinical Trial Design and Data Analysis Functions
Utilities to make your clinical collaborations easier if not fun. It contains functions for designing studies such as Simon 2-stage and group sequential designs and for data analysis such as Jonckheere-Terpstra test and estimating survival quantiles.
Maintained by Venkatraman E. Seshan. Last updated 1 years ago.
5.5 match 5 stars 7.86 score 124 scripts 8 dependentscran
perm:Exact or Asymptotic Permutation Tests
Perform Exact or Asymptotic permutation tests [see Fay and Shaw <doi:10.18637/jss.v036.i02>].
Maintained by Michael P. Fay. Last updated 2 years ago.
9.0 match 4.83 score 118 scripts 9 dependentssoroushmdg
gwid:Genome-Wide Identity-by-Descent
Methods and tools for the analysis of Genome Wide Identity-by-Descent ('gwid') mapping data, focusing on testing whether there is a higher occurrence of Identity-By-Descent (IBD) segments around potential causal variants in cases compared to controls, which is crucial for identifying rare variants. To enhance its analytical power, 'gwid' incorporates a Sliding Window Approach, allowing for the detection and analysis of signals from multiple Single Nucleotide Polymorphisms (SNPs).
Maintained by Soroush Mahmoudiandehkordi. Last updated 6 months ago.
12.1 match 1 stars 3.60 score 4 scriptsbioc
twilight:Estimation of local false discovery rate
In a typical microarray setting with gene expression data observed under two conditions, the local false discovery rate describes the probability that a gene is not differentially expressed between the two conditions given its corrresponding observed score or p-value level. The resulting curve of p-values versus local false discovery rate offers an insight into the twilight zone between clear differential and clear non-differential gene expression. Package 'twilight' contains two main functions: Function twilight.pval performs a two-condition test on differences in means for a given input matrix or expression set and computes permutation based p-values. Function twilight performs a stochastic downhill search to estimate local false discovery rates and effect size distributions. The package further provides means to filter for permutations that describe the null distribution correctly. Using filtered permutations, the influence of hidden confounders could be diminished.
Maintained by Stefanie Senger. Last updated 27 days ago.
microarraydifferentialexpressionmultiplecomparison
12.8 match 3.40 score 14 scripts 1 dependentscwatson
brainGraph:Graph Theory Analysis of Brain MRI Data
A set of tools for performing graph theory analysis of brain MRI data. It works with data from a Freesurfer analysis (cortical thickness, volumes, local gyrification index, surface area), diffusion tensor tractography data (e.g., from FSL) and resting-state fMRI data (e.g., from DPABI). It contains a graphical user interface for graph visualization and data exploration, along with several functions for generating useful figures.
Maintained by Christopher G. Watson. Last updated 1 years ago.
brain-connectivitybrain-imagingcomplex-networksconnectomeconnectomicsfmrigraph-theorymrinetwork-analysisneuroimagingneurosciencestatisticstractography
5.4 match 188 stars 7.86 score 107 scripts 3 dependentsdavidvandijcke
DiSCos:Distributional Synthetic Controls Estimation
The method of synthetic controls is a widely-adopted tool for evaluating causal effects of policy changes in settings with observational data. In many settings where it is applicable, researchers want to identify causal effects of policy changes on a treated unit at an aggregate level while having access to data at a finer granularity. This package implements a simple extension of the synthetic controls estimator, developed in Gunsilius (2023) <doi:10.3982/ECTA18260>, that takes advantage of this additional structure and provides nonparametric estimates of the heterogeneity within the aggregate unit. The idea is to replicate the quantile function associated with the treated unit by a weighted average of quantile functions of the control units. The package contains tools for aggregating and plotting the resulting distributional estimates, as well as for carrying out inference on them.
Maintained by David Van Dijcke. Last updated 2 days ago.
8.9 match 1 stars 4.81 score 8 scriptsmoderndive
moderndive:Tidyverse-Friendly Introductory Linear Regression
Datasets and wrapper functions for tidyverse-friendly introductory linear regression, used in "Statistical Inference via Data Science: A ModernDive into R and the Tidyverse" available at <https://moderndive.com/>.
Maintained by Albert Y. Kim. Last updated 3 months ago.
3.8 match 88 stars 11.35 score 1.8k scriptsbioc
waddR:Statistical tests for detecting differential distributions based on the 2-Wasserstein distance
The package offers statistical tests based on the 2-Wasserstein distance for detecting and characterizing differences between two distributions given in the form of samples. Functions for calculating the 2-Wasserstein distance and testing for differential distributions are provided, as well as a specifically tailored test for differential expression in single-cell RNA sequencing data.
Maintained by Julian Flesch. Last updated 5 months ago.
softwarestatisticalmethodsinglecelldifferentialexpressioncpp
6.3 match 25 stars 6.70 score 6 scriptsjamesramsay5
fda:Functional Data Analysis
These functions were developed to support functional data analysis as described in Ramsay, J. O. and Silverman, B. W. (2005) Functional Data Analysis. New York: Springer and in Ramsay, J. O., Hooker, Giles, and Graves, Spencer (2009). Functional Data Analysis with R and Matlab (Springer). The package includes data sets and script files working many examples including all but one of the 76 figures in this latter book. Matlab versions are available by ftp from <https://www.psych.mcgill.ca/misc/fda/downloads/FDAfuns/>.
Maintained by James Ramsay. Last updated 4 months ago.
3.4 match 3 stars 12.29 score 2.0k scripts 143 dependentsneurodata
mgc:Multiscale Graph Correlation
Multiscale Graph Correlation (MGC) is a framework developed by Vogelstein et al. (2019) <DOI:10.7554/eLife.41690> that extends global correlation procedures to be multiscale; consequently, MGC tests typically require far fewer samples than existing methods for a wide variety of dependence structures and dimensionalities, while maintaining computational efficiency. Moreover, MGC provides a simple and elegant multiscale characterization of the potentially complex latent geometry underlying the relationship.
Maintained by Eric Bridgeford. Last updated 4 years ago.
5.6 match 9 stars 7.50 score 59 scripts 2 dependentsgagolews
stringi:Fast and Portable Character String Processing Facilities
A collection of character string/text/natural language processing tools for pattern searching (e.g., with 'Java'-like regular expressions or the 'Unicode' collation algorithm), random string generation, case mapping, string transliteration, concatenation, sorting, padding, wrapping, Unicode normalisation, date-time formatting and parsing, and many more. They are fast, consistent, convenient, and - thanks to 'ICU' (International Components for Unicode) - portable across all locales and platforms. Documentation about 'stringi' is provided via its website at <https://stringi.gagolewski.com/> and the paper by Gagolewski (2022, <doi:10.18637/jss.v103.i02>).
Maintained by Marek Gagolewski. Last updated 1 months ago.
icuicu4cnatural-language-processingnlpregexregexpstring-manipulationstringistringrtexttext-processingtidy-dataunicodecpp
2.3 match 309 stars 18.31 score 10k scripts 8.6k dependentst-kalinowski
keras:R Interface to 'Keras'
Interface to 'Keras' <https://keras.io>, a high-level neural networks 'API'. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both 'CPU' and 'GPU' devices.
Maintained by Tomasz Kalinowski. Last updated 11 months ago.
3.8 match 10.82 score 10k scripts 54 dependentsjosiahparry
sfdep:Spatial Dependence for Simple Features
An interface to 'spdep' to integrate with 'sf' objects and the 'tidyverse'.
Maintained by Dexter Locke. Last updated 6 months ago.
5.8 match 130 stars 7.01 score 130 scriptscran
asnipe:Animal Social Network Inference and Permutations for Ecologists
Implements several tools that are used in animal social network analysis, as described in Whitehead (2007) Analyzing Animal Societies <University of Chicago Press> and Farine & Whitehead (2015) <doi: 10.1111/1365-2656.12418>. In particular, this package provides the tools to infer groups and generate networks from observation data, perform permutation tests on the data, calculate lagged association rates, and performed multiple regression analysis on social network data.
Maintained by Damien R. Farine. Last updated 1 years ago.
9.2 match 2 stars 4.36 score 173 scripts 2 dependentsschlosslab
mikropml:User-Friendly R Package for Supervised Machine Learning Pipelines
An interface to build machine learning models for classification and regression problems. 'mikropml' implements the ML pipeline described by Topçuoğlu et al. (2020) <doi:10.1128/mBio.00434-20> with reasonable default options for data preprocessing, hyperparameter tuning, cross-validation, testing, model evaluation, and interpretation steps. See the website <https://www.schlosslab.org/mikropml/> for more information, documentation, and examples.
Maintained by Kelly Sovacool. Last updated 2 years ago.
5.1 match 56 stars 7.83 score 86 scriptszarquon42b
Morpho:Calculations and Visualisations Related to Geometric Morphometrics
A toolset for Geometric Morphometrics and mesh processing. This includes (among other stuff) mesh deformations based on reference points, permutation tests, detection of outliers, processing of sliding semi-landmarks and semi-automated surface landmark placement.
Maintained by Stefan Schlager. Last updated 5 months ago.
4.0 match 51 stars 10.00 score 218 scripts 13 dependentsadeverse
adespatial:Multivariate Multiscale Spatial Analysis
Tools for the multiscale spatial analysis of multivariate data. Several methods are based on the use of a spatial weighting matrix and its eigenvector decomposition (Moran's Eigenvectors Maps, MEM). Several approaches are described in the review Dray et al (2012) <doi:10.1890/11-1183.1>.
Maintained by Aurélie Siberchicot. Last updated 11 days ago.
3.6 match 36 stars 11.06 score 398 scripts 2 dependentsspatstat
spatstat.explore:Exploratory Data Analysis for the 'spatstat' Family
Functionality for exploratory data analysis and nonparametric analysis of spatial data, mainly spatial point patterns, in the 'spatstat' family of packages. (Excludes analysis of spatial data on a linear network, which is covered by the separate package 'spatstat.linnet'.) Methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported.
Maintained by Adrian Baddeley. Last updated 1 months ago.
cluster-detectionconfidence-intervalshypothesis-testingk-functionroc-curvesscan-statisticssignificance-testingsimulation-envelopesspatial-analysisspatial-data-analysisspatial-sharpeningspatial-smoothingspatial-statistics
3.9 match 1 stars 10.17 score 67 scripts 148 dependentsfkeck
phylosignal:Exploring the Phylogenetic Signal in Continuous Traits
A collection of tools to explore the phylogenetic signal in univariate and multivariate data. The package provides functions to plot traits data against a phylogenetic tree, different measures and tests for the phylogenetic signal, methods to describe where the signal is located and a phylogenetic clustering method.
Maintained by Francois Keck. Last updated 1 years ago.
5.4 match 16 stars 7.22 score 104 scriptsbioc
structToolbox:Data processing & analysis tools for Metabolomics and other omics
An extensive set of data (pre-)processing and analysis methods and tools for metabolomics and other omics, with a strong emphasis on statistics and machine learning. This toolbox allows the user to build extensive and standardised workflows for data analysis. The methods and tools have been implemented using class-based templates provided by the struct (Statistics in R Using Class-based Templates) package. The toolbox includes pre-processing methods (e.g. signal drift and batch correction, normalisation, missing value imputation and scaling), univariate (e.g. ttest, various forms of ANOVA, Kruskal–Wallis test and more) and multivariate statistical methods (e.g. PCA and PLS, including cross-validation and permutation testing) as well as machine learning methods (e.g. Support Vector Machines). The STATistics Ontology (STATO) has been integrated and implemented to provide standardised definitions for the different methods, inputs and outputs.
Maintained by Gavin Rhys Lloyd. Last updated 24 days ago.
workflowstepmetabolomicsbioconductor-packagedimslc-msmachine-learningmultivariate-analysisstatisticsunivariate
6.3 match 10 stars 6.26 score 12 scriptsaloy
CarletonStats:Functions for Statistics Classes at Carleton College
Includes commands for bootstrapping and permutation tests, a command for created grouped bar plots, and a demo of the quantile-normal plot for data drawn from different distributions.
Maintained by Adam Loy. Last updated 7 months ago.
10.3 match 3.81 score 65 scriptsprodriguezsosa
conText:'a la Carte' on Text (ConText) Embedding Regression
A fast, flexible and transparent framework to estimate context-specific word and short document embeddings using the 'a la carte' embeddings approach developed by Khodak et al. (2018) <arXiv:1805.05388> and evaluate hypotheses about covariate effects on embeddings using the regression framework developed by Rodriguez et al. (2021)<https://github.com/prodriguezsosa/EmbeddingRegression>.
Maintained by Pedro L. Rodriguez. Last updated 11 months ago.
4.1 match 104 stars 9.40 score 1.7k scriptskbroman
broman:Karl Broman's R Code
Miscellaneous R functions, including functions related to graphics (mostly for base graphics), permutation tests, running mean/median, and general utilities.
Maintained by Karl W Broman. Last updated 10 months ago.
4.4 match 183 stars 8.80 score 648 scripts 1 dependentsdusadrian
admisc:Adrian Dusa's Miscellaneous
Contains functions used across packages 'DDIwR', 'QCA' and 'venn'. Interprets and translates, factorizes and negates SOP - Sum of Products expressions, for both binary and multi-value crisp sets, and extracts information (set names, set values) from those expressions. Other functions perform various other checks if possibly numeric (even if all numbers reside in a character vector) and coerce to numeric, or check if the numbers are whole. It also offers, among many others, a highly versatile recoding routine and some more flexible alternatives to the base functions 'with()' and 'within()'. SOP simplification functions in this package use related minimization from package 'QCA', which is recommended to be installed despite not being listed in the Imports field, due to circular dependency issues.
Maintained by Adrian Dusa. Last updated 2 days ago.
5.0 match 2 stars 7.61 score 20 scripts 92 dependentsecospat
ecospat:Spatial Ecology Miscellaneous Methods
Collection of R functions and data sets for the support of spatial ecology analyses with a focus on pre, core and post modelling analyses of species distribution, niche quantification and community assembly. Written by current and former members and collaborators of the ecospat group of Antoine Guisan, Department of Ecology and Evolution (DEE) and Institute of Earth Surface Dynamics (IDYST), University of Lausanne, Switzerland. Read Di Cola et al. (2016) <doi:10.1111/ecog.02671> for details.
Maintained by Olivier Broennimann. Last updated 1 months ago.
4.0 match 32 stars 9.35 score 418 scripts 1 dependentsbioc
epistasisGA:An R package to identify multi-snp effects in nuclear family studies using the GADGETS method
This package runs the GADGETS method to identify epistatic effects in nuclear family studies. It also provides functions for permutation-based inference and graphical visualization of the results.
Maintained by Michael Nodzenski. Last updated 5 months ago.
geneticssnpgeneticvariabilityopenblascpp
8.3 match 1 stars 4.48 score 5 scriptscdowd
twosamples:Fast Permutation Based Two Sample Tests
Fast randomization based two sample tests. Testing the hypothesis that two samples come from the same distribution using randomization to create p-values. Included tests are: Kolmogorov-Smirnov, Kuiper, Cramer-von Mises, Anderson-Darling, Wasserstein, and DTS. The default test (two_sample) is based on the DTS test statistic, as it is the most powerful, and thus most useful to most users. The DTS test statistic builds on the Wasserstein distance by using a weighting scheme like that of Anderson-Darling. See the companion paper at <arXiv:2007.01360> or <https://codowd.com/public/DTS.pdf> for details of that test statistic, and non-standard uses of the package (parallel for big N, weighted observations, one sample tests, etc). We also include the permutation scheme to make test building simple for others.
Maintained by Connor Dowd. Last updated 2 years ago.
5.4 match 17 stars 6.88 score 62 scripts 8 dependentsegenn
rtemis:Machine Learning and Visualization
Advanced Machine Learning and Visualization. Unsupervised Learning (Clustering, Decomposition), Supervised Learning (Classification, Regression), Cross-Decomposition, Bagging, Boosting, Meta-models. Static and interactive graphics.
Maintained by E.D. Gennatas. Last updated 1 months ago.
data-sciencedata-visualizationmachine-learningmachine-learning-libraryvisualization
5.3 match 145 stars 7.09 score 50 scripts 2 dependentsbioc
BioNAR:Biological Network Analysis in R
the R package BioNAR, developed to step by step analysis of PPI network. The aim is to quantify and rank each protein’s simultaneous impact into multiple complexes based on network topology and clustering. Package also enables estimating of co-occurrence of diseases across the network and specific clusters pointing towards shared/common mechanisms.
Maintained by Anatoly Sorokin. Last updated 17 days ago.
softwaregraphandnetworknetwork
6.3 match 3 stars 5.90 score 35 scriptsr-forge
zoo:S3 Infrastructure for Regular and Irregular Time Series (Z's Ordered Observations)
An S3 class with methods for totally ordered indexed observations. It is particularly aimed at irregular time series of numeric vectors/matrices and factors. zoo's key design goals are independence of a particular index/date/time class and consistency with ts and base R by providing methods to extend standard generics.
Maintained by Achim Zeileis. Last updated 12 days ago.
2.3 match 16.23 score 33k scripts 2.2k dependentsgjmvanboxtel
gsignal:Signal Processing
R implementation of the 'Octave' package 'signal', containing a variety of signal processing tools, such as signal generation and measurement, correlation and convolution, filtering, filter design, filter analysis and conversion, power spectrum analysis, system identification, decimation and sample rate change, and windowing.
Maintained by Geert van Boxtel. Last updated 2 months ago.
3.6 match 24 stars 10.03 score 133 scripts 34 dependentscogdisreslab
KRSA:KRSA: Kinome Random Sampling Analyzer
The goal of this package is to analyze the PamChip data and identify the changes in the active kinome. The package can preprocess the PamChip data output from BioNavigator and use Random Sampling and Permutation Analysis to identify upstream kinases. Additionally, this package provides a set of useful visualizations for the PamChip data.
Maintained by Ali Sajid Imami. Last updated 9 days ago.
kinasephosphatasespamchipkinomerandom samplingpermutation analysis
8.0 match 4 stars 4.42 score 49 scriptstidyverse
forcats:Tools for Working with Categorical Variables (Factors)
Helpers for reordering factor levels (including moving specified levels to front, ordering by first appearance, reversing, and randomly shuffling), and tools for modifying factor levels (including collapsing rare levels into other, 'anonymising', and manually 'recoding').
Maintained by Hadley Wickham. Last updated 1 years ago.
1.9 match 555 stars 18.77 score 21k scripts 1.2k dependentsradicalcommecol
cxr:A Toolbox for Modelling Species Coexistence in R
Recent developments in modern coexistence theory have advanced our understanding on how species are able to persist and co-occur with other species at varying abundances. However, applying this mathematical framework to empirical data is still challenging, precluding a larger adoption of the theoretical tools developed by empiricists. This package provides a complete toolbox for modelling interaction effects between species, and calculate fitness and niche differences. The functions are flexible, may accept covariates, and different fitting algorithms can be used. A full description of the underlying methods is available in García-Callejas, D., Godoy, O., and Bartomeus, I. (2020) <doi:10.1111/2041-210X.13443>. Furthermore, the package provides a series of functions to calculate dynamics for stage-structured populations across sites.
Maintained by David Garcia-Callejas. Last updated 1 months ago.
5.3 match 10 stars 6.51 score 27 scriptsbioc
deltaCaptureC:This Package Discovers Meso-scale Chromatin Remodeling from 3C Data
This package discovers meso-scale chromatin remodelling from 3C data. 3C data is local in nature. It givens interaction counts between restriction enzyme digestion fragments and a preferred 'viewpoint' region. By binning this data and using permutation testing, this package can test whether there are statistically significant changes in the interaction counts between the data from two cell types or two treatments.
Maintained by Michael Shapiro. Last updated 5 months ago.
biologicalquestionstatisticalmethod
9.9 match 3.48 score 1 scriptsbblonder
hypervolume:High Dimensional Geometry, Set Operations, Projection, and Inference Using Kernel Density Estimation, Support Vector Machines, and Convex Hulls
Estimates the shape and volume of high-dimensional datasets and performs set operations: intersection / overlap, union, unique components, inclusion test, and hole detection. Uses stochastic geometry approach to high-dimensional kernel density estimation, support vector machine delineation, and convex hull generation. Applications include modeling trait and niche hypervolumes and species distribution modeling.
Maintained by Benjamin Blonder. Last updated 2 months ago.
3.5 match 23 stars 9.75 score 211 scripts 7 dependentsericarcher
rfPermute:Estimate Permutation p-Values for Random Forest Importance Metrics
Estimate significance of importance metrics for a Random Forest model by permuting the response variable. Produces null distribution of importance metrics for each predictor variable and p-value of observed. Provides summary and visualization functions for 'randomForest' results.
Maintained by Eric Archer. Last updated 2 years ago.
5.0 match 27 stars 6.77 score 96 scripts 1 dependentsdkahle
TITAN2:Threshold Indicator Taxa Analysis
Uses indicator species scores across binary partitions of a sample set to detect congruence in taxon-specific changes of abundance and occurrence frequency along an environmental gradient as evidence of an ecological community threshold. Relevant references include Baker and King (2010) <doi:10.1111/j.2041-210X.2009.00007.x>, King and Baker (2010) <doi:10.1899/09-144.1>, and Baker and King (2013) <doi:10.1899/12-142.1>.
Maintained by David Kahle. Last updated 1 years ago.
5.2 match 13 stars 6.59 score 30 scriptsprabhleenkaur19
aniSNA:Statistical Network Analysis of Animal Social Networks
Obtain network structures from animal GPS telemetry observations and statistically analyse them to assess their adequacy for social network analysis. Methods include pre-network data permutations, bootstrapping techniques to obtain confidence intervals for global and node-level network metrics, and correlation and regression analysis of the local network metrics.
Maintained by Prabhleen Kaur. Last updated 2 months ago.
10.7 match 3.18 scorer-forge
randtoolbox:Toolbox for Pseudo and Quasi Random Number Generation and Random Generator Tests
Provides (1) pseudo random generators - general linear congruential generators, multiple recursive generators and generalized feedback shift register (SF-Mersenne Twister algorithm (<doi:10.1007/978-3-540-74496-2_36>) and WELL (<doi:10.1145/1132973.1132974>) generators); (2) quasi random generators - the Torus algorithm, the Sobol sequence, the Halton sequence (including the Van der Corput sequence) and (3) some generator tests - the gap test, the serial test, the poker test, see, e.g., Gentle (2003) <doi:10.1007/b97336>. Take a look at the Distribution task view of types and tests of random number generators. The package can be provided without the 'rngWELL' dependency on demand. Package in Memoriam of Diethelm and Barbara Wuertz.
Maintained by Christophe Dutang. Last updated 3 months ago.
3.3 match 1 stars 10.23 score 578 scripts 80 dependentsbiometris
douconca:Double Constrained Correspondence Analysis for Trait-Environment Analysis in Ecology
Double constrained correspondence analysis (dc-CA) analyzes (multi-)trait (multi-)environment ecological data by using the 'vegan' package and native R code. Throughout the two step algorithm of ter Braak et al. (2018) is used. This algorithm combines and extends community- (sample-) and species-level analyses, i.e. the usual community weighted means (CWM)-based regression analysis and the species-level analysis of species-niche centroids (SNC)-based regression analysis. The two steps use canonical correspondence analysis to regress the abundance data on to the traits and (weighted) redundancy analysis to regress the CWM of the orthonormalized traits on to the environmental predictors. The function dc_CA() has an option to divide the abundance data of a site by the site total, giving equal site weights. This division has the advantage that the multivariate analysis corresponds with an unweighted (multi-trait) community-level analysis, instead of being weighted. The first step of the algorithm uses vegan::cca(). The second step uses wrda() but vegan::rda() if the site weights are equal. This version has a predict() function. For details see ter Braak et al. 2018 <doi:10.1007/s10651-017-0395-x>.
Maintained by Bart-Jan van Rossum. Last updated 3 months ago.
correspondence-analysisecologyecology-modelingmulti-environmentmulti-trait
6.7 match 5.02 score 6 scriptsadrientaudiere
MiscMetabar:Miscellaneous Functions for Metabarcoding Analysis
Facilitate the description, transformation, exploration, and reproducibility of metabarcoding analyses. 'MiscMetabar' is mainly built on top of the 'phyloseq', 'dada2' and 'targets' R packages. It helps to build reproducible and robust bioinformatics pipelines in R. 'MiscMetabar' makes ecological analysis of alpha and beta-diversity easier, more reproducible and more powerful by integrating a large number of tools. Important features are described in Taudière A. (2023) <doi:10.21105/joss.06038>.
Maintained by Adrien Taudière. Last updated 24 days ago.
sequencingmicrobiomemetagenomicsclusteringclassificationvisualizationampliconamplicon-sequencingbiodiversity-informaticsecologyilluminametabarcodingngs-analysis
5.2 match 17 stars 6.44 score 23 scriptscran
statcomp:Statistical Complexity and Information Measures for Time Series Analysis
An implementation of local and global statistical complexity measures (aka Information Theory Quantifiers, ITQ) for time series analysis based on ordinal statistics (Bandt and Pompe (2002) <DOI:10.1103/PhysRevLett.88.174102>). Several distance measures that operate on ordinal pattern distributions, auxiliary functions for ordinal pattern analysis, and generating functions for stochastic and deterministic-chaotic processes for ITQ testing are provided.
Maintained by Sebastian Sippel. Last updated 5 years ago.
9.8 match 4 stars 3.41 score 72 scripts 1 dependentsbioc
singscore:Rank-based single-sample gene set scoring method
A simple single-sample gene signature scoring method that uses rank-based statistics to analyze the sample's gene expression profile. It scores the expression activities of gene sets at a single-sample level.
Maintained by Malvika Kharbanda. Last updated 5 months ago.
softwaregeneexpressiongenesetenrichmentbioinformatics
3.3 match 41 stars 10.03 score 124 scripts 4 dependentsbiooss
sensitivity:Global Sensitivity Analysis of Model Outputs and Importance Measures
A collection of functions for sensitivity analysis of model outputs (factor screening, global sensitivity analysis and robustness analysis), for variable importance measures of data, as well as for interpretability of machine learning models. Most of the functions have to be applied on scalar output, but several functions support multi-dimensional outputs.
Maintained by Bertrand Iooss. Last updated 7 months ago.
4.9 match 17 stars 6.74 score 472 scripts 8 dependentsbioc
LimROTS:A Hybrid Method Integrating Empirical Bayes and Reproducibility-Optimized Statistics for Robust Analysis of Proteomics and Metabolomics Data
Differential expression analysis is a prevalent method utilised in the examination of diverse biological data. The reproducibility-optimized test statistic (ROTS) modifies a t-statistic based on the data's intrinsic characteristics and ranks features according to their statistical significance for differential expression between two or more groups (f-statistic). Focussing on proteomics and metabolomics, the current ROTS implementation cannot account for technical or biological covariates such as MS batches or gender differences among the samples. Consequently, we developed LimROTS, which employs a reproducibility-optimized test statistic utilising the limma methodology to simulate complex experimental designs. LimROTS is a hybrid method integrating empirical bayes and reproducibility-optimized statistics for robust analysis of proteomics and metabolomics data.
Maintained by Ali Mostafa Anwar. Last updated 3 months ago.
softwaregeneexpressiondifferentialexpressionmicroarrayrnaseqproteomicsimmunooncologymetabolomicsmrnamicroarray
7.0 match 1 stars 4.70 score 1 scriptsiandryden
shapes:Statistical Shape Analysis
Routines for the statistical analysis of landmark shapes, including Procrustes analysis, graphical displays, principal components analysis, permutation and bootstrap tests, thin-plate spline transformation grids and comparing covariance matrices. See Dryden, I.L. and Mardia, K.V. (2016). Statistical shape analysis, with Applications in R (2nd Edition), John Wiley and Sons.
Maintained by Ian Dryden. Last updated 4 months ago.
3.8 match 7 stars 8.50 score 225 scripts 24 dependentsbnaras
PMA:Penalized Multivariate Analysis
Performs Penalized Multivariate Analysis: a penalized matrix decomposition, sparse principal components analysis, and sparse canonical correlation analysis, described in Witten, Tibshirani and Hastie (2009) <doi:10.1093/biostatistics/kxp008> and Witten and Tibshirani (2009) Extensions of sparse canonical correlation analysis, with applications to genomic data <doi:10.2202/1544-6115.1470>.
Maintained by Balasubramanian Narasimhan. Last updated 1 years ago.
4.5 match 4 stars 7.24 score 254 scripts 11 dependentsmllg
checkmate:Fast and Versatile Argument Checks
Tests and assertions to perform frequent argument checks. A substantial part of the package was written in C to minimize any worries about execution time overhead.
Maintained by Michel Lang. Last updated 8 months ago.
2.0 match 276 stars 16.28 score 1.5k scripts 1.9k dependentsrstudio
tfprobability:Interface to 'TensorFlow Probability'
Interface to 'TensorFlow Probability', a 'Python' library built on 'TensorFlow' that makes it easy to combine probabilistic models and deep learning on modern hardware ('TPU', 'GPU'). 'TensorFlow Probability' includes a wide selection of probability distributions and bijectors, probabilistic layers, variational inference, Markov chain Monte Carlo, and optimizers such as Nelder-Mead, BFGS, and SGLD.
Maintained by Tomasz Kalinowski. Last updated 3 years ago.
3.8 match 54 stars 8.63 score 221 scripts 3 dependentsjfukuyama
phyloseqGraphTest:Graph-Based Permutation Tests for Microbiome Data
Provides functions for graph-based multiple-sample testing and visualization of microbiome data, in particular data stored in 'phyloseq' objects. The tests are based on those described in Friedman and Rafsky (1979) <http://www.jstor.org/stable/2958919>, and the tests are described in more detail in Callahan et al. (2016) <doi:10.12688/f1000research.8986.1>.
Maintained by Julia Fukuyama. Last updated 1 years ago.
6.7 match 4 stars 4.81 score 16 scriptsbioc
MicrobiotaProcess:A comprehensive R package for managing and analyzing microbiome and other ecological data within the tidy framework
MicrobiotaProcess is an R package for analysis, visualization and biomarker discovery of microbial datasets. It introduces MPSE class, this make it more interoperable with the existing computing ecosystem. Moreover, it introduces a tidy microbiome data structure paradigm and analysis grammar. It provides a wide variety of microbiome data analysis procedures under the unified and common framework (tidy-like framework).
Maintained by Shuangbin Xu. Last updated 5 months ago.
visualizationmicrobiomesoftwaremultiplecomparisonfeatureextractionmicrobiome-analysismicrobiome-data
3.3 match 183 stars 9.70 score 126 scripts 1 dependentsbioc
BiocGenerics:S4 generic functions used in Bioconductor
The package defines many S4 generic functions used in Bioconductor.
Maintained by Hervé Pagès. Last updated 1 months ago.
infrastructurebioconductor-packagecore-package
2.3 match 12 stars 14.22 score 612 scripts 2.2k dependentsbioc
progeny:Pathway RespOnsive GENes for activity inference from gene expression
PROGENy is resource that leverages a large compendium of publicly available signaling perturbation experiments to yield a common core of pathway responsive genes for human and mouse. These, coupled with any statistical method, can be used to infer pathway activities from bulk or single-cell transcriptomics.
Maintained by Aurélien Dugourd. Last updated 5 months ago.
systemsbiologygeneexpressionfunctionalpredictiongeneregulation
3.6 match 99 stars 8.90 score 221 scripts 1 dependentshfgolino
EGAnet:Exploratory Graph Analysis – a Framework for Estimating the Number of Dimensions in Multivariate Data using Network Psychometrics
Implements the Exploratory Graph Analysis (EGA) framework for dimensionality and psychometric assessment. EGA estimates the number of dimensions in psychological data using network estimation methods and community detection algorithms. A bootstrap method is provided to assess the stability of dimensions and items. Fit is evaluated using the Entropy Fit family of indices. Unique Variable Analysis evaluates the extent to which items are locally dependent (or redundant). Network loadings provide similar information to factor loadings and can be used to compute network scores. A bootstrap and permutation approach are available to assess configural and metric invariance. Hierarchical structures can be detected using Hierarchical EGA. Time series and intensive longitudinal data can be analyzed using Dynamic EGA, supporting individual, group, and population level assessments.
Maintained by Hudson Golino. Last updated 8 days ago.
4.1 match 47 stars 7.80 score 61 scripts 1 dependentsbioc
MPAC:Multi-omic Pathway Analysis of Cells
Multi-omic Pathway Analysis of Cells (MPAC), integrates multi-omic data for understanding cellular mechanisms. It predicts novel patient groups with distinct pathway profiles as well as identifying key pathway proteins with potential clinical associations. From CNA and RNA-seq data, it determines genes’ DNA and RNA states (i.e., repressed, normal, or activated), which serve as the input for PARADIGM to calculate Inferred Pathway Levels (IPLs). It also permutes DNA and RNA states to create a background distribution to filter IPLs as a way to remove events observed by chance. It provides multiple methods for downstream analysis and visualization.
Maintained by Peng Liu. Last updated 15 hours ago.
softwaretechnologysequencingrnaseqsurvivalclusteringimmunooncology
7.5 match 4.20 score 1 scriptsjlessler
IDSpatialStats:Estimate Global Clustering in Infectious Disease
Implements various novel and standard clustering statistics and other analyses useful for understanding the spread of infectious disease.
Maintained by Justin Lessler. Last updated 8 months ago.
11.6 match 1 stars 2.69 score 33 scriptsdkahle
mpoly:Symbolic Computation and More with Multivariate Polynomials
Symbolic computing with multivariate polynomials in R.
Maintained by David Kahle. Last updated 4 months ago.
5.0 match 12 stars 6.25 score 70 scripts 7 dependentskhliland
HDANOVA:High-Dimensional Analysis of Variance
Functions and datasets to support Smilde, Marini, Westerhuis and Liland (2025, ISBN: 978-1-394-21121-0) "Analysis of Variance for High-Dimensional Data - Applications in Life, Food and Chemical Sciences". This implements and imports a collection of methods for HD-ANOVA data analysis with common interfaces, result- and plotting functions, multiple real data sets and four vignettes covering a range different applications.
Maintained by Kristian Hovde Liland. Last updated 2 days ago.
7.1 match 4.35 score 8 scripts 1 dependentscran
network:Classes for Relational Data
Tools to create and modify network objects. The network class can represent a range of relational data types, and supports arbitrary vertex/edge/graph attributes.
Maintained by Carter T. Butts. Last updated 3 months ago.
4.0 match 3 stars 7.65 score 146 dependentstomasfryda
h2o:R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Maintained by Tomas Fryda. Last updated 1 years ago.
3.8 match 3 stars 8.20 score 7.8k scripts 11 dependentsbioc
INDEED:Interactive Visualization of Integrated Differential Expression and Differential Network Analysis for Biomarker Candidate Selection Package
An R package for integrated differential expression and differential network analysis based on omic data for cancer biomarker discovery. Both correlation and partial correlation can be used to generate differential network to aid the traditional differential expression analysis to identify changes between biomolecules on both their expression and pairwise association levels. A detailed description of the methodology has been published in Methods journal (PMID: 27592383). An interactive visualization feature allows for the exploration and selection of candidate biomarkers.
Maintained by Ressom group. Last updated 5 months ago.
immunooncologysoftwareresearchfieldbiologicalquestionstatisticalmethoddifferentialexpressionmassspectrometrymetabolomics
5.1 match 4 stars 5.92 score 10 scriptsannavesely
sumSome:True Discovery Guarantee by Sum-Based Tests
It allows to quickly perform closed testing by sum-based global tests, and construct lower confidence bounds for the TDP, simultaneously over all subsets of hypotheses. As main features, it produces permutation-based simultaneous lower confidence bounds for the proportion of active voxels in clusters for fMRI data, differentially expressed genes in pathways for gene expression data, and significant effects for multiverse analysis. Details may be found in Vesely at al. (2023) < doi:10.1093/jrsssb/qkad019> and Tian at al. (2022) <doi:10.1111/sjos.12614>.
Maintained by Anna Vesely. Last updated 2 months ago.
11.2 match 1 stars 2.70 score 3 scriptssalvatoremangiafico
rcompanion:Functions to Support Extension Education Program Evaluation
Functions and datasets to support Summary and Analysis of Extension Program Evaluation in R, and An R Companion for the Handbook of Biological Statistics. Vignettes are available at <https://rcompanion.org>.
Maintained by Salvatore Mangiafico. Last updated 30 days ago.
3.8 match 4 stars 8.01 score 2.4k scripts 5 dependentsbioc
DelayedTensor:R package for sparse and out-of-core arithmetic and decomposition of Tensor
DelayedTensor operates Tensor arithmetic directly on DelayedArray object. DelayedTensor provides some generic function related to Tensor arithmetic/decompotision and dispatches it on the DelayedArray class. DelayedTensor also suppors Tensor contraction by einsum function, which is inspired by numpy einsum.
Maintained by Koki Tsuyuzaki. Last updated 5 months ago.
softwareinfrastructuredatarepresentationdimensionreduction
6.3 match 4 stars 4.68 score 3 scriptshanjunwei-lab
ICDS:Identification of Cancer Dysfunctional Subpathway with Omics Data
Identify Cancer Dysfunctional Sub-pathway by integrating gene expression, DNA methylation and copy number variation, and pathway topological information. 1)We firstly calculate the gene risk scores by integrating three kinds of data: DNA methylation, copy number variation, and gene expression. 2)Secondly, we perform a greedy search algorithm to identify the key dysfunctional sub-pathways within the pathways for which the discriminative scores were locally maximal. 3)Finally, the permutation test was used to calculate statistical significance level for these key dysfunctional sub-pathways.
Maintained by Junwei Han. Last updated 8 months ago.
6.5 match 4.54 score 3 scriptspboutros
bedr:Genomic Region Processing using Tools Such as 'BEDTools', 'BEDOPS' and 'Tabix'
Genomic regions processing using open-source command line tools such as 'BEDTools', 'BEDOPS' and 'Tabix'. These tools offer scalable and efficient utilities to perform genome arithmetic e.g indexing, formatting and merging. bedr API enhances access to these tools as well as offers additional utilities for genomic regions processing.
Maintained by Paul C. Boutros. Last updated 6 years ago.
5.9 match 4.98 score 264 scripts 2 dependentstidymodels
infer:Tidy Statistical Inference
The objective of this package is to perform inference using an expressive statistical grammar that coheres with the tidy design framework.
Maintained by Simon Couch. Last updated 6 months ago.
1.9 match 734 stars 15.69 score 3.5k scripts 17 dependentswviechtb
metafor:Meta-Analysis Package for R
A comprehensive collection of functions for conducting meta-analyses in R. The package includes functions to calculate various effect sizes or outcome measures, fit equal-, fixed-, random-, and mixed-effects models to such data, carry out moderator and meta-regression analyses, and create various types of meta-analytical plots (e.g., forest, funnel, radial, L'Abbe, Baujat, bubble, and GOSH plots). For meta-analyses of binomial and person-time data, the package also provides functions that implement specialized methods, including the Mantel-Haenszel method, Peto's method, and a variety of suitable generalized linear (mixed-effects) models (i.e., mixed-effects logistic and Poisson regression models). Finally, the package provides functionality for fitting meta-analytic multivariate/multilevel models that account for non-independent sampling errors and/or true effects (e.g., due to the inclusion of multiple treatment studies, multiple endpoints, or other forms of clustering). Network meta-analyses and meta-analyses accounting for known correlation structures (e.g., due to phylogenetic relatedness) can also be conducted. An introduction to the package can be found in Viechtbauer (2010) <doi:10.18637/jss.v036.i03>.
Maintained by Wolfgang Viechtbauer. Last updated 22 hours ago.
meta-analysismixed-effectsmultilevel-modelsmultivariate
1.8 match 246 stars 16.30 score 4.9k scripts 92 dependentsvathymut
dsos:Dataset Shift with Outlier Scores
Test for no adverse shift in two-sample comparison when we have a training set, the reference distribution, and a test set. The approach is flexible and relies on a robust and powerful test statistic, the weighted AUC. Technical details are in Kamulete, V. M. (2021) <arXiv:1908.04000>. Modern notions of outlyingness such as trust scores and prediction uncertainty can be used as the underlying scores for example.
Maintained by Vathy M. Kamulete. Last updated 2 years ago.
data-driftdata-validationdataset-shiftsdrift-detectionmachine-learningmlopsmodel-monitoringmodel-validationperformance-monitoringstatistical-process-controlstatistical-tests
5.8 match 2 stars 5.08 score 40 scriptscran
permutest:Run Permutation Tests and Construct Associated Confidence Intervals
Implements permutation tests for any test statistic and randomization scheme and constructs associated confidence intervals as described in Glazer and Stark (2024) <doi:10.48550/arXiv.2405.05238>.
Maintained by Amanda Glazer. Last updated 6 months ago.
17.1 match 1.70 score 2 scriptsocbe-uio
permChacko:Chacko Test for Order-Restriction with Permutation
Implements an extension of the Chacko chi-square test for ordered vectors (Chacko, 1966, <https://www.jstor.org/stable/25051572>). Our extension brings the Chacko test to the computer age by implementing a permutation test to offer a numeric estimate of the p-value, which is particularly useful when the analytic solution is not available.
Maintained by Waldir Leoncio. Last updated 6 months ago.
6.8 match 4.30 score 3 scriptsmerck
gMCPLite:Lightweight Graph Based Multiple Comparison Procedures
A lightweight fork of 'gMCP' with functions for graphical described multiple test procedures introduced in Bretz et al. (2009) <doi:10.1002/sim.3495> and Bretz et al. (2011) <doi:10.1002/bimj.201000239>. Implements a flexible function using 'ggplot2' to create multiplicity graph visualizations. Contains instructions of multiplicity graph and graphical testing for group sequential design, described in Maurer and Bretz (2013) <doi:10.1080/19466315.2013.807748>, with necessary unit testing using 'testthat'.
Maintained by Nan Xiao. Last updated 1 years ago.
5.0 match 11 stars 5.79 score 14 scriptsalexchristensen
NetworkToolbox:Methods and Measures for Brain, Cognitive, and Psychometric Network Analysis
Implements network analysis and graph theory measures used in neuroscience, cognitive science, and psychology. Methods include various filtering methods and approaches such as threshold, dependency (Kenett, Tumminello, Madi, Gur-Gershgoren, Mantegna, & Ben-Jacob, 2010 <doi:10.1371/journal.pone.0015032>), Information Filtering Networks (Barfuss, Massara, Di Matteo, & Aste, 2016 <doi:10.1103/PhysRevE.94.062306>), and Efficiency-Cost Optimization (Fallani, Latora, & Chavez, 2017 <doi:10.1371/journal.pcbi.1005305>). Brain methods include the recently developed Connectome Predictive Modeling (see references in package). Also implements several network measures including local network characteristics (e.g., centrality), community-level network characteristics (e.g., community centrality), global network characteristics (e.g., clustering coefficient), and various other measures associated with the reliability and reproducibility of network analysis.
Maintained by Alexander Christensen. Last updated 2 years ago.
4.1 match 23 stars 6.99 score 101 scripts 4 dependentsdwarton
ecostats:Code and Data Accompanying the Eco-Stats Text (Warton 2022)
Functions and data supporting the Eco-Stats text (Warton, 2022, Springer), and solutions to exercises. Functions include tools for using simulation envelopes in diagnostic plots, and a function for diagnostic plots of multivariate linear models. Datasets mentioned in the package are included here (where not available elsewhere) and there is a vignette for each chapter of the text with solutions to exercises.
Maintained by David Warton. Last updated 1 years ago.
4.4 match 8 stars 6.58 score 53 scriptsbioc
Category:Category Analysis
A collection of tools for performing category (gene set enrichment) analysis.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
annotationgopathwaysgenesetenrichment
3.6 match 7.93 score 183 scripts 16 dependentstalgalili
dendextend:Extending 'dendrogram' Functionality in R
Offers a set of functions for extending 'dendrogram' objects in R, letting you visualize and compare trees of 'hierarchical clusterings'. You can (1) Adjust a tree's graphical parameters - the color, size, type, etc of its branches, nodes and labels. (2) Visually and statistically compare different 'dendrograms' to one another.
Maintained by Tal Galili. Last updated 2 months ago.
1.7 match 154 stars 17.02 score 6.0k scripts 164 dependentsbioc
PolySTest:PolySTest: Detection of differentially regulated features. Combined statistical testing for data with few replicates and missing values
The complexity of high-throughput quantitative omics experiments often leads to low replicates numbers and many missing values. We implemented a new test to simultaneously consider missing values and quantitative changes, which we combined with well-performing statistical tests for high confidence detection of differentially regulated features. The package contains functions to run the test and to visualize the results.
Maintained by Veit Schwämmle. Last updated 4 months ago.
massspectrometryproteomicssoftwaredifferentialexpression
5.8 match 4.95 score 12 scriptsloelschlaeger
oeli:Utilities for Developing Data Science Software
Some general helper functions that I (and maybe others) find useful when developing data science software.
Maintained by Lennart Oelschläger. Last updated 4 months ago.
5.3 match 2 stars 5.42 score 1 scripts 4 dependentslukketotte
MultSurvTests:Permutation Tests for Multivariate Survival Analysis
Multivariate version of the two-sample Gehan and logrank tests, as described in L.J Wei & J.M Lachin (1984) and Persson et al. (2019).
Maintained by Lukas Arnroth. Last updated 4 years ago.
10.4 match 2.70 score 2 scriptsmhahsler
arules:Mining Association Rules and Frequent Itemsets
Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules). Also provides C implementations of the association mining algorithms Apriori and Eclat. Hahsler, Gruen and Hornik (2005) <doi:10.18637/jss.v014.i15>.
Maintained by Michael Hahsler. Last updated 1 months ago.
arulesassociation-rulesfrequent-itemsets
2.0 match 194 stars 13.99 score 3.3k scripts 28 dependentsbioc
GraphAlignment:GraphAlignment
Graph alignment is an extension package for the R programming environment which provides functions for finding an alignment between two networks based on link and node similarity scores. (J. Berg and M. Laessig, "Cross-species analysis of biological networks by Bayesian alignment", PNAS 103 (29), 10967-10972 (2006))
Maintained by Joern P. Meier. Last updated 5 months ago.
7.1 match 3.90 score 9 scriptsgbm-developers
gbm:Generalized Boosted Regression Models
An implementation of extensions to Freund and Schapire's AdaBoost algorithm and Friedman's gradient boosting machine. Includes regression methods for least squares, absolute loss, t-distribution loss, quantile regression, logistic, multinomial logistic, Poisson, Cox proportional hazards partial likelihood, AdaBoost exponential loss, Huberized hinge loss, and Learning to Rank measures (LambdaMart). Originally developed by Greg Ridgeway. Newer version available at github.com/gbm-developers/gbm3.
Maintained by Greg Ridgeway. Last updated 9 months ago.
2.0 match 52 stars 13.85 score 6.8k scripts 91 dependentsmichbur
biogram:N-Gram Analysis of Biological Sequences
Tools for extraction and analysis of various n-grams (k-mers) derived from biological sequences (proteins or nucleic acids). Contains QuiPT (quick permutation test) for fast feature-filtering of the n-gram data.
Maintained by Michal Burdukiewicz. Last updated 7 months ago.
biological-sequencesngram-analysis
3.6 match 10 stars 7.50 score 87 scripts 3 dependentsasgr
imager:Image Processing Library Based on 'CImg'
Fast image processing for images in up to 4 dimensions (two spatial dimensions, one time/depth dimension, one colour dimension). Provides most traditional image processing tools (filtering, morphology, transformations, etc.) as well as various functions for easily analysing image data using R. The package wraps 'CImg', <http://cimg.eu>, a simple, modern C++ library for image processing.
Maintained by Aaron Robotham. Last updated 25 days ago.
2.0 match 17 stars 13.62 score 2.4k scripts 45 dependentscran
jmuOutlier:Permutation Tests for Nonparametric Statistics
Performs a permutation test on the difference between two location parameters, a permutation correlation test, a permutation F-test, the Siegel-Tukey test, a ratio mean deviance test. Also performs some graphing techniques, such as for confidence intervals, vector addition, and Fourier analysis; and includes functions related to the Laplace (double exponential) and triangular distributions. Performs power calculations for the binomial test.
Maintained by Steven T. Garren. Last updated 6 years ago.
12.1 match 2.26 score 1 dependentsbioc
HEM:Heterogeneous error model for identification of differentially expressed genes under multiple conditions
This package fits heterogeneous error models for analysis of microarray data
Maintained by HyungJun Cho. Last updated 5 months ago.
microarraydifferentialexpression
6.3 match 4.30 score 6 scriptsmetabocomp
MUVR2:Multivariate Methods with Unbiased Variable Selection
Predictive multivariate modelling for metabolomics. Types: Classification and regression. Methods: Partial Least Squares, Random Forest ans Elastic Net Data structures: Paired and unpaired Validation: repeated double cross-validation (Westerhuis et al. (2008)<doi:10.1007/s11306-007-0099-6>, Filzmoser et al. (2009)<doi:10.1002/cem.1225>) Variable selection: Performed internally, through tuning in the inner cross-validation loop.
Maintained by Yingxiao Yan. Last updated 6 months ago.
7.1 match 1 stars 3.81 score 1 scriptscran
AUtests:Approximate Unconditional and Permutation Tests
Performs approximate unconditional and permutation testing for 2x2 contingency tables. Motivated by testing for disease association with rare genetic variants in case-control studies. When variants are extremely rare, these tests give better control of Type I error than standard tests.
Maintained by Arjun Sondhi. Last updated 5 years ago.
13.4 match 2.00 scoreaaamini
nett:Network Analysis and Community Detection
Features tools for the network data analysis and community detection. Provides multiple methods for fitting, model selection and goodness-of-fit testing in degree-corrected stochastic blocks models. Most of the computations are fast and scalable for sparse networks, esp. for Poisson versions of the models. Implements the following: Amini, Chen, Bickel and Levina (2013) <doi:10.1214/13-AOS1138> Bickel and Sarkar (2015) <doi:10.1111/rssb.12117> Lei (2016) <doi:10.1214/15-AOS1370> Wang and Bickel (2017) <doi:10.1214/16-AOS1457> Zhang and Amini (2020) <arXiv:2012.15047> Le and Levina (2022) <doi:10.1214/21-EJS1971>.
Maintained by Arash A. Amini. Last updated 2 years ago.
4.8 match 8 stars 5.48 score 19 scriptscran
wPerm:Permutation Tests
Supplies permutation-test alternatives to traditional hypothesis-test procedures such as two-sample tests for means, medians, and standard deviations; correlation tests; tests for homogeneity and independence; and more. Suitable for general audiences, including individual and group users, introductory statistics courses, and more advanced statistics courses that desire an introduction to permutation tests.
Maintained by Neil A. Weiss. Last updated 9 years ago.
20.0 match 1.30 scorer-gregmisc
gdata:Various R Programming Tools for Data Manipulation
Various R programming tools for data manipulation, including medical unit conversions, combining objects, character vector operations, factor manipulation, obtaining information about R objects, generating fixed-width format files, extracting components of date & time objects, operations on columns of data frames, matrix operations, operations on vectors, operations on data frames, value of last evaluated expression, and a resample() wrapper for sample() that ensures consistent behavior for both scalar and vector arguments.
Maintained by Arni Magnusson. Last updated 2 months ago.
1.9 match 9 stars 13.62 score 4.5k scripts 124 dependentswinvector
sigr:Succinct and Correct Statistical Summaries for Reports
Succinctly and correctly format statistical summaries of various models and tests (F-test, Chi-Sq-test, Fisher-test, T-test, and rank-significance). This package also includes empirical tests, such as Monte Carlo and bootstrap distribution estimates.
Maintained by John Mount. Last updated 2 years ago.
3.5 match 28 stars 7.18 score 97 scripts 1 dependentsstamats
MKinfer:Inferential Statistics
Computation of various confidence intervals (Altman et al. (2000), ISBN:978-0-727-91375-3; Hedderich and Sachs (2018), ISBN:978-3-662-56657-2) including bootstrapped versions (Davison and Hinkley (1997), ISBN:978-0-511-80284-3) as well as Hsu (Hedderich and Sachs (2018), ISBN:978-3-662-56657-2), permutation (Janssen (1997), <doi:10.1016/S0167-7152(97)00043-6>), bootstrap (Davison and Hinkley (1997), ISBN:978-0-511-80284-3), intersection-union (Sozu et al. (2015), ISBN:978-3-319-22005-5) and multiple imputation (Barnard and Rubin (1999), <doi:10.1093/biomet/86.4.948>) t-test; furthermore, computation of intersection-union z-test as well as multiple imputation Wilcoxon tests. Graphical visualization by volcano and Bland-Altman plots (Bland and Altman (1986), <doi:10.1016/S0140-6736(86)90837-8>; Shieh (2018), <doi:10.1186/s12874-018-0505-y>).
Maintained by Matthias Kohl. Last updated 11 months ago.
3.8 match 6 stars 6.56 score 71 scripts 4 dependentswinvector
wrapr:Wrap R Tools for Debugging and Parametric Programming
Tools for writing and debugging R code. Provides: '%.>%' dot-pipe (an 'S3' configurable pipe), unpack/to (R style multiple assignment/return), 'build_frame()'/'draw_frame()' ('data.frame' example tools), 'qc()' (quoting concatenate), ':=' (named map builder), 'let()' (converts non-standard evaluation interfaces to parametric standard evaluation interfaces, inspired by 'gtools::strmacro()' and 'base::bquote()'), and more.
Maintained by John Mount. Last updated 2 years ago.
2.3 match 137 stars 11.11 score 390 scripts 12 dependentsbioc
sRACIPE:Systems biology tool to simulate gene regulatory circuits
sRACIPE implements a randomization-based method for gene circuit modeling. It allows us to study the effect of both the gene expression noise and the parametric variation on any gene regulatory circuit (GRC) using only its topology, and simulates an ensemble of models with random kinetic parameters at multiple noise levels. Statistical analysis of the generated gene expressions reveals the basin of attraction and stability of various phenotypic states and their changes associated with intrinsic and extrinsic noises. sRACIPE provides a holistic picture to evaluate the effects of both the stochastic nature of cellular processes and the parametric variation.
Maintained by Mingyang Lu. Last updated 18 days ago.
researchfieldsystemsbiologymathematicalbiologygeneexpressiongeneregulationgenetargetcpp
3.9 match 4 stars 6.40 score 209 scriptsdaqana
dqrng:Fast Pseudo Random Number Generators
Several fast random number generators are provided as C++ header only libraries: The PCG family by O'Neill (2014 <https://www.cs.hmc.edu/tr/hmc-cs-2014-0905.pdf>) as well as the Xoroshiro / Xoshiro family by Blackman and Vigna (2021 <doi:10.1145/3460772>). In addition fast functions for generating random numbers according to a uniform, normal and exponential distribution are included. The latter two use the Ziggurat algorithm originally proposed by Marsaglia and Tsang (2000, <doi:10.18637/jss.v005.i08>). The fast sampling methods support unweighted sampling both with and without replacement. These functions are exported to R and as a C++ interface and are enabled for use with the default 64 bit generator from the PCG family, Xoroshiro128+/++/** and Xoshiro256+/++/** as well as the 64 bit version of the 20 rounds Threefry engine (Salmon et al., 2011, <doi:10.1145/2063384.2063405>) as provided by the package 'sitmo'.
Maintained by Ralf Stubner. Last updated 6 months ago.
randomrandom-distributionsrandom-generationrandom-samplingrngcpp
1.9 match 42 stars 13.12 score 188 scripts 183 dependentspgiraudoux
pgirmess:Spatial Analysis and Data Mining for Field Ecologists
Set of tools for reading, writing and transforming spatial and seasonal data, model selection and specific statistical tests for ecologists. It includes functions to interpolate regular positions of points between landmarks, to discretize polylines into regular point positions, link distant observations to points and convert a bounding box in a spatial object. It also provides miscellaneous functions for field ecologists such as spatial statistics and inference on diversity indexes, writing data.frame with Chinese characters.
Maintained by Patrick Giraudoux. Last updated 1 years ago.
3.4 match 5 stars 7.32 score 422 scripts 2 dependentstesselle
kairos:Analysis of Chronological Patterns from Archaeological Count Data
A toolkit for absolute and relative dating and analysis of chronological patterns. This package includes functions for chronological modeling and dating of archaeological assemblages from count data. It provides methods for matrix seriation. It also allows to compute time point estimates and density estimates of the occupation and duration of an archaeological site.
Maintained by Nicolas Frerebeau. Last updated 11 days ago.
chronologymatrix-seriationarchaeologyarchaeological-science
5.3 match 4.66 score 11 scripts 1 dependentsmike-lawrence
ez:Easy Analysis and Visualization of Factorial Experiments
Facilitates easy analysis of factorial experiments, including purely within-Ss designs (a.k.a. "repeated measures"), purely between-Ss designs, and mixed within-and-between-Ss designs. The functions in this package aim to provide simple, intuitive and consistent specification of data analysis and visualization. Visualization functions also include design visualization for pre-analysis data auditing, and correlation matrix visualization. Finally, this package includes functions for non-parametric analysis, including permutation tests and bootstrap resampling. The bootstrap function obtains predictions either by cell means or by more advanced/powerful mixed effects models, yielding predictions and confidence intervals that may be easily visualized at any level of the experiment's design.
Maintained by Michael A. Lawrence. Last updated 8 years ago.
2.4 match 53 stars 10.28 score 2.7k scripts 12 dependentsgabrielgesteira
qtlpoly:Random-Effect Multiple QTL Mapping in Autopolyploids
Performs random-effect multiple interval mapping (REMIM) in full-sib families of autopolyploid species based on restricted maximum likelihood (REML) estimation and score statistics, as described in Pereira et al. (2020) <doi:10.1534/genetics.120.303080>.
Maintained by Gabriel de Siqueira Gesteira. Last updated 4 months ago.
polyploidqtl-mappingopenblascppopenmp
4.7 match 6 stars 5.17 score 61 scriptschoi-phd
lordif:Logistic Ordinal Regression Differential Item Functioning using IRT
Performs analysis of Differential Item Functioning (DIF) for dichotomous and polytomous items using an iterative hybrid of ordinal logistic regression and item response theory (IRT) according to Choi, Gibbons, and Crane (2011) <doi:10.18637/jss.v039.i08>.
Maintained by Seung W. Choi. Last updated 2 months ago.
4.8 match 1 stars 5.12 score 35 scripts 1 dependentsbioc
viper:Virtual Inference of Protein-activity by Enriched Regulon analysis
Inference of protein activity from gene expression data, including the VIPER and msVIPER algorithms
Maintained by Mariano J Alvarez. Last updated 5 months ago.
systemsbiologynetworkenrichmentgeneexpressionfunctionalpredictiongeneregulation
3.5 match 7.00 score 342 scripts 5 dependentshenrikbengtsson
R.utils:Various Programming Utilities
Utility functions useful when programming and developing R packages.
Maintained by Henrik Bengtsson. Last updated 1 years ago.
1.8 match 63 stars 13.74 score 5.7k scripts 814 dependentsbioc
npGSEA:Permutation approximation methods for gene set enrichment analysis (non-permutation GSEA)
Current gene set enrichment methods rely upon permutations for inference. These approaches are computationally expensive and have minimum achievable p-values based on the number of permutations, not on the actual observed statistics. We have derived three parametric approximations to the permutation distributions of two gene set enrichment test statistics. We are able to reduce the computational burden and granularity issues of permutation testing with our method, which is implemented in this package. npGSEA calculates gene set enrichment statistics and p-values without the computational cost of permutations. It is applicable in settings where one or many gene sets are of interest. There are also built-in plotting functions to help users visualize results.
Maintained by Jessica Larson. Last updated 5 months ago.
genesetenrichmentmicroarraystatisticalmethodpathways
7.3 match 3.30 score 4 scripts