R-universe search: iterator

revolutionanalytics

iterators:Provides Iterator Construct

Support for iterators, which allow a programmer to traverse through all the elements of a vector, list, or other collection of data.

Maintained by Folashade Daniel. Last updated 3 years ago.

86.8 match 5 stars 13.74 score 1.7k scripts 2.8k dependents

crowding

iterors:Fast, Compact Iterators and Tools

A fresh take on iterators in R. Designed to be cross-compatible with the 'iterators' package, but using the 'nextOr' method will offer better performance as well as more compact code. With batteries included: includes a collection of iterator constructors and combinators ported and refined from the 'iterators', 'itertools', and 'itertools2' packages.

Maintained by Peter Meilstrup. Last updated 2 years ago.

130.8 match 4 stars 6.02 score 21 scripts

ramhiser

itertools2:Iterators for efficient looping

A port of Python's excellent itertools module to R for efficient looping.

Maintained by John A. Ramey. Last updated 9 years ago.

itertools

90.9 match 12 stars 5.10 score 35 scripts 2 dependents

steveweston

itertools:Iterator Tools

Various tools for creating iterators, many patterned after functions in the Python itertools module, and others patterned after functions in the 'snow' package.

Maintained by Steve Weston. Last updated 11 years ago.

52.9 match 8.26 score 8.3k scripts 65 dependents

rstudio

reticulate:Interface to 'Python'

Interface to 'Python' modules, classes, and functions. When calling into 'Python', R data types are automatically converted to their equivalent 'Python' types. When values are returned from 'Python' to R they are converted back to R types. Compatible with all versions of 'Python' >= 2.7.

Maintained by Tomasz Kalinowski. Last updated 11 hours ago.

cpp

14.8 match 1.7k stars 21.07 score 18k scripts 427 dependents

r-lib

coro:'Coroutines' for R

Provides 'coroutines' for R, a family of functions that can be suspended and resumed later on. This includes 'async' functions (which await) and generators (which yield). 'Async' functions are based on the concurrency framework of the 'promises' package. Generators are based on a dependency free iteration protocol defined in 'coro' and are compatible with iterators from the 'reticulate' package.

Maintained by Lionel Henry. Last updated 19 days ago.

async coroutines generator iterator promises reticulate

24.7 match 166 stars 11.88 score 105 scripts 50 dependents

mlverse

torch:Tensors and Neural Networks with 'GPU' Acceleration

Provides functionality to define and train neural networks similar to 'PyTorch' by Paszke et al (2019) <doi:10.48550/arXiv.1912.01703> but written entirely in R using the 'libtorch' library. Also supports low-level tensor operations and 'GPU' acceleration.

Maintained by Daniel Falbel. Last updated 5 days ago.

autograd deep-learning torch cpp

15.5 match 520 stars 16.52 score 1.4k scripts 38 dependents

jwood000

RcppAlgos:High Performance Tools for Combinatorics and Computational Mathematics

Provides optimized functions and flexible iterators implemented in C++ for solving problems in combinatorics and computational mathematics. Handles various combinatorial objects including combinations, permutations, integer partitions and compositions, Cartesian products, unordered Cartesian products, and partition of groups. Utilizes the RMatrix class from 'RcppParallel' for thread safety. The combination and permutation functions contain constraint parameters that allow for generation of all results of a vector meeting specific criteria (e.g. finding all combinations such that the sum is between two bounds). Capable of ranking/unranking combinatorial objects efficiently (e.g. retrieve only the nth lexicographical result) which sets up nicely for parallelization as well as random sampling. Gmp support permits exploration where the total number of results is large (e.g. comboSample(10000, 500, n = 4)). Additionally, there are several high performance number theoretic functions that are useful for problems common in computational mathematics. Some of these functions make use of the fast integer division library 'libdivide'. The primeSieve function is based on the segmented sieve of Eratosthenes implementation by Kim Walisch. It is also efficient for large numbers by using the cache friendly improvements originally developed by Tomás Oliveira. Finally, there is a prime counting function that implements Legendre's formula based on the work of Kim Walisch.

Maintained by Joseph Wood. Last updated 1 months ago.

combinations combinatorics factorization number-theory parallel permutation prime-factorizations primesieve gmp cpp

25.2 match 45 stars 10.04 score 153 scripts 12 dependents

stevecondylios

iteratoR:Print Loop Iterations at Exponentially Disparate Intervals

Know which loop iteration the code execution is up to by including a single, convenient function call inside the loop.

Maintained by Steve Condylios. Last updated 3 years ago.

58.3 match 1 stars 2.70 score 4 scripts

rstudio

tfdatasets:Interface to 'TensorFlow' Datasets

Interface to 'TensorFlow' Datasets, a high-level library for building complex input pipelines from simple, re-usable pieces. See <https://www.tensorflow.org/guide> for additional details.

Maintained by Tomasz Kalinowski. Last updated 3 days ago.

15.6 match 34 stars 9.32 score 656 scripts 3 dependents

kingaa

pomp:Statistical Inference for Partially Observed Markov Processes

Tools for data analysis with partially observed Markov process (POMP) models (also known as stochastic dynamical systems, hidden Markov models, and nonlinear, non-Gaussian, state-space models). The package provides facilities for implementing POMP models, simulating them, and fitting them to time series data by a variety of frequentist and Bayesian methods. It is also a versatile platform for implementation of inference methods for general POMP models.

Maintained by Aaron A. King. Last updated 1 months ago.

abc b-spline differential-equations dynamical-systems iterated-filtering likelihood likelihood-free markov-chain-monte-carlo markov-model mathematical-modelling measurement-error particle-filter sequential-monte-carlo simulation-based-inference sobol-sequence state-space statistical-inference stochastic-processes time-series openblas

12.3 match 115 stars 11.81 score 1.3k scripts 4 dependents

samuel-marsh

scCustomize:Custom Visualizations & Functions for Streamlined Analyses of Single Cell Sequencing

Collection of functions created and/or curated to aid in the visualization and analysis of single-cell data using 'R'. 'scCustomize' aims to provide 1) Customized visualizations for aid in ease of use and to create more aesthetic and functional visuals. 2) Improve speed/reproducibility of common tasks/pieces of code in scRNA-seq analysis with a single or group of functions. For citation please use: Marsh SE (2021) "Custom Visualizations & Functions for Streamlined Analyses of Single Cell Sequencing" <doi:10.5281/zenodo.5706430> RRID:SCR_024675.

Maintained by Samuel Marsh. Last updated 3 months ago.

customization ggplot2 scrna-seq seurat single-cell single-cell-genomics single-cell-rna-seq visualization

16.3 match 242 stars 8.75 score 1.1k scripts

rpahl

container:Extending Base 'R' Lists

Extends the functionality of base 'R' lists and provides specialized data structures 'deque', 'set', 'dict', and 'dict.table', the latter to extend the 'data.table' package.

Maintained by Roman Pahl. Last updated 2 months ago.

container data-structures deque dict sets

18.7 match 16 stars 7.13 score 140 scripts

bioc

iterativeBMAsurv:The Iterative Bayesian Model Averaging (BMA) Algorithm For Survival Analysis

The iterative Bayesian Model Averaging (BMA) algorithm for survival analysis is a variable selection method for applying survival analysis to microarray data.

Maintained by Ka Yee Yeung. Last updated 5 months ago.

microarray

38.8 match 3.30 score 8 scripts

flr

FLCore:Core Package of FLR, Fisheries Modelling in R

Core classes and methods for FLR, a framework for fisheries modelling and management strategy simulation in R. Developed by a team of fisheries scientists in various countries. More information can be found at <http://flr-project.org/>.

Maintained by Iago Mosqueira. Last updated 8 days ago.

fisheries flr fisheries-modelling

13.6 match 16 stars 8.78 score 956 scripts 23 dependents

bioc

iterativeBMA:The Iterative Bayesian Model Averaging (BMA) algorithm

The iterative Bayesian Model Averaging (BMA) algorithm is a variable selection and classification algorithm with an application of classifying 2-class microarray samples, as described in Yeung, Bumgarner and Raftery (Bioinformatics 2005, 21: 2394-2402).

Maintained by Ka Yee Yeung. Last updated 5 months ago.

microarray classification

31.3 match 3.78 score 1 scripts

randy3k

iterpc:Efficient Iterator for Permutations and Combinations

Iterator for generating permutations and combinations. They can be either drawn with or without replacement, or with distinct/ non-distinct items (multiset). The generated sequences are in lexicographical order (dictionary order). The algorithms to generate permutations and combinations are memory efficient. These iterative algorithms enable users to process all sequences without putting all results in the memory at the same time. The algorithms are written in C/C++ for faster performance. Note: 'iterpc' is no longer being maintained. Users are recommended to switch to 'arrangements'.

Maintained by Randy Lai. Last updated 5 years ago.

16.4 match 9 stars 7.17 score 47 scripts 5 dependents

randy3k

arrangements:Fast Generators and Iterators for Permutations, Combinations, Integer Partitions and Compositions

Fast generators and iterators for permutations, combinations, integer partitions and compositions. The arrangements are in lexicographical order and generated iteratively in a memory efficient manner. It has been demonstrated that 'arrangements' outperforms most existing packages of similar kind. Benchmarks could be found at <https://randy3k.github.io/arrangements/articles/benchmark.html>.

Maintained by Randy Lai. Last updated 2 years ago.

gmp

14.3 match 52 stars 7.89 score 118 scripts 23 dependents

mlopez-ibanez

irace:Iterated Racing for Automatic Algorithm Configuration

Iterated race is an extension of the Iterated F-race method for the automatic configuration of optimization algorithms, that is, (offline) tuning their parameters by finding the most appropriate settings given a set of instances of an optimization problem. M. López-Ibáñez, J. Dubois-Lacoste, L. Pérez Cáceres, T. Stützle, and M. Birattari (2016) <doi:10.1016/j.orp.2016.09.002>.

Maintained by Manuel López-Ibáñez. Last updated 29 days ago.

algorithm-configuration hyperparameter-tuning irace optimization-algorithms

10.9 match 63 stars 10.28 score 103 scripts 1 dependents

inlabru-org

inlabru:Bayesian Latent Gaussian Modelling using INLA and Extensions

Facilitates spatial and general latent Gaussian modeling using integrated nested Laplace approximation via the INLA package (<https://www.r-inla.org>). Additionally, extends the GAM-like model class to more general nonlinear predictor expressions, and implements a log Gaussian Cox process likelihood for modeling univariate and spatial point processes based on ecological survey data. Model components are specified with general inputs and mapping methods to the latent variables, and the predictors are specified via general R expressions, with separate expressions for each observation likelihood model in multi-likelihood models. A prediction method based on fast Monte Carlo sampling allows posterior prediction of general expressions of the latent variables. Ecology-focused introduction in Bachl, Lindgren, Borchers, and Illian (2019) <doi:10.1111/2041-210X.13168>.

Maintained by Finn Lindgren. Last updated 2 days ago.

8.1 match 96 stars 12.62 score 832 scripts 6 dependents

poissonconsulting

nlist:Lists of Numeric Atomic Objects

Create and manipulate numeric list ('nlist') objects. An 'nlist' is an S3 list of uniquely named numeric objects. An numeric object is an integer or double vector, matrix or array. An 'nlists' object is a S3 class list of 'nlist' objects with the same names, dimensionalities and typeofs. Numeric list objects are of interest because they are the raw data inputs for analytic engines such as 'JAGS', 'STAN' and 'TMB'. Numeric lists objects, which are useful for storing multiple realizations of of simulated data sets, can be converted to coda::mcmc and coda::mcmc.list objects.

Maintained by Joe Thorley. Last updated 2 months ago.

data-frame natomic nlist nlists

13.5 match 6 stars 7.23 score 13 scripts 12 dependents

statistikat

VIM:Visualization and Imputation of Missing Values

New tools for the visualization of missing and/or imputed values are introduced, which can be used for exploring the data and the structure of the missing and/or imputed values. Depending on this structure of the missing values, the corresponding methods may help to identify the mechanism generating the missing values and allows to explore the data including missing values. In addition, the quality of imputation can be visually explored using various univariate, bivariate, multiple and multivariate plot methods. A graphical user interface available in the separate package VIMGUI allows an easy handling of the implemented plot methods.

Maintained by Matthias Templ. Last updated 7 months ago.

hotdeck imputation-methods model-predictions visualization cpp

6.5 match 85 stars 14.44 score 2.6k scripts 19 dependents

tanaylab

naryn:Native Access Medical Record Retriever for High Yield Analytics

A toolkit for medical records data analysis. The 'naryn' package implements an efficient data structure for storing medical records, and provides a set of functions for data extraction, manipulation and analysis.

Maintained by Aviezer Lifshitz. Last updated 3 days ago.

data-analysis medical-records cpp

17.7 match 3 stars 5.26 score 4 scripts

gateslab

gimme:Group Iterative Multiple Model Estimation

Data-driven approach for arriving at person-specific time series models. The method first identifies which relations replicate across the majority of individuals to detect signal from noise. These group-level relations are then used as a foundation for starting the search for person-specific (or individual-level) relations. See Gates & Molenaar (2012) <doi:10.1016/j.neuroimage.2012.06.026>.

Maintained by Kathleen M Gates. Last updated 6 months ago.

11.3 match 26 stars 7.71 score 53 scripts

tanaylab

misha:Toolkit for Analysis of Genomic Data

A toolkit for analysis of genomic data. The 'misha' package implements an efficient data structure for storing genomic data, and provides a set of functions for data extraction, manipulation and analysis. Some of the 2D genome algorithms were described in Yaffe and Tanay (2011) <doi:10.1038/ng.947>.

Maintained by Aviezer Lifshitz. Last updated 4 days ago.

genomic-data-analysis cpp

14.8 match 4 stars 5.86 score

poissonconsulting

mcmcr:Manipulate MCMC Samples

Functions and classes to store, manipulate and summarise Monte Carlo Markov Chain (MCMC) samples. For more information see Brooks et al. (2011) <isbn:978-1-4200-7941-8>.

Maintained by Joe Thorley. Last updated 2 months ago.

coda mcmc

11.3 match 17 stars 7.66 score 111 scripts 10 dependents

michaelhallquist

MplusAutomation:An R Package for Facilitating Large-Scale Latent Variable Analyses in Mplus

Leverages the R language to automate latent variable model estimation and interpretation using 'Mplus', a powerful latent variable modeling program developed by Muthen and Muthen (<https://www.statmodel.com>). Specifically, this package provides routines for creating related groups of models, running batches of models, and extracting and tabulating model parameters and fit statistics.

Maintained by Michael Hallquist. Last updated 2 months ago.

6.6 match 86 stars 12.96 score 664 scripts 13 dependents

kota7

combiter:Combinatorics Iterators

Provides iterators for combinations, permutations, subsets, and Cartesian product, which allow one to go through all elements without creating a huge set of all possible values.

Maintained by Kota Mori. Last updated 7 years ago.

cpp

22.7 match 4 stars 3.56 score 18 scripts

pierre-andre

ibr:Iterative Bias Reduction

Multivariate smoothing using iterative bias reduction with kernel, thin plate splines, Duchon splines or low rank splines.

Maintained by "Pierre-Andre Cornillon". Last updated 2 years ago.

openblas

61.5 match 1.28 score 19 scripts

rfastofficial

Rfast:A Collection of Efficient and Extremely Fast R Functions

A collection of fast (utility) functions for data analysis. Column and row wise means, medians, variances, minimums, maximums, many t, F and G-square tests, many regressions (normal, logistic, Poisson), are some of the many fast functions. References: a) Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>. b) Tsagris M. and Papadakis M. (2018). Forward regression in R: from the extreme slow to the extreme fast. Journal of Data Science, 16(4): 771--780. <doi:10.6339/JDS.201810_16(4).00006>. c) Chatzipantsiou C., Dimitriadis M., Papadakis M. and Tsagris M. (2020). Extremely Efficient Permutation and Bootstrap Hypothesis Tests Using Hypothesis Tests Using R. Journal of Modern Applied Statistical Methods, 18(2), eP2898. <doi:10.48550/arXiv.1806.10947>. d) Tsagris M., Papadakis M., Alenazi A. and Alzeley O. (2024). Computationally Efficient Outlier Detection for High-Dimensional Data Using the MDP Algorithm. Computation, 12(9): 185. <doi:10.3390/computation12090185>. e) Tsagris M. and Papadakis M. (2025). Fast and light-weight energy statistics using the R package Rfast. <doi:10.48550/arXiv.2501.02849>.

Maintained by Manos Papadakis. Last updated 16 days ago.

openblas cpp openmp

6.0 match 147 stars 12.54 score 1.2k scripts 166 dependents

biostatomics

Coxmos:Cox MultiBlock Survival

This software package provides Cox survival analysis for high-dimensional and multiblock datasets. It encompasses a suite of functions dedicated from the classical Cox regression to newest analysis, including Cox proportional hazards model, Stepwise Cox regression, and Elastic-Net Cox regression, Sparse Partial Least Squares Cox regression (sPLS-COX) incorporating three distinct strategies, and two Multiblock-PLS Cox regression (MB-sPLS-COX) methods. This tool is designed to adeptly handle high-dimensional data, and provides tools for cross-validation, plot generation, and additional resources for interpreting results. While references are available within the corresponding functions, key literature is mentioned below. Terry M Therneau (2024) <https://CRAN.R-project.org/package=survival>, Noah Simon et al. (2011) <doi:10.18637/jss.v039.i05>, Philippe Bastien et al. (2005) <doi:10.1016/j.csda.2004.02.005>, Philippe Bastien (2008) <doi:10.1016/j.chemolab.2007.09.009>, Philippe Bastien et al. (2014) <doi:10.1093/bioinformatics/btu660>, Kassu Mehari Beyene and Anouar El Ghouch (2020) <doi:10.1002/sim.8671>, Florian Rohart et al. (2017) <doi:10.1371/journal.pcbi.1005752>.

Maintained by Pedro Salguero García. Last updated 10 days ago.

14.0 match 1 stars 5.30 score 5 scripts

bioc

struct:Statistics in R Using Class-based Templates

Defines and includes a set of class-based templates for developing and implementing data processing and analysis workflows, with a strong emphasis on statistics and machine learning. The templates can be used and where needed extended to 'wrap' tools and methods from other packages into a common standardised structure to allow for effective and fast integration. Model objects can be combined into sequences, and sequences nested in iterators using overloaded operators to simplify and improve readability of the code. Ontology lookup has been integrated and implemented to provide standardised definitions for methods, inputs and outputs wrapped using the class-based templates.

Maintained by Gavin Rhys Lloyd. Last updated 5 months ago.

workflowstep

11.8 match 6.04 score 76 scripts 3 dependents

r-lib

httr2:Perform HTTP Requests and Process the Responses

Tools for creating and modifying HTTP requests, then performing them and processing the results. 'httr2' is a modern re-imagining of 'httr' that uses a pipe-based interface and solves more of the problems that API wrapping packages face.

Maintained by Hadley Wickham. Last updated 7 days ago.

http

3.9 match 246 stars 17.66 score 1.9k scripts 1.1k dependents

plangfelder

WGCNA:Weighted Correlation Network Analysis

Functions necessary to perform Weighted Correlation Network Analysis on high-dimensional data as originally described in Horvath and Zhang (2005) <doi:10.2202/1544-6115.1128> and Langfelder and Horvath (2008) <doi:10.1186/1471-2105-9-559>. Includes functions for rudimentary data cleaning, construction of correlation networks, module identification, summarization, and relating of variables and modules to sample traits. Also includes a number of utility functions for data manipulation and visualization.

Maintained by Peter Langfelder. Last updated 6 months ago.

cpp

7.1 match 54 stars 9.65 score 5.3k scripts 32 dependents

jamiemkass

ENMeval:Automated Tuning and Evaluations of Ecological Niche Models

Runs ecological niche models over all combinations of user-defined settings (i.e., tuning), performs cross validation to evaluate models, and returns data tables to aid in selection of optimal model settings that balance goodness-of-fit and model complexity. Also has functions to partition data spatially (or not) for cross validation, to plot multiple visualizations of results, to run null models to estimate significance and effect sizes of performance metrics, and to calculate range overlap between model predictions, among others. The package was originally built for Maxent models (Phillips et al. 2006, Phillips et al. 2017), but the current version allows possible extensions for any modeling algorithm. The extensive vignette, which guides users through most package functionality but unfortunately has a file size too big for CRAN, can be found here on the package's Github Pages website: <https://jamiemkass.github.io/ENMeval/articles/ENMeval-2.0-vignette.html>.

Maintained by Jamie M. Kass. Last updated 2 months ago.

6.0 match 49 stars 11.25 score 332 scripts 2 dependents

dselivanov

text2vec:Modern Text Mining Framework for R

Fast and memory-friendly tools for text vectorization, topic modeling (LDA, LSA), word embeddings (GloVe), similarities. This package provides a source-agnostic streaming API, which allows researchers to perform analysis of collections of documents which are larger than available RAM. All core functions are parallelized to benefit from multicore machines.

Maintained by Dmitriy Selivanov. Last updated 7 months ago.

glove latent-dirichlet-allocation natural-language-processing text-mining topic-modeling vectorization word-embeddings word2vec cpp

4.9 match 860 stars 13.48 score 1.3k scripts 23 dependents

bentaylor1

lgcp:Log-Gaussian Cox Process

Spatial and spatio-temporal modelling of point patterns using the log-Gaussian Cox process. Bayesian inference for spatial, spatiotemporal, multivariate and aggregated point processes using Markov chain Monte Carlo. See Benjamin M. Taylor, Tilman M. Davies, Barry S. Rowlingson, Peter J. Diggle (2015) <doi:10.18637/jss.v063.i07>.

Maintained by Benjamin M. Taylor. Last updated 1 years ago.

18.3 match 3.59 score 27 scripts

briencj

asremlPlus:Augments 'ASReml-R' in Fitting Mixed Models and Packages Generally in Exploring Prediction Differences

Assists in automating the selection of terms to include in mixed models when 'asreml' is used to fit the models. Procedures are available for choosing models that conform to the hierarchy or marginality principle, for fitting and choosing between two-dimensional spatial models using correlation, natural cubic smoothing spline and P-spline models. A history of the fitting of a sequence of models is kept in a data frame. Also used to compute functions and contrasts of, to investigate differences between and to plot predictions obtained using any model fitting function. The content falls into the following natural groupings: (i) Data, (ii) Model modification functions, (iii) Model selection and description functions, (iv) Model diagnostics and simulation functions, (v) Prediction production and presentation functions, (vi) Response transformation functions, (vii) Object manipulation functions, and (viii) Miscellaneous functions (for further details see 'asremlPlus-package' in help). The 'asreml' package provides a computationally efficient algorithm for fitting a wide range of linear mixed models using Residual Maximum Likelihood. It is a commercial package and a license for it can be purchased from 'VSNi' <https://vsni.co.uk/> as 'asreml-R', who will supply a zip file for local installation/updating (see <https://asreml.kb.vsni.co.uk/>). It is not needed for functions that are methods for 'alldiffs' and 'data.frame' objects. The package 'asremPlus' can also be installed from <http://chris.brien.name/rpackages/>.

Maintained by Chris Brien. Last updated 26 days ago.

asreml mixed-models

6.9 match 19 stars 9.34 score 200 scripts

bioc

XDE:XDE: a Bayesian hierarchical model for cross-study analysis of differential gene expression

Multi-level model for cross-study detection of differential gene expression.

Maintained by Robert Scharpf. Last updated 5 months ago.

microarray differentialexpression cpp

15.3 match 4.20 score 10 scripts

pachadotdev

cpp11armadillo:An 'Armadillo' Interface

Provides function declarations and inline function definitions that facilitate communication between R and the 'Armadillo' 'C++' library for linear algebra and scientific computing. This implementation is detailed in Vargas Sepulveda and Schneider Malamud (2024) <doi:10.48550/arXiv.2408.11074>.

Maintained by Mauricio Vargas Sepulveda. Last updated 24 days ago.

armadillo cpp cpp11 hacktoberfest linear-algebra

7.0 match 9 stars 9.14 score 1 scripts 16 dependents

rstudio

keras3:R Interface to 'Keras'

Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.

Maintained by Tomasz Kalinowski. Last updated 3 days ago.

4.7 match 845 stars 13.57 score 264 scripts 2 dependents

ncss-tech

aqp:Algorithms for Quantitative Pedology

The Algorithms for Quantitative Pedology (AQP) project was started in 2009 to organize a loosely-related set of concepts and source code on the topic of soil profile visualization, aggregation, and classification into this package (aqp). Over the past 8 years, the project has grown into a suite of related R packages that enhance and simplify the quantitative analysis of soil profile data. Central to the AQP project is a new vocabulary of specialized functions and data structures that can accommodate the inherent complexity of soil profile information; freeing the scientist to focus on ideas rather than boilerplate data processing tasks <doi:10.1016/j.cageo.2012.10.020>. These functions and data structures have been extensively tested and documented, applied to projects involving hundreds of thousands of soil profiles, and deeply integrated into widely used tools such as SoilWeb <https://casoilresource.lawr.ucdavis.edu/soilweb-apps>. Components of the AQP project (aqp, soilDB, sharpshootR, soilReports packages) serve an important role in routine data analysis within the USDA-NRCS Soil Science Division. The AQP suite of R packages offer a convenient platform for bridging the gap between pedometric theory and practice.

Maintained by Dylan Beaudette. Last updated 27 days ago.

digital-soil-mapping ncss-tech nrcs pedology pedometrics soil soil-survey usda

5.3 match 55 stars 11.77 score 1.2k scripts 2 dependents

easystats

bayestestR:Understand and Describe Bayesian Models and Posterior Distributions

Provides utilities to describe posterior distributions and Bayesian models. It includes point-estimates such as Maximum A Posteriori (MAP), measures of dispersion (Highest Density Interval - HDI; Kruschke, 2015 <doi:10.1016/C2012-0-00477-2>) and indices used for null-hypothesis testing (such as ROPE percentage, pd and Bayes factors). References: Makowski et al. (2021) <doi:10.21105/joss.01541>.

Maintained by Dominique Makowski. Last updated 11 days ago.

bayes-factors bayesfactor bayesian bayesian-framework credible-interval easystats hacktoberfest hdi map posterior-distributions rope

3.7 match 579 stars 16.82 score 2.2k scripts 82 dependents

r-lib

slider:Sliding Window Functions

Provides type-stable rolling window functions over any R data type. Cumulative and expanding windows are also supported. For more advanced usage, an index can be used as a secondary vector that defines how sliding windows are to be created.

Maintained by Davis Vaughan. Last updated 30 days ago.

4.5 match 302 stars 13.92 score 848 scripts 99 dependents

thk686

strider:Strided Iterator and Range

The strided iterator adapts multidimensional buffers to work with the C++ standard library and range-based for-loops. Given a pointer or iterator into a multidimensional data buffer, one can generate an iterator range using make_strided to construct strided versions of the standard library's begin and end. For constructing range-based for-loops, a strided_range class is provided. These help authors to avoid integer-based indexing, which in some cases can impede algorithm performance and introduce indexing errors. This library exists primarily to expose the header file to other R projects.

Maintained by Tim Keitt. Last updated 5 years ago.

cpp iterators cpp

14.3 match 4 stars 4.34 score 11 scripts

jeff-hughes

paramtest:Run a Function Iteratively While Varying Parameters

Run simulations or other functions while easily varying parameters from one iteration to the next. Some common use cases would be grid search for machine learning algorithms, running sets of simulations (e.g., estimating statistical power for complex models), or bootstrapping under various conditions. See the 'paramtest' documentation for more information and examples.

Maintained by Jeffrey Hughes. Last updated 7 years ago.

12.5 match 1 stars 4.85 score 47 scripts

ropensci

redland:RDF Library Bindings in R

Provides methods to parse, query and serialize information stored in the Resource Description Framework (RDF). RDF is described at <https://www.w3.org/TR/rdf-primer/>. This package supports RDF by implementing an R interface to the Redland RDF C library, described at <https://librdf.org/docs/api/index.html>. In brief, RDF provides a structured graph consisting of Statements composed of Subject, Predicate, and Object Nodes.

Maintained by Matthew B. Jones. Last updated 1 years ago.

redland

7.5 match 17 stars 7.85 score 98 scripts 13 dependents

bioc

SeqVarTools:Tools for variant data

An interface to the fast-access storage format for VCF data provided in SeqArray, with tools for common operations and analysis.

Maintained by Stephanie M. Gogarten. Last updated 5 months ago.

snp geneticvariability sequencing genetics

6.8 match 3 stars 8.76 score 384 scripts 2 dependents

smac-group

ib:Bias Correction via Iterative Bootstrap

An implementation of the iterative bootstrap procedure of Kuk (1995) <doi:10.1111/j.2517-6161.1995.tb02035.x> to correct the estimation bias of a fitted model object. This procedure has better bias correction properties than the bootstrap bias correction technique.

Maintained by Samuel Orso. Last updated 1 years ago.

17.1 match 2 stars 3.36 score 23 scripts

statistikat

surveysd:Survey Standard Error Estimation for Cumulated Estimates and their Differences in Complex Panel Designs

Calculate point estimates and their standard errors in complex household surveys using bootstrap replicates. Bootstrapping considers survey design with a rotating panel. A comprehensive description of the methodology can be found under <https://statistikat.github.io/surveysd/articles/methodology.html>.

Maintained by Johannes Gussenbauer. Last updated 3 months ago.

bootstrap error-estimation survey cpp

8.3 match 9 stars 6.86 score 67 scripts

hwborchers

pracma:Practical Numerical Math Functions

Provides a large number of functions from numerical analysis and linear algebra, numerical optimization, differential equations, time series, plus some well-known special mathematical functions. Uses 'MATLAB' function names where appropriate to simplify porting.

Maintained by Hans W. Borchers. Last updated 1 years ago.

4.5 match 29 stars 12.34 score 6.6k scripts 931 dependents

greta-dev

greta.dynamics:Modelling Structured Dynamical Systems in 'greta'

A 'greta' extension for analysing transition matrices and ordinary differential equations representing dynamical systems. Provides functions for analysing transition matrices by iteration, and solving ordinary differential equations. This is an extension to the 'greta' software, Golding (2019) <doi:10.21105/joss.01601>.

Maintained by Nicholas Tierney. Last updated 4 months ago.

9.6 match 6 stars 5.72 score 11 scripts

renkun-ken

rlist:A Toolbox for Non-Tabular Data Manipulation

Provides a set of functions for data manipulation with list objects, including mapping, filtering, grouping, sorting, updating, searching, and other useful functions. Most functions are designed to be pipeline friendly so that data processing with lists can be chained.

Maintained by Kun Ren. Last updated 2 years ago.

4.0 match 206 stars 13.73 score 2.2k scripts 123 dependents

asl

Rssa:A Collection of Methods for Singular Spectrum Analysis

Methods and tools for Singular Spectrum Analysis including decomposition, forecasting and gap-filling for univariate and multivariate time series. General description of the methods with many examples can be found in the book Golyandina (2018, <doi:10.1007/978-3-662-57380-8>). See 'citation("Rssa")' for details.

Maintained by Anton Korobeynikov. Last updated 6 months ago.

fftw3

7.6 match 58 stars 7.10 score 182 scripts 4 dependents

flr

FLasher:Projection and Forecasting of Fish Populations, Stocks and Fleets

Projection of future population and fishery dynamics is carried out for a given set of management targets. A system of equations is solved, using Automatic Differentation (AD), for the levels of effort by fishery (fleet) that will result in the required abundances, catches or fishing mortalities.

Maintained by Iago Mosqueira. Last updated 8 days ago.

forecast fisheries flr cpp

7.8 match 2 stars 6.86 score 254 scripts 6 dependents

spatpomp-org

spatPomp:Inference for Spatiotemporal Partially Observed Markov Processes

Inference on panel data using spatiotemporal partially-observed Markov process (SpatPOMP) models. The 'spatPomp' package extends 'pomp' to include algorithms taking advantage of the spatial structure in order to assist with handling high dimensional processes. See Asfaw et al. (2024) <doi:10.48550/arXiv.2101.01157> for further description of the package.

Maintained by Edward Ionides. Last updated 4 months ago.

7.1 match 2 stars 7.38 score 93 scripts

r-lib

cpp11:A C++11 Interface for R's C Interface

Provides a header only, C++11 interface to R's C interface. Compared to other approaches 'cpp11' strives to be safe against long jumps from the C API as well as C++ exceptions, conform to normal R function semantics and supports interaction with 'ALTREP' vectors.

Maintained by Davis Vaughan. Last updated 11 days ago.

cpp cpp11

2.8 match 212 stars 17.69 score 104 scripts 8.6k dependents

jojo-

mipfp:Multidimensional Iterative Proportional Fitting and Alternative Models

An implementation of the iterative proportional fitting (IPFP), maximum likelihood, minimum chi-square and weighted least squares procedures for updating a N-dimensional array with respect to given target marginal distributions (which, in turn can be multidimensional). The package also provides an application of the IPFP to simulate multivariate Bernoulli distributions.

Maintained by Johan Barthelemy. Last updated 4 years ago.

7.1 match 24 stars 6.79 score 86 scripts 3 dependents

detlew

PowerTOST:Power and Sample Size for (Bio)Equivalence Studies

Contains functions to calculate power and sample size for various study designs used in bioequivalence studies. Use known.designs() to see the designs supported. Power and sample size can be obtained based on different methods, amongst them prominently the TOST procedure (two one-sided t-tests). See README and NEWS for further information.

Maintained by Detlew Labes. Last updated 12 months ago.

4.9 match 20 stars 9.61 score 112 scripts 4 dependents

guyabel

migest:Methods for the Indirect Estimation of Bilateral Migration

Tools for estimating, measuring and working with migration data.

Maintained by Guy J. Abel. Last updated 1 months ago.

demography migration population

7.9 match 32 stars 5.80 score 86 scripts

rstudio

tensorflow:R Interface to 'TensorFlow'

Interface to 'TensorFlow' <https://www.tensorflow.org/>, an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more 'CPUs' or 'GPUs' in a desktop, server, or mobile device with a single 'API'. 'TensorFlow' was originally developed by researchers and engineers working on the Google Brain Team within Google's Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well.

Maintained by Tomasz Kalinowski. Last updated 14 days ago.

3.0 match 1.3k stars 15.35 score 3.2k scripts 74 dependents

bioc

immunoClust:immunoClust - Automated Pipeline for Population Detection in Flow Cytometry

immunoClust is a model based clustering approach for Flow Cytometry samples. The cell-events of single Flow Cytometry samples are modelled by a mixture of multinominal normal- or t-distributions. The cell-event clusters of several samples are modelled by a mixture of multinominal normal-distributions aiming stable co-clusters across these samples.

Maintained by Till Soerensen. Last updated 4 months ago.

clustering flowcytometry singlecell cellbasedassays immunooncology gsl cpp

10.1 match 4.38 score 4 scripts

vinhdizzo

DisImpact:Calculates Disproportionate Impact When Binary Success Data are Disaggregated by Subgroups

Implements methods for calculating disproportionate impact: the percentage point gap, proportionality index, and the 80% index. California Community Colleges Chancellor's Office (2017). Percentage Point Gap Method. <https://www.cccco.edu/-/media/CCCCO-Website/About-Us/Divisions/Digital-Innovation-and-Infrastructure/Research/Files/PercentagePointGapMethod2017.ashx>. California Community Colleges Chancellor's Office (2014). Guidelines for Measuring Disproportionate Impact in Equity Plans. <https://www.cccco.edu/-/media/CCCCO-Website/Files/DII/guidelines-for-measuring-disproportionate-impact-in-equity-plans-tfa-ada.pdf>.

Maintained by Vinh Nguyen. Last updated 2 years ago.

8.2 match 2 stars 5.41 score 17 scripts 1 dependents

mlr-org

mlr3tuning:Hyperparameter Optimization for 'mlr3'

Hyperparameter optimization package of the 'mlr3' ecosystem. It features highly configurable search spaces via the 'paradox' package and finds optimal hyperparameter configurations for any 'mlr3' learner. 'mlr3tuning' works with several optimization algorithms e.g. Random Search, Iterated Racing, Bayesian Optimization (in 'mlr3mbo') and Hyperband (in 'mlr3hyperband'). Moreover, it can automatically optimize learners and estimate the performance of optimized models with nested resampling.

Maintained by Marc Becker. Last updated 3 months ago.

bbotk hyperparameter-optimization hyperparameter-tuning machine-learning mlr3 optimization tune tuning

3.7 match 55 stars 11.59 score 384 scripts 11 dependents

revolutionanalytics

foreach:Provides Foreach Looping Construct

Support for the foreach looping construct. Foreach is an idiom that allows for iterating over elements in a collection, without the use of an explicit loop counter. This package in particular is intended to be used for its return value, rather than for its side effects. In that sense, it is similar to the standard lapply function, but doesn't require the evaluation of a function. Using foreach without side effects also facilitates executing the loop in parallel.

Maintained by Folashade Daniel. Last updated 3 years ago.

foreach parallel-computing

2.5 match 54 stars 17.16 score 43k scripts 2.8k dependents

functionaldata

fdapace:Functional Data Analysis and Empirical Dynamics

A versatile package that provides implementation of various methods of Functional Data Analysis (FDA) and Empirical Dynamics. The core of this package is Functional Principal Component Analysis (FPCA), a key technique for functional data analysis, for sparsely or densely sampled random trajectories and time courses, via the Principal Analysis by Conditional Estimation (PACE) algorithm. This core algorithm yields covariance and mean functions, eigenfunctions and principal component (scores), for both functional data and derivatives, for both dense (functional) and sparse (longitudinal) sampling designs. For sparse designs, it provides fitted continuous trajectories with confidence bands, even for subjects with very few longitudinal observations. PACE is a viable and flexible alternative to random effects modeling of longitudinal data. There is also a Matlab version (PACE) that contains some methods not available on fdapace and vice versa. Updates to fdapace were supported by grants from NIH Echo and NSF DMS-1712864 and DMS-2014626. Please cite our package if you use it (You may run the command citation("fdapace") to get the citation format and bibtex entry). References: Wang, J.L., Chiou, J., Müller, H.G. (2016) <doi:10.1146/annurev-statistics-041715-033624>; Chen, K., Zhang, X., Petersen, A., Müller, H.G. (2017) <doi:10.1007/s12561-015-9137-5>.

Maintained by Yidong Zhou. Last updated 9 months ago.

cpp

3.8 match 31 stars 11.46 score 474 scripts 25 dependents

projectmosaic

mosaicCalc:R-Language Based Calculus Operations for Teaching

Software to support the introductory *MOSAIC Calculus* textbook <https://www.mosaic-web.org/MOSAIC-Calculus/>), one of many data- and modeling-oriented educational resources developed by Project MOSAIC (<https://www.mosaic-web.org/>). Provides symbolic and numerical differentiation and integration, as well as support for applied linear algebra (for data science), and differential equations/dynamics. Includes grammar-of-graphics-based functions for drawing vector fields, trajectories, etc. The software is suitable for general use, but intended mainly for teaching calculus.

Maintained by Daniel Kaplan. Last updated 19 days ago.

4.9 match 13 stars 8.68 score 546 scripts

jacgoldsm

peruse:A Tidy API for Sequence Iteration and Set Comprehension

A friendly API for sequence iteration and set comprehension.

Maintained by Jacob Goldsmith. Last updated 4 years ago.

15.6 match 1 stars 2.70 score 2 scripts

marcellgranat

currr:Apply Mapping Functions in Frequent Saving

Implementations of the family of map() functions with frequent saving of the intermediate results. The contained functions let you start the evaluation of the iterations where you stopped (reading the already evaluated ones from cache), and work with the currently evaluated iterations while remaining ones are running in a background job. Parallel computing is also easier with the workers parameter.

Maintained by Marcell Granat. Last updated 7 months ago.

checkpoints parallel-computing purrr

10.4 match 21 stars 4.02 score 7 scripts

ltorgo

performanceEstimation:An Infra-Structure for Performance Estimation of Predictive Models

An infra-structure for estimating the predictive performance of predictive models. In this context, it can also be used to compare and/or select among different alternative ways of solving one or more predictive tasks. The main goal of the package is to provide a generic infra-structure to estimate the values of different metrics of predictive performance using different estimation procedures. These estimation tasks can be applied to any solutions (workflows) to the predictive tasks. The package provides easy to use standard workflows that allow the usage of any available R modeling algorithm together with some pre-defined data pre-processing steps and also prediction post- processing methods. It also provides means for addressing issues related with the statistical significance of the observed differences.

Maintained by Luis Torgo. Last updated 8 years ago.

7.0 match 16 stars 5.97 score 195 scripts 1 dependents

mmaechler

sfsmisc:Utilities from 'Seminar fuer Statistik' ETH Zurich

Useful utilities ['goodies'] from Seminar fuer Statistik ETH Zurich, some of which were ported from S-plus in the 1990s. For graphics, have pretty (Log-scale) axes eaxis(), an enhanced Tukey-Anscombe plot, combining histogram and boxplot, 2d-residual plots, a 'tachoPlot()', pretty arrows, etc. For robustness, have a robust F test and robust range(). For system support, notably on Linux, provides 'Sys.*()' functions with more access to system and CPU information. Finally, miscellaneous utilities such as simple efficient prime numbers, integer codes, Duplicated(), toLatex.numeric() and is.whole().

Maintained by Martin Maechler. Last updated 5 months ago.

3.8 match 11 stars 10.87 score 566 scripts 119 dependents

bbuchsbaum

neuroim:Data Structures and Handling for Neuroimaging Data

A collection of data structures that represent volumetric brain imaging data. The focus is on basic data handling for 3D and 4D neuroimaging data. In addition, there are function to read and write NIFTI files and limited support for reading AFNI files.

Maintained by Bradley Buchsbaum. Last updated 4 years ago.

cpp

7.2 match 6 stars 5.64 score 48 scripts

pchausse

gmm:Generalized Method of Moments and Generalized Empirical Likelihood

It is a complete suite to estimate models based on moment conditions. It includes the two step Generalized method of moments (Hansen 1982; <doi:10.2307/1912775>), the iterated GMM and continuous updated estimator (Hansen, Eaton and Yaron 1996; <doi:10.2307/1392442>) and several methods that belong to the Generalized Empirical Likelihood family of estimators (Smith 1997; <doi:10.1111/j.0013-0133.1997.174.x>, Kitamura 1997; <doi:10.1214/aos/1069362388>, Newey and Smith 2004; <doi:10.1111/j.1468-0262.2004.00482.x>, and Anatolyev 2005 <doi:10.1111/j.1468-0262.2005.00601.x>).

Maintained by Pierre Chausse. Last updated 1 years ago.

fortran openblas

4.4 match 2 stars 9.28 score 304 scripts 66 dependents

maarten14c

rbacon:Age-Depth Modelling using Bayesian Statistics

An approach to age-depth modelling that uses Bayesian statistics to reconstruct accumulation histories for deposits, through combining radiocarbon and other dates with prior information on accumulation rates and their variability. See Blaauw & Christen (2011).

Maintained by Maarten Blaauw. Last updated 24 days ago.

age-depth-model bayesian holocene lakes ocean-sediments peat radiocarbon-calibration cpp

5.9 match 7 stars 6.75 score 57 scripts 1 dependents

animint

animint2:Animated Interactive Grammar of Graphics

Functions are provided for defining animated, interactive data visualizations in R code, and rendering on a web page. The 2018 Journal of Computational and Graphical Statistics paper, <doi:10.1080/10618600.2018.1513367> describes the concepts implemented.

Maintained by Toby Hocking. Last updated 26 days ago.

4.5 match 64 stars 8.87 score 173 scripts

zachcp

rcdk:Interface to the 'CDK' Libraries

Allows the user to access functionality in the 'CDK', a Java framework for chemoinformatics. This allows the user to load molecules, evaluate fingerprints, calculate molecular descriptors and so on. In addition, the 'CDK' API allows the user to view structures in 2D.

Maintained by Zachary Charlop-Powers. Last updated 2 years ago.

openjdk

5.9 match 1 stars 6.78 score 287 scripts 11 dependents

red-list-ecosystem

redlistr:Tools for the IUCN Red List of Ecosystems and Species

A toolbox created by members of the International Union for Conservation of Nature (IUCN) Red List of Ecosystems Committee for Scientific Standards. Primarily, it is a set of tools suitable for calculating the metrics required for making assessments of species and ecosystems against the IUCN Red List of Threatened Species and the IUCN Red List of Ecosystems categories and criteria. See the IUCN website for detailed guidelines, the criteria, publications and other information.

Maintained by Calvin Lee. Last updated 1 years ago.

6.3 match 32 stars 6.35 score 35 scripts

bioc

ceRNAnetsim:Regulation Simulator of Interaction between miRNA and Competing RNAs (ceRNA)

This package simulates regulations of ceRNA (Competing Endogenous) expression levels after a expression level change in one or more miRNA/mRNAs. The methodolgy adopted by the package has potential to incorparate any ceRNA (circRNA, lincRNA, etc.) into miRNA:target interaction network. The package basically distributes miRNA expression over available ceRNAs where each ceRNA attracks miRNAs proportional to its amount. But, the package can utilize multiple parameters that modify miRNA effect on its target (seed type, binding energy, binding location, etc.). The functions handle the given dataset as graph object and the processes progress via edge and node variables.

Maintained by Selcen Ari Yuka. Last updated 5 months ago.

networkinference systemsbiology network graphandnetwork transcriptomics cerna mirna network-biology network-simulator tcga tidygraph tidyverse

6.9 match 4 stars 5.76 score 12 scripts

ohdsi

PatientLevelPrediction:Develop Clinical Prediction Models Using the Common Data Model

A user friendly way to create patient level prediction models using the Observational Medical Outcomes Partnership Common Data Model. Given a cohort of interest and an outcome of interest, the package can use data in the Common Data Model to build a large set of features. These features can then be used to fit a predictive model with a number of machine learning algorithms. This is further described in Reps (2017) <doi:10.1093/jamia/ocy032>.

Maintained by Egill Fridgeirsson. Last updated 7 days ago.

hades openjdk

3.6 match 190 stars 10.85 score 297 scripts

ryantibs

genlasso:Path Algorithm for Generalized Lasso Problems

Computes the solution path for generalized lasso problems. Important use cases are the fused lasso over an arbitrary graph, and trend fitting of any given polynomial order. Specialized implementations for the latter two subproblems are given to improve stability and speed. See Taylor Arnold and Ryan Tibshirani (2016) <doi:10.1080/10618600.2015.1008638>.

Maintained by Taylor B. Arnold. Last updated 2 years ago.

5.0 match 32 stars 7.66 score 160 scripts 6 dependents

pik-piam

edgeTransport:Prepare EDGE Transport Data for the REMIND model

EDGE-T is a fork of the GCAM transport module https://jgcri.github.io/gcam-doc/energy.html#transportation with a high level of detail in its representation of technological and modal options. It is a partial equilibrium model with a nested multinomial logit structure and relies on the modified logit formulation. Most of the sources are not publicly available. PIK-internal users can find the sources in the distributed file system in the folder `/p/projects/rd3mod/inputdata/sources/EDGE-Transport-Standalone`.

Maintained by Johanna Hoppe. Last updated 21 hours ago.

5.6 match 5 stars 6.84 score 16 scripts 2 dependents

ecor

RMAWGEN:Multi-Site Auto-Regressive Weather GENerator

S3 and S4 functions are implemented for spatial multi-site stochastic generation of daily time series of temperature and precipitation. These tools make use of Vector AutoRegressive models (VARs). The weather generator model is then saved as an object and is calibrated by daily instrumental "Gaussianized" time series through the 'vars' package tools. Once obtained this model, it can it can be used for weather generations and be adapted to work with several climatic monthly time series.

Maintained by Emanuele Cordano. Last updated 25 days ago.

6.8 match 3 stars 5.62 score 115 scripts 4 dependents

tidyverse

purrr:Functional Programming Tools

A complete and consistent functional programming toolkit for R.

Maintained by Hadley Wickham. Last updated 1 months ago.

functional-programming

1.7 match 1.3k stars 22.12 score 59k scripts 6.9k dependents

covaruber

sommer:Solving Mixed Model Equations in R

Structural multivariate-univariate linear mixed model solver for estimation of multiple random effects with unknown variance-covariance structures (e.g., heterogeneous and unstructured) and known covariance among levels of random effects (e.g., pedigree and genomic relationship matrices) (Covarrubias-Pazaran, 2016 <doi:10.1371/journal.pone.0156744>; Maier et al., 2015 <doi:10.1016/j.ajhg.2014.12.006>; Jensen et al., 1997). REML estimates can be obtained using the Direct-Inversion Newton-Raphson and Direct-Inversion Average Information algorithms for the problems r x r (r being the number of records) or using the Henderson-based average information algorithm for the problem c x c (c being the number of coefficients to estimate). Spatial models can also be fitted using the two-dimensional spline functionality available.

Maintained by Giovanny Covarrubias-Pazaran. Last updated 20 days ago.

average-information mixed-models rcpparmadillo openblas cpp openmp

2.9 match 43 stars 12.70 score 300 scripts 9 dependents

bentaylor1

spatsurv:Bayesian Spatial Survival Analysis with Parametric Proportional Hazards Models

Bayesian inference for parametric proportional hazards spatial survival models; flexible spatial survival models. See Benjamin M. Taylor, Barry S. Rowlingson (2017) <doi:10.18637/jss.v077.i04>.

Maintained by Benjamin M. Taylor. Last updated 1 years ago.

18.3 match 1 stars 2.03 score 10 scripts

bioc

microSTASIS:Microbiota STability ASsessment via Iterative cluStering

The toolkit 'µSTASIS', or microSTASIS, has been developed for the stability analysis of microbiota in a temporal framework by leveraging on iterative clustering. Concretely, the core function uses Hartigan-Wong k-means algorithm as many times as possible for stressing out paired samples from the same individuals to test if they remain together for multiple numbers of clusters over a whole data set of individuals. Moreover, the package includes multiple functions to subset samples from paired times, validate the results or visualize the output.

Maintained by Pedro Sánchez-Sánchez. Last updated 5 months ago.

geneticvariability biomedicalinformatics clustering multiplecomparison microbiome

8.5 match 2 stars 4.30 score 1 scripts

ropensci

karel:Learning programming with Karel the robot

This is the R implementation of Karel the robot, a programming language created by Dr. R. E. Pattis at Stanford University in 1981. Karel is an useful tool to teach introductory concepts about general programming, such as algorithmic decomposition, conditional statements, loops, etc., in an interactive and fun way, by writing programs to make Karel the robot achieve certain tasks in the world she lives in. Originally based on Pascal, Karel was implemented in many languages through these decades, including 'Java', 'C++', 'Ruby' and 'Python'. This is the first package implementing Karel in R.

Maintained by Marcos Prunello. Last updated 8 months ago.

learning programming r-language

5.3 match 10 stars 6.87 score 31 scripts

mrc-ide

mcstate:Monte Carlo Methods for State Space Models

Implements Monte Carlo methods for state-space models such as 'SIR' models in epidemiology. Particle MCMC (pmcmc) and SMC2 methods are planned. This package is particularly designed to work with odin/dust models, but we will see how general it becomes.

Maintained by Rich FitzJohn. Last updated 9 months ago.

5.1 match 19 stars 7.08 score 87 scripts

lucaweihs

SEMID:Identifiability of Linear Structural Equation Models

Provides routines to check identifiability or non-identifiability of linear structural equation models as described in Drton, Foygel, and Sullivant (2011) <doi:10.1214/10-AOS859>, Foygel, Draisma, and Drton (2012) <doi:10.1214/12-AOS1012>, and other works. The routines are based on the graphical representation of structural equation models.

Maintained by Nils Sturma. Last updated 2 years ago.

8.9 match 4 stars 4.06 score 29 scripts

mxjki

EfficientMaxEigenpair:Efficient Initials for Computing the Maximal Eigenpair

An implementation for using efficient initials to compute the maximal eigenpair in R. It provides three algorithms to find the efficient initials under two cases: the tridiagonal matrix case and the general matrix case. Besides, it also provides two algorithms for the next to the maximal eigenpair under these two cases.

Maintained by Xiao-Jun Mao. Last updated 7 years ago.

9.3 match 3.85 score 14 scripts

paulkinyanjui01

CondMVT:Conditional Multivariate t Distribution

The packages helps sample from the conditional multivariate t distribution.

Maintained by Paul Kimani Kinyanjui. Last updated 3 years ago.

13.3 match 2.70 score

tengmcing

bandicoot:Light-Weight 'python'-Like Object-Oriented System

A light-weight object-oriented system with 'python'-like syntax which supports multiple inheritances and incorporates a 'python'-like method resolution order.

Maintained by Weihao Li. Last updated 1 years ago.

7.3 match 4 stars 4.78 score 10 scripts 1 dependents

norskregnesentral

shapr:Prediction Explanation with Dependence-Aware Shapley Values

Complex machine learning models are often hard to interpret. However, in many situations it is crucial to understand and explain why a model made a specific prediction. Shapley values is the only method for such prediction explanation framework with a solid theoretical foundation. Previously known methods for estimating the Shapley values do, however, assume feature independence. This package implements methods which accounts for any feature dependence, and thereby produces more accurate estimates of the true Shapley values. An accompanying 'Python' wrapper ('shaprpy') is available through the GitHub repository.

Maintained by Martin Jullum. Last updated 1 months ago.

explainable-ai explainable-ml rcpp rcpparmadillo shapley openblas cpp openmp

3.3 match 153 stars 10.62 score 175 scripts 1 dependents

auto-optimization

iraceplot:Plots for Visualizing the Data Produced by the 'irace' Package

Graphical visualization tools for analyzing the data produced by 'irace'. The 'iraceplot' package enables users to analyze the performance and the parameter space data sampled by the configuration during the search process. It provides a set of functions that generate different plots to visualize the configurations sampled during the execution of 'irace' and their performance. The functions just require the log file generated by 'irace' and, in some cases, they can be used with user-provided data.

Maintained by Manuel López-Ibáñez. Last updated 1 months ago.

irace parameter-tuning

6.0 match 5 stars 5.70 score 7 scripts

stocnet

RSiena:Siena - Simulation Investigation for Empirical Network Analysis

The main purpose of this package is to perform simulation-based estimation of stochastic actor-oriented models for longitudinal network data collected as panel data. Dependent variables can be single or multivariate networks, which can be directed, non-directed, or two-mode; and associated actor variables. There are also functions for testing parameters and checking goodness of fit. An overview of these models is given in Snijders (2017), <doi:10.1146/annurev-statistics-060116-054035>.

Maintained by Tom A.B. Snijders. Last updated 1 months ago.

longitudinal-data rsiena social-network-analysis statistical-network-analysis statistics cpp

3.4 match 107 stars 9.93 score 346 scripts 1 dependents

flr

FLa4a:A Simple and Robust Statistical Catch at Age Model

A simple and robust statistical Catch at Age model that is specifically designed for stocks with intermediate levels of data quantity and quality.

Maintained by Ernesto Jardim. Last updated 4 days ago.

5.1 match 12 stars 6.66 score 177 scripts 2 dependents

egeulgen

pathfindR:Enrichment Analysis Utilizing Active Subnetworks

Enrichment analysis enables researchers to uncover mechanisms underlying a phenotype. However, conventional methods for enrichment analysis do not take into account protein-protein interaction information, resulting in incomplete conclusions. 'pathfindR' is a tool for enrichment analysis utilizing active subnetworks. The main function identifies active subnetworks in a protein-protein interaction network using a user-provided list of genes and associated p values. It then performs enrichment analyses on the identified subnetworks, identifying enriched terms (i.e. pathways or, more broadly, gene sets) that possibly underlie the phenotype of interest. 'pathfindR' also offers functionalities to cluster the enriched terms and identify representative terms in each cluster, to score the enriched terms per sample and to visualize analysis results. The enrichment, clustering and other methods implemented in 'pathfindR' are described in detail in Ulgen E, Ozisik O, Sezerman OU. 2019. 'pathfindR': An R Package for Comprehensive Identification of Enriched Pathways in Omics Data Through Active Subnetworks. Front. Genet. <doi:10.3389/fgene.2019.00858>.

Maintained by Ege Ulgen. Last updated 26 days ago.

active-subnetworks enrichment pathway pathway-enrichment-analysis subnetwork

3.4 match 186 stars 10.13 score 138 scripts

ysosirius

windfarmGA:Genetic Algorithm for Wind Farm Layout Optimization

The genetic algorithm is designed to optimize wind farms of any shape. It requires a predefined amount of turbines, a unified rotor radius and an average wind speed value for each incoming wind direction. A terrain effect model can be included that downloads an 'SRTM' elevation model and loads a Corine Land Cover raster to approximate surface roughness.

Maintained by Sebastian Gatscha. Last updated 2 months ago.

windfarm-layout optimization genetic-algorithm renewable-energy cpp

6.7 match 27 stars 5.06 score 17 scripts

mbedward

packcircles:Circle Packing

Algorithms to find arrangements of non-overlapping circles.

Maintained by Michael Bedward. Last updated 4 months ago.

cpp

3.3 match 57 stars 10.06 score 422 scripts 6 dependents

gchapron

MDPtoolbox:Markov Decision Processes Toolbox

The Markov Decision Processes (MDP) toolbox proposes functions related to the resolution of discrete-time Markov Decision Processes: finite horizon, value iteration, policy iteration, linear programming algorithms with some variants and also proposes some functions related to Reinforcement Learning.

Maintained by Guillaume Chapron. Last updated 8 years ago.

13.9 match 3 stars 2.40 score 84 scripts

rvaradhan

SQUAREM:Squared Extrapolation Methods for Accelerating EM-Like Monotone Algorithms

Algorithms for accelerating the convergence of slow, monotone sequences from smooth, contraction mapping such as the EM algorithm. It can be used to accelerate any smooth, linearly convergent acceleration scheme. A tutorial style introduction to this package is available in a vignette on the CRAN download page or, when the package is loaded in an R session, with vignette("SQUAREM"). Refer to the J Stat Software article: <doi:10.18637/jss.v092.i07>.

Maintained by Ravi Varadhan. Last updated 4 years ago.

3.5 match 2 stars 9.26 score 84 scripts 502 dependents

hadley

plyr:Tools for Splitting, Applying and Combining Data

A set of tools that solves a common set of problems: you need to break a big problem down into manageable pieces, operate on each piece and then put all the pieces back together. For example, you might want to fit a model to each spatial location or time point in your study, summarise data by panels or collapse high-dimensional arrays to simpler summary statistics. The development of 'plyr' has been generously supported by 'Becton Dickinson'.

Maintained by Hadley Wickham. Last updated 4 months ago.

cpp

1.8 match 500 stars 18.16 score 83k scripts 3.3k dependents

ss3sim

ss3sim:Fisheries Stock Assessment Simulation Testing with Stock Synthesis

A framework for fisheries stock assessment simulation testing with Stock Synthesis (SS3) as described in Anderson et al. (2014) <doi:10.1371/journal.pone.0092725>.

Maintained by Kelli F. Johnson. Last updated 5 months ago.

fisheries simulation stock-synthesis

3.7 match 39 stars 8.89 score 149 scripts

robinhankin

elliptic:Weierstrass and Jacobi Elliptic Functions

A suite of elliptic and related functions including Weierstrass and Jacobi forms. Also includes various tools for manipulating and visualizing complex functions.

Maintained by Robin K. S. Hankin. Last updated 10 days ago.

3.4 match 3 stars 9.31 score 54 scripts 79 dependents

hanase

BMA:Bayesian Model Averaging

Package for Bayesian model averaging and variable selection for linear models, generalized linear models and survival models (cox regression).

Maintained by Hana Sevcikova. Last updated 2 months ago.

fortran

3.4 match 37 stars 9.38 score 152 scripts 14 dependents

matthieu-bruneaux

isotracer:Isotopic Tracer Analysis Using MCMC

Implements Bayesian models to analyze data from tracer addition experiments. The implemented method was originally described in the article "A New Method to Reconstruct Quantitative Food Webs and Nutrient Flows from Isotope Tracer Addition Experiments" by López-Sepulcre et al. (2020) <doi:10.1086/708546>.

Maintained by Matthieu Bruneaux. Last updated 4 months ago.

cpp

5.3 match 5.92 score 60 scripts

schweflo

lpSolveAPI:R Interface to 'lp_solve' Version 5.5.2.0

The lpSolveAPI package provides an R interface to 'lp_solve', a Mixed Integer Linear Programming (MILP) solver with support for pure linear, (mixed) integer/binary, semi-continuous and special ordered sets (SOS) models.

Maintained by Florian Schwendinger. Last updated 8 months ago.

openblas

4.0 match 7.83 score 640 scripts 79 dependents

rspatial

terra:Spatial Data Analysis

Methods for spatial data analysis with vector (points, lines, polygons) and raster (grid) data. Methods for vector data include geometric operations such as intersect and buffer. Raster methods include local, focal, global, zonal and geometric operations. The predict and interpolate methods facilitate the use of regression type (interpolation, machine learning) models for spatial prediction, including with satellite remote sensing data. Processing of very large files is supported. See the manual and tutorials on <https://rspatial.org/> to get started. 'terra' replaces the 'raster' package ('terra' can do more, and it is faster and easier to use).

Maintained by Robert J. Hijmans. Last updated 4 hours ago.

geospatial raster spatial vector onetbb proj gdal geos cpp

1.8 match 559 stars 17.65 score 17k scripts 849 dependents

bioc

goSorensen:Statistical inference based on the Sorensen-Dice dissimilarity and the Gene Ontology (GO)

This package implements inferential methods to compare gene lists in terms of their biological meaning as expressed in the GO. The compared gene lists are characterized by cross-tabulation frequency tables of enriched GO items. Dissimilarity between gene lists is evaluated using the Sorensen-Dice index. The fundamental guiding principle is that two gene lists are taken as similar if they share a great proportion of common enriched GO items.

Maintained by Pablo Flores. Last updated 5 months ago.

annotation go genesetenrichment software microarray pathways geneexpression multiplecomparison graphandnetwork reactome clustering kegg

6.8 match 4.56 score 12 scripts

mmahmoudian

sivs:Stable Iterative Variable Selection

An iterative feature selection method (manuscript submitted) that internally utilizes various Machine Learning methods that have embedded feature reduction in order to shrink down the feature space into a small and yet robust set.

Maintained by Mehrad Mahmoudian. Last updated 22 days ago.

6.6 match 4 stars 4.60 score

bioc

pRoloc:A unifying bioinformatics framework for spatial proteomics

The pRoloc package implements machine learning and visualisation methods for the analysis and interogation of quantitiative mass spectrometry data to reliably infer protein sub-cellular localisation.

Maintained by Lisa Breckels. Last updated 25 days ago.

immunooncology proteomics massspectrometry classification clustering qualitycontrol bioconductor proteomics-data spatial-proteomics visualisation openblas cpp

3.5 match 15 stars 8.71 score 101 scripts 2 dependents

bioc

BiocParallel:Bioconductor facilities for parallel evaluation

This package provides modified versions and novel implementation of functions for parallel evaluation, tailored to use with Bioconductor objects.

Maintained by Martin Morgan. Last updated 24 days ago.

infrastructure bioconductor-package core-package u24ca289073 cpp

1.8 match 67 stars 17.40 score 7.3k scripts 1.1k dependents

laplacesdemonr

LaplacesDemon:Complete Environment for Bayesian Inference

Provides a complete environment for Bayesian inference using a variety of different samplers (see ?LaplacesDemon for an overview).

Maintained by Henrik Singmann. Last updated 12 months ago.

2.3 match 93 stars 13.45 score 1.8k scripts 60 dependents

echasnovski

comperank:Ranking Methods for Competition Results

Compute ranking and rating based on competition results. Methods of different nature are implemented: with fixed Head-to-Head structure, with variable Head-to-Head structure and with iterative nature. All algorithms are taken from the book 'Who’s #1?: The science of rating and ranking' by Amy N. Langville and Carl D. Meyer (2012, ISBN:978-0-691-15422-0).

Maintained by Evgeni Chasnovski. Last updated 2 years ago.

cpp

5.3 match 24 stars 5.65 score 37 scripts

epiforecasts

EpiSoon:Forecast Cases Using Reproduction Numbers

To forecast the time-varying reproduction number and use this to forecast reported case counts. Includes tools to evaluate a range of models across samples and time series using proper scoring rules.

Maintained by Sam Abbott. Last updated 2 years ago.

case-forecasts forecasts

7.0 match 7 stars 4.26 score 25 scripts 1 dependents

bioc

netZooR:Unified methods for the inference and analysis of gene regulatory networks

netZooR unifies the implementations of several Network Zoo methods (netzoo, netzoo.github.io) into a single package by creating interfaces between network inference and network analysis methods. Currently, the package has 3 methods for network inference including PANDA and its optimized implementation OTTER (network reconstruction using mutliple lines of biological evidence), LIONESS (single-sample network inference), and EGRET (genotype-specific networks). Network analysis methods include CONDOR (community detection), ALPACA (differential community detection), CRANE (significance estimation of differential modules), MONSTER (estimation of network transition states). In addition, YARN allows to process gene expresssion data for tissue-specific analyses and SAMBAR infers missing mutation data based on pathway information.

Maintained by Tara Eicher. Last updated 7 days ago.

networkinference network generegulation geneexpression transcription microarray graphandnetwork gene-regulatory-network transcription-factors

3.8 match 105 stars 7.98 score

mlfit

mlfit:Iterative Proportional Fitting Algorithms for Nested Structures

The Iterative Proportional Fitting (IPF) algorithm operates on count data. This package offers implementations for several algorithms that extend this to nested structures: 'parent' and 'child' items for both of which constraints can be provided. The fitting algorithms include Iterative Proportional Updating <https://trid.trb.org/view/881554>, Hierarchical IPF <doi:10.3929/ethz-a-006620748>, Entropy Optimization <https://trid.trb.org/view/881144>, and Generalized Raking <doi:10.2307/2290793>. Additionally, a number of replication methods is also provided such as 'Truncate, replicate, sample' <doi:10.1016/j.compenvurbsys.2013.03.004>.

Maintained by Amarin Siripanich. Last updated 3 months ago.

5.4 match 14 stars 5.47 score 15 scripts

tyee001

VGAM:Vector Generalized Linear and Additive Models

An implementation of about 6 major classes of statistical regression models. The central algorithm is Fisher scoring and iterative reweighted least squares. At the heart of this package are the vector generalized linear and additive model (VGLM/VGAM) classes. VGLMs can be loosely thought of as multivariate GLMs. VGAMs are data-driven VGLMs that use smoothing. The book "Vector Generalized Linear and Additive Models: With an Implementation in R" (Yee, 2015) <DOI:10.1007/978-1-4939-2818-7> gives details of the statistical framework and the package. Currently only fixed-effects models are implemented. Many (100+) models and distributions are estimated by maximum likelihood estimation (MLE) or penalized MLE. The other classes are RR-VGLMs (reduced-rank VGLMs), quadratic RR-VGLMs, doubly constrained RR-VGLMs, quadratic RR-VGLMs, reduced-rank VGAMs, RCIMs (row-column interaction models)---these classes perform constrained and unconstrained quadratic ordination (CQO/UQO) models in ecology, as well as constrained additive ordination (CAO). Hauck-Donner effect detection is implemented. Note that these functions are subject to change; see the NEWS and ChangeLog files for latest changes.

Maintained by Thomas Yee. Last updated 1 months ago.

fortran

2.8 match 10 stars 10.67 score 3.6k scripts 169 dependents

r-lib

lintr:A 'Linter' for R Code

Checks adherence to a given style, syntax errors and possible semantic issues. Supports on the fly checking of R code edited with 'RStudio IDE', 'Emacs', 'Vim', 'Sublime Text', 'Atom' and 'Visual Studio Code'.

Maintained by Michael Chirico. Last updated 7 days ago.

linter

1.7 match 1.2k stars 17.00 score 916 scripts 33 dependents

amices

mice:Multivariate Imputation by Chained Equations

Multiple imputation using Fully Conditional Specification (FCS) implemented by the MICE algorithm as described in Van Buuren and Groothuis-Oudshoorn (2011) <doi:10.18637/jss.v045.i03>. Each variable has its own imputation model. Built-in imputation models are provided for continuous data (predictive mean matching, normal), binary data (logistic regression), unordered categorical data (polytomous logistic regression) and ordered categorical data (proportional odds). MICE can also impute continuous two-level data (normal model, pan, second-level variables). Passive imputation can be used to maintain consistency between variables. Various diagnostic plots are available to inspect the quality of the imputations.

Maintained by Stef van Buuren. Last updated 5 days ago.

chained-equations fcs imputation mice missing-data missing-values multiple-imputation multivariate-data cpp

1.8 match 462 stars 16.50 score 10k scripts 154 dependents

ashbythorpe

selenider:Concise, Lazy and Reliable Wrapper for 'chromote' and 'selenium'

A user-friendly wrapper for web automation, using either 'chromote' or 'selenium'. Provides a simple and consistent API to make web scraping and testing scripts easy to write and understand. Elements are lazy, and automatically wait for the website to be valid, resulting in reliable and reproducible code, with no visible impact on the experience of the programmer.

Maintained by Ashby Thorpe. Last updated 2 months ago.

web-scraping

4.0 match 39 stars 7.21 score 23 scripts

vandomed

dvmisc:Convenience Functions, Moving Window Statistics, and Graphics

Collection of functions for running and summarizing statistical simulation studies, creating visualizations (e.g. CART Shiny app, histograms with fitted probability mass/density functions), calculating moving-window statistics efficiently, and performing common computations.

Maintained by Dane R. Van Domelen. Last updated 4 years ago.

aic bmi histograms miscellaneous cpp

4.7 match 1 stars 6.18 score 125 scripts 8 dependents

poissonconsulting

universals:S3 Generics for Bayesian Analyses

Provides S3 generic methods and some default implementations for Bayesian analyses that generate Markov Chain Monte Carlo (MCMC) samples. The purpose of 'universals' is to reduce package dependencies and conflicts. The 'nlist' package implements many of the methods for its 'nlist' class.

Maintained by Joe Thorley. Last updated 2 months ago.

generics model-fitting s3

4.5 match 4 stars 6.37 score 1 scripts 20 dependents

otoomet

maxLik:Maximum Likelihood Estimation and Related Tools

Functions for Maximum Likelihood (ML) estimation, non-linear optimization, and related tools. It includes a unified way to call different optimizers, and classes and methods to handle the results from the Maximum Likelihood viewpoint. It also includes a number of convenience tools for testing and developing your own models.

Maintained by Ott Toomet. Last updated 12 months ago.

3.1 match 9.08 score 480 scripts 109 dependents

canmod

macpan2:Fast and Flexible Compartmental Modelling

Fast and flexible compartmental modelling with Template Model Builder.

Maintained by Steve Walker. Last updated 15 hours ago.

compartmental-models epidemiology forecasting mixed-effects model-fitting optimization simulation simulation-modeling cpp

3.2 match 4 stars 8.89 score 246 scripts 1 dependents

stochastictree

stochtree:Stochastic Tree Ensembles (XBART and BART) for Supervised Learning and Causal Inference

Flexible stochastic tree ensemble software. Robust implementations of Bayesian Additive Regression Trees (BART) Chipman, George, McCulloch (2010) <doi:10.1214/09-AOAS285> for supervised learning and Bayesian Causal Forests (BCF) Hahn, Murray, Carvalho (2020) <doi:10.1214/19-BA1195> for causal inference. Enables model serialization and parallel sampling and provides a low-level interface for custom stochastic forest samplers.

Maintained by Drew Herren. Last updated 16 days ago.

bart bayesian-machine-learning bayesian-methods decision-trees gradient-boosted-trees machine-learning probabilistic-models tree-ensembles cpp

3.3 match 20 stars 8.52 score 40 scripts

trevorhastie

softImpute:Matrix Completion via Iterative Soft-Thresholded SVD

Iterative methods for matrix completion that use nuclear-norm regularization. There are two main approaches.The one approach uses iterative soft-thresholded svds to impute the missing values. The second approach uses alternating least squares. Both have an 'EM' flavor, in that at each iteration the matrix is completed with the current estimate. For large matrices there is a special sparse-matrix class named "Incomplete" that efficiently handles all computations. The package includes procedures for centering and scaling rows, columns or both, and for computing low-rank SVDs on large sparse centered matrices (i.e. principal components).

Maintained by Trevor Hastie. Last updated 4 years ago.

fortran

3.8 match 10 stars 7.47 score 253 scripts 22 dependents

fcharte

mldr.datasets:R Ultimate Multilabel Dataset Repository

Large collection of multilabel datasets along with the functions needed to export them to several formats, to make partitions, and to obtain bibliographic information.

Maintained by David Charte. Last updated 6 years ago.

6.0 match 8 stars 4.68 score 120 scripts

cran

mgcv:Mixed GAM Computation Vehicle with Automatic Smoothness Estimation

Generalized additive (mixed) models, some of their extensions and other generalized ridge regression with multiple smoothing parameter estimation by (Restricted) Marginal Likelihood, Generalized Cross Validation and similar, or using iterated nested Laplace approximation for fully Bayesian inference. See Wood (2017) <doi:10.1201/9781315370279> for an overview. Includes a gam() function, a wide variety of smoothers, 'JAGS' support and distributions beyond the exponential family.

Maintained by Simon Wood. Last updated 1 years ago.

openblas openmp

2.2 match 32 stars 12.71 score 17k scripts 7.8k dependents

furrer-lab

abn:Modelling Multivariate Data with Additive Bayesian Networks

The 'abn' R package facilitates Bayesian network analysis, a probabilistic graphical model that derives from empirical data a directed acyclic graph (DAG). This DAG describes the dependency structure between random variables. The R package 'abn' provides routines to help determine optimal Bayesian network models for a given data set. These models are used to identify statistical dependencies in messy, complex data. Their additive formulation is equivalent to multivariate generalised linear modelling, including mixed models with independent and identically distributed (iid) random effects. The core functionality of the 'abn' package revolves around model selection, also known as structure discovery. It supports both exact and heuristic structure learning algorithms and does not restrict the data distribution of parent-child combinations, providing flexibility in model creation and analysis. The 'abn' package uses Laplace approximations for metric estimation and includes wrappers to the 'INLA' package. It also employs 'JAGS' for data simulation purposes. For more resources and information, visit the 'abn' website.

Maintained by Matteo Delucchi. Last updated 4 days ago.

bayesian-network binomial categorical-data gaussian grouped-datasets mixed-effects multinomial multivariate poisson structure-learning gsl openblas cpp openmp jags

4.0 match 6 stars 6.94 score 90 scripts

distancedevelopment

mrds:Mark-Recapture Distance Sampling

Animal abundance estimation via conventional, multiple covariate and mark-recapture distance sampling (CDS/MCDS/MRDS). Detection function fitting is performed via maximum likelihood. Also included are diagnostics and plotting for fitted detection functions. Abundance estimation is via a Horvitz-Thompson-like estimator.

Maintained by Laura Marshall. Last updated 2 months ago.

3.4 match 4 stars 8.05 score 78 scripts 7 dependents

sidoruvigo

DTDA.ni:Doubly Truncated Data Analysis, Non Iterative

Non-iterative estimator for the cumulative distribution of a doubly truncated variable. de Uña-Álvarez J. (2018) <doi:10.1007/978-3-319-73848-2_37>.

Maintained by José Carlos Soage González. Last updated 6 years ago.

6.4 match 4.30 score 7 scripts

sparklyr

sparklyr:R Interface to Apache Spark

R interface to Apache Spark, a fast and general engine for big data processing, see <https://spark.apache.org/>. This package supports connecting to local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end, and provides an interface to Spark's built-in machine learning algorithms.

Maintained by Edgar Ruiz. Last updated 8 days ago.

apache-spark distributed dplyr ide livy machine-learning remote-clusters spark sparklyr

1.8 match 959 stars 15.16 score 4.0k scripts 21 dependents

azure

AzureVision:Interface to Azure Computer Vision Services

An interface to 'Azure Computer Vision' <https://docs.microsoft.com/azure/cognitive-services/Computer-vision/Home> and 'Azure Custom Vision' <https://docs.microsoft.com/azure/cognitive-services/custom-vision-service/home>, building on the low-level functionality provided by the 'AzureCognitive' package. These services allow users to leverage the cloud to carry out visual recognition tasks using advanced image processing models, without needing powerful hardware of their own. Part of the 'AzureR' family of packages.

Maintained by Hong Ooi. Last updated 4 years ago.

azure-cognitive-services azure-sdk-r computer-vision custom-vision

5.3 match 5 stars 5.00 score 8 scripts

ropensci

targets:Dynamic Function-Oriented 'Make'-Like Declarative Pipelines

Pipeline tools coordinate the pieces of computationally demanding analysis projects. The 'targets' package is a 'Make'-like pipeline tool for statistics and data science in R. The package skips costly runtime for tasks that are already up to date, orchestrates the necessary computation with implicit parallel computing, and abstracts files as R objects. If all the current output matches the current upstream code and data, then the whole pipeline is up to date, and the results are more trustworthy than otherwise. The methodology in this package borrows from GNU 'Make' (2015, ISBN:978-9881443519) and 'drake' (2018, <doi:10.21105/joss.00550>).

Maintained by William Michael Landau. Last updated 14 hours ago.

data-science high-performance-computing make peer-reviewed pipeline r-targetopia reproducibility reproducible-research targets workflow

1.8 match 973 stars 15.20 score 4.6k scripts 22 dependents

robinhankin

gsl:Wrapper for the Gnu Scientific Library

An R wrapper for some of the functionality of the Gnu Scientific Library.

Maintained by Robin K. S. Hankin. Last updated 2 months ago.

gsl

2.3 match 15 stars 11.82 score 472 scripts 204 dependents

davisvaughan

treesitter:Bindings to 'Tree-Sitter'

Provides bindings to 'Tree-sitter', an incremental parsing system for programming tools. 'Tree-sitter' builds concrete syntax trees for source files of any language, and can efficiently update those syntax trees as the source file is edited. It also includes a robust error recovery system that provides useful parse results even in the presence of syntax errors.

Maintained by Davis Vaughan. Last updated 6 months ago.

4.0 match 37 stars 6.62 score 18 scripts 2 dependents

bioc

multiWGCNA:multiWGCNA

An R package for deeping mining gene co-expression networks in multi-trait expression data. Provides functions for analyzing, comparing, and visualizing WGCNA networks across conditions. multiWGCNA was designed to handle the common case where there are multiple biologically meaningful sample traits, such as disease vs wildtype across development or anatomical region.

Maintained by Dario Tommasini. Last updated 5 months ago.

sequencing rnaseq geneexpression differentialexpression regression clustering

6.2 match 4.30 score 6 scripts

zhuwang46

irboost:Iteratively Reweighted Boosting for Robust Analysis

Fit a predictive model using iteratively reweighted boosting (IRBoost) to minimize robust loss functions within the CC-family (concave-convex). This constitutes an application of iteratively reweighted convex optimization (IRCO), where convex optimization is performed using the functional descent boosting algorithm. IRBoost assigns weights to facilitate outlier identification. Applications include robust generalized linear models and robust accelerated failure time models. Wang (2025) <doi:10.6339/24-JDS1138>.

Maintained by Zhu Wang. Last updated 1 months ago.

8.8 match 3.00 score

divdyn

divDyn:Diversity Dynamics using Fossil Sampling Data

Functions to describe sampling and diversity dynamics of fossil occurrence datasets (e.g. from the Paleobiology Database). The package includes methods to calculate range- and occurrence-based metrics of taxonomic richness, extinction and origination rates, along with traditional sampling measures. A powerful subsampling tool is also included that implements frequently used sampling standardization methods in a multiple bin-framework. The plotting of time series and the occurrence data can be simplified by the functions incorporated in the package, as well as other calculations, such as environmental affinities and extinction selectivity testing. Details can be found in: Kocsis, A.T.; Reddin, C.J.; Alroy, J. and Kiessling, W. (2019) <doi:10.1101/423780>.

Maintained by Adam T. Kocsis. Last updated 4 months ago.

diversity extinction fossil-data occurrences origination paleobiology cpp

4.0 match 11 stars 6.48 score 137 scripts

bioc

RPA:RPA: Robust Probabilistic Averaging for probe-level analysis

Probabilistic analysis of probe reliability and differential gene expression on short oligonucleotide arrays.

Maintained by Leo Lahti. Last updated 5 months ago.

geneexpression microarray preprocessing qualitycontrol

4.5 match 5.78 score 20 scripts 1 dependents

mqbssppe

label.switching:Relabelling MCMC Outputs of Mixture Models

The Bayesian estimation of mixture models (and more general hidden Markov models) suffers from the label switching phenomenon, making the MCMC output non-identifiable. This package can be used in order to deal with this problem using various relabelling algorithms.

Maintained by Panagiotis Papastamoulis. Last updated 6 years ago.

7.6 match 1 stars 3.41 score 65 scripts 11 dependents

business-science

modeltime.ensemble:Ensemble Algorithms for Time Series Forecasting with Modeltime

A 'modeltime' extension that implements time series ensemble forecasting methods including model averaging, weighted averaging, and stacking. These techniques are popular methods to improve forecast accuracy and stability.

Maintained by Matt Dancho. Last updated 8 months ago.

ensemble ensemble-learning forecast forecasting modeltime stacking stacking-ensemble tidymodels time time-series timeseries

3.1 match 77 stars 8.30 score 143 scripts

cvxgrp

CVXR:Disciplined Convex Optimization

An object-oriented modeling language for disciplined convex programming (DCP) as described in Fu, Narasimhan, and Boyd (2020, <doi:10.18637/jss.v094.i14>). It allows the user to formulate convex optimization problems in a natural way following mathematical convention and DCP rules. The system analyzes the problem, verifies its convexity, converts it into a canonical form, and hands it off to an appropriate solver to obtain the solution. Interfaces to solvers on CRAN and elsewhere are provided, both commercial and open source.

Maintained by Anqi Fu. Last updated 4 months ago.

cpp

2.0 match 207 stars 12.89 score 768 scripts 51 dependents

awblocker

ipfp:Fast Implementation of the Iterative Proportional Fitting Procedure in C

A fast (C) implementation of the iterative proportional fitting procedure.

Maintained by Alexander W Blocker. Last updated 3 years ago.

openblas

5.1 match 13 stars 4.98 score 49 scripts 1 dependents

dgkf

ggpackets:Package Plot Layers for Easier Portability and Modularization

Create groups of 'ggplot2' layers that can be easily migrated from one plot to another, reducing redundant code and improving the ability to format many plots that draw from the same source 'ggpacket' layers.

Maintained by Doug Kelkhoff. Last updated 12 days ago.

ggplot plotting

3.5 match 69 stars 7.34 score 12 scripts 1 dependents

bioc

iasva:Iteratively Adjusted Surrogate Variable Analysis

Iteratively Adjusted Surrogate Variable Analysis (IA-SVA) is a statistical framework to uncover hidden sources of variation even when these sources are correlated. IA-SVA provides a flexible methodology to i) identify a hidden factor for unwanted heterogeneity while adjusting for all known factors; ii) test the significance of the putative hidden factor for explaining the unmodeled variation in the data; and iii), if significant, use the estimated factor as an additional known factor in the next iteration to uncover further hidden factors.

Maintained by Donghyung Lee. Last updated 5 months ago.

preprocessing qualitycontrol batcheffect rnaseq software statisticalmethod featureextraction immunooncology

5.5 match 4.65 score 45 scripts

adeverse

ade4:Analysis of Ecological Data: Exploratory and Euclidean Methods in Environmental Sciences

Tools for multivariate data analysis. Several methods are provided for the analysis (i.e., ordination) of one-table (e.g., principal component analysis, correspondence analysis), two-table (e.g., coinertia analysis, redundancy analysis), three-table (e.g., RLQ analysis) and K-table (e.g., STATIS, multiple coinertia analysis). The philosophy of the package is described in Dray and Dufour (2007) <doi:10.18637/jss.v022.i04>.

Maintained by Aurélie Siberchicot. Last updated 11 days ago.

openblas cpp

1.7 match 39 stars 14.96 score 2.2k scripts 256 dependents

s-u

iotools:I/O Tools for Streaming

Basic I/O tools for streaming and data parsing.

Maintained by Simon Urbanek. Last updated 1 years ago.

3.4 match 48 stars 7.35 score 60 scripts 10 dependents

mjskay

tidybayes:Tidy Data and 'Geoms' for Bayesian Models

Compose data for and extract, manipulate, and visualize posterior draws from Bayesian models ('JAGS', 'Stan', 'rstanarm', 'brms', 'MCMCglmm', 'coda', ...) in a tidy data format. Functions are provided to help extract tidy data frames of draws from Bayesian models and that generate point summaries and intervals in a tidy format. In addition, 'ggplot2' 'geoms' and 'stats' are provided for common visualization primitives like points with multiple uncertainty intervals, eye plots (intervals plus densities), and fit curves with multiple, arbitrary uncertainty bands.

Maintained by Matthew Kay. Last updated 6 months ago.

bayesian-data-analysis brms ggplot2 jags stan tidy-data visualization

1.7 match 732 stars 14.88 score 7.3k scripts 19 dependents

turtletopia

aurrera:Wrap an Interable in a Progress Bar

Allows a simple creation of progress bars by wrapping the iterated object in 'pb()'.

Maintained by Laura Bakala. Last updated 2 years ago.

iterable lapply map progress-bar

12.4 match 2 stars 2.00 score 3 scripts

s-baumann

FixedPoint:Algorithms for Finding Fixed Point Vectors of Functions

For functions that take and return vectors (or scalars), this package provides 8 algorithms for finding fixed point vectors (vectors for which the inputs and outputs to the function are the same vector). These algorithms include Anderson (1965) acceleration <doi:10.1145/321296.321305>, epsilon extrapolation methods (Wynn 1962 <doi:10.2307/2004051>) and minimal polynomial methods (Cabay and Jackson 1976 <doi:10.1137/0713060>).

Maintained by Stuart Baumann. Last updated 2 years ago.

6.7 match 1 stars 3.69 score 33 scripts 1 dependents

maarten14c

coffee:Chronological Ordering for Fossils and Environmental Events

While individual calibrated radiocarbon dates can span several centuries, combining multiple dates together with any chronological constraints can make a chronology much more robust and precise. This package uses Bayesian methods to enforce the chronological ordering of radiocarbon and other dates, for example for trees with multiple radiocarbon dates spaced at exactly known intervals (e.g., 10 annual rings). For methods see Christen 2003 <doi:10.11141/ia.13.2>. Another example is sites where the relative chronological position of the dates is taken into account - the ages of dates further down a site must be older than those of dates further up (Buck, Kenworthy, Litton and Smith 1991 <doi:10.1017/S0003598X00080534>; Nicholls and Jones 2001 <doi:10.1111/1467-9876.00250>). The paper accompanying this R package is Blaauw et al. 2024 <doi:10.1017/RDC.2024.56>.

Maintained by Maarten Blaauw. Last updated 3 months ago.

4.1 match 7 stars 6.02 score 6 scripts

bioc

iBBiG:Iterative Binary Biclustering of Genesets

iBBiG is a bi-clustering algorithm which is optimizes for binary data analysis. We apply it to meta-gene set analysis of large numbers of gene expression datasets. The iterative algorithm extracts groups of phenotypes from multiple studies that are associated with similar gene sets. iBBiG does not require prior knowledge of the number or scale of clusters and allows discovery of clusters with diverse sizes

Maintained by Aedin Culhane. Last updated 5 months ago.

clustering annotation genesetenrichment

5.4 match 4.56 score 3 scripts 2 dependents

gasparrini

mixmeta:An Extended Mixed-Effects Framework for Meta-Analysis

A collection of functions to perform various meta-analytical models through a unified mixed-effects framework, including standard univariate fixed and random-effects meta-analysis and meta-regression, and non-standard extensions such as multivariate, multilevel, longitudinal, and dose-response models.

Maintained by Antonio Gasparrini. Last updated 3 years ago.

3.5 match 13 stars 6.96 score 63 scripts 13 dependents

padrinodb

ipmr:Integral Projection Models

Flexibly implements Integral Projection Models using a mathematical(ish) syntax. This package will not help with the vital rate modeling process, but will help convert those regression models into an IPM. 'ipmr' handles density dependence and environmental stochasticity, with a couple of options for implementing the latter. In addition, provides functions to avoid unintentional eviction of individuals from models. Additionally, provides model diagnostic tools, plotting functionality, stochastic/deterministic simulations, and analysis tools. Integral projection models are described in depth by Easterling et al. (2000) <doi:10.1890/0012-9658(2000)081[0694:SSSAAN]2.0.CO;2>, Merow et al. (2013) <doi:10.1111/2041-210X.12146>, Rees et al. (2014) <doi:10.1111/1365-2656.12178>, and Metcalf et al. (2015) <doi:10.1111/2041-210X.12405>. Williams et al. (2012) <doi:10.1890/11-2147.1> discuss the problem of unintentional eviction.

Maintained by Sam Levin. Last updated 4 months ago.

demography integral-projection-models cpp

3.5 match 7 stars 6.92 score 66 scripts 1 dependents

datacloning

dclone:Data Cloning and MCMC Tools for Maximum Likelihood Methods

Low level functions for implementing maximum likelihood estimating procedures for complex models using data cloning and Bayesian Markov chain Monte Carlo methods as described in Solymos 2010 <doi:10.32614/RJ-2010-011>. Sequential and parallel MCMC support for 'JAGS', 'WinBUGS', 'OpenBUGS', and 'Stan'.

Maintained by Peter Solymos. Last updated 6 months ago.

jags cpp

3.5 match 7 stars 6.91 score 215 scripts 4 dependents

statisticsnorway

GaussSuppression:Tabular Data Suppression using Gaussian Elimination

A statistical disclosure control tool to protect tables by suppression using the Gaussian elimination secondary suppression algorithm (Langsrud, 2024) <doi:10.1007/978-3-031-69651-0_6>. A suggestion is to start by working with functions SuppressSmallCounts() and SuppressDominantCells(). These functions use primary suppression functions for the minimum frequency rule and the dominance rule, respectively. Novel functionality for suppression of disclosive cells is also included. General primary suppression functions can be supplied as input to the general working horse function, GaussSuppressionFromData(). Suppressed frequencies can be replaced by synthetic decimal numbers as described in Langsrud (2019) <doi:10.1007/s11222-018-9848-9>.

Maintained by Øyvind Langsrud. Last updated 21 hours ago.

3.6 match 2 stars 6.61 score 50 scripts

cmann3

eList:List Comprehension and Tools

Create list comprehensions (and other types of comprehension) similar to those in 'python', 'haskell', and other languages. List comprehension in 'R' converts a regular for() loop into a vectorized lapply() function. Support for looping with multiple variables, parallelization, and across non-standard objects included. Package also contains a variety of functions to help with list comprehension.

Maintained by Chris Mann. Last updated 4 years ago.

5.3 match 2 stars 4.48 score 9 scripts 1 dependents

mlr-org

bbotk:Black-Box Optimization Toolkit

Features highly configurable search spaces via the 'paradox' package and optimizes every user-defined objective function. The package includes several optimization algorithms e.g. Random Search, Iterated Racing, Bayesian Optimization (in 'mlr3mbo') and Hyperband (in 'mlr3hyperband'). bbotk is the base package of 'mlr3tuning', 'mlr3fselect' and 'miesmuschel'.

Maintained by Marc Becker. Last updated 3 months ago.

bbotk black-box-optimization data-science hyperparameter-optimization hyperparameter-tuning machine-learning mlr3 optimization

2.4 match 22 stars 9.87 score 166 scripts 14 dependents

dranthropoid

mmodely:Modeling Multivariate Origins Determinants - Evolutionary Lineages in Ecology

Perform multivariate modeling of evolved traits, with special attention to understanding the interplay of the multi-factorial determinants of their origins in complex ecological settings (Stephens, 2007 <doi:10.1016/j.tree.2006.12.003>). This software primarily concentrates on phylogenetic regression analysis, enabling implementation of tree transformation averaging and visualization functionality. Functions additionally support information theoretic approaches (Grueber, 2011 <doi:10.1111/j.1420-9101.2010.02210.x>; Garamszegi, 2011 <doi:10.1007/s00265-010-1028-7>) such as model averaging and selection of phylogenetic models. Accessory functions are also implemented for coef standardization (Cade 2015), selection uncertainty, and variable importance (Burnham & Anderson 2000). There are other numerous functions for visualizing confounded variables, plotting phylogenetic trees, as well as reporting and exporting modeling results. Lastly, as challenges to ecology are inherently multifarious, and therefore often multi-dataset, this package features several functions to support the identification, interpolation, merging, and updating of missing data and outdated nomenclature.

Maintained by David M Schruth. Last updated 2 years ago.

10.3 match 2.30 score 4 scripts

mrc-ide

dust:Iterate Multiple Realisations of Stochastic Models

An Engine for simulation of stochastic models. Includes support for running stochastic models in parallel, either with shared or varying parameters. Simulations are run efficiently in compiled code and can be run with a fraction of simulated states returned to R, allowing control over memory usage. Support is provided for building bootstrap particle filter for performing Sequential Monte Carlo (e.g., Gordon et al. 1993 <doi:10.1049/ip-f-2.1993.0015>). The core of the simulation engine is the 'xoshiro256**' algorithm (Blackman and Vigna <arXiv:1805.01407>), and the package is further described in FitzJohn et al 2021 <doi:10.12688/wellcomeopenres.16466.2>.

Maintained by Rich FitzJohn. Last updated 5 months ago.

cpp openmp

3.0 match 18 stars 7.84 score 60 scripts 3 dependents

tidymodels

shinymodels:Interactive Assessments of Models

Launch a 'shiny' application for 'tidymodels' results. For classification or regression models, the app can be used to determine if there is lack of fit or poorly predicted points.

Maintained by Simon Couch. Last updated 5 months ago.

shiny

3.8 match 48 stars 6.21 score 48 scripts

sssydysss

TransProR:Analysis and Visualization of Multi-Omics Data

A tool for comprehensive transcriptomic data analysis, with a focus on transcript-level data preprocessing, expression profiling, differential expression analysis, and functional enrichment. It enables researchers to identify key biological processes, disease biomarkers, and gene regulatory mechanisms. 'TransProR' is aimed at researchers and bioinformaticians working with RNA-Seq data, providing an intuitive framework for in-depth analysis and visualization of transcriptomic datasets. The package includes comprehensive documentation and usage examples to guide users through the entire analysis pipeline. The differential expression analysis methods incorporated in the package include 'limma' (Ritchie et al., 2015, <doi:10.1093/nar/gkv007>; Smyth, 2005, <doi:10.1007/0-387-29362-0_23>), 'edgeR' (Robinson et al., 2010, <doi:10.1093/bioinformatics/btp616>), 'DESeq2' (Love et al., 2014, <doi:10.1186/s13059-014-0550-8>), and Wilcoxon tests (Li et al., 2022, <doi:10.1186/s13059-022-02648-4>), providing flexible and robust approaches to RNA-Seq data analysis. For more information, refer to the package vignettes and related publications.

Maintained by Dongyue Yu. Last updated 18 days ago.

3.1 match 174 stars 7.55 score 34 scripts

flr

mse:Tools for Running Management Strategy Evaluations using FLR

A set of functions and methods to enable the development and running of Management Strategy Evaluation (MSE) analyses, using the FLR packages and classes and the a4a methods and algorithms.

Maintained by Iago Mosqueira. Last updated 20 days ago.

simulation mse fisheries flr a4a

3.3 match 4 stars 7.04 score 137 scripts 3 dependents

bioc

mixOmics:Omics Data Integration Project

Multivariate methods are well suited to large omics data sets where the number of variables (e.g. genes, proteins, metabolites) is much larger than the number of samples (patients, cells, mice). They have the appealing properties of reducing the dimension of the data by using instrumental variables (components), which are defined as combinations of all variables. Those components are then used to produce useful graphical outputs that enable better understanding of the relationships and correlation structures between the different data sets that are integrated. mixOmics offers a wide range of multivariate methods for the exploration and integration of biological datasets with a particular focus on variable selection. The package proposes several sparse multivariate models we have developed to identify the key variables that are highly correlated, and/or explain the biological outcome of interest. The data that can be analysed with mixOmics may come from high throughput sequencing technologies, such as omics data (transcriptomics, metabolomics, proteomics, metagenomics etc) but also beyond the realm of omics (e.g. spectral imaging). The methods implemented in mixOmics can also handle missing values without having to delete entire rows with missing data. A non exhaustive list of methods include variants of generalised Canonical Correlation Analysis, sparse Partial Least Squares and sparse Discriminant Analysis. Recently we implemented integrative methods to combine multiple data sets: N-integration with variants of Generalised Canonical Correlation Analysis and P-integration with variants of multi-group Partial Least Squares.

Maintained by Eva Hamrud. Last updated 2 days ago.

immunooncology microarray sequencing metabolomics metagenomics proteomics geneprediction multiplecomparison classification regression bioconductor genomics genomics-data genomics-visualization multivariate-analysis multivariate-statistics omics r-pkg r-project

1.7 match 182 stars 13.71 score 1.3k scripts 22 dependents

kerguler

albopictus:Age-Structured Population Dynamics Model

Implements discrete time deterministic and stochastic age-structured population dynamics models described in Erguler and others (2016) <doi:10.1371/journal.pone.0149282> and Erguler and others (2017) <doi:10.1371/journal.pone.0174293>.

Maintained by Kamil Erguler. Last updated 5 years ago.

7.0 match 2 stars 3.30 score 2 scripts

ropensci

openalexR:Getting Bibliographic Records from 'OpenAlex' Database Using 'DSL' API

A set of tools to extract bibliographic content from 'OpenAlex' database using API <https://docs.openalex.org>.

Maintained by Massimo Aria. Last updated 26 days ago.

bibliographic-data bibliographic-database bibliometrics bibliometrix science-mapping

2.3 match 107 stars 10.24 score 194 scripts 5 dependents

skranz

RelationalContracts:Characterize relational contracts in repated or stochastic games

Characterize relational contracts in repated or stochastic games. Can also analyse repeated negotiation equilibria.

Maintained by Sebastian Kranz. Last updated 4 years ago.

dynamic-game economics game-theory hold-up nash-equilibrium repeated-game stochastic-game

9.3 match 4 stars 2.48 score 15 scripts

edwindj

whisker:{{mustache}} for R, logicless templating

Implements 'Mustache' logicless templating.

Maintained by Edwin de Jonge. Last updated 2 years ago.

mustache-templates

1.8 match 212 stars 12.74 score 241 scripts 551 dependents

sdanzige

ADAPTS:Automated Deconvolution Augmentation of Profiles for Tissue Specific Cells

Tools to construct (or add to) cell-type signature matrices using flow sorted or single cell samples and deconvolve bulk gene expression data. Useful for assessing the quality of single cell RNAseq experiments, estimating the accuracy of signature matrices, and determining cell-type spillover. Please cite: Danziger SA et al. (2019) ADAPTS: Automated Deconvolution Augmentation of Profiles for Tissue Specific cells <doi:10.1371/journal.pone.0224693>.

Maintained by Samuel A Danziger. Last updated 3 years ago.

3.5 match 2 stars 6.56 score 40 scripts 1 dependents

jabiru

binr:Cut Numeric Values into Evenly Distributed Groups

Implementation of algorithms for cutting numerical values exhibiting a potentially highly skewed distribution into evenly distributed groups (bins). This functionality can be applied for binning discrete values, such as counts, as well as for discretization of continuous values, for example, during generation of features used in machine learning algorithms.

Maintained by Sergei Izrailev. Last updated 7 years ago.

4.0 match 11 stars 5.67 score 71 scripts 4 dependents

coolbutuseless

emphatic:Exploratory Analysis of Tabular Data using Colour Highlighting

Tools for exploratory analysis of tabular data using colour highlighting. Highlighting is displayed in any console supporting 'ANSI' colours, and can be converted to 'HTML', 'typst', 'latex' and 'SVG'. 'quarto' and 'rmarkdown' rendering are directly supported. It is also possible to add colour to regular expression matches and highlight differences between two arbitrary R objects.

Maintained by Mike Cheng. Last updated 3 months ago.

3.0 match 141 stars 7.55 score 12 scripts

cran

fullfact:Full Factorial Breeding Analysis

We facilitate the analysis of full factorial mating designs with mixed-effects models. The package contains six vignettes containing detailed examples.

Maintained by Aimee Lee Houde. Last updated 1 years ago.

8.0 match 2.78 score

atahk

pscl:Political Science Computational Laboratory

Bayesian analysis of item-response theory (IRT) models, roll call analysis; computing highest density regions; maximum likelihood estimation of zero-inflated and hurdle models for count data; goodness-of-fit measures for GLMs; data sets used in writing and teaching; seats-votes curves.

Maintained by Simon Jackman. Last updated 1 years ago.

1.7 match 67 stars 13.28 score 2.7k scripts 54 dependents

nerler

JointAI:Joint Analysis and Imputation of Incomplete Data

Joint analysis and imputation of incomplete data in the Bayesian framework, using (generalized) linear (mixed) models and extensions there of, survival models, or joint models for longitudinal and survival data, as described in Erler, Rizopoulos and Lesaffre (2021) <doi:10.18637/jss.v100.i20>. Incomplete covariates, if present, are automatically imputed. The package performs some preprocessing of the data and creates a 'JAGS' model, which will then automatically be passed to 'JAGS' <https://mcmc-jags.sourceforge.io/> with the help of the package 'rjags'.

Maintained by Nicole S. Erler. Last updated 12 months ago.

bayesian generalized-linear-models glm glmm imputation imputations jags joint-analysis linear-mixed-models linear-regression-models mcmc-sample mcmc-sampling missing-data missing-values survival cpp

3.0 match 28 stars 7.30 score 59 scripts 1 dependents

dkyleward

ipfr:List Balancing for Reweighting and Population Synthesis

Performs iterative proportional updating given a seed table and an arbitrary number of marginal distributions. This is commonly used in population synthesis, survey raking, matrix rebalancing, and other applications. For example, a household survey may be weighted to match the known distribution of households by size from the census. An origin/ destination trip matrix might be balanced to match traffic counts. The approach used by this package is based on a paper from Arizona State University (Ye, Xin, et. al. (2009) <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.537.723&rep=rep1&type=pdf>). Some enhancements have been made to their work including primary and secondary target balance/importance, general marginal agreement, and weight restriction.

Maintained by Kyle Ward. Last updated 5 years ago.

4.3 match 5 stars 5.06 score 23 scripts

bioc

seqArchR:Identify Different Architectures of Sequence Elements

seqArchR enables unsupervised discovery of _de novo_ clusters with characteristic sequence architectures characterized by position-specific motifs or composition of stretches of nucleotides, e.g., CG-richness. seqArchR does _not_ require any specifications w.r.t. the number of clusters, the length of any individual motifs, or the distance between motifs if and when they occur in pairs/groups; it directly detects them from the data. seqArchR uses non-negative matrix factorization (NMF) as its backbone, and employs a chunking-based iterative procedure that enables processing of large sequence collections efficiently. Wrapper functions are provided for visualizing cluster architectures as sequence logos.

Maintained by Sarvesh Nikumbh. Last updated 5 months ago.

motifdiscovery generegulation mathematicalbiology systemsbiology transcriptomics genetics clustering dimensionreduction featureextraction dnaseq nmf nonnegative-matrix-factorization promoter-sequence-architectures scikit-learn sequence-analysis sequence-architectures unsupervised-machine-learning

4.7 match 1 stars 4.48 score 9 scripts 1 dependents

christopherggreen

CerioliOutlierDetection:Outlier Detection Using the Iterated RMCD Method of Cerioli (2010)

Implements the iterated RMCD method of Cerioli (2010) for multivariate outlier detection via robust Mahalanobis distances. Also provides the finite-sample RMCD method discussed in the paper, as well as the methods provided in Hardin and Rocke (2005) <doi:10.1198/106186005X77685> and Green and Martin (2017).

Maintained by Christopher G. Green. Last updated 1 years ago.

6.8 match 10 stars 3.11 score 13 scripts

bioc

qpgraph:Estimation of genetic and molecular regulatory networks from high-throughput genomics data

Estimate gene and eQTL networks from high-throughput expression and genotyping assays.

Maintained by Robert Castelo. Last updated 4 days ago.

microarray geneexpression transcription pathways networkinference graphandnetwork generegulation genetics geneticvariability snp software openblas

3.6 match 5.75 score 20 scripts 3 dependents

ikosmidis

brglm2:Bias Reduction in Generalized Linear Models

Estimation and inference from generalized linear models based on various methods for bias reduction and maximum penalized likelihood with powers of the Jeffreys prior as penalty. The 'brglmFit' fitting method can achieve reduction of estimation bias by solving either the mean bias-reducing adjusted score equations in Firth (1993) <doi:10.1093/biomet/80.1.27> and Kosmidis and Firth (2009) <doi:10.1093/biomet/asp055>, or the median bias-reduction adjusted score equations in Kenne et al. (2017) <doi:10.1093/biomet/asx046>, or through the direct subtraction of an estimate of the bias of the maximum likelihood estimator from the maximum likelihood estimates as in Cordeiro and McCullagh (1991) <https://www.jstor.org/stable/2345592>. See Kosmidis et al (2020) <doi:10.1007/s11222-019-09860-6> for more details. Estimation in all cases takes place via a quasi Fisher scoring algorithm, and S3 methods for the construction of of confidence intervals for the reduced-bias estimates are provided. In the special case of generalized linear models for binomial and multinomial responses (both ordinal and nominal), the adjusted score approaches to mean and media bias reduction have been found to return estimates with improved frequentist properties, that are also always finite, even in cases where the maximum likelihood estimates are infinite (e.g. complete and quasi-complete separation; see Kosmidis and Firth, 2020 <doi:10.1093/biomet/asaa052>, for a proof for mean bias reduction in logistic regression).

Maintained by Ioannis Kosmidis. Last updated 6 months ago.

adjusted-score-equations algorithms bias-reducing-adjustments bias-reduction estimation glm logistic-regression nominal-responses ordinal-responses regression regression-algorithms statistics

2.0 match 32 stars 10.41 score 106 scripts 10 dependents

ralmond

mongo:Higher level interface to Mongo database

This is a wrapper for the jsonlite and mongolite packages which offers both an R6 object for managing the connection as well as some mechanisms for saving and restoring S4 objects to a Mongo database.

Maintained by Russell Almond. Last updated 10 months ago.

5.0 match 4.13 score 3 dependents

marce10

warbleR:Streamline Bioacoustic Analysis

Functions aiming to facilitate the analysis of the structure of animal acoustic signals in 'R'. 'warbleR' makes use of the basic sound analysis tools from the packages 'tuneR' and 'seewave', and offers new tools for explore and quantify acoustic signal structure. The package allows to organize and manipulate multiple sound files, create spectrograms of complete recordings or individual signals in different formats, run several measures of acoustic structure, and characterize different structural levels in acoustic signals.

Maintained by Marcelo Araya-Salas. Last updated 2 months ago.

animal-acoustic-signals audio-processing bioacoustics spectrogram streamline-analysis cpp

1.9 match 54 stars 11.01 score 270 scripts 4 dependents

bedapub

designit:Blocking and Randomization for Experimental Design

Intelligently assign samples to batches in order to reduce batch effects. Batch effects can have a significant impact on data analysis, especially when the assignment of samples to batches coincides with the contrast groups being studied. By defining a batch container and a scoring function that reflects the contrasts, this package allows users to assign samples in a way that minimizes the potential impact of batch effects on the comparison of interest. Among other functionality, we provide an implementation for OSAT score by Yan et al. (2012, <doi:10.1186/1471-2164-13-689>).

Maintained by Iakov I. Davydov. Last updated 4 months ago.

design-of-experiments randomization

2.8 match 8 stars 7.28 score 24 scripts

bioc

metagenomeSeq:Statistical analysis for sparse high-throughput sequencing

metagenomeSeq is designed to determine features (be it Operational Taxanomic Unit (OTU), species, etc.) that are differentially abundant between two or more groups of multiple samples. metagenomeSeq is designed to address the effects of both normalization and under-sampling of microbial communities on disease association detection and the testing of feature correlations.

Maintained by Joseph N. Paulson. Last updated 3 months ago.

immunooncology classification clustering geneticvariability differentialexpression microbiome metagenomics normalization visualization multiplecomparison sequencing software

1.7 match 69 stars 12.02 score 494 scripts 7 dependents

trivialfis

xgboost:Extreme Gradient Boosting

Extreme Gradient Boosting, which is an efficient implementation of the gradient boosting framework from Chen & Guestrin (2016) <doi:10.1145/2939672.2939785>. This package is its R interface. The package includes efficient linear model solver and tree learning algorithms. The package can automatically do parallel computation on a single machine which could be more than 10 times faster than existing gradient boosting packages. It supports various objective functions, including regression, classification and ranking. The package is made to be extensible, so that users are also allowed to define their own objectives easily.

Maintained by Jiaming Yuan. Last updated 8 months ago.

cpp openmp

1.8 match 6 stars 11.70 score 13k scripts 112 dependents

bioc

DAPAR:Tools for the Differential Analysis of Proteins Abundance with R

The package DAPAR is a Bioconductor distributed R package which provides all the necessary functions to analyze quantitative data from label-free proteomics experiments. Contrarily to most other similar R packages, it is endowed with rich and user-friendly graphical interfaces, so that no programming skill is required (see `Prostar` package).

Maintained by Samuel Wieczorek. Last updated 5 months ago.

proteomics normalization preprocessing massspectrometry qualitycontrol go dataimport prostar1

3.8 match 2 stars 5.42 score 22 scripts 1 dependents

t-kalinowski

keras:R Interface to 'Keras'

Interface to 'Keras' <https://keras.io>, a high-level neural networks 'API'. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both 'CPU' and 'GPU' devices.

Maintained by Tomasz Kalinowski. Last updated 11 months ago.

1.9 match 10.82 score 10k scripts 54 dependents

kserkcho

SCEM:Splitting-Coalescence-Estimation Method

We introduce improved methods for statistically assessing birth seasonality and intra-annual variation. The first method we propose is a new idea that uses a nonparametric clustering procedure to group individuals with similar time series data and estimate birth seasonality based on the clusters. One can use the function SCEM() to implement this method. The second method estimates input parameters for use with a previously-developed parametric approach (Tornero et al., 2013). The relevant code for this approach is makeFits_OLS(), while makeFits_initial() is the code to implement the same method but with given initial conditions for two parameters. The latter can be used to show the disadvantage of the existing approach. One can use the function makeFits() to generate parametric birth seasonality estimates using either initialization. Detailed description can be found here: Chazin Hannah, Soudeep Deb, Joshua Falk, and Arun Srinivasan. (2019) "New Statistical Approaches to Intra-Individual Isotopic Analysis and Modeling Birth Seasonality in Studies of Herd Animals." <doi:10.1111/arcm.12432>.

Maintained by Kyung Serk Cho. Last updated 4 years ago.

scem

4.7 match 4.30 score 3 scripts

stewid

SimInf:A Framework for Data-Driven Stochastic Disease Spread Simulations

Provides an efficient and very flexible framework to conduct data-driven epidemiological modeling in realistic large scale disease spread simulations. The framework integrates infection dynamics in subpopulations as continuous-time Markov chains using the Gillespie stochastic simulation algorithm and incorporates available data such as births, deaths and movements as scheduled events at predefined time-points. Using C code for the numerical solvers and 'OpenMP' (if available) to divide work over multiple processors ensures high performance when simulating a sample outcome. One of our design goals was to make the package extendable and enable usage of the numerical solvers from other R extension packages in order to facilitate complex epidemiological research. The package contains template models and can be extended with user-defined models. For more details see the paper by Widgren, Bauer, Eriksson and Engblom (2019) <doi:10.18637/jss.v091.i12>. The package also provides functionality to fit models to time series data using the Approximate Bayesian Computation Sequential Monte Carlo ('ABC-SMC') algorithm of Toni and others (2009) <doi:10.1098/rsif.2008.0172>.

Maintained by Stefan Widgren. Last updated 3 days ago.

data-driven epidemiology high-performance-computing markov-chain mathematical-modelling gsl openmp

2.0 match 35 stars 10.09 score 227 scripts

cran

DiceOptim:Kriging-Based Optimization for Computer Experiments

Efficient Global Optimization (EGO) algorithm as described in "Roustant et al. (2012)" <doi:10.18637/jss.v051.i01> and adaptations for problems with noise ("Picheny and Ginsbourger, 2012") <doi:10.1016/j.csda.2013.03.018>, parallel infill, and problems with constraints.

Maintained by Victor Picheny. Last updated 4 years ago.

6.5 match 4 stars 3.11 score 107 scripts 1 dependents

ichcha-m

cophescan:Adaptation of the Coloc Method for PheWAS

A Bayesian method for Phenome-wide association studies (PheWAS) that identifies causal associations between genetic variants and traits, while simultaneously addressing confounding due to linkage disequilibrium. For details see Manipur et al (2023) <doi:10.1101/2023.06.29.546856>.

Maintained by Ichcha Manipur. Last updated 9 months ago.

cpp openmp

3.5 match 6 stars 5.76 score 24 scripts

bnaras

cubature:Adaptive Multivariate Integration over Hypercubes

R wrappers around the cubature C library of Steven G. Johnson for adaptive multivariate integration over hypercubes and the Cuba C library of Thomas Hahn for deterministic and Monte Carlo integration. Scalar and vector interfaces for cubature and Cuba routines are provided; the vector interfaces are highly recommended as demonstrated in the package vignette.

Maintained by Balasubramanian Narasimhan. Last updated 8 months ago.

fortran cpp

1.8 match 12 stars 11.08 score 488 scripts 162 dependents

paws-r

paws.common:Paws Low-Level Amazon Web Services API

Functions for making low-level API requests to Amazon Web Services <https://aws.amazon.com>. The functions handle building, signing, and sending requests, and receiving responses. They are designed to help build higher-level interfaces to individual services, such as Simple Storage Service (S3).

Maintained by Dyfan Jones. Last updated 2 days ago.

aws aws-sdk cpp

1.8 match 332 stars 11.07 score 39 dependents

uupharmacometrics

xpose:Diagnostics for Pharmacometric Models

Diagnostics for non-linear mixed-effects (population) models from 'NONMEM' <https://www.iconplc.com/solutions/technologies/nonmem/>. 'xpose' facilitates data import, creation of numerical run summary and provide 'ggplot2'-based graphics for data exploration and model diagnostics.

Maintained by Benjamin Guiastrennec. Last updated 2 months ago.

diagnostics ggplot2 nonmem pharmacometrics xpose

1.8 match 62 stars 11.02 score 183 scripts 6 dependents

mjlajeunesse

switchboard:An Agile Widget Engine for Real-Time, Dynamic Visualizations

An unsorted collection of visualization widgets rendered in 'Tcl/Tk'<https://www.tcl.tk/> to generate agile dashboards for your iterative simulations. Widgets include progress bars, counters, eavesdroppers, injectors, switches, and sliders for dynamic manipulation and visualization of simulation parameters.

Maintained by Marc J. Lajeunesse. Last updated 3 years ago.

4.0 match 18 stars 4.95 score 2 scripts

laresbernardo

lares:Analytics & Machine Learning Sidekick

Auxiliary package for better/faster analytics, visualization, data mining, and machine learning tasks. With a wide variety of family functions, like Machine Learning, Data Wrangling, Marketing Mix Modeling (Robyn), Exploratory, API, and Scrapper, it helps the analyst or data scientist to get quick and robust results, without the need of repetitive coding or advanced R programming skills.

Maintained by Bernardo Lares. Last updated 23 days ago.

analytics api automation automl data-science descriptive-statistics h2o machine-learning marketing mmm predictive-modeling puzzle rlanguage robyn visualization

2.0 match 233 stars 9.84 score 185 scripts 1 dependents

smouksassi

coveffectsplot:Produce Forest Plots to Visualize Covariate Effects

Produce forest plots to visualize covariate effects using either the command line or an interactive 'Shiny' application.

Maintained by Samer Mouksassi. Last updated 1 months ago.

2.5 match 32 stars 7.86 score 40 scripts

r4ss

r4ss:R Code for Stock Synthesis

A collection of R functions for use with Stock Synthesis, a fisheries stock assessment modeling platform written in ADMB by Dr. Richard D. Methot at the NOAA Northwest Fisheries Science Center. The functions include tools for summarizing and plotting results, manipulating files, visualizing model parameterizations, and various other common stock assessment tasks. This version of '{r4ss}' is compatible with Stock Synthesis versions 3.24 through 3.30 (specifically version 3.30.23.1, from December 2024). Support for 3.24 models is only through the core functions for reading output and plotting.

Maintained by Ian G. Taylor. Last updated 3 days ago.

fisheries fisheries-stock-assessment stock-synthesis

1.7 match 43 stars 11.38 score 1.0k scripts 2 dependents

marcosmolla

complexNet:Complex Network Generation

Providing a set of functions to easily generate and iterate complex networks. The functions can be used to generate realistic networks with a wide range of different clustering, density, and average path length. For more information consult research articles by Amiyaal Ilany and Erol Akcay (2016) <doi:10.1093/icb/icw068> and Ilany and Erol Akcay (2016) <doi:10.1101/026120>, which have inspired many methods in this package.

Maintained by Marco Smolla. Last updated 8 months ago.

complex-networks social-inheritance social-networks

4.5 match 4 stars 4.30 score 4 scripts