Showing 200 of total 3580 results (show query)
igraph
igraph:Network Analysis and Visualization
Routines for simple graphs and network analysis. It can handle large graphs very well and provides functions for generating random and regular graphs, graph visualization, centrality methods and much more.
Maintained by Kirill Müller. Last updated 1 days ago.
complex-networksgraph-algorithmsgraph-theorymathematicsnetwork-analysisnetwork-graphfortranlibxml2glpkopenblascpp
76.8 match 581 stars 21.10 score 31k scripts 1.9k dependentsspatstat
spatstat.random:Random Generation Functionality for the 'spatstat' Family
Functionality for random generation of spatial data in the 'spatstat' family of packages. Generates random spatial patterns of points according to many simple rules (complete spatial randomness, Poisson, binomial, random grid, systematic, cell), randomised alteration of patterns (thinning, random shift, jittering), simulated realisations of random point processes including simple sequential inhibition, Matern inhibition models, Neyman-Scott cluster processes (using direct, Brix-Kendall, or hybrid algorithms), log-Gaussian Cox processes, product shot noise cluster processes and Gibbs point processes (using Metropolis-Hastings birth-death-shift algorithm, alternating Gibbs sampler, or coupling-from-the-past perfect simulation). Also generates random spatial patterns of line segments, random tessellations, and random images (random noise, random mosaics). Excludes random generation on a linear network, which is covered by the separate package 'spatstat.linnet'.
Maintained by Adrian Baddeley. Last updated 6 months ago.
point-processesrandom-generationsimulationspatial-samplingspatial-simulationcpp
92.2 match 5 stars 10.77 score 84 scripts 173 dependentstrinker
wakefield:Generate Random Data Sets
Generates random data sets including: data.frames, lists, and vectors.
Maintained by Tyler Rinker. Last updated 5 years ago.
118.8 match 256 stars 7.13 score 209 scriptseddelbuettel
random:True Random Numbers using RANDOM.ORG
The true random number service provided by the RANDOM.ORG website created by Mads Haahr samples atmospheric noise via radio tuned to an unused broadcasting frequency together with a skew correction algorithm due to John von Neumann. More background is available in the included vignette based on an essay by Mads Haahr. In its current form, the package offers functions to retrieve random integers, randomized sequences and random strings.
Maintained by Dirk Eddelbuettel. Last updated 28 days ago.
82.4 match 9 stars 9.47 score 1.4k scripts 2 dependentsalexpghayes
distributions3:Probability Distributions as S3 Objects
Tools to create and manipulate probability distributions using S3. Generics pdf(), cdf(), quantile(), and random() provide replacements for base R's d/p/q/r style functions. Functions and arguments have been named carefully to minimize confusion for students in intro stats courses. The documentation for each distribution contains detailed mathematical notes.
Maintained by Alex Hayes. Last updated 6 months ago.
66.1 match 101 stars 11.31 score 118 scripts 7 dependentsstan-dev
posterior:Tools for Working with Posterior Distributions
Provides useful tools for both users and developers of packages for fitting Bayesian models or working with output from Bayesian models. The primary goals of the package are to: (a) Efficiently convert between many different useful formats of draws (samples) from posterior or prior distributions. (b) Provide consistent methods for operations commonly performed on draws, for example, subsetting, binding, or mutating draws. (c) Provide various summaries of draws in convenient formats. (d) Provide lightweight implementations of state of the art posterior inference diagnostics. References: Vehtari et al. (2021) <doi:10.1214/20-BA1221>.
Maintained by Paul-Christian Bürkner. Last updated 9 days ago.
45.3 match 168 stars 16.13 score 3.3k scripts 342 dependentsdaqana
dqrng:Fast Pseudo Random Number Generators
Several fast random number generators are provided as C++ header only libraries: The PCG family by O'Neill (2014 <https://www.cs.hmc.edu/tr/hmc-cs-2014-0905.pdf>) as well as the Xoroshiro / Xoshiro family by Blackman and Vigna (2021 <doi:10.1145/3460772>). In addition fast functions for generating random numbers according to a uniform, normal and exponential distribution are included. The latter two use the Ziggurat algorithm originally proposed by Marsaglia and Tsang (2000, <doi:10.18637/jss.v005.i08>). The fast sampling methods support unweighted sampling both with and without replacement. These functions are exported to R and as a C++ interface and are enabled for use with the default 64 bit generator from the PCG family, Xoroshiro128+/++/** and Xoshiro256+/++/** as well as the 64 bit version of the 20 rounds Threefry engine (Salmon et al., 2011, <doi:10.1145/2063384.2063405>) as provided by the package 'sitmo'.
Maintained by Ralf Stubner. Last updated 6 months ago.
randomrandom-distributionsrandom-generationrandom-samplingrngcpp
51.6 match 42 stars 13.12 score 188 scripts 183 dependentsdeclaredesign
randomizr:Easy-to-Use Tools for Common Forms of Random Assignment and Sampling
Generates random assignments for common experimental designs and random samples for common sampling designs.
Maintained by Alexander Coppock. Last updated 1 months ago.
63.1 match 37 stars 9.90 score 396 scripts 13 dependentsrstudio
keras3:R Interface to 'Keras'
Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.
Maintained by Tomasz Kalinowski. Last updated 3 days ago.
40.8 match 845 stars 13.57 score 264 scripts 2 dependentssparklyr
sparklyr:R Interface to Apache Spark
R interface to Apache Spark, a fast and general engine for big data processing, see <https://spark.apache.org/>. This package supports connecting to local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end, and provides an interface to Spark's built-in machine learning algorithms.
Maintained by Edgar Ruiz. Last updated 8 days ago.
apache-sparkdistributeddplyridelivymachine-learningremote-clusterssparksparklyr
34.2 match 959 stars 15.16 score 4.0k scripts 21 dependentsspsanderson
TidyDensity:Functions for Tidy Analysis and Generation of Random Data
To make it easy to generate random numbers based upon the underlying stats distribution functions. All data is returned in a tidy and structured format making working with the data simple and straight forward. Given that the data is returned in a tidy 'tibble' it lends itself to working with the rest of the 'tidyverse'.
Maintained by Steven Sanderson. Last updated 5 months ago.
bootstrapdensitydistributionsggplot2probabilityr-languagesimulationstatisticstibbletidy
66.2 match 34 stars 7.78 score 66 scripts 1 dependentskkholst
mets:Analysis of Multivariate Event Times
Implementation of various statistical models for multivariate event history data <doi:10.1007/s10985-013-9244-x>. Including multivariate cumulative incidence models <doi:10.1002/sim.6016>, and bivariate random effects probit models (Liability models) <doi:10.1016/j.csda.2015.01.014>. Modern methods for survival analysis, including regression modelling (Cox, Fine-Gray, Ghosh-Lin, Binomial regression) with fast computation of influence functions.
Maintained by Klaus K. Holst. Last updated 1 days ago.
multivariate-time-to-eventsurvival-analysistime-to-eventfortranopenblascpp
36.7 match 14 stars 13.47 score 236 scripts 42 dependentsunuran
Runuran:R Interface to the 'UNU.RAN' Random Variate Generators
Interface to the 'UNU.RAN' library for Universal Non-Uniform RANdom variate generators. Thus it allows to build non-uniform random number generators from quite arbitrary distributions. In particular, it provides an algorithm for fast numerical inversion for distribution with given density function. In addition, the package contains densities, distribution functions and quantiles from a couple of distributions.
Maintained by Josef Leydold. Last updated 5 months ago.
60.8 match 6.87 score 180 scripts 8 dependentsdatastorm-open
rAmCharts:JavaScript Charts Tool
Provides an R interface for using 'AmCharts' Library. Based on 'htmlwidgets', it provides a global architecture to generate 'JavaScript' source code for charts. Most of classes in the library have their equivalent in R with S4 classes; for those classes, not all properties have been referenced but can easily be added in the constructors. Complex properties (e.g. 'JavaScript' object) can be passed as named list. See examples at <https://datastorm-open.github.io/introduction_ramcharts/> and <https://www.amcharts.com/> for more information about the library. The package includes the free version of 'AmCharts' Library. Its only limitation is a small link to the web site displayed on your charts. If you enjoy this library, do not hesitate to refer to this page <https://www.amcharts.com/online-store/> to purchase a licence, and thus support its creators and get a period of Priority Support. See also <https://www.amcharts.com/about/> for more information about 'AmCharts' company.
Maintained by Benoit Thieurmel. Last updated 2 months ago.
53.9 match 49 stars 7.17 score 153 scripts 4 dependentseasystats
insight:Easy Access to Model Information for Various Model Objects
A tool to provide an easy, intuitive and consistent access to information contained in various R models, like model formulas, model terms, information about random effects, data that was used to fit the model or data from response variables. 'insight' mainly revolves around two types of functions: Functions that find (the names of) information, starting with 'find_', and functions that get the underlying data, starting with 'get_'. The package has a consistent syntax and works with many different model objects, where otherwise functions to access these information are missing.
Maintained by Daniel Lüdecke. Last updated 4 days ago.
easystatshacktoberfestinsightmodelsnamespredictorsrandom
22.1 match 412 stars 17.24 score 568 scripts 210 dependentsinsightsengineering
random.cdisc.data:Create Random ADaM Datasets
A set of functions to create random Analysis Data Model (ADaM) datasets and cached dataset. ADaM dataset specifications are described by the Clinical Data Interchange Standards Consortium (CDISC) Analysis Data Model Team.
Maintained by Joe Zhu. Last updated 5 months ago.
43.7 match 33 stars 8.60 score 52 scriptscran
randomizeR:Randomization for Clinical Trials
This tool enables the user to choose a randomization procedure based on sound scientific criteria. It comprises the generation of randomization sequences as well the assessment of randomization procedures based on carefully selected criteria. Furthermore, 'randomizeR' provides a function for the comparison of randomization procedures.
Maintained by Ralf-Dieter Hilgers. Last updated 1 years ago.
110.2 match 2 stars 3.38 score 1 dependentsopenintrostat
openintro:Datasets and Supplemental Functions from 'OpenIntro' Textbooks and Labs
Supplemental functions and data for 'OpenIntro' resources, which includes open-source textbooks and resources for introductory statistics (<https://www.openintro.org/>). The package contains datasets used in our open-source textbooks along with custom plotting functions for reproducing book figures. Note that many functions and examples include color transparency; some plotting elements may not show up properly (or at all) when run in some versions of Windows operating system.
Maintained by Mine Çetinkaya-Rundel. Last updated 2 months ago.
28.1 match 240 stars 11.39 score 6.0k scriptsjknowles
merTools:Tools for Analyzing Mixed Effect Regression Models
Provides methods for extracting results from mixed-effect model objects fit with the 'lme4' package. Allows construction of prediction intervals efficiently from large scale linear and generalized linear mixed-effects models. This method draws from the simulation framework used in the Gelman and Hill (2007) textbook: Data Analysis Using Regression and Multilevel/Hierarchical Models.
Maintained by Jared E. Knowles. Last updated 1 years ago.
29.1 match 105 stars 10.49 score 768 scriptsstatnet
ergm:Fit, Simulate and Diagnose Exponential-Family Models for Networks
An integrated set of tools to analyze and simulate networks based on exponential-family random graph models (ERGMs). 'ergm' is a part of the Statnet suite of packages for network analysis. See Hunter, Handcock, Butts, Goodreau, and Morris (2008) <doi:10.18637/jss.v024.i03> and Krivitsky, Hunter, Morris, and Klumb (2023) <doi:10.18637/jss.v105.i06>.
Maintained by Pavel N. Krivitsky. Last updated 5 days ago.
18.8 match 100 stars 15.36 score 1.4k scripts 36 dependentscovaruber
sommer:Solving Mixed Model Equations in R
Structural multivariate-univariate linear mixed model solver for estimation of multiple random effects with unknown variance-covariance structures (e.g., heterogeneous and unstructured) and known covariance among levels of random effects (e.g., pedigree and genomic relationship matrices) (Covarrubias-Pazaran, 2016 <doi:10.1371/journal.pone.0156744>; Maier et al., 2015 <doi:10.1016/j.ajhg.2014.12.006>; Jensen et al., 1997). REML estimates can be obtained using the Direct-Inversion Newton-Raphson and Direct-Inversion Average Information algorithms for the problems r x r (r being the number of records) or using the Henderson-based average information algorithm for the problem c x c (c being the number of coefficients to estimate). Spatial models can also be fitted using the two-dimensional spline functionality available.
Maintained by Giovanny Covarrubias-Pazaran. Last updated 20 days ago.
average-informationmixed-modelsrcpparmadilloopenblascppopenmp
22.7 match 43 stars 12.70 score 300 scripts 9 dependentsmlr-org
mlr3extralearners:Extra Learners For mlr3
Extra learners for use in mlr3.
Maintained by Sebastian Fischer. Last updated 4 months ago.
29.0 match 94 stars 9.16 score 474 scriptsstochastictree
stochtree:Stochastic Tree Ensembles (XBART and BART) for Supervised Learning and Causal Inference
Flexible stochastic tree ensemble software. Robust implementations of Bayesian Additive Regression Trees (BART) Chipman, George, McCulloch (2010) <doi:10.1214/09-AOAS285> for supervised learning and Bayesian Causal Forests (BCF) Hahn, Murray, Carvalho (2020) <doi:10.1214/19-BA1195> for causal inference. Enables model serialization and parallel sampling and provides a low-level interface for custom stochastic forest samplers.
Maintained by Drew Herren. Last updated 16 days ago.
bartbayesian-machine-learningbayesian-methodsdecision-treesgradient-boosted-treesmachine-learningprobabilistic-modelstree-ensemblescpp
30.6 match 20 stars 8.52 score 40 scriptst-kalinowski
keras:R Interface to 'Keras'
Interface to 'Keras' <https://keras.io>, a high-level neural networks 'API'. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both 'CPU' and 'GPU' devices.
Maintained by Tomasz Kalinowski. Last updated 11 months ago.
23.1 match 10.82 score 10k scripts 54 dependentserichson
rsvd:Randomized Singular Value Decomposition
Low-rank matrix decompositions are fundamental tools and widely used for data analysis, dimension reduction, and data compression. Classically, highly accurate deterministic matrix algorithms are used for this task. However, the emergence of large-scale data has severely challenged our computational ability to analyze big data. The concept of randomness has been demonstrated as an effective strategy to quickly produce approximate answers to familiar problems such as the singular value decomposition (SVD). The rsvd package provides several randomized matrix algorithms such as the randomized singular value decomposition (rsvd), randomized principal component analysis (rpca), randomized robust principal component analysis (rrpca), randomized interpolative decomposition (rid), and the randomized CUR decomposition (rcur). In addition several plot functions are provided.
Maintained by N. Benjamin Erichson. Last updated 4 years ago.
dimension-reductionmatrix-approximationpcaprincipal-component-analysisprobabilistic-algorithmsrandomized-algorithmsingular-value-decompositionsvd
22.7 match 98 stars 10.80 score 408 scripts 119 dependentsmodeloriented
randomForestExplainer:Explaining and Visualizing Random Forests in Terms of Variable Importance
A set of tools to help explain which variables are most important in a random forests. Various variable importance measures are calculated and visualized in different settings in order to get an idea on how their importance changes depending on our criteria (Hemant Ishwaran and Udaya B. Kogalur and Eiran Z. Gorodeski and Andy J. Minn and Michael S. Lauer (2010) <doi:10.1198/jasa.2009.tm08622>, Leo Breiman (2001) <doi:10.1023/A:1010933404324>).
Maintained by Yue Jiang. Last updated 12 months ago.
23.1 match 231 stars 9.82 score 236 scriptsbioc
DelayedRandomArray:Delayed Arrays of Random Values
Implements a DelayedArray of random values where the realization of the sampled values is delayed until they are needed. Reproducible sampling within any subarray is achieved by chunking where each chunk is initialized with a different random seed and stream. The usual distributions in the stats package are supported, along with scalar, vector and arrays for the parameters.
Maintained by Aaron Lun. Last updated 2 months ago.
41.7 match 5.26 score 6 scripts 1 dependentsr-forge
copula:Multivariate Dependence with Copulas
Classes (S4) of commonly used elliptical, Archimedean, extreme-value and other copula families, as well as their rotations, mixtures and asymmetrizations. Nested Archimedean copulas, related tools and special functions. Methods for density, distribution, random number generation, bivariate dependence measures, Rosenblatt transform, Kendall distribution function, perspective and contour plots. Fitting of copula models with potentially partly fixed parameters, including standard errors. Serial independence tests, copula specification tests (independence, exchangeability, radial symmetry, extreme-value dependence, goodness-of-fit) and model selection based on cross-validation. Empirical copula, smoothed versions, and non-parametric estimators of the Pickands dependence function.
Maintained by Martin Maechler. Last updated 10 days ago.
18.0 match 11.83 score 1.2k scripts 86 dependentspoissonconsulting
extras:Helper Functions for Bayesian Analyses
Functions to 'numericise' 'R' objects (coerce to numeric objects), summarise 'MCMC' (Monte Carlo Markov Chain) samples and calculate deviance residuals as well as 'R' translations of some 'BUGS' (Bayesian Using Gibbs Sampling), 'JAGS' (Just Another Gibbs Sampler), 'STAN' and 'TMB' (Template Model Builder) functions.
Maintained by Nicole Hill. Last updated 2 months ago.
24.8 match 9 stars 8.49 score 15 scripts 16 dependentsrolkra
explore:Simplifies Exploratory Data Analysis
Interactive data exploration with one line of code, automated reporting or use an easy to remember set of tidy functions for low code exploratory data analysis.
Maintained by Roland Krasser. Last updated 3 months ago.
data-explorationdata-visualisationdecision-treesedarmarkdownshinytidy
18.3 match 228 stars 11.43 score 221 scripts 1 dependentsmlverse
torchvision:Models, Datasets and Transformations for Images
Provides access to datasets, models and preprocessing facilities for deep learning with images. Integrates seamlessly with the 'torch' package and it's 'API' borrows heavily from 'PyTorch' vision package.
Maintained by Daniel Falbel. Last updated 6 months ago.
21.0 match 65 stars 9.75 score 313 scripts 6 dependentssollano
forestmangr:Forest Mensuration and Management
Processing forest inventory data with methods such as simple random sampling, stratified random sampling and systematic sampling. There are also functions for yield and growth predictions and model fitting, linear and nonlinear grouped data fitting, and statistical tests. References: Kershaw Jr., Ducey, Beers and Husch (2016). <doi:10.1002/9781118902028>.
Maintained by Sollano Rabelo Braga. Last updated 3 months ago.
25.4 match 17 stars 7.97 score 378 scriptscenterforassessment
randomNames:Generate Random Given and Surnames
Function for generating random gender and ethnicity correct first and/or last names. Names are chosen proportionally based upon their probability of appearing in a large scale data base of real names.
Maintained by Damian W. Betebenner. Last updated 3 months ago.
random-name-generatorsrandom-names
21.7 match 32 stars 9.24 score 297 scripts 5 dependentsmunterfi
eRTG3D:Empirically Informed Random Trajectory Generation in 3-D
Creates realistic random trajectories in a 3-D space between two given fix points, so-called conditional empirical random walks (CERWs). The trajectory generation is based on empirical distribution functions extracted from observed trajectories (training data) and thus reflects the geometrical movement characteristics of the mover. A digital elevation model (DEM), representing the Earth's surface, and a background layer of probabilities (e.g. food sources, uplift potential, waterbodies, etc.) can be used to influence the trajectories. Unterfinger M (2018). "3-D Trajectory Simulation in Movement Ecology: Conditional Empirical Random Walk". Master's thesis, University of Zurich. <https://www.geo.uzh.ch/dam/jcr:6194e41e-055c-4635-9807-53c5a54a3be7/MasterThesis_Unterfinger_2018.pdf>. Technitis G, Weibel R, Kranstauber B, Safi K (2016). "An algorithm for empirically informed random trajectory generation between two endpoints". GIScience 2016: Ninth International Conference on Geographic Information Science, 9, online. <doi:10.5167/uzh-130652>.
Maintained by Merlin Unterfinger. Last updated 3 years ago.
3dbirdsconditional-empirical-random-walkgliding-and-soaringmachine-learningmovement-ecologyrandom-trajectory-generatorrandom-walksimulationtrajectory-generation
34.0 match 6 stars 5.71 score 19 scriptsglmmtmb
glmmTMB:Generalized Linear Mixed Models using Template Model Builder
Fit linear and generalized linear mixed models with various extensions, including zero-inflation. The models are fitted using maximum likelihood estimation via 'TMB' (Template Model Builder). Random effects are assumed to be Gaussian on the scale of the linear predictor and are integrated out using the Laplace approximation. Gradients are calculated using automatic differentiation.
Maintained by Mollie Brooks. Last updated 10 days ago.
11.5 match 312 stars 16.77 score 3.7k scripts 24 dependentsbraverock
PortfolioAnalytics:Portfolio Analysis, Including Numerical Methods for Optimization of Portfolios
Portfolio optimization and analysis routines and graphics.
Maintained by Brian G. Peterson. Last updated 3 months ago.
16.7 match 81 stars 11.49 score 626 scripts 2 dependentscoatless-rpkg
sitmo:Parallel Pseudo Random Number Generator (PPRNG) 'sitmo' Header Files
Provided within are two high quality and fast PPRNGs that may be used in an 'OpenMP' parallel environment. In addition, there is a generator for one dimensional low-discrepancy sequence. The objective of this library to consolidate the distribution of the 'sitmo' (C++98 & C++11), 'threefry' and 'vandercorput' (C++11-only) engines on CRAN by enabling others to link to the header files inside of 'sitmo' instead of including a copy of each engine within their individual package. Lastly, the package contains example implementations using the 'sitmo' package and three accompanying vignette that provide additional information.
Maintained by James Balamuta. Last updated 1 years ago.
parallelrandom-generationrcppcppopenmp
19.1 match 7 stars 9.75 score 15 scripts 201 dependentsscumdogsteev
mlsjunkgen:Use the MLS Junk Generator Algorithm to Generate a Stream of Pseudo-Random Numbers
Generate a stream of pseudo-random numbers generated using the MLS Junk Generator algorithm. Functions exist to generate single pseudo-random numbers as well as a vector, data frame, or matrix of pseudo-random numbers.
Maintained by Steve Myles. Last updated 4 years ago.
mls-junk-generatormlsjunkgenrandom-generationrandom-numberrandom-number-generatorrandom-number-generatorsrandom-quote-machinerngrpackages
46.9 match 3.95 score 18 scriptsgadget-framework
gadget3:Globally-Applicable Area Disaggregated General Ecosystem Toolbox V3
A framework to assist creation of marine ecosystem models, generating either 'R' or 'C++' code which can then be optimised using the 'TMB' package and standard 'R' tools. Principally designed to reproduce gadget2 models in 'TMB', but can be extended beyond gadget2's capabilities. Kasper Kristensen, Anders Nielsen, Casper W. Berg, Hans Skaug, Bradley M. Bell (2016) <doi:10.18637/jss.v070.i05> "TMB: Automatic Differentiation and Laplace Approximation.". Begley, J., & Howell, D. (2004) <https://core.ac.uk/download/pdf/225936648.pdf> "An overview of Gadget, the globally applicable area-disaggregated general ecosystem toolbox. ICES.".
Maintained by Jamie Lentin. Last updated 29 days ago.
21.3 match 8 stars 8.69 score 170 scriptslme4
lme4:Linear Mixed-Effects Models using 'Eigen' and S4
Fit linear and generalized linear mixed-effects models. The models and their components are represented using S4 classes and methods. The core computational algorithms are implemented using the 'Eigen' C++ library for numerical linear algebra and 'RcppEigen' "glue".
Maintained by Ben Bolker. Last updated 1 days ago.
8.9 match 647 stars 20.69 score 35k scripts 1.5k dependentsrichardli
SUMMER:Small-Area-Estimation Unit/Area Models and Methods for Estimation in R
Provides methods for spatial and spatio-temporal smoothing of demographic and health indicators using survey data, with particular focus on estimating and projecting under-five mortality rates, described in Mercer et al. (2015) <doi:10.1214/15-AOAS872>, Li et al. (2019) <doi:10.1371/journal.pone.0210645>, Wu et al. (DHS Spatial Analysis Reports No. 21, 2021), and Li et al. (2023) <doi:10.48550/arXiv.2007.05117>.
Maintained by Zehang R Li. Last updated 2 months ago.
bayesian-inferencesmall-area-estimationspace-time
17.7 match 23 stars 10.28 score 134 scripts 2 dependentsinsightsengineering
teal.data:Data Model for 'teal' Applications
Provides a 'teal_data' class as a unified data model for 'teal' applications focusing on reproducibility and relational data.
Maintained by Dawid Kaledkowski. Last updated 2 months ago.
18.3 match 11 stars 9.93 score 44 scripts 8 dependentshemingnm
SESraster:Raster Randomization for Null Hypothesis Testing
Randomization of presence/absence species distribution raster data with or without including spatial structure for calculating standardized effect sizes and testing null hypothesis. The randomization algorithms are based on classical algorithms for matrices (Gotelli 2000, <doi:10.2307/177478>) implemented for raster data.
Maintained by Neander Marcel Heming. Last updated 5 months ago.
null-modelsrandomizationrasterspatialspatial-analysisspecies-distribution-modelling
27.3 match 7 stars 6.61 score 32 scripts 2 dependentsspsanderson
RandomWalker:Generate Random Walks Compatible with the 'tidyverse'
Generates random walks of various types by providing a set of functions that are compatible with the 'tidyverse'. The functions provided in the package make it simple to create random walks with a variety of properties, such as how many simulations to run, how many steps to take, and the distribution of random walk itself.
Maintained by Steven Sanderson. Last updated 1 months ago.
random-walkrandom-walksrpackages
29.3 match 5 stars 6.05 score 5 scripts 1 dependentsalexkowa
EnvStats:Package for Environmental Statistics, Including US EPA Guidance
Graphical and statistical analyses of environmental data, with focus on analyzing chemical concentrations and physical parameters, usually in the context of mandated environmental monitoring. Major environmental statistical methods found in the literature and regulatory guidance documents, with extensive help that explains what these methods do, how to use them, and where to find them in the literature. Numerous built-in data sets from regulatory guidance documents and environmental statistics literature. Includes scripts reproducing analyses presented in the book "EnvStats: An R Package for Environmental Statistics" (Millard, 2013, Springer, ISBN 978-1-4614-8455-4, <doi:10.1007/978-1-4614-8456-1>).
Maintained by Alexander Kowarik. Last updated 15 days ago.
13.7 match 26 stars 12.80 score 2.4k scripts 46 dependentsdidiermurillof
FielDHub:A Shiny App for Design of Experiments in Life Sciences
A shiny design of experiments (DOE) app that aids in the creation of traditional, un-replicated, augmented and partially-replicated designs applied to agriculture, plant breeding, forestry, animal and biological sciences.
Maintained by Didier Murillo. Last updated 8 months ago.
agriculturalbreedingdesigndoeexperimentalplantbreedingshiny
19.1 match 48 stars 9.10 score 70 scripts 1 dependentscran
RRTCS:Randomized Response Techniques for Complex Surveys
Point and interval estimation of linear parameters with data obtained from complex surveys (including stratified and clustered samples) when randomization techniques are used. The randomized response technique was developed to obtain estimates that are more valid when studying sensitive topics. Estimators and variances for 14 randomized response methods for qualitative variables and 7 randomized response methods for quantitative variables are also implemented. In addition, some data sets from surveys with these randomization methods are included in the package.
Maintained by Beatriz Cobo Rodríguez. Last updated 4 years ago.
87.1 match 2.00 scoretmlange
optRF:Optimising Random Forest Stability by Determining the Optimal Number of Trees
Calculating the stability of random forest with certain numbers of trees. The non-linear relationship between stability and numbers of trees is described using a logistic regression model and used to estimate the optimal number of trees.
Maintained by Thomas Martin Lange. Last updated 1 months ago.
36.3 match 4.78 scorealanarnholt
BSDA:Basic Statistics and Data Analysis
Data sets for book "Basic Statistics and Data Analysis" by Larry J. Kitchens.
Maintained by Alan T. Arnholt. Last updated 2 years ago.
18.8 match 7 stars 9.11 score 1.3k scripts 6 dependentspachadotdev
cpp11armadillo:An 'Armadillo' Interface
Provides function declarations and inline function definitions that facilitate communication between R and the 'Armadillo' 'C++' library for linear algebra and scientific computing. This implementation is detailed in Vargas Sepulveda and Schneider Malamud (2024) <doi:10.48550/arXiv.2408.11074>.
Maintained by Mauricio Vargas Sepulveda. Last updated 24 days ago.
armadillocppcpp11hacktoberfestlinear-algebra
18.5 match 9 stars 9.14 score 1 scripts 16 dependentscran
mgcv:Mixed GAM Computation Vehicle with Automatic Smoothness Estimation
Generalized additive (mixed) models, some of their extensions and other generalized ridge regression with multiple smoothing parameter estimation by (Restricted) Marginal Likelihood, Generalized Cross Validation and similar, or using iterated nested Laplace approximation for fully Bayesian inference. See Wood (2017) <doi:10.1201/9781315370279> for an overview. Includes a gam() function, a wide variety of smoothers, 'JAGS' support and distributions beyond the exponential family.
Maintained by Simon Wood. Last updated 1 years ago.
13.0 match 32 stars 12.71 score 17k scripts 7.8k dependentsr-forge
mlogit:Multinomial Logit Models
Maximum Likelihood estimation of random utility discrete choice models, as described in Kenneth Train (2009) Discrete Choice Methods with Simulations <doi:10.1017/CBO9780511805271>.
Maintained by Yves Croissant. Last updated 5 years ago.
16.8 match 9.81 score 1.2k scripts 14 dependentsmichaellli
evalITR:Evaluating Individualized Treatment Rules
Provides various statistical methods for evaluating Individualized Treatment Rules under randomized data. The provided metrics include Population Average Value (PAV), Population Average Prescription Effect (PAPE), Area Under Prescription Effect Curve (AUPEC). It also provides the tools to analyze Individualized Treatment Rules under budget constraints. Detailed reference in Imai and Li (2019) <arXiv:1905.05389>.
Maintained by Michael Lingzhi Li. Last updated 2 years ago.
24.0 match 14 stars 6.78 score 36 scriptsjinli22
spm:Spatial Predictive Modeling
Introduction to some novel accurate hybrid methods of geostatistical and machine learning methods for spatial predictive modelling. It contains two commonly used geostatistical methods, two machine learning methods, four hybrid methods and two averaging methods. For each method, two functions are provided. One function is for assessing the predictive errors and accuracy of the method based on cross-validation. The other one is for generating spatial predictions using the method. For details please see: Li, J., Potter, A., Huang, Z., Daniell, J. J. and Heap, A. (2010) <https:www.ga.gov.au/metadata-gateway/metadata/record/gcat_71407> Li, J., Heap, A. D., Potter, A., Huang, Z. and Daniell, J. (2011) <doi:10.1016/j.csr.2011.05.015> Li, J., Heap, A. D., Potter, A. and Daniell, J. (2011) <doi:10.1016/j.envsoft.2011.07.004> Li, J., Potter, A., Huang, Z. and Heap, A. (2012) <https:www.ga.gov.au/metadata-gateway/metadata/record/74030>.
Maintained by Jin Li. Last updated 3 years ago.
29.7 match 3 stars 5.46 score 107 scripts 3 dependentsreside-ic
ids:Generate Random Identifiers
Generate random or human readable and pronounceable identifiers.
Maintained by Rich FitzJohn. Last updated 3 years ago.
12.0 match 94 stars 13.27 score 175 scripts 165 dependentsimbs-hl
ranger:A Fast Implementation of Random Forests
A fast implementation of Random Forests, particularly suited for high dimensional data. Ensembles of classification, regression, survival and probability prediction trees are supported. Data from genome-wide association studies can be analyzed efficiently. In addition to data frames, datasets of class 'gwaa.data' (R package 'GenABEL') and 'dgCMatrix' (R package 'Matrix') can be directly analyzed.
Maintained by Marvin N. Wright. Last updated 4 months ago.
9.5 match 783 stars 16.22 score 9.2k scripts 189 dependentsusepa
spmodel:Spatial Statistical Modeling and Prediction
Fit, summarize, and predict for a variety of spatial statistical models applied to point-referenced and areal (lattice) data. Parameters are estimated using various methods. Additional modeling features include anisotropy, non-spatial random effects, partition factors, big data approaches, and more. Model-fit statistics are used to summarize, visualize, and compare models. Predictions at unobserved locations are readily obtainable. For additional details, see Dumelle et al. (2023) <doi:10.1371/journal.pone.0282524>.
Maintained by Michael Dumelle. Last updated 2 days ago.
19.9 match 15 stars 7.66 score 112 scripts 3 dependentscran
nlme:Linear and Nonlinear Mixed Effects Models
Fit and compare Gaussian linear and nonlinear mixed-effects models.
Maintained by R Core Team. Last updated 2 months ago.
11.7 match 6 stars 13.00 score 13k scripts 8.7k dependentsbedapub
designit:Blocking and Randomization for Experimental Design
Intelligently assign samples to batches in order to reduce batch effects. Batch effects can have a significant impact on data analysis, especially when the assignment of samples to batches coincides with the contrast groups being studied. By defining a batch container and a scoring function that reflects the contrasts, this package allows users to assign samples in a way that minimizes the potential impact of batch effects on the comparison of interest. Among other functionality, we provide an implementation for OSAT score by Yan et al. (2012, <doi:10.1186/1471-2164-13-689>).
Maintained by Iakov I. Davydov. Last updated 4 months ago.
design-of-experimentsrandomization
20.5 match 8 stars 7.28 score 24 scriptsmrc-ide
dust:Iterate Multiple Realisations of Stochastic Models
An Engine for simulation of stochastic models. Includes support for running stochastic models in parallel, either with shared or varying parameters. Simulations are run efficiently in compiled code and can be run with a fraction of simulated states returned to R, allowing control over memory usage. Support is provided for building bootstrap particle filter for performing Sequential Monte Carlo (e.g., Gordon et al. 1993 <doi:10.1049/ip-f-2.1993.0015>). The core of the simulation engine is the 'xoshiro256**' algorithm (Blackman and Vigna <arXiv:1805.01407>), and the package is further described in FitzJohn et al 2021 <doi:10.12688/wellcomeopenres.16466.2>.
Maintained by Rich FitzJohn. Last updated 5 months ago.
18.8 match 18 stars 7.84 score 60 scripts 3 dependentssimsem
semTools:Useful Tools for Structural Equation Modeling
Provides miscellaneous tools for structural equation modeling, many of which extend the 'lavaan' package. For example, latent interactions can be estimated using product indicators (Lin et al., 2010, <doi:10.1080/10705511.2010.488999>) and simple effects probed; analytical power analyses can be conducted (Jak et al., 2021, <doi:10.3758/s13428-020-01479-0>); and scale reliability can be estimated based on estimated factor-model parameters.
Maintained by Terrence D. Jorgensen. Last updated 2 days ago.
10.3 match 79 stars 13.74 score 1.1k scripts 31 dependentsbriencj
dae:Functions Useful in the Design and ANOVA of Experiments
The content falls into the following groupings: (i) Data, (ii) Factor manipulation functions, (iii) Design functions, (iv) ANOVA functions, (v) Matrix functions, (vi) Projector and canonical efficiency functions, and (vii) Miscellaneous functions. There is a vignette describing how to use the design functions for randomizing and assessing designs available as a vignette called 'DesignNotes'. The ANOVA functions facilitate the extraction of information when the 'Error' function has been used in the call to 'aov'. The package 'dae' can also be installed from <http://chris.brien.name/rpackages/>.
Maintained by Chris Brien. Last updated 3 months ago.
16.4 match 1 stars 8.62 score 356 scripts 7 dependentsropensci
aorsf:Accelerated Oblique Random Forests
Fit, interpret, and compute predictions with oblique random forests. Includes support for partial dependence, variable importance, passing customized functions for variable importance and identification of linear combinations of features. Methods for the oblique random survival forest are described in Jaeger et al., (2023) <DOI:10.1080/10618600.2023.2231048>.
Maintained by Byron Jaeger. Last updated 2 days ago.
data-scienceobliquerandom-forestsurvivalopenblascppopenmp
15.3 match 58 stars 9.21 score 60 scripts 1 dependentsdaijiang
phyr:Model Based Phylogenetic Analysis
A collection of functions to do model-based phylogenetic analysis. It includes functions to calculate community phylogenetic diversity, to estimate correlations among functional traits while accounting for phylogenetic relationships, and to fit phylogenetic generalized linear mixed models. The Bayesian phylogenetic generalized linear mixed models are fitted with the 'INLA' package (<https://www.r-inla.org>).
Maintained by Daijiang Li. Last updated 1 years ago.
bayesianglmminlaphylogenyspecies-distribution-modelingopenblascpp
16.1 match 31 stars 8.67 score 107 scripts 2 dependentsgaynorr
AlphaSimR:Breeding Program Simulations
The successor to the 'AlphaSim' software for breeding program simulation [Faux et al. (2016) <doi:10.3835/plantgenome2016.02.0013>]. Used for stochastic simulations of breeding programs to the level of DNA sequence for every individual. Contained is a wide range of functions for modeling common tasks in a breeding program, such as selection and crossing. These functions allow for constructing simulations of highly complex plant and animal breeding programs via scripting in the R software environment. Such simulations can be used to evaluate overall breeding program performance and conduct research into breeding program design, such as implementation of genomic selection. Included is the 'Markovian Coalescent Simulator' ('MaCS') for fast simulation of biallelic sequences according to a population demographic history [Chen et al. (2009) <doi:10.1101/gr.083634.108>].
Maintained by Chris Gaynor. Last updated 4 months ago.
breedinggenomicssimulationopenblascppopenmp
13.4 match 47 stars 10.22 score 534 scripts 2 dependentsblasbenito
spatialRF:Easy Spatial Modeling with Random Forest
Automatic generation and selection of spatial predictors for spatial regression with Random Forest. Spatial predictors are surrogates of variables driving the spatial structure of a response variable. The package offers two methods to generate spatial predictors from a distance matrix among training cases: 1) Moran's Eigenvector Maps (MEMs; Dray, Legendre, and Peres-Neto 2006 <DOI:10.1016/j.ecolmodel.2006.02.015>): computed as the eigenvectors of a weighted matrix of distances; 2) RFsp (Hengl et al. <DOI:10.7717/peerj.5518>): columns of the distance matrix used as spatial predictors. Spatial predictors help minimize the spatial autocorrelation of the model residuals and facilitate an honest assessment of the importance scores of the non-spatial predictors. Additionally, functions to reduce multicollinearity, identify relevant variable interactions, tune random forest hyperparameters, assess model transferability via spatial cross-validation, and explore model results via partial dependence curves and interaction surfaces are included in the package. The modelling functions are built around the highly efficient 'ranger' package (Wright and Ziegler 2017 <DOI:10.18637/jss.v077.i01>).
Maintained by Blas M. Benito. Last updated 3 years ago.
random-forestspatial-analysisspatial-regression
25.2 match 114 stars 5.45 score 49 scriptsrsbivand
splancs:Spatial and Space-Time Point Pattern Analysis
The Splancs package was written as an enhancement to S-Plus for display and analysis of spatial point pattern data; it has been ported to R and is in "maintenance mode".
Maintained by Roger Bivand. Last updated 10 months ago.
15.7 match 1 stars 8.72 score 592 scripts 53 dependentskenaho1
asbio:A Collection of Statistical Tools for Biologists
Contains functions from: Aho, K. (2014) Foundational and Applied Statistics for Biologists using R. CRC/Taylor and Francis, Boca Raton, FL, ISBN: 978-1-4398-7338-0.
Maintained by Ken Aho. Last updated 2 months ago.
19.1 match 5 stars 7.09 score 310 scripts 3 dependentsemmanuelparadis
ape:Analyses of Phylogenetics and Evolution
Functions for reading, writing, plotting, and manipulating phylogenetic trees, analyses of comparative data in a phylogenetic framework, ancestral character analyses, analyses of diversification and macroevolution, computing distances from DNA sequences, reading and writing nucleotide sequences as well as importing from BioConductor, and several tools such as Mantel's test, generalized skyline plots, graphical exploration of phylogenetic data (alex, trex, kronoviz), estimation of absolute evolutionary rates and clock-like trees using mean path lengths and penalized likelihood, dating trees with non-contemporaneous sequences, translating DNA into AA sequences, and assessing sequence alignments. Phylogeny estimation can be done with the NJ, BIONJ, ME, MVR, SDM, and triangle methods, and several methods handling incomplete distance matrices (NJ*, BIONJ*, MVR*, and the corresponding triangle method). Some functions call external applications (PhyML, Clustal, T-Coffee, Muscle) whose results are returned into R.
Maintained by Emmanuel Paradis. Last updated 1 months ago.
7.8 match 64 stars 17.18 score 13k scripts 601 dependentskogalur
randomForestSRC:Fast Unified Random Forests for Survival, Regression, and Classification (RF-SRC)
Fast OpenMP parallel computing of Breiman's random forests for univariate, multivariate, unsupervised, survival, competing risks, class imbalanced classification and quantile regression. New Mahalanobis splitting for correlated outcomes. Extreme random forests and randomized splitting. Suite of imputation methods for missing data. Fast random forests using subsampling. Confidence regions and standard errors for variable importance. New improved holdout importance. Case-specific importance. Minimal depth variable importance. Visualize trees on your Safari or Google Chrome browser. Anonymous random forests for data privacy.
Maintained by Udaya B. Kogalur. Last updated 2 months ago.
16.6 match 10 stars 7.90 score 1.2k scripts 12 dependentsandyliaw-mrk
randomForest:Breiman and Cutlers Random Forests for Classification and Regression
Classification and regression based on a forest of trees using random inputs, based on Breiman (2001) <DOI:10.1023/A:1010933404324>.
Maintained by Andy Liaw. Last updated 6 months ago.
10.8 match 47 stars 12.11 score 35k scripts 282 dependentstilltnet
egor:Import and Analyse Ego-Centered Network Data
Tools for importing, analyzing and visualizing ego-centered network data. Supports several data formats, including the export formats of 'EgoNet', 'EgoWeb 2.0' and 'openeddi'. An interactive (shiny) app for the intuitive visualization of ego-centered networks is provided. Also included are procedures for creating and visualizing Clustered Graphs (Lerner 2008 <DOI:10.1109/PACIFICVIS.2008.4475458>).
Maintained by Till Krenz. Last updated 11 days ago.
ego-centeredegonetegornetwork-analysissna
15.1 match 24 stars 8.64 score 76 scripts 2 dependentsjlmelville
rnndescent:Nearest Neighbor Descent Method for Approximate Nearest Neighbors
The Nearest Neighbor Descent method for finding approximate nearest neighbors by Dong and co-workers (2010) <doi:10.1145/1963405.1963487>. Based on the 'Python' package 'PyNNDescent' <https://github.com/lmcinnes/pynndescent>.
Maintained by James Melville. Last updated 8 months ago.
approximate-nearest-neighbor-searchcpp
17.7 match 11 stars 7.31 score 75 scriptsewenharrison
finalfit:Quickly Create Elegant Regression Results Tables and Plots when Modelling
Generate regression results tables and plots in final format for publication. Explore models and export directly to PDF and 'Word' using 'RMarkdown'.
Maintained by Ewen Harrison. Last updated 6 months ago.
11.3 match 270 stars 11.43 score 1.0k scriptslebebr01
simglm:Simulate Models Based on the Generalized Linear Model
Simulates regression models, including both simple regression and generalized linear mixed models with up to three level of nesting. Power simulations that are flexible allowing the specification of missing data, unbalanced designs, and different random error distributions are built into the package.
Maintained by Brandon LeBeau. Last updated 10 months ago.
16.4 match 43 stars 7.87 score 87 scriptsegenn
rtemis:Machine Learning and Visualization
Advanced Machine Learning and Visualization. Unsupervised Learning (Clustering, Decomposition), Supervised Learning (Classification, Regression), Cross-Decomposition, Bagging, Boosting, Meta-models. Static and interactive graphics.
Maintained by E.D. Gennatas. Last updated 1 months ago.
data-sciencedata-visualizationmachine-learningmachine-learning-libraryvisualization
18.2 match 145 stars 7.09 score 50 scripts 2 dependentsropensci
spatsoc:Group Animal Relocation Data by Spatial and Temporal Relationship
Detects spatial and temporal groups in GPS relocations (Robitaille et al. (2019) <doi:10.1111/2041-210X.13215>). It can be used to convert GPS relocations to gambit-of-the-group format to build proximity-based social networks In addition, the randomizations function provides data-stream randomization methods suitable for GPS data.
Maintained by Alec L. Robitaille. Last updated 1 months ago.
12.9 match 24 stars 9.97 score 145 scripts 3 dependentscran
randomUniformForest:Random Uniform Forests for Classification, Regression and Unsupervised Learning
Ensemble model, for classification, regression and unsupervised learning, based on a forest of unpruned and randomized binary decision trees. Each tree is grown by sampling, with replacement, a set of variables at each node. Each cut-point is generated randomly, according to the continuous Uniform distribution. For each tree, data are either bootstrapped or subsampled. The unsupervised mode introduces clustering, dimension reduction and variable importance, using a three-layer engine. Random Uniform Forests are mainly aimed to lower correlation between trees (or trees residuals), to provide a deep analysis of variable importance and to allow native distributed and incremental learning.
Maintained by Saip Ciss. Last updated 3 years ago.
34.0 match 3 stars 3.77 score 99 scriptsr-forge
randtoolbox:Toolbox for Pseudo and Quasi Random Number Generation and Random Generator Tests
Provides (1) pseudo random generators - general linear congruential generators, multiple recursive generators and generalized feedback shift register (SF-Mersenne Twister algorithm (<doi:10.1007/978-3-540-74496-2_36>) and WELL (<doi:10.1145/1132973.1132974>) generators); (2) quasi random generators - the Torus algorithm, the Sobol sequence, the Halton sequence (including the Van der Corput sequence) and (3) some generator tests - the gap test, the serial test, the poker test, see, e.g., Gentle (2003) <doi:10.1007/b97336>. Take a look at the Distribution task view of types and tests of random number generators. The package can be provided without the 'rngWELL' dependency on demand. Package in Memoriam of Diethelm and Barbara Wuertz.
Maintained by Christophe Dutang. Last updated 3 months ago.
12.5 match 1 stars 10.23 score 578 scripts 80 dependentsthinkr-open
shinipsum:Lorem-Ipsum-like Helpers for fast Shiny Prototyping
Prototype your shiny apps quickly with these Lorem-Ipsum-like Helpers.
Maintained by Colin Fay. Last updated 1 years ago.
dygraphggplotgolemversehacktoberfestlorem-ipsum
19.5 match 125 stars 6.45 score 50 scripts 1 dependentsmayer79
missRanger:Fast Imputation of Missing Values
Alternative implementation of the beautiful 'MissForest' algorithm used to impute mixed-type data sets by chaining random forests, introduced by Stekhoven, D.J. and Buehlmann, P. (2012) <doi:10.1093/bioinformatics/btr597>. Under the hood, it uses the lightning fast random forest package 'ranger'. Between the iterative model fitting, we offer the option of using predictive mean matching. This firstly avoids imputation with values not already present in the original data (like a value 0.3334 in 0-1 coded variable). Secondly, predictive mean matching tries to raise the variance in the resulting conditional distributions to a realistic level. This would allow, e.g., to do multiple imputation when repeating the call to missRanger(). Out-of-sample application is supported as well.
Maintained by Michael Mayer. Last updated 3 months ago.
imputationmachine-learningmissing-valuesrandom-forest
11.3 match 69 stars 11.07 score 208 scripts 6 dependentsbioc
seqsetvis:Set Based Visualizations for Next-Gen Sequencing Data
seqsetvis enables the visualization and analysis of sets of genomic sites in next gen sequencing data. Although seqsetvis was designed for the comparison of mulitple ChIP-seq samples, this package is domain-agnostic and allows the processing of multiple genomic coordinate files (bed-like files) and signal files (bigwig files pileups from bam file). seqsetvis has multiple functions for fetching data from regions into a tidy format for analysis in data.table or tidyverse and visualization via ggplot2.
Maintained by Joseph R Boyd. Last updated 3 months ago.
softwarechipseqmultiplecomparisonsequencingvisualization
21.4 match 5.82 score 82 scriptsnicholasjclark
MRFcov:Markov Random Fields with Additional Covariates
Approximate node interaction parameters of Markov Random Fields graphical networks. Models can incorporate additional covariates, allowing users to estimate how interactions between nodes in the graph are predicted to change across covariate gradients. The general methods implemented in this package are described in Clark et al. (2018) <doi:10.1002/ecy.2221>.
Maintained by Nicholas J Clark. Last updated 12 months ago.
conditional-random-fieldsgraphical-modelsmachine-learningmarkov-random-fieldmultivariate-analysismultivariate-statisticsnetwork-analysisnetworks
20.5 match 24 stars 6.03 score 30 scriptsbayesiandemography
bage:Bayesian Estimation and Forecasting of Age-Specific Rates
Fast Bayesian estimation and forecasting of age-specific rates, probabilities, and means, based on 'Template Model Builder'.
Maintained by John Bryant. Last updated 2 months ago.
16.9 match 3 stars 7.30 score 39 scriptsgamlss-dev
gamlss:Generalized Additive Models for Location Scale and Shape
Functions for fitting the Generalized Additive Models for Location Scale and Shape introduced by Rigby and Stasinopoulos (2005), <doi:10.1111/j.1467-9876.2005.00510.x>. The models use a distributional regression approach where all the parameters of the conditional distribution of the response variable are modelled using explanatory variables.
Maintained by Mikis Stasinopoulos. Last updated 4 months ago.
11.0 match 16 stars 11.23 score 2.0k scripts 49 dependentsjakubnowicki
fixtuRes:Mock Data Generator
Generate mock data in R using YAML configuration.
Maintained by Jakub Nowicki. Last updated 3 years ago.
fixturesmock-datamock-data-generatortest-data-generatoryaml-configuration
24.7 match 16 stars 4.98 score 12 scriptschrhennig
fpc:Flexible Procedures for Clustering
Various methods for clustering and cluster validation. Fixed point clustering. Linear regression clustering. Clustering by merging Gaussian mixture components. Symmetric and asymmetric discriminant projections for visualisation of the separation of groupings. Cluster validation statistics for distance based clustering including corrected Rand index. Standardisation of cluster validation statistics by random clusterings and comparison between many clustering methods and numbers of clusters based on this. Cluster-wise cluster stability assessment. Methods for estimation of the number of clusters: Calinski-Harabasz, Tibshirani and Walther's prediction strength, Fang and Wang's bootstrap stability. Gaussian/multinomial mixture fitting for mixed continuous/categorical variables. Variable-wise statistics for cluster interpretation. DBSCAN clustering. Interface functions for many clustering methods implemented in R, including estimating the number of clusters with kmeans, pam and clara. Modality diagnosis for Gaussian mixtures. For an overview see package?fpc.
Maintained by Christian Hennig. Last updated 6 months ago.
13.2 match 11 stars 9.25 score 2.6k scripts 70 dependentsbayesball
LearnBayes:Learning Bayesian Inference
Contains functions for summarizing basic one and two parameter posterior distributions and predictive distributions. It contains MCMC algorithms for summarizing posterior distributions defined by the user. It also contains functions for regression models, hierarchical models, Bayesian tests, and illustrations of Gibbs sampling.
Maintained by Jim Albert. Last updated 7 years ago.
10.7 match 38 stars 11.34 score 690 scripts 31 dependentsshangzhi-hong
RfEmpImp:Multiple Imputation using Chained Random Forests
An R package for multiple imputation using chained random forests. Implemented methods can handle missing data in mixed types of variables by using prediction-based or node-based conditional distributions constructed using random forests. For prediction-based imputation, the method based on the empirical distribution of out-of-bag prediction errors of random forests and the method based on normality assumption for prediction errors of random forests are provided for imputing continuous variables. And the method based on predicted probabilities is provided for imputing categorical variables. For node-based imputation, the method based on the conditional distribution formed by the predicting nodes of random forests, and the method based on proximity measures of random forests are provided. More details of the statistical methods can be found in Hong et al. (2020) <arXiv:2004.14823>.
Maintained by Shangzhi Hong. Last updated 2 years ago.
imputationmissing-datarandom-forest
27.3 match 5 stars 4.40 score 8 scriptsjeroen
openssl:Toolkit for Encryption, Signatures and Certificates Based on OpenSSL
Bindings to OpenSSL libssl and libcrypto, plus custom SSH key parsers. Supports RSA, DSA and EC curves P-256, P-384, P-521, and curve25519. Cryptographic signatures can either be created and verified manually or via x509 certificates. AES can be used in cbc, ctr or gcm mode for symmetric encryption; RSA for asymmetric (public key) encryption or EC for Diffie Hellman. High-level envelope functions combine RSA and AES for encrypting arbitrary sized data. Other utilities include key generators, hash functions (md5, sha1, sha256, etc), base64 encoder, a secure random number generator, and 'bignum' math methods for manually performing crypto calculations on large multibyte integers.
Maintained by Jeroen Ooms. Last updated 1 months ago.
6.6 match 65 stars 18.00 score 632 scripts 5.0k dependentsjeffreyevans
spatialEco:Spatial Analysis and Modelling Utilities
Utilities to support spatial data manipulation, query, sampling and modelling in ecological applications. Functions include models for species population density, spatial smoothing, multivariate separability, point process model for creating pseudo- absences and sub-sampling, Quadrant-based sampling and analysis, auto-logistic modeling, sampling models, cluster optimization, statistical exploratory tools and raster-based metrics.
Maintained by Jeffrey S. Evans. Last updated 12 days ago.
biodiversityconservationecologyr-spatialrasterspatialvector
12.5 match 110 stars 9.55 score 736 scripts 2 dependentsjenniniku
gllvm:Generalized Linear Latent Variable Models
Analysis of multivariate data using generalized linear latent variable models (gllvm). Estimation is performed using either the Laplace method, variational approximations, or extended variational approximations, implemented via TMB (Kristensen et al. (2016), <doi:10.18637/jss.v070.i05>).
Maintained by Jenni Niku. Last updated 18 hours ago.
11.3 match 51 stars 10.52 score 176 scripts 1 dependentsmhahsler
arules:Mining Association Rules and Frequent Itemsets
Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules). Also provides C implementations of the association mining algorithms Apriori and Eclat. Hahsler, Gruen and Hornik (2005) <doi:10.18637/jss.v014.i15>.
Maintained by Michael Hahsler. Last updated 1 months ago.
arulesassociation-rulesfrequent-itemsets
8.5 match 194 stars 13.99 score 3.3k scripts 28 dependentszdebruine
RcppML:Rcpp Machine Learning Library
Fast machine learning algorithms including matrix factorization and divisive clustering for large sparse and dense matrices.
Maintained by Zach DeBruine. Last updated 2 years ago.
clusteringmatrix-factorizationnmfrcpprcppeigensparse-matrixcppopenmp
11.2 match 104 stars 10.53 score 125 scripts 46 dependentskosukeimai
experiment:R Package for Designing and Analyzing Randomized Experiments
Provides various statistical methods for designing and analyzing randomized experiments. One functionality of the package is the implementation of randomized-block and matched-pair designs based on possibly multivariate pre-treatment covariates. The package also provides the tools to analyze various randomized experiments including cluster randomized experiments, two-stage randomized experiments, randomized experiments with noncompliance, and randomized experiments with missing data.
Maintained by Kosuke Imai. Last updated 3 years ago.
22.3 match 14 stars 5.29 score 23 scriptsmetinbulus
PowerUpR:Power Analysis Tools for Multilevel Randomized Experiments
Includes tools to calculate statistical power, minimum detectable effect size (MDES), MDES difference (MDESD), and minimum required sample size for various multilevel randomized experiments with continuous outcomes. Some of the functions can assist with planning multilevel randomized experiments sensetive to detect multilevel moderation (2-1-1, 2-1-2, 2-2-1, and 2-2-2 designs) and multilevel mediation (2-1-1, 2-2-1, 3-1-1, 3-2-1, and 3-3-1 designs). See 'PowerUp!' Excel series at <https://www.causalevaluation.org/>.
Maintained by Metin Bulus. Last updated 4 years ago.
25.0 match 2 stars 4.68 score 24 scriptsbnosac
crfsuite:Conditional Random Fields for Labelling Sequential Data in Natural Language Processing
Wraps the 'CRFsuite' library <https://github.com/chokkan/crfsuite> allowing users to fit a Conditional Random Field model and to apply it on existing data. The focus of the implementation is in the area of Natural Language Processing where this R package allows you to easily build and apply models for named entity recognition, text chunking, part of speech tagging, intent recognition or classification of any category you have in mind. Next to training, a small web application is included in the package to allow you to easily construct training data.
Maintained by Jan Wijffels. Last updated 1 years ago.
chunkingconditional-random-fieldscrfcrfsuitedata-scienceintent-classificationnatural-language-processingnernlpcpp
18.5 match 63 stars 6.34 score 35 scriptsdavidbolin
rSPDE:Rational Approximations of Fractional Stochastic Partial Differential Equations
Functions that compute rational approximations of fractional elliptic stochastic partial differential equations. The package also contains functions for common statistical usage of these approximations. The main references for rSPDE are Bolin, Simas and Xiong (2023) <doi:10.1080/10618600.2023.2231051> for the covariance-based method and Bolin and Kirchner (2020) <doi:10.1080/10618600.2019.1665537> for the operator-based rational approximation. These can be generated by the citation function in R.
Maintained by David Bolin. Last updated 8 days ago.
15.3 match 11 stars 7.57 score 188 scripts 3 dependentstwolodzko
extraDistr:Additional Univariate and Multivariate Distributions
Density, distribution function, quantile function and random generation for a number of univariate and multivariate distributions. This package implements the following distributions: Bernoulli, beta-binomial, beta-negative binomial, beta prime, Bhattacharjee, Birnbaum-Saunders, bivariate normal, bivariate Poisson, categorical, Dirichlet, Dirichlet-multinomial, discrete gamma, discrete Laplace, discrete normal, discrete uniform, discrete Weibull, Frechet, gamma-Poisson, generalized extreme value, Gompertz, generalized Pareto, Gumbel, half-Cauchy, half-normal, half-t, Huber density, inverse chi-squared, inverse-gamma, Kumaraswamy, Laplace, location-scale t, logarithmic, Lomax, multivariate hypergeometric, multinomial, negative hypergeometric, non-standard beta, normal mixture, Poisson mixture, Pareto, power, reparametrized beta, Rayleigh, shifted Gompertz, Skellam, slash, triangular, truncated binomial, truncated normal, truncated Poisson, Tukey lambda, Wald, zero-inflated binomial, zero-inflated negative binomial, zero-inflated Poisson.
Maintained by Tymoteusz Wolodzko. Last updated 10 days ago.
c-plus-plusc-plus-plus-11distributionmultivariate-distributionsprobabilityrandom-generationrcppstatisticscpp
9.9 match 53 stars 11.60 score 1.5k scripts 107 dependentsbioc
sccomp:Tests differences in cell-type proportion for single-cell data, robust to outliers
A robust and outlier-aware method for testing differences in cell-type proportion in single-cell data. This model can infer changes in tissue composition and heterogeneity, and can produce realistic data simulations based on any existing dataset. This model can also transfer knowledge from a large set of integrated datasets to increase accuracy further.
Maintained by Stefano Mangiola. Last updated 15 days ago.
bayesianregressiondifferentialexpressionsinglecellbatch-correctioncompositioncytofdifferential-proportionmicrobiomemultilevelproportionsrandom-effectssingle-cellunwanted-variation
13.6 match 99 stars 8.41 score 69 scriptstylermorganwall
spacefillr:Space-Filling Random and Quasi-Random Sequences
Generates random and quasi-random space-filling sequences. Supports the following sequences: 'Halton', 'Sobol', 'Owen'-scrambled 'Sobol', 'Owen'-scrambled 'Sobol' with errors distributed as blue noise, progressive jittered, progressive multi-jittered ('PMJ'), 'PMJ' with blue noise, 'PMJ02', and 'PMJ02' with blue noise. Includes a 'C++' 'API'. Methods derived from "Constructing Sobol sequences with better two-dimensional projections" (2012) <doi:10.1137/070709359> S. Joe and F. Y. Kuo, "Progressive Multi-Jittered Sample Sequences" (2018) <https://graphics.pixar.com/library/ProgressiveMultiJitteredSampling/paper.pdf> Christensen, P., Kensler, A. and Kilpatrick, C., and "A Low-Discrepancy Sampler that Distributes Monte Carlo Errors as a Blue Noise in Screen Space" (2019) E. Heitz, B. Laurent, O. Victor, C. David and I. Jean-Claude, <doi:10.1145/3306307.3328191>.
Maintained by Tyler Morgan-Wall. Last updated 19 days ago.
halton-sequencequasi-random-generatorsobol-sequencecpp
16.0 match 7 stars 7.07 score 3 scripts 45 dependentspbs-assess
sdmTMB:Spatial and Spatiotemporal SPDE-Based GLMMs with 'TMB'
Implements spatial and spatiotemporal GLMMs (Generalized Linear Mixed Effect Models) using 'TMB', 'fmesher', and the SPDE (Stochastic Partial Differential Equation) Gaussian Markov random field approximation to Gaussian random fields. One common application is for spatially explicit species distribution models (SDMs). See Anderson et al. (2024) <doi:10.1101/2022.03.24.485545>.
Maintained by Sean C. Anderson. Last updated 8 hours ago.
ecologyglmmspatial-analysisspecies-distribution-modellingtmbcpp
10.4 match 203 stars 10.71 score 848 scripts 1 dependentsjmsigner
amt:Animal Movement Tools
Manage and analyze animal movement data. The functionality of 'amt' includes methods to calculate home ranges, track statistics (e.g. step lengths, speed, or turning angles), prepare data for fitting habitat selection analyses, and simulation of space-use from fitted step-selection functions.
Maintained by Johannes Signer. Last updated 4 months ago.
10.5 match 41 stars 10.54 score 418 scriptstychelab
CoSMoS:Complete Stochastic Modelling Solution
Makes univariate, multivariate, or random fields simulations precise and simple. Just select the desired time series or random fields’ properties and it will do the rest. CoSMoS is based on the framework described in Papalexiou (2018, <doi:10.1016/j.advwatres.2018.02.013>), extended for random fields in Papalexiou and Serinaldi (2020, <doi:10.1029/2019WR026331>), and further advanced in Papalexiou et al. (2021, <doi:10.1029/2020WR029466>) to allow fine-scale space-time simulation of storms (or even cyclone-mimicking fields).
Maintained by Kevin Shook. Last updated 4 years ago.
15.5 match 11 stars 7.10 score 77 scriptsmiraisolutions
rTRNG:Advanced and Parallel Random Number Generation via 'TRNG'
Embeds sources and headers from Tina's Random Number Generator ('TRNG') C++ library. Exposes some functionality for easier access, testing and benchmarking into R. Provides examples of how to use parallel RNG with 'RcppParallel'. The methods and techniques behind 'TRNG' are illustrated in the package vignettes and examples. Full documentation is available in Bauke (2021) <https://github.com/rabauke/trng4/blob/v4.23.1/doc/trng.pdf>.
Maintained by Riccardo Porreca. Last updated 1 years ago.
19.6 match 19 stars 5.63 score 15 scriptshesim-dev
hesim:Health Economic Simulation Modeling and Decision Analysis
A modular and computationally efficient R package for parameterizing, simulating, and analyzing health economic simulation models. The package supports cohort discrete time state transition models (Briggs et al. 1998) <doi:10.2165/00019053-199813040-00003>, N-state partitioned survival models (Glasziou et al. 1990) <doi:10.1002/sim.4780091106>, and individual-level continuous time state transition models (Siebert et al. 2012) <doi:10.1016/j.jval.2012.06.014>, encompassing both Markov (time-homogeneous and time-inhomogeneous) and semi-Markov processes. Decision uncertainty from a cost-effectiveness analysis is quantified with standard graphical and tabular summaries of a probabilistic sensitivity analysis (Claxton et al. 2005, Barton et al. 2008) <doi:10.1002/hec.985>, <doi:10.1111/j.1524-4733.2008.00358.x>. Use of C++ and data.table make individual-patient simulation, probabilistic sensitivity analysis, and incorporation of patient heterogeneity fast.
Maintained by Devin Incerti. Last updated 6 months ago.
health-economic-evaluationmicrosimulationsimulation-modelingcpp
13.6 match 67 stars 8.12 score 41 scriptsspatstat
spatstat.geom:Geometrical Functionality of the 'spatstat' Family
Defines spatial data types and supports geometrical operations on them. Data types include point patterns, windows (domains), pixel images, line segment patterns, tessellations and hyperframes. Capabilities include creation and manipulation of data (using command line or graphical interaction), plotting, geometrical operations (rotation, shift, rescale, affine transformation), convex hull, discretisation and pixellation, Dirichlet tessellation, Delaunay triangulation, pairwise distances, nearest-neighbour distances, distance transform, morphological operations (erosion, dilation, closing, opening), quadrat counting, geometrical measurement, geometrical covariance, colour maps, calculus on spatial domains, Gaussian blur, level sets of images, transects of images, intersections between objects, minimum distance matching. (Excludes spatial data on a network, which are supported by the package 'spatstat.linnet'.)
Maintained by Adrian Baddeley. Last updated 4 hours ago.
classes-and-objectsdistance-calculationgeometrygeometry-processingimagesmensurationplottingpoint-patternsspatial-dataspatial-data-analysis
9.1 match 7 stars 12.11 score 241 scripts 227 dependentsskgrange
rmweather:Tools to Conduct Meteorological Normalisation and Counterfactual Modelling for Air Quality Data
An integrated set of tools to allow data users to conduct meteorological normalisation and counterfactual modelling for air quality data. The meteorological normalisation technique uses predictive random forest models to remove variation of pollutant concentrations so trends and interventions can be explored in a robust way. For examples, see Grange et al. (2018) <doi:10.5194/acp-18-6223-2018> and Grange and Carslaw (2019) <doi:10.1016/j.scitotenv.2018.10.344>. The random forest models can also be used for counterfactual or business as usual (BAU) modelling by using the models to predict, from the model's perspective, the future. For an example, see Grange et al. (2021) <doi:10.5194/acp-2020-1171>.
Maintained by Stuart K. Grange. Last updated 22 days ago.
17.5 match 49 stars 6.24 score 239 scriptspdhoff
rstiefel:Random Orthonormal Matrix Generation and Optimization on the Stiefel Manifold
Simulation of random orthonormal matrices from linear and quadratic exponential family distributions on the Stiefel manifold. The most general type of distribution covered is the matrix-variate Bingham-von Mises-Fisher distribution. Most of the simulation methods are presented in Hoff(2009) "Simulation of the Matrix Bingham-von Mises-Fisher Distribution, With Applications to Multivariate and Relational Data" <doi:10.1198/jcgs.2009.07177>. The package also includes functions for optimization on the Stiefel manifold based on algorithms described in Wen and Yin (2013) "A feasible method for optimization with orthogonality constraints" <doi:10.1007/s10107-012-0584-1>.
Maintained by Peter Hoff. Last updated 4 years ago.
16.4 match 3 stars 6.53 score 95 scripts 8 dependentsstatnet
latentnet:Latent Position and Cluster Models for Statistical Networks
Fit and simulate latent position and cluster models for statistical networks. See Krivitsky and Handcock (2008) <doi:10.18637/jss.v024.i05> and Krivitsky, Handcock, Raftery, and Hoff (2009) <doi:10.1016/j.socnet.2009.04.001>.
Maintained by Pavel N. Krivitsky. Last updated 4 days ago.
12.8 match 19 stars 8.36 score 191 scripts 4 dependentswilkelab
ggridges:Ridgeline Plots in 'ggplot2'
Ridgeline plots provide a convenient way of visualizing changes in distributions over time or space. This package enables the creation of such plots in 'ggplot2'.
Maintained by Claus O. Wilke. Last updated 3 months ago.
6.4 match 418 stars 16.71 score 14k scripts 285 dependentsbioc
BiocParallel:Bioconductor facilities for parallel evaluation
This package provides modified versions and novel implementation of functions for parallel evaluation, tailored to use with Bioconductor objects.
Maintained by Martin Morgan. Last updated 24 days ago.
infrastructurebioconductor-packagecore-packageu24ca289073cpp
6.1 match 67 stars 17.40 score 7.3k scripts 1.1k dependentsbbolker
broom.mixed:Tidying Methods for Mixed Models
Convert fitted objects from various R mixed-model packages into tidy data frames along the lines of the 'broom' package. The package provides three S3 generics for each model: tidy(), which summarizes a model's statistical findings such as coefficients of a regression; augment(), which adds columns to the original data such as predictions, residuals and cluster assignments; and glance(), which provides a one-row summary of model-level statistics.
Maintained by Ben Bolker. Last updated 3 months ago.
7.0 match 231 stars 15.22 score 4.0k scripts 37 dependentsjinghuazhao
gap:Genetic Analysis Package
As first reported [Zhao, J. H. 2007. "gap: Genetic Analysis Package". J Stat Soft 23(8):1-18. <doi:10.18637/jss.v023.i08>], it is designed as an integrated package for genetic data analysis of both population and family data. Currently, it contains functions for sample size calculations of both population-based and family-based designs, probability of familial disease aggregation, kinship calculation, statistics in linkage analysis, and association analysis involving genetic markers including haplotype analysis with or without environmental covariates. Over years, the package has been developed in-between many projects hence also in line with the name (gap).
Maintained by Jing Hua Zhao. Last updated 15 days ago.
8.8 match 12 stars 11.88 score 448 scripts 16 dependentsmobiodiv
mobsim:Spatial Simulation and Scale-Dependent Analysis of Biodiversity Changes
Simulation, analysis and sampling of spatial biodiversity data (May, Gerstner, McGlinn, Xiao & Chase 2017) <doi:10.1111/2041-210x.12986>. In the simulation tools user define the numbers of species and individuals, the species abundance distribution and species aggregation. Functions for analysis include species rarefaction and accumulation curves, species-area relationships and the distance decay of similarity.
Maintained by Felix May. Last updated 3 months ago.
biodiversitymacroecologypoint-pattern-analysisrarefactionsimulationspeciesspecies-abundance-distributionscpp
13.3 match 20 stars 7.84 score 76 scriptsrudeboybert
resampledata:Data Sets for Mathematical Statistics with Resampling in R
Package of data sets from "Mathematical Statistics with Resampling in R" (1st Ed. 2011, 2nd Ed. 2018) by Laura Chihara and Tim Hesterberg.
Maintained by Albert Y. Kim. Last updated 4 months ago.
20.2 match 15 stars 5.15 score 187 scriptsmlr-org
mlr3torch:Deep Learning with 'mlr3'
Deep Learning library that extends the mlr3 framework by building upon the 'torch' package. It allows to conveniently build, train, and evaluate deep learning models without having to worry about low level details. Custom architectures can be created using the graph language defined in 'mlr3pipelines'.
Maintained by Sebastian Fischer. Last updated 30 days ago.
data-sciencedeep-learningmachine-learningmlr3torch
13.6 match 42 stars 7.63 score 78 scriptscran
sna:Tools for Social Network Analysis
A range of tools for social network analysis, including node and graph-level indices, structural distance and covariance methods, structural equivalence detection, network regression, random graph generation, and 2D/3D network visualization.
Maintained by Carter T. Butts. Last updated 6 months ago.
15.2 match 8 stars 6.78 score 94 dependentsolink-proteomics
OlinkAnalyze:Facilitate Analysis of Proteomic Data from Olink
A collection of functions to facilitate analysis of proteomic data from Olink, primarily NPX data that has been exported from Olink Software. The functions also work on QUANT data from Olink by log- transforming the QUANT data. The functions are focused on reading data, facilitating data wrangling and quality control analysis, performing statistical analysis and generating figures to visualize the results of the statistical analysis. The goal of this package is to help users extract biological insights from proteomic data run on the Olink platform.
Maintained by Kathleen Nevola. Last updated 19 days ago.
olinkproteomicsproteomics-data-analysis
10.6 match 104 stars 9.72 score 61 scriptsehrlinger
ggRandomForests:Visually Exploring Random Forests
Graphic elements for exploring Random Forests using the 'randomForest' or 'randomForestSRC' package for survival, regression and classification forests and 'ggplot2' package plotting.
Maintained by John Ehrlinger. Last updated 4 days ago.
11.5 match 148 stars 8.94 score 197 scriptsmelff
mclogit:Multinomial Logit Models, with or without Random Effects or Overdispersion
Provides estimators for multinomial logit models in their conditional logit and baseline logit variants, with or without random effects, with or without overdispersion. Random effects models are estimated using the PQL technique (based on a Laplace approximation) or the MQL technique (based on a Solomon-Cox approximation). Estimates should be treated with caution if the group sizes are small.
Maintained by Martin Elff. Last updated 3 months ago.
9.3 match 23 stars 11.03 score 262 scripts 4 dependentshwborchers
pracma:Practical Numerical Math Functions
Provides a large number of functions from numerical analysis and linear algebra, numerical optimization, differential equations, time series, plus some well-known special mathematical functions. Uses 'MATLAB' function names where appropriate to simplify porting.
Maintained by Hans W. Borchers. Last updated 1 years ago.
8.3 match 29 stars 12.34 score 6.6k scripts 931 dependentsdgbonett
statpsych:Statistical Methods for Psychologists
Implements confidence interval and sample size methods that are especially useful in psychological research. The methods can be applied in 1-group, 2-group, paired-samples, and multiple-group designs and to a variety of parameters including means, medians, proportions, slopes, standardized mean differences, standardized linear contrasts of means, plus several measures of correlation and association. Confidence interval and sample size functions are given for single parameters as well as differences, ratios, and linear contrasts of parameters. The sample size functions can be used to approximate the sample size needed to estimate a parameter or function of parameters with desired confidence interval precision or to perform a variety of hypothesis tests (directional two-sided, equivalence, superiority, noninferiority) with desired power. For details see: Statistical Methods for Psychologists, Volumes 1 – 4, <https://dgbonett.sites.ucsc.edu/>.
Maintained by Douglas G. Bonett. Last updated 3 months ago.
21.2 match 6 stars 4.83 score 15 scripts 1 dependentsamalan-constat
fitODBOD:Modeling Over Dispersed Binomial Outcome Data Using BMD and ABD
Contains Probability Mass Functions, Cumulative Mass Functions, Negative Log Likelihood value, parameter estimation and modeling data using Binomial Mixture Distributions (BMD) (Manoj et al (2013) <doi:10.5539/ijsp.v2n2p24>) and Alternate Binomial Distributions (ABD) (Paul (1985) <doi:10.1080/03610928508828990>), also Journal article to use the package(<doi:10.21105/joss.01505>).
Maintained by Amalan Mahendran. Last updated 4 months ago.
binomial-outcome-dataoverdispersion
22.7 match 1 stars 4.44 score 139 scriptsmiriamesteve
eat:Efficiency Analysis Trees
Functions are provided to determine production frontiers and technical efficiency measures through non-parametric techniques based upon regression trees. The package includes code for estimating radial input, output, directional and additive measures, plotting graphical representations of the scores and the production frontiers by means of trees, and determining rankings of importance of input variables in the analysis. Additionally, an adaptation of Random Forest by a set of individual Efficiency Analysis Trees for estimating technical efficiency is also included. More details in: <doi:10.1016/j.eswa.2020.113783>.
Maintained by Miriam Esteve. Last updated 3 years ago.
21.5 match 5 stars 4.68 score 19 scriptsbiodiverse
unmarked:Models for Data from Unmarked Animals
Fits hierarchical models of animal abundance and occurrence to data collected using survey methods such as point counts, site occupancy sampling, distance sampling, removal sampling, and double observer sampling. Parameters governing the state and observation processes can be modeled as functions of covariates. References: Kellner et al. (2023) <doi:10.1111/2041-210X.14123>, Fiske and Chandler (2011) <doi:10.18637/jss.v043.i10>.
Maintained by Ken Kellner. Last updated 1 months ago.
7.7 match 4 stars 13.02 score 652 scripts 12 dependentsropensci
canaper:Categorical Analysis of Neo- And Paleo-Endemism
Provides functions to analyze the spatial distribution of biodiversity, in particular categorical analysis of neo- and paleo-endemism (CANAPE) as described in Mishler et al (2014) <doi:10.1038/ncomms5473>. 'canaper' conducts statistical tests to determine the types of endemism that occur in a study area while accounting for the evolutionary relationships of species.
Maintained by Joel H. Nitta. Last updated 2 years ago.
18.5 match 7 stars 5.38 score 23 scriptsstekhoven
missForest:Nonparametric Missing Value Imputation using Random Forest
The function 'missForest' in this package is used to impute missing values particularly in the case of mixed-type data. It uses a random forest trained on the observed values of a data matrix to predict the missing values. It can be used to impute continuous and/or categorical data including complex interactions and non-linear relations. It yields an out-of-bag (OOB) imputation error estimate without the need of a test set or elaborate cross-validation. It can be run in parallel to save computation time.
Maintained by Daniel J. Stekhoven. Last updated 1 years ago.
8.6 match 92 stars 11.53 score 1.1k scripts 32 dependentsklvoje
evoTS:Analyses of Evolutionary Time-Series
Facilitates univariate and multivariate analysis of evolutionary sequences of phenotypic change. The package extends the modeling framework available in the 'paleoTS' package. Please see <https://klvoje.github.io/evoTS/index.html> for information about the package and the implemented models.
Maintained by Kjetil Lysne Voje. Last updated 9 months ago.
23.1 match 1 stars 4.26 score 184 scriptscjerzak
fastrerandomize:Hardware-Accelerated Rerandomization for Improved Balance
Provides hardware-accelerated tools for performing rerandomization and randomization testing in experimental research. Using a 'JAX' backend, the package enables exact rerandomization inference even for large experiments with hundreds of billions of possible randomizations. Key functionalities include generating pools of acceptable rerandomizations based on covariate balance, conducting exact randomization tests, and performing pre-analysis evaluations to determine optimal rerandomization acceptance thresholds. The package supports various hardware acceleration frameworks including 'CPU', 'CUDA', and 'METAL', making it versatile across accelerated computing environments. This allows researchers to efficiently implement stringent rerandomization designs and conduct valid inference even with large sample sizes. The package is partly based on Jerzak and Goldstein (2023) <doi:10.48550/arXiv.2310.00861>.
Maintained by Connor Jerzak. Last updated 1 months ago.
balanceexperimental-designhardware-acceleration
17.4 match 8 stars 5.64 score 1 scriptsepiforecasts
epinowcast:Flexible Hierarchical Nowcasting
Tools to enable flexible and efficient hierarchical nowcasting of right-truncated epidemiological time-series using a semi-mechanistic Bayesian model with support for a range of reporting and generative processes. Nowcasting, in this context, is gaining situational awareness using currently available observations and the reporting patterns of historical observations. This can be useful when tracking the spread of infectious disease in real-time: without nowcasting, changes in trends can be obfuscated by partial reporting or their detection may be delayed due to the use of simpler methods like truncation. While the package has been designed with epidemiological applications in mind, it could be applied to any set of right-truncated time-series count data.
Maintained by Sam Abbott. Last updated 11 months ago.
cmdstanreffective-reproduction-number-estimationepidemiologyinfectious-disease-surveillancenowcastingoutbreak-analysispandemic-preparednessreal-time-infectious-disease-modellingstan
12.4 match 61 stars 7.88 score 65 scriptsmelodyaowen
crt2power:Designing Cluster-Randomized Trials with Two Continuous Co-Primary Outcomes
Provides methods for powering cluster-randomized trials with two continuous co-primary outcomes using five key design techniques. Includes functions for calculating required sample size and statistical power. For more details on methodology, see Owen et al. (2025) <doi:10.1002/sim.70015>, Yang et al. (2022) <doi:10.1111/biom.13692>, Pocock et al. (1987) <doi:10.2307/2531989>, Vickerstaff et al. (2019) <doi:10.1186/s12874-019-0754-4>, and Li et al. (2020) <doi:10.1111/biom.13212>.
Maintained by Melody Owen. Last updated 1 days ago.
27.2 match 3.60 score 2 scriptseikeluedeling
decisionSupport:Quantitative Support of Decision Making under Uncertainty
Supporting the quantitative analysis of binary welfare based decision making processes using Monte Carlo simulations. Decision support is given on two levels: (i) The actual decision level is to choose between two alternatives under probabilistic uncertainty. This package calculates the optimal decision based on maximizing expected welfare. (ii) The meta decision level is to allocate resources to reduce the uncertainty in the underlying decision problem, i.e to increase the current information to improve the actual decision making process. This problem is dealt with using the Value of Information Analysis. The Expected Value of Information for arbitrary prospective estimates can be calculated as well as Individual Expected Value of Perfect Information. The probabilistic calculations are done via Monte Carlo simulations. This Monte Carlo functionality can be used on its own.
Maintained by Eike Luedeling. Last updated 11 months ago.
18.9 match 6 stars 5.17 score 123 scriptsf-rousset
spaMM:Mixed-Effect Models, with or without Spatial Random Effects
Inference based on models with or without spatially-correlated random effects, multivariate responses, or non-Gaussian random effects (e.g., Beta). Variation in residual variance (heteroscedasticity) can itself be represented by a mixed-effect model. Both classical geostatistical models (Rousset and Ferdy 2014 <doi:10.1111/ecog.00566>), and Markov random field models on irregular grids (as considered in the 'INLA' package, <https://www.r-inla.org>), can be fitted, with distinct computational procedures exploiting the sparse matrix representations for the latter case and other autoregressive models. Laplace approximations are used for likelihood or restricted likelihood. Penalized quasi-likelihood and other variants discussed in the h-likelihood literature (Lee and Nelder 2001 <doi:10.1093/biomet/88.4.987>) are also implemented.
Maintained by François Rousset. Last updated 9 months ago.
19.7 match 4.94 score 208 scripts 5 dependentsepinowcast
epinowcast:Flexible Hierarchical Nowcasting
Tools to enable flexible and efficient hierarchical nowcasting of right-truncated epidemiological time-series using a semi-mechanistic Bayesian model with support for a range of reporting and generative processes. Nowcasting, in this context, is gaining situational awareness using currently available observations and the reporting patterns of historical observations. This can be useful when tracking the spread of infectious disease in real-time: without nowcasting, changes in trends can be obfuscated by partial reporting or their detection may be delayed due to the use of simpler methods like truncation. While the package has been designed with epidemiological applications in mind, it could be applied to any set of right-truncated time-series count data.
Maintained by Sam Abbott. Last updated 11 months ago.
cmdstanreffective-reproduction-number-estimationepidemiologyinfectious-disease-surveillancenowcastingoutbreak-analysispandemic-preparednessreal-time-infectious-disease-modellingstan
12.4 match 61 stars 7.79 score 71 scriptscrj32
MLeval:Machine Learning Model Evaluation
Straightforward and detailed evaluation of machine learning models. 'MLeval' can produce receiver operating characteristic (ROC) curves, precision-recall (PR) curves, calibration curves, and PR gain curves. 'MLeval' accepts a data frame of class probabilities and ground truth labels, or, it can automatically interpret the Caret train function results from repeated cross validation, then select the best model and analyse the results. 'MLeval' produces a range of evaluation metrics with confidence intervals.
Maintained by Christopher R John. Last updated 5 years ago.
16.9 match 6 stars 5.71 score 144 scriptsr-forge
RandVar:Implementation of Random Variables
Implements random variables by means of S4 classes and methods.
Maintained by Matthias Kohl. Last updated 2 months ago.
16.0 match 6.03 score 43 scripts 7 dependentsflr
FLasher:Projection and Forecasting of Fish Populations, Stocks and Fleets
Projection of future population and fishery dynamics is carried out for a given set of management targets. A system of equations is solved, using Automatic Differentation (AD), for the levels of effort by fishery (fleet) that will result in the required abundances, catches or fishing mortalities.
Maintained by Iago Mosqueira. Last updated 8 days ago.
14.0 match 2 stars 6.86 score 254 scripts 6 dependentskearutherford
BerkeleyForestsAnalytics:Compute and Summarize Core Forest Metrics from Field Data
A suite of open-source R functions designed to produce standard metrics for forest management and ecology from forest inventory data. The overarching goal is to minimize potential inconsistencies introduced by the algorithms used to compute and summarize core forest metrics. Learn more about the purpose of the package and the specific algorithms used in the package at <https://github.com/kearutherford/BerkeleyForestsAnalytics>.
Maintained by Kea Rutherford. Last updated 2 months ago.
17.4 match 7 stars 5.50 score 4 scriptsclaudioagostinelli
CircStats:Circular Statistics, from "Topics in Circular Statistics" (2001)
Circular Statistics, from "Topics in Circular Statistics" (2001) S. Rao Jammalamadaka and A. SenGupta, World Scientific.
Maintained by Claudio Agostinelli. Last updated 7 years ago.
14.5 match 2 stars 6.60 score 261 scripts 36 dependentspaulhendricks
generator:Generate Data Containing Fake Personally Identifiable Information
Allows users to quickly and easily generate fake data containing Personally Identifiable Information (PII) through convenience functions.
Maintained by Paul Hendricks. Last updated 8 years ago.
16.0 match 24 stars 5.99 score 81 scriptsadeverse
adespatial:Multivariate Multiscale Spatial Analysis
Tools for the multiscale spatial analysis of multivariate data. Several methods are based on the use of a spatial weighting matrix and its eigenvector decomposition (Moran's Eigenvectors Maps, MEM). Several approaches are described in the review Dray et al (2012) <doi:10.1890/11-1183.1>.
Maintained by Aurélie Siberchicot. Last updated 11 days ago.
8.6 match 36 stars 11.06 score 398 scripts 2 dependentsgavinsimpson
gratia:Graceful 'ggplot'-Based Graphics and Other Functions for GAMs Fitted Using 'mgcv'
Graceful 'ggplot'-based graphics and utility functions for working with generalized additive models (GAMs) fitted using the 'mgcv' package. Provides a reimplementation of the plot() method for GAMs that 'mgcv' provides, as well as 'tidyverse' compatible representations of estimated smooths.
Maintained by Gavin L. Simpson. Last updated 4 days ago.
distributional-regressiongamgammgeneralized-additive-mixed-modelsgeneralized-additive-modelsggplot2glmlmmgcvpenalized-splinerandom-effectssmoothingsplines
7.5 match 216 stars 12.68 score 1.6k scripts 1 dependentscran
PracTools:Designing and Weighting Survey Samples
Functions and datasets to support Valliant, Dever, and Kreuter (2018), <doi:10.1007/978-3-319-93632-1>, "Practical Tools for Designing and Weighting Survey Samples". Contains functions for sample size calculation for survey samples using stratified or clustered one-, two-, and three-stage sample designs, and single-stage audit sample designs. Functions are included that will group geographic units accounting for distances apart and measures of size. Other functions compute variance components for multistage designs and sample sizes in two-phase designs. A number of example data sets are included.
Maintained by Richard Valliant. Last updated 9 months ago.
21.2 match 1 stars 4.48 score 101 scripts 1 dependentsinbo
inlatools:Diagnostic Tools for INLA Models
Several functions which can be useful to choose sensible priors and diagnose the fitted model.
Maintained by Thierry Onkelinx. Last updated 5 months ago.
bayesian-statisticsgplv3inlamixed-modelsmodel-checkingmodel-validation
21.2 match 4 stars 4.41 score 43 scriptsgi0na
ghypernet:Fit and Simulate Generalised Hypergeometric Ensembles of Graphs
Provides functions for model fitting and selection of generalised hypergeometric ensembles of random graphs (gHypEG). To learn how to use it, check the vignettes for a quick tutorial. Please reference its use as Casiraghi, G., Nanumyan, V. (2019) <doi:10.5281/zenodo.2555300> together with those relevant references from the one listed below. The package is based on the research developed at the Chair of Systems Design, ETH Zurich. Casiraghi, G., Nanumyan, V., Scholtes, I., Schweitzer, F. (2016) <arXiv:1607.02441>. Casiraghi, G., Nanumyan, V., Scholtes, I., Schweitzer, F. (2017) <doi:10.1007/978-3-319-67256-4_11>. Casiraghi, G., (2017) <arXiv:1702.02048> Brandenberger, L., Casiraghi, G., Nanumyan, V., Schweitzer, F. (2019) <doi:10.1145/3341161.3342926> Casiraghi, G. (2019) <doi:10.1007/s41109-019-0241-1>. Casiraghi, G., Nanumyan, V. (2021) <doi:10.1038/s41598-021-92519-y>. Casiraghi, G. (2021) <doi:10.1088/2632-072X/ac0493>.
Maintained by Giona Casiraghi. Last updated 11 months ago.
data-miningdata-sciencegraphsnetworknetwork-analysisrandom-graph-generationrandom-graphs
16.5 match 8 stars 5.68 score 20 scriptsamices
mice:Multivariate Imputation by Chained Equations
Multiple imputation using Fully Conditional Specification (FCS) implemented by the MICE algorithm as described in Van Buuren and Groothuis-Oudshoorn (2011) <doi:10.18637/jss.v045.i03>. Each variable has its own imputation model. Built-in imputation models are provided for continuous data (predictive mean matching, normal), binary data (logistic regression), unordered categorical data (polytomous logistic regression) and ordered categorical data (proportional odds). MICE can also impute continuous two-level data (normal model, pan, second-level variables). Passive imputation can be used to maintain consistency between variables. Various diagnostic plots are available to inspect the quality of the imputations.
Maintained by Stef van Buuren. Last updated 5 days ago.
chained-equationsfcsimputationmicemissing-datamissing-valuesmultiple-imputationmultivariate-datacpp
5.6 match 462 stars 16.50 score 10k scripts 154 dependentsquanteda
quanteda:Quantitative Analysis of Textual Data
A fast, flexible, and comprehensive framework for quantitative text analysis in R. Provides functionality for corpus management, creating and manipulating tokens and n-grams, exploring keywords in context, forming and manipulating sparse matrices of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and distances, applying content dictionaries, applying supervised and unsupervised machine learning, visually representing text and text analyses, and more.
Maintained by Kenneth Benoit. Last updated 2 months ago.
corpusnatural-language-processingquantedatext-analyticsonetbbcpp
5.5 match 851 stars 16.68 score 5.4k scripts 51 dependentsstatnet
tergm:Fit, Simulate and Diagnose Models for Network Evolution Based on Exponential-Family Random Graph Models
An integrated set of extensions to the 'ergm' package to analyze and simulate network evolution based on exponential-family random graph models (ERGM). 'tergm' is a part of the 'statnet' suite of packages for network analysis. See Krivitsky and Handcock (2014) <doi:10.1111/rssb.12014> and Carnegie, Krivitsky, Hunter, and Goodreau (2015) <doi:10.1080/10618600.2014.903087>.
Maintained by Pavel N. Krivitsky. Last updated 4 months ago.
10.0 match 27 stars 9.29 score 78 scripts 3 dependentsbioc
randRotation:Random Rotation Methods for High Dimensional Data with Batch Structure
A collection of methods for performing random rotations on high-dimensional, normally distributed data (e.g. microarray or RNA-seq data) with batch structure. The random rotation approach allows exact testing of dependent test statistics with linear models following arbitrary batch effect correction methods.
Maintained by Peter Hettegger. Last updated 5 months ago.
softwaresequencingbatcheffectbiomedicalinformaticsrnaseqpreprocessingmicroarraydifferentialexpressiongeneexpressiongeneticsmicrornaarraynormalizationstatisticalmethod
25.2 match 3.60 score 3 scriptschristianroever
bayesmeta:Bayesian Random-Effects Meta-Analysis and Meta-Regression
A collection of functions allowing to derive the posterior distribution of the model parameters in random-effects meta-analysis or meta-regression, and providing functionality to evaluate joint and marginal posterior probability distributions, predictive distributions, shrinkage effects, posterior predictive p-values, etc.; For more details, see also Roever C (2020) <doi:10.18637/jss.v093.i06>, or Roever C and Friede T (2022) <doi:10.1016/j.cmpb.2022.107303>.
Maintained by Christian Roever. Last updated 1 years ago.
16.8 match 3 stars 5.40 score 73 scripts 1 dependentsfarrellday
miceRanger:Multiple Imputation by Chained Equations with Random Forests
Multiple Imputation has been shown to be a flexible method to impute missing values by Van Buuren (2007) <doi:10.1177/0962280206074463>. Expanding on this, random forests have been shown to be an accurate model by Stekhoven and Buhlmann <arXiv:1105.0828> to impute missing values in datasets. They have the added benefits of returning out of bag error and variable importance estimates, as well as being simple to run in parallel.
Maintained by Sam Wilson. Last updated 3 years ago.
imputation-methodsmachine-learningmicemissing-datamissing-valuesrandom-forests
12.7 match 67 stars 7.09 score 41 scripts 1 dependentspsirusteam
TeachingSampling:Selection of Samples and Parameter Estimation in Finite Population
Allows the user to draw probabilistic samples and make inferences from a finite population based on several sampling designs.
Maintained by Hugo Andres Gutierrez Rojas. Last updated 5 years ago.
15.4 match 4 stars 5.80 score 217 scripts 4 dependentstomasfryda
h2o:R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Maintained by Tomas Fryda. Last updated 1 years ago.
10.8 match 3 stars 8.20 score 7.8k scripts 11 dependentspennchopmicrobiomeprogram
ZIBR:A Zero-Inflated Beta Random Effect Model
A two-part zero-inflated Beta regression model with random effects (ZIBR) for testing the association between microbial abundance and clinical covariates for longitudinal microbiome data. Eric Z. Chen and Hongzhe Li (2016) <doi:10.1093/bioinformatics/btw308>.
Maintained by Charlie Bushman. Last updated 1 years ago.
15.1 match 30 stars 5.86 score 24 scriptsbioc
graph:graph: A package to handle graph data structures
A package that implements some simple graph handling capabilities.
Maintained by Bioconductor Package Maintainer. Last updated 9 days ago.
7.5 match 11.78 score 764 scripts 342 dependentscsafe-isu
handwriterRF:Handwriting Analysis with Random Forests
Perform forensic handwriting analysis of two scanned handwritten documents. This package implements the statistical method described by Madeline Johnson and Danica Ommen (2021) <doi:10.1002/sam.11566>. Similarity measures and a random forest produce a score-based likelihood ratio that quantifies the strength of the evidence in favor of the documents being written by the same writer or different writers.
Maintained by Stephanie Reinders. Last updated 7 days ago.
14.1 match 2 stars 6.18 score 15 scripts 1 dependentsrcppcore
RcppArmadillo:'Rcpp' Integration for the 'Armadillo' Templated Linear Algebra Library
'Armadillo' is a templated C++ linear algebra library (by Conrad Sanderson) that aims towards a good balance between speed and ease of use. Integer, floating point and complex numbers are supported, as well as a subset of trigonometric and statistics functions. Various matrix decompositions are provided through optional integration with LAPACK and ATLAS libraries. The 'RcppArmadillo' package includes the header files from the templated 'Armadillo' library. Thus users do not need to install 'Armadillo' itself in order to use 'RcppArmadillo'. From release 7.800.0 on, 'Armadillo' is licensed under Apache License 2; previous releases were under licensed as MPL 2.0 from version 3.800.0 onwards and LGPL-3 prior to that; 'RcppArmadillo' (the 'Rcpp' bindings/bridge to Armadillo) is licensed under the GNU GPL version 2 or later, as is the rest of 'Rcpp'.
Maintained by Dirk Eddelbuettel. Last updated 4 days ago.
armadilloc-plus-plusrcpprcpparmadilloopenblascppopenmp
4.6 match 197 stars 18.77 score 1.9k scripts 3.4k dependentsdebruine
faux:Simulation for Factorial Designs
Create datasets with factorial structure through simulation by specifying variable parameters. Extended documentation at <https://debruine.github.io/faux/>. Described in DeBruine (2020) <doi:10.5281/zenodo.2669586>.
Maintained by Lisa DeBruine. Last updated 2 months ago.
9.2 match 98 stars 9.35 score 716 scripts 1 dependentsmoviedo5
fda.usc:Functional Data Analysis and Utilities for Statistical Computing
Routines for exploratory and descriptive analysis of functional data such as depth measurements, atypical curves detection, regression models, supervised classification, unsupervised classification and functional analysis of variance.
Maintained by Manuel Oviedo de la Fuente. Last updated 4 months ago.
functional-data-analysisfortran
8.8 match 12 stars 9.72 score 560 scripts 22 dependentsmrcieu
TwoSampleMR:Two Sample MR Functions and Interface to MRC Integrative Epidemiology Unit OpenGWAS Database
A package for performing Mendelian randomization using GWAS summary data. It uses the IEU OpenGWAS database <https://gwas.mrcieu.ac.uk/> to automatically obtain data, and a wide range of methods to run the analysis.
Maintained by Gibran Hemani. Last updated 9 days ago.
7.6 match 467 stars 11.23 score 1.7k scripts 1 dependentsmrc-ide
monty:Monte Carlo Models
Experimental sources for the next generation of mcstate, now called 'monty', which will support much of the old mcstate functionality but new things like better parameter interfaces, Hamiltonian Monte Carlo, and other features.
Maintained by Rich FitzJohn. Last updated 1 months ago.
11.2 match 3 stars 7.52 score 29 scripts 3 dependentstechtonique
nnetsauce:Randomized and Quasi-Randomized networks for Statistical/Machine Learning
Randomized and Quasi-Randomized networks for Statistical/Machine Learning
Maintained by T. Moudiki. Last updated 7 months ago.
deep-learningmachine-learningneural-networksrandomized-algorithmsstatistical-learning
32.3 match 2 stars 2.60 score 6 scriptspaulnorthrop
revdbayes:Ratio-of-Uniforms Sampling for Bayesian Extreme Value Analysis
Provides functions for the Bayesian analysis of extreme value models. The 'rust' package <https://cran.r-project.org/package=rust> is used to simulate a random sample from the required posterior distribution. The functionality of 'revdbayes' is similar to the 'evdbayes' package <https://cran.r-project.org/package=evdbayes>, which uses Markov Chain Monte Carlo ('MCMC') methods for posterior simulation. In addition, there are functions for making inferences about the extremal index, using the models for threshold inter-exceedance times of Suveges and Davison (2010) <doi:10.1214/09-AOAS292> and Holesovsky and Fusek (2020) <doi:10.1007/s10687-020-00374-3>. Also provided are d,p,q,r functions for the Generalised Extreme Value ('GEV') and Generalised Pareto ('GP') distributions that deal appropriately with cases where the shape parameter is very close to zero.
Maintained by Paul J. Northrop. Last updated 7 months ago.
analysisbayesianextremeextreme-value-statisticsextremesgeneralized-pareto-distributiongevinferencenhpppoint-processposteriorpredictivercppvalueopenblascpp
11.0 match 4 stars 7.62 score 58 scripts 4 dependentsmrcieu
OneSampleMR:One Sample Mendelian Randomization and Instrumental Variable Analyses
Useful functions for one-sample (individual level data) Mendelian randomization and instrumental variable analyses. The package includes implementations of; the Sanderson and Windmeijer (2016) <doi:10.1016/j.jeconom.2015.06.004> conditional F-statistic, the multiplicative structural mean model Hernán and Robins (2006) <doi:10.1097/01.ede.0000222409.00878.37>, and two-stage predictor substitution and two-stage residual inclusion estimators explained by Terza et al. (2008) <doi:10.1016/j.jhealeco.2007.09.009>.
Maintained by Tom Palmer. Last updated 19 days ago.
instrumental-variableinstrumental-variablesmendelian-randomisationmendelian-randomizationmendelianrandomisationmendelianrandomization
12.6 match 19 stars 6.69 score 16 scriptsdaijiang
megatrees:Subsets of randomly selected phylogenies from existing mega-phylogenies
There are an increasing number of mega-phylogenies available nowadays, with many of them being sets of thousands of posterior distribution phylogenies. For ecological studies, we may need to randomly select many such posterior phylogeneies to conduct analyses. This data package serves this purpose by providing a small number (100) of randomly selected posterior phylogenies (if available) so that we can readily use them for our downstream analyses without repeating the downloading and selecting processes.
Maintained by Daijiang Li. Last updated 2 months ago.
27.2 match 4 stars 3.08 score 2 scripts 1 dependentscran
kendallRandomWalks:Simulate and Visualize Kendall Random Walks and Related Distributions
Kendall random walks are a continuous-space Markov chains generated by the Kendall generalized convolution. This package provides tools for simulating these random walks and studying distributions related to them. For more information about Kendall random walks see Jasiulis-Gołdyn (2014) <arXiv:1412.0220>.
Maintained by Mateusz Staniak. Last updated 7 years ago.
25.7 match 3.26 score 18 scriptsbioc
regioneR:Association analysis of genomic regions based on permutation tests
regioneR offers a statistical framework based on customizable permutation tests to assess the association between genomic region sets and other genomic features.
Maintained by Bernat Gel. Last updated 5 months ago.
geneticschipseqdnaseqmethylseqcopynumbervariation
9.3 match 9.00 score 2.7k scripts 21 dependentsdsy109
mixtools:Tools for Analyzing Finite Mixture Models
Analyzes finite mixture models for various parametric and semiparametric settings. This includes mixtures of parametric distributions (normal, multivariate normal, multinomial, gamma), various Reliability Mixture Models (RMMs), mixtures-of-regressions settings (linear regression, logistic regression, Poisson regression, linear regression with changepoints, predictor-dependent mixing proportions, random effects regressions, hierarchical mixtures-of-experts), and tools for selecting the number of components (bootstrapping the likelihood ratio test statistic, mixturegrams, and model selection criteria). Bayesian estimation of mixtures-of-linear-regressions models is available as well as a novel data depth method for obtaining credible bands. This package is based upon work supported by the National Science Foundation under Grant No. SES-0518772 and the Chan Zuckerberg Initiative: Essential Open Source Software for Science (Grant No. 2020-255193).
Maintained by Derek Young. Last updated 9 months ago.
mixture-modelsmixture-of-expertssemiparametric-regression
7.3 match 20 stars 11.34 score 1.4k scripts 56 dependentsadamlilith
fasterRaster:Faster Raster and Spatial Vector Processing Using 'GRASS GIS'
Processing of large-in-memory/large-on disk rasters and spatial vectors using 'GRASS GIS' <https://grass.osgeo.org/>. Most functions in the 'terra' package are recreated. Processing of medium-sized and smaller spatial objects will nearly always be faster using 'terra' or 'sf', but for large-in-memory/large-on-disk objects, 'fasterRaster' may be faster. To use most of the functions, you must have the stand-alone version (not the 'OSGeoW4' installer version) of 'GRASS GIS' 8.0 or higher.
Maintained by Adam B. Smith. Last updated 17 days ago.
aspectdistancefragmentationfragmentation-indicesgisgrassgrass-gisrasterraster-projectionrasterizeslopetopographyvectorization
10.7 match 58 stars 7.69 score 8 scriptsopengeos
whitebox:'WhiteboxTools' R Frontend
An R frontend for the 'WhiteboxTools' library, which is an advanced geospatial data analysis platform developed by Prof. John Lindsay at the University of Guelph's Geomorphometry and Hydrogeomatics Research Group. 'WhiteboxTools' can be used to perform common geographical information systems (GIS) analysis operations, such as cost-distance analysis, distance buffering, and raster reclassification. Remote sensing and image processing tasks include image enhancement (e.g. panchromatic sharpening, contrast adjustments), image mosaicing, numerous filtering operations, simple classification (k-means), and common image transformations. 'WhiteboxTools' also contains advanced tooling for spatial hydrological analysis (e.g. flow-accumulation, watershed delineation, stream network analysis, sink removal), terrain analysis (e.g. common terrain indices such as slope, curvatures, wetness index, hillshading; hypsometric analysis; multi-scale topographic position analysis), and LiDAR data processing. Suggested citation: Lindsay (2016) <doi:10.1016/j.cageo.2016.07.003>.
Maintained by Andrew Brown. Last updated 5 months ago.
geomorphometrygeoprocessinggeospatialgishydrologyremote-sensingrstudio
8.5 match 173 stars 9.65 score 203 scripts 2 dependentssb452
MendelianRandomization:Mendelian Randomization Package
Encodes several methods for performing Mendelian randomization analyses with summarized data. Summarized data on genetic associations with the exposure and with the outcome can be obtained from large consortia. These data can be used for obtaining causal estimates using instrumental variable methods.
Maintained by Stephen Burgess. Last updated 2 years ago.
12.0 match 1 stars 6.83 score 940 scripts 1 dependentssdctools
sdcMicro:Statistical Disclosure Control Methods for Anonymization of Data and Risk Estimation
Data from statistical agencies and other institutions are mostly confidential. This package, introduced in Templ, Kowarik and Meindl (2017) <doi:10.18637/jss.v067.i04>, can be used for the generation of anonymized (micro)data, i.e. for the creation of public- and scientific-use files. The theoretical basis for the methods implemented can be found in Templ (2017) <doi:10.1007/978-3-319-50272-4>. Various risk estimation and anonymization methods are included. Note that the package includes a graphical user interface published in Meindl and Templ (2019) <doi:10.3390/a12090191> that allows to use various methods of this package.
Maintained by Matthias Templ. Last updated 25 days ago.
8.3 match 83 stars 9.89 score 258 scriptsmlr-org
mlr3fselect:Feature Selection for 'mlr3'
Feature selection package of the 'mlr3' ecosystem. It selects the optimal feature set for any 'mlr3' learner. The package works with several optimization algorithms e.g. Random Search, Recursive Feature Elimination, and Genetic Search. Moreover, it can automatically optimize learners and estimate the performance of optimized feature sets with nested resampling.
Maintained by Marc Becker. Last updated 2 months ago.
evolutionary-algorithmsexhaustive-searchfeature-selectionmachine-learningmlr3optimizationrandom-searchrecursive-feature-eliminationsequential-feature-selection
9.9 match 23 stars 8.25 score 70 scripts 2 dependentsnalzok
tree.interpreter:Random Forest Prediction Decomposition and Feature Importance Measure
An R re-implementation of the 'treeinterpreter' package on PyPI <https://pypi.org/project/treeinterpreter/>. Each prediction can be decomposed as 'prediction = bias + feature_1_contribution + ... + feature_n_contribution'. This decomposition is then used to calculate the Mean Decrease Impurity (MDI) and Mean Decrease Impurity using out-of-bag samples (MDI-oob) feature importance measures based on the work of Li et al. (2019) <arXiv:1906.10845>.
Maintained by Qingyao Sun. Last updated 5 years ago.
data-sciencedatascienceinterpretabilitymachine-learningrandom-forestcpp
14.1 match 12 stars 5.79 score 6 scriptsmlr-org
mlr3mbo:Flexible Bayesian Optimization
A modern and flexible approach to Bayesian Optimization / Model Based Optimization building on the 'bbotk' package. 'mlr3mbo' is a toolbox providing both ready-to-use optimization algorithms as well as their fundamental building blocks allowing for straightforward implementation of custom algorithms. Single- and multi-objective optimization is supported as well as mixed continuous, categorical and conditional search spaces. Moreover, using 'mlr3mbo' for hyperparameter optimization of machine learning models within the 'mlr3' ecosystem is straightforward via 'mlr3tuning'. Examples of ready-to-use optimization algorithms include Efficient Global Optimization by Jones et al. (1998) <doi:10.1023/A:1008306431147>, ParEGO by Knowles (2006) <doi:10.1109/TEVC.2005.851274> and SMS-EGO by Ponweiser et al. (2008) <doi:10.1007/978-3-540-87700-4_78>.
Maintained by Lennart Schneider. Last updated 11 days ago.
automlbayesian-optimizationbbotkblack-box-optimizationgaussian-processhpohyperparameterhyperparameter-optimizationhyperparameter-tuningmachine-learningmlr3model-based-optimizationoptimizationoptimizerrandom-foresttuning
9.5 match 25 stars 8.57 score 120 scripts 3 dependentsepimodel
EpiModel:Mathematical Modeling of Infectious Disease Dynamics
Tools for simulating mathematical models of infectious disease dynamics. Epidemic model classes include deterministic compartmental models, stochastic individual-contact models, and stochastic network models. Network models use the robust statistical methods of exponential-family random graph models (ERGMs) from the Statnet suite of software packages in R. Standard templates for epidemic modeling include SI, SIR, and SIS disease types. EpiModel features an API for extending these templates to address novel scientific research aims. Full methods for EpiModel are detailed in Jenness et al. (2018, <doi:10.18637/jss.v084.i08>).
Maintained by Samuel Jenness. Last updated 2 months ago.
agent-based-modelingepidemicsepidemiologyinfectious-diseasesnetwork-graphcpp
7.0 match 250 stars 11.57 score 315 scriptssebastian-engelke
graphicalExtremes:Statistical Methodology for Graphical Extreme Value Models
Statistical methodology for sparse multivariate extreme value models. Methods are provided for exact simulation and statistical inference for multivariate Pareto distributions on graphical structures as described in the paper 'Graphical Models for Extremes' by Engelke and Hitz (2020) <doi:10.1111/rssb.12355>.
Maintained by Sebastian Engelke. Last updated 2 months ago.
10.9 match 16 stars 7.38 score 28 scripts 1 dependentsgzt
CholWishart:Cholesky Decomposition of the Wishart Distribution
Sampling from the Cholesky factorization of a Wishart random variable, sampling from the inverse Wishart distribution, sampling from the Cholesky factorization of an inverse Wishart random variable, sampling from the pseudo Wishart distribution, sampling from the generalized inverse Wishart distribution, computing densities for the Wishart and inverse Wishart distributions, and computing the multivariate gamma and digamma functions. Provides a header file so the C functions can be called directly from other programs.
Maintained by Geoffrey Thompson. Last updated 6 months ago.
cholesky-decompositioncholesky-factorizationdigamma-functionsgammamultivariatepseudo-wishartwishartwishart-distributionsopenblas
11.4 match 7 stars 7.05 score 41 scripts 13 dependentsr-lidar
lidR:Airborne LiDAR Data Manipulation and Visualization for Forestry Applications
Airborne LiDAR (Light Detection and Ranging) interface for data manipulation and visualization. Read/write 'las' and 'laz' files, computation of metrics in area based approach, point filtering, artificial point reduction, classification from geographic data, normalization, individual tree segmentation and other manipulations.
Maintained by Jean-Romain Roussel. Last updated 1 months ago.
alsforestrylaslazlidarpoint-cloudremote-sensingopenblascppopenmp
5.6 match 623 stars 14.47 score 844 scripts 8 dependentsliuyu-star
ODRF:Oblique Decision Random Forest for Classification and Regression
The oblique decision tree (ODT) uses linear combinations of predictors as partitioning variables in a decision tree. Oblique Decision Random Forest (ODRF) is an ensemble of multiple ODTs generated by feature bagging. Oblique Decision Boosting Tree (ODBT) applies feature bagging during the training process of ODT-based boosting trees to ensemble multiple boosting trees. All three methods can be used for classification and regression, and ODT and ODRF serve as supplements to the classical CART of Breiman (1984) <DOI:10.1201/9781315139470> and Random Forest of Breiman (2001) <DOI:10.1023/A:1010933404324> respectively.
Maintained by Yu Liu. Last updated 5 months ago.
15.6 match 7 stars 5.10 score 18 scriptsbayesiandemography
rvec:Vector Representing a Random Variable
Random vectors, called rvecs. An rvec holds multiple draws, but tries to behave like a standard R vector, including working well in data frames. Rvecs are useful for working with output from a simulation or a Bayesian analysis.
Maintained by John Bryant. Last updated 6 months ago.
14.5 match 2 stars 5.46 score 24 scripts 2 dependents