R-universe search: poisson

alexpghayes

distributions3:Probability Distributions as S3 Objects

Tools to create and manipulate probability distributions using S3. Generics pdf(), cdf(), quantile(), and random() provide replacements for base R's d/p/q/r style functions. Functions and arguments have been named carefully to minimize confusion for students in intro stats courses. The documentation for each distribution contains detailed mathematical notes.

Maintained by Alex Hayes. Last updated 6 months ago.

62.6 match 101 stars 11.31 score 118 scripts 7 dependents

tyee001

VGAM:Vector Generalized Linear and Additive Models

An implementation of about 6 major classes of statistical regression models. The central algorithm is Fisher scoring and iterative reweighted least squares. At the heart of this package are the vector generalized linear and additive model (VGLM/VGAM) classes. VGLMs can be loosely thought of as multivariate GLMs. VGAMs are data-driven VGLMs that use smoothing. The book "Vector Generalized Linear and Additive Models: With an Implementation in R" (Yee, 2015) <DOI:10.1007/978-1-4939-2818-7> gives details of the statistical framework and the package. Currently only fixed-effects models are implemented. Many (100+) models and distributions are estimated by maximum likelihood estimation (MLE) or penalized MLE. The other classes are RR-VGLMs (reduced-rank VGLMs), quadratic RR-VGLMs, doubly constrained RR-VGLMs, quadratic RR-VGLMs, reduced-rank VGAMs, RCIMs (row-column interaction models)---these classes perform constrained and unconstrained quadratic ordination (CQO/UQO) models in ecology, as well as constrained additive ordination (CAO). Hauck-Donner effect detection is implemented. Note that these functions are subject to change; see the NEWS and ChangeLog files for latest changes.

Maintained by Thomas Yee. Last updated 1 months ago.

fortran

52.3 match 10 stars 10.67 score 3.6k scripts 169 dependents

inlabru-org

inlabru:Bayesian Latent Gaussian Modelling using INLA and Extensions

Facilitates spatial and general latent Gaussian modeling using integrated nested Laplace approximation via the INLA package (<https://www.r-inla.org>). Additionally, extends the GAM-like model class to more general nonlinear predictor expressions, and implements a log Gaussian Cox process likelihood for modeling univariate and spatial point processes based on ecological survey data. Model components are specified with general inputs and mapping methods to the latent variables, and the predictors are specified via general R expressions, with separate expressions for each observation likelihood model in multi-likelihood models. A prediction method based on fast Monte Carlo sampling allows posterior prediction of general expressions of the latent variables. Ecology-focused introduction in Bachl, Lindgren, Borchers, and Illian (2019) <doi:10.1111/2041-210X.13168>.

Maintained by Finn Lindgren. Last updated 2 days ago.

35.6 match 96 stars 12.62 score 832 scripts 6 dependents

rfastofficial

Rfast:A Collection of Efficient and Extremely Fast R Functions

A collection of fast (utility) functions for data analysis. Column and row wise means, medians, variances, minimums, maximums, many t, F and G-square tests, many regressions (normal, logistic, Poisson), are some of the many fast functions. References: a) Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>. b) Tsagris M. and Papadakis M. (2018). Forward regression in R: from the extreme slow to the extreme fast. Journal of Data Science, 16(4): 771--780. <doi:10.6339/JDS.201810_16(4).00006>. c) Chatzipantsiou C., Dimitriadis M., Papadakis M. and Tsagris M. (2020). Extremely Efficient Permutation and Bootstrap Hypothesis Tests Using Hypothesis Tests Using R. Journal of Modern Applied Statistical Methods, 18(2), eP2898. <doi:10.48550/arXiv.1806.10947>. d) Tsagris M., Papadakis M., Alenazi A. and Alzeley O. (2024). Computationally Efficient Outlier Detection for High-Dimensional Data Using the MDP Algorithm. Computation, 12(9): 185. <doi:10.3390/computation12090185>. e) Tsagris M. and Papadakis M. (2025). Fast and light-weight energy statistics using the R package Rfast. <doi:10.48550/arXiv.2501.02849>.

Maintained by Manos Papadakis. Last updated 16 days ago.

openblas cpp openmp

30.7 match 147 stars 12.54 score 1.2k scripts 166 dependents

spatstat

spatstat.model:Parametric Statistical Modelling and Inference for the 'spatstat' Family

Functionality for parametric statistical modelling and inference for spatial data, mainly spatial point patterns, in the 'spatstat' family of packages. (Excludes analysis of spatial data on a linear network, which is covered by the separate package 'spatstat.linnet'.) Supports parametric modelling, formal statistical inference, and model validation. Parametric models include Poisson point processes, Cox point processes, Neyman-Scott cluster processes, Gibbs point processes and determinantal point processes. Models can be fitted to data using maximum likelihood, maximum pseudolikelihood, maximum composite likelihood and the method of minimum contrast. Fitted models can be simulated and predicted. Formal inference includes hypothesis tests (quadrat counting tests, Cressie-Read tests, Clark-Evans test, Berman test, Diggle-Cressie-Loosmore-Ford test, scan test, studentised permutation test, segregation test, ANOVA tests of fitted models, adjusted composite likelihood ratio test, envelope tests, Dao-Genton test, balanced independent two-stage test), confidence intervals for parameters, and prediction intervals for point counts. Model validation techniques include leverage, influence, partial residuals, added variable plots, diagnostic plots, pseudoscore residual plots, model compensators and Q-Q plots.

Maintained by Adrian Baddeley. Last updated 6 days ago.

analysis-of-variance cluster-process confidence-intervals cox-process determinantal-point-processes gibbs-process influence leverage model-diagnostics neyman-scott parameter-estimation poisson-process spatial-analysis spatial-modelling spatial-point-processes statistical-inference

36.7 match 5 stars 9.09 score 6 scripts 46 dependents

spatstat

spatstat.random:Random Generation Functionality for the 'spatstat' Family

Functionality for random generation of spatial data in the 'spatstat' family of packages. Generates random spatial patterns of points according to many simple rules (complete spatial randomness, Poisson, binomial, random grid, systematic, cell), randomised alteration of patterns (thinning, random shift, jittering), simulated realisations of random point processes including simple sequential inhibition, Matern inhibition models, Neyman-Scott cluster processes (using direct, Brix-Kendall, or hybrid algorithms), log-Gaussian Cox processes, product shot noise cluster processes and Gibbs point processes (using Metropolis-Hastings birth-death-shift algorithm, alternating Gibbs sampler, or coupling-from-the-past perfect simulation). Also generates random spatial patterns of line segments, random tessellations, and random images (random noise, random mosaics). Excludes random generation on a linear network, which is covered by the separate package 'spatstat.linnet'.

Maintained by Adrian Baddeley. Last updated 6 months ago.

point-processes random-generation simulation spatial-sampling spatial-simulation cpp

29.6 match 5 stars 10.77 score 84 scripts 173 dependents

poissonconsulting

extras:Helper Functions for Bayesian Analyses

Functions to 'numericise' 'R' objects (coerce to numeric objects), summarise 'MCMC' (Monte Carlo Markov Chain) samples and calculate deviance residuals as well as 'R' translations of some 'BUGS' (Bayesian Using Gibbs Sampling), 'JAGS' (Just Another Gibbs Sampler), 'STAN' and 'TMB' (Template Model Builder) functions.

Maintained by Nicole Hill. Last updated 2 months ago.

37.4 match 9 stars 8.49 score 15 scripts 16 dependents

coolbutuseless

poissoned:Poisson Disk Sampling in 2D and 3D

Poisson disk sampling is a method of generating blue noise sample patterns where all samples are at least a specified distance apart. Poisson samples may be generated in two or three dimensions with this package. The algorithm used is an implementation of Bridson (2007) "Fast Poisson disk sampling in arbitrary dimensions" <doi:10.1145/1278780.1278807>.

Maintained by Mike Cheng. Last updated 4 months ago.

57.5 match 13 stars 4.74 score 28 scripts

pln-team

PLNmodels:Poisson Lognormal Models

The Poisson-lognormal model and variants (Chiquet, Mariadassou and Robin, 2021 <doi:10.3389/fevo.2021.588292>) can be used for a variety of multivariate problems when count data are at play, including principal component analysis for count data, discriminant analysis, model-based clustering and network inference. Implements variational algorithms to fit such models accompanied with a set of functions for visualization and diagnostic.

Maintained by Julien Chiquet. Last updated 3 days ago.

count-data multivariate-analysis network-inference pca poisson-lognormal-model openblas cpp

28.7 match 56 stars 9.50 score 226 scripts

bioc

glmGamPoi:Fit a Gamma-Poisson Generalized Linear Model

Fit linear models to overdispersed count data. The package can estimate the overdispersion and fit repeated models for matrix input. It is designed to handle large input datasets as they typically occur in single cell RNA-seq experiments.

Maintained by Constantin Ahlmann-Eltze. Last updated 1 months ago.

regression rnaseq software singlecell gamma-poisson glm negative-binomial-regression on-disk openblas cpp

15.7 match 110 stars 12.11 score 1.0k scripts 4 dependents

afialkowski

SimMultiCorrData:Simulation of Correlated Data with Multiple Variable Types

Generate continuous (normal or non-normal), binary, ordinal, and count (Poisson or Negative Binomial) variables with a specified correlation matrix. It can also produce a single continuous variable. This package can be used to simulate data sets that mimic real-world situations (i.e. clinical or genetic data sets, plasmodes). All variables are generated from standard normal variables with an imposed intermediate correlation matrix. Continuous variables are simulated by specifying mean, variance, skewness, standardized kurtosis, and fifth and sixth standardized cumulants using either Fleishman's third-order (<DOI:10.1007/BF02293811>) or Headrick's fifth-order (<DOI:10.1016/S0167-9473(02)00072-5>) polynomial transformation. Binary and ordinal variables are simulated using a modification of the ordsample() function from 'GenOrd'. Count variables are simulated using the inverse cdf method. There are two simulation pathways which differ primarily according to the calculation of the intermediate correlation matrix. In Correlation Method 1, the intercorrelations involving count variables are determined using a simulation based, logarithmic correlation correction (adapting Yahav and Shmueli's 2012 method, <DOI:10.1002/asmb.901>). In Correlation Method 2, the count variables are treated as ordinal (adapting Barbiero and Ferrari's 2015 modification of GenOrd, <DOI:10.1002/asmb.2072>). There is an optional error loop that corrects the final correlation matrix to be within a user-specified precision value of the target matrix. The package also includes functions to calculate standardized cumulants for theoretical distributions or from real data sets, check if a target correlation matrix is within the possible correlation bounds (given the distributions of the simulated variables), summarize results (numerically or graphically), to verify valid power method pdfs, and to calculate lower standardized kurtosis bounds.

Maintained by Allison Cynthia Fialkowski. Last updated 7 years ago.

20.2 match 12 stars 7.58 score 44 scripts 6 dependents

bladder-ca

nhppp:Simulating Nonhomogeneous Poisson Point Processes

Simulates events from one dimensional nonhomogeneous Poisson point processes (NHPPPs) as per Trikalinos and Sereda (2024, <doi:10.48550/arXiv.2402.00358> and 2024, <doi:10.1371/journal.pone.0311311>). Functions are based on three algorithms that provably sample from a target NHPPP: the time-transformation of a homogeneous Poisson process (of intensity one) via the inverse of the integrated intensity function (Cinlar E, "Theory of stochastic processes" (1975, ISBN:0486497996)); the generation of a Poisson number of order statistics from a fixed density function; and the thinning of a majorizing NHPPP via an acceptance-rejection scheme (Lewis PAW, Shedler, GS (1979) <doi:10.1002/nav.3800260304>).

Maintained by Thomas Trikalinos. Last updated 16 days ago.

cpp

25.1 match 3 stars 5.76 score 19 scripts

lotze

COMPoissonReg:Conway-Maxwell Poisson (COM-Poisson) Regression

Fit Conway-Maxwell Poisson (COM-Poisson or CMP) regression models to count data (Sellers & Shmueli, 2010) <doi:10.1214/09-AOAS306>. The package provides functions for model estimation, dispersion testing, and diagnostics. Zero-inflated CMP regression (Sellers & Raim, 2016) <doi:10.1016/j.csda.2016.01.007> is also supported.

Maintained by Andrew Raim. Last updated 1 years ago.

cpp

21.4 match 9 stars 6.63 score 53 scripts 3 dependents

mariarizzo

energy:E-Statistics: Multivariate Inference via the Energy of Data

E-statistics (energy) tests and statistics for multivariate and univariate inference, including distance correlation, one-sample, two-sample, and multi-sample tests for comparing multivariate distributions, are implemented. Measuring and testing multivariate independence based on distance correlation, partial distance correlation, multivariate goodness-of-fit tests, k-groups and hierarchical clustering based on energy distance, testing for multivariate normality, distance components (disco) for non-parametric analysis of structured data, and other energy statistics/methods are implemented.

Maintained by Maria Rizzo. Last updated 7 months ago.

distance-correlation energy multivariate-analysis statistics cpp

13.1 match 45 stars 10.60 score 634 scripts 45 dependents

ericgiunta

Colossus:"Risk Model Regression and Analysis with Complex Non-Linear Models"

Performs survival analysis using general non-linear models. Risk models can be the sum or product of terms. Each term is the product of exponential/linear functions of covariates. Additionally sub-terms can be defined as a sum of exponential, linear threshold, and step functions. Cox Proportional hazards <https://en.wikipedia.org/wiki/Proportional_hazards_model>, Poisson <https://en.wikipedia.org/wiki/Poisson_regression>, and Fine-Gray competing risks <https://www.publichealth.columbia.edu/research/population-health-methods/competing-risk-analysis> regression are supported. This work was sponsored by NASA Grant 80NSSC19M0161 through a subcontract from the National Council on Radiation Protection and Measurements (NCRP). The computing for this project was performed on the Beocat Research Cluster at Kansas State University, which is funded in part by NSF grants CNS-1006860, EPS-1006860, EPS-0919443, ACI-1440548, CHE-1726332, and NIH P20GM113109.

Maintained by Eric Giunta. Last updated 2 days ago.

cpp openmp

19.3 match 1 stars 7.06 score 36 scripts

cran

Sequential:Exact Sequential Analysis for Poisson and Binomial Data

Functions to calculate exact critical values, statistical power, expected time to signal, and required sample sizes for performing exact sequential analysis. All these calculations can be done for either Poisson or binomial data, for continuous or group sequential analyses, and for different types of rejection boundaries. In case of group sequential analyses, the group sizes do not have to be specified in advance and the alpha spending can be arbitrarily settled.

Maintained by Ivair Ramos Silva. Last updated 5 months ago.

41.5 match 2 stars 3.24 score 38 scripts 1 dependents

gamlss-dev

gamlss.dist:Distributions for Generalized Additive Models for Location Scale and Shape

A set of distributions which can be used for modelling the response variables in Generalized Additive Models for Location Scale and Shape, Rigby and Stasinopoulos (2005), <doi:10.1111/j.1467-9876.2005.00510.x>. The distributions can be continuous, discrete or mixed distributions. Extra distributions can be created, by transforming, any continuous distribution defined on the real line, to a distribution defined on ranges 0 to infinity or 0 to 1, by using a 'log' or a 'logit' transformation respectively.

Maintained by Mikis Stasinopoulos. Last updated 20 days ago.

12.7 match 4 stars 10.50 score 346 scripts 71 dependents

rfastofficial

Rfast2:A Collection of Efficient and Extremely Fast R Functions II

A collection of fast statistical and utility functions for data analysis. Functions for regression, maximum likelihood, column-wise statistics and many more have been included. C++ has been utilized to speed up the functions. References: Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>.

Maintained by Manos Papadakis. Last updated 1 years ago.

openblas cpp openmp

16.3 match 38 stars 8.09 score 75 scripts 26 dependents

spatstat

spatstat:Spatial Point Pattern Analysis, Model-Fitting, Simulation, Tests

Comprehensive open-source toolbox for analysing Spatial Point Patterns. Focused mainly on two-dimensional point patterns, including multitype/marked points, in any spatial region. Also supports three-dimensional point patterns, space-time point patterns in any number of dimensions, point patterns on a linear network, and patterns of other geometrical objects. Supports spatial covariate data such as pixel images. Contains over 3000 functions for plotting spatial data, exploratory data analysis, model-fitting, simulation, spatial sampling, model diagnostics, and formal inference. Data types include point patterns, line segment patterns, spatial windows, pixel images, tessellations, and linear networks. Exploratory methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported. Parametric models can be fitted to point pattern data using the functions ppm(), kppm(), slrm(), dppm() similar to glm(). Types of models include Poisson, Gibbs and Cox point processes, Neyman-Scott cluster processes, and determinantal point processes. Models may involve dependence on covariates, inter-point interaction, cluster formation and dependence on marks. Models are fitted by maximum likelihood, logistic regression, minimum contrast, and composite likelihood methods. A model can be fitted to a list of point patterns (replicated point pattern data) using the function mppm(). The model can include random effects and fixed effects depending on the experimental design, in addition to all the features listed above. Fitted point process models can be simulated, automatically. Formal hypothesis tests of a fitted model are supported (likelihood ratio test, analysis of deviance, Monte Carlo tests) along with basic tools for model selection (stepwise(), AIC()) and variable selection (sdr). Tools for validating the fitted model include simulation envelopes, residuals, residual plots and Q-Q plots, leverage and influence diagnostics, partial residuals, and added variable plots.

Maintained by Adrian Baddeley. Last updated 2 months ago.

cluster-process cox-point-process gibbs-process kernel-density network-analysis point-process poisson-process spatial-analysis spatial-data spatial-data-analysis spatial-statistics spatstat statistical-methods statistical-models statistical-tests statistics

8.0 match 200 stars 16.32 score 5.5k scripts 41 dependents

ropensci

QuadratiK:Collection of Methods Constructed using Kernel-Based Quadratic Distances

It includes test for multivariate normality, test for uniformity on the d-dimensional Sphere, non-parametric two- and k-sample tests, random generation of points from the Poisson kernel-based density and clustering algorithm for spherical data. For more information see Saraceno G., Markatou M., Mukhopadhyay R. and Golzy M. (2024) <doi:10.48550/arXiv.2402.02290> Markatou, M. and Saraceno, G. (2024) <doi:10.48550/arXiv.2407.16374>, Ding, Y., Markatou, M. and Saraceno, G. (2023) <doi:10.5705/ss.202022.0347>, and Golzy, M. and Markatou, M. (2020) <doi:10.1080/10618600.2020.1740713>.

Maintained by Giovanni Saraceno. Last updated 1 months ago.

cpp

20.1 match 1 stars 6.36 score 27 scripts

twolodzko

extraDistr:Additional Univariate and Multivariate Distributions

Density, distribution function, quantile function and random generation for a number of univariate and multivariate distributions. This package implements the following distributions: Bernoulli, beta-binomial, beta-negative binomial, beta prime, Bhattacharjee, Birnbaum-Saunders, bivariate normal, bivariate Poisson, categorical, Dirichlet, Dirichlet-multinomial, discrete gamma, discrete Laplace, discrete normal, discrete uniform, discrete Weibull, Frechet, gamma-Poisson, generalized extreme value, Gompertz, generalized Pareto, Gumbel, half-Cauchy, half-normal, half-t, Huber density, inverse chi-squared, inverse-gamma, Kumaraswamy, Laplace, location-scale t, logarithmic, Lomax, multivariate hypergeometric, multinomial, negative hypergeometric, non-standard beta, normal mixture, Poisson mixture, Pareto, power, reparametrized beta, Rayleigh, shifted Gompertz, Skellam, slash, triangular, truncated binomial, truncated normal, truncated Poisson, Tukey lambda, Wald, zero-inflated binomial, zero-inflated negative binomial, zero-inflated Poisson.

Maintained by Tymoteusz Wolodzko. Last updated 10 days ago.

c-plus-plus c-plus-plus-11 distribution multivariate-distributions probability random-generation rcpp statistics cpp

10.9 match 53 stars 11.60 score 1.5k scripts 107 dependents

merliseclyde

BAS:Bayesian Variable Selection and Model Averaging using Bayesian Adaptive Sampling

Package for Bayesian Variable Selection and Model Averaging in linear models and generalized linear models using stochastic or deterministic sampling without replacement from posterior distributions. Prior distributions on coefficients are from Zellner's g-prior or mixtures of g-priors corresponding to the Zellner-Siow Cauchy Priors or the mixture of g-priors from Liang et al (2008) <DOI:10.1198/016214507000001337> for linear models or mixtures of g-priors from Li and Clyde (2019) <DOI:10.1080/01621459.2018.1469992> in generalized linear models. Other model selection criteria include AIC, BIC and Empirical Bayes estimates of g. Sampling probabilities may be updated based on the sampled models using sampling w/out replacement or an efficient MCMC algorithm which samples models using a tree structure of the model space as an efficient hash table. See Clyde, Ghosh and Littman (2010) <DOI:10.1198/jcgs.2010.09049> for details on the sampling algorithms. Uniform priors over all models or beta-binomial prior distributions on model size are allowed, and for large p truncated priors on the model space may be used to enforce sampling models that are full rank. The user may force variables to always be included in addition to imposing constraints that higher order interactions are included only if their parents are included in the model. This material is based upon work supported by the National Science Foundation under Division of Mathematical Sciences grant 1106891. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Maintained by Merlise Clyde. Last updated 4 months ago.

bayesian bayesian-inference generalized-linear-models linear-regression logistic-regression mcmc model-selection poisson-regression predictive-modeling regression variable-selection fortran openblas

11.6 match 44 stars 10.81 score 420 scripts 3 dependents

stephenslab

fastTopics:Fast Algorithms for Fitting Topic Models and Non-Negative Matrix Factorizations to Count Data

Implements fast, scalable optimization algorithms for fitting topic models ("grade of membership" models) and non-negative matrix factorizations to count data. The methods exploit the special relationship between the multinomial topic model (also, "probabilistic latent semantic indexing") and Poisson non-negative matrix factorization. The package provides tools to compare, annotate and visualize model fits, including functions to efficiently create "structure plots" and identify key features in topics. The 'fastTopics' package is a successor to the 'CountClust' package. For more information, see <doi:10.48550/arXiv.2105.13440> and <doi:10.1186/s13059-023-03067-9>. Please also see the GitHub repository for additional vignettes not included in the package on CRAN.

Maintained by Peter Carbonetto. Last updated 15 days ago.

openblas cpp

14.1 match 79 stars 8.38 score 678 scripts 1 dependents

alexkowa

EnvStats:Package for Environmental Statistics, Including US EPA Guidance

Graphical and statistical analyses of environmental data, with focus on analyzing chemical concentrations and physical parameters, usually in the context of mandated environmental monitoring. Major environmental statistical methods found in the literature and regulatory guidance documents, with extensive help that explains what these methods do, how to use them, and where to find them in the literature. Numerous built-in data sets from regulatory guidance documents and environmental statistics literature. Includes scripts reproducing analyses presented in the book "EnvStats: An R Package for Environmental Statistics" (Millard, 2013, Springer, ISBN 978-1-4614-8455-4, <doi:10.1007/978-1-4614-8456-1>).

Maintained by Alexander Kowarik. Last updated 15 days ago.

9.2 match 26 stars 12.80 score 2.4k scripts 46 dependents

mlverse

torch:Tensors and Neural Networks with 'GPU' Acceleration

Provides functionality to define and train neural networks similar to 'PyTorch' by Paszke et al (2019) <doi:10.48550/arXiv.1912.01703> but written entirely in R using the 'libtorch' library. Also supports low-level tensor operations and 'GPU' acceleration.

Maintained by Daniel Falbel. Last updated 5 days ago.

autograd deep-learning torch cpp

7.0 match 520 stars 16.52 score 1.4k scripts 38 dependents

fj86

PoissonBinomial:Efficient Computation of Ordinary and Generalized Poisson Binomial Distributions

Efficient implementations of multiple exact and approximate methods as described in Hong (2013) <doi:10.1016/j.csda.2012.10.006>, Biscarri, Zhao & Brunner (2018) <doi:10.1016/j.csda.2018.01.007> and Zhang, Hong & Balakrishnan (2018) <doi:10.1080/00949655.2018.1440294> for computing the probability mass, cumulative distribution and quantile functions, as well as generating random numbers for both the ordinary and generalized Poisson binomial distribution.

Maintained by Florian Junge. Last updated 6 months ago.

fftw3 cpp

23.2 match 3 stars 4.86 score 10 scripts 2 dependents

actuaryzhang

cplm:Compound Poisson Linear Models

Likelihood-based and Bayesian methods for various compound Poisson linear models based on Zhang, Yanwei (2013) <doi:10.1007/s11222-012-9343-7>.

Maintained by Yanwei (Wayne) Zhang. Last updated 1 years ago.

openblas

12.5 match 16 stars 8.45 score 75 scripts 10 dependents

doccstat

fastcpd:Fast Change Point Detection via Sequential Gradient Descent

Implements fast change point detection algorithm based on the paper "Sequential Gradient Descent and Quasi-Newton's Method for Change-Point Analysis" by Xianyang Zhang, Trisha Dawn <https://proceedings.mlr.press/v206/zhang23b.html>. The algorithm is based on dynamic programming with pruning and sequential gradient descent. It is able to detect change points a magnitude faster than the vanilla Pruned Exact Linear Time(PELT). The package includes examples of linear regression, logistic regression, Poisson regression, penalized linear regression data, and whole lot more examples with custom cost function in case the user wants to use their own cost function.

Maintained by Xingchi Li. Last updated 2 hours ago.

change-point-detection cpp custom-function gradient-descent lasso linear-regression logistic-regression offline pelt penalized-regression poisson-regression quasi-newton statistics time-series warm-start fortran openblas cpp openmp

15.0 match 22 stars 7.00 score 7 scripts

bayesball

LearnBayes:Learning Bayesian Inference

Contains functions for summarizing basic one and two parameter posterior distributions and predictive distributions. It contains MCMC algorithms for summarizing posterior distributions defined by the user. It also contains functions for regression models, hierarchical models, Bayesian tests, and illustrations of Gibbs sampling.

Maintained by Jim Albert. Last updated 7 years ago.

9.0 match 38 stars 11.34 score 690 scripts 31 dependents

vpnsctl

mixpoissonreg:Mixed Poisson Regression for Overdispersed Count Data

Fits mixed Poisson regression models (Poisson-Inverse Gaussian or Negative-Binomial) on data sets with response variables being count data. The models can have varying precision parameter, where a linear regression structure (through a link function) is assumed to hold on the precision parameter. The Expectation-Maximization algorithm for both these models (Poisson Inverse Gaussian and Negative Binomial) is an important contribution of this package. Another important feature of this package is the set of functions to perform global and local influence analysis. See Barreto-Souza and Simas (2016) <doi:10.1007/s11222-015-9601-6> for further details.

Maintained by Alexandre B. Simas. Last updated 4 years ago.

count-data diagnostics influence-analysis local-influence negative-binomial-regression poisson-inverse-gaussian-regression

18.6 match 3 stars 5.44 score 23 scripts

tjheaton

carbondate:Calibration and Summarisation of Radiocarbon Dates

Performs Bayesian non-parametric calibration of multiple related radiocarbon determinations, and summarises the calendar age information to plot their joint calendar age density (see Heaton (2022) <doi:10.1111/rssc.12599>). Also models the occurrence of radiocarbon samples as a variable-rate (inhomogeneous) Poisson process, plotting the posterior estimate for the occurrence rate of the samples over calendar time, and providing information about potential change points.

Maintained by Timothy J Heaton. Last updated 2 months ago.

cpp

16.8 match 5 stars 5.78 score 20 scripts

afialkowski

SimCorrMix:Simulation of Correlated Data with Multiple Variable Types Including Continuous and Count Mixture Distributions

Generate continuous (normal, non-normal, or mixture distributions), binary, ordinal, and count (regular or zero-inflated, Poisson or Negative Binomial) variables with a specified correlation matrix, or one continuous variable with a mixture distribution. This package can be used to simulate data sets that mimic real-world clinical or genetic data sets (i.e., plasmodes, as in Vaughan et al., 2009 <DOI:10.1016/j.csda.2008.02.032>). The methods extend those found in the 'SimMultiCorrData' R package. Standard normal variables with an imposed intermediate correlation matrix are transformed to generate the desired distributions. Continuous variables are simulated using either Fleishman (1978)'s third order <DOI:10.1007/BF02293811> or Headrick (2002)'s fifth order <DOI:10.1016/S0167-9473(02)00072-5> polynomial transformation method (the power method transformation, PMT). Non-mixture distributions require the user to specify mean, variance, skewness, standardized kurtosis, and standardized fifth and sixth cumulants. Mixture distributions require these inputs for the component distributions plus the mixing probabilities. Simulation occurs at the component level for continuous mixture distributions. The target correlation matrix is specified in terms of correlations with components of continuous mixture variables. These components are transformed into the desired mixture variables using random multinomial variables based on the mixing probabilities. However, the package provides functions to approximate expected correlations with continuous mixture variables given target correlations with the components. Binary and ordinal variables are simulated using a modification of ordsample() in package 'GenOrd'. Count variables are simulated using the inverse CDF method. There are two simulation pathways which calculate intermediate correlations involving count variables differently. Correlation Method 1 adapts Yahav and Shmueli's 2012 method <DOI:10.1002/asmb.901> and performs best with large count variable means and positive correlations or small means and negative correlations. Correlation Method 2 adapts Barbiero and Ferrari's 2015 modification of the 'GenOrd' package <DOI:10.1002/asmb.2072> and performs best under the opposite scenarios. The optional error loop may be used to improve the accuracy of the final correlation matrix. The package also contains functions to calculate the standardized cumulants of continuous mixture distributions, check parameter inputs, calculate feasible correlation boundaries, and summarize and plot simulated variables.

Maintained by Allison Cynthia Fialkowski. Last updated 7 years ago.

18.4 match 5 stars 5.24 score 14 scripts

paulnorthrop

bang:Bayesian Analysis, No Gibbs

Provides functions for the Bayesian analysis of some simple commonly-used models, without using Markov Chain Monte Carlo (MCMC) methods such as Gibbs sampling. The 'rust' package <https://cran.r-project.org/package=rust> is used to simulate a random sample from the required posterior distribution, using the generalized ratio-of-uniforms method. See Wakefield, Gelfand and Smith (1991) <DOI:10.1007/BF01889987> for details. At the moment three conjugate hierarchical models are available: beta-binomial, gamma-Poisson and a 1-way analysis of variance (ANOVA).

Maintained by Paul J. Northrop. Last updated 1 months ago.

anova bayesian beta binomial gamma gibbs hierarchical poisson

15.9 match 3 stars 5.62 score 35 scripts

wasquith

lmomco:L-Moments, Censored L-Moments, Trimmed L-Moments, L-Comoments, and Many Distributions

Extensive functions for Lmoments (LMs) and probability-weighted moments (PWMs), distribution parameter estimation, LMs for distributions, LM ratio diagrams, multivariate Lcomoments, and asymmetric (asy) trimmed LMs (TLMs). Maximum likelihood and maximum product spacings estimation are available. Right-tail and left-tail LM censoring by threshold or indicator variable are available. LMs of residual (resid) and reversed (rev) residual life are implemented along with 13 quantile operators for reliability analyses. Exact analytical bootstrap estimates of order statistics, LMs, and LM var-covars are available. Harri-Coble Tau34-squared Normality Test is available. Distributions with L, TL, and added (+) support for right-tail censoring (RC) encompass: Asy Exponential (Exp) Power [L], Asy Triangular [L], Cauchy [TL], Eta-Mu [L], Exp. [L], Gamma [L], Generalized (Gen) Exp Poisson [L], Gen Extreme Value [L], Gen Lambda [L, TL], Gen Logistic [L], Gen Normal [L], Gen Pareto [L+RC, TL], Govindarajulu [L], Gumbel [L], Kappa [L], Kappa-Mu [L], Kumaraswamy [L], Laplace [L], Linear Mean Residual Quantile Function [L], Normal [L], 3p log-Normal [L], Pearson Type III [L], Polynomial Density-Quantile 3 and 4 [L], Rayleigh [L], Rev-Gumbel [L+RC], Rice [L], Singh Maddala [L], Slash [TL], 3p Student t [L], Truncated Exponential [L], Wakeby [L], and Weibull [L].

Maintained by William Asquith. Last updated 1 months ago.

flood-frequency-analysis l-moments mle-estimation mps-estimation probability-distribution rainfall-frequency-analysis reliability-analysis risk-analysis survival-analysis

10.9 match 2 stars 8.06 score 458 scripts 38 dependents

ghislainv

hSDM:Hierarchical Bayesian Species Distribution Models

User-friendly and fast set of functions for estimating parameters of hierarchical Bayesian species distribution models (Latimer and others 2006 <doi:10.1890/04-0609>). Such models allow interpreting the observations (occurrence and abundance of a species) as a result of several hierarchical processes including ecological processes (habitat suitability, spatial dependence and anthropogenic disturbance) and observation processes (species detectability). Hierarchical species distribution models are essential for accurately characterizing the environmental response of species, predicting their probability of occurrence, and assessing uncertainty in the model results.

Maintained by Ghislain Vieilledent. Last updated 2 years ago.

gsl

14.4 match 9 stars 6.04 score 41 scripts

spsanderson

TidyDensity:Functions for Tidy Analysis and Generation of Random Data

To make it easy to generate random numbers based upon the underlying stats distribution functions. All data is returned in a tidy and structured format making working with the data simple and straight forward. Given that the data is returned in a tidy 'tibble' it lends itself to working with the rest of the 'tidyverse'.

Maintained by Steven Sanderson. Last updated 5 months ago.

bootstrap density distributions ggplot2 probability r-language simulation statistics tibble tidy

10.6 match 34 stars 7.78 score 66 scripts 1 dependents

spatstat

spatstat.linnet:Linear Networks Functionality of the 'spatstat' Family

Defines types of spatial data on a linear network and provides functionality for geometrical operations, data analysis and modelling of data on a linear network, in the 'spatstat' family of packages. Contains definitions and support for linear networks, including creation of networks, geometrical measurements, topological connectivity, geometrical operations such as inserting and deleting vertices, intersecting a network with another object, and interactive editing of networks. Data types defined on a network include point patterns, pixel images, functions, and tessellations. Exploratory methods include kernel estimation of intensity on a network, K-functions and pair correlation functions on a network, simulation envelopes, nearest neighbour distance and empty space distance, relative risk estimation with cross-validated bandwidth selection. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported. Parametric models can be fitted to point pattern data using the function lppm() similar to glm(). Only Poisson models are implemented so far. Models may involve dependence on covariates and dependence on marks. Models are fitted by maximum likelihood. Fitted point process models can be simulated, automatically. Formal hypothesis tests of a fitted model are supported (likelihood ratio test, analysis of deviance, Monte Carlo tests) along with basic tools for model selection (stepwise(), AIC()) and variable selection (sdr). Tools for validating the fitted model include simulation envelopes, residuals, residual plots and Q-Q plots, leverage and influence diagnostics, partial residuals, and added variable plots. Random point patterns on a network can be generated using a variety of models.

Maintained by Adrian Baddeley. Last updated 2 months ago.

density-estimation heat-equation kernel-density-estimation network-analysis point-processes spatial-data-analysis statistical-analysis statistical-inference statistical-models

8.5 match 6 stars 9.64 score 35 scripts 43 dependents

jongheepark

MCMCpack:Markov Chain Monte Carlo (MCMC) Package

Contains functions to perform Bayesian inference using posterior simulation for a number of statistical models. Most simulation is done in compiled C++ written in the Scythe Statistical Library Version 1.0.3. All models return 'coda' mcmc objects that can then be summarized using the 'coda' package. Some useful utility functions such as density functions, pseudo-random number generators for statistical distributions, a general purpose Metropolis sampling algorithm, and tools for visualization are provided.

Maintained by Jong Hee Park. Last updated 7 months ago.

cpp

8.4 match 13 stars 9.40 score 2.6k scripts 150 dependents

spluque

diveMove:Dive Analysis and Calibration

Utilities to represent, visualize, filter, analyse, and summarize time-depth recorder (TDR) data. Miscellaneous functions for handling location data are also provided.

Maintained by Sebastian P. Luque. Last updated 5 months ago.

animal-behavior behavioural-ecology biology diving science

11.7 match 6 stars 6.75 score 55 scripts

yuimaproject

yuima:The YUIMA Project Package for SDEs

Simulation and Inference for SDEs and Other Stochastic Processes.

Maintained by Stefano M. Iacus. Last updated 1 days ago.

openblas cpp

10.7 match 9 stars 7.26 score 92 scripts 2 dependents

finnishcancerregistry

popEpi:Functions for Epidemiological Analysis using Population Data

Enables computation of epidemiological statistics, including those where counts or mortality rates of the reference population are used. Currently supported: excess hazard models (Dickman, Sloggett, Hills, and Hakulinen (2012) <doi:10.1002/sim.1597>), rates, mean survival times, relative/net survival (in particular the Ederer II (Ederer and Heise (1959)) and Pohar Perme (Pohar Perme, Stare, and Esteve (2012) <doi:10.1111/j.1541-0420.2011.01640.x>) estimators), and standardized incidence and mortality ratios, all of which can be easily adjusted for by covariates such as age. Fast splitting and aggregation of 'Lexis' objects (from package 'Epi') and other computations achieved using 'data.table'.

Maintained by Joonas Miettinen. Last updated 1 months ago.

adjust-estimates age-adjusting direct-adjusting epidemiology indirect-adjusting survival

9.6 match 8 stars 8.05 score 117 scripts 1 dependents

furrer-lab

abn:Modelling Multivariate Data with Additive Bayesian Networks

The 'abn' R package facilitates Bayesian network analysis, a probabilistic graphical model that derives from empirical data a directed acyclic graph (DAG). This DAG describes the dependency structure between random variables. The R package 'abn' provides routines to help determine optimal Bayesian network models for a given data set. These models are used to identify statistical dependencies in messy, complex data. Their additive formulation is equivalent to multivariate generalised linear modelling, including mixed models with independent and identically distributed (iid) random effects. The core functionality of the 'abn' package revolves around model selection, also known as structure discovery. It supports both exact and heuristic structure learning algorithms and does not restrict the data distribution of parent-child combinations, providing flexibility in model creation and analysis. The 'abn' package uses Laplace approximations for metric estimation and includes wrappers to the 'INLA' package. It also employs 'JAGS' for data simulation purposes. For more resources and information, visit the 'abn' website.

Maintained by Matteo Delucchi. Last updated 4 days ago.

bayesian-network binomial categorical-data gaussian grouped-datasets mixed-effects multinomial multivariate poisson structure-learning gsl openblas cpp openmp jags

11.0 match 6 stars 6.94 score 90 scripts

drizopoulos

GLMMadaptive:Generalized Linear Mixed Models using Adaptive Gaussian Quadrature

Fits generalized linear mixed models for a single grouping factor under maximum likelihood approximating the integrals over the random effects with an adaptive Gaussian quadrature rule; Jose C. Pinheiro and Douglas M. Bates (1995) <doi:10.1080/10618600.1995.10474663>.

Maintained by Dimitris Rizopoulos. Last updated 5 days ago.

generalized-linear-mixed-models mixed-effects-models mixed-models

7.3 match 61 stars 10.37 score 212 scripts 5 dependents

petelaud

ratesci:Confidence Intervals for Comparisons of Binomial or Poisson Rates

Computes confidence intervals for the rate (or risk) difference ('RD') or rate ratio (or relative risk, 'RR') for binomial proportions or Poisson rates, or for odds ratio ('OR', binomial only). Also confidence intervals for a single binomial or Poisson rate, and intervals for matched pairs. Includes skewness-corrected asymptotic score ('SCAS') methods, which have been developed in Laud (2017) <doi:10.1002/pst.1813> from Miettinen & Nurminen (1985) <doi:10.1002/sim.4780040211> and Gart & Nam (1988) <doi:10.2307/2531848>, and in Laud (2025, under review) for paired proportions. The same score produces hypothesis tests analogous to the test for binomial RD and RR by Farrington & Manning (1990) <doi:10.1002/sim.4780091208>, or the McNemar test for paired data. The package also includes MOVER methods (Method Of Variance Estimates Recovery) for all contrasts, derived from the Newcombe method but with options to use equal-tailed intervals in place of the Wilson score method, and generalised for Bayesian applications incorporating prior information. So-called 'exact' methods for strictly conservative coverage are approximated using continuity corrections, and the amount of correction can be selected to avoid over-conservative coverage. Also includes methods for stratified calculations (e.g. meta-analysis), either assuming fixed effects (matching the CMH test) or incorporating stratum heterogeneity.

Maintained by Pete Laud. Last updated 18 hours ago.

16.9 match 1 stars 4.43 score 27 scripts 3 dependents

r-forge

surveillance:Temporal and Spatio-Temporal Modeling and Monitoring of Epidemic Phenomena

Statistical methods for the modeling and monitoring of time series of counts, proportions and categorical data, as well as for the modeling of continuous-time point processes of epidemic phenomena. The monitoring methods focus on aberration detection in count data time series from public health surveillance of communicable diseases, but applications could just as well originate from environmetrics, reliability engineering, econometrics, or social sciences. The package implements many typical outbreak detection procedures such as the (improved) Farrington algorithm, or the negative binomial GLR-CUSUM method of Hoehle and Paul (2008) <doi:10.1016/j.csda.2008.02.015>. A novel CUSUM approach combining logistic and multinomial logistic modeling is also included. The package contains several real-world data sets, the ability to simulate outbreak data, and to visualize the results of the monitoring in a temporal, spatial or spatio-temporal fashion. A recent overview of the available monitoring procedures is given by Salmon et al. (2016) <doi:10.18637/jss.v070.i10>. For the retrospective analysis of epidemic spread, the package provides three endemic-epidemic modeling frameworks with tools for visualization, likelihood inference, and simulation. hhh4() estimates models for (multivariate) count time series following Paul and Held (2011) <doi:10.1002/sim.4177> and Meyer and Held (2014) <doi:10.1214/14-AOAS743>. twinSIR() models the susceptible-infectious-recovered (SIR) event history of a fixed population, e.g, epidemics across farms or networks, as a multivariate point process as proposed by Hoehle (2009) <doi:10.1002/bimj.200900050>. twinstim() estimates self-exciting point process models for a spatio-temporal point pattern of infective events, e.g., time-stamped geo-referenced surveillance data, as proposed by Meyer et al. (2012) <doi:10.1111/j.1541-0420.2011.01684.x>. A recent overview of the implemented space-time modeling frameworks for epidemic phenomena is given by Meyer et al. (2017) <doi:10.18637/jss.v077.i11>.

Maintained by Sebastian Meyer. Last updated 15 days ago.

cpp

7.0 match 2 stars 10.74 score 446 scripts 3 dependents

bioc

tweeDEseq:RNA-seq data analysis using the Poisson-Tweedie family of distributions

Differential expression analysis of RNA-seq using the Poisson-Tweedie (PT) family of distributions. PT distributions are described by a mean, a dispersion and a shape parameter and include Poisson and NB distributions, among others, as particular cases. An important feature of this family is that, while the Negative Binomial (NB) distribution only allows a quadratic mean-variance relationship, the PT distributions generalizes this relationship to any orde.

Maintained by Dolors Pelegri-Siso. Last updated 5 months ago.

immunooncology statisticalmethod differentialexpression sequencing rnaseq dnaseq

15.1 match 4.91 score 45 scripts 1 dependents

wobbrock

multpois:Analyze Nominal Response Data with the Multinomial-Poisson Trick

Dichotomous responses having two categories can be analyzed with stats::glm() or lme4::glmer() using the family=binomial option. Unfortunately, polytomous responses with three or more unordered categories cannot be analyzed similarly because there is no analogous family=multinomial option. For between-subjects data, nnet::multinom() can address this need, but it cannot handle random factors and therefore cannot handle repeated measures. To address this gap, we transform nominal response data into counts for each categorical alternative. These counts are then analyzed using (mixed) Poisson regression as per Baker (1994) <doi:10.2307/2348134>. Omnibus analyses of variance can be run along with post hoc pairwise comparisons. For users wishing to analyze nominal responses from surveys or experiments, the functions in this package essentially act as though stats::glm() or lme4::glmer() provide a family=multinomial option.

Maintained by Jacob O. Wobbrock. Last updated 1 months ago.

15.1 match 1 stars 4.78 score 20 scripts

david-cortes

poismf:Factorization of Sparse Counts Matrices Through Poisson Likelihood

Creates a non-negative low-rank approximate factorization of a sparse counts matrix by maximizing Poisson likelihood with L1/L2 regularization (e.g. for implicit-feedback recommender systems or bag-of-words-based topic modeling) (Cortes, (2018) <arXiv:1811.01908>), which usually leads to very sparse user and item factors (over 90% zero-valued). Similar to hierarchical Poisson factorization (HPF), but follows an optimization-based approach with regularization instead of a hierarchical prior, and is fit through gradient-based methods instead of variational inference.

Maintained by David Cortes. Last updated 9 months ago.

implicit-feedback poisson-factorization openblas openmp

16.5 match 46 stars 4.36 score 9 scripts

ai4ci

ggoutbreak:Estimate Incidence, Proportions and Exponential Growth Rates

Simple statistical models and visualisations for calculating the incidence, proportion, exponential growth rate, and reproduction number of infectious disease case time series. This toolkit was largely developed during the COVID-19 pandemic.

Maintained by Robert Challen. Last updated 1 months ago.

16.0 match 1 stars 4.30 score

biodiverse

unmarked:Models for Data from Unmarked Animals

Fits hierarchical models of animal abundance and occurrence to data collected using survey methods such as point counts, site occupancy sampling, distance sampling, removal sampling, and double observer sampling. Parameters governing the state and observation processes can be modeled as functions of covariates. References: Kellner et al. (2023) <doi:10.1111/2041-210X.14123>, Fiske and Chandler (2011) <doi:10.18637/jss.v043.i10>.

Maintained by Ken Kellner. Last updated 1 months ago.

openblas cpp openmp

5.2 match 4 stars 13.02 score 652 scripts 12 dependents

cran

exactci:Exact P-Values and Matching Confidence Intervals for Simple Discrete Parametric Cases

Calculates exact tests and confidence intervals for one-sample binomial and one- or two-sample Poisson cases (see Fay (2010) <doi:10.32614/rj-2010-008>).

Maintained by Michael P. Fay. Last updated 2 years ago.

12.1 match 5.55 score 36 scripts 10 dependents

vidargrotan

poilog:Poisson Lognormal and Bivariate Poisson Lognormal Distribution

Functions for obtaining the density, random deviates and maximum likelihood estimates of the Poisson lognormal distribution and the bivariate Poisson lognormal distribution.

Maintained by Vidar Grotan. Last updated 2 years ago.

16.2 match 4.11 score 30 scripts 4 dependents

mqbssppe

poisson.glm.mix:Fit High Dimensional Mixtures of Poisson GLMs

Mixtures of Poisson Generalized Linear Models for high dimensional count data clustering. The (multivariate) responses can be partitioned into set of blocks. Three different parameterizations of the linear predictor are considered. The models are estimated according to the EM algorithm with an efficient initialization scheme <doi:10.1016/j.csda.2014.07.005>.

Maintained by Panagiotis Papastamoulis. Last updated 2 years ago.

43.5 match 1.52 score 11 scripts 1 dependents

ropensci

beautier:'BEAUti' from R

'BEAST2' (<https://www.beast2.org>) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. 'BEAUti 2' (which is part of 'BEAST2') is a GUI tool that allows users to specify the many possible setups and generates the XML file 'BEAST2' needs to run. This package provides a way to create 'BEAST2' input files without active user input, but using R function calls instead.

Maintained by Richèl J.C. Bilderbeek. Last updated 21 days ago.

bayesian beast beast2 beauti phylogenetic-inference phylogenetics

7.4 match 13 stars 8.76 score 198 scripts 5 dependents

steve-the-bayesian

BoomSpikeSlab:MCMC for Spike and Slab Regression

Spike and slab regression with a variety of residual error distributions corresponding to Gaussian, Student T, probit, logit, SVM, and a few others. Spike and slab regression is Bayesian regression with prior distributions containing a point mass at zero. The posterior updates the amount of mass on this point, leading to a posterior distribution that is actually sparse, in the sense that if you sample from it many coefficients are actually zeros. Sampling from this posterior distribution is an elegant way to handle Bayesian variable selection and model averaging. See <DOI:10.1504/IJMMNO.2014.059942> for an explanation of the Gaussian case.

Maintained by Steven L. Scott. Last updated 1 years ago.

cpp

11.9 match 6 stars 5.46 score 95 scripts 5 dependents

andreamrau

HTSCluster:Clustering High-Throughput Transcriptome Sequencing (HTS) Data

A Poisson mixture model is implemented to cluster genes from high- throughput transcriptome sequencing (RNA-seq) data. Parameter estimation is performed using either the EM or CEM algorithm, and the slope heuristics are used for model selection (i.e., to choose the number of clusters).

Maintained by Andrea Rau. Last updated 2 years ago.

12.7 match 5.02 score 7 scripts 1 dependents

dsy109

tolerance:Statistical Tolerance Intervals and Regions

Statistical tolerance limits provide the limits between which we can expect to find a specified proportion of a sampled population with a given level of confidence. This package provides functions for estimating tolerance limits (intervals) for various univariate distributions (binomial, Cauchy, discrete Pareto, exponential, two-parameter exponential, extreme value, hypergeometric, Laplace, logistic, negative binomial, negative hypergeometric, normal, Pareto, Poisson-Lindley, Poisson, uniform, and Zipf-Mandelbrot), Bayesian normal tolerance limits, multivariate normal tolerance regions, nonparametric tolerance intervals, tolerance bands for regression settings (linear regression, nonlinear regression, nonparametric regression, and multivariate regression), and analysis of variance tolerance intervals. Visualizations are also available for most of these settings.

Maintained by Derek S. Young. Last updated 9 months ago.

tolerance-intervals

9.9 match 4 stars 6.39 score 153 scripts 7 dependents

nlmixr2

rxode2:Facilities for Simulating from ODE-Based Models

Facilities for running simulations from ordinary differential equation ('ODE') models, such as pharmacometrics and other compartmental models. A compilation manager translates the ODE model into C, compiles it, and dynamically loads the object code into R for improved computational efficiency. An event table object facilitates the specification of complex dosing regimens (optional) and sampling schedules. NB: The use of this package requires both C and Fortran compilers, for details on their use with R please see Section 6.3, Appendix A, and Appendix D in the "R Administration and Installation" manual. Also the code is mostly released under GPL. The 'VODE' and 'LSODA' are in the public domain. The information is available in the inst/COPYRIGHTS.

Maintained by Matthew L. Fidler. Last updated 29 days ago.

fortran openblas cpp openmp

5.6 match 39 stars 11.16 score 220 scripts 13 dependents

franciscomartinezdelrio

DGLMExtPois:Double Generalized Linear Models Extending Poisson Regression

Model estimation, dispersion testing and diagnosis of hyper-Poisson Saez-Castillo, A.J. and Conde-Sanchez, A. (2013) <doi:10.1016/j.csda.2012.12.009> and Conway-Maxwell-Poisson Huang, A. (2017) regression models.

Maintained by Francisco Martinez. Last updated 2 years ago.

22.0 match 2.85 score 14 scripts

mayoverse

arsenal:An Arsenal of 'R' Functions for Large-Scale Statistical Summaries

An Arsenal of 'R' functions for large-scale statistical summaries, which are streamlined to work within the latest reporting tools in 'R' and 'RStudio' and which use formulas and versatile summary statistics for summary tables and models. The primary functions include tableby(), a Table-1-like summary of multiple variable types 'by' the levels of one or more categorical variables; paired(), a Table-1-like summary of multiple variable types paired across two time points; modelsum(), which performs simple model fits on one or more endpoints for many variables (univariate or adjusted for covariates); freqlist(), a powerful frequency table across many categorical variables; comparedf(), a function for comparing data.frames; and write2(), a function to output tables to a document.

Maintained by Ethan Heinzen. Last updated 7 months ago.

baseline-characteristics descriptive-statistics modeling paired-comparisons reporting statistics tableone

4.5 match 225 stars 13.45 score 1.2k scripts 16 dependents

trevorhastie

glmnet:Lasso and Elastic-Net Regularized Generalized Linear Models

Extremely efficient procedures for fitting the entire lasso or elastic-net regularization path for linear regression, logistic and multinomial regression models, Poisson regression, Cox model, multiple-response Gaussian, and the grouped multinomial regression; see <doi:10.18637/jss.v033.i01> and <doi:10.18637/jss.v039.i05>. There are two new and important additions. The family argument can be a GLM family object, which opens the door to any programmed family (<doi:10.18637/jss.v106.i01>). This comes with a modest computational cost, so when the built-in families suffice, they should be used instead. The other novelty is the relax option, which refits each of the active sets in the path unpenalized. The algorithm uses cyclical coordinate descent in a path-wise fashion, as described in the papers cited.

Maintained by Trevor Hastie. Last updated 2 years ago.

fortran cpp

4.0 match 82 stars 15.15 score 22k scripts 736 dependents

weiliang

powerMediation:Power/Sample Size Calculation for Mediation Analysis

Functions to calculate power and sample size for testing (1) mediation effects; (2) the slope in a simple linear regression; (3) odds ratio in a simple logistic regression; (4) mean change for longitudinal study with 2 time points; (5) interaction effect in 2-way ANOVA; and (6) the slope in a simple Poisson regression.

Maintained by Weiliang Qiu. Last updated 4 years ago.

14.9 match 3 stars 3.97 score 65 scripts 2 dependents

poissonconsulting

poispalette:Poisson Palettes

An R package for Poisson Consulting colour palettes.

Maintained by Evan Amies-Galonski. Last updated 2 months ago.

20.8 match 2.78 score 3 scripts

vigou3

actuar:Actuarial Functions and Heavy Tailed Distributions

Functions and data sets for actuarial science: modeling of loss distributions; risk theory and ruin theory; simulation of compound models, discrete mixtures and compound hierarchical models; credibility theory. Support for many additional probability distributions to model insurance loss size and frequency: 23 continuous heavy tailed distributions; the Poisson-inverse Gaussian discrete distribution; zero-truncated and zero-modified extensions of the standard discrete distributions. Support for phase-type distributions commonly used to compute ruin probabilities. Main reference: <doi:10.18637/jss.v025.i07>. Implementation of the Feller-Pareto family of distributions: <doi:10.18637/jss.v103.i06>.

Maintained by Vincent Goulet. Last updated 2 months ago.

openblas

6.1 match 12 stars 9.44 score 732 scripts 35 dependents

marberts

sps:Sequential Poisson Sampling

Sequential Poisson sampling is a variation of Poisson sampling for drawing probability-proportional-to-size samples with a given number of units, and is commonly used for price-index surveys. This package gives functions to draw stratified sequential Poisson samples according to the method by Ohlsson (1998, ISSN:0282-423X), as well as other order sample designs by Rosén (1997, <doi:10.1016/S0378-3758(96)00186-3>), and generate appropriate bootstrap replicate weights according to the generalized bootstrap method by Beaumont and Patak (2012, <doi:10.1111/j.1751-5823.2011.00166.x>).

Maintained by Steve Martin. Last updated 1 months ago.

official-statistics sampling statistics survey-sampling

11.0 match 4 stars 5.20 score 8 scripts

schaubert

catdata:Categorical Data

This R-package contains examples from the book "Regression for Categorical Data", Tutz 2012, Cambridge University Press. The names of the examples refer to the chapter and the data set that is used.

Maintained by Gunther Schauberger. Last updated 1 years ago.

8.6 match 6.61 score 158 scripts 2 dependents

mitchelloharawild

distributional:Vectorised Probability Distributions

Vectorised distribution objects with tools for manipulating, visualising, and using probability distributions. Designed to allow model prediction outputs to return distributions rather than their parameters, allowing users to directly interact with predictive distributions in a data-oriented workflow. In addition to providing generic replacements for p/d/q/r functions, other useful statistics can be computed including means, variances, intervals, and highest density regions.

Maintained by Mitchell OHara-Wild. Last updated 2 months ago.

probability-distribution statistics vctrs

4.1 match 101 stars 13.50 score 744 scripts 384 dependents

adrian-bowman

sm:Smoothing Methods for Nonparametric Regression and Density Estimation

This is software linked to the book 'Applied Smoothing Techniques for Data Analysis - The Kernel Approach with S-Plus Illustrations' Oxford University Press.

Maintained by Adrian Bowman. Last updated 1 years ago.

fortran

8.0 match 1 stars 6.99 score 732 scripts 36 dependents

nicolettadangelo

stopp:Spatio-Temporal Point Pattern Methods, Model Fitting, Diagnostics, Simulation, Local Tests

Toolbox for different kinds of spatio-temporal analyses to be performed on observed point patterns, following the growing stream of literature on point process theory. This R package implements functions to perform different kinds of analyses on point processes, proposed in the papers (Siino, Adelfio, and Mateu 2018<doi:10.1007/s00477-018-1579-0>; Siino et al. 2018<doi:10.1002/env.2463>; Adelfio et al. 2020<doi:10.1007/s00477-019-01748-1>; D’Angelo, Adelfio, and Mateu 2021<doi:10.1016/j.spasta.2021.100534>; D’Angelo, Adelfio, and Mateu 2022<doi:10.1007/s00362-022-01338-4>; D’Angelo, Adelfio, and Mateu 2023<doi:10.1016/j.csda.2022.107679>). The main topics include modeling, statistical inference, and simulation issues on spatio-temporal point processes on Euclidean space and linear networks.

Maintained by Nicoletta DAngelo. Last updated 9 months ago.

33.5 match 1.60 score

flxzimmer

mlpwr:A Power Analysis Toolbox to Find Cost-Efficient Study Designs

We implement a surrogate modeling algorithm to guide simulation-based sample size planning. The method is described in detail in our paper (Zimmer & Debelak (2023) <doi:10.1037/met0000611>). It supports multiple study design parameters and optimization with respect to a cost function. It can find optimal designs that correspond to a desired statistical power or that fulfill a cost constraint. We also provide a tutorial paper (Zimmer et al. (2023) <doi:10.3758/s13428-023-02269-0>).

Maintained by Felix Zimmer. Last updated 5 months ago.

9.2 match 4 stars 5.83 score 16 scripts

epiverse-trace

superspreading:Understand Individual-Level Variation in Infectious Disease Transmission

Estimate and understand individual-level variation in transmission. Implements density and cumulative compound Poisson discrete distribution functions ('Kremer et al.' (2021) <doi:10.1038/s41598-021-93578-x>), as well as functions to calculate infectious disease outbreak statistics given epidemiological parameters on individual-level transmission; including the probability of an outbreak becoming an epidemic/extinct ('Kucharski et al.' (2020) <doi:10.1016/S1473-3099(20)30144-4>), or the cluster size statistics, e.g. what proportion of cases cause X\% of transmission ('Lloyd-Smith et al.' (2005) <doi:10.1038/nature04153>).

Maintained by Joshua W. Lambert. Last updated 2 months ago.

disease-transmission epidemiology epiverse

7.5 match 4 stars 6.98 score 16 scripts

lhvanegasp

glmtoolbox:Set of Tools to Data Analysis using Generalized Linear Models

Set of tools for the statistical analysis of data using: (1) normal linear models; (2) generalized linear models; (3) negative binomial regression models as alternative to the Poisson regression models under the presence of overdispersion; (4) beta-binomial and random-clumped binomial regression models as alternative to the binomial regression models under the presence of overdispersion; (5) Zero-inflated and zero-altered regression models to deal with zero-excess in count data; (6) generalized nonlinear models; (7) generalized estimating equations for cluster correlated data.

Maintained by Luis Hernando Vanegas. Last updated 8 months ago.

17.1 match 1 stars 3.00 score 149 scripts

fhernanb

DiscreteDists:Discrete Statistical Distributions

Implementation of new discrete statistical distributions. Each distribution includes the traditional functions as well as an additional function called the family function, which can be used to estimate parameters within the 'gamlss' framework.

Maintained by Freddy Hernandez-Barajas. Last updated 4 days ago.

cpp

13.4 match 3.81 score 1 scripts

vabar

vibass:Valencia International Bayesian Summer School

Materials for the introductory course on Bayesian inference. Practicals, data and interactive apps.

Maintained by Facundo Muñoz. Last updated 8 months ago.

bayesian-inference teaching

9.2 match 7 stars 5.40 score 2 scripts

mattocci27

ztpln:Zero-Truncated Poisson Lognormal Distribution

Functions for obtaining the density, random variates and maximum likelihood estimates of the Zero-truncated Poisson lognormal distribution and their mixture distribution.

Maintained by Masatoshi Katabuchi. Last updated 3 years ago.

cpp

13.5 match 3.70 score 4 scripts

pilaboratory

sads:Maximum Likelihood Models for Species Abundance Distributions

Maximum likelihood tools to fit and compare models of species abundance distributions and of species rank-abundance distributions.

Maintained by Paulo I. Prado. Last updated 1 years ago.

5.8 match 23 stars 8.66 score 244 scripts 3 dependents

stephens999

ashr:Methods for Adaptive Shrinkage, using Empirical Bayes

The R package 'ashr' implements an Empirical Bayes approach for large-scale hypothesis testing and false discovery rate (FDR) estimation based on the methods proposed in M. Stephens, 2016, "False discovery rates: a new deal", <DOI:10.1093/biostatistics/kxw041>. These methods can be applied whenever two sets of summary statistics---estimated effects and standard errors---are available, just as 'qvalue' can be applied to previously computed p-values. Two main interfaces are provided: ash(), which is more user-friendly; and ash.workhorse(), which has more options and is geared toward advanced users. The ash() and ash.workhorse() also provides a flexible modeling interface that can accommodate a variety of likelihoods (e.g., normal, Poisson) and mixture priors (e.g., uniform, normal).

Maintained by Peter Carbonetto. Last updated 10 months ago.

cpp

4.1 match 82 stars 12.10 score 780 scripts 15 dependents

rstudio

keras3:R Interface to 'Keras'

Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.

Maintained by Tomasz Kalinowski. Last updated 3 days ago.

3.6 match 845 stars 13.57 score 264 scripts 2 dependents

monty-se

PINstimation:Estimation of the Probability of Informed Trading

A comprehensive bundle of utilities for the estimation of probability of informed trading models: original PIN in Easley and O'Hara (1992) and Easley et al. (1996); Multilayer PIN (MPIN) in Ersan (2016); Adjusted PIN (AdjPIN) in Duarte and Young (2009); and volume-synchronized PIN (VPIN) in Easley et al. (2011, 2012). Implementations of various estimation methods suggested in the literature are included. Additional compelling features comprise posterior probabilities, an implementation of an expectation-maximization (EM) algorithm, and PIN decomposition into layers, and into bad/good components. Versatile data simulation tools, and trade classification algorithms are among the supplementary utilities. The package provides fast, compact, and precise utilities to tackle the sophisticated, error-prone, and time-consuming estimation procedure of informed trading, and this solely using the raw trade-level data.

Maintained by Montasser Ghachem. Last updated 5 months ago.

clustering-analysis expectation-maximisation-algorithm hierarchical-clustering information-asymmetry market-microstructure maximum-likelihood-estimation mixture-distributions poisson-distribution

7.5 match 36 stars 6.48 score 14 scripts

swihart

rmutil:Utilities for Nonlinear Regression and Repeated Measurements Models

A toolkit of functions for nonlinear regression and repeated measurements not to be used by itself but called by other Lindsey packages such as 'gnlm', 'stable', 'growth', 'repeated', and 'event' (available at <https://www.commanster.eu/rcode.html>).

Maintained by Bruce Swihart. Last updated 2 years ago.

fortran

5.8 match 1 stars 8.35 score 358 scripts 70 dependents

rsbivand

splancs:Spatial and Space-Time Point Pattern Analysis

The Splancs package was written as an enhancement to S-Plus for display and analysis of spatial point pattern data; it has been ported to R and is in "maintenance mode".

Maintained by Roger Bivand. Last updated 10 months ago.

fortran

5.5 match 1 stars 8.72 score 592 scripts 53 dependents

bioc

scran:Methods for Single-Cell RNA-Seq Data Analysis

Implements miscellaneous functions for interpretation of single-cell RNA-seq data. Methods are provided for assignment of cell cycle phase, detection of highly variable and significantly correlated genes, identification of marker genes, and other common tasks in routine single-cell analysis workflows.

Maintained by Aaron Lun. Last updated 5 months ago.

immunooncology normalization sequencing rnaseq software geneexpression transcriptomics singlecell clustering bioconductor-package human-cell-atlas single-cell-rna-seq openblas cpp

3.6 match 41 stars 13.14 score 7.6k scripts 36 dependents

bioc

GeoDiff:Count model based differential expression and normalization on GeoMx RNA data

A series of statistical models using count generating distributions for background modelling, feature and sample QC, normalization and differential expression analysis on GeoMx RNA data. The application of these methods are demonstrated by example data analysis vignette.

Maintained by Nicole Ortogero. Last updated 5 months ago.

geneexpression differentialexpression normalization openblas cpp openmp

8.5 match 8 stars 5.51 score 9 scripts

cran

PoiClaClu:Classification and Clustering of Sequencing Data Based on a Poisson Model

Implements the methods described in the paper, Witten (2011) Classification and Clustering of Sequencing Data using a Poisson Model, Annals of Applied Statistics 5(4) 2493-2518.

Maintained by Daniela Witten. Last updated 6 years ago.

12.0 match 3.81 score 107 scripts 2 dependents

owenward

ppdiag:Diagnosis and Visualizations Tools for Temporal Point Processes

A suite of diagnostic tools for univariate point processes. This includes tools for simulating and fitting both common and more complex temporal point processes. We also include functions to visualise these point processes and collect existing diagnostic tools of Brown et al. (2002) <doi:10.1162/08997660252741149> and Wu et al. (2021) <doi:10.1002/9781119821588.ch7>, which can be used to assess the fit of a chosen point process model.

Maintained by Owen G. Ward. Last updated 2 years ago.

8.8 match 5 stars 5.18 score 15 scripts

knoths

spc:Statistical Process Control -- Calculation of ARL and Other Control Chart Performance Measures

Evaluation of control charts by means of the zero-state, steady-state ARL (Average Run Length) and RL quantiles. Setting up control charts for given in-control ARL. The control charts under consideration are one- and two-sided EWMA, CUSUM, and Shiryaev-Roberts schemes for monitoring the mean or variance of normally distributed independent data. ARL calculation of the same set of schemes under drift (in the mean) are added. Eventually, all ARL measures for the multivariate EWMA (MEWMA) are provided.

Maintained by Sven Knoth. Last updated 7 months ago.

openblas

13.7 match 5 stars 3.30 score 66 scripts 1 dependents

bayes-rules

bayesrules:Datasets and Supplemental Functions from Bayes Rules! Book

Provides datasets and functions used for analysis and visualizations in the Bayes Rules! book (<https://www.bayesrulesbook.com>). The package contains a set of functions that summarize and plot Bayesian models from some conjugate families and another set of functions for evaluation of some Bayesian models.

Maintained by Mine Dogucu. Last updated 3 years ago.

bayesian-statistics data

5.5 match 72 stars 8.06 score 466 scripts

zeemkr

ncpen:Unified Algorithm for Non-convex Penalized Estimation for Generalized Linear Models

An efficient unified nonconvex penalized estimation algorithm for Gaussian (linear), binomial Logit (logistic), Poisson, multinomial Logit, and Cox proportional hazard regression models. The unified algorithm is implemented based on the convex concave procedure and the algorithm can be applied to most of the existing nonconvex penalties. The algorithm also supports convex penalty: least absolute shrinkage and selection operator (LASSO). Supported nonconvex penalties include smoothly clipped absolute deviation (SCAD), minimax concave penalty (MCP), truncated LASSO penalty (TLP), clipped LASSO (CLASSO), sparse ridge (SRIDGE), modified bridge (MBRIDGE) and modified log (MLOG). For high-dimensional data (data set with many variables), the algorithm selects relevant variables producing a parsimonious regression model. Kim, D., Lee, S. and Kwon, S. (2018) <arXiv:1811.05061>, Lee, S., Kwon, S. and Kim, Y. (2016) <doi:10.1016/j.csda.2015.08.019>, Kwon, S., Lee, S. and Kim, Y. (2015) <doi:10.1016/j.csda.2015.07.001>. (This research is funded by Julian Virtue Professorship from Center for Applied Research at Pepperdine Graziadio Business School and the National Research Foundation of Korea.)

Maintained by Dongshin Kim. Last updated 6 years ago.

binomial classo cox gaussian high-dimensional-data lasso linear mbridge mcp mlog multinomial nonconvex-penalties poisson scad sridge tlp openblas cpp

11.5 match 8 stars 3.88 score 19 scripts

ousuga

RelDists:Estimation for some Reliability Distributions

Parameters estimation and linear regression models for Reliability distributions families reviewed by Almalki & Nadarajah (2014) <doi:10.1016/j.ress.2013.11.010> using Generalized Additive Models for Location, Scale and Shape, aka GAMLSS by Rigby & Stasinopoulos (2005) <doi:10.1111/j.1467-9876.2005.00510.x>.

Maintained by Jaime Mosquera. Last updated 7 days ago.

7.8 match 3 stars 5.76 score 19 scripts

cran

flexmix:Flexible Mixture Modeling

A general framework for finite mixtures of regression models using the EM algorithm is implemented. The E-step and all data handling are provided, while the M-step can be supplied by the user to easily define new models. Existing drivers implement mixtures of standard linear models, generalized linear models and model-based clustering.

Maintained by Bettina Gruen. Last updated 15 days ago.

5.4 match 5 stars 8.19 score 113 dependents

f-rousset

spaMM:Mixed-Effect Models, with or without Spatial Random Effects

Inference based on models with or without spatially-correlated random effects, multivariate responses, or non-Gaussian random effects (e.g., Beta). Variation in residual variance (heteroscedasticity) can itself be represented by a mixed-effect model. Both classical geostatistical models (Rousset and Ferdy 2014 <doi:10.1111/ecog.00566>), and Markov random field models on irregular grids (as considered in the 'INLA' package, <https://www.r-inla.org>), can be fitted, with distinct computational procedures exploiting the sparse matrix representations for the latter case and other autoregressive models. Laplace approximations are used for likelihood or restricted likelihood. Penalized quasi-likelihood and other variants discussed in the h-likelihood literature (Lee and Nelder 2001 <doi:10.1093/biomet/88.4.987>) are also implemented.

Maintained by François Rousset. Last updated 9 months ago.

gsl cpp openmp

8.9 match 4.94 score 208 scripts 5 dependents

apwheele

ptools:Tools for Poisson Data

Functions used for analyzing count data, mostly crime counts. Includes checking difference in two Poisson counts (e-test), checking the fit for a Poisson distribution, small sample tests for counts in bins, Weighted Displacement Difference test (Wheeler and Ratcliffe, 2018) <doi:10.1186/s40163-018-0085-5>, to evaluate crime changes over time in treated/control areas. Additionally includes functions for aggregating spatial data and spatial feature engineering.

Maintained by Andrew Wheeler. Last updated 1 years ago.

crime-analysis criminal-justice criminology

9.9 match 5 stars 4.44 score 11 scripts

oobianom

quickcode:Quick and Essential 'R' Tricks for Better Scripts

The NOT functions, 'R' tricks and a compilation of some simple quick plus often used 'R' codes to improve your scripts. Improve the quality and reproducibility of 'R' scripts.

Maintained by Obinna Obianom. Last updated 12 days ago.

colors data distributions images

5.6 match 5 stars 7.76 score 7 scripts 6 dependents

tyee001

VGAMdata:Data Supporting the 'VGAM' Package

Mainly data sets to accompany the VGAM package and the book "Vector Generalized Linear and Additive Models: With an Implementation in R" (Yee, 2015) <DOI:10.1007/978-1-4939-2818-7>. These are used to illustrate vector generalized linear and additive models (VGLMs/VGAMs), and associated models (Reduced-Rank VGLMs, Quadratic RR-VGLMs, Row-Column Interaction Models, and constrained and unconstrained ordination models in ecology). This package now contains some old VGAM family functions which have been replaced by newer ones (often because they are now special cases).

Maintained by Thomas Yee. Last updated 1 months ago.

14.7 match 1 stars 2.94 score 95 scripts 1 dependents

cran

mgcv:Mixed GAM Computation Vehicle with Automatic Smoothness Estimation

Generalized additive (mixed) models, some of their extensions and other generalized ridge regression with multiple smoothing parameter estimation by (Restricted) Marginal Likelihood, Generalized Cross Validation and similar, or using iterated nested Laplace approximation for fully Bayesian inference. See Wood (2017) <doi:10.1201/9781315370279> for an overview. Includes a gam() function, a wide variety of smoothers, 'JAGS' support and distributions beyond the exponential family.

Maintained by Simon Wood. Last updated 1 years ago.

openblas openmp

3.4 match 32 stars 12.71 score 17k scripts 7.8k dependents

stan-dev

loo:Efficient Leave-One-Out Cross-Validation and WAIC for Bayesian Models

Efficient approximate leave-one-out cross-validation (LOO) for Bayesian models fit using Markov chain Monte Carlo, as described in Vehtari, Gelman, and Gabry (2017) <doi:10.1007/s11222-016-9696-4>. The approximation uses Pareto smoothed importance sampling (PSIS), a new procedure for regularizing importance weights. As a byproduct of the calculations, we also obtain approximate standard errors for estimated predictive errors and for the comparison of predictive errors between models. The package also provides methods for using stacking and other model weighting techniques to average Bayesian predictive distributions.

Maintained by Jonah Gabry. Last updated 1 days ago.

bayes bayesian bayesian-data-analysis bayesian-inference bayesian-methods bayesian-statistics cross-validation information-criterion model-comparison stan

2.5 match 152 stars 17.30 score 2.6k scripts 297 dependents

baddstats

spatstat.local:Extension to 'spatstat' for Local Composite Likelihood

Extension to the 'spatstat' package, enabling the user to fit point process models to point pattern data by local composite likelihood ('geographically weighted regression').

Maintained by Adrian Baddeley. Last updated 8 months ago.

spatial-analysis spatial-data spatstat

9.2 match 4.66 score 23 scripts

bioc

twoddpcr:Classify 2-d Droplet Digital PCR (ddPCR) data and quantify the number of starting molecules

The twoddpcr package takes Droplet Digital PCR (ddPCR) droplet amplitude data from Bio-Rad's QuantaSoft and can classify the droplets. A summary of the positive/negative droplet counts can be generated, which can then be used to estimate the number of molecules using the Poisson distribution. This is the first open source package that facilitates the automatic classification of general two channel ddPCR data. Previous work includes 'definetherain' (Jones et al., 2014) and 'ddpcRquant' (Trypsteen et al., 2015) which both handle one channel ddPCR experiments only. The 'ddpcr' package available on CRAN (Attali et al., 2016) supports automatic gating of a specific class of two channel ddPCR experiments only.

Maintained by Anthony Chiu. Last updated 5 months ago.

ddpcr software classification

7.3 match 10 stars 5.78 score 4 scripts

boost-r

mboost:Model-Based Boosting

Functional gradient descent algorithm (boosting) for optimizing general risk functions utilizing component-wise (penalised) least squares estimates or regression trees as base-learners for fitting generalized linear, additive and interaction models to potentially high-dimensional data. Models and algorithms are described in <doi:10.1214/07-STS242>, a hands-on tutorial is available from <doi:10.1007/s00180-012-0382-5>. The package allows user-specified loss functions and base-learners.

Maintained by Torsten Hothorn. Last updated 4 months ago.

boosting-algorithms gam glm machine-learning mboost modelling r-language tutorials variable-selection openblas

3.3 match 72 stars 12.70 score 540 scripts 27 dependents

janoleko

LaMa:Fast Numerical Maximum Likelihood Estimation for Latent Markov Models

A variety of latent Markov models, including hidden Markov models, hidden semi-Markov models, state-space models and continuous-time variants can be formulated and estimated within the same framework via directly maximising the likelihood function using the so-called forward algorithm. Applied researchers often need custom models that standard software does not easily support. Writing tailored 'R' code offers flexibility but suffers from slow estimation speeds. We address these issues by providing easy-to-use functions (written in 'C++' for speed) for common tasks like the forward algorithm. These functions can be combined into custom models in a Lego-type approach, offering up to 10-20 times faster estimation via standard numerical optimisers. To aid in building fully custom likelihood functions, several vignettes are included that show how to simulate data from and estimate all the above model classes.

Maintained by Jan-Ole Koslik. Last updated 6 hours ago.

openblas cpp openmp

5.3 match 9 stars 7.84 score 42 scripts

greta-dev

greta:Simple and Scalable Statistical Modelling in R

Write statistical models in R and fit them by MCMC and optimisation on CPUs and GPUs, using Google 'TensorFlow'. greta lets you write your own model like in BUGS, JAGS and Stan, except that you write models right in R, it scales well to massive datasets, and it’s easy to extend and build on. See the website for more information, including tutorials, examples, package documentation, and the greta forum.

Maintained by Nicholas Tierney. Last updated 4 days ago.

3.3 match 566 stars 12.53 score 396 scripts 6 dependents

chandlerxiandeyang

CleaningValidation:Cleaning Validation Functions for Pharmaceutical Cleaning Process

Provides essential Cleaning Validation functions for complying with pharmaceutical cleaning process regulatory standards. The package includes non-parametric methods to analyze drug active-ingredient residue (DAR), cleaning agent residue (CAR), and microbial colonies (Mic) for non-Poisson distributions. Additionally, Poisson methods are provided for Mic analysis when Mic data follow a Poisson distribution.

Maintained by Xiande Yang. Last updated 10 months ago.

15.3 match 2.70 score

ohdsi

Cyclops:Cyclic Coordinate Descent for Logistic, Poisson and Survival Analysis

This model fitting tool incorporates cyclic coordinate descent and majorization-minimization approaches to fit a variety of regression models found in large-scale observational healthcare data. Implementations focus on computational optimization and fine-scale parallelization to yield efficient inference in massive datasets. Please see: Suchard, Simpson, Zorych, Ryan and Madigan (2013) <doi:10.1145/2414416.2414791>.

Maintained by Marc A. Suchard. Last updated 3 months ago.

hades cpp

4.5 match 39 stars 9.05 score 73 scripts 4 dependents

therneau

survival:Survival Analysis

Contains the core survival analysis routines, including definition of Surv objects, Kaplan-Meier and Aalen-Johansen (multi-state) curves, Cox models, and parametric accelerated failure time models.

Maintained by Terry M Therneau. Last updated 3 months ago.

2.0 match 400 stars 20.43 score 29k scripts 3.9k dependents

mattheaphy

offsetreg:An Extension of 'Tidymodels' Supporting Offset Terms

Extend the 'tidymodels' ecosystem <https://www.tidymodels.org/> to enable the creation of predictive models with offset terms. Models with offsets are most useful when working with count data or when fitting an adjustment model on top of an existing model with a prior expectation. The former situation is common in insurance where data is often weighted by exposures. The latter is common in life insurance where industry mortality tables are often used as a starting point for setting assumptions.

Maintained by Matt Heaphy. Last updated 20 days ago.

9.1 match 2 stars 4.48 score 4 scripts

yili-hong

poibin:The Poisson Binomial Distribution

Implementation of both the exact and approximation methods for computing the cdf of the Poisson binomial distribution as described in Hong (2013) <doi: 10.1016/j.csda.2012.10.006>. It also provides the pmf, quantile function, and random number generation for the Poisson binomial distribution. The C code for fast Fourier transformation (FFT) is written by R Core Team (2019)<https://www.R-project.org/>, which implements the FFT algorithm in Singleton (1969) <doi: 10.1109/TAU.1969.1162042>.

Maintained by Yili Hong. Last updated 7 months ago.

8.1 match 3 stars 4.96 score 80 scripts 9 dependents

yuepan027

scpoisson:Single Cell Poisson Probability Paradigm

Useful to visualize the Poissoneity (an independent Poisson statistical framework, where each RNA measurement for each cell comes from its own independent Poisson distribution) of Unique Molecular Identifier (UMI) based single cell RNA sequencing (scRNA-seq) data, and explore cell clustering based on model departure as a novel data representation.

Maintained by Yue Pan. Last updated 3 years ago.

14.8 match 2.70 score 4 scripts

pbs-assess

sdmTMB:Spatial and Spatiotemporal SPDE-Based GLMMs with 'TMB'

Implements spatial and spatiotemporal GLMMs (Generalized Linear Mixed Effect Models) using 'TMB', 'fmesher', and the SPDE (Stochastic Partial Differential Equation) Gaussian Markov random field approximation to Gaussian random fields. One common application is for spatially explicit species distribution models (SDMs). See Anderson et al. (2024) <doi:10.1101/2022.03.24.485545>.

Maintained by Sean C. Anderson. Last updated 7 hours ago.

ecology glmm spatial-analysis species-distribution-modelling tmb cpp

3.7 match 203 stars 10.71 score 848 scripts 1 dependents

amalan-constat

fitODBOD:Modeling Over Dispersed Binomial Outcome Data Using BMD and ABD

Contains Probability Mass Functions, Cumulative Mass Functions, Negative Log Likelihood value, parameter estimation and modeling data using Binomial Mixture Distributions (BMD) (Manoj et al (2013) <doi:10.5539/ijsp.v2n2p24>) and Alternate Binomial Distributions (ABD) (Paul (1985) <doi:10.1080/03610928508828990>), also Journal article to use the package(<doi:10.21105/joss.01505>).

Maintained by Amalan Mahendran. Last updated 4 months ago.

binomial-outcome-data overdispersion

8.7 match 1 stars 4.44 score 139 scripts

giorgilancs

PrevMap:Geostatistical Modelling of Spatially Referenced Prevalence Data

Provides functions for both likelihood-based and Bayesian analysis of spatially referenced prevalence data. For a tutorial on the use of the R package, see Giorgi and Diggle (2017) <doi:10.18637/jss.v078.i08>.

Maintained by Emanuele Giorgi. Last updated 2 years ago.

8.9 match 4.36 score 46 scripts

lbb220

GWmodel:Geographically-Weighted Models

Techniques from a particular branch of spatial statistics,termed geographically-weighted (GW) models. GW models suit situations when data are not described well by some global model, but where there are spatial regions where a suitably localised calibration provides a better description. 'GWmodel' includes functions to calibrate: GW summary statistics (Brunsdon et al., 2002)<doi: 10.1016/s0198-9715(01)00009-6>, GW principal components analysis (Harris et al., 2011)<doi: 10.1080/13658816.2011.554838>, GW discriminant analysis (Brunsdon et al., 2007)<doi: 10.1111/j.1538-4632.2007.00709.x> and various forms of GW regression (Brunsdon et al., 1996)<doi: 10.1111/j.1538-4632.1996.tb00936.x>; some of which are provided in basic and robust (outlier resistant) forms.

Maintained by Binbin Lu. Last updated 6 months ago.

openblas cpp openmp

6.0 match 18 stars 6.38 score 266 scripts 4 dependents

lbbe-software

fitdistrplus:Help to Fit of a Parametric Distribution to Non-Censored or Censored Data

Extends the fitdistr() function (of the MASS package) with several functions to help the fit of a parametric distribution to non-censored or censored data. Censored data may contain left censored, right censored and interval censored values, with several lower and upper bounds. In addition to maximum likelihood estimation (MLE), the package provides moment matching (MME), quantile matching (QME), maximum goodness-of-fit estimation (MGE) and maximum spacing estimation (MSE) methods (available only for non-censored data). Weighted versions of MLE, MME, QME and MSE are available. See e.g. Casella & Berger (2002), Statistical inference, Pacific Grove, for a general introduction to parametric estimation.

Maintained by Aurélie Siberchicot. Last updated 11 days ago.

2.4 match 54 stars 16.15 score 4.5k scripts 153 dependents

m-signo

ptmixed:Poisson-Tweedie Generalized Linear Mixed Model

Fits the Poisson-Tweedie generalized linear mixed model described in Signorelli et al. (2021, <doi:10.1177/1471082X20936017>). Likelihood approximation based on adaptive Gauss Hermite quadrature rule.

Maintained by Mirko Signorelli. Last updated 3 years ago.

17.7 match 2.15 score 14 scripts

silvaneojunior

kDGLM:Bayesian Analysis of Dynamic Generalized Linear Models

Provide routines for filtering and smoothing, forecasting, sampling and Bayesian analysis of Dynamic Generalized Linear Models using the methodology described in Alves et al. (2024)<doi:10.48550/arXiv.2201.05387> and dos Santos Jr. et al. (2024)<doi:10.48550/arXiv.2403.13069>.

Maintained by Silvaneo Vieira dos Santos Junior. Last updated 2 days ago.

6.7 match 2 stars 5.70 score 9 scripts

maxmenssen

predint:Prediction Intervals

An implementation of prediction intervals for overdispersed count data, for overdispersed binomial data and for linear random effects models.

Maintained by Max Menssen. Last updated 4 months ago.

12.5 match 3.00 score 4 scripts

carloshellin

LearningRlab:Statistical Learning Functions

Aids in learning statistical functions incorporating the result of calculus done with each function and how they are obtained, that is, which equation and variables are used. Also for all these equations and their related variables detailed explanations and interactive exercises are also included. All these characteristics allow to the package user to improve the learning of statistics basics by means of their use.

Maintained by Carlos Javier Hellin Asensio. Last updated 2 years ago.

10.1 match 3.64 score 44 scripts

bayesiandemography

bage:Bayesian Estimation and Forecasting of Age-Specific Rates

Fast Bayesian estimation and forecasting of age-specific rates, probabilities, and means, based on 'Template Model Builder'.

Maintained by John Bryant. Last updated 2 months ago.

cpp

5.0 match 3 stars 7.30 score 39 scripts

jthaman

ciTools:Confidence or Prediction Intervals, Quantiles, and Probabilities for Statistical Models

Functions to append confidence intervals, prediction intervals, and other quantities of interest to data frames. All appended quantities are for the response variable, after conditioning on the model and covariates. This package has a data frame first syntax that allows for easy piping. Currently supported models include (log-) linear, (log-) linear mixed, generalized linear models, generalized linear mixed models, and accelerated failure time models.

Maintained by John Haman. Last updated 1 years ago.

lme4 modeling tidyverse

3.7 match 107 stars 9.85 score 148 scripts 5 dependents

anjapago

ocp:Bayesian Online Changepoint Detection

Implements the Bayesian online changepoint detection method by Adams and MacKay (2007) <arXiv:0710.3742> for univariate or multivariate data. Gaussian and Poisson probability models are implemented. Provides post-processing functions with alternative ways to extract changepoints.

Maintained by Andrea Pagotto. Last updated 6 years ago.

9.0 match 1 stars 4.06 score 23 scripts

rstudio

tfprobability:Interface to 'TensorFlow Probability'

Interface to 'TensorFlow Probability', a 'Python' library built on 'TensorFlow' that makes it easy to combine probabilistic models and deep learning on modern hardware ('TPU', 'GPU'). 'TensorFlow Probability' includes a wide selection of probability distributions and bijectors, probabilistic layers, variational inference, Markov chain Monte Carlo, and optimizers such as Nelder-Mead, BFGS, and SGLD.

Maintained by Tomasz Kalinowski. Last updated 3 years ago.

4.1 match 54 stars 8.63 score 221 scripts 3 dependents

jlaake

RMark:R Code for Mark Analysis

An interface to the software package MARK that constructs input files for MARK and extracts the output. MARK was developed by Gary White and is freely available at <http://www.phidot.org/software/mark/downloads/> but is not open source.

Maintained by Jeff Laake. Last updated 3 years ago.

7.2 match 4.90 score 366 scripts 4 dependents

cran

bivpois:Bivariate Poisson Distribution

Maximum likelihood estimation, random values generation, density computation and other functions for the bivariate Poisson distribution. References include: Kawamura K. (1984). "Direct calculation of maximum likelihood estimator for the bivariate Poisson distribution". Kodai Mathematical Journal, 7(2): 211--221. <doi:10.2996/kmj/1138036908>. Kocherlakota S. and Kocherlakota K. (1992). "Bivariate discrete distributions". CRC Press. <doi:10.1201/9781315138480>. Karlis D. and Ntzoufras I. (2003). "Analysis of sports data by using bivariate Poisson models". Journal of the Royal Statistical Society: Series D (The Statistician), 52(3): 381--393. <doi:10.1111/1467-9884.00366>.

Maintained by Michail Tsagris. Last updated 2 months ago.

16.8 match 4 stars 2.08 score 1 dependents

bioc

mirTarRnaSeq:mirTarRnaSeq

mirTarRnaSeq R package can be used for interactive mRNA miRNA sequencing statistical analysis. This package utilizes expression or differential expression mRNA and miRNA sequencing results and performs interactive correlation and various GLMs (Regular GLM, Multivariate GLM, and Interaction GLMs ) analysis between mRNA and miRNA expriments. These experiments can be time point experiments, and or condition expriments.

Maintained by Mercedeh Movassagh. Last updated 5 months ago.

mirna regression software sequencing smallrna timecourse differentialexpression

8.7 match 4.00 score 9 scripts

ikosmidis

brglm2:Bias Reduction in Generalized Linear Models

Estimation and inference from generalized linear models based on various methods for bias reduction and maximum penalized likelihood with powers of the Jeffreys prior as penalty. The 'brglmFit' fitting method can achieve reduction of estimation bias by solving either the mean bias-reducing adjusted score equations in Firth (1993) <doi:10.1093/biomet/80.1.27> and Kosmidis and Firth (2009) <doi:10.1093/biomet/asp055>, or the median bias-reduction adjusted score equations in Kenne et al. (2017) <doi:10.1093/biomet/asx046>, or through the direct subtraction of an estimate of the bias of the maximum likelihood estimator from the maximum likelihood estimates as in Cordeiro and McCullagh (1991) <https://www.jstor.org/stable/2345592>. See Kosmidis et al (2020) <doi:10.1007/s11222-019-09860-6> for more details. Estimation in all cases takes place via a quasi Fisher scoring algorithm, and S3 methods for the construction of of confidence intervals for the reduced-bias estimates are provided. In the special case of generalized linear models for binomial and multinomial responses (both ordinal and nominal), the adjusted score approaches to mean and media bias reduction have been found to return estimates with improved frequentist properties, that are also always finite, even in cases where the maximum likelihood estimates are infinite (e.g. complete and quasi-complete separation; see Kosmidis and Firth, 2020 <doi:10.1093/biomet/asaa052>, for a proof for mean bias reduction in logistic regression).

Maintained by Ioannis Kosmidis. Last updated 6 months ago.

adjusted-score-equations algorithms bias-reducing-adjustments bias-reduction estimation glm logistic-regression nominal-responses ordinal-responses regression regression-algorithms statistics

3.3 match 32 stars 10.41 score 106 scripts 10 dependents

statnet

ergm.count:Fit, Simulate and Diagnose Exponential-Family Models for Networks with Count Edges

A set of extensions for the 'ergm' package to fit weighted networks whose edge weights are counts. See Krivitsky (2012) <doi:10.1214/12-EJS696> and Krivitsky, Hunter, Morris, and Klumb (2023) <doi:10.18637/jss.v105.i06>.

Maintained by Pavel N. Krivitsky. Last updated 4 months ago.

3.9 match 10 stars 8.78 score 140 scripts 1 dependents

mauroflorez

MultRegCMP:Bayesian Multivariate Conway-Maxwell-Poisson Regression Model for Correlated Count Data

Fits a Bayesian Regression Model for multivariate count data. This model assumes that the data is distributed according to the Conway-Maxwell-Poisson distribution, and for each response variable it is associate different covariates. This model allows to account for correlations between the counts by using latent effects based on the Chib and Winkelmann (2001) <http://www.jstor.org/stable/1392277> proposal.

Maintained by Mauro Florez. Last updated 9 months ago.

10.2 match 3.30 score 4 scripts

cran

PNAR:Poisson Network Autoregressive Models

Quasi likelihood-based methods for estimating linear and log-linear Poisson Network Autoregression models with p lags and covariates. Tools for testing the linearity versus several non-linear alternatives. Tools for simulation of multivariate count distributions, from linear and non-linear PNAR models, by using a specific copula construction. References include: Armillotta, M. and K. Fokianos (2023). "Nonlinear network autoregression". Annals of Statistics, 51(6): 2526--2552. <doi:10.1214/23-AOS2345>. Armillotta, M. and K. Fokianos (2024). "Count network autoregression". Journal of Time Series Analysis, 45(4): 584--612. <doi:10.1111/jtsa.12728>. Armillotta, M., Tsagris, M. and Fokianos, K. (2024). "Inference for Network Count Time Series with the R Package PNAR". The R Journal, 15/4: 255--269. <doi:10.32614/RJ-2023-094>.

Maintained by Michail Tsagris. Last updated 6 months ago.

33.4 match 1.00 score

andrisignorell

DescTools:Tools for Descriptive Statistics

A collection of miscellaneous basic statistic functions and convenience wrappers for efficiently describing data. The author's intention was to create a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results. The package contains furthermore functions to produce documents using MS Word (or PowerPoint) and functions to import data from Excel. Many of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The 'BigCamelCase' style was consequently applied to functions borrowed from contributed R packages as well.

Maintained by Andri Signorell. Last updated 9 days ago.

fortran cpp

2.0 match 87 stars 16.68 score 7.7k scripts 99 dependents

tidymodels

parsnip:A Common API to Modeling and Analysis Functions

A common interface is provided to allow users to specify a model without having to remember the different argument names across different functions or computational engines (e.g. 'R', 'Spark', 'Stan', 'H2O', etc).

Maintained by Max Kuhn. Last updated 3 days ago.

2.0 match 612 stars 16.37 score 3.4k scripts 69 dependents

iiasa

ibis.iSDM:Modelling framework for integrated biodiversity distribution scenarios

Integrated framework of modelling the distribution of species and ecosystems in a suitability framing. This package allows the estimation of integrated species distribution models (iSDM) based on several sources of evidence and provided presence-only and presence-absence datasets. It makes heavy use of point-process models for estimating habitat suitability and allows to include spatial latent effects and priors in the estimation. To do so 'ibis.iSDM' supports a number of engines for Bayesian and more non-parametric machine learning estimation. Further, the 'ibis.iSDM' is specifically customized to support spatial-temporal projections of habitat suitability into the future.

Maintained by Martin Jung. Last updated 4 months ago.

bayesian biodiversity integrated-framework poisson-process scenarios sdm spatial-grain spatial-predictions species-distribution-modelling

7.5 match 21 stars 4.36 score 12 scripts 1 dependents

jacob-long

jtools:Analysis and Presentation of Social Scientific Data

This is a collection of tools for more efficiently understanding and sharing the results of (primarily) regression analyses. There are also a number of miscellaneous functions for statistical and programming purposes. Support for models produced by the survey and lme4 packages are points of emphasis.

Maintained by Jacob A. Long. Last updated 6 months ago.

social-sciences

2.2 match 167 stars 14.48 score 4.0k scripts 14 dependents

wenjie2wang

reda:Recurrent Event Data Analysis

Contains implementations of recurrent event data analysis routines including (1) survival and recurrent event data simulation from stochastic process point of view by the thinning method proposed by Lewis and Shedler (1979) <doi:10.1002/nav.3800260304> and the inversion method introduced in Cinlar (1975, ISBN:978-0486497976), (2) the mean cumulative function (MCF) estimation by the Nelson-Aalen estimator of the cumulative hazard rate function, (3) two-sample recurrent event responses comparison with the pseudo-score tests proposed by Lawless and Nadeau (1995) <doi:10.2307/1269617>, (4) gamma frailty model with spline rate function following Fu, et al. (2016) <doi:10.1080/10543406.2014.992524>.

Maintained by Wenjie Wang. Last updated 1 years ago.

mcf mean-cumulative-function recurrent-event survival-analysis cpp

4.2 match 15 stars 7.52 score 55 scripts 3 dependents

animint

animint2:Animated Interactive Grammar of Graphics

Functions are provided for defining animated, interactive data visualizations in R code, and rendering on a web page. The 2018 Journal of Computational and Graphical Statistics paper, <doi:10.1080/10618600.2018.1513367> describes the concepts implemented.

Maintained by Toby Hocking. Last updated 26 days ago.

3.5 match 64 stars 8.87 score 173 scripts

genentech

psborrow2:Bayesian Dynamic Borrowing Analysis and Simulation

Bayesian dynamic borrowing is an approach to incorporating external data to supplement a randomized, controlled trial analysis in which external data are incorporated in a dynamic way (e.g., based on similarity of outcomes); see Viele 2013 <doi:10.1002/pst.1589> for an overview. This package implements the hierarchical commensurate prior approach to dynamic borrowing as described in Hobbes 2011 <doi:10.1111/j.1541-0420.2011.01564.x>. There are three main functionalities. First, 'psborrow2' provides a user-friendly interface for applying dynamic borrowing on the study results handles the Markov Chain Monte Carlo sampling on behalf of the user. Second, 'psborrow2' provides a simulation framework to compare different borrowing parameters (e.g. full borrowing, no borrowing, dynamic borrowing) and other trial and borrowing characteristics (e.g. sample size, covariates) in a unified way. Third, 'psborrow2' provides a set of functions to generate data for simulation studies, and also allows the user to specify their own data generation process. This package is designed to use the sampling functions from 'cmdstanr' which can be installed from <https://stan-dev.r-universe.dev>.

Maintained by Matt Secrest. Last updated 1 months ago.

bayesian-dynamic-borrowing psborrow2 simulation-study

3.9 match 18 stars 7.87 score 16 scripts

radiant-rstats

radiant.basics:Basics Menu for Radiant: Business Analytics using R and Shiny

The Radiant Basics menu includes interfaces for probability calculation, central limit theorem simulation, comparing means and proportions, goodness-of-fit testing, cross-tabs, and correlation. The application extends the functionality in 'radiant.data'.

Maintained by Vincent Nijs. Last updated 10 months ago.

5.5 match 8 stars 5.56 score 79 scripts 3 dependents

tidymodels

multilevelmod:Model Wrappers for Multi-Level Models

Bindings for hierarchical regression models for use with the 'parsnip' package. Models include longitudinal generalized linear models (Liang and Zeger, 1986) <doi:10.1093/biomet/73.1.13>, and mixed-effect models (Pinheiro and Bates) <doi:10.1007/978-1-4419-0318-1_1>.

Maintained by Hannah Frick. Last updated 5 months ago.

3.8 match 74 stars 8.12 score 239 scripts

frankportman

bayesAB:Fast Bayesian Methods for AB Testing

A suite of functions that allow the user to analyze A/B test data in a Bayesian framework. Intended to be a drop-in replacement for common frequentist hypothesis test such as the t-test and chi-sq test.

Maintained by Frank Portman. Last updated 4 years ago.

ab-testing bayesian-methods bayesian-tests cpp

4.1 match 308 stars 7.43 score 88 scripts

ohdsi

EmpiricalCalibration:Routines for Performing Empirical Calibration of Observational Study Estimates

Routines for performing empirical calibration of observational study estimates. By using a set of negative control hypotheses we can estimate the empirical null distribution of a particular observational study setup. This empirical null distribution can be used to compute a calibrated p-value, which reflects the probability of observing an estimated effect size when the null hypothesis is true taking both random and systematic error into account. A similar approach can be used to calibrate confidence intervals, using both negative and positive controls. For more details, see Schuemie et al. (2013) <doi:10.1002/sim.5925> and Schuemie et al. (2018) <doi:10.1073/pnas.1708282114>.

Maintained by Martijn Schuemie. Last updated 29 days ago.

hades cpp

3.5 match 10 stars 8.51 score 151 scripts 1 dependents

danielturek

nimbleSCR:Spatial Capture-Recapture (SCR) Methods Using 'nimble'

Provides utility functions, distributions, and fitting methods for Bayesian Spatial Capture-Recapture (SCR) and Open Population Spatial Capture-Recapture (OPSCR) modelling using the nimble package (de Valpine et al. 2017 <doi:10.1080/10618600.2016.1172487 >). Development of the package was motivated primarily by the need for flexible and efficient analysis of large-scale SCR data (Bischof et al. 2020 <doi:10.1073/pnas.2011383117 >). Computational methods and techniques implemented in nimbleSCR include those discussed in Turek et al. 2021 <doi:10.1002/ecs2.3385>; among others. For a recent application of nimbleSCR, see Milleret et al. (2021) <doi:10.1098/rsbl.2021.0128>.

Maintained by Daniel Turek. Last updated 2 years ago.

7.0 match 4.29 score 388 scripts

jwb133

smcfcs:Multiple Imputation of Covariates by Substantive Model Compatible Fully Conditional Specification

Implements multiple imputation of missing covariates by Substantive Model Compatible Fully Conditional Specification. This is a modification of the popular FCS/chained equations multiple imputation approach, and allows imputation of missing covariate values from models which are compatible with the user specified substantive model.

Maintained by Jonathan Bartlett. Last updated 15 hours ago.

3.3 match 11 stars 9.00 score 59 scripts 1 dependents

mastoffel

rptR:Repeatability Estimation for Gaussian and Non-Gaussian Data

Estimating repeatability (intra-class correlation) from Gaussian, binary, proportion and Poisson data.

Maintained by Martin Stoffel. Last updated 6 months ago.

3.5 match 17 stars 8.53 score 112 scripts 2 dependents

kisungyou

Rdimtools:Dimension Reduction and Estimation Methods

We provide linear and nonlinear dimension reduction techniques. Intrinsic dimension estimation methods for exploratory analysis are also provided. For more details on the package, see the paper by You and Shung (2022) <doi:10.1016/j.simpa.2022.100414>.

Maintained by Kisung You. Last updated 2 years ago.

dimension-estimation dimension-reduction manifold-learning subspace-learning openblas cpp openmp

3.5 match 52 stars 8.37 score 186 scripts 8 dependents

cran

gp:Maximum Likelihood Estimation of the Generalized Poisson Distribution

Functions to estimate the parameters of the generalized Poisson distribution with or without covariates using maximum likelihood. The references include Nikoloulopoulos A.K. & Karlis D. (2008). "On modeling count data: a comparison of some well-known discrete distributions". Journal of Statistical Computation and Simulation, 78(3): 437--457, <doi:10.1080/10629360601010760> and Consul P.C. & Famoye F. (1992). "Generalized Poisson regression model". Communications in Statistics - Theory and Methods, 21(1): 89--109, <doi:10.1080/03610929208830766>.

Maintained by Michail Tsagris. Last updated 1 years ago.

14.5 match 2.01 score 17 scripts 2 dependents

kkholst

lava:Latent Variable Models

A general implementation of Structural Equation Models with latent variables (MLE, 2SLS, and composite likelihood estimators) with both continuous, censored, and ordinal outcomes (Holst and Budtz-Joergensen (2013) <doi:10.1007/s00180-012-0344-y>). Mixture latent variable models and non-linear latent variable models (Holst and Budtz-Joergensen (2020) <doi:10.1093/biostatistics/kxy082>). The package also provides methods for graph exploration (d-separation, back-door criterion), simulation of general non-linear latent variable models, and estimation of influence functions for a broad range of statistical models.

Maintained by Klaus K. Holst. Last updated 2 months ago.

latent-variable-models simulation statistics structural-equation-models

2.3 match 33 stars 12.85 score 610 scripts 476 dependents

promerpr

scanstatistics:Space-Time Anomaly Detection using Scan Statistics

Detection of anomalous space-time clusters using the scan statistics methodology. Focuses on prospective surveillance of data streams, scanning for clusters with ongoing anomalies. Hypothesis testing is made possible by Monte Carlo simulation. Allévius (2018) <doi:10.21105/joss.00515>.

Maintained by Paul Romer Present. Last updated 2 years ago.

cpp

6.0 match 1 stars 4.81 score 43 scripts

tidymodels

poissonreg:Model Wrappers for Poisson Regression

Bindings for Poisson regression models for use with the 'parsnip' package. Models include simple generalized linear models, Bayesian models, and zero-inflated Poisson models (Zeileis, Kleiber, and Jackman (2008) <doi:10.18637/jss.v027.i08>).

Maintained by Hannah Frick. Last updated 4 months ago.

3.9 match 22 stars 7.26 score 342 scripts 1 dependents

gmcmacran

LRTesteR:Likelihood Ratio Tests and Confidence Intervals

A collection of hypothesis tests and confidence intervals based on the likelihood ratio <https://en.wikipedia.org/wiki/Likelihood-ratio_test>.

Maintained by Greg McMahan. Last updated 6 months ago.

4.9 match 5.83 score 168 scripts

jonasmoss

univariateML:Maximum Likelihood Estimation for Univariate Densities

User-friendly maximum likelihood estimation (Fisher (1921) <doi:10.1098/rsta.1922.0009>) of univariate densities.

Maintained by Jonas Moss. Last updated 12 days ago.

density estimation maximum-likelihood

3.5 match 8 stars 8.10 score 62 scripts 7 dependents

lynettecaitlin

oHMMed:HMMs with Ordered Hidden States and Emission Densities

Inference using a class of Hidden Markov models (HMMs) called 'oHMMed'(ordered HMM with emission densities <doi:10.1186/s12859-024-05751-4>): The 'oHMMed' algorithms identify the number of comparably homogeneous regions within observed sequences with autocorrelation patterns. These are modelled as discrete hidden states; the observed data points are then realisations of continuous probability distributions with state-specific means that enable ordering of these distributions. The observed sequence is labelled according to the hidden states, permitting only neighbouring states that are also neighbours within the ordering of their associated distributions. The parameters that characterise these state-specific distributions are then inferred. Relevant for application to genomic sequences, time series, or any other sequence data with serial autocorrelation.

Maintained by Michal Majka. Last updated 1 months ago.

8.6 match 2 stars 3.30 score 4 scripts

dgbonett

statpsych:Statistical Methods for Psychologists

Implements confidence interval and sample size methods that are especially useful in psychological research. The methods can be applied in 1-group, 2-group, paired-samples, and multiple-group designs and to a variety of parameters including means, medians, proportions, slopes, standardized mean differences, standardized linear contrasts of means, plus several measures of correlation and association. Confidence interval and sample size functions are given for single parameters as well as differences, ratios, and linear contrasts of parameters. The sample size functions can be used to approximate the sample size needed to estimate a parameter or function of parameters with desired confidence interval precision or to perform a variety of hypothesis tests (directional two-sided, equivalence, superiority, noninferiority) with desired power. For details see: Statistical Methods for Psychologists, Volumes 1 – 4, <https://dgbonett.sites.ucsc.edu/>.

Maintained by Douglas G. Bonett. Last updated 3 months ago.

5.8 match 6 stars 4.83 score 15 scripts 1 dependents

tidymodels

yardstick:Tidy Characterizations of Model Performance

Tidy tools for quantifying how well model fits to a data set such as confusion matrices, class probability curve summaries, and regression metrics (e.g., RMSE).

Maintained by Emil Hvitfeldt. Last updated 3 days ago.

1.8 match 387 stars 15.47 score 2.2k scripts 60 dependents

bioc

frenchFISH:Poisson Models for Quantifying DNA Copy-number from FISH Images of Tissue Sections

FrenchFISH comprises a nuclear volume correction method coupled with two types of Poisson models: either a Poisson model for improved manual spot counting without the need for control probes; or a homogenous Poisson Point Process model for automated spot counting.

Maintained by Adam Berman. Last updated 5 months ago.

software biomedicalinformatics cellbiology genetics hiddenmarkovmodel preprocessing

7.0 match 4.00 score 3 scripts

bioc

transformGamPoi:Variance Stabilizing Transformation for Gamma-Poisson Models

Variance-stabilizing transformations help with the analysis of heteroskedastic data (i.e., data where the variance is not constant, like count data). This package provide two types of variance stabilizing transformations: (1) methods based on the delta method (e.g., 'acosh', 'log(x+1)'), (2) model residual based (Pearson and randomized quantile residuals).

Maintained by Constantin Ahlmann-Eltze. Last updated 5 months ago.

singlecell normalization preprocessing regression cpp

4.7 match 21 stars 5.95 score 21 scripts

jhmaindonald

qra:Quantal Response Analysis for Dose-Mortality Data

Functions are provided that implement the use of the Fieller's formula methodology, for calculating a confidence interval for a ratio of (commonly, correlated) means. See Fieller (1954) <doi:10.1111/j.2517-6161.1954.tb00159.x>. Here, the application of primary interest is to studies of insect mortality response to increasing doses of a fumigant, or, e.g., to time in coolstorage. The formula is used to calculate a confidence interval for the dose or time required to achieve a specified mortality proportion, commonly 0.5 or 0.99. Vignettes demonstrate link functions that may be considered, checks on fitted models, and alternative choices of error family. Note in particular the betabinomial error family. See also Maindonald, Waddell, and Petry (2001) <doi:10.1016/S0925-5214(01)00082-5>.

Maintained by John Maindonald. Last updated 1 years ago.

8.0 match 3.48 score 1 scripts

kenkellner

ASMbook:Functions for the Book "Applied Statistical Modeling for Ecologists"

Provides functions to accompany the book "Applied Statistical Modeling for Ecologists" by Marc Kéry and Kenneth F. Kellner (2024, ISBN: 9780443137150). Included are functions for simulating and customizing the datasets used for the example models in each chapter, summarizing output from model fitting engines, and running custom Markov Chain Monte Carlo.

Maintained by Ken Kellner. Last updated 7 months ago.

7.0 match 2 stars 3.90 score 10 scripts

sparklyr

sparklyr:R Interface to Apache Spark

R interface to Apache Spark, a fast and general engine for big data processing, see <https://spark.apache.org/>. This package supports connecting to local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end, and provides an interface to Spark's built-in machine learning algorithms.

Maintained by Edgar Ruiz. Last updated 8 days ago.

apache-spark distributed dplyr ide livy machine-learning remote-clusters spark sparklyr

1.8 match 959 stars 15.16 score 4.0k scripts 21 dependents

kaz-yos

regmedint:Regression-Based Causal Mediation Analysis with Interaction and Effect Modification Terms

This is an extension of the regression-based causal mediation analysis first proposed by Valeri and VanderWeele (2013) <doi:10.1037/a0031034> and Valeri and VanderWeele (2015) <doi:10.1097/EDE.0000000000000253>). It supports including effect measure modification by covariates(treatment-covariate and mediator-covariate product terms in mediator and outcome regression models) as proposed by Li et al (2023) <doi:10.1097/EDE.0000000000001643>. It also accommodates the original 'SAS' macro and 'PROC CAUSALMED' procedure in 'SAS' when there is no effect measure modification. Linear and logistic models are supported for the mediator model. Linear, logistic, loglinear, Poisson, negative binomial, Cox, and accelerated failure time (exponential and Weibull) models are supported for the outcome model.

Maintained by Yi Li. Last updated 1 years ago.

causal-inference mediation-analysis

4.0 match 29 stars 6.84 score 40 scripts

rchen18

RNGforGPD:Random Number Generation for Generalized Poisson Distribution

Generation of univariate and multivariate data that follow the generalized Poisson distribution. The details of the univariate part are explained in Demirtas (2017) <doi: 10.1080/03610918.2014.968725>, and the multivariate part is an extension of the correlated Poisson data generation routine that was introduced in Yahav and Shmueli (2012) <doi: 10.1002/asmb.901>.

Maintained by Ruizhe Chen. Last updated 4 years ago.

9.0 match 1 stars 3.00 score 11 scripts 3 dependents

laplacesdemonr

LaplacesDemon:Complete Environment for Bayesian Inference

Provides a complete environment for Bayesian inference using a variety of different samplers (see ?LaplacesDemon for an overview).

Maintained by Henrik Singmann. Last updated 12 months ago.

2.0 match 93 stars 13.45 score 1.8k scripts 60 dependents

bioc

DESeq2:Differential gene expression analysis based on the negative binomial distribution

Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.

Maintained by Michael Love. Last updated 10 days ago.

sequencing rnaseq chipseq geneexpression transcription normalization differentialexpression bayesian regression principalcomponent clustering immunooncology openblas cpp

1.7 match 375 stars 16.11 score 17k scripts 115 dependents

yenyiho-lab

scDECO:Estimating Dynamic Correlation

Implementations for two different Bayesian models of differential co-expression. scdeco.cop() fits the bivariate Gaussian copula model from Zichen Ma, Shannon W. Davis, Yen-Yi Ho (2023) <doi:10.1111/biom.13701>, while scdeco.pg() fits the bivariate Poisson-Gamma model from Zhen Yang, Yen-Yi Ho (2022) <doi:10.1111/biom.13457>.

Maintained by Anderson Bussing. Last updated 9 months ago.

jags cpp

5.6 match 4.78 score

inbo

inlatools:Diagnostic Tools for INLA Models

Several functions which can be useful to choose sensible priors and diagnose the fitted model.

Maintained by Thierry Onkelinx. Last updated 5 months ago.

bayesian-statistics gplv3 inla mixed-models model-checking model-validation

6.0 match 4 stars 4.41 score 43 scripts

helske

bssm:Bayesian Inference of Non-Linear and Non-Gaussian State Space Models

Efficient methods for Bayesian inference of state space models via Markov chain Monte Carlo (MCMC) based on parallel importance sampling type weighted estimators (Vihola, Helske, and Franks, 2020, <doi:10.1111/sjos.12492>), particle MCMC, and its delayed acceptance version. Gaussian, Poisson, binomial, negative binomial, and Gamma observation densities and basic stochastic volatility models with linear-Gaussian state dynamics, as well as general non-linear Gaussian models and discretised diffusion models are supported. See Helske and Vihola (2021, <doi:10.32614/RJ-2021-103>) for details.

Maintained by Jouni Helske. Last updated 6 months ago.

bayesian-inference cpp markov-chain-monte-carlo particle-filter state-space time-series openblas cpp openmp

4.1 match 42 stars 6.43 score 11 scripts

dsy109

mixtools:Tools for Analyzing Finite Mixture Models

Analyzes finite mixture models for various parametric and semiparametric settings. This includes mixtures of parametric distributions (normal, multivariate normal, multinomial, gamma), various Reliability Mixture Models (RMMs), mixtures-of-regressions settings (linear regression, logistic regression, Poisson regression, linear regression with changepoints, predictor-dependent mixing proportions, random effects regressions, hierarchical mixtures-of-experts), and tools for selecting the number of components (bootstrapping the likelihood ratio test statistic, mixturegrams, and model selection criteria). Bayesian estimation of mixtures-of-linear-regressions models is available as well as a novel data depth method for obtaining credible bands. This package is based upon work supported by the National Science Foundation under Grant No. SES-0518772 and the Chan Zuckerberg Initiative: Essential Open Source Software for Science (Grant No. 2020-255193).

Maintained by Derek Young. Last updated 9 months ago.

mixture-models mixture-of-experts semiparametric-regression

2.3 match 20 stars 11.34 score 1.4k scripts 56 dependents

r-forge

survey:Analysis of Complex Survey Samples

Summary statistics, two-sample tests, rank tests, generalised linear models, cumulative link models, Cox models, loglinear models, and general maximum pseudolikelihood estimation for multistage stratified, cluster-sampled, unequally weighted survey samples. Variances by Taylor series linearisation or replicate weights. Post-stratification, calibration, and raking. Two-phase and multiphase subsampling designs. Graphics. PPS sampling without replacement. Small-area estimation. Dual-frame designs.

Maintained by "Thomas Lumley". Last updated 6 months ago.

cpp

1.9 match 1 stars 13.94 score 13k scripts 232 dependents

solivella

poisbinom:A Faster Implementation of the Poisson-Binomial Distribution

Provides the probability, distribution, and quantile functions and random number generator for the Poisson-Binomial distribution. This package relies on FFTW to implement the discrete Fourier transform, so that it is much faster than the existing implementation of the same algorithm in R.

Maintained by Santiago Olivella. Last updated 6 years ago.

fftw3 cpp

5.5 match 2 stars 4.72 score 16 scripts 9 dependents

span-18

spStack:Bayesian Geostatistics Using Predictive Stacking

Fits Bayesian hierarchical spatial process models for point-referenced Gaussian, Poisson, binomial, and binary data using stacking of predictive densities. It involves sampling from analytically available posterior distributions conditional upon some candidate values of the spatial process parameters and, subsequently assimilate inference from these individual posterior distributions using Bayesian predictive stacking. Our algorithm is highly parallelizable and hence, much faster than traditional Markov chain Monte Carlo algorithms while delivering competitive predictive performance. See Zhang, Tang, and Banerjee (2024) <doi:10.48550/arXiv.2304.12414>, and, Pan, Zhang, Bradley, and Banerjee (2024) <doi:10.48550/arXiv.2406.04655> for details.

Maintained by Soumyakanti Pan. Last updated 9 days ago.

openblas cpp

5.3 match 4.95 score 6 scripts

jfrench

smerc:Statistical Methods for Regional Counts

Implements statistical methods for analyzing the counts of areal data, with a focus on the detection of spatial clusters and clustering. The package has a heavy emphasis on spatial scan methods, which were first introduced by Kulldorff and Nagarwalla (1995) <doi:10.1002/sim.4780140809> and Kulldorff (1997) <doi:10.1080/03610929708831995>.

Maintained by Joshua French. Last updated 5 months ago.

cpp

4.3 match 3 stars 6.11 score 45 scripts 3 dependents

abreu-uma

ecpdist:Extended Chen-Poisson Lifetime Distribution

Computes the Extended Chen-Poisson (ecp) distribution, survival, density, hazard, cumulative hazard and quantile functions. It also allows to generate a pseudo-random sample from this distribution. The corresponding graphics are available. Functions to obtain measures of skewness and kurtosis, k-th raw moments, conditional k-th moments and mean residual life function were added. For details about ecp distribution, see Sousa-Ferreira, I., Abreu, A.M. & Rocha, C. (2023). <doi:10.57805/revstat.v21i2.405>.

Maintained by Ana Abreu. Last updated 6 months ago.

6.9 match 3.74 score 1 scripts

johnnyzhz

WebPower:Basic and Advanced Statistical Power Analysis

This is a collection of tools for conducting both basic and advanced statistical power analysis including correlation, proportion, t-test, one-way ANOVA, two-way ANOVA, linear regression, logistic regression, Poisson regression, mediation analysis, longitudinal data analysis, structural equation modeling and multilevel modeling. It also serves as the engine for conducting power analysis online at <https://webpower.psychstat.org>.

Maintained by Zhiyong Zhang. Last updated 6 months ago.

4.6 match 8 stars 5.52 score 128 scripts

ewenharrison

finalfit:Quickly Create Elegant Regression Results Tables and Plots when Modelling

Generate regression results tables and plots in final format for publication. Explore models and export directly to PDF and 'Word' using 'RMarkdown'.

Maintained by Ewen Harrison. Last updated 6 months ago.

2.2 match 270 stars 11.43 score 1.0k scripts

majkamichal

naivebayes:High Performance Implementation of the Naive Bayes Algorithm

In this implementation of the Naive Bayes classifier following class conditional distributions are available: 'Bernoulli', 'Categorical', 'Gaussian', 'Poisson', 'Multinomial' and non-parametric representation of the class conditional density estimated via Kernel Density Estimation. Implemented classifiers handle missing data and can take advantage of sparse data.

Maintained by Michal Majka. Last updated 1 months ago.

classification-model datascience machine-learning naive-bayes

2.4 match 37 stars 10.47 score 1.0k scripts 6 dependents

sthomas522

hmclearn:Fit Statistical Models Using Hamiltonian Monte Carlo

Provide users with a framework to learn the intricacies of the Hamiltonian Monte Carlo algorithm with hands-on experience by tuning and fitting their own models. All of the code is written in R. Theoretical references are listed below:. Neal, Radford (2011) "Handbook of Markov Chain Monte Carlo" ISBN: 978-1420079418, Betancourt, Michael (2017) "A Conceptual Introduction to Hamiltonian Monte Carlo" <arXiv:1701.02434>, Thomas, S., Tu, W. (2020) "Learning Hamiltonian Monte Carlo in R" <arXiv:2006.16194>, Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2013) "Bayesian Data Analysis" ISBN: 978-1439840955, Agresti, Alan (2015) "Foundations of Linear and Generalized Linear Models ISBN: 978-1118730034, Pinheiro, J., Bates, D. (2006) "Mixed-effects Models in S and S-Plus" ISBN: 978-1441903174.

Maintained by Samuel Thomas. Last updated 4 years ago.

4.4 match 11 stars 5.64 score 16 scripts

hwborchers

pracma:Practical Numerical Math Functions

Provides a large number of functions from numerical analysis and linear algebra, numerical optimization, differential equations, time series, plus some well-known special mathematical functions. Uses 'MATLAB' function names where appropriate to simplify porting.

Maintained by Hans W. Borchers. Last updated 1 years ago.

2.0 match 29 stars 12.34 score 6.6k scripts 931 dependents

unuran

Runuran:R Interface to the 'UNU.RAN' Random Variate Generators

Interface to the 'UNU.RAN' library for Universal Non-Uniform RANdom variate generators. Thus it allows to build non-uniform random number generators from quite arbitrary distributions. In particular, it provides an algorithm for fast numerical inversion for distribution with given density function. In addition, the package contains densities, distribution functions and quantiles from a couple of distributions.

Maintained by Josef Leydold. Last updated 5 months ago.

3.5 match 6.87 score 180 scripts 8 dependents

cran

poisFErobust:Poisson Fixed Effects Robust

Computation of robust standard errors of Poisson fixed effects models, following Wooldridge (1999).

Maintained by Evan Wright. Last updated 5 years ago.

14.3 match 1.70 score 7 scripts

kgoldfeld

simstudy:Simulation of Study Data

Simulates data sets in order to explore modeling techniques or better understand data generating processes. The user specifies a set of relationships between covariates, and generates data based on these specifications. The final data sets can represent data from randomized control trials, repeated measure (longitudinal) designs, and cluster randomized trials. Missingness can be generated using various mechanisms (MCAR, MAR, NMAR).

Maintained by Keith Goldfeld. Last updated 8 months ago.

data-generation data-simulation simulation statistical-models cpp

2.2 match 82 stars 11.00 score 972 scripts 1 dependents

cran

BlakerCI:Blaker's Binomial and Poisson Confidence Limits

Fast and accurate calculation of Blaker's binomial and Poisson confidence limits (and some related stuff).

Maintained by Jan Klaschka. Last updated 6 years ago.

18.5 match 1.30 score 8 scripts

jamesyang007

adelie:Group Lasso and Elastic Net Solver for Generalized Linear Models

Extremely efficient procedures for fitting the entire group lasso and group elastic net regularization path for GLMs, multinomial, the Cox model and multi-task Gaussian models. Similar to the R package 'glmnet' in scope of models, and in computational speed. This package provides R bindings to the C++ code underlying the corresponding Python package 'adelie'. These bindings offer a general purpose group elastic net solver, a wide range of matrix classes that can exploit special structure to allow large-scale inputs, and an assortment of generalized linear model classes for fitting various types of data. The package is an implementation of Yang, J. and Hastie, T. (2024) <doi:10.48550/arXiv.2405.08631>.

Maintained by Trevor Hastie. Last updated 15 days ago.

cpp openmp

4.0 match 6 stars 5.86 score 3 scripts

bioc

PIPETS:Poisson Identification of PEaks from Term-Seq data

PIPETS provides statistically robust analysis for 3'-seq/term-seq data. It utilizes a sliding window approach to apply a Poisson Distribution test to identify genomic positions with termination read coverage that is significantly higher than the surrounding signal. PIPETS then condenses proximal signal and produces strand specific results that contain all significant termination peaks.

Maintained by Quinlan Furumo. Last updated 5 months ago.

sequencing transcription generegulation peakdetection genetics transcriptomics coverage

6.1 match 3.85 score 2 scripts

flyaflya

causact:Fast, Easy, and Visual Bayesian Inference

Accelerate Bayesian analytics workflows in 'R' through interactive modelling, visualization, and inference. Define probabilistic graphical models using directed acyclic graphs (DAGs) as a unifying language for business stakeholders, statisticians, and programmers. This package relies on interfacing with the 'numpyro' python package.

Maintained by Adam Fleischhacker. Last updated 2 months ago.

bayesian-inference dags posterior-probability probabilistic-graphical-models probabilistic-programming

3.3 match 45 stars 7.15 score 52 scripts

bioc

CAEN:Category encoding method for selecting feature genes for the classification of single-cell RNA-seq

With the development of high-throughput techniques, more and more gene expression analysis tend to replace hybridization-based microarrays with the revolutionary technology.The novel method encodes the category again by employing the rank of samples for each gene in each class. We then consider the correlation coefficient of gene and class with rank of sample and new rank of category. The highest correlation coefficient genes are considered as the feature genes which are most effective to classify the samples.

Maintained by Zhou Yan. Last updated 5 months ago.

differentialexpression sequencing classification rnaseq atacseq singlecell geneexpression ripseq

5.1 match 4.60 score 2 scripts

cran

NHPoisson:Modelling and Validation of Non Homogeneous Poisson Processes

Tools for modelling, ML estimation, validation analysis and simulation of non homogeneous Poisson processes in time.

Maintained by Ana C. Cebrian. Last updated 5 years ago.

8.6 match 2 stars 2.71 score 43 scripts 2 dependents

jobago

tlm:Effects under Linear, Logistic and Poisson Regression Models with Transformed Variables

Computation of effects under linear, logistic and Poisson regression models with transformed variables. Logarithm and power transformations are allowed. Effects can be displayed both numerically and graphically in both the original and the transformed space of the variables. The methods are described in Barrera-Gomez and Basagana (2015) <doi:10.1097/EDE.0000000000000247>.

Maintained by Jose Barrera-Gomez. Last updated 2 months ago.

8.3 match 2.81 score 13 scripts

spatialstatisticsupna

bigDM:Scalable Bayesian Disease Mapping Models for High-Dimensional Data

Implements several spatial and spatio-temporal scalable disease mapping models for high-dimensional count data using the INLA technique for approximate Bayesian inference in latent Gaussian models (Orozco-Acosta et al., 2021 <doi:10.1016/j.spasta.2021.100496>; Orozco-Acosta et al., 2023 <doi:10.1016/j.cmpb.2023.107403> and Vicente et al., 2023 <doi:10.1007/s11222-023-10263-x>). The creation and develpment of this package has been supported by Project MTM2017-82553-R (AEI/FEDER, UE) and Project PID2020-113125RB-I00/MCIN/AEI/10.13039/501100011033. It has also been partially funded by the Public University of Navarra (project PJUPNA2001).

Maintained by Aritz Adin. Last updated 7 months ago.

4.8 match 15 stars 4.88 score 10 scripts

psirusteam

TeachingSampling:Selection of Samples and Parameter Estimation in Finite Population

Allows the user to draw probabilistic samples and make inferences from a finite population based on several sampling designs.

Maintained by Hugo Andres Gutierrez Rojas. Last updated 5 years ago.

4.0 match 4 stars 5.80 score 217 scripts 4 dependents

spatstat

spatstat.explore:Exploratory Data Analysis for the 'spatstat' Family

Functionality for exploratory data analysis and nonparametric analysis of spatial data, mainly spatial point patterns, in the 'spatstat' family of packages. (Excludes analysis of spatial data on a linear network, which is covered by the separate package 'spatstat.linnet'.) Methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported.

Maintained by Adrian Baddeley. Last updated 1 months ago.

cluster-detection confidence-intervals hypothesis-testing k-function roc-curves scan-statistics significance-testing simulation-envelopes spatial-analysis spatial-data-analysis spatial-sharpening spatial-smoothing spatial-statistics

2.3 match 1 stars 10.17 score 67 scripts 148 dependents

r-econometrics

lfe:Linear Group Fixed Effects

Transforms away factors with many levels prior to doing an OLS. Useful for estimating linear models with multiple group fixed effects, and for estimating linear models which uses factors with many levels as pure control variables. See Gaure (2013) <doi:10.1016/j.csda.2013.03.024> Includes support for instrumental variables, conditional F statistics for weak instruments, robust and multi-way clustered standard errors, as well as limited mobility bias correction (Gaure 2014 <doi:10.1002/sta4.68>). Since version 3.0, it provides dedicated functions to estimate Poisson models.

Maintained by Mauricio Vargas Sepulveda. Last updated 1 years ago.

openblas

2.2 match 10.30 score 1.8k scripts 5 dependents

waternumbers

anomalous:Anomaly Detection using the CAPA and PELT Algorithms

Implimentations of the univariate CAPA <doi:10.1002/sam.11586> and PELT <doi:10.1080/01621459.2012.737745> algotithms along with various cost functions for different distributions and models. The modular design, using R6 classes, favour ease of extension (for example user written cost functions) over the performance of other implimentations (e.g. <doi:10.32614/CRAN.package.changepoint>, <doi:10.32614/CRAN.package.anomaly>).

Maintained by Paul Smith. Last updated 3 months ago.

cpp

4.9 match 4.61 score 18 scripts

scott-foster

fishMod:Fits Poisson-Sum-of-Gammas GLMs, Tweedie GLMs, and Delta Log-Normal Models

Fits models to catch and effort data. Single-species models are 1) delta log-normal, 2) Tweedie, or 3) Poisson-gamma (G)LMs.

Maintained by Scott D. Foster. Last updated 5 months ago.

cpp

8.3 match 1 stars 2.68 score 9 scripts 4 dependents

vladimirholy

gasmodel:Generalized Autoregressive Score Models

Estimation, forecasting, and simulation of generalized autoregressive score (GAS) models of Creal, Koopman, and Lucas (2013) <doi:10.1002/jae.1279> and Harvey (2013) <doi:10.1017/cbo9781139540933>. Model specification allows for various data types and distributions, different parametrizations, exogenous variables, joint and separate modeling of exogenous variables and dynamics, higher score and autoregressive orders, custom and unconditional initial values of time-varying parameters, fixed and bounded values of coefficients, and missing values. Model estimation is performed by the maximum likelihood method.

Maintained by Vladimír Holý. Last updated 1 years ago.

dcs gas time-series

4.1 match 14 stars 5.45 score 2 scripts

yanyachen

MLmetrics:Machine Learning Evaluation Metrics

A collection of evaluation metrics, including loss, score and utility functions, that measure regression, classification and ranking performance.

Maintained by Yachen Yan. Last updated 11 months ago.

2.0 match 69 stars 11.09 score 2.2k scripts 20 dependents

nanxstats

msaenet:Multi-Step Adaptive Estimation Methods for Sparse Regressions

Multi-step adaptive elastic-net (MSAENet) algorithm for feature selection in high-dimensional regressions proposed in Xiao and Xu (2015) <DOI:10.1080/00949655.2015.1016944>, with support for multi-step adaptive MCP-net (MSAMNet) and multi-step adaptive SCAD-net (MSASNet) methods.

Maintained by Nan Xiao. Last updated 8 months ago.

false-positive-control high-dimensional-data linear-regression machine-learning variable-selection

3.7 match 13 stars 6.01 score 52 scripts

kingaa

pomp:Statistical Inference for Partially Observed Markov Processes

Tools for data analysis with partially observed Markov process (POMP) models (also known as stochastic dynamical systems, hidden Markov models, and nonlinear, non-Gaussian, state-space models). The package provides facilities for implementing POMP models, simulating them, and fitting them to time series data by a variety of frequentist and Bayesian methods. It is also a versatile platform for implementation of inference methods for general POMP models.

Maintained by Aaron A. King. Last updated 1 months ago.

abc b-spline differential-equations dynamical-systems iterated-filtering likelihood likelihood-free markov-chain-monte-carlo markov-model mathematical-modelling measurement-error particle-filter sequential-monte-carlo simulation-based-inference sobol-sequence state-space statistical-inference stochastic-processes time-series openblas

1.9 match 115 stars 11.81 score 1.3k scripts 4 dependents

florianhartig

DHARMa:Residual Diagnostics for Hierarchical (Multi-Level / Mixed) Regression Models

The 'DHARMa' package uses a simulation-based approach to create readily interpretable scaled (quantile) residuals for fitted (generalized) linear mixed models. Currently supported are linear and generalized linear (mixed) models from 'lme4' (classes 'lmerMod', 'glmerMod'), 'glmmTMB', 'GLMMadaptive', and 'spaMM'; phylogenetic linear models from 'phylolm' (classes 'phylolm' and 'phyloglm'); generalized additive models ('gam' from 'mgcv'); 'glm' (including 'negbin' from 'MASS', but excluding quasi-distributions) and 'lm' model classes. Moreover, externally created simulations, e.g. posterior predictive simulations from Bayesian software such as 'JAGS', 'STAN', or 'BUGS' can be processed as well. The resulting residuals are standardized to values between 0 and 1 and can be interpreted as intuitively as residuals from a linear regression. The package also provides a number of plot and test functions for typical model misspecification problems, such as over/underdispersion, zero-inflation, and residual spatial, phylogenetic and temporal autocorrelation.

Maintained by Florian Hartig. Last updated 11 days ago.

glmm regression regression-diagnostics residual

1.5 match 226 stars 14.74 score 2.8k scripts 10 dependents

poissonconsulting

poissontemplate:'pkgdown' Templates for Poisson Consulting Packages

This is a private template for use by Poisson consulting packages. Please don't use for your own code.

Maintained by Joe Thorley. Last updated 2 months ago.

5.1 match 4.30 score

dwarton

ecostats:Code and Data Accompanying the Eco-Stats Text (Warton 2022)

Functions and data supporting the Eco-Stats text (Warton, 2022, Springer), and solutions to exercises. Functions include tools for using simulation envelopes in diagnostic plots, and a function for diagnostic plots of multivariate linear models. Datasets mentioned in the package are included here (where not available elsewhere) and there is a vignette for each chapter of the text with solutions to exercises.

Maintained by David Warton. Last updated 1 years ago.

3.3 match 8 stars 6.58 score 53 scripts

stopsack

risks:Estimate Risk Ratios and Risk Differences using Regression

Risk ratios and risk differences are estimated using regression models that allow for binary, categorical, and continuous exposures and confounders. Implemented are marginal standardization after fitting logistic models (g-computation) with delta-method and bootstrap standard errors, Miettinen's case-duplication approach (Schouten et al. 1993, <doi:10.1002/sim.4780121808>), log-binomial (Poisson) models with empirical variance (Zou 2004, <doi:10.1093/aje/kwh090>), binomial models with starting values from Poisson models (Spiegelman and Hertzmark 2005, <doi:10.1093/aje/kwi188>), and others.

Maintained by Konrad Stopsack. Last updated 11 months ago.

binomial biostatistics epidemiology regression-models

4.4 match 5 stars 4.95 score 12 scripts

handcock

degreenet:Models for Skewed Count Distributions Relevant to Networks

Likelihood-based inference for skewed count distributions, typically of degrees used in network modeling. "degreenet" is a part of the "statnet" suite of packages for network analysis. See Jones and Handcock <doi:10.1098/rspb.2003.2369>.

Maintained by Mark S. Handcock. Last updated 6 months ago.

12.4 match 1 stars 1.75 score 28 scripts

daphnegiorgi

IBMPopSim:Individual Based Model Population Simulation

Simulation of the random evolution of heterogeneous populations using stochastic Individual-Based Models (IBMs) <doi:10.48550/arXiv.2303.06183>. The package enables users to simulate population evolution, in which individuals are characterized by their age and some characteristics, and the population is modified by different types of events, including births/arrivals, death/exit events, or changes of characteristics. The frequency at which an event can occur to an individual can depend on their age and characteristics, but also on the characteristics of other individuals (interactions). Such models have a wide range of applications. For instance, IBMs can be used for simulating the evolution of a heterogeneous insurance portfolio with selection or for validating mortality forecasts. This package overcomes the limitations of time-consuming IBMs simulations by implementing new efficient algorithms based on thinning methods, which are compiled using the 'Rcpp' package while providing a user-friendly interface.

Maintained by Daphné Giorgi. Last updated 5 months ago.

cpp

3.7 match 8 stars 5.83 score 14 scripts