Showing 200 of total 456 results (show query)
hanase
BMA:Bayesian Model Averaging
Package for Bayesian model averaging and variable selection for linear models, generalized linear models and survival models (cox regression).
Maintained by Hana Sevcikova. Last updated 2 months ago.
35.3 match 37 stars 9.38 score 152 scripts 14 dependentsbenjaminschlegel
glm.predict:Predicted Values and Discrete Changes for Regression Models
Functions to calculate predicted values and the difference between the two cases with confidence interval for lm() [linear model], glm() [generalized linear model], glm.nb() [negative binomial model], polr() [ordinal logistic model], vglm() [generalized ordinal logistic model], multinom() [multinomial model], tobit() [tobit model], svyglm() [survey-weighted generalised linear models] and lmer() [linear multilevel models] using Monte Carlo simulations or bootstrap. Reference: Bennet A. Zelner (2009) <doi:10.1002/smj.783>.
Maintained by Benjamin E. Schlegel. Last updated 7 months ago.
47.9 match 1 stars 5.10 score 55 scriptssinnweja
haplo.stats:Statistical Analysis of Haplotypes with Traits and Covariates when Linkage Phase is Ambiguous
Routines for the analysis of indirectly measured haplotypes. The statistical methods assume that all subjects are unrelated and that haplotypes are ambiguous (due to unknown linkage phase of the genetic markers). The main functions are: haplo.em(), haplo.glm(), haplo.score(), and haplo.power(); all of which have detailed examples in the vignette.
Maintained by Jason P. Sinnwell. Last updated 6 months ago.
36.4 match 2 stars 5.98 score 96 scripts 12 dependentsfriendly
vcdExtra:'vcd' Extensions and Additions
Provides additional data sets, methods and documentation to complement the 'vcd' package for Visualizing Categorical Data and the 'gnm' package for Generalized Nonlinear Models. In particular, 'vcdExtra' extends mosaic, assoc and sieve plots from 'vcd' to handle 'glm()' and 'gnm()' models and adds a 3D version in 'mosaic3d'. Additionally, methods are provided for comparing and visualizing lists of 'glm' and 'loglm' objects. This package is now a support package for the book, "Discrete Data Analysis with R" by Michael Friendly and David Meyer.
Maintained by Michael Friendly. Last updated 5 months ago.
categorical-data-visualizationgeneralized-linear-modelsmosaic-plots
18.1 match 24 stars 10.34 score 472 scripts 3 dependentsjamesyang007
adelie:Group Lasso and Elastic Net Solver for Generalized Linear Models
Extremely efficient procedures for fitting the entire group lasso and group elastic net regularization path for GLMs, multinomial, the Cox model and multi-task Gaussian models. Similar to the R package 'glmnet' in scope of models, and in computational speed. This package provides R bindings to the C++ code underlying the corresponding Python package 'adelie'. These bindings offer a general purpose group elastic net solver, a wide range of matrix classes that can exploit special structure to allow large-scale inputs, and an assortment of generalized linear model classes for fitting various types of data. The package is an implementation of Yang, J. and Hastie, T. (2024) <doi:10.48550/arXiv.2405.08631>.
Maintained by Trevor Hastie. Last updated 15 days ago.
30.1 match 6 stars 5.86 score 3 scriptstomasfryda
h2o:R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Maintained by Tomas Fryda. Last updated 1 years ago.
21.5 match 3 stars 8.13 score 7.8k scripts 11 dependentsikosmidis
brglm2:Bias Reduction in Generalized Linear Models
Estimation and inference from generalized linear models based on various methods for bias reduction and maximum penalized likelihood with powers of the Jeffreys prior as penalty. The 'brglmFit' fitting method can achieve reduction of estimation bias by solving either the mean bias-reducing adjusted score equations in Firth (1993) <doi:10.1093/biomet/80.1.27> and Kosmidis and Firth (2009) <doi:10.1093/biomet/asp055>, or the median bias-reduction adjusted score equations in Kenne et al. (2017) <doi:10.1093/biomet/asx046>, or through the direct subtraction of an estimate of the bias of the maximum likelihood estimator from the maximum likelihood estimates as in Cordeiro and McCullagh (1991) <https://www.jstor.org/stable/2345592>. See Kosmidis et al (2020) <doi:10.1007/s11222-019-09860-6> for more details. Estimation in all cases takes place via a quasi Fisher scoring algorithm, and S3 methods for the construction of of confidence intervals for the reduced-bias estimates are provided. In the special case of generalized linear models for binomial and multinomial responses (both ordinal and nominal), the adjusted score approaches to mean and media bias reduction have been found to return estimates with improved frequentist properties, that are also always finite, even in cases where the maximum likelihood estimates are infinite (e.g. complete and quasi-complete separation; see Kosmidis and Firth, 2020 <doi:10.1093/biomet/asaa052>, for a proof for mean bias reduction in logistic regression).
Maintained by Ioannis Kosmidis. Last updated 6 months ago.
adjusted-score-equationsalgorithmsbias-reducing-adjustmentsbias-reductionestimationglmlogistic-regressionnominal-responsesordinal-responsesregressionregression-algorithmsstatistics
16.0 match 32 stars 10.41 score 106 scripts 10 dependentstrevorhastie
glmnet:Lasso and Elastic-Net Regularized Generalized Linear Models
Extremely efficient procedures for fitting the entire lasso or elastic-net regularization path for linear regression, logistic and multinomial regression models, Poisson regression, Cox model, multiple-response Gaussian, and the grouped multinomial regression; see <doi:10.18637/jss.v033.i01> and <doi:10.18637/jss.v039.i05>. There are two new and important additions. The family argument can be a GLM family object, which opens the door to any programmed family (<doi:10.18637/jss.v106.i01>). This comes with a modest computational cost, so when the built-in families suffice, they should be used instead. The other novelty is the relax option, which refits each of the active sets in the path unpenalized. The algorithm uses cyclical coordinate descent in a path-wise fashion, as described in the papers cited.
Maintained by Trevor Hastie. Last updated 2 years ago.
10.3 match 82 stars 15.15 score 22k scripts 736 dependentsbioc
ALDEx2:Analysis Of Differential Abundance Taking Sample and Scale Variation Into Account
A differential abundance analysis for the comparison of two or more conditions. Useful for analyzing data from standard RNA-seq or meta-RNA-seq assays as well as selected and unselected values from in-vitro sequence selections. Uses a Dirichlet-multinomial model to infer abundance from counts, optimized for three or more experimental replicates. The method infers biological and sampling variation to calculate the expected false discovery rate, given the variation, based on a Wilcoxon Rank Sum test and Welch's t-test (via aldex.ttest), a Kruskal-Wallis test (via aldex.kw), a generalized linear model (via aldex.glm), or a correlation test (via aldex.corr). All tests report predicted p-values and posterior Benjamini-Hochberg corrected p-values. ALDEx2 also calculates expected standardized effect sizes for paired or unpaired study designs. ALDEx2 can now be used to estimate the effect of scale on the results and report on the scale-dependent robustness of results.
Maintained by Greg Gloor. Last updated 5 months ago.
differentialexpressionrnaseqtranscriptomicsgeneexpressiondnaseqchipseqbayesiansequencingsoftwaremicrobiomemetagenomicsimmunooncologyscale simulationposterior p-value
14.4 match 28 stars 10.70 score 424 scripts 3 dependentsikosmidis
enrichwith:Methods to Enrich R Objects with Extra Components
Provides the "enrich" method to enrich list-like R objects with new, relevant components. The current version has methods for enriching objects of class 'family', 'link-glm', 'lm', 'glm' and 'betareg'. The resulting objects preserve their class, so all methods associated with them still apply. The package also provides the 'enriched_glm' function that has the same interface as 'glm' but results in objects of class 'enriched_glm'. In addition to the usual components in a `glm` object, 'enriched_glm' objects carry an object-specific simulate method and functions to compute the scores, the observed and expected information matrix, the first-order bias, as well as model densities, probabilities, and quantiles at arbitrary parameter values. The package can also be used to produce customizable source code templates for the structured implementation of methods to compute new components and enrich arbitrary objects.
Maintained by Ioannis Kosmidis. Last updated 5 years ago.
20.8 match 6 stars 7.35 score 16 scripts 12 dependentscastroloj
glm.deploy:'C' and 'Java' Source Code Generator for Fitted Glm Objects
Provides two functions that generate source code implementing the predict function of fitted glm objects. In this version, code can be generated for either 'C' or 'Java'. The idea is to provide a tool for the easy and fast deployment of glm predictive models into production. The source code generated by this package implements two function/methods. One of such functions implements the equivalent to predict(type="response"), while the second implements predict(type="link"). Source code is written to disk as a .c or .java file in the specified path. In the case of c, an .h file is also generated.
Maintained by Oscar Castro-Lopez. Last updated 6 years ago.
46.1 match 2 stars 3.04 score 11 scriptsboost-r
mboost:Model-Based Boosting
Functional gradient descent algorithm (boosting) for optimizing general risk functions utilizing component-wise (penalised) least squares estimates or regression trees as base-learners for fitting generalized linear, additive and interaction models to potentially high-dimensional data. Models and algorithms are described in <doi:10.1214/07-STS242>, a hands-on tutorial is available from <doi:10.1007/s00180-012-0382-5>. The package allows user-specified loss functions and base-learners.
Maintained by Torsten Hothorn. Last updated 4 months ago.
boosting-algorithmsgamglmmachine-learningmboostmodellingr-languagetutorialsvariable-selectionopenblas
11.0 match 72 stars 12.70 score 540 scripts 27 dependentsgavinsimpson
gratia:Graceful 'ggplot'-Based Graphics and Other Functions for GAMs Fitted Using 'mgcv'
Graceful 'ggplot'-based graphics and utility functions for working with generalized additive models (GAMs) fitted using the 'mgcv' package. Provides a reimplementation of the plot() method for GAMs that 'mgcv' provides, as well as 'tidyverse' compatible representations of estimated smooths.
Maintained by Gavin L. Simpson. Last updated 4 days ago.
distributional-regressiongamgammgeneralized-additive-mixed-modelsgeneralized-additive-modelsggplot2glmlmmgcvpenalized-splinerandom-effectssmoothingsplines
11.0 match 216 stars 12.68 score 1.6k scripts 1 dependentsecpolley
SuperLearner:Super Learner Prediction
Implements the super learner prediction method and contains a library of prediction algorithms to be used in the super learner.
Maintained by Eric Polley. Last updated 1 years ago.
10.5 match 273 stars 13.07 score 2.1k scripts 36 dependentsbioc
glmGamPoi:Fit a Gamma-Poisson Generalized Linear Model
Fit linear models to overdispersed count data. The package can estimate the overdispersion and fit repeated models for matrix input. It is designed to handle large input datasets as they typically occur in single cell RNA-seq experiments.
Maintained by Constantin Ahlmann-Eltze. Last updated 1 months ago.
regressionrnaseqsoftwaresinglecellgamma-poissonglmnegative-binomial-regressionon-diskopenblascpp
11.0 match 110 stars 12.11 score 1.0k scripts 4 dependentskkholst
mets:Analysis of Multivariate Event Times
Implementation of various statistical models for multivariate event history data <doi:10.1007/s10985-013-9244-x>. Including multivariate cumulative incidence models <doi:10.1002/sim.6016>, and bivariate random effects probit models (Liability models) <doi:10.1016/j.csda.2015.01.014>. Modern methods for survival analysis, including regression modelling (Cox, Fine-Gray, Ghosh-Lin, Binomial regression) with fast computation of influence functions.
Maintained by Klaus K. Holst. Last updated 22 hours ago.
multivariate-time-to-eventsurvival-analysistime-to-eventfortranopenblascpp
9.8 match 14 stars 13.47 score 236 scripts 42 dependentsvitomuggeo
segmented:Regression Models with Break-Points / Change-Points Estimation (with Possibly Random Effects)
Fitting regression models where, in addition to possible linear terms, one or more covariates have segmented (i.e., broken-line or piece-wise linear) or stepmented (i.e. piece-wise constant) effects. Multiple breakpoints for the same variable are allowed. The estimation method is discussed in Muggeo (2003, <doi:10.1002/sim.1545>) and illustrated in Muggeo (2008, <https://www.r-project.org/doc/Rnews/Rnews_2008-1.pdf>). An approach for hypothesis testing is presented in Muggeo (2016, <doi:10.1080/00949655.2016.1149855>), and interval estimation for the breakpoint is discussed in Muggeo (2017, <doi:10.1111/anzs.12200>). Segmented mixed models, i.e. random effects in the change point, are discussed in Muggeo (2014, <doi:10.1177/1471082X13504721>). Estimation of piecewise-constant relationships and changepoints (mean-shift models) is discussed in Fasola et al. (2018, <doi:10.1007/s00180-017-0740-4>).
Maintained by Vito M. R. Muggeo. Last updated 15 days ago.
12.3 match 9 stars 10.03 score 1.2k scripts 203 dependentscran
MASS:Support Functions and Datasets for Venables and Ripley's MASS
Functions and datasets to support Venables and Ripley, "Modern Applied Statistics with S" (4th edition, 2002).
Maintained by Brian Ripley. Last updated 15 days ago.
11.6 match 19 stars 10.53 score 11k dependentstidymodels
broom:Convert Statistical Objects into Tidy Tibbles
Summarizes key information about statistical objects in tidy tibbles. This makes it easy to report results, create plots and consistently work with large numbers of models at once. Broom provides three verbs that each provide different types of information about a model. tidy() summarizes information about model components such as coefficients of a regression. glance() reports information about an entire model, such as goodness of fit measures like AIC and BIC. augment() adds information about individual observations to a dataset, such as fitted values or influence measures.
Maintained by Simon Couch. Last updated 4 months ago.
5.5 match 1.5k stars 21.56 score 37k scripts 1.4k dependentsecospat
ecospat:Spatial Ecology Miscellaneous Methods
Collection of R functions and data sets for the support of spatial ecology analyses with a focus on pre, core and post modelling analyses of species distribution, niche quantification and community assembly. Written by current and former members and collaborators of the ecospat group of Antoine Guisan, Department of Ecology and Evolution (DEE) and Institute of Earth Surface Dynamics (IDYST), University of Lausanne, Switzerland. Read Di Cola et al. (2016) <doi:10.1111/ecog.02671> for details.
Maintained by Olivier Broennimann. Last updated 1 months ago.
11.8 match 32 stars 9.35 score 418 scripts 1 dependentswilltownes
glmpca:Dimension Reduction of Non-Normally Distributed Data
Implements a generalized version of principal components analysis (GLM-PCA) for dimension reduction of non-normally distributed data such as counts or binary matrices. Townes FW, Hicks SC, Aryee MJ, Irizarry RA (2019) <doi:10.1186/s13059-019-1861-6>. Townes FW (2019) <arXiv:1907.02647>.
Maintained by F. William Townes. Last updated 11 months ago.
10.1 match 94 stars 9.24 score 258 scripts 4 dependentsamrei-stammann
alpaca:Fit GLM's with High-Dimensional k-Way Fixed Effects
Provides a routine to partial out factors with many levels during the optimization of the log-likelihood function of the corresponding generalized linear model (glm). The package is based on the algorithm described in Stammann (2018) <arXiv:1707.01815> and is restricted to glm's that are based on maximum likelihood estimation and nonlinear. It also offers an efficient algorithm to recover estimates of the fixed effects in a post-estimation routine and includes robust and multi-way clustered standard errors. Further the package provides analytical bias corrections for binary choice models derived by Fernandez-Val and Weidner (2016) <doi:10.1016/j.jeconom.2015.12.014> and Hinz, Stammann, and Wanner (2020) <arXiv:2004.12655>.
Maintained by Amrei Stammann. Last updated 6 months ago.
11.9 match 45 stars 7.01 score 105 scriptsjinseob2kim
jstable:Create Tables from Different Types of Regression
Create regression tables from generalized linear model(GLM), generalized estimating equation(GEE), generalized linear mixed-effects model(GLMM), Cox proportional hazards model, survey-weighted generalized linear model(svyglm) and survey-weighted Cox model results for publication.
Maintained by Jinseob Kim. Last updated 10 days ago.
8.3 match 26 stars 9.98 score 199 scripts 1 dependentsalexpkeil1
qgcomp:Quantile G-Computation
G-computation for a set of time-fixed exposures with quantile-based basis functions, possibly under linearity and homogeneity assumptions. This approach estimates a regression line corresponding to the expected change in the outcome (on the link basis) given a simultaneous increase in the quantile-based category for all exposures. Works with continuous, binary, and right-censored time-to-event outcomes. Reference: Alexander P. Keil, Jessie P. Buckley, Katie M. OBrien, Kelly K. Ferguson, Shanshan Zhao, and Alexandra J. White (2019) A quantile-based g-computation approach to addressing the effects of exposure mixtures; <doi:10.1289/EHP5838>.
Maintained by Alexander Keil. Last updated 3 days ago.
exposureexposure-mixtureexposure-mixturesquantile-gcomputationsurvival
9.4 match 37 stars 8.73 score 70 scripts 2 dependentsdiystat
NBPSeq:Negative Binomial Models for RNA-Sequencing Data
Negative Binomial (NB) models for two-group comparisons and regression inferences from RNA-Sequencing Data.
Maintained by Yanming Di. Last updated 11 years ago.
16.8 match 1 stars 4.88 score 17 scripts 3 dependentshojsgaard
doBy:Groupwise Statistics, LSmeans, Linear Estimates, Utilities
Utility package containing: 1) Facilities for working with grouped data: 'do' something to data stratified 'by' some variables. 2) LSmeans (least-squares means), general linear estimates. 3) Restrict functions to a smaller domain. 4) Miscellaneous other utilities.
Maintained by Søren Højsgaard. Last updated 3 days ago.
5.4 match 1 stars 14.94 score 3.2k scripts 939 dependentsmlr-org
mlr3learners:Recommended Learners for 'mlr3'
Recommended Learners for 'mlr3'. Extends 'mlr3' with interfaces to essential machine learning packages on CRAN. This includes, but is not limited to: (penalized) linear and logistic regression, linear and quadratic discriminant analysis, k-nearest neighbors, naive Bayes, support vector machines, and gradient boosting.
Maintained by Marc Becker. Last updated 4 months ago.
classificationlearnersmachine-learningmlr3regression
7.0 match 91 stars 11.51 score 1.5k scripts 10 dependentsnerler
JointAI:Joint Analysis and Imputation of Incomplete Data
Joint analysis and imputation of incomplete data in the Bayesian framework, using (generalized) linear (mixed) models and extensions there of, survival models, or joint models for longitudinal and survival data, as described in Erler, Rizopoulos and Lesaffre (2021) <doi:10.18637/jss.v100.i20>. Incomplete covariates, if present, are automatically imputed. The package performs some preprocessing of the data and creates a 'JAGS' model, which will then automatically be passed to 'JAGS' <https://mcmc-jags.sourceforge.io/> with the help of the package 'rjags'.
Maintained by Nicole S. Erler. Last updated 12 months ago.
bayesiangeneralized-linear-modelsglmglmmimputationimputationsjagsjoint-analysislinear-mixed-modelslinear-regression-modelsmcmc-samplemcmc-samplingmissing-datamissing-valuessurvivalcpp
11.0 match 28 stars 7.30 score 59 scripts 1 dependentspachadotdev
gravity:Estimation Methods for Gravity Models
A wrapper of different standard estimation methods for gravity models. This package provides estimation methods for log-log models and multiplicative models.
Maintained by Mauricio Vargas. Last updated 4 months ago.
bvubvwddmeconometricsglmgpmlgravityinternational-tradelmmaximum-likelihoodnbpmlnlsolsppmlsilstobittrade
11.0 match 35 stars 6.98 score 55 scriptsvadimtyuryaev
RegrCoeffsExplorer:Efficient Visualization of Regression Coefficients for lm(), glm(), and glmnet() Objects
The visualization tool offers a nuanced understanding of regression dynamics, going beyond traditional per-unit interpretation of continuous variables versus categorical ones. It highlights the impact of unit changes as well as larger shifts like interquartile changes, acknowledging the distribution of empirical data. Furthermore, it generates visualizations depicting alterations in Odds Ratios for predictors across minimum, first quartile, median, third quartile, and maximum values, aiding in comprehending predictor-outcome interplay within empirical data distributions, particularly in logistic regression frameworks.
Maintained by Vadim Tyuryaev. Last updated 2 months ago.
coefficients-of-linear-regressionconfidence-intervalsempirical-dataglmglmnetlasso-regressionlmpostselectioninferenceregression-analysisregularized-linear-regressionregularized-logistic-regressionselectiveinferencestatistics-for-data-sciencevisualization
15.1 match 1 stars 4.90 score 4 scriptsmatteo21q
jomo:Multilevel Joint Modelling Multiple Imputation
Similarly to Schafer's package 'pan', 'jomo' is a package for multilevel joint modelling multiple imputation (Carpenter and Kenward, 2013) <doi:10.1002/9781119942283>. Novel aspects of 'jomo' are the possibility of handling binary and categorical data through latent normal variables, the option to use cluster-specific covariance matrices and to impute compatibly with the substantive model.
Maintained by Matteo Quartagno. Last updated 2 years ago.
7.7 match 3 stars 9.58 score 126 scripts 154 dependentsyouyifong
kyotil:Utility Functions for Statistical Analysis Report Generation and Monte Carlo Studies
Helper functions for creating formatted summary of regression models, writing publication-ready tables to latex files, and running Monte Carlo experiments.
Maintained by Youyi Fong. Last updated 6 days ago.
9.0 match 7.87 score 236 scripts 7 dependentslme4
lme4:Linear Mixed-Effects Models using 'Eigen' and S4
Fit linear and generalized linear mixed-effects models. The models and their components are represented using S4 classes and methods. The core computational algorithms are implemented using the 'Eigen' C++ library for numerical linear algebra and 'RcppEigen' "glue".
Maintained by Ben Bolker. Last updated 1 days ago.
3.4 match 647 stars 20.69 score 35k scripts 1.5k dependentsewenharrison
finalfit:Quickly Create Elegant Regression Results Tables and Plots when Modelling
Generate regression results tables and plots in final format for publication. Explore models and export directly to PDF and 'Word' using 'RMarkdown'.
Maintained by Ewen Harrison. Last updated 6 months ago.
6.0 match 270 stars 11.43 score 1.0k scriptsmpascariu
ungroup:Penalized Composite Link Model for Efficient Estimation of Smooth Distributions from Coarsely Binned Data
Versatile method for ungrouping histograms (binned count data) assuming that counts are Poisson distributed and that the underlying sequence on a fine grid to be estimated is smooth. The method is based on the composite link model and estimation is achieved by maximizing a penalized likelihood. Smooth detailed sequences of counts and rates are so estimated from the binned counts. Ungrouping binned data can be desirable for many reasons: Bins can be too coarse to allow for accurate analysis; comparisons can be hindered when different grouping approaches are used in different histograms; and the last interval is often wide and open-ended and, thus, covers a lot of information in the tail area. Age-at-death distributions grouped in age classes and abridged life tables are examples of binned data. Because of modest assumptions, the approach is suitable for many demographic and epidemiological applications. For a detailed description of the method and applications see Rizzi et al. (2015) <doi:10.1093/aje/kwv020>.
Maintained by Marius D. Pascariu. Last updated 1 years ago.
distributionsglmsmoothingungroupingcpp
11.0 match 14 stars 5.96 score 65 scriptsmoviedo5
fda.usc:Functional Data Analysis and Utilities for Statistical Computing
Routines for exploratory and descriptive analysis of functional data such as depth measurements, atypical curves detection, regression models, supervised classification, unsupervised classification and functional analysis of variance.
Maintained by Manuel Oviedo de la Fuente. Last updated 4 months ago.
functional-data-analysisfortran
6.5 match 12 stars 9.72 score 560 scripts 22 dependentssbgraves237
Ecfun:Functions for 'Ecdat'
Functions and vignettes to update data sets in 'Ecdat' and to create, manipulate, plot, and analyze those and similar data sets.
Maintained by Spencer Graves. Last updated 3 months ago.
7.9 match 7.94 score 85 scripts 4 dependentsmqbssppe
poisson.glm.mix:Fit High Dimensional Mixtures of Poisson GLMs
Mixtures of Poisson Generalized Linear Models for high dimensional count data clustering. The (multivariate) responses can be partitioned into set of blocks. Three different parameterizations of the linear predictor are considered. The models are estimated according to the EM algorithm with an efficient initialization scheme <doi:10.1016/j.csda.2014.07.005>.
Maintained by Panagiotis Papastamoulis. Last updated 2 years ago.
40.3 match 1.52 score 11 scripts 1 dependentsguyabel
tidycat:Expand Tidy Output for Categorical Parameter Estimates
Create additional rows and columns on broom::tidy() output to allow for easier control on categorical parameter estimates.
Maintained by Guy J. Abel. Last updated 1 years ago.
data-visualizationdata-vizglmmodel-comparisonregression-analysisregression-modelsstatistical-analysisstatistical-modeling
11.0 match 4 stars 5.53 score 56 scripts 1 dependentsxsswang
remiod:Reference-Based Multiple Imputation for Ordinal/Binary Response
Reference-based multiple imputation of ordinal and binary responses under Bayesian framework, as described in Wang and Liu (2022) <arXiv:2203.02771>. Methods for missing-not-at-random include Jump-to-Reference (J2R), Copy Reference (CR), and Delta Adjustment which can generate tipping point analysis.
Maintained by Tony Wang. Last updated 2 years ago.
bayesiancontrol-basedcopy-referencedelta-adjustmentgeneralized-linear-modelsglmjagsjump-to-referencemcmcmissing-at-randommissing-datamissing-not-at-randommultiple-imputationnon-ignorableordinal-regressionpattern-mixture-modelreference-basedstatisticscpp
14.0 match 4.30 score 3 scriptsrapporter
pander:An R 'Pandoc' Writer
Contains some functions catching all messages, 'stdout' and other useful information while evaluating R code and other helpers to return user specified text elements (like: header, paragraph, table, image, lists etc.) in 'pandoc' markdown or several type of R objects similarly automatically transformed to markdown format. Also capable of exporting/converting (the resulting) complex 'pandoc' documents to e.g. HTML, 'PDF', 'docx' or 'odt'. This latter reporting feature is supported in brew syntax or with a custom reference class with a smarty caching 'backend'.
Maintained by Gergely Daróczi. Last updated 14 days ago.
literate-programmingmarkdownpandocpandoc-markdownreproducible-researchrmarkdowncpp
3.6 match 297 stars 16.60 score 7.6k scripts 108 dependentsjared-fowler
prettyglm:Pretty Summaries of Generalized Linear Model Coefficients
One of the main advantages of using Generalised Linear Models is their interpretability. The goal of 'prettyglm' is to provide a set of functions which easily create beautiful coefficient summaries which can readily be shared and explained. 'prettyglm' helps users create coefficient summaries which include categorical base levels, variable importance and type III p.values. 'prettyglm' also creates beautiful relativity plots for categorical, continuous and splined coefficients.
Maintained by Jared Fowler. Last updated 1 years ago.
classificationclassification-modeldata-sciencedata-visualizationglmlinear-modelsregressionregression-analysisregression-modelregression-modelsstatistical-models
12.5 match 3 stars 4.73 score 36 scriptszhuwang46
mpath:Regularized Linear Models
Algorithms compute robust estimators for loss functions in the concave convex (CC) family by the iteratively reweighted convex optimization (IRCO), an extension of the iteratively reweighted least squares (IRLS). The IRCO reduces the weight of the observation that leads to a large loss; it also provides weights to help identify outliers. Applications include robust (penalized) generalized linear models and robust support vector machines. The package also contains penalized Poisson, negative binomial, zero-inflated Poisson, zero-inflated negative binomial regression models and robust models with non-convex loss functions. Wang et al. (2014) <doi:10.1002/sim.6314>, Wang et al. (2015) <doi:10.1002/bimj.201400143>, Wang et al. (2016) <doi:10.1177/0962280214530608>, Wang (2021) <doi:10.1007/s11749-021-00770-2>, Wang (2020) <arXiv:2010.02848>.
Maintained by Zhu Wang. Last updated 3 years ago.
8.8 match 1 stars 6.67 score 131 scripts 4 dependentsmartin3141
spant:MR Spectroscopy Analysis Tools
Tools for reading, visualising and processing Magnetic Resonance Spectroscopy data. The package includes methods for spectral fitting: Wilson (2021) <DOI:10.1002/mrm.28385> and spectral alignment: Wilson (2018) <DOI:10.1002/mrm.27605>.
Maintained by Martin Wilson. Last updated 29 days ago.
brainmrimrsmrshubspectroscopyfortran
6.8 match 24 stars 8.55 score 81 scriptsegenn
rtemis:Machine Learning and Visualization
Advanced Machine Learning and Visualization. Unsupervised Learning (Clustering, Decomposition), Supervised Learning (Classification, Regression), Cross-Decomposition, Bagging, Boosting, Meta-models. Static and interactive graphics.
Maintained by E.D. Gennatas. Last updated 1 months ago.
data-sciencedata-visualizationmachine-learningmachine-learning-libraryvisualization
7.7 match 145 stars 7.09 score 50 scripts 2 dependentsdaniel-gerhard
mcprofile:Testing Generalized Linear Hypotheses for Generalized Linear Model Parameters by Profile Deviance
Calculation of signed root deviance profiles for linear combinations of parameters in a generalized linear model. Multiple tests and simultaneous confidence intervals are provided.
Maintained by Daniel Gerhard. Last updated 4 years ago.
11.0 match 1 stars 4.88 score 51 scripts 1 dependentsvalentint
robust:Port of the S+ "Robust Library"
Methods for robust statistics, a state of the art in the early 2000s, notably for robust regression and robust multivariate analysis.
Maintained by Valentin Todorov. Last updated 7 months ago.
7.1 match 7.51 score 572 scripts 8 dependentscran
boot:Bootstrap Functions (Originally by Angelo Canty for S)
Functions and datasets for bootstrapping from the book "Bootstrap Methods and Their Application" by A. C. Davison and D. V. Hinkley (1997, CUP), originally written by Angelo Canty for S.
Maintained by Alessandra R. Brazzale. Last updated 7 months ago.
6.5 match 2 stars 8.21 score 2.3k dependentschristophergandrud
coreSim:Core Functionality for Simulating Quantities of Interest from Generalised Linear Models
Core functions for simulating quantities of interest from generalised linear models (GLM). This package will form the backbone of a series of other packages that improve the interpretation of GLM estimates.
Maintained by Christopher Gandrud. Last updated 8 years ago.
generalised-linear-modelsglmsimulating-quantitiessimulation
13.5 match 5 stars 3.88 score 9 scripts 1 dependentskwstat
agridat:Agricultural Datasets
Datasets from books, papers, and websites related to agriculture. Example graphics and analyses are included. Data come from small-plot trials, multi-environment trials, uniformity trials, yield monitors, and more.
Maintained by Kevin Wright. Last updated 26 days ago.
4.7 match 125 stars 11.02 score 1.7k scripts 2 dependentsleifeld
texreg:Conversion of R Regression Output to LaTeX or HTML Tables
Converts coefficients, standard errors, significance stars, and goodness-of-fit statistics of statistical models into LaTeX tables or HTML tables/MS Word documents or to nicely formatted screen output for the R console for easy model comparison. A list of several models can be combined in a single table. The output is highly customizable. New model types can be easily implemented. Details can be found in Leifeld (2013), JStatSoft <doi:10.18637/jss.v055.i08>.)
Maintained by Philip Leifeld. Last updated 2 months ago.
html-tableslatexlatex-tablesregressionreportingtabletexreg
3.7 match 113 stars 14.09 score 1.8k scripts 67 dependentsjaredlander
coefplot:Plots Coefficients from Fitted Models
Plots the coefficients from model objects. This very quickly shows the user the point estimates and confidence intervals for fitted models.
Maintained by Jared P. Lander. Last updated 3 years ago.
6.3 match 27 stars 8.28 score 744 scripts 1 dependentsstephenslab
fastglmpca:Fast Algorithms for Generalized Principal Component Analysis
Implements fast, scalable optimization algorithms for fitting generalized principal components analysis (GLM-PCA) models, as described in "A Generalization of Principal Components Analysis to the Exponential Family" Collins M, Dasgupta S, Schapire RE (2002, ISBN:9780262271738), and subsequently "Feature Selection and Dimension Reduction for Single-Cell RNA-Seq Based on a Multinomial Model" Townes FW, Hicks SC, Aryee MJ, Irizarry RA (2019) <doi:10.1186/s13059-019-1861-6>.
Maintained by Eric Weine. Last updated 3 days ago.
8.7 match 11 stars 5.72 score 16 scriptsbioc
MAST:Model-based Analysis of Single Cell Transcriptomics
Methods and models for handling zero-inflated single cell assay data.
Maintained by Andrew McDavid. Last updated 5 months ago.
geneexpressiondifferentialexpressiongenesetenrichmentrnaseqtranscriptomicssinglecell
3.9 match 230 stars 12.75 score 1.8k scripts 5 dependentshaghish
mlim:Single and Multiple Imputation with Automated Machine Learning
Machine learning algorithms have been used for performing single missing data imputation and most recently, multiple imputations. However, this is the first attempt for using automated machine learning algorithms for performing both single and multiple imputation. Automated machine learning is a procedure for fine-tuning the model automatic, performing a random search for a model that results in less error, without overfitting the data. The main idea is to allow the model to set its own parameters for imputing each variable separately instead of setting fixed predefined parameters to impute all variables of the dataset. Using automated machine learning, the package fine-tunes an Elastic Net (default) or Gradient Boosting, Random Forest, Deep Learning, Extreme Gradient Boosting, or Stacked Ensemble machine learning model (from one or a combination of other supported algorithms) for imputing the missing observations. This procedure has been implemented for the first time by this package and is expected to outperform other packages for imputing missing data that do not fine-tune their models. The multiple imputation is implemented via bootstrapping without letting the duplicated observations to harm the cross-validation procedure, which is the way imputed variables are evaluated. Most notably, the package implements automated procedure for handling imputing imbalanced data (class rarity problem), which happens when a factor variable has a level that is far more prevalent than the other(s). This is known to result in biased predictions, hence, biased imputation of missing data. However, the autobalancing procedure ensures that instead of focusing on maximizing accuracy (classification error) in imputing factor variables, a fairer procedure and imputation method is practiced.
Maintained by E. F. Haghish. Last updated 8 months ago.
automatic-machine-learningautomlclassimbalancedata-scienceelastic-netextreme-gradient-boostinggbmglmgradient-boostinggradient-boosting-machineimputationimputation-algorithmimputation-methodsmachine-learningmissing-datamultipleimputationstack-ensemble
11.0 match 31 stars 4.49 score 7 scriptsjulierennes
misaem:Linear Regression and Logistic Regression with Missing Covariates
Estimate parameters of linear regression and logistic regression with missing covariates with missing data, perform model selection and prediction, using EM-type algorithms. Jiang W., Josse J., Lavielle M., TraumaBase Group (2020) <doi:10.1016/j.csda.2019.106907>.
Maintained by Julie Josse. Last updated 4 years ago.
11.8 match 1 stars 4.20 score 32 scriptsaariq
bumbl:Tools for Modeling Bumblebee Colony Growth and Decline
Bumblebee colonies grow during worker production, then decline after switching to production of reproductive individuals (drones and gynes). This package provides tools for modeling and visualizing this pattern by identifying a switchpoint with a growth rate before and a decline rate after the switchpoint. The mathematical models fit by bumbl are described in Crone and Williams (2016) <doi:10.1111/ele.12581>.
Maintained by Eric R. Scott. Last updated 2 years ago.
bumblebeedemographyglmswitchpoint
11.0 match 3 stars 4.48 score 8 scriptsgkremling
gofreg:Bootstrap-Based Goodness-of-Fit Tests for Parametric Regression
Provides statistical methods to check if a parametric family of conditional density functions fits to some given dataset of covariates and response variables. Different test statistics can be used to determine the goodness-of-fit of the assumed model, see Andrews (1997) <doi:10.2307/2171880>, Bierens & Wang (2012) <doi:10.1017/S0266466611000168>, Dikta & Scheer (2021) <doi:10.1007/978-3-030-73480-0> and Kremling & Dikta (2024) <doi:10.48550/arXiv.2409.20262>. As proposed in these papers, the corresponding p-values are approximated using a parametric bootstrap method.
Maintained by Gitte Kremling. Last updated 5 months ago.
9.0 match 5.30 score 9 scriptsr-forge
robustbase:Basic Robust Statistics
"Essential" Robust Statistics. Tools allowing to analyze data with robust methods. This includes regression methodology including model selections and multivariate statistics where we strive to cover the book "Robust Statistics, Theory and Methods" by 'Maronna, Martin and Yohai'; Wiley 2006.
Maintained by Martin Maechler. Last updated 4 months ago.
3.5 match 13.33 score 1.7k scripts 480 dependentsbioc
apeglm:Approximate posterior estimation for GLM coefficients
apeglm provides Bayesian shrinkage estimators for effect sizes for a variety of GLM models, using approximation of the posterior for individual coefficients.
Maintained by Anqi Zhu. Last updated 5 months ago.
immunooncologysequencingrnaseqdifferentialexpressiongeneexpressionbayesiancpp
5.3 match 8.64 score 700 scripts 9 dependentsbioc
CytoGLMM:Conditional Differential Analysis for Flow and Mass Cytometry Experiments
The CytoGLMM R package implements two multiple regression strategies: A bootstrapped generalized linear model (GLM) and a generalized linear mixed model (GLMM). Most current data analysis tools compare expressions across many computationally discovered cell types. CytoGLMM focuses on just one cell type. Our narrower field of application allows us to define a more specific statistical model with easier to control statistical guarantees. As a result, CytoGLMM finds differential proteins in flow and mass cytometry data while reducing biases arising from marker correlations and safeguarding against false discoveries induced by patient heterogeneity.
Maintained by Christof Seiler. Last updated 5 months ago.
flowcytometryproteomicssinglecellcellbasedassayscellbiologyimmunooncologyregressionstatisticalmethodsoftware
8.1 match 2 stars 5.68 score 1 scripts 1 dependentsmyaseen208
StroupGLMM:R Codes and Datasets for Generalized Linear Mixed Models: Modern Concepts, Methods and Applications by Walter W. Stroup
R Codes and Datasets for Stroup, W. W. (2012). Generalized Linear Mixed Models Modern Concepts, Methods and Applications, CRC Press.
Maintained by Muhammad Yaseen. Last updated 5 months ago.
11.0 match 14 stars 4.15 score 2 scriptscardiomoon
moonBook:Functions and Datasets for the Book by Keon-Woong Moon
Several analysis-related functions for the book entitled "R statistics and graph for medical articles" (written in Korean), version 1, by Keon-Woong Moon with Korean demographic data with several plot functions.
Maintained by Keon-Woong Moon. Last updated 1 years ago.
4.7 match 37 stars 9.66 score 278 scripts 5 dependentssciviews
modelit:Statistical Models for 'SciViews::R'
Create and use statistical models (linear, general, nonlinear...) with extensions to support rich-formatted tables, equations and plots for the 'SciViews::R' dialect.
Maintained by Philippe Grosjean. Last updated 4 months ago.
13.7 match 1 stars 3.30 score 8 scriptscran
catalytic:Tools for Applying Catalytic Priors in Statistical Modeling
To improve estimation accuracy and stability in statistical modeling, catalytic prior distributions are employed, integrating observed data with synthetic data generated from a simpler model's predictive distribution. This approach enhances model robustness, stability, and flexibility in complex data scenarios. The catalytic prior distributions are introduced by 'Huang et al.' (2020, <doi:10.1073/pnas.1920913117>), Li and Huang (2023, <doi:10.48550/arXiv.2312.01411>).
Maintained by Dongming Huang. Last updated 3 months ago.
14.0 match 3.18 scoreddalthorp
dwp:Density-Weighted Proportion
Fit a Poisson regression to carcass distance data and integrate over the searched area at a wind farm to estimate the fraction of carcasses falling in the searched area and format the output for use as the dwp parameter in the 'GenEst' or 'eoa' package for estimating bird and bat mortality, following Dalthorp, et al. (2022) <arXiv:2201.10064>.
Maintained by Daniel Dalthorp. Last updated 2 years ago.
16.2 match 1 stars 2.70 scorecrsh
papaja:Prepare American Psychological Association Journal Articles with R Markdown
Tools to create dynamic, submission-ready manuscripts, which conform to American Psychological Association manuscript guidelines. We provide R Markdown document formats for manuscripts (PDF and Word) and revision letters (PDF). Helper functions facilitate reporting statistical analyses or create publication-ready tables and plots.
Maintained by Frederik Aust. Last updated 16 days ago.
apaapa-guidelinesjournalmanuscriptpsychologyreproducible-paperreproducible-researchrmarkdown
3.7 match 662 stars 11.74 score 1.7k scripts 1 dependentswwbrannon
sqlscore:Utilities for Generating SQL Queries from Model Objects
Provides utilities for generating SQL queries (particularly CREATE TABLE statements) from R model objects. The most important use case is generating SQL to score a generalized linear model or related model represented as an R object, in which case the package handles parsing formula operators and including the model's response function.
Maintained by William Brannon. Last updated 6 years ago.
11.0 match 13 stars 3.81 score 8 scriptscwatson
brainGraph:Graph Theory Analysis of Brain MRI Data
A set of tools for performing graph theory analysis of brain MRI data. It works with data from a Freesurfer analysis (cortical thickness, volumes, local gyrification index, surface area), diffusion tensor tractography data (e.g., from FSL) and resting-state fMRI data (e.g., from DPABI). It contains a graphical user interface for graph visualization and data exploration, along with several functions for generating useful figures.
Maintained by Christopher G. Watson. Last updated 1 years ago.
brain-connectivitybrain-imagingcomplex-networksconnectomeconnectomicsfmrigraph-theorymrinetwork-analysisneuroimagingneurosciencestatisticstractography
5.3 match 188 stars 7.86 score 107 scripts 3 dependentsbioc
msmsTests:LC-MS/MS Differential Expression Tests
Statistical tests for label-free LC-MS/MS data by spectral counts, to discover differentially expressed proteins between two biological conditions. Three tests are available: Poisson GLM regression, quasi-likelihood GLM regression, and the negative binomial of the edgeR package.The three models admit blocking factors to control for nuissance variables.To assure a good level of reproducibility a post-test filter is available, where we may set the minimum effect size considered biologicaly relevant, and the minimum expression of the most abundant condition.
Maintained by Josep Gregori i Font. Last updated 5 months ago.
immunooncologysoftwaremassspectrometryproteomics
8.2 match 5.03 score 15 scripts 1 dependentslotze
COMPoissonReg:Conway-Maxwell Poisson (COM-Poisson) Regression
Fit Conway-Maxwell Poisson (COM-Poisson or CMP) regression models to count data (Sellers & Shmueli, 2010) <doi:10.1214/09-AOAS306>. The package provides functions for model estimation, dispersion testing, and diagnostics. Zero-inflated CMP regression (Sellers & Raim, 2016) <doi:10.1016/j.csda.2016.01.007> is also supported.
Maintained by Andrew Raim. Last updated 1 years ago.
6.3 match 9 stars 6.63 score 53 scripts 3 dependentsbioc
mirTarRnaSeq:mirTarRnaSeq
mirTarRnaSeq R package can be used for interactive mRNA miRNA sequencing statistical analysis. This package utilizes expression or differential expression mRNA and miRNA sequencing results and performs interactive correlation and various GLMs (Regular GLM, Multivariate GLM, and Interaction GLMs ) analysis between mRNA and miRNA expriments. These experiments can be time point experiments, and or condition expriments.
Maintained by Mercedeh Movassagh. Last updated 5 months ago.
mirnaregressionsoftwaresequencingsmallrnatimecoursedifferentialexpression
10.1 match 4.00 score 9 scriptsgksmyth
statmod:Statistical Modeling
A collection of algorithms and functions to aid statistical modeling. Includes limiting dilution analysis (aka ELDA), growth curve comparisons, mixed linear models, heteroscedastic regression, inverse-Gaussian probability calculations, Gauss quadrature and a secure convergence algorithm for nonlinear models. Also includes advanced generalized linear model functions including Tweedie and Digamma distributional families, secure convergence and exact distributional calculations for unit deviances.
Maintained by Gordon Smyth. Last updated 2 years ago.
4.0 match 1 stars 9.62 score 2.2k scripts 849 dependentsluisagi
enmpa:Ecological Niche Modeling using Presence-Absence Data
A set of tools to perform Ecological Niche Modeling with presence-absence data. It includes algorithms for data partitioning, model fitting, calibration, evaluation, selection, and prediction. Other functions help to explore signals of ecological niche using univariate and multivariate analyses, and model features such as variable response curves and variable importance. Unique characteristics of this package are the ability to exclude models with concave quadratic responses, and the option to clamp model predictions to specific variables. These tools are implemented following principles proposed in Cobos et al., (2022) <doi:10.17161/bi.v17i.15985>, Cobos et al., (2019) <doi:10.7717/peerj.6281>, and Peterson et al., (2008) <doi:10.1016/j.ecolmodel.2007.11.008>.
Maintained by Luis F. Arias-Giraldo. Last updated 3 months ago.
8.7 match 5 stars 4.35 score 5 scriptsdanlwarren
ENMTools:Analysis of Niche Evolution using Niche and Distribution Models
Constructing niche models and analyzing patterns of niche evolution. Acts as an interface for many popular modeling algorithms, and allows users to conduct Monte Carlo tests to address basic questions in evolutionary ecology and biogeography. Warren, D.L., R.E. Glor, and M. Turelli (2008) <doi:10.1111/j.1558-5646.2008.00482.x> Glor, R.E., and D.L. Warren (2011) <doi:10.1111/j.1558-5646.2010.01177.x> Warren, D.L., R.E. Glor, and M. Turelli (2010) <doi:10.1111/j.1600-0587.2009.06142.x> Cardillo, M., and D.L. Warren (2016) <doi:10.1111/geb.12455> D.L. Warren, L.J. Beaumont, R. Dinnage, and J.B. Baumgartner (2019) <doi:10.1111/ecog.03900>.
Maintained by Dan Warren. Last updated 2 months ago.
5.4 match 105 stars 6.91 score 126 scriptscliffordlai
bestglm:Best Subset GLM and Regression Utilities
Best subset glm using information criteria or cross-validation, carried by using 'leaps' algorithm (Furnival and Wilson, 1974) <doi:10.2307/1267601> or complete enumeration (Morgan and Tatar, 1972) <doi:10.1080/00401706.1972.10488918>. Implements PCR and PLS using AIC/BIC. Implements one-standard deviation rule for use with the 'caret' package.
Maintained by Yuanhao Lai. Last updated 5 years ago.
7.1 match 5.29 score 418 scripts 5 dependentsamices
mice:Multivariate Imputation by Chained Equations
Multiple imputation using Fully Conditional Specification (FCS) implemented by the MICE algorithm as described in Van Buuren and Groothuis-Oudshoorn (2011) <doi:10.18637/jss.v045.i03>. Each variable has its own imputation model. Built-in imputation models are provided for continuous data (predictive mean matching, normal), binary data (logistic regression), unordered categorical data (polytomous logistic regression) and ordered categorical data (proportional odds). MICE can also impute continuous two-level data (normal model, pan, second-level variables). Passive imputation can be used to maintain consistency between variables. Various diagnostic plots are available to inspect the quality of the imputations.
Maintained by Stef van Buuren. Last updated 5 days ago.
chained-equationsfcsimputationmicemissing-datamissing-valuesmultiple-imputationmultivariate-datacpp
2.3 match 462 stars 16.50 score 10k scripts 154 dependentsjpmml
r2pmml:Convert R Models to PMML
R wrapper for the JPMML-R library <https://github.com/jpmml/jpmml-r>, which converts R models to Predictive Model Markup Language (PMML).
Maintained by Villu Ruusmann. Last updated 11 days ago.
5.8 match 74 stars 6.29 score 35 scriptscran
lctools:Local Correlation, Spatial Inequalities, Geographically Weighted Regression and Other Tools
Provides researchers and educators with easy-to-learn user friendly tools for calculating key spatial statistics and to apply simple as well as advanced methods of spatial analysis in real data. These include: Local Pearson and Geographically Weighted Pearson Correlation Coefficients, Spatial Inequality Measures (Gini, Spatial Gini, LQ, Focal LQ), Spatial Autocorrelation (Global and Local Moran's I), several Geographically Weighted Regression techniques and other Spatial Analysis tools (other geographically weighted statistics). This package also contains functions for measuring the significance of each statistic calculated, mainly based on Monte Carlo simulations.
Maintained by Stamatis Kalogirou. Last updated 12 months ago.
11.9 match 1 stars 3.03 score 53 scriptspat-s
oddsratio:Odds Ratio Calculation for GAM(M)s & GLM(M)s
Simplified odds ratio calculation of GAM(M)s & GLM(M)s. Provides structured output (data frame) of all predictors and their corresponding odds ratios and confident intervals for further analyses. It helps to avoid false references of predictors and increments by specifying these parameters in a list instead of using 'exp(coef(model))' (standard approach of odds ratio calculation for GLMs) which just returns a plain numeric output. For GAM(M)s, odds ratio calculation is highly simplified with this package since it takes care of the multiple 'predict()' calls of the chosen predictor while holding other predictors constant. Also, this package allows odds ratio calculation of percentage steps across the whole predictor distribution range for GAM(M)s. In both cases, confident intervals are returned additionally. Calculated odds ratio of GAM(M)s can be inserted into the smooth function plot.
Maintained by Patrick Schratz. Last updated 11 months ago.
odds-ratioprobabilitystatistics
4.8 match 31 stars 7.48 score 81 scripts 1 dependentsbioc
snpStats:SnpMatrix and XSnpMatrix classes and methods
Classes and statistical methods for large SNP association studies. This extends the earlier snpMatrix package, allowing for uncertainty in genotypes.
Maintained by David Clayton. Last updated 5 months ago.
microarraysnpgeneticvariabilityzlib
3.8 match 9.41 score 674 scripts 17 dependentspauleilers
JOPS:Practical Smoothing with P-Splines
Functions and data to reproduce all plots in the book "Practical Smoothing. The Joys of P-splines" by Paul H.C. Eilers and Brian D. Marx (2021, ISBN:978-1108482950).
Maintained by Paul Eilers. Last updated 2 years ago.
10.4 match 1 stars 3.43 score 296 scripts 3 dependentsboennecd
parglm:Parallel GLM
Provides a parallel estimation method for generalized linear models without compiling with a multithreaded LAPACK or BLAS.
Maintained by Benjamin Christoffersen. Last updated 3 years ago.
generalized-linear-modelsparallel-computingopenblascpp
5.5 match 11 stars 6.41 score 39 scripts 4 dependentslrberge
fixest:Fast Fixed-Effects Estimations
Fast and user-friendly estimation of econometric models with multiple fixed-effects. Includes ordinary least squares (OLS), generalized linear models (GLM) and the negative binomial. The core of the package is based on optimized parallel C++ code, scaling especially well for large data sets. The method to obtain the fixed-effects coefficients is based on Berge (2018) <https://github.com/lrberge/fixest/blob/master/_DOCS/FENmlm_paper.pdf>. Further provides tools to export and view the results of several estimations with intuitive design to cluster the standard-errors.
Maintained by Laurent Berge. Last updated 7 months ago.
2.4 match 387 stars 14.69 score 3.8k scripts 25 dependentsevilgraham
flatr:Transforms Contingency Tables to Data Frames, and Analyses Them
Contingency Tables are a pain to work with when you want to run regressions. This package takes them, flattens them into a long data frame, so you can more easily analyse them! As well, you can calculate other related statistics. All of this is done so in a 'tidy' manner, so it should tie in nicely with 'tidyverse' series of packages.
Maintained by Scott D. Graham. Last updated 7 years ago.
contingency-tableglmregressiontidytidy-data
11.0 match 3 stars 3.18 score 6 scriptsalexisderumigny
CondCopulas:Estimation and Inference for Conditional Copula Models
Provides functions for the estimation of conditional copulas models, various estimators of conditional Kendall's tau (proposed in Derumigny and Fermanian (2019a, 2019b, 2020) <doi:10.1515/demo-2019-0016>, <doi:10.1016/j.csda.2019.01.013>, <doi:10.1016/j.jmva.2020.104610>), and test procedures for the simplifying assumption (proposed in Derumigny and Fermanian (2017) <doi:10.1515/demo-2017-0011> and Derumigny, Fermanian and Min (2022) <doi:10.1002/cjs.11742>).
Maintained by Alexis Derumigny. Last updated 6 months ago.
conditional-copulasconditional-kendalls-taucopulasr-pkgsimplifying-assumption
7.4 match 2 stars 4.70 score 7 scriptspauljohn32
rockchalk:Regression Estimation and Presentation
A collection of functions for interpretation and presentation of regression analysis. These functions are used to produce the statistics lectures in <https://pj.freefaculty.org/guides/>. Includes regression diagnostics, regression tables, and plots of interactions and "moderator" variables. The emphasis is on "mean-centered" and "residual-centered" predictors. The vignette 'rockchalk' offers a fairly comprehensive overview. The vignette 'Rstyle' has advice about coding in R. The package title 'rockchalk' refers to our school motto, 'Rock Chalk Jayhawk, Go K.U.'.
Maintained by Paul E. Johnson. Last updated 3 years ago.
4.8 match 7.13 score 584 scripts 18 dependentscran
gplm:Generalized Partial Linear Models (GPLM)
Provides functions for estimating a generalized partial linear model, a semiparametric variant of the generalized linear model (GLM) which replaces the linear predictor by the sum of a linear and a nonparametric function.
Maintained by Marlene Mueller. Last updated 9 years ago.
17.0 match 2.00 scoresinhrks
ggfortify:Data Visualization Tools for Statistical Analysis Results
Unified plotting tools for statistics commonly used, such as GLM, time series, PCA families, clustering and survival analysis. The package offers a single plotting interface for these analysis results and plots in a unified style using 'ggplot2'.
Maintained by Yuan Tang. Last updated 9 months ago.
2.3 match 529 stars 14.49 score 9.1k scripts 22 dependentsmoskante
MixedPsy:Statistical Tools for the Analysis of Psychophysical Data
Tools for the analysis of psychophysical data in R. This package allows to estimate the Point of Subjective Equivalence (PSE) and the Just Noticeable Difference (JND), either from a psychometric function or from a Generalized Linear Mixed Model (GLMM). Additionally, the package allows plotting the fitted models and the response data, simulating psychometric functions of different shapes, and simulating data sets. For a description of the use of GLMMs applied to psychophysical data, refer to Moscatelli et al. (2012).
Maintained by Alessandro Moscatelli. Last updated 25 days ago.
8.8 match 5 stars 3.70 score 9 scriptsikosmidis
brglm:Bias Reduction in Binomial-Response Generalized Linear Models
Fit generalized linear models with binomial responses using either an adjusted-score approach to bias reduction or maximum penalized likelihood where penalization is by Jeffreys invariant prior. These procedures return estimates with improved frequentist properties (bias, mean squared error) that are always finite even in cases where the maximum likelihood estimates are infinite (data separation). Fitting takes place by fitting generalized linear models on iteratively updated pseudo-data. The interface is essentially the same as 'glm'. More flexibility is provided by the fact that custom pseudo-data representations can be specified and used for model fitting. Functions are provided for the construction of confidence intervals for the reduced-bias estimates.
Maintained by Ioannis Kosmidis. Last updated 4 years ago.
4.6 match 6 stars 7.14 score 86 scripts 11 dependentsmdonoghoe
glm2:Fitting Generalized Linear Models
Fits generalized linear models using the same model specification as glm in the stats package, but with a modified default fitting method that provides greater stability for models that may fail to converge using glm.
Maintained by Mark W. Donoghoe. Last updated 7 years ago.
5.6 match 1 stars 5.78 score 270 scripts 24 dependentsnrs02004
SGL:Fit a GLM (or Cox Model) with a Combination of Lasso and Group Lasso Regularization
Fit a regularized generalized linear model via penalized maximum likelihood. The model is fit for a path of values of the penalty parameter. Fits linear, logistic and Cox models.
Maintained by Noah Simon. Last updated 5 years ago.
7.8 match 6 stars 4.11 score 71 scripts 1 dependentsdavidgohel
flextable:Functions for Tabular Reporting
Use a grammar for creating and customizing pretty tables. The following formats are supported: 'HTML', 'PDF', 'RTF', 'Microsoft Word', 'Microsoft PowerPoint' and R 'Grid Graphics'. 'R Markdown', 'Quarto' and the package 'officer' can be used to produce the result files. The syntax is the same for the user regardless of the type of output to be produced. A set of functions allows the creation, definition of cell arrangement, addition of headers or footers, formatting and definition of cell content with text and or images. The package also offers a set of high-level functions that allow tabular reporting of statistical models and the creation of complex cross tabulations.
Maintained by David Gohel. Last updated 1 months ago.
docxhtml5ms-office-documentsrmarkdowntable
1.9 match 583 stars 17.04 score 7.3k scripts 119 dependentscumulocity-iot
pmml:Generate PMML for Various Models
The Predictive Model Markup Language (PMML) is an XML-based language which provides a way for applications to define machine learning, statistical and data mining models and to share models between PMML compliant applications. More information about the PMML industry standard and the Data Mining Group can be found at <http://dmg.org/>. The generated PMML can be imported into any PMML consuming application, such as Zementis Predictive Analytics products. The package isofor (used for anomaly detection) can be installed with devtools::install_github("gravesee/isofor").
Maintained by Dmitriy Bolotov. Last updated 3 years ago.
4.0 match 20 stars 7.98 score 560 scripts 1 dependentsclbustos
dominanceanalysis:Dominance Analysis
Dominance analysis is a method that allows to compare the relative importance of predictors in multiple regression models: ordinary least squares, generalized linear models, hierarchical linear models, beta regression and dynamic linear models. The main principles and methods of dominance analysis are described in Budescu, D. V. (1993) <doi:10.1037/0033-2909.114.3.542> and Azen, R., & Budescu, D. V. (2003) <doi:10.1037/1082-989X.8.2.129> for ordinary least squares regression. Subsequently, the extensions for multivariate regression, logistic regression and hierarchical linear models were described in Azen, R., & Budescu, D. V. (2006) <doi:10.3102/10769986031002157>, Azen, R., & Traxel, N. (2009) <doi:10.3102/1076998609332754> and Luo, W., & Azen, R. (2013) <doi:10.3102/1076998612458319>, respectively.
Maintained by Claudio Bustos Navarrete. Last updated 1 years ago.
5.5 match 25 stars 5.75 score 45 scriptsmlr-org
mlr3extralearners:Extra Learners For mlr3
Extra learners for use in mlr3.
Maintained by Sebastian Fischer. Last updated 4 months ago.
3.4 match 94 stars 9.16 score 474 scriptsbioc
DESeq2:Differential gene expression analysis based on the negative binomial distribution
Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.
Maintained by Michael Love. Last updated 10 days ago.
sequencingrnaseqchipseqgeneexpressiontranscriptionnormalizationdifferentialexpressionbayesianregressionprincipalcomponentclusteringimmunooncologyopenblascpp
1.9 match 375 stars 16.11 score 17k scripts 115 dependentsmsesia
knockoff:The Knockoff Filter for Controlled Variable Selection
The knockoff filter is a general procedure for controlling the false discovery rate (FDR) when performing variable selection. For more information, see the website below and the accompanying paper: Candes et al., "Panning for gold: model-X knockoffs for high-dimensional controlled variable selection", J. R. Statist. Soc. B (2018) 80, 3, pp. 551-577.
Maintained by Matteo Sesia. Last updated 3 years ago.
5.6 match 2 stars 5.35 score 248 scripts 5 dependentscran
RCAL:Regularized Calibrated Estimation
Regularized calibrated estimation for causal inference and missing-data problems with high-dimensional data, based on Tan (2020a) <doi:10.1093/biomet/asz059>, Tan (2020b) <doi:10.1214/19-AOS1824> and Sun and Tan (2020) <arXiv:2009.09286>.
Maintained by Zhiqiang Tan. Last updated 4 years ago.
8.5 match 3.49 score 17 scripts 1 dependentsflorianhartig
DHARMa:Residual Diagnostics for Hierarchical (Multi-Level / Mixed) Regression Models
The 'DHARMa' package uses a simulation-based approach to create readily interpretable scaled (quantile) residuals for fitted (generalized) linear mixed models. Currently supported are linear and generalized linear (mixed) models from 'lme4' (classes 'lmerMod', 'glmerMod'), 'glmmTMB', 'GLMMadaptive', and 'spaMM'; phylogenetic linear models from 'phylolm' (classes 'phylolm' and 'phyloglm'); generalized additive models ('gam' from 'mgcv'); 'glm' (including 'negbin' from 'MASS', but excluding quasi-distributions) and 'lm' model classes. Moreover, externally created simulations, e.g. posterior predictive simulations from Bayesian software such as 'JAGS', 'STAN', or 'BUGS' can be processed as well. The resulting residuals are standardized to values between 0 and 1 and can be interpreted as intuitively as residuals from a linear regression. The package also provides a number of plot and test functions for typical model misspecification problems, such as over/underdispersion, zero-inflation, and residual spatial, phylogenetic and temporal autocorrelation.
Maintained by Florian Hartig. Last updated 11 days ago.
glmmregressionregression-diagnosticsresidual
2.0 match 226 stars 14.74 score 2.8k scripts 10 dependentsbertcarnell
tornado:Plots for Model Sensitivity and Variable Importance
Draws tornado plots for model sensitivity to univariate changes. Implements methods for many modeling methods including linear models, generalized linear models, survival regression models, and arbitrary machine learning models in the caret package. Also draws variable importance plots.
Maintained by Rob Carnell. Last updated 7 months ago.
explanabilityregressionsensitivity-analysis
6.1 match 7 stars 4.85 score 4 scriptsbioc
compcodeR:RNAseq data simulation, differential expression analysis and performance comparison of differential expression methods
This package provides extensive functionality for comparing results obtained by different methods for differential expression analysis of RNAseq data. It also contains functions for simulating count data. Finally, it provides convenient interfaces to several packages for performing the differential expression analysis. These can also be used as templates for setting up and running a user-defined differential analysis workflow within the framework of the package.
Maintained by Charlotte Soneson. Last updated 3 months ago.
immunooncologyrnaseqdifferentialexpression
3.6 match 11 stars 8.06 score 26 scriptsflxzimmer
mlpwr:A Power Analysis Toolbox to Find Cost-Efficient Study Designs
We implement a surrogate modeling algorithm to guide simulation-based sample size planning. The method is described in detail in our paper (Zimmer & Debelak (2023) <doi:10.1037/met0000611>). It supports multiple study design parameters and optimization with respect to a cost function. It can find optimal designs that correspond to a desired statistical power or that fulfill a cost constraint. We also provide a tutorial paper (Zimmer et al. (2023) <doi:10.3758/s13428-023-02269-0>).
Maintained by Felix Zimmer. Last updated 5 months ago.
5.0 match 4 stars 5.83 score 16 scriptsdqksnow
subsampling:Optimal Subsampling Methods for Statistical Models
Balancing computational and statistical efficiency, subsampling techniques offer a practical solution for handling large-scale data analysis. Subsampling methods enhance statistical modeling for massive datasets by efficiently drawing representative subsamples from full dataset based on tailored sampling probabilities. These probabilities are optimized for specific goals, such as minimizing the variance of coefficient estimates or reducing prediction error.
Maintained by Qingkai Dong. Last updated 4 months ago.
5.1 match 1 stars 5.60 score 6 scriptsdwarton
ecostats:Code and Data Accompanying the Eco-Stats Text (Warton 2022)
Functions and data supporting the Eco-Stats text (Warton, 2022, Springer), and solutions to exercises. Functions include tools for using simulation envelopes in diagnostic plots, and a function for diagnostic plots of multivariate linear models. Datasets mentioned in the package are included here (where not available elsewhere) and there is a vignette for each chapter of the text with solutions to exercises.
Maintained by David Warton. Last updated 1 years ago.
4.3 match 8 stars 6.58 score 53 scriptsopenpharma
beeca:Binary Endpoint Estimation with Covariate Adjustment
Performs estimation of marginal treatment effects for binary outcomes when using logistic regression working models with covariate adjustment (see discussions in Magirr et al (2024) <https://osf.io/9mp58/>). Implements the variance estimators of Ge et al (2011) <doi:10.1177/009286151104500409> and Ye et al (2023) <doi:10.1080/24754269.2023.2205802>.
Maintained by Alex Przybylski. Last updated 4 months ago.
5.2 match 6 stars 5.48 score 8 scriptsmelff
memisc:Management of Survey Data and Presentation of Analysis Results
An infrastructure for the management of survey data including value labels, definable missing values, recoding of variables, production of code books, and import of (subsets of) 'SPSS' and 'Stata' files is provided. Further, the package allows to produce tables and data frames of arbitrary descriptive statistics and (almost) publication-ready tables of regression model estimates, which can be exported to 'LaTeX' and HTML.
Maintained by Martin Elff. Last updated 10 days ago.
2.3 match 46 stars 12.34 score 1.2k scripts 13 dependentsalexpkeil1
qgcompint:Quantile G-Computation Extensions for Effect Measure Modification
G-computation for a set of time-fixed exposures with quantile-based basis functions, possibly under linearity and homogeneity assumptions. Effect measure modification in this method is a way to assess how the effect of the mixture varies by a binary, categorical or continuous variable. Reference: Alexander P. Keil, Jessie P. Buckley, Katie M. OBrien, Kelly K. Ferguson, Shanshan Zhao, and Alexandra J. White (2019) A quantile-based g-computation approach to addressing the effects of exposure mixtures; <doi:10.1289/EHP5838>.
Maintained by Alexander Keil. Last updated 3 days ago.
5.6 match 4 stars 4.89 score 13 scriptsbioc
multiHiCcompare:Normalize and detect differences between Hi-C datasets when replicates of each experimental condition are available
multiHiCcompare provides functions for joint normalization and difference detection in multiple Hi-C datasets. This extension of the original HiCcompare package now allows for Hi-C experiments with more than 2 groups and multiple samples per group. multiHiCcompare operates on processed Hi-C data in the form of sparse upper triangular matrices. It accepts four column (chromosome, region1, region2, IF) tab-separated text files storing chromatin interaction matrices. multiHiCcompare provides cyclic loess and fast loess (fastlo) methods adapted to jointly normalizing Hi-C data. Additionally, it provides a general linear model (GLM) framework adapting the edgeR package to detect differences in Hi-C data in a distance dependent manner.
Maintained by Mikhail Dozmorov. Last updated 5 months ago.
softwarehicsequencingnormalization
3.7 match 9 stars 7.30 score 37 scripts 2 dependentsjinli22
spm2:Spatial Predictive Modeling
An updated and extended version of 'spm' package, by introducing some further novel functions for modern statistical methods (i.e., generalised linear models, glmnet, generalised least squares), thin plate splines, support vector machine, kriging methods (i.e., simple kriging, universal kriging, block kriging, kriging with an external drift), and novel hybrid methods (228 hybrids plus numerous variants) of modern statistical methods or machine learning methods with mathematical and/or univariate geostatistical methods for spatial predictive modelling. For each method, two functions are provided, with one function for assessing the predictive errors and accuracy of the method based on cross-validation, and the other for generating spatial predictions. It also contains a couple of functions for data preparation and predictive accuracy assessment.
Maintained by Jin Li. Last updated 2 years ago.
13.0 match 2.08 score 2 scripts 2 dependentsf-rousset
spaMM:Mixed-Effect Models, with or without Spatial Random Effects
Inference based on models with or without spatially-correlated random effects, multivariate responses, or non-Gaussian random effects (e.g., Beta). Variation in residual variance (heteroscedasticity) can itself be represented by a mixed-effect model. Both classical geostatistical models (Rousset and Ferdy 2014 <doi:10.1111/ecog.00566>), and Markov random field models on irregular grids (as considered in the 'INLA' package, <https://www.r-inla.org>), can be fitted, with distinct computational procedures exploiting the sparse matrix representations for the latter case and other autoregressive models. Laplace approximations are used for likelihood or restricted likelihood. Penalized quasi-likelihood and other variants discussed in the h-likelihood literature (Lee and Nelder 2001 <doi:10.1093/biomet/88.4.987>) are also implemented.
Maintained by François Rousset. Last updated 9 months ago.
5.4 match 4.94 score 208 scripts 5 dependentstidymodels
butcher:Model Butcher
Provides a set of S3 generics to axe components of fitted model objects and help reduce the size of model objects saved to disk.
Maintained by Julia Silge. Last updated 12 days ago.
2.3 match 132 stars 11.54 score 146 scripts 13 dependentsbioc
glmSparseNet:Network Centrality Metrics for Elastic-Net Regularized Models
glmSparseNet is an R-package that generalizes sparse regression models when the features (e.g. genes) have a graph structure (e.g. protein-protein interactions), by including network-based regularizers. glmSparseNet uses the glmnet R-package, by including centrality measures of the network as penalty weights in the regularization. The current version implements regularization based on node degree, i.e. the strength and/or number of its associated edges, either by promoting hubs in the solution or orphan genes in the solution. All the glmnet distribution families are supported, namely "gaussian", "poisson", "binomial", "multinomial", "cox", and "mgaussian".
Maintained by André Veríssimo. Last updated 5 months ago.
softwarestatisticalmethoddimensionreductionregressionclassificationsurvivalnetworkgraphandnetwork
3.4 match 6 stars 7.42 score 41 scripts 1 dependentscardiomoon
ztable:Zebra-Striped Tables in LaTeX and HTML Formats
Makes zebra-striped tables (tables with alternating row colors) in LaTeX and HTML formats easily from a data.frame, matrix, lm, aov, anova, glm, coxph, nls, fitdistr, mytable and cbind.mytable objects.
Maintained by Keon-Woong Moon. Last updated 2 years ago.
3.2 match 21 stars 7.90 score 212 scripts 2 dependentsfbertran
plsRglm:Partial Least Squares Regression for Generalized Linear Models
Provides (weighted) Partial least squares Regression for generalized linear models and repeated k-fold cross-validation of such models using various criteria <arXiv:1810.01005>. It allows for missing data in the explanatory variables. Bootstrap confidence intervals constructions are also available.
Maintained by Frederic Bertrand. Last updated 2 years ago.
3.3 match 16 stars 7.75 score 103 scripts 5 dependentsmathijsdeen
ClusterBootstrap:Analyze Clustered Data with Generalized Linear Models using the Cluster Bootstrap
Provides functionality for the analysis of clustered data using the cluster bootstrap.
Maintained by Mathijs Deen. Last updated 4 years ago.
6.9 match 2 stars 3.60 score 8 scriptsatahk
pscl:Political Science Computational Laboratory
Bayesian analysis of item-response theory (IRT) models, roll call analysis; computing highest density regions; maximum likelihood estimation of zero-inflated and hurdle models for count data; goodness-of-fit measures for GLMs; data sets used in writing and teaching; seats-votes curves.
Maintained by Simon Jackman. Last updated 1 years ago.
1.9 match 67 stars 13.28 score 2.7k scripts 54 dependentsbioc
RCM:Fit row-column association models with the negative binomial distribution for the microbiome
Combine ideas of log-linear analysis of contingency table, flexible response function estimation and empirical Bayes dispersion estimation for explorative visualization of microbiome datasets. The package includes unconstrained as well as constrained analysis. In addition, diagnostic plot to detect lack of fit are available.
Maintained by Stijn Hawinkel. Last updated 5 months ago.
metagenomicsdimensionreductionmicrobiomevisualizationordinationphyloseqrcm
3.5 match 16 stars 6.90 score 25 scriptsmerliseclyde
BAS:Bayesian Variable Selection and Model Averaging using Bayesian Adaptive Sampling
Package for Bayesian Variable Selection and Model Averaging in linear models and generalized linear models using stochastic or deterministic sampling without replacement from posterior distributions. Prior distributions on coefficients are from Zellner's g-prior or mixtures of g-priors corresponding to the Zellner-Siow Cauchy Priors or the mixture of g-priors from Liang et al (2008) <DOI:10.1198/016214507000001337> for linear models or mixtures of g-priors from Li and Clyde (2019) <DOI:10.1080/01621459.2018.1469992> in generalized linear models. Other model selection criteria include AIC, BIC and Empirical Bayes estimates of g. Sampling probabilities may be updated based on the sampled models using sampling w/out replacement or an efficient MCMC algorithm which samples models using a tree structure of the model space as an efficient hash table. See Clyde, Ghosh and Littman (2010) <DOI:10.1198/jcgs.2010.09049> for details on the sampling algorithms. Uniform priors over all models or beta-binomial prior distributions on model size are allowed, and for large p truncated priors on the model space may be used to enforce sampling models that are full rank. The user may force variables to always be included in addition to imposing constraints that higher order interactions are included only if their parents are included in the model. This material is based upon work supported by the National Science Foundation under Division of Mathematical Sciences grant 1106891. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Maintained by Merlise Clyde. Last updated 4 months ago.
bayesianbayesian-inferencegeneralized-linear-modelslinear-regressionlogistic-regressionmcmcmodel-selectionpoisson-regressionpredictive-modelingregressionvariable-selectionfortranopenblas
2.3 match 44 stars 10.81 score 420 scripts 3 dependentsr-spatial
spatialreg:Spatial Regression Analysis
A collection of all the estimation functions for spatial cross-sectional models (on lattice/areal data using spatial weights matrices) contained up to now in 'spdep'. These model fitting functions include maximum likelihood methods for cross-sectional models proposed by 'Cliff' and 'Ord' (1973, ISBN:0850860369) and (1981, ISBN:0850860814), fitting methods initially described by 'Ord' (1975) <doi:10.1080/01621459.1975.10480272>. The models are further described by 'Anselin' (1988) <doi:10.1007/978-94-015-7799-1>. Spatial two stage least squares and spatial general method of moment models initially proposed by 'Kelejian' and 'Prucha' (1998) <doi:10.1023/A:1007707430416> and (1999) <doi:10.1111/1468-2354.00027> are provided. Impact methods and MCMC fitting methods proposed by 'LeSage' and 'Pace' (2009) <doi:10.1201/9781420064254> are implemented for the family of cross-sectional spatial regression models. Methods for fitting the log determinant term in maximum likelihood and MCMC fitting are compared by 'Bivand et al.' (2013) <doi:10.1111/gean.12008>, and model fitting methods by 'Bivand' and 'Piras' (2015) <doi:10.18637/jss.v063.i18>; both of these articles include extensive lists of references. A recent review is provided by 'Bivand', 'Millo' and 'Piras' (2021) <doi:10.3390/math9111276>. 'spatialreg' >= 1.1-* corresponded to 'spdep' >= 1.1-1, in which the model fitting functions were deprecated and passed through to 'spatialreg', but masked those in 'spatialreg'. From versions 1.2-*, the functions have been made defunct in 'spdep'. From version 1.3-6, add Anselin-Kelejian (1997) test to `stsls` for residual spatial autocorrelation <doi:10.1177/016001769702000109>.
Maintained by Roger Bivand. Last updated 2 days ago.
bayesianimpactsmaximum-likelihoodspatial-dependencespatial-econometricsspatial-regressionopenblas
1.9 match 46 stars 12.92 score 916 scripts 24 dependentszuoyi93
ProSGPV:Penalized Regression with Second-Generation P-Values
Implementation of penalized regression with second-generation p-values for variable selection. The algorithm can handle linear regression, GLM, and Cox regression. S3 methods print(), summary(), coef(), predict(), and plot() are available for the algorithm. Technical details can be found at Zuo et al. (2021) <doi:10.1080/00031305.2021.1946150>.
Maintained by Yi Zuo. Last updated 4 years ago.
5.1 match 5 stars 4.70 score 9 scriptskhliland
mixlm:Mixed Model ANOVA and Statistics for Education
The main functions perform mixed models analysis by least squares or REML by adding the function r() to formulas of lm() and glm(). A collection of text-book statistics for higher education is also included, e.g. modifications of the functions lm(), glm() and associated summaries from the package 'stats'.
Maintained by Kristian Hovde Liland. Last updated 30 days ago.
4.1 match 5.87 score 56 scripts 3 dependentsrichjjackson
psc:Personalised Synthetic Controls
Allows the comparison of data cohorts (DC) against a Counter Factual Model (CFM) and measures the difference in terms of an efficacy parameter. Allows the application of Personalised Synthetic Controls.
Maintained by Richard Jackson. Last updated 4 months ago.
5.7 match 1 stars 4.23 score 24 scriptsmandymejia
BayesfMRI:Spatial Bayesian Methods for Task Functional MRI Studies
Performs a spatial Bayesian general linear model (GLM) for task functional magnetic resonance imaging (fMRI) data on the cortical surface. Additional models include group analysis and inference to detect thresholded areas of activation. Includes direct support for the 'CIFTI' neuroimaging file format. For more information see A. F. Mejia, Y. R. Yue, D. Bolin, F. Lindgren, M. A. Lindquist (2020) <doi:10.1080/01621459.2019.1611582> and D. Spencer, Y. R. Yue, D. Bolin, S. Ryan, A. F. Mejia (2022) <doi:10.1016/j.neuroimage.2022.118908>.
Maintained by Amanda Mejia. Last updated 7 days ago.
4.1 match 26 stars 5.77 score 19 scriptscran
clusterSEs:Calculate Cluster-Robust p-Values and Confidence Intervals
Calculate p-values and confidence intervals using cluster-adjusted t-statistics (based on Ibragimov and Muller (2010) <DOI:10.1198/jbes.2009.08046>, pairs cluster bootstrapped t-statistics, and wild cluster bootstrapped t-statistics (the latter two techniques based on Cameron, Gelbach, and Miller (2008) <DOI:10.1162/rest.90.3.414>. Procedures are included for use with GLM, ivreg, plm (pooling or fixed effects), and mlogit models.
Maintained by Justin Esarey. Last updated 4 years ago.
13.4 match 2 stars 1.78 score 1 dependentspsychbruce
bruceR:Broadly Useful Convenient and Efficient R Functions
Broadly useful convenient and efficient R functions that bring users concise and elegant R data analyses. This package includes easy-to-use functions for (1) basic R programming (e.g., set working directory to the path of currently opened file; import/export data from/to files in any format; print tables to Microsoft Word); (2) multivariate computation (e.g., compute scale sums/means/... with reverse scoring); (3) reliability analyses and factor analyses; (4) descriptive statistics and correlation analyses; (5) t-test, multi-factor analysis of variance (ANOVA), simple-effect analysis, and post-hoc multiple comparison; (6) tidy report of statistical models (to R Console and Microsoft Word); (7) mediation and moderation analyses (PROCESS); and (8) additional toolbox for statistics and graphics.
Maintained by Han-Wu-Shuang Bao. Last updated 9 months ago.
anovadata-analysisdata-sciencelinear-modelslinear-regressionmultilevel-modelsstatisticstoolbox
3.0 match 176 stars 7.87 score 316 scripts 3 dependentsbioc
edgeR:Empirical Analysis of Digital Gene Expression Data in R
Differential expression analysis of sequence count data. Implements a range of statistical methodology based on the negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models, quasi-likelihood, and gene set enrichment. Can perform differential analyses of any type of omics data that produces read counts, including RNA-seq, ChIP-seq, ATAC-seq, Bisulfite-seq, SAGE, CAGE, metabolomics, or proteomics spectral counts. RNA-seq analyses can be conducted at the gene or isoform level, and tests can be conducted for differential exon or transcript usage.
Maintained by Yunshun Chen. Last updated 4 days ago.
alternativesplicingbatcheffectbayesianbiomedicalinformaticscellbiologychipseqclusteringcoveragedifferentialexpressiondifferentialmethylationdifferentialsplicingdnamethylationepigeneticsfunctionalgenomicsgeneexpressiongenesetenrichmentgeneticsimmunooncologymultiplecomparisonnormalizationpathwaysproteomicsqualitycontrolregressionrnaseqsagesequencingsinglecellsystemsbiologytimecoursetranscriptiontranscriptomicsopenblas
1.8 match 13.40 score 17k scripts 255 dependentsbioc
msqrob2:Robust statistical inference for quantitative LC-MS proteomics
msqrob2 provides a robust linear mixed model framework for assessing differential abundance in MS-based Quantitative proteomics experiments. Our workflows can start from raw peptide intensities or summarised protein expression values. The model parameter estimates can be stabilized by ridge regression, empirical Bayes variance estimation and robust M-estimation. msqrob2's hurde workflow can handle missing data without having to rely on hard-to-verify imputation assumptions, and, outcompetes state-of-the-art methods with and without imputation for both high and low missingness. It builds on QFeature infrastructure for quantitative mass spectrometry data to store the model results together with the raw data and preprocessed data.
Maintained by Lieven Clement. Last updated 17 days ago.
proteomicsmassspectrometrydifferentialexpressionmultiplecomparisonregressionexperimentaldesignsoftwareimmunooncologynormalizationtimecoursepreprocessing
3.4 match 10 stars 6.94 score 83 scriptsnilotpalsanyal
BHMSMAfMRI:Bayesian Hierarchical Multi-Subject Multiscale Analysis of Functional MRI (fMRI) Data
Package BHMSMAfMRI performs Bayesian hierarchical multi-subject multiscale analysis of fMRI data as described in Sanyal & Ferreira (2012) <DOI:10.1016/j.neuroimage.2012.08.041>, or other multiscale data, using wavelet based prior that borrows strength across subjects and provides posterior smoothed images of the effect sizes and samples from the posterior distribution.
Maintained by Nilotpal Sanyal. Last updated 2 years ago.
bayesian-hierarchical-modelsfmri-data-analysismultiscale-datawavelet-transformopenblascppopenmp
8.3 match 2.81 score 13 scriptsagbarnett
season:Seasonal Analysis of Health Data
Routines for the seasonal analysis of health data, including regression models, time-stratified case-crossover, plotting functions and residual checks, see Barnett and Dobson (2010) ISBN 978-3-642-10748-1. Thanks to Yuming Guo for checking the case-crossover code.
Maintained by Adrian Barnett. Last updated 3 years ago.
3.9 match 2 stars 5.85 score 70 scriptstidymodels
tidypredict:Run Predictions Inside the Database
It parses a fitted 'R' model object, and returns a formula in 'Tidy Eval' code that calculates the predictions. It works with several databases back-ends because it leverages 'dplyr' and 'dbplyr' for the final 'SQL' translation of the algorithm. It currently supports lm(), glm(), randomForest(), ranger(), earth(), xgb.Booster.complete(), cubist(), and ctree() models.
Maintained by Emil Hvitfeldt. Last updated 3 months ago.
2.0 match 261 stars 11.03 score 241 scripts 2 dependentsinsightsengineering
tern:Create Common TLGs Used in Clinical Trials
Table, Listings, and Graphs (TLG) library for common outputs used in clinical trials.
Maintained by Joe Zhu. Last updated 2 months ago.
clinical-trialsgraphslistingsnestoutputstables
1.8 match 79 stars 12.62 score 186 scripts 9 dependentsbxc147
Epi:Statistical Analysis in Epidemiology
Functions for demographic and epidemiological analysis in the Lexis diagram, i.e. register and cohort follow-up data. In particular representation, manipulation, rate estimation and simulation for multistate data - the Lexis suite of functions, which includes interfaces to 'mstate', 'etm' and 'cmprsk' packages. Contains functions for Age-Period-Cohort and Lee-Carter modeling and a function for interval censored data and some useful functions for tabulation and plotting, as well as a number of epidemiological data sets.
Maintained by Bendix Carstensen. Last updated 2 months ago.
2.3 match 4 stars 9.65 score 708 scripts 11 dependentscran
smurf:Sparse Multi-Type Regularized Feature Modeling
Implementation of the SMuRF algorithm of Devriendt et al. (2021) <doi:10.1016/j.insmatheco.2020.11.010> to fit generalized linear models (GLMs) with multiple types of predictors via regularized maximum likelihood.
Maintained by Tom Reynkens. Last updated 20 days ago.
6.7 match 3.21 score 27 scripts 1 dependentssyedhaider5
chicane:Capture Hi-C Analysis Engine
Toolkit for processing and calling interactions in capture Hi-C data. Converts BAM files into counts of reads linking restriction fragments, and identifies pairs of fragments that interact more than expected by chance. Significant interactions are identified by comparing the observed read count to the expected background rate from a count regression model.
Maintained by Syed Haider. Last updated 3 years ago.
7.8 match 2.75 score 28 scriptsgmcmacran
GlmSimulatoR:Creates Ideal Data for Generalized Linear Models
Creates ideal data for all distributions in the generalized linear model framework.
Maintained by Greg McMahan. Last updated 8 months ago.
4.2 match 1 stars 5.12 score 53 scriptssteve-the-bayesian
BoomSpikeSlab:MCMC for Spike and Slab Regression
Spike and slab regression with a variety of residual error distributions corresponding to Gaussian, Student T, probit, logit, SVM, and a few others. Spike and slab regression is Bayesian regression with prior distributions containing a point mass at zero. The posterior updates the amount of mass on this point, leading to a posterior distribution that is actually sparse, in the sense that if you sample from it many coefficients are actually zeros. Sampling from this posterior distribution is an elegant way to handle Bayesian variable selection and model averaging. See <DOI:10.1504/IJMMNO.2014.059942> for an explanation of the Gaussian case.
Maintained by Steven L. Scott. Last updated 1 years ago.
3.9 match 6 stars 5.46 score 95 scripts 5 dependentssuyusung
arm:Data Analysis Using Regression and Multilevel/Hierarchical Models
Functions to accompany A. Gelman and J. Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models, Cambridge University Press, 2007.
Maintained by Yu-Sung Su. Last updated 4 months ago.
1.7 match 25 stars 12.38 score 3.3k scripts 89 dependentsscheike
timereg:Flexible Regression Models for Survival Data
Programs for Martinussen and Scheike (2006), `Dynamic Regression Models for Survival Data', Springer Verlag. Plus more recent developments. Additive survival model, semiparametric proportional odds model, fast cumulative residuals, excess risk models and more. Flexible competing risks regression including GOF-tests. Two-stage frailty modelling. PLS for the additive risk model. Lasso in the 'ahaz' package.
Maintained by Thomas Scheike. Last updated 6 months ago.
2.0 match 31 stars 10.42 score 289 scripts 44 dependentsalexanderrobitzsch
miceadds:Some Additional Multiple Imputation Functions, Especially for 'mice'
Contains functions for multiple imputation which complements existing functionality in R. In particular, several imputation methods for the mice package (van Buuren & Groothuis-Oudshoorn, 2011, <doi:10.18637/jss.v045.i03>) are implemented. Main features of the miceadds package include plausible value imputation (Mislevy, 1991, <doi:10.1007/BF02294457>), multilevel imputation for variables at any level or with any number of hierarchical and non-hierarchical levels (Grund, Luedtke & Robitzsch, 2018, <doi:10.1177/1094428117703686>; van Buuren, 2018, Ch.7, <doi:10.1201/9780429492259>), imputation using partial least squares (PLS) for high dimensional predictors (Robitzsch, Pham & Yanagida, 2016), nested multiple imputation (Rubin, 2003, <doi:10.1111/1467-9574.00217>), substantive model compatible imputation (Bartlett et al., 2015, <doi:10.1177/0962280214521348>), and features for the generation of synthetic datasets (Reiter, 2005, <doi:10.1111/j.1467-985X.2004.00343.x>; Nowok, Raab, & Dibben, 2016, <doi:10.18637/jss.v074.i11>).
Maintained by Alexander Robitzsch. Last updated 14 days ago.
missing-datamultiple-imputationopenblascpp
2.3 match 16 stars 9.16 score 542 scripts 9 dependentsspatstat
spatstat.model:Parametric Statistical Modelling and Inference for the 'spatstat' Family
Functionality for parametric statistical modelling and inference for spatial data, mainly spatial point patterns, in the 'spatstat' family of packages. (Excludes analysis of spatial data on a linear network, which is covered by the separate package 'spatstat.linnet'.) Supports parametric modelling, formal statistical inference, and model validation. Parametric models include Poisson point processes, Cox point processes, Neyman-Scott cluster processes, Gibbs point processes and determinantal point processes. Models can be fitted to data using maximum likelihood, maximum pseudolikelihood, maximum composite likelihood and the method of minimum contrast. Fitted models can be simulated and predicted. Formal inference includes hypothesis tests (quadrat counting tests, Cressie-Read tests, Clark-Evans test, Berman test, Diggle-Cressie-Loosmore-Ford test, scan test, studentised permutation test, segregation test, ANOVA tests of fitted models, adjusted composite likelihood ratio test, envelope tests, Dao-Genton test, balanced independent two-stage test), confidence intervals for parameters, and prediction intervals for point counts. Model validation techniques include leverage, influence, partial residuals, added variable plots, diagnostic plots, pseudoscore residual plots, model compensators and Q-Q plots.
Maintained by Adrian Baddeley. Last updated 6 days ago.
analysis-of-variancecluster-processconfidence-intervalscox-processdeterminantal-point-processesgibbs-processinfluenceleveragemodel-diagnosticsneyman-scottparameter-estimationpoisson-processspatial-analysisspatial-modellingspatial-point-processesstatistical-inference
2.3 match 5 stars 9.09 score 6 scripts 46 dependentsdmurdoch
ellipse:Functions for Drawing Ellipses and Ellipse-Like Confidence Regions
Contains various routines for drawing ellipses and ellipse-like confidence regions, implementing the plots described in Murdoch and Chow (1996, <doi:10.2307/2684435>). There are also routines implementing the profile plots described in Bates and Watts (1988, <doi:10.1002/9780470316757>).
Maintained by Duncan Murdoch. Last updated 2 years ago.
1.8 match 4 stars 11.13 score 1.2k scripts 256 dependentscran
MuMIn:Multi-Model Inference
Tools for model selection and model averaging with support for a wide range of statistical models. Automated model selection through subsets of the maximum model, with optional constraints for model inclusion. Averaging of model parameters and predictions based on model weights derived from information criteria (AICc and alike) or custom model weighting schemes.
Maintained by Kamil Bartoń. Last updated 9 months ago.
2.3 match 8 stars 8.84 score 5.6k scripts 27 dependentshannahlowens
voluModel:Modeling Species Distributions in Three Dimensions
Facilitates modeling species' ecological niches and geographic distributions based on occurrences and environments that have a vertical as well as horizontal component, and projecting models into three-dimensional geographic space. Working in three dimensions is useful in an aquatic context when the organisms one wishes to model can be found across a wide range of depths in the water column. The package also contains functions to automatically generate marine training model training regions using machine learning, and interpolate and smooth patchily sampled environmental rasters using thin plate splines. Davis Rabosky AR, Cox CL, Rabosky DL, Title PO, Holmes IA, Feldman A, McGuire JA (2016) <doi:10.1038/ncomms11484>. Nychka D, Furrer R, Paige J, Sain S (2021) <doi:10.5065/D6W957CT>. Pateiro-Lopez B, Rodriguez-Casal A (2022) <https://CRAN.R-project.org/package=alphahull>.
Maintained by Hannah L. Owens. Last updated 18 hours ago.
3.0 match 9 stars 6.60 score 35 scriptsrstudio
tfprobability:Interface to 'TensorFlow Probability'
Interface to 'TensorFlow Probability', a 'Python' library built on 'TensorFlow' that makes it easy to combine probabilistic models and deep learning on modern hardware ('TPU', 'GPU'). 'TensorFlow Probability' includes a wide selection of probability distributions and bijectors, probabilistic layers, variational inference, Markov chain Monte Carlo, and optimizers such as Nelder-Mead, BFGS, and SGLD.
Maintained by Tomasz Kalinowski. Last updated 3 years ago.
2.3 match 54 stars 8.63 score 221 scripts 3 dependentssachaepskamp
qgraph:Graph Plotting Methods, Psychometric Data Visualization and Graphical Model Estimation
Fork of qgraph - Weighted network visualization and analysis, as well as Gaussian graphical model computation. See Epskamp et al. (2012) <doi:10.18637/jss.v048.i04>.
Maintained by Sacha Epskamp. Last updated 1 years ago.
1.7 match 69 stars 11.43 score 1.2k scripts 63 dependentsangieshen6
BayesPPD:Bayesian Power Prior Design
Bayesian power/type I error calculation and model fitting using the power prior and the normalized power prior for generalized linear models. Detailed examples of applying the package are available at <doi:10.32614/RJ-2023-016>. Models for time-to-event outcomes are implemented in the R package 'BayesPPDSurv'. The Bayesian clinical trial design methodology is described in Chen et al. (2011) <doi:10.1111/j.1541-0420.2011.01561.x>, and Psioda and Ibrahim (2019) <doi:10.1093/biostatistics/kxy009>. The normalized power prior is described in Duan et al. (2006) <doi:10.1002/env.752> and Ibrahim et al. (2015) <doi:10.1002/sim.6728>.
Maintained by Yueqi Shen. Last updated 2 months ago.
7.8 match 2.48 score 7 scripts 1 dependentscvoeten
permutes:Permutation Tests for Time Series Data
Helps you determine the analysis window to use when analyzing densely-sampled time-series data, such as EEG data, using permutation testing (Maris & Oostenveld, 2007) <doi:10.1016/j.jneumeth.2007.03.024>. These permutation tests can help identify the timepoints where significance of an effect begins and ends, and the results can be plotted in various types of heatmap for reporting. Mixed-effects models are supported using an implementation of the approach by Lee & Braun (2012) <doi:10.1111/j.1541-0420.2011.01675.x>.
Maintained by Cesko C. Voeten. Last updated 2 years ago.
4.5 match 4.23 score 16 scriptsbatss-dev
BATSS:Bayesian Adaptive Trial Simulator Software (BATSS) for Generalised Linear Models
Defines operating characteristics of Bayesian Adaptive Trials considering a generalised linear model response via Monte Carlo simulations of Bayesian GLM fitted via integrated Laplace approximations (INLA).
Maintained by Dominique-Laurent Couturier. Last updated 5 months ago.
4.6 match 2 stars 4.15 scorestatsgary
OddsPlotty:Odds Plot to Visualise a Logistic Regression Model
Uses the outputs of a logistic regression model, from caret <https://CRAN.R-project.org/package=caret>, to build an odds plot. This allows for the rapid visualisation of odds plot ratios and works best with the outputs of CARET's GLM model class, by returning the final trained model.
Maintained by Gary Hutson. Last updated 27 days ago.
3.0 match 17 stars 6.39 score 48 scripts 1 dependentsmages
ChainLadder:Statistical Methods and Models for Claims Reserving in General Insurance
Various statistical methods and models which are typically used for the estimation of outstanding claims reserves in general insurance, including those to estimate the claims development result as required under Solvency II.
Maintained by Markus Gesmann. Last updated 1 months ago.
1.9 match 82 stars 10.04 score 196 scripts 2 dependentswinvector
wrapr:Wrap R Tools for Debugging and Parametric Programming
Tools for writing and debugging R code. Provides: '%.>%' dot-pipe (an 'S3' configurable pipe), unpack/to (R style multiple assignment/return), 'build_frame()'/'draw_frame()' ('data.frame' example tools), 'qc()' (quoting concatenate), ':=' (named map builder), 'let()' (converts non-standard evaluation interfaces to parametric standard evaluation interfaces, inspired by 'gtools::strmacro()' and 'base::bquote()'), and more.
Maintained by John Mount. Last updated 2 years ago.
1.7 match 137 stars 11.11 score 390 scripts 12 dependentsharrison4192
autostats:Auto Stats
Automatically do statistical exploration. Create formulas using 'tidyselect' syntax, and then determine cross-validated model accuracy and variable contributions using 'glm' and 'xgboost'. Contains additional helper functions to create and modify formulas. Has a flagship function to quickly determine relationships between categorical and continuous variables in the data set.
Maintained by Harrison Tietze. Last updated 10 days ago.
2.8 match 6 stars 6.76 score 5 scripts 2 dependentsmarc-girondot
HelpersMG:Tools for Environmental Analyses, Ecotoxicology and Various R Functions
Contains miscellaneous functions useful for managing 'NetCDF' files (see <https://en.wikipedia.org/wiki/NetCDF>), get moon phase and time for sun rise and fall, tide level, analyse and reconstruct periodic time series of temperature with irregular sinusoidal pattern, show scales and wind rose in plot with change of color of text, Metropolis-Hastings algorithm for Bayesian MCMC analysis, plot graphs or boxplot with error bars, search files in disk by there names or their content, read the contents of all files from a folder at one time.
Maintained by Marc Girondot. Last updated 2 months ago.
4.0 match 4 stars 4.59 score 160 scripts 4 dependentsrolkra
explore:Simplifies Exploratory Data Analysis
Interactive data exploration with one line of code, automated reporting or use an easy to remember set of tidy functions for low code exploratory data analysis.
Maintained by Roland Krasser. Last updated 3 months ago.
data-explorationdata-visualisationdecision-treesedarmarkdownshinytidy
1.6 match 228 stars 11.43 score 221 scripts 1 dependentssinnweja
pleio:Pleiotropy Test for Multiple Traits on a Genetic Marker
Perform tests for pleiotropy of multiple traits of various variable types on genotypes for a genetic marker.
Maintained by Jason Sinnwell. Last updated 1 years ago.
6.0 match 3.00 score 7 scriptssachsmc
stdReg2:Regression Standardization for Causal Inference
Contains more modern tools for causal inference using regression standardization. Four general classes of models are implemented; generalized linear models, conditional generalized estimating equation models, Cox proportional hazards models, and shared frailty gamma-Weibull models. Methodological details are described in Sjölander, A. (2016) <doi:10.1007/s10654-016-0157-3>. Also includes functionality for doubly robust estimation for generalized linear models in some special cases, and the ability to implement custom models.
Maintained by Michael C Sachs. Last updated 15 days ago.
3.5 match 2 stars 5.08 score 9 scriptsbips-hb
neuralnet:Training of Neural Networks
Training of neural networks using backpropagation, resilient backpropagation with (Riedmiller, 1994) or without weight backtracking (Riedmiller and Braun, 1993) or the modified globally convergent version by Anastasiadis et al. (2005). The package allows flexible settings through custom-choice of error and activation function. Furthermore, the calculation of generalized weights (Intrator O & Intrator N, 1993) is implemented.
Maintained by Marvin N. Wright. Last updated 4 years ago.
1.7 match 32 stars 10.73 score 2.9k scripts 38 dependentsamerican-institutes-for-research
EdSurvey:Analysis of NCES Education Survey and Assessment Data
Read in and analyze functions for education survey and assessment data from the National Center for Education Statistics (NCES) <https://nces.ed.gov/>, including National Assessment of Educational Progress (NAEP) data <https://nces.ed.gov/nationsreportcard/> and data from the International Assessment Database: Organisation for Economic Co-operation and Development (OECD) <https://www.oecd.org/en/about/directorates/directorate-for-education-and-skills.html>, including Programme for International Student Assessment (PISA), Teaching and Learning International Survey (TALIS), Programme for the International Assessment of Adult Competencies (PIAAC), and International Association for the Evaluation of Educational Achievement (IEA) <https://www.iea.nl/>, including Trends in International Mathematics and Science Study (TIMSS), TIMSS Advanced, Progress in International Reading Literacy Study (PIRLS), International Civic and Citizenship Study (ICCS), International Computer and Information Literacy Study (ICILS), and Civic Education Study (CivEd).
Maintained by Paul Bailey. Last updated 14 days ago.
2.3 match 10 stars 7.86 score 139 scripts 1 dependentssmtorres
GLMpack:Data and Code to Accompany Generalized Linear Models, 2nd Edition
Contains all the data and functions used in Generalized Linear Models, 2nd edition, by Jeff Gill and Michelle Torres. Examples to create all models, tables, and plots are included for each data set.
Maintained by Michelle Torres. Last updated 6 years ago.
8.8 match 1 stars 2.00 scoreeasystats
report:Automated Reporting of Results and Statistical Models
The aim of the 'report' package is to bridge the gap between R’s output and the formatted results contained in your manuscript. This package converts statistical models and data frames into textual reports suited for publication, ensuring standardization and quality in results reporting.
Maintained by Rémi Thériault. Last updated 1 months ago.
anovasapaautomated-report-generationautomaticbayesiandescribeeasystatshacktoberfestmanuscriptmodelsreportreportingreportsscientificstatsmodels
1.2 match 698 stars 14.48 score 1.1k scripts 3 dependentssujit-sahu
bmstdr:Bayesian Modeling of Spatio-Temporal Data with R
Fits, validates and compares a number of Bayesian models for spatial and space time point referenced and areal unit data. Model fitting is done using several packages: 'rstan', 'INLA', 'spBayes', 'spTimer', 'spTDyn', 'CARBayes' and 'CARBayesST'. Model comparison is performed using the DIC and WAIC, and K-fold cross-validation where the user is free to select their own subset of data rows for validation. Sahu (2022) <doi:10.1201/9780429318443> describes the methods in detail.
Maintained by Sujit K. Sahu. Last updated 1 years ago.
bayesianmodellingspatio-temporal-datacpp
3.5 match 15 stars 4.95 score 12 scriptsgenentech
psborrow2:Bayesian Dynamic Borrowing Analysis and Simulation
Bayesian dynamic borrowing is an approach to incorporating external data to supplement a randomized, controlled trial analysis in which external data are incorporated in a dynamic way (e.g., based on similarity of outcomes); see Viele 2013 <doi:10.1002/pst.1589> for an overview. This package implements the hierarchical commensurate prior approach to dynamic borrowing as described in Hobbes 2011 <doi:10.1111/j.1541-0420.2011.01564.x>. There are three main functionalities. First, 'psborrow2' provides a user-friendly interface for applying dynamic borrowing on the study results handles the Markov Chain Monte Carlo sampling on behalf of the user. Second, 'psborrow2' provides a simulation framework to compare different borrowing parameters (e.g. full borrowing, no borrowing, dynamic borrowing) and other trial and borrowing characteristics (e.g. sample size, covariates) in a unified way. Third, 'psborrow2' provides a set of functions to generate data for simulation studies, and also allows the user to specify their own data generation process. This package is designed to use the sampling functions from 'cmdstanr' which can be installed from <https://stan-dev.r-universe.dev>.
Maintained by Matt Secrest. Last updated 1 months ago.
bayesian-dynamic-borrowingpsborrow2simulation-study
2.2 match 18 stars 7.87 score 16 scriptstlverse
sl3:Pipelines for Machine Learning and Super Learning
A modern implementation of the Super Learner prediction algorithm, coupled with a general purpose framework for composing arbitrary pipelines for machine learning tasks.
Maintained by Jeremy Coyle. Last updated 4 months ago.
data-scienceensemble-learningensemble-modelmachine-learningmodel-selectionregressionstackingstatistics
1.7 match 100 stars 9.94 score 748 scripts 7 dependentssingmann
afex:Analysis of Factorial Experiments
Convenience functions for analyzing factorial experiments using ANOVA or mixed models. aov_ez(), aov_car(), and aov_4() allow specification of between, within (i.e., repeated-measures), or mixed (i.e., split-plot) ANOVAs for data in long format (i.e., one observation per row), automatically aggregating multiple observations per individual and cell of the design. mixed() fits mixed models using lme4::lmer() and computes p-values for all fixed effects using either Kenward-Roger or Satterthwaite approximation for degrees of freedom (LMM only), parametric bootstrap (LMMs and GLMMs), or likelihood ratio tests (LMMs and GLMMs). afex_plot() provides a high-level interface for interaction or one-way plots using ggplot2, combining raw data and model estimates. afex uses type 3 sums of squares as default (imitating commercial statistical software).
Maintained by Henrik Singmann. Last updated 7 months ago.
1.2 match 123 stars 14.50 score 1.4k scripts 15 dependentsjuhkim111
MGLM:Multivariate Response Generalized Linear Models
Provides functions that (1) fit multivariate discrete distributions, (2) generate random numbers from multivariate discrete distributions, and (3) run regression and penalized regression on the multivariate categorical response data. Implemented models include: multinomial logit model, Dirichlet multinomial model, generalized Dirichlet multinomial model, and negative multinomial model. Making the best of the minorization-maximization (MM) algorithm and Newton-Raphson method, we derive and implement stable and efficient algorithms to find the maximum likelihood estimates. On a multi-core machine, multi-threading is supported.
Maintained by Juhyun Kim. Last updated 3 years ago.
3.6 match 4 stars 4.65 score 53 scripts 1 dependentsmaebruck
chantrics:Loglikelihood Adjustments for Econometric Models
Adjusts the loglikelihood of common econometric models for clustered data based on the estimation process suggested in Chandler and Bate (2007) <doi:10.1093/biomet/asm015>, using the 'chandwich' package <https://cran.r-project.org/package=chandwich>, and provides convenience functions for inference on the adjusted models.
Maintained by Theo Bruckbauer. Last updated 3 years ago.
clusteringeconometricslikelihoodlikelihood-ratio-testloglikelihood-adjustmentmaximum-likelihood
4.5 match 3.70 score 4 scriptsvaudigier
micemd:Multiple Imputation by Chained Equations with Multilevel Data
Addons for the 'mice' package to perform multiple imputation using chained equations with two-level data. Includes imputation methods dedicated to sporadically and systematically missing values. Imputation of continuous, binary or count variables are available. Following the recommendations of Audigier, V. et al (2018) <doi:10.1214/18-STS646>, the choice of the imputation method for each variable can be facilitated by a default choice tuned according to the structure of the incomplete dataset. Allows parallel calculation and overimputation for 'mice'.
Maintained by Vincent Audigier. Last updated 1 years ago.
5.4 match 1 stars 3.08 score 80 scripts 1 dependentsevolecolgroup
tidysdm:Species Distribution Models with Tidymodels
Fit species distribution models (SDMs) using the 'tidymodels' framework, which provides a standardised interface to define models and process their outputs. 'tidysdm' expands 'tidymodels' by providing methods for spatial objects, models and metrics specific to SDMs, as well as a number of specialised functions to process occurrences for contemporary and palaeo datasets. The full functionalities of the package are described in Leonardi et al. (2023) <doi:10.1101/2023.07.24.550358>.
Maintained by Andrea Manica. Last updated 8 days ago.
species-distribution-modellingtidymodels
1.9 match 31 stars 8.82 score 51 scriptswjbraun
DAAG:Data Analysis and Graphics Data and Functions
Functions and data sets used in examples and exercises in the text Maindonald, J.H. and Braun, W.J. (2003, 2007, 2010) "Data Analysis and Graphics Using R", and in an upcoming Maindonald, Braun, and Andrews text that builds on this earlier text.
Maintained by W. John Braun. Last updated 11 months ago.
2.0 match 8.25 score 1.2k scripts 1 dependentsopisthokonta
chainbinomial:Chain Binomial Models for Analysis of Infectious Disease Data
Implements the chain binomial model for analysis of infectious disease data. Contains functions for calculating probabilities of the final size of infectious disease outbreaks using the method from D. Ludwig (1975) <doi:10.1016/0025-5564(75)90119-4> and for outbreaks that are not concluded, from Lindstrøm et al. (2024) <doi:10.48550/arXiv.2403.03948>. The package also contains methods for estimation and regression analysis of secondary attack rates.
Maintained by Jonas Christoffer Lindstrøm. Last updated 2 months ago.
3.1 match 5.15 score 5 scriptspbiecek
breakDown:Model Agnostic Explainers for Individual Predictions
Model agnostic tool for decomposition of predictions from black boxes. Break Down Table shows contributions of every variable to a final prediction. Break Down Plot presents variable contributions in a concise graphical way. This package work for binary classifiers and general regression models.
Maintained by Przemyslaw Biecek. Last updated 1 years ago.
data-scienceimlinterpretabilitymachine-learningvisual-explanationsxai
1.8 match 103 stars 8.90 score 91 scripts 2 dependentseheinzen
elo:Ranking Teams by Elo Rating and Comparable Methods
A flexible framework for calculating Elo ratings and resulting rankings of any two-team-per-matchup system (chess, sports leagues, 'Go', etc.). This implementation is capable of evaluating a variety of matchups, Elo rating updates, and win probabilities, all based on the basic Elo rating system. It also includes methods to benchmark performance, including logistic regression and Markov chain models.
Maintained by Ethan Heinzen. Last updated 1 years ago.
eloelo-ratinglogistic-regressionmarkov-chainmarkov-modelrankingsports-analyticscpp
2.3 match 37 stars 7.05 score 153 scriptsedoardocostantini
gspcr:Generalized Supervised Principal Component Regression
Generalization of supervised principal component regression (SPCR; Bair et al., 2006, <doi:10.1198/016214505000000628>) to support continuous, binary, and discrete variables as outcomes and predictors (inspired by the 'superpc' R package <https://cran.r-project.org/package=superpc>).
Maintained by Edoardo Costantini. Last updated 12 months ago.
3.8 match 1 stars 4.18 score 10 scriptscran
HiddenMarkov:Hidden Markov Models
Contains functions for the analysis of Discrete Time Hidden Markov Models, Markov Modulated GLMs and the Markov Modulated Poisson Process. It includes functions for simulation, parameter estimation, and the Viterbi algorithm. See the topic "HiddenMarkov" for an introduction to the package, and "Change Log" for a list of recent changes. The algorithms are based of those of Walter Zucchini.
Maintained by David Harte. Last updated 2 months ago.
4.1 match 3.79 score 59 scripts 3 dependentskoalaverse
sure:Surrogate Residuals for Ordinal and General Regression Models
An implementation of the surrogate approach to residuals and diagnostics for ordinal and general regression models; for details, see Liu and Zhang (2017, <doi:https://doi.org/10.1080/01621459.2017.1292915>) and Greenwell et al. (2017, <https://journal.r-project.org/archive/2018/RJ-2018-004/index.html>). These residuals can be used to construct standard residual plots for model diagnostics (e.g., residual-vs-fitted value plots, residual-vs-covariate plots, Q-Q plots, etc.). The package also provides an 'autoplot' function for producing standard diagnostic plots using 'ggplot2' graphics. The package currently supports cumulative link models from packages 'MASS', 'ordinal', 'rms', and 'VGAM'. Support for binary regression models using the standard 'glm' function is also available.
Maintained by Brandon Greenwell. Last updated 12 days ago.
categorical-datadiagnosticsordinal-regressionresiduals
2.8 match 9 stars 5.58 score 47 scripts 1 dependentsisglobal-brge
SNPassoc:SNPs-Based Whole Genome Association Studies
Functions to perform most of the common analysis in genome association studies are implemented. These analyses include descriptive statistics and exploratory analysis of missing values, calculation of Hardy-Weinberg equilibrium, analysis of association based on generalized linear models (either for quantitative or binary traits), and analysis of multiple SNPs (haplotype and epistasis analysis). Permutation test and related tests (sum statistic and truncated product) are also implemented. Max-statistic and genetic risk-allele score exact distributions are also possible to be estimated. The methods are described in Gonzalez JR et al., 2007 <doi: 10.1093/bioinformatics/btm025>.
Maintained by Dolors Pelegri. Last updated 5 months ago.
1.7 match 16 stars 9.14 score 89 scripts 6 dependentsricharddmorey
BayesFactor:Computation of Bayes Factors for Common Designs
A suite of functions for computing various Bayes factors for simple designs, including contingency tables, one- and two-sample designs, one-way designs, general ANOVA designs, and linear regression.
Maintained by Richard D. Morey. Last updated 1 years ago.
1.1 match 133 stars 13.70 score 1.7k scripts 21 dependentsarvsjo
stdReg:Regression Standardization
Contains functionality for regression standardization. Four general classes of models are allowed; generalized linear models, conditional generalized estimating equation models, Cox proportional hazards models and shared frailty gamma-Weibull models. Sjolander, A. (2016) <doi:10.1007/s10654-016-0157-3>.
Maintained by Arvid Sjolander. Last updated 4 years ago.
5.3 match 2.80 score 53 scripts 1 dependentsbrian-j-smith
MachineShop:Machine Learning Models and Tools
Meta-package for statistical and machine learning with a unified interface for model fitting, prediction, performance assessment, and presentation of results. Approaches for model fitting and prediction of numerical, categorical, or censored time-to-event outcomes include traditional regression models, regularization methods, tree-based methods, support vector machines, neural networks, ensembles, data preprocessing, filtering, and model tuning and selection. Performance metrics are provided for model assessment and can be estimated with independent test sets, split sampling, cross-validation, or bootstrap resampling. Resample estimation can be executed in parallel for faster processing and nested in cases of model tuning and selection. Modeling results can be summarized with descriptive statistics; calibration curves; variable importance; partial dependence plots; confusion matrices; and ROC, lift, and other performance curves.
Maintained by Brian J Smith. Last updated 7 months ago.
classification-modelsmachine-learningpredictive-modelingregression-modelssurvival-models
1.9 match 61 stars 7.95 score 121 scriptsluca-scr
dispmod:Modelling Dispersion in GLM
Functions for estimating Gaussian dispersion regression models (Aitkin, 1987 <doi:10.2307/2347792>), overdispersed binomial logit models (Williams, 1987 <doi:10.2307/2347977>), and overdispersed Poisson log-linear models (Breslow, 1984 <doi:10.2307/2347661>), using a quasi-likelihood approach.
Maintained by Luca Scrucca. Last updated 7 years ago.
7.3 match 2.02 score 21 scriptsipums
ipumsr:An R Interface for Downloading, Reading, and Handling IPUMS Data
An easy way to work with census, survey, and geographic data provided by IPUMS in R. Generate and download data through the IPUMS API and load IPUMS files into R with their associated metadata to make analysis easier. IPUMS data describing 1.4 billion individuals drawn from over 750 censuses and surveys is available free of charge from the IPUMS website <https://www.ipums.org>.
Maintained by Derek Burk. Last updated 17 days ago.
1.3 match 28 stars 11.07 score 720 scripts 2 dependents