R-universe search: racing

kjhealy

gssrdoc:Document General Social Survey Variable

The General Social Survey (GSS) is a long-running, mostly annual survey of US households. It is administered by the National Opinion Research Center (NORC). This package contains the a tibble with information on the survey variables, together with every variable documented as an R help page. For more information on the GSS see \url{http://gss.norc.org}.

Maintained by Kieran Healy. Last updated 12 months ago.

101.1 match 2.28 score 38 scripts

robinhankin

hyper2:The Hyperdirichlet Distribution, Mark 2

A suite of routines for the hyperdirichlet distribution and reified Bradley-Terry; supersedes the 'hyperdirichlet' package; uses 'disordR' discipline <doi:10.48550/ARXIV.2210.03856>. To cite in publications please use Hankin 2017 <doi:10.32614/rj-2017-061>, and for Generalized Plackett-Luce likelihoods use Hankin 2024 <doi:10.18637/jss.v109.i08>.

Maintained by Robin K. S. Hankin. Last updated 1 hours ago.

cpp

19.6 match 5 stars 7.91 score 38 scripts 1 dependents

wjbraun

DAAG:Data Analysis and Graphics Data and Functions

Functions and data sets used in examples and exercises in the text Maindonald, J.H. and Braun, W.J. (2003, 2007, 2010) "Data Analysis and Graphics Using R", and in an upcoming Maindonald, Braun, and Andrews text that builds on this earlier text.

Maintained by W. John Braun. Last updated 12 months ago.

16.0 match 8.35 score 1.2k scripts 1 dependents

alanarnholt

BSDA:Basic Statistics and Data Analysis

Data sets for book "Basic Statistics and Data Analysis" by Larry J. Kitchens.

Maintained by Alan T. Arnholt. Last updated 2 years ago.

13.9 match 7 stars 9.11 score 1.3k scripts 6 dependents

openintrostat

openintro:Datasets and Supplemental Functions from 'OpenIntro' Textbooks and Labs

Supplemental functions and data for 'OpenIntro' resources, which includes open-source textbooks and resources for introductory statistics (<https://www.openintro.org/>). The package contains datasets used in our open-source textbooks along with custom plotting functions for reproducing book figures. Note that many functions and examples include color transparency; some plotting elements may not show up properly (or at all) when run in some versions of Windows operating system.

Maintained by Mine Çetinkaya-Rundel. Last updated 3 months ago.

data openintro

10.7 match 240 stars 11.29 score 6.0k scripts

openintrostat

usdata:Data on the States and Counties of the United States

Demographic data on the United States at the county and state levels spanning multiple years.

Maintained by Mine Çetinkaya-Rundel. Last updated 10 months ago.

data openintro

17.4 match 9 stars 6.89 score 294 scripts 1 dependents

kylegrealis

nascaR.data:NASCAR Race Data

A collection of NASCAR race, driver, owner and manufacturer data across the three major NASCAR divisions: NASCAR Cup Series, NASCAR Xfinity Series, and NASCAR Craftsman Truck Series. The curated data begins with the 1949 season and extends through the end of the 2024 season. Explore race, season, or career performance for drivers, teams, and manufacturers throughout NASCAR's history. Data was sourced with permission from DriverAverages.com.

Maintained by Kyle Grealis. Last updated 28 days ago.

data-science data-visualization racing

24.9 match 5 stars 4.70 score 2 scripts

mlopez-ibanez

irace:Iterated Racing for Automatic Algorithm Configuration

Iterated race is an extension of the Iterated F-race method for the automatic configuration of optimization algorithms, that is, (offline) tuning their parameters by finding the most appropriate settings given a set of instances of an optimization problem. M. López-Ibáñez, J. Dubois-Lacoste, L. Pérez Cáceres, T. Stützle, and M. Birattari (2016) <doi:10.1016/j.orp.2016.09.002>.

Maintained by Manuel López-Ibáñez. Last updated 2 months ago.

algorithm-configuration hyperparameter-tuning irace optimization-algorithms

10.9 match 63 stars 10.20 score 103 scripts 1 dependents

jacobkap

predictrace:Predict the Race and Gender of a Given Name Using Census and Social Security Administration Data

Predicts the most common race of a surname and based on U.S. Census data, and the most common first named based on U.S. Social Security Administration data.

Maintained by Jacob Kaplan. Last updated 2 years ago.

21.6 match 12 stars 5.03 score 18 scripts

alexwhitworth

synthACS:Synthetic Microdata and Spatial MicroSimulation Modeling for ACS Data

Provides access to curated American Community Survey (ACS) base tables via a wrapper to library(acs). Builds synthetic micro-datasets at any user-specified geographic level with ten default attributes; and, conducts spatial microsimulation modeling (SMSM) via simulated annealing. SMSM is conducted in parallel by default. Lastly, we provide functionality for data-extensibility of micro-datasets <doi:10.18637/jss.v104.i07>.

Maintained by Alex Whitworth. Last updated 2 years ago.

acs acs-data microsimulation spatial-data-analysis cpp

24.8 match 5 stars 4.29 score 78 scripts

svmiller

dragracer:Data Sets for RuPaul's Drag Race

These are data sets for the hit TV show, RuPaul's Drag Race. Data right now include episode-level data, contestant-level data, and episode-contestant-level data. This is a work in progress, and a love letter of a kind to RuPaul's Drag Race and the performers that have appeared on the show. This may not be the most productive use of my time, but I have tenure and what are you going to do about it? I think there is at least some value in this package if it allows the show's fandom to learn more about the R programming language around its contents.

Maintained by Steve Miller. Last updated 2 years ago.

rupaul-drag-race

21.1 match 21 stars 5.02 score 9 scripts

lightbluetitan

usdatasets:A Comprehensive Collection of U.S. Datasets

Provides a diverse collection of U.S. datasets encompassing various fields such as crime, economics, education, finance, energy, healthcare, and more. It serves as a valuable resource for researchers and analysts seeking to perform in-depth analyses and derive insights from U.S.-specific data.

Maintained by Renzo Caceres Rossi. Last updated 6 months ago.

13.7 match 7 stars 5.99 score 141 scripts

tidymodels

finetune:Additional Functions for Model Tuning

The ability to tune models is important. 'finetune' enhances the 'tune' package by providing more specialized methods for finding reasonable values of model tuning parameters. Two racing methods described by Kuhn (2014) <arXiv:1405.6974> are included. An iterative search method using generalized simulated annealing (Bohachevsky, Johnson and Stein, 1986) <doi:10.1080/00401706.1986.10488128> is also included.

Maintained by Max Kuhn. Last updated 8 months ago.

9.4 match 62 stars 8.36 score 704 scripts 1 dependents

r-forge

Sleuth3:Data Sets from Ramsey and Schafer's "Statistical Sleuth (3rd Ed)"

Data sets from Ramsey, F.L. and Schafer, D.W. (2013), "The Statistical Sleuth: A Course in Methods of Data Analysis (3rd ed)", Cengage Learning.

Maintained by Berwin A Turlach. Last updated 1 years ago.

12.5 match 6.29 score 522 scripts

sensitivequestions

list:Statistical Methods for the Item Count Technique and List Experiment

Allows researchers to conduct multivariate statistical analyses of survey data with list experiments. This survey methodology is also known as the item count technique or the unmatched count technique and is an alternative to the commonly used randomized response method. The package implements the methods developed by Imai (2011) <doi:10.1198/jasa.2011.ap10415>, Blair and Imai (2012) <doi:10.1093/pan/mpr048>, Blair, Imai, and Lyall (2013) <doi:10.1111/ajps.12086>, Imai, Park, and Greene (2014) <doi:10.1093/pan/mpu017>, Aronow, Coppock, Crawford, and Green (2015) <doi:10.1093/jssam/smu023>, Chou, Imai, and Rosenfeld (2017) <doi:10.1177/0049124117729711>, and Blair, Chou, and Imai (2018) <https://imai.fas.harvard.edu/research/files/listerror.pdf>. This includes a Bayesian MCMC implementation of regression for the standard and multiple sensitive item list experiment designs and a random effects setup, a Bayesian MCMC hierarchical regression model with up to three hierarchical groups, the combined list experiment and endorsement experiment regression model, a joint model of the list experiment that enables the analysis of the list experiment as a predictor in outcome regression models, a method for combining list experiments with direct questions, and methods for diagnosing and adjusting for response error. In addition, the package implements the statistical test that is designed to detect certain failures of list experiments, and a placebo test for the list experiment using data from direct questions.

Maintained by Graeme Blair. Last updated 1 years ago.

openblas

10.8 match 7 stars 6.60 score 191 scripts

dfe-analytical-services

dfeR:Common Department for Education Analysis Tasks

Preferred methods for common analytical tasks that are undertaken across the Department, including number formatting, project templates and curated reference data.

Maintained by Cam Race. Last updated 1 months ago.

9.0 match 12 stars 6.89 score 8 scripts

christopherkenny

censable:Making Census Data More Usable

Creates a common framework for organizing, naming, and gathering population, age, race, and ethnicity data from the Census Bureau. Accesses the API <https://www.census.gov/data/developers/data-sets.html>. Provides tools for adding information to existing data to line up with Census data.

Maintained by Christopher T. Kenny. Last updated 10 months ago.

10.9 match 8 stars 5.30 score 42 scripts 4 dependents

chris-prener

biscale:Tools and Palettes for Bivariate Thematic Mapping

Provides a 'ggplot2' centric approach to bivariate mapping. This is a technique that maps two quantities simultaneously rather than the single value that most thematic maps display. The package provides a suite of tools for calculating breaks using multiple different approaches, a selection of palettes appropriate for bivariate mapping and scale functions for 'ggplot2' calls that adds those palettes to maps. Tools for creating bivariate legends are also included.

Maintained by Christopher Prener. Last updated 3 years ago.

6.8 match 122 stars 8.53 score 466 scripts

r-forge

Sleuth2:Data Sets from Ramsey and Schafer's "Statistical Sleuth (2nd Ed)"

Data sets from Ramsey, F.L. and Schafer, D.W. (2002), "The Statistical Sleuth: A Course in Methods of Data Analysis (2nd ed)", Duxbury.

Maintained by Berwin A Turlach. Last updated 1 years ago.

8.5 match 5.79 score 191 scripts

atahk

pscl:Political Science Computational Laboratory

Bayesian analysis of item-response theory (IRT) models, roll call analysis; computing highest density regions; maximum likelihood estimation of zero-inflated and hurdle models for count data; goodness-of-fit measures for GLMs; data sets used in writing and teaching; seats-votes curves.

Maintained by Simon Jackman. Last updated 1 years ago.

3.4 match 67 stars 13.28 score 2.7k scripts 54 dependents

sehellmann

dynConfiR:Dynamic Models for Confidence and Response Time Distributions

Provides density functions for the joint distribution of choice, response time and confidence for discrete confidence judgments as well as functions for parameter fitting, prediction and simulation for various dynamical models of decision confidence. All models are explained in detail by Hellmann et al. (2023; Preprint available at <https://osf.io/9jfqr/>, published version: <doi:10.1037/rev0000411>). Implemented models are the dynaViTE model, dynWEV model, the 2DSD model (Pleskac & Busemeyer, 2010, <doi:10.1037/a0019737>), and various race models. C++ code for dynWEV and 2DSD is based on the 'rtdists' package by Henrik Singmann.

Maintained by Sebastian Hellmann. Last updated 11 days ago.

cpp

7.2 match 3 stars 5.47 score 18 scripts

friendly

vcdExtra:'vcd' Extensions and Additions

Provides additional data sets, methods and documentation to complement the 'vcd' package for Visualizing Categorical Data and the 'gnm' package for Generalized Nonlinear Models. In particular, 'vcdExtra' extends mosaic, assoc and sieve plots from 'vcd' to handle 'glm()' and 'gnm()' models and adds a 3D version in 'mosaic3d'. Additionally, methods are provided for comparing and visualizing lists of 'glm' and 'loglm' objects. This package is now a support package for the book, "Discrete Data Analysis with R" by Michael Friendly and David Meyer.

Maintained by Michael Friendly. Last updated 8 days ago.

categorical-data-visualization generalized-linear-models mosaic-plots

3.6 match 24 stars 10.85 score 472 scripts 3 dependents

guyabel

migest:Methods for the Indirect Estimation of Bilateral Migration

Tools for estimating, measuring and working with migration data.

Maintained by Guy J. Abel. Last updated 2 months ago.

demography migration population

6.7 match 32 stars 5.80 score 86 scripts

cran

MASS:Support Functions and Datasets for Venables and Ripley's MASS

Functions and datasets to support Venables and Ripley, "Modern Applied Statistics with S" (4th edition, 2002).

Maintained by Brian Ripley. Last updated 1 months ago.

3.6 match 19 stars 10.64 score 11k dependents

ampl-psych

EMC2:Bayesian Hierarchical Analysis of Cognitive Models of Choice

Fit Bayesian (hierarchical) cognitive models using a linear modeling language interface using particle Metropolis Markov chain Monte Carlo sampling with Gibbs steps. The diffusion decision model (DDM), linear ballistic accumulator model (LBA), racing diffusion model (RDM), and the lognormal race model (LNR) are supported. Additionally, users can specify their own likelihood function and/or choose for non-hierarchical estimation, as well as for a diagonal, blocked or full multivariate normal group-level distribution to test individual differences. Prior specification is facilitated through methods that visualize the (implied) prior. A wide range of plotting functions assist in assessing model convergence and posterior inference. Models can be easily evaluated using functions that plot posterior predictions or using relative model comparison metrics such as information criteria or Bayes factors. References: Stevenson et al. (2024) <doi:10.31234/osf.io/2e4dq>.

Maintained by Niek Stevenson. Last updated 22 days ago.

cpp

4.6 match 13 stars 8.25 score 392 scripts

zmeers

flourishcharts:'Flourish' for 'R' and 'Python'

Interactive data visualization for data practitioners. 'flourishcharts' allows users to visualize their data using 'Flourish' graphs that are grounded in data storytelling principles. Users can create racing bar & line charts, as well as other interactive elements commonly found in 'D3' graphics, easily in 'R' and 'Python'. The package relies on an enterprise API provided by 'Flourish', a data visualization platform <https://developers.flourish.studio/api/introduction/>.

Maintained by Zoe Meers. Last updated 5 months ago.

12.6 match 3.00 score 5 scripts

helske

KFAS:Kalman Filter and Smoother for Exponential Family State Space Models

State space modelling is an efficient and flexible framework for statistical inference of a broad class of time series and other data. KFAS includes computationally efficient functions for Kalman filtering, smoothing, forecasting, and simulation of multivariate exponential family state space models, with observations from Gaussian, Poisson, binomial, negative binomial, and gamma distributions. See the paper by Helske (2017) <doi:10.18637/jss.v078.i10> for details.

Maintained by Jouni Helske. Last updated 7 months ago.

dynamic-linear-model exponential-family fortran gaussian-models state-space time-series openblas

3.4 match 64 stars 10.97 score 242 scripts 16 dependents

trinker

wakefield:Generate Random Data Sets

Generates random data sets including: data.frames, lists, and vectors.

Maintained by Tyler Rinker. Last updated 5 years ago.

data-generation wakefield

5.2 match 256 stars 7.13 score 209 scripts

lightbluetitan

crimedatasets:A Comprehensive Collection of Crime-Related Datasets

A comprehensive collection of datasets exclusively focused on crimes, criminal activities, and related topics. This package serves as a valuable resource for researchers, analysts, and students interested in crime analysis, criminology, social and economic studies related to criminal behavior. Datasets span global and local contexts, with a mix of tabular and spatial data.

Maintained by Renzo Caceres Rossi. Last updated 4 months ago.

7.5 match 8 stars 4.90 score 3 scripts

r-computing-lab

discord:Functions for Discordant Kinship Modeling

Functions for discordant kinship modeling (and other sibling-based quasi-experimental designs). Currently, the package contains data restructuring functions and functions for generating biometrically informed data for kin pairs.

Maintained by S. Mason Garrison. Last updated 1 years ago.

7.6 match 4.83 score 34 scripts

hanase

BMA:Bayesian Model Averaging

Package for Bayesian model averaging and variable selection for linear models, generalized linear models and survival models (cox regression).

Maintained by Hana Sevcikova. Last updated 3 months ago.

fortran

3.8 match 38 stars 9.40 score 152 scripts 14 dependents

rstudio

promises:Abstractions for Promise-Based Asynchronous Programming

Provides fundamental abstractions for doing asynchronous programming in R using promises. Asynchronous programming is useful for allowing a single R process to orchestrate multiple tasks in the background while also attending to something else. Semantics are similar to 'JavaScript' promises, but with a syntax that is idiomatic R.

Maintained by Joe Cheng. Last updated 2 months ago.

cpp

2.0 match 204 stars 17.10 score 688 scripts 2.6k dependents

projectmosaic

mosaicData:Project MOSAIC Data Sets

Data sets from Project MOSAIC (<http://www.mosaic-web.org>) used to teach mathematics, statistics, computation and modeling. Funded by the NSF, Project MOSAIC is a community of educators working to tie together aspects of quantitative work that students in science, technology, engineering and mathematics will need in their professional lives, but which are usually taught in isolation, if at all.

Maintained by Randall Pruim. Last updated 1 years ago.

4.0 match 6 stars 8.33 score 632 scripts 8 dependents

insightsengineering

chevron:Standard TLGs for Clinical Trials Reporting

Provide standard tables, listings, and graphs (TLGs) libraries used in clinical trials. This package implements a structure to reformat the data with 'dunlin', create reporting tables using 'rtables' and 'tern' with standardized input arguments to enable quick generation of standard outputs. In addition, it also provides comprehensive data checks and script generation functionality.

Maintained by Joe Zhu. Last updated 1 months ago.

clinical-trials graphs listings nest reporting tables

4.0 match 14 stars 8.30 score 12 scripts

clevelandclinicqhs

sociome:Operationalizing Social Determinants of Health Data for Researchers

Accesses raw data via API and calculates social determinants of health measures for user-specified locations in the US, returning them in tidyverse- and sf-compatible data frames.

Maintained by Nik Krieger. Last updated 1 years ago.

6.9 match 30 stars 4.65 score 8 scripts 1 dependents

chris-prener

areal:Areal Weighted Interpolation

A pipeable, transparent implementation of areal weighted interpolation with support for interpolating multiple variables in a single function call. These tools provide a full-featured workflow for validation and estimation that fits into both modern data management (e.g. tidyverse) and spatial data (e.g. sf) frameworks.

Maintained by Christopher Prener. Last updated 3 years ago.

3.5 match 93 stars 8.88 score 106 scripts 4 dependents

tidyverts

tsibbledata:Diverse Datasets for 'tsibble'

Provides diverse datasets in the 'tsibble' data structure. These datasets are useful for learning and demonstrating how tidy temporal data can tidied, visualised, and forecasted.

Maintained by Mitchell OHara-Wild. Last updated 5 months ago.

dataset tsibble

3.6 match 25 stars 8.44 score 740 scripts 2 dependents

bayes-rules

bayesrules:Datasets and Supplemental Functions from Bayes Rules! Book

Provides datasets and functions used for analysis and visualizations in the Bayes Rules! book (<https://www.bayesrulesbook.com>). The package contains a set of functions that summarize and plot Bayesian models from some conjugate families and another set of functions for evaluation of some Bayesian models.

Maintained by Mine Dogucu. Last updated 3 years ago.

bayesian-statistics data

3.8 match 72 stars 8.06 score 466 scripts

mlr-org

mlr3tuning:Hyperparameter Optimization for 'mlr3'

Hyperparameter optimization package of the 'mlr3' ecosystem. It features highly configurable search spaces via the 'paradox' package and finds optimal hyperparameter configurations for any 'mlr3' learner. 'mlr3tuning' works with several optimization algorithms e.g. Random Search, Iterated Racing, Bayesian Optimization (in 'mlr3mbo') and Hyperband (in 'mlr3hyperband'). Moreover, it can automatically optimize learners and estimate the performance of optimized models with nested resampling.

Maintained by Marc Becker. Last updated 4 months ago.

bbotk hyperparameter-optimization hyperparameter-tuning machine-learning mlr3 optimization tune tuning

2.4 match 55 stars 11.53 score 384 scripts 11 dependents

vlyubchich

lawstat:Tools for Biostatistics, Public Policy, and Law

Statistical tests widely utilized in biostatistics, public policy, and law. Along with the well-known tests for equality of means and variances, randomness, and measures of relative variability, the package contains new robust tests of symmetry, omnibus and directional tests of normality, and their graphical counterparts such as robust QQ plot, robust trend tests for variances, etc. All implemented tests and methods are illustrated by simulations and real-life examples from legal statistics, economics, and biostatistics.

Maintained by Yulia R. Gel. Last updated 2 years ago.

3.5 match 7.34 score 484 scripts 6 dependents

njlyon0

dndR:Dungeons & Dragons Functions for Players and Dungeon Masters

The goal of 'dndR' is to provide a suite of Dungeons & Dragons related functions. This package is meant to be useful both to players and Dungeon Masters (DMs). All functions currently focus on Fifth Edition (a.k.a. "5e") but once the next edition is published functions will likely be expanded to include any rule changes.

Maintained by Nicholas Lyon. Last updated 11 months ago.

data-science dungeons-and-dragons ttrpg

3.6 match 17 stars 6.98 score 16 scripts

elipousson

getACS:Help Wrangling American Community Survey Data from tidycensus

A package with helper functions for working with Census data downloaded with the tidycensus package.

Maintained by Eli Pousson. Last updated 2 months ago.

american-community-survey tidycensus

5.3 match 4 stars 4.68 score 10 scripts

bioc

sevenbridges:Seven Bridges Platform API Client and Common Workflow Language Tool Builder in R

R client and utilities for Seven Bridges platform API, from Cancer Genomics Cloud to other Seven Bridges supported platforms.

Maintained by Phil Webster. Last updated 5 months ago.

software dataimport thirdpartyclient api-client bioconductor bioinformatics cloud common-workflow-language sevenbridges

3.3 match 35 stars 7.40 score 24 scripts

christopherkenny

ppmf:Read Census Privacy Protected Microdata Files

Implements data processing described in <doi:10.1126/sciadv.abk3283> to align modern differentially private data with formatting of older US Census data releases. The primary goal is to read in Census Privacy Protected Microdata Files data in a reproducible way. This includes tools for aggregating to relevant levels of geography by creating geographic identifiers which match the US Census Bureau's numbering. Additionally, there are tools for grouping race numeric identifiers into categories, consistent with OMB (Office of Management and Budget) classifications. Functions exist for downloading and linking to existing sources of privacy protected microdata.

Maintained by Christopher T. Kenny. Last updated 2 years ago.

9.0 match 1 stars 2.70 score 3 scripts

mlr-org

bbotk:Black-Box Optimization Toolkit

Features highly configurable search spaces via the 'paradox' package and optimizes every user-defined objective function. The package includes several optimization algorithms e.g. Random Search, Iterated Racing, Bayesian Optimization (in 'mlr3mbo') and Hyperband (in 'mlr3hyperband'). bbotk is the base package of 'mlr3tuning', 'mlr3fselect' and 'miesmuschel'.

Maintained by Marc Becker. Last updated 4 months ago.

bbotk black-box-optimization data-science hyperparameter-optimization hyperparameter-tuning machine-learning mlr3 optimization

2.4 match 22 stars 9.83 score 166 scripts 14 dependents

rpruim

fastR2:Foundations and Applications of Statistics Using R (2nd Edition)

Data sets and utilities to accompany the second edition of "Foundations and Applications of Statistics: an Introduction using R" (R Pruim, published by AMS, 2017), a text covering topics from probability and mathematical statistics at an advanced undergraduate level. R is integrated throughout, and access to all the R code in the book is provided via the snippet() function.

Maintained by Randall Pruim. Last updated 1 years ago.

4.0 match 13 stars 5.85 score 108 scripts

homerhanumat

tigerstats:R Functions for Elementary Statistics

A collection of data sets and functions that are useful in the teaching of statistics at an elementary level to students who may have little or no previous experience with the command line. The functions for elementary inferential procedures follow a uniform interface for user input. Some of the functions are instructional applets that can only be run on the R Studio integrated development environment with package 'manipulate' installed. Other instructional applets are Shiny apps that may be run locally. In teaching the package is used alongside of package 'mosaic', 'mosaicData' and 'abd', which are therefore listed as dependencies.

Maintained by Homer White. Last updated 5 years ago.

4.0 match 16 stars 5.74 score 327 scripts

kosukeimai

wru:Who are You? Bayesian Prediction of Racial Category Using Surname, First Name, Middle Name, and Geolocation

Predicts individual race/ethnicity using surname, first name, middle name, geolocation, and other attributes, such as gender and age. The method utilizes Bayes' Rule (with optional measurement error correction) to compute the posterior probability of each racial category for any given individual. The package implements methods described in Imai and Khanna (2016) "Improving Ecological Inference by Predicting Individual Ethnicity from Voter Registration Records" Political Analysis <DOI:10.1093/pan/mpw001> and Imai, Olivella, and Rosenman (2022) "Addressing census data problems in race imputation via fully Bayesian Improved Surname Geocoding and name supplements" <DOI:10.1126/sciadv.adc9824>. The package also incorporates the data described in Rosenman, Olivella, and Imai (2023) "Race and ethnicity data for first, middle, and surnames" <DOI:10.1038/s41597-023-02202-2>.

Maintained by Brandon Bertelsen. Last updated 10 months ago.

cpp

2.9 match 133 stars 7.54 score 146 scripts

tidymodels

shinymodels:Interactive Assessments of Models

Launch a 'shiny' application for 'tidymodels' results. For classification or regression models, the app can be used to determine if there is lack of fit or poorly predicted points.

Maintained by Simon Couch. Last updated 5 months ago.

shiny

3.5 match 48 stars 6.21 score 48 scripts

dtkaplan

LSTbook:Data and Software for "Lessons in Statistical Thinking"

"Lessons in Statistical Thinking" D.T. Kaplan (2014) <https://dtkaplan.github.io/Lessons-in-statistical-thinking/> is a textbook for a first or second course in statistics that embraces data wrangling, causal reasoning, modeling, statistical adjustment, and simulation. 'LSTbook' supports the student-centered, tidy, pipeline-oriented computing style featured in the book.

Maintained by Daniel Kaplan. Last updated 5 days ago.

3.4 match 4 stars 6.32 score 27 scripts

tyee001

VGAMdata:Data Supporting the 'VGAM' Package

Mainly data sets to accompany the VGAM package and the book "Vector Generalized Linear and Additive Models: With an Implementation in R" (Yee, 2015) <DOI:10.1007/978-1-4939-2818-7>. These are used to illustrate vector generalized linear and additive models (VGLMs/VGAMs), and associated models (Reduced-Rank VGLMs, Quadratic RR-VGLMs, Row-Column Interaction Models, and constrained and unconstrained ordination models in ecology). This package now contains some old VGAM family functions which have been replaced by newer ones (often because they are now special cases).

Maintained by Thomas Yee. Last updated 2 months ago.

7.0 match 1 stars 2.94 score 95 scripts 1 dependents

teachinglab

tlShiny:Supplies essential functions to Teaching Lab dashboards

A bunch of random functions I use in developing dashboards Needs to vastly reduce the number of dependencies at the moment.

Maintained by Duncan Gates. Last updated 4 days ago.

6.8 match 3.04 score

a2-ai

scicalc:Scientific Calculations for Quantitative Clinical Pharmacology and Pharmacometrics Analysis

Utility functions helpful for reproducible scientific calculations.

Maintained by Matthew Smith. Last updated 2 months ago.

5.0 match 1 stars 4.04 score 4 scripts

openintrostat

cherryblossom:Cherry Blossom Run Race Results

Race results of the Cherry Blossom Run, which is an annual road race that takes place in Washington, DC.

Maintained by Mine Çetinkaya-Rundel. Last updated 1 years ago.

data openintro

3.9 match 6 stars 5.11 score 28 scripts 1 dependents

svmiller

peacesciencer:Tools and Data for Quantitative Peace Science Research

These are useful tools and data sets for the study of quantitative peace science. The goal for this package is to include tools and data sets for doing original research that mimics well what a user would have to previously get from a software package that may not be well-sourced or well-supported. Those software bundles were useful the extent to which they encourage replications of long-standing analyses by starting the data-generating process from scratch. However, a lot of the functionality can be done relatively quickly and more transparently in the R programming language.

Maintained by Steve Miller. Last updated 19 days ago.

eugene peace-science

3.6 match 29 stars 5.49 score 211 scripts

shabbychef

ohenery:Modeling of Ordinal Random Variables via Softmax Regression

Supports the modeling of ordinal random variables, like the outcomes of races, via Softmax regression, under the Harville <doi:10.1080/01621459.1973.10482425> and Henery <doi:10.1111/j.2517-6161.1981.tb01153.x> models.

Maintained by Steven E. Pav. Last updated 1 years ago.

cpp

4.5 match 6 stars 4.26 score 6 scripts 2 dependents

statmanrobin

Stat2Data:Datasets for Stat2

Datasets for the textbook Stat2: Modeling with Regression and ANOVA (second edition). The package also includes data for the first edition, Stat2: Building Models for a World of Data and a few functions for plotting diagnostics.

Maintained by Robin Lock. Last updated 6 years ago.

3.8 match 5 stars 4.94 score 544 scripts

usepa

httk:High-Throughput Toxicokinetics

Pre-made models that can be rapidly tailored to various chemicals and species using chemical-specific in vitro data and physiological information. These tools allow incorporation of chemical toxicokinetics ("TK") and in vitro-in vivo extrapolation ("IVIVE") into bioinformatics, as described by Pearce et al. (2017) (<doi:10.18637/jss.v079.i04>). Chemical-specific in vitro data characterizing toxicokinetics have been obtained from relatively high-throughput experiments. The chemical-independent ("generic") physiologically-based ("PBTK") and empirical (for example, one compartment) "TK" models included here can be parameterized with in vitro data or in silico predictions which are provided for thousands of chemicals, multiple exposure routes, and various species. High throughput toxicokinetics ("HTTK") is the combination of in vitro data and generic models. We establish the expected accuracy of HTTK for chemicals without in vivo data through statistical evaluation of HTTK predictions for chemicals where in vivo data do exist. The models are systems of ordinary differential equations that are developed in MCSim and solved using compiled (C-based) code for speed. A Monte Carlo sampler is included for simulating human biological variability (Ring et al., 2017 <doi:10.1016/j.envint.2017.06.004>) and propagating parameter uncertainty (Wambaugh et al., 2019 <doi:10.1093/toxsci/kfz205>). Empirically calibrated methods are included for predicting tissue:plasma partition coefficients and volume of distribution (Pearce et al., 2017 <doi:10.1007/s10928-017-9548-7>). These functions and data provide a set of tools for using IVIVE to convert concentrations from high-throughput screening experiments (for example, Tox21, ToxCast) to real-world exposures via reverse dosimetry (also known as "RTK") (Wetmore et al., 2015 <doi:10.1093/toxsci/kfv171>).

Maintained by John Wambaugh. Last updated 2 months ago.

comptox ord

1.7 match 27 stars 10.22 score 307 scripts 1 dependents

salernos

svycdiff:Controlled Difference Estimation for Complex Surveys

Estimates the population average controlled difference for a given outcome between levels of a binary treatment, exposure, or other group membership variable of interest for clustered, stratified survey samples where sample selection depends on the comparison group. Provides three methods for estimation, namely outcome modeling and two factorizations of inverse probability weighting. Under stronger assumptions, these methods estimate the causal population average treatment effect. Salerno et al., (2024) <doi:10.48550/arXiv.2406.19597>.

Maintained by Stephen Salerno. Last updated 26 days ago.

3.6 match 1 stars 4.54 score 4 scripts

tslumley

rimu:Responses in Multiplex

Tools for manipulating, exploring, and visualising multiple-response data, including scored or ranked responses. Conversions to and from factors, lists, strings, matrices; reordering, lumping, flattening; set operations; tables; frequency and co-occurrence plots.

Maintained by Thomas Lumley. Last updated 1 years ago.

3.3 match 4 stars 4.78 score 10 scripts

impaug

ComradesM:The Comrades Marathon 1921 to 2019

Datasets related to the Comrades Marathon used in the book Antony Unwin (2024, ISBN:978-0367674007) "Getting (more out of) Graphics". The main dataset contains the times of every runner that finished in the time limit for each year the race was run.

Maintained by Antony Unwin. Last updated 7 months ago.

9.0 match 1.70 score

kjhealy

covdata:COVID-19 Data

COVID-19 related data from the ECDC, the COVID-19 Tracking Project, the New York Times, the Human Mortality Database, and Apple. Packaged for R.

Maintained by Kieran Healy. Last updated 2 years ago.

3.2 match 83 stars 4.73 score 129 scripts

cran

ei:Ecological Inference

Software accompanying Gary King's book: A Solution to the Ecological Inference Problem. (1997). Princeton University Press. ISBN 978-0691012407.

Maintained by James Honaker. Last updated 8 years ago.

7.6 match 1 stars 2.00 score

rtdists

rtdists:Response Time Distributions

Provides response time distributions (density/PDF, distribution function/CDF, quantile function, and random generation): (a) Ratcliff diffusion model (Ratcliff & McKoon, 2008, <doi:10.1162/neco.2008.12-06-420>) based on C code by Andreas and Jochen Voss and (b) linear ballistic accumulator (LBA; Brown & Heathcote, 2008, <doi:10.1016/j.cogpsych.2007.12.002>) with different distributions underlying the drift rate.

Maintained by Henrik Singmann. Last updated 3 years ago.

cpp

1.7 match 46 stars 8.85 score 116 scripts 2 dependents

bayesball

ProbBayes:Probability and Bayesian Modeling

Functions and datasets to accompany J. Albert and J. Hu, "Probability and Bayesian Modeling", CRC Press, (2019, ISBN: 1138492566).

Maintained by Jim Albert. Last updated 4 years ago.

3.5 match 5 stars 4.30 score 80 scripts

ctn-0094

public.ctn0094extra:Helper Files for the CTN-0094 Relational Database

Engineered features and "helper" functions ancillary to the 'public.ctn0094data' package, extending this package for ease of use (see <https://CRAN.R-project.org/package=public.ctn0094data>). This 'public.ctn0094data' package contains harmonized datasets from some of the National Institute of Drug Abuse's Clinical Trials Network (NIDA's CTN) projects. Specifically, the CTN-0094 project is to harmonize and de-identify clinical trials data from the CTN-0027, CTN-0030, and CTN-51 studies for opioid use disorder. This current version is built from 'public.ctn0094data' v. 1.0.6.

Maintained by Gabriel Odom. Last updated 1 years ago.

3.6 match 3.88 score 15 scripts

bioc

GenomicDataCommons:NIH / NCI Genomic Data Commons Access

Programmatically access the NIH / NCI Genomic Data Commons RESTful service.

Maintained by Sean Davis. Last updated 2 months ago.

dataimport sequencing api-client bioconductor bioinformatics cancer core-services data-science genomics nci tcga vignette

1.2 match 87 stars 11.94 score 238 scripts 12 dependents

marchionnilab

covid19census:Extracts Covid-19 and other demographic metrics regarding U.S.A and Italy

Package with functions to scrape data regarding COVID-19 epidemic in U.S.A and Italy, as well as datasets with related indexes.

Maintained by claudio_zanettini. Last updated 4 years ago.

6.6 match 2 stars 2.00 score 5 scripts

arilamstein

choroplethr:Simplify the Creation of Choropleth Maps in R

Choropleths are thematic maps where geographic regions, such as states, are colored according to some metric, such as the number of people who live in that state. This package simplifies this process by 1. Providing ready-made functions for creating choropleths of common maps. 2. Providing data and API connections to interesting data sources for making choropleths. 3. Providing a framework for creating choropleths from arbitrary shapefiles. 4. Overlaying those maps over reference maps from Google Maps.

Maintained by Zhaochen He. Last updated 15 hours ago.

1.7 match 3 stars 7.37 score 860 scripts 1 dependents

lhvanegasp

glmtoolbox:Set of Tools to Data Analysis using Generalized Linear Models

Set of tools for the statistical analysis of data using: (1) normal linear models; (2) generalized linear models; (3) negative binomial regression models as alternative to the Poisson regression models under the presence of overdispersion; (4) beta-binomial and random-clumped binomial regression models as alternative to the binomial regression models under the presence of overdispersion; (5) Zero-inflated and zero-altered regression models to deal with zero-excess in count data; (6) generalized nonlinear models; (7) generalized estimating equations for cluster correlated data.

Maintained by Luis Hernando Vanegas. Last updated 8 months ago.

4.0 match 1 stars 3.10 score 149 scripts

cran

glarma:Generalized Linear Autoregressive Moving Average Models

Functions are provided for estimation, testing, diagnostic checking and forecasting of generalized linear autoregressive moving average (GLARMA) models for discrete valued time series with regression variables. These are a class of observation driven non-linear non-Gaussian state space models. The state vector consists of a linear regression component plus an observation driven component consisting of an autoregressive-moving average (ARMA) filter of past predictive residuals. Currently three distributions (Poisson, negative binomial and binomial) can be used for the response series. Three options (Pearson, score-type and unscaled) for the residuals in the observation driven component are available. Estimation is via maximum likelihood (conditional on initializing values for the ARMA process) optimized using Fisher scoring or Newton Raphson iterative methods. Likelihood ratio and Wald tests for the observation driven component allow testing for serial dependence in generalized linear model settings. Graphical diagnostics including model fits, autocorrelation functions and probability integral transform residuals are included in the package. Several standard data sets are included in the package.

Maintained by William T.M. Dunsmuir. Last updated 4 days ago.

3.8 match 2 stars 3.26 score 1 dependents

lbenitesanchez

FMsmsnReg:Regression Models with Finite Mixtures of Skew Heavy-Tailed Errors

Fit linear regression models where the random errors follow a finite mixture of of Skew Heavy-Tailed Errors.

Maintained by Luis Benites Sanchez. Last updated 8 years ago.

3.6 match 3 stars 3.18 score 9 scripts

rkabacoff

qacBase:Functions to Facilitate Exploratory Data Analysis

Functions for descriptive statistics, data management, and data visualization.

Maintained by Kabacoff Robert. Last updated 3 years ago.

eda statistics

2.2 match 1 stars 5.13 score 45 scripts

dfe-analytical-services

shinyGovstyle:Custom Gov Style Inputs for Shiny

Collection of 'shiny' application styling that are the based on the GOV.UK Design System. See <https://design-system.service.gov.uk/components/> for details.

Maintained by Ross Wyatt. Last updated 8 days ago.

1.6 match 44 stars 6.77 score 25 scripts

feddelegrand7

ddplot:Create D3 Based SVG Graphics

Create 'D3' based 'SVG' ('Scalable Vector Graphics') graphics using a simple 'R' API. The package aims to simplify the creation of many 'SVG' plot types using a straightforward 'R' API. The package relies on the 'r2d3' 'R' package and the 'D3' 'JavaScript' library. See <https://rstudio.github.io/r2d3/> and <https://d3js.org/> respectively.

Maintained by Mohamed El Fodil Ihaddaden. Last updated 2 years ago.

d3 d3js dataviz javascript

1.9 match 44 stars 5.64 score 7 scripts

nlsy-links

NlsyLinks:Utilities and Kinship Information for Research with the NLSY

Utilities and kinship information for behavior genetics and developmental research using the National Longitudinal Survey of Youth (NLSY; <https://www.nlsinfo.org/>).

Maintained by S. Mason Garrison. Last updated 23 days ago.

behavior-genetics kinship-information national-longitudinal-survey nlsy

1.3 match 7 stars 7.49 score 185 scripts

pdhoff

eigenmodel:Semiparametric Factor and Regression Models for Symmetric Relational Data

Estimation of the parameters in a model for symmetric relational data (e.g., the above-diagonal part of a square matrix), using a model-based eigenvalue decomposition and regression. Missing data is accommodated, and a posterior mean for missing data is calculated under the assumption that the data are missing at random. The marginal distribution of the relational data can be arbitrary, and is fit with an ordered probit specification. See Hoff (2007) <arXiv:0711.1146> for details on the model.

Maintained by Peter Hoff. Last updated 6 years ago.

3.4 match 2.83 score 22 scripts 6 dependents

pattaroc

nephro:Utilities for Nephrology

Set of functions to estimate kidney function and other traits of interest in nephrology.

Maintained by Cristian Pattaro. Last updated 2 months ago.

3.4 match 4 stars 2.82 score 11 scripts

fangzhou-xie

rethnicity:Predicting Ethnic Group from Names

Implementation of the race/ethnicity prediction method, described in "rethnicity: An R package for predicting ethnicity from names" by Fangzhou Xie (2022) <doi:10.1016/j.softx.2021.100965> and "Rethnicity: Predicting Ethnicity from Names" by Fangzhou Xie (2021) <doi:10.48550/arXiv.2109.09228>.

Maintained by Fangzhou Xie. Last updated 20 days ago.

ethnicity-classifier ethnicity-prediction lstm cpp

1.7 match 9 stars 5.66 score 17 scripts

dkangeyan

OneSampleLogRankTest:One-Sample Log-Rank Test

The log-rank test is performed to assess the survival outcomes between two group. When there is no proper control group or obtaining such data is cumbersome, one sample log-rank test can be applied. This package performs one sample log-rank test as described in Finkelstein et al. (2003)<doi:10.1093/jnci/djt227> and variation of the test for small sample sizes which is detailed in FD Liddell (1984)<doi:10.1136/jech.38.1.85> paper. Visualization function in the package generates Kaplan-Meier Curve comparing survival curve of the general population against that of the population of interest.

Maintained by Divy Kangeyan. Last updated 1 years ago.

3.5 match 2.70 score

ekstroem

isdals:Datasets for Introduction to Statistical Data Analysis for the Life Sciences

Provides datasets for the book "Introduction to Statistical Data Analysis for the Life Sciences, Second edition" by Ekstrøm and Sørensen (2014).

Maintained by Claus Ekstrom. Last updated 2 years ago.

3.8 match 2.51 score 108 scripts 1 dependents

juba

robservable:Import an Observable Notebook as HTML Widget

Allows loading and displaying an Observable notebook (online JavaScript notebooks powered by <https://observablehq.com>) as an HTML Widget in an R session, 'shiny' application or 'rmarkdown' document.

Maintained by Julien Barnier. Last updated 7 months ago.

htmlwidgets observable

1.3 match 164 stars 6.99 score 40 scripts

dietrichson

ProPublicaR:Access Functions for ProPublica's APIs

Provides wrapper functions to access the ProPublica's Congress and Campaign Finance APIs. The Congress API provides near real-time access to legislative data from the House of Representatives, the Senate and the Library of Congress. The Campaign Finance API provides data from United States Federal Election Commission filings and other sources. The API covers summary information for candidates and committees, as well as certain types of itemized data. For more information about these APIs go to: <https://www.propublica.org/datastore/apis>.

Maintained by Aleksander Dietrichson. Last updated 2 years ago.

2.0 match 12 stars 4.38 score 1 scripts

homerhanumat

tigerData:GC Statistics Datasets

A small, informal collection of datasets useful in undergraduate statistics courses.

Maintained by Homer White. Last updated 2 months ago.

4.0 match 2.18 score 6 scripts

usaid-oha-si

mindthegap:Mind the Gap

Package to tidy UNAIDS estimates (from the EDMS database) as well as plot trends in UNAIDS 95 goals and ART coverage gap by country.

Maintained by Karishma Srikanth. Last updated 3 months ago.

1.5 match 5 stars 5.51 score 13 scripts

sgibb

readBrukerFlexData:Reads Mass Spectrometry Data in Bruker *flex Format

Reads data files acquired by Bruker Daltonics' matrix-assisted laser desorption/ionization-time-of-flight mass spectrometer of the *flex series.

Maintained by Sebastian Gibb. Last updated 6 months ago.

bruker-daltonics import maldi-tof-ms

1.6 match 13 stars 5.04 score 11 scripts 5 dependents

jrnold

smss:Datasets for Agresti and Finlay's "Statistical Methods for the Social Sciences"

Datasets used in "Statistical Methods for the Social Sciences" (SMSS) by Alan Agresti and Barbara Finlay.

Maintained by Jeffrey B. Arnold. Last updated 9 years ago.

3.8 match 2.10 score 25 scripts

rolkra

codebreaker:Retro Logic Game

Logic game in the style of the early 1980s home computers that can be played in the R console. This game is inspired by Mastermind, a game that became popular in the 1970s. Can you break the code?

Maintained by Roland Krasser. Last updated 2 years ago.

1.8 match 11 stars 3.74 score 6 scripts

vinhdizzo

IRexamples:Collection of Practical Institutional Research Examples and Tutorials

Provides examples of code for analyzing data or accomplishing tasks that may be useful to institutional or educational researchers.

Maintained by Vinh Nguyen. Last updated 2 years ago.

1.3 match 4 stars 5.00 score 4 scripts

cjgeyer

CatDataAnalysis:Datasets for Categorical Data Analysis by Agresti

Datasets used in the book "Categorical Data Analysis" by Agresti (2012, ISBN:978-0-470-46363-5) but not printed in the book. Datasets and help pages were automatically produced from the source <https://users.stat.ufl.edu/~aa/cda/data.html> by the R script foo.R, which can be found in the GitHub repository.

Maintained by Charles J. Geyer. Last updated 3 years ago.

6.6 match 1.00 score 5 scripts

jgoungounga

xhaz:Excess Hazard Modelling Considering Inappropriate Mortality Rates

Fits relative survival regression models with or without proportional excess hazards and with the additional possibility to correct for background mortality by one or more parameter(s). These models are relevant when the observed mortality in the studied group is not comparable to that of the general population or in population-based studies where the available life tables used for net survival estimation are insufficiently stratified. In the latter case, the proposed model by Touraine et al. (2020) <doi:10.1177/0962280218823234> can be used. The user can also fit a model that relaxes the proportional expected hazards assumption considered in the Touraine et al. excess hazard model. This extension was proposed by Mba et al. (2020) <doi:10.1186/s12874-020-01139-z> to allow non-proportional effects of the additional variable on the general population mortality. In non-population-based studies, researchers can identify non-comparability source of bias in terms of expected mortality of selected individuals. An excess hazard model correcting this selection bias is presented in Goungounga et al. (2019) <doi:10.1186/s12874-019-0747-3>. This class of model with a random effect at the cluster level on excess hazard is presented in Goungounga et al. (2023) <doi:10.1002/bimj.202100210>.

Maintained by Juste Goungounga. Last updated 9 months ago.

2.1 match 3.04 score 11 scripts

christopherkenny

royale:Clash Royale API

R interface to the official API for Clash Royale <https://developer.clashroyale.com/#/>.

Maintained by Christopher T. Kenny. Last updated 1 years ago.

3.8 match 1.70 score 4 scripts

sammorrissette

SC2API:Blizzard SC2 API Wrapper

A wrapper for Blizzard's Starcraft II (a 2010 real-time strategy game) Application Programming Interface (API). All documented API calls are implemented in an easy-to-use and consistent manner.

Maintained by Samuel Morrissette. Last updated 5 years ago.

1.5 match 1 stars 3.70 score 4 scripts

ecmerkle

smdata:Data to Accompany Smithson & Merkle, 2013

Contains data files to accompany Smithson & Merkle (2013), Generalized Linear Models for Categorical and Continuous Limited Dependent Variables.

Maintained by Ed Merkle. Last updated 7 years ago.

3.6 match 1.46 score 29 scripts

lmiratrix

elec:Collection of Functions for Statistical Election Audits

This is a bizarre collection of functions written to do various sorts of statistical election audits. There are also functions to generate simulated voting data, and simulated "truth" so as to do simulations to check characteristics of these methods.

Maintained by Luke Mirarix. Last updated 3 years ago.

2.3 match 2.30 score 20 scripts

economic

epiextractr:Tools to use Economic Policy Institute Microdata Extracts

Tools to download and load the EPI microdata extracts from microdata.epi.org.

Maintained by Ben Zipperer. Last updated 7 months ago.

1.2 match 5 stars 4.26 score 36 scripts

skranz

RelationalContracts:Characterize relational contracts in repated or stochastic games

Characterize relational contracts in repated or stochastic games. Can also analyse repeated negotiation equilibria.

Maintained by Sebastian Kranz. Last updated 4 years ago.

dynamic-game economics game-theory hold-up nash-equilibrium repeated-game stochastic-game

2.0 match 4 stars 2.48 score 15 scripts

jorgecastillomateo

RecordTest:Inference Tools in Time Series Based on Record Statistics

Statistical tools based on the probabilistic properties of the record occurrence in a sequence of independent and identically distributed continuous random variables. In particular, tools to prepare a time series as well as distribution-free trend and change-point tests and graphical tools to study the record occurrence. Details about the implemented tools can be found in Castillo-Mateo et al. (2023a) <doi:10.18637/jss.v106.i05> and Castillo-Mateo et al. (2023b) <doi:10.1016/j.atmosres.2023.106934>.

Maintained by Jorge Castillo-Mateo. Last updated 2 years ago.

hypothesis-testing record-breaking

1.2 match 1 stars 3.70 score 3 scripts

cran

IncomPair:Comparison of Means for the Incomplete Paired Data

Implements a variety of nonparametric and parametric methods that are commonly used when the data set is a mixture of paired observations and independent samples. The package also calculates and returns values of different tests with their corresponding p-values. Bhoj, D. S. (1991) <doi:10.1002/bimj.4710330108> "Testing equality of means in the presence of correlation and missing data". Dubnicka, S. R., Blair, R. C., and Hettmansperger, T. P. (2002) <doi:10.22237/jmasm/1020254460> "Rank-based procedures for mixed paired and two-sample designs". Einsporn, R. L. and Habtzghi, D. (2013) <https://pdfs.semanticscholar.org/89a3/90bafeb2bc41ed4414533cfd5ab84a6b54b6.pdf> "Combining paired and two-sample data using a permutation test". Ekbohm, G. (1976) <doi:10.1093/biomet/63.2.299> "On comparing means in the paired case with incomplete data on both responses". Lin, P. E. and Stivers, L. E. (1974) <doi:10.1093/biomet/61.2.325> On difference of means with incomplete data". Maritz, J. S. (1995) <doi:10.1111/j.1467-842x.1995.tb00649.x> "A permutation paired test allowing for missing values".

Maintained by Desale Habtzghi. Last updated 5 years ago.

4.0 match 1.00 score

modeloriented

fairmodels:Flexible Tool for Bias Detection, Visualization, and Mitigation

Measure fairness metrics in one place for many models. Check how big is model's bias towards different races, sex, nationalities etc. Use measures such as Statistical Parity, Equal odds to detect the discrimination against unprivileged groups. Visualize the bias using heatmap, radar plot, biplot, bar chart (and more!). There are various pre-processing and post-processing bias mitigation algorithms implemented. Package also supports calculating fairness metrics for regression models. Find more details in (Wiśniewski, Biecek (2021)) <arXiv:2104.00507>.

Maintained by Jakub Wiśniewski. Last updated 2 months ago.

explain-classifiers explainable-ml fairness fairness-comparison fairness-ml model-evaluation

0.5 match 87 stars 7.73 score 51 scripts 1 dependents

cran

acc:Exploring Accelerometer Data

Processes accelerometer data from uni-axial and tri-axial devices, and generates data summaries. Also includes functions to plot, analyze, and simulate accelerometer data.

Maintained by Jaejoon Song. Last updated 8 years ago.

openblas cpp

1.8 match 1 stars 1.78 score

thlytras

rspiro:Implementation of Spirometry Equations

Implementation of various spirometry equations in R, currently the GLI-2012 (Global Lung Initiative; Quanjer et al. 2012 <doi:10.1183/09031936.00080312>), the race-neutral GLI global 2022 (Global Lung Initiative; Bowerman et al. 2023 <doi:10.1164/rccm.202205-0963OC>), the NHANES3 (National Health and Nutrition Examination Survey; Hankinson et al. 1999 <doi:10.1164/ajrccm.159.1.9712108>) and the JRS 2014 (Japanese Respiratory Society; Kubota et al. 2014 <doi:10.1016/j.resinv.2014.03.003>) equations. Also the GLI-2017 diffusing capacity equations <doi:10.1183/13993003.00010-2017> are implemented. Contains user-friendly functions to calculate predicted and LLN (Lower Limit of Normal) values for different spirometric parameters such as FEV1 (Forced Expiratory Volume in 1 second), FVC (Forced Vital Capacity), etc, and to convert absolute spirometry measurements to percent (%) predicted and z-scores.

Maintained by Theodore Lytras. Last updated 15 days ago.

0.5 match 15 stars 5.53 score 28 scripts

cran

SimSST:Simulated Stop Signal Task Data

Stop signal task data of go and stop trials is generated per participant. The simulation process is based on the generally non-independent horse race model and fixed stop signal delay or tracking method. Each of go and stop process is assumed having exponentially modified Gaussian(ExG) or Shifted Wald (SW) distributions. The output data can be converted to 'BEESTS' software input data enabling researchers to test and evaluate various brain stopping processes manifested by ExG or SW distributional parameters of interest. Methods are described in: Soltanifar M (2020) <https://hdl.handle.net/1807/101208>, Matzke D, Love J, Wiecki TV, Brown SD, Logan GD and Wagenmakers E-J (2013) <doi:10.3389/fpsyg.2013.00918>, Logan GD, Van Zandt T, Verbruggen F, Wagenmakers EJ. (2014) <doi:10.1037/a0035230>.

Maintained by Chel Hee Lee. Last updated 2 years ago.

0.5 match 1 stars 2.70 score

bbartholdy

hitchr:A random sample generator based on The Hitchhiker's Guide to the Galaxy

Generates random samples containing races described in The Hitchhiker's Guide to the Galaxy.

Maintained by Bjørn Peare Bartholdy. Last updated 3 years ago.

hitchhikers-guide missing-data sample-generation

0.6 match 1.70 score

cran

BCRA:Breast Cancer Risk Assessment

Functions provide risk projections of invasive breast cancer based on Gail model according to National Cancer Institute's Breast Cancer Risk Assessment Tool algorithm for specified race/ethnic groups and age intervals. Gail MH, Brinton LA, et al (1989) <doi:10.1093/jnci/81.24.1879>. Marthew PB, Gail MH, et al (2016) <doi:10.1093/jnci/djw215>.

Maintained by Fanni Zhang. Last updated 5 years ago.

0.5 match 1 stars 1.30 score

cran

FairMclus:Clustering for Data with Sensitive Attribute

Clustering for categorical and mixed-type of data, to preventing classification biases due to race, gender or others sensitive attributes. This algorithm is an extension of the methodology proposed by "Santos & Heras (2020) <doi:10.28945/4643>".

Maintained by Carlos Santos-Mangudo. Last updated 3 years ago.

0.5 match 1.00 score

tfliaoui

iIneq:Computing Individual Components of the Gini and the Theil Indices

Computes individual contributions to the overall Gini and Theil's T and Theil's L measures and their decompositions by groups such as race, gender, national origin, with the three functions of iGini(), iTheiT(), and iTheilL(). For details, see Tim F. Liao (2019) <doi:10.1177/0049124119875961>.

Maintained by Tim Liao. Last updated 4 years ago.

0.5 match 1.00 score