R-universe search: tabulations

mlverse

tabulate:Pretty Console Output for Tables

Generates pretty console output for tables allowing for full customization of cell colors, font type, borders and many others attributes. It also supports 'multibyte' characters and nested tables.

Maintained by Daniel Falbel. Last updated 3 years ago.

cpp

53.4 match 39 stars 5.29 score 33 scripts

insightsengineering

tern:Create Common TLGs Used in Clinical Trials

Table, Listings, and Graphs (TLG) library for common outputs used in clinical trials.

Maintained by Joe Zhu. Last updated 2 months ago.

clinical-trials graphs listings nest outputs tables

20.2 match 79 stars 12.62 score 186 scripts 9 dependents

skhiggins

tabulator:Efficient Tabulation with Stata-Like Output

Efficient tabulation with Stata-like output. For each unique value of the variable, it shows the number of observations with that value, proportion of observations with that value, and cumulative proportion, in descending order of frequency. Accepts data.table, tibble, or data.frame as input. Efficient with big data: if you give it a data.table, tab() uses data.table syntax.

Maintained by Sean Higgins. Last updated 4 years ago.

55.8 match 12 stars 4.14 score 23 scripts

insightsengineering

rtables:Reporting Tables

Reporting tables often have structure that goes beyond simple rectangular data. The 'rtables' package provides a framework for declaring complex multi-level tabulations and then applying them to data. This framework models both tabulation and the resulting tables as hierarchical, tree-like objects which support sibling sub-tables, arbitrary splitting or grouping of data in row and column dimensions, cells containing multiple values, and the concept of contextual summary computations. A convenient pipe-able interface is provided for declaring table layouts and the corresponding computations, and then applying them to data.

Maintained by Joe Zhu. Last updated 2 months ago.

pharmaceuticals tables

15.8 match 232 stars 13.65 score 238 scripts 17 dependents

sfirke

janitor:Simple Tools for Examining and Cleaning Dirty Data

The main janitor functions can: perfectly format data.frame column names; provide quick counts of variable combinations (i.e., frequency tables and crosstabs); and explore duplicate records. Other janitor functions nicely format the tabulation results. These tabulate-and-report functions approximate popular features of SPSS and Microsoft Excel. This package follows the principles of the "tidyverse" and works well with the pipe function %>%. janitor was built with beginning-to-intermediate R users in mind and is optimized for user-friendliness.

Maintained by Sam Firke. Last updated 3 months ago.

data-analysis data-cleaning data-science dirty-data excel pivot-tables spss tabulations tidyverse

10.8 match 1.4k stars 19.15 score 35k scripts 231 dependents

davidgohel

flextable:Functions for Tabular Reporting

Use a grammar for creating and customizing pretty tables. The following formats are supported: 'HTML', 'PDF', 'RTF', 'Microsoft Word', 'Microsoft PowerPoint' and R 'Grid Graphics'. 'R Markdown', 'Quarto' and the package 'officer' can be used to produce the result files. The syntax is the same for the user regardless of the type of output to be produced. A set of functions allows the creation, definition of cell arrangement, addition of headers or footers, formatting and definition of cell content with text and or images. The package also offers a set of high-level functions that allow tabular reporting of statistical models and the creation of complex cross tabulations.

Maintained by David Gohel. Last updated 1 months ago.

docx html5 ms-office-documents rmarkdown table

11.4 match 583 stars 17.04 score 7.3k scripts 119 dependents

poissonconsulting

ypr:Yield Per Recruit

An implementation of equilibrium-based yield per recruit methods. Yield per recruit methods can used to estimate the optimal yield for a fish population as described by Walters and Martell (2004) <isbn:0-691-11544-3>. The yield can be based on the number of fish caught (or harvested) or biomass caught for all fish or just large (trophy) individuals.

Maintained by Joe Thorley. Last updated 2 months ago.

yield-per-recruit

12.4 match 7 stars 7.84 score 55 scripts 1 dependents

eoda-dev

rtabulator:R Bindings for 'Tabulator JS'

Provides R bindings for 'Tabulator JS' <https://tabulator.info/>. Makes it a breeze to create highly customizable interactive tables in 'rmarkdown' documents and 'shiny' applications. It includes filtering, grouping, editing, input validation, history recording, column formatters, packaged themes and more.

Maintained by Stefan Kuethe. Last updated 4 months ago.

bindings htmlwidgets rlang shiny spreadsheet table tabulator-js

21.2 match 11 stars 4.44 score 9 scripts

dcomtois

summarytools:Tools to Quickly and Neatly Summarize Data

Data frame summaries, cross-tabulations, weight-enabled frequency tables and common descriptive (univariate) statistics in concise tables available in a variety of formats (plain ASCII, Markdown and HTML). A good point-of-entry for exploring data, both for experienced and new R users.

Maintained by Dominic Comtois. Last updated 1 days ago.

descriptive-statistics frequency-table html-report markdown pander pandoc pandoc-markdown rmarkdown rstudio

5.2 match 526 stars 14.52 score 2.9k scripts 6 dependents

shannonpileggi

gtreg:Regulatory Tables for Clinical Research

Creates tables suitable for regulatory agency submission by leveraging the 'gtsummary' package as the back end. Tables can be exported to HTML, Word, PDF and more. Highly customized outputs are available by utilizing existing styling functions from 'gtsummary' as well as custom options designed for regulatory tables.

Maintained by Shannon Pileggi. Last updated 25 days ago.

9.4 match 37 stars 6.92 score 30 scripts

tomasfryda

h2o:R Interface for the 'H2O' Scalable Machine Learning Platform

R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).

Maintained by Tomas Fryda. Last updated 1 years ago.

7.8 match 3 stars 8.20 score 7.8k scripts 11 dependents

larmarange

ggstats:Extension to 'ggplot2' for Plotting Stats

Provides new statistics, new geometries and new positions for 'ggplot2' and a suite of functions to facilitate the creation of statistical plots.

Maintained by Joseph Larmarange. Last updated 6 days ago.

4.9 match 37 stars 13.08 score 190 scripts 156 dependents

njtierney

naniar:Data Structures, Summaries, and Visualisations for Missing Data

Missing values are ubiquitous in data and need to be explored and handled in the initial stages of analysis. 'naniar' provides data structures and functions that facilitate the plotting of missing values and examination of imputations. This allows missing data dependencies to be explored with minimal deviation from the common work patterns of 'ggplot2' and tidy data. The work is fully discussed at Tierney & Cook (2023) <doi:10.18637/jss.v105.i07>.

Maintained by Nicholas Tierney. Last updated 4 days ago.

data-visualisation ggplot2 missing-data missingness tidy-data

4.0 match 657 stars 15.63 score 5.1k scripts 9 dependents

ggobi

GGally:Extension to 'ggplot2'

The R package 'ggplot2' is a plotting system based on the grammar of graphics. 'GGally' extends 'ggplot2' by adding several functions to reduce the complexity of combining geometric objects with transformed data. Some of these functions include a pairwise plot matrix, a two group pairwise plot matrix, a parallel coordinates plot, a survival plot, and several functions to plot networks.

Maintained by Barret Schloerke. Last updated 10 months ago.

3.7 match 597 stars 16.15 score 17k scripts 154 dependents

bioc

BiocGenerics:S4 generic functions used in Bioconductor

The package defines many S4 generic functions used in Bioconductor.

Maintained by Hervé Pagès. Last updated 1 months ago.

infrastructure bioconductor-package core-package

4.1 match 12 stars 14.22 score 612 scripts 2.2k dependents

wraff

wrProteo:Proteomics Data Analysis Functions

Data analysis of proteomics experiments by mass spectrometry is supported by this collection of functions mostly dedicated to the analysis of (bottom-up) quantitative (XIC) data. Fasta-formatted proteomes (eg from UniProt Consortium <doi:10.1093/nar/gky1049>) can be read with automatic parsing and multiple annotation types (like species origin, abbreviated gene names, etc) extracted. Initial results from multiple software for protein (and peptide) quantitation can be imported (to a common format): MaxQuant (Tyanova et al 2016 <doi:10.1038/nprot.2016.136>), Dia-NN (Demichev et al 2020 <doi:10.1038/s41592-019-0638-x>), Fragpipe (da Veiga et al 2020 <doi:10.1038/s41592-020-0912-y>), ionbot (Degroeve et al 2021 <doi:10.1101/2021.07.02.450686>), MassChroq (Valot et al 2011 <doi:10.1002/pmic.201100120>), OpenMS (Strauss et al 2021 <doi:10.1038/nmeth.3959>), ProteomeDiscoverer (Orsburn 2021 <doi:10.3390/proteomes9010015>), Proline (Bouyssie et al 2020 <doi:10.1093/bioinformatics/btaa118>), AlphaPept (preprint Strauss et al <doi:10.1101/2021.07.23.453379>) and Wombat-P (Bouyssie et al 2023 <doi:10.1021/acs.jproteome.3c00636>. Meta-data provided by initial analysis software and/or in sdrf format can be integrated to the analysis. Quantitative proteomics measurements frequently contain multiple NA values, due to physical absence of given peptides in some samples, limitations in sensitivity or other reasons. Help is provided to inspect the data graphically to investigate the nature of NA-values via their respective replicate measurements and to help/confirm the choice of NA-replacement algorithms. Meta-data in sdrf-format (Perez-Riverol et al 2020 <doi:10.1021/acs.jproteome.0c00376>) or similar tabular formats can be imported and included. Missing values can be inspected and imputed based on the concept of NA-neighbours or other methods. Dedicated filtering and statistical testing using the framework of package 'limma' <doi:10.18129/B9.bioc.limma> can be run, enhanced by multiple rounds of NA-replacements to provide robustness towards rare stochastic events. Multi-species samples, as frequently used in benchmark-tests (eg Navarro et al 2016 <doi:10.1038/nbt.3685>, Ramus et al 2016 <doi:10.1016/j.jprot.2015.11.011>), can be run with special options considering such sub-groups during normalization and testing. Subsequently, ROC curves (Hand and Till 2001 <doi:10.1023/A:1010920819831>) can be constructed to compare multiple analysis approaches. As detailed example the data-set from Ramus et al 2016 <doi:10.1016/j.jprot.2015.11.011>) quantified by MaxQuant, ProteomeDiscoverer, and Proline is provided with a detailed analysis of heterologous spike-in proteins.

Maintained by Wolfgang Raffelsberger. Last updated 4 months ago.

15.4 match 3.67 score 17 scripts 1 dependents

cran

epiDisplay:Epidemiological Data Display Package

Package for data exploration and result presentation. Full 'epicalc' package with data management functions is available at '<https://medipe.psu.ac.th/epicalc/>'.

Maintained by Virasakdi Chongsuvivatwong. Last updated 3 years ago.

9.8 match 1 stars 5.44 score 758 scripts 2 dependents

tagteam

Publish:Format Output of Various Routines in a Suitable Way for Reports and Publication

A bunch of convenience functions that transform the results of some basic statistical analyses into table format nearly ready for publication. This includes descriptive tables, tables of logistic regression and Cox regression results as well as forest plots.

Maintained by Thomas A. Gerds. Last updated 11 days ago.

5.1 match 15 stars 10.11 score 274 scripts 36 dependents

cdcgov

surveytable:Formatted Survey Estimates

Short and understandable commands that generate tabulated, formatted, and rounded survey estimates. Mostly a wrapper for the 'survey' package (Lumley (2004) <doi:10.18637/jss.v009.i08> <https://CRAN.R-project.org/package=survey>) that identifies low-precision estimates using the National Center for Health Statistics (NCHS) presentation standards (Parker et al. (2017) <https://www.cdc.gov/nchs/data/series/sr_02/sr02_175.pdf>, Parker et al. (2023) <doi:10.15620/cdc:124368>).

Maintained by Alex Strashny. Last updated 4 days ago.

estimates formatted-output pretty-print survey tables

7.3 match 6 stars 6.71 score 19 scripts

sciviews

tabularise:Create Tabular Outputs from R

Create rich-formatted tabular outputs from R that can be incorporated into R Markdown/Quarto documents with correct output at least in HTML, LaTeX/PDF, Word and PowerPoint formats for various R objects.

Maintained by Philippe Grosjean. Last updated 9 months ago.

sciviews tabulation

10.0 match 4.56 score 12 scripts 4 dependents

gianmarcoalberti

CAinterprTools:Graphical Aid in Correspondence Analysis Interpretation and Significance Testings

Allows to plot a number of information related to the interpretation of Correspondence Analysis' results. It provides the facility to plot the contribution of rows and columns categories to the principal dimensions, the quality of points display on selected dimensions, the correlation of row and column categories to selected dimensions, etc. It also allows to assess which dimension(s) is important for the data structure interpretation by means of different statistics and tests. The package also offers the facility to plot the permuted distribution of the table total inertia as well as of the inertia accounted for by pairs of selected dimensions. Different facilities are also provided that aim to produce interpretation-oriented scatterplots. Reference: Alberti 2015 <doi:10.1016/j.softx.2015.07.001>.

Maintained by Gianmarco Alberti. Last updated 5 years ago.

16.8 match 2.52 score 33 scripts

rspatial

terra:Spatial Data Analysis

Methods for spatial data analysis with vector (points, lines, polygons) and raster (grid) data. Methods for vector data include geometric operations such as intersect and buffer. Raster methods include local, focal, global, zonal and geometric operations. The predict and interpolate methods facilitate the use of regression type (interpolation, machine learning) models for spatial prediction, including with satellite remote sensing data. Processing of very large files is supported. See the manual and tutorials on <https://rspatial.org/> to get started. 'terra' replaces the 'raster' package ('terra' can do more, and it is faster and easier to use).

Maintained by Robert J. Hijmans. Last updated 1 days ago.

geospatial raster spatial vector onetbb proj gdal geos cpp

2.3 match 559 stars 17.64 score 17k scripts 851 dependents

gdemin

expss:Tables, Labels and Some Useful Functions from Spreadsheets and 'SPSS' Statistics

Package computes and displays tables with support for 'SPSS'-style labels, multiple and nested banners, weights, multiple-response variables and significance testing. There are facilities for nice output of tables in 'knitr', 'Shiny', '*.xlsx' files, R and 'Jupyter' notebooks. Methods for labelled variables add value labels support to base R functions and to some functions from other packages. Additionally, the package brings popular data transformation functions from 'SPSS' Statistics and 'Excel': 'RECODE', 'COUNT', 'COUNTIF', 'VLOOKUP' and etc. These functions are very useful for data processing in marketing research surveys. Package intended to help people to move data processing from 'Excel' and 'SPSS' to R.

Maintained by Gregory Demin. Last updated 11 months ago.

excel labels labels-support msexcel pivot-tables recode spss spss-statistics tables variable-labels vlookup

3.5 match 84 stars 11.00 score 1.8k scripts 4 dependents

rspatial

raster:Geographic Data Analysis and Modeling

Reading, writing, manipulating, analyzing and modeling of spatial data. This package has been superseded by the "terra" package <https://CRAN.R-project.org/package=terra>.

Maintained by Robert J. Hijmans. Last updated 2 months ago.

cpp

2.3 match 164 stars 17.05 score 58k scripts 555 dependents

marketbridge

zctaCrosswalk:Crosswalk Between 2020 Census ZIP Code Tabulation Areas (ZCTAs), States and Counties

Contains the US Census Bureau's 2020 ZCTA to County Relationship File, as well as convenience functions to translate between States, Counties and ZIP Code Tabulation Areas (ZCTAs).

Maintained by Ari Lamstein. Last updated 2 years ago.

6.7 match 5 stars 5.39 score 11 scripts 1 dependents

joon-e

tidycomm:Data Modification and Analysis for Communication Research

Provides convenience functions for common data modification and analysis tasks in communication research. This includes functions for univariate and bivariate data analysis, index generation and reliability computation, and intercoder reliability tests. All functions follow the style and syntax of the tidyverse, and are construed to perform their computations on multiple variables at once. Functions for univariate and bivariate data analysis comprise summary statistics for continuous and categorical variables, as well as several tests of bivariate association including effect sizes. Functions for data modification comprise index generation and automated reliability analysis of index variables. Functions for intercoder reliability comprise tests of several intercoder reliability estimates, including simple and mean pairwise percent agreement, Krippendorff's Alpha (Krippendorff 2004, ISBN: 9780761915454), and various Kappa coefficients (Brennan & Prediger 1981 <doi: 10.1177/001316448104100307>; Cohen 1960 <doi: 10.1177/001316446002000104>; Fleiss 1971 <doi: 10.1037/h0031619>).

Maintained by Julian Unkel. Last updated 10 months ago.

5.4 match 15 stars 6.59 score 52 scripts

nicolas-robette

descriptio:Descriptive Statistical Analysis

Description of statistical associations between variables : measures of local and global association between variables (phi, Cramér V, correlations, eta-squared, Goodman and Kruskal tau, permutation tests, etc.), multiple graphical representations of the associations between variables (using 'ggplot2') and weighted statistics.

Maintained by Nicolas Robette. Last updated 6 months ago.

7.0 match 4 stars 5.00 score 11 scripts 3 dependents

lindbrook

packageRank:Computation and Visualization of Package Download Counts and Percentile Ranks

Compute and visualize package download counts and percentile ranks from Posit/RStudio's CRAN mirror.

Maintained by lindbrook. Last updated 4 days ago.

bioconductor-packages

5.6 match 28 stars 6.13 score 27 scripts

bioc

VariantAnnotation:Annotation of Genetic Variants

Annotate variants, compute amino acid coding changes, predict coding outcomes.

Maintained by Bioconductor Package Maintainer. Last updated 2 months ago.

dataimport sequencing snp annotation genetics variantannotation curl bzip2 xz-utils zlib

3.0 match 11.39 score 1.9k scripts 152 dependents

sparklyr

sparklyr:R Interface to Apache Spark

R interface to Apache Spark, a fast and general engine for big data processing, see <https://spark.apache.org/>. This package supports connecting to local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end, and provides an interface to Spark's built-in machine learning algorithms.

Maintained by Edgar Ruiz. Last updated 10 days ago.

apache-spark distributed dplyr ide livy machine-learning remote-clusters spark sparklyr

2.3 match 959 stars 15.16 score 4.0k scripts 21 dependents

matthieugomez

statar:Tools Inspired by 'Stata' to Manipulate Tabular Data

A set of tools inspired by 'Stata' to explore data.frames ('summarize', 'tabulate', 'xtile', 'pctile', 'binscatter', elapsed quarters/month, lead/lag).

Maintained by Matthieu Gomez. Last updated 2 years ago.

4.0 match 54 stars 8.40 score 226 scripts 1 dependents

ipums

ipumsr:An R Interface for Downloading, Reading, and Handling IPUMS Data

An easy way to work with census, survey, and geographic data provided by IPUMS in R. Generate and download data through the IPUMS API and load IPUMS files into R with their associated metadata to make analysis easier. IPUMS data describing 1.4 billion individuals drawn from over 750 censuses and surveys is available free of charge from the IPUMS website <https://www.ipums.org>.

Maintained by Derek Burk. Last updated 19 days ago.

3.0 match 28 stars 11.07 score 720 scripts 2 dependents

rsquaredacademy

descriptr:Generate Descriptive Statistics

Generate descriptive statistics such as measures of location, dispersion, frequency tables, cross tables, group summaries and multiple one/two way tables.

Maintained by Aravind Hebbali. Last updated 4 months ago.

descriptive-statistics eda summary-statistics

4.5 match 34 stars 7.37 score 221 scripts

henrikbengtsson

matrixStats:Functions that Apply to Rows and Columns of Matrices (and to Vectors)

High-performing functions operating on rows and columns of matrices, e.g. col / rowMedians(), col / rowRanks(), and col / rowSds(). Functions optimized per data type and for subsetted calculations such that both memory usage and processing time is minimized. There are also optimized vector-based methods, e.g. binMeans(), madDiff() and weightedMedian().

Maintained by Henrik Bengtsson. Last updated 2 months ago.

matrix performance vector

1.8 match 208 stars 18.09 score 20k scripts 2.3k dependents

jalvesaq

descr:Descriptive Statistics

Weighted frequency and contingency tables of categorical variables and of the comparison of the mean value of a numerical variable by the levels of a factor, and methods to produce xtable objects of the tables and to plot them. There are also functions to facilitate the character encoding conversion of objects, to quickly convert fixed width files into csv ones, and to export a data.frame to a text file with the necessary R and SPSS codes to reread the data.

Maintained by Jakson Aquino. Last updated 1 years ago.

3.7 match 18 stars 8.80 score 692 scripts 4 dependents

mayoverse

arsenal:An Arsenal of 'R' Functions for Large-Scale Statistical Summaries

An Arsenal of 'R' functions for large-scale statistical summaries, which are streamlined to work within the latest reporting tools in 'R' and 'RStudio' and which use formulas and versatile summary statistics for summary tables and models. The primary functions include tableby(), a Table-1-like summary of multiple variable types 'by' the levels of one or more categorical variables; paired(), a Table-1-like summary of multiple variable types paired across two time points; modelsum(), which performs simple model fits on one or more endpoints for many variables (univariate or adjusted for covariates); freqlist(), a powerful frequency table across many categorical variables; comparedf(), a function for comparing data.frames; and write2(), a function to output tables to a document.

Maintained by Ethan Heinzen. Last updated 7 months ago.

baseline-characteristics descriptive-statistics modeling paired-comparisons reporting statistics tableone

2.4 match 225 stars 13.45 score 1.2k scripts 16 dependents

cjvanlissa

tidySEM:Tidy Structural Equation Modeling

A tidy workflow for generating, estimating, reporting, and plotting structural equation models using 'lavaan', 'OpenMx', or 'Mplus'. Throughout this workflow, elements of syntax, results, and graphs are represented as 'tidy' data, making them easy to customize. Includes functionality to estimate latent class analyses, and to plot 'dagitty' and 'igraph' objects.

Maintained by Caspar J. van Lissa. Last updated 7 days ago.

3.0 match 58 stars 10.69 score 330 scripts 1 dependents

insightsengineering

tern.mmrm:Tables and Graphs for Mixed Models for Repeated Measures (MMRM)

Mixed models for repeated measures (MMRM) are a popular choice for analyzing longitudinal continuous outcomes in randomized clinical trials and beyond; see for example Cnaan, Laird and Slasor (1997) <doi:10.1002/(SICI)1097-0258(19971030)16:20%3C2349::AID-SIM667%3E3.0.CO;2-E>. This package provides an interface for fitting MMRM within the 'tern' <https://cran.r-project.org/package=tern> framework by Zhu et al. (2023) and tabulate results easily using 'rtables' <https://cran.r-project.org/package=rtables> by Becker et al. (2023). It builds on 'mmrm' <https://cran.r-project.org/package=mmrm> by Sabanés Bové et al. (2023) for the actual MMRM computations.

Maintained by Joe Zhu. Last updated 6 months ago.

graphs listings statistical-engineering tables

4.4 match 6 stars 7.26 score 8 scripts 1 dependents

goranbrostrom

eha:Event History Analysis

Parametric proportional hazards fitting with left truncation and right censoring for common families of distributions, piecewise constant hazards, and discrete models. Parametric accelerated failure time models for left truncated and right censored data. Proportional hazards models for tabular and register data. Sampling of risk sets in Cox regression, selections in the Lexis diagram, bootstrapping. Broström (2022) <doi:10.1201/9780429503764>.

Maintained by Göran Broström. Last updated 9 months ago.

fortran openblas

3.3 match 7 stars 9.76 score 308 scripts 10 dependents

insightsengineering

tern.gee:Tables and Graphs for Generalized Estimating Equations (GEE) Model Fits

Generalized estimating equations (GEE) are a popular choice for analyzing longitudinal binary outcomes. This package provides an interface for fitting GEE, currently for logistic regression, within the 'tern' <https://cran.r-project.org/package=tern> framework (Zhu, Sabanés Bové et al., 2023) and tabulate results easily using 'rtables' <https://cran.r-project.org/package=rtables> (Becker, Waddell et al., 2023). It builds on 'geepack' <doi:10.18637/jss.v015.i02> (Højsgaard, Halekoh and Yan, 2006) for the actual GEE model fitting.

Maintained by Joe Zhu. Last updated 7 months ago.

4.5 match 8 stars 7.00 score 3 scripts 1 dependents

pfizer-opensource

zippeR:Working with United States ZIP Code and ZIP Code Tabulation Area Data

Provides a set of functions for working with American postal codes, which are known as ZIP Codes. These include accessing ZIP Code to ZIP Code Tabulation Area (ZCTA) crosswalks, retrieving demographic data for ZCTAs, and tabulating demographic data for three-digit ZCTAs.

Maintained by Christopher Prener. Last updated 19 days ago.

4.8 match 11 stars 6.52 score 5 scripts 1 dependents

benmbutler

powdR:Full Pattern Summation of X-Ray Powder Diffraction Data

Full pattern summation of X-ray powder diffraction data as described in Chipera and Bish (2002) <doi:10.1107/S0021889802017405> and Butler and Hillier (2021) <doi:10.1016/j.cageo.2020.104662>. Derives quantitative estimates of crystalline and amorphous phase concentrations in complex mixtures.

Maintained by Benjamin Butler. Last updated 3 years ago.

5.6 match 12 stars 5.56 score 30 scripts

sebkrantz

collapse:Advanced and Fast Data Transformation

A C/C++ based package for advanced data transformation and statistical computing in R that is extremely fast, class-agnostic, robust and programmer friendly. Core functionality includes a rich set of S3 generic grouped and weighted statistical functions for vectors, matrices and data frames, which provide efficient low-level vectorizations, OpenMP multithreading, and skip missing values by default. These are integrated with fast grouping and ordering algorithms (also callable from C), and efficient data manipulation functions. The package also provides a flexible and rigorous approach to time series and panel data in R. It further includes fast functions for common statistical procedures, detailed (grouped, weighted) summary statistics, powerful tools to work with nested data, fast data object conversions, functions for memory efficient R programming, and helpers to effectively deal with variable labels, attributes, and missing data. It is well integrated with base R classes, 'dplyr'/'tibble', 'data.table', 'sf', 'units', 'plm' (panel-series and data frames), and 'xts'/'zoo'.

Maintained by Sebastian Krantz. Last updated 6 days ago.

data-aggregation data-analysis data-manipulation data-processing data-science data-transformation econometrics high-performance panel-data scientific-computing statistics time-series weighted weights cpp openmp

1.9 match 672 stars 16.63 score 708 scripts 97 dependents

aphalo

photobiologyPlants:Plant Photobiology Related Functions and Data

Provides functions for quantifying visible (VIS) and ultraviolet (UV) radiation in relation to the photoreceptors Phytochromes, Cryptochromes, and UVR8 which are present in plants. It also includes data sets on the optical properties of plants. Part of the 'r4photobiology' suite, Aphalo P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.

Maintained by Pedro J. Aphalo. Last updated 2 months ago.

5.6 match 5.52 score 55 scripts

plambertuliege

degross:Density Estimation from GROuped Summary Statistics

Estimation of a density from grouped (tabulated) summary statistics evaluated in each of the big bins (or classes) partitioning the support of the variable. These statistics include class frequencies and central moments of order one up to four. The log-density is modelled using a linear combination of penalised B-splines. The multinomial log-likelihood involving the frequencies adds up to a roughness penalty based on the differences in the coefficients of neighbouring B-splines and the log of a root-n approximation of the sampling density of the observed vector of central moments in each class. The so-obtained penalized log-likelihood is maximized using the EM algorithm to get an estimate of the spline parameters and, consequently, of the variable density and related quantities such as quantiles, see Lambert, P. (2021) <arXiv:2107.03883> for details.

Maintained by Philippe Lambert. Last updated 2 years ago.

10.2 match 2 stars 3.00 score 4 scripts

uupharmacometrics

xpose4:Diagnostics for Nonlinear Mixed-Effect Models

A model building aid for nonlinear mixed-effects (population) model analysis using NONMEM, facilitating data set checkout, exploration and visualization, model diagnostics, candidate covariate identification and model comparison. The methods are described in Keizer et al. (2013) <doi:10.1038/psp.2013.24>, and Jonsson et al. (1999) <doi:10.1016/s0169-2607(98)00067-4>.

Maintained by Andrew C. Hooker. Last updated 1 years ago.

diagnostics nonmem pharmacometrics population-model xpose

4.1 match 35 stars 7.30 score 315 scripts

michbur

biogram:N-Gram Analysis of Biological Sequences

Tools for extraction and analysis of various n-grams (k-mers) derived from biological sequences (proteins or nucleic acids). Contains QuiPT (quick permutation test) for fast feature-filtering of the n-gram data.

Maintained by Michal Burdukiewicz. Last updated 7 months ago.

biological-sequences ngram-analysis

4.0 match 10 stars 7.50 score 87 scripts 3 dependents

davidgohel

officer:Manipulation of Microsoft Word and PowerPoint Documents

Access and manipulate 'Microsoft Word', 'RTF' and 'Microsoft PowerPoint' documents from R. The package focuses on tabular and graphical reporting from R; it also provides two functions that let users get document content into data objects. A set of functions lets add and remove images, tables and paragraphs of text in new or existing documents. The package does not require any installation of Microsoft products to be able to write Microsoft files.

Maintained by David Gohel. Last updated 1 months ago.

ms-office-documents powerpoint word

1.9 match 630 stars 15.79 score 4.1k scripts 137 dependents

statist7

sitar:Super Imposition by Translation and Rotation Growth Curve Analysis

Functions for fitting and plotting SITAR (Super Imposition by Translation And Rotation) growth curve models. SITAR is a shape-invariant model with a regression B-spline mean curve and subject-specific random effects on both the measurement and age scales. The model was first described by Lindstrom (1995) <doi:10.1002/sim.4780141807> and developed as the SITAR method by Cole et al (2010) <doi:10.1093/ije/dyq115>.

Maintained by Tim Cole. Last updated 2 months ago.

3.4 match 13 stars 8.69 score 58 scripts 3 dependents

strohne

volker:High-Level Functions for Tabulating, Charting and Reporting Survey Data

Craft polished tables and plots in Markdown reports. Simply choose whether to treat your data as counts or metrics, and the package will automatically generate well-designed default tables and plots for you. Boiled down to the basics, with labeling features and simple interactive reports. All functions are 'tidyverse' compatible.

Maintained by Jakob Jünger. Last updated 3 days ago.

4.1 match 5 stars 7.16 score 125 scripts

ropensci

targets:Dynamic Function-Oriented 'Make'-Like Declarative Pipelines

Pipeline tools coordinate the pieces of computationally demanding analysis projects. The 'targets' package is a 'Make'-like pipeline tool for statistics and data science in R. The package skips costly runtime for tasks that are already up to date, orchestrates the necessary computation with implicit parallel computing, and abstracts files as R objects. If all the current output matches the current upstream code and data, then the whole pipeline is up to date, and the results are more trustworthy than otherwise. The methodology in this package borrows from GNU 'Make' (2015, ISBN:978-9881443519) and 'drake' (2018, <doi:10.21105/joss.00550>).

Maintained by William Michael Landau. Last updated 2 days ago.

data-science high-performance-computing make peer-reviewed pipeline r-targetopia reproducibility reproducible-research targets workflow

1.9 match 973 stars 15.20 score 4.6k scripts 22 dependents

bioc

S4Vectors:Foundation of vector-like and list-like containers in Bioconductor

The S4Vectors package defines the Vector and List virtual classes and a set of generic functions that extend the semantic of ordinary vectors and lists in R. Package developers can easily implement vector-like or list-like objects as concrete subclasses of Vector or List. In addition, a few low-level concrete subclasses of general interest (e.g. DataFrame, Rle, Factor, and Hits) are implemented in the S4Vectors package itself (many more are implemented in the IRanges package and in other Bioconductor infrastructure packages).

Maintained by Hervé Pagès. Last updated 1 months ago.

infrastructure datarepresentation bioconductor-package core-package

1.8 match 18 stars 16.05 score 1.0k scripts 1.9k dependents

r-lib

bit64:A S3 Class for Vectors of 64bit Integers

Package 'bit64' provides serializable S3 atomic 64bit (signed) integers. These are useful for handling database keys and exact counting in +-2^63. WARNING: do not use them as replacement for 32bit integers, integer64 are not supported for subscripting by R-core and they have different semantics when combined with double, e.g. integer64 + double => integer64. Class integer64 can be used in vectors, matrices, arrays and data.frames. Methods are available for coercion from and to logicals, integers, doubles, characters and factors as well as many elementwise and summary functions. Many fast algorithmic operations such as 'match' and 'order' support inter- active data exploration and manipulation and optionally leverage caching.

Maintained by Michael Chirico. Last updated 4 days ago.

1.8 match 35 stars 14.91 score 1.5k scripts 3.2k dependents

easystats

see:Model Visualisation Toolbox for 'easystats' and 'ggplot2'

Provides plotting utilities supporting packages in the 'easystats' ecosystem (<https://github.com/easystats/easystats>) and some extra themes, geoms, and scales for 'ggplot2'. Color scales are based on <https://materialui.co/>. References: Lüdecke et al. (2021) <doi:10.21105/joss.03393>.

Maintained by Indrajeet Patil. Last updated 5 days ago.

data-visualization easystats ggplot2 hacktoberfest plotting see statistics visualisation visualization

2.0 match 902 stars 13.22 score 2.0k scripts 3 dependents

bioc

SIM:Integrated Analysis on two human genomic datasets

Finds associations between two human genomic datasets.

Maintained by Renee X. de Menezes. Last updated 5 months ago.

microarray visualization

6.0 match 4.30 score 3 scripts

pauljohn32

rockchalk:Regression Estimation and Presentation

A collection of functions for interpretation and presentation of regression analysis. These functions are used to produce the statistics lectures in <https://pj.freefaculty.org/guides/>. Includes regression diagnostics, regression tables, and plots of interactions and "moderator" variables. The emphasis is on "mean-centered" and "residual-centered" predictors. The vignette 'rockchalk' offers a fairly comprehensive overview. The vignette 'Rstyle' has advice about coding in R. The package title 'rockchalk' refers to our school motto, 'Rock Chalk Jayhawk, Go K.U.'.

Maintained by Paul E. Johnson. Last updated 3 years ago.

3.6 match 7.13 score 584 scripts 18 dependents

pharmaverse

sdtmchecks:Data Quality Checks for Study Data Tabulation Model (SDTM) Datasets

A series of checks to identify common issues in Study Data Tabulation Model (SDTM) datasets. These checks are intended to be generalizable, actionable, and meaningful for analysis.

Maintained by Will Harris. Last updated 3 months ago.

3.3 match 21 stars 7.66 score 15 scripts

bioc

spiky:Spike-in calibration for cell-free MeDIP

spiky implements methods and model generation for cfMeDIP (cell-free methylated DNA immunoprecipitation) with spike-in controls. CfMeDIP is an enrichment protocol which avoids destructive conversion of scarce template, making it ideal as a "liquid biopsy," but creating certain challenges in comparing results across specimens, subjects, and experiments. The use of synthetic spike-in standard oligos allows diagnostics performed with cfMeDIP to quantitatively compare samples across subjects, experiments, and time points in both relative and absolute terms.

Maintained by Tim Triche. Last updated 5 months ago.

differentialmethylation dnamethylation normalization preprocessing qualitycontrol sequencing

5.2 match 2 stars 4.90 score 3 scripts

vincentarelbundock

modelsummary:Summary Tables and Plots for Statistical Models and Data: Beautiful, Customizable, and Publication-Ready

Create beautiful and customizable tables to summarize several statistical models side-by-side. Draw coefficient plots, multi-level cross-tabs, dataset summaries, balance tables (a.k.a. "Table 1s"), and correlation matrices. This package supports dozens of statistical models, and it can produce tables in HTML, LaTeX, Word, Markdown, PDF, PowerPoint, Excel, RTF, JPG, or PNG. Tables can easily be embedded in 'Rmarkdown' or 'knitr' dynamic documents. Details can be found in Arel-Bundock (2022) <doi:10.18637/jss.v103.i01>.

Maintained by Vincent Arel-Bundock. Last updated 15 days ago.

1.9 match 926 stars 13.41 score 6.2k scripts 2 dependents

bioc

ggbio:Visualization tools for genomic data

The ggbio package extends and specializes the grammar of graphics for biological data. The graphics are designed to answer common scientific questions, in particular those often asked of high throughput genomics data. All core Bioconductor data structures are supported, where appropriate. The package supports detailed views of particular genomic regions, as well as genome-wide overviews. Supported overviews include ideograms and grand linear views. High-level plots include sequence fragment length, edge-linked interval to data view, mismatch pileup, and several splicing summaries.

Maintained by Michael Lawrence. Last updated 5 months ago.

infrastructure visualization

2.0 match 111 stars 12.26 score 734 scripts 17 dependents

bioc

goSorensen:Statistical inference based on the Sorensen-Dice dissimilarity and the Gene Ontology (GO)

This package implements inferential methods to compare gene lists in terms of their biological meaning as expressed in the GO. The compared gene lists are characterized by cross-tabulation frequency tables of enriched GO items. Dissimilarity between gene lists is evaluated using the Sorensen-Dice index. The fundamental guiding principle is that two gene lists are taken as similar if they share a great proportion of common enriched GO items.

Maintained by Pablo Flores. Last updated 5 months ago.

annotation go genesetenrichment software microarray pathways geneexpression multiplecomparison graphandnetwork reactome clustering kegg

5.4 match 4.56 score 12 scripts

statistikat

simPop:Simulation of Complex Synthetic Data Information

Tools and methods to simulate populations for surveys based on auxiliary data. The tools include model-based methods, calibration and combinatorial optimization algorithms, see Templ, Kowarik and Meindl (2017) <doi:10.18637/jss.v079.i10>) and Templ (2017) <doi:10.1007/978-3-319-50272-4>. The package was developed with support of the International Household Survey Network, DFID Trust Fund TF011722 and funds from the World bank.

Maintained by Matthias Templ. Last updated 4 months ago.

cpp

3.8 match 31 stars 6.51 score 104 scripts

sticsrpacks

SticsRFiles:Read and Modify 'STICS' Input/Output Files

Manipulating input and output files of the 'STICS' crop model. Files are either 'JavaSTICS' XML files or text files used by the model 'fortran' executable. Most basic functionalities are reading or writing parameter names and values in both XML or text input files, and getting data from output files. Advanced functionalities include XML files generation from XML templates and/or spreadsheets, or text files generation from XML files by using 'xslt' transformation.

Maintained by Patrice Lecharpentier. Last updated 18 days ago.

2.9 match 4 stars 8.27 score 124 scripts

mhahsler

arules:Mining Association Rules and Frequent Itemsets

Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules). Also provides C implementations of the association mining algorithms Apriori and Eclat. Hahsler, Gruen and Hornik (2005) <doi:10.18637/jss.v014.i15>.

Maintained by Michael Hahsler. Last updated 1 months ago.

arules association-rules frequent-itemsets

1.7 match 194 stars 13.99 score 3.3k scripts 28 dependents

juba

questionr:Functions to Make Surveys Processing Easier

Set of functions to make the processing and analysis of surveys easier : interactive shiny apps and addins for data recoding, contingency tables, dataset metadata handling, and several convenience functions.

Maintained by Julien Barnier. Last updated 1 days ago.

1.9 match 83 stars 12.62 score 1.1k scripts 19 dependents

rfastofficial

Rfast:A Collection of Efficient and Extremely Fast R Functions

A collection of fast (utility) functions for data analysis. Column and row wise means, medians, variances, minimums, maximums, many t, F and G-square tests, many regressions (normal, logistic, Poisson), are some of the many fast functions. References: a) Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>. b) Tsagris M. and Papadakis M. (2018). Forward regression in R: from the extreme slow to the extreme fast. Journal of Data Science, 16(4): 771--780. <doi:10.6339/JDS.201810_16(4).00006>. c) Chatzipantsiou C., Dimitriadis M., Papadakis M. and Tsagris M. (2020). Extremely Efficient Permutation and Bootstrap Hypothesis Tests Using Hypothesis Tests Using R. Journal of Modern Applied Statistical Methods, 18(2), eP2898. <doi:10.48550/arXiv.1806.10947>. d) Tsagris M., Papadakis M., Alenazi A. and Alzeley O. (2024). Computationally Efficient Outlier Detection for High-Dimensional Data Using the MDP Algorithm. Computation, 12(9): 185. <doi:10.3390/computation12090185>. e) Tsagris M. and Papadakis M. (2025). Fast and light-weight energy statistics using the R package Rfast. <doi:10.48550/arXiv.2501.02849>.

Maintained by Manos Papadakis. Last updated 17 days ago.

openblas cpp openmp

1.9 match 147 stars 12.54 score 1.2k scripts 166 dependents

randrescastaneda

joyn:Tool for Diagnosis of Tables Joins and Complementary Join Features

Tool for diagnosing table joins. It combines the speed of `collapse` and `data.table`, the flexibility of `dplyr`, and the diagnosis and features of the `merge` command in `Stata`.

Maintained by R.Andres Castaneda. Last updated 3 months ago.

join merge

3.3 match 9 stars 7.00 score 31 scripts

massimoaria

bibliometrix:Comprehensive Science Mapping Analysis

Tool for quantitative research in scientometrics and bibliometrics. It implements the comprehensive workflow for science mapping analysis proposed in Aria M. and Cuccurullo C. (2017) <doi:10.1016/j.joi.2017.08.007>. 'bibliometrix' provides various routines for importing bibliographic data from 'SCOPUS', 'Clarivate Analytics Web of Science' (<https://www.webofknowledge.com/>), 'Digital Science Dimensions' (<https://www.dimensions.ai/>), 'OpenAlex' (<https://openalex.org/>), 'Cochrane Library' (<https://www.cochranelibrary.com/>), 'Lens' (<https://lens.org>), and 'PubMed' (<https://pubmed.ncbi.nlm.nih.gov/>) databases, performing bibliometric analysis and building networks for co-citation, coupling, scientific collaboration and co-word analysis.

Maintained by Massimo Aria. Last updated 8 days ago.

bibliometric-analysis bibliometrics citation citation-network citations co-authors co-occurence co-word-analysis correspondence-analysis coupling isi-web journal manuscript quantitative-analysis scholars science science-mapping scientific scientometrics scopus

1.8 match 545 stars 12.54 score 518 scripts 2 dependents

walkerke

tigris:Load Census TIGER/Line Shapefiles

Download TIGER/Line shapefiles from the United States Census Bureau (<https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html>) and load into R as 'sf' objects.

Maintained by Kyle Walker. Last updated 4 months ago.

1.7 match 331 stars 12.87 score 5.3k scripts 16 dependents

opengeos

whitebox:'WhiteboxTools' R Frontend

An R frontend for the 'WhiteboxTools' library, which is an advanced geospatial data analysis platform developed by Prof. John Lindsay at the University of Guelph's Geomorphometry and Hydrogeomatics Research Group. 'WhiteboxTools' can be used to perform common geographical information systems (GIS) analysis operations, such as cost-distance analysis, distance buffering, and raster reclassification. Remote sensing and image processing tasks include image enhancement (e.g. panchromatic sharpening, contrast adjustments), image mosaicing, numerous filtering operations, simple classification (k-means), and common image transformations. 'WhiteboxTools' also contains advanced tooling for spatial hydrological analysis (e.g. flow-accumulation, watershed delineation, stream network analysis, sink removal), terrain analysis (e.g. common terrain indices such as slope, curvatures, wetness index, hillshading; hypsometric analysis; multi-scale topographic position analysis), and LiDAR data processing. Suggested citation: Lindsay (2016) <doi:10.1016/j.cageo.2016.07.003>.

Maintained by Andrew Brown. Last updated 5 months ago.

geomorphometry geoprocessing geospatial gis hydrology remote-sensing rstudio

2.3 match 173 stars 9.65 score 203 scripts 2 dependents

dmenne

breathtestcore:Core Functions to Read and Fit 13c Time Series from Breath Tests

Reads several formats of 13C data (IRIS/Wagner, BreathID) and CSV. Creates artificial sample data for testing. Fits Maes/Ghoos, Bluck-Coward self-correcting formula using 'nls', 'nlme'. Methods to fit breath test curves with Bayesian Stan methods are refactored to package 'breathteststan'. For a Shiny GUI, see package 'dmenne/breathtestshiny' on github.

Maintained by Dieter Menne. Last updated 2 months ago.

13c breath breath-test gastroenterology medical stan

3.5 match 2 stars 6.19 score 64 scripts 1 dependents

cran

tseriesTARMA:Analysis of Nonlinear Time Series Through Threshold Autoregressive Moving Average Models (TARMA) Models

Routines for nonlinear time series analysis based on Threshold Autoregressive Moving Average (TARMA) models. It provides functions and methods for: TARMA model fitting and forecasting, including robust estimators, see Goracci et al. JBES (2025) <doi:10.1080/07350015.2024.2412011>; tests for threshold effects, see Giannerini et al. JoE (2024) <doi:10.1016/j.jeconom.2023.01.004>, Goracci et al. Statistica Sinica (2023) <doi:10.5705/ss.202021.0120>, Angelini et al. (2024) <doi:10.48550/arXiv.2308.00444>; unit-root tests based on TARMA models, see Chan et al. Statistica Sinica (2024) <doi:10.5705/ss.202022.0125>.

Maintained by Simone Giannerini. Last updated 5 months ago.

fortran openblas

7.1 match 3.06 score

urswilke

datadaptor:Modify Labelled Data Sets With Excel Files

An R package to modify labelled data sets with commands in Excel files. The commands in this package allow to create new variables, and modify the labels of the variables, as well as the variables themselves. The goal is to provide an easy & concise syntax, and to allow for fast systematic data entry using Excel for advanced users. The commands work on the variables inside the data.frame environment (like e.g. inside dplyr verbs), thus providing an approach that might ease the use for people without in-depth programming experience.

Maintained by Urs Wilke. Last updated 27 days ago.

4.2 match 5.13 score 1 dependents

dbosak01

procs:Recreates Some 'SAS®' Procedures in 'R'

Contains functions to simulate the most commonly used 'SAS®' procedures. Specifically, the package aims to simulate the functionality of 'proc freq', 'proc means', 'proc ttest', 'proc reg', 'proc transpose', 'proc sort', and 'proc print'. The simulation will include recreating all statistics with the highest fidelity possible.

Maintained by David Bosak. Last updated 10 months ago.

2.8 match 6 stars 7.57 score 37 scripts 2 dependents

ncss-tech

aqp:Algorithms for Quantitative Pedology

The Algorithms for Quantitative Pedology (AQP) project was started in 2009 to organize a loosely-related set of concepts and source code on the topic of soil profile visualization, aggregation, and classification into this package (aqp). Over the past 8 years, the project has grown into a suite of related R packages that enhance and simplify the quantitative analysis of soil profile data. Central to the AQP project is a new vocabulary of specialized functions and data structures that can accommodate the inherent complexity of soil profile information; freeing the scientist to focus on ideas rather than boilerplate data processing tasks <doi:10.1016/j.cageo.2012.10.020>. These functions and data structures have been extensively tested and documented, applied to projects involving hundreds of thousands of soil profiles, and deeply integrated into widely used tools such as SoilWeb <https://casoilresource.lawr.ucdavis.edu/soilweb-apps>. Components of the AQP project (aqp, soilDB, sharpshootR, soilReports packages) serve an important role in routine data analysis within the USDA-NRCS Soil Science Division. The AQP suite of R packages offer a convenient platform for bridging the gap between pedometric theory and practice.

Maintained by Dylan Beaudette. Last updated 29 days ago.

digital-soil-mapping ncss-tech nrcs pedology pedometrics soil soil-survey usda

1.8 match 55 stars 11.77 score 1.2k scripts 2 dependents

jinkim3

kim:A Toolkit for Behavioral Scientists

A collection of functions for analyzing data typically collected or used by behavioral scientists. Examples of the functions include a function that compares groups in a factorial experimental design, a function that conducts two-way analysis of variance (ANOVA), and a function that cleans a data set generated by Qualtrics surveys. Some of the functions will require installing additional package(s). Such packages and other references are cited within the section describing the relevant functions. Many functions in this package rely heavily on these two popular R packages: Dowle et al. (2021) <https://CRAN.R-project.org/package=data.table>. Wickham et al. (2021) <https://CRAN.R-project.org/package=ggplot2>.

Maintained by Jin Kim. Last updated 19 days ago.

4.5 match 7 stars 4.66 score 3 scripts

ssnn-airr

alakazam:Immunoglobulin Clonal Lineage and Diversity Analysis

Provides methods for high-throughput adaptive immune receptor repertoire sequencing (AIRR-Seq; Rep-Seq) analysis. In particular, immunoglobulin (Ig) sequence lineage reconstruction, lineage topology analysis, diversity profiling, amino acid property analysis and gene usage. Citations: Gupta and Vander Heiden, et al (2017) <doi:10.1093/bioinformatics/btv359>, Stern, Yaari and Vander Heiden, et al (2014) <doi:10.1126/scitranslmed.3008879>.

Maintained by Susanna Marquez. Last updated 3 months ago.

software annotationdata cpp

2.3 match 9.09 score 424 scripts 7 dependents

bioc

sparseMatrixStats:Summary Statistics for Rows and Columns of Sparse Matrices

High performance functions for row and column operations on sparse matrices. For example: col / rowMeans2, col / rowMedians, col / rowVars etc. Currently, the optimizations are limited to data in the column sparse format. This package is inspired by the matrixStats package by Henrik Bengtsson.

Maintained by Constantin Ahlmann-Eltze. Last updated 5 months ago.

infrastructure software datarepresentation cpp

1.7 match 54 stars 11.98 score 174 scripts 126 dependents

bioc

DelayedMatrixStats:Functions that Apply to Rows and Columns of 'DelayedMatrix' Objects

A port of the 'matrixStats' API for use with DelayedMatrix objects from the 'DelayedArray' package. High-performing functions operating on rows and columns of DelayedMatrix objects, e.g. col / rowMedians(), col / rowRanks(), and col / rowSds(). Functions optimized per data type and for subsetted calculations such that both memory usage and processing time is minimized.

Maintained by Peter Hickey. Last updated 2 months ago.

infrastructure datarepresentation software

1.7 match 16 stars 11.86 score 211 scripts 112 dependents

cran

MARVEL:Revealing Splicing Dynamics at Single-Cell Resolution

Alternative splicing represents an additional and underappreciated layer of complexity underlying gene expression profiles. Nevertheless, there remains hitherto a paucity of software to investigate splicing dynamics at single-cell resolution. 'MARVEL' enables splicing analysis of single-cell RNA-sequencing data generated from plate- and droplet-based library preparation methods.

Maintained by Sean Wen. Last updated 2 years ago.

7.5 match 2.71 score 51 scripts

bioc

MatrixGenerics:S4 Generic Summary Statistic Functions that Operate on Matrix-Like Objects

S4 generic functions modeled after the 'matrixStats' API for alternative matrix implementations. Packages with alternative matrix implementation can depend on this package and implement the generic functions that are defined here for a useful set of row and column summary statistics. Other package developers can import this package and handle a different matrix implementations without worrying about incompatibilities.

Maintained by Peter Hickey. Last updated 2 months ago.

infrastructure software bioconductor-package core-package

1.7 match 12 stars 11.64 score 129 scripts 1.3k dependents

gianmarcoalberti

chisquare:Chi-Square and G-Square Test of Independence, Power and Residual Analysis, Measures of Categorical Association

Provides the facility to perform the chi-square and G-square test of independence, calculates the retrospective power of the traditional chi-square test, compute permutation and Monte Carlo p-value, and provides measures of association for tables of any size such as Phi, Phi corrected, odds ratio with 95 percent CI and p-value, Yule' Q and Y, adjusted contingency coefficient, Cramer's V, V corrected, V standardised, bias-corrected V, W, Cohen's w, Goodman-Kruskal's lambda, and tau. It also calculates standardised, moment-corrected standardised, and adjusted standardised residuals, and their significance, as well as the Quetelet Index, IJ association factor, and adjusted standardised counts. It also computes the chi-square-maximising version of the input table. Different outputs are returned in nicely formatted tables.

Maintained by Gianmarco Alberti. Last updated 5 months ago.

9.9 match 2.00 score 1 scripts

sdctools

sdcMicro:Statistical Disclosure Control Methods for Anonymization of Data and Risk Estimation

Data from statistical agencies and other institutions are mostly confidential. This package, introduced in Templ, Kowarik and Meindl (2017) <doi:10.18637/jss.v067.i04>, can be used for the generation of anonymized (micro)data, i.e. for the creation of public- and scientific-use files. The theoretical basis for the methods implemented can be found in Templ (2017) <doi:10.1007/978-3-319-50272-4>. Various risk estimation and anonymization methods are included. Note that the package includes a graphical user interface published in Meindl and Templ (2019) <doi:10.3390/a12090191> that allows to use various methods of this package.

Maintained by Matthias Templ. Last updated 27 days ago.

cpp

2.0 match 83 stars 9.89 score 258 scripts

kurthornik

clue:Cluster Ensembles

CLUster Ensembles.

Maintained by Kurt Hornik. Last updated 4 months ago.

2.0 match 2 stars 9.85 score 496 scripts 401 dependents

tntp

tntpr:Data Analysis Tools Customized for TNTP

An assortment of functions and templates customized to meet the needs of data analysts at the non-profit organization TNTP. Includes functions for branded colors and plots, credentials management, repository set-up, and other common analytic tasks.

Maintained by Dustin Pashouwer. Last updated 4 months ago.

3.4 match 7 stars 5.83 score 13 scripts

laresbernardo

lares:Analytics & Machine Learning Sidekick

Auxiliary package for better/faster analytics, visualization, data mining, and machine learning tasks. With a wide variety of family functions, like Machine Learning, Data Wrangling, Marketing Mix Modeling (Robyn), Exploratory, API, and Scrapper, it helps the analyst or data scientist to get quick and robust results, without the need of repetitive coding or advanced R programming skills.

Maintained by Bernardo Lares. Last updated 24 days ago.

analytics api automation automl data-science descriptive-statistics h2o machine-learning marketing mmm predictive-modeling puzzle rlanguage robyn visualization

2.0 match 233 stars 9.84 score 185 scripts 1 dependents

eogrady21

vprr:Processing and Visualization of Video Plankton Recorder Data

An oceanographic data processing package for analyzing and visualizing Video Plankton Recorder data. This package was developed at 'Bedford Institute of Oceanography'. Functions are designed to process automated image classification output and create organized and easily portable data products.

Maintained by Emily OGrady. Last updated 1 months ago.

3.5 match 2 stars 5.61 score 17 scripts

grunwaldlab

poppr:Genetic Analysis of Populations with Mixed Reproduction

Population genetic analyses for hierarchical analysis of partially clonal populations built upon the architecture of the 'adegenet' package. Originally described in Kamvar, Tabima, and Grünwald (2014) <doi:10.7717/peerj.281> with version 2.0 described in Kamvar, Brooks, and Grünwald (2015) <doi:10.3389/fgene.2015.00208>.

Maintained by Zhian N. Kamvar. Last updated 10 months ago.

clonality genetic-analysis genetic-distances minimum-spanning-networks multilocus-genotypes multilocus-lineages population-genetics populations openmp

1.8 match 69 stars 10.84 score 672 scripts

wraff

wrMisc:Analyze Experimental High-Throughput (Omics) Data

The efficient treatment and convenient analysis of experimental high-throughput (omics) data gets facilitated through this collection of diverse functions. Several functions address advanced object-conversions, like manipulating lists of lists or lists of arrays, reorganizing lists to arrays or into separate vectors, merging of multiple entries, etc. Another set of functions provides speed-optimized calculation of standard deviation (sd), coefficient of variance (CV) or standard error of the mean (SEM) for data in matrixes or means per line with respect to additional grouping (eg n groups of replicates). A group of functions facilitate dealing with non-redundant information, by indexing unique, adding counters to redundant or eliminating lines with respect redundancy in a given reference-column, etc. Help is provided to identify very closely matching numeric values to generate (partial) distance matrixes for very big data in a memory efficient manner or to reduce the complexity of large data-sets by combining very close values. Other functions help aligning a matrix or data.frame to a reference using partial matching or to mine an experimental setup to extract patterns of replicate samples. Many times large experimental datasets need some additional filtering, adequate functions are provided. Convenient data normalization is supported in various different modes, parameter estimation via permutations or boot-strap as well as flexible testing of multiple pair-wise combinations using the framework of 'limma' is provided, too. Batch reading (or writing) of sets of files and combining data to arrays is supported, too.

Maintained by Wolfgang Raffelsberger. Last updated 7 months ago.

4.2 match 4.44 score 33 scripts 4 dependents

r-gregmisc

gmodels:Various R Programming Tools for Model Fitting

Various R programming tools for model fitting.

Maintained by Gregory R. Warnes. Last updated 3 months ago.

1.8 match 1 stars 10.01 score 3.5k scripts 30 dependents

quanteda

quanteda.textstats:Textual Statistics for the Quantitative Analysis of Textual Data

Textual statistics functions formerly in the 'quanteda' package. Textual statistics for characterizing and comparing textual data. Includes functions for measuring term and document frequency, the co-occurrence of words, similarity and distance between features and documents, feature entropy, keyword occurrence, readability, and lexical diversity. These functions extend the 'quanteda' package and are specially designed for sparse textual data.

Maintained by Kenneth Benoit. Last updated 6 months ago.

onetbb cpp

2.0 match 15 stars 8.91 score 916 scripts 10 dependents

magnusdv

pedtools:Creating and Working with Pedigrees and Marker Data

A comprehensive collection of tools for creating, manipulating and visualising pedigrees and genetic marker data. Pedigrees can be read from text files or created on the fly with built-in functions. A range of utilities enable modifications like adding or removing individuals, breaking loops, and merging pedigrees. An online tool for creating pedigrees interactively, based on 'pedtools', is available at <https://magnusdv.shinyapps.io/quickped>. 'pedtools' is the hub of the 'pedsuite', a collection of packages for pedigree analysis. A detailed presentation of the 'pedsuite' is given in the book 'Pedigree Analysis in R' (Vigeland, 2021, ISBN:9780128244302).

Maintained by Magnus Dehli Vigeland. Last updated 2 months ago.

2.0 match 25 stars 8.83 score 60 scripts 18 dependents

opensafely-core

osutils:Useful Functions for OpenSAFELY

Contains functions that are often needed when using the OpenSAFELY platform <https://www.opensafely.org/>, such as redaction and low-memory processing.

Maintained by William Hulme. Last updated 3 months ago.

10.3 match 1.70 score 1 scripts

spatstat

spatstat.explore:Exploratory Data Analysis for the 'spatstat' Family

Functionality for exploratory data analysis and nonparametric analysis of spatial data, mainly spatial point patterns, in the 'spatstat' family of packages. (Excludes analysis of spatial data on a linear network, which is covered by the separate package 'spatstat.linnet'.) Methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported.

Maintained by Adrian Baddeley. Last updated 1 months ago.

cluster-detection confidence-intervals hypothesis-testing k-function roc-curves scan-statistics significance-testing simulation-envelopes spatial-analysis spatial-data-analysis spatial-sharpening spatial-smoothing spatial-statistics

1.7 match 1 stars 10.17 score 67 scripts 148 dependents

chstock

DTComPair:Comparison of Binary Diagnostic Tests in a Paired Study Design

Comparison of the accuracy of two binary diagnostic tests in a "paired" study design, i.e. when each test is applied to each subject in the study.

Maintained by Christian Stock. Last updated 5 months ago.

clinical-epidemiology comparative-analysis diagnosis diagnostic-accuracy-studies diagnostic-likelihood-ratio diagnostic-tests medicine predictive-value sensitivity specificity

3.4 match 1 stars 5.07 score 47 scripts

muschellij2

neurobase:'Neuroconductor' Base Package with Helper Functions for 'nifti' Objects

Base package for 'Neuroconductor', which includes many helper functions that interact with objects of class 'nifti', implemented by package 'oro.nifti', for reading/writing and also other manipulation functions.

Maintained by John Muschelli. Last updated 1 months ago.

2.0 match 5 stars 8.49 score 486 scripts 7 dependents

joshobrien

rasterDT:Fast Raster Summary and Manipulation

Fast alternatives to several relatively slow 'raster' package functions. For large rasters, the functions run from 5 to approximately 100 times faster than the 'raster' package functions they replace. The 'fasterize' package, on which one function in this package depends, includes an implementation of the scan line algorithm attributed to Wylie et al. (1967) <doi:10.1145/1465611.1465619>.

Maintained by Joshua OBrien. Last updated 2 years ago.

3.7 match 27 stars 4.61 score 8 scripts 1 dependents

ropensci

USAboundaries:Historical and Contemporary Boundaries of the United States of America

The boundaries for geographical units in the United States of America contained in this package include state, county, congressional district, and zip code tabulation area. Contemporary boundaries are provided by the U.S. Census Bureau (public domain). Historical boundaries for the years from 1629 to 2000 are provided form the Newberry Library's 'Atlas of Historical County Boundaries' (licensed CC BY-NC-SA). Additional data is provided in the 'USAboundariesData' package; this package provides an interface to access that data.

Maintained by Lincoln Mullen. Last updated 3 years ago.

digital-history history spatial-data

2.3 match 58 stars 7.33 score 1.2k scripts

cran

poliscidata:Datasets and Functions Featured in Pollock and Edwards, an R Companion to Essentials of Political Analysis, Second Edition

Bundles the datasets and functions used in the textbook by Philip Pollock and Barry Edwards, an R Companion to Essentials of Political Analysis, Second Edition.

Maintained by Barry Edwards. Last updated 5 years ago.

7.3 match 2.33 score 213 scripts

edm44

hydraulics:Basic Pipe and Open Channel Hydraulics

Functions for basic hydraulic calculations related to water flow in circular pipes both flowing full (under pressure), and partially full (gravity flow), and trapezoidal open channels. For pressure flow this includes friction loss calculations by solving the Darcy-Weisbach equation for head loss, flow or diameter, plotting a Moody diagram, matching a pump characteristic curve to a system curve, and solving for flows in a pipe network using the Hardy-Cross method. The Darcy-Weisbach friction factor is calculated using the Colebrook (or Colebrook-White equation), the basis of the Moody diagram, the original citation being Colebrook (1939) <doi:10.1680/ijoti.1939.13150>. For gravity flow, the Manning equation is used, again solving for missing parameters. The derivation of and solutions using the Darcy-Weisbach equation and the Manning equation are outlined in many fluid mechanics texts such as Finnemore and Maurer (2024, ISBN:978-1-264-78729-6). Some gradually- and rapidly-varied flow functions are included. For the Manning equation solutions, this package uses modifications of original code from the 'iemisc' package by Irucka Embry.

Maintained by Ed Maurer. Last updated 4 months ago.

3.3 match 9 stars 5.13 score 8 scripts

bergsmat

tablet:Tabulate Descriptive Statistics in Multiple Formats

Creates a table of descriptive statistics for factor and numeric columns in a data frame. Displays these by groups, if any. Highly customizable, with support for 'html' and 'pdf' provided by 'kableExtra'. Respects original column order, column labels, and factor level order. See ?tablet.data.frame and vignettes.

Maintained by Tim Bergsma. Last updated 4 months ago.

3.0 match 3 stars 5.57 score 26 scripts

stulacy

epitab:Flexible Contingency Tables for Epidemiology

Builds contingency tables that cross-tabulate multiple categorical variables and also calculates various summary measures. Export to a variety of formats is supported, including: 'HTML', 'LaTeX', and 'Excel'.

Maintained by Stuart Lacy. Last updated 7 years ago.

3.8 match 1 stars 4.41 score 17 scripts

trinker

textshape:Tools for Reshaping Text

Tools that can be used to reshape and restructure text data.

Maintained by Tyler Rinker. Last updated 12 months ago.

data-reshaping manipulation sentence-boundary-detection text-data text-formating tidy

1.8 match 50 stars 9.18 score 266 scripts 34 dependents

environmentalinformatics-marburg

satellite:Handling and Manipulating Remote Sensing Data

Herein, we provide a broad variety of functions which are useful for handling, manipulating, and visualizing satellite-based remote sensing data. These operations range from mere data import and layer handling (eg subsetting), over Raster* typical data wrangling (eg crop, extend), to more sophisticated (pre-)processing tasks typically applied to satellite imagery (eg atmospheric and topographic correction). This functionality is complemented by a full access to the satellite layers' metadata at any stage and the documentation of performed actions in a separate log file. Currently available sensors include Landsat 4-5 (TM), 7 (ETM+), and 8 (OLI/TIRS Combined), and additional compatibility is ensured for the Landsat Global Land Survey data set.

Maintained by Florian Detsch. Last updated 1 years ago.

cpp

1.7 match 22 stars 9.88 score 61 scripts 27 dependents

zrmacc

SurrogateRegression:Surrogate Outcome Regression Analysis

Performs estimation and inference on a partially missing target outcome (e.g. gene expression in an inaccessible tissue) while borrowing information from a correlated surrogate outcome (e.g. gene expression in an accessible tissue). Rather than regarding the surrogate outcome as a proxy for the target outcome, this package jointly models the target and surrogate outcomes within a bivariate regression framework. Unobserved values of either outcome are treated as missing data. In contrast to imputation-based inference, no assumptions are required regarding the relationship between the target and surrogate outcomes. Estimation in the presence of bilateral outcome missingness is performed via an expectation conditional maximization either algorithm. In the case of unilateral target missingness, estimation is performed using an accelerated least squares procedure. A flexible association test is provided for evaluating hypotheses about the target regression parameters. For additional details, see: McCaw ZR, Gaynor SM, Sun R, Lin X: "Leveraging a surrogate outcome to improve inference on a partially missing target outcome" <doi:10.1111/biom.13629>.

Maintained by Zachary McCaw. Last updated 5 months ago.

openblas cpp

4.0 match 1 stars 4.08 score 12 scripts

oakleyj

SHELF:Tools to Support the Sheffield Elicitation Framework

Implements various methods for eliciting a probability distribution for a single parameter from an expert or a group of experts. The expert provides a small number of probability judgements, corresponding to points on his or her cumulative distribution function. A range of parametric distributions can then be fitted and displayed, with feedback provided in the form of fitted probabilities and percentiles. For multiple experts, a weighted linear pool can be calculated. Also includes functions for eliciting beliefs about population distributions; eliciting multivariate distributions using a Gaussian copula; eliciting a Dirichlet distribution; eliciting distributions for variance parameters in a random effects meta-analysis model; survival extrapolation. R Shiny apps for most of the methods are included.

Maintained by Jeremy Oakley. Last updated 15 days ago.

1.8 match 19 stars 8.90 score 73 scripts 3 dependents

bioc

beadarray:Quality assessment and low-level analysis for Illumina BeadArray data

The package is able to read bead-level data (raw TIFFs and text files) output by BeadScan as well as bead-summary data from BeadStudio. Methods for quality assessment and low-level analysis are provided.

Maintained by Mark Dunning. Last updated 5 months ago.

microarray onechannel qualitycontrol preprocessing

2.0 match 7.88 score 70 scripts 4 dependents

olihawkins

tabbycat:Tabulate and Summarise Categorical Data

Functions for tabulating and summarising categorical variables. Most functions are designed to work with dataframes, and use the 'tidyverse' idiom of taking the dataframe as the first argument so they work within pipelines. Equivalent functions that operate directly on vectors are also provided where it makes sense. This package aims to make exploratory data analysis involving categorical variables quicker, simpler and more robust.

Maintained by Oliver Hawkins. Last updated 2 years ago.

3.6 match 36 stars 4.26 score 2 scripts

thothorn

libcoin:Linear Test Statistics for Permutation Inference

Basic infrastructure for linear test statistics and permutation inference in the framework of Strasser and Weber (1999) <https://epub.wu.ac.at/102/>. This package must not be used by end-users. CRAN package 'coin' implements all user interfaces and is ready to be used by anyone.

Maintained by Torsten Hothorn. Last updated 1 years ago.

openblas

2.3 match 1 stars 6.81 score 25 scripts 171 dependents

ropensci

rdhs:API Client and Dataset Management for the Demographic and Health Survey (DHS) Data

Provides a client for (1) querying the DHS API for survey indicators and metadata (<https://api.dhsprogram.com/#/index.html>), (2) identifying surveys and datasets for analysis, (3) downloading survey datasets from the DHS website, (4) loading datasets and associate metadata into R, and (5) extracting variables and combining datasets for pooled analysis.

Maintained by OJ Watson. Last updated 17 days ago.

dataset dhs dhs-api extract peer-reviewed survey-data

1.5 match 35 stars 10.07 score 286 scripts 3 dependents

muschellij2

fslr:Wrapper Functions for 'FSL' ('FMRIB' Software Library) from Functional MRI of the Brain ('FMRIB')

Wrapper functions that interface with 'FSL' <http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/>, a powerful and commonly-used 'neuroimaging' software, using system commands. The goal is to be able to interface with 'FSL' completely in R, where you pass R objects of class 'nifti', implemented by package 'oro.nifti', and the function executes an 'FSL' command and returns an R object of class 'nifti' if desired.

Maintained by John Muschelli. Last updated 1 months ago.

fsl fslr neuroimaging neuroimaging-analysis neuroimaging-data-science

1.9 match 41 stars 8.01 score 420 scripts

ai-sdc

acro:A Tool for Semi-Automating the Statistical Disclosure Control of Research Outputs

Assists researchers and output checkers by distinguishing between research output that is safe to publish, output that requires further analysis, and output that cannot be published because of substantial disclosure risk. A paper about the tool was presented at the UNECE Expert Meeting on Statistical Data Confidentiality 2023; see <https://uwe-repository.worktribe.com/output/11060964>.

Maintained by Jim Smith. Last updated 10 days ago.

data-privacy data-protection privacy privacy-tools statistical-disclosure-control statistical-software

3.5 match 1 stars 4.11 score 1 scripts

huismanj

qpNCA:Noncompartmental Pharmacokinetic Analysis by qPharmetra

Computes noncompartmental pharmacokinetic parameters for drug concentration profiles. For each profile, data imputations and adjustments are made as necessary and basic parameters are estimated. Supports single dose, multi-dose, and multi-subject data. Supports steady-state calculations and various routes of drug administration. See ?qpNCA and vignettes. Methodology follows Rowland and Tozer (2011, ISBN:978-0-683-07404-8), Gabrielsson and Weiner (1997, ISBN:978-91-9765-100-4), and Gibaldi and Perrier (1982, ISBN:978-0824710422).

Maintained by Jan Huisman. Last updated 4 years ago.

3.8 match 3.83 score 34 scripts

uligges

klaR:Classification and Visualization

Miscellaneous functions for classification and visualization, e.g. regularized discriminant analysis, sknn() kernel-density naive Bayes, an interface to 'svmlight' and stepclass() wrapper variable selection for supervised classification, partimat() visualization of classification rules and shardsplot() of cluster results as well as kmodes() clustering for categorical data, corclust() variable clustering, variable extraction from different variable clustering models and weight of evidence preprocessing.

Maintained by Uwe Ligges. Last updated 1 years ago.

1.9 match 5 stars 7.61 score 1.4k scripts 13 dependents

egenn

rtemis:Machine Learning and Visualization

Advanced Machine Learning and Visualization. Unsupervised Learning (Clustering, Decomposition), Supervised Learning (Classification, Regression), Cross-Decomposition, Bagging, Boosting, Meta-models. Static and interactive graphics.

Maintained by E.D. Gennatas. Last updated 1 months ago.

data-science data-visualization machine-learning machine-learning-library visualization

2.0 match 145 stars 7.09 score 50 scripts 2 dependents

projectmosaic

mosaicCore:Common Utilities for Other MOSAIC-Family Packages

Common utilities used in other MOSAIC-family packages are collected here.

Maintained by Randall Pruim. Last updated 1 years ago.

2.0 match 1 stars 7.07 score 113 scripts 26 dependents

insightsengineering

tern.rbmi:Create Interface for 'RBMI' and 'tern'

'RBMI' implements standard and reference based multiple imputation methods for continuous longitudinal endpoints (Gower-Page et al. (2022) <doi:10.21105/joss.04251>). This package provides an interface for 'RBMI' uses the 'tern' <https://cran.r-project.org/package=tern> framework by Zhu et al. (2023) and tabulate results easily using 'rtables' <https://cran.r-project.org/package=rtables> by Becker et al. (2023).

Maintained by Joe Zhu. Last updated 10 days ago.

graphs listings tables

2.2 match 3 stars 6.53 score 3 scripts

kestrel99

pmxTools:Pharmacometric and Pharmacokinetic Toolkit

Pharmacometric tools for common data analytical tasks; closed-form solutions for calculating concentrations at given times after dosing based on compartmental PK models (1-compartment, 2-compartment and 3-compartment, covering infusions, zero- and first-order absorption, and lag times, after single doses and at steady state, per Bertrand & Mentre (2008) <http://lixoft.com/wp-content/uploads/2016/03/PKPDlibrary.pdf>); parametric simulation from NONMEM-generated parameter estimates and other output; and parsing, tabulating and plotting results generated by Perl-speaks-NONMEM (PsN).

Maintained by Justin Wilkins. Last updated 7 months ago.

nonmem pharmacokinetics simulation

2.1 match 30 stars 6.40 score 84 scripts

finnishcancerregistry

popEpi:Functions for Epidemiological Analysis using Population Data

Enables computation of epidemiological statistics, including those where counts or mortality rates of the reference population are used. Currently supported: excess hazard models (Dickman, Sloggett, Hills, and Hakulinen (2012) <doi:10.1002/sim.1597>), rates, mean survival times, relative/net survival (in particular the Ederer II (Ederer and Heise (1959)) and Pohar Perme (Pohar Perme, Stare, and Esteve (2012) <doi:10.1111/j.1541-0420.2011.01640.x>) estimators), and standardized incidence and mortality ratios, all of which can be easily adjusted for by covariates such as age. Fast splitting and aggregation of 'Lexis' objects (from package 'Epi') and other computations achieved using 'data.table'.

Maintained by Joonas Miettinen. Last updated 2 months ago.

adjust-estimates age-adjusting direct-adjusting epidemiology indirect-adjusting survival

1.7 match 8 stars 8.05 score 117 scripts 1 dependents

brockk

escalation:A Modular Approach to Dose-Finding Clinical Trials

Methods for working with dose-finding clinical trials. We provide implementations of many dose-finding clinical trial designs, including the continual reassessment method (CRM) by O'Quigley et al. (1990) <doi:10.2307/2531628>, the toxicity probability interval (TPI) design by Ji et al. (2007) <doi:10.1177/1740774507079442>, the modified TPI (mTPI) design by Ji et al. (2010) <doi:10.1177/1740774510382799>, the Bayesian optimal interval design (BOIN) by Liu & Yuan (2015) <doi:10.1111/rssc.12089>, EffTox by Thall & Cook (2004) <doi:10.1111/j.0006-341X.2004.00218.x>; the design of Wages & Tait (2015) <doi:10.1080/10543406.2014.920873>, and the 3+3 described by Korn et al. (1994) <doi:10.1002/sim.4780131802>. All designs are implemented with a common interface. We also offer optional additional classes to tailor the behaviour of all designs, including avoiding skipping doses, stopping after n patients have been treated at the recommended dose, stopping when a toxicity condition is met, or demanding that n patients are treated before stopping is allowed. By daisy-chaining together these classes using the pipe operator from 'magrittr', it is simple to tailor the behaviour of a dose-finding design so it behaves how the trialist wants. Having provided a flexible interface for specifying designs, we then provide functions to run simulations and calculate dose-paths for future cohorts of patients.

Maintained by Kristian Brock. Last updated 2 months ago.

1.7 match 15 stars 7.91 score 67 scripts

nmautoverse

NMdata:Preparation, Checking and Post-Processing Data for PK/PD Modeling

Efficient tools for preparation, checking and post-processing of data in PK/PD (pharmacokinetics/pharmacodynamics) modeling, with focus on use of Nonmem. Attention is paid to ensure consistency, traceability, and Nonmem compatibility of Data. Rigorously checks final Nonmem datasets. Implemented in 'data.table', but easily integrated with 'base' and 'tidyverse'.

Maintained by Philip Delff. Last updated 3 days ago.

nonmem pharmacometrics

1.8 match 17 stars 7.69 score 88 scripts 2 dependents

roelandkindt

BiodiversityR:Package for Community Ecology and Suitability Analysis

Graphical User Interface (via the R-Commander) and utility functions (often based on the vegan package) for statistical analysis of biodiversity and ecological communities, including species accumulation curves, diversity indices, Renyi profiles, GLMs for analysis of species abundance and presence-absence, distance matrices, Mantel tests, and cluster, constrained and unconstrained ordination analysis. A book on biodiversity and community ecology analysis is available for free download from the website. In 2012, methods for (ensemble) suitability modelling and mapping were expanded in the package.

Maintained by Roeland Kindt. Last updated 2 months ago.

1.8 match 16 stars 7.42 score 390 scripts 2 dependents

jeffreyevans

yaImpute:Nearest Neighbor Observation Imputation and Evaluation Tools

Performs nearest neighbor-based imputation using one or more alternative approaches to processing multivariate data. These include methods based on canonical correlation: analysis, canonical correspondence analysis, and a multivariate adaptation of the random forest classification and regression techniques of Leo Breiman and Adele Cutler. Additional methods are also offered. The package includes functions for comparing the results from running alternative techniques, detecting imputation targets that are notably distant from reference observations, detecting and correcting for bias, bootstrapping and building ensemble imputations, and mapping results.

Maintained by Jeffrey S. Evans. Last updated 6 months ago.

imputation cpp

1.8 match 3 stars 7.40 score 94 scripts 12 dependents

cvoeten

buildmer:Stepwise Elimination and Term Reordering for Mixed-Effects Regression

Finds the largest possible regression model that will still converge for various types of regression analyses (including mixed models and generalized additive models) and then optionally performs stepwise elimination similar to the forward and backward effect-selection methods in SAS, based on the change in log-likelihood or its significance, Akaike's Information Criterion, the Bayesian Information Criterion, the explained deviance, or the F-test of the change in R².

Maintained by Cesko C. Voeten. Last updated 1 years ago.

2.3 match 5.82 score 200 scripts

ddalthorp

GenEst:Generalized Mortality Estimator

Command-line and 'shiny' GUI implementation of the GenEst models for estimating bird and bat mortality at wind and solar power facilities, following Dalthorp, et al. (2018) <doi:10.3133/tm7A2>.

Maintained by Daniel Dalthorp. Last updated 2 years ago.

cpp

1.7 match 7 stars 7.81 score 55 scripts 2 dependents

bioc

ropls:PCA, PLS(-DA) and OPLS(-DA) for multivariate analysis and feature selection of omics data

Latent variable modeling with Principal Component Analysis (PCA) and Partial Least Squares (PLS) are powerful methods for visualization, regression, classification, and feature selection of omics data where the number of variables exceeds the number of samples and with multicollinearity among variables. Orthogonal Partial Least Squares (OPLS) enables to separately model the variation correlated (predictive) to the factor of interest and the uncorrelated (orthogonal) variation. While performing similarly to PLS, OPLS facilitates interpretation. Successful applications of these chemometrics techniques include spectroscopic data such as Raman spectroscopy, nuclear magnetic resonance (NMR), mass spectrometry (MS) in metabolomics and proteomics, but also transcriptomics data. In addition to scores, loadings and weights plots, the package provides metrics and graphics to determine the optimal number of components (e.g. with the R2 and Q2 coefficients), check the validity of the model by permutation testing, detect outliers, and perform feature selection (e.g. with Variable Importance in Projection or regression coefficients). The package can be accessed via a user interface on the Workflow4Metabolomics.org online resource for computational metabolomics (built upon the Galaxy environment).

Maintained by Etienne A. Thevenot. Last updated 5 months ago.

regression classification principalcomponent transcriptomics proteomics metabolomics lipidomics massspectrometry immunooncology

1.7 match 7.55 score 210 scripts 8 dependents

trinker

qdapTools:Tools for the 'qdap' Package

A collection of tools associated with the 'qdap' package that may be useful outside of the context of text analysis.

Maintained by Tyler Rinker. Last updated 2 years ago.

1.8 match 16 stars 7.04 score 408 scripts 5 dependents

sem-in-r

seminr:Building and Estimating Structural Equation Models

A powerful, easy to syntax for specifying and estimating complex Structural Equation Models. Models can be estimated using Partial Least Squares Path Modeling or Covariance-Based Structural Equation Modeling or covariance based Confirmatory Factor Analysis. Methods described in Ray, Danks, and Valdez (2021).

Maintained by Nicholas Patrick Danks. Last updated 3 years ago.

common-factors composites construct pls-models

1.7 match 62 stars 7.46 score 284 scripts

ibecav

CGPfunctions:Powell Miscellaneous Functions for Teaching and Learning Statistics

Miscellaneous functions useful for teaching statistics as well as actually practicing the art. They typically are not new methods but rather wrappers around either base R or other packages.

Maintained by Chuck Powell. Last updated 4 years ago.

1.7 match 27 stars 7.28 score 122 scripts

petermeissner

tabit:Simple Tabulation Made Simple

Simple tabulation should be dead simple. This package is an opinionated approach to easy tabulations while also providing exact numbers and allowing for re-usability. This is achieved by providing tabulations as data.frames with columns for values, optional variable names, frequency counts including and excluding NAs and percentages for counts including and excluding NAs. Also values are automatically sorted by in decreasing order of frequency counts to allow for fast skimming of the most important information.

Maintained by Peter Meissner. Last updated 5 years ago.

4.1 match 2 stars 3.00 score 3 scripts

bioc

SynExtend:Tools for Working With Synteny Objects

Shared order between genomic sequences provide a great deal of information. Synteny objects produced by the R package DECIPHER provides quantitative information about that shared order. SynExtend provides tools for extracting information from Synteny objects.

Maintained by Nicholas Cooley. Last updated 3 days ago.

genetics clustering comparativegenomics dataimport fortran openmp

1.9 match 1 stars 6.42 score 77 scripts

gavinrozzi

zipcodeR:Data & Functions for Working with US ZIP Codes

Make working with ZIP codes in R painless with an integrated dataset of U.S. ZIP codes and functions for working with them. Search ZIP codes by multiple geographies, including state, county, city & across time zones. Also included are functions for relating ZIP codes to Census data, geocoding & distance calculations.

Maintained by Gavin Rozzi. Last updated 1 years ago.

1.6 match 80 stars 7.31 score 176 scripts

gianmarcoalberti

caplot:Correspondence Analysis with Geometric Frequency Interpretation

Performs Correspondence Analysis on the given dataframe and plots the results in a scatterplot that emphasizes the geometric interpretation aspect of the analysis, following Borg-Groenen (2005) and Yelland (2010). It is particularly useful for highlighting the relationships between a selected row (or column) category and the column (or row) categories. See Borg-Groenen (2005, ISBN:978-0-387-28981-6); Yelland (2010) <doi:10.3888/tmj.12-4>.

Maintained by Gianmarco Alberti. Last updated 2 years ago.

6.8 match 1.70 score 1 scripts

alexanderrobitzsch

BIFIEsurvey:Tools for Survey Statistics in Educational Assessment

Contains tools for survey statistics (especially in educational assessment) for datasets with replication designs (jackknife, bootstrap, replicate weights; see Kolenikov, 2010; Pfefferman & Rao, 2009a, 2009b, <doi:10.1016/S0169-7161(09)70003-3>, <doi:10.1016/S0169-7161(09)70037-9>); Shao, 1996, <doi:10.1080/02331889708802523>). Descriptive statistics, linear and logistic regression, path models for manifest variables with measurement error correction and two-level hierarchical regressions for weighted samples are included. Statistical inference can be conducted for multiply imputed datasets and nested multiply imputed datasets and is in particularly suited for the analysis of plausible values (for details see George, Oberwimmer & Itzlinger-Bruneforth, 2016; Bruneforth, Oberwimmer & Robitzsch, 2016; Robitzsch, Pham & Yanagida, 2016). The package development was supported by BIFIE (Federal Institute for Educational Research, Innovation and Development of the Austrian School System; Salzburg, Austria).

Maintained by Alexander Robitzsch. Last updated 11 months ago.

survey-statistics openblas cpp

2.3 match 4 stars 4.99 score 85 scripts 1 dependents

murrayefford

openCR:Open Population Capture-Recapture

Non-spatial and spatial open-population capture-recapture analysis.

Maintained by Murray Efford. Last updated 5 months ago.

cpp

1.9 match 4 stars 5.98 score 53 scripts

stats-uoa

s20x:Functions for University of Auckland Course STATS 201/208 Data Analysis

A set of functions used in teaching STATS 201/208 Data Analysis at the University of Auckland. The functions are designed to make parts of R more accessible to a large undergraduate population who are mostly not statistics majors.

Maintained by James Curran. Last updated 2 years ago.

1.8 match 3 stars 6.40 score 211 scripts 3 dependents

psolymos

mefa:Multivariate Data Handling in Ecology and Biogeography

A framework package aimed to provide standardized computational environment for specialist work via object classes to represent the data coded by samples, taxa and segments (i.e. subpopulations, repeated measures). It supports easy processing of the data along with cross tabulation and relational data tables for samples and taxa. An object of class `mefa' is a project specific compendium of the data and can be easily used in further analyses. Methods are provided for extraction, aggregation, conversion, plotting, summary and reporting of `mefa' objects. Reports can be generated in plain text or LaTeX format. Vignette contains worked examples.

Maintained by Peter Solymos. Last updated 10 months ago.

2.3 match 2 stars 4.82 score 111 scripts 2 dependents

middleton-lab

abd:The Analysis of Biological Data

The abd package contains data sets and sample code for The Analysis of Biological Data by Michael Whitlock and Dolph Schluter (2009; Roberts & Company Publishers).

Maintained by Kevin M. Middleton. Last updated 11 months ago.

2.0 match 6 stars 5.53 score 182 scripts 1 dependents

bioc

ontoProc:processing of ontologies of anatomy, cell lines, and so on

Support harvesting of diverse bioinformatic ontologies, making particular use of the ontologyIndex package on CRAN. We provide snapshots of key ontologies for terms about cells, cell lines, chemical compounds, and anatomy, to help analyze genome-scale experiments, particularly cell x compound screens. Another purpose is to strengthen development of compelling use cases for richer interfaces to emerging ontologies.

Maintained by Vincent Carey. Last updated 3 days ago.

infrastructure go bioinformatics genomics ontology

1.7 match 3 stars 6.37 score 75 scripts 2 dependents

ashesitr

reservr:Fit Distributions and Neural Networks to Censored and Truncated Data

Define distribution families and fit them to interval-censored and interval-truncated data, where the truncation bounds may depend on the individual observation. The defined distributions feature density, probability, sampling and fitting methods as well as efficient implementations of the log-density log f(x) and log-probability log P(x0 <= X <= x1) for use in 'TensorFlow' neural networks via the 'tensorflow' package. Allows training parametric neural networks on interval-censored and interval-truncated data with flexible parameterization. Applications include Claims Development in Non-Life Insurance, e.g. modelling reporting delay distributions from incomplete data, see Bücher, Rosenstock (2022) <doi:10.1007/s13385-022-00314-4>.

Maintained by Alexander Rosenstock. Last updated 9 months ago.

openblas cpp openmp

2.0 match 5 stars 5.35 score 9 scripts

richardhooijmaijers

R3port:Report Functions to Create HTML and PDF Files

Create and combine HTML and PDF reports from within R. Possibility to design tables and listings for reporting and also include R plots.

Maintained by Richard Hooijmaijers. Last updated 1 years ago.

1.9 match 10 stars 5.71 score 34 scripts 1 dependents

psolymos

mefa4:Multivariate Data Handling with S4 Classes and Sparse Matrices

An S4 update of the 'mefa' package using sparse matrices for enhanced efficiency. Sparse array-like objects are supported via lists of sparse matrices.

Maintained by Peter Solymos. Last updated 6 months ago.

data-manipulation ecology sparse-matrices

2.0 match 5.34 score 368 scripts 2 dependents

jinkim3

exploratory:A Tool for Large-Scale Exploratory Analyses

Conduct numerous exploratory analyses in an instant with a point-and-click interface. With one simple command, this tool launches a Shiny App on the local machine. Drag and drop variables in a data set to categorize them as possible independent, dependent, moderating, or mediating variables. Then run dozens (or hundreds) of analyses instantly to uncover any statistically significant relationships among variables. Any relationship thus uncovered should be tested in follow-up studies. This tool is designed only to facilitate exploratory analyses and should NEVER be used for p-hacking. Many of the functions used in this package are previous versions of functions in the R Packages 'kim' and 'ezr'. Selected References: Chang et al. (2021) <https://CRAN.R-project.org/package=shiny>. Dowle et al. (2021) <https://CRAN.R-project.org/package=data.table>. Kim (2023) <https://jinkim.science/docs/kim.pdf>. Kim (2021) <doi:10.5281/zenodo.4619237>. Kim (2020) <https://CRAN.R-project.org/package=ezr>. Simmons et al. (2011) <doi:10.1177/0956797611417632> Tingley et al. (2019) <https://CRAN.R-project.org/package=mediation>. Wickham et al. (2020) <https://CRAN.R-project.org/package=ggplot2>.

Maintained by Jin Kim. Last updated 1 years ago.

2.3 match 5 stars 4.75 score 45 scripts

clewerenz

ilabelled:Simple Handling of Labelled Data

Simple handling of survey data. Smart handling of meta-information like e.g. variable-labels value-labels and scale-levels. Easy access and validation of meta-information. Useage of value labels and values respectively for subsetting and recoding data.

Maintained by Christof Lewerenz. Last updated 2 months ago.

1.7 match 2 stars 6.02 score 13 scripts

ellessenne

rsimsum:Analysis of Simulation Studies Including Monte Carlo Error

Summarise results from simulation studies and compute Monte Carlo standard errors of commonly used summary statistics. This package is modelled on the 'simsum' user-written command in 'Stata' (White I.R., 2010 <https://www.stata-journal.com/article.html?article=st0200>), further extending it with additional performance measures and functionality.

Maintained by Alessandro Gasparini. Last updated 10 months ago.

biostatistics monte-carlo-error simulation simulation-study simulations statistics

1.3 match 28 stars 7.70 score 148 scripts

noaa-ocm

SWMPrExtension:Functions for Analyzing and Plotting Estuary Monitoring Data

Tools for performing routine analysis and plotting tasks with environmental data from the System Wide Monitoring Program of the National Estuarine Research Reserve System <https://cdmo.baruch.sc.edu/>. This package builds on the functionality of the 'SWMPr' package <https://cran.r-project.org/package=SWMPr>, which is used to retrieve and organize the data. The combined set of tools address common challenges associated with continuous time series data for environmental decision making, and are intended for use in annual reporting activities. References: Beck, Marcus W. (2016) <ISSN 2073-4859><https://journal.r-project.org/archive/2016-1/beck.pdf> Rudis, Bob (2014) <https://rud.is/b/2014/11/16/moving-the-earth-well-alaska-hawaii-with-r/>. United States Environmental Protection Agency (2015) <https://cfpub.epa.gov/si/si_public_record_Report.cfm?Lab=OWOW&dirEntryId=327030>. United States Environmental Protection Agency (2012) <https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.646.1973&rep=rep1&type=pdf>.

Maintained by Matt Dornback. Last updated 2 years ago.

2.0 match 12 stars 5.10 score 42 scripts

urswilke

crosstabser:Generate Crosstabs of Labelled Data Sets with Excel Files

An R package to use commands in Excel files to generate crosstabs of labelled data sets (usually survey data). The crosstabs can be printed to the console, and serve as an input for an app to plot them interactively.

Maintained by Urs Wilke. Last updated 17 days ago.

2.3 match 4.48 score

simonmoulds

lulcc:Land Use Change Modelling in R

Classes and methods for spatially explicit land use change modelling in R.

Maintained by Simon Moulds. Last updated 5 years ago.

1.8 match 41 stars 5.37 score 38 scripts

reginalexavier

OpenLand:Quantitative Analysis and Visualization of LUCC

Tools for the analysis of land use and cover (LUC) time series. It includes support for loading spatiotemporal raster data and synthesized spatial plotting. Several LUC change (LUCC) metrics in regular or irregular time intervals can be extracted and visualized through one- and multistep sankey and chord diagrams. A complete intensity analysis according to Aldwaik and Pontius (2012) <doi:10.1016/j.landurbplan.2012.02.010> is implemented, including tools for the generation of standardized multilevel output graphics.

Maintained by Reginal Exavier. Last updated 11 months ago.

geography geospatial intensity-analysis land-use-and-land-cover-change luc-maps lulc plot rasters

1.7 match 22 stars 5.80 score 19 scripts

tmsalab

cIRT:Choice Item Response Theory

Jointly model the accuracy of cognitive responses and item choices within a Bayesian hierarchical framework as described by Culpepper and Balamuta (2015) <doi:10.1007/s11336-015-9484-7>. In addition, the package contains the datasets used within the analysis of the paper.

Maintained by James Joseph Balamuta. Last updated 3 years ago.

armadillo bayesian choice cognitive-diagnostic-models gibbs-sampling item-response-theory rcpparmadillo openblas cpp openmp

1.9 match 4 stars 5.14 score 23 scripts

bioc

cTRAP:Identification of candidate causal perturbations from differential gene expression data

Compare differential gene expression results with those from known cellular perturbations (such as gene knock-down, overexpression or small molecules) derived from the Connectivity Map. Such analyses allow not only to infer the molecular causes of the observed difference in gene expression but also to identify small molecules that could drive or revert specific transcriptomic alterations.

Maintained by Nuno Saraiva-Agostinho. Last updated 5 months ago.

differentialexpression geneexpression rnaseq transcriptomics pathways immunooncology genesetenrichment bioconductor bioinformatics cmap gene-expression l1000

1.9 match 5 stars 5.08 score 16 scripts

rdinnager

slimr:Create, Run and Post-Process 'SLiM' Population Genetics Forward Simulations

Lets you write 'SLiM' scripts (population genomics simulation) using your favourite R IDE, using a syntax as close as possible to the original 'SLiM' language. It offer many tools to manipulate those scripts, as well as run them in the 'SLiM' software from R, as well as capture and post-process their output, after or even during a simulation.

Maintained by Russell Dinnage. Last updated 4 months ago.

2.0 match 8 stars 4.70 score 42 scripts

bemts-hhs

nemsqar:National Emergency Medical Service Quality Alliance Measure Calculations

Designed to automate the calculation of Emergency Medical Service (EMS) quality metrics, 'nemsqar' implements measures defined by the National EMS Quality Alliance (NEMSQA). By providing reliable, evidence-based quality assessments, the package supports EMS agencies, healthcare providers, and researchers in evaluating and improving patient outcomes. Users can find details on all approved NEMSQA measures at <https://www.nemsqa.org/measures>. Full technical specifications, including documentation and pseudocode used to develop 'nemsqar', are available on the NEMSQA website after creating a user profile at <https://www.nemsqa.org>.

Maintained by Nicolas Foss. Last updated 3 days ago.

ems nemsqa quality trauma

2.0 match 5 stars 4.70 score

bioimaginggroup

bioimagetools:Tools for Microscopy Imaging

Tools for 3D imaging, mostly for biology/microscopy. Read and write TIFF stacks. Functions for segmentation, filtering and analyzing 3D point patterns.

Maintained by Volker Schmid. Last updated 3 years ago.

biology microscopy

1.7 match 4 stars 5.30 score 33 scripts 1 dependents

bioc

CMA:Synthesis of microarray-based classification

This package provides a comprehensive collection of various microarray-based classification algorithms both from Machine Learning and Statistics. Variable Selection, Hyperparameter tuning, Evaluation and Comparison can be performed combined or stepwise in a user-friendly environment.

Maintained by Roman Hornung. Last updated 5 months ago.

classification decisiontree

1.8 match 5.09 score 61 scripts

statisticsnorway

GaussSuppression:Tabular Data Suppression using Gaussian Elimination

A statistical disclosure control tool to protect tables by suppression using the Gaussian elimination secondary suppression algorithm (Langsrud, 2024) <doi:10.1007/978-3-031-69651-0_6>. A suggestion is to start by working with functions SuppressSmallCounts() and SuppressDominantCells(). These functions use primary suppression functions for the minimum frequency rule and the dominance rule, respectively. Novel functionality for suppression of disclosive cells is also included. General primary suppression functions can be supplied as input to the general working horse function, GaussSuppressionFromData(). Suppressed frequencies can be replaced by synthetic decimal numbers as described in Langsrud (2019) <doi:10.1007/s11222-018-9848-9>.

Maintained by Øyvind Langsrud. Last updated 3 days ago.

1.3 match 2 stars 6.61 score 50 scripts

lhdjung

scrutiny:Error Detection in Science

Test published summary statistics for consistency (Brown and Heathers, 2017, <doi:10.1177/1948550616673876>; Allard, 2018, <https://aurelienallard.netlify.app/post/anaytic-grimmer-possibility-standard-deviations/>; Heathers and Brown, 2019, <https://osf.io/5vb3u/>). The package also provides infrastructure for implementing new error detection techniques.

Maintained by Lukas Jung. Last updated 6 months ago.

1.3 match 8 stars 6.52 score 38 scripts

cran

epitools:Epidemiology Tools

Tools for training and practicing epidemiologists including methods for two-way and multi-way contingency tables.

Maintained by Adam Omidpanah. Last updated 5 years ago.

1.8 match 2 stars 4.89 score 12 dependents

elipousson

mapmaryland:Easy Access to Maryland Spatial Data

A small collection of data sources and utility functions for working with state and county data sources in Maryland.

Maintained by Eli Pousson. Last updated 5 months ago.

maryland rspatial

3.4 match 3 stars 2.48 score 4 scripts

jhmaindonald

gamclass:Functions and Data for a Course on Modern Regression and Classification

Functions and data are provided that support a course that emphasizes statistical issues of inference and generalizability. The functions are designed to make it straightforward to illustrate the use of cross-validation, the training/test approach, simulation, and model-based estimates of accuracy. Methods considered are Generalized Additive Modeling, Linear and Quadratic Discriminant Analysis, Tree-based methods, and Random Forests.

Maintained by John Maindonald. Last updated 2 years ago.

1.8 match 4.82 score 44 scripts

lhdjung

moder:Mode Estimation

Determines single or multiple modes (most frequent values). Checks if missing values make this impossible, and returns 'NA' in this case. Dependency-free source code. See Franzese and Iuliano (2019) <doi:10.1016/B978-0-12-809633-8.20354-3>.

Maintained by Lukas Jung. Last updated 1 years ago.

1.9 match 4.48 score 15 scripts

bioc

ptairMS:Pre-processing PTR-TOF-MS Data

This package implements a suite of methods to preprocess data from PTR-TOF-MS instruments (HDF5 format) and generates the 'sample by features' table of peak intensities in addition to the sample and feature metadata (as a singl<e ExpressionSet object for subsequent statistical analysis). This package also permit usefull tools for cohorts management as analyzing data progressively, visualization tools and quality control. The steps include calibration, expiration detection, peak detection and quantification, feature alignment, missing value imputation and feature annotation. Applications to exhaled air and cell culture in headspace are described in the vignettes and examples. This package was used for data analysis of Gassin Delyle study on adults undergoing invasive mechanical ventilation in the intensive care unit due to severe COVID-19 or non-COVID-19 acute respiratory distress syndrome (ARDS), and permit to identfy four potentiel biomarquers of the infection.

Maintained by camille Roquencourt. Last updated 5 months ago.

software massspectrometry preprocessing metabolomics peakdetection alignment cpp

1.6 match 7 stars 5.15 score 3 scripts

cran

stoppingrule:Create and Evaluate Stopping Rules for Safety Monitoring

Provides functions for creating, displaying, and evaluating stopping rules for safety monitoring in clinical studies.

Maintained by Michael J. Martens. Last updated 1 months ago.

3.6 match 2.30 score

jinkim3

ezr:Easy Use of R via Shiny App for Basic Analyses of Experimental Data

Runs a Shiny App in the local machine for basic statistical and graphical analyses. The point-and-click interface of Shiny App enables obtaining the same analysis outputs (e.g., plots and tables) more quickly, as compared with typing the required code in R, especially for users without much experience or expertise with coding. Examples of possible analyses include tabulating descriptive statistics for a variable, creating histograms by experimental groups, and creating a scatter plot and calculating the correlation between two variables.

Maintained by Jin Kim. Last updated 4 years ago.

2.8 match 1 stars 3.00 score 2 scripts

cran

diffval:Vegetation Patterns

Find, visualize and explore patterns of differential taxa in vegetation data (namely in a phytosociological table), using the Differential Value (DiffVal). Patterns are searched through mathematical optimization algorithms. Ultimately, Total Differential Value (TDV) optimization aims at obtaining classifications of vegetation data based on differential taxa, as in the traditional geobotanical approach. The Gurobi optimizer, as well as the R package 'gurobi', can be installed from <https://www.gurobi.com/products/gurobi-optimizer/>. The useful vignette Gurobi Installation Guide, from package 'prioritizr', can be found here: <https://prioritizr.net/articles/gurobi_installation_guide.html>.

Maintained by Tiago Monteiro-Henriques. Last updated 2 years ago.

4.8 match 1.70 score

mjuraska

seqDesign:Simulation and Group-Sequential Monitoring of Randomized Treatment Efficacy Trials with Time-to-Event Endpoints

A broad spectrum of both event-driven and fixed follow-up preventive vaccine efficacy trial designs, including designs of Gilbert, Grove et al. (2011, Statistical Communications in Infectious Diseases), are implemented, with application generally to individual-randomized clinical trials with multiple active treatment groups and a shared control group, and a study endpoint that is a time-to-event endpoint subject to right-censoring. The design accommodates the following features: (1) the possibility that the efficacy of the treatment/vaccine groups may take time to accrue while the multiple treatment administrations/vaccinations are given, (2) hazard ratio and cumulative incidence-based treatment/vaccine efficacy parameters and multiple estimation/hypothesis testing procedures are available, (3) interim/group-sequential monitoring of each treatment group for potential harm, non-efficacy (lack of benefit), efficacy (benefit), and high efficacy, (3) arbitrary alpha spending functions for different monitoring outcomes, (4) arbitrary timing of interim looks, separate for each monitoring outcome, in terms of either event accrual or calendar time, (5) flexible analysis cohort characterization (intention-to-treat vs. per-protocol/as-treated; counting only events for analysis that occur after a specific point in study time), and (6) division of the trial into two stages of time periods where each treatment is first evaluated for efficacy in the first stage of follow-up, and, if and only if it shows significant treatment efficacy in stage one, it is evaluated for longer-term durability of efficacy in stage two. The package produces plots and tables describing operating characteristics of a specified design including a description of monitoring boundaries on multiple scales for the different outcomes; event accrual since trial initiation; probabilities of stopping early for potential harm, non-efficacy, etc.; an unconditional power for intention-to-treat and per-protocol analyses; calendar time to crossing a monitoring boundary or reaching the target number of endpoints if no boundary is crossed; trial duration; unconditional power for comparing treatment efficacies; and the distribution of the number of endpoints within an arbitrary study time interval (e.g., events occurring after the treatments/vaccinations are given), useful as input parameters for the design of studies of the association of biomarkers with a clinical outcome (surrogate endpoint problem). The code can be used for a single active treatment versus control design and for a single-stage design.

Maintained by Michal Juraska. Last updated 2 years ago.

1.7 match 2 stars 4.60 score 7 scripts

cran

misty:Miscellaneous Functions 'T. Yanagida'

Miscellaneous functions for (1) data management (e.g., grand-mean and group-mean centering, coding variables and reverse coding items, scale and cluster scores, reading and writing Excel and SPSS files), (2) descriptive statistics (e.g., frequency table, cross tabulation, effect size measures), (3) missing data (e.g., descriptive statistics for missing data, missing data pattern, Little's test of Missing Completely at Random, and auxiliary variable analysis), (4) multilevel data (e.g., multilevel descriptive statistics, within-group and between-group correlation matrix, multilevel confirmatory factor analysis, level-specific fit indices, cross-level measurement equivalence evaluation, multilevel composite reliability, and multilevel R-squared measures), (5) item analysis (e.g., confirmatory factor analysis, coefficient alpha and omega, between-group and longitudinal measurement equivalence evaluation), (6) statistical analysis (e.g., bootstrap confidence intervals, collinearity and residual diagnostics, dominance analysis, between- and within-subject analysis of variance, latent class analysis, t-test, z-test, sample size determination), and (7) functions to interact with 'Blimp' and 'Mplus'.

Maintained by Takuya Yanagida. Last updated 6 days ago.

2.8 match 1 stars 2.82 score 1 dependents

rhartmano

labelr:Label Data Frames, Variables, and Values

Create and use data frame labels for data frame objects (frame labels), their columns (name labels), and individual values of a column (value labels). Value labels include one-to-one and many-to-one labels for nominal and ordinal variables, as well as numerical range-based value labels for continuous variables. Convert value-labeled variables so each value is replaced by its corresponding value label. Add values-converted-to-labels columns to a value-labeled data frame while preserving parent columns. Filter and subset a value-labeled data frame using labels, while returning results in terms of values. Overlay labels in place of values in common R commands to increase interpretability. Generate tables of value frequencies, with categories expressed as raw values or as labels. Access data frames that show value-to-label mappings for easy reference.

Maintained by Robert Hartman. Last updated 7 months ago.

1.3 match 3 stars 5.65 score 10 scripts

geomarker-io

codec:Community Data Explorer for Cincinnati

This repository serves as the definition of the CoDEC data specifications and provides helpers to create, validate, release, and read CoDEC data.

Maintained by Cole Brokamp. Last updated 23 days ago.

1.8 match 4 stars 4.15 score 27 scripts

bioc

iterativeBMAsurv:The Iterative Bayesian Model Averaging (BMA) Algorithm For Survival Analysis

The iterative Bayesian Model Averaging (BMA) algorithm for survival analysis is a variable selection method for applying survival analysis to microarray data.

Maintained by Ka Yee Yeung. Last updated 5 months ago.

microarray

2.3 match 3.30 score 8 scripts

jhhmuc

pairwise:Rasch Model Parameters by Pairwise Algorithm

Performs the explicit calculation -- not estimation! -- of the Rasch item parameters for dichotomous and polytomous item responses, using a pairwise comparison approach. Person parameters (WLE) are calculated according to Warm's weighted likelihood approach.

Maintained by Joerg-Henrik Heine. Last updated 2 years ago.

1.9 match 3.96 score 38 scripts 1 dependents

rwoldford

eikosograms:The Picture of Probability

An eikosogram (ancient Greek for probability picture) divides the unit square into rectangular regions whose areas, sides, and widths, represent various probabilities associated with the values of one or more categorical variates. Rectangle areas are joint probabilities, widths are always marginal (though possibly joint margins, i.e. marginal joint distributions of two or more variates), and heights of rectangles are always conditional probabilities. Eikosograms embed the rules of probability and are useful for introducing elementary probability theory, including axioms, marginal, conditional, and joint probabilities, and their relationships (including Bayes theorem as a completely trivial consequence). They are markedly superior to Venn diagrams for this purpose, especially in distinguishing probabilistic independence, mutually exclusive events, coincident events, and associations. They also are useful for identifying and understanding conditional independence structure. As data analysis tools, eikosograms display categorical data in a manner similar to Mosaic plots, especially when only two variates are involved (the only case in which they are essentially identical, though eikosograms purposely disallow spacing between rectangles). Unlike Mosaic plots, eikosograms do not alternate axes as each new categorical variate (beyond two) is introduced. Instead, only one categorical variate, designated the "response", presents on the vertical axis and all others, designated the "conditioning" variates, appear on the horizontal. In this way, conditional probability appears only as height and marginal probabilities as widths. The eikosogram is therefore much better suited to a response model analysis (e.g. logistic model) than is a Mosaic plot. Mosaic plots are better suited to log-linear style modelling as in discrete multivariate analysis. Of course, eikosograms are also suited to discrete multivariate analysis with each variate in turn appearing as the response. This makes it better suited than Mosaic plots to discrete graphical models based on conditional independence graphs (i.e. "Bayesian Networks" or "BayesNets"). The eikosogram and its superiority to Venn diagrams in teaching probability is described in W.H. Cherry and R.W. Oldford (2003) <https://math.uwaterloo.ca/~rwoldfor/papers/eikosograms/paper.pdf>, its value in exploring conditional independence structure and relation to graphical and log-linear models is described in R.W. Oldford (2003) <https://math.uwaterloo.ca/~rwoldfor/papers/eikosograms/independence/paper.pdf>, and a number of problems, puzzles, and paradoxes that are easily explained with eikosograms are given in R.W. Oldford (2003) <https://math.uwaterloo.ca/~rwoldfor/papers/eikosograms/examples/paper.pdf>.

Maintained by Wayne Oldford. Last updated 6 years ago.

1.5 match 4 stars 4.92 score 14 scripts

bioc

AssessORF:Assess Gene Predictions Using Proteomics and Evolutionary Conservation

In order to assess the quality of a set of predicted genes for a genome, evidence must first be mapped to that genome. Next, each gene must be categorized based on how strong the evidence is for or against that gene. The AssessORF package provides the functions and class structures necessary for accomplishing those tasks, using proteomic hits and evolutionarily conserved start codons as the forms of evidence.

Maintained by Deepank Korandla. Last updated 5 months ago.

comparativegenomics geneprediction genomeannotation genetics proteomics qualitycontrol visualization

1.8 match 4.18 score 3 scripts

christopherkenny

cvap:Citizen Voting Age Population

Works with the Citizen Voting Age Population special tabulation from the US Census Bureau <https://www.census.gov/programs-surveys/decennial-census/about/voting-rights/cvap.html>. Provides tools to download and process raw data. Also provides a downloading interface to processed data. Implements a very basic approach to estimate block level citizen voting age population from block group data.

Maintained by Christopher T. Kenny. Last updated 12 months ago.

2.2 match 2 stars 3.30 score 7 scripts

psychbruce

PsychWordVec:Word Embedding Research Framework for Psychological Science

An integrative toolbox of word embedding research that provides: (1) a collection of 'pre-trained' static word vectors in the '.RData' compressed format <https://psychbruce.github.io/WordVector_RData.pdf>; (2) a series of functions to process, analyze, and visualize word vectors; (3) a range of tests to examine conceptual associations, including the Word Embedding Association Test <doi:10.1126/science.aal4230> and the Relative Norm Distance <doi:10.1073/pnas.1720347115>, with permutation test of significance; (4) a set of training methods to locally train (static) word vectors from text corpora, including 'Word2Vec' <arXiv:1301.3781>, 'GloVe' <doi:10.3115/v1/D14-1162>, and 'FastText' <arXiv:1607.04606>; (5) a group of functions to download 'pre-trained' language models (e.g., 'GPT', 'BERT') and extract contextualized (dynamic) word vectors (based on the R package 'text').

Maintained by Han-Wu-Shuang Bao. Last updated 1 years ago.

1.8 match 22 stars 4.04 score 10 scripts

bioc

phenomis:Postprocessing and univariate analysis of omics data

The 'phenomis' package provides methods to perform post-processing (i.e. quality control and normalization) as well as univariate statistical analysis of single and multi-omics data sets. These methods include quality control metrics, signal drift and batch effect correction, intensity transformation, univariate hypothesis testing, but also clustering (as well as annotation of metabolomics data). The data are handled in the standard Bioconductor formats (i.e. SummarizedExperiment and MultiAssayExperiment for single and multi-omics datasets, respectively; the alternative ExpressionSet and MultiDataSet formats are also supported for convenience). As a result, all methods can be readily chained as workflows. The pipeline can be further enriched by multivariate analysis and feature selection, by using the 'ropls' and 'biosigner' packages, which support the same formats. Data can be conveniently imported from and exported to text files. Although the methods were initially targeted to metabolomics data, most of the methods can be applied to other types of omics data (e.g., transcriptomics, proteomics).

Maintained by Etienne A. Thevenot. Last updated 5 months ago.

batcheffect clustering coverage kegg massspectrometry metabolomics normalization proteomics qualitycontrol sequencing statisticalmethod transcriptomics

1.6 match 4.40 score 6 scripts

core-bioinformatics

noisyr:Noise Quantification in High Throughput Sequencing Output

Quantifies and removes technical noise from high-throughput sequencing data. Two approaches are used, one based on the count matrix, and one using the alignment BAM files directly. Contains several options for every step of the process, as well as tools to quality check and assess the stability of output.

Maintained by Ilias Moutsopoulos. Last updated 3 years ago.

1.7 match 9 stars 4.13 score 5 scripts 1 dependents

william-swl

baizer:Useful Functions for Data Processing

In ancient Chinese mythology, Bai Ze is a divine creature that knows the needs of everything. 'baizer' provides data processing functions frequently used by the author. Hope this package also knows what you want!

Maintained by William Song. Last updated 1 years ago.

dataframe numbers strings tidyverse

1.8 match 6 stars 3.95 score 5 scripts 1 dependents

florianjansen

vegdata:Access Vegetation Databases and Treat Taxonomy

Handling of vegetation data from different sources ( Turboveg 2.0 <https://www.synbiosys.alterra.nl/turboveg/>; the German national repository <https://www.vegetweb.de> and others. Taxonomic harmonization (given appropriate taxonomic lists, e.g. the German taxonomic standard list "GermanSL", <https://germansl.infinitenature.org>).

Maintained by Florian Jansen. Last updated 1 years ago.

1.8 match 2 stars 3.84 score 38 scripts 3 dependents

cran

ibmdbR:IBM in-Database Analytics for R

Functionality required to efficiently use R with IBM(R) Db2(R) Warehouse offerings (formerly IBM dashDB(R)) and IBM Db2 for z/OS(R) in conjunction with IBM Db2 Analytics Accelerator for z/OS. Many basic and complex R operations are pushed down into the database, which removes the main memory boundary of R and allows to make full use of parallel processing in the underlying database. For executing R-functions in a multi-node environment in parallel the idaTApply() function requires the 'SparkR' package (<https://spark.apache.org/docs/latest/sparkr.html>). The optional 'ggplot2' package is needed for the plot.idaLm() function only.

Maintained by Shaikh Quader. Last updated 1 years ago.

1.8 match 2 stars 3.82 score 66 scripts

r4epi

epitabulate:Tables for Epidemiological Analysis

Produces tables for descriptive epidemiological analysis. These tables describe counts of variables in either line-list or survey data (with appropriate confidence intervals), with additional functionality to calculate odds, risk, and incidence rate ratios directly from a linelist across several variables. This package is part of the 'R4EPIs' project <https://R4epis.netlify.com>.

Maintained by Alexander Spina. Last updated 2 years ago.

2.0 match 8 stars 3.38 score 3 scripts 1 dependents

marianschmidt

msSPChelpR:Helper Functions for Second Primary Cancer Analyses

A collection of helper functions for analyzing Second Primary Cancer data, including functions to reshape data, to calculate patient states and analyze cancer incidence.

Maintained by Marian Eberl. Last updated 1 years ago.

1.6 match 2 stars 4.18 score 15 scripts

cran

designmatch:Matched Samples that are Balanced and Representative by Design

Includes functions for the construction of matched samples that are balanced and representative by design. Among others, these functions can be used for matching in observational studies with treated and control units, with cases and controls, in related settings with instrumental variables, and in discontinuity designs. Also, they can be used for the design of randomized experiments, for example, for matching before randomization. By default, 'designmatch' uses the 'highs' optimization solver, but its performance is greatly enhanced by the 'Gurobi' optimization solver and its associated R interface. For their installation, please follow the instructions at <https://www.gurobi.com/documentation/quickstart.html> and <https://www.gurobi.com/documentation/7.0/refman/r_api_overview.html>. We have also included directions in the gurobi_installation file in the inst folder.

Maintained by Jose R. Zubizarreta. Last updated 2 years ago.

3.6 match 1 stars 1.85 score 71 scripts

michaelhallquist

MplusAutomation:An R Package for Facilitating Large-Scale Latent Variable Analyses in Mplus

Leverages the R language to automate latent variable model estimation and interpretation using 'Mplus', a powerful latent variable modeling program developed by Muthen and Muthen (<https://www.statmodel.com>). Specifically, this package provides routines for creating related groups of models, running batches of models, and extracting and tabulating model parameters and fit statistics.

Maintained by Michael Hallquist. Last updated 2 months ago.

0.5 match 86 stars 12.96 score 664 scripts 13 dependents

cran

caroline:A Collection of Database, Data Structure, Visualization, and Utility Functions for R

The caroline R library contains dozens of functions useful for: database migration (dbWriteTable2), database style joins & aggregation (nerge, groupBy, & bestBy), data structure conversion (nv, tab2df), legend table making (sstable & leghead), automatic legend positioning for scatter and box plots (), plot annotation (labsegs & mvlabs), data visualization (pies, sparge, confound.grid & raPlot), character string manipulation (m & pad), file I/O (write.delim), batch scripting, data exploration, and more. The package's greatest contributions lie in the database style merge, aggregation and interface functions as well as in it's extensive use and propagation of row, column and vector names in most functions.

Maintained by David Schruth. Last updated 5 months ago.

2.0 match 3.29 score 108 scripts 3 dependents

matei-ionita

Cleanet:Automated doublet detection and classification for cytometry data

Automated method for doublet detection in flow or mass cytometry data, based on simulating doublets and finding events whose protein expression patterns are similar to the simulated doublets.

Maintained by Matei Ionita. Last updated 3 months ago.

cpp

1.8 match 3.70 score

bioc

blacksheepr:Outlier Analysis for pairwise differential comparison

Blacksheep is a tool designed for outlier analysis in the context of pairwise comparisons in an effort to find distinguishing characteristics from two groups. This tool was designed to be applied for biological applications such as phosphoproteomics or transcriptomics, but it can be used for any data that can be represented by a 2D table, and has two sub populations within the table to compare.

Maintained by RugglesLab. Last updated 5 months ago.

sequencing rnaseq geneexpression transcription differentialexpression transcriptomics

1.5 match 4.30 score 6 scripts

fmicompbio

swissknife:Handy code shared in the FMI CompBio group

A collection of useful R functions performing various tasks that might be re-usable and worth sharing.

Maintained by Michael Stadler. Last updated 2 months ago.

cpp

1.7 match 8 stars 3.76 score 12 scripts

cran

kequate:The Kernel Method of Test Equating

Implements the kernel method of test equating as defined in von Davier, A. A., Holland, P. W. and Thayer, D. T. (2004) <doi:10.1007/b97446> and Andersson, B. and Wiberg, M. (2017) <doi:10.1007/s11336-016-9528-7> using the CB, EG, SG, NEAT CE/PSE and NEC designs, supporting Gaussian, logistic and uniform kernels and unsmoothed and pre-smoothed input data.

Maintained by Björn Andersson. Last updated 3 years ago.

1.9 match 2 stars 3.37 score 29 scripts

cran

phase:Analyse Biological Time-Series Data

Compiles functions to trim, bin, visualise, and analyse activity/sleep time-series data collected from the Drosophila Activity Monitor (DAM) system (Trikinetics, USA). The following methods were used to compute periodograms - Chi-square periodogram: Sokolove and Bushell (1978) <doi:10.1016/0022-5193(78)90022-X>, Lomb-Scargle periodogram: Lomb (1976) <doi:10.1007/BF00648343>, Scargle (1982) <doi:10.1086/160554> and Ruf (1999) <doi:10.1076/brhm.30.2.178.1422>, and Autocorrelation: Eijzenbach et al. (1986) <doi:10.1111/j.1440-1681.1986.tb00943.x>. Identification of activity peaks is done after using a Savitzky-Golay filter (Savitzky and Golay (1964) <doi:10.1021/ac60214a047>) to smooth raw activity data. Three methods to estimate anticipation of activity are used based on the following papers - Slope method: Fernandez et al. (2020) <doi:10.1016/j.cub.2020.04.025>, Harrisingh method: Harrisingh et al. (2007) <doi:10.1523/JNEUROSCI.3680-07.2007>, and Stoleru method: Stoleru et al. (2004) <doi:10.1038/nature02926>. Rose plots and circular analysis are based on methods from - Batschelet (1981) <ISBN:0120810506> and Zar (2010) <ISBN:0321656865>.

Maintained by Lakshman Abhilash. Last updated 2 years ago.

3.3 match 1.78 score

numbersman77

reporttools:Generate "LaTeX"" Tables of Descriptive Statistics

These functions are especially helpful when writing reports of data analysis using "Sweave".

Maintained by Kaspar Rufibach. Last updated 3 years ago.

1.8 match 2 stars 3.35 score 113 scripts

openvolley

ovlytics:Functions and Algorithms for Volleyball Analytics

Analytical functions for volleyball analytics, to be used in conjunction with the datavolley and peranavolley packages.

Maintained by Ben Raymond. Last updated 3 months ago.

1.9 match 3.13 score 9 scripts 3 dependents

urswilke

pyramidi:Generate and Manipulate Midi Data in R Data Frames

Import the python libraries miditapyr and mido to read in midi file data in pandas DataFrames. These can then be imported in R via reticulate. The event-based midi data is widened to facilitate the manipulation and plotting of note-based structures as in music21. The data frame format allows for an easy implementation of many music data manipulations.

Maintained by Urs Wilke. Last updated 1 years ago.

midi

1.7 match 8 stars 3.33 score 27 scripts

cran

RcmdrPlugin.TeachStat:R Commander Plugin for Teaching Statistical Methods

R Commander plugin for teaching statistical methods. It adds a new menu for making easier the teaching of the main concepts about the main statistical methods.

Maintained by Manuel A. Mosquera Rodríguez. Last updated 1 years ago.

5.6 match 1.00 score

willzywiec

criticality:Modeling Fissile Material Operations in Nuclear Facilities

A collection of functions for modeling fissile material operations in nuclear facilities, based on Zywiec et al (2021) <doi:10.1016/j.ress.2020.107322>.

Maintained by William Zywiec. Last updated 2 years ago.

5.3 match 1.00 score

bpoconnor

Crosstabs.Loglinear:Cross Tabulation and Loglinear Analyses of Categorical Data

Provides 'SPSS'- and 'SAS'-like output for cross tabulations of two categorical variables (CROSSTABS) and for hierarchical loglinear analyses of two or more categorical variables (LOGLINEAR). The methods are described in Agresti (2013, ISBN:978-0-470-46363-5), Ajzen & Walker (2021, ISBN:9780429330308), Field (2018, ISBN:9781526440273), Norusis (2012, ISBN:978-0-321-74843-0), Nussbaum (2015, ISBN:978-1-84872-603-1), Stevens (2009, ISBN:978-0-8058-5903-4), Tabachnik & Fidell (2019, ISBN:9780134790541), and von Eye & Mun (2013, ISBN:978-1-118-14640-8).

Maintained by Brian OConnor. Last updated 2 years ago.

5.2 match 1.00 score

tmsalab

fourPNO:Bayesian 4 Parameter Item Response Model

Estimate Barton & Lord's (1981) <doi:10.1002/j.2333-8504.1981.tb01255.x> four parameter IRT model with lower and upper asymptotes using Bayesian formulation described by Culpepper (2016) <doi:10.1007/s11336-015-9477-6>.

Maintained by Steven Andrew Culpepper. Last updated 5 years ago.

armadillo cognitive-diagnostic-models gibbs-sampler item-response-theory rcpp rcpparmadillo openblas cpp openmp

1.9 match 1 stars 2.70 score 5 scripts

windwill

cascsim:Casualty Actuarial Society Individual Claim Simulator

It is an open source insurance claim simulation engine sponsored by the Casualty Actuarial Society. It generates individual insurance claims including open claims, reopened claims, incurred but not reported claims and future claims. It also includes claim data fitting functions to help set simulation assumptions. It is useful for claim level reserving analysis. Parodi (2013) <https://www.actuaries.org.uk/documents/triangle-free-reserving-non-traditional-framework-estimating-reserves-and-reserve-uncertainty>.

Maintained by Kailan Shang. Last updated 5 years ago.

1.7 match 2.99 score 98 scripts

grafmoni

SGB:Simplicial Generalized Beta Regression

Main properties and regression procedures using a generalization of the Dirichlet distribution called Simplicial Generalized Beta distribution. It is a new distribution on the simplex (i.e. on the space of compositions or positive vectors with sum of components equal to 1). The Dirichlet distribution can be constructed from a random vector of independent Gamma variables divided by their sum. The SGB follows the same construction with generalized Gamma instead of Gamma variables. The Dirichlet exponents are supplemented by an overall shape parameter and a vector of scales. The scale vector is itself a composition and can be modeled with auxiliary variables through a log-ratio transformation. Graf, M. (2017, ISBN: 978-84-947240-0-8). See also the vignette enclosed in the package.

Maintained by Monique Graf. Last updated 1 years ago.

1.7 match 3.00 score 2 scripts

bxc147

Epi:Statistical Analysis in Epidemiology

Functions for demographic and epidemiological analysis in the Lexis diagram, i.e. register and cohort follow-up data. In particular representation, manipulation, rate estimation and simulation for multistate data - the Lexis suite of functions, which includes interfaces to 'mstate', 'etm' and 'cmprsk' packages. Contains functions for Age-Period-Cohort and Lee-Carter modeling and a function for interval censored data and some useful functions for tabulation and plotting, as well as a number of epidemiological data sets.

Maintained by Bendix Carstensen. Last updated 2 months ago.

0.5 match 4 stars 9.65 score 708 scripts 11 dependents