reshape:Flexibly Reshape Data
Flexibly restructure and aggregate data using just two functions: melt and cast.
Maintained by Hadley Wickham. Last updated 3 years ago.
datawizard:Easy Data Wrangling and Statistical Transformations
A lightweight package to assist in key steps involved in any data analysis workflow: (1) wrangling the raw data to get it in the needed form, (2) applying preprocessing steps and statistical transformations, and (3) compute statistical summaries of data properties and distributions. It is also the data wrangling backend for packages in 'easystats' ecosystem. References: Patil et al. (2022) <doi:10.21105/joss.04684>.
Maintained by Etienne Bacher. Last updated 10 days ago.
textshape:Tools for Reshaping Text
Tools that can be used to reshape and restructure text data.
Maintained by Tyler Rinker. Last updated 12 months ago.
splitstackshape:Stack and Reshape Datasets After Splitting Concatenated Values
Online data collection tools like Google Forms often export multiple-response questions with data concatenated in cells. The concat.split (cSplit) family of functions splits such data into separate cells. The package also includes functions to stack groups of columns and to reshape wide data, even when the data are "unbalanced"---something which reshape (from base R) does not handle, and which melt and dcast from reshape2 do not easily handle.
Maintained by Ananda Mahto. Last updated 6 years ago.
tabshiftr:Reshape Disorganised Messy Data
Helps the user to build and register schema descriptions of disorganised (messy) tables. Disorganised tables are tables that are not in a topologically coherent form, where packages such as 'tidyr' could be used for reshaping. The schema description documents the arrangement of input tables and is used to reshape them into a standardised (tidy) output format.
Maintained by Steffen Ehrmann. Last updated 30 days ago.
mets:Analysis of Multivariate Event Times
Implementation of various statistical models for multivariate event history data <doi:10.1007/s10985-013-9244-x>. Including multivariate cumulative incidence models <doi:10.1002/sim.6016>, and bivariate random effects probit models (Liability models) <doi:10.1016/j.csda.2015.01.014>. Modern methods for survival analysis, including regression modelling (Cox, Fine-Gray, Ghosh-Lin, Binomial regression) with fast computation of influence functions.
Maintained by Klaus K. Holst. Last updated 3 days ago.
reshape2:Flexibly Reshape Data: A Reboot of the Reshape Package
Flexibly restructure and aggregate data using just two functions: melt and 'dcast' (or 'acast').
Maintained by Hadley Wickham. Last updated 4 years ago.
Hmisc:Harrell Miscellaneous
Contains many functions useful for data analysis, high-level graphics, utility operations, functions for computing sample size and power, simulation, importing and annotating datasets, imputing missing values, advanced table making, variable clustering, character string manipulation, conversion of R objects to LaTeX and html code, recoding variables, caching, simplified parallel computing, encrypting and decrypting data using a safe workflow, general moving window statistical estimation, and assistance in interpreting principal component analysis.
Maintained by Frank E Harrell Jr. Last updated 9 hours ago.
reticulate:Interface to 'Python'
Interface to 'Python' modules, classes, and functions. When calling into 'Python', R data types are automatically converted to their equivalent 'Python' types. When values are returned from 'Python' to R they are converted back to R types. Compatible with all versions of 'Python' >= 2.7.
Maintained by Tomasz Kalinowski. Last updated 2 days ago.
data.table:Extension of `data.frame`
Fast aggregation of large data (e.g. 100GB in RAM), fast ordered joins, fast add/modify/delete of columns by group using no copies at all, list columns, friendly and fast character-separated-value read/write. Offers a natural and flexible syntax, for faster development.
Maintained by Tyson Barrett. Last updated 2 days ago.
pracma:Practical Numerical Math Functions
Provides a large number of functions from numerical analysis and linear algebra, numerical optimization, differential equations, time series, plus some well-known special mathematical functions. Uses 'MATLAB' function names where appropriate to simplify porting.
Maintained by Hans W. Borchers. Last updated 1 years ago.
dtrackr:Track your Data Pipelines
Track and document 'dplyr' data pipelines. As you filter, mutate, and join your way through a data set, 'dtrackr' seamlessly keeps track of your data flow and makes publication ready documentation of a data pipeline simple.
Maintained by Robert Challen. Last updated 5 months ago.
memisc:Management of Survey Data and Presentation of Analysis Results
An infrastructure for the management of survey data including value labels, definable missing values, recoding of variables, production of code books, and import of (subsets of) 'SPSS' and 'Stata' files is provided. Further, the package allows to produce tables and data frames of arbitrary descriptive statistics and (almost) publication-ready tables of regression model estimates, which can be exported to 'LaTeX' and HTML.
Maintained by Martin Elff. Last updated 11 days ago.
CVXR:Disciplined Convex Optimization
An object-oriented modeling language for disciplined convex programming (DCP) as described in Fu, Narasimhan, and Boyd (2020, <doi:10.18637/jss.v094.i14>). It allows the user to formulate convex optimization problems in a natural way following mathematical convention and DCP rules. The system analyzes the problem, verifies its convexity, converts it into a canonical form, and hands it off to an appropriate solver to obtain the solution. Interfaces to solvers on CRAN and elsewhere are provided, both commercial and open source.
Maintained by Anqi Fu. Last updated 4 months ago.
mlr3torch:Deep Learning with 'mlr3'
Deep Learning library that extends the mlr3 framework by building upon the 'torch' package. It allows to conveniently build, train, and evaluate deep learning models without having to worry about low level details. Custom architectures can be created using the graph language defined in 'mlr3pipelines'.
Maintained by Sebastian Fischer. Last updated 1 months ago.
fastai:Interface to 'fastai'
The 'fastai' <> library simplifies training fast and accurate neural networks using modern best practices. It is based on research in to deep learning best practices undertaken at '', including 'out of the box' support for vision, text, tabular, audio, time series, and collaborative filtering models.
Maintained by Turgut Abdullayev. Last updated 11 months ago.
msSPChelpR:Helper Functions for Second Primary Cancer Analyses
A collection of helper functions for analyzing Second Primary Cancer data, including functions to reshape data, to calculate patient states and analyze cancer incidence.
Maintained by Marian Eberl. Last updated 1 years ago.
torch:Tensors and Neural Networks with 'GPU' Acceleration
Provides functionality to define and train neural networks similar to 'PyTorch' by Paszke et al (2019) <doi:10.48550/arXiv.1912.01703> but written entirely in R using the 'libtorch' library. Also supports low-level tensor operations and 'GPU' acceleration.
Maintained by Daniel Falbel. Last updated 6 days ago.
tidyfst:Tidy Verbs for Fast Data Manipulation
A toolkit of tidy data manipulation verbs with 'data.table' as the backend. Combining the merits of syntax elegance from 'dplyr' and computing performance from 'data.table', 'tidyfst' intends to provide users with state-of-the-art data manipulation tools with least pain. This package is an extension of 'data.table'. While enjoying a tidy syntax, it also wraps combinations of efficient functions to facilitate frequently-used data operations.
Maintained by Tian-Yuan Huang. Last updated 6 months ago.
igraph:Network Analysis and Visualization
Routines for simple graphs and network analysis. It can handle large graphs very well and provides functions for generating random and regular graphs, graph visualization, centrality methods and much more.
Maintained by Kirill Müller. Last updated 13 hours ago.
polars:Lightning-Fast 'DataFrame' Library
Lightning-fast 'DataFrame' library written in 'Rust'. Convert R data to 'Polars' data and vice versa. Perform fast, lazy, larger-than-memory and optimized data queries. 'Polars' is interoperable with the package 'arrow', as both are based on the 'Apache Arrow' Columnar Format.
Maintained by Soren Welling. Last updated 3 days ago.
keras:R Interface to 'Keras'
Interface to 'Keras' <>, a high-level neural networks 'API'. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both 'CPU' and 'GPU' devices.
Maintained by Tomasz Kalinowski. Last updated 11 months ago.
forcis:An R Client to Access the FORCIS Database
Provides an interface to the FORCIS database (<>) on global foraminifera distribution. This package allows to download and to handle FORCIS data. It is part of the FRB-CESAB working group FORCIS. <>.
Maintained by Nicolas Casajus. Last updated 11 days ago.
lsr:Companion to "Learning Statistics with R"
A collection of tools intended to make introductory statistics easier to teach, including wrappers for common hypothesis tests and basic data manipulation. It accompanies Navarro, D. J. (2015). Learning Statistics with R: A Tutorial for Psychology Students and Other Beginners, Version 0.6.
Maintained by Danielle Navarro. Last updated 3 years ago.
rstatix:Pipe-Friendly Framework for Basic Statistical Tests
Provides a simple and intuitive pipe-friendly framework, coherent with the 'tidyverse' design philosophy, for performing basic statistical tests, including t-test, Wilcoxon test, ANOVA, Kruskal-Wallis and correlation analyses. The output of each test is automatically transformed into a tidy data frame to facilitate visualization. Additional functions are available for reshaping, reordering, manipulating and visualizing correlation matrix. Functions are also included to facilitate the analysis of factorial experiments, including purely 'within-Ss' designs (repeated measures), purely 'between-Ss' designs, and mixed 'within-and-between-Ss' designs. It's also possible to compute several effect size metrics, including "eta squared" for ANOVA, "Cohen's d" for t-test and 'Cramer V' for the association between categorical variables. The package contains helper functions for identifying univariate and multivariate outliers, assessing normality and homogeneity of variances.
Maintained by Alboukadel Kassambara. Last updated 2 years ago.
cpp11armadillo:An 'Armadillo' Interface
Provides function declarations and inline function definitions that facilitate communication between R and the 'Armadillo' 'C++' library for linear algebra and scientific computing. This implementation is detailed in Vargas Sepulveda and Schneider Malamud (2024) <doi:10.48550/arXiv.2408.11074>.
Maintained by Mauricio Vargas Sepulveda. Last updated 26 days ago.
dmtools:Tools for Clinical Data Management
For checking the dataset from EDC(Electronic Data Capture) in clinical trials. 'dmtools' reshape your dataset in a tidy view and check events. You can reshape the dataset and choose your target to check, for example, the laboratory reference range.
Maintained by Konstantin Ryabov. Last updated 2 years ago.
tfprobability:Interface to 'TensorFlow Probability'
Interface to 'TensorFlow Probability', a 'Python' library built on 'TensorFlow' that makes it easy to combine probabilistic models and deep learning on modern hardware ('TPU', 'GPU'). 'TensorFlow Probability' includes a wide selection of probability distributions and bijectors, probabilistic layers, variational inference, Markov chain Monte Carlo, and optimizers such as Nelder-Mead, BFGS, and SGLD.
Maintained by Tomasz Kalinowski. Last updated 3 years ago.
collapse:Advanced and Fast Data Transformation
A C/C++ based package for advanced data transformation and statistical computing in R that is extremely fast, class-agnostic, robust and programmer friendly. Core functionality includes a rich set of S3 generic grouped and weighted statistical functions for vectors, matrices and data frames, which provide efficient low-level vectorizations, OpenMP multithreading, and skip missing values by default. These are integrated with fast grouping and ordering algorithms (also callable from C), and efficient data manipulation functions. The package also provides a flexible and rigorous approach to time series and panel data in R. It further includes fast functions for common statistical procedures, detailed (grouped, weighted) summary statistics, powerful tools to work with nested data, fast data object conversions, functions for memory efficient R programming, and helpers to effectively deal with variable labels, attributes, and missing data. It is well integrated with base R classes, 'dplyr'/'tibble', 'data.table', 'sf', 'units', 'plm' (panel-series and data frames), and 'xts'/'zoo'.
Maintained by Sebastian Krantz. Last updated 6 days ago.
gcplyr:Wrangle and Analyze Growth Curve Data
Easy wrangling and model-free analysis of microbial growth curve data, as commonly output by plate readers. Tools for reshaping common plate reader outputs into 'tidy' formats and merging them with design information, making data easy to work with using 'gcplyr' and other packages. Also streamlines common growth curve processing steps, like smoothing and calculating derivatives, and facilitates model-free characterization and analysis of growth data. See methods at <>.
Maintained by Mike Blazanin. Last updated 2 months ago.
MultiAssayExperiment:Software for the integration of multi-omics experiments in Bioconductor
Harmonize data management of multiple experimental assays performed on an overlapping set of specimens. It provides a familiar Bioconductor user experience by extending concepts from SummarizedExperiment, supporting an open-ended mix of standard data classes for individual assays, and allowing subsetting by genomic ranges or rownames. Facilities are provided for reshaping data into wide and long formats for adaptability to graphing and downstream analysis.
Maintained by Marcel Ramos. Last updated 2 months ago.
furniture:Furniture for Quantitative Scientists
Contains four main functions (i.e., four pieces of furniture): table1() which produces a well-formatted table of descriptive statistics common as Table 1 in research articles, tableC() which produces a well-formatted table of correlations, tableF() which provides frequency counts, and washer() which is helpful in cleaning up the data. These furniture-themed functions are designed to simplify common tasks in quantitative analysis. Other data summary and cleaning tools are also available.
Maintained by Tyson S. Barrett. Last updated 1 years ago.
chromatographR:Chromatographic Data Analysis Toolset
Tools for high-throughput analysis of HPLC-DAD/UV chromatograms (or similar data). Includes functions for preprocessing, alignment, peak-finding and fitting, peak-table construction, data-visualization, etc. Preprocessing and peak-table construction follow the rough formula laid out in 'alsace' (Wehrens, R., Bloemberg, T.G., and Eilers P.H.C., 2015. <doi:10.1093/bioinformatics/btv299>. Alignment of chromatograms is available using parametric time warping (as implemented in the 'ptw' package) (Wehrens, R., Bloemberg, T.G., and Eilers P.H.C. 2015. <doi:10.1093/bioinformatics/btv299>) or variable penalty dynamic time warping (as implemented in 'VPdtw') (Clifford, D., & Stone, G. 2012. <doi:10.18637/jss.v047.i08>). Peak-finding uses the algorithm by Tom O'Haver <>. Peaks are then fitted to a gaussian or exponential-gaussian hybrid peak shape using non-linear least squares (Lan, K. & Jorgenson, J. W. 2001. <doi:10.1016/S0021-9673(01)00594-5>). See the vignette for more details and suggested workflow.
Maintained by Ethan Bass. Last updated 9 days ago.
bayestestR:Understand and Describe Bayesian Models and Posterior Distributions
Provides utilities to describe posterior distributions and Bayesian models. It includes point-estimates such as Maximum A Posteriori (MAP), measures of dispersion (Highest Density Interval - HDI; Kruschke, 2015 <doi:10.1016/C2012-0-00477-2>) and indices used for null-hypothesis testing (such as ROPE percentage, pd and Bayes factors). References: Makowski et al. (2021) <doi:10.21105/joss.01541>.
Maintained by Dominique Makowski. Last updated 10 hours ago.
rsyntax:Extract Semantic Relations from Text by Querying and Reshaping Syntax
Various functions for querying and reshaping dependency trees, as for instance created with the 'spacyr' or 'udpipe' packages. This enables the automatic extraction of useful semantic relations from texts, such as quotes (who said what) and clauses (who did what). Method proposed in Van Atteveldt et al. (2017) <doi:10.1017/pan.2016.12>.
Maintained by Kasper Welbers. Last updated 3 years ago.
listarrays:A Toolbox for Working with R Arrays in a Functional Programming Style
A toolbox for R arrays. Flexibly split, bind, reshape, modify, subset and name arrays.
Maintained by Tomasz Kalinowski. Last updated 11 months ago.
parameters:Processing of Model Parameters
Utilities for processing the parameters of various statistical models. Beyond computing p values, CIs, and other indices for a wide variety of models (see list of supported models using the function 'insight::supported_models()'), this package implements features like bootstrapping or simulating of parameters and models, feature reduction (feature extraction and variable selection) as well as functions to describe data and variable characteristics (e.g. skewness, kurtosis, smoothness or distribution).
Maintained by Daniel Lüdecke. Last updated 2 days ago.
naniar:Data Structures, Summaries, and Visualisations for Missing Data
Missing values are ubiquitous in data and need to be explored and handled in the initial stages of analysis. 'naniar' provides data structures and functions that facilitate the plotting of missing values and examination of imputations. This allows missing data dependencies to be explored with minimal deviation from the common work patterns of 'ggplot2' and tidy data. The work is fully discussed at Tierney & Cook (2023) <doi:10.18637/jss.v105.i07>.
Maintained by Nicholas Tierney. Last updated 4 days ago.
cdata:Fluid Data Transformations
Supplies higher-order coordinatized data specification and fluid transform operators that include pivot and anti-pivot as special cases. The methodology is describe in 'Zumel', 2018, "Fluid data reshaping with 'cdata'", <> , <DOI:10.5281/zenodo.1173299> . This package introduces the idea of explicit control table specification of data transforms. Works on in-memory data or on remote data using 'rquery' and 'SQL' database interfaces.
Maintained by John Mount. Last updated 6 months ago.
tameDP:Import targets and PLHIV data from COP Target Setting Tool (formerly Data Pack)
Import PSNUxIM targets and PLHIV data from COP Data Pack. The purpose is to make the data tidy and more usable than their current structure in the Excel data packs.
Maintained by Aaron Chafetz. Last updated 1 years ago.
panelr:Regression Models and Utilities for Repeated Measures and Panel Data
Provides an object type and associated tools for storing and wrangling panel data. Implements several methods for creating regression models that take advantage of the unique aspects of panel data. Among other capabilities, automates the "within-between" (also known as "between-within" and "hybrid") panel regression specification that combines the desirable aspects of both fixed effects and random effects econometric models and fits them as multilevel models (Allison, 2009 <doi:10.4135/9781412993869.d33>; Bell & Jones, 2015 <doi:10.1017/psrm.2014.7>). These models can also be estimated via generalized estimating equations (GEE; McNeish, 2019 <doi:10.1080/00273171.2019.1602504>) and Bayesian estimation is (optionally) supported via 'Stan'. Supports estimation of asymmetric effects models via first differences (Allison, 2019 <doi:10.1177/2378023119826441>) as well as a generalized linear model extension thereof using GEE.
Maintained by Jacob A. Long. Last updated 1 years ago.
beezdiscounting:Behavioral Economic Easy Discounting
Facilitates some of the analyses performed in studies of behavioral economic discounting. The package supports scoring of the 27-Item Monetary Choice Questionnaire (see Kaplan et al., 2016; <doi:10.1007/s40614-016-0070-9>), calculating k values (Mazur's simple hyperbolic and exponential) using nonlinear regression, calculating various Area Under the Curve (AUC) measures, plotting regression curves for both fit-to-group and two-stage approaches, checking for unsystematic discounting (Johnson & Bickel, 2008; <doi:10.1037/1064-1297.16.3.264>) and scoring of the minute discounting task (see Koffarnus & Bickel, 2014; <doi:10.1037/a0035973>) using the Qualtrics 5-trial discounting template (see the Qualtrics Minute Discounting User Guide; <doi:10.13140/RG.2.2.26495.79527>), which is also available as a .qsf file in this package.
Maintained by Brent A. Kaplan. Last updated 2 months ago.
lessR:Less Code, More Results
Each function replaces multiple standard R functions. For example, two function calls, Read() and CountAll(), generate summary statistics for all variables in the data frame, plus histograms and bar charts as appropriate. Other functions provide for summary statistics via pivot tables, a comprehensive regression analysis, ANOVA and t-test, visualizations including the Violin/Box/Scatter plot for a numerical variable, bar chart, histogram, box plot, density curves, calibrated power curve, reading multiple data formats with the same function call, variable labels, time series with aggregation and forecasting, color themes, and Trellis (facet) graphics. Also includes a confirmatory factor analysis of multiple indicator measurement models, pedagogical routines for data simulation such as for the Central Limit Theorem, generation and rendering of regression instructions for interpretative output, and interactive visualizations.
Maintained by David W. Gerbing. Last updated 1 months ago.
mrf2d:Markov Random Field Models for Image Analysis
Model fitting, sampling and visualization for the (Hidden) Markov Random Field model with pairwise interactions and general interaction structure from Freguglia, Garcia & Bicas (2020) <doi:10.1002/env.2613>, which has many popular models used in 2-dimensional lattices as particular cases, like the Ising Model and Potts Model. A complete manuscript describing the package is available in Freguglia & Garcia (2022) <doi:10.18637/jss.v101.i08>.
Maintained by Victor Freguglia. Last updated 2 years ago.
meteoland:Landscape Meteorology Tools
Functions to estimate weather variables at any position of a landscape [De Caceres et al. (2018) <doi:10.1016/j.envsoft.2018.08.003>].
Maintained by Miquel De Cáceres. Last updated 2 months ago.
gaston:Genetic Data Handling (QC, GRM, LD, PCA) & Linear Mixed Models
Manipulation of genetic data (SNPs). Computation of GRM and dominance matrix, LD, heritability with efficient algorithms for linear mixed model (AIREML). Dandine et al <doi:10.1159/000488519>.
Maintained by Hervé Perdry. Last updated 1 years ago.
ideanet:Integrating Data Exchange and Analysis for Networks ('ideanet')
A suite of convenient tools for social network analysis geared toward students, entry-level users, and non-expert practitioners. ‘ideanet’ features unique functions for the processing and measurement of sociocentric and egocentric network data. These functions automatically generate node- and system-level measures commonly used in the analysis of these types of networks. Outputs from these functions maximize the ability of novice users to employ network measurements in further analyses while making all users less prone to common data analytic errors. Additionally, ‘ideanet’ features an R Shiny graphic user interface that allows novices to explore network data with minimal need for coding.
Maintained by Tom Wolff. Last updated 3 days ago.
keras3:R Interface to 'Keras'
Interface to 'Keras' <>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.
Maintained by Tomasz Kalinowski. Last updated 4 days ago.
R.utils:Various Programming Utilities
Utility functions useful when programming and developing R packages.
Maintained by Henrik Bengtsson. Last updated 1 years ago.
RcmdrMisc:R Commander Miscellaneous Functions
Various statistical, graphics, and data-management functions used by the Rcmdr package in the R Commander GUI for R.
Maintained by John Fox. Last updated 1 years ago.
nc:Named Capture to Data Tables
User-friendly functions for extracting a data table (row for each match, column for each group) from non-tabular text data using regular expressions, and for melting columns that match a regular expression. Patterns are defined using a readable syntax that makes it easy to build complex patterns in terms of simpler, re-usable sub-patterns. Named R arguments are translated to column names in the output; capture groups without names are used internally in order to provide a standard interface to three regular expression 'C' libraries ('PCRE', 'RE2', 'ICU'). Output can also include numeric columns via user-specified type conversion functions.
Maintained by Toby Hocking. Last updated 2 months ago.
vegclust:Fuzzy Clustering of Vegetation Data
A set of functions to: (1) perform fuzzy clustering of vegetation data (De Caceres et al, 2010) <doi:10.1111/j.1654-1103.2010.01211.x>; (2) to assess ecological community similarity on the basis of structure and composition (De Caceres et al, 2013) <doi:10.1111/2041-210X.12116>.
Maintained by Miquel De Cáceres. Last updated 8 months ago.
dataMojo:Reshape Data Table
A grammar of data manipulation with 'data.table', providing a consistent a series of utility functions that help you solve the most common data manipulation challenges.
Maintained by Jiena McLellan. Last updated 2 years ago.
HDF5Array:HDF5 datasets as array-like objects in R
The HDF5Array package is an HDF5 backend for DelayedArray objects. It implements the HDF5Array, H5SparseMatrix, H5ADMatrix, and TENxMatrix classes, 4 convenient and memory-efficient array-like containers for representing and manipulating either: (1) a conventional (a.k.a. dense) HDF5 dataset, (2) an HDF5 sparse matrix (stored in CSR/CSC/Yale format), (3) the central matrix of an h5ad file (or any matrix in the /layers group), or (4) a 10x Genomics sparse matrix. All these containers are DelayedArray extensions and thus support all operations (delayed or block-processed) supported by DelayedArray objects.
Maintained by Hervé Pagès. Last updated 26 days ago.
TimeSpaceAnalysis:Statistical tools for time-space analysis
Use Geometric Data Analysis approaches (e.g. MCA or MFA), time pattern analysis (see "time sequence clustering") and places chronologies (see "time geography") analysis.
Maintained by Fabian Mundt. Last updated 7 days ago.
skimr:Compact and Flexible Summaries of Data
A simple to use summary function that can be used with pipes and displays nicely in the console. The default summary statistics may be modified by the user as can the default formatting. Support for data frames and vectors is included, and users can implement their own skim methods for specific object types as described in a vignette. Default summaries include support for inline spark graphs. Instructions for managing these on specific operating systems are given in the "Using skimr" vignette and the README.
Maintained by Elin Waring. Last updated 2 months ago.
fda:Functional Data Analysis
These functions were developed to support functional data analysis as described in Ramsay, J. O. and Silverman, B. W. (2005) Functional Data Analysis. New York: Springer and in Ramsay, J. O., Hooker, Giles, and Graves, Spencer (2009). Functional Data Analysis with R and Matlab (Springer). The package includes data sets and script files working many examples including all but one of the 76 figures in this latter book. Matlab versions are available by ftp from <>.
Maintained by James Ramsay. Last updated 4 months ago.
forestry:Reshape Data Tree
'forestry' a series of utility functions to help with reshaping hierarchy of data tree, and reform the structure of data tree.
Maintained by Jiena McLellan. Last updated 5 years ago.
HH:Statistical Analysis and Data Display: Heiberger and Holland
Support software for Statistical Analysis and Data Display (Second Edition, Springer, ISBN 978-1-4939-2121-8, 2015) and (First Edition, Springer, ISBN 0-387-40270-5, 2004) by Richard M. Heiberger and Burt Holland. This contemporary presentation of statistical methods features extensive use of graphical displays for exploring data and for displaying the analysis. The second edition includes redesigned graphics and additional chapters. The authors emphasize how to construct and interpret graphs, discuss principles of graphical design, and show how accompanying traditional tabular results are used to confirm the visual impressions derived directly from the graphs. Many of the graphical formats are novel and appear here for the first time in print. All chapters have exercises. All functions introduced in the book are in the package. R code for all examples, both graphs and tables, in the book is included in the scripts directory of the package.
Maintained by Richard M. Heiberger. Last updated 1 months ago.
matlab:'MATLAB' Emulation Package
Emulate 'MATLAB' code using 'R'.
Maintained by P. Roebuck. Last updated 9 months ago.
tsbox:Class-Agnostic Time Series
Time series toolkit with identical behavior for all time series classes: 'ts','xts', 'data.frame', 'data.table', 'tibble', 'zoo', 'timeSeries', 'tsibble', 'tis' or 'irts'. Also converts reliably between these classes.
Maintained by Christoph Sax. Last updated 5 months ago.
bipartite:Visualising Bipartite Networks and Calculating Some (Ecological) Indices
Functions to visualise webs and calculate a series of indices commonly used to describe pattern in (ecological) webs. It focuses on webs consisting of only two levels (bipartite), e.g. pollination webs or predator-prey-webs. Visualisation is important to get an idea of what we are actually looking at, while the indices summarise different aspects of the web's topology.
Maintained by Carsten F. Dormann. Last updated 7 days ago.
psychmeta:Psychometric Meta-Analysis Toolkit
Tools for computing bare-bones and psychometric meta-analyses and for generating psychometric data for use in meta-analysis simulations. Supports bare-bones, individual-correction, and artifact-distribution methods for meta-analyzing correlations and d values. Includes tools for converting effect sizes, computing sporadic artifact corrections, reshaping meta-analytic databases, computing multivariate corrections for range variation, and more. Bugs can be reported to <> or <>.
Maintained by Jeffrey A. Dahlke. Last updated 9 months ago.
inti:Tools and Statistical Procedures in Plant Science
The 'inti' package is part of the 'inkaverse' project for developing different procedures and tools used in plant science and experimental designs. The mean aim of the package is to support researchers during the planning of experiments and data collection (tarpuy()), data analysis and graphics (yupana()) , and technical writing. Learn more about the 'inkaverse' project at <>.
Maintained by Flavio Lozano-Isla. Last updated 2 days ago.
matlab2r:Translation Layer from MATLAB to R
Allows users familiar with MATLAB to use MATLAB-named functions in R. Several basic MATLAB functions are written in this package to mimic the behavior of their original counterparts, with more to come as this package grows.
Maintained by Waldir Leoncio. Last updated 2 years ago.
Rcmdr:R Commander
A platform-independent basic-statistics GUI (graphical user interface) for R, based on the tcltk package.
Maintained by John Fox. Last updated 5 months ago.
mulea:Enrichment Analysis Using Multiple Ontologies and False Discovery Rate
Background - Traditional gene set enrichment analyses are typically limited to a few ontologies and do not account for the interdependence of gene sets or terms, resulting in overcorrected p-values. To address these challenges, we introduce mulea, an R package offering comprehensive overrepresentation and functional enrichment analysis. Results - mulea employs a progressive empirical false discovery rate (eFDR) method, specifically designed for interconnected biological data, to accurately identify significant terms within diverse ontologies. mulea expands beyond traditional tools by incorporating a wide range of ontologies, encompassing Gene Ontology, pathways, regulatory elements, genomic locations, and protein domains. This flexibility enables researchers to tailor enrichment analysis to their specific questions, such as identifying enriched transcriptional regulators in gene expression data or overrepresented protein domains in protein sets. To facilitate seamless analysis, mulea provides gene sets (in standardised GMT format) for 27 model organisms, covering 22 ontology types from 16 databases and various identifiers resulting in almost 900 files. Additionally, the muleaData ExperimentData Bioconductor package simplifies access to these pre-defined ontologies. Finally, mulea's architecture allows for easy integration of user-defined ontologies, or GMT files from external sources (e.g., MSigDB or Enrichr), expanding its applicability across diverse research areas. Conclusions - mulea is distributed as a CRAN R package. It offers researchers a powerful and flexible toolkit for functional enrichment analysis, addressing limitations of traditional tools with its progressive eFDR and by supporting a variety of ontologies. Overall, mulea fosters the exploration of diverse biological questions across various model organisms.
Maintained by Tamas Stirling. Last updated 3 months ago.
sdtmval:Validate SDTM Domains
Provides a set of tools to assist statistical programmers in validating Study Data Tabulation Model (SDTM) domain data sets. Statistical programmers are required to validate that a SDTM data set domain has been programmed correctly, per the SDTM Implementation Guide (SDTMIG) by 'CDISC' (<>), study specification, and study protocol using a process called double programming. Double programming involves two different programmers independently converting the raw electronic data cut (EDC) data into a SDTM domain data table and comparing their results to ensure accurate standardization of the data. One of these attempts is termed 'production' and the other 'validation'. Generally, production runs are the official programs for submittals and these are written in 'SAS'. Validation runs can be programmed in another language, in this case 'R'.
Maintained by Stephen Knapp. Last updated 1 years ago.
BatchGetSymbols:Downloads and Organizes Financial Data for Multiple Tickers
Makes it easy to download financial data from Yahoo Finance <>.
Maintained by Marcelo Perlin. Last updated 3 years ago.
superb:Summary Plots with Adjusted Error Bars
Computes standard error and confidence interval of various descriptive statistics under various designs and sampling schemes. The main function, superb(), return a plot. It can also be used to obtain a dataframe with the statistics and their precision intervals so that other plotting environments (e.g., Excel) can be used. See Cousineau and colleagues (2021) <doi:10.1177/25152459211035109> or Cousineau (2017) <doi:10.5709/acp-0214-z> for a review as well as Cousineau (2005) <doi:10.20982/tqmp.01.1.p042>, Morey (2008) <doi:10.20982/tqmp.04.2.p061>, Baguley (2012) <doi:10.3758/s13428-011-0123-7>, Cousineau & Laurencelle (2016) <doi:10.1037/met0000055>, Cousineau & O'Brien (2014) <doi:10.3758/s13428-013-0441-z>, Calderini & Harding <doi:10.20982/tqmp.15.1.p001> for specific references.
Maintained by Denis Cousineau. Last updated 2 months ago.
cholera:Amend, Augment and Aid Analysis of John Snow's Cholera Map
Amends errors, augments data and aids analysis of John Snow's map of the 1854 London cholera outbreak.
Maintained by lindbrook. Last updated 1 days ago.
mintyr:Streamlined Data Processing Tools for Genomic Selection
A toolkit for genomic selection in animal breeding with emphasis on multi-breed and multi-trait nested grouping operations. Streamlines iterative analysis workflows when working with 'ASReml-R' package. Includes utility functions for phenotypic data processing commonly used by animal breeders.
Maintained by Guo Meng. Last updated 3 months ago.
gsalib:Utility Functions for 'GATK'
Provides utility functions used by the Genome Analysis Toolkit ('GATK') to load tables and plot data. The 'GATK' is a toolkit for variant discovery in high-throughput sequencing data.
Maintained by Louis Bergelson. Last updated 1 years ago.
pixiedust:Tables so Beautifully Fine-Tuned You Will Believe It's Magic
The introduction of the 'broom' package has made converting model objects into data frames as simple as a single function. While the 'broom' package focuses on providing tidy data frames that can be used in advanced analysis, it deliberately stops short of providing functionality for reporting models in publication-ready tables. 'pixiedust' provides this functionality with a programming interface intended to be similar to 'ggplot2's system of layers with fine tuned control over each cell of the table. Options for output include printing to the console and to the common markdown formats (markdown, HTML, and LaTeX). With a little 'pixiedust' (and happy thoughts) tables can really fly.
Maintained by Benjamin Nutter. Last updated 1 years ago.
gsw:Gibbs Sea Water Functions
Provides an interface to the Gibbs 'SeaWater' ('TEOS-10') C library, version 3.06-16-0 (commit '657216dd4f5ea079b5f0e021a4163e2d26893371', dated 2022-10-11, available at <>, which stems from 'Matlab' and other code written by members of Working Group 127 of 'SCOR'/'IAPSO' (Scientific Committee on Oceanic Research / International Association for the Physical Sciences of the Oceans).
Maintained by Dan Kelley. Last updated 8 days ago.
threeBrain:Your Advanced 3D Brain Visualization
A fast, interactive cross-platform, and easy to share 'WebGL'-based 3D brain viewer that visualizes 'FreeSurfer' and/or 'AFNI/SUMA' surfaces. The viewer widget can be either standalone or embedded into 'R-shiny' applications. The standalone version only require a web browser with 'WebGL2' support (for example, 'Chrome', 'Firefox', 'Safari'), and can be inserted into any websites. The 'R-shiny' support allows the 3D viewer to be dynamically generated from reactive user inputs. Please check the publication by Wang, Magnotti, Zhang, and Beauchamp (2023, <doi:10.1523/ENEURO.0328-23.2023>) for electrode localization. This viewer has been fully adopted by 'RAVE' <>, an interactive toolbox to analyze 'iEEG' data by Magnotti, Wang, and Beauchamp (2020, <doi:10.1016/j.neuroimage.2020.117341>). Please check citation("threeBrain") for details.
Maintained by Zhengjia Wang. Last updated 2 days ago.
wql:Exploring Water Quality Monitoring Data
Functions to assist in the processing and exploration of data from environmental monitoring programs. The package name stands for "water quality" and reflects the original focus on time series data for physical and chemical properties of water, as well as the biota. Intended for programs that sample approximately monthly, quarterly or annually at discrete stations, a feature of many legacy data sets. Most of the functions should be useful for analysis of similar-frequency time series regardless of the subject matter.
Maintained by Jemma Stachelek. Last updated 2 months ago.
gophr:Utility functions related to working with the MER Structured Dataset
This packages contains a number of functions for working with the PEPFAR MSD.
Maintained by Aaron Chafetz. Last updated 4 months ago.
myClim:Microclimatic Data Processing
Handling the microclimatic data in R. The 'myClim' workflow begins at the reading data primary from microclimatic dataloggers, but can be also reading of meteorological station data from files. Cleaning time step, time zone settings and metadata collecting is the next step of the work flow. With 'myClim' tools one can crop, join, downscale, and convert microclimatic data formats, sort them into localities, request descriptive characteristics and compute microclimatic variables. Handy plotting functions are provided with smart defaults.
Maintained by Vojtěch Kalčík. Last updated 13 days ago.
DCEmgmt:DCE Data Reshaping and Processing
Prepare the results of a DCE to be analysed through choice models.'DCEmgmt' reshapes DCE data from wide to long format considering the special characteristics of a DCE. 'DCEmgmt' includes the function 'DCEestm' which estimates choice models once the database has been reshaped with 'DCEmgmt'.
Maintained by Daniel Perez-Troncoso. Last updated 3 years ago.
cvTools:Cross-validation tools for regression models
Tools that allow developers to write functions for cross-validation with minimal programming effort and assist users with model selection.
Maintained by Andreas Alfons. Last updated 13 years ago.
Ymisc:Miscellaneous Functions
The Author's personal R Package that contains miscellaneous functions. The current version of package contains miscellaneous functions for brain data to compute Asymmetry Index (AI) and bilateral (L+R) measures and reshape the data.
Maintained by Yoo Ri Hwang. Last updated 1 years ago.
drcSeedGerm:Utilities for Data Analyses in Seed Germination/Emergence Assays
Utility functions to be used to analyse datasets obtained from seed germination/emergence assays. Fits several types of seed germination/emergence models, including those reported in Onofri et al. (2018) "Hydrothermal-time-to-event models for seed germination", European Journal of Agronomy, 101, 129-139 <doi:10.1016/j.eja.2018.08.011>. Contains several datasets for practicing.
Maintained by Andrea Onofri. Last updated 2 months ago.
madness:Automatic Differentiation of Multivariate Operations
An object that supports automatic differentiation of matrix- and multidimensional-valued functions with respect to multidimensional independent variables. Automatic differentiation is via 'forward accumulation'.
Maintained by Steven E. Pav. Last updated 4 years ago.
rtemis:Machine Learning and Visualization
Advanced Machine Learning and Visualization. Unsupervised Learning (Clustering, Decomposition), Supervised Learning (Classification, Regression), Cross-Decomposition, Bagging, Boosting, Meta-models. Static and interactive graphics.
Maintained by E.D. Gennatas. Last updated 1 months ago.
isobar:Analysis and quantitation of isobarically tagged MSMS proteomics data
isobar provides methods for preprocessing, normalization, and report generation for the analysis of quantitative mass spectrometry proteomics data labeled with isobaric tags, such as iTRAQ and TMT. Features modules for integrating and validating PTM-centric datasets (isobar-PTM). More information on
Maintained by Florian P Breitwieser. Last updated 5 months ago.
soiltexture:Functions for Soil Texture Plot, Classification and Transformation
"The Soil Texture Wizard" is a set of R functions designed to produce texture triangles (also called texture plots, texture diagrams, texture ternary plots), classify and transform soil textures data. These functions virtually allows to plot any soil texture triangle (classification) into any triangle geometry (isosceles, right-angled triangles, etc.). This set of function is expected to be useful to people using soil textures data from different soil texture classification or different particle size systems. Many (> 15) texture triangles from all around the world are predefined in the package. A simple text based graphical user interface is provided: soiltexture_gui().
Maintained by Julien Moeys. Last updated 1 years ago.
multiGSEA:Combining GSEA-based pathway enrichment with multi omics data integration
Extracted features from pathways derived from 8 different databases (KEGG, Reactome, Biocarta, etc.) can be used on transcriptomic, proteomic, and/or metabolomic level to calculate a combined GSEA-based enrichment score.
Maintained by Sebastian Canzler. Last updated 2 months ago.
MicSim:Performing Continuous-Time Microsimulation
This toolkit allows performing continuous-time microsimulation for a wide range of life science (demography, social sciences, epidemiology) applications. Individual life-courses are specified by a continuous-time multi-state model.
Maintained by Sabine Zinn. Last updated 2 years ago.
KOFM:Test the Kronecker Product Structure in Tensor Factor Models
To test if a tensor time series following a Tucker-decomposition factor model has a Kronecker product structure. Supplementary functions for tensor reshape and its reversal are also included.
Maintained by Zetai Cen. Last updated 2 months ago.
h5mread:A fast HDF5 reader
The main function in the h5mread package is h5mread(), which allows reading arbitrary data from an HDF5 dataset into R, similarly to what the h5read() function from the rhdf5 package does. In the case of h5mread(), the implementation has been optimized to make it as fast and memory-efficient as possible.
Maintained by Hervé Pagès. Last updated 2 months ago.
narray:Subset- And Name-Aware Array Utility Functions
Stacking arrays according to dimension names, subset-aware splitting and mapping of functions, intersecting along arbitrary dimensions, converting to and from data.frames, and many other helper functions.
Maintained by Michael Schubert. Last updated 2 months ago.
acledR:Manipulate ACLED Data
Tools working with data from ACLED (Armed Conflict Location and Event Data). Functions include simplified access to ACLED's API (<>), methods for keeping local versions of ACLED data up-to-date, and functions for common ACLED data transformations.
Maintained by Trey Billing. Last updated 18 days ago.
mousetrap:Process and Analyze Mouse-Tracking Data
Mouse-tracking, the analysis of mouse movements in computerized experiments, is a method that is becoming increasingly popular in the cognitive sciences. The mousetrap package offers functions for importing, preprocessing, analyzing, aggregating, and visualizing mouse-tracking data. An introduction into mouse-tracking analyses using mousetrap can be found in Wulff, Kieslich, Henninger, Haslbeck, & Schulte-Mecklenbeck (2023) <doi:10.31234/> (preprint: <>).
Maintained by Pascal J. Kieslich. Last updated 1 years ago.
mzinspectr:Read and Analyze Mass Spectrometry Alignment Files
A few functions for analyzing MS-DIAL alignments in R. Includes functions for feature normalization, subtraction of blanks, and mass library (msp) search.
Maintained by Ethan Bass. Last updated 5 months ago.
eatGADS:Data Management of Large Hierarchical Data
Import 'SPSS' data, handle and change 'SPSS' meta data, store and access large hierarchical data in 'SQLite' data bases.
Maintained by Benjamin Becker. Last updated 23 days ago.
artMS:Analytical R tools for Mass Spectrometry
artMS provides a set of tools for the analysis of proteomics label-free datasets. It takes as input the MaxQuant search result output (evidence.txt file) and performs quality control, relative quantification using MSstats, downstream analysis and integration. artMS also provides a set of functions to re-format and make it compatible with other analytical tools, including, SAINTq, SAINTexpress, Phosfate, and PHOTON. Check []( for details.
Maintained by David Jimenez-Morales. Last updated 5 months ago.
terms:T-matrix for Electromagnetic Radiation with Multiple Scatterers
A set of Fortran modules/routines for T-matrix-based calculations of light scattering by clusters of individual scatterers.
Maintained by Baptiste Auguié. Last updated 7 months ago.
fsbrain:Managing and Visualizing Brain Surface Data
Provides high-level access to neuroimaging data from standard software packages like 'FreeSurfer' <> on the level of subjects and groups. Load morphometry data, surfaces and brain parcellations based on atlases. Mask data using labels, load data for specific atlas regions only, and visualize data and statistical results directly in 'R'.
Maintained by Tim Schäfer. Last updated 4 months ago.
simcausal:Simulating Longitudinal Data with Causal Inference Applications
A flexible tool for simulating complex longitudinal data using structural equations, with emphasis on problems in causal inference. Specify interventions and simulate from intervened data generating distributions. Define and evaluate treatment-specific means, the average treatment effects and coefficients from working marginal structural models. User interface designed to facilitate the conduct of transparent and reproducible simulation studies, and allows concise expression of complex functional dependencies for a large number of time-varying nodes. See the package vignette for more information, documentation and examples.
Maintained by Oleg Sofrygin. Last updated 8 months ago.
PP:Person Parameter Estimation
The PP package includes estimation of (MLE, WLE, MAP, EAP, ROBUST) person parameters for the 1,2,3,4-PL model and the GPCM (generalized partial credit model). The parameters are estimated under the assumption that the item parameters are known and fixed. The package is useful e.g. in the case that items from an item pool / item bank with known item parameters are administered to a new population of test-takers and an ability estimation for every test-taker is needed.
Maintained by Jan Steinfeld. Last updated 2 years ago.
wavethresh:Wavelets Statistics and Transforms
Performs 1, 2 and 3D real and complex-valued wavelet transforms, nondecimated transforms, wavelet packet transforms, nondecimated wavelet packet transforms, multiple wavelet transforms, complex-valued wavelet transforms, wavelet shrinkage for various kinds of data, locally stationary wavelet time series, nonstationary multiscale transfer function modeling, density estimation.
Maintained by Guy Nason. Last updated 7 months ago.
psychTools:Tools to Accompany the 'psych' Package for Psychological Research
Support functions, data sets, and vignettes for the 'psych' package. Contains several of the biggest data sets for the 'psych' package as well as four vignettes. A few helper functions for file manipulation are included as well. For more information, see the <> web page.
Maintained by William Revelle. Last updated 12 months ago.
MultiRNAflow:An R package for integrated analysis of temporal RNA-seq data with multiple biological conditions
Our R package MultiRNAflow provides an easy to use unified framework allowing to automatically make both unsupervised and supervised (DE) analysis for datasets with an arbitrary number of biological conditions and time points. In particular, our code makes a deep downstream analysis of DE information, e.g. identifying temporal patterns across biological conditions and DE genes which are specific to a biological condition for each time.
Maintained by Rodolphe Loubaton. Last updated 5 months ago.
simulist:Simulate Disease Outbreak Line List and Contacts Data
Tools to simulate realistic raw case data for an epidemic in the form of line lists and contacts using a branching process. Simulated outbreaks are parameterised with epidemiological parameters and can have age-structured populations, age-stratified hospitalisation and death risk and time-varying case fatality risk.
Maintained by Joshua W. Lambert. Last updated 3 days ago.
eatTools:Miscellaneous Functions for the Analysis of Educational Assessments
Miscellaneous functions for data cleaning and data analysis of educational assessments. Includes functions for descriptive analyses, character vector manipulations and weighted statistics. Mainly a lightweight dependency for the packages 'eatRep', 'eatGADS', 'eatPrep' and 'eatModel' (which will be subsequently submitted to 'CRAN'). The function for defining (weighted) contrasts in weighted effect coding refers to te Grotenhuis et al. (2017) <doi:10.1007/s00038-016-0901-1>. Functions for weighted statistics refer to Wolter (2007) <doi:10.1007/978-0-387-35099-8>.
Maintained by Sebastian Weirich. Last updated 3 months ago.
DGM:Dynamic Graphical Models
Dynamic graphical models for multivariate time series data to estimate directed dynamic networks in functional magnetic resonance imaging (fMRI), see Schwab et al. (2017) <doi:10.1016/j.neuroimage.2018.03.074>.
Maintained by Simon Schwab. Last updated 3 years ago.
iNZightTools:Tools for 'iNZight'
Provides a collection of wrapper functions for common variable and dataset manipulation workflows primarily used by 'iNZight', a graphical user interface providing easy exploration and visualisation of data for students of statistics, available in both desktop and online versions. Additionally, many of the functions return the 'tidyverse' code used to obtain the result in an effort to bridge the gap between GUI and coding.
Maintained by Tom Elliott. Last updated 3 months ago.
biogrowth:Modelling of Population Growth
Modelling of population growth under static and dynamic environmental conditions. Includes functions for model fitting and making prediction under isothermal and dynamic conditions. The methods (algorithms & models) are based on predictive microbiology (See Perez-Rodriguez and Valero (2012, ISBN:978-1-4614-5519-6)).
Maintained by Alberto Garre. Last updated 2 days ago.
spiky:Spike-in calibration for cell-free MeDIP
spiky implements methods and model generation for cfMeDIP (cell-free methylated DNA immunoprecipitation) with spike-in controls. CfMeDIP is an enrichment protocol which avoids destructive conversion of scarce template, making it ideal as a "liquid biopsy," but creating certain challenges in comparing results across specimens, subjects, and experiments. The use of synthetic spike-in standard oligos allows diagnostics performed with cfMeDIP to quantitatively compare samples across subjects, experiments, and time points in both relative and absolute terms.
Maintained by Tim Triche. Last updated 5 months ago.
mand:Multivariate Analysis for Neuroimaging Data
Several functions can be used to analyze neuroimaging data using multivariate methods based on the 'msma' package. The functions used in the book entitled "Multivariate Analysis for Neuroimaging Data" (2021, ISBN-13: 978-0367255329) are contained.
Maintained by Atsushi Kawaguchi. Last updated 2 years ago.
neopolars:R Bindings for the 'polars' Rust Library
Lightning-fast 'DataFrame' library written in 'Rust'. Convert R data to 'Polars' data and vice versa. Perform fast, lazy, larger-than-memory and optimized data queries. 'Polars' is interoperable with the package 'arrow', as both are based on the 'Apache Arrow' Columnar Format.
Maintained by Tatsuya Shima. Last updated 1 days ago.
perry:Resampling-Based Prediction Error Estimation for Regression Models
Tools that allow developers to write functions for prediction error estimation with minimal programming effort and assist users with model selection in regression problems.
Maintained by Andreas Alfons. Last updated 3 years ago.
DoE.base:Full Factorials, Orthogonal Arrays and Base Utilities for DoE Packages
Creates full factorial experimental designs and designs based on orthogonal arrays for (industrial) experiments. Provides diverse quality criteria. Provides utility functions for the class design, which is also used by other packages for designed experiments.
Maintained by Ulrike Groemping. Last updated 1 years ago.
sift:Intelligently Peruse Data
Facilitate extraction of key information from common datasets.
Maintained by Scott McKenzie. Last updated 4 years ago.
quest:Prepare Questionnaire Data for Analysis
Offers a suite of functions to prepare questionnaire data for analysis (perhaps other types of data as well). By data preparation, I mean data analytic tasks to get your raw data ready for statistical modeling (e.g., regression). There are functions to investigate missing data, reshape data, validate responses, recode variables, score questionnaires, center variables, aggregate by groups, shift scores (i.e., leads or lags), etc. It provides functions for both single level and multilevel (i.e., grouped) data. With a few exceptions (e.g., ncases()), functions without an "s" at the end of their primary word (e.g., center_by()) act on atomic vectors, while functions with an "s" at the end of their primary word (e.g., centers_by()) act on multiple columns of a data.frame.
Maintained by David Disabato. Last updated 1 years ago.
m61r:Package About Data Manipulation in Pure Base R
Data manipulation in one package and in base R. Minimal. No dependencies. 'dplyr' and 'tidyr'-like in one place. Nothing else than base R to build the package.
Maintained by Jean-Marie Lepioufle. Last updated 5 months ago.
icdcomorbid:Mapping ICD Codes to Comorbidity
Provides tools for mapping International Classification of Diseases codes to comorbidity, enabling the identification and analysis of various medical conditions within healthcare data.
Maintained by April Nguyen. Last updated 7 months ago.
austraits:Helpful functions to access the AusTraits database and wrangle data from other databases
`austraits` allow users to **access, explore and wrangle data** from relational databases. It is also an R interface to AusTraits, the Australian plant trait database. This package contains functions for joining data from various tables, filtering to specific records, combining multiple databases and visualising the distribution of the data. We expect this package will assist users in working with `` databases.
Maintained by Fonti Kar. Last updated 2 months ago.
drcte:Statistical Approaches for Time-to-Event Data in Agriculture
A specific and comprehensive framework for the analyses of time-to-event data in agriculture. Fit non-parametric and parametric time-to-event models. Compare time-to-event curves for different experimental groups. Plots and other displays. It is particularly tailored to the analyses of data from germination and emergence assays. The methods are described in Onofri et al. (2022) "A unified framework for the analysis of germination, emergence, and other time-to-event data in weed science", Weed Science, 70, 259-271 <doi:10.1017/wsc.2022.8>.
Maintained by Andrea Onofri. Last updated 1 years ago.
DistatisR:DiSTATIS Three Way Metric Multidimensional Scaling
Implement DiSTATIS and CovSTATIS (three-way multidimensional scaling). DiSTATIS and CovSTATIS are used to analyze multiple distance/covariance matrices collected on the same set of observations. These methods are based on Abdi, H., Williams, L.J., Valentin, D., & Bennani-Dosse, M. (2012) <doi:10.1002/wics.198>.
Maintained by Herve Abdi. Last updated 1 years ago.
prettyR:Pretty Descriptive Stats
Functions for conventionally formatting descriptive stats, reshaping data frames and formatting R output as HTML.
Maintained by Jim Lemon. Last updated 6 years ago.
figma:Web Client/Wrapper to the 'Figma API'
An easy-to-use web client/wrapper for the 'Figma API' <>. It allows you to bring all data from a 'Figma' file to your 'R' session. This includes the data of all objects that you have drawn in this file, and their respective canvas/page metadata.
Maintained by Pedro Faria. Last updated 2 years ago.
psidread:Streamline Building Panel Data from Panel Study of Income Dynamics ('PSID') Raw Files
Streamline the management, creation, and formatting of panel data from the Panel Study of Income Dynamics ('PSID') <> using this user-friendly tool. Simply define variable names and input code book details directly from the 'PSID' official website, and this toolbox will efficiently facilitate the data preparation process, transforming raw 'PSID' files into a well-organized format ready for further analysis.
Maintained by Shuyi Qiu. Last updated 1 years ago.
novelqualcodes:Visualise the Path to a Stopping Point in Qualitative Interviews Based on Novel Codes
In semi-structured interviews that use the 'framework' method, it is not always clear how refinements to interview questions affect the decision of when to stop interviews. The trend of 'novel' and 'duplicate' interview codes (novel codes are information that other interviewees have not previously mentioned) provides insight into the richness of qualitative information. This package provides tools to visualise when refinements occur and how that affects the trends of novel and duplicate codes. These visualisations, when used progressively as new interviews are finished, can help the researcher to decide on a stopping point for their interviews.
Maintained by Desi Quintans. Last updated 1 years ago.
stablespec:Stable Specification Search in Structural Equation Models
An exploratory and heuristic approach for specification search in Structural Equation Modeling. The basic idea is to subsample the original data and then search for optimal models on each subset. Optimality is defined through two objectives: model fit and parsimony. As these objectives are conflicting, we apply a multi-objective optimization methods, specifically NSGA-II, to obtain optimal models for the whole range of model complexities. From these optimal models, we consider only the relevant model specifications (structures), i.e., those that are both stable (occur frequently) and parsimonious and use those to infer a causal model.
Maintained by Ridho Rahmadi. Last updated 8 years ago.
twoway:Analysis of Two-Way Tables
Carries out analyses of two-way tables with one observation per cell, together with graphical displays for an additive fit and a diagnostic plot for removable 'non-additivity' via a power transformation of the response. It implements Tukey's Exploratory Data Analysis (1973) <ISBN: 978-0201076165> methods, including a 1-degree-of-freedom test for row*column 'non-additivity', linear in the row and column effects.
Maintained by Michael Friendly. Last updated 5 months ago.
eurodata:Fast and Easy Eurostat Data Import and Search
Interface to Eurostat’s API (SDMX 2.1) with fast data.table-based import of data, labels, and metadata. On top of the core functionality, data search and data description/comparison functions are also provided. Use <> — a point-and-click app for rapid and easy generation of richly-commented R code — to import a Eurostat dataset or its subset (based on the eurodata::importData() function).
Maintained by Aleksander Rutkowski. Last updated 4 months ago.
reschola:The Schola Empirica Package
A collection of utilies, themes and templates for data analysis at Schola Empirica.
Maintained by Jan Netík. Last updated 5 months ago.
HARplus:Enhanced R Package for 'GEMPACK' .har and .sl4 Files
Provides tools for processing and analyzing .har and .sl4 files, making it easier for 'GEMPACK' users and 'GTAP' researchers to handle large economic datasets. It simplifies the management of multiple experiment results, enabling faster and more efficient comparisons without complexity. Users can extract, restructure, and merge data seamlessly, ensuring compatibility across different tools. The processed data can be exported and used in 'R', 'Stata', 'Python', 'Julia', or any software that supports Text, CSV, or 'Excel' formats.
Maintained by Pattawee Puangchit. Last updated 15 hours ago.
combiroc:Selection and Ranking of Omics Biomarkers Combinations Made Easy
Provides functions and a workflow to easily and powerfully calculating specificity, sensitivity and ROC curves of biomarkers combinations. Allows to rank and select multi-markers signatures as well as to find the best performing sub-signatures, now also from single-cell RNA-seq datasets. The method used was first published as a Shiny app and described in Mazzara et al. (2017) <doi:10.1038/srep45477> and further described in Bombaci & Rossi (2019) <doi:10.1007/978-1-4939-9164-8_16>, and widely expanded as a package as presented in the bioRxiv pre print Ferrari et al. <doi:10.1101/2022.01.17.476603>.
Maintained by Riccardo L. Rossi. Last updated 9 months ago.
USAID OHA Office. Munging of mission weekly HFR data.
Maintained by Aaron Chafetz. Last updated 2 years ago.
RFPNG:Very Fast PNG Image Reader/Writer for 24/32bpp Images
Wraps 'fpng', a very fast C++ .PNG image reader/writer for 24/32bpp images
Maintained by David Schoch. Last updated 1 years ago.
nlsur:Estimating Nonlinear Least Squares for Equation Systems
"Estimation of Nonlinear Least Squares (NLS), Feasible Generalized NLS (FGNLS) and Iterative FGNLS (IFGNLS) for Equation Systems."
Maintained by Jan Marvin Garbuszus. Last updated 7 months ago.
LFApp:Shiny Apps for Lateral Flow Assays
Shiny apps for the quantitative analysis of images from lateral flow assays (LFAs). The images are segmented and background corrected and color intensities are extracted. The apps can be used to import and export intensity data and to calibrate LFAs by means of linear, loess, or gam models. The calibration models can further be saved and applied to intensity data from new images for determining concentrations.
Maintained by Filip Paskali. Last updated 10 months ago.
eatRep:Educational Assessment Tools for Replication Methods
Replication methods to compute some basic statistic operations (means, standard deviations, frequency tables, percentiles, mean comparisons using weighted effect coding, generalized linear models, and linear multilevel models) in complex survey designs comprising multiple imputed or nested imputed variables and/or a clustered sampling structure which both deserve special procedures at least in estimating standard errors. See the package documentation for a more detailed description along with references.
Maintained by Sebastian Weirich. Last updated 17 days ago.
telefit:Estimation and Prediction for Remote Effects Spatial Process Models
Implementation of the remote effects spatial process (RESP) model for teleconnection. The RESP model is a geostatistical model that allows a spatially-referenced variable (like average precipitation) to be influenced by covariates defined on a remote domain (like sea surface temperatures). The RESP model is introduced in Hewitt et al. (2018) <doi:10.1002/env.2523>. Sample code for working with the RESP model is available at <>. This material is based upon work supported by the National Science Foundation under grant number AGS 1419558. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
Maintained by Joshua Hewitt. Last updated 5 years ago.
GenderInfer:This is a Collection of Functions to Analyse Gender Differences
Implementation of functions, which combines binomial calculation and data visualisation, to analyse the differences in publishing authorship by gender described in Day et al. (2020) <doi:10.1039/C9SC04090K>. It should only be used when self-reported gender is unavailable.
Maintained by Rita Giordano. Last updated 3 years ago.
academictwitteR:Access the Twitter Academic Research Product Track V2 API Endpoint
Package to query the Twitter Academic Research Product Track, providing access to full-archive search and other v2 API endpoints. Functions are written with academic research in mind. They provide flexibility in how the user wishes to store collected data, and encourage regular storage of data to mitigate loss when collecting large volumes of tweets. They also provide workarounds to manage and reshape the format in which data is provided on the client side.
Maintained by Christopher Barrie. Last updated 2 years ago.
demic:Dynamic Estimator of Microbial Communities
Multi-sample algorithm based on contigs and coverage values, to infer the relative distances of contigs from the replication origin and to accurately compare bacterial growth rates between samples. Yuan Gao and Hongzhe Li (2018) <doi:10.1038/s41592-018-0182-0>.
Maintained by Charlie Bushman. Last updated 1 years ago.
SIGN:Similarity Identification in Gene Expression
Provides a classification framework to use expression patterns of pathways as features to identify similarity between biological samples. It provides a new measure for quantifying similarity between expression patterns of pathways.
Maintained by Benjamin Haibe-Kains. Last updated 6 years ago.
Peanut:Parameterized Bayesian Networks, Abstract Classes
This provides support of learning conditional probability tables parameterized using CPTtools. This provides and object oriented layer on top of a CPTtools, to facilitate calculations with Parameterized models for Bayesian networks. Peanut is a collection of abstract classes and generic functions defining a protocol, with the intent that the protocol can be implemented with different Bayes net engines. The companion pacakge PNetica provides an implementation using Netica and RNetica.
Maintained by Russell Almond. Last updated 2 years ago.
biobroom:Turn Bioconductor objects into tidy data frames
This package contains methods for converting standard objects constructed by bioinformatics packages, especially those in Bioconductor, and converting them to tidy data. It thus serves as a complement to the broom package, and follows the same the tidy, augment, glance division of tidying methods. Tidying data makes it easy to recombine, reshape and visualize bioinformatics analyses.
Maintained by John D. Storey. Last updated 5 months ago.
GEVcdn:GEV Conditional Density Estimation Network
Implements a flexible nonlinear modelling framework for nonstationary generalized extreme value analysis in hydroclimatology following Cannon (2010) <doi:10.1002/hyp.7506>.
Maintained by Alex J. Cannon. Last updated 5 years ago.
rNeighborQTL:Interval Mapping for Quantitative Trait Loci Underlying Neighbor Effects
To enable quantitative trait loci mapping of neighbor effects, this package extends a single-marker regression to interval mapping. The theoretical background of the method is described in Sato et al. (2021) <doi:10.1093/g3journal/jkab017>.
Maintained by Yasuhiro Sato. Last updated 4 years ago.
pwrRasch:Statistical Power Simulation for Testing the Rasch Model
Statistical power simulation for testing the Rasch Model based on a three-way analysis of variance design with mixed classification.
Maintained by Takuya Yanagida. Last updated 9 years ago.
IPEDSuploadables:Transforms Institutional Data into Text Files for IPEDS Automated Import/Upload
Starting from user-supplied institutional data, these scripts transform, aggregate, and reshape the information to produce key-value pair data files that are able to be uploaded to IPEDS (Integrated Postsecondary Education Data System) through their submission portal <>. Starting data specifications can be found in the vignettes. Final files are saved locally to a location of the user's choice. User-friendly readable files can also be produced for purposes of data review and validation.
Maintained by Alison Lanski. Last updated 3 months ago.
samadb:South Africa Macroeconomic Database API
An R API providing access to a relational database with macroeconomic time series data for South Africa, obtained from the South African Reserve Bank (SARB) and Statistics South Africa (STATSSA), and updated on a weekly basis via the EconData <> platform and automated scraping of the SARB and STATSSA websites. The database is maintained at the Department of Economics at Stellenbosch University.
Maintained by Sebastian Krantz. Last updated 10 months ago.
ugatsdb:Uganda Time Series Database API
An R API providing easy access to a relational database with macroeconomic, financial and development related time series data for Uganda. Overall more than 5000 series at varying frequency (daily, monthly, quarterly, annual in fiscal or calendar years) can be accessed through the API. The data is provided by the Bank of Uganda, the Ugandan Ministry of Finance, Planning and Economic Development, the IMF and the World Bank. The database is being updated once a month.
Maintained by Sebastian Krantz. Last updated 2 years ago.
CaDENCE:Conditional Density Estimation Network Construction and Evaluation
Parameters of a user-specified probability distribution are modelled by a multi-layer perceptron artificial neural network. This framework can be used to implement probabilistic nonlinear models including mixture density networks, heteroscedastic regression models, zero-inflated models, etc. following Cannon (2012) <doi:10.1016/j.cageo.2011.08.023>.
Maintained by Alex J. Cannon. Last updated 7 years ago.
kwb.waterParcel:Lagrangian Sampling Interpretation
Conservative tracers within the water can be used to interpret the results of a sampling compaign following one water parcel (Lagrangian Sampling).
Maintained by Malte Zamzow. Last updated 2 years ago.
polypoly:Helper Functions for Orthogonal Polynomials
Tools for reshaping, plotting, and manipulating matrices of orthogonal polynomials.
Maintained by Tristan Mahr. Last updated 2 years ago.
csodata:Download Data from the CSO 'PxStat' API
Imports 'PxStat' data in JSON-stat format and (optionally) reshapes it into wide format. The Central Statistics Office (CSO) is the national statistical institute of Ireland and 'PxStat' is the CSOs online database of Official Statistics. This database contains current and historical data series compiled from CSO statistical releases and is accessed at <>. The CSO 'PxStat' Application Programming Interface (API), which is accessed in this package, provides access to 'PxStat' data in JSON-stat format at <>. This dissemination tool allows developers machine to machine access to CSO 'PxStat' data.
Maintained by Conor Crowley. Last updated 10 months ago.
ggseqplot:Render Sequence Plots using 'ggplot2'
A set of wrapper functions that mainly re-produces most of the sequence plots rendered with TraMineR::seqplot(). Whereas 'TraMineR' uses base R to produce the plots this library draws on 'ggplot2'. The plots are produced on the basis of a sequence object defined with TraMineR::seqdef(). The package automates the reshaping and plotting of sequence data. Resulting plots are of class 'ggplot', i.e. components can be added and tweaked using '+' and regular 'ggplot2' functions.
Maintained by Marcel Raab. Last updated 4 months ago.
plotdap:Easily Visualize Data from 'ERDDAP' Servers via the 'rerddap' Package
Easily visualize and animate 'tabledap' and 'griddap' objects obtained via the 'rerddap' package in a simple one-line command, using either base graphics or 'ggplot2' graphics. 'plotdap' handles extracting and reshaping the data, map projections and continental outlines. Optionally the data can be animated through time using the 'gganmiate' package.
Maintained by Roy Mendelssohn. Last updated 1 years ago.
RMM:Revenue Management Modeling
The RMM fits Revenue Management Models using the RDE(Robust Demand Estimation) method introduced in the paper by <doi:10.2139/ssrn.3598259>, one of the customer choice-based Revenue Management Model. Furthermore, it is possible to select a multinomial model as well as a conditional logit model as a model of RDE.
Maintained by Chul Kim. Last updated 3 years ago.
broomstick:Convert Decision Tree Objects into Tidy Data Frames
Convert Decision Tree objects into tidy data frames, by using the framework laid out by the package broom, this means that decision tree output can be easily reshaped, porocessed, and combined with tools like 'dplyr', 'tidyr' and 'ggplot2'. Like the package broom, broomstick provides three S3 generics: tidy, to summarise decision tree specific features - tidy returns the variable importance table; augment adds columns to the original data such as predictions and residuals; and glance, which provides a one-row summary of model-level statistics.
Maintained by Nicholas Tierney. Last updated 1 years ago.
eufmdis.adapt:Analyse 'EuFMDiS' Output Files via a Shiny App
Analyses 'EuFMDiS' output files in a Shiny App. The distributions of relevant output parameters are described in form of tables (quantiles) and plots. The App is called using eufmdis.adapt::run_adapt().
Maintained by Ian Kopacka. Last updated 2 years ago.
velociraptr:Fossil Analysis
Functions for downloading, reshaping, culling, cleaning, and analyzing fossil data from the Paleobiology Database <>.
Maintained by Andrew A Zaffos. Last updated 6 years ago.
