Showing 200 of total 248 results (show query)
mlverse
tabulate:Pretty Console Output for Tables
Generates pretty console output for tables allowing for full customization of cell colors, font type, borders and many others attributes. It also supports 'multibyte' characters and nested tables.
Maintained by Daniel Falbel. Last updated 3 years ago.
53.4 match 39 stars 5.29 score 33 scriptsinsightsengineering
tern:Create Common TLGs Used in Clinical Trials
Table, Listings, and Graphs (TLG) library for common outputs used in clinical trials.
Maintained by Joe Zhu. Last updated 2 months ago.
clinical-trialsgraphslistingsnestoutputstables
20.2 match 79 stars 12.62 score 186 scripts 9 dependentsskhiggins
tabulator:Efficient Tabulation with Stata-Like Output
Efficient tabulation with Stata-like output. For each unique value of the variable, it shows the number of observations with that value, proportion of observations with that value, and cumulative proportion, in descending order of frequency. Accepts data.table, tibble, or data.frame as input. Efficient with big data: if you give it a data.table, tab() uses data.table syntax.
Maintained by Sean Higgins. Last updated 4 years ago.
55.8 match 12 stars 4.14 score 23 scriptsinsightsengineering
rtables:Reporting Tables
Reporting tables often have structure that goes beyond simple rectangular data. The 'rtables' package provides a framework for declaring complex multi-level tabulations and then applying them to data. This framework models both tabulation and the resulting tables as hierarchical, tree-like objects which support sibling sub-tables, arbitrary splitting or grouping of data in row and column dimensions, cells containing multiple values, and the concept of contextual summary computations. A convenient pipe-able interface is provided for declaring table layouts and the corresponding computations, and then applying them to data.
Maintained by Joe Zhu. Last updated 2 months ago.
15.8 match 232 stars 13.65 score 238 scripts 17 dependentssfirke
janitor:Simple Tools for Examining and Cleaning Dirty Data
The main janitor functions can: perfectly format data.frame column names; provide quick counts of variable combinations (i.e., frequency tables and crosstabs); and explore duplicate records. Other janitor functions nicely format the tabulation results. These tabulate-and-report functions approximate popular features of SPSS and Microsoft Excel. This package follows the principles of the "tidyverse" and works well with the pipe function %>%. janitor was built with beginning-to-intermediate R users in mind and is optimized for user-friendliness.
Maintained by Sam Firke. Last updated 3 months ago.
data-analysisdata-cleaningdata-sciencedirty-dataexcelpivot-tablesspsstabulationstidyverse
10.8 match 1.4k stars 19.15 score 35k scripts 231 dependentsdavidgohel
flextable:Functions for Tabular Reporting
Use a grammar for creating and customizing pretty tables. The following formats are supported: 'HTML', 'PDF', 'RTF', 'Microsoft Word', 'Microsoft PowerPoint' and R 'Grid Graphics'. 'R Markdown', 'Quarto' and the package 'officer' can be used to produce the result files. The syntax is the same for the user regardless of the type of output to be produced. A set of functions allows the creation, definition of cell arrangement, addition of headers or footers, formatting and definition of cell content with text and or images. The package also offers a set of high-level functions that allow tabular reporting of statistical models and the creation of complex cross tabulations.
Maintained by David Gohel. Last updated 1 months ago.
docxhtml5ms-office-documentsrmarkdowntable
11.4 match 583 stars 17.04 score 7.3k scripts 119 dependentspoissonconsulting
ypr:Yield Per Recruit
An implementation of equilibrium-based yield per recruit methods. Yield per recruit methods can used to estimate the optimal yield for a fish population as described by Walters and Martell (2004) <isbn:0-691-11544-3>. The yield can be based on the number of fish caught (or harvested) or biomass caught for all fish or just large (trophy) individuals.
Maintained by Joe Thorley. Last updated 2 months ago.
12.4 match 7 stars 7.84 score 55 scripts 1 dependentseoda-dev
rtabulator:R Bindings for 'Tabulator JS'
Provides R bindings for 'Tabulator JS' <https://tabulator.info/>. Makes it a breeze to create highly customizable interactive tables in 'rmarkdown' documents and 'shiny' applications. It includes filtering, grouping, editing, input validation, history recording, column formatters, packaged themes and more.
Maintained by Stefan Kuethe. Last updated 4 months ago.
bindingshtmlwidgetsrlangshinyspreadsheettabletabulator-js
21.2 match 11 stars 4.44 score 9 scriptsdcomtois
summarytools:Tools to Quickly and Neatly Summarize Data
Data frame summaries, cross-tabulations, weight-enabled frequency tables and common descriptive (univariate) statistics in concise tables available in a variety of formats (plain ASCII, Markdown and HTML). A good point-of-entry for exploring data, both for experienced and new R users.
Maintained by Dominic Comtois. Last updated 1 days ago.
descriptive-statisticsfrequency-tablehtml-reportmarkdownpanderpandocpandoc-markdownrmarkdownrstudio
5.2 match 526 stars 14.52 score 2.9k scripts 6 dependentsshannonpileggi
gtreg:Regulatory Tables for Clinical Research
Creates tables suitable for regulatory agency submission by leveraging the 'gtsummary' package as the back end. Tables can be exported to HTML, Word, PDF and more. Highly customized outputs are available by utilizing existing styling functions from 'gtsummary' as well as custom options designed for regulatory tables.
Maintained by Shannon Pileggi. Last updated 25 days ago.
9.4 match 37 stars 6.92 score 30 scriptstomasfryda
h2o:R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Maintained by Tomas Fryda. Last updated 1 years ago.
7.8 match 3 stars 8.20 score 7.8k scripts 11 dependentslarmarange
ggstats:Extension to 'ggplot2' for Plotting Stats
Provides new statistics, new geometries and new positions for 'ggplot2' and a suite of functions to facilitate the creation of statistical plots.
Maintained by Joseph Larmarange. Last updated 6 days ago.
4.9 match 37 stars 13.08 score 190 scripts 156 dependentsnjtierney
naniar:Data Structures, Summaries, and Visualisations for Missing Data
Missing values are ubiquitous in data and need to be explored and handled in the initial stages of analysis. 'naniar' provides data structures and functions that facilitate the plotting of missing values and examination of imputations. This allows missing data dependencies to be explored with minimal deviation from the common work patterns of 'ggplot2' and tidy data. The work is fully discussed at Tierney & Cook (2023) <doi:10.18637/jss.v105.i07>.
Maintained by Nicholas Tierney. Last updated 4 days ago.
data-visualisationggplot2missing-datamissingnesstidy-data
4.0 match 657 stars 15.63 score 5.1k scripts 9 dependentsggobi
GGally:Extension to 'ggplot2'
The R package 'ggplot2' is a plotting system based on the grammar of graphics. 'GGally' extends 'ggplot2' by adding several functions to reduce the complexity of combining geometric objects with transformed data. Some of these functions include a pairwise plot matrix, a two group pairwise plot matrix, a parallel coordinates plot, a survival plot, and several functions to plot networks.
Maintained by Barret Schloerke. Last updated 10 months ago.
3.7 match 597 stars 16.15 score 17k scripts 154 dependentsbioc
BiocGenerics:S4 generic functions used in Bioconductor
The package defines many S4 generic functions used in Bioconductor.
Maintained by Hervé Pagès. Last updated 1 months ago.
infrastructurebioconductor-packagecore-package
4.1 match 12 stars 14.22 score 612 scripts 2.2k dependentscran
epiDisplay:Epidemiological Data Display Package
Package for data exploration and result presentation. Full 'epicalc' package with data management functions is available at '<https://medipe.psu.ac.th/epicalc/>'.
Maintained by Virasakdi Chongsuvivatwong. Last updated 3 years ago.
9.8 match 1 stars 5.44 score 758 scripts 2 dependentstagteam
Publish:Format Output of Various Routines in a Suitable Way for Reports and Publication
A bunch of convenience functions that transform the results of some basic statistical analyses into table format nearly ready for publication. This includes descriptive tables, tables of logistic regression and Cox regression results as well as forest plots.
Maintained by Thomas A. Gerds. Last updated 11 days ago.
5.1 match 15 stars 10.11 score 274 scripts 36 dependentscdcgov
surveytable:Formatted Survey Estimates
Short and understandable commands that generate tabulated, formatted, and rounded survey estimates. Mostly a wrapper for the 'survey' package (Lumley (2004) <doi:10.18637/jss.v009.i08> <https://CRAN.R-project.org/package=survey>) that identifies low-precision estimates using the National Center for Health Statistics (NCHS) presentation standards (Parker et al. (2017) <https://www.cdc.gov/nchs/data/series/sr_02/sr02_175.pdf>, Parker et al. (2023) <doi:10.15620/cdc:124368>).
Maintained by Alex Strashny. Last updated 4 days ago.
estimatesformatted-outputpretty-printsurveytables
7.3 match 6 stars 6.71 score 19 scriptssciviews
tabularise:Create Tabular Outputs from R
Create rich-formatted tabular outputs from R that can be incorporated into R Markdown/Quarto documents with correct output at least in HTML, LaTeX/PDF, Word and PowerPoint formats for various R objects.
Maintained by Philippe Grosjean. Last updated 9 months ago.
10.0 match 4.56 score 12 scripts 4 dependentsgianmarcoalberti
CAinterprTools:Graphical Aid in Correspondence Analysis Interpretation and Significance Testings
Allows to plot a number of information related to the interpretation of Correspondence Analysis' results. It provides the facility to plot the contribution of rows and columns categories to the principal dimensions, the quality of points display on selected dimensions, the correlation of row and column categories to selected dimensions, etc. It also allows to assess which dimension(s) is important for the data structure interpretation by means of different statistics and tests. The package also offers the facility to plot the permuted distribution of the table total inertia as well as of the inertia accounted for by pairs of selected dimensions. Different facilities are also provided that aim to produce interpretation-oriented scatterplots. Reference: Alberti 2015 <doi:10.1016/j.softx.2015.07.001>.
Maintained by Gianmarco Alberti. Last updated 5 years ago.
16.8 match 2.52 score 33 scriptsrspatial
terra:Spatial Data Analysis
Methods for spatial data analysis with vector (points, lines, polygons) and raster (grid) data. Methods for vector data include geometric operations such as intersect and buffer. Raster methods include local, focal, global, zonal and geometric operations. The predict and interpolate methods facilitate the use of regression type (interpolation, machine learning) models for spatial prediction, including with satellite remote sensing data. Processing of very large files is supported. See the manual and tutorials on <https://rspatial.org/> to get started. 'terra' replaces the 'raster' package ('terra' can do more, and it is faster and easier to use).
Maintained by Robert J. Hijmans. Last updated 23 hours ago.
geospatialrasterspatialvectoronetbbprojgdalgeoscpp
2.3 match 559 stars 17.64 score 17k scripts 851 dependentsgdemin
expss:Tables, Labels and Some Useful Functions from Spreadsheets and 'SPSS' Statistics
Package computes and displays tables with support for 'SPSS'-style labels, multiple and nested banners, weights, multiple-response variables and significance testing. There are facilities for nice output of tables in 'knitr', 'Shiny', '*.xlsx' files, R and 'Jupyter' notebooks. Methods for labelled variables add value labels support to base R functions and to some functions from other packages. Additionally, the package brings popular data transformation functions from 'SPSS' Statistics and 'Excel': 'RECODE', 'COUNT', 'COUNTIF', 'VLOOKUP' and etc. These functions are very useful for data processing in marketing research surveys. Package intended to help people to move data processing from 'Excel' and 'SPSS' to R.
Maintained by Gregory Demin. Last updated 11 months ago.
excellabelslabels-supportmsexcelpivot-tablesrecodespssspss-statisticstablesvariable-labelsvlookup
3.5 match 84 stars 11.00 score 1.8k scripts 4 dependentsrspatial
raster:Geographic Data Analysis and Modeling
Reading, writing, manipulating, analyzing and modeling of spatial data. This package has been superseded by the "terra" package <https://CRAN.R-project.org/package=terra>.
Maintained by Robert J. Hijmans. Last updated 2 months ago.
2.3 match 164 stars 17.05 score 58k scripts 555 dependentsmarketbridge
zctaCrosswalk:Crosswalk Between 2020 Census ZIP Code Tabulation Areas (ZCTAs), States and Counties
Contains the US Census Bureau's 2020 ZCTA to County Relationship File, as well as convenience functions to translate between States, Counties and ZIP Code Tabulation Areas (ZCTAs).
Maintained by Ari Lamstein. Last updated 2 years ago.
6.7 match 5 stars 5.39 score 11 scripts 1 dependentsnicolas-robette
descriptio:Descriptive Statistical Analysis
Description of statistical associations between variables : measures of local and global association between variables (phi, Cramér V, correlations, eta-squared, Goodman and Kruskal tau, permutation tests, etc.), multiple graphical representations of the associations between variables (using 'ggplot2') and weighted statistics.
Maintained by Nicolas Robette. Last updated 6 months ago.
7.0 match 4 stars 5.00 score 11 scripts 3 dependentslindbrook
packageRank:Computation and Visualization of Package Download Counts and Percentile Ranks
Compute and visualize package download counts and percentile ranks from Posit/RStudio's CRAN mirror.
Maintained by lindbrook. Last updated 4 days ago.
5.6 match 28 stars 6.13 score 27 scriptsbioc
VariantAnnotation:Annotation of Genetic Variants
Annotate variants, compute amino acid coding changes, predict coding outcomes.
Maintained by Bioconductor Package Maintainer. Last updated 2 months ago.
dataimportsequencingsnpannotationgeneticsvariantannotationcurlbzip2xz-utilszlib
3.0 match 11.39 score 1.9k scripts 152 dependentssparklyr
sparklyr:R Interface to Apache Spark
R interface to Apache Spark, a fast and general engine for big data processing, see <https://spark.apache.org/>. This package supports connecting to local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end, and provides an interface to Spark's built-in machine learning algorithms.
Maintained by Edgar Ruiz. Last updated 10 days ago.
apache-sparkdistributeddplyridelivymachine-learningremote-clusterssparksparklyr
2.3 match 959 stars 15.16 score 4.0k scripts 21 dependentsmatthieugomez
statar:Tools Inspired by 'Stata' to Manipulate Tabular Data
A set of tools inspired by 'Stata' to explore data.frames ('summarize', 'tabulate', 'xtile', 'pctile', 'binscatter', elapsed quarters/month, lead/lag).
Maintained by Matthieu Gomez. Last updated 2 years ago.
4.0 match 54 stars 8.40 score 226 scripts 1 dependentsipums
ipumsr:An R Interface for Downloading, Reading, and Handling IPUMS Data
An easy way to work with census, survey, and geographic data provided by IPUMS in R. Generate and download data through the IPUMS API and load IPUMS files into R with their associated metadata to make analysis easier. IPUMS data describing 1.4 billion individuals drawn from over 750 censuses and surveys is available free of charge from the IPUMS website <https://www.ipums.org>.
Maintained by Derek Burk. Last updated 19 days ago.
3.0 match 28 stars 11.07 score 720 scripts 2 dependentsrsquaredacademy
descriptr:Generate Descriptive Statistics
Generate descriptive statistics such as measures of location, dispersion, frequency tables, cross tables, group summaries and multiple one/two way tables.
Maintained by Aravind Hebbali. Last updated 4 months ago.
descriptive-statisticsedasummary-statistics
4.5 match 34 stars 7.37 score 221 scriptshenrikbengtsson
matrixStats:Functions that Apply to Rows and Columns of Matrices (and to Vectors)
High-performing functions operating on rows and columns of matrices, e.g. col / rowMedians(), col / rowRanks(), and col / rowSds(). Functions optimized per data type and for subsetted calculations such that both memory usage and processing time is minimized. There are also optimized vector-based methods, e.g. binMeans(), madDiff() and weightedMedian().
Maintained by Henrik Bengtsson. Last updated 2 months ago.
1.8 match 208 stars 18.09 score 20k scripts 2.3k dependentsjalvesaq
descr:Descriptive Statistics
Weighted frequency and contingency tables of categorical variables and of the comparison of the mean value of a numerical variable by the levels of a factor, and methods to produce xtable objects of the tables and to plot them. There are also functions to facilitate the character encoding conversion of objects, to quickly convert fixed width files into csv ones, and to export a data.frame to a text file with the necessary R and SPSS codes to reread the data.
Maintained by Jakson Aquino. Last updated 1 years ago.
3.7 match 18 stars 8.80 score 692 scripts 4 dependentsmayoverse
arsenal:An Arsenal of 'R' Functions for Large-Scale Statistical Summaries
An Arsenal of 'R' functions for large-scale statistical summaries, which are streamlined to work within the latest reporting tools in 'R' and 'RStudio' and which use formulas and versatile summary statistics for summary tables and models. The primary functions include tableby(), a Table-1-like summary of multiple variable types 'by' the levels of one or more categorical variables; paired(), a Table-1-like summary of multiple variable types paired across two time points; modelsum(), which performs simple model fits on one or more endpoints for many variables (univariate or adjusted for covariates); freqlist(), a powerful frequency table across many categorical variables; comparedf(), a function for comparing data.frames; and write2(), a function to output tables to a document.
Maintained by Ethan Heinzen. Last updated 7 months ago.
baseline-characteristicsdescriptive-statisticsmodelingpaired-comparisonsreportingstatisticstableone
2.4 match 225 stars 13.45 score 1.2k scripts 16 dependentscjvanlissa
tidySEM:Tidy Structural Equation Modeling
A tidy workflow for generating, estimating, reporting, and plotting structural equation models using 'lavaan', 'OpenMx', or 'Mplus'. Throughout this workflow, elements of syntax, results, and graphs are represented as 'tidy' data, making them easy to customize. Includes functionality to estimate latent class analyses, and to plot 'dagitty' and 'igraph' objects.
Maintained by Caspar J. van Lissa. Last updated 7 days ago.
3.0 match 58 stars 10.69 score 330 scripts 1 dependentsinsightsengineering
tern.mmrm:Tables and Graphs for Mixed Models for Repeated Measures (MMRM)
Mixed models for repeated measures (MMRM) are a popular choice for analyzing longitudinal continuous outcomes in randomized clinical trials and beyond; see for example Cnaan, Laird and Slasor (1997) <doi:10.1002/(SICI)1097-0258(19971030)16:20%3C2349::AID-SIM667%3E3.0.CO;2-E>. This package provides an interface for fitting MMRM within the 'tern' <https://cran.r-project.org/package=tern> framework by Zhu et al. (2023) and tabulate results easily using 'rtables' <https://cran.r-project.org/package=rtables> by Becker et al. (2023). It builds on 'mmrm' <https://cran.r-project.org/package=mmrm> by Sabanés Bové et al. (2023) for the actual MMRM computations.
Maintained by Joe Zhu. Last updated 6 months ago.
graphslistingsstatistical-engineeringtables
4.4 match 6 stars 7.26 score 8 scripts 1 dependentsgoranbrostrom
eha:Event History Analysis
Parametric proportional hazards fitting with left truncation and right censoring for common families of distributions, piecewise constant hazards, and discrete models. Parametric accelerated failure time models for left truncated and right censored data. Proportional hazards models for tabular and register data. Sampling of risk sets in Cox regression, selections in the Lexis diagram, bootstrapping. Broström (2022) <doi:10.1201/9780429503764>.
Maintained by Göran Broström. Last updated 9 months ago.
3.3 match 7 stars 9.76 score 308 scripts 10 dependentsinsightsengineering
tern.gee:Tables and Graphs for Generalized Estimating Equations (GEE) Model Fits
Generalized estimating equations (GEE) are a popular choice for analyzing longitudinal binary outcomes. This package provides an interface for fitting GEE, currently for logistic regression, within the 'tern' <https://cran.r-project.org/package=tern> framework (Zhu, Sabanés Bové et al., 2023) and tabulate results easily using 'rtables' <https://cran.r-project.org/package=rtables> (Becker, Waddell et al., 2023). It builds on 'geepack' <doi:10.18637/jss.v015.i02> (Højsgaard, Halekoh and Yan, 2006) for the actual GEE model fitting.
Maintained by Joe Zhu. Last updated 7 months ago.
4.5 match 8 stars 7.00 score 3 scripts 1 dependentspfizer-opensource
zippeR:Working with United States ZIP Code and ZIP Code Tabulation Area Data
Provides a set of functions for working with American postal codes, which are known as ZIP Codes. These include accessing ZIP Code to ZIP Code Tabulation Area (ZCTA) crosswalks, retrieving demographic data for ZCTAs, and tabulating demographic data for three-digit ZCTAs.
Maintained by Christopher Prener. Last updated 19 days ago.
4.8 match 11 stars 6.52 score 5 scripts 1 dependentsbenmbutler
powdR:Full Pattern Summation of X-Ray Powder Diffraction Data
Full pattern summation of X-ray powder diffraction data as described in Chipera and Bish (2002) <doi:10.1107/S0021889802017405> and Butler and Hillier (2021) <doi:10.1016/j.cageo.2020.104662>. Derives quantitative estimates of crystalline and amorphous phase concentrations in complex mixtures.
Maintained by Benjamin Butler. Last updated 3 years ago.
5.6 match 12 stars 5.56 score 30 scriptssebkrantz
collapse:Advanced and Fast Data Transformation
A C/C++ based package for advanced data transformation and statistical computing in R that is extremely fast, class-agnostic, robust and programmer friendly. Core functionality includes a rich set of S3 generic grouped and weighted statistical functions for vectors, matrices and data frames, which provide efficient low-level vectorizations, OpenMP multithreading, and skip missing values by default. These are integrated with fast grouping and ordering algorithms (also callable from C), and efficient data manipulation functions. The package also provides a flexible and rigorous approach to time series and panel data in R. It further includes fast functions for common statistical procedures, detailed (grouped, weighted) summary statistics, powerful tools to work with nested data, fast data object conversions, functions for memory efficient R programming, and helpers to effectively deal with variable labels, attributes, and missing data. It is well integrated with base R classes, 'dplyr'/'tibble', 'data.table', 'sf', 'units', 'plm' (panel-series and data frames), and 'xts'/'zoo'.
Maintained by Sebastian Krantz. Last updated 6 days ago.
data-aggregationdata-analysisdata-manipulationdata-processingdata-sciencedata-transformationeconometricshigh-performancepanel-datascientific-computingstatisticstime-seriesweightedweightscppopenmp
1.9 match 672 stars 16.63 score 708 scripts 97 dependentsaphalo
photobiologyPlants:Plant Photobiology Related Functions and Data
Provides functions for quantifying visible (VIS) and ultraviolet (UV) radiation in relation to the photoreceptors Phytochromes, Cryptochromes, and UVR8 which are present in plants. It also includes data sets on the optical properties of plants. Part of the 'r4photobiology' suite, Aphalo P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.
Maintained by Pedro J. Aphalo. Last updated 2 months ago.
5.6 match 5.52 score 55 scriptsplambertuliege
degross:Density Estimation from GROuped Summary Statistics
Estimation of a density from grouped (tabulated) summary statistics evaluated in each of the big bins (or classes) partitioning the support of the variable. These statistics include class frequencies and central moments of order one up to four. The log-density is modelled using a linear combination of penalised B-splines. The multinomial log-likelihood involving the frequencies adds up to a roughness penalty based on the differences in the coefficients of neighbouring B-splines and the log of a root-n approximation of the sampling density of the observed vector of central moments in each class. The so-obtained penalized log-likelihood is maximized using the EM algorithm to get an estimate of the spline parameters and, consequently, of the variable density and related quantities such as quantiles, see Lambert, P. (2021) <arXiv:2107.03883> for details.
Maintained by Philippe Lambert. Last updated 2 years ago.
10.2 match 2 stars 3.00 score 4 scriptsuupharmacometrics
xpose4:Diagnostics for Nonlinear Mixed-Effect Models
A model building aid for nonlinear mixed-effects (population) model analysis using NONMEM, facilitating data set checkout, exploration and visualization, model diagnostics, candidate covariate identification and model comparison. The methods are described in Keizer et al. (2013) <doi:10.1038/psp.2013.24>, and Jonsson et al. (1999) <doi:10.1016/s0169-2607(98)00067-4>.
Maintained by Andrew C. Hooker. Last updated 1 years ago.
diagnosticsnonmempharmacometricspopulation-modelxpose
4.1 match 35 stars 7.30 score 315 scriptsmichbur
biogram:N-Gram Analysis of Biological Sequences
Tools for extraction and analysis of various n-grams (k-mers) derived from biological sequences (proteins or nucleic acids). Contains QuiPT (quick permutation test) for fast feature-filtering of the n-gram data.
Maintained by Michal Burdukiewicz. Last updated 7 months ago.
biological-sequencesngram-analysis
4.0 match 10 stars 7.50 score 87 scripts 3 dependentsdavidgohel
officer:Manipulation of Microsoft Word and PowerPoint Documents
Access and manipulate 'Microsoft Word', 'RTF' and 'Microsoft PowerPoint' documents from R. The package focuses on tabular and graphical reporting from R; it also provides two functions that let users get document content into data objects. A set of functions lets add and remove images, tables and paragraphs of text in new or existing documents. The package does not require any installation of Microsoft products to be able to write Microsoft files.
Maintained by David Gohel. Last updated 1 months ago.
ms-office-documentspowerpointword
1.9 match 630 stars 15.79 score 4.1k scripts 137 dependentsstatist7
sitar:Super Imposition by Translation and Rotation Growth Curve Analysis
Functions for fitting and plotting SITAR (Super Imposition by Translation And Rotation) growth curve models. SITAR is a shape-invariant model with a regression B-spline mean curve and subject-specific random effects on both the measurement and age scales. The model was first described by Lindstrom (1995) <doi:10.1002/sim.4780141807> and developed as the SITAR method by Cole et al (2010) <doi:10.1093/ije/dyq115>.
Maintained by Tim Cole. Last updated 2 months ago.
3.4 match 13 stars 8.69 score 58 scripts 3 dependentsstrohne
volker:High-Level Functions for Tabulating, Charting and Reporting Survey Data
Craft polished tables and plots in Markdown reports. Simply choose whether to treat your data as counts or metrics, and the package will automatically generate well-designed default tables and plots for you. Boiled down to the basics, with labeling features and simple interactive reports. All functions are 'tidyverse' compatible.
Maintained by Jakob Jünger. Last updated 3 days ago.
4.1 match 5 stars 7.16 score 125 scriptsropensci
targets:Dynamic Function-Oriented 'Make'-Like Declarative Pipelines
Pipeline tools coordinate the pieces of computationally demanding analysis projects. The 'targets' package is a 'Make'-like pipeline tool for statistics and data science in R. The package skips costly runtime for tasks that are already up to date, orchestrates the necessary computation with implicit parallel computing, and abstracts files as R objects. If all the current output matches the current upstream code and data, then the whole pipeline is up to date, and the results are more trustworthy than otherwise. The methodology in this package borrows from GNU 'Make' (2015, ISBN:978-9881443519) and 'drake' (2018, <doi:10.21105/joss.00550>).
Maintained by William Michael Landau. Last updated 2 days ago.
data-sciencehigh-performance-computingmakepeer-reviewedpipeliner-targetopiareproducibilityreproducible-researchtargetsworkflow
1.9 match 973 stars 15.20 score 4.6k scripts 22 dependentsbioc
S4Vectors:Foundation of vector-like and list-like containers in Bioconductor
The S4Vectors package defines the Vector and List virtual classes and a set of generic functions that extend the semantic of ordinary vectors and lists in R. Package developers can easily implement vector-like or list-like objects as concrete subclasses of Vector or List. In addition, a few low-level concrete subclasses of general interest (e.g. DataFrame, Rle, Factor, and Hits) are implemented in the S4Vectors package itself (many more are implemented in the IRanges package and in other Bioconductor infrastructure packages).
Maintained by Hervé Pagès. Last updated 1 months ago.
infrastructuredatarepresentationbioconductor-packagecore-package
1.8 match 18 stars 16.05 score 1.0k scripts 1.9k dependentsr-lib
bit64:A S3 Class for Vectors of 64bit Integers
Package 'bit64' provides serializable S3 atomic 64bit (signed) integers. These are useful for handling database keys and exact counting in +-2^63. WARNING: do not use them as replacement for 32bit integers, integer64 are not supported for subscripting by R-core and they have different semantics when combined with double, e.g. integer64 + double => integer64. Class integer64 can be used in vectors, matrices, arrays and data.frames. Methods are available for coercion from and to logicals, integers, doubles, characters and factors as well as many elementwise and summary functions. Many fast algorithmic operations such as 'match' and 'order' support inter- active data exploration and manipulation and optionally leverage caching.
Maintained by Michael Chirico. Last updated 4 days ago.
1.8 match 35 stars 14.91 score 1.5k scripts 3.2k dependentseasystats
see:Model Visualisation Toolbox for 'easystats' and 'ggplot2'
Provides plotting utilities supporting packages in the 'easystats' ecosystem (<https://github.com/easystats/easystats>) and some extra themes, geoms, and scales for 'ggplot2'. Color scales are based on <https://materialui.co/>. References: Lüdecke et al. (2021) <doi:10.21105/joss.03393>.
Maintained by Indrajeet Patil. Last updated 5 days ago.
data-visualizationeasystatsggplot2hacktoberfestplottingseestatisticsvisualisationvisualization
2.0 match 902 stars 13.22 score 2.0k scripts 3 dependentsbioc
SIM:Integrated Analysis on two human genomic datasets
Finds associations between two human genomic datasets.
Maintained by Renee X. de Menezes. Last updated 5 months ago.
6.0 match 4.30 score 3 scriptspauljohn32
rockchalk:Regression Estimation and Presentation
A collection of functions for interpretation and presentation of regression analysis. These functions are used to produce the statistics lectures in <https://pj.freefaculty.org/guides/>. Includes regression diagnostics, regression tables, and plots of interactions and "moderator" variables. The emphasis is on "mean-centered" and "residual-centered" predictors. The vignette 'rockchalk' offers a fairly comprehensive overview. The vignette 'Rstyle' has advice about coding in R. The package title 'rockchalk' refers to our school motto, 'Rock Chalk Jayhawk, Go K.U.'.
Maintained by Paul E. Johnson. Last updated 3 years ago.
3.6 match 7.13 score 584 scripts 18 dependentspharmaverse
sdtmchecks:Data Quality Checks for Study Data Tabulation Model (SDTM) Datasets
A series of checks to identify common issues in Study Data Tabulation Model (SDTM) datasets. These checks are intended to be generalizable, actionable, and meaningful for analysis.
Maintained by Will Harris. Last updated 3 months ago.
3.3 match 21 stars 7.66 score 15 scriptsbioc
spiky:Spike-in calibration for cell-free MeDIP
spiky implements methods and model generation for cfMeDIP (cell-free methylated DNA immunoprecipitation) with spike-in controls. CfMeDIP is an enrichment protocol which avoids destructive conversion of scarce template, making it ideal as a "liquid biopsy," but creating certain challenges in comparing results across specimens, subjects, and experiments. The use of synthetic spike-in standard oligos allows diagnostics performed with cfMeDIP to quantitatively compare samples across subjects, experiments, and time points in both relative and absolute terms.
Maintained by Tim Triche. Last updated 5 months ago.
differentialmethylationdnamethylationnormalizationpreprocessingqualitycontrolsequencing
5.2 match 2 stars 4.90 score 3 scriptsvincentarelbundock
modelsummary:Summary Tables and Plots for Statistical Models and Data: Beautiful, Customizable, and Publication-Ready
Create beautiful and customizable tables to summarize several statistical models side-by-side. Draw coefficient plots, multi-level cross-tabs, dataset summaries, balance tables (a.k.a. "Table 1s"), and correlation matrices. This package supports dozens of statistical models, and it can produce tables in HTML, LaTeX, Word, Markdown, PDF, PowerPoint, Excel, RTF, JPG, or PNG. Tables can easily be embedded in 'Rmarkdown' or 'knitr' dynamic documents. Details can be found in Arel-Bundock (2022) <doi:10.18637/jss.v103.i01>.
Maintained by Vincent Arel-Bundock. Last updated 15 days ago.
1.9 match 926 stars 13.41 score 6.2k scripts 2 dependentsbioc
ggbio:Visualization tools for genomic data
The ggbio package extends and specializes the grammar of graphics for biological data. The graphics are designed to answer common scientific questions, in particular those often asked of high throughput genomics data. All core Bioconductor data structures are supported, where appropriate. The package supports detailed views of particular genomic regions, as well as genome-wide overviews. Supported overviews include ideograms and grand linear views. High-level plots include sequence fragment length, edge-linked interval to data view, mismatch pileup, and several splicing summaries.
Maintained by Michael Lawrence. Last updated 5 months ago.
2.0 match 111 stars 12.26 score 734 scripts 17 dependentsbioc
goSorensen:Statistical inference based on the Sorensen-Dice dissimilarity and the Gene Ontology (GO)
This package implements inferential methods to compare gene lists in terms of their biological meaning as expressed in the GO. The compared gene lists are characterized by cross-tabulation frequency tables of enriched GO items. Dissimilarity between gene lists is evaluated using the Sorensen-Dice index. The fundamental guiding principle is that two gene lists are taken as similar if they share a great proportion of common enriched GO items.
Maintained by Pablo Flores. Last updated 5 months ago.
annotationgogenesetenrichmentsoftwaremicroarraypathwaysgeneexpressionmultiplecomparisongraphandnetworkreactomeclusteringkegg
5.4 match 4.56 score 12 scriptsstatistikat
simPop:Simulation of Complex Synthetic Data Information
Tools and methods to simulate populations for surveys based on auxiliary data. The tools include model-based methods, calibration and combinatorial optimization algorithms, see Templ, Kowarik and Meindl (2017) <doi:10.18637/jss.v079.i10>) and Templ (2017) <doi:10.1007/978-3-319-50272-4>. The package was developed with support of the International Household Survey Network, DFID Trust Fund TF011722 and funds from the World bank.
Maintained by Matthias Templ. Last updated 4 months ago.
3.8 match 31 stars 6.51 score 104 scriptssticsrpacks
SticsRFiles:Read and Modify 'STICS' Input/Output Files
Manipulating input and output files of the 'STICS' crop model. Files are either 'JavaSTICS' XML files or text files used by the model 'fortran' executable. Most basic functionalities are reading or writing parameter names and values in both XML or text input files, and getting data from output files. Advanced functionalities include XML files generation from XML templates and/or spreadsheets, or text files generation from XML files by using 'xslt' transformation.
Maintained by Patrice Lecharpentier. Last updated 18 days ago.
2.9 match 4 stars 8.27 score 124 scriptsmhahsler
arules:Mining Association Rules and Frequent Itemsets
Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules). Also provides C implementations of the association mining algorithms Apriori and Eclat. Hahsler, Gruen and Hornik (2005) <doi:10.18637/jss.v014.i15>.
Maintained by Michael Hahsler. Last updated 1 months ago.
arulesassociation-rulesfrequent-itemsets
1.7 match 194 stars 13.99 score 3.3k scripts 28 dependentsjuba
questionr:Functions to Make Surveys Processing Easier
Set of functions to make the processing and analysis of surveys easier : interactive shiny apps and addins for data recoding, contingency tables, dataset metadata handling, and several convenience functions.
Maintained by Julien Barnier. Last updated 1 days ago.
1.9 match 83 stars 12.62 score 1.1k scripts 19 dependentsrandrescastaneda
joyn:Tool for Diagnosis of Tables Joins and Complementary Join Features
Tool for diagnosing table joins. It combines the speed of `collapse` and `data.table`, the flexibility of `dplyr`, and the diagnosis and features of the `merge` command in `Stata`.
Maintained by R.Andres Castaneda. Last updated 3 months ago.
3.3 match 9 stars 7.00 score 31 scriptsmassimoaria
bibliometrix:Comprehensive Science Mapping Analysis
Tool for quantitative research in scientometrics and bibliometrics. It implements the comprehensive workflow for science mapping analysis proposed in Aria M. and Cuccurullo C. (2017) <doi:10.1016/j.joi.2017.08.007>. 'bibliometrix' provides various routines for importing bibliographic data from 'SCOPUS', 'Clarivate Analytics Web of Science' (<https://www.webofknowledge.com/>), 'Digital Science Dimensions' (<https://www.dimensions.ai/>), 'OpenAlex' (<https://openalex.org/>), 'Cochrane Library' (<https://www.cochranelibrary.com/>), 'Lens' (<https://lens.org>), and 'PubMed' (<https://pubmed.ncbi.nlm.nih.gov/>) databases, performing bibliometric analysis and building networks for co-citation, coupling, scientific collaboration and co-word analysis.
Maintained by Massimo Aria. Last updated 7 days ago.
bibliometric-analysisbibliometricscitationcitation-networkcitationsco-authorsco-occurenceco-word-analysiscorrespondence-analysiscouplingisi-webjournalmanuscriptquantitative-analysisscholarssciencescience-mappingscientificscientometricsscopus
1.8 match 545 stars 12.54 score 518 scripts 2 dependentswalkerke
tigris:Load Census TIGER/Line Shapefiles
Download TIGER/Line shapefiles from the United States Census Bureau (<https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html>) and load into R as 'sf' objects.
Maintained by Kyle Walker. Last updated 4 months ago.
1.7 match 331 stars 12.87 score 5.3k scripts 16 dependentsopengeos
whitebox:'WhiteboxTools' R Frontend
An R frontend for the 'WhiteboxTools' library, which is an advanced geospatial data analysis platform developed by Prof. John Lindsay at the University of Guelph's Geomorphometry and Hydrogeomatics Research Group. 'WhiteboxTools' can be used to perform common geographical information systems (GIS) analysis operations, such as cost-distance analysis, distance buffering, and raster reclassification. Remote sensing and image processing tasks include image enhancement (e.g. panchromatic sharpening, contrast adjustments), image mosaicing, numerous filtering operations, simple classification (k-means), and common image transformations. 'WhiteboxTools' also contains advanced tooling for spatial hydrological analysis (e.g. flow-accumulation, watershed delineation, stream network analysis, sink removal), terrain analysis (e.g. common terrain indices such as slope, curvatures, wetness index, hillshading; hypsometric analysis; multi-scale topographic position analysis), and LiDAR data processing. Suggested citation: Lindsay (2016) <doi:10.1016/j.cageo.2016.07.003>.
Maintained by Andrew Brown. Last updated 5 months ago.
geomorphometrygeoprocessinggeospatialgishydrologyremote-sensingrstudio
2.3 match 173 stars 9.65 score 203 scripts 2 dependentsdmenne
breathtestcore:Core Functions to Read and Fit 13c Time Series from Breath Tests
Reads several formats of 13C data (IRIS/Wagner, BreathID) and CSV. Creates artificial sample data for testing. Fits Maes/Ghoos, Bluck-Coward self-correcting formula using 'nls', 'nlme'. Methods to fit breath test curves with Bayesian Stan methods are refactored to package 'breathteststan'. For a Shiny GUI, see package 'dmenne/breathtestshiny' on github.
Maintained by Dieter Menne. Last updated 2 months ago.
13cbreathbreath-testgastroenterologymedicalstan
3.5 match 2 stars 6.19 score 64 scripts 1 dependentscran
tseriesTARMA:Analysis of Nonlinear Time Series Through Threshold Autoregressive Moving Average Models (TARMA) Models
Routines for nonlinear time series analysis based on Threshold Autoregressive Moving Average (TARMA) models. It provides functions and methods for: TARMA model fitting and forecasting, including robust estimators, see Goracci et al. JBES (2025) <doi:10.1080/07350015.2024.2412011>; tests for threshold effects, see Giannerini et al. JoE (2024) <doi:10.1016/j.jeconom.2023.01.004>, Goracci et al. Statistica Sinica (2023) <doi:10.5705/ss.202021.0120>, Angelini et al. (2024) <doi:10.48550/arXiv.2308.00444>; unit-root tests based on TARMA models, see Chan et al. Statistica Sinica (2024) <doi:10.5705/ss.202022.0125>.
Maintained by Simone Giannerini. Last updated 5 months ago.
7.1 match 3.06 scoreurswilke
datadaptor:Modify Labelled Data Sets With Excel Files
An R package to modify labelled data sets with commands in Excel files. The commands in this package allow to create new variables, and modify the labels of the variables, as well as the variables themselves. The goal is to provide an easy & concise syntax, and to allow for fast systematic data entry using Excel for advanced users. The commands work on the variables inside the data.frame environment (like e.g. inside dplyr verbs), thus providing an approach that might ease the use for people without in-depth programming experience.
Maintained by Urs Wilke. Last updated 27 days ago.
4.2 match 5.13 score 1 dependentsdbosak01
procs:Recreates Some 'SAS®' Procedures in 'R'
Contains functions to simulate the most commonly used 'SAS®' procedures. Specifically, the package aims to simulate the functionality of 'proc freq', 'proc means', 'proc ttest', 'proc reg', 'proc transpose', 'proc sort', and 'proc print'. The simulation will include recreating all statistics with the highest fidelity possible.
Maintained by David Bosak. Last updated 10 months ago.
2.8 match 6 stars 7.57 score 37 scripts 2 dependentsncss-tech
aqp:Algorithms for Quantitative Pedology
The Algorithms for Quantitative Pedology (AQP) project was started in 2009 to organize a loosely-related set of concepts and source code on the topic of soil profile visualization, aggregation, and classification into this package (aqp). Over the past 8 years, the project has grown into a suite of related R packages that enhance and simplify the quantitative analysis of soil profile data. Central to the AQP project is a new vocabulary of specialized functions and data structures that can accommodate the inherent complexity of soil profile information; freeing the scientist to focus on ideas rather than boilerplate data processing tasks <doi:10.1016/j.cageo.2012.10.020>. These functions and data structures have been extensively tested and documented, applied to projects involving hundreds of thousands of soil profiles, and deeply integrated into widely used tools such as SoilWeb <https://casoilresource.lawr.ucdavis.edu/soilweb-apps>. Components of the AQP project (aqp, soilDB, sharpshootR, soilReports packages) serve an important role in routine data analysis within the USDA-NRCS Soil Science Division. The AQP suite of R packages offer a convenient platform for bridging the gap between pedometric theory and practice.
Maintained by Dylan Beaudette. Last updated 29 days ago.
digital-soil-mappingncss-technrcspedologypedometricssoilsoil-surveyusda
1.8 match 55 stars 11.77 score 1.2k scripts 2 dependentsjinkim3
kim:A Toolkit for Behavioral Scientists
A collection of functions for analyzing data typically collected or used by behavioral scientists. Examples of the functions include a function that compares groups in a factorial experimental design, a function that conducts two-way analysis of variance (ANOVA), and a function that cleans a data set generated by Qualtrics surveys. Some of the functions will require installing additional package(s). Such packages and other references are cited within the section describing the relevant functions. Many functions in this package rely heavily on these two popular R packages: Dowle et al. (2021) <https://CRAN.R-project.org/package=data.table>. Wickham et al. (2021) <https://CRAN.R-project.org/package=ggplot2>.
Maintained by Jin Kim. Last updated 19 days ago.
4.5 match 7 stars 4.66 score 3 scriptsssnn-airr
alakazam:Immunoglobulin Clonal Lineage and Diversity Analysis
Provides methods for high-throughput adaptive immune receptor repertoire sequencing (AIRR-Seq; Rep-Seq) analysis. In particular, immunoglobulin (Ig) sequence lineage reconstruction, lineage topology analysis, diversity profiling, amino acid property analysis and gene usage. Citations: Gupta and Vander Heiden, et al (2017) <doi:10.1093/bioinformatics/btv359>, Stern, Yaari and Vander Heiden, et al (2014) <doi:10.1126/scitranslmed.3008879>.
Maintained by Susanna Marquez. Last updated 3 months ago.
2.3 match 9.09 score 424 scripts 7 dependentsbioc
sparseMatrixStats:Summary Statistics for Rows and Columns of Sparse Matrices
High performance functions for row and column operations on sparse matrices. For example: col / rowMeans2, col / rowMedians, col / rowVars etc. Currently, the optimizations are limited to data in the column sparse format. This package is inspired by the matrixStats package by Henrik Bengtsson.
Maintained by Constantin Ahlmann-Eltze. Last updated 5 months ago.
infrastructuresoftwaredatarepresentationcpp
1.7 match 54 stars 11.98 score 174 scripts 126 dependentsbioc
DelayedMatrixStats:Functions that Apply to Rows and Columns of 'DelayedMatrix' Objects
A port of the 'matrixStats' API for use with DelayedMatrix objects from the 'DelayedArray' package. High-performing functions operating on rows and columns of DelayedMatrix objects, e.g. col / rowMedians(), col / rowRanks(), and col / rowSds(). Functions optimized per data type and for subsetted calculations such that both memory usage and processing time is minimized.
Maintained by Peter Hickey. Last updated 2 months ago.
infrastructuredatarepresentationsoftware
1.7 match 16 stars 11.86 score 211 scripts 112 dependentscran
MARVEL:Revealing Splicing Dynamics at Single-Cell Resolution
Alternative splicing represents an additional and underappreciated layer of complexity underlying gene expression profiles. Nevertheless, there remains hitherto a paucity of software to investigate splicing dynamics at single-cell resolution. 'MARVEL' enables splicing analysis of single-cell RNA-sequencing data generated from plate- and droplet-based library preparation methods.
Maintained by Sean Wen. Last updated 2 years ago.
7.5 match 2.71 score 51 scriptsbioc
MatrixGenerics:S4 Generic Summary Statistic Functions that Operate on Matrix-Like Objects
S4 generic functions modeled after the 'matrixStats' API for alternative matrix implementations. Packages with alternative matrix implementation can depend on this package and implement the generic functions that are defined here for a useful set of row and column summary statistics. Other package developers can import this package and handle a different matrix implementations without worrying about incompatibilities.
Maintained by Peter Hickey. Last updated 2 months ago.
infrastructuresoftwarebioconductor-packagecore-package
1.7 match 12 stars 11.64 score 129 scripts 1.3k dependentsgianmarcoalberti
chisquare:Chi-Square and G-Square Test of Independence, Power and Residual Analysis, Measures of Categorical Association
Provides the facility to perform the chi-square and G-square test of independence, calculates the retrospective power of the traditional chi-square test, compute permutation and Monte Carlo p-value, and provides measures of association for tables of any size such as Phi, Phi corrected, odds ratio with 95 percent CI and p-value, Yule' Q and Y, adjusted contingency coefficient, Cramer's V, V corrected, V standardised, bias-corrected V, W, Cohen's w, Goodman-Kruskal's lambda, and tau. It also calculates standardised, moment-corrected standardised, and adjusted standardised residuals, and their significance, as well as the Quetelet Index, IJ association factor, and adjusted standardised counts. It also computes the chi-square-maximising version of the input table. Different outputs are returned in nicely formatted tables.
Maintained by Gianmarco Alberti. Last updated 5 months ago.
9.9 match 2.00 score 1 scriptssdctools
sdcMicro:Statistical Disclosure Control Methods for Anonymization of Data and Risk Estimation
Data from statistical agencies and other institutions are mostly confidential. This package, introduced in Templ, Kowarik and Meindl (2017) <doi:10.18637/jss.v067.i04>, can be used for the generation of anonymized (micro)data, i.e. for the creation of public- and scientific-use files. The theoretical basis for the methods implemented can be found in Templ (2017) <doi:10.1007/978-3-319-50272-4>. Various risk estimation and anonymization methods are included. Note that the package includes a graphical user interface published in Meindl and Templ (2019) <doi:10.3390/a12090191> that allows to use various methods of this package.
Maintained by Matthias Templ. Last updated 26 days ago.
2.0 match 83 stars 9.89 score 258 scriptskurthornik
clue:Cluster Ensembles
CLUster Ensembles.
Maintained by Kurt Hornik. Last updated 4 months ago.
2.0 match 2 stars 9.85 score 496 scripts 401 dependentstntp
tntpr:Data Analysis Tools Customized for TNTP
An assortment of functions and templates customized to meet the needs of data analysts at the non-profit organization TNTP. Includes functions for branded colors and plots, credentials management, repository set-up, and other common analytic tasks.
Maintained by Dustin Pashouwer. Last updated 4 months ago.
3.4 match 7 stars 5.83 score 13 scriptslaresbernardo
lares:Analytics & Machine Learning Sidekick
Auxiliary package for better/faster analytics, visualization, data mining, and machine learning tasks. With a wide variety of family functions, like Machine Learning, Data Wrangling, Marketing Mix Modeling (Robyn), Exploratory, API, and Scrapper, it helps the analyst or data scientist to get quick and robust results, without the need of repetitive coding or advanced R programming skills.
Maintained by Bernardo Lares. Last updated 24 days ago.
analyticsapiautomationautomldata-sciencedescriptive-statisticsh2omachine-learningmarketingmmmpredictive-modelingpuzzlerlanguagerobynvisualization
2.0 match 233 stars 9.84 score 185 scripts 1 dependentseogrady21
vprr:Processing and Visualization of Video Plankton Recorder Data
An oceanographic data processing package for analyzing and visualizing Video Plankton Recorder data. This package was developed at 'Bedford Institute of Oceanography'. Functions are designed to process automated image classification output and create organized and easily portable data products.
Maintained by Emily OGrady. Last updated 1 months ago.
3.5 match 2 stars 5.61 score 17 scriptsgrunwaldlab
poppr:Genetic Analysis of Populations with Mixed Reproduction
Population genetic analyses for hierarchical analysis of partially clonal populations built upon the architecture of the 'adegenet' package. Originally described in Kamvar, Tabima, and Grünwald (2014) <doi:10.7717/peerj.281> with version 2.0 described in Kamvar, Brooks, and Grünwald (2015) <doi:10.3389/fgene.2015.00208>.
Maintained by Zhian N. Kamvar. Last updated 10 months ago.
clonalitygenetic-analysisgenetic-distancesminimum-spanning-networksmultilocus-genotypesmultilocus-lineagespopulation-geneticspopulationsopenmp
1.8 match 69 stars 10.84 score 672 scriptsr-gregmisc
gmodels:Various R Programming Tools for Model Fitting
Various R programming tools for model fitting.
Maintained by Gregory R. Warnes. Last updated 3 months ago.
1.8 match 1 stars 10.01 score 3.5k scripts 30 dependentsquanteda
quanteda.textstats:Textual Statistics for the Quantitative Analysis of Textual Data
Textual statistics functions formerly in the 'quanteda' package. Textual statistics for characterizing and comparing textual data. Includes functions for measuring term and document frequency, the co-occurrence of words, similarity and distance between features and documents, feature entropy, keyword occurrence, readability, and lexical diversity. These functions extend the 'quanteda' package and are specially designed for sparse textual data.
Maintained by Kenneth Benoit. Last updated 6 months ago.
2.0 match 15 stars 8.91 score 916 scripts 10 dependentsmagnusdv
pedtools:Creating and Working with Pedigrees and Marker Data
A comprehensive collection of tools for creating, manipulating and visualising pedigrees and genetic marker data. Pedigrees can be read from text files or created on the fly with built-in functions. A range of utilities enable modifications like adding or removing individuals, breaking loops, and merging pedigrees. An online tool for creating pedigrees interactively, based on 'pedtools', is available at <https://magnusdv.shinyapps.io/quickped>. 'pedtools' is the hub of the 'pedsuite', a collection of packages for pedigree analysis. A detailed presentation of the 'pedsuite' is given in the book 'Pedigree Analysis in R' (Vigeland, 2021, ISBN:9780128244302).
Maintained by Magnus Dehli Vigeland. Last updated 2 months ago.
2.0 match 25 stars 8.83 score 60 scripts 18 dependentsopensafely-core
osutils:Useful Functions for OpenSAFELY
Contains functions that are often needed when using the OpenSAFELY platform <https://www.opensafely.org/>, such as redaction and low-memory processing.
Maintained by William Hulme. Last updated 3 months ago.
10.3 match 1.70 score 1 scriptsspatstat
spatstat.explore:Exploratory Data Analysis for the 'spatstat' Family
Functionality for exploratory data analysis and nonparametric analysis of spatial data, mainly spatial point patterns, in the 'spatstat' family of packages. (Excludes analysis of spatial data on a linear network, which is covered by the separate package 'spatstat.linnet'.) Methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported.
Maintained by Adrian Baddeley. Last updated 1 months ago.
cluster-detectionconfidence-intervalshypothesis-testingk-functionroc-curvesscan-statisticssignificance-testingsimulation-envelopesspatial-analysisspatial-data-analysisspatial-sharpeningspatial-smoothingspatial-statistics
1.7 match 1 stars 10.17 score 67 scripts 148 dependentschstock
DTComPair:Comparison of Binary Diagnostic Tests in a Paired Study Design
Comparison of the accuracy of two binary diagnostic tests in a "paired" study design, i.e. when each test is applied to each subject in the study.
Maintained by Christian Stock. Last updated 5 months ago.
clinical-epidemiologycomparative-analysisdiagnosisdiagnostic-accuracy-studiesdiagnostic-likelihood-ratiodiagnostic-testsmedicinepredictive-valuesensitivityspecificity
3.4 match 1 stars 5.07 score 47 scriptsmuschellij2
neurobase:'Neuroconductor' Base Package with Helper Functions for 'nifti' Objects
Base package for 'Neuroconductor', which includes many helper functions that interact with objects of class 'nifti', implemented by package 'oro.nifti', for reading/writing and also other manipulation functions.
Maintained by John Muschelli. Last updated 1 months ago.
2.0 match 5 stars 8.49 score 486 scripts 7 dependentsjoshobrien
rasterDT:Fast Raster Summary and Manipulation
Fast alternatives to several relatively slow 'raster' package functions. For large rasters, the functions run from 5 to approximately 100 times faster than the 'raster' package functions they replace. The 'fasterize' package, on which one function in this package depends, includes an implementation of the scan line algorithm attributed to Wylie et al. (1967) <doi:10.1145/1465611.1465619>.
Maintained by Joshua OBrien. Last updated 2 years ago.
3.7 match 27 stars 4.61 score 8 scripts 1 dependentsropensci
USAboundaries:Historical and Contemporary Boundaries of the United States of America
The boundaries for geographical units in the United States of America contained in this package include state, county, congressional district, and zip code tabulation area. Contemporary boundaries are provided by the U.S. Census Bureau (public domain). Historical boundaries for the years from 1629 to 2000 are provided form the Newberry Library's 'Atlas of Historical County Boundaries' (licensed CC BY-NC-SA). Additional data is provided in the 'USAboundariesData' package; this package provides an interface to access that data.
Maintained by Lincoln Mullen. Last updated 3 years ago.
digital-historyhistoryspatial-data
2.3 match 58 stars 7.33 score 1.2k scriptscran
poliscidata:Datasets and Functions Featured in Pollock and Edwards, an R Companion to Essentials of Political Analysis, Second Edition
Bundles the datasets and functions used in the textbook by Philip Pollock and Barry Edwards, an R Companion to Essentials of Political Analysis, Second Edition.
Maintained by Barry Edwards. Last updated 5 years ago.
7.3 match 2.33 score 213 scriptsbergsmat
tablet:Tabulate Descriptive Statistics in Multiple Formats
Creates a table of descriptive statistics for factor and numeric columns in a data frame. Displays these by groups, if any. Highly customizable, with support for 'html' and 'pdf' provided by 'kableExtra'. Respects original column order, column labels, and factor level order. See ?tablet.data.frame and vignettes.
Maintained by Tim Bergsma. Last updated 4 months ago.
3.0 match 3 stars 5.57 score 26 scriptsstulacy
epitab:Flexible Contingency Tables for Epidemiology
Builds contingency tables that cross-tabulate multiple categorical variables and also calculates various summary measures. Export to a variety of formats is supported, including: 'HTML', 'LaTeX', and 'Excel'.
Maintained by Stuart Lacy. Last updated 7 years ago.
3.8 match 1 stars 4.41 score 17 scriptstrinker
textshape:Tools for Reshaping Text
Tools that can be used to reshape and restructure text data.
Maintained by Tyler Rinker. Last updated 12 months ago.
data-reshapingmanipulationsentence-boundary-detectiontext-datatext-formatingtidy
1.8 match 50 stars 9.18 score 266 scripts 34 dependentsenvironmentalinformatics-marburg
satellite:Handling and Manipulating Remote Sensing Data
Herein, we provide a broad variety of functions which are useful for handling, manipulating, and visualizing satellite-based remote sensing data. These operations range from mere data import and layer handling (eg subsetting), over Raster* typical data wrangling (eg crop, extend), to more sophisticated (pre-)processing tasks typically applied to satellite imagery (eg atmospheric and topographic correction). This functionality is complemented by a full access to the satellite layers' metadata at any stage and the documentation of performed actions in a separate log file. Currently available sensors include Landsat 4-5 (TM), 7 (ETM+), and 8 (OLI/TIRS Combined), and additional compatibility is ensured for the Landsat Global Land Survey data set.
Maintained by Florian Detsch. Last updated 1 years ago.
1.7 match 22 stars 9.88 score 61 scripts 27 dependentsoakleyj
SHELF:Tools to Support the Sheffield Elicitation Framework
Implements various methods for eliciting a probability distribution for a single parameter from an expert or a group of experts. The expert provides a small number of probability judgements, corresponding to points on his or her cumulative distribution function. A range of parametric distributions can then be fitted and displayed, with feedback provided in the form of fitted probabilities and percentiles. For multiple experts, a weighted linear pool can be calculated. Also includes functions for eliciting beliefs about population distributions; eliciting multivariate distributions using a Gaussian copula; eliciting a Dirichlet distribution; eliciting distributions for variance parameters in a random effects meta-analysis model; survival extrapolation. R Shiny apps for most of the methods are included.
Maintained by Jeremy Oakley. Last updated 15 days ago.
1.8 match 19 stars 8.90 score 73 scripts 3 dependentsbioc
beadarray:Quality assessment and low-level analysis for Illumina BeadArray data
The package is able to read bead-level data (raw TIFFs and text files) output by BeadScan as well as bead-summary data from BeadStudio. Methods for quality assessment and low-level analysis are provided.
Maintained by Mark Dunning. Last updated 5 months ago.
microarrayonechannelqualitycontrolpreprocessing
2.0 match 7.88 score 70 scripts 4 dependentsolihawkins
tabbycat:Tabulate and Summarise Categorical Data
Functions for tabulating and summarising categorical variables. Most functions are designed to work with dataframes, and use the 'tidyverse' idiom of taking the dataframe as the first argument so they work within pipelines. Equivalent functions that operate directly on vectors are also provided where it makes sense. This package aims to make exploratory data analysis involving categorical variables quicker, simpler and more robust.
Maintained by Oliver Hawkins. Last updated 2 years ago.
3.6 match 36 stars 4.26 score 2 scriptsthothorn
libcoin:Linear Test Statistics for Permutation Inference
Basic infrastructure for linear test statistics and permutation inference in the framework of Strasser and Weber (1999) <https://epub.wu.ac.at/102/>. This package must not be used by end-users. CRAN package 'coin' implements all user interfaces and is ready to be used by anyone.
Maintained by Torsten Hothorn. Last updated 1 years ago.
2.3 match 1 stars 6.81 score 25 scripts 171 dependentsropensci
rdhs:API Client and Dataset Management for the Demographic and Health Survey (DHS) Data
Provides a client for (1) querying the DHS API for survey indicators and metadata (<https://api.dhsprogram.com/#/index.html>), (2) identifying surveys and datasets for analysis, (3) downloading survey datasets from the DHS website, (4) loading datasets and associate metadata into R, and (5) extracting variables and combining datasets for pooled analysis.
Maintained by OJ Watson. Last updated 17 days ago.
datasetdhsdhs-apiextractpeer-reviewedsurvey-data
1.5 match 35 stars 10.07 score 286 scripts 3 dependentsmuschellij2
fslr:Wrapper Functions for 'FSL' ('FMRIB' Software Library) from Functional MRI of the Brain ('FMRIB')
Wrapper functions that interface with 'FSL' <http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/>, a powerful and commonly-used 'neuroimaging' software, using system commands. The goal is to be able to interface with 'FSL' completely in R, where you pass R objects of class 'nifti', implemented by package 'oro.nifti', and the function executes an 'FSL' command and returns an R object of class 'nifti' if desired.
Maintained by John Muschelli. Last updated 1 months ago.
fslfslrneuroimagingneuroimaging-analysisneuroimaging-data-science
1.9 match 41 stars 8.01 score 420 scriptsai-sdc
acro:A Tool for Semi-Automating the Statistical Disclosure Control of Research Outputs
Assists researchers and output checkers by distinguishing between research output that is safe to publish, output that requires further analysis, and output that cannot be published because of substantial disclosure risk. A paper about the tool was presented at the UNECE Expert Meeting on Statistical Data Confidentiality 2023; see <https://uwe-repository.worktribe.com/output/11060964>.
Maintained by Jim Smith. Last updated 10 days ago.
data-privacydata-protectionprivacyprivacy-toolsstatistical-disclosure-controlstatistical-software
3.5 match 1 stars 4.11 score 1 scriptshuismanj
qpNCA:Noncompartmental Pharmacokinetic Analysis by qPharmetra
Computes noncompartmental pharmacokinetic parameters for drug concentration profiles. For each profile, data imputations and adjustments are made as necessary and basic parameters are estimated. Supports single dose, multi-dose, and multi-subject data. Supports steady-state calculations and various routes of drug administration. See ?qpNCA and vignettes. Methodology follows Rowland and Tozer (2011, ISBN:978-0-683-07404-8), Gabrielsson and Weiner (1997, ISBN:978-91-9765-100-4), and Gibaldi and Perrier (1982, ISBN:978-0824710422).
Maintained by Jan Huisman. Last updated 4 years ago.
3.8 match 3.83 score 34 scriptsuligges
klaR:Classification and Visualization
Miscellaneous functions for classification and visualization, e.g. regularized discriminant analysis, sknn() kernel-density naive Bayes, an interface to 'svmlight' and stepclass() wrapper variable selection for supervised classification, partimat() visualization of classification rules and shardsplot() of cluster results as well as kmodes() clustering for categorical data, corclust() variable clustering, variable extraction from different variable clustering models and weight of evidence preprocessing.
Maintained by Uwe Ligges. Last updated 1 years ago.
1.9 match 5 stars 7.61 score 1.4k scripts 13 dependentsegenn
rtemis:Machine Learning and Visualization
Advanced Machine Learning and Visualization. Unsupervised Learning (Clustering, Decomposition), Supervised Learning (Classification, Regression), Cross-Decomposition, Bagging, Boosting, Meta-models. Static and interactive graphics.
Maintained by E.D. Gennatas. Last updated 1 months ago.
data-sciencedata-visualizationmachine-learningmachine-learning-libraryvisualization
2.0 match 145 stars 7.09 score 50 scripts 2 dependentsprojectmosaic
mosaicCore:Common Utilities for Other MOSAIC-Family Packages
Common utilities used in other MOSAIC-family packages are collected here.
Maintained by Randall Pruim. Last updated 1 years ago.
2.0 match 1 stars 7.07 score 113 scripts 26 dependentsinsightsengineering
tern.rbmi:Create Interface for 'RBMI' and 'tern'
'RBMI' implements standard and reference based multiple imputation methods for continuous longitudinal endpoints (Gower-Page et al. (2022) <doi:10.21105/joss.04251>). This package provides an interface for 'RBMI' uses the 'tern' <https://cran.r-project.org/package=tern> framework by Zhu et al. (2023) and tabulate results easily using 'rtables' <https://cran.r-project.org/package=rtables> by Becker et al. (2023).
Maintained by Joe Zhu. Last updated 10 days ago.
2.2 match 3 stars 6.53 score 3 scriptskestrel99
pmxTools:Pharmacometric and Pharmacokinetic Toolkit
Pharmacometric tools for common data analytical tasks; closed-form solutions for calculating concentrations at given times after dosing based on compartmental PK models (1-compartment, 2-compartment and 3-compartment, covering infusions, zero- and first-order absorption, and lag times, after single doses and at steady state, per Bertrand & Mentre (2008) <http://lixoft.com/wp-content/uploads/2016/03/PKPDlibrary.pdf>); parametric simulation from NONMEM-generated parameter estimates and other output; and parsing, tabulating and plotting results generated by Perl-speaks-NONMEM (PsN).
Maintained by Justin Wilkins. Last updated 7 months ago.
nonmempharmacokineticssimulation
2.1 match 30 stars 6.40 score 84 scriptsfinnishcancerregistry
popEpi:Functions for Epidemiological Analysis using Population Data
Enables computation of epidemiological statistics, including those where counts or mortality rates of the reference population are used. Currently supported: excess hazard models (Dickman, Sloggett, Hills, and Hakulinen (2012) <doi:10.1002/sim.1597>), rates, mean survival times, relative/net survival (in particular the Ederer II (Ederer and Heise (1959)) and Pohar Perme (Pohar Perme, Stare, and Esteve (2012) <doi:10.1111/j.1541-0420.2011.01640.x>) estimators), and standardized incidence and mortality ratios, all of which can be easily adjusted for by covariates such as age. Fast splitting and aggregation of 'Lexis' objects (from package 'Epi') and other computations achieved using 'data.table'.
Maintained by Joonas Miettinen. Last updated 1 months ago.
adjust-estimatesage-adjustingdirect-adjustingepidemiologyindirect-adjustingsurvival
1.7 match 8 stars 8.05 score 117 scripts 1 dependentsnmautoverse
NMdata:Preparation, Checking and Post-Processing Data for PK/PD Modeling
Efficient tools for preparation, checking and post-processing of data in PK/PD (pharmacokinetics/pharmacodynamics) modeling, with focus on use of Nonmem. Attention is paid to ensure consistency, traceability, and Nonmem compatibility of Data. Rigorously checks final Nonmem datasets. Implemented in 'data.table', but easily integrated with 'base' and 'tidyverse'.
Maintained by Philip Delff. Last updated 2 days ago.
1.8 match 17 stars 7.69 score 88 scripts 2 dependentsroelandkindt
BiodiversityR:Package for Community Ecology and Suitability Analysis
Graphical User Interface (via the R-Commander) and utility functions (often based on the vegan package) for statistical analysis of biodiversity and ecological communities, including species accumulation curves, diversity indices, Renyi profiles, GLMs for analysis of species abundance and presence-absence, distance matrices, Mantel tests, and cluster, constrained and unconstrained ordination analysis. A book on biodiversity and community ecology analysis is available for free download from the website. In 2012, methods for (ensemble) suitability modelling and mapping were expanded in the package.
Maintained by Roeland Kindt. Last updated 2 months ago.
1.8 match 16 stars 7.42 score 390 scripts 2 dependentsjeffreyevans
yaImpute:Nearest Neighbor Observation Imputation and Evaluation Tools
Performs nearest neighbor-based imputation using one or more alternative approaches to processing multivariate data. These include methods based on canonical correlation: analysis, canonical correspondence analysis, and a multivariate adaptation of the random forest classification and regression techniques of Leo Breiman and Adele Cutler. Additional methods are also offered. The package includes functions for comparing the results from running alternative techniques, detecting imputation targets that are notably distant from reference observations, detecting and correcting for bias, bootstrapping and building ensemble imputations, and mapping results.
Maintained by Jeffrey S. Evans. Last updated 6 months ago.
1.8 match 3 stars 7.40 score 94 scripts 12 dependentscvoeten
buildmer:Stepwise Elimination and Term Reordering for Mixed-Effects Regression
Finds the largest possible regression model that will still converge for various types of regression analyses (including mixed models and generalized additive models) and then optionally performs stepwise elimination similar to the forward and backward effect-selection methods in SAS, based on the change in log-likelihood or its significance, Akaike's Information Criterion, the Bayesian Information Criterion, the explained deviance, or the F-test of the change in R².
Maintained by Cesko C. Voeten. Last updated 1 years ago.
2.3 match 5.82 score 200 scriptsddalthorp
GenEst:Generalized Mortality Estimator
Command-line and 'shiny' GUI implementation of the GenEst models for estimating bird and bat mortality at wind and solar power facilities, following Dalthorp, et al. (2018) <doi:10.3133/tm7A2>.
Maintained by Daniel Dalthorp. Last updated 2 years ago.
1.7 match 7 stars 7.81 score 55 scripts 2 dependentsbioc
ropls:PCA, PLS(-DA) and OPLS(-DA) for multivariate analysis and feature selection of omics data
Latent variable modeling with Principal Component Analysis (PCA) and Partial Least Squares (PLS) are powerful methods for visualization, regression, classification, and feature selection of omics data where the number of variables exceeds the number of samples and with multicollinearity among variables. Orthogonal Partial Least Squares (OPLS) enables to separately model the variation correlated (predictive) to the factor of interest and the uncorrelated (orthogonal) variation. While performing similarly to PLS, OPLS facilitates interpretation. Successful applications of these chemometrics techniques include spectroscopic data such as Raman spectroscopy, nuclear magnetic resonance (NMR), mass spectrometry (MS) in metabolomics and proteomics, but also transcriptomics data. In addition to scores, loadings and weights plots, the package provides metrics and graphics to determine the optimal number of components (e.g. with the R2 and Q2 coefficients), check the validity of the model by permutation testing, detect outliers, and perform feature selection (e.g. with Variable Importance in Projection or regression coefficients). The package can be accessed via a user interface on the Workflow4Metabolomics.org online resource for computational metabolomics (built upon the Galaxy environment).
Maintained by Etienne A. Thevenot. Last updated 5 months ago.
regressionclassificationprincipalcomponenttranscriptomicsproteomicsmetabolomicslipidomicsmassspectrometryimmunooncology
1.7 match 7.55 score 210 scripts 8 dependentstrinker
qdapTools:Tools for the 'qdap' Package
A collection of tools associated with the 'qdap' package that may be useful outside of the context of text analysis.
Maintained by Tyler Rinker. Last updated 2 years ago.
1.8 match 16 stars 7.04 score 408 scripts 5 dependentssem-in-r
seminr:Building and Estimating Structural Equation Models
A powerful, easy to syntax for specifying and estimating complex Structural Equation Models. Models can be estimated using Partial Least Squares Path Modeling or Covariance-Based Structural Equation Modeling or covariance based Confirmatory Factor Analysis. Methods described in Ray, Danks, and Valdez (2021).
Maintained by Nicholas Patrick Danks. Last updated 3 years ago.
common-factorscompositesconstructpls-models
1.7 match 62 stars 7.46 score 284 scriptsibecav
CGPfunctions:Powell Miscellaneous Functions for Teaching and Learning Statistics
Miscellaneous functions useful for teaching statistics as well as actually practicing the art. They typically are not new methods but rather wrappers around either base R or other packages.
Maintained by Chuck Powell. Last updated 4 years ago.
1.7 match 27 stars 7.28 score 122 scriptspetermeissner
tabit:Simple Tabulation Made Simple
Simple tabulation should be dead simple. This package is an opinionated approach to easy tabulations while also providing exact numbers and allowing for re-usability. This is achieved by providing tabulations as data.frames with columns for values, optional variable names, frequency counts including and excluding NAs and percentages for counts including and excluding NAs. Also values are automatically sorted by in decreasing order of frequency counts to allow for fast skimming of the most important information.
Maintained by Peter Meissner. Last updated 5 years ago.
4.1 match 2 stars 3.00 score 3 scriptsbioc
SynExtend:Tools for Working With Synteny Objects
Shared order between genomic sequences provide a great deal of information. Synteny objects produced by the R package DECIPHER provides quantitative information about that shared order. SynExtend provides tools for extracting information from Synteny objects.
Maintained by Nicholas Cooley. Last updated 3 days ago.
geneticsclusteringcomparativegenomicsdataimportfortranopenmp
1.9 match 1 stars 6.42 score 77 scriptsgavinrozzi
zipcodeR:Data & Functions for Working with US ZIP Codes
Make working with ZIP codes in R painless with an integrated dataset of U.S. ZIP codes and functions for working with them. Search ZIP codes by multiple geographies, including state, county, city & across time zones. Also included are functions for relating ZIP codes to Census data, geocoding & distance calculations.
Maintained by Gavin Rozzi. Last updated 1 years ago.
1.6 match 80 stars 7.31 score 176 scriptsgianmarcoalberti
caplot:Correspondence Analysis with Geometric Frequency Interpretation
Performs Correspondence Analysis on the given dataframe and plots the results in a scatterplot that emphasizes the geometric interpretation aspect of the analysis, following Borg-Groenen (2005) and Yelland (2010). It is particularly useful for highlighting the relationships between a selected row (or column) category and the column (or row) categories. See Borg-Groenen (2005, ISBN:978-0-387-28981-6); Yelland (2010) <doi:10.3888/tmj.12-4>.
Maintained by Gianmarco Alberti. Last updated 2 years ago.
6.8 match 1.70 score 1 scriptsalexanderrobitzsch
BIFIEsurvey:Tools for Survey Statistics in Educational Assessment
Contains tools for survey statistics (especially in educational assessment) for datasets with replication designs (jackknife, bootstrap, replicate weights; see Kolenikov, 2010; Pfefferman & Rao, 2009a, 2009b, <doi:10.1016/S0169-7161(09)70003-3>, <doi:10.1016/S0169-7161(09)70037-9>); Shao, 1996, <doi:10.1080/02331889708802523>). Descriptive statistics, linear and logistic regression, path models for manifest variables with measurement error correction and two-level hierarchical regressions for weighted samples are included. Statistical inference can be conducted for multiply imputed datasets and nested multiply imputed datasets and is in particularly suited for the analysis of plausible values (for details see George, Oberwimmer & Itzlinger-Bruneforth, 2016; Bruneforth, Oberwimmer & Robitzsch, 2016; Robitzsch, Pham & Yanagida, 2016). The package development was supported by BIFIE (Federal Institute for Educational Research, Innovation and Development of the Austrian School System; Salzburg, Austria).
Maintained by Alexander Robitzsch. Last updated 11 months ago.
2.3 match 4 stars 4.99 score 85 scripts 1 dependentsmurrayefford
openCR:Open Population Capture-Recapture
Non-spatial and spatial open-population capture-recapture analysis.
Maintained by Murray Efford. Last updated 5 months ago.
1.9 match 4 stars 5.98 score 53 scriptsstats-uoa
s20x:Functions for University of Auckland Course STATS 201/208 Data Analysis
A set of functions used in teaching STATS 201/208 Data Analysis at the University of Auckland. The functions are designed to make parts of R more accessible to a large undergraduate population who are mostly not statistics majors.
Maintained by James Curran. Last updated 2 years ago.
1.8 match 3 stars 6.40 score 211 scripts 3 dependentspsolymos
mefa:Multivariate Data Handling in Ecology and Biogeography
A framework package aimed to provide standardized computational environment for specialist work via object classes to represent the data coded by samples, taxa and segments (i.e. subpopulations, repeated measures). It supports easy processing of the data along with cross tabulation and relational data tables for samples and taxa. An object of class `mefa' is a project specific compendium of the data and can be easily used in further analyses. Methods are provided for extraction, aggregation, conversion, plotting, summary and reporting of `mefa' objects. Reports can be generated in plain text or LaTeX format. Vignette contains worked examples.
Maintained by Peter Solymos. Last updated 10 months ago.
2.3 match 2 stars 4.82 score 111 scripts 2 dependentsmiddleton-lab
abd:The Analysis of Biological Data
The abd package contains data sets and sample code for The Analysis of Biological Data by Michael Whitlock and Dolph Schluter (2009; Roberts & Company Publishers).
Maintained by Kevin M. Middleton. Last updated 11 months ago.
2.0 match 6 stars 5.53 score 182 scripts 1 dependentsbioc
ontoProc:processing of ontologies of anatomy, cell lines, and so on
Support harvesting of diverse bioinformatic ontologies, making particular use of the ontologyIndex package on CRAN. We provide snapshots of key ontologies for terms about cells, cell lines, chemical compounds, and anatomy, to help analyze genome-scale experiments, particularly cell x compound screens. Another purpose is to strengthen development of compelling use cases for richer interfaces to emerging ontologies.
Maintained by Vincent Carey. Last updated 3 days ago.
infrastructuregobioinformaticsgenomicsontology
1.7 match 3 stars 6.37 score 75 scripts 2 dependentsashesitr
reservr:Fit Distributions and Neural Networks to Censored and Truncated Data
Define distribution families and fit them to interval-censored and interval-truncated data, where the truncation bounds may depend on the individual observation. The defined distributions feature density, probability, sampling and fitting methods as well as efficient implementations of the log-density log f(x) and log-probability log P(x0 <= X <= x1) for use in 'TensorFlow' neural networks via the 'tensorflow' package. Allows training parametric neural networks on interval-censored and interval-truncated data with flexible parameterization. Applications include Claims Development in Non-Life Insurance, e.g. modelling reporting delay distributions from incomplete data, see Bücher, Rosenstock (2022) <doi:10.1007/s13385-022-00314-4>.
Maintained by Alexander Rosenstock. Last updated 9 months ago.
2.0 match 5 stars 5.35 score 9 scriptsrichardhooijmaijers
R3port:Report Functions to Create HTML and PDF Files
Create and combine HTML and PDF reports from within R. Possibility to design tables and listings for reporting and also include R plots.
Maintained by Richard Hooijmaijers. Last updated 1 years ago.
1.9 match 10 stars 5.71 score 34 scripts 1 dependentspsolymos
mefa4:Multivariate Data Handling with S4 Classes and Sparse Matrices
An S4 update of the 'mefa' package using sparse matrices for enhanced efficiency. Sparse array-like objects are supported via lists of sparse matrices.
Maintained by Peter Solymos. Last updated 6 months ago.
data-manipulationecologysparse-matrices
2.0 match 5.34 score 368 scripts 2 dependentsclewerenz
ilabelled:Simple Handling of Labelled Data
Simple handling of survey data. Smart handling of meta-information like e.g. variable-labels value-labels and scale-levels. Easy access and validation of meta-information. Useage of value labels and values respectively for subsetting and recoding data.
Maintained by Christof Lewerenz. Last updated 2 months ago.
1.7 match 2 stars 6.02 score 13 scriptsellessenne
rsimsum:Analysis of Simulation Studies Including Monte Carlo Error
Summarise results from simulation studies and compute Monte Carlo standard errors of commonly used summary statistics. This package is modelled on the 'simsum' user-written command in 'Stata' (White I.R., 2010 <https://www.stata-journal.com/article.html?article=st0200>), further extending it with additional performance measures and functionality.
Maintained by Alessandro Gasparini. Last updated 10 months ago.
biostatisticsmonte-carlo-errorsimulationsimulation-studysimulationsstatistics
1.3 match 28 stars 7.70 score 148 scriptsurswilke
crosstabser:Generate Crosstabs of Labelled Data Sets with Excel Files
An R package to use commands in Excel files to generate crosstabs of labelled data sets (usually survey data). The crosstabs can be printed to the console, and serve as an input for an app to plot them interactively.
Maintained by Urs Wilke. Last updated 17 days ago.
2.3 match 4.48 scoresimonmoulds
lulcc:Land Use Change Modelling in R
Classes and methods for spatially explicit land use change modelling in R.
Maintained by Simon Moulds. Last updated 5 years ago.
1.8 match 41 stars 5.37 score 38 scriptsreginalexavier
OpenLand:Quantitative Analysis and Visualization of LUCC
Tools for the analysis of land use and cover (LUC) time series. It includes support for loading spatiotemporal raster data and synthesized spatial plotting. Several LUC change (LUCC) metrics in regular or irregular time intervals can be extracted and visualized through one- and multistep sankey and chord diagrams. A complete intensity analysis according to Aldwaik and Pontius (2012) <doi:10.1016/j.landurbplan.2012.02.010> is implemented, including tools for the generation of standardized multilevel output graphics.
Maintained by Reginal Exavier. Last updated 11 months ago.
geographygeospatialintensity-analysisland-use-and-land-cover-changeluc-mapslulcplotrasters
1.7 match 22 stars 5.80 score 19 scriptstmsalab
cIRT:Choice Item Response Theory
Jointly model the accuracy of cognitive responses and item choices within a Bayesian hierarchical framework as described by Culpepper and Balamuta (2015) <doi:10.1007/s11336-015-9484-7>. In addition, the package contains the datasets used within the analysis of the paper.
Maintained by James Joseph Balamuta. Last updated 3 years ago.
armadillobayesianchoicecognitive-diagnostic-modelsgibbs-samplingitem-response-theoryrcpparmadilloopenblascppopenmp
1.9 match 4 stars 5.14 score 23 scriptsbioc
cTRAP:Identification of candidate causal perturbations from differential gene expression data
Compare differential gene expression results with those from known cellular perturbations (such as gene knock-down, overexpression or small molecules) derived from the Connectivity Map. Such analyses allow not only to infer the molecular causes of the observed difference in gene expression but also to identify small molecules that could drive or revert specific transcriptomic alterations.
Maintained by Nuno Saraiva-Agostinho. Last updated 5 months ago.
differentialexpressiongeneexpressionrnaseqtranscriptomicspathwaysimmunooncologygenesetenrichmentbioconductorbioinformaticscmapgene-expressionl1000
1.9 match 5 stars 5.08 score 16 scriptsrdinnager
slimr:Create, Run and Post-Process 'SLiM' Population Genetics Forward Simulations
Lets you write 'SLiM' scripts (population genomics simulation) using your favourite R IDE, using a syntax as close as possible to the original 'SLiM' language. It offer many tools to manipulate those scripts, as well as run them in the 'SLiM' software from R, as well as capture and post-process their output, after or even during a simulation.
Maintained by Russell Dinnage. Last updated 4 months ago.
2.0 match 8 stars 4.70 score 42 scriptsbemts-hhs
nemsqar:National Emergency Medical Service Quality Alliance Measure Calculations
Designed to automate the calculation of Emergency Medical Service (EMS) quality metrics, 'nemsqar' implements measures defined by the National EMS Quality Alliance (NEMSQA). By providing reliable, evidence-based quality assessments, the package supports EMS agencies, healthcare providers, and researchers in evaluating and improving patient outcomes. Users can find details on all approved NEMSQA measures at <https://www.nemsqa.org/measures>. Full technical specifications, including documentation and pseudocode used to develop 'nemsqar', are available on the NEMSQA website after creating a user profile at <https://www.nemsqa.org>.
Maintained by Nicolas Foss. Last updated 3 days ago.
2.0 match 5 stars 4.70 scorebioimaginggroup
bioimagetools:Tools for Microscopy Imaging
Tools for 3D imaging, mostly for biology/microscopy. Read and write TIFF stacks. Functions for segmentation, filtering and analyzing 3D point patterns.
Maintained by Volker Schmid. Last updated 3 years ago.
1.7 match 4 stars 5.30 score 33 scripts 1 dependentsbioc
CMA:Synthesis of microarray-based classification
This package provides a comprehensive collection of various microarray-based classification algorithms both from Machine Learning and Statistics. Variable Selection, Hyperparameter tuning, Evaluation and Comparison can be performed combined or stepwise in a user-friendly environment.
Maintained by Roman Hornung. Last updated 5 months ago.
1.8 match 5.09 score 61 scriptsstatisticsnorway
GaussSuppression:Tabular Data Suppression using Gaussian Elimination
A statistical disclosure control tool to protect tables by suppression using the Gaussian elimination secondary suppression algorithm (Langsrud, 2024) <doi:10.1007/978-3-031-69651-0_6>. A suggestion is to start by working with functions SuppressSmallCounts() and SuppressDominantCells(). These functions use primary suppression functions for the minimum frequency rule and the dominance rule, respectively. Novel functionality for suppression of disclosive cells is also included. General primary suppression functions can be supplied as input to the general working horse function, GaussSuppressionFromData(). Suppressed frequencies can be replaced by synthetic decimal numbers as described in Langsrud (2019) <doi:10.1007/s11222-018-9848-9>.
Maintained by Øyvind Langsrud. Last updated 2 days ago.
1.3 match 2 stars 6.61 score 50 scriptslhdjung
scrutiny:Error Detection in Science
Test published summary statistics for consistency (Brown and Heathers, 2017, <doi:10.1177/1948550616673876>; Allard, 2018, <https://aurelienallard.netlify.app/post/anaytic-grimmer-possibility-standard-deviations/>; Heathers and Brown, 2019, <https://osf.io/5vb3u/>). The package also provides infrastructure for implementing new error detection techniques.
Maintained by Lukas Jung. Last updated 6 months ago.
1.3 match 8 stars 6.52 score 38 scriptscran
epitools:Epidemiology Tools
Tools for training and practicing epidemiologists including methods for two-way and multi-way contingency tables.
Maintained by Adam Omidpanah. Last updated 5 years ago.
1.8 match 2 stars 4.89 score 12 dependentselipousson
mapmaryland:Easy Access to Maryland Spatial Data
A small collection of data sources and utility functions for working with state and county data sources in Maryland.
Maintained by Eli Pousson. Last updated 5 months ago.
3.4 match 3 stars 2.48 score 4 scriptsjhmaindonald
gamclass:Functions and Data for a Course on Modern Regression and Classification
Functions and data are provided that support a course that emphasizes statistical issues of inference and generalizability. The functions are designed to make it straightforward to illustrate the use of cross-validation, the training/test approach, simulation, and model-based estimates of accuracy. Methods considered are Generalized Additive Modeling, Linear and Quadratic Discriminant Analysis, Tree-based methods, and Random Forests.
Maintained by John Maindonald. Last updated 2 years ago.
1.8 match 4.82 score 44 scriptslhdjung
moder:Mode Estimation
Determines single or multiple modes (most frequent values). Checks if missing values make this impossible, and returns 'NA' in this case. Dependency-free source code. See Franzese and Iuliano (2019) <doi:10.1016/B978-0-12-809633-8.20354-3>.
Maintained by Lukas Jung. Last updated 1 years ago.
1.9 match 4.48 score 15 scriptsbioc
ptairMS:Pre-processing PTR-TOF-MS Data
This package implements a suite of methods to preprocess data from PTR-TOF-MS instruments (HDF5 format) and generates the 'sample by features' table of peak intensities in addition to the sample and feature metadata (as a singl<e ExpressionSet object for subsequent statistical analysis). This package also permit usefull tools for cohorts management as analyzing data progressively, visualization tools and quality control. The steps include calibration, expiration detection, peak detection and quantification, feature alignment, missing value imputation and feature annotation. Applications to exhaled air and cell culture in headspace are described in the vignettes and examples. This package was used for data analysis of Gassin Delyle study on adults undergoing invasive mechanical ventilation in the intensive care unit due to severe COVID-19 or non-COVID-19 acute respiratory distress syndrome (ARDS), and permit to identfy four potentiel biomarquers of the infection.
Maintained by camille Roquencourt. Last updated 5 months ago.
softwaremassspectrometrypreprocessingmetabolomicspeakdetectionalignmentcpp
1.6 match 7 stars 5.15 score 3 scriptscran
stoppingrule:Create and Evaluate Stopping Rules for Safety Monitoring
Provides functions for creating, displaying, and evaluating stopping rules for safety monitoring in clinical studies.
Maintained by Michael J. Martens. Last updated 1 months ago.
3.6 match 2.30 scorejinkim3
ezr:Easy Use of R via Shiny App for Basic Analyses of Experimental Data
Runs a Shiny App in the local machine for basic statistical and graphical analyses. The point-and-click interface of Shiny App enables obtaining the same analysis outputs (e.g., plots and tables) more quickly, as compared with typing the required code in R, especially for users without much experience or expertise with coding. Examples of possible analyses include tabulating descriptive statistics for a variable, creating histograms by experimental groups, and creating a scatter plot and calculating the correlation between two variables.
Maintained by Jin Kim. Last updated 4 years ago.
2.8 match 1 stars 3.00 score 2 scriptscran
diffval:Vegetation Patterns
Find, visualize and explore patterns of differential taxa in vegetation data (namely in a phytosociological table), using the Differential Value (DiffVal). Patterns are searched through mathematical optimization algorithms. Ultimately, Total Differential Value (TDV) optimization aims at obtaining classifications of vegetation data based on differential taxa, as in the traditional geobotanical approach. The Gurobi optimizer, as well as the R package 'gurobi', can be installed from <https://www.gurobi.com/products/gurobi-optimizer/>. The useful vignette Gurobi Installation Guide, from package 'prioritizr', can be found here: <https://prioritizr.net/articles/gurobi_installation_guide.html>.
Maintained by Tiago Monteiro-Henriques. Last updated 2 years ago.
4.8 match 1.70 scorerhartmano
labelr:Label Data Frames, Variables, and Values
Create and use data frame labels for data frame objects (frame labels), their columns (name labels), and individual values of a column (value labels). Value labels include one-to-one and many-to-one labels for nominal and ordinal variables, as well as numerical range-based value labels for continuous variables. Convert value-labeled variables so each value is replaced by its corresponding value label. Add values-converted-to-labels columns to a value-labeled data frame while preserving parent columns. Filter and subset a value-labeled data frame using labels, while returning results in terms of values. Overlay labels in place of values in common R commands to increase interpretability. Generate tables of value frequencies, with categories expressed as raw values or as labels. Access data frames that show value-to-label mappings for easy reference.
Maintained by Robert Hartman. Last updated 7 months ago.
1.3 match 3 stars 5.65 score 10 scriptsgeomarker-io
codec:Community Data Explorer for Cincinnati
This repository serves as the definition of the CoDEC data specifications and provides helpers to create, validate, release, and read CoDEC data.
Maintained by Cole Brokamp. Last updated 23 days ago.
1.8 match 4 stars 4.15 score 27 scriptsbioc
iterativeBMAsurv:The Iterative Bayesian Model Averaging (BMA) Algorithm For Survival Analysis
The iterative Bayesian Model Averaging (BMA) algorithm for survival analysis is a variable selection method for applying survival analysis to microarray data.
Maintained by Ka Yee Yeung. Last updated 5 months ago.
2.3 match 3.30 score 8 scriptsjhhmuc
pairwise:Rasch Model Parameters by Pairwise Algorithm
Performs the explicit calculation -- not estimation! -- of the Rasch item parameters for dichotomous and polytomous item responses, using a pairwise comparison approach. Person parameters (WLE) are calculated according to Warm's weighted likelihood approach.
Maintained by Joerg-Henrik Heine. Last updated 2 years ago.
1.9 match 3.96 score 38 scripts 1 dependentsbioc
AssessORF:Assess Gene Predictions Using Proteomics and Evolutionary Conservation
In order to assess the quality of a set of predicted genes for a genome, evidence must first be mapped to that genome. Next, each gene must be categorized based on how strong the evidence is for or against that gene. The AssessORF package provides the functions and class structures necessary for accomplishing those tasks, using proteomic hits and evolutionarily conserved start codons as the forms of evidence.
Maintained by Deepank Korandla. Last updated 5 months ago.
comparativegenomicsgenepredictiongenomeannotationgeneticsproteomicsqualitycontrolvisualization
1.8 match 4.18 score 3 scriptschristopherkenny
cvap:Citizen Voting Age Population
Works with the Citizen Voting Age Population special tabulation from the US Census Bureau <https://www.census.gov/programs-surveys/decennial-census/about/voting-rights/cvap.html>. Provides tools to download and process raw data. Also provides a downloading interface to processed data. Implements a very basic approach to estimate block level citizen voting age population from block group data.
Maintained by Christopher T. Kenny. Last updated 12 months ago.
2.2 match 2 stars 3.30 score 7 scriptspsychbruce
PsychWordVec:Word Embedding Research Framework for Psychological Science
An integrative toolbox of word embedding research that provides: (1) a collection of 'pre-trained' static word vectors in the '.RData' compressed format <https://psychbruce.github.io/WordVector_RData.pdf>; (2) a series of functions to process, analyze, and visualize word vectors; (3) a range of tests to examine conceptual associations, including the Word Embedding Association Test <doi:10.1126/science.aal4230> and the Relative Norm Distance <doi:10.1073/pnas.1720347115>, with permutation test of significance; (4) a set of training methods to locally train (static) word vectors from text corpora, including 'Word2Vec' <arXiv:1301.3781>, 'GloVe' <doi:10.3115/v1/D14-1162>, and 'FastText' <arXiv:1607.04606>; (5) a group of functions to download 'pre-trained' language models (e.g., 'GPT', 'BERT') and extract contextualized (dynamic) word vectors (based on the R package 'text').
Maintained by Han-Wu-Shuang Bao. Last updated 1 years ago.
bertcosine-similarityfasttextglovegptlanguage-modelnatural-language-processingnlppretrained-modelspsychologysemantic-analysistext-analysistext-miningtsneword-embeddingsword-vectorsword2vecopenjdk
1.8 match 22 stars 4.04 score 10 scriptsbioc
phenomis:Postprocessing and univariate analysis of omics data
The 'phenomis' package provides methods to perform post-processing (i.e. quality control and normalization) as well as univariate statistical analysis of single and multi-omics data sets. These methods include quality control metrics, signal drift and batch effect correction, intensity transformation, univariate hypothesis testing, but also clustering (as well as annotation of metabolomics data). The data are handled in the standard Bioconductor formats (i.e. SummarizedExperiment and MultiAssayExperiment for single and multi-omics datasets, respectively; the alternative ExpressionSet and MultiDataSet formats are also supported for convenience). As a result, all methods can be readily chained as workflows. The pipeline can be further enriched by multivariate analysis and feature selection, by using the 'ropls' and 'biosigner' packages, which support the same formats. Data can be conveniently imported from and exported to text files. Although the methods were initially targeted to metabolomics data, most of the methods can be applied to other types of omics data (e.g., transcriptomics, proteomics).
Maintained by Etienne A. Thevenot. Last updated 5 months ago.
batcheffectclusteringcoveragekeggmassspectrometrymetabolomicsnormalizationproteomicsqualitycontrolsequencingstatisticalmethodtranscriptomics
1.6 match 4.40 score 6 scriptscore-bioinformatics
noisyr:Noise Quantification in High Throughput Sequencing Output
Quantifies and removes technical noise from high-throughput sequencing data. Two approaches are used, one based on the count matrix, and one using the alignment BAM files directly. Contains several options for every step of the process, as well as tools to quality check and assess the stability of output.
Maintained by Ilias Moutsopoulos. Last updated 3 years ago.
1.7 match 9 stars 4.13 score 5 scripts 1 dependentswilliam-swl
baizer:Useful Functions for Data Processing
In ancient Chinese mythology, Bai Ze is a divine creature that knows the needs of everything. 'baizer' provides data processing functions frequently used by the author. Hope this package also knows what you want!
Maintained by William Song. Last updated 1 years ago.
dataframenumbersstringstidyverse
1.8 match 6 stars 3.95 score 5 scripts 1 dependentsflorianjansen
vegdata:Access Vegetation Databases and Treat Taxonomy
Handling of vegetation data from different sources ( Turboveg 2.0 <https://www.synbiosys.alterra.nl/turboveg/>; the German national repository <https://www.vegetweb.de> and others. Taxonomic harmonization (given appropriate taxonomic lists, e.g. the German taxonomic standard list "GermanSL", <https://germansl.infinitenature.org>).
Maintained by Florian Jansen. Last updated 1 years ago.
1.8 match 2 stars 3.84 score 38 scripts 3 dependentscran
ibmdbR:IBM in-Database Analytics for R
Functionality required to efficiently use R with IBM(R) Db2(R) Warehouse offerings (formerly IBM dashDB(R)) and IBM Db2 for z/OS(R) in conjunction with IBM Db2 Analytics Accelerator for z/OS. Many basic and complex R operations are pushed down into the database, which removes the main memory boundary of R and allows to make full use of parallel processing in the underlying database. For executing R-functions in a multi-node environment in parallel the idaTApply() function requires the 'SparkR' package (<https://spark.apache.org/docs/latest/sparkr.html>). The optional 'ggplot2' package is needed for the plot.idaLm() function only.
Maintained by Shaikh Quader. Last updated 1 years ago.
1.8 match 2 stars 3.82 score 66 scriptsr4epi
epitabulate:Tables for Epidemiological Analysis
Produces tables for descriptive epidemiological analysis. These tables describe counts of variables in either line-list or survey data (with appropriate confidence intervals), with additional functionality to calculate odds, risk, and incidence rate ratios directly from a linelist across several variables. This package is part of the 'R4EPIs' project <https://R4epis.netlify.com>.
Maintained by Alexander Spina. Last updated 2 years ago.
2.0 match 8 stars 3.38 score 3 scripts 1 dependentsmarianschmidt
msSPChelpR:Helper Functions for Second Primary Cancer Analyses
A collection of helper functions for analyzing Second Primary Cancer data, including functions to reshape data, to calculate patient states and analyze cancer incidence.
Maintained by Marian Eberl. Last updated 1 years ago.
1.6 match 2 stars 4.18 score 15 scriptsmichaelhallquist
MplusAutomation:An R Package for Facilitating Large-Scale Latent Variable Analyses in Mplus
Leverages the R language to automate latent variable model estimation and interpretation using 'Mplus', a powerful latent variable modeling program developed by Muthen and Muthen (<https://www.statmodel.com>). Specifically, this package provides routines for creating related groups of models, running batches of models, and extracting and tabulating model parameters and fit statistics.
Maintained by Michael Hallquist. Last updated 2 months ago.
0.5 match 86 stars 12.96 score 664 scripts 13 dependentscran
caroline:A Collection of Database, Data Structure, Visualization, and Utility Functions for R
The caroline R library contains dozens of functions useful for: database migration (dbWriteTable2), database style joins & aggregation (nerge, groupBy, & bestBy), data structure conversion (nv, tab2df), legend table making (sstable & leghead), automatic legend positioning for scatter and box plots (), plot annotation (labsegs & mvlabs), data visualization (pies, sparge, confound.grid & raPlot), character string manipulation (m & pad), file I/O (write.delim), batch scripting, data exploration, and more. The package's greatest contributions lie in the database style merge, aggregation and interface functions as well as in it's extensive use and propagation of row, column and vector names in most functions.
Maintained by David Schruth. Last updated 5 months ago.
2.0 match 3.29 score 108 scripts 3 dependentsmatei-ionita
Cleanet:Automated doublet detection and classification for cytometry data
Automated method for doublet detection in flow or mass cytometry data, based on simulating doublets and finding events whose protein expression patterns are similar to the simulated doublets.
Maintained by Matei Ionita. Last updated 3 months ago.
1.8 match 3.70 scorebioc
blacksheepr:Outlier Analysis for pairwise differential comparison
Blacksheep is a tool designed for outlier analysis in the context of pairwise comparisons in an effort to find distinguishing characteristics from two groups. This tool was designed to be applied for biological applications such as phosphoproteomics or transcriptomics, but it can be used for any data that can be represented by a 2D table, and has two sub populations within the table to compare.
Maintained by RugglesLab. Last updated 5 months ago.
sequencingrnaseqgeneexpressiontranscriptiondifferentialexpressiontranscriptomics
1.5 match 4.30 score 6 scriptsfmicompbio
swissknife:Handy code shared in the FMI CompBio group
A collection of useful R functions performing various tasks that might be re-usable and worth sharing.
Maintained by Michael Stadler. Last updated 2 months ago.
1.7 match 8 stars 3.76 score 12 scriptscran
kequate:The Kernel Method of Test Equating
Implements the kernel method of test equating as defined in von Davier, A. A., Holland, P. W. and Thayer, D. T. (2004) <doi:10.1007/b97446> and Andersson, B. and Wiberg, M. (2017) <doi:10.1007/s11336-016-9528-7> using the CB, EG, SG, NEAT CE/PSE and NEC designs, supporting Gaussian, logistic and uniform kernels and unsmoothed and pre-smoothed input data.
Maintained by Björn Andersson. Last updated 3 years ago.
1.9 match 2 stars 3.37 score 29 scriptsnumbersman77
reporttools:Generate "LaTeX"" Tables of Descriptive Statistics
These functions are especially helpful when writing reports of data analysis using "Sweave".
Maintained by Kaspar Rufibach. Last updated 3 years ago.
1.8 match 2 stars 3.35 score 113 scriptsopenvolley
ovlytics:Functions and Algorithms for Volleyball Analytics
Analytical functions for volleyball analytics, to be used in conjunction with the datavolley and peranavolley packages.
Maintained by Ben Raymond. Last updated 3 months ago.
1.9 match 3.13 score 9 scripts 3 dependentsurswilke
pyramidi:Generate and Manipulate Midi Data in R Data Frames
Import the python libraries miditapyr and mido to read in midi file data in pandas DataFrames. These can then be imported in R via reticulate. The event-based midi data is widened to facilitate the manipulation and plotting of note-based structures as in music21. The data frame format allows for an easy implementation of many music data manipulations.
Maintained by Urs Wilke. Last updated 1 years ago.
1.7 match 8 stars 3.33 score 27 scriptscran
RcmdrPlugin.TeachStat:R Commander Plugin for Teaching Statistical Methods
R Commander plugin for teaching statistical methods. It adds a new menu for making easier the teaching of the main concepts about the main statistical methods.
Maintained by Manuel A. Mosquera Rodríguez. Last updated 1 years ago.
5.6 match 1.00 scorewillzywiec
criticality:Modeling Fissile Material Operations in Nuclear Facilities
A collection of functions for modeling fissile material operations in nuclear facilities, based on Zywiec et al (2021) <doi:10.1016/j.ress.2020.107322>.
Maintained by William Zywiec. Last updated 2 years ago.
5.3 match 1.00 scorebpoconnor
Crosstabs.Loglinear:Cross Tabulation and Loglinear Analyses of Categorical Data
Provides 'SPSS'- and 'SAS'-like output for cross tabulations of two categorical variables (CROSSTABS) and for hierarchical loglinear analyses of two or more categorical variables (LOGLINEAR). The methods are described in Agresti (2013, ISBN:978-0-470-46363-5), Ajzen & Walker (2021, ISBN:9780429330308), Field (2018, ISBN:9781526440273), Norusis (2012, ISBN:978-0-321-74843-0), Nussbaum (2015, ISBN:978-1-84872-603-1), Stevens (2009, ISBN:978-0-8058-5903-4), Tabachnik & Fidell (2019, ISBN:9780134790541), and von Eye & Mun (2013, ISBN:978-1-118-14640-8).
Maintained by Brian OConnor. Last updated 2 years ago.
5.2 match 1.00 scoretmsalab
fourPNO:Bayesian 4 Parameter Item Response Model
Estimate Barton & Lord's (1981) <doi:10.1002/j.2333-8504.1981.tb01255.x> four parameter IRT model with lower and upper asymptotes using Bayesian formulation described by Culpepper (2016) <doi:10.1007/s11336-015-9477-6>.
Maintained by Steven Andrew Culpepper. Last updated 5 years ago.
armadillocognitive-diagnostic-modelsgibbs-sampleritem-response-theoryrcpprcpparmadilloopenblascppopenmp
1.9 match 1 stars 2.70 score 5 scriptswindwill
cascsim:Casualty Actuarial Society Individual Claim Simulator
It is an open source insurance claim simulation engine sponsored by the Casualty Actuarial Society. It generates individual insurance claims including open claims, reopened claims, incurred but not reported claims and future claims. It also includes claim data fitting functions to help set simulation assumptions. It is useful for claim level reserving analysis. Parodi (2013) <https://www.actuaries.org.uk/documents/triangle-free-reserving-non-traditional-framework-estimating-reserves-and-reserve-uncertainty>.
Maintained by Kailan Shang. Last updated 5 years ago.
1.7 match 2.99 score 98 scriptsbxc147
Epi:Statistical Analysis in Epidemiology
Functions for demographic and epidemiological analysis in the Lexis diagram, i.e. register and cohort follow-up data. In particular representation, manipulation, rate estimation and simulation for multistate data - the Lexis suite of functions, which includes interfaces to 'mstate', 'etm' and 'cmprsk' packages. Contains functions for Age-Period-Cohort and Lee-Carter modeling and a function for interval censored data and some useful functions for tabulation and plotting, as well as a number of epidemiological data sets.
Maintained by Bendix Carstensen. Last updated 2 months ago.
0.5 match 4 stars 9.65 score 708 scripts 11 dependents