R-universe search: mtcars

stemangiola

tidyHeatmap:A Tidy Implementation of Heatmap

This is a tidy implementation for heatmap. At the moment it is based on the (great) package 'ComplexHeatmap'. The goal of this package is to interface a tidy data frame with this powerful tool. Some of the advantages are: Row and/or columns colour annotations are easy to integrate just specifying one parameter (column names). Custom grouping of rows is easy to specify providing a grouped tbl. For example: df %>% group_by(...). Labels size adjusted by row and column total number. Default use of Brewer and Viridis palettes.

Maintained by Stefano Mangiola. Last updated 1 months ago.

assaydomain infrastructure brewer complexheatmap custom-palette dplyr graphviz heatmap mtcars plotting rstudio scale tibble tidy tidy-data-frame tidybulk tidyverse viridis

10.0 match 335 stars 10.23 score 197 scripts 1 dependents

rolkra

explore:Simplifies Exploratory Data Analysis

Interactive data exploration with one line of code, automated reporting or use an easy to remember set of tidy functions for low code exploratory data analysis.

Maintained by Roland Krasser. Last updated 3 months ago.

data-exploration data-visualisation decision-trees eda rmarkdown shiny tidy

8.8 match 228 stars 11.43 score 221 scripts 1 dependents

ropensci

drake:A Pipeline Toolkit for Reproducible Computation at Scale

A general-purpose computational engine for data analysis, drake rebuilds intermediate data objects when their dependencies change, and it skips work when the results are already up to date. Not every execution starts from scratch, there is native support for parallel and distributed computing, and completed projects have tangible evidence that they are reproducible. Extensive documentation, from beginner-friendly tutorials to practical examples and more, is available at the reference website <https://docs.ropensci.org/drake/> and the online manual <https://books.ropensci.org/drake/>.

Maintained by William Michael Landau. Last updated 3 months ago.

data-science drake high-performance-computing makefile peer-reviewed pipeline reproducibility reproducible-research ropensci workflow

4.8 match 1.3k stars 11.49 score 1.7k scripts 1 dependents

danchaltiel

crosstable:Crosstables for Descriptive Analyses

Create descriptive tables for continuous and categorical variables. Apply summary statistics and counting function, with or without a grouping variable, and create beautiful reports using 'rmarkdown' or 'officer'. You can also compute effect sizes and statistical tests if needed.

Maintained by Dan Chaltiel. Last updated 2 months ago.

descriptive-statistics flextable frequency-table html-report msword officer

5.3 match 116 stars 10.37 score 340 scripts

openanalytics

editbl:'DT' Extension for CRUD (Create, Read, Update, Delete) Applications in 'shiny'

The core of this package is a function eDT() which enhances DT::datatable() such that it can be used to interactively modify data in 'shiny'. By the use of generic 'dplyr' methods it supports many types of data storage, with relational databases ('dbplyr') being the main use case.

Maintained by Jasper Schelfhout. Last updated 2 months ago.

7.5 match 23 stars 6.52 score 12 scripts

funkyheatmap

funkyheatmap:Generating Funky Heatmaps for Data Frames

Allows generating heatmap-like visualisations for data frames. Funky heatmaps can be fine-tuned by providing annotations of the columns and rows, which allows assigning multiple palettes or geometries or grouping rows and columns together in categories. Saelens et al. (2019) <doi:10.1038/s41587-019-0071-9>.

Maintained by Robrecht Cannoodt. Last updated 1 months ago.

3.8 match 171 stars 8.37 score 76 scripts

tripartio

ale:Interpretable Machine Learning and Statistical Inference with Accumulated Local Effects (ALE)

Accumulated Local Effects (ALE) were initially developed as a model-agnostic approach for global explanations of the results of black-box machine learning algorithms. ALE has a key advantage over other approaches like partial dependency plots (PDP) and SHapley Additive exPlanations (SHAP): its values represent a clean functional decomposition of the model. As such, ALE values are not affected by the presence or absence of interactions among variables in a mode. Moreover, its computation is relatively rapid. This package reimplements the algorithms for calculating ALE data and develops highly interpretable visualizations for plotting these ALE values. It also extends the original ALE concept to add bootstrap-based confidence intervals and ALE-based statistics that can be used for statistical inference. For more details, see Okoli, Chitu. 2023. “Statistical Inference Using Machine Learning and Classical Techniques Based on Accumulated Local Effects (ALE).” arXiv. <arXiv:2310.09877>. <doi:10.48550/arXiv.2310.09877>.

Maintained by Chitu Okoli. Last updated 26 days ago.

4.7 match 4 stars 6.41 score 27 scripts

vegawidget

vegawidget:'Htmlwidget' for 'Vega' and 'Vega-Lite'

'Vega' and 'Vega-Lite' parse text in 'JSON' notation to render chart-specifications into 'HTML'. This package is used to facilitate the rendering. It also provides a means to interact with signals, events, and datasets in a 'Vega' chart using 'JavaScript' or 'Shiny'.

Maintained by Ian Lyttle. Last updated 1 years ago.

3.8 match 68 stars 8.04 score 49 scripts 4 dependents

eutwt

versus:Compare Data Frames

A toolset for interactively exploring the differences between two data frames.

Maintained by Ryan Dickerson. Last updated 9 months ago.

7.1 match 7 stars 4.02 score 4 scripts

stibu81

ibawds:Functions and Datasets for the Data Science Course at IBAW

A collection of useful functions and datasets for the Data Science Course at IBAW.

Maintained by Stefan Lanz. Last updated 9 days ago.

data-science-learning educational-resources

5.4 match 2 stars 4.26 score 8 scripts

urswilke

datadaptor:Modify Labelled Data Sets With Excel Files

An R package to modify labelled data sets with commands in Excel files. The commands in this package allow to create new variables, and modify the labels of the variables, as well as the variables themselves. The goal is to provide an easy & concise syntax, and to allow for fast systematic data entry using Excel for advanced users. The commands work on the variables inside the data.frame environment (like e.g. inside dplyr verbs), thus providing an approach that might ease the use for people without in-depth programming experience.

Maintained by Urs Wilke. Last updated 27 days ago.

4.0 match 5.13 score 1 dependents

sabainter

SSVS:Functions for Stochastic Search Variable Selection (SSVS)

Functions for performing stochastic search variable selection (SSVS) for binary and continuous outcomes and visualizing the results. SSVS is a Bayesian variable selection method used to estimate the probability that individual predictors should be included in a regression model. Using MCMC estimation, the method samples thousands of regression models in order to characterize the model uncertainty regarding both the predictor set and the regression parameters. For details see Bainter, McCauley, Wager, and Losin (2020) Improving practices for selecting a subset of important predictors in psychology: An application to predicting pain, Advances in Methods and Practices in Psychological Science 3(1), 66-80 <DOI:10.1177/2515245919885617>.

Maintained by Sierra Bainter. Last updated 1 months ago.

4.0 match 8 stars 5.05 score 14 scripts

erblast

easyalluvial:Generate Alluvial Plots with a Single Line of Code

Alluvial plots are similar to sankey diagrams and visualise categorical data over multiple dimensions as flows. (Rosvall M, Bergstrom CT (2010) Mapping Change in Large Networks. PLoS ONE 5(1): e8694. <doi:10.1371/journal.pone.0008694> Their graphical grammar however is a bit more complex then that of a regular x/y plots. The 'ggalluvial' package made a great job of translating that grammar into 'ggplot2' syntax and gives you many options to tweak the appearance of an alluvial plot, however there still remains a multi-layered complexity that makes it difficult to use 'ggalluvial' for explorative data analysis. 'easyalluvial' provides a simple interface to this package that allows you to produce a decent alluvial plot from any dataframe in either long or wide format from a single line of code while also handling continuous data. It is meant to allow a quick visualisation of entire dataframes with a focus on different colouring options that can make alluvial plots a great tool for data exploration.

Maintained by Bjoern Koneswarakantha. Last updated 1 years ago.

3.3 match 110 stars 6.13 score 81 scripts 1 dependents

agdamsbo

REDCapCAST:REDCap Metadata Casting and Castellated Data Handling

Casting metadata for REDCap database creation and handling of castellated data using repeated instruments and longitudinal projects in 'REDCap'. Keeps a focused data export approach, by allowing to only export required data from the database. Also for casting new REDCap databases based on datasets from other sources. Originally forked from the R part of 'REDCapRITS' by Paul Egeler. See <https://github.com/pegeler/REDCapRITS>. 'REDCap' (Research Electronic Data Capture) is a secure, web-based software platform designed to support data capture for research studies, providing 1) an intuitive interface for validated data capture; 2) audit trails for tracking data manipulation and export procedures; 3) automated export procedures for seamless data downloads to common statistical packages; and 4) procedures for data integration and interoperability with external sources (Harris et al (2009) <doi:10.1016/j.jbi.2008.08.010>; Harris et al (2019) <doi:10.1016/j.jbi.2019.103208>).

Maintained by Andreas Gammelgaard Damsbo. Last updated 7 days ago.

3.3 match 1 stars 5.84 score 12 scripts

kiangkiangkiang

ggESDA:Exploratory Symbolic Data Analysis with 'ggplot2'

Implements an extension of 'ggplot2' and visualizes the symbolic data with multiple plot which can be adjusted by more general and flexible input arguments. It also provides a function to transform the classical data to symbolic data by both clustering algorithm and customized method.

Maintained by Bo-Syue Jiang. Last updated 2 years ago.

4.0 match 21 stars 4.02 score 9 scripts

nbarrowman

vtree:Display Information About Nested Subsets of a Data Frame

A tool for calculating and drawing "variable trees". Variable trees display information about nested subsets of a data frame.

Maintained by Nick Barrowman. Last updated 1 days ago.

data-science data-visualization exploratory-data-analysis statistics

2.0 match 76 stars 7.09 score 65 scripts

coolbutuseless

emphatic:Exploratory Analysis of Tabular Data using Colour Highlighting

Tools for exploratory analysis of tabular data using colour highlighting. Highlighting is displayed in any console supporting 'ANSI' colours, and can be converted to 'HTML', 'typst', 'latex' and 'SVG'. 'quarto' and 'rmarkdown' rendering are directly supported. It is also possible to add colour to regular expression matches and highlight differences between two arbitrary R objects.

Maintained by Mike Cheng. Last updated 3 months ago.

1.3 match 141 stars 7.55 score 12 scripts

mingzehuang

latentcor:Fast Computation of Latent Correlations for Mixed Data

The first stand-alone R package for computation of latent correlation that takes into account all variable types (continuous/binary/ordinal/zero-inflated), comes with an optimized memory footprint, and is computationally efficient, essentially making latent correlation estimation almost as fast as rank-based correlation estimation. The estimation is based on latent copula Gaussian models. For continuous/binary types, see Fan, J., Liu, H., Ning, Y., and Zou, H. (2017). For ternary type, see Quan X., Booth J.G. and Wells M.T. (2018) <arXiv:1809.06255>. For truncated type or zero-inflated type, see Yoon G., Carroll R.J. and Gaynanova I. (2020) <doi:10.1093/biomet/asaa007>. For approximation method of computation, see Yoon G., Müller C.L. and Gaynanova I. (2021) <doi:10.1080/10618600.2021.1882468>. The latter method uses multi-linear interpolation originally implemented in the R package <https://cran.r-project.org/package=chebpol>.

Maintained by Mingze Huang. Last updated 2 years ago.

data-analysis data-mining data-processing data-science data-structures machine-learning mixed-types statistics

1.3 match 16 stars 6.65 score 46 scripts 1 dependents

bioc

AnVILAz:R / Bioconductor Support for the AnVIL Azure Platform

The AnVIL is a cloud computing resource developed in part by the National Human Genome Research Institute. The AnVILAz package supports end-users and developers using the AnVIL platform in the Azure cloud. The package provides a programmatic interface to AnVIL resources, including workspaces, notebooks, tables, and workflows. The package also provides utilities for managing resources, including copying files to and from Azure Blob Storage, and creating shared access signatures (SAS) for secure access to Azure resources.

Maintained by Marcel Ramos. Last updated 5 months ago.

software infrastructure thirdpartyclient

1.5 match 5.45 score 5 scripts

junhuili1017

StepReg:Stepwise Regression Analysis

Stepwise regression is a statistical technique used for model selection. This package streamlines stepwise regression analysis by supporting multiple regression types, incorporating popular selection strategies, and offering essential metrics. It enables users to apply multiple selection strategies and metrics in a single function call, visualize variable selection processes, and export results in various formats. However, StepReg should not be used for statistical inference unless the variable selection process is explicitly accounted for, as it can compromise the validity of the results. This limitation does not apply when StepReg is used for prediction purposes. We validated StepReg's accuracy using public datasets within the SAS software environment. Additionally, StepReg features an interactive Shiny application to enhance usability and accessibility.

Maintained by Junhui Li. Last updated 16 days ago.

1.3 match 1 stars 6.52 score 34 scripts

tin900

vvauditor:Creates Assertion Tests

Offers a comprehensive set of assertion tests to help users validate the integrity of their data. These tests can be used to check for specific conditions or properties within a dataset and help ensure that data is accurate and reliable. The package is designed to make it easy to add quality control checks to data analysis workflows and to aid in identifying and correcting any errors or inconsistencies in data.

Maintained by Tomer Iwan. Last updated 1 months ago.

1.9 match 4.03 score 7 scripts

gdemin

maditr:Fast Data Aggregation, Modification, and Filtering with Pipes and 'data.table'

Provides pipe-style interface for 'data.table'. Package preserves all 'data.table' features without significant impact on performance. 'let' and 'take' functions are simplified interfaces for most common data manipulation tasks. For example, you can write 'take(mtcars, mean(mpg), by = am)' for aggregation or 'let(mtcars, hp_wt = hp/wt, hp_wt_mpg = hp_wt/mpg)' for modification. Use 'take_if/let_if' for conditional aggregation/modification. Additionally there are some conveniences such as automatic 'data.frame' conversion to 'data.table'.

Maintained by Gregory Demin. Last updated 4 months ago.

data-table magrittr pipes

0.8 match 61 stars 8.98 score 248 scripts 7 dependents

techtonique

learningmachine:Machine Learning with Explanations and Uncertainty Quantification

Regression-based Machine Learning with explanations and uncertainty quantification.

Maintained by T. Moudiki. Last updated 4 months ago.

conformal-prediction machine-learning machine-learning-algorithms machinelearning statistical-learning uncertainty-quantification cpp

1.3 match 5 stars 5.57 score 21 scripts

cbhurley

PairViz:Visualization using Graph Traversal

Improving graphics by ameliorating order effects, using Eulerian tours and Hamiltonian decompositions of graphs. References for the methods presented here are C.B. Hurley and R.W. Oldford (2010) <doi:10.1198/jcgs.2010.09136> and C.B. Hurley and R.W. Oldford (2011) <doi:10.1007/s00180-011-0229-5>.

Maintained by Catherine Hurley. Last updated 3 years ago.

1.2 match 1 stars 5.75 score 42 scripts 3 dependents

nimble-dev

nimbleMacros:Macros Generating 'nimble' Code

Macros to generate 'nimble' code from a concise syntax. Included are macros for generating linear modeling code using a formula-based syntax and for building for() loops. For more details review the 'nimble' manual: <https://r-nimble.org/html_manual/cha-writing-models.html#subsec:macros>.

Maintained by Ken Kellner. Last updated 5 days ago.

1.3 match 4.98 score

cienciadedatos

datos:Traduce al Español Varios Conjuntos de Datos de Práctica

Provee una versión traducida de los siguientes conjuntos de datos: 'airlines', 'airports', 'AwardsManagers', 'babynames', 'Batting', 'credit_data', 'diamonds', 'faithful', 'fueleconomy', 'Fielding', 'flights', 'gapminder', 'gss_cat', 'iris', 'Managers', 'mpg', 'mtcars', 'atmos', 'palmerpenguins', 'People, 'Pitching', 'planes', 'presidential', 'table1', 'table2', 'table3', 'table4a', 'table4b', 'table5', 'vehicles', 'weather', 'who'. English: It provides a Spanish translated version of the datasets listed above.

Maintained by Riva Quiroga. Last updated 1 years ago.

0.5 match 48 stars 8.12 score 354 scripts

cienciadedatos

dados:Translate Datasets to Portuguese

Este pacote traduz os seguintes conjuntos de dados: 'airlines', 'airports', 'ames_raw', 'AwardsManagers', 'babynames', 'Batting', 'diamonds', 'faithful', 'fueleconomy', 'Fielding', 'flights', 'gapminder', 'gss_cat', 'iris', 'Managers', 'mpg', 'mtcars', 'atmos', 'penguins', 'People, 'Pitching', 'pixarfilms','planes', 'presidential', 'table1', 'table2', 'table3', 'table4a', 'table4b', 'table5', 'vehicles', 'weather', 'who'. English: It provides a Portuguese translated version of the datasets listed above.

Maintained by Riva Quiroga. Last updated 7 months ago.

0.5 match 46 stars 7.13 score 266 scripts