R-universe search: exports:replace

Showing 19 of total 19 results (show query)

tidyverse

tidyr:Tidy Messy Data

Tools to help to create tidy data, where each column is a variable, each row is an observation, and each cell contains a single value. 'tidyr' contains tools for changing the shape (pivoting) and hierarchy (nesting and 'unnesting') of a dataset, turning deeply nested lists into rectangular data frames ('rectangling'), and extracting values out of string columns. It also includes tools for working with missing values (both implicit and explicit).

Maintained by Hadley Wickham. Last updated 25 days ago.

tidy-data cpp

1.4k stars 22.88 score 168k scripts 5.5k dependents

sebkrantz

collapse:Advanced and Fast Data Transformation

A C/C++ based package for advanced data transformation and statistical computing in R that is extremely fast, class-agnostic, robust and programmer friendly. Core functionality includes a rich set of S3 generic grouped and weighted statistical functions for vectors, matrices and data frames, which provide efficient low-level vectorizations, OpenMP multithreading, and skip missing values by default. These are integrated with fast grouping and ordering algorithms (also callable from C), and efficient data manipulation functions. The package also provides a flexible and rigorous approach to time series and panel data in R. It further includes fast functions for common statistical procedures, detailed (grouped, weighted) summary statistics, powerful tools to work with nested data, fast data object conversions, functions for memory efficient R programming, and helpers to effectively deal with variable labels, attributes, and missing data. It is well integrated with base R classes, 'dplyr'/'tibble', 'data.table', 'sf', 'units', 'plm' (panel-series and data frames), and 'xts'/'zoo'.

Maintained by Sebastian Krantz. Last updated 7 days ago.

data-aggregation data-analysis data-manipulation data-processing data-science data-transformation econometrics high-performance panel-data scientific-computing statistics time-series weighted weights cpp openmp

672 stars 16.68 score 708 scripts 99 dependents

sparklyr

sparklyr:R Interface to Apache Spark

R interface to Apache Spark, a fast and general engine for big data processing, see <https://spark.apache.org/>. This package supports connecting to local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end, and provides an interface to Spark's built-in machine learning algorithms.

Maintained by Edgar Ruiz. Last updated 10 days ago.

apache-spark distributed dplyr ide livy machine-learning remote-clusters spark sparklyr

959 stars 15.20 score 4.0k scripts 21 dependents

thomasp85

tidygraph:A Tidy API for Graph Manipulation

A graph, while not "tidy" in itself, can be thought of as two tidy data frames describing node and edge data respectively. 'tidygraph' provides an approach to manipulate these two virtual data frames using the API defined in the 'dplyr' package, as well as provides tidy interfaces to a lot of common graph algorithms.

Maintained by Thomas Lin Pedersen. Last updated 2 months ago.

graph-algorithms graph-manipulation igraph network-analysis tidyverse cpp

553 stars 14.74 score 4.6k scripts 136 dependents

dieghernan

tidyterra:'tidyverse' Methods and 'ggplot2' Helpers for 'terra' Objects

Extension of the 'tidyverse' for 'SpatRaster' and 'SpatVector' objects of the 'terra' package. It includes also new 'geom_' functions that provide a convenient way of visualizing 'terra' objects with 'ggplot2'.

Maintained by Diego Hernangómez. Last updated 4 days ago.

terra ggplot-extension r-spatial rspatial

190 stars 13.59 score 1.9k scripts 25 dependents

markfairbanks

tidytable:Tidy Interface to 'data.table'

A tidy interface to 'data.table', giving users the speed of 'data.table' while using tidyverse-like syntax.

Maintained by Mark Fairbanks. Last updated 2 months ago.

460 stars 11.39 score 732 scripts 11 dependents

nathaneastwood

poorman:A Poor Man's Dependency Free Recreation of 'dplyr'

A replication of key functionality from 'dplyr' and the wider 'tidyverse' using only 'base'.

Maintained by Nathan Eastwood. Last updated 1 years ago.

base-r data-manipulation grammar

342 stars 10.79 score 156 scripts 27 dependents

elbersb

tidylog:Logging for 'dplyr' and 'tidyr' Functions

Provides feedback about 'dplyr' and 'tidyr' operations.

Maintained by Benjamin Elbers. Last updated 10 months ago.

dplyr tidyr tidyverse wrapper-functions

593 stars 10.23 score 1.7k scripts

ben519

mltools:Machine Learning Tools

A collection of machine learning helper functions, particularly assisting in the Exploratory Data Analysis phase. Makes heavy use of the 'data.table' package for optimal speed and memory efficiency. Highlights include a versatile bin_data() function, sparsify() for converting a data.table to sparse matrix format with one-hot encoding, fast evaluation metrics, and empirical_cdf() for calculating empirical Multivariate Cumulative Distribution Functions.

Maintained by Ben Gorman. Last updated 4 years ago.

exploratory-data-analysis machine-learning

72 stars 9.67 score 1.2k scripts 13 dependents

nepem-ufsc

metan:Multi Environment Trials Analysis

Performs stability analysis of multi-environment trial data using parametric and non-parametric methods. Parametric methods includes Additive Main Effects and Multiplicative Interaction (AMMI) analysis by Gauch (2013) <doi:10.2135/cropsci2013.04.0241>, Ecovalence by Wricke (1965), Genotype plus Genotype-Environment (GGE) biplot analysis by Yan & Kang (2003) <doi:10.1201/9781420040371>, geometric adaptability index by Mohammadi & Amri (2008) <doi:10.1007/s10681-007-9600-6>, joint regression analysis by Eberhart & Russel (1966) <doi:10.2135/cropsci1966.0011183X000600010011x>, genotypic confidence index by Annicchiarico (1992), Murakami & Cruz's (2004) method, power law residuals (POLAR) statistics by Doring et al. (2015) <doi:10.1016/j.fcr.2015.08.005>, scale-adjusted coefficient of variation by Doring & Reckling (2018) <doi:10.1016/j.eja.2018.06.007>, stability variance by Shukla (1972) <doi:10.1038/hdy.1972.87>, weighted average of absolute scores by Olivoto et al. (2019a) <doi:10.2134/agronj2019.03.0220>, and multi-trait stability index by Olivoto et al. (2019b) <doi:10.2134/agronj2019.03.0221>. Non-parametric methods includes superiority index by Lin & Binns (1988) <doi:10.4141/cjps88-018>, nonparametric measures of phenotypic stability by Huehn (1990) <doi:10.1007/BF00024241>, TOP third statistic by Fox et al. (1990) <doi:10.1007/BF00040364>. Functions for computing biometrical analysis such as path analysis, canonical correlation, partial correlation, clustering analysis, and tools for inspecting, manipulating, summarizing and plotting typical multi-environment trial data are also provided.

Maintained by Tiago Olivoto. Last updated 21 days ago.

2 stars 9.48 score 1.3k scripts 2 dependents

insightsengineering

random.cdisc.data:Create Random ADaM Datasets

A set of functions to create random Analysis Data Model (ADaM) datasets and cached dataset. ADaM dataset specifications are described by the Clinical Data Interchange Standards Consortium (CDISC) Analysis Data Model Team.

Maintained by Joe Zhu. Last updated 6 months ago.

cdisc dataset

33 stars 8.60 score 52 scripts

shichenxie

scorecard:Credit Risk Scorecard

The `scorecard` package makes the development of credit risk scorecard easier and efficient by providing functions for some common tasks, such as data partition, variable selection, woe binning, scorecard scaling, performance evaluation and report generation. These functions can also used in the development of machine learning models. The references including: 1. Refaat, M. (2011, ISBN: 9781447511199). Credit Risk Scorecard: Development and Implementation Using SAS. 2. Siddiqi, N. (2006, ISBN: 9780471754510). Credit risk scorecards. Developing and Implementing Intelligent Credit Scoring.

Maintained by Shichen Xie. Last updated 12 months ago.

binning credit-scoring release scorecard woe woebinning

164 stars 8.07 score 94 scripts

bioc

BioNERO:Biological Network Reconstruction Omnibus

BioNERO aims to integrate all aspects of biological network inference in a single package, including data preprocessing, exploratory analyses, network inference, and analyses for biological interpretations. BioNERO can be used to infer gene coexpression networks (GCNs) and gene regulatory networks (GRNs) from gene expression data. Additionally, it can be used to explore topological properties of protein-protein interaction (PPI) networks. GCN inference relies on the popular WGCNA algorithm. GRN inference is based on the "wisdom of the crowds" principle, which consists in inferring GRNs with multiple algorithms (here, CLR, GENIE3 and ARACNE) and calculating the average rank for each interaction pair. As all steps of network analyses are included in this package, BioNERO makes users avoid having to learn the syntaxes of several packages and how to communicate between them. Finally, users can also identify consensus modules across independent expression sets and calculate intra and interspecies module preservation statistics between different networks.

Maintained by Fabricio Almeida-Silva. Last updated 5 months ago.

software geneexpression generegulation systemsbiology graphandnetwork preprocessing network networkinference

27 stars 7.78 score 50 scripts 1 dependents

cbroeckl

RAMClustR:Mass Spectrometry Metabolomics Feature Clustering and Interpretation

A feature clustering algorithm for non-targeted mass spectrometric metabolomics data. This method is compatible with gas and liquid chromatography coupled mass spectrometry, including indiscriminant tandem mass spectrometry <DOI: 10.1021/ac501530d> data.

Maintained by Helge Hecht. Last updated 7 months ago.

massspectrometry metabolomics

12 stars 7.46 score 20 scripts

ropensci

taxlist:Handling Taxonomic Lists

Handling taxonomic lists through objects of class 'taxlist'. This package provides functions to import species lists from 'Turboveg' (<https://www.synbiosys.alterra.nl/turboveg/>) and the possibility to create backups from resulting R-objects. Also quick displays are implemented as summary-methods.

Maintained by Miguel Alvarez. Last updated 6 months ago.

12 stars 7.07 score 81 scripts 2 dependents

statisfactions

simpr:Flexible 'Tidyverse'-Friendly Simulations

A general, 'tidyverse'-friendly framework for simulation studies, design analysis, and power analysis. Specify data generation, define varying parameters, generate data, fit models, and tidy model results in a single pipeline, without needing loops or custom functions.

Maintained by Ethan Brown. Last updated 9 months ago.

43 stars 6.89 score 30 scripts

eth-mds

ricu:Intensive Care Unit Data with R

Focused on (but not exclusive to) data sets hosted on PhysioNet (<https://physionet.org>), 'ricu' provides utilities for download, setup and access of intensive care unit (ICU) data sets. In addition to functions for running arbitrary queries against available data sets, a system for defining clinical concepts and encoding their representations in tabular ICU data is presented.

Maintained by Nicolas Bennett. Last updated 10 months ago.

39 stars 5.65 score 77 scripts

roaldarbol

animovement:An R toolbox for analysing animal movement across space and time

An R toolbox for analysing animal movement across space and time.

Maintained by Mikkel Roald-Arbøl. Last updated 3 months ago.

animal-behaviour animal-movement neuroethology neuroscience

10 stars 4.81 score 8 scripts

poissonconsulting

mcmcdata:Manipulate MCMC Samples and Data Frames

Manipulates Monte Carlo Markov Chain samples and associated data frames.

Maintained by Joe Thorley. Last updated 2 months ago.

1 stars 3.56 score 4 scripts 4 dependents