Showing 200 of total 376 results (show query)
tidyverse
dplyr:A Grammar of Data Manipulation
A fast, consistent tool for working with data frame like objects, both in memory and out of memory.
Maintained by Hadley Wickham. Last updated 13 days ago.
10.1 match 4.8k stars 24.68 score 659k scripts 7.8k dependentsmjwoods
RNetCDF:Interface to 'NetCDF' Datasets
An interface to the 'NetCDF' file formats designed by Unidata for efficient storage of array-oriented scientific data and descriptions. Most capabilities of 'NetCDF' version 4 are supported. Optional conversions of time units are enabled by 'UDUNITS' version 2, also from Unidata.
Maintained by Milton Woods. Last updated 8 days ago.
16.0 match 24 stars 10.26 score 540 scripts 23 dependentsdankelley
oce:Analysis of Oceanographic Data
Supports the analysis of Oceanographic data, including 'ADCP' measurements, measurements made with 'argo' floats, 'CTD' measurements, sectional data, sea-level time series, coastline and topographic data, etc. Provides specialized functions for calculating seawater properties such as potential temperature in either the 'UNESCO' or 'TEOS-10' equation of state. Produces graphical displays that conform to the conventions of the Oceanographic literature. This package is discussed extensively by Kelley (2018) "Oceanographic Analysis with R" <doi:10.1007/978-1-4939-8844-0>.
Maintained by Dan Kelley. Last updated 1 days ago.
10.2 match 146 stars 15.42 score 4.2k scripts 18 dependentsrpolars
polars:Lightning-Fast 'DataFrame' Library
Lightning-fast 'DataFrame' library written in 'Rust'. Convert R data to 'Polars' data and vice versa. Perform fast, lazy, larger-than-memory and optimized data queries. 'Polars' is interoperable with the package 'arrow', as both are based on the 'Apache Arrow' Columnar Format.
Maintained by Soren Welling. Last updated 3 days ago.
12.0 match 499 stars 12.01 score 1.0k scripts 2 dependentsr-lib
tidyselect:Select from a Set of Strings
A backend for the selecting functions of the 'tidyverse'. It makes it easy to implement select-like functions in your own packages in a way that is consistent with other 'tidyverse' interfaces for selection.
Maintained by Lionel Henry. Last updated 4 months ago.
6.8 match 130 stars 18.31 score 1.9k scripts 8.2k dependentscynkra
dm:Relational Data Models
Provides tools for working with multiple related tables, stored as data frames or in a relational database. Multiple tables (data and metadata) are stored in a compound object, which can then be manipulated with a pipe-friendly syntax.
Maintained by Kirill Müller. Last updated 2 months ago.
data-modeldata-warehousingdatawarehousingdbidbplyrrelational-databases
7.3 match 511 stars 14.81 score 410 scripts 8 dependentsr4ss
r4ss:R Code for Stock Synthesis
A collection of R functions for use with Stock Synthesis, a fisheries stock assessment modeling platform written in ADMB by Dr. Richard D. Methot at the NOAA Northwest Fisheries Science Center. The functions include tools for summarizing and plotting results, manipulating files, visualizing model parameterizations, and various other common stock assessment tasks. This version of '{r4ss}' is compatible with Stock Synthesis versions 3.24 through 3.30 (specifically version 3.30.23.1, from December 2024). Support for 3.24 models is only through the core functions for reading output and plotting.
Maintained by Ian G. Taylor. Last updated 5 days ago.
fisheriesfisheries-stock-assessmentstock-synthesis
8.6 match 43 stars 11.38 score 1.0k scripts 2 dependentsbtskinner
crosswalkr:Rename and Encode Data Frames Using External Crosswalk Files
A pair of functions for renaming and encoding data frames using external crosswalk files. It is especially useful when constructing master data sets from multiple smaller data sets that do not name or encode variables consistently across files. Based on similar commands in 'Stata'.
Maintained by Benjamin Skinner. Last updated 1 years ago.
17.0 match 9 stars 5.26 score 20 scriptssebkrantz
collapse:Advanced and Fast Data Transformation
A C/C++ based package for advanced data transformation and statistical computing in R that is extremely fast, class-agnostic, robust and programmer friendly. Core functionality includes a rich set of S3 generic grouped and weighted statistical functions for vectors, matrices and data frames, which provide efficient low-level vectorizations, OpenMP multithreading, and skip missing values by default. These are integrated with fast grouping and ordering algorithms (also callable from C), and efficient data manipulation functions. The package also provides a flexible and rigorous approach to time series and panel data in R. It further includes fast functions for common statistical procedures, detailed (grouped, weighted) summary statistics, powerful tools to work with nested data, fast data object conversions, functions for memory efficient R programming, and helpers to effectively deal with variable labels, attributes, and missing data. It is well integrated with base R classes, 'dplyr'/'tibble', 'data.table', 'sf', 'units', 'plm' (panel-series and data frames), and 'xts'/'zoo'.
Maintained by Sebastian Krantz. Last updated 6 days ago.
data-aggregationdata-analysisdata-manipulationdata-processingdata-sciencedata-transformationeconometricshigh-performancepanel-datascientific-computingstatisticstime-seriesweightedweightscppopenmp
5.4 match 672 stars 16.63 score 708 scripts 97 dependentsr-gregmisc
gdata:Various R Programming Tools for Data Manipulation
Various R programming tools for data manipulation, including medical unit conversions, combining objects, character vector operations, factor manipulation, obtaining information about R objects, generating fixed-width format files, extracting components of date & time objects, operations on columns of data frames, matrix operations, operations on vectors, operations on data frames, value of last evaluated expression, and a resample() wrapper for sample() that ensures consistent behavior for both scalar and vector arguments.
Maintained by Arni Magnusson. Last updated 2 months ago.
6.3 match 9 stars 13.62 score 4.5k scripts 124 dependentsmarkfairbanks
tidytable:Tidy Interface to 'data.table'
A tidy interface to 'data.table', giving users the speed of 'data.table' while using tidyverse-like syntax.
Maintained by Mark Fairbanks. Last updated 2 months ago.
7.0 match 458 stars 11.41 score 732 scripts 10 dependentsyihui
xfun:Supporting Functions for Packages Maintained by 'Yihui Xie'
Miscellaneous functions commonly used in other packages maintained by 'Yihui Xie'.
Maintained by Yihui Xie. Last updated 3 days ago.
3.8 match 145 stars 18.18 score 916 scripts 4.4k dependentsdieghernan
tidyterra:'tidyverse' Methods and 'ggplot2' Helpers for 'terra' Objects
Extension of the 'tidyverse' for 'SpatRaster' and 'SpatVector' objects of the 'terra' package. It includes also new 'geom_' functions that provide a convenient way of visualizing 'terra' objects with 'ggplot2'.
Maintained by Diego Hernangómez. Last updated 1 days ago.
terraggplot-extensionr-spatialrspatial
5.0 match 191 stars 13.62 score 1.9k scripts 25 dependentstidymodels
recipes:Preprocessing and Feature Engineering Steps for Modeling
A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.
Maintained by Max Kuhn. Last updated 6 days ago.
3.6 match 584 stars 18.71 score 7.2k scripts 380 dependentsjosesamos
starschemar:Obtaining Stars from Flat Tables
Data in multidimensional systems is obtained from operational systems and is transformed to adapt it to the new structure. Frequently, the operations to be performed aim to transform a flat table into a star schema. Transformations can be carried out using professional extract, transform and load tools or tools intended for data transformation for end users. With the tools mentioned, this transformation can be carried out, but it requires a lot of work. The main objective of this package is to define transformations that allow obtaining stars from flat tables easily. In addition, it includes basic data cleaning, dimension enrichment, incremental data refresh and query operations, adapted to this context.
Maintained by Jose Samos. Last updated 11 months ago.
11.4 match 7 stars 5.66 score 11 scripts 2 dependentstidyverse
dbplyr:A 'dplyr' Back End for Databases
A 'dplyr' back end for databases that allows you to work with remote database tables as if they are in-memory data frames. Basic features works with any database that has a 'DBI' back end; more advanced features require 'SQL' translation to be provided by the package author.
Maintained by Hadley Wickham. Last updated 3 months ago.
3.3 match 481 stars 19.72 score 5.2k scripts 736 dependentstidyverse
dtplyr:Data Table Back-End for 'dplyr'
Provides a data.table backend for 'dplyr'. The goal of 'dtplyr' is to allow you to write 'dplyr' code that is automatically translated to the equivalent, but usually much faster, data.table code.
Maintained by Hadley Wickham. Last updated 2 months ago.
3.9 match 671 stars 16.27 score 2.5k scripts 147 dependentsstan-dev
posterior:Tools for Working with Posterior Distributions
Provides useful tools for both users and developers of packages for fitting Bayesian models or working with output from Bayesian models. The primary goals of the package are to: (a) Efficiently convert between many different useful formats of draws (samples) from posterior or prior distributions. (b) Provide consistent methods for operations commonly performed on draws, for example, subsetting, binding, or mutating draws. (c) Provide various summaries of draws in convenient formats. (d) Provide lightweight implementations of state of the art posterior inference diagnostics. References: Vehtari et al. (2021) <doi:10.1214/20-BA1221>.
Maintained by Paul-Christian Bürkner. Last updated 10 days ago.
3.9 match 168 stars 16.13 score 3.3k scripts 342 dependentsropensci
git2r:Provides Access to Git Repositories
Interface to the 'libgit2' library, which is a pure C implementation of the 'Git' core methods. Provides access to 'Git' repositories to extract data and running some basic 'Git' commands.
Maintained by Stefan Widgren. Last updated 12 days ago.
gitgit-clientlibgit2libgit2-library
4.5 match 218 stars 13.86 score 836 scripts 49 dependentsrich-iannone
DiagrammeR:Graph/Network Visualization
Build graph/network structures using functions for stepwise addition and deletion of nodes and edges. Work with data available in tables for bulk addition of nodes, edges, and associated metadata. Use graph selections and traversals to apply changes to specific nodes or edges. A wide selection of graph algorithms allow for the analysis of graphs. Visualize the graphs and take advantage of any aesthetic properties assigned to nodes and edges.
Maintained by Richard Iannone. Last updated 2 months ago.
graphgraph-functionsnetwork-graphproperty-graphvisualization
4.0 match 1.7k stars 15.18 score 3.8k scripts 87 dependentseitsupi
neopolars:R Bindings for the 'polars' Rust Library
Lightning-fast 'DataFrame' library written in 'Rust'. Convert R data to 'Polars' data and vice versa. Perform fast, lazy, larger-than-memory and optimized data queries. 'Polars' is interoperable with the package 'arrow', as both are based on the 'Apache Arrow' Columnar Format.
Maintained by Tatsuya Shima. Last updated 1 days ago.
12.2 match 40 stars 4.86 score 1 scriptshadley
reshape:Flexibly Reshape Data
Flexibly restructure and aggregate data using just two functions: melt and cast.
Maintained by Hadley Wickham. Last updated 3 years ago.
6.0 match 9.83 score 21k scripts 231 dependentsbioc
TRONCO:TRONCO, an R package for TRanslational ONCOlogy
The TRONCO (TRanslational ONCOlogy) R package collects algorithms to infer progression models via the approach of Suppes-Bayes Causal Network, both from an ensemble of tumors (cross-sectional samples) and within an individual patient (multi-region or single-cell samples). The package provides parallel implementation of algorithms that process binary matrices where each row represents a tumor sample and each column a single-nucleotide or a structural variant driving the progression; a 0/1 value models the absence/presence of that alteration in the sample. The tool can import data from plain, MAF or GISTIC format files, and can fetch it from the cBioPortal for cancer genomics. Functions for data manipulation and visualization are provided, as well as functions to import/export such data to other bioinformatics tools for, e.g, clustering or detection of mutually exclusive alterations. Inferred models can be visualized and tested for their confidence via bootstrap and cross-validation. TRONCO is used for the implementation of the Pipeline for Cancer Inference (PICNIC).
Maintained by Luca De Sano. Last updated 5 months ago.
biomedicalinformaticsbayesiangraphandnetworksomaticmutationnetworkinferencenetworkclusteringdataimportsinglecellimmunooncologyalgorithmscancer-inferencetumors
9.0 match 30 stars 6.50 score 38 scriptsplotly
plotly:Create Interactive Web Graphics via 'plotly.js'
Create interactive web graphics from 'ggplot2' graphs and/or a custom interface to the (MIT-licensed) JavaScript library 'plotly.js' inspired by the grammar of graphics.
Maintained by Carson Sievert. Last updated 3 months ago.
d3jsdata-visualizationggplot2javascriptplotlyshinywebgl
3.0 match 2.6k stars 19.43 score 93k scripts 797 dependentsropensci
beautier:'BEAUti' from R
'BEAST2' (<https://www.beast2.org>) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. 'BEAUti 2' (which is part of 'BEAST2') is a GUI tool that allows users to specify the many possible setups and generates the XML file 'BEAST2' needs to run. This package provides a way to create 'BEAST2' input files without active user input, but using R function calls instead.
Maintained by Richèl J.C. Bilderbeek. Last updated 23 days ago.
bayesianbeastbeast2beautiphylogenetic-inferencephylogenetics
6.6 match 13 stars 8.76 score 198 scripts 5 dependentsbioc
RCy3:Functions to Access and Control Cytoscape
Vizualize, analyze and explore networks using Cytoscape via R. Anything you can do using the graphical user interface of Cytoscape, you can now do with a single RCy3 function.
Maintained by Alex Pico. Last updated 5 months ago.
visualizationgraphandnetworkthirdpartyclientnetwork
4.3 match 52 stars 13.39 score 628 scripts 15 dependentsnathaneastwood
poorman:A Poor Man's Dependency Free Recreation of 'dplyr'
A replication of key functionality from 'dplyr' and the wider 'tidyverse' using only 'base'.
Maintained by Nathan Eastwood. Last updated 1 years ago.
base-rdata-manipulationgrammar
5.3 match 341 stars 10.79 score 156 scripts 27 dependentshadley
plyr:Tools for Splitting, Applying and Combining Data
A set of tools that solves a common set of problems: you need to break a big problem down into manageable pieces, operate on each piece and then put all the pieces back together. For example, you might want to fit a model to each spatial location or time point in your study, summarise data by panels or collapse high-dimensional arrays to simpler summary statistics. The development of 'plyr' has been generously supported by 'Becton Dickinson'.
Maintained by Hadley Wickham. Last updated 4 months ago.
3.0 match 500 stars 18.16 score 83k scripts 3.3k dependentskwb-r
kwb.utils:General Utility Functions Developed at KWB
This package contains some small helper functions that aim at improving the quality of code developed at Kompetenzzentrum Wasser gGmbH (KWB).
Maintained by Hauke Sonnenberg. Last updated 12 months ago.
7.3 match 8 stars 7.33 score 12 scripts 78 dependentsuupharmacometrics
xpose:Diagnostics for Pharmacometric Models
Diagnostics for non-linear mixed-effects (population) models from 'NONMEM' <https://www.iconplc.com/solutions/technologies/nonmem/>. 'xpose' facilitates data import, creation of numerical run summary and provide 'ggplot2'-based graphics for data exploration and model diagnostics.
Maintained by Benjamin Guiastrennec. Last updated 2 months ago.
diagnosticsggplot2nonmempharmacometricsxpose
4.8 match 62 stars 11.02 score 183 scripts 6 dependentsjuba
questionr:Functions to Make Surveys Processing Easier
Set of functions to make the processing and analysis of surveys easier : interactive shiny apps and addins for data recoding, contingency tables, dataset metadata handling, and several convenience functions.
Maintained by Julien Barnier. Last updated 1 days ago.
4.1 match 83 stars 12.62 score 1.1k scripts 19 dependentsropensci
BaseSet:Working with Sets the Tidy Way
Implements a class and methods to work with sets, doing intersection, union, complementary sets, power sets, cartesian product and other set operations in a "tidy" way. These set operations are available for both classical sets and fuzzy sets. Import sets from several formats or from other several data structures.
Maintained by Lluís Revilla Sancho. Last updated 26 days ago.
bioconductorbioconductor-packagesets
9.0 match 11 stars 5.69 score 5 scriptsbioc
clusterProfiler:A universal enrichment tool for interpreting omics data
This package supports functional characteristics of both coding and non-coding genomics data for thousands of species with up-to-date gene annotation. It provides a univeral interface for gene functional annotation from a variety of sources and thus can be applied in diverse scenarios. It provides a tidy interface to access, manipulate, and visualize enrichment results to help users achieve efficient data interpretation. Datasets obtained from multiple treatments and time points can be analyzed and compared in a single run, easily revealing functional consensus and differences among distinct conditions.
Maintained by Guangchuang Yu. Last updated 4 months ago.
annotationclusteringgenesetenrichmentgokeggmultiplecomparisonpathwaysreactomevisualizationenrichment-analysisgsea
3.0 match 1.1k stars 17.03 score 11k scripts 48 dependentsmomx
Momocs:Morphometrics using R
The goal of 'Momocs' is to provide a complete, convenient, reproducible and open-source toolkit for 2D morphometrics. It includes most common 2D morphometrics approaches on outlines, open outlines, configurations of landmarks, traditional morphometrics, and facilities for data preparation, manipulation and visualization with a consistent grammar throughout. It allows reproducible, complex morphometrics analyses and other morphometrics approaches should be easy to plug in, or develop from, on top of this canvas.
Maintained by Vincent Bonhomme. Last updated 1 years ago.
6.9 match 51 stars 7.42 score 346 scriptsbioc
tidybulk:Brings transcriptomics to the tidyverse
This is a collection of utility functions that allow to perform exploration of and calculations to RNA sequencing data, in a modular, pipe-friendly and tidy fashion.
Maintained by Stefano Mangiola. Last updated 5 months ago.
assaydomaininfrastructurernaseqdifferentialexpressiongeneexpressionnormalizationclusteringqualitycontrolsequencingtranscriptiontranscriptomicsbioconductorbulk-transcriptional-analysesdeseq2differential-expressionedgerensembl-idsentrezgene-symbolsgseamds-dimensionspcapiperedundancytibbletidytidy-datatidyversetranscriptstsne
5.3 match 168 stars 9.48 score 172 scripts 1 dependentselbersb
tidylog:Logging for 'dplyr' and 'tidyr' Functions
Provides feedback about 'dplyr' and 'tidyr' operations.
Maintained by Benjamin Elbers. Last updated 9 months ago.
dplyrtidyrtidyversewrapper-functions
4.7 match 593 stars 10.23 score 1.7k scriptssatijalab
SeuratObject:Data Structures for Single Cell Data
Defines S4 classes for single-cell genomic data and associated information, such as dimensionality reduction embeddings, nearest-neighbor graphs, and spatially-resolved coordinates. Provides data access methods and R-native hooks to ensure the Seurat object is familiar to other R users. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, and Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031> for more details.
Maintained by Paul Hoffman. Last updated 1 years ago.
4.1 match 25 stars 11.69 score 1.2k scripts 88 dependentsbioc
gdsfmt:R Interface to CoreArray Genomic Data Structure (GDS) Files
Provides a high-level R interface to CoreArray Genomic Data Structure (GDS) data files. GDS is portable across platforms with hierarchical structure to store multiple scalable array-oriented data sets with metadata information. It is suited for large-scale datasets, especially for data which are much larger than the available random-access memory. The gdsfmt package offers the efficient operations specifically designed for integers of less than 8 bits, since a diploid genotype, like single-nucleotide polymorphism (SNP), usually occupies fewer bits than a byte. Data compression and decompression are available with relatively efficient random access. It is also allowed to read a GDS file in parallel with multiple R processes supported by the package parallel.
Maintained by Xiuwen Zheng. Last updated 2 days ago.
infrastructuredataimportbioinformaticsgds-formatgenomicscpp
4.3 match 18 stars 11.34 score 920 scripts 29 dependentsbioc
S4Vectors:Foundation of vector-like and list-like containers in Bioconductor
The S4Vectors package defines the Vector and List virtual classes and a set of generic functions that extend the semantic of ordinary vectors and lists in R. Package developers can easily implement vector-like or list-like objects as concrete subclasses of Vector or List. In addition, a few low-level concrete subclasses of general interest (e.g. DataFrame, Rle, Factor, and Hits) are implemented in the S4Vectors package itself (many more are implemented in the IRanges package and in other Bioconductor infrastructure packages).
Maintained by Hervé Pagès. Last updated 1 months ago.
infrastructuredatarepresentationbioconductor-packagecore-package
3.0 match 18 stars 16.05 score 1.0k scripts 1.9k dependentsr-lib
lifecycle:Manage the Life Cycle of your Package Functions
Manage the life cycle of your exported functions with shared conventions, documentation badges, and user-friendly deprecation warnings.
Maintained by Lionel Henry. Last updated 26 days ago.
3.0 match 95 stars 15.84 score 54 scripts 14k dependentsbioc
dada2:Accurate, high-resolution sample inference from amplicon sequencing data
The dada2 package infers exact amplicon sequence variants (ASVs) from high-throughput amplicon sequencing data, replacing the coarser and less accurate OTU clustering approach. The dada2 pipeline takes as input demultiplexed fastq files, and outputs the sequence variants and their sample-wise abundances after removing substitution and chimera errors. Taxonomic classification is available via a native implementation of the RDP naive Bayesian classifier, and species-level assignment to 16S rRNA gene fragments by exact matching.
Maintained by Benjamin Callahan. Last updated 5 months ago.
immunooncologymicrobiomesequencingclassificationmetagenomicsampliconbioconductorbioinformaticsmetabarcodingtaxonomycpp
3.6 match 485 stars 13.17 score 3.0k scripts 4 dependentsworkflowr
workflowr:A Framework for Reproducible and Collaborative Data Science
Provides a workflow for your analysis projects by combining literate programming ('knitr' and 'rmarkdown') and version control ('Git', via 'git2r') to generate a website containing time-stamped, versioned, and documented results.
Maintained by John Blischak. Last updated 4 months ago.
gitproject-managementrmarkdownwebsiteworkflow
4.0 match 845 stars 11.83 score 566 scriptsalexzwanenburg
familiar:End-to-End Automated Machine Learning and Model Evaluation
Single unified interface for end-to-end modelling of regression, categorical and time-to-event (survival) outcomes. Models created using familiar are self-containing, and their use does not require additional information such as baseline survival, feature clustering, or feature transformation and normalisation parameters. Model performance, calibration, risk group stratification, (permutation) variable importance, individual conditional expectation, partial dependence, and more, are assessed automatically as part of the evaluation process and exported in tabular format and plotted, and may also be computed manually using export and plot functions. Where possible, metrics and values obtained during the evaluation process come with confidence intervals.
Maintained by Alex Zwanenburg. Last updated 6 months ago.
aiexplainable-aimachine-learningsurvival-analysistabular-data
9.1 match 31 stars 5.05 score 18 scriptshojsgaard
doBy:Groupwise Statistics, LSmeans, Linear Estimates, Utilities
Utility package containing: 1) Facilities for working with grouped data: 'do' something to data stratified 'by' some variables. 2) LSmeans (least-squares means), general linear estimates. 3) Restrict functions to a smaller domain. 4) Miscellaneous other utilities.
Maintained by Søren Højsgaard. Last updated 4 days ago.
3.1 match 1 stars 14.94 score 3.2k scripts 939 dependentsadrientaudiere
MiscMetabar:Miscellaneous Functions for Metabarcoding Analysis
Facilitate the description, transformation, exploration, and reproducibility of metabarcoding analyses. 'MiscMetabar' is mainly built on top of the 'phyloseq', 'dada2' and 'targets' R packages. It helps to build reproducible and robust bioinformatics pipelines in R. 'MiscMetabar' makes ecological analysis of alpha and beta-diversity easier, more reproducible and more powerful by integrating a large number of tools. Important features are described in Taudière A. (2023) <doi:10.21105/joss.06038>.
Maintained by Adrien Taudière. Last updated 26 days ago.
sequencingmicrobiomemetagenomicsclusteringclassificationvisualizationampliconamplicon-sequencingbiodiversity-informaticsecologyilluminametabarcodingngs-analysis
7.1 match 17 stars 6.44 score 23 scriptstiledb-inc
tiledb:Modern Database Engine for Complex Data Based on Multi-Dimensional Arrays
The modern database 'TileDB' introduces a powerful on-disk format for storing and accessing any complex data based on multi-dimensional arrays. It supports dense and sparse arrays, dataframes and key-values stores, cloud storage ('S3', 'GCS', 'Azure'), chunked arrays, multiple compression, encryption and checksum filters, uses a fully multi-threaded implementation, supports parallel I/O, data versioning ('time travel'), metadata and groups. It is implemented as an embeddable cross-platform C++ library with APIs from several languages, and integrations. This package provides the R support.
Maintained by Isaiah Norton. Last updated 4 days ago.
arrayhdfss3storage-managertiledbcpp
3.8 match 107 stars 11.96 score 306 scripts 4 dependentsbioc
decoupleR:decoupleR: Ensemble of computational methods to infer biological activities from omics data
Many methods allow us to extract biological activities from omics data using information from prior knowledge resources, reducing the dimensionality for increased statistical power and better interpretability. Here, we present decoupleR, a Bioconductor package containing different statistical methods to extract these signatures within a unified framework. decoupleR allows the user to flexibly test any method with any resource. It incorporates methods that take into account the sign and weight of network interactions. decoupleR can be used with any omic, as long as its features can be linked to a biological process based on prior knowledge. For example, in transcriptomics gene sets regulated by a transcription factor, or in phospho-proteomics phosphosites that are targeted by a kinase.
Maintained by Pau Badia-i-Mompel. Last updated 5 months ago.
differentialexpressionfunctionalgenomicsgeneexpressiongeneregulationnetworksoftwarestatisticalmethodtranscription
4.0 match 230 stars 11.27 score 316 scripts 3 dependentsinsightsengineering
cards:Analysis Results Data
Construct CDISC (Clinical Data Interchange Standards Consortium) compliant Analysis Results Data objects. These objects are used and re-used to construct summary tables, visualizations, and written reports. The package also exports utilities for working with these objects and creating new Analysis Results Data objects.
Maintained by Daniel D. Sjoberg. Last updated 15 days ago.
3.9 match 39 stars 11.41 score 100 scripts 20 dependentsthomasp85
tidygraph:A Tidy API for Graph Manipulation
A graph, while not "tidy" in itself, can be thought of as two tidy data frames describing node and edge data respectively. 'tidygraph' provides an approach to manipulate these two virtual data frames using the API defined in the 'dplyr' package, as well as provides tidy interfaces to a lot of common graph algorithms.
Maintained by Thomas Lin Pedersen. Last updated 1 months ago.
graph-algorithmsgraph-manipulationigraphnetwork-analysistidyversecpp
3.0 match 553 stars 14.74 score 4.6k scripts 136 dependentsycphs
openxlsx:Read, Write and Edit xlsx Files
Simplifies the creation of Excel .xlsx files by providing a high level interface to writing, styling and editing worksheets. Through the use of 'Rcpp', read/write times are comparable to the 'xlsx' and 'XLConnect' packages with the added benefit of removing the dependency on Java.
Maintained by Jan Marvin Garbuszus. Last updated 2 months ago.
2.3 match 232 stars 18.98 score 20k scripts 270 dependentsgergness
srvyr:'dplyr'-Like Syntax for Summary Statistics of Survey Data
Use piping, verbs like 'group_by' and 'summarize', and other 'dplyr' inspired syntactic style when calculating summary statistics on survey data using functions from the 'survey' package.
Maintained by Greg Freedman Ellis. Last updated 1 months ago.
3.0 match 215 stars 13.88 score 1.8k scripts 15 dependentsips-lmu
emuR:Main Package of the EMU Speech Database Management System
Provide the EMU Speech Database Management System (EMU-SDMS) with database management, data extraction, data preparation and data visualization facilities. See <https://ips-lmu.github.io/The-EMU-SDMS-Manual/> for more details.
Maintained by Markus Jochim. Last updated 1 years ago.
6.0 match 24 stars 6.89 score 135 scripts 1 dependentsjprybylski
xpose.xtras:Extra Functionality for the 'xpose' Package
Adding some at-present missing functionality, or functions unlikely to be added to the base 'xpose' package. This includes some diagnostic plots that have been missing in translation from 'xpose4', but also some useful features that truly extend the capabilities of what can be done with 'xpose'. These extensions include the concept of a set of 'xpose' objects, and diagnostics for likelihood-based models.
Maintained by John Prybylski. Last updated 4 months ago.
6.8 match 6.01 score 5 scriptsr-lib
fs:Cross-Platform File System Operations Based on 'libuv'
A cross-platform interface to file system operations, built on top of the 'libuv' C library.
Maintained by Gábor Csárdi. Last updated 4 months ago.
2.0 match 370 stars 20.26 score 8.1k scripts 5.2k dependentscran
CytobankAPI:Cytobank API Wrapper for R
Tools to interface with Cytobank's API via R, organized by endpoints that represent various areas of Cytobank functionality. Learn more about Cytobank at <https://www.beckman.com/flow-cytometry/software>.
Maintained by Stu Blair. Last updated 2 years ago.
13.5 match 3.00 scoreyulab-smu
tidytree:A Tidy Tool for Phylogenetic Tree Data Manipulation
Phylogenetic tree generally contains multiple components including node, edge, branch and associated data. 'tidytree' provides an approach to convert tree object to tidy data frame as well as provides tidy interfaces to manipulate tree data.
Maintained by Guangchuang Yu. Last updated 8 months ago.
phylogenetic-treetidyversetree-data
3.0 match 54 stars 13.25 score 584 scripts 128 dependentskkholst
mets:Analysis of Multivariate Event Times
Implementation of various statistical models for multivariate event history data <doi:10.1007/s10985-013-9244-x>. Including multivariate cumulative incidence models <doi:10.1002/sim.6016>, and bivariate random effects probit models (Liability models) <doi:10.1016/j.csda.2015.01.014>. Modern methods for survival analysis, including regression modelling (Cox, Fine-Gray, Ghosh-Lin, Binomial regression) with fast computation of influence functions.
Maintained by Klaus K. Holst. Last updated 3 days ago.
multivariate-time-to-eventsurvival-analysistime-to-eventfortranopenblascpp
2.9 match 14 stars 13.47 score 236 scripts 42 dependentsropensci
rsat:Dealing with Multiplatform Satellite Images
Downloading, customizing, and processing time series of satellite images for a region of interest. 'rsat' functions allow a unified access to multispectral images from Landsat, MODIS and Sentinel repositories. 'rsat' also offers capabilities for customizing satellite images, such as tile mosaicking, image cropping and new variables computation. Finally, 'rsat' covers the processing, including cloud masking, compositing and gap-filling/smoothing time series of images (Militino et al., 2018 <doi:10.3390/rs10030398> and Militino et al., 2019 <doi:10.1109/TGRS.2019.2904193>).
Maintained by Unai Pérez - Goya. Last updated 11 months ago.
5.3 match 54 stars 7.45 score 52 scriptstbates
umx:Structural Equation Modeling and Twin Modeling in R
Quickly create, run, and report structural equation models, and twin models. See '?umx' for help, and umx_open_CRAN_page("umx") for NEWS. Timothy C. Bates, Michael C. Neale, Hermine H. Maes, (2019). umx: A library for Structural Equation and Twin Modelling in R. Twin Research and Human Genetics, 22, 27-41. <doi:10.1017/thg.2019.2>.
Maintained by Timothy C. Bates. Last updated 2 days ago.
behavior-geneticsgeneticsopenmxpsychologysemstatisticsstructural-equation-modelingtutorialstwin-modelsumx
4.1 match 44 stars 9.45 score 472 scriptsr-spatial
sf:Simple Features for R
Support for simple feature access, a standardized way to encode and analyze spatial vector data. Binds to 'GDAL' <doi: 10.5281/zenodo.5884351> for reading and writing data, to 'GEOS' <doi: 10.5281/zenodo.11396894> for geometrical operations, and to 'PROJ' <doi: 10.5281/zenodo.5884394> for projection conversions and datum transformations. Uses by default the 's2' package for geometry operations on geodetic (long/lat degree) coordinates.
Maintained by Edzer Pebesma. Last updated 16 days ago.
1.7 match 1.4k stars 22.42 score 117k scripts 1.2k dependentsmolgenis
dsTidyverseClient:'DataSHIELD' 'Tidyverse' Clientside Package
Implementation of selected 'Tidyverse' functions within 'DataSHIELD', an open-source federated analysis solution in R. Currently, 'DataSHIELD' contains very limited tools for data manipulation, so the aim of this package is to improve the researcher experience by implementing essential functions for data manipulation, including subsetting, filtering, grouping, and renaming variables. This is the clientside package which should be installed locally, and is used in conjuncture with the serverside package 'dsTidyverse' which is installed on the remote server holding the data. For more information, see <https://www.tidyverse.org/>, <https://datashield.org/> and <https://github.com/molgenis/ds-tidyverse>.
Maintained by Tim Cadman. Last updated 18 days ago.
7.0 match 1 stars 5.43 score 2 scriptsbioc
hermes:Preprocessing, analyzing, and reporting of RNA-seq data
Provides classes and functions for quality control, filtering, normalization and differential expression analysis of pre-processed `RNA-seq` data. Data can be imported from `SummarizedExperiment` as well as `matrix` objects and can be annotated from `BioMart`. Filtering for genes without too low expression or containing required annotations, as well as filtering for samples with sufficient correlation to other samples or total number of reads is supported. The standard normalization methods including cpm, rpkm and tpm can be used, and 'DESeq2` as well as voom differential expression analyses are available.
Maintained by Daniel Sabanés Bové. Last updated 5 months ago.
rnaseqdifferentialexpressionnormalizationpreprocessingqualitycontrolrna-seqstatistical-engineering
4.9 match 11 stars 7.77 score 48 scripts 1 dependentsoliverehmer
act:Aligned Corpus Toolkit
The Aligned Corpus Toolkit (act) is designed for linguists that work with time aligned transcription data. It offers functions to import and export various annotation file formats ('ELAN' .eaf, 'EXMARaLDA .exb and 'Praat' .TextGrid files), create print transcripts in the style of conversation analysis, search transcripts (span searches across multiple annotations, search in normalized annotations, make concordances etc.), export and re-import search results (.csv and 'Excel' .xlsx format), create cuts for the search results (print transcripts, audio/video cuts using 'FFmpeg' and video sub titles in 'Subrib title' .srt format), modify the data in a corpus (search/replace, delete, filter etc.), interact with 'Praat' using 'Praat'-scripts, and exchange data with the 'rPraat' package. The package is itself written in R and may be expanded by other users.
Maintained by Oliver Ehmer. Last updated 2 years ago.
5.6 match 4 stars 6.65 score 184 scriptsmelff
memisc:Management of Survey Data and Presentation of Analysis Results
An infrastructure for the management of survey data including value labels, definable missing values, recoding of variables, production of code books, and import of (subsets of) 'SPSS' and 'Stata' files is provided. Further, the package allows to produce tables and data frames of arbitrary descriptive statistics and (almost) publication-ready tables of regression model estimates, which can be exported to 'LaTeX' and HTML.
Maintained by Martin Elff. Last updated 11 days ago.
3.0 match 46 stars 12.34 score 1.2k scripts 13 dependentsfrankportman
bayesAB:Fast Bayesian Methods for AB Testing
A suite of functions that allow the user to analyze A/B test data in a Bayesian framework. Intended to be a drop-in replacement for common frequentist hypothesis test such as the t-test and chi-sq test.
Maintained by Frank Portman. Last updated 4 years ago.
ab-testingbayesian-methodsbayesian-testscpp
4.9 match 308 stars 7.43 score 88 scriptsjeromeecoac
seewave:Sound Analysis and Synthesis
Functions for analysing, manipulating, displaying, editing and synthesizing time waves (particularly sound). This package processes time analysis (oscillograms and envelopes), spectral content, resonance quality factor, entropy, cross correlation and autocorrelation, zero-crossing, dominant frequency, analytic signal, frequency coherence, 2D and 3D spectrograms and many other analyses. See Sueur et al. (2008) <doi:10.1080/09524622.2008.9753600> and Sueur (2018) <doi:10.1007/978-3-319-77647-7>.
Maintained by Jerome Sueur. Last updated 1 years ago.
4.0 match 18 stars 8.88 score 880 scripts 23 dependentsdgerbing
lessR:Less Code, More Results
Each function replaces multiple standard R functions. For example, two function calls, Read() and CountAll(), generate summary statistics for all variables in the data frame, plus histograms and bar charts as appropriate. Other functions provide for summary statistics via pivot tables, a comprehensive regression analysis, ANOVA and t-test, visualizations including the Violin/Box/Scatter plot for a numerical variable, bar chart, histogram, box plot, density curves, calibrated power curve, reading multiple data formats with the same function call, variable labels, time series with aggregation and forecasting, color themes, and Trellis (facet) graphics. Also includes a confirmatory factor analysis of multiple indicator measurement models, pedagogical routines for data simulation such as for the Central Limit Theorem, generation and rendering of regression instructions for interpretative output, and interactive visualizations.
Maintained by David W. Gerbing. Last updated 1 months ago.
4.8 match 6 stars 7.47 score 394 scripts 3 dependentsrpahl
container:Extending Base 'R' Lists
Extends the functionality of base 'R' lists and provides specialized data structures 'deque', 'set', 'dict', and 'dict.table', the latter to extend the 'data.table' package.
Maintained by Roman Pahl. Last updated 2 months ago.
containerdata-structuresdequedictsets
5.0 match 16 stars 7.13 score 140 scriptsjosesamos
rolap:Obtaining Star Databases from Flat Tables
Data in multidimensional systems is obtained from operational systems and is transformed to adapt it to the new structure. Frequently, the operations to be performed aim to transform a flat table into a ROLAP (Relational On-Line Analytical Processing) star database. The main objective of the package is to allow the definition of these transformations easily. The implementation of the multidimensional database obtained can be exported to work with multidimensional analysis tools on spreadsheets or relational databases.
Maintained by Jose Samos. Last updated 1 years ago.
5.7 match 5 stars 6.12 score 25 scripts 1 dependentsbioc
ILoReg:ILoReg: a tool for high-resolution cell population identification from scRNA-Seq data
ILoReg is a tool for identification of cell populations from scRNA-seq data. In particular, ILoReg is useful for finding cell populations with subtle transcriptomic differences. The method utilizes a self-supervised learning method, called Iteratitive Clustering Projection (ICP), to find cluster probabilities, which are used in noise reduction prior to PCA and the subsequent hierarchical clustering and t-SNE steps. Additionally, functions for differential expression analysis to find gene markers for the populations and gene expression visualization are provided.
Maintained by Johannes Smolander. Last updated 5 months ago.
singlecellsoftwareclusteringdimensionreductionrnaseqvisualizationtranscriptomicsdatarepresentationdifferentialexpressiontranscriptiongeneexpression
7.1 match 5 stars 4.88 score 2 scriptsbioc
tidyFlowCore:tidyFlowCore: Bringing flowCore to the tidyverse
tidyFlowCore bridges the gap between flow cytometry analysis using the flowCore Bioconductor package and the tidy data principles advocated by the tidyverse. It provides a suite of dplyr-, ggplot2-, and tidyr-like verbs specifically designed for working with flowFrame and flowSet objects as if they were tibbles; however, your data remain flowCore data structures under this layer of abstraction. tidyFlowCore enables intuitive and streamlined analysis workflows that can leverage both the Bioconductor and tidyverse ecosystems for cytometry data.
Maintained by Timothy Keyes. Last updated 5 months ago.
singlecellflowcytometryinfrastructure
8.0 match 1 stars 4.30 score 7 scriptssatijalab
Seurat:Tools for Single Cell Genomics
A toolkit for quality control, analysis, and exploration of single cell RNA sequencing data. 'Seurat' aims to enable users to identify and interpret sources of heterogeneity from single cell transcriptomic measurements, and to integrate diverse types of single cell data. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031>, and Hao, Hao, et al (2020) <doi:10.1101/2020.10.12.335331> for more details.
Maintained by Paul Hoffman. Last updated 1 years ago.
human-cell-atlassingle-cell-genomicssingle-cell-rna-seqcpp
2.0 match 2.4k stars 16.86 score 50k scripts 73 dependentsbioc
gDRutils:A package with helper functions for processing drug response data
This package contains utility functions used throughout the gDR platform to fit data, manipulate data, and convert and validate data structures. This package also has the necessary default constants for gDR platform. Many of the functions are utilized by the gDRcore package.
Maintained by Arkadiusz Gladki. Last updated 4 days ago.
4.5 match 2 stars 7.40 score 3 scripts 3 dependentsmitchelloharawild
vitae:Curriculum Vitae for R Markdown
Provides templates and functions to simplify the production and maintenance of curriculum vitae.
Maintained by Mitchell OHara-Wild. Last updated 9 months ago.
3.0 match 1.2k stars 10.78 score 556 scriptskwb-r
dwc.wells:A Package for Condition Predictions for Drinking Water Wells
This package allows to predict the condition of a drinking water well based on ML models. The models are trained with results from pump tests and a large set of input variables e.g. the well material, the age and the number of regenerations.
Maintained by Michael Rustler. Last updated 3 years ago.
10.7 match 3.00 score 7 scriptsspesenti
SWIM:Scenario Weights for Importance Measurement
An efficient sensitivity analysis for stochastic models based on Monte Carlo samples. Provides weights on simulated scenarios from a stochastic model, such that stressed random variables fulfil given probabilistic constraints (e.g. specified values for risk measures), under the new scenario weights. Scenario weights are selected by constrained minimisation of the relative entropy to the baseline model. The 'SWIM' package is based on Pesenti S.M., Millossovich P., Tsanakas A. (2019) "Reverse Sensitivity Testing: What does it take to break the model" <openaccess.city.ac.uk/id/eprint/18896/> and Pesenti S.M. (2021) "Reverse Sensitivity Analysis for Risk Modelling" <https://www.ssrn.com/abstract=3878879>.
Maintained by Silvana M. Pesenti. Last updated 3 years ago.
5.0 match 8 stars 6.38 score 20 scriptsthinkr-open
fusen:Build a Package from Rmarkdown Files
Use Rmarkdown First method to build your package. Start your package with documentation, functions, examples and tests in the same unique file. Everything can be set from the Rmarkdown template file provided in your project, then inflated as a package. Inflating the template copies the relevant chunks and sections in the appropriate files required for package development.
Maintained by Vincent Guyader. Last updated 2 months ago.
3.3 match 163 stars 9.45 score 35 scriptsdopatendo
ILSAmerge:Merge and Download International Large-Scale Assessments (ILSA) Data
Merges and downloads 'SPSS' data from different International Large-Scale Assessments (ILSA), including: Trends in International Mathematics and Science Study (TIMSS), Progress in International Reading Literacy Study (PIRLS), and others.
Maintained by Andrés Christiansen. Last updated 21 days ago.
5.3 match 2 stars 5.86 score 12 scriptspaul-buerkner
brms:Bayesian Regression Models using 'Stan'
Fit Bayesian generalized (non-)linear multivariate multilevel models using 'Stan' for full Bayesian inference. A wide range of distributions and link functions are supported, allowing users to fit -- among others -- linear, robust linear, count data, survival, response times, ordinal, zero-inflated, hurdle, and even self-defined mixture models all in a multilevel context. Further modeling options include both theory-driven and data-driven non-linear terms, auto-correlation structures, censoring and truncation, meta-analytic standard errors, and quite a few more. In addition, all parameters of the response distribution can be predicted in order to perform distributional regression. Prior specifications are flexible and explicitly encourage users to apply prior distributions that actually reflect their prior knowledge. Models can easily be evaluated and compared using several methods assessing posterior or prior predictions. References: Bürkner (2017) <doi:10.18637/jss.v080.i01>; Bürkner (2018) <doi:10.32614/RJ-2018-017>; Bürkner (2021) <doi:10.18637/jss.v100.i05>; Carpenter et al. (2017) <doi:10.18637/jss.v076.i01>.
Maintained by Paul-Christian Bürkner. Last updated 3 days ago.
bayesian-inferencebrmsmultilevel-modelsstanstatistical-models
1.9 match 1.3k stars 16.61 score 13k scripts 34 dependentsserafinialessio
dformula:Data Manipulation using Formula
A tool for manipulating data using the generic formula. A single formula allows to easily add, replace and remove variables before running the analysis.
Maintained by Alessio Serafini. Last updated 8 months ago.
8.4 match 3.70 score 1 scriptsr-lib
usethis:Automate Package and Project Setup
Automate package and project setup tasks that are otherwise performed manually. This includes setting up unit testing, test coverage, continuous integration, Git, 'GitHub', licenses, 'Rcpp', 'RStudio' projects, and more.
Maintained by Jennifer Bryan. Last updated 11 days ago.
1.8 match 869 stars 17.54 score 5.6k scripts 336 dependentsbioc
genefu:Computation of Gene Expression-Based Signatures in Breast Cancer
This package contains functions implementing various tasks usually required by gene expression analysis, especially in breast cancer studies: gene mapping between different microarray platforms, identification of molecular subtypes, implementation of published gene signatures, gene selection, and survival analysis.
Maintained by Benjamin Haibe-Kains. Last updated 4 months ago.
differentialexpressiongeneexpressionvisualizationclusteringclassification
4.1 match 7.42 score 193 scripts 3 dependentshope-data-science
tidyft:Fast and Memory Efficient Data Operations in Tidy Syntax
Tidy syntax for 'data.table', using modification by reference whenever possible. This toolkit is designed for big data analysis in high-performance desktop or laptop computers. The syntax of the package is similar or identical to 'tidyverse'. It is user friendly, memory efficient and time saving. For more information, check its ancestor package 'tidyfst'.
Maintained by Tian-Yuan Huang. Last updated 6 months ago.
4.9 match 35 stars 6.25 score 34 scriptshelixcn
phylotools:Phylogenetic Tools for Eco-Phylogenetics
A collection of tools for building RAxML supermatrix using PHYLIP or aligned FASTA files. These functions will be useful for building large phylogenies using multiple markers.
Maintained by Jinlong Zhang. Last updated 5 months ago.
4.1 match 11 stars 7.31 score 368 scriptstidyverse
googledrive:An Interface to Google Drive
Manage Google Drive files from R.
Maintained by Jennifer Bryan. Last updated 7 months ago.
2.0 match 329 stars 14.97 score 2.1k scripts 164 dependentsropensci
phonfieldwork:Linguistic Phonetic Fieldwork Tools
There are a lot of different typical tasks that have to be solved during phonetic research and experiments. This includes creating a presentation that will contain all stimuli, renaming and concatenating multiple sound files recorded during a session, automatic annotation in 'Praat' TextGrids (this is one of the sound annotation standards provided by 'Praat' software, see Boersma & Weenink 2020 <https://www.fon.hum.uva.nl/praat/>), creating an html table with annotations and spectrograms, and converting multiple formats ('Praat' TextGrid, 'ELAN', 'EXMARaLDA', 'Audacity', subtitles '.srt', and 'FLEx' flextext). All of these tasks can be solved by a mixture of different tools (any programming language has programs for automatic renaming, and Praat contains scripts for concatenating and renaming files, etc.). 'phonfieldwork' provides a functionality that will make it easier to solve those tasks independently of any additional tools. You can also compare the functionality with other packages: 'rPraat' <https://CRAN.R-project.org/package=rPraat>, 'textgRid' <https://CRAN.R-project.org/package=textgRid>.
Maintained by George Moroz. Last updated 8 months ago.
audacityeafelanexbexmaraldafieldworkflextextphoneticsphonologypraatsrt-subtitlestextgrid
4.5 match 20 stars 6.68 score 20 scriptsbioc
esATAC:An Easy-to-use Systematic pipeline for ATACseq data analysis
This package provides a framework and complete preset pipeline for quantification and analysis of ATAC-seq Reads. It covers raw sequencing reads preprocessing (FASTQ files), reads alignment (Rbowtie2), aligned reads file operations (SAM, BAM, and BED files), peak calling (F-seq), genome annotations (Motif, GO, SNP analysis) and quality control report. The package is managed by dataflow graph. It is easy for user to pass variables seamlessly between processes and understand the workflow. Users can process FASTQ files through end-to-end preset pipeline which produces a pretty HTML report for quality control and preliminary statistical results, or customize workflow starting from any intermediate stages with esATAC functions easily and flexibly.
Maintained by Zheng Wei. Last updated 5 months ago.
immunooncologysequencingdnaseqqualitycontrolalignmentpreprocessingcoverageatacseqdnaseseqatac-seqbioconductorpipelinecppopenjdk
4.9 match 23 stars 6.11 score 3 scriptstidyverse
googlesheets4:Access Google Sheets using the Sheets API V4
Interact with Google Sheets through the Sheets API v4 <https://developers.google.com/sheets/api>. "API" is an acronym for "application programming interface"; the Sheets API allows users to interact with Google Sheets programmatically, instead of via a web browser. The "v4" refers to the fact that the Sheets API is currently at version 4. This package can read and write both the metadata and the cell data in a Sheet.
Maintained by Jennifer Bryan. Last updated 8 months ago.
google-drivegoogle-sheetsspreadsheet
2.0 match 363 stars 14.55 score 7.0k scripts 144 dependentsbioc
MicrobiotaProcess:A comprehensive R package for managing and analyzing microbiome and other ecological data within the tidy framework
MicrobiotaProcess is an R package for analysis, visualization and biomarker discovery of microbial datasets. It introduces MPSE class, this make it more interoperable with the existing computing ecosystem. Moreover, it introduces a tidy microbiome data structure paradigm and analysis grammar. It provides a wide variety of microbiome data analysis procedures under the unified and common framework (tidy-like framework).
Maintained by Shuangbin Xu. Last updated 5 months ago.
visualizationmicrobiomesoftwaremultiplecomparisonfeatureextractionmicrobiome-analysismicrobiome-data
3.0 match 183 stars 9.70 score 126 scripts 1 dependentsepiverse-trace
linelist:Tagging and Validating Epidemiological Data
Provides tools to help storing and handling case line list data. The 'linelist' class adds a tagging system to classical 'data.frame' objects to identify key epidemiological data such as dates of symptom onset, epidemiological case definition, age, gender or disease outcome. Once tagged, these variables can be seamlessly used in downstream analyses, making data pipelines more robust and reliable.
Maintained by Hugo Gruson. Last updated 23 days ago.
datadata-structuresepidemiologyepiverseoutbreakssdg-3structured-data
3.3 match 8 stars 8.80 score 61 scripts 2 dependentsbioc
plyinteractions:Extending tidy verbs to genomic interactions
Operate on `GInteractions` objects as tabular data using `dplyr`-like verbs. The functions and methods in `plyinteractions` provide a grammatical approach to manipulate `GInteractions`, to facilitate their integration in genomic analysis workflows.
Maintained by Jacques Serizay. Last updated 5 months ago.
6.0 match 4.75 score 14 scriptsbioc
GenomeInfoDb:Utilities for manipulating chromosome names, including modifying them to follow a particular naming style
Contains data and functions that define and allow translation between different chromosome sequence naming conventions (e.g., "chr1" versus "1"), including a function that attempts to place sequence names in their natural, rather than lexicographic, order.
Maintained by Hervé Pagès. Last updated 2 months ago.
geneticsdatarepresentationannotationgenomeannotationbioconductor-packagecore-package
1.7 match 32 stars 16.46 score 1.3k scripts 1.7k dependentsmlr-org
mlr3pipelines:Preprocessing Operators and Pipelines for 'mlr3'
Dataflow programming toolkit that enriches 'mlr3' with a diverse set of pipelining operators ('PipeOps') that can be composed into graphs. Operations exist for data preprocessing, model fitting, and ensemble learning. Graphs can themselves be treated as 'mlr3' 'Learners' and can therefore be resampled, benchmarked, and tuned.
Maintained by Martin Binder. Last updated 9 days ago.
baggingdata-sciencedataflow-programmingensemble-learningmachine-learningmlr3pipelinespreprocessingstacking
2.3 match 141 stars 12.36 score 448 scripts 7 dependentseasystats
datawizard:Easy Data Wrangling and Statistical Transformations
A lightweight package to assist in key steps involved in any data analysis workflow: (1) wrangling the raw data to get it in the needed form, (2) applying preprocessing steps and statistical transformations, and (3) compute statistical summaries of data properties and distributions. It is also the data wrangling backend for packages in 'easystats' ecosystem. References: Patil et al. (2022) <doi:10.21105/joss.04684>.
Maintained by Etienne Bacher. Last updated 10 days ago.
datadplyrhacktoberfestjanitormanipulationreshapetidyrwrangling
1.9 match 222 stars 14.71 score 436 scripts 119 dependentswinvector
rquery:Relational Query Generator for Data Manipulation at Scale
A piped query generator based on Edgar F. Codd's relational algebra, and on production experience using 'SQL' and 'dplyr' at big data scale. The design represents an attempt to make 'SQL' more teachable by denoting composition by a sequential pipeline notation instead of nested queries or functions. The implementation delivers reliable high performance data processing on large data systems such as 'Spark', databases, and 'data.table'. Package features include: data processing trees or pipelines as observable objects (able to report both columns produced and columns used), optimized 'SQL' generation as an explicit user visible table modeling step, plus explicit query reasoning and checking.
Maintained by John Mount. Last updated 2 years ago.
2.9 match 110 stars 9.53 score 126 scripts 3 dependentscmmr
rbiom:Read/Write, Analyze, and Visualize 'BIOM' Data
A toolkit for working with Biological Observation Matrix ('BIOM') files. Read/write all 'BIOM' formats. Compute rarefaction, alpha diversity, and beta diversity (including 'UniFrac'). Summarize counts by taxonomic level. Subset based on metadata. Generate visualizations and statistical analyses. CPU intensive operations are coded in C for speed.
Maintained by Daniel P. Smith. Last updated 6 days ago.
3.0 match 15 stars 9.02 score 117 scripts 6 dependentsrdatatable
data.table:Extension of `data.frame`
Fast aggregation of large data (e.g. 100GB in RAM), fast ordered joins, fast add/modify/delete of columns by group using no copies at all, list columns, friendly and fast character-separated-value read/write. Offers a natural and flexible syntax, for faster development.
Maintained by Tyson Barrett. Last updated 4 hours ago.
1.1 match 3.7k stars 23.52 score 230k scripts 4.6k dependentsvincentarelbundock
modelsummary:Summary Tables and Plots for Statistical Models and Data: Beautiful, Customizable, and Publication-Ready
Create beautiful and customizable tables to summarize several statistical models side-by-side. Draw coefficient plots, multi-level cross-tabs, dataset summaries, balance tables (a.k.a. "Table 1s"), and correlation matrices. This package supports dozens of statistical models, and it can produce tables in HTML, LaTeX, Word, Markdown, PDF, PowerPoint, Excel, RTF, JPG, or PNG. Tables can easily be embedded in 'Rmarkdown' or 'knitr' dynamic documents. Details can be found in Arel-Bundock (2022) <doi:10.18637/jss.v103.i01>.
Maintained by Vincent Arel-Bundock. Last updated 15 days ago.
2.0 match 926 stars 13.41 score 6.2k scripts 2 dependentsa-maldet
labelmachine:Make Labeling of R Data Sets Easy
Assign meaningful labels to data frame columns. 'labelmachine' manages your label assignment rules in 'yaml' files and makes it easy to use the same labels in multiple projects.
Maintained by Adrian Maldet. Last updated 5 years ago.
5.0 match 7 stars 5.26 score 13 scriptsmateiz
mlflow:Interface to 'MLflow'
R interface to 'MLflow', open source platform for the complete machine learning life cycle, see <https://mlflow.org/>. This package supports installing 'MLflow', tracking experiments, creating and running projects, and saving and serving models.
Maintained by Matei Zaharia. Last updated 2 days ago.
4.3 match 1 stars 6.25 score 644 scriptsadibender
pammtools:Piece-Wise Exponential Additive Mixed Modeling Tools for Survival Analysis
The Piece-wise exponential (Additive Mixed) Model (PAMM; Bender and others (2018) <doi: 10.1177/1471082X17748083>) is a powerful model class for the analysis of survival (or time-to-event) data, based on Generalized Additive (Mixed) Models (GA(M)Ms). It offers intuitive specification and robust estimation of complex survival models with stratified baseline hazards, random effects, time-varying effects, time-dependent covariates and cumulative effects (Bender and others (2019)), as well as support for left-truncated, competing risks and recurrent events data. pammtools provides tidy workflow for survival analysis with PAMMs, including data simulation, transformation and other functions for data preprocessing and model post-processing as well as visualization.
Maintained by Andreas Bender. Last updated 2 months ago.
additive-modelspammpammtoolspiece-wise-exponentialsurvival-analysis
3.0 match 48 stars 8.78 score 310 scripts 8 dependentsmolgenis
dsTidyverse:'DataSHIELD' 'Tidyverse' Serverside Package
Implementation of selected 'Tidyverse' functions within 'DataSHIELD', an open-source federated analysis solution in R. Currently, DataSHIELD contains very limited tools for data manipulation, so the aim of this package is to improve the researcher experience by implementing essential functions for data manipulation, including subsetting, filtering, grouping, and renaming variables. This is the serverside package which should be installed on the server holding the data, and is used in conjuncture with the clientside package 'dsTidyverseClient' which is installed in the local R environment of the analyst. For more information, see <https://www.tidyverse.org/> and <https://datashield.org/>.
Maintained by Tim Cadman. Last updated 18 days ago.
5.8 match 2 stars 4.56 score 2 scriptspoissonconsulting
readwritesqlite:Enhanced Reading and Writing for 'SQLite' Databases
Reads and writes data frames to 'SQLite' databases while preserving time zones (for POSIXct columns), projections (for 'sfc' columns), units (for 'units' columns), levels (for factors and ordered factors) and classes for logical, Date and 'hms' columns. It also logs changes to tables and provides more informative error messages.
Maintained by Joe Thorley. Last updated 2 months ago.
dbilogmetadataposixctreadsfcsqliteunitswrite
4.0 match 38 stars 6.42 score 11 scripts 1 dependentstidyverse
duckplyr:A 'DuckDB'-Backed Version of 'dplyr'
A drop-in replacement for 'dplyr', powered by 'DuckDB' for performance. Offers convenient utilities for working with in-memory and larger-than-memory data while retaining full 'dplyr' compatibility.
Maintained by Kirill Müller. Last updated 5 days ago.
analyticsdataframedplyrduckdbperformance
2.3 match 309 stars 11.33 score 220 scriptsrundel
ghclass:Tools for Managing Classes on GitHub
Interface for the GitHub API that enables efficient management of courses on GitHub. It has a functionality for managing organizations, teams, repositories, and users on GitHub and helps automate most of the tedious and repetitive tasks around creating and distributing assignments.
Maintained by Colin Rundel. Last updated 1 months ago.
3.5 match 142 stars 7.32 score 70 scriptsbioc
sparrow:Take command of set enrichment analyses through a unified interface
Provides a unified interface to a variety of GSEA techniques from different bioconductor packages. Results are harmonized into a single object and can be interrogated uniformly for quick exploration and interpretation of results. Interactive exploration of GSEA results is enabled through a shiny app provided by a sparrow.shiny sibling package.
Maintained by Steve Lianoglou. Last updated 3 months ago.
genesetenrichmentpathwaysbioinformaticsgsea
3.8 match 21 stars 6.58 score 13 scriptsmiraisolutions
XLConnect:Excel Connector for R
Provides comprehensive functionality to read, write and format Excel data.
Maintained by Martin Studer. Last updated 18 days ago.
cross-platformexcelr-languagexlconnectopenjdk
2.0 match 130 stars 12.28 score 1.2k scripts 1 dependentspatzaw
ReDaMoR:Relational Data Modeler
The aim of this package is to manipulate relational data models in R. It provides functions to create, modify and export data models in json format. It also allows importing models created with 'MySQL Workbench' (<https://www.mysql.com/products/workbench/>). These functions are accessible through a graphical user interface made with 'shiny'. Constraints such as types, keys, uniqueness and mandatory fields are automatically checked and corrected when editing a model. Finally, real data can be confronted to a model to check their compatibility.
Maintained by Patrice Godard. Last updated 23 days ago.
3.8 match 17 stars 6.24 score 17 scripts 1 dependentsdamiendevienne
phylter:Detect and Remove Outliers in Phylogenomics Datasets
Analyzis and filtering of phylogenomics datasets. It takes an input either a collection of gene trees (then transformed to matrices) or directly a collection of gene matrices and performs an iterative process to identify what species in what genes are outliers, and whose elimination significantly improves the concordance between the input matrices. The methods builds upon the Distatis approach (Abdi et al. (2005) <doi:10.1101/2021.09.08.459421>), a generalization of classical multidimensional scaling to multiple distance matrices.
Maintained by Aurélie Siberchicot. Last updated 12 days ago.
phylogenetic-treesphylogeneticsphylogenomicscpp
4.0 match 9 stars 5.91 score 6 scriptsperson-c
easybio:Comprehensive Single-Cell Annotation and Transcriptomic Analysis Toolkit
Provides a comprehensive toolkit for single-cell annotation with the 'CellMarker2.0' database (see Xia Li, Peng Wang, Yunpeng Zhang (2023) <doi: 10.1093/nar/gkac947>). Streamlines biological label assignment in single-cell RNA-seq data and facilitates transcriptomic analysis, including preparation of TCGA<https://portal.gdc.cancer.gov/> and GEO<https://www.ncbi.nlm.nih.gov/geo/> datasets, differential expression analysis and visualization of enrichment analysis results. Additional utility functions support various bioinformatics workflows. See Wei Cui (2024) <doi: 10.1101/2024.09.14.609619> for more details.
Maintained by Wei Cui. Last updated 13 days ago.
limmageoqueryedgerfgseabioinformaticscellmarker2gsearna-seqsingle-cell
3.5 match 10 stars 6.62 score 35 scriptssomalogic
SomaDataIO:Input/Output 'SomaScan' Data
Load and export 'SomaScan' data via the 'Standard BioTools, Inc.' structured text file called an ADAT ('*.adat'). For file format see <https://github.com/SomaLogic/SomaLogic-Data/blob/main/README.md>. The package also exports auxiliary functions for manipulating, wrangling, and extracting relevant information from an ADAT object once in memory.
Maintained by Caleb Scheidel. Last updated 1 months ago.
adatproteomicsproteomics-data-analysissomascan
3.0 match 26 stars 7.71 score 132 scriptslwjohnst86
carpenter:Build Common Tables of Summary Statistics for Reports
Mainly used to build tables that are commonly presented for bio-medical/health research, such as basic characteristic tables or descriptive statistics.
Maintained by Luke Johnston. Last updated 6 years ago.
characteristic-tablesdescriptive-statisticssciencetabletables
4.9 match 9 stars 4.73 score 12 scriptsspatstat
spatstat.explore:Exploratory Data Analysis for the 'spatstat' Family
Functionality for exploratory data analysis and nonparametric analysis of spatial data, mainly spatial point patterns, in the 'spatstat' family of packages. (Excludes analysis of spatial data on a linear network, which is covered by the separate package 'spatstat.linnet'.) Methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported.
Maintained by Adrian Baddeley. Last updated 1 months ago.
cluster-detectionconfidence-intervalshypothesis-testingk-functionroc-curvesscan-statisticssignificance-testingsimulation-envelopesspatial-analysisspatial-data-analysisspatial-sharpeningspatial-smoothingspatial-statistics
2.3 match 1 stars 10.17 score 67 scripts 148 dependentsmatthewheun
matsbyname:An Implementation of Matrix Mathematics that Respects Row and Column Names
An implementation of matrix mathematics wherein operations are performed "by name."
Maintained by Matthew Heun. Last updated 10 days ago.
3.4 match 2 stars 6.65 score 150 scripts 1 dependentsbioc
AlpsNMR:Automated spectraL Processing System for NMR
Reads Bruker NMR data directories both zipped and unzipped. It provides automated and efficient signal processing for untargeted NMR metabolomics. It is able to interpolate the samples, detect outliers, exclude regions, normalize, detect peaks, align the spectra, integrate peaks, manage metadata and visualize the spectra. After spectra proccessing, it can apply multivariate analysis on extracted data. Efficient plotting with 1-D data is also available. Basic reading of 1D ACD/Labs exported JDX samples is also available.
Maintained by Sergio Oller Moreno. Last updated 5 months ago.
softwarepreprocessingvisualizationclassificationcheminformaticsmetabolomicsdataimport
3.0 match 15 stars 7.59 score 12 scripts 1 dependentsd-score
dscore:D-Score for Child Development
The D-score summarizes the child's performance on a set of milestones into a single number. The package implements four Rasch model keys to convert milestone scores into a D-score. It provides tools to calculate the D-score and its precision from the child's milestone scores, to convert the D-score into the Development-for-Age Z-score (DAZ) using age-conditional references, and to map milestone names into a generic 9-position item naming convention.
Maintained by Stef van Buuren. Last updated 7 months ago.
child-developmentd-scoredazdevelopmental-trajectoriesgrowth-chartsrasch-modelcpp
3.3 match 8 stars 6.89 score 40 scriptsropensci
RefManageR:Straightforward 'BibTeX' and 'BibLaTeX' Bibliography Management
Provides tools for importing and working with bibliographic references. It greatly enhances the 'bibentry' class by providing a class 'BibEntry' which stores 'BibTeX' and 'BibLaTeX' references, supports 'UTF-8' encoding, and can be easily searched by any field, by date ranges, and by various formats for name lists (author by last names, translator by full names, etc.). Entries can be updated, combined, sorted, printed in a number of styles, and exported. 'BibTeX' and 'BibLaTeX' '.bib' files can be read into 'R' and converted to 'BibEntry' objects. Interfaces to 'NCBI Entrez', 'CrossRef', and 'Zotero' are provided for importing references and references can be created from locally stored 'PDF' files using 'Poppler'. Includes functions for citing and generating a bibliography with hyperlinks for documents prepared with 'RMarkdown' or 'RHTML'.
Maintained by Mathew W. McLean. Last updated 4 months ago.
1.9 match 115 stars 12.06 score 2.3k scripts 16 dependentsropensci
git2rdata:Store and Retrieve Data.frames in a Git Repository
The git2rdata package is an R package for writing and reading dataframes as plain text files. A metadata file stores important information. 1) Storing metadata allows to maintain the classes of variables. By default, git2rdata optimizes the data for file storage. The optimization is most effective on data containing factors. The optimization makes the data less human readable. The user can turn this off when they prefer a human readable format over smaller files. Details on the implementation are available in vignette("plain_text", package = "git2rdata"). 2) Storing metadata also allows smaller row based diffs between two consecutive commits. This is a useful feature when storing data as plain text files under version control. Details on this part of the implementation are available in vignette("version_control", package = "git2rdata"). Although we envisioned git2rdata with a git workflow in mind, you can use it in combination with other version control systems like subversion or mercurial. 3) git2rdata is a useful tool in a reproducible and traceable workflow. vignette("workflow", package = "git2rdata") gives a toy example. 4) vignette("efficiency", package = "git2rdata") provides some insight into the efficiency of file storage, git repository size and speed for writing and reading.
Maintained by Thierry Onkelinx. Last updated 2 months ago.
reproducible-researchversion-control
2.3 match 99 stars 10.03 score 216 scripts 4 dependentsdreamrs
datamods:Modules to Import and Manipulate Data in 'Shiny'
'Shiny' modules to import data into an application or 'addin' from various sources, and to manipulate them after that.
Maintained by Victor Perrier. Last updated 11 days ago.
1.9 match 144 stars 12.03 score 174 scripts 7 dependentsprestodb
RPresto:DBI Connector to Presto
Implements a 'DBI' compliant interface to Presto. Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes: <https://prestodb.io/>.
Maintained by Jarod G.R. Meng. Last updated 1 months ago.
2.3 match 132 stars 9.73 score 25 scripts 4 dependentsbodysbobb
HARplus:Enhanced R Package for 'GEMPACK' .har and .sl4 Files
Provides tools for processing and analyzing .har and .sl4 files, making it easier for 'GEMPACK' users and 'GTAP' researchers to handle large economic datasets. It simplifies the management of multiple experiment results, enabling faster and more efficient comparisons without complexity. Users can extract, restructure, and merge data seamlessly, ensuring compatibility across different tools. The processed data can be exported and used in 'R', 'Stata', 'Python', 'Julia', or any software that supports Text, CSV, or 'Excel' formats.
Maintained by Pattawee Puangchit. Last updated 14 hours ago.
4.6 match 2 stars 4.70 scorestemangiola
tidyseurat:Brings Seurat to the Tidyverse
It creates an invisible layer that allow to see the 'Seurat' object as tibble and interact seamlessly with the tidyverse.
Maintained by Stefano Mangiola. Last updated 8 months ago.
assaydomaininfrastructurernaseqdifferentialexpressiongeneexpressionnormalizationclusteringqualitycontrolsequencingtranscriptiontranscriptomicsdplyrggplot2pcapurrrsctseuratsingle-cellsingle-cell-rna-seqtibbletidyrtidyversetranscriptstsneumap
2.3 match 158 stars 9.66 score 398 scripts 1 dependentsstatdivlab
corncob:Count Regression for Correlated Observations with the Beta-Binomial
Statistical modeling for correlated count data using the beta-binomial distribution, described in Martin et al. (2020) <doi:10.1214/19-AOAS1283>. It allows for both mean and overdispersion covariates.
Maintained by Amy D Willis. Last updated 6 months ago.
2.3 match 105 stars 9.64 score 248 scripts 1 dependentsqile0317
APackOfTheClones:Visualization of Clonal Expansion for Single Cell Immune Profiles
Visualize clonal expansion via circle-packing. 'APackOfTheClones' extends 'scRepertoire' to produce a publication-ready visualization of clonal expansion at a single cell resolution, by representing expanded clones as differently sized circles. The method was originally implemented by Murray Christian and Ben Murrell in the following immunology study: Ma et al. (2021) <doi:10.1126/sciimmunol.abg6356>.
Maintained by Qile Yang. Last updated 4 months ago.
clonal-analysisimmune-repertoireimmune-systemscrna-seqscrnaseqseuratsingle-cellsingle-cell-genomicscpp
3.3 match 15 stars 6.45 score 15 scriptsjacobbien
simulator:An Engine for Running Simulations
A framework for performing simulations such as those common in methodological statistics papers. The design principles of this package are described in greater depth in Bien, J. (2016) "The simulator: An Engine to Streamline Simulations," which is available at <arXiv:1607.00021>.
Maintained by Jacob Bien. Last updated 2 years ago.
3.0 match 52 stars 7.13 score 103 scriptssizespectrum
mizer:Dynamic Multi-Species Size Spectrum Modelling
A set of classes and methods to set up and run multi-species, trait based and community size spectrum ecological models, focused on the marine environment.
Maintained by Gustav Delius. Last updated 2 months ago.
ecosystem-modelfish-population-dynamicsfisheriesfisheries-managementmarine-ecosystempopulation-dynamicssimulationsize-structurespecies-interactionstransport-equationcpp
2.3 match 38 stars 9.43 score 207 scriptsdbosak01
libr:Libraries, Data Dictionaries, and a Data Step for R
Contains a set of functions to create data libraries, generate data dictionaries, and simulate a data step. The libname() function will load a directory of data into a library in one line of code. The dictionary() function will generate data dictionaries for individual data frames or an entire library. And the datestep() function will perform row-by-row data processing.
Maintained by David Bosak. Last updated 3 months ago.
2.5 match 27 stars 8.27 score 48 scripts 2 dependentsstatisfactions
simpr:Flexible 'Tidyverse'-Friendly Simulations
A general, 'tidyverse'-friendly framework for simulation studies, design analysis, and power analysis. Specify data generation, define varying parameters, generate data, fit models, and tidy model results in a single pipeline, without needing loops or custom functions.
Maintained by Ethan Brown. Last updated 8 months ago.
3.0 match 43 stars 6.89 score 30 scriptshelske
KFAS:Kalman Filter and Smoother for Exponential Family State Space Models
State space modelling is an efficient and flexible framework for statistical inference of a broad class of time series and other data. KFAS includes computationally efficient functions for Kalman filtering, smoothing, forecasting, and simulation of multivariate exponential family state space models, with observations from Gaussian, Poisson, binomial, negative binomial, and gamma distributions. See the paper by Helske (2017) <doi:10.18637/jss.v078.i10> for details.
Maintained by Jouni Helske. Last updated 6 months ago.
dynamic-linear-modelexponential-familyfortrangaussian-modelsstate-spacetime-seriesopenblas
1.9 match 64 stars 10.97 score 242 scripts 16 dependentsinzightvit
iNZightTools:Tools for 'iNZight'
Provides a collection of wrapper functions for common variable and dataset manipulation workflows primarily used by 'iNZight', a graphical user interface providing easy exploration and visualisation of data for students of statistics, available in both desktop and online versions. Additionally, many of the functions return the 'tidyverse' code used to obtain the result in an effort to bridge the gap between GUI and coding.
Maintained by Tom Elliott. Last updated 3 months ago.
3.9 match 1 stars 5.16 score 18 scripts 2 dependentsnjlyon0
supportR:Support Functions for Wrangling and Visualization
Suite of helper functions for data wrangling and visualization. The only theme for these functions is that they tend towards simple, short, and narrowly-scoped. These functions are built for tasks that often recur but are not large enough in scope to warrant an ecosystem of interdependent functions.
Maintained by Nicholas J Lyon. Last updated 4 months ago.
3.2 match 5 stars 6.22 score 15 scriptsbioc
tidySingleCellExperiment:Brings SingleCellExperiment to the Tidyverse
'tidySingleCellExperiment' is an adapter that abstracts the 'SingleCellExperiment' container in the form of a 'tibble'. This allows *tidy* data manipulation, nesting, and plotting. For example, a 'tidySingleCellExperiment' is directly compatible with functions from 'tidyverse' packages `dplyr` and `tidyr`, as well as plotting with `ggplot2` and `plotly`. In addition, the package provides various utility functions specific to single-cell omics data analysis (e.g., aggregation of cell-level data to pseudobulks).
Maintained by Stefano Mangiola. Last updated 5 months ago.
assaydomaininfrastructurernaseqdifferentialexpressionsinglecellgeneexpressionnormalizationclusteringqualitycontrolsequencingbioconductordplyrggplot2plotlysingle-cell-rna-seqsingle-cell-sequencingsinglecellexperimenttibbletidyrtidyverse
2.3 match 36 stars 8.86 score 125 scripts 2 dependentsdyfanjones
s3fs:'Amazon Web Service S3' File System
Access 'Amazon Web Service Simple Storage Service' ('S3') <https://aws.amazon.com/s3/> as if it were a file system. Interface based on the R package 'fs'.
Maintained by Dyfan Jones. Last updated 7 months ago.
3.8 match 43 stars 5.28 score 11 scriptsstan-dev
shinystan:Interactive Visual and Numerical Diagnostics and Posterior Analysis for Bayesian Models
A graphical user interface for interactive Markov chain Monte Carlo (MCMC) diagnostics and plots and tables helpful for analyzing a posterior sample. The interface is powered by the 'Shiny' web application framework from 'RStudio' and works with the output of MCMC programs written in any programming language (and has extended functionality for 'Stan' models fit using the 'rstan' and 'rstanarm' packages).
Maintained by Jonah Gabry. Last updated 3 years ago.
bayesianbayesian-data-analysisbayesian-inferencebayesian-methodsbayesian-statisticsmcmcshiny-appsstanstatistical-graphics
1.5 match 200 stars 13.13 score 1.6k scripts 15 dependentssamuel-marsh
scCustomize:Custom Visualizations & Functions for Streamlined Analyses of Single Cell Sequencing
Collection of functions created and/or curated to aid in the visualization and analysis of single-cell data using 'R'. 'scCustomize' aims to provide 1) Customized visualizations for aid in ease of use and to create more aesthetic and functional visuals. 2) Improve speed/reproducibility of common tasks/pieces of code in scRNA-seq analysis with a single or group of functions. For citation please use: Marsh SE (2021) "Custom Visualizations & Functions for Streamlined Analyses of Single Cell Sequencing" <doi:10.5281/zenodo.5706430> RRID:SCR_024675.
Maintained by Samuel Marsh. Last updated 3 months ago.
customizationggplot2scrna-seqseuratsingle-cellsingle-cell-genomicssingle-cell-rna-seqvisualization
2.3 match 242 stars 8.75 score 1.1k scriptsnlmixr2
rxode2:Facilities for Simulating from ODE-Based Models
Facilities for running simulations from ordinary differential equation ('ODE') models, such as pharmacometrics and other compartmental models. A compilation manager translates the ODE model into C, compiles it, and dynamically loads the object code into R for improved computational efficiency. An event table object facilitates the specification of complex dosing regimens (optional) and sampling schedules. NB: The use of this package requires both C and Fortran compilers, for details on their use with R please see Section 6.3, Appendix A, and Appendix D in the "R Administration and Installation" manual. Also the code is mostly released under GPL. The 'VODE' and 'LSODA' are in the public domain. The information is available in the inst/COPYRIGHTS.
Maintained by Matthew L. Fidler. Last updated 30 days ago.
1.8 match 40 stars 11.24 score 220 scripts 13 dependentscraddm
eegUtils:Utilities for Electroencephalographic (EEG) Analysis
Electroencephalography data processing and visualization tools. Includes import functions for 'BioSemi' (.BDF), 'Neuroscan' (.CNT), 'Brain Vision Analyzer' (.VHDR), 'EEGLAB' (.set) and 'Fieldtrip' (.mat). Many preprocessing functions such as referencing, epoching, filtering, and ICA are available. There are a variety of visualizations possible, including timecourse and topographical plotting.
Maintained by Matt Craddock. Last updated 5 months ago.
eegeeg-analysiseeg-dataeeg-signalseeg-signals-processingopenblascppopenmp
3.0 match 106 stars 6.54 score 82 scriptsstocnet
manynet:Many Ways to Make, Modify, Map, Mark, and Measure Myriad Networks
Many tools for making, modifying, mapping, marking, measuring, and motifs and memberships of many different types of networks. All functions operate with matrices, edge lists, and 'igraph', 'network', and 'tidygraph' objects, and on one-mode, two-mode (bipartite), and sometimes three-mode networks. The package includes functions for importing and exporting, creating and generating networks, modifying networks and node and tie attributes, and describing and visualizing networks with sensible defaults.
Maintained by James Hollway. Last updated 3 months ago.
diffusion-modelsgraphsnetwork-analysis
3.0 match 13 stars 6.41 score 35 scripts 1 dependentsdoi-usgs
hydroloom:Utilities to Weave Hydrologic Fabrics
A collection of utilities that support creation of network attributes for hydrologic networks. Methods and algorithms implemented are documented in Moore et al. (2019) <doi:10.3133/ofr20191096>), Cormen and Leiserson (2022) <ISBN:9780262046305> and Verdin and Verdin (1999) <doi:10.1016/S0022-1694(99)00011-6>.
Maintained by David Blodgett. Last updated 2 months ago.
2.3 match 28 stars 8.53 score 19 scripts 6 dependentsmattheaphy
actxps:Create Actuarial Experience Studies: Prepare Data, Summarize Results, and Create Reports
Experience studies are used by actuaries to explore historical experience across blocks of business and to inform assumption setting activities. This package provides functions for preparing data, creating studies, visualizing results, and beginning assumption development. Experience study methods, including exposure calculations, are described in: Atkinson & McGarry (2016) "Experience Study Calculations" <https://www.soa.org/49378a/globalassets/assets/files/research/experience-study-calculations.pdf>. The limited fluctuation credibility method used by the 'exp_stats()' function is described in: Herzog (1999, ISBN:1-56698-374-6) "Introduction to Credibility Theory".
Maintained by Matt Heaphy. Last updated 2 months ago.
3.0 match 14 stars 6.38 score 23 scriptsbioc
tidySummarizedExperiment:Brings SummarizedExperiment to the Tidyverse
The tidySummarizedExperiment package provides a set of tools for creating and manipulating tidy data representations of SummarizedExperiment objects. SummarizedExperiment is a widely used data structure in bioinformatics for storing high-throughput genomic data, such as gene expression or DNA sequencing data. The tidySummarizedExperiment package introduces a tidy framework for working with SummarizedExperiment objects. It allows users to convert their data into a tidy format, where each observation is a row and each variable is a column. This tidy representation simplifies data manipulation, integration with other tidyverse packages, and enables seamless integration with the broader ecosystem of tidy tools for data analysis.
Maintained by Stefano Mangiola. Last updated 5 months ago.
assaydomaininfrastructurernaseqdifferentialexpressiongeneexpressionnormalizationclusteringqualitycontrolsequencingtranscriptiontranscriptomics
2.3 match 26 stars 8.44 score 196 scripts 1 dependentsrepboxr
repboxUtils:Utility functions shared by several repbox packages
Utility functions shared by several repbox packages
Maintained by Sebastian Kranz. Last updated 30 days ago.
4.5 match 4.21 score 9 dependentshope-data-science
tidyfst:Tidy Verbs for Fast Data Manipulation
A toolkit of tidy data manipulation verbs with 'data.table' as the backend. Combining the merits of syntax elegance from 'dplyr' and computing performance from 'data.table', 'tidyfst' intends to provide users with state-of-the-art data manipulation tools with least pain. This package is an extension of 'data.table'. While enjoying a tidy syntax, it also wraps combinations of efficient functions to facilitate frequently-used data operations.
Maintained by Tian-Yuan Huang. Last updated 6 months ago.
1.9 match 98 stars 10.09 score 118 scripts 4 dependentsmrc-ide
orderly2:Orderly Next Generation
Distributed reproducible computing framework, adopting ideas from git, docker and other software. By defining a lightweight interface around the inputs and outputs of an analysis, a lot of the repetitive work for reproducible research can be automated. We define a simple format for organising and describing work that facilitates collaborative reproducible research and acknowledges that all analyses are run multiple times over their lifespans.
Maintained by Rich FitzJohn. Last updated 2 months ago.
2.3 match 8 stars 8.30 score 49 scripts 2 dependentsdanchaltiel
crosstable:Crosstables for Descriptive Analyses
Create descriptive tables for continuous and categorical variables. Apply summary statistics and counting function, with or without a grouping variable, and create beautiful reports using 'rmarkdown' or 'officer'. You can also compute effect sizes and statistical tests if needed.
Maintained by Dan Chaltiel. Last updated 2 months ago.
descriptive-statisticsflextablefrequency-tablehtml-reportmswordofficer
1.8 match 116 stars 10.37 score 340 scriptsmarce10
warbleR:Streamline Bioacoustic Analysis
Functions aiming to facilitate the analysis of the structure of animal acoustic signals in 'R'. 'warbleR' makes use of the basic sound analysis tools from the packages 'tuneR' and 'seewave', and offers new tools for explore and quantify acoustic signal structure. The package allows to organize and manipulate multiple sound files, create spectrograms of complete recordings or individual signals in different formats, run several measures of acoustic structure, and characterize different structural levels in acoustic signals.
Maintained by Marcelo Araya-Salas. Last updated 2 months ago.
animal-acoustic-signalsaudio-processingbioacousticsspectrogramstreamline-analysiscpp
1.7 match 54 stars 11.01 score 270 scripts 4 dependentsrstudio
distill:'R Markdown' Format for Scientific and Technical Writing
Scientific and technical article format for the web. 'Distill' articles feature attractive, reader-friendly typography, flexible layout options for visualizations, and full support for footnotes and citations.
Maintained by Christophe Dervieux. Last updated 1 years ago.
1.9 match 422 stars 9.90 score 402 scripts 6 dependentskonstantinryabov
dmtools:Tools for Clinical Data Management
For checking the dataset from EDC(Electronic Data Capture) in clinical trials. 'dmtools' reshape your dataset in a tidy view and check events. You can reshape the dataset and choose your target to check, for example, the laboratory reference range.
Maintained by Konstantin Ryabov. Last updated 2 years ago.
cdiscclinical-data-managementlaboratory-reference-range-validate
4.3 match 1 stars 4.32 score 14 scriptspridiltal
staplr:A Toolkit for PDF Files
Provides functions to manipulate PDF files: fill out PDF forms; merge multiple PDF files into one; remove selected pages from a file; rename multiple files in a directory; rotate entire pdf document; rotate selected pages of a pdf file; Select pages from a file; splits single input PDF document into individual pages; splits single input PDF document into parts from given points.
Maintained by Priyanga Dilini Talagala. Last updated 1 years ago.
2.5 match 267 stars 7.31 score 36 scripts 2 dependentsvubiostat
redcapAPI:Interface to 'REDCap'
Access data stored in 'REDCap' databases using the Application Programming Interface (API). 'REDCap' (Research Electronic Data CAPture; <https://projectredcap.org>, Harris, et al. (2009) <doi:10.1016/j.jbi.2008.08.010>, Harris, et al. (2019) <doi:10.1016/j.jbi.2019.103208>) is a web application for building and managing online surveys and databases developed at Vanderbilt University. The API allows users to access data and project meta data (such as the data dictionary) from the web programmatically. The 'redcapAPI' package facilitates the process of accessing data with options to prepare an analysis-ready data set consistent with the definitions in a database's data dictionary.
Maintained by Shawn Garbett. Last updated 9 days ago.
1.8 match 22 stars 10.47 score 134 scripts 2 dependentswinvector
rqdatatable:'rquery' for 'data.table'
Implements the 'rquery' piped Codd-style query algebra using 'data.table'. This allows for a high-speed in memory implementation of Codd-style data manipulation tools.
Maintained by John Mount. Last updated 2 years ago.
2.3 match 38 stars 8.10 score 142 scripts 2 dependentsbioc
ribor:An R Interface for Ribo Files
The ribor package provides an R Interface for .ribo files. It provides functionality to read the .ribo file, which is of HDF5 format, and performs common analyses on its contents.
Maintained by Michael Geng. Last updated 5 months ago.
4.0 match 4.51 score 32 scriptspolmine
polmineR:Verbs and Nouns for Corpus Analysis
Package for corpus analysis using the Corpus Workbench ('CWB', <https://cwb.sourceforge.io>) as an efficient back end for indexing and querying large corpora. The package offers functionality to flexibly create subcorpora and to carry out basic statistical operations (count, co-occurrences etc.). The original full text of documents can be reconstructed and inspected at any time. Beyond that, the package is intended to serve as an interface to packages implementing advanced statistical procedures. Respective data structures (document-term matrices, term-co-occurrence matrices etc.) can be created based on the indexed corpora.
Maintained by Andreas Blaette. Last updated 1 years ago.
2.3 match 49 stars 7.96 score 311 scriptsmbojan
intergraph:Coercion Routines for Network Data Objects
Functions implemented in this package allow to coerce (i.e. convert) network data between classes provided by other R packages. Currently supported classes are those defined in packages: network and igraph.
Maintained by Michał Bojanowski. Last updated 1 years ago.
1.8 match 21 stars 9.95 score 724 scripts 20 dependentstalgalili
installr:Using R to Install Stuff on Windows OS (Such As: R, 'Rtools', 'RStudio', 'Git', and More!)
R is great for installing software. Through the 'installr' package you can automate the updating of R (on Windows, using updateR()) and install new software. Software installation is initiated through a GUI (just run installr()), or through functions such as: install.Rtools(), install.pandoc(), install.git(), and many more. The updateR() command performs the following: finding the latest R version, downloading it, running the installer, deleting the installation file, copy and updating old packages to the new R installation.
Maintained by Tal Galili. Last updated 1 years ago.
1.8 match 273 stars 10.19 score 1.2k scriptscanmod
macpan2:Fast and Flexible Compartmental Modelling
Fast and flexible compartmental modelling with Template Model Builder.
Maintained by Steve Walker. Last updated 2 days ago.
compartmental-modelsepidemiologyforecastingmixed-effectsmodel-fittingoptimizationsimulationsimulation-modelingcpp
2.0 match 4 stars 8.89 score 246 scripts 1 dependentspoissonconsulting
mcmcdata:Manipulate MCMC Samples and Data Frames
Manipulates Monte Carlo Markov Chain samples and associated data frames.
Maintained by Joe Thorley. Last updated 2 months ago.
5.0 match 1 stars 3.56 score 4 scripts 4 dependentsetiennebacher
tidypolars:Get the Power of Polars with the Syntax of the Tidyverse
Polars is a cross-language tool for manipulating very large data. However, one drawback is that the R implementation has a syntax that will look odd to many R users who are not used to Python syntax. The objective of tidypolars is to improve the ease-of-use of Polars in R by providing tidyverse syntax to polars.
Maintained by Etienne Bacher. Last updated 6 days ago.
2.3 match 198 stars 7.86 score 30 scriptsamerican-institutes-for-research
EdSurvey:Analysis of NCES Education Survey and Assessment Data
Read in and analyze functions for education survey and assessment data from the National Center for Education Statistics (NCES) <https://nces.ed.gov/>, including National Assessment of Educational Progress (NAEP) data <https://nces.ed.gov/nationsreportcard/> and data from the International Assessment Database: Organisation for Economic Co-operation and Development (OECD) <https://www.oecd.org/en/about/directorates/directorate-for-education-and-skills.html>, including Programme for International Student Assessment (PISA), Teaching and Learning International Survey (TALIS), Programme for the International Assessment of Adult Competencies (PIAAC), and International Association for the Evaluation of Educational Achievement (IEA) <https://www.iea.nl/>, including Trends in International Mathematics and Science Study (TIMSS), TIMSS Advanced, Progress in International Reading Literacy Study (PIRLS), International Civic and Citizenship Study (ICCS), International Computer and Information Literacy Study (ICILS), and Civic Education Study (CivEd).
Maintained by Paul Bailey. Last updated 16 days ago.
2.3 match 10 stars 7.86 score 139 scripts 1 dependentsjbengler
tidyplots:Tidy Plots for Scientific Papers
The goal of 'tidyplots' is to streamline the creation of publication-ready plots for scientific papers. It allows to gradually add, remove and adjust plot components using a consistent and intuitive syntax.
Maintained by Jan Broder Engler. Last updated 4 days ago.
1.9 match 482 stars 9.40 score 85 scriptsdavidwpierce
ncdf4:Interface to Unidata netCDF (Version 4 or Earlier) Format Data Files
Provides a high-level R interface to data files written using Unidata's netCDF library (version 4 or earlier), which are binary data files that are portable across platforms and include metadata information in addition to the data sets. Using this package, netCDF files (either version 4 or "classic" version 3) can be opened and data sets read in easily. It is also easy to create new netCDF dimensions, variables, and files, in either version 3 or 4 format, and manipulate existing netCDF files. This package replaces the former ncdf package, which only worked with netcdf version 3 files. For various reasons the names of the functions have had to be changed from the names in the ncdf package. The old ncdf package is still available at the URL given below, if you need to have backward compatibility. It should be possible to have both the ncdf and ncdf4 packages installed simultaneously without a problem. However, the ncdf package does not provide an interface for netcdf version 4 files.
Maintained by David Pierce. Last updated 7 months ago.
1.8 match 3 stars 9.71 score 8.9k scripts 181 dependentschristopherkenny
name:Tools for Working with Names
A system for organizing column names in data. Aimed at supporting a prefix-based and suffix-based column naming scheme. Extends 'dplyr' functionality to add ordering by function and more explicit renaming.
Maintained by Christopher T. Kenny. Last updated 3 years ago.
2.8 match 2 stars 6.28 score 19k scriptsstevenmmortimer
salesforcer:An Implementation of 'Salesforce' APIs Using Tidy Principles
Functions connecting to the 'Salesforce' Platform APIs (REST, SOAP, Bulk 1.0, Bulk 2.0, Metadata, Reports and Dashboards) <https://trailhead.salesforce.com/content/learn/modules/api_basics/api_basics_overview>. "API" is an acronym for "application programming interface". Most all calls from these APIs are supported as they use CSV, XML or JSON data that can be parsed into R data structures. For more details please see the 'Salesforce' API documentation and this package's website <https://stevenmmortimer.github.io/salesforcer/> for more information, documentation, and examples.
Maintained by Steven M. Mortimer. Last updated 4 months ago.
api-wrappersr-languager-programmingsalesforcesalesforce-apis
1.9 match 82 stars 9.27 score 191 scriptsinsileco
inSilecoMisc:inSileco Miscellaneous Functions
A set of miscellaneous R functions written by our inSileco group.
Maintained by Kevin Cazelles. Last updated 3 years ago.
5.3 match 4 stars 3.30 score 4 scriptsrolkra
explore:Simplifies Exploratory Data Analysis
Interactive data exploration with one line of code, automated reporting or use an easy to remember set of tidy functions for low code exploratory data analysis.
Maintained by Roland Krasser. Last updated 3 months ago.
data-explorationdata-visualisationdecision-treesedarmarkdownshinytidy
1.5 match 228 stars 11.43 score 221 scripts 1 dependentsinsightrx
PKPDsim:Tools for Performing Pharmacokinetic-Pharmacodynamic Simulations
Simulate dose regimens for pharmacokinetic-pharmacodynamic (PK-PD) models described by differential equation (DE) systems. Simulation using ADVAN-style analytical equations is also supported (Abuhelwa et al. (2015) <doi:10.1016/j.vascn.2015.03.004>).
Maintained by Ron Keizer. Last updated 20 days ago.
odepharmacodynamicspharmacokineticspharmacometricscpp
1.8 match 36 stars 9.47 score 100 scriptsbioc
sesame:SEnsible Step-wise Analysis of DNA MEthylation BeadChips
Tools For analyzing Illumina Infinium DNA methylation arrays. SeSAMe provides utilities to support analyses of multiple generations of Infinium DNA methylation BeadChips, including preprocessing, quality control, visualization and inference. SeSAMe features accurate detection calling, intelligent inference of ethnicity, sex and advanced quality control routines.
Maintained by Wanding Zhou. Last updated 2 months ago.
dnamethylationmethylationarraypreprocessingqualitycontrolbioinformaticsdna-methylationmicroarray
1.9 match 69 stars 9.08 score 258 scripts 1 dependentsdataoneorg
dataone:R Interface to the DataONE REST API
Provides read and write access to data and metadata from the DataONE network <https://www.dataone.org> of data repositories. Each DataONE repository implements a consistent repository application programming interface. Users call methods in R to access these remote repository functions, such as methods to query the metadata catalog, get access to metadata for particular data packages, and read the data objects from the data repository. Users can also insert and update data objects on repositories that support these methods.
Maintained by Matthew B. Jones. Last updated 3 years ago.
1.7 match 36 stars 9.93 score 472 scripts 3 dependentspecanproject
PEcAn.MA:PEcAn Functions Used for Meta-Analysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. The PEcAn.MA package contains the functions used in the Bayesian meta-analysis of trait data.
Maintained by David LeBauer. Last updated 2 days ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
1.7 match 216 stars 9.88 score 7 scripts 7 dependentstomasfryda
h2o:R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Maintained by Tomas Fryda. Last updated 1 years ago.
2.0 match 3 stars 8.20 score 7.8k scripts 11 dependentsctu-bern
redcaptools:Tools for exporting and working with REDCap data
Tools for exporting and working with REDCap data (e.g. adding labels, formatting dates).
Maintained by Alan G Haynes. Last updated 4 months ago.
3.6 match 4 stars 4.51 score 9 scriptsbioc
immunogenViewer:Visualization and evaluation of protein immunogens
Plots protein properties and visualizes position of peptide immunogens within protein sequence. Allows evaluation of immunogens based on structural and functional annotations to infer suitability for antibody-based methods aiming to detect native proteins.
Maintained by Katharina Waury. Last updated 1 months ago.
featureextractionproteomicssoftwarevisualization
3.5 match 4.65 score 10 scriptsvegawidget
vegawidget:'Htmlwidget' for 'Vega' and 'Vega-Lite'
'Vega' and 'Vega-Lite' parse text in 'JSON' notation to render chart-specifications into 'HTML'. This package is used to facilitate the rendering. It also provides a means to interact with signals, events, and datasets in a 'Vega' chart using 'JavaScript' or 'Shiny'.
Maintained by Ian Lyttle. Last updated 1 years ago.
2.0 match 68 stars 8.04 score 49 scripts 4 dependentsnrcan
PlotFTIR:Plot FTIR Spectra
The goal of 'PlotFTIR' is to easily and quickly kick-start the production of journal-quality Fourier Transform Infra-Red (FTIR) spectral plots in R using 'ggplot2'. The produced plots can be published directly or further modified by 'ggplot2' functions. L'objectif de 'PlotFTIR' est de démarrer facilement et rapidement la production des tracés spectraux de spectroscopie infrarouge à transformée de Fourier (IRTF) de qualité journal dans R à l'aide de 'ggplot2'. Les tracés produits peuvent être publiés directement ou modifiés davantage par les fonctions 'ggplot2'.
Maintained by Philip Bulsink. Last updated 1 months ago.
3.2 match 4.98 score 5 scriptscenterforassessment
SGP:Student Growth Percentiles & Percentile Growth Trajectories
An analytic framework for the calculation of norm- and criterion-referenced academic growth estimates using large scale, longitudinal education assessment data as developed in Betebenner (2009) <doi:10.1111/j.1745-3992.2009.00161.x>.
Maintained by Damian W. Betebenner. Last updated 2 months ago.
percentile-growth-projectionsquantile-regressionsgpsgp-analysesstudent-growth-percentilesstudent-growth-projections
1.6 match 20 stars 9.69 score 88 scriptsappelmar
gdalcubes:Earth Observation Data Cubes from Satellite Image Collections
Processing collections of Earth observation images as on-demand multispectral, multitemporal raster data cubes. Users define cubes by spatiotemporal extent, resolution, and spatial reference system and let 'gdalcubes' automatically apply cropping, reprojection, and resampling using the 'Geospatial Data Abstraction Library' ('GDAL'). Implemented functions on data cubes include reduction over space and time, applying arithmetic expressions on pixel band values, moving window aggregates over time, filtering by space, time, bands, and predicates on pixel values, exporting data cubes as 'netCDF' or 'GeoTIFF' files, plotting, and extraction from spatial and or spatiotemporal features. All computational parts are implemented in C++, linking to the 'GDAL', 'netCDF', 'CURL', and 'SQLite' libraries. See Appel and Pebesma (2019) <doi:10.3390/data4030092> for further details.
Maintained by Marius Appel. Last updated 1 years ago.
remote-sensingsatellite-imageryspatial-analysisgdalnetcdfcpp
1.9 match 124 stars 8.39 score 356 scriptsdkaschek
dMod:Dynamic Modeling and Parameter Estimation in ODE Models
The framework provides functions to generate ODEs of reaction networks, parameter transformations, observation functions, residual functions, etc. The framework follows the paradigm that derivative information should be used for optimization whenever possible. Therefore, all major functions produce and can handle expressions for symbolic derivatives.
Maintained by Daniel Kaschek. Last updated 10 days ago.
1.9 match 20 stars 8.35 score 251 scriptsobiba
opalr:'Opal' Data Repository Client and 'DataSHIELD' Utils
Data integration Web application for biobanks by 'OBiBa'. 'Opal' is the core database application for biobanks. Participant data, once collected from any data source, must be integrated and stored in a central data repository under a uniform model. 'Opal' is such a central repository. It can import, process, validate, query, analyze, report, and export data. 'Opal' is typically used in a research center to analyze the data acquired at assessment centres. Its ultimate purpose is to achieve seamless data-sharing among biobanks. This 'Opal' client allows to interact with 'Opal' web services and to perform operations on the R server side. 'DataSHIELD' administration tools are also provided.
Maintained by Yannick Marcon. Last updated 2 months ago.
2.0 match 3 stars 7.76 score 179 scripts 2 dependentsbranchlab
metasnf:Meta Clustering with Similarity Network Fusion
Framework to facilitate patient subtyping with similarity network fusion and meta clustering. The similarity network fusion (SNF) algorithm was introduced by Wang et al. (2014) in <doi:10.1038/nmeth.2810>. SNF is a data integration approach that can transform high-dimensional and diverse data types into a single similarity network suitable for clustering with minimal loss of information from each initial data source. The meta clustering approach was introduced by Caruana et al. (2006) in <doi:10.1109/ICDM.2006.103>. Meta clustering involves generating a wide range of cluster solutions by adjusting clustering hyperparameters, then clustering the solutions themselves into a manageable number of qualitatively similar solutions, and finally characterizing representative solutions to find ones that are best for the user's specific context. This package provides a framework to easily transform multi-modal data into a wide range of similarity network fusion-derived cluster solutions as well as to visualize, characterize, and validate those solutions. Core package functionality includes easy customization of distance metrics, clustering algorithms, and SNF hyperparameters to generate diverse clustering solutions; calculation and plotting of associations between features, between patients, and between cluster solutions; and standard cluster validation approaches including resampled measures of cluster stability, standard metrics of cluster quality, and label propagation to evaluate generalizability in unseen data. Associated vignettes guide the user through using the package to identify patient subtypes while adhering to best practices for unsupervised learning.
Maintained by Prashanth S Velayudhan. Last updated 5 days ago.
bioinformaticsclusteringmetaclusteringsnf
1.9 match 8 stars 8.21 score 30 scriptsbioc
PDATK:Pancreatic Ductal Adenocarcinoma Tool-Kit
Pancreatic ductal adenocarcinoma (PDA) has a relatively poor prognosis and is one of the most lethal cancers. Molecular classification of gene expression profiles holds the potential to identify meaningful subtypes which can inform therapeutic strategy in the clinical setting. The Pancreatic Cancer Adenocarcinoma Tool-Kit (PDATK) provides an S4 class-based interface for performing unsupervised subtype discovery, cross-cohort meta-clustering, gene-expression-based classification, and subsequent survival analysis to identify prognostically useful subtypes in pancreatic cancer and beyond. Two novel methods, Consensus Subtypes in Pancreatic Cancer (CSPC) and Pancreatic Cancer Overall Survival Predictor (PCOSP) are included for consensus-based meta-clustering and overall-survival prediction, respectively. Additionally, four published subtype classifiers and three published prognostic gene signatures are included to allow users to easily recreate published results, apply existing classifiers to new data, and benchmark the relative performance of new methods. The use of existing Bioconductor classes as input to all PDATK classes and methods enables integration with existing Bioconductor datasets, including the 21 pancreatic cancer patient cohorts available in the MetaGxPancreas data package. PDATK has been used to replicate results from Sandhu et al (2019) [https://doi.org/10.1200/cci.18.00102] and an additional paper is in the works using CSPC to validate subtypes from the included published classifiers, both of which use the data available in MetaGxPancreas. The inclusion of subtype centroids and prognostic gene signatures from these and other publications will enable researchers and clinicians to classify novel patient gene expression data, allowing the direct clinical application of the classifiers included in PDATK. Overall, PDATK provides a rich set of tools to identify and validate useful prognostic and molecular subtypes based on gene-expression data, benchmark new classifiers against existing ones, and apply discovered classifiers on novel patient data to inform clinical decision making.
Maintained by Benjamin Haibe-Kains. Last updated 5 months ago.
geneexpressionpharmacogeneticspharmacogenomicssoftwareclassificationsurvivalclusteringgeneprediction
3.5 match 1 stars 4.31 score 17 scriptspachadotdev
analogsea:Interface to 'DigitalOcean'
Provides a set of functions for interacting with the 'DigitalOcean' API <https://www.digitalocean.com/>, including creating images, destroying them, rebooting, getting details on regions, and available images.
Maintained by Mauricio Vargas. Last updated 2 years ago.
2.0 match 159 stars 7.56 score 100 scripts 1 dependentsbioc
MOFA2:Multi-Omics Factor Analysis v2
The MOFA2 package contains a collection of tools for training and analysing multi-omic factor analysis (MOFA). MOFA is a probabilistic factor model that aims to identify principal axes of variation from data sets that can comprise multiple omic layers and/or groups of samples. Additional time or space information on the samples can be incorporated using the MEFISTO framework, which is part of MOFA2. Downstream analysis functions to inspect molecular features underlying each factor, vizualisation, imputation etc are available.
Maintained by Ricard Argelaguet. Last updated 5 months ago.
dimensionreductionbayesianvisualizationfactor-analysismofamulti-omics
1.5 match 319 stars 10.02 score 502 scriptscredibilitylab
groundhog:Version-Control for CRAN, GitHub, and GitLab Packages
Make R scripts reproducible, by ensuring that every time a given script is run, the same version of the used packages are loaded (instead of whichever version the user running the script happens to have installed). This is achieved by using the command groundhog.library() instead of the base command library(), and including a date in the call. The date is used to call on the same version of the package every time (the most recent version available at that date). Load packages from CRAN, GitHub, or Gitlab.
Maintained by Uri Simonsohn. Last updated 26 days ago.
2.0 match 80 stars 7.51 score 264 scriptsronkeizer
vpc:Create Visual Predictive Checks
Visual predictive checks are a commonly used diagnostic plot in pharmacometrics, showing how certain statistics (percentiles) for observed data compare to those same statistics for data simulated from a model. The package can generate VPCs for continuous, categorical, censored, and (repeated) time-to-event data.
Maintained by Ron Keizer. Last updated 9 months ago.
1.7 match 36 stars 9.01 score 318 scripts 11 dependentsbioc
MANOR:CGH Micro-Array NORmalization
Importation, normalization, visualization, and quality control functions to correct identified sources of variability in array-CGH experiments.
Maintained by Pierre Neuvial. Last updated 4 days ago.
microarraytwochanneldataimportqualitycontrolpreprocessingcopynumbervariationnormalization
3.0 match 4.98 score 1 scriptsdoi-usgs
sbtools:USGS ScienceBase Tools
Tools for interacting with U.S. Geological Survey ScienceBase <https://www.sciencebase.gov> interfaces. ScienceBase is a data cataloging and collaborative data management platform. Functions included for querying ScienceBase, and creating and fetching datasets.
Maintained by David Blodgett. Last updated 10 months ago.
1.9 match 21 stars 7.94 score 127 scripts 2 dependentsbioc
lipidr:Data Mining and Analysis of Lipidomics Datasets
lipidr an easy-to-use R package implementing a complete workflow for downstream analysis of targeted and untargeted lipidomics data. lipidomics results can be imported into lipidr as a numerical matrix or a Skyline export, allowing integration into current analysis frameworks. Data mining of lipidomics datasets is enabled through integration with Metabolomics Workbench API. lipidr allows data inspection, normalization, univariate and multivariate analysis, displaying informative visualizations. lipidr also implements a novel Lipid Set Enrichment Analysis (LSEA), harnessing molecular information such as lipid class, total chain length and unsaturation.
Maintained by Ahmed Mohamed. Last updated 5 months ago.
lipidomicsmassspectrometrynormalizationqualitycontrolvisualizationbioconductor
2.0 match 29 stars 7.44 score 40 scriptsgenentech
psborrow2:Bayesian Dynamic Borrowing Analysis and Simulation
Bayesian dynamic borrowing is an approach to incorporating external data to supplement a randomized, controlled trial analysis in which external data are incorporated in a dynamic way (e.g., based on similarity of outcomes); see Viele 2013 <doi:10.1002/pst.1589> for an overview. This package implements the hierarchical commensurate prior approach to dynamic borrowing as described in Hobbes 2011 <doi:10.1111/j.1541-0420.2011.01564.x>. There are three main functionalities. First, 'psborrow2' provides a user-friendly interface for applying dynamic borrowing on the study results handles the Markov Chain Monte Carlo sampling on behalf of the user. Second, 'psborrow2' provides a simulation framework to compare different borrowing parameters (e.g. full borrowing, no borrowing, dynamic borrowing) and other trial and borrowing characteristics (e.g. sample size, covariates) in a unified way. Third, 'psborrow2' provides a set of functions to generate data for simulation studies, and also allows the user to specify their own data generation process. This package is designed to use the sampling functions from 'cmdstanr' which can be installed from <https://stan-dev.r-universe.dev>.
Maintained by Matt Secrest. Last updated 1 months ago.
bayesian-dynamic-borrowingpsborrow2simulation-study
1.9 match 18 stars 7.87 score 16 scriptsropensci
beastier:Call 'BEAST2'
'BEAST2' (<https://www.beast2.org>) is a widely used Bayesian phylogenetic tool, that uses DNA/RNA/protein data and many model priors to create a posterior of jointly estimated phylogenies and parameters. 'BEAST2' is a command-line tool. This package provides a way to call 'BEAST2' from an 'R' function call.
Maintained by Richèl J.C. Bilderbeek. Last updated 23 days ago.
bayesianbeastbeast2phylogenetic-inferencephylogeneticsopenjdk
1.9 match 11 stars 7.87 score 47 scripts 4 dependentsbioc
mistyR:Multiview Intercellular SpaTial modeling framework
mistyR is an implementation of the Multiview Intercellular SpaTialmodeling framework (MISTy). MISTy is an explainable machine learning framework for knowledge extraction and analysis of single-cell, highly multiplexed, spatially resolved data. MISTy facilitates an in-depth understanding of marker interactions by profiling the intra- and intercellular relationships. MISTy is a flexible framework able to process a custom number of views. Each of these views can describe a different spatial context, i.e., define a relationship among the observed expressions of the markers, such as intracellular regulation or paracrine regulation, but also, the views can also capture cell-type specific relationships, capture relations between functional footprints or focus on relations between different anatomical regions. Each MISTy view is considered as a potential source of variability in the measured marker expressions. Each MISTy view is then analyzed for its contribution to the total expression of each marker and is explained in terms of the interactions with other measurements that led to the observed contribution.
Maintained by Jovan Tanevski. Last updated 5 months ago.
softwarebiomedicalinformaticscellbiologysystemsbiologyregressiondecisiontreesinglecellspatialbioconductorbiologyintercellularmachine-learningmodularmolecular-biologymultiviewspatial-transcriptomics
1.9 match 51 stars 7.87 score 160 scriptsmarkbravington
mvbutils:General utilities, workspace organization, code and docu editing, live package maintenance, etc
Hierarchical workspace tree, code editing and backup, easy package prep, editing of packages while loaded, per-object lazy-loading, easy documentation, macro functions, and miscellaneous utilities. Needed by debug package.
Maintained by Mark V. Bravington. Last updated 6 days ago.
2.3 match 6.53 score 138 scripts 18 dependents