R-universe search: collapse

sebkrantz

collapse:Advanced and Fast Data Transformation

A C/C++ based package for advanced data transformation and statistical computing in R that is extremely fast, class-agnostic, robust and programmer friendly. Core functionality includes a rich set of S3 generic grouped and weighted statistical functions for vectors, matrices and data frames, which provide efficient low-level vectorizations, OpenMP multithreading, and skip missing values by default. These are integrated with fast grouping and ordering algorithms (also callable from C), and efficient data manipulation functions. The package also provides a flexible and rigorous approach to time series and panel data in R. It further includes fast functions for common statistical procedures, detailed (grouped, weighted) summary statistics, powerful tools to work with nested data, fast data object conversions, functions for memory efficient R programming, and helpers to effectively deal with variable labels, attributes, and missing data. It is well integrated with base R classes, 'dplyr'/'tibble', 'data.table', 'sf', 'units', 'plm' (panel-series and data frames), and 'xts'/'zoo'.

Maintained by Sebastian Krantz. Last updated 9 days ago.

data-aggregation data-analysis data-manipulation data-processing data-science data-transformation econometrics high-performance panel-data scientific-computing statistics time-series weighted weights cpp openmp

88.2 match 672 stars 16.63 score 708 scripts 97 dependents

emmanuelparadis

ape:Analyses of Phylogenetics and Evolution

Functions for reading, writing, plotting, and manipulating phylogenetic trees, analyses of comparative data in a phylogenetic framework, ancestral character analyses, analyses of diversification and macroevolution, computing distances from DNA sequences, reading and writing nucleotide sequences as well as importing from BioConductor, and several tools such as Mantel's test, generalized skyline plots, graphical exploration of phylogenetic data (alex, trex, kronoviz), estimation of absolute evolutionary rates and clock-like trees using mean path lengths and penalized likelihood, dating trees with non-contemporaneous sequences, translating DNA into AA sequences, and assessing sequence alignments. Phylogeny estimation can be done with the NJ, BIONJ, ME, MVR, SDM, and triangle methods, and several methods handling incomplete distance matrices (NJ*, BIONJ*, MVR*, and the corresponding triangle method). Some functions call external applications (PhyML, Clustal, T-Coffee, Muscle) whose results are returned into R.

Maintained by Emmanuel Paradis. Last updated 3 days ago.

openblas cpp

10.5 match 64 stars 17.22 score 13k scripts 599 dependents

bioc

ORFik:Open Reading Frames in Genomics

R package for analysis of transcript and translation features through manipulation of sequence data and NGS data like Ribo-Seq, RNA-Seq, TCP-Seq and CAGE. It is generalized in the sense that any transcript region can be analysed, as the name hints to it was made with investigation of ribosomal patterns over Open Reading Frames (ORFs) as it's primary use case. ORFik is extremely fast through use of C++, data.table and GenomicRanges. Package allows to reassign starts of the transcripts with the use of CAGE-Seq data, automatic shifting of RiboSeq reads, finding of Open Reading Frames for whole genomes and much more.

Maintained by Haakon Tjeldnes. Last updated 30 days ago.

immunooncology software sequencing riboseq rnaseq functionalgenomics coverage alignment dataimport cpp

15.5 match 33 stars 10.63 score 115 scripts 2 dependents

solivella

lda:Collapsed Gibbs Sampling Methods for Topic Models

Implements latent Dirichlet allocation (LDA) and related models. This includes (but is not limited to) sLDA, corrLDA, and the mixed-membership stochastic blockmodel. Inference for all of these models is implemented via a fast collapsed Gibbs sampler written in C. Utility functions for reading/writing data typically used in topic models, as well as tools for examining posterior distributions are also included.

Maintained by Santiago Olivella. Last updated 11 months ago.

16.1 match 7.62 score 548 scripts 11 dependents

cran

nlme:Linear and Nonlinear Mixed Effects Models

Fit and compare Gaussian linear and nonlinear mixed-effects models.

Maintained by R Core Team. Last updated 2 months ago.

fortran

7.0 match 6 stars 13.00 score 13k scripts 8.7k dependents

friendly

vcdExtra:'vcd' Extensions and Additions

Provides additional data sets, methods and documentation to complement the 'vcd' package for Visualizing Categorical Data and the 'gnm' package for Generalized Nonlinear Models. In particular, 'vcdExtra' extends mosaic, assoc and sieve plots from 'vcd' to handle 'glm()' and 'gnm()' models and adds a 3D version in 'mosaic3d'. Additionally, methods are provided for comparing and visualizing lists of 'glm' and 'loglm' objects. This package is now a support package for the book, "Discrete Data Analysis with R" by Michael Friendly and David Meyer.

Maintained by Michael Friendly. Last updated 5 months ago.

categorical-data-visualization generalized-linear-models mosaic-plots

8.7 match 24 stars 10.34 score 472 scripts 3 dependents

bioc

ggtree:an R package for visualization of tree and annotation data

'ggtree' extends the 'ggplot2' plotting system which implemented the grammar of graphics. 'ggtree' is designed for visualization and annotation of phylogenetic trees and other tree-like structures with their annotation data.

Maintained by Guangchuang Yu. Last updated 5 months ago.

alignment annotation clustering dataimport multiplesequencealignment phylogenetics reproducibleresearch software visualization annotations ggplot2 phylogenetic-trees

5.3 match 864 stars 16.86 score 5.1k scripts 109 dependents

talgalili

dendextend:Extending 'dendrogram' Functionality in R

Offers a set of functions for extending 'dendrogram' objects in R, letting you visualize and compare trees of 'hierarchical clusterings'. You can (1) Adjust a tree's graphical parameters - the color, size, type, etc of its branches, nodes and labels. (2) Visually and statistically compare different 'dendrograms' to one another.

Maintained by Tal Galili. Last updated 2 months ago.

5.1 match 154 stars 17.02 score 6.0k scripts 164 dependents

bioc

fgsea:Fast Gene Set Enrichment Analysis

The package implements an algorithm for fast gene set enrichment analysis. Using the fast algorithm allows to make more permutations and get more fine grained p-values, which allows to use accurate stantard approaches to multiple hypothesis correction.

Maintained by Alexey Sergushichev. Last updated 3 months ago.

geneexpression differentialexpression genesetenrichment pathways cpp

5.1 match 387 stars 16.25 score 3.9k scripts 101 dependents

liamrevell

phytools:Phylogenetic Tools for Comparative Biology (and Other Things)

A wide range of methods for phylogenetic analysis - concentrated in phylogenetic comparative biology, but also including numerous techniques for visualizing, analyzing, manipulating, reading or writing, and even inferring phylogenetic trees. Included among the functions in phylogenetic comparative biology are various for ancestral state reconstruction, model-fitting, and simulation of phylogenies and trait data. A broad range of plotting methods for phylogenies and comparative data include (but are not restricted to) methods for mapping trait evolution on trees, for projecting trees into phenotype space or a onto a geographic map, and for visualizing correlated speciation between trees. Lastly, numerous functions are designed for reading, writing, analyzing, inferring, simulating, and manipulating phylogenetic trees and comparative data. For instance, there are functions for computing consensus phylogenies from a set, for simulating phylogenetic trees and data under a range of models, for randomly or non-randomly attaching species or clades to a tree, as well as for a wide range of other manipulations and analyses that phylogenetic biologists might find useful in their research.

Maintained by Liam J. Revell. Last updated 30 days ago.

5.8 match 220 stars 13.84 score 4.8k scripts 77 dependents

tidyverse

glue:Interpreted String Literals

An implementation of interpreted string literals, inspired by Python's Literal String Interpolation <https://www.python.org/dev/peps/pep-0498/> and Docstrings <https://www.python.org/dev/peps/pep-0257/> and Julia's Triple-Quoted String Literals <https://docs.julialang.org/en/v1.3/manual/strings/#Triple-Quoted-String-Literals-1>.

Maintained by Jennifer Bryan. Last updated 6 months ago.

string-interpolation strings

3.5 match 729 stars 21.76 score 57k scripts 14k dependents

tidyverse

forcats:Tools for Working with Categorical Variables (Factors)

Helpers for reordering factor levels (including moving specified levels to front, ordering by first appearance, reversing, and randomly shuffling), and tools for modifying factor levels (including collapsing rare levels into other, 'anonymising', and manually 'recoding').

Maintained by Hadley Wickham. Last updated 1 years ago.

factor tidyverse

4.0 match 555 stars 18.77 score 21k scripts 1.2k dependents

berndbischl

BBmisc:Miscellaneous Helper Functions for B. Bischl

Miscellaneous helper functions for and from B. Bischl and some other guys, mainly for package development.

Maintained by Bernd Bischl. Last updated 2 years ago.

7.0 match 20 stars 10.59 score 980 scripts 69 dependents

tidyverse

dplyr:A Grammar of Data Manipulation

A fast, consistent tool for working with data frame like objects, both in memory and out of memory.

Maintained by Hadley Wickham. Last updated 16 days ago.

data-manipulation grammar cpp

3.0 match 4.8k stars 24.68 score 659k scripts 7.8k dependents

kwb-r

kwb.utils:General Utility Functions Developed at KWB

This package contains some small helper functions that aim at improving the quality of code developed at Kompetenzzentrum Wasser gGmbH (KWB).

Maintained by Hauke Sonnenberg. Last updated 12 months ago.

9.9 match 8 stars 7.33 score 12 scripts 78 dependents

adeelk93

collapsibleTree:Interactive Collapsible Tree Diagrams using 'D3.js'

Interactive Reingold-Tilford tree diagrams created using 'D3.js', where every node can be expanded and collapsed by clicking on it. Tooltips and color gradients can be mapped to nodes using a numeric column in the source data frame. See 'collapsibleTree' website for more information and examples.

Maintained by Adeel Khan. Last updated 1 years ago.

d3js htmlwidgets shiny

8.7 match 159 stars 8.25 score 472 scripts 6 dependents

poissonconsulting

nlist:Lists of Numeric Atomic Objects

Create and manipulate numeric list ('nlist') objects. An 'nlist' is an S3 list of uniquely named numeric objects. An numeric object is an integer or double vector, matrix or array. An 'nlists' object is a S3 class list of 'nlist' objects with the same names, dimensionalities and typeofs. Numeric list objects are of interest because they are the raw data inputs for analytic engines such as 'JAGS', 'STAN' and 'TMB'. Numeric lists objects, which are useful for storing multiple realizations of of simulated data sets, can be converted to coda::mcmc and coda::mcmc.list objects.

Maintained by Joe Thorley. Last updated 2 months ago.

data-frame natomic nlist nlists

9.0 match 6 stars 7.23 score 13 scripts 12 dependents

spatstat

spatstat.explore:Exploratory Data Analysis for the 'spatstat' Family

Functionality for exploratory data analysis and nonparametric analysis of spatial data, mainly spatial point patterns, in the 'spatstat' family of packages. (Excludes analysis of spatial data on a linear network, which is covered by the separate package 'spatstat.linnet'.) Methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported.

Maintained by Adrian Baddeley. Last updated 3 days ago.

cluster-detection confidence-intervals hypothesis-testing k-function roc-curves scan-statistics significance-testing simulation-envelopes spatial-analysis spatial-data-analysis spatial-sharpening spatial-smoothing spatial-statistics

6.3 match 1 stars 10.18 score 67 scripts 149 dependents

highlanderlab

SIMplyBee:'AlphaSimR' Extension for Simulating Honeybee Populations and Breeding Programmes

An extension of the 'AlphaSimR' package (<https://cran.r-project.org/package=AlphaSimR>) for stochastic simulations of honeybee populations and breeding programmes. 'SIMplyBee' enables simulation of individual bees that form a colony, which includes a queen, fathers (drones the queen mated with), virgin queens, workers, and drones. Multiple colony can be merged into a population of colonies, such as an apiary or a whole country of colonies. Functions enable operations on castes, colony, or colonies, to ease 'R' scripting of whole populations. All 'AlphaSimR' functionality with respect to genomes and genetic and phenotype values is available and further extended for honeybees, including haplo-diploidy, complementary sex determiner locus, colony events (swarming, supersedure, etc.), and colony phenotype values.

Maintained by Jana Obšteter. Last updated 6 months ago.

cpp openmp

10.0 match 2 stars 6.24 score 18 scripts

christopherkenny

censable:Making Census Data More Usable

Creates a common framework for organizing, naming, and gathering population, age, race, and ethnicity data from the Census Bureau. Accesses the API <https://www.census.gov/data/developers/data-sets.html>. Provides tools for adding information to existing data to line up with Census data.

Maintained by Christopher T. Kenny. Last updated 10 months ago.

10.4 match 8 stars 5.78 score 42 scripts 4 dependents

epiforecasts

EpiNow2:Estimate Real-Time Case Counts and Time-Varying Epidemiological Parameters

Estimates the time-varying reproduction number, rate of spread, and doubling time using a range of open-source tools (Abbott et al. (2020) <doi:10.12688/wellcomeopenres.16006.1>), and current best practices (Gostic et al. (2020) <doi:10.1101/2020.06.18.20134858>). It aims to help users avoid some of the limitations of naive implementations in a framework that is informed by community feedback and is actively supported.

Maintained by Sebastian Funk. Last updated 28 days ago.

backcalculation covid-19 gaussian-processes open-source reproduction-number stan cpp

4.9 match 120 stars 11.88 score 210 scripts

kharchenkolab

pagoda2:Single Cell Analysis and Differential Expression

Analyzing and interactively exploring large-scale single-cell RNA-seq datasets. 'pagoda2' primarily performs normalization and differential gene expression analysis, with an interactive application for exploring single-cell RNA-seq datasets. It performs basic tasks such as cell size normalization, gene variance normalization, and can be used to identify subpopulations and run differential expression within individual samples. 'pagoda2' was written to rapidly process modern large-scale scRNAseq datasets of approximately 1e6 cells. The companion web application allows users to explore which gene expression patterns form the different subpopulations within your data. The package also serves as the primary method for preprocessing data for conos, <https://github.com/kharchenkolab/conos>. This package interacts with data available through the 'p2data' package, which is available in a 'drat' repository. To access this data package, see the instructions at <https://github.com/kharchenkolab/pagoda2>. The size of the 'p2data' package is approximately 6 MB.

Maintained by Evan Biederstedt. Last updated 1 years ago.

scrna-seq single-cell single-cell-rna-seq transcriptomics openblas cpp openmp

7.0 match 222 stars 8.00 score 282 scripts

bioc

Biostrings:Efficient manipulation of biological strings

Memory efficient string containers, string matching algorithms, and other utilities, for fast manipulation of large biological sequences or sets of sequences.

Maintained by Hervé Pagès. Last updated 27 days ago.

sequencematching alignment sequencing genetics dataimport datarepresentation infrastructure bioconductor-package core-package

3.0 match 61 stars 17.83 score 8.6k scripts 1.2k dependents

bioc

DESeq2:Differential gene expression analysis based on the negative binomial distribution

Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.

Maintained by Michael Love. Last updated 14 days ago.

sequencing rnaseq chipseq geneexpression transcription normalization differentialexpression bayesian regression principalcomponent clustering immunooncology openblas cpp

3.1 match 375 stars 16.11 score 17k scripts 115 dependents

kharchenkolab

sccore:Core Utilities for Single-Cell RNA-Seq

Core utilities for single-cell RNA-seq data analysis. Contained within are utility functions for working with differential expression (DE) matrices and count matrices, a collection of functions for manipulating and plotting data via 'ggplot2', and functions to work with cell graphs and cell embeddings. Graph-based methods include embedding kNN cell graphs into a UMAP <doi:10.21105/joss.00861>, collapsing vertices of each cluster in the graph, and propagating graph labels.

Maintained by Evan Biederstedt. Last updated 1 years ago.

cpp

7.5 match 12 stars 6.44 score 36 scripts 9 dependents

bioc

edgeR:Empirical Analysis of Digital Gene Expression Data in R

Differential expression analysis of sequence count data. Implements a range of statistical methodology based on the negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models, quasi-likelihood, and gene set enrichment. Can perform differential analyses of any type of omics data that produces read counts, including RNA-seq, ChIP-seq, ATAC-seq, Bisulfite-seq, SAGE, CAGE, metabolomics, or proteomics spectral counts. RNA-seq analyses can be conducted at the gene or isoform level, and tests can be conducted for differential exon or transcript usage.

Maintained by Yunshun Chen. Last updated 8 days ago.

alternativesplicing batcheffect bayesian biomedicalinformatics cellbiology chipseq clustering coverage differentialexpression differentialmethylation differentialsplicing dnamethylation epigenetics functionalgenomics geneexpression genesetenrichment genetics immunooncology multiplecomparison normalization pathways proteomics qualitycontrol regression rnaseq sage sequencing singlecell systemsbiology timecourse transcription transcriptomics openblas

3.5 match 13.40 score 17k scripts 255 dependents

bioc

autonomics:Unified Statistical Modeling of Omics Data

This package unifies access to Statistal Modeling of Omics Data. Across linear modeling engines (lm, lme, lmer, limma, and wilcoxon). Across coding systems (treatment, difference, deviation, etc). Across model formulae (with/without intercept, random effect, interaction or nesting). Across omics platforms (microarray, rnaseq, msproteomics, affinity proteomics, metabolomics). Across projection methods (pca, pls, sma, lda, spls, opls). Across clustering methods (hclust, pam, cmeans). It provides a fast enrichment analysis implementation. And an intuitive contrastogram visualisation to summarize contrast effects in complex designs.

Maintained by Aditya Bhagwat. Last updated 2 months ago.

software dataimport preprocessing dimensionreduction principalcomponent regression differentialexpression genesetenrichment transcriptomics transcription geneexpression rnaseq microarray proteomics metabolomics massspectrometry

7.9 match 5.95 score 5 scripts

bioc

DuplexDiscovereR:Analysis of the data from RNA duplex probing experiments

DuplexDiscovereR is a package designed for analyzing data from RNA cross-linking and proximity ligation protocols such as SPLASH, PARIS, LIGR-seq, and others. DuplexDiscovereR accepts input in the form of chimerically or split-aligned reads. It includes procedures for alignment classification, filtering, and efficient clustering of individual chimeric reads into duplex groups (DGs). Once DGs are identified, the package predicts RNA duplex formation and their hybridization energies. Additional metrics, such as p-values for random ligation hypothesis or mean DG alignment scores, can be calculated to rank final set of RNA duplexes. Data from multiple experiments or replicates can be processed separately and further compared to check the reproducibility of the experimental method.

Maintained by Egor Semenchenko. Last updated 2 months ago.

sequencing transcriptomics structuralprediction clustering splicedalignment

10.2 match 1 stars 4.60 score 5 scripts

bioc

IRanges:Foundation of integer range manipulation in Bioconductor

Provides efficient low-level and highly reusable S4 classes for storing, manipulating and aggregating over annotated ranges of integers. Implements an algebra of range operations, including efficient algorithms for finding overlaps and nearest neighbors. Defines efficient list-like classes for storing, transforming and aggregating large grouped data, i.e., collections of atomic vectors and DataFrames.

Maintained by Hervé Pagès. Last updated 1 months ago.

infrastructure datarepresentation bioconductor-package core-package

3.0 match 22 stars 15.09 score 2.1k scripts 1.8k dependents

pik-piam

magclass:Data Class and Tools for Handling Spatial-Temporal Data

Data class for increased interoperability working with spatial-temporal data together with corresponding functions and methods (conversions, basic calculations and basic data manipulation). The class distinguishes between spatial, temporal and other dimensions to facilitate the development and interoperability of tools build for it. Additional features are name-based addressing of data and internal consistency checks (e.g. checking for the right data order in calculations).

Maintained by Jan Philipp Dietrich. Last updated 13 days ago.

4.0 match 5 stars 11.16 score 412 scripts 56 dependents

tidyverse

tidyr:Tidy Messy Data

Tools to help to create tidy data, where each column is a variable, each row is an observation, and each cell contains a single value. 'tidyr' contains tools for changing the shape (pivoting) and hierarchy (nesting and 'unnesting') of a dataset, turning deeply nested lists into rectangular data frames ('rectangling'), and extracting values out of string columns. It also includes tools for working with missing values (both implicit and explicit).

Maintained by Hadley Wickham. Last updated 16 days ago.

tidy-data cpp

1.8 match 1.4k stars 22.88 score 168k scripts 5.5k dependents

matthewheun

matsindf:Matrices in Data Frames

Provides functions to collapse a tidy data frame into matrices in a data frame and expand a data frame of matrices into a tidy data frame.

Maintained by Matthew Heun. Last updated 12 days ago.

7.4 match 5 stars 5.48 score 40 scripts

jsilve24

fido:Bayesian Multinomial Logistic Normal Regression

Provides methods for fitting and inspection of Bayesian Multinomial Logistic Normal Models using MAP estimation and Laplace Approximation as developed in Silverman et. Al. (2022) <https://www.jmlr.org/papers/v23/19-882.html>. Key functionality is implemented in C++ for scalability. 'fido' replaces the previous package 'stray'.

Maintained by Justin Silverman. Last updated 21 days ago.

cpp openmp

4.9 match 20 stars 8.31 score 103 scripts

metrumresearchgroup

mrgsolve:Simulate from ODE-Based Models

Fast simulation from ordinary differential equation (ODE) based models typically employed in quantitative pharmacology and systems biology.

Maintained by Kyle T Baron. Last updated 1 months ago.

mrgsolve ode openblas cpp

3.7 match 138 stars 10.90 score 1.2k scripts 3 dependents

arg0naut91

neatRanges:Tidy Up Date/Time Ranges

Collapse, partition, combine, fill gaps in and expand date/time ranges.

Maintained by Aljaz Jelenko. Last updated 3 years ago.

collapse date-range intervals partitioning timestamp-ranges cpp

12.3 match 3 stars 3.18 score 2 scripts

dipterix

dipsaus:A Dipping Sauce for Data Analysis and Visualizations

Works as an "add-on" to packages like 'shiny', 'future', as well as 'rlang', and provides utility functions. Just like dipping sauce adding flavors to potato chips or pita bread, 'dipsaus' for data analysis and visualizations adds handy functions and enhancements to popular packages. The goal is to provide simple solutions that are frequently asked for online, such as how to synchronize 'shiny' inputs without freezing the app, or how to get memory size on 'Linux' or 'MacOS' system. The enhancements roughly fall into these four categories: 1. 'shiny' input widgets; 2. high-performance computing using the 'future' package; 3. modify R calls and convert among numbers, strings, and other objects. 4. utility functions to get system information such like CPU chip-set, memory limit, etc.

Maintained by Zhengjia Wang. Last updated 8 days ago.

cpp

4.8 match 13 stars 7.90 score 85 scripts 3 dependents

business-science

tibbletime:Time Aware Tibbles

Built on top of the 'tibble' package, 'tibbletime' is an extension that allows for the creation of time aware tibbles. Some immediate advantages of this include: the ability to perform time-based subsetting on tibbles, quickly summarising and aggregating results by time periods, and creating columns that can be used as 'dplyr' time-based groups.

Maintained by Davis Vaughan. Last updated 4 months ago.

periodicity tibble time time-series timeseries cpp

3.6 match 177 stars 10.51 score 644 scripts 2 dependents

egenn

rtemis:Machine Learning and Visualization

Advanced Machine Learning and Visualization. Unsupervised Learning (Clustering, Decomposition), Supervised Learning (Classification, Regression), Cross-Decomposition, Bagging, Boosting, Meta-models. Static and interactive graphics.

Maintained by E.D. Gennatas. Last updated 1 months ago.

data-science data-visualization machine-learning machine-learning-library visualization

5.2 match 145 stars 7.09 score 50 scripts 2 dependents

lyzander

tableHTML:A Tool to Create HTML Tables

A tool to create and style HTML tables with CSS. These can be exported and used in any application that accepts HTML (e.g. 'shiny', 'rmarkdown', 'PowerPoint'). It also provides functions to create CSS files (which also work with shiny).

Maintained by Theo Boutaris. Last updated 2 years ago.

4.2 match 22 stars 8.80 score 280 scripts 1 dependents

biogenies

tidysq:Tidy Processing and Analysis of Biological Sequences

A tidy approach to analysis of biological sequences. All processing and data-storage functions are heavily optimized to allow the fastest and most efficient data storage.

Maintained by Dominik Rafacz. Last updated 3 months ago.

bioconductor bioinformatics biological-sequences fasta s3 sequences tibble tidy tidyverse vctrs cpp

4.9 match 40 stars 7.56 score 38 scripts

ebailey78

shinyBS:Extra Twitter Bootstrap Components for Shiny

Adds easy access to additional Twitter Bootstrap components to Shiny.

Maintained by Eric Bailey. Last updated 9 years ago.

3.0 match 183 stars 12.19 score 3.1k scripts 101 dependents

ms609

TreeTools:Create, Modify and Analyse Phylogenetic Trees

Efficient implementations of functions for the creation, modification and analysis of phylogenetic trees. Applications include: generation of trees with specified shapes; tree rearrangement; analysis of tree shape; rooting of trees and extraction of subtrees; calculation and depiction of split support; plotting the position of rogue taxa (Klopfstein & Spasojevic 2019) <doi:10.1371/journal.pone.0212942>; calculation of ancestor-descendant relationships, of 'stemwardness' (Asher & Smith, 2022) <doi:10.1093/sysbio/syab072>, and of tree balance (Mir et al. 2013, Lemant et al. 2022) <doi:10.1016/j.mbs.2012.10.005>, <doi:10.1093/sysbio/syac027>; artificial extinction (Asher & Smith, 2022) <doi:10.1093/sysbio/syab072>; import and export of trees from Newick, Nexus (Maddison et al. 1997) <doi:10.1093/sysbio/46.4.590>, and TNT <https://www.lillo.org.ar/phylogeny/tnt/> formats; and analysis of splits and cladistic information.

Maintained by Martin R. Smith. Last updated 1 months ago.

evolutionary-biology phylogenetic-trees phylogenetics cpp

3.7 match 21 stars 9.92 score 124 scripts 10 dependents

ludvigolsen

groupdata2:Creating Groups from Data

Methods for dividing data into groups. Create balanced partitions and cross-validation folds. Perform time series windowing and general grouping and splitting of data. Balance existing groups with up- and downsampling or collapse them to fewer groups.

Maintained by Ludvig Renbo Olsen. Last updated 3 months ago.

balance cross-validation data data-frame fold group-factor groups participants partition split staircase

4.0 match 27 stars 9.04 score 338 scripts 7 dependents

r-lib

cli:Helpers for Developing Command Line Interfaces

A suite of tools to build attractive command line interfaces ('CLIs'), from semantic elements: headings, lists, alerts, paragraphs, etc. Supports custom themes via a 'CSS'-like language. It also contains a number of lower level 'CLI' elements: rules, boxes, trees, and 'Unicode' symbols with 'ASCII' alternatives. It support ANSI colors and text styles as well.

Maintained by Gábor Csárdi. Last updated 2 days ago.

cli

1.9 match 664 stars 19.34 score 1.4k scripts 14k dependents

bioc

extraChIPs:Additional functions for working with ChIP-Seq data

This package builds on existing tools and adds some simple but extremely useful capabilities for working wth ChIP-Seq data. The focus is on detecting differential binding windows/regions. One set of functions focusses on set-operations retaining mcols for GRanges objects, whilst another group of functions are to aid visualisation of results. Coercion to tibble objects is also implemented.

Maintained by Stevie Pederson. Last updated 19 days ago.

chipseq hic sequencing coverage

5.4 match 7 stars 6.67 score 25 scripts

tidymodels

recipes:Preprocessing and Feature Engineering Steps for Modeling

A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.

Maintained by Max Kuhn. Last updated 1 days ago.

1.9 match 584 stars 18.73 score 7.2k scripts 382 dependents

haozhu233

kableExtra:Construct Complex Table with 'kable' and Pipe Syntax

Build complex HTML or 'LaTeX' tables using 'kable()' from 'knitr' and the piping syntax from 'magrittr'. Function 'kable()' is a light weight table generator coming from 'knitr'. This package simplifies the way to manipulate the HTML or 'LaTeX' codes generated by 'kable()' and allows users to construct complex tables and customize styles using a readable syntax.

Maintained by Hao Zhu. Last updated 13 days ago.

html kable kableextra knitr latex rmarkdown

1.8 match 702 stars 19.35 score 55k scripts 163 dependents

bioc

QSutils:Quasispecies Diversity

Set of utility functions for viral quasispecies analysis with NGS data. Most functions are equally useful for metagenomic studies. There are three main types: (1) data manipulation and exploration—functions useful for converting reads to haplotypes and frequencies, repairing reads, intersecting strand haplotypes, and visualizing haplotype alignments. (2) diversity indices—functions to compute diversity and entropy, in which incidence, abundance, and functional indices are considered. (3) data simulation—functions useful for generating random viral quasispecies data.

Maintained by Mercedes Guerrero-Murillo. Last updated 5 months ago.

software genetics dnaseq geneticvariability sequencing alignment sequencematching dataimport

6.2 match 5.56 score 8 scripts 1 dependents

tidymodels

embed:Extra Recipes for Encoding Predictors

Predictors can be converted to one or more numeric representations using a variety of methods. Effect encodings using simple generalized linear models <doi:10.48550/arXiv.1611.09477> or nonlinear models <doi:10.48550/arXiv.1604.06737> can be used. There are also functions for dimension reduction and other approaches.

Maintained by Emil Hvitfeldt. Last updated 2 months ago.

3.7 match 142 stars 9.35 score 1.1k scripts

tomaskrehlik

frequencyConnectedness:Spectral Decomposition of Connectedness Measures

Accompanies a paper (Barunik, Krehlik (2018) <doi:10.1093/jjfinec/nby001>) dedicated to spectral decomposition of connectedness measures and their interpretation. We implement all the developed estimators as well as the historical counterparts. For more information, see the help or GitHub page (<https://github.com/tomaskrehlik/frequencyConnectedness>) for relevant information.

Maintained by Tomas Krehlik. Last updated 2 years ago.

5.8 match 100 stars 5.88 score 50 scripts 1 dependents

rstudio

bslib:Custom 'Bootstrap' 'Sass' Themes for 'shiny' and 'rmarkdown'

Simplifies custom 'CSS' styling of both 'shiny' and 'rmarkdown' via 'Bootstrap' 'Sass'. Supports 'Bootstrap' 3, 4 and 5 as well as their various 'Bootswatch' themes. An interactive widget is also provided for previewing themes in real time.

Maintained by Carson Sievert. Last updated 14 days ago.

bootstrap htmltools rmarkdown sass shiny

1.9 match 511 stars 18.02 score 5.1k scripts 4.3k dependents

skranz

RTutor:Interactive R problem sets with automatic testing of solutions and automatic hints

Interactive R problem sets with automatic testing of solutions and automatic hints

Maintained by Sebastian Kranz. Last updated 1 years ago.

economics learn-to-code problem-set rstudio rtutor shiny teaching

5.8 match 205 stars 5.83 score 111 scripts 1 dependents

bioc

sesame:SEnsible Step-wise Analysis of DNA MEthylation BeadChips

Tools For analyzing Illumina Infinium DNA methylation arrays. SeSAMe provides utilities to support analyses of multiple generations of Infinium DNA methylation BeadChips, including preprocessing, quality control, visualization and inference. SeSAMe features accurate detection calling, intelligent inference of ethnicity, sex and advanced quality control routines.

Maintained by Wanding Zhou. Last updated 2 months ago.

dnamethylation methylationarray preprocessing qualitycontrol bioinformatics dna-methylation microarray

3.7 match 69 stars 9.08 score 258 scripts 1 dependents

andrisignorell

DescTools:Tools for Descriptive Statistics

A collection of miscellaneous basic statistic functions and convenience wrappers for efficiently describing data. The author's intention was to create a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results. The package contains furthermore functions to produce documents using MS Word (or PowerPoint) and functions to import data from Excel. Many of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The 'BigCamelCase' style was consequently applied to functions borrowed from contributed R packages as well.

Maintained by Andri Signorell. Last updated 3 days ago.

fortran cpp

2.0 match 87 stars 16.70 score 7.7k scripts 99 dependents

r-spatial

stars:Spatiotemporal Arrays, Raster and Vector Data Cubes

Reading, manipulating, writing and plotting spatiotemporal arrays (raster and vector data cubes) in 'R', using 'GDAL' bindings provided by 'sf', and 'NetCDF' bindings by 'ncmeta' and 'RNetCDF'.

Maintained by Edzer Pebesma. Last updated 1 months ago.

raster satellite-images spatial

1.8 match 571 stars 18.27 score 7.2k scripts 137 dependents

gforge

htmlTable:Advanced Tables for Markdown/HTML

Tables with state-of-the-art layout elements such as row spanners, column spanners, table spanners, zebra striping, and more. While allowing advanced layout, the underlying css-structure is simple in order to maximize compatibility with common word processors. The package also contains a few text formatting functions that help outputting text compatible with HTML/LaTeX.

Maintained by Max Gordon. Last updated 8 months ago.

knitr table

2.0 match 79 stars 15.32 score 1.3k scripts 763 dependents

bioc

RCy3:Functions to Access and Control Cytoscape

Vizualize, analyze and explore networks using Cytoscape via R. Anything you can do using the graphical user interface of Cytoscape, you can now do with a single RCy3 function.

Maintained by Alex Pico. Last updated 5 months ago.

visualization graphandnetwork thirdpartyclient network

2.3 match 52 stars 13.42 score 628 scripts 17 dependents

atorus-research

Tplyr:A Traceability Focused Grammar of Clinical Data Summary

A traceability focused tool created to simplify the data manipulation necessary to create clinical summaries.

Maintained by Mike Stackhouse. Last updated 1 years ago.

pharma tables

3.1 match 95 stars 9.49 score 138 scripts 2 dependents

setempler

miscset:Miscellaneous Tools Set

A collection of miscellaneous methods to simplify various tasks, including plotting, data.frame and matrix transformations, environment functions, regular expression methods, and string and logical operations, as well as numerical and statistical tools. Most of the methods are simple but useful wrappers of common base R functions, which extend S3 generics or provide default values for important parameters.

Maintained by Sven E. Templer. Last updated 8 years ago.

miscellaneous cpp

6.8 match 1 stars 4.40 score 50 scripts

bioc

iSEE:Interactive SummarizedExperiment Explorer

Create an interactive Shiny-based graphical user interface for exploring data stored in SummarizedExperiment objects, including row- and column-level metadata. The interface supports transmission of selections between plots and tables, code tracking, interactive tours, interactive or programmatic initialization, preservation of app state, and extensibility to new panel types via S4 classes. Special attention is given to single-cell data in a SingleCellExperiment object with visualization of dimensionality reduction results.

Maintained by Kevin Rue-Albrecht. Last updated 13 days ago.

cellbasedassays clustering dimensionreduction featureextraction geneexpression gui immunooncology shinyapps singlecell transcription transcriptomics visualization dimension-reduction feature-extraction gene-expression hacktoberfest human-cell-atlas shiny single-cell

2.3 match 225 stars 12.86 score 380 scripts 9 dependents

dgkf

ggpackets:Package Plot Layers for Easier Portability and Modularization

Create groups of 'ggplot2' layers that can be easily migrated from one plot to another, reducing redundant code and improving the ability to format many plots that draw from the same source 'ggpacket' layers.

Maintained by Doug Kelkhoff. Last updated 17 days ago.

ggplot plotting

3.9 match 69 stars 7.34 score 12 scripts 1 dependents

nutterb

pixiedust:Tables so Beautifully Fine-Tuned You Will Believe It's Magic

The introduction of the 'broom' package has made converting model objects into data frames as simple as a single function. While the 'broom' package focuses on providing tidy data frames that can be used in advanced analysis, it deliberately stops short of providing functionality for reporting models in publication-ready tables. 'pixiedust' provides this functionality with a programming interface intended to be similar to 'ggplot2's system of layers with fine tuned control over each cell of the table. Options for output include printing to the console and to the common markdown formats (markdown, HTML, and LaTeX). With a little 'pixiedust' (and happy thoughts) tables can really fly.

Maintained by Benjamin Nutter. Last updated 1 years ago.

3.5 match 180 stars 8.01 score 94 scripts

mlr-org

mlr3pipelines:Preprocessing Operators and Pipelines for 'mlr3'

Dataflow programming toolkit that enriches 'mlr3' with a diverse set of pipelining operators ('PipeOps') that can be composed into graphs. Operations exist for data preprocessing, model fitting, and ensemble learning. Graphs can themselves be treated as 'mlr3' 'Learners' and can therefore be resampled, benchmarked, and tuned.

Maintained by Martin Binder. Last updated 12 days ago.

bagging data-science dataflow-programming ensemble-learning machine-learning mlr3 pipelines preprocessing stacking

2.3 match 141 stars 12.36 score 448 scripts 7 dependents

lrberge

stringmagic:Character String Operations and Interpolation, Magic Edition

Performs complex string operations compactly and efficiently. Supports string interpolation jointly with over 50 string operations. Also enhances regular string functions (like grep() and co). See an introduction at <https://lrberge.github.io/stringmagic/>.

Maintained by Laurent R Berge. Last updated 7 months ago.

interpolation string cpp

2.6 match 15 stars 10.56 score 37 scripts 33 dependents

atlasoflivingaustralia

galah:Biodiversity Data from the GBIF Node Network

The Global Biodiversity Information Facility ('GBIF', <https://www.gbif.org>) sources data from an international network of data providers, known as 'nodes'. Several of these nodes - the "living atlases" (<https://living-atlases.gbif.org>) - maintain their own web services using software originally developed by the Atlas of Living Australia ('ALA', <https://www.ala.org.au>). 'galah' enables the R community to directly access data and resources hosted by 'GBIF' and its partner nodes.

Maintained by Martin Westgate. Last updated 1 months ago.

3.0 match 43 stars 9.17 score 275 scripts 1 dependents

datastorm-open

visNetwork:Network Visualization using 'vis.js' Library

Provides an R interface to the 'vis.js' JavaScript charting library. It allows an interactive visualization of networks.

Maintained by Benoit Thieurmel. Last updated 2 years ago.

1.8 match 549 stars 15.14 score 4.1k scripts 195 dependents

strengejacke

ggeffects:Create Tidy Data Frames of Marginal Effects for 'ggplot' from Model Outputs

Compute marginal effects and adjusted predictions from statistical models and returns the result as tidy data frames. These data frames are ready to use with the 'ggplot2'-package. Effects and predictions can be calculated for many different models. Interaction terms, splines and polynomial terms are also supported. The main functions are ggpredict(), ggemmeans() and ggeffect(). There is a generic plot()-method to plot the results using 'ggplot2'.

Maintained by Daniel Lüdecke. Last updated 8 days ago.

estimated-marginal-means hacktoberfest marginal-effects prediction

1.8 match 588 stars 15.55 score 3.6k scripts 7 dependents

philchalmers

mirt:Multidimensional Item Response Theory

Analysis of discrete response data using unidimensional and multidimensional item analysis models under the Item Response Theory paradigm (Chalmers (2012) <doi:10.18637/jss.v048.i06>). Exploratory and confirmatory item factor analysis models are estimated with quadrature (EM) or stochastic (MHRM) methods. Confirmatory bi-factor and two-tier models are available for modeling item testlets using dimension reduction EM algorithms, while multiple group analyses and mixed effects designs are included for detecting differential item, bundle, and test functioning, and for modeling item and person covariates. Finally, latent class models such as the DINA, DINO, multidimensional latent class, mixture IRT models, and zero-inflated response models are supported, as well as a wide family of probabilistic unfolding models.

Maintained by Phil Chalmers. Last updated 14 days ago.

irt mirt openblas cpp openmp

1.8 match 210 stars 14.98 score 2.5k scripts 40 dependents

dipterix

ravetools:Signal and Image Processing Toolbox for Analyzing Intracranial Electroencephalography Data

Implemented fast and memory-efficient Notch-filter, Welch-periodogram, discrete wavelet spectrogram for minutes of high-resolution signals, fast 3D convolution, image registration, 3D mesh manipulation; providing fundamental toolbox for intracranial Electroencephalography (iEEG) pipelines. Documentation and examples about 'RAVE' project are provided at <https://rave.wiki>, and the paper by John F. Magnotti, Zhengjia Wang, Michael S. Beauchamp (2020) <doi:10.1016/j.neuroimage.2020.117341>; see 'citation("ravetools")' for details.

Maintained by Zhengjia Wang. Last updated 7 days ago.

fftw3 cpp

5.3 match 3 stars 5.13 score 20 scripts 1 dependents

jmbarbone

fuj:Functions and Utilities for Jordan

Provides core functions and utilities for packages and other code developed by Jordan Mark Barbone.

Maintained by Jordan Mark Barbone. Last updated 10 days ago.

6.0 match 2 stars 4.48 score 8 scripts 1 dependents

btskinner

crosswalkr:Rename and Encode Data Frames Using External Crosswalk Files

A pair of functions for renaming and encoding data frames using external crosswalk files. It is especially useful when constructing master data sets from multiple smaller data sets that do not name or encode variables consistently across files. Based on similar commands in 'Stata'.

Maintained by Benjamin Skinner. Last updated 1 years ago.

crosswalk encode labels rename

5.0 match 9 stars 5.26 score 20 scripts

bioc

Gviz:Plotting data and annotation information along genomic coordinates

Genomic data analyses requires integrated visualization of known genomic information and new experimental data. Gviz uses the biomaRt and the rtracklayer packages to perform live annotation queries to Ensembl and UCSC and translates this to e.g. gene/transcript structures in viewports of the grid graphics package. This results in genomic information plotted together with your data.

Maintained by Robert Ivanek. Last updated 5 months ago.

visualization microarray sequencing

2.0 match 79 stars 13.08 score 1.4k scripts 48 dependents

nicchr

fastplyr:Fast Alternatives to 'tidyverse' Functions

A full set of fast data manipulation tools with a tidy front-end and a fast back-end using 'collapse' and 'cheapr'.

Maintained by Nick Christofides. Last updated 27 days ago.

cpp

4.1 match 23 stars 6.32 score 36 scripts 1 dependents

bioc

scde:Single Cell Differential Expression

The scde package implements a set of statistical methods for analyzing single-cell RNA-seq data. scde fits individual error models for single-cell RNA-seq measurements. These models can then be used for assessment of differential expression between groups of cells, as well as other types of analysis. The scde package also contains the pagoda framework which applies pathway and gene set overdispersion analysis to identify and characterize putative cell subpopulations based on transcriptional signatures. The overall approach to the differential expression analysis is detailed in the following publication: "Bayesian approach to single-cell differential expression analysis" (Kharchenko PV, Silberstein L, Scadden DT, Nature Methods, doi: 10.1038/nmeth.2967). The overall approach to subpopulation identification and characterization is detailed in the following pre-print: "Characterizing transcriptional heterogeneity through pathway and gene set overdispersion analysis" (Fan J, Salathia N, Liu R, Kaeser G, Yung Y, Herman J, Kaper F, Fan JB, Zhang K, Chun J, and Kharchenko PV, Nature Methods, doi:10.1038/nmeth.3734).

Maintained by Evan Biederstedt. Last updated 5 months ago.

immunooncology rnaseq statisticalmethod differentialexpression bayesian transcription software analysis bioinformatics heterogenity ngs single-cell transcriptomics openblas cpp openmp

3.5 match 173 stars 7.53 score 141 scripts

bioc

SingleMoleculeFootprinting:Analysis tools for Single Molecule Footprinting (SMF) data

SingleMoleculeFootprinting provides functions to analyze Single Molecule Footprinting (SMF) data. Following the workflow exemplified in its vignette, the user will be able to perform basic data analysis of SMF data with minimal coding effort. Starting from an aligned bam file, we show how to perform quality controls over sequencing libraries, extract methylation information at the single molecule level accounting for the two possible kind of SMF experiments (single enzyme or double enzyme), classify single molecules based on their patterns of molecular occupancy, plot SMF information at a given genomic location.

Maintained by Guido Barzaghi. Last updated 1 months ago.

dnamethylation coverage nucleosomepositioning datarepresentation epigenetics methylseq qualitycontrol sequencing

4.0 match 2 stars 6.43 score 27 scripts

tagteam

riskRegression:Risk Regression Models and Prediction Scores for Survival Analysis with Competing Risks

Implementation of the following methods for event history analysis. Risk regression models for survival endpoints also in the presence of competing risks are fitted using binomial regression based on a time sequence of binary event status variables. A formula interface for the Fine-Gray regression model and an interface for the combination of cause-specific Cox regression models. A toolbox for assessing and comparing performance of risk predictions (risk markers and risk prediction models). Prediction performance is measured by the Brier score and the area under the ROC curve for binary possibly time-dependent outcome. Inverse probability of censoring weighting and pseudo values are used to deal with right censored data. Lists of risk markers and lists of risk models are assessed simultaneously. Cross-validation repeatedly splits the data, trains the risk prediction models on one part of each split and then summarizes and compares the performance across splits.

Maintained by Thomas Alexander Gerds. Last updated 20 days ago.

openblas cpp

2.0 match 46 stars 13.00 score 736 scripts 35 dependents

edgarsantos-fernandez

SSNbayes:Bayesian Spatio-Temporal Analysis in Stream Networks

Fits Bayesian spatio-temporal models and makes predictions on stream networks using the approach by Santos-Fernandez, Edgar, et al. (2022)."Bayesian spatio-temporal models for stream networks" and Santos-Fernandez, Edgar, et al. (2023). "SSNbayes: An R Package for Bayesian Spatio-Temporal Modelling on Stream Networks". In these models, spatial dependence is captured using stream distance and flow connectivity, while temporal autocorrelation is modelled using vector autoregression methods.

Maintained by Edgar Santos-Fernandez. Last updated 2 months ago.

4.8 match 17 stars 5.41 score 6 scripts

rdoctaskforce

pkgcond:Classed Error and Warning Conditions

This provides utilities for creating classed error and warning conditions based on where the error originated.

Maintained by Andrew Redd. Last updated 4 years ago.

5.0 match 5 stars 5.19 score 41 scripts 5 dependents

bioc

methylSig:MethylSig: Differential Methylation Testing for WGBS and RRBS Data

MethylSig is a package for testing for differentially methylated cytosines (DMCs) or regions (DMRs) in whole-genome bisulfite sequencing (WGBS) or reduced representation bisulfite sequencing (RRBS) experiments. MethylSig uses a beta binomial model to test for significant differences between groups of samples. Several options exist for either site-specific or sliding window tests, and variance estimation.

Maintained by Raymond G. Cavalcante. Last updated 5 months ago.

dnamethylation differentialmethylation epigenetics regression methylseq differential-methylation dna-methylation

3.5 match 17 stars 7.37 score 23 scripts

choi-phd

lordif:Logistic Ordinal Regression Differential Item Functioning using IRT

Performs analysis of Differential Item Functioning (DIF) for dichotomous and polytomous items using an iterative hybrid of ordinal logistic regression and item response theory (IRT) according to Choi, Gibbons, and Crane (2011) <doi:10.18637/jss.v039.i08>.

Maintained by Seung W. Choi. Last updated 2 months ago.

5.0 match 1 stars 5.12 score 35 scripts 1 dependents

skyebend

networkDynamic:Dynamic Extensions for Network Objects

Simple interface routines to facilitate the handling of network objects with complex intertemporal data. This is a part of the "statnet" suite of packages for network analysis.

Maintained by Skye Bender-deMoll. Last updated 4 months ago.

3.9 match 3 stars 6.47 score 132 scripts 11 dependents

polar-fhir

fhircrackr:Handling HL7 FHIR® Resources in R

Useful tools for conveniently downloading FHIR resources in xml format and converting them to R data.frames. The package uses FHIR-search to download bundles from a FHIR server, provides functions to save and read xml-files containing such bundles and allows flattening the bundles to data.frames using XPath expressions. FHIR® is the registered trademark of HL7 and is used with the permission of HL7. Use of the FHIR trademark does not constitute endorsement of this product by HL7.

Maintained by Julia Palm. Last updated 14 days ago.

fhir fhir-client

3.3 match 33 stars 7.63 score 46 scripts

tidyverts

tsibble:Tidy Temporal Data Frames and Tools

Provides a 'tbl_ts' class (the 'tsibble') for temporal data in an data- and model-oriented format. The 'tsibble' provides tools to easily manipulate and analyse temporal data, such as filling in time gaps and aggregating over calendar periods.

Maintained by Earo Wang. Last updated 2 months ago.

1.8 match 538 stars 14.47 score 4.4k scripts 42 dependents

bioc

microbiome:Microbiome Analytics

Utilities for microbiome analysis.

Maintained by Leo Lahti. Last updated 5 months ago.

metagenomics microbiome sequencing systemsbiology hitchip hitchip-atlas human-microbiome microbiology microbiome-analysis phyloseq population-study

2.0 match 293 stars 12.51 score 2.0k scripts 5 dependents

tidyverse

dbplyr:A 'dplyr' Back End for Databases

A 'dplyr' back end for databases that allows you to work with remote database tables as if they are in-memory data frames. Basic features works with any database that has a 'DBI' back end; more advanced features require 'SQL' translation to be provided by the package author.

Maintained by Hadley Wickham. Last updated 4 months ago.

database

1.3 match 481 stars 19.72 score 5.2k scripts 736 dependents

christophergandrud

networkD3:D3 JavaScript Network Graphs from R

Creates 'D3' 'JavaScript' network, tree, dendrogram, and Sankey graphs from 'R'.

Maintained by Christopher Gandrud. Last updated 6 years ago.

d3js networks

1.8 match 654 stars 13.55 score 3.4k scripts 31 dependents

r-hyperspec

hyperSpec:Work with Hyperspectral Data, i.e. Spectra + Meta Information (Spatial, Time, Concentration, ...)

Comfortable ways to work with hyperspectral data sets, i.e. spatially or time-resolved spectra, or spectra with any other kind of information associated with each of the spectra. The spectra can be data as obtained in XRF, UV/VIS, Fluorescence, AES, NIR, IR, Raman, NMR, MS, etc. More generally, any data that is recorded over a discretized variable, e.g. absorbance = f(wavelength), stored as a vector of absorbance values for discrete wavelengths is suitable.

Maintained by Claudia Beleites. Last updated 10 months ago.

data-wrangling hyperspectral imaging infrared nmr raman spectroscopy uv-vis xrf

3.0 match 16 stars 8.13 score 233 scripts 2 dependents

ijlyttle

bsplus:Adds Functionality to the R Markdown + Shiny Bootstrap Framework

The Bootstrap framework lets you add some JavaScript functionality to your web site by adding attributes to your HTML tags - Bootstrap takes care of the JavaScript <https://getbootstrap.com/docs/3.3/javascript/>. If you are using R Markdown or Shiny, you can use these functions to create collapsible sections, accordion panels, modals, tooltips, popovers, and an accordion sidebar framework (not described at Bootstrap site). Please note this package was designed for Bootstrap 3.3.

Maintained by Ian Lyttle. Last updated 2 years ago.

bootstrap3 rmarkdown shiny

2.8 match 147 stars 8.80 score 295 scripts 15 dependents

lz100

spsComps:'systemPipeShiny' UI and Server Components

The systemPipeShiny (SPS) framework comes with many UI and server components. However, installing the whole framework is heavy and takes some time. If you would like to use UI and server components from SPS in your own Shiny apps, do not hesitate to try this package.

Maintained by Le Zhang. Last updated 1 years ago.

3.7 match 31 stars 6.56 score 65 scripts 4 dependents

hneth

ds4psy:Data Science for Psychologists

All datasets and functions required for the examples and exercises of the book "Data Science for Psychologists" (by Hansjoerg Neth, Konstanz University, 2023), freely available at <https://bookdown.org/hneth/ds4psy/>. The book and course introduce principles and methods of data science to students of psychology and other biological or social sciences. The 'ds4psy' package primarily provides datasets, but also functions for data generation and manipulation (e.g., of text and time data) and graphics that are used in the book and its exercises. All functions included in 'ds4psy' are designed to be explicit and instructive, rather than efficient or elegant.

Maintained by Hansjoerg Neth. Last updated 1 months ago.

data-literacy data-science education exploratory-data-analysis psychology social-sciences visualisation

3.5 match 22 stars 6.79 score 70 scripts

peekxc

simplextree:Provides Tools for Working with General Simplicial Complexes

Provides an interface to a Simplex Tree data structure, which is a data structure aimed at enabling efficient manipulation of simplicial complexes of any dimension. The Simplex Tree data structure was originally introduced by Jean-Daniel Boissonnat and Clément Maria (2014) <doi:10.1007/s00453-014-9887-3>.

Maintained by Matt Piekenbrock. Last updated 1 years ago.

rcpp simplicial-complex topological-data-analysis topology cpp

5.3 match 15 stars 4.56 score 16 scripts 1 dependents

bioc

reconsi:Resampling Collapsed Null Distributions for Simultaneous Inference

Improves simultaneous inference under dependence of tests by estimating a collapsed null distribution through resampling. Accounting for the dependence between tests increases the power while reducing the variability of the false discovery proportion. This dependence is common in genomics applications, e.g. when combining flow cytometry measurements with microbiome sequence counts.

Maintained by Stijn Hawinkel. Last updated 5 months ago.

metagenomics microbiome multiplecomparison flowcytometry

5.1 match 2 stars 4.60 score 2 scripts

datasketch

shinypanels:Shiny Layout with Collapsible Panels

Create 'Shiny Apps' with collapsible vertical panels. This package provides a new visual arrangement for elements on top of 'Shiny'. Use the expand and collapse capabilities to leverage web applications with many elements to focus the user attention on the panel of interest.

Maintained by Juan Pablo Marin Diaz. Last updated 9 months ago.

shiny

3.9 match 80 stars 6.01 score 43 scripts

philchalmers

SimDesign:Structure for Organizing Monte Carlo Simulation Designs

Provides tools to safely and efficiently organize and execute Monte Carlo simulation experiments in R. The package controls the structure and back-end of Monte Carlo simulation experiments by utilizing a generate-analyse-summarise workflow. The workflow safeguards against common simulation coding issues, such as automatically re-simulating non-convergent results, prevents inadvertently overwriting simulation files, catches error and warning messages during execution, implicitly supports parallel processing with high-quality random number generation, and provides tools for managing high-performance computing (HPC) array jobs submitted to schedulers such as SLURM. For a pedagogical introduction to the package see Sigal and Chalmers (2016) <doi:10.1080/10691898.2016.1246953>. For a more in-depth overview of the package and its design philosophy see Chalmers and Adkins (2020) <doi:10.20982/tqmp.16.4.p248>.

Maintained by Phil Chalmers. Last updated 2 days ago.

monte-carlo-simulation simulation simulation-framework

1.8 match 62 stars 13.38 score 253 scripts 46 dependents

mlr-org

mlr3misc:Helper Functions for 'mlr3'

Frequently used helper functions and assertions used in 'mlr3' and its companion packages. Comes with helper functions for functional programming, for printing, to work with 'data.table', as well as some generally useful 'R6' classes. This package also supersedes the package 'BBmisc'.

Maintained by Marc Becker. Last updated 4 months ago.

machine-learning miscellaneous mlr3

2.3 match 12 stars 10.28 score 302 scripts 42 dependents

wrathematics

kazaam:Tools for Tall Distributed Matrices

Many data science problems reduce to operations on very tall, skinny matrices. However, sometimes these matrices can be so tall that they are difficult to work with, or do not even fit into main memory. One strategy to deal with such objects is to distribute their rows across several processors. To this end, we offer an 'S4' class for tall, skinny, distributed matrices, called the 'shaq'. We also provide many useful numerical methods and statistics operations for operating on these distributed objects. The naming is a bit "tongue-in-cheek", with the class a play on the fact that 'Shaquille' 'ONeal' ('Shaq') is very tall, and he starred in the film 'Kazaam'.

Maintained by Drew Schmidt. Last updated 8 years ago.

openblas

6.0 match 3.82 score 133 scripts

mbq

vistla:Detecting Influence Paths with Information Theory

Traces information spread through interactions between features, utilising information theory measures and a higher-order generalisation of the concept of widest paths in graphs. In particular, 'vistla' can be used to better understand the results of high-throughput biomedical experiments, by organising the effects of the investigated intervention in a tree-like hierarchy from direct to indirect ones, following the plausible information relay circuits. Due to its higher-order nature, 'vistla' can handle multi-modality and assign multiple roles to a single feature.

Maintained by Miron B. Kursa. Last updated 28 days ago.

openmp

4.8 match 4.78 score 3 scripts

rmgpanw

gtexr:Query the GTEx Portal API

A convenient R interface to the Genotype-Tissue Expression (GTEx) Portal API. For more information on the API, see <https://gtexportal.org/api/v2/redoc>.

Maintained by Alasdair Warwick. Last updated 6 months ago.

api-wrapper bioinformatics eqtl gtex sqtl

3.5 match 5 stars 6.41 score 5 scripts

truecluster

ff:Memory-Efficient Storage of Large Data on Disk and Fast Access Functions

The ff package provides data structures that are stored on disk but behave (almost) as if they were in RAM by transparently mapping only a section (pagesize) in main memory - the effective virtual memory consumption per ff object. ff supports R's standard atomic data types 'double', 'logical', 'raw' and 'integer' and non-standard atomic types boolean (1 bit), quad (2 bit unsigned), nibble (4 bit unsigned), byte (1 byte signed with NAs), ubyte (1 byte unsigned), short (2 byte signed with NAs), ushort (2 byte unsigned), single (4 byte float with NAs). For example 'quad' allows efficient storage of genomic data as an 'A','T','G','C' factor. The unsigned types support 'circular' arithmetic. There is also support for close-to-atomic types 'factor', 'ordered', 'POSIXct', 'Date' and custom close-to-atomic types. ff not only has native C-support for vectors, matrices and arrays with flexible dimorder (major column-order, major row-order and generalizations for arrays). There is also a ffdf class not unlike data.frames and import/export filters for csv files. ff objects store raw data in binary flat files in native encoding, and complement this with metadata stored in R as physical and virtual attributes. ff objects have well-defined hybrid copying semantics, which gives rise to certain performance improvements through virtualization. ff objects can be stored and reopened across R sessions. ff files can be shared by multiple ff R objects (using different data en/de-coding schemes) in the same process or from multiple R processes to exploit parallelism. A wide choice of finalizer options allows to work with 'permanent' files as well as creating/removing 'temporary' ff files completely transparent to the user. On certain OS/Filesystem combinations, creating the ff files works without notable delay thanks to using sparse file allocation. Several access optimization techniques such as Hybrid Index Preprocessing and Virtualization are implemented to achieve good performance even with large datasets, for example virtual matrix transpose without touching a single byte on disk. Further, to reduce disk I/O, 'logicals' and non-standard data types get stored native and compact on binary flat files i.e. logicals take up exactly 2 bits to represent TRUE, FALSE and NA. Beyond basic access functions, the ff package also provides compatibility functions that facilitate writing code for ff and ram objects and support for batch processing on ff objects (e.g. as.ram, as.ff, ffapply). ff interfaces closely with functionality from package 'bit': chunked looping, fast bit operations and coercions between different objects that can store subscript information ('bit', 'bitwhich', ff 'boolean', ri range index, hi hybrid index). This allows to work interactively with selections of large datasets and quickly modify selection criteria. Further high-performance enhancements can be made available upon request.

Maintained by Jens Oehlschlägel. Last updated 2 months ago.

cpp

1.9 match 27 stars 12.01 score 764 scripts 71 dependents

cran

shiny.blueprint:Palantir's 'Blueprint' for 'Shiny' Apps

Easily use 'Blueprint', the popular 'React' library from Palantir, in your 'Shiny' app. 'Blueprint' provides a rich set of UI components for creating visually appealing applications and is optimized for building complex, data-dense web interfaces. This package provides most components from the underlying library, as well as special wrappers for some components to make it easy to use them in 'R' without writing 'JavaScript' code.

Maintained by Jakub Sobolewski. Last updated 10 months ago.

6.0 match 3.70 score

dnychka

fields:Tools for Spatial Data

For curve, surface and function fitting with an emphasis on splines, spatial data, geostatistics, and spatial statistics. The major methods include cubic, and thin plate splines, Kriging, and compactly supported covariance functions for large data sets. The splines and Kriging methods are supported by functions that can determine the smoothing parameter (nugget and sill variance) and other covariance function parameters by cross validation and also by restricted maximum likelihood. For Kriging there is an easy to use function that also estimates the correlation scale (range parameter). A major feature is that any covariance function implemented in R and following a simple format can be used for spatial prediction. There are also many useful functions for plotting and working with spatial data as images. This package also contains an implementation of sparse matrix methods for large spatial data sets and currently requires the sparse matrix (spam) package. Use help(fields) to get started and for an overview. The fields source code is deliberately commented and provides useful explanations of numerical details as a companion to the manual pages. The commented source code can be viewed by expanding the source code version and looking in the R subdirectory. The reference for fields can be generated by the citation function in R and has DOI <doi:10.5065/D6W957CT>. Development of this package was supported in part by the National Science Foundation Grant 1417857, the National Center for Atmospheric Research, and Colorado School of Mines. See the Fields URL for a vignette on using this package and some background on spatial statistics.

Maintained by Douglas Nychka. Last updated 9 months ago.

fortran

1.8 match 15 stars 12.60 score 7.7k scripts 295 dependents

bioc

minfi:Analyze Illumina Infinium DNA methylation arrays

Tools to analyze & visualize Illumina Infinium methylation arrays.

Maintained by Kasper Daniel Hansen. Last updated 4 months ago.

immunooncology dnamethylation differentialmethylation epigenetics microarray methylationarray multichannel twochannel dataimport normalization preprocessing qualitycontrol

1.7 match 60 stars 12.83 score 996 scripts 26 dependents

bioc

maftools:Summarize, Analyze and Visualize MAF Files

Analyze and visualize Mutation Annotation Format (MAF) files from large scale sequencing studies. This package provides various functions to perform most commonly used analyses in cancer genomics and to create feature rich customizable visualzations with minimal effort.

Maintained by Anand Mayakonda. Last updated 5 months ago.

datarepresentation dnaseq visualization drivermutation variantannotation featureextraction classification somaticmutation sequencing functionalgenomics survival bioinformatics cancer-genome-atlas cancer-genomics genomics maf-files tcga curl bzip2 xz-utils zlib

1.5 match 459 stars 14.63 score 948 scripts 18 dependents

ouhscbbmc

REDCapR:Interaction Between R and REDCap

Encapsulates functions to streamline calls from R to the REDCap API. REDCap (Research Electronic Data CAPture) is a web application for building and managing online surveys and databases developed at Vanderbilt University. The Application Programming Interface (API) offers an avenue to access and modify data programmatically, improving the capacity for literate and reproducible programming.

Maintained by Will Beasley. Last updated 2 months ago.

redcap redcap-api

1.8 match 118 stars 12.36 score 438 scripts 6 dependents

insightsengineering

tern:Create Common TLGs Used in Clinical Trials

Table, Listings, and Graphs (TLG) library for common outputs used in clinical trials.

Maintained by Joe Zhu. Last updated 2 months ago.

clinical-trials graphs listings nest outputs tables

1.7 match 83 stars 12.50 score 186 scripts 9 dependents

bayesball

LearnBayes:Learning Bayesian Inference

Contains functions for summarizing basic one and two parameter posterior distributions and predictive distributions. It contains MCMC algorithms for summarizing posterior distributions defined by the user. It also contains functions for regression models, hierarchical models, Bayesian tests, and illustrations of Gibbs sampling.

Maintained by Jim Albert. Last updated 7 years ago.

1.9 match 38 stars 11.34 score 690 scripts 31 dependents

bioc

hopach:Hierarchical Ordered Partitioning and Collapsing Hybrid (HOPACH)

The HOPACH clustering algorithm builds a hierarchical tree of clusters by recursively partitioning a data set, while ordering and possibly collapsing clusters at each level. The algorithm uses the Mean/Median Split Silhouette (MSS) criteria to identify the level of the tree with maximally homogeneous clusters. It also runs the tree down to produce a final ordered list of the elements. The non-parametric bootstrap allows one to estimate the probability that each element belongs to each cluster (fuzzy clustering).

Maintained by Katherine S. Pollard. Last updated 5 months ago.

clustering

3.4 match 6.05 score 54 scripts 5 dependents

bnosac

udpipe:Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

This natural language processing toolkit provides language-agnostic 'tokenization', 'parts of speech tagging', 'lemmatization' and 'dependency parsing' of raw text. Next to text parsing, the package also allows you to train annotation models based on data of 'treebanks' in 'CoNLL-U' format as provided at <https://universaldependencies.org/format.html>. The techniques are explained in detail in the paper: 'Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe', available at <doi:10.18653/v1/K17-3009>. The toolkit also contains functionalities for commonly used data manipulations on texts which are enriched with the output of the parser. Namely functionalities and algorithms for collocations, token co-occurrence, document term matrix handling, term frequency inverse document frequency calculations, information retrieval metrics (Okapi BM25), handling of multi-word expressions, keyword detection (Rapid Automatic Keyword Extraction, noun phrase extraction, syntactical patterns) sentiment scoring and semantic similarity analysis.

Maintained by Jan Wijffels. Last updated 2 years ago.

conll dependency-parser lemmatization natural-language-processing nlp pos-tagging r-pkg rcpp text-mining tokenizer udpipe cpp

1.8 match 215 stars 11.83 score 1.2k scripts 9 dependents

tdhock

nc:Named Capture to Data Tables

User-friendly functions for extracting a data table (row for each match, column for each group) from non-tabular text data using regular expressions, and for melting columns that match a regular expression. Patterns are defined using a readable syntax that makes it easy to build complex patterns in terms of simpler, re-usable sub-patterns. Named R arguments are translated to column names in the output; capture groups without names are used internally in order to provide a standard interface to three regular expression 'C' libraries ('PCRE', 'RE2', 'ICU'). Output can also include numeric columns via user-specified type conversion functions.

Maintained by Toby Hocking. Last updated 2 months ago.

3.0 match 16 stars 6.85 score 46 scripts

markfairbanks

tidytable:Tidy Interface to 'data.table'

A tidy interface to 'data.table', giving users the speed of 'data.table' while using tidyverse-like syntax.

Maintained by Mark Fairbanks. Last updated 2 months ago.

1.8 match 458 stars 11.41 score 732 scripts 10 dependents

ncss-tech

aqp:Algorithms for Quantitative Pedology

The Algorithms for Quantitative Pedology (AQP) project was started in 2009 to organize a loosely-related set of concepts and source code on the topic of soil profile visualization, aggregation, and classification into this package (aqp). Over the past 8 years, the project has grown into a suite of related R packages that enhance and simplify the quantitative analysis of soil profile data. Central to the AQP project is a new vocabulary of specialized functions and data structures that can accommodate the inherent complexity of soil profile information; freeing the scientist to focus on ideas rather than boilerplate data processing tasks <doi:10.1016/j.cageo.2012.10.020>. These functions and data structures have been extensively tested and documented, applied to projects involving hundreds of thousands of soil profiles, and deeply integrated into widely used tools such as SoilWeb <https://casoilresource.lawr.ucdavis.edu/soilweb-apps>. Components of the AQP project (aqp, soilDB, sharpshootR, soilReports packages) serve an important role in routine data analysis within the USDA-NRCS Soil Science Division. The AQP suite of R packages offer a convenient platform for bridging the gap between pedometric theory and practice.

Maintained by Dylan Beaudette. Last updated 1 months ago.

digital-soil-mapping ncss-tech nrcs pedology pedometrics soil soil-survey usda

1.7 match 55 stars 11.90 score 1.2k scripts 2 dependents

modeloriented

shapviz:SHAP Visualizations

Visualizations for SHAP (SHapley Additive exPlanations), such as waterfall plots, force plots, various types of importance plots, dependence plots, and interaction plots. These plots act on a 'shapviz' object created from a matrix of SHAP values and a corresponding feature dataset. Wrappers for the R packages 'xgboost', 'lightgbm', 'fastshap', 'shapr', 'h2o', 'treeshap', 'DALEX', and 'kernelshap' are added for convenience. By separating visualization and computation, it is possible to display factor variables in graphs, even if the SHAP values are calculated by a model that requires numerical features. The plots are inspired by those provided by the 'shap' package in Python, but there is no dependency on it.

Maintained by Michael Mayer. Last updated 2 months ago.

explainable-ai machine-learning shap shapley-value visualization xai

2.0 match 89 stars 9.95 score 250 scripts

johanngb

ruv:Detect and Remove Unwanted Variation using Negative Controls

Implements the 'RUV' (Remove Unwanted Variation) algorithms. These algorithms attempt to adjust for systematic errors of unknown origin in high-dimensional data. The algorithms were originally developed for use with genomic data, especially microarray data, but may be useful with other types of high-dimensional data as well. These algorithms were proposed in Gagnon-Bartsch and Speed (2012) <doi:10.1093/nar/gkz433>, Gagnon-Bartsch, Jacob and Speed (2013), and Molania, et. al. (2019) <doi:10.1093/nar/gkz433>. The algorithms require the user to specify a set of negative control variables, as described in the references. The algorithms included in this package are 'RUV-2', 'RUV-4', 'RUV-inv', 'RUV-rinv', 'RUV-I', and RUV-III', along with various supporting algorithms.

Maintained by Johann Gagnon-Bartsch. Last updated 6 years ago.

4.5 match 2 stars 4.36 score 94 scripts 7 dependents

bioc

CoreGx:Classes and Functions to Serve as the Basis for Other 'Gx' Packages

A collection of functions and classes which serve as the foundation for our lab's suite of R packages, such as 'PharmacoGx' and 'RadioGx'. This package was created to abstract shared functionality from other lab package releases to increase ease of maintainability and reduce code repetition in current and future 'Gx' suite programs. Major features include a 'CoreSet' class, from which 'RadioSet' and 'PharmacoSet' are derived, along with get and set methods for each respective slot. Additional functions related to fitting and plotting dose response curves, quantifying statistical correlation and calculating area under the curve (AUC) or survival fraction (SF) are included. For more details please see the included documentation, as well as: Smirnov, P., Safikhani, Z., El-Hachem, N., Wang, D., She, A., Olsen, C., Freeman, M., Selby, H., Gendoo, D., Grossman, P., Beck, A., Aerts, H., Lupien, M., Goldenberg, A. (2015) <doi:10.1093/bioinformatics/btv723>. Manem, V., Labie, M., Smirnov, P., Kofia, V., Freeman, M., Koritzinksy, M., Abazeed, M., Haibe-Kains, B., Bratman, S. (2018) <doi:10.1101/449793>.

Maintained by Benjamin Haibe-Kains. Last updated 5 months ago.

software pharmacogenomics classification survival

3.0 match 6.53 score 63 scripts 6 dependents

gforge

Gmisc:Descriptive Statistics, Transition Plots, and More

Tools for making the descriptive "Table 1" used in medical articles, a transition plot for showing changes between categories (also known as a Sankey diagram), flow charts by extending the grid package, a method for variable selection based on the SVD, Bézier lines with arrows complementing the ones in the 'grid' package, and more.

Maintained by Max Gordon. Last updated 2 years ago.

cpp

1.9 match 50 stars 10.40 score 233 scripts 2 dependents

divdyn

divDyn:Diversity Dynamics using Fossil Sampling Data

Functions to describe sampling and diversity dynamics of fossil occurrence datasets (e.g. from the Paleobiology Database). The package includes methods to calculate range- and occurrence-based metrics of taxonomic richness, extinction and origination rates, along with traditional sampling measures. A powerful subsampling tool is also included that implements frequently used sampling standardization methods in a multiple bin-framework. The plotting of time series and the occurrence data can be simplified by the functions incorporated in the package, as well as other calculations, such as environmental affinities and extinction selectivity testing. Details can be found in: Kocsis, A.T.; Reddin, C.J.; Alroy, J. and Kiessling, W. (2019) <doi:10.1101/423780>.

Maintained by Adam T. Kocsis. Last updated 4 months ago.

diversity extinction fossil-data occurrences origination paleobiology cpp

3.0 match 11 stars 6.48 score 137 scripts

funwithr

MonteCarlo:Automatic Parallelized Monte Carlo Simulations

Simplifies Monte Carlo simulation studies by automatically setting up loops to run over parameter grids and parallelising the Monte Carlo repetitions. It also generates LaTeX tables.

Maintained by Christian Hendrik Leschinski. Last updated 6 years ago.

3.2 match 34 stars 6.09 score 36 scripts

saviviro

sstvars:Toolkit for Reduced Form and Structural Smooth Transition Vector Autoregressive Models

Penalized and non-penalized maximum likelihood estimation of smooth transition vector autoregressive models with various types of transition weight functions, conditional distributions, and identification methods. Constrained estimation with various types of constraints is available. Residual based model diagnostics, forecasting, simulations, and calculation of impulse response functions, generalized impulse response functions, and generalized forecast error variance decompositions. See Heather Anderson, Farshid Vahid (1998) <doi:10.1016/S0304-4076(97)00076-6>, Helmut Lütkepohl, Aleksei Netšunajev (2017) <doi:10.1016/j.jedc.2017.09.001>, Markku Lanne, Savi Virolainen (2025) <doi:10.48550/arXiv.2403.14216>, Savi Virolainen (2025) <doi:10.48550/arXiv.2404.19707>.

Maintained by Savi Virolainen. Last updated 19 days ago.

openblas cpp openmp

3.0 match 4 stars 6.36 score 41 scripts

ropensci

rdhs:API Client and Dataset Management for the Demographic and Health Survey (DHS) Data

Provides a client for (1) querying the DHS API for survey indicators and metadata (<https://api.dhsprogram.com/#/index.html>), (2) identifying surveys and datasets for analysis, (3) downloading survey datasets from the DHS website, (4) loading datasets and associate metadata into R, and (5) extracting variables and combining datasets for pooled analysis.

Maintained by OJ Watson. Last updated 20 days ago.

dataset dhs dhs-api extract peer-reviewed survey-data

1.9 match 35 stars 10.07 score 286 scripts 3 dependents

jknowles

merTools:Tools for Analyzing Mixed Effect Regression Models

Provides methods for extracting results from mixed-effect model objects fit with the 'lme4' package. Allows construction of prediction intervals efficiently from large scale linear and generalized linear mixed-effects models. This method draws from the simulation framework used in the Gelman and Hill (2007) textbook: Data Analysis Using Regression and Multilevel/Hierarchical Models.

Maintained by Jared E. Knowles. Last updated 1 years ago.

1.8 match 105 stars 10.49 score 768 scripts

ms609

TreeDist:Calculate and Map Distances Between Phylogenetic Trees

Implements measures of tree similarity, including information-based generalized Robinson-Foulds distances (Phylogenetic Information Distance, Clustering Information Distance, Matching Split Information Distance; Smith 2020) <doi:10.1093/bioinformatics/btaa614>; Jaccard-Robinson-Foulds distances (Bocker et al. 2013) <doi:10.1007/978-3-642-40453-5_13>, including the Nye et al. (2006) metric <doi:10.1093/bioinformatics/bti720>; the Matching Split Distance (Bogdanowicz & Giaro 2012) <doi:10.1109/TCBB.2011.48>; Maximum Agreement Subtree distances; the Kendall-Colijn (2016) distance <doi:10.1093/molbev/msw124>, and the Nearest Neighbour Interchange (NNI) distance, approximated per Li et al. (1996) <doi:10.1007/3-540-61332-3_168>. Includes tools for visualizing mappings of tree space (Smith 2022) <doi:10.1093/sysbio/syab100>, for identifying islands of trees (Silva and Wilkinson 2021) <doi:10.1093/sysbio/syab015>, for calculating the median of sets of trees, and for computing the information content of trees and splits.

Maintained by Martin R. Smith. Last updated 1 months ago.

phylogenetics tree-distance phylogenetic-trees tree-distances trees cpp

1.8 match 32 stars 10.32 score 97 scripts 5 dependents

louissirugue

directotree:Creates an Interactive Tree Structure of a Directory

Represents the content of a directory as an interactive collapsible tree. Offers the possibility to assign a text (e.g., a 'Readme.txt') to each folder (represented as a clickable node), so that when the user hovers the pointer over a node, the corresponding text is displayed as a tooltip.

Maintained by Louis Sirugue. Last updated 6 years ago.

collapsible-tree data-visualization directotree tree

8.0 match 2 stars 2.30 score 1 scripts

yonicd

d3Tree:Create Interactive Collapsible Trees with the JavaScript 'D3' Library

Create and customize interactive collapsible 'D3' trees using the 'D3' JavaScript library and the 'htmlwidgets' package. These trees can be used directly from the R console, from 'RStudio', in Shiny apps and R Markdown documents. When in Shiny the tree layout is observed by the server and can be used as a reactive filter of structured data.

Maintained by Jonathan Sidi. Last updated 1 years ago.

d3js hierarchy htmlwidgets query shiny

3.4 match 87 stars 5.46 score 33 scripts

dcomtois

summarytools:Tools to Quickly and Neatly Summarize Data

Data frame summaries, cross-tabulations, weight-enabled frequency tables and common descriptive (univariate) statistics in concise tables available in a variety of formats (plain ASCII, Markdown and HTML). A good point-of-entry for exploring data, both for experienced and new R users.

Maintained by Dominic Comtois. Last updated 4 days ago.

descriptive-statistics frequency-table html-report markdown pander pandoc pandoc-markdown rmarkdown rstudio

1.3 match 526 stars 14.52 score 2.9k scripts 6 dependents

colearendt

tidyjson:Tidy Complex 'JSON'

Turn complex 'JSON' data into tidy data frames.

Maintained by Cole Arendt. Last updated 2 years ago.

1.7 match 192 stars 10.64 score 522 scripts 7 dependents

muschellij2

rscopus:Scopus Database 'API' Interface

Uses Elsevier 'Scopus' API <https://dev.elsevier.com/sc_apis.html> to download information about authors and their citations.

Maintained by John Muschelli. Last updated 1 years ago.

bibliometrics scopus scopus-api

1.9 match 77 stars 9.33 score 124 scripts 3 dependents

bioc

plotgardener:Coordinate-Based Genomic Visualization Package for R

Coordinate-based genomic visualization package for R. It grants users the ability to programmatically produce complex, multi-paneled figures. Tailored for genomics, plotgardener allows users to visualize large complex genomic datasets and provides exquisite control over how plots are placed and arranged on a page.

Maintained by Nicole Kramer. Last updated 5 months ago.

visualization genomeannotation functionalgenomics genomeassembly hic cpp

1.7 match 309 stars 10.17 score 167 scripts 3 dependents

stevenmmortimer

salesforcer:An Implementation of 'Salesforce' APIs Using Tidy Principles

Functions connecting to the 'Salesforce' Platform APIs (REST, SOAP, Bulk 1.0, Bulk 2.0, Metadata, Reports and Dashboards) <https://trailhead.salesforce.com/content/learn/modules/api_basics/api_basics_overview>. "API" is an acronym for "application programming interface". Most all calls from these APIs are supported as they use CSV, XML or JSON data that can be parsed into R data structures. For more details please see the 'Salesforce' API documentation and this package's website <https://stevenmmortimer.github.io/salesforcer/> for more information, documentation, and examples.

Maintained by Steven M. Mortimer. Last updated 4 months ago.

api-wrappers r-language r-programming salesforce salesforce-apis

1.9 match 82 stars 9.27 score 191 scripts

poissonconsulting

mcmcr:Manipulate MCMC Samples

Functions and classes to store, manipulate and summarise Monte Carlo Markov Chain (MCMC) samples. For more information see Brooks et al. (2011) <isbn:978-1-4200-7941-8>.

Maintained by Joe Thorley. Last updated 2 months ago.

coda mcmc

2.3 match 17 stars 7.66 score 111 scripts 10 dependents

bioc

derfinder:Annotation-agnostic differential expression analysis of RNA-seq data at base-pair resolution via the DER Finder approach

This package provides functions for annotation-agnostic differential expression analysis of RNA-seq data. Two implementations of the DER Finder approach are included in this package: (1) single base-level F-statistics and (2) DER identification at the expressed regions-level. The DER Finder approach can also be used to identify differentially bounded ChIP-seq peaks.

Maintained by Leonardo Collado-Torres. Last updated 3 months ago.

differentialexpression sequencing rnaseq chipseq differentialpeakcalling software immunooncology coverage annotation-agnostic bioconductor derfinder

1.7 match 42 stars 10.03 score 78 scripts 6 dependents

desiquintans

librarian:Install, Update, Load Packages from CRAN, 'GitHub', and 'Bioconductor' in One Step

Automatically install, update, and load 'CRAN', 'GitHub', and 'Bioconductor' packages in a single function call. By accepting bare unquoted names for packages, it's easy to add or remove packages from the list.

Maintained by Desi Quintans. Last updated 3 months ago.

2.3 match 54 stars 7.63 score 410 scripts 1 dependents

eth-mds

ricu:Intensive Care Unit Data with R

Focused on (but not exclusive to) data sets hosted on PhysioNet (<https://physionet.org>), 'ricu' provides utilities for download, setup and access of intensive care unit (ICU) data sets. In addition to functions for running arbitrary queries against available data sets, a system for defining clinical concepts and encoding their representations in tabular ICU data is presented.

Maintained by Nicolas Bennett. Last updated 10 months ago.

3.0 match 39 stars 5.65 score 77 scripts

bioc

OmicsMLRepoR:Search harmonized metadata created under the OmicsMLRepo project

This package provides functions to browse the harmonized metadata for large omics databases. This package also supports data navigation if the metadata incorporates ontology.

Maintained by Sehyun Oh. Last updated 1 months ago.

software infrastructure datarepresentation

3.1 match 5.40 score 14 scripts

taddylab

distrom:Distributed Multinomial Regression

Fast distributed/parallel estimation for multinomial logistic regression via Poisson factorization and the 'gamlr' package. For details see: Taddy (2015, AoAS), Distributed Multinomial Regression, <arXiv:1311.6139>.

Maintained by Nelson Rayl. Last updated 7 months ago.

3.0 match 19 stars 5.58 score 44 scripts 3 dependents

usaid-oha-si

tameDP:Import targets and PLHIV data from COP Target Setting Tool (formerly Data Pack)

Import PSNUxIM targets and PLHIV data from COP Data Pack. The purpose is to make the data tidy and more usable than their current structure in the Excel data packs.

Maintained by Aaron Chafetz. Last updated 1 years ago.

3.4 match 1 stars 4.92 score 46 scripts

rte-antares-rpackage

manipulateWidget:Add Even More Interactivity to Interactive Charts

Like package 'manipulate' does for static graphics, this package helps to easily add controls like sliders, pickers, checkboxes, etc. that can be used to modify the input data or the parameters of an interactive chart created with package 'htmlwidgets'.

Maintained by Veronique Bachelier. Last updated 3 years ago.

graphical htmlwidgets interactive-charts manipulate rte shiny tyndp

1.9 match 129 stars 8.82 score 143 scripts 1 dependents

wadpac

GGIR:Raw Accelerometer Data Analysis

A tool to process and analyse data collected with wearable raw acceleration sensors as described in Migueles and colleagues (JMPB 2019), and van Hees and colleagues (JApplPhysiol 2014; PLoSONE 2015). The package has been developed and tested for binary data from 'GENEActiv' <https://activinsights.com/>, binary (.gt3x) and .csv-export data from 'Actigraph' <https://theactigraph.com> devices, and binary (.cwa) and .csv-export data from 'Axivity' <https://axivity.com>. These devices are currently widely used in research on human daily physical activity. Further, the package can handle accelerometer data file from any other sensor brand providing that the data is stored in csv format. Also the package allows for external function embedding.

Maintained by Vincent T van Hees. Last updated 5 days ago.

accelerometer activity-recognition circadian-rhythm movement-sensor sleep

1.3 match 109 stars 13.20 score 342 scripts 3 dependents

tomasfryda

h2o:R Interface for the 'H2O' Scalable Machine Learning Platform

R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).

Maintained by Tomas Fryda. Last updated 1 years ago.

2.0 match 3 stars 8.20 score 7.8k scripts 11 dependents

ohdsi

CohortConstructor:Build and Manipulate Study Cohorts Using a Common Data Model

Create and manipulate study cohorts in data mapped to the Observational Medical Outcomes Partnership Common Data Model.

Maintained by Edward Burn. Last updated 6 days ago.

1.7 match 2 stars 9.71 score 207 scripts 2 dependents

cdriveraus

ctsem:Continuous Time Structural Equation Modelling

Hierarchical continuous (and discrete) time state space modelling, for linear and nonlinear systems measured by continuous variables, with limited support for binary data. The subject specific dynamic system is modelled as a stochastic differential equation (SDE) or difference equation, measurement models are typically multivariate normal factor models. Linear mixed effects SDE's estimated via maximum likelihood and optimization are the default. Nonlinearities, (state dependent parameters) and random effects on all parameters are possible, using either max likelihood / max a posteriori optimization (with optional importance sampling) or Stan's Hamiltonian Monte Carlo sampling. See <https://github.com/cdriveraus/ctsem/raw/master/vignettes/hierarchicalmanual.pdf> for details. Priors may be used. For the conceptual overview of the hierarchical Bayesian linear SDE approach, see <https://www.researchgate.net/publication/324093594_Hierarchical_Bayesian_Continuous_Time_Dynamic_Modeling>. Exogenous inputs may also be included, for an overview of such possibilities see <https://www.researchgate.net/publication/328221807_Understanding_the_Time_Course_of_Interventions_with_Continuous_Time_Dynamic_Models> . Stan based functions are not available on 32 bit Windows systems at present. <https://cdriver.netlify.app/> contains some tutorial blog posts.

Maintained by Charles Driver. Last updated 14 days ago.

stochastic-differential-equations time-series cpp

1.7 match 42 stars 9.58 score 366 scripts 1 dependents

bioc

spikeLI:Affymetrix Spike-in Langmuir Isotherm Data Analysis Tool

SpikeLI is a package that performs the analysis of the Affymetrix spike-in data using the Langmuir Isotherm. The aim of this package is to show the advantages of a physical-chemistry based analysis of the Affymetrix microarray data compared to the traditional methods. The spike-in (or Latin square) data for the HGU95 and HGU133 chipsets have been downloaded from the Affymetrix web site. The model used in the spikeLI package is described in details in E. Carlon and T. Heim, Physica A 362, 433 (2006).

Maintained by Enrico Carlon. Last updated 5 months ago.

microarray qualitycontrol

4.8 match 3.30 score

kbroman

broman:Karl Broman's R Code

Miscellaneous R functions, including functions related to graphics (mostly for base graphics), permutation tests, running mean/median, and general utilities.

Maintained by Karl W Broman. Last updated 10 months ago.

1.8 match 183 stars 8.80 score 648 scripts 1 dependents

bioc

CellBench:Construct Benchmarks for Single Cell Analysis Methods

This package contains infrastructure for benchmarking analysis methods and access to single cell mixture benchmarking data. It provides a framework for organising analysis methods and testing combinations of methods in a pipeline without explicitly laying out each combination. It also provides utilities for sampling and filtering SingleCellExperiment objects, constructing lists of functions with varying parameters, and multithreaded evaluation of analysis methods.

Maintained by Shian Su. Last updated 5 months ago.

software infrastructure singlecell benchmark bioinformatics

1.8 match 31 stars 8.73 score 98 scripts

insitro

AllelicSeries:Allelic Series Test

Implementation of gene-level rare variant association tests targeting allelic series: genes where increasingly deleterious mutations have increasingly large phenotypic effects. The COding-variant Allelic Series Test (COAST) operates on the benign missense variants (BMVs), deleterious missense variants (DMVs), and protein truncating variants (PTVs) within a gene. COAST uses a set of adjustable weights that tailor the test towards rejecting the null hypothesis for genes where the average magnitude of effect increases monotonically from BMVs to DMVs to PTVs. See McCaw ZR, O’Dushlaine C, Somineni H, Bereket M, Klein C, Karaletsos T, Casale FP, Koller D, Soare TW. (2023) "An allelic series rare variant association test for candidate gene discovery" <doi:10.1016/j.ajhg.2023.07.001>.

Maintained by Zachary McCaw. Last updated 1 months ago.

openblas cpp openmp

2.3 match 13 stars 6.97 score 8 scripts

bioc

SRAdb:A compilation of metadata from NCBI SRA and tools

The Sequence Read Archive (SRA) is the largest public repository of sequencing data from the next generation of sequencing platforms including Roche 454 GS System, Illumina Genome Analyzer, Applied Biosystems SOLiD System, Helicos Heliscope, and others. However, finding data of interest can be challenging using current tools. SRAdb is an attempt to make access to the metadata associated with submission, study, sample, experiment and run much more feasible. This is accomplished by parsing all the NCBI SRA metadata into a SQLite database that can be stored and queried locally. Fulltext search in the package make querying metadata very flexible and powerful. fastq and sra files can be downloaded for doing alignment locally. Beside ftp protocol, the SRAdb has funcitons supporting fastp protocol (ascp from Aspera Connect) for faster downloading large data files over long distance. The SQLite database is updated regularly as new data is added to SRA and can be downloaded at will for the most up-to-date metadata.

Maintained by Jack Zhu. Last updated 3 months ago.

infrastructure sequencing dataimport

2.0 match 2 stars 7.81 score 200 scripts

bupaverse

bupaR:Business Process Analysis in R

Comprehensive Business Process Analysis toolkit. Creates S3-class for event log objects, and related handler functions. Imports related packages for filtering event data, computation of descriptive statistics, handling of 'Petri Net' objects and visualization of process maps. See also packages 'edeaR','processmapR', 'eventdataR' and 'processmonitR'.

Maintained by Gert Janssenswillen. Last updated 2 years ago.

1.7 match 55 stars 9.07 score 389 scripts 11 dependents

mqbssppe

fabMix:Overfitting Bayesian Mixtures of Factor Analyzers with Parsimonious Covariance and Unknown Number of Components

Model-based clustering of multivariate continuous data using Bayesian mixtures of factor analyzers (Papastamoulis (2019) <DOI:10.1007/s11222-019-09891-z> (2018) <DOI:10.1016/j.csda.2018.03.007>). The number of clusters is estimated using overfitting mixture models (Rousseau and Mengersen (2011) <DOI:10.1111/j.1467-9868.2011.00781.x>): suitable prior assumptions ensure that asymptotically the extra components will have zero posterior weight, therefore, the inference is based on the ``alive'' components. A Gibbs sampler is implemented in order to (approximately) sample from the posterior distribution of the overfitting mixture. A prior parallel tempering scheme is also available, which allows to run multiple parallel chains with different prior distributions on the mixture weights. These chains run in parallel and can swap states using a Metropolis-Hastings move. Eight different parameterizations give rise to parsimonious representations of the covariance per cluster (following Mc Nicholas and Murphy (2008) <DOI:10.1007/s11222-008-9056-0>). The model parameterization and number of factors is selected according to the Bayesian Information Criterion. Identifiability issues related to label switching are dealt by post-processing the simulated output with the Equivalence Classes Representatives algorithm (Papastamoulis and Iliopoulos (2010) <DOI:10.1198/jcgs.2010.09008>, Papastamoulis (2016) <DOI:10.18637/jss.v069.c01>).

Maintained by Panagiotis Papastamoulis. Last updated 1 years ago.

openblas cpp openmp

7.4 match 2.09 score 41 scripts 1 dependents

donadelnal

RQdeltaCT:Relative Quantification of Gene Expression using Delta Ct Methods

The commonly used methods for relative quantification of gene expression levels obtained in real-time PCR (Polymerase Chain Reaction) experiments are the delta Ct methods, encompassing 2^-dCt and 2^-ddCt methods, originally proposed by Kenneth J. Livak and Thomas D. Schmittgen (2001) <doi:10.1006/meth.2001.1262>. The main idea is to normalise gene expression values using endogenous control gene, present gene expression levels in linear form by using the 2^-(value)^ transformation, and calculate differences in gene expression levels between groups of samples (or technical replicates of a single sample). The 'RQdeltaCT' package offers functions that cover both methods for comparison of either independent groups of samples or groups with paired samples, together with importing expression datasets, performing multi-step quality control of data, enabling numerous data visualisations, enrichment of the standard workflow with additional useful analyses (correlation analysis, Receiver Operating Characteristic analysis, logistic regression), and conveniently export obtained results in table and image formats. The package has been designed to be friendly to non-experts in R programming.

Maintained by Daniel Zalewski. Last updated 1 months ago.

3.3 match 4.70 score 4 scripts

markvanderloo

accumulate:Split-Apply-Combine with Dynamic Groups

Estimate group aggregates, where one can set user-defined conditions that each group of records must satisfy to be suitable for aggregation. If a group of records is not suitable, it is expanded using a collapsing scheme defined by the user. A paper on this package was published in the Journal of Statistical Software <doi:10.18637/jss.v112.i04>.

Maintained by Mark van der Loo. Last updated 1 days ago.

2.9 match 9 stars 5.35 score 3 scripts

bioc

AneuFinder:Analysis of Copy Number Variation in Single-Cell-Sequencing Data

AneuFinder implements functions for copy-number detection, breakpoint detection, and karyotype and heterogeneity analysis in single-cell whole genome sequencing and strand-seq data.

Maintained by Aaron Taudt. Last updated 5 months ago.

immunooncology software sequencing singlecell copynumbervariation genomicvariation hiddenmarkovmodel wholegenome cpp

2.0 match 17 stars 7.70 score 37 scripts

pharmar

riskmetric:Risk Metrics to Evaluating R Packages

Facilities for assessing R packages against a number of metrics to help quantify their robustness.

Maintained by Eli Miller. Last updated 11 hours ago.

1.7 match 167 stars 8.91 score 43 scripts

dereckmezquita

stenographer:Flexible and Customisable Logging System

A comprehensive logging framework for R applications that provides hierarchical logging levels, database integration, and contextual logging capabilities. The package supports 'SQLite' storage for persistent logs, provides colour-coded console output for better readability, includes parallel processing support, and implements structured error reporting with 'JSON' formatting.

Maintained by Dereck Mezquita. Last updated 2 months ago.

3.0 match 3 stars 5.08 score 1 scripts

bioc

cmapR:CMap Tools in R

The Connectivity Map (CMap) is a massive resource of perturbational gene expression profiles built by researchers at the Broad Institute and funded by the NIH Library of Integrated Network-Based Cellular Signatures (LINCS) program. Please visit https://clue.io for more information. The cmapR package implements methods to parse, manipulate, and write common CMap data objects, such as annotated matrices and collections of gene sets.

Maintained by Ted Natoli. Last updated 5 months ago.

dataimport datarepresentation geneexpression bioconductor bioinformatics cmap

1.7 match 90 stars 8.86 score 298 scripts

keyatm

keyATM:Keyword Assisted Topic Models

Fits keyword assisted topic models (keyATM) using collapsed Gibbs samplers. The keyATM combines the latent dirichlet allocation (LDA) models with a small number of keywords selected by researchers in order to improve the interpretability and topic classification of the LDA. The keyATM can also incorporate covariates and directly model time trends. The keyATM is proposed in Eshima, Imai, and Sasaki (2024) <doi:10.1111/ajps.12779>.

Maintained by Shusei Eshima. Last updated 11 months ago.

latent-dirichlet-allocation natural-language-processing political-science rcpp rcppeigen social-science topic-models cpp

2.4 match 106 stars 6.30 score 63 scripts

bioc

BaalChIP:BaalChIP: Bayesian analysis of allele-specific transcription factor binding in cancer genomes

The package offers functions to process multiple ChIP-seq BAM files and detect allele-specific events. Computes allele counts at individual variants (SNPs/SNVs), implements extensive QC steps to remove problematic variants, and utilizes a bayesian framework to identify statistically significant allele- specific events. BaalChIP is able to account for copy number differences between the two alleles, a known phenotypical feature of cancer samples.

Maintained by Ines de Santiago. Last updated 5 months ago.

software chipseq bayesian sequencing

3.8 match 4.00 score 5 scripts

flr

FLa4a:A Simple and Robust Statistical Catch at Age Model

A simple and robust statistical Catch at Age model that is specifically designed for stocks with intermediate levels of data quantity and quality.

Maintained by Ernesto Jardim. Last updated 8 days ago.

2.3 match 12 stars 6.66 score 177 scripts 2 dependents

kvasilopoulos

exuber:Econometric Analysis of Explosive Time Series

Testing for and dating periods of explosive dynamics (exuberance) in time series using the univariate and panel recursive unit root tests proposed by Phillips et al. (2015) <doi:10.1111/iere.12132> and Pavlidis et al. (2016) <doi:10.1007/s11146-015-9531-2>.The recursive least-squares algorithm utilizes the matrix inversion lemma to avoid matrix inversion which results in significant speed improvements. Simulation of a variety of periodically-collapsing bubble processes. Details can be found in Vasilopoulos et al. (2022) <doi:10.18637/jss.v103.i10>.

Maintained by Kostas Vasilopoulos. Last updated 1 years ago.

dickey-fuller explosive-dynamics simulation time-series openblas cpp

2.2 match 29 stars 6.83 score 77 scripts

juanv66x

qvirus:Quantum Computing for Analyzing CD4 Lymphocytes and Antiretroviral Therapy

Resources, tutorials, and code snippets dedicated to exploring the intersection of quantum computing and artificial intelligence (AI) in the context of analyzing Cluster of Differentiation 4 (CD4) lymphocytes and optimizing antiretroviral therapy (ART) for human immunodeficiency virus (HIV). With the emergence of quantum artificial intelligence and the development of small-scale quantum computers, there's an unprecedented opportunity to revolutionize the understanding of HIV dynamics and treatment strategies. This project leverages the R package 'qsimulatR' (Ostmeyer and Urbach, 2023, <https://CRAN.R-project.org/package=qsimulatR>), a quantum computer simulator, to explore these applications in quantum computing techniques, addressing the challenges in studying CD4 lymphocytes and enhancing ART efficacy.

Maintained by Juan Pablo Acuña González. Last updated 14 days ago.

2.8 match 5.43 score 15 scripts

isglobal-brge

SNPassoc:SNPs-Based Whole Genome Association Studies

Functions to perform most of the common analysis in genome association studies are implemented. These analyses include descriptive statistics and exploratory analysis of missing values, calculation of Hardy-Weinberg equilibrium, analysis of association based on generalized linear models (either for quantitative or binary traits), and analysis of multiple SNPs (haplotype and epistasis analysis). Permutation test and related tests (sum statistic and truncated product) are also implemented. Max-statistic and genetic risk-allele score exact distributions are also possible to be estimated. The methods are described in Gonzalez JR et al., 2007 <doi: 10.1093/bioinformatics/btm025>.

Maintained by Dolors Pelegri. Last updated 5 months ago.

1.6 match 16 stars 9.14 score 89 scripts 6 dependents

wraff

wrMisc:Analyze Experimental High-Throughput (Omics) Data

The efficient treatment and convenient analysis of experimental high-throughput (omics) data gets facilitated through this collection of diverse functions. Several functions address advanced object-conversions, like manipulating lists of lists or lists of arrays, reorganizing lists to arrays or into separate vectors, merging of multiple entries, etc. Another set of functions provides speed-optimized calculation of standard deviation (sd), coefficient of variance (CV) or standard error of the mean (SEM) for data in matrixes or means per line with respect to additional grouping (eg n groups of replicates). A group of functions facilitate dealing with non-redundant information, by indexing unique, adding counters to redundant or eliminating lines with respect redundancy in a given reference-column, etc. Help is provided to identify very closely matching numeric values to generate (partial) distance matrixes for very big data in a memory efficient manner or to reduce the complexity of large data-sets by combining very close values. Other functions help aligning a matrix or data.frame to a reference using partial matching or to mine an experimental setup to extract patterns of replicate samples. Many times large experimental datasets need some additional filtering, adequate functions are provided. Convenient data normalization is supported in various different modes, parameter estimation via permutations or boot-strap as well as flexible testing of multiple pair-wise combinations using the framework of 'limma' is provided, too. Batch reading (or writing) of sets of files and combining data to arrays is supported, too.

Maintained by Wolfgang Raffelsberger. Last updated 7 months ago.

3.3 match 4.44 score 33 scripts 4 dependents

taxonomicallyinformedannotation

tima:Taxonomically Informed Metabolite Annotation

This package provides the infrastructure to perform Taxonomically Informed Metabolite Annotation.

Maintained by Adriano Rutz. Last updated 8 hours ago.

metabolite annotation chemotaxonomy scoring system natural products computational metabolomics taxonomic distance specialized metabolome

2.3 match 9 stars 6.55 score 32 scripts 2 dependents

bioc

UPDhmm:Detecting Uniparental Disomy through NGS trio data

Uniparental disomy (UPD) is a genetic condition where an individual inherits both copies of a chromosome or part of it from one parent, rather than one copy from each parent. This package contains a HMM for detecting UPDs through HTS (High Throughput Sequencing) data from trio assays. By analyzing the genotypes in the trio, the model infers a hidden state (normal, father isodisomy, mother isodisomy, father heterodisomy and mother heterodisomy).

Maintained by Marta Sevilla. Last updated 5 months ago.

software hiddenmarkovmodel genetics

3.2 match 1 stars 4.54 score 3 scripts

sciviews

svBase:Base Objects like Data Frames for 'SciViews::R'

Functions to manipulated the three main classes of "data frames" for 'SciViews::R': data.frame, data.table and tibble. Allow to select the preferred one, and to convert more carefully between the three, taking care of correct presentation of row names and data.table's keys. More homogeneous way of creating these three data frames and of printing them on the R console.

Maintained by Philippe Grosjean. Last updated 9 months ago.

data-frame sciviews

3.3 match 4.38 score 3 scripts 8 dependents

martin3141

spant:MR Spectroscopy Analysis Tools

Tools for reading, visualising and processing Magnetic Resonance Spectroscopy data. The package includes methods for spectral fitting: Wilson (2021) <DOI:10.1002/mrm.28385> and spectral alignment: Wilson (2018) <DOI:10.1002/mrm.27605>.

Maintained by Martin Wilson. Last updated 1 months ago.

brain mri mrs mrshub spectroscopy fortran

1.7 match 25 stars 8.52 score 81 scripts

poissonconsulting

universals:S3 Generics for Bayesian Analyses

Provides S3 generic methods and some default implementations for Bayesian analyses that generate Markov Chain Monte Carlo (MCMC) samples. The purpose of 'universals' is to reduce package dependencies and conflicts. The 'nlist' package implements many of the methods for its 'nlist' class.

Maintained by Joe Thorley. Last updated 2 months ago.

generics model-fitting s3

2.3 match 4 stars 6.37 score 1 scripts 20 dependents

statisticsnorway

SSBtools:Algorithms and Tools for Tabular Statistics and Hierarchical Computations

Includes general data manipulation functions, algorithms for statistical disclosure control (Langsrud, 2024) <doi:10.1007/978-3-031-69651-0_6> and functions for hierarchical computations by sparse model matrices (Langsrud, 2023) <doi:10.32614/RJ-2023-088>.

Maintained by Øyvind Langsrud. Last updated 5 days ago.

statistics

1.9 match 7 stars 7.62 score 68 scripts 7 dependents

usdaforestservice

FIESTAutils:Utility Functions for Forest Inventory Estimation and Analysis

A set of tools for data wrangling, spatial data analysis, statistical modeling (including direct, model-assisted, photo-based, and small area tools), and USDA Forest Service data base tools. These tools are aimed to help Foresters, Analysts, and Scientists extract and perform analyses on USDA Forest Service data.

Maintained by Grayson White. Last updated 5 days ago.

cpp

2.3 match 8 stars 6.33 score 1 dependents

poissonconsulting

tidyplus:Additional 'tidyverse' Functions

Provides functions such as str_crush(), add_missing_column(), coalesce_data() and drop_na_all() that complement 'tidyverse' functionality or functions that provide alternative behaviors such as if_else2() and str_detect2().

Maintained by Ayla Pearson. Last updated 2 months ago.

2.3 match 9 stars 6.33 score 1 scripts 4 dependents

samhforbes

eyetrackingR:Eye-Tracking Data Analysis

Addresses tasks along the pipeline from raw data to analysis and visualization for eye-tracking data. Offers several popular types of analyses, including linear and growth curve time analyses, onset-contingent reaction time analyses, as well as several non-parametric bootstrapping approaches. For references to the approach see Mirman, Dixon & Magnuson (2008) <doi:10.1016/j.jml.2007.11.006>, and Barr (2008) <doi:10.1016/j.jml.2007.09.002>.

Maintained by Samuel Forbes. Last updated 2 years ago.

1.8 match 22 stars 7.84 score 60 scripts

ctn-0094

DOPE:Drug Ontology Parsing Engine

Provides information on drug names (brand, generic and street) for drugs tracked by the DEA. There are functions that will search synonyms and return the drug names and types. The vignettes have extensive information on the work done to create the data for the package.

Maintained by Raymond Balise. Last updated 4 years ago.

1.8 match 21 stars 7.83 score 31 scripts

jonesor

Rage:Life History Metrics from Matrix Population Models

Functions for calculating life history metrics using matrix population models ('MPMs'). Described in Jones et al. (2021) <doi:10.1101/2021.04.26.441330>.

Maintained by Owen Jones. Last updated 3 months ago.

1.7 match 11 stars 8.17 score 62 scripts 1 dependents

green-striped-gecko

dartR.base:Analysing 'SNP' and 'Silicodart' Data - Basic Functions

Facilitates the import and analysis of 'SNP' (single nucleotide 'polymorphism') and 'silicodart' (presence/absence) data. The main focus is on data generated by 'DarT' (Diversity Arrays Technology), however, data from other sequencing platforms can be used once 'SNP' or related fragment presence/absence data from any source is imported. Genetic datasets are stored in a derived 'genlight' format (package 'adegenet'), that allows for a very compact storage of data and metadata. Functions are available for importing and exporting of 'SNP' and 'silicodart' data, for reporting on and filtering on various criteria (e.g. 'callrate', 'heterozygosity', 'reproducibility', maximum allele frequency). Additional functions are available for visualization (e.g. Principle Coordinate Analysis) and creating a spatial representation using maps. 'dartR.base' is the 'base' package of the 'dartRverse' suits of packages. To install the other packages, we recommend to install the 'dartRverse' package, that supports the installation of all packages in the 'dartRverse'. If you want to cite 'dartR', you find the information by typing citation('dartR.base') in the console.

Maintained by Bernd Gruber. Last updated 16 days ago.

3.6 match 3.84 score 17 scripts 5 dependents

bioc

genefu:Computation of Gene Expression-Based Signatures in Breast Cancer

This package contains functions implementing various tasks usually required by gene expression analysis, especially in breast cancer studies: gene mapping between different microarray platforms, identification of molecular subtypes, implementation of published gene signatures, gene selection, and survival analysis.

Maintained by Benjamin Haibe-Kains. Last updated 4 months ago.

differentialexpression geneexpression visualization clustering classification

1.9 match 7.42 score 193 scripts 3 dependents

branchlab

metasnf:Meta Clustering with Similarity Network Fusion

Framework to facilitate patient subtyping with similarity network fusion and meta clustering. The similarity network fusion (SNF) algorithm was introduced by Wang et al. (2014) in <doi:10.1038/nmeth.2810>. SNF is a data integration approach that can transform high-dimensional and diverse data types into a single similarity network suitable for clustering with minimal loss of information from each initial data source. The meta clustering approach was introduced by Caruana et al. (2006) in <doi:10.1109/ICDM.2006.103>. Meta clustering involves generating a wide range of cluster solutions by adjusting clustering hyperparameters, then clustering the solutions themselves into a manageable number of qualitatively similar solutions, and finally characterizing representative solutions to find ones that are best for the user's specific context. This package provides a framework to easily transform multi-modal data into a wide range of similarity network fusion-derived cluster solutions as well as to visualize, characterize, and validate those solutions. Core package functionality includes easy customization of distance metrics, clustering algorithms, and SNF hyperparameters to generate diverse clustering solutions; calculation and plotting of associations between features, between patients, and between cluster solutions; and standard cluster validation approaches including resampled measures of cluster stability, standard metrics of cluster quality, and label propagation to evaluate generalizability in unseen data. Associated vignettes guide the user through using the package to identify patient subtypes while adhering to best practices for unsupervised learning.

Maintained by Prashanth S Velayudhan. Last updated 8 days ago.

bioinformatics clustering metaclustering snf

1.7 match 8 stars 8.21 score 30 scripts

darwin-eu

DrugUtilisation:Summarise Patient-Level Drug Utilisation in Data Mapped to the OMOP Common Data Model

Summarise patient-level drug utilisation cohorts using data mapped to the Observational Medical Outcomes Partnership (OMOP) common data model. New users and prevalent users cohorts can be generated and their characteristics, indication and drug use summarised.

Maintained by Martí Català. Last updated 2 months ago.

1.7 match 8.20 score 156 scripts 2 dependents

graemetlloyd

Claddis:Measuring Morphological Diversity and Evolutionary Tempo

Measures morphological diversity from discrete character data and estimates evolutionary tempo on phylogenetic trees. Imports morphological data from #NEXUS (Maddison et al. (1997) <doi:10.1093/sysbio/46.4.590>) format with read_nexus_matrix(), and writes to both #NEXUS and TNT format (Goloboff et al. (2008) <doi:10.1111/j.1096-0031.2008.00217.x>). Main functions are test_rates(), which implements AIC and likelihood ratio tests for discrete character rates introduced across Lloyd et al. (2012) <doi:10.1111/j.1558-5646.2011.01460.x>, Brusatte et al. (2014) <doi:10.1016/j.cub.2014.08.034>, Close et al. (2015) <doi:10.1016/j.cub.2015.06.047>, and Lloyd (2016) <doi:10.1111/bij.12746>, and calculate_morphological_distances(), which implements multiple discrete character distance metrics from Gower (1971) <doi:10.2307/2528823>, Wills (1998) <doi:10.1006/bijl.1998.0255>, Lloyd (2016) <doi:10.1111/bij.12746>, and Hopkins and St John (2018) <doi:10.1098/rspb.2018.1784>. This also includes the GED correction from Lehmann et al. (2019) <doi:10.1111/pala.12430>. Multiple functions implement morphospace plots: plot_chronophylomorphospace() implements Sakamoto and Ruta (2012) <doi:10.1371/journal.pone.0039752>, plot_morphospace() implements Wills et al. (1994) <doi:10.1017/S009483730001263X>, plot_changes_on_tree() implements Wang and Lloyd (2016) <doi:10.1098/rspb.2016.0214>, and plot_morphospace_stack() implements Foote (1993) <doi:10.1017/S0094837300015864>. Other functions include safe_taxonomic_reduction(), which implements Wilkinson (1995) <doi:10.1093/sysbio/44.4.501>, map_dollo_changes() implements the Dollo stochastic character mapping of Tarver et al. (2018) <doi:10.1093/gbe/evy096>, and estimate_ancestral_states() implements the ancestral state options of Lloyd (2018) <doi:10.1111/pala.12380>. calculate_tree_length() and reconstruct_ancestral_states() implements the generalised algorithms from Swofford and Maddison (1992; no doi).

Maintained by Graeme T. Lloyd. Last updated 7 months ago.

1.8 match 13 stars 7.81 score 77 scripts 2 dependents

dwbapst

paleotree:Paleontological and Phylogenetic Analyses of Evolution

Provides tools for transforming, a posteriori time-scaling, and modifying phylogenies containing extinct (i.e. fossil) lineages. In particular, most users are interested in the functions timePaleoPhy, bin_timePaleoPhy, cal3TimePaleoPhy and bin_cal3TimePaleoPhy, which date cladograms of fossil taxa using stratigraphic data. This package also contains a large number of likelihood functions for estimating sampling and diversification rates from different types of data available from the fossil record (e.g. range data, occurrence data, etc). paleotree users can also simulate diversification and sampling in the fossil record using the function simFossilRecord, which is a detailed simulator for branching birth-death-sampling processes composed of discrete taxonomic units arranged in ancestor-descendant relationships. Users can use simFossilRecord to simulate diversification in incompletely sampled fossil records, under various models of morphological differentiation (i.e. the various patterns by which morphotaxa originate from one another), and with time-dependent, longevity-dependent and/or diversity-dependent rates of diversification, extinction and sampling. Additional functions allow users to translate simulated ancestor-descendant data from simFossilRecord into standard time-scaled phylogenies or unscaled cladograms that reflect the relationships among taxon units.

Maintained by David W. Bapst. Last updated 8 months ago.

1.8 match 21 stars 7.53 score 216 scripts 2 dependents

cdcgov

surveytable:Formatted Survey Estimates

Short and understandable commands that generate tabulated, formatted, and rounded survey estimates. Mostly a wrapper for the 'survey' package (Lumley (2004) <doi:10.18637/jss.v009.i08> <https://CRAN.R-project.org/package=survey>) that identifies low-precision estimates using the National Center for Health Statistics (NCHS) presentation standards (Parker et al. (2017) <https://www.cdc.gov/nchs/data/series/sr_02/sr02_175.pdf>, Parker et al. (2023) <doi:10.15620/cdc:124368>).

Maintained by Alex Strashny. Last updated 7 days ago.

estimates formatted-output pretty-print survey tables

2.0 match 6 stars 6.71 score 19 scripts

usaid-oha-si

selfdestructin5:Creates SI OHA Mission Director Briefers

Creates a series of data frames that can be passed to a gt() to create the PEPFAR summary tables.

Maintained by Tim Essam. Last updated 29 days ago.

3.4 match 1 stars 3.98 score 21 scripts

bioc

BioNERO:Biological Network Reconstruction Omnibus

BioNERO aims to integrate all aspects of biological network inference in a single package, including data preprocessing, exploratory analyses, network inference, and analyses for biological interpretations. BioNERO can be used to infer gene coexpression networks (GCNs) and gene regulatory networks (GRNs) from gene expression data. Additionally, it can be used to explore topological properties of protein-protein interaction (PPI) networks. GCN inference relies on the popular WGCNA algorithm. GRN inference is based on the "wisdom of the crowds" principle, which consists in inferring GRNs with multiple algorithms (here, CLR, GENIE3 and ARACNE) and calculating the average rank for each interaction pair. As all steps of network analyses are included in this package, BioNERO makes users avoid having to learn the syntaxes of several packages and how to communicate between them. Finally, users can also identify consensus modules across independent expression sets and calculate intra and interspecies module preservation statistics between different networks.

Maintained by Fabricio Almeida-Silva. Last updated 5 months ago.

software geneexpression generegulation systemsbiology graphandnetwork preprocessing network networkinference

1.7 match 27 stars 7.78 score 50 scripts 1 dependents

beckerbenj

eatGADS:Data Management of Large Hierarchical Data

Import 'SPSS' data, handle and change 'SPSS' meta data, store and access large hierarchical data in 'SQLite' data bases.

Maintained by Benjamin Becker. Last updated 26 days ago.

1.8 match 1 stars 7.36 score 34 scripts 1 dependents

dipterix

filearray:File-Backed Array for Out-of-Memory Computation

Stores large arrays in files to avoid occupying large memories. Implemented with super fast gigabyte-level multi-threaded reading/writing via 'OpenMP'. Supports multiple non-character data types (double, float, complex, integer, logical, and raw).

Maintained by Zhengjia Wang. Last updated 2 days ago.

array big-data memory-map out-of-memory outofmemory cpp

2.0 match 17 stars 6.58 score 10 scripts 3 dependents

skranz

xglue:Extended glue. Write collapse and by Uses a template file with a special syntax.

Extended glue. Add in the string to be glued blocks that specify collapse operations on vectors, and allow to operate on a grouped data frame

Maintained by Sebastian Kranz. Last updated 4 years ago.

glue stringinterpolation

5.3 match 6 stars 2.48 score 6 scripts

davidchall

ipaddress:Data Analysis for IP Addresses and Networks

Classes and functions for working with IP (Internet Protocol) addresses and networks, inspired by the Python 'ipaddress' module. Offers full support for both IPv4 and IPv6 (Internet Protocol versions 4 and 6) address spaces. It is specifically designed to work well with the 'tidyverse'.

Maintained by David Hall. Last updated 1 years ago.

cyber data-analysis ip-address ipv4 ipv6 vctrs cpp

1.9 match 32 stars 7.02 score 27 scripts 2 dependents

cran

tensorA:Advanced Tensor Arithmetic with Named Indices

Provides convenience functions for advanced linear algebra with tensors and computation with data sets of tensors on a higher level abstraction. It includes Einstein and Riemann summing conventions, dragging, co- and contravariate indices, parallel computations on sequences of tensors.

Maintained by K. Gerald van den Boogaart. Last updated 1 years ago.

2.3 match 5.83 score 399 dependents

sdanzige

ADAPTS:Automated Deconvolution Augmentation of Profiles for Tissue Specific Cells

Tools to construct (or add to) cell-type signature matrices using flow sorted or single cell samples and deconvolve bulk gene expression data. Useful for assessing the quality of single cell RNAseq experiments, estimating the accuracy of signature matrices, and determining cell-type spillover. Please cite: Danziger SA et al. (2019) ADAPTS: Automated Deconvolution Augmentation of Profiles for Tissue Specific cells <doi:10.1371/journal.pone.0224693>.

Maintained by Samuel A Danziger. Last updated 3 years ago.

2.0 match 2 stars 6.56 score 40 scripts 1 dependents

thibautjombart

treespace:Statistical Exploration of Landscapes of Phylogenetic Trees

Tools for the exploration of distributions of phylogenetic trees. This package includes a 'shiny' interface which can be started from R using treespaceServer(). For further details see Jombart et al. (2017) <DOI:10.1111/1755-0998.12676>.

Maintained by Michelle Kendall. Last updated 2 years ago.

cpp

1.8 match 28 stars 7.39 score 63 scripts

mikeblazanin

gcplyr:Wrangle and Analyze Growth Curve Data

Easy wrangling and model-free analysis of microbial growth curve data, as commonly output by plate readers. Tools for reshaping common plate reader outputs into 'tidy' formats and merging them with design information, making data easy to work with using 'gcplyr' and other packages. Also streamlines common growth curve processing steps, like smoothing and calculating derivatives, and facilitates model-free characterization and analysis of growth data. See methods at <https://mikeblazanin.github.io/gcplyr/>.

Maintained by Mike Blazanin. Last updated 2 months ago.

dplyr ggplot2 tidyverse

1.7 match 30 stars 7.53 score 75 scripts

samhforbes

PupillometryR:A Unified Pipeline for Pupillometry Data

Provides a unified pipeline to clean, prepare, plot, and run basic analyses on pupillometry experiments.

Maintained by Samuel Forbes. Last updated 1 years ago.

1.7 match 44 stars 7.58 score 288 scripts 1 dependents

openanalytics

clinDataReview:Clinical Data Review Tool

Creation of interactive tables, listings and figures ('TLFs') and associated report for exploratory analysis of data in a clinical trial, e.g. for clinical oversight activities. Interactive figures include sunburst, treemap, scatterplot, line plot and barplot of counts data. Interactive tables include table of summary statistics (as counts of adverse events, enrollment table) and listings. Possibility to compare data (summary table or listing) across two data batches/sets. A clinical data review report is created via study-specific configuration files and template 'R Markdown' reports contained in the package.

Maintained by Laure Cougnaud. Last updated 9 months ago.

1.8 match 11 stars 7.10 score 36 scripts

bioc

openCyto:Hierarchical Gating Pipeline for flow cytometry data

This package is designed to facilitate the automated gating methods in sequential way to mimic the manual gating strategy.

Maintained by Mike Jiang. Last updated 5 months ago.

immunooncology flowcytometry dataimport preprocessing datarepresentation cpp

1.7 match 7.62 score 404 scripts 1 dependents

bioc

SpatialDecon:Deconvolution of mixed cells from spatial and/or bulk gene expression data

Using spatial or bulk gene expression data, estimates abundance of mixed cell types within each observation. Based on "Advances in mixed cell deconvolution enable quantification of cell types in spatial transcriptomic data", Danaher (2022). Designed for use with the NanoString GeoMx platform, but applicable to any gene expression data.

Maintained by Maddy Griswold. Last updated 5 months ago.

immunooncology featureextraction geneexpression transcriptomics spatial

1.7 match 36 stars 7.40 score 58 scripts

capitalone

dataCompareR:Compare Two Data Frames and Summarise the Difference

Easy comparison of two tabular data objects in R. Specifically designed to show differences between two sets of data in a useful way that should make it easier to understand the differences, and if necessary, help you work out how to remedy them. Aims to offer a more useful output than all.equal() when your two data sets do not match, but isn't intended to replace all.equal() as a way to test for equality.

Maintained by Sarah Johnston. Last updated 2 years ago.

compare-data data data-analysis data-science

1.8 match 76 stars 7.24 score 76 scripts

jonesor

Rcompadre:Utilities for using the 'COM(P)ADRE' Matrix Model Database

Utility functions for interacting with the 'COMPADRE' and 'COMADRE' databases of matrix population models. Described in Jones et al. (2021) <doi:10.1101/2021.04.26.441330>.

Maintained by Owen Jones. Last updated 5 months ago.

1.6 match 11 stars 7.74 score 55 scripts 2 dependents

darunabas

phyloregion:Biogeographic Regionalization and Macroecology

Computational infrastructure for biogeography, community ecology, and biodiversity conservation (Daru et al. 2020) <doi:10.1111/2041-210X.13478>. It is based on the methods described in Daru et al. (2020) <doi:10.1038/s41467-020-15921-6>. The original conceptual work is described in Daru et al. (2017) <doi:10.1016/j.tree.2017.08.013> on patterns and processes of biogeographical regionalization. Additionally, the package contains fast and efficient functions to compute more standard conservation measures such as phylogenetic diversity, phylogenetic endemism, evolutionary distinctiveness and global endangerment, as well as compositional turnover (e.g., beta diversity).

Maintained by Barnabas H. Daru. Last updated 5 months ago.

software

1.8 match 18 stars 7.21 score 50 scripts 1 dependents

cran

EHR:Electronic Health Record (EHR) Data Processing and Analysis Tool

Process and analyze electronic health record (EHR) data. The 'EHR' package provides modules to perform diverse medication-related studies using data from EHR databases. Especially, the package includes modules to perform pharmacokinetic/pharmacodynamic (PK/PD) analyses using EHRs, as outlined in Choi, Beck, McNeer, Weeks, Williams, James, Niu, Abou-Khalil, Birdwell, Roden, Stein, Bejan, Denny, and Van Driest (2020) <doi:10.1002/cpt.1787>. Additional modules will be added in future. In addition, this package provides various functions useful to perform Phenome Wide Association Study (PheWAS) to explore associations between drug exposure and phenotypes obtained from EHR data, as outlined in Choi, Carroll, Beck, Mosley, Roden, Denny, and Van Driest (2018) <doi:10.1093/bioinformatics/bty306>.

Maintained by Leena Choi. Last updated 2 years ago.

3.5 match 3.60 score

uzh-peg

microxanox:Oxic-Anoxic Regime Shifts in Microbial Communities

Model to simulate a three functional group system with four chemical substrates using a set of ordinary differential equations. Simulations can be run individually or over a parameter range, to find stable states. The model features multiple species per functional group, where the number is only limited by computational constraints. The R package is constructed in such a way, that the results contain the input parameter used, so that a saved results can be loaded again and thesimulation be repeated.

Maintained by Owen L. Petchey. Last updated 1 years ago.

3.3 match 3.85 score 35 scripts

bioc

alevinQC:Generate QC Reports For Alevin Output

Generate QC reports summarizing the output from an alevin, alevin-fry, or simpleaf run. Reports can be generated as html or pdf files, or as shiny applications.

Maintained by Charlotte Soneson. Last updated 3 months ago.

qualitycontrol singlecell cpp

1.8 match 31 stars 6.89 score 21 scripts

kbhoehn

dowser:B Cell Receptor Phylogenetics Toolkit

Provides a set of functions for inferring, visualizing, and analyzing B cell phylogenetic trees. Provides methods to 1) reconstruct unmutated ancestral sequences, 2) build B cell phylogenetic trees using multiple methods, 3) visualize trees with metadata at the tips, 4) reconstruct intermediate sequences, 5) detect biased ancestor-descendant relationships among metadata types Workflow examples available at documentation site (see URL). Citations: Hoehn et al (2022) <doi:10.1371/journal.pcbi.1009885>, Hoehn et al (2021) <doi:10.1101/2021.01.06.425648>.

Maintained by Kenneth Hoehn. Last updated 2 months ago.

1.8 match 6.81 score 84 scripts

steverozen

ICAMS:In-Depth Characterization and Analysis of Mutational Signatures ('ICAMS')

Analysis and visualization of experimentally elucidated mutational signatures -- the kind of analysis and visualization in Boot et al., "In-depth characterization of the cisplatin mutational signature in human cell lines and in esophageal and liver tumors", Genome Research 2018, <doi:10.1101/gr.230219.117> and "Characterization of colibactin-associated mutational signature in an Asian oral squamous cell carcinoma and in other mucosal tumor types", Genome Research 2020 <doi:10.1101/gr.255620.119>. 'ICAMS' stands for In-depth Characterization and Analysis of Mutational Signatures. 'ICAMS' has functions to read in variant call files (VCFs) and to collate the corresponding catalogs of mutational spectra and to analyze and plot catalogs of mutational spectra and signatures. Handles both "counts-based" and "density-based" (i.e. representation as mutations per megabase) mutational spectra or signatures.

Maintained by Steve Rozen. Last updated 3 years ago.

2.3 match 8 stars 5.41 score 128 scripts