Showing 200 of total 332 results (show query)
sebkrantz
collapse:Advanced and Fast Data Transformation
A C/C++ based package for advanced data transformation and statistical computing in R that is extremely fast, class-agnostic, robust and programmer friendly. Core functionality includes a rich set of S3 generic grouped and weighted statistical functions for vectors, matrices and data frames, which provide efficient low-level vectorizations, OpenMP multithreading, and skip missing values by default. These are integrated with fast grouping and ordering algorithms (also callable from C), and efficient data manipulation functions. The package also provides a flexible and rigorous approach to time series and panel data in R. It further includes fast functions for common statistical procedures, detailed (grouped, weighted) summary statistics, powerful tools to work with nested data, fast data object conversions, functions for memory efficient R programming, and helpers to effectively deal with variable labels, attributes, and missing data. It is well integrated with base R classes, 'dplyr'/'tibble', 'data.table', 'sf', 'units', 'plm' (panel-series and data frames), and 'xts'/'zoo'.
Maintained by Sebastian Krantz. Last updated 9 days ago.
data-aggregationdata-analysisdata-manipulationdata-processingdata-sciencedata-transformationeconometricshigh-performancepanel-datascientific-computingstatisticstime-seriesweightedweightscppopenmp
88.2 match 672 stars 16.63 score 708 scripts 97 dependentsemmanuelparadis
ape:Analyses of Phylogenetics and Evolution
Functions for reading, writing, plotting, and manipulating phylogenetic trees, analyses of comparative data in a phylogenetic framework, ancestral character analyses, analyses of diversification and macroevolution, computing distances from DNA sequences, reading and writing nucleotide sequences as well as importing from BioConductor, and several tools such as Mantel's test, generalized skyline plots, graphical exploration of phylogenetic data (alex, trex, kronoviz), estimation of absolute evolutionary rates and clock-like trees using mean path lengths and penalized likelihood, dating trees with non-contemporaneous sequences, translating DNA into AA sequences, and assessing sequence alignments. Phylogeny estimation can be done with the NJ, BIONJ, ME, MVR, SDM, and triangle methods, and several methods handling incomplete distance matrices (NJ*, BIONJ*, MVR*, and the corresponding triangle method). Some functions call external applications (PhyML, Clustal, T-Coffee, Muscle) whose results are returned into R.
Maintained by Emmanuel Paradis. Last updated 3 days ago.
10.5 match 64 stars 17.22 score 13k scripts 599 dependentsbioc
ORFik:Open Reading Frames in Genomics
R package for analysis of transcript and translation features through manipulation of sequence data and NGS data like Ribo-Seq, RNA-Seq, TCP-Seq and CAGE. It is generalized in the sense that any transcript region can be analysed, as the name hints to it was made with investigation of ribosomal patterns over Open Reading Frames (ORFs) as it's primary use case. ORFik is extremely fast through use of C++, data.table and GenomicRanges. Package allows to reassign starts of the transcripts with the use of CAGE-Seq data, automatic shifting of RiboSeq reads, finding of Open Reading Frames for whole genomes and much more.
Maintained by Haakon Tjeldnes. Last updated 30 days ago.
immunooncologysoftwaresequencingriboseqrnaseqfunctionalgenomicscoveragealignmentdataimportcpp
15.5 match 33 stars 10.63 score 115 scripts 2 dependentssolivella
lda:Collapsed Gibbs Sampling Methods for Topic Models
Implements latent Dirichlet allocation (LDA) and related models. This includes (but is not limited to) sLDA, corrLDA, and the mixed-membership stochastic blockmodel. Inference for all of these models is implemented via a fast collapsed Gibbs sampler written in C. Utility functions for reading/writing data typically used in topic models, as well as tools for examining posterior distributions are also included.
Maintained by Santiago Olivella. Last updated 11 months ago.
16.1 match 7.62 score 548 scripts 11 dependentscran
nlme:Linear and Nonlinear Mixed Effects Models
Fit and compare Gaussian linear and nonlinear mixed-effects models.
Maintained by R Core Team. Last updated 2 months ago.
7.0 match 6 stars 13.00 score 13k scripts 8.7k dependentsfriendly
vcdExtra:'vcd' Extensions and Additions
Provides additional data sets, methods and documentation to complement the 'vcd' package for Visualizing Categorical Data and the 'gnm' package for Generalized Nonlinear Models. In particular, 'vcdExtra' extends mosaic, assoc and sieve plots from 'vcd' to handle 'glm()' and 'gnm()' models and adds a 3D version in 'mosaic3d'. Additionally, methods are provided for comparing and visualizing lists of 'glm' and 'loglm' objects. This package is now a support package for the book, "Discrete Data Analysis with R" by Michael Friendly and David Meyer.
Maintained by Michael Friendly. Last updated 5 months ago.
categorical-data-visualizationgeneralized-linear-modelsmosaic-plots
8.7 match 24 stars 10.34 score 472 scripts 3 dependentsbioc
ggtree:an R package for visualization of tree and annotation data
'ggtree' extends the 'ggplot2' plotting system which implemented the grammar of graphics. 'ggtree' is designed for visualization and annotation of phylogenetic trees and other tree-like structures with their annotation data.
Maintained by Guangchuang Yu. Last updated 5 months ago.
alignmentannotationclusteringdataimportmultiplesequencealignmentphylogeneticsreproducibleresearchsoftwarevisualizationannotationsggplot2phylogenetic-trees
5.3 match 864 stars 16.86 score 5.1k scripts 109 dependentstalgalili
dendextend:Extending 'dendrogram' Functionality in R
Offers a set of functions for extending 'dendrogram' objects in R, letting you visualize and compare trees of 'hierarchical clusterings'. You can (1) Adjust a tree's graphical parameters - the color, size, type, etc of its branches, nodes and labels. (2) Visually and statistically compare different 'dendrograms' to one another.
Maintained by Tal Galili. Last updated 2 months ago.
5.1 match 154 stars 17.02 score 6.0k scripts 164 dependentsbioc
fgsea:Fast Gene Set Enrichment Analysis
The package implements an algorithm for fast gene set enrichment analysis. Using the fast algorithm allows to make more permutations and get more fine grained p-values, which allows to use accurate stantard approaches to multiple hypothesis correction.
Maintained by Alexey Sergushichev. Last updated 3 months ago.
geneexpressiondifferentialexpressiongenesetenrichmentpathwayscpp
5.1 match 387 stars 16.25 score 3.9k scripts 101 dependentstidyverse
glue:Interpreted String Literals
An implementation of interpreted string literals, inspired by Python's Literal String Interpolation <https://www.python.org/dev/peps/pep-0498/> and Docstrings <https://www.python.org/dev/peps/pep-0257/> and Julia's Triple-Quoted String Literals <https://docs.julialang.org/en/v1.3/manual/strings/#Triple-Quoted-String-Literals-1>.
Maintained by Jennifer Bryan. Last updated 6 months ago.
3.5 match 729 stars 21.76 score 57k scripts 14k dependentstidyverse
forcats:Tools for Working with Categorical Variables (Factors)
Helpers for reordering factor levels (including moving specified levels to front, ordering by first appearance, reversing, and randomly shuffling), and tools for modifying factor levels (including collapsing rare levels into other, 'anonymising', and manually 'recoding').
Maintained by Hadley Wickham. Last updated 1 years ago.
4.0 match 555 stars 18.77 score 21k scripts 1.2k dependentsberndbischl
BBmisc:Miscellaneous Helper Functions for B. Bischl
Miscellaneous helper functions for and from B. Bischl and some other guys, mainly for package development.
Maintained by Bernd Bischl. Last updated 2 years ago.
7.0 match 20 stars 10.59 score 980 scripts 69 dependentstidyverse
dplyr:A Grammar of Data Manipulation
A fast, consistent tool for working with data frame like objects, both in memory and out of memory.
Maintained by Hadley Wickham. Last updated 16 days ago.
3.0 match 4.8k stars 24.68 score 659k scripts 7.8k dependentskwb-r
kwb.utils:General Utility Functions Developed at KWB
This package contains some small helper functions that aim at improving the quality of code developed at Kompetenzzentrum Wasser gGmbH (KWB).
Maintained by Hauke Sonnenberg. Last updated 12 months ago.
9.9 match 8 stars 7.33 score 12 scripts 78 dependentsadeelk93
collapsibleTree:Interactive Collapsible Tree Diagrams using 'D3.js'
Interactive Reingold-Tilford tree diagrams created using 'D3.js', where every node can be expanded and collapsed by clicking on it. Tooltips and color gradients can be mapped to nodes using a numeric column in the source data frame. See 'collapsibleTree' website for more information and examples.
Maintained by Adeel Khan. Last updated 1 years ago.
8.7 match 159 stars 8.25 score 472 scripts 6 dependentspoissonconsulting
nlist:Lists of Numeric Atomic Objects
Create and manipulate numeric list ('nlist') objects. An 'nlist' is an S3 list of uniquely named numeric objects. An numeric object is an integer or double vector, matrix or array. An 'nlists' object is a S3 class list of 'nlist' objects with the same names, dimensionalities and typeofs. Numeric list objects are of interest because they are the raw data inputs for analytic engines such as 'JAGS', 'STAN' and 'TMB'. Numeric lists objects, which are useful for storing multiple realizations of of simulated data sets, can be converted to coda::mcmc and coda::mcmc.list objects.
Maintained by Joe Thorley. Last updated 2 months ago.
9.0 match 6 stars 7.23 score 13 scripts 12 dependentsspatstat
spatstat.explore:Exploratory Data Analysis for the 'spatstat' Family
Functionality for exploratory data analysis and nonparametric analysis of spatial data, mainly spatial point patterns, in the 'spatstat' family of packages. (Excludes analysis of spatial data on a linear network, which is covered by the separate package 'spatstat.linnet'.) Methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported.
Maintained by Adrian Baddeley. Last updated 3 days ago.
cluster-detectionconfidence-intervalshypothesis-testingk-functionroc-curvesscan-statisticssignificance-testingsimulation-envelopesspatial-analysisspatial-data-analysisspatial-sharpeningspatial-smoothingspatial-statistics
6.3 match 1 stars 10.18 score 67 scripts 149 dependentshighlanderlab
SIMplyBee:'AlphaSimR' Extension for Simulating Honeybee Populations and Breeding Programmes
An extension of the 'AlphaSimR' package (<https://cran.r-project.org/package=AlphaSimR>) for stochastic simulations of honeybee populations and breeding programmes. 'SIMplyBee' enables simulation of individual bees that form a colony, which includes a queen, fathers (drones the queen mated with), virgin queens, workers, and drones. Multiple colony can be merged into a population of colonies, such as an apiary or a whole country of colonies. Functions enable operations on castes, colony, or colonies, to ease 'R' scripting of whole populations. All 'AlphaSimR' functionality with respect to genomes and genetic and phenotype values is available and further extended for honeybees, including haplo-diploidy, complementary sex determiner locus, colony events (swarming, supersedure, etc.), and colony phenotype values.
Maintained by Jana Obลกteter. Last updated 6 months ago.
10.0 match 2 stars 6.24 score 18 scriptschristopherkenny
censable:Making Census Data More Usable
Creates a common framework for organizing, naming, and gathering population, age, race, and ethnicity data from the Census Bureau. Accesses the API <https://www.census.gov/data/developers/data-sets.html>. Provides tools for adding information to existing data to line up with Census data.
Maintained by Christopher T. Kenny. Last updated 10 months ago.
10.4 match 8 stars 5.78 score 42 scripts 4 dependentsepiforecasts
EpiNow2:Estimate Real-Time Case Counts and Time-Varying Epidemiological Parameters
Estimates the time-varying reproduction number, rate of spread, and doubling time using a range of open-source tools (Abbott et al. (2020) <doi:10.12688/wellcomeopenres.16006.1>), and current best practices (Gostic et al. (2020) <doi:10.1101/2020.06.18.20134858>). It aims to help users avoid some of the limitations of naive implementations in a framework that is informed by community feedback and is actively supported.
Maintained by Sebastian Funk. Last updated 28 days ago.
backcalculationcovid-19gaussian-processesopen-sourcereproduction-numberstancpp
4.9 match 120 stars 11.88 score 210 scriptskharchenkolab
pagoda2:Single Cell Analysis and Differential Expression
Analyzing and interactively exploring large-scale single-cell RNA-seq datasets. 'pagoda2' primarily performs normalization and differential gene expression analysis, with an interactive application for exploring single-cell RNA-seq datasets. It performs basic tasks such as cell size normalization, gene variance normalization, and can be used to identify subpopulations and run differential expression within individual samples. 'pagoda2' was written to rapidly process modern large-scale scRNAseq datasets of approximately 1e6 cells. The companion web application allows users to explore which gene expression patterns form the different subpopulations within your data. The package also serves as the primary method for preprocessing data for conos, <https://github.com/kharchenkolab/conos>. This package interacts with data available through the 'p2data' package, which is available in a 'drat' repository. To access this data package, see the instructions at <https://github.com/kharchenkolab/pagoda2>. The size of the 'p2data' package is approximately 6 MB.
Maintained by Evan Biederstedt. Last updated 1 years ago.
scrna-seqsingle-cellsingle-cell-rna-seqtranscriptomicsopenblascppopenmp
7.0 match 222 stars 8.00 score 282 scriptsbioc
Biostrings:Efficient manipulation of biological strings
Memory efficient string containers, string matching algorithms, and other utilities, for fast manipulation of large biological sequences or sets of sequences.
Maintained by Hervรฉ Pagรจs. Last updated 27 days ago.
sequencematchingalignmentsequencinggeneticsdataimportdatarepresentationinfrastructurebioconductor-packagecore-package
3.0 match 61 stars 17.83 score 8.6k scripts 1.2k dependentsbioc
DESeq2:Differential gene expression analysis based on the negative binomial distribution
Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.
Maintained by Michael Love. Last updated 14 days ago.
sequencingrnaseqchipseqgeneexpressiontranscriptionnormalizationdifferentialexpressionbayesianregressionprincipalcomponentclusteringimmunooncologyopenblascpp
3.1 match 375 stars 16.11 score 17k scripts 115 dependentskharchenkolab
sccore:Core Utilities for Single-Cell RNA-Seq
Core utilities for single-cell RNA-seq data analysis. Contained within are utility functions for working with differential expression (DE) matrices and count matrices, a collection of functions for manipulating and plotting data via 'ggplot2', and functions to work with cell graphs and cell embeddings. Graph-based methods include embedding kNN cell graphs into a UMAP <doi:10.21105/joss.00861>, collapsing vertices of each cluster in the graph, and propagating graph labels.
Maintained by Evan Biederstedt. Last updated 1 years ago.
7.5 match 12 stars 6.44 score 36 scripts 9 dependentsbioc
edgeR:Empirical Analysis of Digital Gene Expression Data in R
Differential expression analysis of sequence count data. Implements a range of statistical methodology based on the negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models, quasi-likelihood, and gene set enrichment. Can perform differential analyses of any type of omics data that produces read counts, including RNA-seq, ChIP-seq, ATAC-seq, Bisulfite-seq, SAGE, CAGE, metabolomics, or proteomics spectral counts. RNA-seq analyses can be conducted at the gene or isoform level, and tests can be conducted for differential exon or transcript usage.
Maintained by Yunshun Chen. Last updated 8 days ago.
alternativesplicingbatcheffectbayesianbiomedicalinformaticscellbiologychipseqclusteringcoveragedifferentialexpressiondifferentialmethylationdifferentialsplicingdnamethylationepigeneticsfunctionalgenomicsgeneexpressiongenesetenrichmentgeneticsimmunooncologymultiplecomparisonnormalizationpathwaysproteomicsqualitycontrolregressionrnaseqsagesequencingsinglecellsystemsbiologytimecoursetranscriptiontranscriptomicsopenblas
3.5 match 13.40 score 17k scripts 255 dependentsbioc
autonomics:Unified Statistical Modeling of Omics Data
This package unifies access to Statistal Modeling of Omics Data. Across linear modeling engines (lm, lme, lmer, limma, and wilcoxon). Across coding systems (treatment, difference, deviation, etc). Across model formulae (with/without intercept, random effect, interaction or nesting). Across omics platforms (microarray, rnaseq, msproteomics, affinity proteomics, metabolomics). Across projection methods (pca, pls, sma, lda, spls, opls). Across clustering methods (hclust, pam, cmeans). It provides a fast enrichment analysis implementation. And an intuitive contrastogram visualisation to summarize contrast effects in complex designs.
Maintained by Aditya Bhagwat. Last updated 2 months ago.
softwaredataimportpreprocessingdimensionreductionprincipalcomponentregressiondifferentialexpressiongenesetenrichmenttranscriptomicstranscriptiongeneexpressionrnaseqmicroarrayproteomicsmetabolomicsmassspectrometry
7.9 match 5.95 score 5 scriptsbioc
DuplexDiscovereR:Analysis of the data from RNA duplex probing experiments
DuplexDiscovereR is a package designed for analyzing data from RNA cross-linking and proximity ligation protocols such as SPLASH, PARIS, LIGR-seq, and others. DuplexDiscovereR accepts input in the form of chimerically or split-aligned reads. It includes procedures for alignment classification, filtering, and efficient clustering of individual chimeric reads into duplex groups (DGs). Once DGs are identified, the package predicts RNA duplex formation and their hybridization energies. Additional metrics, such as p-values for random ligation hypothesis or mean DG alignment scores, can be calculated to rank final set of RNA duplexes. Data from multiple experiments or replicates can be processed separately and further compared to check the reproducibility of the experimental method.
Maintained by Egor Semenchenko. Last updated 2 months ago.
sequencingtranscriptomicsstructuralpredictionclusteringsplicedalignment
10.2 match 1 stars 4.60 score 5 scriptsbioc
IRanges:Foundation of integer range manipulation in Bioconductor
Provides efficient low-level and highly reusable S4 classes for storing, manipulating and aggregating over annotated ranges of integers. Implements an algebra of range operations, including efficient algorithms for finding overlaps and nearest neighbors. Defines efficient list-like classes for storing, transforming and aggregating large grouped data, i.e., collections of atomic vectors and DataFrames.
Maintained by Hervรฉ Pagรจs. Last updated 1 months ago.
infrastructuredatarepresentationbioconductor-packagecore-package
3.0 match 22 stars 15.09 score 2.1k scripts 1.8k dependentspik-piam
magclass:Data Class and Tools for Handling Spatial-Temporal Data
Data class for increased interoperability working with spatial-temporal data together with corresponding functions and methods (conversions, basic calculations and basic data manipulation). The class distinguishes between spatial, temporal and other dimensions to facilitate the development and interoperability of tools build for it. Additional features are name-based addressing of data and internal consistency checks (e.g. checking for the right data order in calculations).
Maintained by Jan Philipp Dietrich. Last updated 13 days ago.
4.0 match 5 stars 11.16 score 412 scripts 56 dependentstidyverse
tidyr:Tidy Messy Data
Tools to help to create tidy data, where each column is a variable, each row is an observation, and each cell contains a single value. 'tidyr' contains tools for changing the shape (pivoting) and hierarchy (nesting and 'unnesting') of a dataset, turning deeply nested lists into rectangular data frames ('rectangling'), and extracting values out of string columns. It also includes tools for working with missing values (both implicit and explicit).
Maintained by Hadley Wickham. Last updated 16 days ago.
1.8 match 1.4k stars 22.88 score 168k scripts 5.5k dependentsmatthewheun
matsindf:Matrices in Data Frames
Provides functions to collapse a tidy data frame into matrices in a data frame and expand a data frame of matrices into a tidy data frame.
Maintained by Matthew Heun. Last updated 12 days ago.
7.4 match 5 stars 5.48 score 40 scriptsjsilve24
fido:Bayesian Multinomial Logistic Normal Regression
Provides methods for fitting and inspection of Bayesian Multinomial Logistic Normal Models using MAP estimation and Laplace Approximation as developed in Silverman et. Al. (2022) <https://www.jmlr.org/papers/v23/19-882.html>. Key functionality is implemented in C++ for scalability. 'fido' replaces the previous package 'stray'.
Maintained by Justin Silverman. Last updated 21 days ago.
4.9 match 20 stars 8.31 score 103 scriptsmetrumresearchgroup
mrgsolve:Simulate from ODE-Based Models
Fast simulation from ordinary differential equation (ODE) based models typically employed in quantitative pharmacology and systems biology.
Maintained by Kyle T Baron. Last updated 1 months ago.
3.7 match 138 stars 10.90 score 1.2k scripts 3 dependentsarg0naut91
neatRanges:Tidy Up Date/Time Ranges
Collapse, partition, combine, fill gaps in and expand date/time ranges.
Maintained by Aljaz Jelenko. Last updated 3 years ago.
collapsedate-rangeintervalspartitioningtimestamp-rangescpp
12.3 match 3 stars 3.18 score 2 scriptsdipterix
dipsaus:A Dipping Sauce for Data Analysis and Visualizations
Works as an "add-on" to packages like 'shiny', 'future', as well as 'rlang', and provides utility functions. Just like dipping sauce adding flavors to potato chips or pita bread, 'dipsaus' for data analysis and visualizations adds handy functions and enhancements to popular packages. The goal is to provide simple solutions that are frequently asked for online, such as how to synchronize 'shiny' inputs without freezing the app, or how to get memory size on 'Linux' or 'MacOS' system. The enhancements roughly fall into these four categories: 1. 'shiny' input widgets; 2. high-performance computing using the 'future' package; 3. modify R calls and convert among numbers, strings, and other objects. 4. utility functions to get system information such like CPU chip-set, memory limit, etc.
Maintained by Zhengjia Wang. Last updated 8 days ago.
4.8 match 13 stars 7.90 score 85 scripts 3 dependentsbusiness-science
tibbletime:Time Aware Tibbles
Built on top of the 'tibble' package, 'tibbletime' is an extension that allows for the creation of time aware tibbles. Some immediate advantages of this include: the ability to perform time-based subsetting on tibbles, quickly summarising and aggregating results by time periods, and creating columns that can be used as 'dplyr' time-based groups.
Maintained by Davis Vaughan. Last updated 4 months ago.
periodicitytibbletimetime-seriestimeseriescpp
3.6 match 177 stars 10.51 score 644 scripts 2 dependentsegenn
rtemis:Machine Learning and Visualization
Advanced Machine Learning and Visualization. Unsupervised Learning (Clustering, Decomposition), Supervised Learning (Classification, Regression), Cross-Decomposition, Bagging, Boosting, Meta-models. Static and interactive graphics.
Maintained by E.D. Gennatas. Last updated 1 months ago.
data-sciencedata-visualizationmachine-learningmachine-learning-libraryvisualization
5.2 match 145 stars 7.09 score 50 scripts 2 dependentslyzander
tableHTML:A Tool to Create HTML Tables
A tool to create and style HTML tables with CSS. These can be exported and used in any application that accepts HTML (e.g. 'shiny', 'rmarkdown', 'PowerPoint'). It also provides functions to create CSS files (which also work with shiny).
Maintained by Theo Boutaris. Last updated 2 years ago.
4.2 match 22 stars 8.80 score 280 scripts 1 dependentsbiogenies
tidysq:Tidy Processing and Analysis of Biological Sequences
A tidy approach to analysis of biological sequences. All processing and data-storage functions are heavily optimized to allow the fastest and most efficient data storage.
Maintained by Dominik Rafacz. Last updated 3 months ago.
bioconductorbioinformaticsbiological-sequencesfastas3sequencestibbletidytidyversevctrscpp
4.9 match 40 stars 7.56 score 38 scriptsebailey78
shinyBS:Extra Twitter Bootstrap Components for Shiny
Adds easy access to additional Twitter Bootstrap components to Shiny.
Maintained by Eric Bailey. Last updated 9 years ago.
3.0 match 183 stars 12.19 score 3.1k scripts 101 dependentsms609
TreeTools:Create, Modify and Analyse Phylogenetic Trees
Efficient implementations of functions for the creation, modification and analysis of phylogenetic trees. Applications include: generation of trees with specified shapes; tree rearrangement; analysis of tree shape; rooting of trees and extraction of subtrees; calculation and depiction of split support; plotting the position of rogue taxa (Klopfstein & Spasojevic 2019) <doi:10.1371/journal.pone.0212942>; calculation of ancestor-descendant relationships, of 'stemwardness' (Asher & Smith, 2022) <doi:10.1093/sysbio/syab072>, and of tree balance (Mir et al. 2013, Lemant et al. 2022) <doi:10.1016/j.mbs.2012.10.005>, <doi:10.1093/sysbio/syac027>; artificial extinction (Asher & Smith, 2022) <doi:10.1093/sysbio/syab072>; import and export of trees from Newick, Nexus (Maddison et al. 1997) <doi:10.1093/sysbio/46.4.590>, and TNT <https://www.lillo.org.ar/phylogeny/tnt/> formats; and analysis of splits and cladistic information.
Maintained by Martin R. Smith. Last updated 1 months ago.
evolutionary-biologyphylogenetic-treesphylogeneticscpp
3.7 match 21 stars 9.92 score 124 scripts 10 dependentsludvigolsen
groupdata2:Creating Groups from Data
Methods for dividing data into groups. Create balanced partitions and cross-validation folds. Perform time series windowing and general grouping and splitting of data. Balance existing groups with up- and downsampling or collapse them to fewer groups.
Maintained by Ludvig Renbo Olsen. Last updated 3 months ago.
balancecross-validationdatadata-framefoldgroup-factorgroupsparticipantspartitionsplitstaircase
4.0 match 27 stars 9.04 score 338 scripts 7 dependentsr-lib
cli:Helpers for Developing Command Line Interfaces
A suite of tools to build attractive command line interfaces ('CLIs'), from semantic elements: headings, lists, alerts, paragraphs, etc. Supports custom themes via a 'CSS'-like language. It also contains a number of lower level 'CLI' elements: rules, boxes, trees, and 'Unicode' symbols with 'ASCII' alternatives. It support ANSI colors and text styles as well.
Maintained by Gรกbor Csรกrdi. Last updated 2 days ago.
1.9 match 664 stars 19.34 score 1.4k scripts 14k dependentsbioc
extraChIPs:Additional functions for working with ChIP-Seq data
This package builds on existing tools and adds some simple but extremely useful capabilities for working wth ChIP-Seq data. The focus is on detecting differential binding windows/regions. One set of functions focusses on set-operations retaining mcols for GRanges objects, whilst another group of functions are to aid visualisation of results. Coercion to tibble objects is also implemented.
Maintained by Stevie Pederson. Last updated 19 days ago.
5.4 match 7 stars 6.67 score 25 scriptstidymodels
recipes:Preprocessing and Feature Engineering Steps for Modeling
A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.
Maintained by Max Kuhn. Last updated 1 days ago.
1.9 match 584 stars 18.73 score 7.2k scripts 382 dependentshaozhu233
kableExtra:Construct Complex Table with 'kable' and Pipe Syntax
Build complex HTML or 'LaTeX' tables using 'kable()' from 'knitr' and the piping syntax from 'magrittr'. Function 'kable()' is a light weight table generator coming from 'knitr'. This package simplifies the way to manipulate the HTML or 'LaTeX' codes generated by 'kable()' and allows users to construct complex tables and customize styles using a readable syntax.
Maintained by Hao Zhu. Last updated 13 days ago.
htmlkablekableextraknitrlatexrmarkdown
1.8 match 702 stars 19.35 score 55k scripts 163 dependentsbioc
QSutils:Quasispecies Diversity
Set of utility functions for viral quasispecies analysis with NGS data. Most functions are equally useful for metagenomic studies. There are three main types: (1) data manipulation and explorationโfunctions useful for converting reads to haplotypes and frequencies, repairing reads, intersecting strand haplotypes, and visualizing haplotype alignments. (2) diversity indicesโfunctions to compute diversity and entropy, in which incidence, abundance, and functional indices are considered. (3) data simulationโfunctions useful for generating random viral quasispecies data.
Maintained by Mercedes Guerrero-Murillo. Last updated 5 months ago.
softwaregeneticsdnaseqgeneticvariabilitysequencingalignmentsequencematchingdataimport
6.2 match 5.56 score 8 scripts 1 dependentstidymodels
embed:Extra Recipes for Encoding Predictors
Predictors can be converted to one or more numeric representations using a variety of methods. Effect encodings using simple generalized linear models <doi:10.48550/arXiv.1611.09477> or nonlinear models <doi:10.48550/arXiv.1604.06737> can be used. There are also functions for dimension reduction and other approaches.
Maintained by Emil Hvitfeldt. Last updated 2 months ago.
3.7 match 142 stars 9.35 score 1.1k scriptstomaskrehlik
frequencyConnectedness:Spectral Decomposition of Connectedness Measures
Accompanies a paper (Barunik, Krehlik (2018) <doi:10.1093/jjfinec/nby001>) dedicated to spectral decomposition of connectedness measures and their interpretation. We implement all the developed estimators as well as the historical counterparts. For more information, see the help or GitHub page (<https://github.com/tomaskrehlik/frequencyConnectedness>) for relevant information.
Maintained by Tomas Krehlik. Last updated 2 years ago.
5.8 match 100 stars 5.88 score 50 scripts 1 dependentsrstudio
bslib:Custom 'Bootstrap' 'Sass' Themes for 'shiny' and 'rmarkdown'
Simplifies custom 'CSS' styling of both 'shiny' and 'rmarkdown' via 'Bootstrap' 'Sass'. Supports 'Bootstrap' 3, 4 and 5 as well as their various 'Bootswatch' themes. An interactive widget is also provided for previewing themes in real time.
Maintained by Carson Sievert. Last updated 14 days ago.
bootstraphtmltoolsrmarkdownsassshiny
1.9 match 511 stars 18.02 score 5.1k scripts 4.3k dependentsskranz
RTutor:Interactive R problem sets with automatic testing of solutions and automatic hints
Interactive R problem sets with automatic testing of solutions and automatic hints
Maintained by Sebastian Kranz. Last updated 1 years ago.
economicslearn-to-codeproblem-setrstudiortutorshinyteaching
5.8 match 205 stars 5.83 score 111 scripts 1 dependentsbioc
sesame:SEnsible Step-wise Analysis of DNA MEthylation BeadChips
Tools For analyzing Illumina Infinium DNA methylation arrays. SeSAMe provides utilities to support analyses of multiple generations of Infinium DNA methylation BeadChips, including preprocessing, quality control, visualization and inference. SeSAMe features accurate detection calling, intelligent inference of ethnicity, sex and advanced quality control routines.
Maintained by Wanding Zhou. Last updated 2 months ago.
dnamethylationmethylationarraypreprocessingqualitycontrolbioinformaticsdna-methylationmicroarray
3.7 match 69 stars 9.08 score 258 scripts 1 dependentsr-spatial
stars:Spatiotemporal Arrays, Raster and Vector Data Cubes
Reading, manipulating, writing and plotting spatiotemporal arrays (raster and vector data cubes) in 'R', using 'GDAL' bindings provided by 'sf', and 'NetCDF' bindings by 'ncmeta' and 'RNetCDF'.
Maintained by Edzer Pebesma. Last updated 1 months ago.
1.8 match 571 stars 18.27 score 7.2k scripts 137 dependentsgforge
htmlTable:Advanced Tables for Markdown/HTML
Tables with state-of-the-art layout elements such as row spanners, column spanners, table spanners, zebra striping, and more. While allowing advanced layout, the underlying css-structure is simple in order to maximize compatibility with common word processors. The package also contains a few text formatting functions that help outputting text compatible with HTML/LaTeX.
Maintained by Max Gordon. Last updated 8 months ago.
2.0 match 79 stars 15.32 score 1.3k scripts 763 dependentsbioc
RCy3:Functions to Access and Control Cytoscape
Vizualize, analyze and explore networks using Cytoscape via R. Anything you can do using the graphical user interface of Cytoscape, you can now do with a single RCy3 function.
Maintained by Alex Pico. Last updated 5 months ago.
visualizationgraphandnetworkthirdpartyclientnetwork
2.3 match 52 stars 13.42 score 628 scripts 17 dependentsatorus-research
Tplyr:A Traceability Focused Grammar of Clinical Data Summary
A traceability focused tool created to simplify the data manipulation necessary to create clinical summaries.
Maintained by Mike Stackhouse. Last updated 1 years ago.
3.1 match 95 stars 9.49 score 138 scripts 2 dependentssetempler
miscset:Miscellaneous Tools Set
A collection of miscellaneous methods to simplify various tasks, including plotting, data.frame and matrix transformations, environment functions, regular expression methods, and string and logical operations, as well as numerical and statistical tools. Most of the methods are simple but useful wrappers of common base R functions, which extend S3 generics or provide default values for important parameters.
Maintained by Sven E. Templer. Last updated 8 years ago.
6.8 match 1 stars 4.40 score 50 scriptsbioc
iSEE:Interactive SummarizedExperiment Explorer
Create an interactive Shiny-based graphical user interface for exploring data stored in SummarizedExperiment objects, including row- and column-level metadata. The interface supports transmission of selections between plots and tables, code tracking, interactive tours, interactive or programmatic initialization, preservation of app state, and extensibility to new panel types via S4 classes. Special attention is given to single-cell data in a SingleCellExperiment object with visualization of dimensionality reduction results.
Maintained by Kevin Rue-Albrecht. Last updated 13 days ago.
cellbasedassaysclusteringdimensionreductionfeatureextractiongeneexpressionguiimmunooncologyshinyappssinglecelltranscriptiontranscriptomicsvisualizationdimension-reductionfeature-extractiongene-expressionhacktoberfesthuman-cell-atlasshinysingle-cell
2.3 match 225 stars 12.86 score 380 scripts 9 dependentsdgkf
ggpackets:Package Plot Layers for Easier Portability and Modularization
Create groups of 'ggplot2' layers that can be easily migrated from one plot to another, reducing redundant code and improving the ability to format many plots that draw from the same source 'ggpacket' layers.
Maintained by Doug Kelkhoff. Last updated 17 days ago.
3.9 match 69 stars 7.34 score 12 scripts 1 dependentsnutterb
pixiedust:Tables so Beautifully Fine-Tuned You Will Believe It's Magic
The introduction of the 'broom' package has made converting model objects into data frames as simple as a single function. While the 'broom' package focuses on providing tidy data frames that can be used in advanced analysis, it deliberately stops short of providing functionality for reporting models in publication-ready tables. 'pixiedust' provides this functionality with a programming interface intended to be similar to 'ggplot2's system of layers with fine tuned control over each cell of the table. Options for output include printing to the console and to the common markdown formats (markdown, HTML, and LaTeX). With a little 'pixiedust' (and happy thoughts) tables can really fly.
Maintained by Benjamin Nutter. Last updated 1 years ago.
3.5 match 180 stars 8.01 score 94 scriptsmlr-org
mlr3pipelines:Preprocessing Operators and Pipelines for 'mlr3'
Dataflow programming toolkit that enriches 'mlr3' with a diverse set of pipelining operators ('PipeOps') that can be composed into graphs. Operations exist for data preprocessing, model fitting, and ensemble learning. Graphs can themselves be treated as 'mlr3' 'Learners' and can therefore be resampled, benchmarked, and tuned.
Maintained by Martin Binder. Last updated 12 days ago.
baggingdata-sciencedataflow-programmingensemble-learningmachine-learningmlr3pipelinespreprocessingstacking
2.3 match 141 stars 12.36 score 448 scripts 7 dependentslrberge
stringmagic:Character String Operations and Interpolation, Magic Edition
Performs complex string operations compactly and efficiently. Supports string interpolation jointly with over 50 string operations. Also enhances regular string functions (like grep() and co). See an introduction at <https://lrberge.github.io/stringmagic/>.
Maintained by Laurent R Berge. Last updated 7 months ago.
2.6 match 15 stars 10.56 score 37 scripts 33 dependentsatlasoflivingaustralia
galah:Biodiversity Data from the GBIF Node Network
The Global Biodiversity Information Facility ('GBIF', <https://www.gbif.org>) sources data from an international network of data providers, known as 'nodes'. Several of these nodes - the "living atlases" (<https://living-atlases.gbif.org>) - maintain their own web services using software originally developed by the Atlas of Living Australia ('ALA', <https://www.ala.org.au>). 'galah' enables the R community to directly access data and resources hosted by 'GBIF' and its partner nodes.
Maintained by Martin Westgate. Last updated 1 months ago.
3.0 match 43 stars 9.17 score 275 scripts 1 dependentsdatastorm-open
visNetwork:Network Visualization using 'vis.js' Library
Provides an R interface to the 'vis.js' JavaScript charting library. It allows an interactive visualization of networks.
Maintained by Benoit Thieurmel. Last updated 2 years ago.
1.8 match 549 stars 15.14 score 4.1k scripts 195 dependentsstrengejacke
ggeffects:Create Tidy Data Frames of Marginal Effects for 'ggplot' from Model Outputs
Compute marginal effects and adjusted predictions from statistical models and returns the result as tidy data frames. These data frames are ready to use with the 'ggplot2'-package. Effects and predictions can be calculated for many different models. Interaction terms, splines and polynomial terms are also supported. The main functions are ggpredict(), ggemmeans() and ggeffect(). There is a generic plot()-method to plot the results using 'ggplot2'.
Maintained by Daniel Lรผdecke. Last updated 8 days ago.
estimated-marginal-meanshacktoberfestmarginal-effectsprediction
1.8 match 588 stars 15.55 score 3.6k scripts 7 dependentsphilchalmers
mirt:Multidimensional Item Response Theory
Analysis of discrete response data using unidimensional and multidimensional item analysis models under the Item Response Theory paradigm (Chalmers (2012) <doi:10.18637/jss.v048.i06>). Exploratory and confirmatory item factor analysis models are estimated with quadrature (EM) or stochastic (MHRM) methods. Confirmatory bi-factor and two-tier models are available for modeling item testlets using dimension reduction EM algorithms, while multiple group analyses and mixed effects designs are included for detecting differential item, bundle, and test functioning, and for modeling item and person covariates. Finally, latent class models such as the DINA, DINO, multidimensional latent class, mixture IRT models, and zero-inflated response models are supported, as well as a wide family of probabilistic unfolding models.
Maintained by Phil Chalmers. Last updated 14 days ago.
1.8 match 210 stars 14.98 score 2.5k scripts 40 dependentsdipterix
ravetools:Signal and Image Processing Toolbox for Analyzing Intracranial Electroencephalography Data
Implemented fast and memory-efficient Notch-filter, Welch-periodogram, discrete wavelet spectrogram for minutes of high-resolution signals, fast 3D convolution, image registration, 3D mesh manipulation; providing fundamental toolbox for intracranial Electroencephalography (iEEG) pipelines. Documentation and examples about 'RAVE' project are provided at <https://rave.wiki>, and the paper by John F. Magnotti, Zhengjia Wang, Michael S. Beauchamp (2020) <doi:10.1016/j.neuroimage.2020.117341>; see 'citation("ravetools")' for details.
Maintained by Zhengjia Wang. Last updated 7 days ago.
5.3 match 3 stars 5.13 score 20 scripts 1 dependentsjmbarbone
fuj:Functions and Utilities for Jordan
Provides core functions and utilities for packages and other code developed by Jordan Mark Barbone.
Maintained by Jordan Mark Barbone. Last updated 10 days ago.
6.0 match 2 stars 4.48 score 8 scripts 1 dependentsbtskinner
crosswalkr:Rename and Encode Data Frames Using External Crosswalk Files
A pair of functions for renaming and encoding data frames using external crosswalk files. It is especially useful when constructing master data sets from multiple smaller data sets that do not name or encode variables consistently across files. Based on similar commands in 'Stata'.
Maintained by Benjamin Skinner. Last updated 1 years ago.
5.0 match 9 stars 5.26 score 20 scriptsbioc
Gviz:Plotting data and annotation information along genomic coordinates
Genomic data analyses requires integrated visualization of known genomic information and new experimental data. Gviz uses the biomaRt and the rtracklayer packages to perform live annotation queries to Ensembl and UCSC and translates this to e.g. gene/transcript structures in viewports of the grid graphics package. This results in genomic information plotted together with your data.
Maintained by Robert Ivanek. Last updated 5 months ago.
visualizationmicroarraysequencing
2.0 match 79 stars 13.08 score 1.4k scripts 48 dependentsnicchr
fastplyr:Fast Alternatives to 'tidyverse' Functions
A full set of fast data manipulation tools with a tidy front-end and a fast back-end using 'collapse' and 'cheapr'.
Maintained by Nick Christofides. Last updated 27 days ago.
4.1 match 23 stars 6.32 score 36 scripts 1 dependentsbioc
scde:Single Cell Differential Expression
The scde package implements a set of statistical methods for analyzing single-cell RNA-seq data. scde fits individual error models for single-cell RNA-seq measurements. These models can then be used for assessment of differential expression between groups of cells, as well as other types of analysis. The scde package also contains the pagoda framework which applies pathway and gene set overdispersion analysis to identify and characterize putative cell subpopulations based on transcriptional signatures. The overall approach to the differential expression analysis is detailed in the following publication: "Bayesian approach to single-cell differential expression analysis" (Kharchenko PV, Silberstein L, Scadden DT, Nature Methods, doi: 10.1038/nmeth.2967). The overall approach to subpopulation identification and characterization is detailed in the following pre-print: "Characterizing transcriptional heterogeneity through pathway and gene set overdispersion analysis" (Fan J, Salathia N, Liu R, Kaeser G, Yung Y, Herman J, Kaper F, Fan JB, Zhang K, Chun J, and Kharchenko PV, Nature Methods, doi:10.1038/nmeth.3734).
Maintained by Evan Biederstedt. Last updated 5 months ago.
immunooncologyrnaseqstatisticalmethoddifferentialexpressionbayesiantranscriptionsoftwareanalysisbioinformaticsheterogenityngssingle-celltranscriptomicsopenblascppopenmp
3.5 match 173 stars 7.53 score 141 scriptsbioc
SingleMoleculeFootprinting:Analysis tools for Single Molecule Footprinting (SMF) data
SingleMoleculeFootprinting provides functions to analyze Single Molecule Footprinting (SMF) data. Following the workflow exemplified in its vignette, the user will be able to perform basic data analysis of SMF data with minimal coding effort. Starting from an aligned bam file, we show how to perform quality controls over sequencing libraries, extract methylation information at the single molecule level accounting for the two possible kind of SMF experiments (single enzyme or double enzyme), classify single molecules based on their patterns of molecular occupancy, plot SMF information at a given genomic location.
Maintained by Guido Barzaghi. Last updated 1 months ago.
dnamethylationcoveragenucleosomepositioningdatarepresentationepigeneticsmethylseqqualitycontrolsequencing
4.0 match 2 stars 6.43 score 27 scriptstagteam
riskRegression:Risk Regression Models and Prediction Scores for Survival Analysis with Competing Risks
Implementation of the following methods for event history analysis. Risk regression models for survival endpoints also in the presence of competing risks are fitted using binomial regression based on a time sequence of binary event status variables. A formula interface for the Fine-Gray regression model and an interface for the combination of cause-specific Cox regression models. A toolbox for assessing and comparing performance of risk predictions (risk markers and risk prediction models). Prediction performance is measured by the Brier score and the area under the ROC curve for binary possibly time-dependent outcome. Inverse probability of censoring weighting and pseudo values are used to deal with right censored data. Lists of risk markers and lists of risk models are assessed simultaneously. Cross-validation repeatedly splits the data, trains the risk prediction models on one part of each split and then summarizes and compares the performance across splits.
Maintained by Thomas Alexander Gerds. Last updated 20 days ago.
2.0 match 46 stars 13.00 score 736 scripts 35 dependentsedgarsantos-fernandez
SSNbayes:Bayesian Spatio-Temporal Analysis in Stream Networks
Fits Bayesian spatio-temporal models and makes predictions on stream networks using the approach by Santos-Fernandez, Edgar, et al. (2022)."Bayesian spatio-temporal models for stream networks" and Santos-Fernandez, Edgar, et al. (2023). "SSNbayes: An R Package for Bayesian Spatio-Temporal Modelling on Stream Networks". In these models, spatial dependence is captured using stream distance and flow connectivity, while temporal autocorrelation is modelled using vector autoregression methods.
Maintained by Edgar Santos-Fernandez. Last updated 2 months ago.
4.8 match 17 stars 5.41 score 6 scriptsrdoctaskforce
pkgcond:Classed Error and Warning Conditions
This provides utilities for creating classed error and warning conditions based on where the error originated.
Maintained by Andrew Redd. Last updated 4 years ago.
5.0 match 5 stars 5.19 score 41 scripts 5 dependentsbioc
methylSig:MethylSig: Differential Methylation Testing for WGBS and RRBS Data
MethylSig is a package for testing for differentially methylated cytosines (DMCs) or regions (DMRs) in whole-genome bisulfite sequencing (WGBS) or reduced representation bisulfite sequencing (RRBS) experiments. MethylSig uses a beta binomial model to test for significant differences between groups of samples. Several options exist for either site-specific or sliding window tests, and variance estimation.
Maintained by Raymond G. Cavalcante. Last updated 5 months ago.
dnamethylationdifferentialmethylationepigeneticsregressionmethylseqdifferential-methylationdna-methylation
3.5 match 17 stars 7.37 score 23 scriptschoi-phd
lordif:Logistic Ordinal Regression Differential Item Functioning using IRT
Performs analysis of Differential Item Functioning (DIF) for dichotomous and polytomous items using an iterative hybrid of ordinal logistic regression and item response theory (IRT) according to Choi, Gibbons, and Crane (2011) <doi:10.18637/jss.v039.i08>.
Maintained by Seung W. Choi. Last updated 2 months ago.
5.0 match 1 stars 5.12 score 35 scripts 1 dependentsskyebend
networkDynamic:Dynamic Extensions for Network Objects
Simple interface routines to facilitate the handling of network objects with complex intertemporal data. This is a part of the "statnet" suite of packages for network analysis.
Maintained by Skye Bender-deMoll. Last updated 4 months ago.
3.9 match 3 stars 6.47 score 132 scripts 11 dependentspolar-fhir
fhircrackr:Handling HL7 FHIRยฎ Resources in R
Useful tools for conveniently downloading FHIR resources in xml format and converting them to R data.frames. The package uses FHIR-search to download bundles from a FHIR server, provides functions to save and read xml-files containing such bundles and allows flattening the bundles to data.frames using XPath expressions. FHIRยฎ is the registered trademark of HL7 and is used with the permission of HL7. Use of the FHIR trademark does not constitute endorsement of this product by HL7.
Maintained by Julia Palm. Last updated 14 days ago.
3.3 match 33 stars 7.63 score 46 scriptstidyverts
tsibble:Tidy Temporal Data Frames and Tools
Provides a 'tbl_ts' class (the 'tsibble') for temporal data in an data- and model-oriented format. The 'tsibble' provides tools to easily manipulate and analyse temporal data, such as filling in time gaps and aggregating over calendar periods.
Maintained by Earo Wang. Last updated 2 months ago.
1.8 match 538 stars 14.47 score 4.4k scripts 42 dependentsbioc
microbiome:Microbiome Analytics
Utilities for microbiome analysis.
Maintained by Leo Lahti. Last updated 5 months ago.
metagenomicsmicrobiomesequencingsystemsbiologyhitchiphitchip-atlashuman-microbiomemicrobiologymicrobiome-analysisphyloseqpopulation-study
2.0 match 293 stars 12.51 score 2.0k scripts 5 dependentstidyverse
dbplyr:A 'dplyr' Back End for Databases
A 'dplyr' back end for databases that allows you to work with remote database tables as if they are in-memory data frames. Basic features works with any database that has a 'DBI' back end; more advanced features require 'SQL' translation to be provided by the package author.
Maintained by Hadley Wickham. Last updated 4 months ago.
1.3 match 481 stars 19.72 score 5.2k scripts 736 dependentschristophergandrud
networkD3:D3 JavaScript Network Graphs from R
Creates 'D3' 'JavaScript' network, tree, dendrogram, and Sankey graphs from 'R'.
Maintained by Christopher Gandrud. Last updated 6 years ago.
1.8 match 654 stars 13.55 score 3.4k scripts 31 dependentsr-hyperspec
hyperSpec:Work with Hyperspectral Data, i.e. Spectra + Meta Information (Spatial, Time, Concentration, ...)
Comfortable ways to work with hyperspectral data sets, i.e. spatially or time-resolved spectra, or spectra with any other kind of information associated with each of the spectra. The spectra can be data as obtained in XRF, UV/VIS, Fluorescence, AES, NIR, IR, Raman, NMR, MS, etc. More generally, any data that is recorded over a discretized variable, e.g. absorbance = f(wavelength), stored as a vector of absorbance values for discrete wavelengths is suitable.
Maintained by Claudia Beleites. Last updated 10 months ago.
data-wranglinghyperspectralimaginginfrarednmrramanspectroscopyuv-visxrf
3.0 match 16 stars 8.13 score 233 scripts 2 dependentsijlyttle
bsplus:Adds Functionality to the R Markdown + Shiny Bootstrap Framework
The Bootstrap framework lets you add some JavaScript functionality to your web site by adding attributes to your HTML tags - Bootstrap takes care of the JavaScript <https://getbootstrap.com/docs/3.3/javascript/>. If you are using R Markdown or Shiny, you can use these functions to create collapsible sections, accordion panels, modals, tooltips, popovers, and an accordion sidebar framework (not described at Bootstrap site). Please note this package was designed for Bootstrap 3.3.
Maintained by Ian Lyttle. Last updated 2 years ago.
2.8 match 147 stars 8.80 score 295 scripts 15 dependentslz100
spsComps:'systemPipeShiny' UI and Server Components
The systemPipeShiny (SPS) framework comes with many UI and server components. However, installing the whole framework is heavy and takes some time. If you would like to use UI and server components from SPS in your own Shiny apps, do not hesitate to try this package.
Maintained by Le Zhang. Last updated 1 years ago.
3.7 match 31 stars 6.56 score 65 scripts 4 dependentshneth
ds4psy:Data Science for Psychologists
All datasets and functions required for the examples and exercises of the book "Data Science for Psychologists" (by Hansjoerg Neth, Konstanz University, 2023), freely available at <https://bookdown.org/hneth/ds4psy/>. The book and course introduce principles and methods of data science to students of psychology and other biological or social sciences. The 'ds4psy' package primarily provides datasets, but also functions for data generation and manipulation (e.g., of text and time data) and graphics that are used in the book and its exercises. All functions included in 'ds4psy' are designed to be explicit and instructive, rather than efficient or elegant.
Maintained by Hansjoerg Neth. Last updated 1 months ago.
data-literacydata-scienceeducationexploratory-data-analysispsychologysocial-sciencesvisualisation
3.5 match 22 stars 6.79 score 70 scriptspeekxc
simplextree:Provides Tools for Working with General Simplicial Complexes
Provides an interface to a Simplex Tree data structure, which is a data structure aimed at enabling efficient manipulation of simplicial complexes of any dimension. The Simplex Tree data structure was originally introduced by Jean-Daniel Boissonnat and Clรฉment Maria (2014) <doi:10.1007/s00453-014-9887-3>.
Maintained by Matt Piekenbrock. Last updated 1 years ago.
rcppsimplicial-complextopological-data-analysistopologycpp
5.3 match 15 stars 4.56 score 16 scripts 1 dependentsbioc
reconsi:Resampling Collapsed Null Distributions for Simultaneous Inference
Improves simultaneous inference under dependence of tests by estimating a collapsed null distribution through resampling. Accounting for the dependence between tests increases the power while reducing the variability of the false discovery proportion. This dependence is common in genomics applications, e.g. when combining flow cytometry measurements with microbiome sequence counts.
Maintained by Stijn Hawinkel. Last updated 5 months ago.
metagenomicsmicrobiomemultiplecomparisonflowcytometry
5.1 match 2 stars 4.60 score 2 scriptsdatasketch
shinypanels:Shiny Layout with Collapsible Panels
Create 'Shiny Apps' with collapsible vertical panels. This package provides a new visual arrangement for elements on top of 'Shiny'. Use the expand and collapse capabilities to leverage web applications with many elements to focus the user attention on the panel of interest.
Maintained by Juan Pablo Marin Diaz. Last updated 9 months ago.
3.9 match 80 stars 6.01 score 43 scriptsphilchalmers
SimDesign:Structure for Organizing Monte Carlo Simulation Designs
Provides tools to safely and efficiently organize and execute Monte Carlo simulation experiments in R. The package controls the structure and back-end of Monte Carlo simulation experiments by utilizing a generate-analyse-summarise workflow. The workflow safeguards against common simulation coding issues, such as automatically re-simulating non-convergent results, prevents inadvertently overwriting simulation files, catches error and warning messages during execution, implicitly supports parallel processing with high-quality random number generation, and provides tools for managing high-performance computing (HPC) array jobs submitted to schedulers such as SLURM. For a pedagogical introduction to the package see Sigal and Chalmers (2016) <doi:10.1080/10691898.2016.1246953>. For a more in-depth overview of the package and its design philosophy see Chalmers and Adkins (2020) <doi:10.20982/tqmp.16.4.p248>.
Maintained by Phil Chalmers. Last updated 2 days ago.
monte-carlo-simulationsimulationsimulation-framework
1.8 match 62 stars 13.38 score 253 scripts 46 dependentsmlr-org
mlr3misc:Helper Functions for 'mlr3'
Frequently used helper functions and assertions used in 'mlr3' and its companion packages. Comes with helper functions for functional programming, for printing, to work with 'data.table', as well as some generally useful 'R6' classes. This package also supersedes the package 'BBmisc'.
Maintained by Marc Becker. Last updated 4 months ago.
machine-learningmiscellaneousmlr3
2.3 match 12 stars 10.28 score 302 scripts 42 dependentswrathematics
kazaam:Tools for Tall Distributed Matrices
Many data science problems reduce to operations on very tall, skinny matrices. However, sometimes these matrices can be so tall that they are difficult to work with, or do not even fit into main memory. One strategy to deal with such objects is to distribute their rows across several processors. To this end, we offer an 'S4' class for tall, skinny, distributed matrices, called the 'shaq'. We also provide many useful numerical methods and statistics operations for operating on these distributed objects. The naming is a bit "tongue-in-cheek", with the class a play on the fact that 'Shaquille' 'ONeal' ('Shaq') is very tall, and he starred in the film 'Kazaam'.
Maintained by Drew Schmidt. Last updated 8 years ago.
6.0 match 3.82 score 133 scriptsmbq
vistla:Detecting Influence Paths with Information Theory
Traces information spread through interactions between features, utilising information theory measures and a higher-order generalisation of the concept of widest paths in graphs. In particular, 'vistla' can be used to better understand the results of high-throughput biomedical experiments, by organising the effects of the investigated intervention in a tree-like hierarchy from direct to indirect ones, following the plausible information relay circuits. Due to its higher-order nature, 'vistla' can handle multi-modality and assign multiple roles to a single feature.
Maintained by Miron B. Kursa. Last updated 28 days ago.
4.8 match 4.78 score 3 scriptsrmgpanw
gtexr:Query the GTEx Portal API
A convenient R interface to the Genotype-Tissue Expression (GTEx) Portal API. For more information on the API, see <https://gtexportal.org/api/v2/redoc>.
Maintained by Alasdair Warwick. Last updated 6 months ago.
api-wrapperbioinformaticseqtlgtexsqtl
3.5 match 5 stars 6.41 score 5 scriptscran
shiny.blueprint:Palantir's 'Blueprint' for 'Shiny' Apps
Easily use 'Blueprint', the popular 'React' library from Palantir, in your 'Shiny' app. 'Blueprint' provides a rich set of UI components for creating visually appealing applications and is optimized for building complex, data-dense web interfaces. This package provides most components from the underlying library, as well as special wrappers for some components to make it easy to use them in 'R' without writing 'JavaScript' code.
Maintained by Jakub Sobolewski. Last updated 10 months ago.
6.0 match 3.70 scorebioc
minfi:Analyze Illumina Infinium DNA methylation arrays
Tools to analyze & visualize Illumina Infinium methylation arrays.
Maintained by Kasper Daniel Hansen. Last updated 4 months ago.
immunooncologydnamethylationdifferentialmethylationepigeneticsmicroarraymethylationarraymultichanneltwochanneldataimportnormalizationpreprocessingqualitycontrol
1.7 match 60 stars 12.83 score 996 scripts 26 dependentsbioc
maftools:Summarize, Analyze and Visualize MAF Files
Analyze and visualize Mutation Annotation Format (MAF) files from large scale sequencing studies. This package provides various functions to perform most commonly used analyses in cancer genomics and to create feature rich customizable visualzations with minimal effort.
Maintained by Anand Mayakonda. Last updated 5 months ago.
datarepresentationdnaseqvisualizationdrivermutationvariantannotationfeatureextractionclassificationsomaticmutationsequencingfunctionalgenomicssurvivalbioinformaticscancer-genome-atlascancer-genomicsgenomicsmaf-filestcgacurlbzip2xz-utilszlib
1.5 match 459 stars 14.63 score 948 scripts 18 dependentsouhscbbmc
REDCapR:Interaction Between R and REDCap
Encapsulates functions to streamline calls from R to the REDCap API. REDCap (Research Electronic Data CAPture) is a web application for building and managing online surveys and databases developed at Vanderbilt University. The Application Programming Interface (API) offers an avenue to access and modify data programmatically, improving the capacity for literate and reproducible programming.
Maintained by Will Beasley. Last updated 2 months ago.
1.8 match 118 stars 12.36 score 438 scripts 6 dependentsinsightsengineering
tern:Create Common TLGs Used in Clinical Trials
Table, Listings, and Graphs (TLG) library for common outputs used in clinical trials.
Maintained by Joe Zhu. Last updated 2 months ago.
clinical-trialsgraphslistingsnestoutputstables
1.7 match 83 stars 12.50 score 186 scripts 9 dependentsbayesball
LearnBayes:Learning Bayesian Inference
Contains functions for summarizing basic one and two parameter posterior distributions and predictive distributions. It contains MCMC algorithms for summarizing posterior distributions defined by the user. It also contains functions for regression models, hierarchical models, Bayesian tests, and illustrations of Gibbs sampling.
Maintained by Jim Albert. Last updated 7 years ago.
1.9 match 38 stars 11.34 score 690 scripts 31 dependentsbioc
hopach:Hierarchical Ordered Partitioning and Collapsing Hybrid (HOPACH)
The HOPACH clustering algorithm builds a hierarchical tree of clusters by recursively partitioning a data set, while ordering and possibly collapsing clusters at each level. The algorithm uses the Mean/Median Split Silhouette (MSS) criteria to identify the level of the tree with maximally homogeneous clusters. It also runs the tree down to produce a final ordered list of the elements. The non-parametric bootstrap allows one to estimate the probability that each element belongs to each cluster (fuzzy clustering).
Maintained by Katherine S. Pollard. Last updated 5 months ago.
3.4 match 6.05 score 54 scripts 5 dependentsbnosac
udpipe:Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit
This natural language processing toolkit provides language-agnostic 'tokenization', 'parts of speech tagging', 'lemmatization' and 'dependency parsing' of raw text. Next to text parsing, the package also allows you to train annotation models based on data of 'treebanks' in 'CoNLL-U' format as provided at <https://universaldependencies.org/format.html>. The techniques are explained in detail in the paper: 'Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe', available at <doi:10.18653/v1/K17-3009>. The toolkit also contains functionalities for commonly used data manipulations on texts which are enriched with the output of the parser. Namely functionalities and algorithms for collocations, token co-occurrence, document term matrix handling, term frequency inverse document frequency calculations, information retrieval metrics (Okapi BM25), handling of multi-word expressions, keyword detection (Rapid Automatic Keyword Extraction, noun phrase extraction, syntactical patterns) sentiment scoring and semantic similarity analysis.
Maintained by Jan Wijffels. Last updated 2 years ago.
conlldependency-parserlemmatizationnatural-language-processingnlppos-taggingr-pkgrcpptext-miningtokenizerudpipecpp
1.8 match 215 stars 11.83 score 1.2k scripts 9 dependentstdhock
nc:Named Capture to Data Tables
User-friendly functions for extracting a data table (row for each match, column for each group) from non-tabular text data using regular expressions, and for melting columns that match a regular expression. Patterns are defined using a readable syntax that makes it easy to build complex patterns in terms of simpler, re-usable sub-patterns. Named R arguments are translated to column names in the output; capture groups without names are used internally in order to provide a standard interface to three regular expression 'C' libraries ('PCRE', 'RE2', 'ICU'). Output can also include numeric columns via user-specified type conversion functions.
Maintained by Toby Hocking. Last updated 2 months ago.
3.0 match 16 stars 6.85 score 46 scriptsmarkfairbanks
tidytable:Tidy Interface to 'data.table'
A tidy interface to 'data.table', giving users the speed of 'data.table' while using tidyverse-like syntax.
Maintained by Mark Fairbanks. Last updated 2 months ago.
1.8 match 458 stars 11.41 score 732 scripts 10 dependentsncss-tech
aqp:Algorithms for Quantitative Pedology
The Algorithms for Quantitative Pedology (AQP) project was started in 2009 to organize a loosely-related set of concepts and source code on the topic of soil profile visualization, aggregation, and classification into this package (aqp). Over the past 8 years, the project has grown into a suite of related R packages that enhance and simplify the quantitative analysis of soil profile data. Central to the AQP project is a new vocabulary of specialized functions and data structures that can accommodate the inherent complexity of soil profile information; freeing the scientist to focus on ideas rather than boilerplate data processing tasks <doi:10.1016/j.cageo.2012.10.020>. These functions and data structures have been extensively tested and documented, applied to projects involving hundreds of thousands of soil profiles, and deeply integrated into widely used tools such as SoilWeb <https://casoilresource.lawr.ucdavis.edu/soilweb-apps>. Components of the AQP project (aqp, soilDB, sharpshootR, soilReports packages) serve an important role in routine data analysis within the USDA-NRCS Soil Science Division. The AQP suite of R packages offer a convenient platform for bridging the gap between pedometric theory and practice.
Maintained by Dylan Beaudette. Last updated 1 months ago.
digital-soil-mappingncss-technrcspedologypedometricssoilsoil-surveyusda
1.7 match 55 stars 11.90 score 1.2k scripts 2 dependentsmodeloriented
shapviz:SHAP Visualizations
Visualizations for SHAP (SHapley Additive exPlanations), such as waterfall plots, force plots, various types of importance plots, dependence plots, and interaction plots. These plots act on a 'shapviz' object created from a matrix of SHAP values and a corresponding feature dataset. Wrappers for the R packages 'xgboost', 'lightgbm', 'fastshap', 'shapr', 'h2o', 'treeshap', 'DALEX', and 'kernelshap' are added for convenience. By separating visualization and computation, it is possible to display factor variables in graphs, even if the SHAP values are calculated by a model that requires numerical features. The plots are inspired by those provided by the 'shap' package in Python, but there is no dependency on it.
Maintained by Michael Mayer. Last updated 2 months ago.
explainable-aimachine-learningshapshapley-valuevisualizationxai
2.0 match 89 stars 9.95 score 250 scriptsjohanngb
ruv:Detect and Remove Unwanted Variation using Negative Controls
Implements the 'RUV' (Remove Unwanted Variation) algorithms. These algorithms attempt to adjust for systematic errors of unknown origin in high-dimensional data. The algorithms were originally developed for use with genomic data, especially microarray data, but may be useful with other types of high-dimensional data as well. These algorithms were proposed in Gagnon-Bartsch and Speed (2012) <doi:10.1093/nar/gkz433>, Gagnon-Bartsch, Jacob and Speed (2013), and Molania, et. al. (2019) <doi:10.1093/nar/gkz433>. The algorithms require the user to specify a set of negative control variables, as described in the references. The algorithms included in this package are 'RUV-2', 'RUV-4', 'RUV-inv', 'RUV-rinv', 'RUV-I', and RUV-III', along with various supporting algorithms.
Maintained by Johann Gagnon-Bartsch. Last updated 6 years ago.
4.5 match 2 stars 4.36 score 94 scripts 7 dependentsbioc
CoreGx:Classes and Functions to Serve as the Basis for Other 'Gx' Packages
A collection of functions and classes which serve as the foundation for our lab's suite of R packages, such as 'PharmacoGx' and 'RadioGx'. This package was created to abstract shared functionality from other lab package releases to increase ease of maintainability and reduce code repetition in current and future 'Gx' suite programs. Major features include a 'CoreSet' class, from which 'RadioSet' and 'PharmacoSet' are derived, along with get and set methods for each respective slot. Additional functions related to fitting and plotting dose response curves, quantifying statistical correlation and calculating area under the curve (AUC) or survival fraction (SF) are included. For more details please see the included documentation, as well as: Smirnov, P., Safikhani, Z., El-Hachem, N., Wang, D., She, A., Olsen, C., Freeman, M., Selby, H., Gendoo, D., Grossman, P., Beck, A., Aerts, H., Lupien, M., Goldenberg, A. (2015) <doi:10.1093/bioinformatics/btv723>. Manem, V., Labie, M., Smirnov, P., Kofia, V., Freeman, M., Koritzinksy, M., Abazeed, M., Haibe-Kains, B., Bratman, S. (2018) <doi:10.1101/449793>.
Maintained by Benjamin Haibe-Kains. Last updated 5 months ago.
softwarepharmacogenomicsclassificationsurvival
3.0 match 6.53 score 63 scripts 6 dependentsgforge
Gmisc:Descriptive Statistics, Transition Plots, and More
Tools for making the descriptive "Table 1" used in medical articles, a transition plot for showing changes between categories (also known as a Sankey diagram), flow charts by extending the grid package, a method for variable selection based on the SVD, Bรฉzier lines with arrows complementing the ones in the 'grid' package, and more.
Maintained by Max Gordon. Last updated 2 years ago.
1.9 match 50 stars 10.40 score 233 scripts 2 dependentsdivdyn
divDyn:Diversity Dynamics using Fossil Sampling Data
Functions to describe sampling and diversity dynamics of fossil occurrence datasets (e.g. from the Paleobiology Database). The package includes methods to calculate range- and occurrence-based metrics of taxonomic richness, extinction and origination rates, along with traditional sampling measures. A powerful subsampling tool is also included that implements frequently used sampling standardization methods in a multiple bin-framework. The plotting of time series and the occurrence data can be simplified by the functions incorporated in the package, as well as other calculations, such as environmental affinities and extinction selectivity testing. Details can be found in: Kocsis, A.T.; Reddin, C.J.; Alroy, J. and Kiessling, W. (2019) <doi:10.1101/423780>.
Maintained by Adam T. Kocsis. Last updated 4 months ago.
diversityextinctionfossil-dataoccurrencesoriginationpaleobiologycpp
3.0 match 11 stars 6.48 score 137 scriptsfunwithr
MonteCarlo:Automatic Parallelized Monte Carlo Simulations
Simplifies Monte Carlo simulation studies by automatically setting up loops to run over parameter grids and parallelising the Monte Carlo repetitions. It also generates LaTeX tables.
Maintained by Christian Hendrik Leschinski. Last updated 6 years ago.
3.2 match 34 stars 6.09 score 36 scriptssaviviro
sstvars:Toolkit for Reduced Form and Structural Smooth Transition Vector Autoregressive Models
Penalized and non-penalized maximum likelihood estimation of smooth transition vector autoregressive models with various types of transition weight functions, conditional distributions, and identification methods. Constrained estimation with various types of constraints is available. Residual based model diagnostics, forecasting, simulations, and calculation of impulse response functions, generalized impulse response functions, and generalized forecast error variance decompositions. See Heather Anderson, Farshid Vahid (1998) <doi:10.1016/S0304-4076(97)00076-6>, Helmut Lรผtkepohl, Aleksei Netลกunajev (2017) <doi:10.1016/j.jedc.2017.09.001>, Markku Lanne, Savi Virolainen (2025) <doi:10.48550/arXiv.2403.14216>, Savi Virolainen (2025) <doi:10.48550/arXiv.2404.19707>.
Maintained by Savi Virolainen. Last updated 19 days ago.
3.0 match 4 stars 6.36 score 41 scriptsropensci
rdhs:API Client and Dataset Management for the Demographic and Health Survey (DHS) Data
Provides a client for (1) querying the DHS API for survey indicators and metadata (<https://api.dhsprogram.com/#/index.html>), (2) identifying surveys and datasets for analysis, (3) downloading survey datasets from the DHS website, (4) loading datasets and associate metadata into R, and (5) extracting variables and combining datasets for pooled analysis.
Maintained by OJ Watson. Last updated 20 days ago.
datasetdhsdhs-apiextractpeer-reviewedsurvey-data
1.9 match 35 stars 10.07 score 286 scripts 3 dependentsjknowles
merTools:Tools for Analyzing Mixed Effect Regression Models
Provides methods for extracting results from mixed-effect model objects fit with the 'lme4' package. Allows construction of prediction intervals efficiently from large scale linear and generalized linear mixed-effects models. This method draws from the simulation framework used in the Gelman and Hill (2007) textbook: Data Analysis Using Regression and Multilevel/Hierarchical Models.
Maintained by Jared E. Knowles. Last updated 1 years ago.
1.8 match 105 stars 10.49 score 768 scriptsms609
TreeDist:Calculate and Map Distances Between Phylogenetic Trees
Implements measures of tree similarity, including information-based generalized Robinson-Foulds distances (Phylogenetic Information Distance, Clustering Information Distance, Matching Split Information Distance; Smith 2020) <doi:10.1093/bioinformatics/btaa614>; Jaccard-Robinson-Foulds distances (Bocker et al. 2013) <doi:10.1007/978-3-642-40453-5_13>, including the Nye et al. (2006) metric <doi:10.1093/bioinformatics/bti720>; the Matching Split Distance (Bogdanowicz & Giaro 2012) <doi:10.1109/TCBB.2011.48>; Maximum Agreement Subtree distances; the Kendall-Colijn (2016) distance <doi:10.1093/molbev/msw124>, and the Nearest Neighbour Interchange (NNI) distance, approximated per Li et al. (1996) <doi:10.1007/3-540-61332-3_168>. Includes tools for visualizing mappings of tree space (Smith 2022) <doi:10.1093/sysbio/syab100>, for identifying islands of trees (Silva and Wilkinson 2021) <doi:10.1093/sysbio/syab015>, for calculating the median of sets of trees, and for computing the information content of trees and splits.
Maintained by Martin R. Smith. Last updated 1 months ago.
phylogeneticstree-distancephylogenetic-treestree-distancestreescpp
1.8 match 32 stars 10.32 score 97 scripts 5 dependentslouissirugue
directotree:Creates an Interactive Tree Structure of a Directory
Represents the content of a directory as an interactive collapsible tree. Offers the possibility to assign a text (e.g., a 'Readme.txt') to each folder (represented as a clickable node), so that when the user hovers the pointer over a node, the corresponding text is displayed as a tooltip.
Maintained by Louis Sirugue. Last updated 6 years ago.
collapsible-treedata-visualizationdirectotreetree
8.0 match 2 stars 2.30 score 1 scriptsyonicd
d3Tree:Create Interactive Collapsible Trees with the JavaScript 'D3' Library
Create and customize interactive collapsible 'D3' trees using the 'D3' JavaScript library and the 'htmlwidgets' package. These trees can be used directly from the R console, from 'RStudio', in Shiny apps and R Markdown documents. When in Shiny the tree layout is observed by the server and can be used as a reactive filter of structured data.
Maintained by Jonathan Sidi. Last updated 1 years ago.
d3jshierarchyhtmlwidgetsqueryshiny
3.4 match 87 stars 5.46 score 33 scriptsdcomtois
summarytools:Tools to Quickly and Neatly Summarize Data
Data frame summaries, cross-tabulations, weight-enabled frequency tables and common descriptive (univariate) statistics in concise tables available in a variety of formats (plain ASCII, Markdown and HTML). A good point-of-entry for exploring data, both for experienced and new R users.
Maintained by Dominic Comtois. Last updated 4 days ago.
descriptive-statisticsfrequency-tablehtml-reportmarkdownpanderpandocpandoc-markdownrmarkdownrstudio
1.3 match 526 stars 14.52 score 2.9k scripts 6 dependentscolearendt
tidyjson:Tidy Complex 'JSON'
Turn complex 'JSON' data into tidy data frames.
Maintained by Cole Arendt. Last updated 2 years ago.
1.7 match 192 stars 10.64 score 522 scripts 7 dependentsmuschellij2
rscopus:Scopus Database 'API' Interface
Uses Elsevier 'Scopus' API <https://dev.elsevier.com/sc_apis.html> to download information about authors and their citations.
Maintained by John Muschelli. Last updated 1 years ago.
1.9 match 77 stars 9.33 score 124 scripts 3 dependentsbioc
plotgardener:Coordinate-Based Genomic Visualization Package for R
Coordinate-based genomic visualization package for R. It grants users the ability to programmatically produce complex, multi-paneled figures. Tailored for genomics, plotgardener allows users to visualize large complex genomic datasets and provides exquisite control over how plots are placed and arranged on a page.
Maintained by Nicole Kramer. Last updated 5 months ago.
visualizationgenomeannotationfunctionalgenomicsgenomeassemblyhiccpp
1.7 match 309 stars 10.17 score 167 scripts 3 dependentsstevenmmortimer
salesforcer:An Implementation of 'Salesforce' APIs Using Tidy Principles
Functions connecting to the 'Salesforce' Platform APIs (REST, SOAP, Bulk 1.0, Bulk 2.0, Metadata, Reports and Dashboards) <https://trailhead.salesforce.com/content/learn/modules/api_basics/api_basics_overview>. "API" is an acronym for "application programming interface". Most all calls from these APIs are supported as they use CSV, XML or JSON data that can be parsed into R data structures. For more details please see the 'Salesforce' API documentation and this package's website <https://stevenmmortimer.github.io/salesforcer/> for more information, documentation, and examples.
Maintained by Steven M. Mortimer. Last updated 4 months ago.
api-wrappersr-languager-programmingsalesforcesalesforce-apis
1.9 match 82 stars 9.27 score 191 scriptspoissonconsulting
mcmcr:Manipulate MCMC Samples
Functions and classes to store, manipulate and summarise Monte Carlo Markov Chain (MCMC) samples. For more information see Brooks et al. (2011) <isbn:978-1-4200-7941-8>.
Maintained by Joe Thorley. Last updated 2 months ago.
2.3 match 17 stars 7.66 score 111 scripts 10 dependentsbioc
derfinder:Annotation-agnostic differential expression analysis of RNA-seq data at base-pair resolution via the DER Finder approach
This package provides functions for annotation-agnostic differential expression analysis of RNA-seq data. Two implementations of the DER Finder approach are included in this package: (1) single base-level F-statistics and (2) DER identification at the expressed regions-level. The DER Finder approach can also be used to identify differentially bounded ChIP-seq peaks.
Maintained by Leonardo Collado-Torres. Last updated 3 months ago.
differentialexpressionsequencingrnaseqchipseqdifferentialpeakcallingsoftwareimmunooncologycoverageannotation-agnosticbioconductorderfinder
1.7 match 42 stars 10.03 score 78 scripts 6 dependentsdesiquintans
librarian:Install, Update, Load Packages from CRAN, 'GitHub', and 'Bioconductor' in One Step
Automatically install, update, and load 'CRAN', 'GitHub', and 'Bioconductor' packages in a single function call. By accepting bare unquoted names for packages, it's easy to add or remove packages from the list.
Maintained by Desi Quintans. Last updated 3 months ago.
2.3 match 54 stars 7.63 score 410 scripts 1 dependentseth-mds
ricu:Intensive Care Unit Data with R
Focused on (but not exclusive to) data sets hosted on PhysioNet (<https://physionet.org>), 'ricu' provides utilities for download, setup and access of intensive care unit (ICU) data sets. In addition to functions for running arbitrary queries against available data sets, a system for defining clinical concepts and encoding their representations in tabular ICU data is presented.
Maintained by Nicolas Bennett. Last updated 10 months ago.
3.0 match 39 stars 5.65 score 77 scriptsbioc
OmicsMLRepoR:Search harmonized metadata created under the OmicsMLRepo project
This package provides functions to browse the harmonized metadata for large omics databases. This package also supports data navigation if the metadata incorporates ontology.
Maintained by Sehyun Oh. Last updated 1 months ago.
softwareinfrastructuredatarepresentation
3.1 match 5.40 score 14 scriptstaddylab
distrom:Distributed Multinomial Regression
Fast distributed/parallel estimation for multinomial logistic regression via Poisson factorization and the 'gamlr' package. For details see: Taddy (2015, AoAS), Distributed Multinomial Regression, <arXiv:1311.6139>.
Maintained by Nelson Rayl. Last updated 7 months ago.
3.0 match 19 stars 5.58 score 44 scripts 3 dependentsusaid-oha-si
tameDP:Import targets and PLHIV data from COP Target Setting Tool (formerly Data Pack)
Import PSNUxIM targets and PLHIV data from COP Data Pack. The purpose is to make the data tidy and more usable than their current structure in the Excel data packs.
Maintained by Aaron Chafetz. Last updated 1 years ago.
3.4 match 1 stars 4.92 score 46 scriptsrte-antares-rpackage
manipulateWidget:Add Even More Interactivity to Interactive Charts
Like package 'manipulate' does for static graphics, this package helps to easily add controls like sliders, pickers, checkboxes, etc. that can be used to modify the input data or the parameters of an interactive chart created with package 'htmlwidgets'.
Maintained by Veronique Bachelier. Last updated 3 years ago.
graphicalhtmlwidgetsinteractive-chartsmanipulaterteshinytyndp
1.9 match 129 stars 8.82 score 143 scripts 1 dependentswadpac
GGIR:Raw Accelerometer Data Analysis
A tool to process and analyse data collected with wearable raw acceleration sensors as described in Migueles and colleagues (JMPB 2019), and van Hees and colleagues (JApplPhysiol 2014; PLoSONE 2015). The package has been developed and tested for binary data from 'GENEActiv' <https://activinsights.com/>, binary (.gt3x) and .csv-export data from 'Actigraph' <https://theactigraph.com> devices, and binary (.cwa) and .csv-export data from 'Axivity' <https://axivity.com>. These devices are currently widely used in research on human daily physical activity. Further, the package can handle accelerometer data file from any other sensor brand providing that the data is stored in csv format. Also the package allows for external function embedding.
Maintained by Vincent T van Hees. Last updated 5 days ago.
accelerometeractivity-recognitioncircadian-rhythmmovement-sensorsleep
1.3 match 109 stars 13.20 score 342 scripts 3 dependentstomasfryda
h2o:R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Maintained by Tomas Fryda. Last updated 1 years ago.
2.0 match 3 stars 8.20 score 7.8k scripts 11 dependentsohdsi
CohortConstructor:Build and Manipulate Study Cohorts Using a Common Data Model
Create and manipulate study cohorts in data mapped to the Observational Medical Outcomes Partnership Common Data Model.
Maintained by Edward Burn. Last updated 6 days ago.
1.7 match 2 stars 9.71 score 207 scripts 2 dependentscdriveraus
ctsem:Continuous Time Structural Equation Modelling
Hierarchical continuous (and discrete) time state space modelling, for linear and nonlinear systems measured by continuous variables, with limited support for binary data. The subject specific dynamic system is modelled as a stochastic differential equation (SDE) or difference equation, measurement models are typically multivariate normal factor models. Linear mixed effects SDE's estimated via maximum likelihood and optimization are the default. Nonlinearities, (state dependent parameters) and random effects on all parameters are possible, using either max likelihood / max a posteriori optimization (with optional importance sampling) or Stan's Hamiltonian Monte Carlo sampling. See <https://github.com/cdriveraus/ctsem/raw/master/vignettes/hierarchicalmanual.pdf> for details. Priors may be used. For the conceptual overview of the hierarchical Bayesian linear SDE approach, see <https://www.researchgate.net/publication/324093594_Hierarchical_Bayesian_Continuous_Time_Dynamic_Modeling>. Exogenous inputs may also be included, for an overview of such possibilities see <https://www.researchgate.net/publication/328221807_Understanding_the_Time_Course_of_Interventions_with_Continuous_Time_Dynamic_Models> . Stan based functions are not available on 32 bit Windows systems at present. <https://cdriver.netlify.app/> contains some tutorial blog posts.
Maintained by Charles Driver. Last updated 14 days ago.
stochastic-differential-equationstime-seriescpp
1.7 match 42 stars 9.58 score 366 scripts 1 dependentsbioc
spikeLI:Affymetrix Spike-in Langmuir Isotherm Data Analysis Tool
SpikeLI is a package that performs the analysis of the Affymetrix spike-in data using the Langmuir Isotherm. The aim of this package is to show the advantages of a physical-chemistry based analysis of the Affymetrix microarray data compared to the traditional methods. The spike-in (or Latin square) data for the HGU95 and HGU133 chipsets have been downloaded from the Affymetrix web site. The model used in the spikeLI package is described in details in E. Carlon and T. Heim, Physica A 362, 433 (2006).
Maintained by Enrico Carlon. Last updated 5 months ago.
4.8 match 3.30 scorekbroman
broman:Karl Broman's R Code
Miscellaneous R functions, including functions related to graphics (mostly for base graphics), permutation tests, running mean/median, and general utilities.
Maintained by Karl W Broman. Last updated 10 months ago.
1.8 match 183 stars 8.80 score 648 scripts 1 dependentsbioc
CellBench:Construct Benchmarks for Single Cell Analysis Methods
This package contains infrastructure for benchmarking analysis methods and access to single cell mixture benchmarking data. It provides a framework for organising analysis methods and testing combinations of methods in a pipeline without explicitly laying out each combination. It also provides utilities for sampling and filtering SingleCellExperiment objects, constructing lists of functions with varying parameters, and multithreaded evaluation of analysis methods.
Maintained by Shian Su. Last updated 5 months ago.
softwareinfrastructuresinglecellbenchmarkbioinformatics
1.8 match 31 stars 8.73 score 98 scriptsinsitro
AllelicSeries:Allelic Series Test
Implementation of gene-level rare variant association tests targeting allelic series: genes where increasingly deleterious mutations have increasingly large phenotypic effects. The COding-variant Allelic Series Test (COAST) operates on the benign missense variants (BMVs), deleterious missense variants (DMVs), and protein truncating variants (PTVs) within a gene. COAST uses a set of adjustable weights that tailor the test towards rejecting the null hypothesis for genes where the average magnitude of effect increases monotonically from BMVs to DMVs to PTVs. See McCaw ZR, OโDushlaine C, Somineni H, Bereket M, Klein C, Karaletsos T, Casale FP, Koller D, Soare TW. (2023) "An allelic series rare variant association test for candidate gene discovery" <doi:10.1016/j.ajhg.2023.07.001>.
Maintained by Zachary McCaw. Last updated 1 months ago.
2.3 match 13 stars 6.97 score 8 scriptsbioc
SRAdb:A compilation of metadata from NCBI SRA and tools
The Sequence Read Archive (SRA) is the largest public repository of sequencing data from the next generation of sequencing platforms including Roche 454 GS System, Illumina Genome Analyzer, Applied Biosystems SOLiD System, Helicos Heliscope, and others. However, finding data of interest can be challenging using current tools. SRAdb is an attempt to make access to the metadata associated with submission, study, sample, experiment and run much more feasible. This is accomplished by parsing all the NCBI SRA metadata into a SQLite database that can be stored and queried locally. Fulltext search in the package make querying metadata very flexible and powerful. fastq and sra files can be downloaded for doing alignment locally. Beside ftp protocol, the SRAdb has funcitons supporting fastp protocol (ascp from Aspera Connect) for faster downloading large data files over long distance. The SQLite database is updated regularly as new data is added to SRA and can be downloaded at will for the most up-to-date metadata.
Maintained by Jack Zhu. Last updated 3 months ago.
infrastructuresequencingdataimport
2.0 match 2 stars 7.81 score 200 scriptsbupaverse
bupaR:Business Process Analysis in R
Comprehensive Business Process Analysis toolkit. Creates S3-class for event log objects, and related handler functions. Imports related packages for filtering event data, computation of descriptive statistics, handling of 'Petri Net' objects and visualization of process maps. See also packages 'edeaR','processmapR', 'eventdataR' and 'processmonitR'.
Maintained by Gert Janssenswillen. Last updated 2 years ago.
1.7 match 55 stars 9.07 score 389 scripts 11 dependentsmarkvanderloo
accumulate:Split-Apply-Combine with Dynamic Groups
Estimate group aggregates, where one can set user-defined conditions that each group of records must satisfy to be suitable for aggregation. If a group of records is not suitable, it is expanded using a collapsing scheme defined by the user. A paper on this package was published in the Journal of Statistical Software <doi:10.18637/jss.v112.i04>.
Maintained by Mark van der Loo. Last updated 1 days ago.
2.9 match 9 stars 5.35 score 3 scriptsbioc
AneuFinder:Analysis of Copy Number Variation in Single-Cell-Sequencing Data
AneuFinder implements functions for copy-number detection, breakpoint detection, and karyotype and heterogeneity analysis in single-cell whole genome sequencing and strand-seq data.
Maintained by Aaron Taudt. Last updated 5 months ago.
immunooncologysoftwaresequencingsinglecellcopynumbervariationgenomicvariationhiddenmarkovmodelwholegenomecpp
2.0 match 17 stars 7.70 score 37 scriptspharmar
riskmetric:Risk Metrics to Evaluating R Packages
Facilities for assessing R packages against a number of metrics to help quantify their robustness.
Maintained by Eli Miller. Last updated 11 hours ago.
1.7 match 167 stars 8.91 score 43 scriptsdereckmezquita
stenographer:Flexible and Customisable Logging System
A comprehensive logging framework for R applications that provides hierarchical logging levels, database integration, and contextual logging capabilities. The package supports 'SQLite' storage for persistent logs, provides colour-coded console output for better readability, includes parallel processing support, and implements structured error reporting with 'JSON' formatting.
Maintained by Dereck Mezquita. Last updated 2 months ago.
3.0 match 3 stars 5.08 score 1 scriptsbioc
cmapR:CMap Tools in R
The Connectivity Map (CMap) is a massive resource of perturbational gene expression profiles built by researchers at the Broad Institute and funded by the NIH Library of Integrated Network-Based Cellular Signatures (LINCS) program. Please visit https://clue.io for more information. The cmapR package implements methods to parse, manipulate, and write common CMap data objects, such as annotated matrices and collections of gene sets.
Maintained by Ted Natoli. Last updated 5 months ago.
dataimportdatarepresentationgeneexpressionbioconductorbioinformaticscmap
1.7 match 90 stars 8.86 score 298 scriptskeyatm
keyATM:Keyword Assisted Topic Models
Fits keyword assisted topic models (keyATM) using collapsed Gibbs samplers. The keyATM combines the latent dirichlet allocation (LDA) models with a small number of keywords selected by researchers in order to improve the interpretability and topic classification of the LDA. The keyATM can also incorporate covariates and directly model time trends. The keyATM is proposed in Eshima, Imai, and Sasaki (2024) <doi:10.1111/ajps.12779>.
Maintained by Shusei Eshima. Last updated 11 months ago.
latent-dirichlet-allocationnatural-language-processingpolitical-sciencercpprcppeigensocial-sciencetopic-modelscpp
2.4 match 106 stars 6.30 score 63 scriptsbioc
BaalChIP:BaalChIP: Bayesian analysis of allele-specific transcription factor binding in cancer genomes
The package offers functions to process multiple ChIP-seq BAM files and detect allele-specific events. Computes allele counts at individual variants (SNPs/SNVs), implements extensive QC steps to remove problematic variants, and utilizes a bayesian framework to identify statistically significant allele- specific events. BaalChIP is able to account for copy number differences between the two alleles, a known phenotypical feature of cancer samples.
Maintained by Ines de Santiago. Last updated 5 months ago.
softwarechipseqbayesiansequencing
3.8 match 4.00 score 5 scriptsflr
FLa4a:A Simple and Robust Statistical Catch at Age Model
A simple and robust statistical Catch at Age model that is specifically designed for stocks with intermediate levels of data quantity and quality.
Maintained by Ernesto Jardim. Last updated 8 days ago.
2.3 match 12 stars 6.66 score 177 scripts 2 dependentskvasilopoulos
exuber:Econometric Analysis of Explosive Time Series
Testing for and dating periods of explosive dynamics (exuberance) in time series using the univariate and panel recursive unit root tests proposed by Phillips et al. (2015) <doi:10.1111/iere.12132> and Pavlidis et al. (2016) <doi:10.1007/s11146-015-9531-2>.The recursive least-squares algorithm utilizes the matrix inversion lemma to avoid matrix inversion which results in significant speed improvements. Simulation of a variety of periodically-collapsing bubble processes. Details can be found in Vasilopoulos et al. (2022) <doi:10.18637/jss.v103.i10>.
Maintained by Kostas Vasilopoulos. Last updated 1 years ago.
dickey-fullerexplosive-dynamicssimulationtime-seriesopenblascpp
2.2 match 29 stars 6.83 score 77 scriptsjuanv66x
qvirus:Quantum Computing for Analyzing CD4 Lymphocytes and Antiretroviral Therapy
Resources, tutorials, and code snippets dedicated to exploring the intersection of quantum computing and artificial intelligence (AI) in the context of analyzing Cluster of Differentiation 4 (CD4) lymphocytes and optimizing antiretroviral therapy (ART) for human immunodeficiency virus (HIV). With the emergence of quantum artificial intelligence and the development of small-scale quantum computers, there's an unprecedented opportunity to revolutionize the understanding of HIV dynamics and treatment strategies. This project leverages the R package 'qsimulatR' (Ostmeyer and Urbach, 2023, <https://CRAN.R-project.org/package=qsimulatR>), a quantum computer simulator, to explore these applications in quantum computing techniques, addressing the challenges in studying CD4 lymphocytes and enhancing ART efficacy.
Maintained by Juan Pablo Acuรฑa Gonzรกlez. Last updated 14 days ago.
2.8 match 5.43 score 15 scriptsisglobal-brge
SNPassoc:SNPs-Based Whole Genome Association Studies
Functions to perform most of the common analysis in genome association studies are implemented. These analyses include descriptive statistics and exploratory analysis of missing values, calculation of Hardy-Weinberg equilibrium, analysis of association based on generalized linear models (either for quantitative or binary traits), and analysis of multiple SNPs (haplotype and epistasis analysis). Permutation test and related tests (sum statistic and truncated product) are also implemented. Max-statistic and genetic risk-allele score exact distributions are also possible to be estimated. The methods are described in Gonzalez JR et al., 2007 <doi: 10.1093/bioinformatics/btm025>.
Maintained by Dolors Pelegri. Last updated 5 months ago.
1.6 match 16 stars 9.14 score 89 scripts 6 dependentstaxonomicallyinformedannotation
tima:Taxonomically Informed Metabolite Annotation
This package provides the infrastructure to perform Taxonomically Informed Metabolite Annotation.
Maintained by Adriano Rutz. Last updated 8 hours ago.
metabolite annotationchemotaxonomyscoring systemnatural productscomputational metabolomicstaxonomic distancespecialized metabolome
2.3 match 9 stars 6.55 score 32 scripts 2 dependentsbioc
UPDhmm:Detecting Uniparental Disomy through NGS trio data
Uniparental disomy (UPD) is a genetic condition where an individual inherits both copies of a chromosome or part of it from one parent, rather than one copy from each parent. This package contains a HMM for detecting UPDs through HTS (High Throughput Sequencing) data from trio assays. By analyzing the genotypes in the trio, the model infers a hidden state (normal, father isodisomy, mother isodisomy, father heterodisomy and mother heterodisomy).
Maintained by Marta Sevilla. Last updated 5 months ago.
softwarehiddenmarkovmodelgenetics
3.2 match 1 stars 4.54 score 3 scriptssciviews
svBase:Base Objects like Data Frames for 'SciViews::R'
Functions to manipulated the three main classes of "data frames" for 'SciViews::R': data.frame, data.table and tibble. Allow to select the preferred one, and to convert more carefully between the three, taking care of correct presentation of row names and data.table's keys. More homogeneous way of creating these three data frames and of printing them on the R console.
Maintained by Philippe Grosjean. Last updated 9 months ago.
3.3 match 4.38 score 3 scripts 8 dependentsmartin3141
spant:MR Spectroscopy Analysis Tools
Tools for reading, visualising and processing Magnetic Resonance Spectroscopy data. The package includes methods for spectral fitting: Wilson (2021) <DOI:10.1002/mrm.28385> and spectral alignment: Wilson (2018) <DOI:10.1002/mrm.27605>.
Maintained by Martin Wilson. Last updated 1 months ago.
brainmrimrsmrshubspectroscopyfortran
1.7 match 25 stars 8.52 score 81 scriptspoissonconsulting
universals:S3 Generics for Bayesian Analyses
Provides S3 generic methods and some default implementations for Bayesian analyses that generate Markov Chain Monte Carlo (MCMC) samples. The purpose of 'universals' is to reduce package dependencies and conflicts. The 'nlist' package implements many of the methods for its 'nlist' class.
Maintained by Joe Thorley. Last updated 2 months ago.
2.3 match 4 stars 6.37 score 1 scripts 20 dependentsstatisticsnorway
SSBtools:Algorithms and Tools for Tabular Statistics and Hierarchical Computations
Includes general data manipulation functions, algorithms for statistical disclosure control (Langsrud, 2024) <doi:10.1007/978-3-031-69651-0_6> and functions for hierarchical computations by sparse model matrices (Langsrud, 2023) <doi:10.32614/RJ-2023-088>.
Maintained by รyvind Langsrud. Last updated 5 days ago.
1.9 match 7 stars 7.62 score 68 scripts 7 dependentsusdaforestservice
FIESTAutils:Utility Functions for Forest Inventory Estimation and Analysis
A set of tools for data wrangling, spatial data analysis, statistical modeling (including direct, model-assisted, photo-based, and small area tools), and USDA Forest Service data base tools. These tools are aimed to help Foresters, Analysts, and Scientists extract and perform analyses on USDA Forest Service data.
Maintained by Grayson White. Last updated 5 days ago.
2.3 match 8 stars 6.33 score 1 dependentspoissonconsulting
tidyplus:Additional 'tidyverse' Functions
Provides functions such as str_crush(), add_missing_column(), coalesce_data() and drop_na_all() that complement 'tidyverse' functionality or functions that provide alternative behaviors such as if_else2() and str_detect2().
Maintained by Ayla Pearson. Last updated 2 months ago.
2.3 match 9 stars 6.33 score 1 scripts 4 dependentssamhforbes
eyetrackingR:Eye-Tracking Data Analysis
Addresses tasks along the pipeline from raw data to analysis and visualization for eye-tracking data. Offers several popular types of analyses, including linear and growth curve time analyses, onset-contingent reaction time analyses, as well as several non-parametric bootstrapping approaches. For references to the approach see Mirman, Dixon & Magnuson (2008) <doi:10.1016/j.jml.2007.11.006>, and Barr (2008) <doi:10.1016/j.jml.2007.09.002>.
Maintained by Samuel Forbes. Last updated 2 years ago.
1.8 match 22 stars 7.84 score 60 scriptsctn-0094
DOPE:Drug Ontology Parsing Engine
Provides information on drug names (brand, generic and street) for drugs tracked by the DEA. There are functions that will search synonyms and return the drug names and types. The vignettes have extensive information on the work done to create the data for the package.
Maintained by Raymond Balise. Last updated 4 years ago.
1.8 match 21 stars 7.83 score 31 scriptsjonesor
Rage:Life History Metrics from Matrix Population Models
Functions for calculating life history metrics using matrix population models ('MPMs'). Described in Jones et al. (2021) <doi:10.1101/2021.04.26.441330>.
Maintained by Owen Jones. Last updated 3 months ago.
1.7 match 11 stars 8.17 score 62 scripts 1 dependentsbioc
genefu:Computation of Gene Expression-Based Signatures in Breast Cancer
This package contains functions implementing various tasks usually required by gene expression analysis, especially in breast cancer studies: gene mapping between different microarray platforms, identification of molecular subtypes, implementation of published gene signatures, gene selection, and survival analysis.
Maintained by Benjamin Haibe-Kains. Last updated 4 months ago.
differentialexpressiongeneexpressionvisualizationclusteringclassification
1.9 match 7.42 score 193 scripts 3 dependentsbranchlab
metasnf:Meta Clustering with Similarity Network Fusion
Framework to facilitate patient subtyping with similarity network fusion and meta clustering. The similarity network fusion (SNF) algorithm was introduced by Wang et al. (2014) in <doi:10.1038/nmeth.2810>. SNF is a data integration approach that can transform high-dimensional and diverse data types into a single similarity network suitable for clustering with minimal loss of information from each initial data source. The meta clustering approach was introduced by Caruana et al. (2006) in <doi:10.1109/ICDM.2006.103>. Meta clustering involves generating a wide range of cluster solutions by adjusting clustering hyperparameters, then clustering the solutions themselves into a manageable number of qualitatively similar solutions, and finally characterizing representative solutions to find ones that are best for the user's specific context. This package provides a framework to easily transform multi-modal data into a wide range of similarity network fusion-derived cluster solutions as well as to visualize, characterize, and validate those solutions. Core package functionality includes easy customization of distance metrics, clustering algorithms, and SNF hyperparameters to generate diverse clustering solutions; calculation and plotting of associations between features, between patients, and between cluster solutions; and standard cluster validation approaches including resampled measures of cluster stability, standard metrics of cluster quality, and label propagation to evaluate generalizability in unseen data. Associated vignettes guide the user through using the package to identify patient subtypes while adhering to best practices for unsupervised learning.
Maintained by Prashanth S Velayudhan. Last updated 8 days ago.
bioinformaticsclusteringmetaclusteringsnf
1.7 match 8 stars 8.21 score 30 scriptsdarwin-eu
DrugUtilisation:Summarise Patient-Level Drug Utilisation in Data Mapped to the OMOP Common Data Model
Summarise patient-level drug utilisation cohorts using data mapped to the Observational Medical Outcomes Partnership (OMOP) common data model. New users and prevalent users cohorts can be generated and their characteristics, indication and drug use summarised.
Maintained by Martรญ Catalร . Last updated 2 months ago.
1.7 match 8.20 score 156 scripts 2 dependentscdcgov
surveytable:Formatted Survey Estimates
Short and understandable commands that generate tabulated, formatted, and rounded survey estimates. Mostly a wrapper for the 'survey' package (Lumley (2004) <doi:10.18637/jss.v009.i08> <https://CRAN.R-project.org/package=survey>) that identifies low-precision estimates using the National Center for Health Statistics (NCHS) presentation standards (Parker et al. (2017) <https://www.cdc.gov/nchs/data/series/sr_02/sr02_175.pdf>, Parker et al. (2023) <doi:10.15620/cdc:124368>).
Maintained by Alex Strashny. Last updated 7 days ago.
estimatesformatted-outputpretty-printsurveytables
2.0 match 6 stars 6.71 score 19 scriptsusaid-oha-si
selfdestructin5:Creates SI OHA Mission Director Briefers
Creates a series of data frames that can be passed to a gt() to create the PEPFAR summary tables.
Maintained by Tim Essam. Last updated 29 days ago.
3.4 match 1 stars 3.98 score 21 scriptsbioc
BioNERO:Biological Network Reconstruction Omnibus
BioNERO aims to integrate all aspects of biological network inference in a single package, including data preprocessing, exploratory analyses, network inference, and analyses for biological interpretations. BioNERO can be used to infer gene coexpression networks (GCNs) and gene regulatory networks (GRNs) from gene expression data. Additionally, it can be used to explore topological properties of protein-protein interaction (PPI) networks. GCN inference relies on the popular WGCNA algorithm. GRN inference is based on the "wisdom of the crowds" principle, which consists in inferring GRNs with multiple algorithms (here, CLR, GENIE3 and ARACNE) and calculating the average rank for each interaction pair. As all steps of network analyses are included in this package, BioNERO makes users avoid having to learn the syntaxes of several packages and how to communicate between them. Finally, users can also identify consensus modules across independent expression sets and calculate intra and interspecies module preservation statistics between different networks.
Maintained by Fabricio Almeida-Silva. Last updated 5 months ago.
softwaregeneexpressiongeneregulationsystemsbiologygraphandnetworkpreprocessingnetworknetworkinference
1.7 match 27 stars 7.78 score 50 scripts 1 dependentsbeckerbenj
eatGADS:Data Management of Large Hierarchical Data
Import 'SPSS' data, handle and change 'SPSS' meta data, store and access large hierarchical data in 'SQLite' data bases.
Maintained by Benjamin Becker. Last updated 26 days ago.
1.8 match 1 stars 7.36 score 34 scripts 1 dependentsdipterix
filearray:File-Backed Array for Out-of-Memory Computation
Stores large arrays in files to avoid occupying large memories. Implemented with super fast gigabyte-level multi-threaded reading/writing via 'OpenMP'. Supports multiple non-character data types (double, float, complex, integer, logical, and raw).
Maintained by Zhengjia Wang. Last updated 2 days ago.
arraybig-datamemory-mapout-of-memoryoutofmemorycpp
2.0 match 17 stars 6.58 score 10 scripts 3 dependentsskranz
xglue:Extended glue. Write collapse and by Uses a template file with a special syntax.
Extended glue. Add in the string to be glued blocks that specify collapse operations on vectors, and allow to operate on a grouped data frame
Maintained by Sebastian Kranz. Last updated 4 years ago.
5.3 match 6 stars 2.48 score 6 scriptsdavidchall
ipaddress:Data Analysis for IP Addresses and Networks
Classes and functions for working with IP (Internet Protocol) addresses and networks, inspired by the Python 'ipaddress' module. Offers full support for both IPv4 and IPv6 (Internet Protocol versions 4 and 6) address spaces. It is specifically designed to work well with the 'tidyverse'.
Maintained by David Hall. Last updated 1 years ago.
cyberdata-analysisip-addressipv4ipv6vctrscpp
1.9 match 32 stars 7.02 score 27 scripts 2 dependentscran
tensorA:Advanced Tensor Arithmetic with Named Indices
Provides convenience functions for advanced linear algebra with tensors and computation with data sets of tensors on a higher level abstraction. It includes Einstein and Riemann summing conventions, dragging, co- and contravariate indices, parallel computations on sequences of tensors.
Maintained by K. Gerald van den Boogaart. Last updated 1 years ago.
2.3 match 5.83 score 399 dependentssdanzige
ADAPTS:Automated Deconvolution Augmentation of Profiles for Tissue Specific Cells
Tools to construct (or add to) cell-type signature matrices using flow sorted or single cell samples and deconvolve bulk gene expression data. Useful for assessing the quality of single cell RNAseq experiments, estimating the accuracy of signature matrices, and determining cell-type spillover. Please cite: Danziger SA et al. (2019) ADAPTS: Automated Deconvolution Augmentation of Profiles for Tissue Specific cells <doi:10.1371/journal.pone.0224693>.
Maintained by Samuel A Danziger. Last updated 3 years ago.
2.0 match 2 stars 6.56 score 40 scripts 1 dependentsthibautjombart
treespace:Statistical Exploration of Landscapes of Phylogenetic Trees
Tools for the exploration of distributions of phylogenetic trees. This package includes a 'shiny' interface which can be started from R using treespaceServer(). For further details see Jombart et al. (2017) <DOI:10.1111/1755-0998.12676>.
Maintained by Michelle Kendall. Last updated 2 years ago.
1.8 match 28 stars 7.39 score 63 scriptsmikeblazanin
gcplyr:Wrangle and Analyze Growth Curve Data
Easy wrangling and model-free analysis of microbial growth curve data, as commonly output by plate readers. Tools for reshaping common plate reader outputs into 'tidy' formats and merging them with design information, making data easy to work with using 'gcplyr' and other packages. Also streamlines common growth curve processing steps, like smoothing and calculating derivatives, and facilitates model-free characterization and analysis of growth data. See methods at <https://mikeblazanin.github.io/gcplyr/>.
Maintained by Mike Blazanin. Last updated 2 months ago.
1.7 match 30 stars 7.53 score 75 scriptssamhforbes
PupillometryR:A Unified Pipeline for Pupillometry Data
Provides a unified pipeline to clean, prepare, plot, and run basic analyses on pupillometry experiments.
Maintained by Samuel Forbes. Last updated 1 years ago.
1.7 match 44 stars 7.58 score 288 scripts 1 dependentsopenanalytics
clinDataReview:Clinical Data Review Tool
Creation of interactive tables, listings and figures ('TLFs') and associated report for exploratory analysis of data in a clinical trial, e.g. for clinical oversight activities. Interactive figures include sunburst, treemap, scatterplot, line plot and barplot of counts data. Interactive tables include table of summary statistics (as counts of adverse events, enrollment table) and listings. Possibility to compare data (summary table or listing) across two data batches/sets. A clinical data review report is created via study-specific configuration files and template 'R Markdown' reports contained in the package.
Maintained by Laure Cougnaud. Last updated 9 months ago.
1.8 match 11 stars 7.10 score 36 scriptsbioc
openCyto:Hierarchical Gating Pipeline for flow cytometry data
This package is designed to facilitate the automated gating methods in sequential way to mimic the manual gating strategy.
Maintained by Mike Jiang. Last updated 5 months ago.
immunooncologyflowcytometrydataimportpreprocessingdatarepresentationcpp
1.7 match 7.62 score 404 scripts 1 dependentsbioc
SpatialDecon:Deconvolution of mixed cells from spatial and/or bulk gene expression data
Using spatial or bulk gene expression data, estimates abundance of mixed cell types within each observation. Based on "Advances in mixed cell deconvolution enable quantification of cell types in spatial transcriptomic data", Danaher (2022). Designed for use with the NanoString GeoMx platform, but applicable to any gene expression data.
Maintained by Maddy Griswold. Last updated 5 months ago.
immunooncologyfeatureextractiongeneexpressiontranscriptomicsspatial
1.7 match 36 stars 7.40 score 58 scriptscapitalone
dataCompareR:Compare Two Data Frames and Summarise the Difference
Easy comparison of two tabular data objects in R. Specifically designed to show differences between two sets of data in a useful way that should make it easier to understand the differences, and if necessary, help you work out how to remedy them. Aims to offer a more useful output than all.equal() when your two data sets do not match, but isn't intended to replace all.equal() as a way to test for equality.
Maintained by Sarah Johnston. Last updated 2 years ago.
compare-datadatadata-analysisdata-science
1.8 match 76 stars 7.24 score 76 scriptsjonesor
Rcompadre:Utilities for using the 'COM(P)ADRE' Matrix Model Database
Utility functions for interacting with the 'COMPADRE' and 'COMADRE' databases of matrix population models. Described in Jones et al. (2021) <doi:10.1101/2021.04.26.441330>.
Maintained by Owen Jones. Last updated 5 months ago.
1.6 match 11 stars 7.74 score 55 scripts 2 dependentsdarunabas
phyloregion:Biogeographic Regionalization and Macroecology
Computational infrastructure for biogeography, community ecology, and biodiversity conservation (Daru et al. 2020) <doi:10.1111/2041-210X.13478>. It is based on the methods described in Daru et al. (2020) <doi:10.1038/s41467-020-15921-6>. The original conceptual work is described in Daru et al. (2017) <doi:10.1016/j.tree.2017.08.013> on patterns and processes of biogeographical regionalization. Additionally, the package contains fast and efficient functions to compute more standard conservation measures such as phylogenetic diversity, phylogenetic endemism, evolutionary distinctiveness and global endangerment, as well as compositional turnover (e.g., beta diversity).
Maintained by Barnabas H. Daru. Last updated 5 months ago.
1.8 match 18 stars 7.21 score 50 scripts 1 dependentsuzh-peg
microxanox:Oxic-Anoxic Regime Shifts in Microbial Communities
Model to simulate a three functional group system with four chemical substrates using a set of ordinary differential equations. Simulations can be run individually or over a parameter range, to find stable states. The model features multiple species per functional group, where the number is only limited by computational constraints. The R package is constructed in such a way, that the results contain the input parameter used, so that a saved results can be loaded again and thesimulation be repeated.
Maintained by Owen L. Petchey. Last updated 1 years ago.
3.3 match 3.85 score 35 scriptsbioc
alevinQC:Generate QC Reports For Alevin Output
Generate QC reports summarizing the output from an alevin, alevin-fry, or simpleaf run. Reports can be generated as html or pdf files, or as shiny applications.
Maintained by Charlotte Soneson. Last updated 3 months ago.
1.8 match 31 stars 6.89 score 21 scriptskbhoehn
dowser:B Cell Receptor Phylogenetics Toolkit
Provides a set of functions for inferring, visualizing, and analyzing B cell phylogenetic trees. Provides methods to 1) reconstruct unmutated ancestral sequences, 2) build B cell phylogenetic trees using multiple methods, 3) visualize trees with metadata at the tips, 4) reconstruct intermediate sequences, 5) detect biased ancestor-descendant relationships among metadata types Workflow examples available at documentation site (see URL). Citations: Hoehn et al (2022) <doi:10.1371/journal.pcbi.1009885>, Hoehn et al (2021) <doi:10.1101/2021.01.06.425648>.
Maintained by Kenneth Hoehn. Last updated 2 months ago.
1.8 match 6.81 score 84 scripts