R-universe search: guide

tidyverse

ggplot2:Create Elegant Data Visualisations Using the Grammar of Graphics

A system for 'declaratively' creating graphics, based on "The Grammar of Graphics". You provide the data, tell 'ggplot2' how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.

Maintained by Thomas Lin Pedersen. Last updated 10 days ago.

data-visualisation visualisation

33.7 match 6.6k stars 25.10 score 645k scripts 7.5k dependents

teunbrand

legendry:Extended Legends and Axes for 'ggplot2'

A 'ggplot2' extension that focusses on expanding the plotter's arsenal of guides. Guides in 'ggplot2' include axes and legends. 'legendry' offers new axes and annotation options, as well as new legends and colour displays.

Maintained by Teun van den Brand. Last updated 12 days ago.

axis axis-customization ggplot-extension ggplot2 legend visualization

60.6 match 227 stars 7.83 score 29 scripts 2 dependents

rudeboybert

fivethirtyeight:Data and Code Behind the Stories and Interactives at 'FiveThirtyEight'

Datasets and code published by the data journalism website 'FiveThirtyEight' available at <https://github.com/fivethirtyeight/data>. Note that while we received guidance from editors at 'FiveThirtyEight', this package is not officially published by 'FiveThirtyEight'.

Maintained by Albert Y. Kim. Last updated 2 years ago.

data-science datajournalism fivethirtyeight statistics

14.3 match 453 stars 10.98 score 1.7k scripts

ggobi

tourr:Tour Methods for Multivariate Data Visualisation

Implements geodesic interpolation and basis generation functions that allow you to create new tour methods from R.

Maintained by Dianne Cook. Last updated 17 days ago.

13.1 match 65 stars 11.17 score 426 scripts 9 dependents

ndphillips

yarrr:A Companion to the e-Book "YaRrr!: The Pirate's Guide to R"

Contains a mixture of functions and data sets referred to in the introductory e-book "YaRrr!: The Pirate's Guide to R". The latest version of the e-book is available for free at <https://www.thepiratesguidetor.com>.

Maintained by Nathaniel Phillips. Last updated 12 months ago.

12.2 match 78 stars 10.67 score 1.2k scripts 2 dependents

r-lib

styler:Non-Invasive Pretty Printing of R Code

Pretty-prints R code without changing the user's formatting intent.

Maintained by Lorenz Walthert. Last updated 1 months ago.

pretty-print

7.7 match 754 stars 16.15 score 940 scripts 62 dependents

animint

animint2:Animated Interactive Grammar of Graphics

Functions are provided for defining animated, interactive data visualizations in R code, and rendering on a web page. The 2018 Journal of Computational and Graphical Statistics paper, <doi:10.1080/10618600.2018.1513367> describes the concepts implemented.

Maintained by Toby Hocking. Last updated 28 days ago.

13.5 match 64 stars 8.87 score 173 scripts

t-kalinowski

keras:R Interface to 'Keras'

Interface to 'Keras' <https://keras.io>, a high-level neural networks 'API'. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both 'CPU' and 'GPU' devices.

Maintained by Tomasz Kalinowski. Last updated 11 months ago.

10.0 match 10.93 score 10k scripts 55 dependents

davidgohel

ggiraph:Make 'ggplot2' Graphics Interactive

Create interactive 'ggplot2' graphics using 'htmlwidgets'.

Maintained by David Gohel. Last updated 3 months ago.

libpng cpp

7.4 match 819 stars 14.39 score 4.1k scripts 34 dependents

mjskay

ggdist:Visualizations of Distributions and Uncertainty

Provides primitives for visualizing distributions using 'ggplot2' that are particularly tuned for visualizing uncertainty in either a frequentist or Bayesian mode. Both analytical distributions (such as frequentist confidence distributions or Bayesian priors) and distributions represented as samples (such as bootstrap distributions or Bayesian posterior samples) are easily visualized. Visualization primitives include but are not limited to: points with multiple uncertainty intervals, eye plots (Spiegelhalter D., 1999) <https://ideas.repec.org/a/bla/jorssa/v162y1999i1p45-58.html>, density plots, gradient plots, dot plots (Wilkinson L., 1999) <doi:10.1080/00031305.1999.10474474>, quantile dot plots (Kay M., Kola T., Hullman J., Munson S., 2016) <doi:10.1145/2858036.2858558>, complementary cumulative distribution function barplots (Fernandes M., Walls L., Munson S., Hullman J., Kay M., 2018) <doi:10.1145/3173574.3173718>, and fit curves with multiple uncertainty ribbons.

Maintained by Matthew Kay. Last updated 4 months ago.

ggplot2 uncertainty uncertainty-visualization visualization cpp

6.5 match 856 stars 15.24 score 3.1k scripts 61 dependents

pasraia

RRphylo:Phylogenetic Ridge Regression Methods for Comparative Studies

Functions for phylogenetic analysis (Castiglione et al., 2018 <doi:10.1111/2041-210X.12954>). The functions perform the estimation of phenotypic evolutionary rates, identification of phenotypic evolutionary rate shifts, quantification of direction and size of evolutionary change in multivariate traits, the computation of ontogenetic shape vectors and test for morphological convergence.

Maintained by Silvia Castiglione. Last updated 7 months ago.

12.0 match 10 stars 7.48 score 83 scripts

csdaw

ggprism:A 'ggplot2' Extension Inspired by 'GraphPad Prism'

Provides various themes, palettes, and other functions that are used to customise ggplots to look like they were made in 'GraphPad Prism'. The 'Prism'-look is achieved with theme_prism() and scale_fill|colour_prism(), axes can be changed with custom guides like guide_prism_minor(), and significance indicators added with add_pvalue().

Maintained by Charlotte Dawson. Last updated 12 months ago.

ggplot-extension ggplot2 prism

8.2 match 175 stars 10.56 score 1.1k scripts 5 dependents

rvlenth

emmeans:Estimated Marginal Means, aka Least-Squares Means

Obtain estimated marginal means (EMMs) for many linear, generalized linear, and mixed models. Compute contrasts or linear functions of EMMs, trends, and comparisons of slopes. Plots and other displays. Least-squares means are discussed, and the term "estimated marginal means" is suggested, in Searle, Speed, and Milliken (1980) Population marginal means in the linear model: An alternative to least squares means, The American Statistician 34(4), 216-221 <doi:10.1080/00031305.1980.10483031>.

Maintained by Russell V. Lenth. Last updated 4 days ago.

4.5 match 377 stars 19.19 score 13k scripts 187 dependents

cbielow

PTXQC:Quality Report Generation for MaxQuant and mzTab Results

Generates Proteomics (PTX) quality control (QC) reports for shotgun LC-MS data analyzed with the MaxQuant software suite (from .txt files) or mzTab files (ideally from OpenMS 'QualityControl' tool). Reports are customizable (target thresholds, subsetting) and available in HTML or PDF format. Published in J. Proteome Res., Proteomics Quality Control: Quality Control Software for MaxQuant Results (2015) <doi:10.1021/acs.jproteome.5b00780>.

Maintained by Chris Bielow. Last updated 1 years ago.

drag-and-drop hacktoberfest heatmap match-between-runs maxquant metric mztab openms proteomics quality-control quality-metrics report

8.6 match 42 stars 9.35 score 105 scripts 1 dependents

atsa-es

MARSS:Multivariate Autoregressive State-Space Modeling

The MARSS package provides maximum-likelihood parameter estimation for constrained and unconstrained linear multivariate autoregressive state-space (MARSS) models, including partially deterministic models. MARSS models are a class of dynamic linear model (DLM) and vector autoregressive model (VAR) model. Fitting available via Expectation-Maximization (EM), BFGS (using optim), and 'TMB' (using the 'marssTMB' companion package). Functions are provided for parametric and innovations bootstrapping, Kalman filtering and smoothing, model selection criteria including bootstrap AICb, confidences intervals via the Hessian approximation or bootstrapping, and all conditional residual types. See the user guide for examples of dynamic factor analysis, dynamic linear models, outlier and shock detection, and multivariate AR-p models. Online workshops (lectures, eBook, and computer labs) at <https://atsa-es.github.io/>.

Maintained by Elizabeth Eli Holmes. Last updated 1 years ago.

multivariate-timeseries state-space-models statistics time-series

7.6 match 52 stars 10.34 score 596 scripts 3 dependents

bioc

derfinder:Annotation-agnostic differential expression analysis of RNA-seq data at base-pair resolution via the DER Finder approach

This package provides functions for annotation-agnostic differential expression analysis of RNA-seq data. Two implementations of the DER Finder approach are included in this package: (1) single base-level F-statistics and (2) DER identification at the expressed regions-level. The DER Finder approach can also be used to identify differentially bounded ChIP-seq peaks.

Maintained by Leonardo Collado-Torres. Last updated 3 months ago.

differentialexpression sequencing rnaseq chipseq differentialpeakcalling software immunooncology coverage annotation-agnostic bioconductor derfinder

7.7 match 42 stars 10.03 score 78 scripts 6 dependents

bioc

CRISPRseek:Design of guide RNAs in CRISPR genome-editing systems

The package encompasses functions to find potential guide RNAs for the CRISPR-based genome-editing systems including the Base Editors and the Prime Editors when supplied with target sequences as input. Users have the flexibility to filter resulting guide RNAs based on parameters such as the absence of restriction enzyme cut sites or the lack of paired guide RNAs. The package also facilitates genome-wide exploration for off-targets, offering features to score and rank off-targets, retrieve flanking sequences, and indicate whether the hits are located within exon regions. All detected guide RNAs are annotated with the cumulative scores of the top5 and topN off-targets together with the detailed information such as mismatch sites and restrictuion enzyme cut sites. The package also outputs INDELs and their frequencies for Cas9 targeted sites.

Maintained by Lihua Julie Zhu. Last updated 7 days ago.

immunooncology generegulation sequencematching crispr

10.1 match 7.18 score 51 scripts 2 dependents

aphalo

ggspectra:Extensions to 'ggplot2' for Radiation Spectra

Additional annotations, stats, geoms and scales for plotting "light" spectra with 'ggplot2', together with specializations of ggplot() and autoplot() methods for spectral data and waveband definitions stored in objects of classes defined in package 'photobiology'. Part of the 'r4photobiology' suite, Aphalo P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.

Maintained by Pedro J. Aphalo. Last updated 3 days ago.

dataviz ggplot2-autoplot ggplot2-enhancementes ggplot2-geoms ggplot2-scales ggplot2-stats light r4photobiology-suite radiation spectra

8.9 match 5 stars 8.09 score 390 scripts 1 dependents

bioc

limma:Linear Models for Microarray and Omics Data

Data analysis, linear models and differential expression for omics data.

Maintained by Gordon Smyth. Last updated 6 days ago.

exonarray geneexpression transcription alternativesplicing differentialexpression differentialsplicing genesetenrichment dataimport bayesian clustering regression timecourse microarray micrornaarray mrnamicroarray onechannel proprietaryplatforms twochannel sequencing rnaseq batcheffect multiplecomparison normalization preprocessing qualitycontrol biomedicalinformatics cellbiology cheminformatics epigenetics functionalgenomics genetics immunooncology metabolomics proteomics systemsbiology transcriptomics

5.2 match 13.81 score 16k scripts 585 dependents

bioc

edgeR:Empirical Analysis of Digital Gene Expression Data in R

Differential expression analysis of sequence count data. Implements a range of statistical methodology based on the negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models, quasi-likelihood, and gene set enrichment. Can perform differential analyses of any type of omics data that produces read counts, including RNA-seq, ChIP-seq, ATAC-seq, Bisulfite-seq, SAGE, CAGE, metabolomics, or proteomics spectral counts. RNA-seq analyses can be conducted at the gene or isoform level, and tests can be conducted for differential exon or transcript usage.

Maintained by Yunshun Chen. Last updated 6 days ago.

alternativesplicing batcheffect bayesian biomedicalinformatics cellbiology chipseq clustering coverage differentialexpression differentialmethylation differentialsplicing dnamethylation epigenetics functionalgenomics geneexpression genesetenrichment genetics immunooncology multiplecomparison normalization pathways proteomics qualitycontrol regression rnaseq sage sequencing singlecell systemsbiology timecourse transcription transcriptomics openblas

5.2 match 13.40 score 17k scripts 255 dependents

datastorm-open

rAmCharts:JavaScript Charts Tool

Provides an R interface for using 'AmCharts' Library. Based on 'htmlwidgets', it provides a global architecture to generate 'JavaScript' source code for charts. Most of classes in the library have their equivalent in R with S4 classes; for those classes, not all properties have been referenced but can easily be added in the constructors. Complex properties (e.g. 'JavaScript' object) can be passed as named list. See examples at <https://datastorm-open.github.io/introduction_ramcharts/> and <https://www.amcharts.com/> for more information about the library. The package includes the free version of 'AmCharts' Library. Its only limitation is a small link to the web site displayed on your charts. If you enjoy this library, do not hesitate to refer to this page <https://www.amcharts.com/online-store/> to purchase a licence, and thus support its creators and get a period of Priority Support. See also <https://www.amcharts.com/about/> for more information about 'AmCharts' company.

Maintained by Benoit Thieurmel. Last updated 2 months ago.

9.5 match 49 stars 7.17 score 153 scripts 4 dependents

inbo

INBOtheme:Themes for ggplot2

Several themes for the ggplot2 package. Among others themes complying with the style guide for the Research Institute for Nature and Forest (INBO) and Elsevier journals.

Maintained by Thierry Onkelinx. Last updated 2 years ago.

ggplot2 ggplot2-themes

12.6 match 3 stars 5.20 score 356 scripts

bioc

cBioPortalData:Exposes and Makes Available Data from the cBioPortal Web Resources

The cBioPortalData R package accesses study datasets from the cBio Cancer Genomics Portal. It accesses the data either from the pre-packaged zip / tar files or from the API interface that was recently implemented by the cBioPortal Data Team. The package can provide data in either tabular format or with MultiAssayExperiment object that uses familiar Bioconductor data representations.

Maintained by Marcel Ramos. Last updated 10 days ago.

software infrastructure thirdpartyclient bioconductor-package nci-itcr u24ca289073

6.3 match 33 stars 10.15 score 147 scripts 4 dependents

daroczig

logger:A Lightweight, Modern and Flexible Logging Utility

Inspired by the the 'futile.logger' R package and 'logging' Python module, this utility provides a flexible and extensible way of formatting and delivering log messages with low overhead.

Maintained by Gergely Daróczi. Last updated 2 months ago.

3.8 match 298 stars 16.88 score 1.5k scripts 98 dependents

projectmosaic

mosaicCalc:R-Language Based Calculus Operations for Teaching

Software to support the introductory *MOSAIC Calculus* textbook <https://www.mosaic-web.org/MOSAIC-Calculus/>), one of many data- and modeling-oriented educational resources developed by Project MOSAIC (<https://www.mosaic-web.org/>). Provides symbolic and numerical differentiation and integration, as well as support for applied linear algebra (for data science), and differential equations/dynamics. Includes grammar-of-graphics-based functions for drawing vector fields, trajectories, etc. The software is suitable for general use, but intended mainly for teaching calculus.

Maintained by Daniel Kaplan. Last updated 21 days ago.

6.9 match 13 stars 8.68 score 546 scripts

jimbrig

rtraining:R Training Resources, Guides, Tips, and Knowledge Base

Houses variouse material realted to teaching R.

Maintained by Jimmy Briggs. Last updated 2 years ago.

best-practices curation developer-tools development development-environment guide knowledge package-development setup shiny-apps tips-and-tricks training training-materials walkthrough

16.2 match 4 stars 3.60 score 6 scripts

rikenbit

guidedPLS:Supervised Dimensional Reduction by Guided Partial Least Squares

Guided partial least squares (guided-PLS) is the combination of partial least squares by singular value decomposition (PLS-SVD) and guided principal component analysis (guided-PCA). For the details of the methods, see the reference section of GitHub README.md <https://github.com/rikenbit/guidedPLS>.

Maintained by Koki Tsuyuzaki. Last updated 2 years ago.

14.4 match 4.00 score

aphalo

photobiology:Photobiological Calculations

Definitions of classes, methods, operators and functions for use in photobiology and radiation meteorology and climatology. Calculation of effective (weighted) and not-weighted irradiances/doses, fluence rates, transmittance, reflectance, absorptance, absorbance and diverse ratios and other derived quantities from spectral data. Local maxima and minima: peaks, valleys and spikes. Conversion between energy-and photon-based units. Wavelength interpolation. Astronomical calculations related solar angles and day length. Colours and vision. This package is part of the 'r4photobiology' suite, Aphalo, P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.

Maintained by Pedro J. Aphalo. Last updated 3 days ago.

light photobiology quantification r4photobiology-suite radiation spectra sun-position

6.0 match 4 stars 9.35 score 604 scripts 12 dependents

quanteda

quanteda:Quantitative Analysis of Textual Data

A fast, flexible, and comprehensive framework for quantitative text analysis in R. Provides functionality for corpus management, creating and manipulating tokens and n-grams, exploring keywords in context, forming and manipulating sparse matrices of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and distances, applying content dictionaries, applying supervised and unsupervised machine learning, visually representing text and text analyses, and more.

Maintained by Kenneth Benoit. Last updated 2 months ago.

corpus natural-language-processing quanteda text-analytics onetbb cpp

3.3 match 851 stars 16.68 score 5.4k scripts 51 dependents

bioc

GUIDEseq:GUIDE-seq and PEtag-seq analysis pipeline

The package implements GUIDE-seq and PEtag-seq analysis workflow including functions for filtering UMI and reads with low coverage, obtaining unique insertion sites (proxy of cleavage sites), estimating the locations of the insertion sites, aka, peaks, merging estimated insertion sites from plus and minus strand, and performing off target search of the extended regions around insertion sites with mismatches and indels.

Maintained by Lihua Julie Zhu. Last updated 5 months ago.

immunooncology generegulation sequencing workflowstep crispr

12.5 match 4.45 score 14 scripts

bioc

gemini:GEMINI: Variational inference approach to infer genetic interactions from pairwise CRISPR screens

GEMINI uses log-fold changes to model sample-dependent and independent effects, and uses a variational Bayes approach to infer these effects. The inferred effects are used to score and identify genetic interactions, such as lethality and recovery. More details can be found in Zamanighomi et al. 2019 (in press).

Maintained by Sidharth Jain. Last updated 5 months ago.

software crispr bayesian dataimport computational-biology genetic-interactions

9.1 match 15 stars 6.02 score 9 scripts

hneth

riskyr:Rendering Risk Literacy more Transparent

Risk-related information (like the prevalence of conditions, the sensitivity and specificity of diagnostic tests, or the effectiveness of interventions or treatments) can be expressed in terms of frequencies or probabilities. By providing a toolbox of corresponding metrics and representations, 'riskyr' computes, translates, and visualizes risk-related information in a variety of ways. Adopting multiple complementary perspectives provides insights into the interplay between key parameters and renders teaching and training programs on risk literacy more transparent.

Maintained by Hansjoerg Neth. Last updated 10 months ago.

2x2-matrix bayesian-inference contingency-table representation risk risk-literacy visualization

7.5 match 19 stars 7.36 score 80 scripts

aphalo

gginnards:Explore the Innards of 'ggplot2' Objects

Extensions to 'ggplot2' providing low-level debug tools: statistics and geometries echoing their data argument. Layer manipulation: deletion, insertion, extraction and reordering of layers. Deletion of unused variables from the data object embedded in "ggplot" objects.

Maintained by Pedro J. Aphalo. Last updated 3 months ago.

dataviz debugging ggplot2-enhancementes ggplot2-layer-manipulation inspection

6.0 match 25 stars 9.03 score 378 scripts 3 dependents

r-forge

numDeriv:Accurate Numerical Derivatives

Methods for calculating (usually) accurate numerical first and second order derivatives. Accurate calculations are done using 'Richardson''s' extrapolation or, when applicable, a complex step derivative is available. A simple difference method is also provided. Simple difference is (usually) less accurate but is much quicker than 'Richardson''s' extrapolation and provides a useful cross-check. Methods are provided for real scalar and vector valued functions.

Maintained by Paul Gilbert. Last updated 2 months ago.

3.8 match 1 stars 14.10 score 1.2k scripts 3.1k dependents

kornl

mutoss:Unified Multiple Testing Procedures

Designed to ease the application and comparison of multiple hypothesis testing procedures for FWER, gFWER, FDR and FDX. Methods are standardized and usable by the accompanying 'mutossGUI'.

Maintained by Kornelius Rohmeyer. Last updated 12 months ago.

6.3 match 4 stars 8.44 score 24 scripts 16 dependents

rpolars

polars:Lightning-Fast 'DataFrame' Library

Lightning-fast 'DataFrame' library written in 'Rust'. Convert R data to 'Polars' data and vice versa. Perform fast, lazy, larger-than-memory and optimized data queries. 'Polars' is interoperable with the package 'arrow', as both are based on the 'Apache Arrow' Columnar Format.

Maintained by Soren Welling. Last updated 4 days ago.

arrow polars rust

4.4 match 499 stars 12.01 score 1.0k scripts 2 dependents

c4tb

shinyExprPortal:A Configurable 'shiny' Portal for Sharing Analysis of Molecular Expression Data

Enables deploying configuration file-based 'shiny' apps with minimal programming for interactive exploration and analysis showcase of molecular expression data. For exploration, supports visualization of correlations between rows of an expression matrix and a table of observations, such as clinical measures, and comparison of changes in expression over time. For showcase, enables visualizing the results of differential expression from package such as 'limma', co-expression modules from 'WGCNA' and lower dimensional projections.

Maintained by Rafael Henkin. Last updated 8 months ago.

bioinformatics data-analysis transcriptomics

9.8 match 5 stars 5.30 score 8 scripts

spatstat

spatstat:Spatial Point Pattern Analysis, Model-Fitting, Simulation, Tests

Comprehensive open-source toolbox for analysing Spatial Point Patterns. Focused mainly on two-dimensional point patterns, including multitype/marked points, in any spatial region. Also supports three-dimensional point patterns, space-time point patterns in any number of dimensions, point patterns on a linear network, and patterns of other geometrical objects. Supports spatial covariate data such as pixel images. Contains over 3000 functions for plotting spatial data, exploratory data analysis, model-fitting, simulation, spatial sampling, model diagnostics, and formal inference. Data types include point patterns, line segment patterns, spatial windows, pixel images, tessellations, and linear networks. Exploratory methods include quadrat counts, K-functions and their simulation envelopes, nearest neighbour distance and empty space statistics, Fry plots, pair correlation function, kernel smoothed intensity, relative risk estimation with cross-validated bandwidth selection, mark correlation functions, segregation indices, mark dependence diagnostics, and kernel estimates of covariate effects. Formal hypothesis tests of random pattern (chi-squared, Kolmogorov-Smirnov, Monte Carlo, Diggle-Cressie-Loosmore-Ford, Dao-Genton, two-stage Monte Carlo) and tests for covariate effects (Cox-Berman-Waller-Lawson, Kolmogorov-Smirnov, ANOVA) are also supported. Parametric models can be fitted to point pattern data using the functions ppm(), kppm(), slrm(), dppm() similar to glm(). Types of models include Poisson, Gibbs and Cox point processes, Neyman-Scott cluster processes, and determinantal point processes. Models may involve dependence on covariates, inter-point interaction, cluster formation and dependence on marks. Models are fitted by maximum likelihood, logistic regression, minimum contrast, and composite likelihood methods. A model can be fitted to a list of point patterns (replicated point pattern data) using the function mppm(). The model can include random effects and fixed effects depending on the experimental design, in addition to all the features listed above. Fitted point process models can be simulated, automatically. Formal hypothesis tests of a fitted model are supported (likelihood ratio test, analysis of deviance, Monte Carlo tests) along with basic tools for model selection (stepwise(), AIC()) and variable selection (sdr). Tools for validating the fitted model include simulation envelopes, residuals, residual plots and Q-Q plots, leverage and influence diagnostics, partial residuals, and added variable plots.

Maintained by Adrian Baddeley. Last updated 2 months ago.

cluster-process cox-point-process gibbs-process kernel-density network-analysis point-process poisson-process spatial-analysis spatial-data spatial-data-analysis spatial-statistics spatstat statistical-methods statistical-models statistical-tests statistics

3.1 match 200 stars 16.32 score 5.5k scripts 41 dependents

icosa-grid

icosa:Global Triangular and Penta-Hexagonal Grids Based on Tessellated Icosahedra

Implementation of icosahedral grids in three dimensions. The spherical-triangular tessellation can be set to create grids with custom resolutions. Both the primary triangular and their inverted penta-hexagonal grids can be calculated. Additional functions are provided that allow plotting of the grids and associated data, the interaction of the grids with other raster and vector objects, and treating the grids as a graphs.

Maintained by Adam T. Kocsis. Last updated 7 months ago.

grid cpp

9.4 match 4 stars 5.41 score 65 scripts

loosolab

wilson:Web-Based Interactive Omics Visualization

Tool-set of modules for creating web-based applications that use plot based strategies to visualize and analyze multi-omics data. This package utilizes the 'shiny' and 'plotly' frameworks to provide a user friendly dashboard for interactive plotting.

Maintained by Hendrik Schultheis. Last updated 4 years ago.

11.8 match 2 stars 4.30 score 7 scripts

atlas-aai

ratlas:ATLAS Formatting Functions and Templates

Provides templates, formatting tools, and 'ggplot2' themes tailored for the Accessible Teaching, Learning, and Assessment Systems (ATLAS) organization. These templates facilitate the creation of topic guides and technical reports, while the formatting functions enable users to customize numbers and tables to meet specific requirements. Additionally, the themes ensure a uniform visual style across graphics.

Maintained by W. Jake Thompson. Last updated 4 months ago.

bookdown ggplot2 ggplot2-themes rmarkdown rmarkdown-template

6.9 match 29 stars 7.28 score 20 scripts

epiverse-trace

epiparameter:Classes and Helper Functions for Working with Epidemiological Parameters

Classes and helper functions for loading, extracting, converting, manipulating, plotting and aggregating epidemiological parameters for infectious diseases. Epidemiological parameters extracted from the literature are loaded from the 'epiparameterDB' R package.

Maintained by Joshua W. Lambert. Last updated 2 months ago.

data-access data-package epidemiology epiverse probability-distribution

5.0 match 33 stars 9.84 score 102 scripts 1 dependents

ecpolley

SuperLearner:Super Learner Prediction

Implements the super learner prediction method and contains a library of prediction algorithms to be used in the super learner.

Maintained by Eric Polley. Last updated 1 years ago.

3.8 match 274 stars 12.85 score 2.1k scripts 36 dependents

r-forge

GPArotation:GPA Factor Rotation

Gradient Projection Algorithm Rotation for Factor Analysis. See '?GPArotation.Intro' for more details.

Maintained by Paul Gilbert. Last updated 2 months ago.

3.8 match 1 stars 12.66 score 1.1k scripts 362 dependents

cbhurley

PairViz:Visualization using Graph Traversal

Improving graphics by ameliorating order effects, using Eulerian tours and Hamiltonian decompositions of graphs. References for the methods presented here are C.B. Hurley and R.W. Oldford (2010) <doi:10.1198/jcgs.2010.09136> and C.B. Hurley and R.W. Oldford (2011) <doi:10.1007/s00180-011-0229-5>.

Maintained by Catherine Hurley. Last updated 3 years ago.

8.2 match 1 stars 5.75 score 42 scripts 3 dependents

andrewcparnell

simmr:A Stable Isotope Mixing Model

Fits Stable Isotope Mixing Models (SIMMs) and is meant as a longer term replacement to the previous widely-used package SIAR. SIMMs are used to infer dietary proportions of organisms consuming various food sources from observations on the stable isotope values taken from the organisms' tissue samples. However SIMMs can also be used in other scenarios, such as in sediment mixing or the composition of fatty acids. The main functions are simmr_load() and simmr_mcmc(). The two vignettes contain a quick start and a full listing of all the features. The methods used are detailed in the papers Parnell et al 2010 <doi:10.1371/journal.pone.0009672>, and Parnell et al 2013 <doi:10.1002/env.2221>.

Maintained by Emma Govan. Last updated 11 months ago.

openblas cpp jags

6.3 match 31 stars 7.58 score 81 scripts

bioc

MultiAssayExperiment:Software for the integration of multi-omics experiments in Bioconductor

Harmonize data management of multiple experimental assays performed on an overlapping set of specimens. It provides a familiar Bioconductor user experience by extending concepts from SummarizedExperiment, supporting an open-ended mix of standard data classes for individual assays, and allowing subsetting by genomic ranges or rownames. Facilities are provided for reshaping data into wide and long formats for adaptability to graphing and downstream analysis.

Maintained by Marcel Ramos. Last updated 2 months ago.

infrastructure datarepresentation bioconductor bioconductor-package genomics nci-itcr tcga u24ca289073

3.1 match 71 stars 14.95 score 670 scripts 127 dependents

facebook

prophet:Automatic Forecasting Procedure

Implements a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well.

Maintained by Sean Taylor. Last updated 5 months ago.

forecasting python cpp

3.0 match 19k stars 15.53 score 976 scripts 13 dependents

klmr

box:Write Reusable, Composable and Modular R Code

A modern module system for R. Organise code into hierarchical, composable, reusable modules, and use it effortlessly across projects via a flexible, declarative dependency loading syntax.

Maintained by Konrad Rudolph. Last updated 13 days ago.

modules packages

3.8 match 888 stars 12.39 score 47 scripts 4 dependents

cynkra

dm:Relational Data Models

Provides tools for working with multiple related tables, stored as data frames or in a relational database. Multiple tables (data and metadata) are stored in a compound object, which can then be manipulated with a pipe-friendly syntax.

Maintained by Kirill Müller. Last updated 2 months ago.

data-model data-warehousing datawarehousing dbi dbplyr relational-databases

3.1 match 511 stars 14.81 score 410 scripts 8 dependents

bioc

scMultiSim:Simulation of Multi-Modality Single Cell Data Guided By Gene Regulatory Networks and Cell-Cell Interactions

scMultiSim simulates paired single cell RNA-seq, single cell ATAC-seq and RNA velocity data, while incorporating mechanisms of gene regulatory networks, chromatin accessibility and cell-cell interactions. It allows users to tune various parameters controlling the amount of each biological factor, variation of gene-expression levels, the influence of chromatin accessibility on RNA sequence data, and so on. It can be used to benchmark various computational methods for single cell multi-omics data, and to assist in experimental design of wet-lab experiments.

Maintained by Hechen Li. Last updated 5 months ago.

singlecell transcriptomics geneexpression sequencing experimentaldesign

6.4 match 23 stars 7.15 score 11 scripts

prioritizr

prioritizr:Systematic Conservation Prioritization in R

Systematic conservation prioritization using mixed integer linear programming (MILP). It provides a flexible interface for building and solving conservation planning problems. Once built, conservation planning problems can be solved using a variety of commercial and open-source exact algorithm solvers. By using exact algorithm solvers, solutions can be generated that are guaranteed to be optimal (or within a pre-specified optimality gap). Furthermore, conservation problems can be constructed to optimize the spatial allocation of different management actions or zones, meaning that conservation practitioners can identify solutions that benefit multiple stakeholders. To solve large-scale or complex conservation planning problems, users should install the Gurobi optimization software (available from <https://www.gurobi.com/>) and the 'gurobi' R package (see Gurobi Installation Guide vignette for details). Users can also install the IBM CPLEX software (<https://www.ibm.com/products/ilog-cplex-optimization-studio/cplex-optimizer>) and the 'cplexAPI' R package (available at <https://github.com/cran/cplexAPI>). Additionally, the 'rcbc' R package (available at <https://github.com/dirkschumacher/rcbc>) can be used to generate solutions using the CBC optimization software (<https://github.com/coin-or/Cbc>). For further details, see Hanson et al. (2025) <doi:10.1111/cobi.14376>.

Maintained by Richard Schuster. Last updated 11 days ago.

biodiversity conservation conservation-planner optimization prioritization solver spatial cpp

3.8 match 124 stars 11.82 score 584 scripts 2 dependents

chjackson

flexsurv:Flexible Parametric Survival and Multi-State Models

Flexible parametric models for time-to-event data, including the Royston-Parmar spline model, generalized gamma and generalized F distributions. Any user-defined parametric distribution can be fitted, given at least an R function defining the probability density or hazard. There are also tools for fitting and predicting from fully parametric multi-state models, based on either cause-specific hazards or mixture models.

Maintained by Christopher Jackson. Last updated 2 months ago.

cpp

3.3 match 57 stars 13.31 score 632 scripts 43 dependents

oscarkjell

text:Analyses of Text using Transformers Models from HuggingFace, Natural Language Processing and Machine Learning

Link R with Transformers from Hugging Face to transform text variables to word embeddings; where the word embeddings are used to statistically test the mean difference between set of texts, compute semantic similarity scores between texts, predict numerical variables, and visual statistically significant words according to various dimensions etc. For more information see <https://www.r-text.org>.

Maintained by Oscar Kjell. Last updated 4 days ago.

deep-learning machine-learning nlp transformers openjdk

3.3 match 146 stars 13.16 score 436 scripts 1 dependents

bioc

Gviz:Plotting data and annotation information along genomic coordinates

Genomic data analyses requires integrated visualization of known genomic information and new experimental data. Gviz uses the biomaRt and the rtracklayer packages to perform live annotation queries to Ensembl and UCSC and translates this to e.g. gene/transcript structures in viewports of the grid graphics package. This results in genomic information plotted together with your data.

Maintained by Robert Ivanek. Last updated 5 months ago.

visualization microarray sequencing

3.3 match 79 stars 13.08 score 1.4k scripts 48 dependents

aphalo

photobiologyWavebands:Waveband Definitions for UV, VIS, and IR Radiation

Constructors of waveband objects for commonly used biological spectral weighting functions (BSWFs) and for different wavebands describing named ranges of wavelengths in the ultraviolet (UV), visible (VIS) and infrared (IR) regions of the electromagnetic spectrum. Part of the 'r4photobiology' suite, Aphalo P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.

Maintained by Pedro J. Aphalo. Last updated 8 months ago.

6.7 match 1 stars 6.53 score 378 scripts 3 dependents

aphalo

SunCalcMeeus:Sun Position and Daylight Calculations

Compute the position of the sun, and local solar time using Meeus' formulae. Compute day and/or night length using different twilight definitions or arbitrary sun elevation angles. This package is part of the 'r4photobiology' suite, Aphalo, P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>. Algorithms from Meeus (1998, ISBN:0943396611).

Maintained by Pedro J. Aphalo. Last updated 2 months ago.

6.7 match 1 stars 6.49 score 6 scripts 13 dependents

eblondel

rsdmx:Tools for Reading SDMX Data and Metadata

Set of classes and methods to read data and metadata documents exchanged through the Statistical Data and Metadata Exchange (SDMX) framework, currently focusing on the SDMX XML standard format (SDMX-ML).

Maintained by Emmanuel Blondel. Last updated 18 days ago.

api datastructures dsd read readsdmx sdmx sdmx-format sdmx-provider sdmx-standards statistics timeseries web-services

4.7 match 105 stars 9.22 score 4 dependents

r-quantities

errors:Uncertainty Propagation for R Vectors

Support for measurement errors in R vectors, matrices and arrays: automatic uncertainty propagation and reporting. Documentation about 'errors' is provided in the paper by Ucar, Pebesma & Azcorra (2018, <doi:10.32614/RJ-2018-075>), included in this package as a vignette; see 'citation("errors")' for details.

Maintained by Iñaki Ucar. Last updated 2 months ago.

error-propagation uncertainty

5.3 match 49 stars 8.18 score 86 scripts 4 dependents

bioc

minfi:Analyze Illumina Infinium DNA methylation arrays

Tools to analyze & visualize Illumina Infinium methylation arrays.

Maintained by Kasper Daniel Hansen. Last updated 4 months ago.

immunooncology dnamethylation differentialmethylation epigenetics microarray methylationarray multichannel twochannel dataimport normalization preprocessing qualitycontrol

3.3 match 60 stars 12.83 score 996 scripts 26 dependents

genentech

psborrow2:Bayesian Dynamic Borrowing Analysis and Simulation

Bayesian dynamic borrowing is an approach to incorporating external data to supplement a randomized, controlled trial analysis in which external data are incorporated in a dynamic way (e.g., based on similarity of outcomes); see Viele 2013 <doi:10.1002/pst.1589> for an overview. This package implements the hierarchical commensurate prior approach to dynamic borrowing as described in Hobbes 2011 <doi:10.1111/j.1541-0420.2011.01564.x>. There are three main functionalities. First, 'psborrow2' provides a user-friendly interface for applying dynamic borrowing on the study results handles the Markov Chain Monte Carlo sampling on behalf of the user. Second, 'psborrow2' provides a simulation framework to compare different borrowing parameters (e.g. full borrowing, no borrowing, dynamic borrowing) and other trial and borrowing characteristics (e.g. sample size, covariates) in a unified way. Third, 'psborrow2' provides a set of functions to generate data for simulation studies, and also allows the user to specify their own data generation process. This package is designed to use the sampling functions from 'cmdstanr' which can be installed from <https://stan-dev.r-universe.dev>.

Maintained by Matt Secrest. Last updated 1 months ago.

bayesian-dynamic-borrowing psborrow2 simulation-study

5.4 match 18 stars 7.87 score 16 scripts

banboo-data

r4googleads:'Google Ads API' Interface

Interface for the 'Google Ads API'. 'Google Ads' is an online advertising service that enables advertisers to display advertising to web users (see <https://developers.google.com/google-ads/> for more information).

Maintained by Johannes Burkhardt. Last updated 3 years ago.

google-ads-api marketing-analytics marketing-automation

8.6 match 4 stars 4.78 score 6 scripts

bioc

bsseq:Analyze, manage and store whole-genome methylation data

A collection of tools for analyzing and visualizing whole-genome methylation data from sequencing. This includes whole-genome bisulfite sequencing and Oxford nanopore data.

Maintained by Kasper Daniel Hansen. Last updated 3 months ago.

dnamethylation cpp

3.3 match 37 stars 12.26 score 676 scripts 15 dependents

bioc

OmnipathR:OmniPath web service client and more

A client for the OmniPath web service (https://www.omnipathdb.org) and many other resources. It also includes functions to transform and pretty print some of the downloaded data, functions to access a number of other resources such as BioPlex, ConsensusPathDB, EVEX, Gene Ontology, Guide to Pharmacology (IUPHAR/BPS), Harmonizome, HTRIdb, Human Phenotype Ontology, InWeb InBioMap, KEGG Pathway, Pathway Commons, Ramilowski et al. 2015, RegNetwork, ReMap, TF census, TRRUST and Vinayagam et al. 2011. Furthermore, OmnipathR features a close integration with the NicheNet method for ligand activity prediction from transcriptomics data, and its R implementation `nichenetr` (available only on github).

Maintained by Denes Turei. Last updated 19 days ago.

graphandnetwork network pathways software thirdpartyclient dataimport datarepresentation genesignaling generegulation systemsbiology transcriptomics singlecell annotation kegg complexes enzyme-ptm networks networks-biology omnipath proteins quarto

4.1 match 126 stars 9.90 score 226 scripts 2 dependents

thej022214

OUwie:Analysis of Evolutionary Rates in an OU Framework

Estimates rates for continuous character evolution under Brownian motion and a new set of Ornstein-Uhlenbeck based Hansen models that allow both the strength of the pull and stochastic motion to vary across selective regimes. Beaulieu et al (2012).

Maintained by Jeremy Beaulieu. Last updated 1 months ago.

4.8 match 9 stars 8.37 score 161 scripts

gbm-developers

gbm:Generalized Boosted Regression Models

An implementation of extensions to Freund and Schapire's AdaBoost algorithm and Friedman's gradient boosting machine. Includes regression methods for least squares, absolute loss, t-distribution loss, quantile regression, logistic, multinomial logistic, Poisson, Cox proportional hazards partial likelihood, AdaBoost exponential loss, Huberized hinge loss, and Learning to Rank measures (LambdaMart). Originally developed by Greg Ridgeway. Newer version available at github.com/gbm-developers/gbm3.

Maintained by Greg Ridgeway. Last updated 9 months ago.

cpp

2.9 match 52 stars 13.85 score 6.8k scripts 91 dependents

nsaph-software

CausalGPS:Matching on Generalized Propensity Scores with Continuous Exposures

Provides a framework for estimating causal effects of a continuous exposure using observational data, and implementing matching and weighting on the generalized propensity score. Wu, X., Mealli, F., Kioumourtzoglou, M.A., Dominici, F. and Braun, D., 2022. Matching on generalized propensity scores with continuous exposures. Journal of the American Statistical Association, pp.1-29.

Maintained by Naeem Khoshnevis. Last updated 9 months ago.

cpp openmp

5.3 match 24 stars 7.67 score 39 scripts

ndphillips

FFTrees:Generate, Visualise, and Evaluate Fast-and-Frugal Decision Trees

Create, visualize, and test fast-and-frugal decision trees (FFTs) using the algorithms and methods described by Phillips, Neth, Woike & Gaissmaier (2017), <doi:10.1017/S1930297500006239>. FFTs are simple and transparent decision trees for solving binary classification problems. FFTs can be preferable to more complex algorithms because they require very little information, are easy to understand and communicate, and are robust against overfitting.

Maintained by Hansjoerg Neth. Last updated 5 months ago.

4.1 match 136 stars 9.53 score 144 scripts

kkholst

mets:Analysis of Multivariate Event Times

Implementation of various statistical models for multivariate event history data <doi:10.1007/s10985-013-9244-x>. Including multivariate cumulative incidence models <doi:10.1002/sim.6016>, and bivariate random effects probit models (Liability models) <doi:10.1016/j.csda.2015.01.014>. Modern methods for survival analysis, including regression modelling (Cox, Fine-Gray, Ghosh-Lin, Binomial regression) with fast computation of influence functions.

Maintained by Klaus K. Holst. Last updated 3 days ago.

multivariate-time-to-event survival-analysis time-to-event fortran openblas cpp

2.9 match 14 stars 13.47 score 236 scripts 42 dependents

bioc

bumphunter:Bump Hunter

Tools for finding bumps in genomic data

Maintained by Tamilselvi Guharaj. Last updated 5 months ago.

dnamethylation epigenetics infrastructure multiplecomparison immunooncology

3.3 match 16 stars 11.74 score 210 scripts 42 dependents

jtextor

dagitty:Graphical Analysis of Structural Causal Models

A port of the web-based software 'DAGitty', available at <https://dagitty.net>, for analyzing structural causal models (also known as directed acyclic graphs or DAGs). This package computes covariate adjustment sets for estimating causal effects, enumerates instrumental variables, derives testable implications (d-separation and vanishing tetrads), generates equivalent models, and includes a simple facility for data simulation.

Maintained by Johannes Textor. Last updated 3 months ago.

3.0 match 302 stars 12.83 score 1.7k scripts 11 dependents

ropensci

unifir:A Unifying API for Calling the 'Unity' '3D' Video Game Engine

Functions for the creation and manipulation of scenes and objects within the 'Unity' '3D' video game engine (<https://unity.com/>). Specific focuses include the creation and import of terrain data and 'GameObjects' as well as scene management.

Maintained by Michael Mahoney. Last updated 1 years ago.

unifir unity unity3d visualization

6.3 match 29 stars 6.16 score 11 scripts 1 dependents

mschubert

clustermq:Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM, PBS/Torque)

Evaluate arbitrary function calls using workers on HPC schedulers in single line of code. All processing is done on the network without accessing the file system. Remote schedulers are supported via SSH.

Maintained by Michael Schubert. Last updated 25 days ago.

cluster high-performance-computing lsf sge slurm ssh zeromq3 cpp

3.8 match 149 stars 10.23 score 253 scripts

mazamascience

MazamaSpatialUtils:Spatial Data Download and Utility Functions

A suite of conversion functions to create internally standardized spatial polygons data frames. Utility functions use these data sets to return values such as country, state, time zone, watershed, etc. associated with a set of longitude/latitude pairs. (They also make cool maps.)

Maintained by Jonathan Callahan. Last updated 5 months ago.

4.7 match 5 stars 8.09 score 282 scripts 2 dependents

bioc

gDRstyle:A package with style requirements for the gDR suite

Package fills a helper package role for whole gDR suite. It helps to support good development practices by keeping style requirements and style tests for other packages. It also contains build helpers to make all package requirements met.

Maintained by Arkadiusz Gladki. Last updated 1 months ago.

software infrastructure

6.2 match 2 stars 6.10 score 2 scripts

stan-dev

rstantools:Tools for Developing R Packages Interfacing with 'Stan'

Provides various tools for developers of R packages interfacing with 'Stan' <https://mc-stan.org>, including functions to set up the required package structure, S3 generics and default methods to unify function naming across 'Stan'-based R packages, and vignettes with recommendations for developers.

Maintained by Jonah Gabry. Last updated 2 months ago.

bayesian-data-analysis bayesian-statistics developer-tools stan

2.9 match 50 stars 13.09 score 134 scripts 222 dependents

biodiverse

unmarked:Models for Data from Unmarked Animals

Fits hierarchical models of animal abundance and occurrence to data collected using survey methods such as point counts, site occupancy sampling, distance sampling, removal sampling, and double observer sampling. Parameters governing the state and observation processes can be modeled as functions of covariates. References: Kellner et al. (2023) <doi:10.1111/2041-210X.14123>, Fiske and Chandler (2011) <doi:10.18637/jss.v043.i10>.

Maintained by Ken Kellner. Last updated 2 days ago.

openblas cpp openmp

2.9 match 4 stars 13.03 score 652 scripts 12 dependents

snoweye

pbdZMQ:Programming with Big Data -- Interface to 'ZeroMQ'

'ZeroMQ' is a well-known library for high-performance asynchronous messaging in scalable, distributed applications. This package provides high level R wrapper functions to easily utilize 'ZeroMQ'. We mainly focus on interactive client/server programming frameworks. For convenience, a minimal 'ZeroMQ' library (4.2.2) is shipped with 'pbdZMQ', which can be used if no system installation of 'ZeroMQ' is available. A few wrapper functions compatible with 'rzmq' are also provided.

Maintained by Wei-Chen Chen. Last updated 6 months ago.

zeromq3

3.8 match 17 stars 9.92 score 46 scripts 26 dependents

bioc

recount3:Explore and download data from the recount3 project

The recount3 package enables access to a large amount of uniformly processed RNA-seq data from human and mouse. You can download RangedSummarizedExperiment objects at the gene, exon or exon-exon junctions level with sample metadata and QC statistics. In addition we provide access to sample coverage BigWig files.

Maintained by Leonardo Collado-Torres. Last updated 3 months ago.

coverage differentialexpression geneexpression rnaseq sequencing software dataimport annotation-agnostic bioconductor count derfinder exon gene human illumina junction mouse recount recount3

4.6 match 33 stars 8.03 score 216 scripts

ngreifer

cobalt:Covariate Balance Tables and Plots

Generate balance tables and plots for covariates of groups preprocessed through matching, weighting or subclassification, for example, using propensity scores. Includes integration with 'MatchIt', 'WeightIt', 'MatchThem', 'twang', 'Matching', 'optmatch', 'CBPS', 'ebal', 'cem', 'sbw', and 'designmatch' for assessing balance on the output of their preprocessing functions. Users can also specify data for balance assessment not generated through the above packages. Also included are methods for assessing balance in clustered or multiply imputed data sets or data sets with multi-category, continuous, or longitudinal treatments.

Maintained by Noah Greifer. Last updated 11 months ago.

causal-inference propensity-scores

2.9 match 75 stars 12.98 score 1.0k scripts 8 dependents

bioc

plotgardener:Coordinate-Based Genomic Visualization Package for R

Coordinate-based genomic visualization package for R. It grants users the ability to programmatically produce complex, multi-paneled figures. Tailored for genomics, plotgardener allows users to visualize large complex genomic datasets and provides exquisite control over how plots are placed and arranged on a page.

Maintained by Nicole Kramer. Last updated 5 months ago.

visualization genomeannotation functionalgenomics genomeassembly hic cpp

3.6 match 308 stars 10.16 score 167 scripts 3 dependents

aphalo

photobiologyPlants:Plant Photobiology Related Functions and Data

Provides functions for quantifying visible (VIS) and ultraviolet (UV) radiation in relation to the photoreceptors Phytochromes, Cryptochromes, and UVR8 which are present in plants. It also includes data sets on the optical properties of plants. Part of the 'r4photobiology' suite, Aphalo P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.

Maintained by Pedro J. Aphalo. Last updated 2 months ago.

6.7 match 5.52 score 55 scripts

kbroman

qtl:Tools for Analyzing QTL Experiments

Analysis of experimental crosses to identify genes (called quantitative trait loci, QTLs) contributing to variation in quantitative traits. Broman et al. (2003) <doi:10.1093/bioinformatics/btg112>.

Maintained by Karl W Broman. Last updated 7 months ago.

openblas

2.9 match 80 stars 12.79 score 2.4k scripts 29 dependents

plotly

plotly:Create Interactive Web Graphics via 'plotly.js'

Create interactive web graphics from 'ggplot2' graphs and/or a custom interface to the (MIT-licensed) JavaScript library 'plotly.js' inspired by the grammar of graphics.

Maintained by Carson Sievert. Last updated 3 months ago.

d3js data-visualization ggplot2 javascript plotly shiny webgl

1.9 match 2.6k stars 19.43 score 93k scripts 797 dependents

hneth

unikn:Graphical Elements of the University of Konstanz's Corporate Design

Define and use graphical elements of corporate design manuals in R. The 'unikn' package provides color functions (by defining dedicated colors and color palettes, and commands for finding, changing, viewing, and using them) and styled text elements (e.g., for marking, underlining, or plotting colored titles). The pre-defined range of colors and text decoration functions is based on the corporate design of the University of Konstanz <https://www.uni-konstanz.de/>, but can be adapted and extended for other purposes or institutions.

Maintained by Hansjoerg Neth. Last updated 3 months ago.

branding color color-palette colorscheme corporate-design palette text-decoration university-colors visual-identity

4.1 match 39 stars 8.82 score 156 scripts 2 dependents

wrathematics

getPass:Masked User Input

A micro-package for reading "passwords", i.e. reading user input with masking, so that the input is not displayed as it is typed. Currently we have support for 'RStudio', the command line (every OS), and any platform where 'tcltk' is present.

Maintained by Drew Schmidt. Last updated 1 years ago.

3.3 match 48 stars 10.84 score 348 scripts 65 dependents

bergsmat

yamlet:Versatile Curation of Table Metadata

A YAML-based mechanism for working with table metadata. Supports compact syntax for creating, modifying, viewing, exporting, importing, displaying, and plotting metadata coded as column attributes. The 'yamlet' dialect is valid 'YAML' with defaults and conventions chosen to improve readability. See ?yamlet, ?decorate, ?modify, ?io_csv, and ?ggplot.decorated.

Maintained by Tim Bergsma. Last updated 23 days ago.

6.0 match 2 stars 5.99 score 60 scripts 1 dependents

thomasp85

patchwork:The Composer of Plots

The 'ggplot2' package provides a strong API for sequentially building up a plot, but does not concern itself with composition of multiple plots. 'patchwork' is a package that expands the API to allow for arbitrarily complex composition of plots by, among others, providing mathematical operators for combining multiple plots. Other packages that try to address this need (but with a different approach) are 'gridExtra' and 'cowplot'.

Maintained by Thomas Lin Pedersen. Last updated 6 months ago.

ggplot-extension ggplot2 visualization

1.8 match 2.5k stars 19.79 score 82k scripts 657 dependents

joblion

rtkore:'STK++' Core Library Integration to 'R' using 'Rcpp'

'STK++' <http://www.stkpp.org> is a collection of C++ classes for statistics, clustering, linear algebra, arrays (with an 'Eigen'-like API), regression, dimension reduction, etc. The integration of the library to 'R' is using 'Rcpp'. The 'rtkore' package includes the header files from the 'STK++' core library. All files contain only template classes and/or inline functions. 'STK++' is licensed under the GNU LGPL version 2 or later. 'rtkore' (the 'stkpp' integration into 'R') is licensed under the GNU GPL version 2 or later. See file LICENSE.note for details.

Maintained by Serge Iovleff. Last updated 10 months ago.

cpp

9.1 match 3.90 score 25 scripts 2 dependents

quanteda

spacyr:Wrapper to the 'spaCy' 'NLP' Library

An R wrapper to the 'Python' 'spaCy' 'NLP' library, from <https://spacy.io>.

Maintained by Kenneth Benoit. Last updated 1 months ago.

extract-entities nlp spacy speech-tagging

3.3 match 253 stars 10.68 score 408 scripts 6 dependents

lme4

lme4:Linear Mixed-Effects Models using 'Eigen' and S4

Fit linear and generalized linear mixed-effects models. The models and their components are represented using S4 classes and methods. The core computational algorithms are implemented using the 'Eigen' C++ library for numerical linear algebra and 'RcppEigen' "glue".

Maintained by Ben Bolker. Last updated 3 days ago.

cpp

1.7 match 647 stars 20.69 score 35k scripts 1.5k dependents

wrathematics

ngram:Fast n-Gram 'Tokenization'

An n-gram is a sequence of n "words" taken, in order, from a body of text. This is a collection of utilities for creating, displaying, summarizing, and "babbling" n-grams. The 'tokenization' and "babbling" are handled by very efficient C code, which can even be built as its own standalone library. The babbler is a simple Markov chain. The package also offers a vignette with complete example 'workflows' and information about the utilities offered in the package.

Maintained by Drew Schmidt. Last updated 1 years ago.

ngram text text-mining

3.3 match 71 stars 10.45 score 844 scripts 7 dependents

bioc

oligo:Preprocessing tools for oligonucleotide arrays

A package to analyze oligonucleotide arrays (expression/SNP/tiling/exon) at probe-level. It currently supports Affymetrix (CEL files) and NimbleGen arrays (XYS files).

Maintained by Benilton Carvalho. Last updated 8 days ago.

microarray onechannel twochannel preprocessing snp differentialexpression exonarray geneexpression dataimport zlib

3.3 match 3 stars 10.42 score 528 scripts 10 dependents

insightsengineering

teal.goshawk:Longitudinal Visualization `teal` Modules

Modules that produce web interfaces through which longitudinal visualizations can be dynamically modified and displayed. These included box plot, correlation plot, density distribution plot, line plot, scatter plot and spaghetti plot with accompanying summary. Data are expected in ADaM structure. Requires analysis subject level (ADSL) and analysis laboratory (ADLB) data sets. Beyond core variables, Limit of Quantification flag variable (LOQFL) is expected with levels 'Y', 'N' or NA.

Maintained by Nick Paszty. Last updated 20 days ago.

modules nest

5.3 match 3 stars 6.59 score 2 scripts

bioc

methylKit:DNA methylation analysis from high-throughput bisulfite sequencing results

methylKit is an R package for DNA methylation analysis and annotation from high-throughput bisulfite sequencing. The package is designed to deal with sequencing data from RRBS and its variants, but also target-capture methods and whole genome bisulfite sequencing. It also has functions to analyze base-pair resolution 5hmC data from experimental protocols such as oxBS-Seq and TAB-Seq. Methylation calling can be performed directly from Bismark aligned BAM files.

Maintained by Altuna Akalin. Last updated 17 days ago.

dnamethylation sequencing methylseq genome-biology methylation statistical-analysis visualization curl bzip2 xz-utils zlib cpp

2.9 match 220 stars 11.80 score 578 scripts 3 dependents

bioc

CrispRVariants:Tools for counting and visualising mutations in a target location

CrispRVariants provides tools for analysing the results of a CRISPR-Cas9 mutagenesis sequencing experiment, or other sequencing experiments where variants within a given region are of interest. These tools allow users to localize variant allele combinations with respect to any genomic location (e.g. the Cas9 cut site), plot allele combinations and calculate mutation rates with flexible filtering of unrelated variants.

Maintained by Helen Lindsay. Last updated 5 months ago.

immunooncology crispr genomicvariation variantdetection geneticvariability datarepresentation visualization sequencing

6.2 match 5.51 score 32 scripts

thomasp85

ggraph:An Implementation of Grammar of Graphics for Graphs and Networks

The grammar of graphics as implemented in ggplot2 is a poor fit for graph and network visualizations due to its reliance on tabular data input. ggraph is an extension of the ggplot2 API tailored to graph visualizations and provides the same flexible approach to building up plots layer by layer.

Maintained by Thomas Lin Pedersen. Last updated 1 years ago.

ggplot-extension ggplot2 graph-visualization network-visualization visualization cpp

2.0 match 1.1k stars 16.96 score 9.2k scripts 111 dependents

epiverse-trace

epidemics:Composable Epidemic Scenario Modelling

A library of compartmental epidemic models taken from the published literature, and classes to represent affected populations, public health response measures including non-pharmaceutical interventions on social contacts, non-pharmaceutical and pharmaceutical interventions that affect disease transmissibility, vaccination regimes, and disease seasonality, which can be combined to compose epidemic scenario models.

Maintained by Rosalind Eggo. Last updated 9 months ago.

decision-support epidemic-modelling epidemic-simulations epidemiology epiverse infectious-disease-dynamics model-library non-pharmaceutical-interventions rcpp rcppeigen scenario-analysis vaccination cpp

4.5 match 9 stars 7.48 score 59 scripts

nsaph-software

GPCERF:Gaussian Processes for Estimating Causal Exposure Response Curves

Provides a non-parametric Bayesian framework based on Gaussian process priors for estimating causal effects of a continuous exposure and detecting change points in the causal exposure response curves using observational data. Ren, B., Wu, X., Braun, D., Pillai, N., & Dominici, F.(2021). "Bayesian modeling for exposure response curve via gaussian processes: Causal effects of exposure to air pollution on health outcomes." arXiv preprint <doi:10.48550/arXiv.2105.03454>.

Maintained by Boyu Ren. Last updated 11 months ago.

cpp

5.3 match 9 stars 6.33 score 16 scripts

pbreheny

visreg:Visualization of Regression Models

Provides a convenient interface for constructing plots to visualize the fit of regression models arising from a wide variety of models in R ('lm', 'glm', 'coxph', 'rlm', 'gam', 'locfit', 'lmer', 'randomForest', etc.)

Maintained by Patrick Breheny. Last updated 17 days ago.

3.1 match 61 stars 10.64 score 2.4k scripts

thomasgstewart

tangram.pipe:Row-by-Row Table Building

Builds tables with customizable rows. Users can specify the type of data to use for each row, as well as how to handle missing data and the types of comparison tests to run on the table columns.

Maintained by Andrew Guide. Last updated 3 years ago.

9.2 match 1 stars 3.60 score 1 scripts

cran

propagate:Propagation of Uncertainty

Propagation of uncertainty using higher-order Taylor expansion and Monte Carlo simulation.

Maintained by Andrej-Nikolai Spiess. Last updated 7 years ago.

cpp

6.9 match 2 stars 4.82 score 183 scripts 3 dependents

rte-antares-rpackage

antaresEditObject:Edit an 'Antares' Simulation

Edit an 'Antares' simulation before running it : create new areas, links, thermal clusters or binding constraints or edit existing ones. Update 'Antares' general & optimization settings. 'Antares' is an open source power system generator, more information available here : <https://antares-simulator.org/>.

Maintained by Tatiana Vargas. Last updated 28 days ago.

antares-simulation cluster energy monte-carlo-simulation rte

3.8 match 8 stars 8.76 score 101 scripts

rstudio

rmarkdown:Dynamic Documents for R

Convert R Markdown documents into a variety of formats.

Maintained by Yihui Xie. Last updated 4 months ago.

literate-programming markdown pandoc rmarkdown

1.5 match 2.9k stars 21.79 score 14k scripts 3.7k dependents

sbg

sevenbridges2:The 'Seven Bridges Platform' API Client

R client and utilities for 'Seven Bridges Platform' API, from 'Cancer Genomics Cloud' to other 'Seven Bridges' supported platforms. API documentation is hosted publicly at <https://docs.sevenbridges.com/docs/the-api>.

Maintained by Marko Trifunovic. Last updated 20 days ago.

api-client bioinformatics cloud sevenbridges

5.5 match 2 stars 5.90 score 4 scripts

ohdsi

PatientLevelPrediction:Develop Clinical Prediction Models Using the Common Data Model

A user friendly way to create patient level prediction models using the Observational Medical Outcomes Partnership Common Data Model. Given a cohort of interest and an outcome of interest, the package can use data in the Common Data Model to build a large set of features. These features can then be used to fit a predictive model with a number of machine learning algorithms. This is further described in Reps (2017) <doi:10.1093/jamia/ocy032>.

Maintained by Egill Fridgeirsson. Last updated 9 days ago.

hades openjdk

3.0 match 190 stars 10.85 score 297 scripts

mrc-ide

odin:ODE Generation and Integration

Generate systems of ordinary differential equations (ODE) and integrate them, using a domain specific language (DSL). The DSL uses R's syntax, but compiles to C in order to efficiently solve the system. A solver is not provided, but instead interfaces to the packages 'deSolve' and 'dde' are generated. With these, while solving the differential equations, no allocations are done and the calculations remain entirely in compiled code. Alternatively, a model can be transpiled to R for use in contexts where a C compiler is not present. After compilation, models can be inspected to return information about parameters and outputs, or intermediate values after calculations. 'odin' is not targeted at any particular domain and is suitable for any system that can be expressed primarily as mathematical expressions. Additional support is provided for working with delays (delay differential equations, DDE), using interpolated functions during interpolation, and for integrating quantities that represent arrays.

Maintained by Rich FitzJohn. Last updated 9 months ago.

infrastructure

3.3 match 106 stars 9.74 score 290 scripts 3 dependents

bioc

goseq:Gene Ontology analyser for RNA-seq and other length biased data

Detects Gene Ontology and/or other user defined categories which are over/under represented in RNA-seq data.

Maintained by Federico Marini. Last updated 5 months ago.

immunooncology sequencing go geneexpression transcription rnaseq differentialexpression annotation genesetenrichment kegg pathways software

3.3 match 1 stars 9.67 score 636 scripts 9 dependents

mlopez-ibanez

irace:Iterated Racing for Automatic Algorithm Configuration

Iterated race is an extension of the Iterated F-race method for the automatic configuration of optimization algorithms, that is, (offline) tuning their parameters by finding the most appropriate settings given a set of instances of an optimization problem. M. López-Ibáñez, J. Dubois-Lacoste, L. Pérez Cáceres, T. Stützle, and M. Birattari (2016) <doi:10.1016/j.orp.2016.09.002>.

Maintained by Manuel López-Ibáñez. Last updated 1 months ago.

algorithm-configuration hyperparameter-tuning irace optimization-algorithms

3.1 match 63 stars 10.28 score 103 scripts 1 dependents

bioc

pcaExplorer:Interactive Visualization of RNA-seq Data Using a Principal Components Approach

This package provides functionality for interactive visualization of RNA-seq datasets based on Principal Components Analysis. The methods provided allow for quick information extraction and effective data exploration. A Shiny application encapsulates the whole analysis.

Maintained by Federico Marini. Last updated 3 months ago.

immunooncology visualization rnaseq dimensionreduction principalcomponent qualitycontrol gui reportwriting shinyapps bioconductor principal-components reproducible-research rna-seq-analysis rna-seq-data shiny transcriptome user-friendly

3.3 match 56 stars 9.63 score 180 scripts

bioc

standR:Spatial transcriptome analyses of Nanostring's DSP data in R

standR is an user-friendly R package providing functions to assist conducting good-practice analysis of Nanostring's GeoMX DSP data. All functions in the package are built based on the SpatialExperiment object, allowing integration into various spatial transcriptomics-related packages from Bioconductor. standR allows data inspection, quality control, normalization, batch correction and evaluation with informative visualizations.

Maintained by Ning Liu. Last updated 1 months ago.

spatial transcriptomics geneexpression differentialexpression qualitycontrol normalization experimenthubsoftware

4.3 match 18 stars 7.39 score 45 scripts

bioc

gCrisprTools:Suite of Functions for Pooled Crispr Screen QC and Analysis

Set of tools for evaluating pooled high-throughput screening experiments, typically employing CRISPR/Cas9 or shRNA expression cassettes. Contains methods for interrogating library and cassette behavior within an experiment, identifying differentially abundant cassettes, aggregating signals to identify candidate targets for empirical validation, hypothesis testing, and comprehensive reporting. Version 2.0 extends these applications to include a variety of tools for contextualizing and integrating signals across many experiments, incorporates extended signal enrichment methodologies via the "sparrow" package, and streamlines many formal requirements to aid in interpretablity.

Maintained by Russell Bainer. Last updated 5 months ago.

immunooncology crispr pooledscreens experimentaldesign biomedicalinformatics cellbiology functionalgenomics pharmacogenomics pharmacogenetics systemsbiology differentialexpression genesetenrichment genetics multiplecomparison normalization preprocessing qualitycontrol rnaseq regression software visualization

6.7 match 4.78 score 8 scripts

mmedl94

lionfish:Interactive 'tourr' Using 'python'

Extends the functionality of the 'tourr' package by an interactive graphical user interface. The interactivity allows users to effortlessly refine their 'tourr' results by manual intervention, which allows for integration of expert knowledge and aids the interpretation of results. For more information on 'tourr' see Wickham et. al (2011) <doi:10.18637/jss.v040.i02> or <https://github.com/ggobi/tourr>.

Maintained by Matthias Medl. Last updated 5 days ago.

data-sience data-visualization dimensionality-reduction exploratory-data-analysis interactive interactive-visualizations tourr

5.3 match 1 stars 5.96 score

rexyai

RestRserve:A Framework for Building HTTP API

Allows to easily create high-performance full featured HTTP APIs from R functions. Provides high-level classes such as 'Request', 'Response', 'Application', 'Middleware' in order to streamline server side application development. Out of the box allows to serve requests using 'Rserve' package, but flexible enough to integrate with other HTTP servers such as 'httpuv'.

Maintained by Dmitry Selivanov. Last updated 4 days ago.

http-server openapi rest-api swagger-ui cpp

3.3 match 283 stars 9.56 score 95 scripts 1 dependents

eblondel

cleangeo:Cleaning Geometries from Spatial Objects

Provides a set of utility tools to inspect spatial objects, facilitate handling and reporting of topology errors and geometry validity issue with sp objects. Finally, it provides a geometry cleaner that will fix all geometry problems, and eliminate (at least reduce) the likelihood of having issues when doing spatial data processing.

Maintained by Emmanuel Blondel. Last updated 2 years ago.

cleaning cleaning-geometries gis sp spatial

4.7 match 45 stars 6.82 score 99 scripts 1 dependents

qinwf

jiebaR:Chinese Text Segmentation

Chinese text segmentation, keyword extraction and speech tagging For R.

Maintained by Qin Wenfeng. Last updated 5 years ago.

chinese chinese-text-segmentation cppjieba jieba lexical-analysis nlp cpp

3.1 match 348 stars 10.18 score 456 scripts 6 dependents

snoweye

phyclust:Phylogenetic Clustering (Phyloclustering)

Phylogenetic clustering (phyloclustering) is an evolutionary Continuous Time Markov Chain model-based approach to identify population structure from molecular data without assuming linkage equilibrium. The package phyclust (Chen 2011) provides a convenient implementation of phyloclustering for DNA and SNP data, capable of clustering individuals into subpopulations and identifying molecular sequences representative of those subpopulations. It is designed in C for performance, interfaced with R for visualization, and incorporates other popular open source programs including ms (Hudson 2002) <doi:10.1093/bioinformatics/18.2.337>, seq-gen (Rambaut and Grassly 1997) <doi:10.1093/bioinformatics/13.3.235>, Hap-Clustering (Tzeng 2005) <doi:10.1002/gepi.20063> and PAML baseml (Yang 1997, 2007) <doi:10.1093/bioinformatics/13.5.555>, <doi:10.1093/molbev/msm088>, for simulating data, additional analyses, and searching the best tree. See the phyclust website for more information, documentations and examples.

Maintained by Wei-Chen Chen. Last updated 2 years ago.

3.8 match 9 stars 8.45 score 126 scripts 8 dependents

bcjaeger

table.glue:Make and Apply Customized Rounding Specifications for Tables

Translate double and integer valued data into character values formatted for tabulation in manuscripts or other types of academic reports.

Maintained by Byron Jaeger. Last updated 4 months ago.

5.3 match 7 stars 5.92 score 60 scripts

microsoft

finnts:Microsoft Finance Time Series Forecasting Framework

Automated time series forecasting developed by Microsoft Finance. The Microsoft Finance Time Series Forecasting Framework, aka Finn, can be used to forecast any component of the income statement, balance sheet, or any other area of interest by finance. Any numerical quantity over time, Finn can be used to forecast it. While it can be applied outside of the finance domain, Finn was built to meet the needs of financial analysts to better forecast their businesses within a company, and has a lot of built in features that are specific to the needs of financial forecasters. Happy forecasting!

Maintained by Mike Tokic. Last updated 26 days ago.

business data-science feature-selection finance finnts forecasting machine-learning microsoft time-series

3.3 match 193 stars 9.45 score 39 scripts

mikewlcheung

metaSEM:Meta-Analysis using Structural Equation Modeling

A collection of functions for conducting meta-analysis using a structural equation modeling (SEM) approach via the 'OpenMx' and 'lavaan' packages. It also implements various procedures to perform meta-analytic structural equation modeling on the correlation and covariance matrices, see Cheung (2015) <doi:10.3389/fpsyg.2014.01521>.

Maintained by Mike Cheung. Last updated 10 days ago.

meta-analysis meta-analytic-sem missing-data multilevel-models multivariate-analysis structural-equation-modeling structural-equation-models

3.3 match 30 stars 9.43 score 208 scripts 1 dependents

kimberlywebb

COMBO:Correcting Misclassified Binary Outcomes in Association Studies

Use frequentist and Bayesian methods to estimate parameters from a binary outcome misclassification model. These methods correct for the problem of "label switching" by assuming that the sum of outcome sensitivity and specificity is at least 1. A description of the analysis methods is available in Hochstedler and Wells (2023) <doi:10.48550/arXiv.2303.10215>.

Maintained by Kimberly Hochstedler Webb. Last updated 20 days ago.

jags cpp

6.2 match 1 stars 5.08 score 4 scripts

biostatomics

Coxmos:Cox MultiBlock Survival

This software package provides Cox survival analysis for high-dimensional and multiblock datasets. It encompasses a suite of functions dedicated from the classical Cox regression to newest analysis, including Cox proportional hazards model, Stepwise Cox regression, and Elastic-Net Cox regression, Sparse Partial Least Squares Cox regression (sPLS-COX) incorporating three distinct strategies, and two Multiblock-PLS Cox regression (MB-sPLS-COX) methods. This tool is designed to adeptly handle high-dimensional data, and provides tools for cross-validation, plot generation, and additional resources for interpreting results. While references are available within the corresponding functions, key literature is mentioned below. Terry M Therneau (2024) <https://CRAN.R-project.org/package=survival>, Noah Simon et al. (2011) <doi:10.18637/jss.v039.i05>, Philippe Bastien et al. (2005) <doi:10.1016/j.csda.2004.02.005>, Philippe Bastien (2008) <doi:10.1016/j.chemolab.2007.09.009>, Philippe Bastien et al. (2014) <doi:10.1093/bioinformatics/btu660>, Kassu Mehari Beyene and Anouar El Ghouch (2020) <doi:10.1002/sim.8671>, Florian Rohart et al. (2017) <doi:10.1371/journal.pcbi.1005752>.

Maintained by Pedro Salguero García. Last updated 12 days ago.

5.9 match 1 stars 5.30 score 5 scripts

prodriguezsosa

conText:'a la Carte' on Text (ConText) Embedding Regression

A fast, flexible and transparent framework to estimate context-specific word and short document embeddings using the 'a la carte' embeddings approach developed by Khodak et al. (2018) <arXiv:1805.05388> and evaluate hypotheses about covariate effects on embeddings using the regression framework developed by Rodriguez et al. (2021)<https://github.com/prodriguezsosa/EmbeddingRegression>.

Maintained by Pedro L. Rodriguez. Last updated 11 months ago.

3.3 match 104 stars 9.40 score 1.7k scripts

selcukorkmaz

PubChemR:Interface to the 'PubChem' Database for Chemical Data Retrieval

Provides an interface to the 'PubChem' database via the PUG REST <https://pubchem.ncbi.nlm.nih.gov/docs/pug-rest> and PUG View <https://pubchem.ncbi.nlm.nih.gov/docs/pug-view> services. This package allows users to automatically access chemical and biological data from 'PubChem', including compounds, substances, assays, and various other data types. Functions are available to retrieve data in different formats, perform searches, and access detailed annotations.

Maintained by Selcuk Korkmaz. Last updated 6 months ago.

5.5 match 2 stars 5.62 score 23 scripts

karissawhiting

cbioportalR:Browse and Query Clinical and Genomic Data from cBioPortal

Provides R users with direct access to genomic and clinical data from the 'cBioPortal' web resource via user-friendly functions that wrap 'cBioPortal's' existing API endpoints <https://www.cbioportal.org/api/swagger-ui/index.html>. Users can browse and query genomic data on mutations, copy number alterations and fusions, as well as data on tumor mutational burden ('TMB'), microsatellite instability status ('MSI'), 'FACETS' and select clinical data points (depending on the study). See <https://www.cbioportal.org/> and Gao et al., (2013) <doi:10.1126/scisignal.2004088> for more information on the cBioPortal web resource.

Maintained by Karissa Whiting. Last updated 4 months ago.

4.6 match 21 stars 6.70 score 20 scripts

bioc

megadepth:megadepth: BigWig and BAM related utilities

This package provides an R interface to Megadepth by Christopher Wilks available at https://github.com/ChristopherWilks/megadepth. It is particularly useful for computing the coverage of a set of genomic regions across bigWig or BAM files. With this package, you can build base-pair coverage matrices for regions or annotations of your choice from BigWig files. Megadepth was used to create the raw files provided by https://bioconductor.org/packages/recount3.

Maintained by David Zhang. Last updated 3 months ago.

software coverage dataimport transcriptomics rnaseq preprocessing bam bigwig daspter megadepth recount2 recount3

4.6 match 12 stars 6.69 score 7 scripts 3 dependents

mazamascience

AirMonitor:Air Quality Data Analysis

Utilities for working with hourly air quality monitoring data with a focus on small particulates (PM2.5). A compact data model is structured as a list with two dataframes. A 'meta' dataframe contains spatial and measuring device metadata associated with deployments at known locations. A 'data' dataframe contains a 'datetime' column followed by columns of measurements associated with each "device-deployment". Algorithms to calculate NowCast and the associated Air Quality Index (AQI) are defined at the US Environmental Projection Agency AirNow program: <https://document.airnow.gov/technical-assistance-document-for-the-reporting-of-daily-air-quailty.pdf>.

Maintained by Jonathan Callahan. Last updated 6 months ago.

4.7 match 7 stars 6.57 score 178 scripts

atlasoflivingaustralia

galah:Biodiversity Data from the GBIF Node Network

The Global Biodiversity Information Facility ('GBIF', <https://www.gbif.org>) sources data from an international network of data providers, known as 'nodes'. Several of these nodes - the "living atlases" (<https://living-atlases.gbif.org>) - maintain their own web services using software originally developed by the Atlas of Living Australia ('ALA', <https://www.ala.org.au>). 'galah' enables the R community to directly access data and resources hosted by 'GBIF' and its partner nodes.

Maintained by Martin Westgate. Last updated 1 months ago.

3.3 match 43 stars 9.17 score 275 scripts 1 dependents

aphalo

photobiologyLEDs:Spectral Data for Light-Emitting-Diodes

Spectral emission data for some frequently used light emitting diodes available as electronic components. Part of the 'r4photobiology' suite, Aphalo P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.

Maintained by Pedro J. Aphalo. Last updated 4 months ago.

5.9 match 2 stars 5.13 score 34 scripts

matthewblackwell

Amelia:A Program for Missing Data

A tool that "multiply imputes" missing data in a single cross-section (such as a survey), from a time series (like variables collected for each year in a country), or from a time-series-cross-sectional data set (such as collected by years for each of several countries). Amelia II implements our bootstrapping-based algorithm that gives essentially the same answers as the standard IP or EMis approaches, is usually considerably faster than existing approaches and can handle many more variables. Unlike Amelia I and other statistically rigorous imputation software, it virtually never crashes (but please let us know if you find to the contrary!). The program also generalizes existing approaches by allowing for trends in time series across observations within a cross-sectional unit, as well as priors that allow experts to incorporate beliefs they have about the values of missing cells in their data. Amelia II also includes useful diagnostics of the fit of multiple imputation models. The program works from the R command line or via a graphical user interface that does not require users to know R.

Maintained by Matthew Blackwell. Last updated 4 months ago.

openblas cpp

3.3 match 1 stars 9.06 score 1.4k scripts 7 dependents

aphalo

photobiologyFilters:Spectral Transmittance and Spectral Reflectance Data

Spectral 'transmittance' data for frequently used filters and similar materials. Plastic sheets and films; photography filters; theatrical gels; machine-vision filters; various types of window glass; optical glass and some laboratory plastics and glassware. Spectral reflectance data for frequently encountered materials. Part of the 'r4photobiology' suite, Aphalo P. J. (2015) <doi:10.19232/uv4pb.2015.1.14>.

Maintained by Pedro J. Aphalo. Last updated 7 days ago.

5.9 match 5.08 score 40 scripts

juba

scatterD3:D3 JavaScript Scatterplot from R

Creates 'D3' 'JavaScript' scatterplots from 'R' with interactive features : panning, zooming, tooltips, etc.

Maintained by Julien Barnier. Last updated 7 months ago.

d3 d3js htmlwidgets shiny

3.3 match 160 stars 8.98 score 125 scripts 4 dependents

bioc

recount:Explore and download data from the recount project

Explore and download data from the recount project available at https://jhubiostatistics.shinyapps.io/recount/. Using the recount package you can download RangedSummarizedExperiment objects at the gene, exon or exon-exon junctions level, the raw counts, the phenotype metadata used, the urls to the sample coverage bigWig files or the mean coverage bigWig file for a particular study. The RangedSummarizedExperiment objects can be used by different packages for performing differential expression analysis. Using http://bioconductor.org/packages/derfinder you can perform annotation-agnostic differential expression analyses with the data from the recount project as described at http://www.nature.com/nbt/journal/v35/n4/full/nbt.3838.html.

Maintained by Leonardo Collado-Torres. Last updated 3 months ago.

coverage differentialexpression geneexpression rnaseq sequencing software dataimport immunooncology annotation-agnostic bioconductor count derfinder deseq2 exon gene human illumina junction recount

3.1 match 41 stars 9.57 score 498 scripts 3 dependents

business-science

anomalize:Tidy Anomaly Detection

The 'anomalize' package enables a "tidy" workflow for detecting anomalies in data. The main functions are time_decompose(), anomalize(), and time_recompose(). When combined, it's quite simple to decompose time series, detect anomalies, and create bands separating the "normal" data from the anomalous data at scale (i.e. for multiple time series). Time series decomposition is used to remove trend and seasonal components via the time_decompose() function and methods include seasonal decomposition of time series by Loess ("stl") and seasonal decomposition by piecewise medians ("twitter"). The anomalize() function implements two methods for anomaly detection of residuals including using an inner quartile range ("iqr") and generalized extreme studentized deviation ("gesd"). These methods are based on those used in the 'forecast' package and the Twitter 'AnomalyDetection' package. Refer to the associated functions for specific references for these methods.

Maintained by Matt Dancho. Last updated 1 years ago.

anomaly anomaly-detection decomposition detect-anomalies iqr time-series

3.1 match 339 stars 9.56 score 332 scripts

nsj3

riojaPlot:Stratigraphic Diagrams in R

Stratigraphic diagrams in R.

Maintained by Steve Juggins. Last updated 2 months ago.

6.5 match 18 stars 4.60 score 11 scripts

matthias-studer

WeightedCluster:Clustering of Weighted Data

Clusters state sequences and weighted data. It provides an optimized weighted PAM algorithm as well as functions for aggregating replicated cases, computing cluster quality measures for a range of clustering solutions and plotting (fuzzy) clusters of state sequences. Parametric bootstraps methods to validate typology of sequences are also provided. Finally, it provides a fuzzy and crisp CLARA algorithm to cluster large database with sequence analysis.

Maintained by Matthias Studer. Last updated 3 months ago.

cpp

5.3 match 5.55 score 106 scripts 4 dependents

bioc

iCOBRA:Comparison and Visualization of Ranking and Assignment Methods

This package provides functions for calculation and visualization of performance metrics for evaluation of ranking and binary classification (assignment) methods. Various types of performance plots can be generated programmatically. The package also contains a shiny application for interactive exploration of results.

Maintained by Charlotte Soneson. Last updated 3 months ago.

classification visualization

3.3 match 14 stars 8.86 score 192 scripts 1 dependents

bioc

cageminer:Candidate Gene Miner

This package aims to integrate GWAS-derived SNPs and coexpression networks to mine candidate genes associated with a particular phenotype. For that, users must define a set of guide genes, which are known genes involved in the studied phenotype. Additionally, the mined candidates can be given a score that favor candidates that are hubs and/or transcription factors. The scores can then be used to rank and select the top n most promising genes for downstream experiments.

Maintained by Fabrício Almeida-Silva. Last updated 5 months ago.

software snp functionalprediction genomewideassociation geneexpression networkenrichment variantannotation functionalgenomics network

6.8 match 1 stars 4.30 score 5 scripts

bioc

ChIPanalyser:ChIPanalyser: Predicting Transcription Factor Binding Sites

ChIPanalyser is a package to predict and understand TF binding by utilizing a statistical thermodynamic model. The model incorporates 4 main factors thought to drive TF binding: Chromatin State, Binding energy, Number of bound molecules and a scaling factor modulating TF binding affinity. Taken together, ChIPanalyser produces ChIP-like profiles that closely mimic the patterns seens in real ChIP-seq data.

Maintained by Patrick C.N. Martin. Last updated 5 months ago.

software biologicalquestion workflowstep transcription sequencing chiponchip coverage alignment chipseq sequencematching dataimport peakdetection

6.7 match 4.38 score 12 scripts

bioc

Cardinal:A mass spectrometry imaging toolbox for statistical analysis

Implements statistical & computational tools for analyzing mass spectrometry imaging datasets, including methods for efficient pre-processing, spatial segmentation, and classification.

Maintained by Kylie Ariel Bemis. Last updated 3 months ago.

software infrastructure proteomics lipidomics massspectrometry imagingmassspectrometry immunooncology normalization clustering classification regression

2.8 match 47 stars 10.34 score 200 scripts

vegandevs

lmodel2:Model II Regression

Computes model II simple linear regression using ordinary least squares (OLS), major axis (MA), standard major axis (SMA), and ranged major axis (RMA).

Maintained by Jari Oksanen. Last updated 3 months ago.

3.0 match 3 stars 9.68 score 235 scripts 18 dependents

tgoodbody

sgsR:Structurally Guided Sampling

Structurally guided sampling (SGS) approaches for airborne laser scanning (ALS; LIDAR). Primary functions provide means to generate data-driven stratifications & methods for allocating samples. Intermediate functions for calculating and extracting important information about input covariates and samples are also included. Processing outcomes are intended to help forest and environmental management practitioners better optimize field sample placement as well as assess and augment existing sample networks in the context of data distributions and conditions. ALS data is the primary intended use case, however any rasterized remote sensing data can be used, enabling data-driven stratifications and sampling approaches.

Maintained by Tristan RH Goodbody. Last updated 14 days ago.

3.8 match 46 stars 7.50 score 34 scripts

theomichelot

moveHMM:Animal Movement Modelling using Hidden Markov Models

Provides tools for animal movement modelling using hidden Markov models. These include processing of tracking data, fitting hidden Markov models to movement data, visualization of data and fitted model, decoding of the state process, etc. <doi:10.1111/2041-210X.12578>.

Maintained by Theo Michelot. Last updated 1 years ago.

openblas cpp

3.3 match 38 stars 8.63 score 112 scripts

lindanab

mecor:Measurement Error Correction in Linear Models with a Continuous Outcome

Covariate measurement error correction is implemented by means of regression calibration by Carroll RJ, Ruppert D, Stefanski LA & Crainiceanu CM (2006, ISBN:1584886331), efficient regression calibration by Spiegelman D, Carroll RJ & Kipnis V (2001) <doi:10.1002/1097-0258(20010115)20:1%3C139::AID-SIM644%3E3.0.CO;2-K> and maximum likelihood estimation by Bartlett JW, Stavola DBL & Frost C (2009) <doi:10.1002/sim.3713>. Outcome measurement error correction is implemented by means of the method of moments by Buonaccorsi JP (2010, ISBN:1420066560) and efficient method of moments by Keogh RH, Carroll RJ, Tooze JA, Kirkpatrick SI & Freedman LS (2014) <doi:10.1002/sim.7011>. Standard error estimation of the corrected estimators is implemented by means of the Delta method by Rosner B, Spiegelman D & Willett WC (1990) <doi:10.1093/oxfordjournals.aje.a115715> and Rosner B, Spiegelman D & Willett WC (1992) <doi:10.1093/oxfordjournals.aje.a116453>, the Fieller method described by Buonaccorsi JP (2010, ISBN:1420066560), and the Bootstrap by Carroll RJ, Ruppert D, Stefanski LA & Crainiceanu CM (2006, ISBN:1584886331).

Maintained by Linda Nab. Last updated 3 years ago.

linear-models measurement-error statistics

5.6 match 6 stars 5.07 score 13 scripts

predictiveecology

NetLogoR:Build and Run Spatially Explicit Agent-Based Models

Build and run spatially explicit agent-based models using only the R platform. 'NetLogoR' follows the same framework as the 'NetLogo' software (Wilensky (1999) <http://ccl.northwestern.edu/netlogo/>) and is a translation in R of the structure and functions of 'NetLogo'. 'NetLogoR' provides new R classes to define model agents and functions to implement spatially explicit agent-based models in the R environment. This package allows benefiting of the fast and easy coding phase from the highly developed 'NetLogo' framework, coupled with the versatility, power and massive resources of the R software. Examples of two models from the NetLogo software repository (Ants <http://ccl.northwestern.edu/netlogo/models/Ants>) and Wolf-Sheep-Predation (<http://ccl.northwestern.edu/netlogo/models/WolfSheepPredation>), and a third, Butterfly, from Railsback and Grimm (2012) <https://www.railsback-grimm-abm-book.com/>, all written using 'NetLogoR' are available. The 'NetLogo' code of the original version of these models is provided alongside. A programming guide inspired from the 'NetLogo' Programming Guide (<https://ccl.northwestern.edu/netlogo/docs/programming.html>) and a dictionary of 'NetLogo' primitives (<https://ccl.northwestern.edu/netlogo/docs/dictionary.html>) equivalences are also available. NOTE: To increment 'time', these functions can use a for loop or can be integrated with a discrete event simulator, such as 'SpaDES' (<https://cran.r-project.org/package=SpaDES>). The suggested package 'fastshp' can be installed with 'install.packages("fastshp", repos = ("<https://rforge.net>"), type = "source")'.

Maintained by Eliot J B McIntire. Last updated 4 months ago.

4.1 match 38 stars 6.94 score 19 scripts

snoweye

EMCluster:EM Algorithm for Model-Based Clustering of Finite Mixture Gaussian Distribution

EM algorithms and several efficient initialization methods for model-based clustering of finite mixture Gaussian distribution with unstructured dispersion in both of unsupervised and semi-supervised learning.

Maintained by Wei-Chen Chen. Last updated 6 months ago.

openblas

3.8 match 18 stars 7.53 score 123 scripts 2 dependents

bmcclintock

momentuHMM:Maximum Likelihood Analysis of Animal Movement Behavior Using Multivariate Hidden Markov Models

Extended tools for analyzing telemetry data using generalized hidden Markov models. Features of momentuHMM (pronounced ``momentum'') include data pre-processing and visualization, fitting HMMs to location and auxiliary biotelemetry or environmental data, biased and correlated random walk movement models, discrete- or continuous-time HMMs, continuous- or discrete-space movement models, approximate Langevin diffusion models, hierarchical HMMs, multiple imputation for incorporating location measurement error and missing data, user-specified design matrices and constraints for covariate modelling of parameters, random effects, decoding of the state process, visualization of fitted models, model checking and selection, and simulation. See McClintock and Michelot (2018) <doi:10.1111/2041-210X.12995>.

Maintained by Brett McClintock. Last updated 1 months ago.

openblas cpp

3.3 match 43 stars 8.47 score 162 scripts

bioc

multiMiR:Integration of multiple microRNA-target databases with their disease and drug associations

A collection of microRNAs/targets from external resources, including validated microRNA-target databases (miRecords, miRTarBase and TarBase), predicted microRNA-target databases (DIANA-microT, ElMMo, MicroCosm, miRanda, miRDB, PicTar, PITA and TargetScan) and microRNA-disease/drug databases (miR2Disease, Pharmaco-miR VerSe and PhenomiR).

Maintained by Spencer Mahaffey. Last updated 5 months ago.

mirnadata homo_sapiens_data mus_musculus_data rattus_norvegicus_data organismdata microrna-sequence sql

3.3 match 20 stars 8.45 score 141 scripts

bioc

ClassifyR:A framework for cross-validated classification problems, with applications to differential variability and differential distribution testing

The software formalises a framework for classification and survival model evaluation in R. There are four stages; Data transformation, feature selection, model training, and prediction. The requirements of variable types and variable order are fixed, but specialised variables for functions can also be provided. The framework is wrapped in a driver loop that reproducibly carries out a number of cross-validation schemes. Functions for differential mean, differential variability, and differential distribution are included. Additional functions may be developed by the user, by creating an interface to the framework.

Maintained by Dario Strbenac. Last updated 7 days ago.

classification survival cpp

3.3 match 5 stars 8.36 score 45 scripts 3 dependents

thomasgstewart

repello:Reports from Trello in R

Creates reports from Trello, a collaborative, project organization and list-making application. <https://trello.com/> Reports are created by comparing individual Trello board cards from two different points in time and documenting any changes made to the cards.

Maintained by Andrew Guide. Last updated 4 years ago.

9.2 match 3.00 score

bioc

GeneTonic:Enjoy Analyzing And Integrating The Results From Differential Expression Analysis And Functional Enrichment Analysis

This package provides functionality to combine the existing pieces of the transcriptome data and results, making it easier to generate insightful observations and hypothesis. Its usage is made easy with a Shiny application, combining the benefits of interactivity and reproducibility e.g. by capturing the features and gene sets of interest highlighted during the live session, and creating an HTML report as an artifact where text, code, and output coexist. Using the GeneTonicList as a standardized container for all the required components, it is possible to simplify the generation of multiple visualizations and summaries.

Maintained by Federico Marini. Last updated 2 months ago.

gui geneexpression software transcription transcriptomics visualization differentialexpression pathways reportwriting genesetenrichment annotation go shinyapps bioconductor bioconductor-package data-exploration data-visualization functional-enrichment-analysis gene-expression pathway-analysis reproducible-research rna-seq-analysis rna-seq-data shiny transcriptome user-friendly

3.3 match 77 stars 8.28 score 37 scripts 1 dependents

theharmonylab

topics:Creating and Significance Testing Language Features for Visualisation

Implements differential language analysis with statistical tests and offers various language visualization techniques for n-grams and topics. It also supports the 'text' package. For more information, visit <https://r-topics.org/> and <https://www.r-text.org/>.

Maintained by Oscar Kjell. Last updated 8 days ago.

openjdk

3.3 match 5 stars 8.28 score 22 scripts 2 dependents

hneth

ds4psy:Data Science for Psychologists

All datasets and functions required for the examples and exercises of the book "Data Science for Psychologists" (by Hansjoerg Neth, Konstanz University, 2023), freely available at <https://bookdown.org/hneth/ds4psy/>. The book and course introduce principles and methods of data science to students of psychology and other biological or social sciences. The 'ds4psy' package primarily provides datasets, but also functions for data generation and manipulation (e.g., of text and time data) and graphics that are used in the book and its exercises. All functions included in 'ds4psy' are designed to be explicit and instructive, rather than efficient or elegant.

Maintained by Hansjoerg Neth. Last updated 1 months ago.

data-literacy data-science education exploratory-data-analysis psychology social-sciences visualisation

4.0 match 22 stars 6.79 score 70 scripts

r-lib

httr:Tools for Working with URLs and HTTP

Useful tools for working with HTTP organised by HTTP verbs (GET(), POST(), etc). Configuration functions make it easy to control additional request components (authenticate(), add_headers() and so on).

Maintained by Hadley Wickham. Last updated 1 years ago.

api curl http

1.3 match 989 stars 20.56 score 29k scripts 4.3k dependents

r-spatial

link2GI:Linking Geographic Information Systems, Remote Sensing and Other Command Line Tools

Functions and tools for using open GIS and remote sensing command-line interfaces in a reproducible environment.

Maintained by Chris Reudenbach. Last updated 4 months ago.

3.0 match 26 stars 9.05 score 78 scripts 1 dependents

hneth

unicol:The Colors of your University

Most universities use specific color combinations to express their unique brand identity. The 'unicol' package provides the colors and color palettes of various universities for easy plotting and printing in R. We collect and provide a diverse range of color palettes for creating scientific visualizations.

Maintained by Hansjoerg Neth. Last updated 7 months ago.

branding color color-palettes color-schemes corporate-design university-colors visual-identity

4.1 match 9 stars 6.58 score 10 scripts

resplab

predtools:Prediction Model Tools

Provides additional functions for evaluating predictive models, including plotting calibration curves and model-based Receiver Operating Characteristic (mROC) based on Sadatsafavi et al (2021) <arXiv:2003.00316>.

Maintained by Amin Adibi. Last updated 2 years ago.

cpp

4.0 match 9 stars 6.74 score 77 scripts

flujoo

gm:Create Music with Ease

Provides a simple and intuitive high-level language for music representation. Generates and embeds music scores and audio files in 'RStudio', 'R Markdown' documents, and R 'Jupyter Notebooks'. Internally, uses 'MusicXML' <https://github.com/w3c/musicxml> to represent music, and 'MuseScore' <https://musescore.org/> to convert 'MusicXML'.

Maintained by Renfei Mao. Last updated 8 months ago.

algorithmic-composition music-programming musicxml

3.3 match 207 stars 8.06 score 35 scripts

bioc

matter:Out-of-core statistical computing and signal processing

Toolbox for larger-than-memory scientific computing and visualization, providing efficient out-of-core data structures using files or shared memory, for dense and sparse vectors, matrices, and arrays, with applications to nonuniformly sampled signals and images.

Maintained by Kylie A. Bemis. Last updated 3 months ago.

infrastructure datarepresentation dataimport dimensionreduction preprocessing cpp

2.8 match 57 stars 9.52 score 64 scripts 2 dependents

glenndavis52

munsellinterpol:Interpolate Munsell Renotation Data from Hue Value/Chroma to CIE/RGB

Methods for interpolating data in the Munsell color system following the ASTM D-1535 standard. Hues and chromas with decimal values can be interpolated and converted to/from the Munsell color system and CIE xyY, CIE XYZ, CIE Lab, CIE Luv, or RGB. Includes ISCC-NBS color block lookup. Based on the work by Paul Centore, "The Munsell and Kubelka-Munk Toolbox".

Maintained by Glenn Davis. Last updated 2 months ago.

6.7 match 2 stars 4.01 score 43 scripts 1 dependents

bioc

ATACseqQC:ATAC-seq Quality Control

ATAC-seq, an assay for Transposase-Accessible Chromatin using sequencing, is a rapid and sensitive method for chromatin accessibility analysis. It was developed as an alternative method to MNase-seq, FAIRE-seq and DNAse-seq. Comparing to the other methods, ATAC-seq requires less amount of the biological samples and time to process. In the process of analyzing several ATAC-seq dataset produced in our labs, we learned some of the unique aspects of the quality assessment for ATAC-seq data.To help users to quickly assess whether their ATAC-seq experiment is successful, we developed ATACseqQC package partially following the guideline published in Nature Method 2013 (Greenleaf et al.), including diagnostic plot of fragment size distribution, proportion of mitochondria reads, nucleosome positioning pattern, and CTCF or other Transcript Factor footprints.

Maintained by Jianhong Ou. Last updated 2 months ago.

sequencing dnaseq atacseq generegulation qualitycontrol coverage nucleosomepositioning immunooncology

3.8 match 7.12 score 146 scripts 1 dependents

snoweye

pbdMPI:R Interface to MPI for HPC Clusters (Programming with Big Data Project)

A simplified, efficient, interface to MPI for HPC clusters. It is a derivation and rethinking of the Rmpi package. pbdMPI embraces the prevalent parallel programming style on HPC clusters. Beyond the interface, a collection of functions for global work with distributed data and resource-independent RNG reproducibility is included. It is based on S4 classes and methods.

Maintained by Wei-Chen Chen. Last updated 6 months ago.

openmpi

3.8 match 2 stars 7.11 score 179 scripts 3 dependents

brian-j-smith

MachineShop:Machine Learning Models and Tools

Meta-package for statistical and machine learning with a unified interface for model fitting, prediction, performance assessment, and presentation of results. Approaches for model fitting and prediction of numerical, categorical, or censored time-to-event outcomes include traditional regression models, regularization methods, tree-based methods, support vector machines, neural networks, ensembles, data preprocessing, filtering, and model tuning and selection. Performance metrics are provided for model assessment and can be estimated with independent test sets, split sampling, cross-validation, or bootstrap resampling. Resample estimation can be executed in parallel for faster processing and nested in cases of model tuning and selection. Modeling results can be summarized with descriptive statistics; calibration curves; variable importance; partial dependence plots; confusion matrices; and ROC, lift, and other performance curves.

Maintained by Brian J Smith. Last updated 7 months ago.

classification-models machine-learning predictive-modeling regression-models survival-models

3.3 match 62 stars 7.95 score 121 scripts

gjwgit

rattle:Graphical User Interface for Data Science in R

The R Analytic Tool To Learn Easily (Rattle) provides a collection of utilities functions for the data scientist. A Gnome (RGtk2) based graphical interface is included with the aim to provide a simple and intuitive introduction to R for data science, allowing a user to quickly load data from a CSV file (or via ODBC), transform and explore the data, build and evaluate models, and export models as PMML (predictive modelling markup language) or as scores. A key aspect of the GUI is that all R commands are logged and commented through the log tab. This can be saved as a standalone R script file and as an aid for the user to learn R or to copy-and-paste directly into R itself. Note that RGtk2 and cairoDevice have been archived on CRAN. See <https://rattle.togaware.com> for installation instructions.

Maintained by Graham Williams. Last updated 3 years ago.

3.1 match 16 stars 8.48 score 3.0k scripts 3 dependents

collinerickson

GauPro:Gaussian Process Fitting

Fits a Gaussian process model to data. Gaussian processes are commonly used in computer experiments to fit an interpolating model. The model is stored as an 'R6' object and can be easily updated with new data. There are options to run in parallel, and 'Rcpp' has been used to speed up calculations. For more info about Gaussian process software, see Erickson et al. (2018) <doi:10.1016/j.ejor.2017.10.002>.

Maintained by Collin Erickson. Last updated 17 hours ago.

openblas cpp openmp

3.1 match 16 stars 8.44 score 104 scripts 1 dependents

mazamascience

MazamaLocationUtils:Manage Spatial Metadata for Known Locations

Utility functions for discovering and managing metadata associated with spatially unique "known locations". Applications include all fields of environmental monitoring (e.g. air and water quality) where data are collected at stationary sites.

Maintained by Jonathan Callahan. Last updated 3 months ago.

4.7 match 5.64 score 108 scripts

desmarais-lab

NetworkInference:Inferring Latent Diffusion Networks

This is an R implementation of the netinf algorithm (Gomez Rodriguez, Leskovec, and Krause, 2010)<doi:10.1145/1835804.1835933>. Given a set of events that spread between a set of nodes the algorithm infers the most likely stable diffusion network that is underlying the diffusion process.

Maintained by Fridolin Linder. Last updated 6 years ago.

diffusion diffusion-network netinf-algorithm network-analysis cpp

4.5 match 24 stars 5.90 score 33 scripts

rstudio

bookdown:Authoring Books and Technical Documents with R Markdown

Output formats and utilities for authoring books and technical documents with R Markdown.

Maintained by Yihui Xie. Last updated 3 days ago.

book bookdown epub gitbook html latex rmarkdown

1.5 match 3.9k stars 17.51 score 1.7k scripts 136 dependents

thinkr-open

gitlabr:Access to the 'GitLab' API

Provides R functions to access the API of the project and repository management web application 'GitLab'. For many common tasks (repository file access, issue assignment and status, commenting) convenience wrappers are provided, and in addition the full API can be used by specifying request locations. 'GitLab' is open-source software and can be self-hosted or used on <https://about.gitlab.com>.

Maintained by Sébastien Rochette. Last updated 10 months ago.

gitlab

3.1 match 40 stars 8.40 score 69 scripts 1 dependents

spatpomp-org

spatPomp:Inference for Spatiotemporal Partially Observed Markov Processes

Inference on panel data using spatiotemporal partially-observed Markov process (SpatPOMP) models. The 'spatPomp' package extends 'pomp' to include algorithms taking advantage of the spatial structure in order to assist with handling high dimensional processes. See Asfaw et al. (2024) <doi:10.48550/arXiv.2101.01157> for further description of the package.

Maintained by Edward Ionides. Last updated 4 months ago.

3.5 match 2 stars 7.38 score 93 scripts

bioc

NanoMethViz:Visualise methylation data from Oxford Nanopore sequencing

NanoMethViz is a toolkit for visualising methylation data from Oxford Nanopore sequencing. It can be used to explore methylation patterns from reads derived from Oxford Nanopore direct DNA sequencing with methylation called by callers including nanopolish, f5c and megalodon. The plots in this package allow the visualisation of methylation profiles aggregated over experimental groups and across classes of genomic features.

Maintained by Shian Su. Last updated 7 days ago.

software longread visualization differentialmethylation dnamethylation epigenetics dataimport zlib cpp

3.8 match 26 stars 6.95 score 11 scripts

bioc

debrowser:Interactive Differential Expresion Analysis Browser

Bioinformatics platform containing interactive plots and tables for differential gene and region expression studies. Allows visualizing expression data much more deeply in an interactive and faster way. By changing the parameters, users can easily discover different parts of the data that like never have been done before. Manually creating and looking these plots takes time. With DEBrowser users can prepare plots without writing any code. Differential expression, PCA and clustering analysis are made on site and the results are shown in various plots such as scatter, bar, box, volcano, ma plots and Heatmaps.

Maintained by Alper Kucukural. Last updated 5 months ago.

sequencing chipseq rnaseq differentialexpression geneexpression clustering immunooncology

3.3 match 61 stars 7.80 score 65 scripts

mbannert

timeseriesdb:A Time Series Database for Official Statistics with R and PostgreSQL

Archive and manage times series data from official statistics. The 'timeseriesdb' package was designed to manage a large catalog of time series from official statistics which are typically published on a monthly, quarterly or yearly basis. Thus timeseriesdb is optimized to handle updates caused by data revision as well as elaborate, multi-lingual meta information.

Maintained by Matthias Bannert. Last updated 6 months ago.

3.8 match 24 stars 6.89 score 26 scripts

bioc

wateRmelon:Illumina DNA methylation array normalization and metrics

15 flavours of betas and three performance metrics, with methods for objects produced by methylumi and minfi packages.

Maintained by Leo C Schalkwyk. Last updated 4 months ago.

dnamethylation microarray twochannel preprocessing qualitycontrol

3.3 match 7.75 score 247 scripts 2 dependents

abbvie-external

OmicNavigator:Open-Source Software for 'Omic' Data Analysis and Visualization

A tool for interactive exploration of the results from 'omics' experiments to facilitate novel discoveries from high-throughput biology. The software includes R functions for the 'bioinformatician' to deposit study metadata and the outputs from statistical analyses (e.g. differential expression, enrichment). These results are then exported to an interactive JavaScript dashboard that can be interrogated on the user's local machine or deployed online to be explored by collaborators. The dashboard includes 'sortable' tables, interactive plots including network visualization, and fine-grained filtering based on statistical significance.

Maintained by John Blischak. Last updated 4 days ago.

bioinformatics genomics omics opencpu

3.3 match 34 stars 7.68 score 31 scripts

insightsengineering

rtables.officer:Exporting Tools for 'rtables'

Designed to create and display complex tables with R, the 'rtables' R package allows cells in an 'rtables' object to contain any high-dimensional data structure, which can then be displayed with cell-specific formatting instructions. Additionally, the 'rtables.officer' package supports export formats related to the Microsoft Office software suite, including Microsoft Word ('docx') and Microsoft PowerPoint ('pptx').

Maintained by Joe Zhu. Last updated 27 days ago.

pharmaceuticals tables

3.8 match 1 stars 6.80 score 3 scripts 7 dependents

jaredhuling

fastglm:Fast and Stable Fitting of Generalized Linear Models using 'RcppEigen'

Fits generalized linear models efficiently using 'RcppEigen'. The iteratively reweighted least squares implementation utilizes the step-halving approach of Marschner (2011) <doi:10.32614/RJ-2011-012> to help safeguard against convergence issues.

Maintained by Jared Huling. Last updated 3 years ago.

cpp

3.0 match 57 stars 8.47 score 59 scripts 13 dependents

bioc

lute:Framework for cell size scale factor normalized bulk transcriptomics deconvolution experiments

Provides a framework for adjustment on cell type size when performing bulk transcripomics deconvolution. The main framework function provides a means of reference normalization using cell size scale factors. It allows for marker selection and deconvolution using non-negative least squares (NNLS) by default. The framework is extensible for other marker selection and deconvolution algorithms, and users may reuse the generics, methods, and classes for these when developing new algorithms.

Maintained by Sean K Maden. Last updated 5 months ago.

rnaseq sequencing singlecell coverage transcriptomics normalization

4.8 match 2 stars 5.26 score 3 scripts

epiverse-trace

finalsize:Calculate the Final Size of an Epidemic

Calculate the final size of a susceptible-infectious-recovered epidemic in a population with demographic variation in contact patterns and susceptibility to disease, as discussed in Miller (2012) <doi:10.1007/s11538-012-9749-6>.

Maintained by Rosalind Eggo. Last updated 1 months ago.

epidemic-modelling epidemiology epiverse outbreak-analysis rcpp sdg-3 cpp

3.1 match 11 stars 8.11 score 46 scripts

s6juncheng

ggpval:Annotate Statistical Tests for 'ggplot2'

Automatically performs desired statistical tests (e.g. wilcox.test(), t.test()) to compare between groups, and adds the resulting p-values to the plot with an annotation bar. Visualizing group differences are frequently performed by boxplots, bar plots, etc. Statistical test results are often needed to be annotated on these plots. This package provides a convenient function that works on 'ggplot2' objects, performs the desired statistical test between groups of interest and annotates the test results on the plot.

Maintained by Jun Cheng. Last updated 3 years ago.

3.3 match 46 stars 7.55 score 306 scripts

rdatatable

data.table:Extension of `data.frame`

Fast aggregation of large data (e.g. 100GB in RAM), fast ordered joins, fast add/modify/delete of columns by group using no copies at all, list columns, friendly and fast character-separated-value read/write. Offers a natural and flexible syntax, for faster development.

Maintained by Tyson Barrett. Last updated 9 hours ago.

1.1 match 3.7k stars 23.52 score 230k scripts 4.6k dependents

bcgov

fasstr:Analyze, Summarize, and Visualize Daily Streamflow Data

The Flow Analysis Summary Statistics Tool for R, 'fasstr', provides various functions to tidy and screen daily stream discharge data, calculate and visualize various summary statistics and metrics, and compute annual trending and volume frequency analyses. It features useful function arguments for filtering of and handling dates, customizing data and metrics, and the ability to pull daily data directly from the Water Survey of Canada hydrometric database (<https://collaboration.cmc.ec.gc.ca/cmc/hydrometrics/www/>).

Maintained by Jon Goetz. Last updated 6 months ago.

bcgov for frequency-analysis hydat hydrology streamflow summary-statistics trends water

3.3 match 57 stars 7.48 score 89 scripts

lifewatch

sdmpredictors:Species Distribution Modelling Predictor Datasets

Terrestrial and marine predictors for species distribution modelling from multiple sources, including WorldClim <https://www.worldclim.org/>,, ENVIREM <https://envirem.github.io/>, Bio-ORACLE <https://bio-oracle.org/> and MARSPEC <http://www.marspec.org/>.

Maintained by Salvador Fernandez. Last updated 2 years ago.

bio-oracle lifewatch lifewatchvliz species-distribution-modelling

3.3 match 30 stars 7.47 score 218 scripts

jiscah

sequoia:Pedigree Inference from SNPs

Multi-generational pedigree inference from incomplete data on hundreds of SNPs, including parentage assignment and sibship clustering. See Huisman (2017) (<DOI:10.1111/1755-0998.12665>) for more information.

Maintained by Jisca Huisman. Last updated 9 months ago.

pedigree pedigree-reconstruction pedigrees sequoia snp snp-data fortran

3.3 match 26 stars 7.40 score 79 scripts

mazamascience

MazamaTimeSeries:Core Functionality for Environmental Time Series

Utility functions for working with environmental time series data from known locations. The compact data model is structured as a list with two dataframes. A 'meta' dataframe contains spatial and measuring device metadata associated with deployments at known locations. A 'data' dataframe contains a 'datetime' column followed by columns of measurements associated with each "device-deployment". Ephemerides calculations are based on code originally found in NOAA's "Solar Calculator" <https://gml.noaa.gov/grad/solcalc/>.

Maintained by Jonathan Callahan. Last updated 1 years ago.

timeseries

4.7 match 5.27 score 62 scripts 1 dependents

gagolews

FuzzyNumbers:Tools to Deal with Fuzzy Numbers

S4 classes and methods to deal with fuzzy numbers. They allow for computing any arithmetic operations (e.g., by using the Zadeh extension principle), performing approximation of arbitrary fuzzy numbers by trapezoidal and piecewise linear ones, preparing plots for publications, computing possibility and necessity values for comparisons, etc.

Maintained by Marek Gagolewski. Last updated 3 years ago.

3.3 match 10 stars 7.37 score 91 scripts 17 dependents

rbgramacy

tgp:Bayesian Treed Gaussian Process Models

Bayesian nonstationary, semiparametric nonlinear regression and design by treed Gaussian processes (GPs) with jumps to the limiting linear model (LLM). Special cases also implemented include Bayesian linear models, CART, treed linear models, stationary separable and isotropic GPs, and GP single-index models. Provides 1-d and 2-d plotting functions (with projection and slice capabilities) and tree drawing, designed for visualization of tgp-class output. Sensitivity analysis and multi-resolution models are supported. Sequential experimental design and adaptive sampling functions are also provided, including ALM, ALC, and expected improvement. The latter supports derivative-free optimization of noisy black-box functions. For details and tutorials, see Gramacy (2007) <doi:10.18637/jss.v019.i09> and Gramacy & Taddy (2010) <doi:10.18637/jss.v033.i06>.

Maintained by Robert B. Gramacy. Last updated 6 months ago.

openblas cpp

3.3 match 9 stars 7.36 score 203 scripts 12 dependents

r-quantities

quantities:Quantity Calculus for R Vectors

Integration of the 'units' and 'errors' packages for a complete quantity calculus system for R vectors, matrices and arrays, with automatic propagation, conversion, derivation and simplification of magnitudes and uncertainties. Documentation about 'units' and 'errors' is provided in the papers by Pebesma, Mailund & Hiebert (2016, <doi:10.32614/RJ-2016-061>) and by Ucar, Pebesma & Azcorra (2018, <doi:10.32614/RJ-2018-075>), included in those packages as vignettes; see 'citation("quantities")' for details.

Maintained by Iñaki Ucar. Last updated 2 months ago.

quantity-calculus uncertainty-propagation units-of-measurement cpp

3.3 match 26 stars 7.36 score 49 scripts 1 dependents

beniaminogreen

zoomerjoin:Superlatively Fast Fuzzy Joins

Empowers users to fuzzily-merge data frames with millions or tens of millions of rows in minutes with low memory usage. The package uses the locality sensitive hashing algorithms developed by Datar, Immorlica, Indyk and Mirrokni (2004) <doi:10.1145/997817.997857>, and Broder (1998) <doi:10.1109/SEQUEN.1997.666900> to avoid having to compare every pair of records in each dataset, resulting in fuzzy-merges that finish in linear time.

Maintained by Beniamino Green. Last updated 2 months ago.

blazinglyfast fuzzyjoin join rust zoomer cargo

3.3 match 103 stars 7.31 score 11 scripts

glenndavis52

colorSpec:Color Calculations with Emphasis on Spectral Data

Calculate with spectral properties of light sources, materials, cameras, eyes, and scanners. Build complex systems from simpler parts using a spectral product algebra. For light sources, compute CCT, CRI, SSI, and IES TM-30 reports. For object colors, compute optimal colors and Logvinenko coordinates. Work with the standard CIE illuminants and color matching functions, and read spectra from text files, including CGATS files. Estimate a spectrum from its response. A user guide and 9 vignettes are included.

Maintained by Glenn Davis. Last updated 1 months ago.

3.8 match 2 stars 6.34 score 73 scripts 5 dependents

geobosh

cvar:Compute Expected Shortfall and Value at Risk for Continuous Distributions

Compute expected shortfall (ES) and Value at Risk (VaR) from a quantile function, distribution function, random number generator or probability density function. ES is also known as Conditional Value at Risk (CVaR). Virtually any continuous distribution can be specified. The functions are vectorized over the arguments. The computations are done directly from the definitions, see e.g. Acerbi and Tasche (2002) <doi:10.1111/1468-0300.00091>. Some support for GARCH models is provided, as well.

Maintained by Georgi N. Boshnakov. Last updated 2 years ago.

expected-shortfall locations-scale-transformations quantile quantile-functions risk value-at-risk

3.0 match 6 stars 8.05 score 27 scripts 52 dependents

stc04003

rocTree:Receiver Operating Characteristic (ROC)-Guided Classification and Survival Tree

Receiver Operating Characteristic (ROC)-guided survival trees and ensemble algorithms are implemented, providing a unified framework for tree-structured analysis with censored survival outcomes. A time-invariant partition scheme on the survivor population was considered to incorporate time-dependent covariates. Motivated by ideas of randomized tests, generalized time-dependent ROC curves were used to evaluate the performance of survival trees and establish the optimality of the target hazard/survival function. The optimality of the target hazard function motivates us to use a weighted average of the time-dependent area under the curve (AUC) on a set of time points to evaluate the prediction performance of survival trees and to guide splitting and pruning. A detailed description of the implemented methods can be found in Sun et al. (2019) <arXiv:1809.05627>.

Maintained by Sy Han Chiou. Last updated 4 years ago.

decision-trees cpp

7.1 match 5 stars 3.40 score 7 scripts

ggpmxdevelopment

ggPMX:'ggplot2' Based Tool to Facilitate Diagnostic Plots for NLME Models

At Novartis, we aimed at standardizing the set of diagnostic plots used for modeling activities in order to reduce the overall effort required for generating such plots. For this, we developed a guidance that proposes an adequate set of diagnostics and a toolbox, called 'ggPMX' to execute them. 'ggPMX' is a toolbox that can generate all diagnostic plots at a quality sufficient for publication and submissions using few lines of code. This package focuses on plots recommended by ISoP <doi:10.1002/psp4.12161>. While not required, you can get/install the 'R' 'lixoftConnectors' package in the 'Monolix' installation, as described at the following url <https://monolix.lixoft.com/monolix-api/lixoftconnectors_installation/>. When 'lixoftConnectors' is available, 'R' can use 'Monolix' directly to create the required Chart Data instead of exporting it from the 'Monolix' gui.

Maintained by Matthew Fidler. Last updated 1 years ago.

pharmacometrics pmx reporting

3.3 match 39 stars 7.23 score 80 scripts

ms-quality-hub

rmzqc:Creation, Reading and Validation of 'mzqc' Files

Reads, writes and validates 'mzQC' files. The 'mzQC' format is a standardized file format for the exchange, transmission, and archiving of quality metrics derived from biological mass spectrometry data, as defined by the HUPO-PSI (Human Proteome Organisation - Proteomics Standards Initiative) Quality Control working group. See <https://hupo-psi.github.io/mzQC/> for details.

Maintained by Chris Bielow. Last updated 11 months ago.

hacktoberfest mass-spectrometry mzqc quality-control

4.2 match 2 stars 5.73 score 10 scripts 3 dependents

rmgpanw

gtexr:Query the GTEx Portal API

A convenient R interface to the Genotype-Tissue Expression (GTEx) Portal API. For more information on the API, see <https://gtexportal.org/api/v2/redoc>.

Maintained by Alasdair Warwick. Last updated 6 months ago.

api-wrapper bioinformatics eqtl gtex sqtl

3.8 match 5 stars 6.41 score 5 scripts

gagolews

TurtleGraphics:Turtle Graphics

An implementation of turtle graphics <http://en.wikipedia.org/wiki/Turtle_graphics>. Turtle graphics comes from Papert's language Logo and has been used to teach concepts of computer programming.

Maintained by Marek Gagolewski. Last updated 3 years ago.

3.3 match 23 stars 7.21 score 117 scripts 2 dependents

macroecology

letsR:Data Handling and Analysis in Macroecology

Handling, processing, and analyzing geographic data on species' distributions and environmental variables. Read Vilela & Villalobos (2015) <doi:10.1111/2041-210X.12401> for details.

Maintained by Bruno Vilela. Last updated 2 months ago.

2.7 match 29 stars 8.87 score 104 scripts

sfcheung

stdmod:Standardized Moderation Effect and Its Confidence Interval

Functions for computing a standardized moderation effect in moderated regression and forming its confidence interval by nonparametric bootstrapping as proposed in Cheung, Cheung, Lau, Hui, and Vong (2022) <doi:10.1037/hea0001188>. Also includes simple-to-use functions for computing conditional effects (unstandardized or standardized) and plotting moderation effects.

Maintained by Shu Fai Cheung. Last updated 6 months ago.

bootstrapping confidence-interval effect-sizes moderation regression standardization standardized-moderation

4.3 match 1 stars 5.62 score 46 scripts

phil8192

obAnalytics:Limit Order Book Analytics

Data processing, visualisation and analysis of Limit Order Book event data.

Maintained by Philip Stubbings. Last updated 6 years ago.

bitcoin limit-order-book trading visualisation

3.8 match 152 stars 6.36 score 30 scripts

bioc

dStruct:Identifying differentially reactive regions from RNA structurome profiling data

dStruct identifies differentially reactive regions from RNA structurome profiling data. dStruct is compatible with a broad range of structurome profiling technologies, e.g., SHAPE-MaP, DMS-MaPseq, Structure-Seq, SHAPE-Seq, etc. See Choudhary et al., Genome Biology, 2019 for the underlying method.

Maintained by Krishna Choudhary. Last updated 5 months ago.

statisticalmethod structuralprediction sequencing software

4.9 match 2 stars 4.86 score 12 scripts