R-universe search: exports:show

package

owner

contributor

author

maintainer

topic

needs

exports

data

Currently serving26345packages,22653articles, and64231datasets by1265organizations,13663 maintainers and22193 contributors.

Not sure what to search for? Why not try:maps, bayesian, ecology, climate, genome, gam, spatial, database, pdf, shiny, rstudio, machine learning, prediction, birds, fish, sports, ... (more popular topics)

Organizations

vimc

lcbc-uio

stan-dev

pharmaverse

r-spatial

tidyverse

ropengov

rstudio

r-lib

ropensci

bioc

r-forge

kwb-r

pik-piam

hypertidy

poissonconsulting

mrc-ide

tidymodels

pecanproject

insightsengineering

thinkr-open

inbo

mlr-org

ggseg

ohdsi

modeloriented

paws-r

predictiveecology

flr

ropenspain

bnosac

sciviews

mrcieu

openvolley

repboxr

rmi-pacta

epiverse-trace

nlmixr2

frbcesab

yulab-smu

ices-tools-prod

riatelab

azure

statnet

bips-hb

mlverse

appsilon

rjdverse

epiforecasts

cloudyr

tmsalab

openpharma

bupaverse

hubverse-org

dreamrs

usepa

usaid-oha-si

business-science

certe-medical-epidemiology

merck

darwin-eu

ambiorix-web

easystats

coatless-rpkg

hugheylab

uscbiostats

rsquaredacademy

rikenbit

spatstat

bluegreen-labs

nutriverse

r-dbi

traitecoevo

ocbe-uio

ipeagit

epicentre-msf

cogdisreslab

rspatial

reconhub

data-cleaning

apache

biometris

terminological

aus-doh-safety-and-quality

humaniverse

nflverse

gesistsa

ifpri

ctu-bern

a2-ai

mazamascience

tlverse

atsa-es

piecepackr

cynkra

rformassspectrometry

Want to learn more about r-universe? Have a look atropensci.org/r-universeor updates from the rOpenSci blog:

Better documentation for R-universe!February 28, 2025
R-Universe Named an R Consortium Top-Level ProjectDecember 3, 2024
Capturing Screenshots Programmatically With RSeptember 10, 2024
Navigating the R ecosystem using R-universeSeptember 24, 2024
A fresh new look for R-universe!June 12, 2024
R-Universe Documentation Gets a Boost from Google Season of DocsApril 12, 2024
R-universe now builds MacOS ARM64 binaries for use on Apple Silicon (aka M1/M2/M3) systemsJanuary 14, 2024
R-universe now builds WASM binaries for all R packagesNovember 17, 2023
The rOpenSci MultiverseNovember 6, 2023
CRAN-ial Expansion: Taking Your R Package Development to New Frontiers with R-UniverseSeptember 19, 2023
Meeting the Stars of the R-Universe: The R-Universe Against Diseases.September 15, 2023
My Life with the R-universeAugust 1, 2023
New cran.dev shortlinks to package information and documentationJuly 26, 2023
Meeting the Stars of the R-Universe: PEcAn, an Open Source Project to Take Care of the PlanetJune 6, 2023
Downloading snapshots and creating stable R packages repositories using r-universeMay 31, 2023
How r-universe searches for packages on CRAN / BioconductorApril 3, 2023
Meeting the Stars of the R-Universe: Researching Our Brain with the Magic of the R-UniverseMarch 30, 2023
Meeting the Stars of the R-universe: ThinkR's Approach to Contributing to a Growing and Friendly R CommunityFebruary 28, 2023
Discovering and learning everything there is to know about R packages using r-universeFebruary 27, 2023
New preferred repo name for r-universe registriesFebruary 7, 2023
Improved permanent URL schema for r-universe.devJanuary 30, 2023
postdoc 1.0: minimal and uncluttered HTML package manualsNovember 29, 2022
Meeting the stars of the R-universe: R Community, Exchange and LearnNovember 23, 2022
Searching and browsing the R universeMarch 23, 2022
A Blend of Package Build FailuresJanuary 31, 2022
How renv restores packages from r-universe for reproducibility or productionJanuary 6, 2022
RSS feeds of package updates in r-universeNovember 24, 2021
How I Test cffr on (about) 2,000 Packages using GitHub Actions and R-universeNovember 23, 2021
Generating and customizing badges in r-universeOctober 14, 2021
rOpenSci docs are now built on r-universeSeptember 3, 2021
How to create your personal CRAN-like repository on R-universeJune 22, 2021
Publishing and browsing articles on R-universeApril 9, 2021
rOpenSci's R-universe ProjectMay 25, 2021
A first look at the R-universe build infrastructureMarch 4, 2021
Moving away from Travis CINovember 19, 2020
How to precompute package vignettes or pkgdown articlesDecember 8, 2019

Showing 200 of total 756 results (show query)

rcppcore

Rcpp:Seamless R and C++ Integration

The 'Rcpp' package provides R functions as well as C++ classes which offer a seamless integration of R and C++. Many R data types and objects can be mapped back and forth to C++ equivalents which facilitates both writing of new code as well as easier integration of third-party libraries. Documentation about 'Rcpp' is provided by several vignettes included in this package, via the 'Rcpp Gallery' site at <https://gallery.rcpp.org>, the paper by Eddelbuettel and Francois (2011, <doi:10.18637/jss.v040.i08>), the book by Eddelbuettel (2013, <doi:10.1007/978-1-4614-6868-4>) and the paper by Eddelbuettel and Balamuta (2018, <doi:10.1080/00031305.2017.1375990>); see 'citation("Rcpp")' for details.

Maintained by Dirk Eddelbuettel. Last updated 2 days ago.

c-plus-plus c-plus-plus-11 c-plus-plus-14 c-plus-plus-17 c-plus-plus-20 rcpp cpp

755 stars 22.62 score 11k scripts 13k dependents

tidyverse

lubridate:Make Dealing with Dates a Little Easier

Functions to work with date-times and time-spans: fast and user friendly parsing of date-time data, extraction and updating of components of a date-time (years, months, days, hours, minutes, and seconds), algebraic manipulation on date-time and time-span objects. The 'lubridate' package has a consistent and memorable syntax that makes working with dates easy and fun.

Maintained by Vitalie Spinu. Last updated 4 months ago.

date date-time

757 stars 20.95 score 135k scripts 1.9k dependents

r-dbi

DBI:R Database Interface

A database interface definition for communication between R and relational database management systems. All classes in this package are virtual and need to be extended by the various R/DBMS implementations.

Maintained by Kirill Müller. Last updated 11 days ago.

database interface

302 stars 20.87 score 19k scripts 2.9k dependents

lme4

lme4:Linear Mixed-Effects Models using 'Eigen' and S4

Fit linear and generalized linear mixed-effects models. The models and their components are represented using S4 classes and methods. The core computational algorithms are implemented using the 'Eigen' C++ library for numerical linear algebra and 'RcppEigen' "glue".

Maintained by Ben Bolker. Last updated 4 days ago.

cpp

647 stars 20.68 score 35k scripts 1.5k dependents

stan-dev

rstan:R Interface to Stan

User-facing R functions are provided to parse, compile, test, estimate, and analyze Stan models by accessing the header-only Stan library provided by the 'StanHeaders' package. The Stan project develops a probabilistic programming language that implements full Bayesian statistical inference via Markov Chain Monte Carlo, rough Bayesian inference via 'variational' approximation, and (optionally penalized) maximum likelihood estimation via optimization. In all three cases, automatic differentiation is used to quickly and accurately evaluate gradients without burdening the user with the need to derive the partial derivatives.

Maintained by Ben Goodrich. Last updated 4 days ago.

bayesian-data-analysis bayesian-inference bayesian-statistics mcmc stan cpp

1.1k stars 18.84 score 14k scripts 281 dependents

r-dbi

RSQLite:SQLite Interface for R

Embeds the SQLite database engine in R and provides an interface compliant with the DBI package. The source for the SQLite engine and for various extensions in a recent version is included. System libraries will never be consulted because this package relies on static linking for the plugins it includes; this also ensures a consistent experience across all installations.

Maintained by Kirill Müller. Last updated 1 months ago.

database sqlite3 cpp

331 stars 18.76 score 8.1k scripts 1.1k dependents

edzer

sp:Classes and Methods for Spatial Data

Classes and methods for spatial data; the classes document where the spatial location information resides, for 2D or 3D data. Utility functions are provided, e.g. for plotting data as maps, spatial selection, as well as methods for retrieving coordinates, for subsetting, print, summary, etc. From this version, 'rgdal', 'maptools', and 'rgeos' are no longer used at all, see <https://r-spatial.org/r/2023/05/15/evolution4.html> for details.

Maintained by Edzer Pebesma. Last updated 2 months ago.

127 stars 18.63 score 35k scripts 1.3k dependents

bioc

Biostrings:Efficient manipulation of biological strings

Memory efficient string containers, string matching algorithms, and other utilities, for fast manipulation of large biological sequences or sets of sequences.

Maintained by Hervé Pagès. Last updated 1 months ago.

sequencematching alignment sequencing genetics dataimport datarepresentation infrastructure bioconductor-package core-package

62 stars 17.77 score 8.6k scripts 1.2k dependents

bioc

GenomicRanges:Representation and manipulation of genomic intervals

The ability to efficiently represent and manipulate genomic annotations and alignments is playing a central role when it comes to analyzing high-throughput sequencing data (a.k.a. NGS data). The GenomicRanges package defines general purpose containers for storing and manipulating genomic intervals and variables defined along a genome. More specialized containers for representing and manipulating short alignments against a reference genome, or a matrix-like summarization of an experiment, are defined in the GenomicAlignments and SummarizedExperiment packages, respectively. Both packages build on top of the GenomicRanges infrastructure.

Maintained by Hervé Pagès. Last updated 4 months ago.

genetics infrastructure datarepresentation sequencing annotation genomeannotation coverage bioconductor-package core-package

44 stars 17.68 score 13k scripts 1.3k dependents

bioc

BiocParallel:Bioconductor facilities for parallel evaluation

This package provides modified versions and novel implementation of functions for parallel evaluation, tailored to use with Bioconductor objects.

Maintained by Martin Morgan. Last updated 1 months ago.

infrastructure bioconductor-package core-package u24ca289073 cpp

67 stars 17.31 score 7.3k scripts 1.1k dependents

daattali

shinyjs:Easily Improve the User Experience of Your Shiny Apps in Seconds

Perform common useful JavaScript operations in Shiny apps that will greatly improve your apps without having to know any JavaScript. Examples include: hiding an element, disabling an input, resetting an input back to its original value, delaying code execution by a few seconds, and many more useful functions for both the end user and the developer. 'shinyjs' can also be used to easily call your own custom JavaScript functions from R.

Maintained by Dean Attali. Last updated 7 months ago.

javascript shiny shiny-r

740 stars 17.28 score 8.9k scripts 400 dependents

r-forge

Matrix:Sparse and Dense Matrix Classes and Methods

A rich hierarchy of sparse and dense matrix classes, including general, symmetric, triangular, and diagonal matrices with numeric, logical, or pattern entries. Efficient methods for operating on such matrices, often wrapping the 'BLAS', 'LAPACK', and 'SuiteSparse' libraries.

Maintained by Martin Maechler. Last updated 19 days ago.

openblas

1 stars 17.23 score 33k scripts 12k dependents

bioc

SummarizedExperiment:A container (S4 class) for matrix-like assays

The SummarizedExperiment container contains one or more assays, each represented by a matrix-like object of numeric or other mode. The rows typically represent genomic ranges of interest and the columns represent samples.

Maintained by Hervé Pagès. Last updated 5 months ago.

genetics infrastructure sequencing annotation coverage genomeannotation bioconductor-package core-package

34 stars 16.84 score 8.6k scripts 1.2k dependents

bioc

Biobase:Biobase: Base functions for Bioconductor

Functions that are needed by many other packages or which replace R functions.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

infrastructure bioconductor-package core-package

9 stars 16.45 score 6.6k scripts 1.8k dependents

bioc

GenomeInfoDb:Utilities for manipulating chromosome names, including modifying them to follow a particular naming style

Contains data and functions that define and allow translation between different chromosome sequence naming conventions (e.g., "chr1" versus "1"), including a function that attempts to place sequence names in their natural, rather than lexicographic, order.

Maintained by Hervé Pagès. Last updated 2 months ago.

genetics datarepresentation annotation genomeannotation bioconductor-package core-package

32 stars 16.32 score 1.3k scripts 1.7k dependents

r-dbi

odbc:Connect to ODBC Compatible Databases (using the DBI Interface)

A DBI-compatible interface to ODBC databases.

Maintained by Hadley Wickham. Last updated 1 days ago.

database odbc unixodbc cpp

396 stars 16.31 score 2.9k scripts 23 dependents

r-forge

colorspace:A Toolbox for Manipulating and Assessing Colors and Palettes

Carries out mapping between assorted color spaces including RGB, HSV, HLS, CIEXYZ, CIELUV, HCL (polar CIELUV), CIELAB, and polar CIELAB. Qualitative, sequential, and diverging color palettes based on HCL colors are provided along with corresponding ggplot2 color scales. Color palette choice is aided by an interactive app (with either a Tcl/Tk or a shiny graphical user interface) and shiny apps with an HCL color picker and a color vision deficiency emulator. Plotting functions for displaying and assessing palettes include color swatches, visualizations of the HCL space, and trajectories in HCL and/or RGB spectrum. Color manipulation functions include: desaturation, lightening/darkening, mixing, and simulation of color vision deficiencies (deutanomaly, protanomaly, tritanomaly). Details can be found on the project web page at <https://colorspace.R-Forge.R-project.org/> and in the accompanying scientific paper: Zeileis et al. (2020, Journal of Statistical Software, <doi:10.18637/jss.v096.i01>).

Maintained by Achim Zeileis. Last updated 4 months ago.

16.23 score 8.2k scripts 8.1k dependents

joshuaulrich

quantmod:Quantitative Financial Modelling Framework

Specify, build, trade, and analyse quantitative financial trading strategies.

Maintained by Joshua M. Ulrich. Last updated 26 days ago.

algorithmic-trading charting data-import finance time-series

839 stars 16.17 score 8.1k scripts 343 dependents

bioc

DESeq2:Differential gene expression analysis based on the negative binomial distribution

Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.

Maintained by Michael Love. Last updated 23 days ago.

sequencing rnaseq chipseq geneexpression transcription normalization differentialexpression bayesian regression principalcomponent clustering immunooncology openblas cpp

375 stars 16.11 score 17k scripts 115 dependents

bioc

IRanges:Foundation of integer range manipulation in Bioconductor

Provides efficient low-level and highly reusable S4 classes for storing, manipulating and aggregating over annotated ranges of integers. Implements an algebra of range operations, including efficient algorithms for finding overlaps and nearest neighbors. Defines efficient list-like classes for storing, transforming and aggregating large grouped data, i.e., collections of atomic vectors and DataFrames.

Maintained by Hervé Pagès. Last updated 2 months ago.

infrastructure datarepresentation bioconductor-package core-package

22 stars 16.09 score 2.1k scripts 1.8k dependents

bioc

S4Vectors:Foundation of vector-like and list-like containers in Bioconductor

The S4Vectors package defines the Vector and List virtual classes and a set of generic functions that extend the semantic of ordinary vectors and lists in R. Package developers can easily implement vector-like or list-like objects as concrete subclasses of Vector or List. In addition, a few low-level concrete subclasses of general interest (e.g. DataFrame, Rle, Factor, and Hits) are implemented in the S4Vectors package itself (many more are implemented in the IRanges package and in other Bioconductor infrastructure packages).

Maintained by Hervé Pagès. Last updated 2 months ago.

infrastructure datarepresentation bioconductor-package core-package

18 stars 16.05 score 1.0k scripts 1.9k dependents

bioc

biomaRt:Interface to BioMart databases (i.e. Ensembl)

In recent years a wealth of biological data has become available in public data repositories. Easy access to these valuable data resources and firm integration with data analysis is needed for comprehensive bioinformatics data analysis. biomaRt provides an interface to a growing collection of databases implementing the BioMart software suite (<http://www.biomart.org>). The package enables retrieval of large amounts of data in a uniform way without the need to know the underlying database schemas or write complex SQL queries. The most prominent examples of BioMart databases are maintain by Ensembl, which provides biomaRt users direct access to a diverse set of data and enables a wide range of powerful online queries from gene annotation to database mining.

Maintained by Mike Smith. Last updated 14 days ago.

annotation bioconductor biomart ensembl

38 stars 15.99 score 13k scripts 230 dependents

bioc

rhdf5:R Interface to HDF5

This package provides an interface between HDF5 and R. HDF5's main features are the ability to store and access very large and/or complex datasets and a wide variety of metadata on mass storage (disk) through a completely portable file format. The rhdf5 package is thus suited for the exchange of large and/or complex datasets between R and other software package, and for letting R applications work on datasets that are larger than the available RAM.

Maintained by Mike Smith. Last updated 4 days ago.

infrastructure dataimport hdf5 rhdf5 openssl curl zlib cpp

62 stars 15.87 score 4.2k scripts 232 dependents

bioc

DelayedArray:A unified framework for working transparently with on-disk and in-memory array-like datasets

Wrapping an array-like object (typically an on-disk object) in a DelayedArray object allows one to perform common array operations on it without loading the object in memory. In order to reduce memory usage and optimize performance, operations on the object are either delayed or executed using a block processing mechanism. Note that this also works on in-memory array-like objects like DataFrame objects (typically with Rle columns), Matrix objects, ordinary arrays and, data frames.

Maintained by Hervé Pagès. Last updated 1 months ago.

infrastructure datarepresentation annotation genomeannotation bioconductor-package core-package u24ca289073

27 stars 15.59 score 538 scripts 1.2k dependents

bioc

Rsamtools:Binary alignment (BAM), FASTA, variant call (BCF), and tabix file import

This package provides an interface to the 'samtools', 'bcftools', and 'tabix' utilities for manipulating SAM (Sequence Alignment / Map), FASTA, binary variant call (BCF) and compressed indexed tab-delimited (tabix) files.

Maintained by Bioconductor Package Maintainer. Last updated 4 months ago.

dataimport sequencing coverage alignment qualitycontrol bioconductor-package core-package curl bzip2 xz-utils zlib cpp

28 stars 15.34 score 3.2k scripts 569 dependents

bioc

GenomicFeatures:Query the gene models of a given organism/assembly

Extract the genomic locations of genes, transcripts, exons, introns, and CDS, for the gene models stored in a TxDb object. A TxDb object is a small database that contains the gene models of a given organism/assembly. Bioconductor provides a small collection of TxDb objects in the form of ready-to-install TxDb packages for the most commonly studied organisms. Additionally, the user can easily make a TxDb object (or package) for the organism/assembly of their choice by using the tools from the txdbmaker package.

Maintained by H. Pagès. Last updated 5 months ago.

genetics infrastructure annotation sequencing genomeannotation bioconductor-package core-package

26 stars 15.34 score 5.3k scripts 339 dependents

bioc

GenomicAlignments:Representation and manipulation of short genomic alignments

Provides efficient containers for storing and manipulating short genomic alignments (typically obtained by aligning short reads to a reference genome). This includes read counting, computing the coverage, junction detection, and working with the nucleotide content of the alignments.

Maintained by Hervé Pagès. Last updated 5 months ago.

infrastructure dataimport genetics sequencing rnaseq snp coverage alignment immunooncology bioconductor-package core-package

10 stars 15.21 score 3.1k scripts 528 dependents

bioc

AnnotationDbi:Manipulation of SQLite-based annotations in Bioconductor

Implements a user-friendly interface for querying SQLite-based annotation data packages.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

annotation microarray sequencing genomeannotation bioconductor-package core-package

9 stars 15.05 score 3.6k scripts 769 dependents

bioc

DOSE:Disease Ontology Semantic and Enrichment analysis

This package implements five methods proposed by Resnik, Schlicker, Jiang, Lin and Wang respectively for measuring semantic similarities among DO terms and gene products. Enrichment analyses including hypergeometric model and gene set enrichment analysis are also implemented for discovering disease associations of high-throughput biological data.

Maintained by Guangchuang Yu. Last updated 5 months ago.

annotation visualization multiplecomparison genesetenrichment pathways software disease-ontology enrichment-analysis semantic-similarity

119 stars 14.97 score 2.0k scripts 61 dependents

bioc

MultiAssayExperiment:Software for the integration of multi-omics experiments in Bioconductor

Harmonize data management of multiple experimental assays performed on an overlapping set of specimens. It provides a familiar Bioconductor user experience by extending concepts from SummarizedExperiment, supporting an open-ended mix of standard data classes for individual assays, and allowing subsetting by genomic ranges or rownames. Facilities are provided for reshaping data into wide and long formats for adaptability to graphing and downstream analysis.

Maintained by Marcel Ramos. Last updated 2 months ago.

infrastructure datarepresentation bioconductor bioconductor-package genomics nci-itcr tcga u24ca289073

71 stars 14.95 score 670 scripts 127 dependents

r-dbi

RPostgres:C++ Interface to PostgreSQL

Fully DBI-compliant C++-backed interface to PostgreSQL <https://www.postgresql.org/>, an open-source relational database.

Maintained by Kirill Müller. Last updated 1 months ago.

database postgres postgresql cpp

338 stars 14.78 score 1.6k scripts 31 dependents

bioc

GEOquery:Get data from NCBI Gene Expression Omnibus (GEO)

The NCBI Gene Expression Omnibus (GEO) is a public repository of microarray data. Given the rich and varied nature of this resource, it is only natural to want to apply BioConductor tools to these data. GEOquery is the bridge between GEO and BioConductor.

Maintained by Sean Davis. Last updated 5 months ago.

microarray dataimport onechannel twochannel sage bioconductor bioinformatics data-science genomics ncbi-geo

92 stars 14.46 score 4.1k scripts 44 dependents

bioc

xcms:LC-MS and GC-MS Data Analysis

Framework for processing and visualization of chromatographically separated and single-spectra mass spectral data. Imports from AIA/ANDI NetCDF, mzXML, mzData and mzML files. Preprocesses data for high-throughput, untargeted analyte profiling.

Maintained by Steffen Neumann. Last updated 14 days ago.

immunooncology massspectrometry metabolomics bioconductor feature-detection mass-spectrometry peak-detection cpp

196 stars 14.31 score 984 scripts 11 dependents

bioc

BSgenome:Software infrastructure for efficient representation of full genomes and their SNPs

Infrastructure shared by all the Biostrings-based genome data packages.

Maintained by Hervé Pagès. Last updated 2 months ago.

genetics infrastructure datarepresentation sequencematching annotation snp bioconductor-package core-package

9 stars 14.12 score 1.2k scripts 267 dependents

leifeld

texreg:Conversion of R Regression Output to LaTeX or HTML Tables

Converts coefficients, standard errors, significance stars, and goodness-of-fit statistics of statistical models into LaTeX tables or HTML tables/MS Word documents or to nicely formatted screen output for the R console for easy model comparison. A list of several models can be combined in a single table. The output is highly customizable. New model types can be easily implemented. Details can be found in Leifeld (2013), JStatSoft <doi:10.18637/jss.v055.i08>.)

Maintained by Philip Leifeld. Last updated 3 months ago.

html-tables latex latex-tables regression reporting table texreg

113 stars 14.09 score 1.8k scripts 67 dependents

bioc

ensembldb:Utilities to create and use Ensembl-based annotation databases

The package provides functions to create and use transcript centric annotation databases/packages. The annotation for the databases are directly fetched from Ensembl using their Perl API. The functionality and data is similar to that of the TxDb packages from the GenomicFeatures package, but, in addition to retrieve all gene/transcript models and annotations from the database, ensembldb provides a filter framework allowing to retrieve annotations for specific entries like genes encoded on a chromosome region or transcript models of lincRNA genes. EnsDb databases built with ensembldb contain also protein annotations and mappings between proteins and their encoding transcripts. Finally, ensembldb provides functions to map between genomic, transcript and protein coordinates.

Maintained by Johannes Rainer. Last updated 5 months ago.

genetics annotationdata sequencing coverage annotation bioconductor bioconductor-packages ensembl

35 stars 14.08 score 892 scripts 108 dependents

edzer

hexbin:Hexagonal Binning Routines

Binning and plotting functions for hexagonal bins.

Maintained by Edzer Pebesma. Last updated 5 months ago.

fortran

37 stars 14.00 score 2.4k scripts 114 dependents

mhahsler

arules:Mining Association Rules and Frequent Itemsets

Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules). Also provides C implementations of the association mining algorithms Apriori and Eclat. Hahsler, Gruen and Hornik (2005) <doi:10.18637/jss.v014.i15>.

Maintained by Michael Hahsler. Last updated 2 months ago.

arules association-rules frequent-itemsets

194 stars 13.99 score 3.3k scripts 28 dependents

bioc

phyloseq:Handling and analysis of high-throughput microbiome census data

phyloseq provides a set of classes and tools to facilitate the import, storage, analysis, and graphical display of microbiome census data.

Maintained by Paul J. McMurdie. Last updated 5 months ago.

immunooncology sequencing microbiome metagenomics clustering classification multiplecomparison geneticvariability

597 stars 13.90 score 8.4k scripts 37 dependents

bioc

AnnotationHub:Client to access AnnotationHub resources

This package provides a client for the Bioconductor AnnotationHub web resource. The AnnotationHub web resource provides a central location where genomic files (e.g., VCF, bed, wig) and other resources from standard locations (e.g., UCSC, Ensembl) can be discovered. The resource includes metadata about each resource, e.g., a textual description, tags, and date of modification. The client creates and manages a local cache of files retrieved by the user, helping with quick and reproducible access.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

infrastructure dataimport gui thirdpartyclient core-package u24ca289073

17 stars 13.88 score 2.7k scripts 104 dependents

biomodhub

biomod2:Ensemble Platform for Species Distribution Modeling

Functions for species distribution modeling, calibration and evaluation, ensemble of models, ensemble forecasting and visualization. The package permits to run consistently up to 10 single models on a presence/absences (resp presences/pseudo-absences) dataset and to combine them in ensemble models and ensemble projections. Some bench of other evaluation and visualisation tools are also available within the package.

Maintained by Maya Guéguen. Last updated 2 days ago.

95 stars 13.85 score 536 scripts 7 dependents

bioc

limma:Linear Models for Microarray and Omics Data

Data analysis, linear models and differential expression for omics data.

Maintained by Gordon Smyth. Last updated 10 days ago.

exonarray geneexpression transcription alternativesplicing differentialexpression differentialsplicing genesetenrichment dataimport bayesian clustering regression timecourse microarray micrornaarray mrnamicroarray onechannel proprietaryplatforms twochannel sequencing rnaseq batcheffect multiplecomparison normalization preprocessing qualitycontrol biomedicalinformatics cellbiology cheminformatics epigenetics functionalgenomics genetics immunooncology metabolomics proteomics systemsbiology transcriptomics

13.81 score 16k scripts 586 dependents

duckdb

duckdb:DBI Package for the DuckDB Database Management System

The DuckDB project is an embedded analytical data management system with support for the Structured Query Language (SQL). This package includes all of DuckDB and an R Database Interface (DBI) connector.

Maintained by Kirill Müller. Last updated 10 days ago.

database duckdb olap cpp

159 stars 13.80 score 1.7k scripts 46 dependents

bioc

BiocFileCache:Manage Files Across Sessions

This package creates a persistent on-disk cache of files that the user can add, update, and retrieve. It is useful for managing resources (such as custom Txdb objects) that are costly or difficult to create, web resources, and data files used across sessions.

Maintained by Lori Shepherd. Last updated 2 months ago.

dataimport core-package u24ca289073

13 stars 13.76 score 486 scripts 436 dependents

simsem

semTools:Useful Tools for Structural Equation Modeling

Provides miscellaneous tools for structural equation modeling, many of which extend the 'lavaan' package. For example, latent interactions can be estimated using product indicators (Lin et al., 2010, <doi:10.1080/10705511.2010.488999>) and simple effects probed; analytical power analyses can be conducted (Jak et al., 2021, <doi:10.3758/s13428-020-01479-0>); and scale reliability can be estimated based on estimated factor-model parameters.

Maintained by Terrence D. Jorgensen. Last updated 15 days ago.

79 stars 13.74 score 1.1k scripts 31 dependents

r-dbi

RMySQL:Database Interface and 'MySQL' Driver for R

Legacy 'DBI' interface to 'MySQL' / 'MariaDB' based on old code ported from S-PLUS. A modern 'MySQL' client written in 'C++' is available from the 'RMariaDB' package.

Maintained by Jeroen Ooms. Last updated 2 months ago.

database mysql

209 stars 13.68 score 3.7k scripts 15 dependents

bioc

SingleCellExperiment:S4 Classes for Single Cell Data

Defines a S4 class for storing data from single-cell experiments. This includes specialized methods to store and retrieve spike-in information, dimensionality reduction coordinates and size factors for each cell, along with the usual metadata for genes and libraries.

Maintained by Davide Risso. Last updated 21 days ago.

immunooncology datarepresentation dataimport infrastructure singlecell

13.53 score 15k scripts 285 dependents

bioc

edgeR:Empirical Analysis of Digital Gene Expression Data in R

Differential expression analysis of sequence count data. Implements a range of statistical methodology based on the negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models, quasi-likelihood, and gene set enrichment. Can perform differential analyses of any type of omics data that produces read counts, including RNA-seq, ChIP-seq, ATAC-seq, Bisulfite-seq, SAGE, CAGE, metabolomics, or proteomics spectral counts. RNA-seq analyses can be conducted at the gene or isoform level, and tests can be conducted for differential exon or transcript usage.

Maintained by Yunshun Chen. Last updated 18 days ago.

alternativesplicing batcheffect bayesian biomedicalinformatics cellbiology chipseq clustering coverage differentialexpression differentialmethylation differentialsplicing dnamethylation epigenetics functionalgenomics geneexpression genesetenrichment genetics immunooncology multiplecomparison normalization pathways proteomics qualitycontrol regression rnaseq sage sequencing singlecell systemsbiology timecourse transcription transcriptomics openblas

13.40 score 17k scripts 255 dependents

yulab-smu

tidytree:A Tidy Tool for Phylogenetic Tree Data Manipulation

Phylogenetic tree generally contains multiple components including node, edge, branch and associated data. 'tidytree' provides an approach to convert tree object to tidy data frame as well as provides tidy interfaces to manipulate tree data.

Maintained by Guangchuang Yu. Last updated 8 months ago.

phylogenetic-tree tidyverse tree-data

56 stars 13.36 score 584 scripts 128 dependents

bioc

HDF5Array:HDF5 datasets as array-like objects in R

The HDF5Array package is an HDF5 backend for DelayedArray objects. It implements the HDF5Array, H5SparseMatrix, H5ADMatrix, and TENxMatrix classes, 4 convenient and memory-efficient array-like containers for representing and manipulating either: (1) a conventional (a.k.a. dense) HDF5 dataset, (2) an HDF5 sparse matrix (stored in CSR/CSC/Yale format), (3) the central matrix of an h5ad file (or any matrix in the /layers group), or (4) a 10x Genomics sparse matrix. All these containers are DelayedArray extensions and thus support all operations (delayed or block-processed) supported by DelayedArray objects.

Maintained by Hervé Pagès. Last updated 9 days ago.

infrastructure datarepresentation dataimport sequencing rnaseq coverage annotation genomeannotation singlecell immunooncology bioconductor-package core-package u24ca289073

12 stars 13.20 score 844 scripts 126 dependents

bioc

Gviz:Plotting data and annotation information along genomic coordinates

Genomic data analyses requires integrated visualization of known genomic information and new experimental data. Gviz uses the biomaRt and the rtracklayer packages to perform live annotation queries to Ensembl and UCSC and translates this to e.g. gene/transcript structures in viewports of the grid graphics package. This results in genomic information plotted together with your data.

Maintained by Robert Ivanek. Last updated 5 months ago.

visualization microarray sequencing

79 stars 13.08 score 1.4k scripts 48 dependents

bioc

ChIPseeker:ChIPseeker for ChIP peak Annotation, Comparison, and Visualization

This package implements functions to retrieve the nearest genes around the peak, annotate genomic region of the peak, statstical methods for estimate the significance of overlap among ChIP peak data sets, and incorporate GEO database for user to compare the own dataset with those deposited in database. The comparison can be used to infer cooperative regulation and thus can be used to generate hypotheses. Several visualization functions are implemented to summarize the coverage of the peak experiment, average profile and heatmap of peaks binding to TSS regions, genomic annotation, distance to TSS, and overlap of peaks or genes.

Maintained by Guangchuang Yu. Last updated 5 months ago.

annotation chipseq software visualization multiplecomparison atac-seq chip-seq comparison epigenetics epigenomics

234 stars 13.02 score 1.6k scripts 5 dependents

bioc

Spectra:Spectra Infrastructure for Mass Spectrometry Data

The Spectra package defines an efficient infrastructure for storing and handling mass spectrometry spectra and functionality to subset, process, visualize and compare spectra data. It provides different implementations (backends) to store mass spectrometry data. These comprise backends tuned for fast data access and processing and backends for very large data sets ensuring a small memory footprint.

Maintained by RforMassSpectrometry Package Maintainer. Last updated 21 days ago.

infrastructure proteomics massspectrometry metabolomics bioconductor hacktoberfest mass-spectrometry

41 stars 13.01 score 254 scripts 35 dependents

bioc

iSEE:Interactive SummarizedExperiment Explorer

Create an interactive Shiny-based graphical user interface for exploring data stored in SummarizedExperiment objects, including row- and column-level metadata. The interface supports transmission of selections between plots and tables, code tracking, interactive tours, interactive or programmatic initialization, preservation of app state, and extensibility to new panel types via S4 classes. Special attention is given to single-cell data in a SingleCellExperiment object with visualization of dimensionality reduction results.

Maintained by Kevin Rue-Albrecht. Last updated 22 days ago.

cellbasedassays clustering dimensionreduction featureextraction geneexpression gui immunooncology shinyapps singlecell transcription transcriptomics visualization dimension-reduction feature-extraction gene-expression hacktoberfest human-cell-atlas shiny single-cell

225 stars 12.86 score 380 scripts 9 dependents

bioc

minfi:Analyze Illumina Infinium DNA methylation arrays

Tools to analyze & visualize Illumina Infinium methylation arrays.

Maintained by Kasper Daniel Hansen. Last updated 4 months ago.

immunooncology dnamethylation differentialmethylation epigenetics microarray methylationarray multichannel twochannel dataimport normalization preprocessing qualitycontrol

60 stars 12.82 score 996 scripts 27 dependents

csgillespie

poweRlaw:Analysis of Heavy Tailed Distributions

An implementation of maximum likelihood estimators for a variety of heavy tailed distributions, including both the discrete and continuous power law distributions. Additionally, a goodness-of-fit based approach is used to estimate the lower cut-off for the scaling region.

Maintained by Colin Gillespie. Last updated 2 months ago.

clauset powerlaw

112 stars 12.79 score 332 scripts 32 dependents

spedygiorgio

markovchain:Easy Handling Discrete Time Markov Chains

Functions and S4 methods to create and manage discrete time Markov chains more easily. In addition functions to perform statistical (fitting and drawing random variates) and probabilistic (analysis of their structural proprieties) analysis are provided. See Spedicato (2017) <doi:10.32614/RJ-2017-036>. Some functions for continuous times Markov chains depend on the suggested ctmcd package.

Maintained by Giorgio Alfredo Spedicato. Last updated 5 months ago.

ctmc dtmc markov-chain markov-model r-programming rcpp openblas cpp

104 stars 12.78 score 712 scripts 4 dependents

bioc

EBImage:Image processing and analysis toolbox for R

EBImage provides general purpose functionality for image processing and analysis. In the context of (high-throughput) microscopy-based cellular assays, EBImage offers tools to segment cells and extract quantitative cellular descriptors. This allows the automation of such tasks using the R programming language and facilitates the use of other tools in the R environment for signal processing, statistical modeling, machine learning and visualization with image data.

Maintained by Andrzej Oleś. Last updated 5 months ago.

visualization bioinformatics image-analysis image-processing cpp

71 stars 12.77 score 1.5k scripts 33 dependents

bioc

MSnbase:Base Functions and Classes for Mass Spectrometry and Proteomics

MSnbase provides infrastructure for manipulation, processing and visualisation of mass spectrometry and proteomics data, ranging from raw to quantitative and annotated data.

Maintained by Laurent Gatto. Last updated 14 days ago.

immunooncology infrastructure proteomics massspectrometry qualitycontrol dataimport bioconductor bioinformatics mass-spectrometry proteomics-data visualisation cpp

131 stars 12.76 score 772 scripts 36 dependents

boost-r

mboost:Model-Based Boosting

Functional gradient descent algorithm (boosting) for optimizing general risk functions utilizing component-wise (penalised) least squares estimates or regression trees as base-learners for fitting generalized linear, additive and interaction models to potentially high-dimensional data. Models and algorithms are described in <doi:10.1214/07-STS242>, a hands-on tutorial is available from <doi:10.1007/s00180-012-0382-5>. The package allows user-specified loss functions and base-learners.

Maintained by Torsten Hothorn. Last updated 5 months ago.

boosting-algorithms gam glm machine-learning mboost modelling r-language tutorials variable-selection openblas

72 stars 12.70 score 540 scripts 27 dependents

bioc

rtracklayer:R interface to genome annotation files and the UCSC genome browser

Extensible framework for interacting with multiple genome browsers (currently UCSC built-in) and manipulating annotation tracks in various formats (currently GFF, BED, bedGraph, BED15, WIG, BigWig and 2bit built-in). The user may export/import tracks to/from the supported browsers, as well as query and modify the browser state, such as the current viewport.

Maintained by Michael Lawrence. Last updated 3 days ago.

annotation visualization dataimport zlib openssl curl

12.66 score 6.7k scripts 480 dependents

r-dbi

bigrquery:An Interface to Google's 'BigQuery' 'API'

Easily talk to Google's 'BigQuery' database from R.

Maintained by Hadley Wickham. Last updated 1 months ago.

bigquery database cpp

520 stars 12.47 score 1.8k scripts 4 dependents

bioc

SparseArray:High-performance sparse data representation and manipulation in R

The SparseArray package provides array-like containers for efficient in-memory representation of multidimensional sparse data in R (arrays and matrices). The package defines the SparseArray virtual class and two concrete subclasses: COO_SparseArray and SVT_SparseArray. Each subclass uses its own internal representation of the nonzero multidimensional data: the "COO layout" and the "SVT layout", respectively. SVT_SparseArray objects mimic as much as possible the behavior of ordinary matrix and array objects in base R. In particular, they suppport most of the "standard matrix and array API" defined in base R and in the matrixStats package from CRAN.

Maintained by Hervé Pagès. Last updated 10 days ago.

infrastructure datarepresentation bioconductor-package core-package openmp

9 stars 12.47 score 79 scripts 1.2k dependents

suyusung

arm:Data Analysis Using Regression and Multilevel/Hierarchical Models

Functions to accompany A. Gelman and J. Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models, Cambridge University Press, 2007.

Maintained by Yu-Sung Su. Last updated 5 months ago.

25 stars 12.38 score 3.3k scripts 89 dependents

asardaes

dtwclust:Time Series Clustering Along with Optimizations for the Dynamic Time Warping Distance

Time series clustering along with optimized techniques related to the Dynamic Time Warping distance and its corresponding lower bounds. Implementations of partitional, hierarchical, fuzzy, k-Shape and TADPole clustering are available. Functionality can be easily extended with custom distance measures and centroid definitions. Implementations of DTW barycenter averaging, a distance based on global alignment kernels, and the soft-DTW distance and centroid routines are also provided. All included distance functions have custom loops optimized for the calculation of cross-distance matrices, including parallelization support. Several cluster validity indices are included.

Maintained by Alexis Sarda. Last updated 8 months ago.

clustering dtw time-series openblas cpp

262 stars 12.35 score 406 scripts 14 dependents

melff

memisc:Management of Survey Data and Presentation of Analysis Results

An infrastructure for the management of survey data including value labels, definable missing values, recoding of variables, production of code books, and import of (subsets of) 'SPSS' and 'Stata' files is provided. Further, the package allows to produce tables and data frames of arbitrary descriptive statistics and (almost) publication-ready tables of regression model estimates, which can be exported to 'LaTeX' and HTML.

Maintained by Martin Elff. Last updated 23 days ago.

survey-data

46 stars 12.34 score 1.2k scripts 13 dependents

gaospecial

ggVennDiagram:A 'ggplot2' Implement of Venn Diagram

Easy-to-use functions to generate 2-7 sets Venn or upset plot in publication quality. 'ggVennDiagram' plot Venn or upset using well-defined geometry dataset and 'ggplot2'. The shapes of 2-4 sets Venn use circles and ellipses, while the shapes of 4-7 sets Venn use irregular polygons (4 has both forms), which are developed and imported from another package 'venn', authored by Adrian Dusa. We provided internal functions to integrate shape data with user provided sets data, and calculated the geometry of every regions/intersections of them, then separately plot Venn in four components, set edges/labels, and region edges/labels. From version 1.0, it is possible to customize these components as you demand in ordinary 'ggplot2' grammar. From version 1.4.4, it supports unlimited number of sets, as it can draw a plain upset plot automatically when number of sets is more than 7.

Maintained by Chun-Hui Gao. Last updated 5 months ago.

set-operations upset upsetplot venn-diagram venn-plot

292 stars 12.31 score 1.3k scripts 4 dependents

miraisolutions

XLConnect:Excel Connector for R

Provides comprehensive functionality to read, write and format Excel data.

Maintained by Martin Studer. Last updated 30 days ago.

cross-platform excel r-language xlconnect openjdk

130 stars 12.28 score 1.2k scripts 1 dependents

alexkz

kernlab:Kernel-Based Machine Learning Lab

Kernel-based machine learning methods for classification, regression, clustering, novelty detection, quantile regression and dimensionality reduction. Among other methods 'kernlab' includes Support Vector Machines, Spectral Clustering, Kernel PCA, Gaussian Processes and a QP solver.

Maintained by Alexandros Karatzoglou. Last updated 8 months ago.

openblas cpp

21 stars 12.26 score 7.8k scripts 487 dependents

bioc

bsseq:Analyze, manage and store whole-genome methylation data

A collection of tools for analyzing and visualizing whole-genome methylation data from sequencing. This includes whole-genome bisulfite sequencing and Oxford nanopore data.

Maintained by Kasper Daniel Hansen. Last updated 3 months ago.

dnamethylation cpp

37 stars 12.26 score 676 scripts 15 dependents

markmfredrickson

optmatch:Functions for Optimal Matching

Distance based bipartite matching using minimum cost flow, oriented to matching of treatment and control groups in observational studies ('Hansen' and 'Klopfer' 2006 <doi:10.1198/106186006X137047>). Routines are provided to generate distances from generalised linear models (propensity score matching), formulas giving variables on which to limit matched distances, stratified or exact matching directives, or calipers, alone or in combination.

Maintained by Josh Errickson. Last updated 4 months ago.

matching openblas cpp

47 stars 12.22 score 588 scripts 5 dependents

r-dbi

RMariaDB:Database Interface and MariaDB Driver

Implements a DBI-compliant interface to MariaDB (<https://mariadb.org/>) and MySQL (<https://www.mysql.com/>) databases.

Maintained by Kirill Müller. Last updated 1 months ago.

database mariadb mysql cpp

133 stars 12.20 score 792 scripts 10 dependents

alexiosg

rugarch:Univariate GARCH Models

ARFIMA, in-mean, external regressors and various GARCH flavors, with methods for fit, forecast, simulation, inference and plotting.

Maintained by Alexios Galanos. Last updated 3 months ago.

cpp

26 stars 12.13 score 1.3k scripts 15 dependents

tomoakin

RPostgreSQL:R Interface to the 'PostgreSQL' Database System

Database interface and 'PostgreSQL' driver for 'R'. This package provides a Database Interface 'DBI' compliant driver for 'R' to access 'PostgreSQL' database systems. In order to build and install this package from source, 'PostgreSQL' itself must be present your system to provide 'PostgreSQL' functionality via its libraries and header files. These files are provided as 'postgresql-devel' package under some Linux distributions. On 'macOS' and 'Microsoft Windows' system the attached 'libpq' library source will be used.

Maintained by Tomoaki Nishiyama. Last updated 15 hours ago.

postgresql

66 stars 12.11 score 4.5k scripts 19 dependents

bioc

BiocSingular:Singular Value Decomposition for Bioconductor Packages

Implements exact and approximate methods for singular value decomposition and principal components analysis, in a framework that allows them to be easily switched within Bioconductor packages or workflows. Where possible, parallelization is achieved using the BiocParallel framework.

Maintained by Aaron Lun. Last updated 5 months ago.

software dimensionreduction principalcomponent bioconductor-package human-cell-atlas singular-value-decomposition cpp

7 stars 12.10 score 1.2k scripts 103 dependents

bioc

ShortRead:FASTQ input and manipulation

This package implements sampling, iteration, and input of FASTQ files. The package includes functions for filtering and trimming reads, and for generating a quality assessment report. Data are represented as DNAStringSet-derived objects, and easily manipulated for a diversity of purposes. The package also contains legacy support for early single-end, ungapped alignment formats.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

dataimport sequencing qualitycontrol bioconductor-package core-package zlib cpp

8 stars 12.08 score 1.8k scripts 49 dependents

bioc

slingshot:Tools for ordering single-cell sequencing

Provides functions for inferring continuous, branching lineage structures in low-dimensional data. Slingshot was designed to model developmental trajectories in single-cell RNA sequencing data and serve as a component in an analysis pipeline after dimensionality reduction and clustering. It is flexible enough to handle arbitrarily many branching events and allows for the incorporation of prior knowledge through supervised graph construction.

Maintained by Kelly Street. Last updated 5 months ago.

clustering differentialexpression geneexpression rnaseq sequencing software singlecell transcriptomics visualization

283 stars 12.01 score 1.0k scripts 4 dependents

bioc

QFeatures:Quantitative features for mass spectrometry data

The QFeatures infrastructure enables the management and processing of quantitative features for high-throughput mass spectrometry assays. It provides a familiar Bioconductor user experience to manages quantitative data across different assay levels (such as peptide spectrum matches, peptides and proteins) in a coherent and tractable format.

Maintained by Laurent Gatto. Last updated 24 days ago.

infrastructure massspectrometry proteomics metabolomics bioconductor mass-spectrometry

27 stars 11.87 score 278 scripts 49 dependents

bioc

graph:graph: A package to handle graph data structures

A package that implements some simple graph handling capabilities.

Maintained by Bioconductor Package Maintainer. Last updated 9 days ago.

graphandnetwork

11.86 score 764 scripts 339 dependents

r-forge

copula:Multivariate Dependence with Copulas

Classes (S4) of commonly used elliptical, Archimedean, extreme-value and other copula families, as well as their rotations, mixtures and asymmetrizations. Nested Archimedean copulas, related tools and special functions. Methods for density, distribution, random number generation, bivariate dependence measures, Rosenblatt transform, Kendall distribution function, perspective and contour plots. Fitting of copula models with potentially partly fixed parameters, including standard errors. Serial independence tests, copula specification tests (independence, exchangeability, radial symmetry, extreme-value dependence, goodness-of-fit) and model selection based on cross-validation. Empirical copula, smoothed versions, and non-parametric estimators of the Pickands dependence function.

Maintained by Martin Maechler. Last updated 23 days ago.

11.83 score 1.2k scripts 86 dependents

tiledb-inc

tiledb:Modern Database Engine for Complex Data Based on Multi-Dimensional Arrays

The modern database 'TileDB' introduces a powerful on-disk format for storing and accessing any complex data based on multi-dimensional arrays. It supports dense and sparse arrays, dataframes and key-values stores, cloud storage ('S3', 'GCS', 'Azure'), chunked arrays, multiple compression, encryption and checksum filters, uses a fully multi-threaded implementation, supports parallel I/O, data versioning ('time travel'), metadata and groups. It is implemented as an embeddable cross-platform C++ library with APIs from several languages, and integrations. This package provides the R support.

Maintained by Isaiah Norton. Last updated 21 hours ago.

array hdfs s3 storage-manager tiledb cpp

108 stars 11.79 score 306 scripts 4 dependents

kingaa

pomp:Statistical Inference for Partially Observed Markov Processes

Tools for data analysis with partially observed Markov process (POMP) models (also known as stochastic dynamical systems, hidden Markov models, and nonlinear, non-Gaussian, state-space models). The package provides facilities for implementing POMP models, simulating them, and fitting them to time series data by a variety of frequentist and Bayesian methods. It is also a versatile platform for implementation of inference methods for general POMP models.

Maintained by Aaron A. King. Last updated 8 days ago.

abc b-spline differential-equations dynamical-systems iterated-filtering likelihood likelihood-free markov-chain-monte-carlo markov-model mathematical-modelling measurement-error particle-filter sequential-monte-carlo simulation-based-inference sobol-sequence state-space statistical-inference stochastic-processes time-series openblas

114 stars 11.74 score 1.3k scripts 4 dependents

r-forge

coin:Conditional Inference Procedures in a Permutation Test Framework

Conditional inference procedures for the general independence problem including two-sample, K-sample (non-parametric ANOVA), correlation, censored, ordered and multivariate problems described in <doi:10.18637/jss.v028.i08>.

Maintained by Torsten Hothorn. Last updated 9 months ago.

11.70 score 1.6k scripts 73 dependents

satijalab

SeuratObject:Data Structures for Single Cell Data

Defines S4 classes for single-cell genomic data and associated information, such as dimensionality reduction embeddings, nearest-neighbor graphs, and spatially-resolved coordinates. Provides data access methods and R-native hooks to ensure the Seurat object is familiar to other R users. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, and Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031> for more details.

Maintained by Paul Hoffman. Last updated 2 years ago.

cpp

25 stars 11.69 score 1.2k scripts 88 dependents

luca-scr

GA:Genetic Algorithms

Flexible general-purpose toolbox implementing genetic algorithms (GAs) for stochastic optimisation. Binary, real-valued, and permutation representations are available to optimize a fitness function, i.e. a function provided by users depending on their objective function. Several genetic operators are available and can be combined to explore the best settings for the current task. Furthermore, users can define new genetic operators and easily evaluate their performances. Local search using general-purpose optimisation algorithms can be applied stochastically to exploit interesting regions. GAs can be run sequentially or in parallel, using an explicit master-slave parallelisation or a coarse-grain islands approach. For more details see Scrucca (2013) <doi:10.18637/jss.v053.i04> and Scrucca (2017) <doi:10.32614/RJ-2017-008>.

Maintained by Luca Scrucca. Last updated 7 months ago.

genetic-algorithm optimisation cpp

93 stars 11.58 score 624 scripts 52 dependents

bioc

systemPipeR:systemPipeR: Workflow Environment for Data Analysis and Report Generation

systemPipeR is a multipurpose data analysis workflow environment that unifies R with command-line tools. It enables scientists to analyze many types of large- or small-scale data on local or distributed computer systems with a high level of reproducibility, scalability and portability. At its core is a command-line interface (CLI) that adopts the Common Workflow Language (CWL). This design allows users to choose for each analysis step the optimal R or command-line software. It supports both end-to-end and partial execution of workflows with built-in restart functionalities. Efficient management of complex analysis tasks is accomplished by a flexible workflow control container class. Handling of large numbers of input samples and experimental designs is facilitated by consistent sample annotation mechanisms. As a multi-purpose workflow toolkit, systemPipeR enables users to run existing workflows, customize them or design entirely new ones while taking advantage of widely adopted data structures within the Bioconductor ecosystem. Another important core functionality is the generation of reproducible scientific analysis and technical reports. For result interpretation, systemPipeR offers a wide range of plotting functionality, while an associated Shiny App offers many useful functionalities for interactive result exploration. The vignettes linked from this page include (1) a general introduction, (2) a description of technical details, and (3) a collection of workflow templates.

Maintained by Thomas Girke. Last updated 5 months ago.

genetics infrastructure dataimport sequencing rnaseq riboseq chipseq methylseq snp geneexpression coverage genesetenrichment alignment qualitycontrol immunooncology reportwriting workflowstep workflowmanagement

53 stars 11.52 score 344 scripts 3 dependents

bioc

Rgraphviz:Provides plotting capabilities for R graph objects

Interfaces R with the AT and T graphviz library for plotting R graph objects from the graph package.

Maintained by Kasper Daniel Hansen. Last updated 2 days ago.

graphandnetwork visualization zlib

11.51 score 1.2k scripts 107 dependents

rudjer

SparseM:Sparse Linear Algebra

Some basic linear algebra functionality for sparse matrices is provided: including Cholesky decomposition and backsolving as well as standard R subsetting and Kronecker products.

Maintained by Roger Koenker. Last updated 8 months ago.

fortran

3 stars 11.47 score 306 scripts 1.5k dependents

bioc

msa:Multiple Sequence Alignment

The 'msa' package provides a unified R/Bioconductor interface to the multiple sequence alignment algorithms ClustalW, ClustalOmega, and Muscle. All three algorithms are integrated in the package, therefore, they do not depend on any external software tools and are available for all major platforms. The multiple sequence alignment algorithms are complemented by a function for pretty-printing multiple sequence alignments using the LaTeX package TeXshade.

Maintained by Ulrich Bodenhofer. Last updated 1 months ago.

multiplesequencealignment alignment multiplecomparison sequencing cpp

17 stars 11.46 score 744 scripts 6 dependents

eddelbuettel

RProtoBuf:R Interface to the 'Protocol Buffers' 'API' (Version 2 or 3)

Protocol Buffers are a way of encoding structured data in an efficient yet extensible format. Google uses Protocol Buffers for almost all of its internal 'RPC' protocols and file formats. Additional documentation is available in two included vignettes one of which corresponds to our 'JSS' paper (2016, <doi:10.18637/jss.v071.i02>. A sufficiently recent version of 'Protocol Buffers' library is required; currently version 3.3.0 from 2017 is the stated minimum.

Maintained by Dirk Eddelbuettel. Last updated 12 days ago.

c-plus-plus protocol-buffers protobuf cpp

73 stars 11.44 score 126 scripts 21 dependents

bioc

destiny:Creates diffusion maps

Create and plot diffusion maps.

Maintained by Philipp Angerer. Last updated 4 months ago.

cellbiology cellbasedassays clustering software visualization diffusion-maps dimensionality-reduction cpp

82 stars 11.44 score 792 scripts 1 dependents

bioc

annotate:Annotation for microarrays

Using R enviroments for annotation.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

annotation pathways go

11.41 score 812 scripts 239 dependents

bioc

PharmacoGx:Analysis of Large-Scale Pharmacogenomic Data

Contains a set of functions to perform large-scale analysis of pharmaco-genomic data. These include the PharmacoSet object for storing the results of pharmacogenomic experiments, as well as a number of functions for computing common summaries of drug-dose response and correlating them with the molecular features in a cancer cell-line.

Maintained by Benjamin Haibe-Kains. Last updated 3 months ago.

geneexpression pharmacogenetics pharmacogenomics software classification datasets pharmacogenomic pharmacogx cpp

68 stars 11.39 score 442 scripts 3 dependents

bioc

biomformat:An interface package for the BIOM file format

This is an R package for interfacing with the BIOM format. This package includes basic tools for reading biom-format files, accessing and subsetting data tables from a biom object (which is more complex than a single table), as well as limited support for writing a biom-object back to a biom-format file. The design of this API is intended to match the python API and other tools included with the biom-format project, but with a decidedly "R flavor" that should be familiar to R users. This includes S4 classes and methods, as well as extensions of common core functions/methods.

Maintained by Paul J. McMurdie. Last updated 5 months ago.

immunooncology dataimport metagenomics microbiome

7 stars 11.38 score 416 scripts 39 dependents

bioc

XVector:Foundation of external vector representation and manipulation in Bioconductor

Provides memory efficient S4 classes for storing sequences "externally" (e.g. behind an R external pointer, or on disk).

Maintained by Hervé Pagès. Last updated 3 months ago.

infrastructure datarepresentation bioconductor-package core-package zlib

2 stars 11.36 score 67 scripts 1.7k dependents

bioc

gdsfmt:R Interface to CoreArray Genomic Data Structure (GDS) Files

Provides a high-level R interface to CoreArray Genomic Data Structure (GDS) data files. GDS is portable across platforms with hierarchical structure to store multiple scalable array-oriented data sets with metadata information. It is suited for large-scale datasets, especially for data which are much larger than the available random-access memory. The gdsfmt package offers the efficient operations specifically designed for integers of less than 8 bits, since a diploid genotype, like single-nucleotide polymorphism (SNP), usually occupies fewer bits than a byte. Data compression and decompression are available with relatively efficient random access. It is also allowed to read a GDS file in parallel with multiple R processes supported by the package parallel.

Maintained by Xiuwen Zheng. Last updated 14 days ago.

infrastructure dataimport bioinformatics gds-format genomics cpp

18 stars 11.34 score 920 scripts 29 dependents

r-forge

Rmpfr:Interface R to MPFR - Multiple Precision Floating-Point Reliable

Arithmetic (via S4 classes and methods) for arbitrary precision floating point numbers, including transcendental ("special") functions. To this end, the package interfaces to the 'LGPL' licensed 'MPFR' (Multiple Precision Floating-Point Reliable) Library which itself is based on the 'GMP' (GNU Multiple Precision) Library.

Maintained by Martin Maechler. Last updated 4 months ago.

mpfr4 gmp

11.30 score 316 scripts 141 dependents

bioc

MAST:Model-based Analysis of Single Cell Transcriptomics

Methods and models for handling zero-inflated single cell assay data.

Maintained by Andrew McDavid. Last updated 5 months ago.

geneexpression differentialexpression genesetenrichment rnaseq transcriptomics singlecell

232 stars 11.28 score 1.8k scripts 5 dependents

bioc

ggcyto:Visualize Cytometry data with ggplot

With the dedicated fortify method implemented for flowSet, ncdfFlowSet and GatingSet classes, both raw and gated flow cytometry data can be plotted directly with ggplot. ggcyto wrapper and some customed layers also make it easy to add gates and population statistics to the plot.

Maintained by Mike Jiang. Last updated 5 months ago.

immunooncology flowcytometry cellbasedassays infrastructure visualization

58 stars 11.25 score 362 scripts 5 dependents

pik-piam

magclass:Data Class and Tools for Handling Spatial-Temporal Data

Data class for increased interoperability working with spatial-temporal data together with corresponding functions and methods (conversions, basic calculations and basic data manipulation). The class distinguishes between spatial, temporal and other dimensions to facilitate the development and interoperability of tools build for it. Additional features are name-based addressing of data and internal consistency checks (e.g. checking for the right data order in calculations).

Maintained by Jan Philipp Dietrich. Last updated 22 days ago.

5 stars 11.16 score 412 scripts 56 dependents

bioc

affy:Methods for Affymetrix Oligonucleotide Arrays

The package contains functions for exploratory oligonucleotide array analysis. The dependence on tkWidgets only concerns few convenience functions. 'affy' is fully functional without it.

Maintained by Robert D. Shear. Last updated 3 months ago.

microarray onechannel preprocessing

11.12 score 2.5k scripts 98 dependents

bioc

genefilter:genefilter: methods for filtering genes from high-throughput experiments

Some basic functions for filtering genes.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

microarray fortran cpp

11.11 score 2.4k scripts 143 dependents

fmichonneau

phylobase:Base Package for Phylogenetic Structures and Comparative Data

Provides a base S4 class for comparative methods, incorporating one or more trees and trait data.

Maintained by Francois Michonneau. Last updated 1 years ago.

phylogenetics cpp

18 stars 11.10 score 394 scripts 18 dependents

rkillick

changepoint:Methods for Changepoint Detection

Implements various mainstream and specialised changepoint methods for finding single and multiple changepoints within data. Many popular non-parametric and frequentist methods are included. The cpt.mean(), cpt.var(), cpt.meanvar() functions should be your first point of call.

Maintained by Rebecca Killick. Last updated 4 months ago.

changepoint segmentation

133 stars 11.05 score 736 scripts 40 dependents

bioc

S4Arrays:Foundation of array-like containers in Bioconductor

The S4Arrays package defines the Array virtual class to be extended by other S4 classes that wish to implement a container with an array-like semantic. It also provides: (1) low-level functionality meant to help the developer of such container to implement basic operations like display, subsetting, or coercion of their array-like objects to an ordinary matrix or array, and (2) a framework that facilitates block processing of array-like objects (typically on-disk objects).

Maintained by Hervé Pagès. Last updated 2 months ago.

infrastructure datarepresentation bioconductor-package core-package

5 stars 10.99 score 8 scripts 1.2k dependents

bioc

DirichletMultinomial:Dirichlet-Multinomial Mixture Model Machine Learning for Microbiome Data

Dirichlet-multinomial mixture models can be used to describe variability in microbial metagenomic data. This package is an interface to code originally made available by Holmes, Harris, and Quince, 2012, PLoS ONE 7(2): 1-15, as discussed further in the man page for this package, ?DirichletMultinomial.

Maintained by Martin Morgan. Last updated 5 months ago.

immunooncology microbiome sequencing clustering classification metagenomics gsl

10 stars 10.91 score 125 scripts 26 dependents

eddelbuettel

nanotime:Nanosecond-Resolution Time Support for R

Full 64-bit resolution date and time functionality with nanosecond granularity is provided, with easy transition to and from the standard 'POSIXct' type. Three additional classes offer interval, period and duration functionality for nanosecond-resolution timestamps.

Maintained by Dirk Eddelbuettel. Last updated 2 months ago.

datetime datetimes nanosecond-resolution nanoseconds cpp

53 stars 10.91 score 134 scripts 17 dependents

r-forge

MatrixModels:Modelling with Sparse and Dense Matrices

Generalized Linear Modelling with sparse and dense 'Matrix' matrices, using modular prediction and response module classes.

Maintained by Martin Maechler. Last updated 19 hours ago.

1 stars 10.90 score 1.5k dependents

metrumresearchgroup

mrgsolve:Simulate from ODE-Based Models

Fast simulation from ordinary differential equation (ODE) based models typically employed in quantitative pharmacology and systems biology.

Maintained by Kyle T Baron. Last updated 9 days ago.

mrgsolve ode openblas cpp

138 stars 10.90 score 1.2k scripts 3 dependents

ecmerkle

blavaan:Bayesian Latent Variable Analysis

Fit a variety of Bayesian latent variable models, including confirmatory factor analysis, structural equation models, and latent growth curve models. References: Merkle & Rosseel (2018) <doi:10.18637/jss.v085.i04>; Merkle et al. (2021) <doi:10.18637/jss.v100.i06>.

Maintained by Edgar Merkle. Last updated 9 days ago.

bayesian-statistics factor-analysis growth-curve-models latent-variables missing-data multilevel-models multivariate-analysis path-analysis psychometrics statistical-modeling structural-equation-modeling cpp

92 stars 10.84 score 183 scripts 3 dependents

welch-lab

rliger:Linked Inference of Genomic Experimental Relationships

Uses an extension of nonnegative matrix factorization to identify shared and dataset-specific factors. See Welch J, Kozareva V, et al (2019) <doi:10.1016/j.cell.2019.05.006>, and Liu J, Gao C, Sodicoff J, et al (2020) <doi:10.1038/s41596-020-0391-8> for more details.

Maintained by Yichen Wang. Last updated 2 months ago.

nonnegative-matrix-factorization single-cell openblas cpp

408 stars 10.77 score 334 scripts 1 dependents

bioc

GWASTools:Tools for Genome Wide Association Studies

Classes for storing very large GWAS data sets and annotation, and functions for GWAS data cleaning and analysis.

Maintained by Stephanie M. Gogarten. Last updated 10 days ago.

snp geneticvariability qualitycontrol microarray

17 stars 10.67 score 396 scripts 5 dependents

tyee001

VGAM:Vector Generalized Linear and Additive Models

An implementation of about 6 major classes of statistical regression models. The central algorithm is Fisher scoring and iterative reweighted least squares. At the heart of this package are the vector generalized linear and additive model (VGLM/VGAM) classes. VGLMs can be loosely thought of as multivariate GLMs. VGAMs are data-driven VGLMs that use smoothing. The book "Vector Generalized Linear and Additive Models: With an Implementation in R" (Yee, 2015) <DOI:10.1007/978-1-4939-2818-7> gives details of the statistical framework and the package. Currently only fixed-effects models are implemented. Many (100+) models and distributions are estimated by maximum likelihood estimation (MLE) or penalized MLE. The other classes are RR-VGLMs (reduced-rank VGLMs), quadratic RR-VGLMs, doubly constrained RR-VGLMs, quadratic RR-VGLMs, reduced-rank VGAMs, RCIMs (row-column interaction models)---these classes perform constrained and unconstrained quadratic ordination (CQO/UQO) models in ecology, as well as constrained additive ordination (CAO). Hauck-Donner effect detection is implemented. Note that these functions are subject to change; see the NEWS and ChangeLog files for latest changes.

Maintained by Thomas Yee. Last updated 1 months ago.

fortran

10 stars 10.67 score 3.6k scripts 169 dependents

r-lum

Luminescence:Comprehensive Luminescence Dating Data Analysis

A collection of various R functions for the purpose of Luminescence dating data analysis. This includes, amongst others, data import, export, application of age models, curve deconvolution, sequence analysis and plotting of equivalent dose distributions.

Maintained by Sebastian Kreutzer. Last updated 1 days ago.

bayesian-statistics data-science geochronology luminescence luminescence-dating open-science osl plotting radiofluorescence tl xsyg cpp

15 stars 10.67 score 178 scripts 8 dependents

zdebruine

RcppML:Rcpp Machine Learning Library

Fast machine learning algorithms including matrix factorization and divisive clustering for large sparse and dense matrices.

Maintained by Zach DeBruine. Last updated 2 years ago.

clustering matrix-factorization nmf rcpp rcppeigen sparse-matrix cpp openmp

107 stars 10.66 score 125 scripts 50 dependents

ohdsi

FeatureExtraction:Generating Features for a Cohort

An R interface for generating features for a cohort using data in the Common Data Model. Features can be constructed using default or custom made feature definitions. Furthermore it's possible to aggregate features and get the summary statistics.

Maintained by Ger Inberg. Last updated 8 days ago.

hades openjdk

62 stars 10.64 score 209 scripts 2 dependents

predictiveecology

SpaDES.core:Core Utilities for Developing and Running Spatially Explicit Discrete Event Models

Provides the core framework for a discrete event system to implement a complete data-to-decisions, reproducible workflow. The core components facilitate the development of modular pieces, and enable the user to include additional functionality by running user-built modules. Includes conditional scheduling, restart after interruption, packaging of reusable modules, tools for developing arbitrary automated workflows, automated interweaving of modules of different temporal resolution, and tools for visualizing and understanding the within-project dependencies. The suggested package 'NLMR' can be installed from the repository (<https://PredictiveEcology.r-universe.dev>).

Maintained by Eliot J B McIntire. Last updated 1 months ago.

discrete-events-simulations simulation-framework simulation-modeling

10 stars 10.61 score 142 scripts 6 dependents

valentint

rrcov:Scalable Robust Estimators with High Breakdown Point

Robust Location and Scatter Estimation and Robust Multivariate Analysis with High Breakdown Point: principal component analysis (Filzmoser and Todorov (2013), <doi:10.1016/j.ins.2012.10.017>), linear and quadratic discriminant analysis (Todorov and Pires (2007)), multivariate tests (Todorov and Filzmoser (2010) <doi:10.1016/j.csda.2009.08.015>), outlier detection (Todorov et al. (2010) <doi:10.1007/s11634-010-0075-2>). See also Todorov and Filzmoser (2009) <urn:isbn:978-3838108148>, Todorov and Filzmoser (2010) <doi:10.18637/jss.v032.i03> and Boudt et al. (2019) <doi:10.1007/s11222-019-09869-x>.

Maintained by Valentin Todorov. Last updated 7 months ago.

fortran openblas

2 stars 10.57 score 484 scripts 96 dependents

bioc

seqLogo:Sequence logos for DNA sequence alignments

seqLogo takes the position weight matrix of a DNA sequence motif and plots the corresponding sequence logo as introduced by Schneider and Stephens (1990).

Maintained by Robert Ivanek. Last updated 5 months ago.

sequencematching

4 stars 10.57 score 304 scripts 29 dependents

bioc

ORFik:Open Reading Frames in Genomics

R package for analysis of transcript and translation features through manipulation of sequence data and NGS data like Ribo-Seq, RNA-Seq, TCP-Seq and CAGE. It is generalized in the sense that any transcript region can be analysed, as the name hints to it was made with investigation of ribosomal patterns over Open Reading Frames (ORFs) as it's primary use case. ORFik is extremely fast through use of C++, data.table and GenomicRanges. Package allows to reassign starts of the transcripts with the use of CAGE-Seq data, automatic shifting of RiboSeq reads, finding of Open Reading Frames for whole genomes and much more.

Maintained by Haakon Tjeldnes. Last updated 1 months ago.

immunooncology software sequencing riboseq rnaseq functionalgenomics coverage alignment dataimport cpp

33 stars 10.56 score 115 scripts 2 dependents

wrathematics

float:32-Bit Floats

R comes with a suite of utilities for linear algebra with "numeric" (double precision) vectors/matrices. However, sometimes single precision (or less!) is more than enough for a particular task. This package extends R's linear algebra facilities to include 32-bit float (single precision) data. Float vectors/matrices have half the precision of their "numeric"-type counterparts but are generally faster to numerically operate on, for a performance vs accuracy trade-off. The internal representation is an S4 class, which allows us to keep the syntax identical to that of base R's. Interaction between floats and base types for binary operators is generally possible; in these cases, type promotion always defaults to the higher precision. The package ships with copies of the single precision 'BLAS' and 'LAPACK', which are automatically built in the event they are not available on the system.

Maintained by Drew Schmidt. Last updated 19 days ago.

float-matrix hpc linear-algebra matrix fortran openblas openmp

46 stars 10.53 score 228 scripts 42 dependents

bioc

miloR:Differential neighbourhood abundance testing on a graph

Milo performs single-cell differential abundance testing. Cell states are modelled as representative neighbourhoods on a nearest neighbour graph. Hypothesis testing is performed using either a negative bionomial generalized linear model or negative binomial generalized linear mixed model.

Maintained by Mike Morgan. Last updated 5 months ago.

singlecell multiplecomparison functionalgenomics software openblas cpp openmp

357 stars 10.49 score 340 scripts 1 dependents

crunch-io

crunch:Crunch.io Data Tools

The Crunch.io service <https://crunch.io/> provides a cloud-based data store and analytic engine, as well as an intuitive web interface. Using this package, analysts can interact with and manipulate Crunch datasets from within R. Importantly, this allows technical researchers to collaborate naturally with team members, managers, and clients who prefer a point-and-click interface.

Maintained by Greg Freedman Ellis. Last updated 7 days ago.

9 stars 10.47 score 200 scripts 2 dependents

wrathematics

ngram:Fast n-Gram 'Tokenization'

An n-gram is a sequence of n "words" taken, in order, from a body of text. This is a collection of utilities for creating, displaying, summarizing, and "babbling" n-grams. The 'tokenization' and "babbling" are handled by very efficient C code, which can even be built as its own standalone library. The babbler is a simple Markov chain. The package also offers a vignette with complete example 'workflows' and information about the utilities offered in the package.

Maintained by Drew Schmidt. Last updated 1 years ago.

ngram text text-mining

71 stars 10.45 score 844 scripts 7 dependents

bioc

ChemmineR:Cheminformatics Toolkit for R

ChemmineR is a cheminformatics package for analyzing drug-like small molecule data in R. Its latest version contains functions for efficient processing of large numbers of molecules, physicochemical/structural property predictions, structural similarity searching, classification and clustering of compound libraries with a wide spectrum of algorithms. In addition, it offers visualization functions for compound clustering results and chemical structures.

Maintained by Thomas Girke. Last updated 5 months ago.

cheminformatics biomedicalinformatics pharmacogenetics pharmacogenomics microtitreplateassay cellbasedassays visualization infrastructure dataimport clustering proteomics metabolomics cpp

15 stars 10.45 score 253 scripts 12 dependents

bioc

oligo:Preprocessing tools for oligonucleotide arrays

A package to analyze oligonucleotide arrays (expression/SNP/tiling/exon) at probe-level. It currently supports Affymetrix (CEL files) and NimbleGen arrays (XYS files).

Maintained by Benilton Carvalho. Last updated 20 days ago.

microarray onechannel twochannel preprocessing snp differentialexpression exonarray geneexpression dataimport zlib

3 stars 10.42 score 528 scripts 10 dependents

mhahsler

recommenderlab:Lab for Developing and Testing Recommender Algorithms

Provides a research infrastructure to develop and evaluate collaborative filtering recommender algorithms. This includes a sparse representation for user-item matrices, many popular algorithms, top-N recommendations, and cross-validation. Hahsler (2022) <doi:10.48550/arXiv.2205.12371>.

Maintained by Michael Hahsler. Last updated 3 days ago.

collaborative-filtering recommender-system

214 stars 10.42 score 840 scripts 2 dependents

adeverse

adegraphics:An S4 Lattice-Based Package for the Representation of Multivariate Data

Graphical functionalities for the representation of multivariate data. It is a complete re-implementation of the functions available in the 'ade4' package.

Maintained by Aurélie Siberchicot. Last updated 8 months ago.

9 stars 10.37 score 386 scripts 6 dependents

bioc

flowCore:flowCore: Basic structures for flow cytometry data

Provides S4 data structures and basic functions to deal with flow cytometry data.

Maintained by Mike Jiang. Last updated 5 months ago.

immunooncology infrastructure flowcytometry cellbasedassays cpp

10.34 score 1.7k scripts 59 dependents

agrdatasci

gdistance:Distances and Routes on Geographical Grids

Provides classes and functions to calculate various distance measures and routes in heterogeneous geographic spaces represented as grids. The package implements measures to model dispersal histories first presented by van Etten and Hijmans (2010) <doi:10.1371/journal.pone.0012060>. Least-cost distances as well as more complex distances based on (constrained) random walks can be calculated. The distances implemented in the package are used in geographical genetics, accessibility indicators, and may also have applications in other fields of geospatial analysis.

Maintained by Andrew Marx. Last updated 1 years ago.

17 stars 10.34 score 478 scripts 23 dependents

bioc

GSEABase:Gene set enrichment data structures and methods

This package provides classes and methods to support Gene Set Enrichment Analysis (GSEA).

Maintained by Bioconductor Package Maintainer. Last updated 2 months ago.

geneexpression genesetenrichment graphandnetwork go kegg

10.27 score 1.5k scripts 77 dependents

bioc

CAMERA:Collection of annotation related methods for mass spectrometry data

Annotation of peaklists generated by xcms, rule based annotation of isotopes and adducts, isotope validation, EIC correlation based tagging of unknown adducts and fragments

Maintained by Steffen Neumann. Last updated 5 months ago.

immunooncology massspectrometry metabolomics

11 stars 10.27 score 175 scripts 6 dependents

bioc

BASiCS:Bayesian Analysis of Single-Cell Sequencing data

Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model to perform statistical analyses of single-cell RNA sequencing datasets in the context of supervised experiments (where the groups of cells of interest are known a priori, e.g. experimental conditions or cell types). BASiCS performs built-in data normalisation (global scaling) and technical noise quantification (based on spike-in genes). BASiCS provides an intuitive detection criterion for highly (or lowly) variable genes within a single group of cells. Additionally, BASiCS can compare gene expression patterns between two or more pre-specified groups of cells. Unlike traditional differential expression tools, BASiCS quantifies changes in expression that lie beyond comparisons of means, also allowing the study of changes in cell-to-cell heterogeneity. The latter can be quantified via a biological over-dispersion parameter that measures the excess of variability that is observed with respect to Poisson sampling noise, after normalisation and technical noise removal. Due to the strong mean/over-dispersion confounding that is typically observed for scRNA-seq datasets, BASiCS also tests for changes in residual over-dispersion, defined by residual values with respect to a global mean/over-dispersion trend.

Maintained by Catalina Vallejos. Last updated 5 months ago.

immunooncology normalization sequencing rnaseq software geneexpression transcriptomics singlecell differentialexpression bayesian cellbiology bioconductor-package gene-expression rcpp rcpparmadillo scrna-seq single-cell openblas cpp openmp

83 stars 10.26 score 368 scripts 1 dependents

bioc

zinbwave:Zero-Inflated Negative Binomial Model for RNA-Seq Data

Implements a general and flexible zero-inflated negative binomial model that can be used to provide a low-dimensional representations of single-cell RNA-seq data. The model accounts for zero inflation (dropouts), over-dispersion, and the count nature of the data. The model also accounts for the difference in library sizes and optionally for batch effects and/or other covariates, avoiding the need for pre-normalize the data.

Maintained by Davide Risso. Last updated 5 months ago.

immunooncology dimensionreduction geneexpression rnaseq software transcriptomics sequencing singlecell

43 stars 10.21 score 190 scripts 6 dependents

bioc

BiocIO:Standard Input and Output for Bioconductor Packages

The `BiocIO` package contains high-level abstract classes and generics used by developers to build IO funcionality within the Bioconductor suite of packages. Implements `import()` and `export()` standard generics for importing and exporting biological data formats. `import()` supports whole-file as well as chunk-wise iterative import. The `import()` interface optionally provides a standard mechanism for 'lazy' access via `filter()` (on row or element-like components of the file resource), `select()` (on column-like components of the file resource) and `collect()`. The `import()` interface optionally provides transparent access to remote (e.g. via https) as well as local access. Developers can register a file extension, e.g., `.loom` for dispatch from character-based URIs to specific `import()` / `export()` methods based on classes representing file types, e.g., `LoomFile()`.

Maintained by Marcel Ramos. Last updated 4 months ago.

annotation dataimport bioconductor-package core-package

1 stars 10.20 score 19 scripts 492 dependents

bioc

AnnotationFilter:Facilities for Filtering Bioconductor Annotation Resources

This package provides class and other infrastructure to implement filters for manipulating Bioconductor annotation resources. The filters will be used by ensembldb, Organism.dplyr, and other packages.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

annotation infrastructure software bioconductor-package core-package

5 stars 10.19 score 45 scripts 160 dependents

bioc

BiocNeighbors:Nearest Neighbor Detection for Bioconductor Packages

Implements exact and approximate methods for nearest neighbor detection, in a framework that allows them to be easily switched within Bioconductor packages or workflows. Exact searches can be performed using the k-means for k-nearest neighbors algorithm or with vantage point trees. Approximate searches can be performed using the Annoy or HNSW libraries. Searching on either Euclidean or Manhattan distances is supported. Parallelization is achieved for all methods by using BiocParallel. Functions are also provided to search for all neighbors within a given distance.

Maintained by Aaron Lun. Last updated 24 days ago.

clustering classification cpp

10.14 score 646 scripts 89 dependents

geoffjentry

twitteR:R Based Twitter Client

Provides an interface to the Twitter web API.

Maintained by Jeff Gentry. Last updated 9 years ago.

254 stars 10.12 score 2.0k scripts 1 dependents

stewid

SimInf:A Framework for Data-Driven Stochastic Disease Spread Simulations

Provides an efficient and very flexible framework to conduct data-driven epidemiological modeling in realistic large scale disease spread simulations. The framework integrates infection dynamics in subpopulations as continuous-time Markov chains using the Gillespie stochastic simulation algorithm and incorporates available data such as births, deaths and movements as scheduled events at predefined time-points. Using C code for the numerical solvers and 'OpenMP' (if available) to divide work over multiple processors ensures high performance when simulating a sample outcome. One of our design goals was to make the package extendable and enable usage of the numerical solvers from other R extension packages in order to facilitate complex epidemiological research. The package contains template models and can be extended with user-defined models. For more details see the paper by Widgren, Bauer, Eriksson and Engblom (2019) <doi:10.18637/jss.v091.i12>. The package also provides functionality to fit models to time series data using the Approximate Bayesian Computation Sequential Monte Carlo ('ABC-SMC') algorithm of Toni and others (2009) <doi:10.1098/rsif.2008.0172>.

Maintained by Stefan Widgren. Last updated 17 days ago.

data-driven epidemiology high-performance-computing markov-chain mathematical-modelling gsl openmp

35 stars 10.09 score 227 scripts

mages

ChainLadder:Statistical Methods and Models for Claims Reserving in General Insurance

Various statistical methods and models which are typically used for the estimation of outstanding claims reserves in general insurance, including those to estimate the claims development result as required under Solvency II.

Maintained by Markus Gesmann. Last updated 2 months ago.

82 stars 10.04 score 196 scripts 2 dependents

insightsengineering

teal.data:Data Model for 'teal' Applications

Provides a 'teal_data' class as a unified data model for 'teal' applications focusing on reproducibility and relational data.

Maintained by Dawid Kaledkowski. Last updated 2 months ago.

data-model nest

11 stars 9.93 score 44 scripts 8 dependents

bioc

methylumi:Handle Illumina methylation data

This package provides classes for holding and manipulating Illumina methylation data. Based on eSet, it can contain MIAME information, sample information, feature information, and multiple matrices of data. An "intelligent" import function, methylumiR can read the Illumina text files and create a MethyLumiSet. methylumIDAT can directly read raw IDAT files from HumanMethylation27 and HumanMethylation450 microarrays. Normalization, background correction, and quality control features for GoldenGate, Infinium, and Infinium HD arrays are also included.

Maintained by Sean Davis. Last updated 5 months ago.

dnamethylation twochannel preprocessing qualitycontrol cpgisland

9 stars 9.90 score 89 scripts 9 dependents

shinra-dev

memuse:Memory Estimation Utilities

How much ram do you need to store a 100,000 by 100,000 matrix? How much ram is your current R session using? How much ram do you even have? Learn the scintillating answer to these and many more such questions with the 'memuse' package.

Maintained by Drew Schmidt. Last updated 2 years ago.

memory-estimation

46 stars 9.84 score 142 scripts 33 dependents

ubod

apcluster:Affinity Propagation Clustering

Implements Affinity Propagation clustering introduced by Frey and Dueck (2007) <DOI:10.1126/science.1136800>. The algorithms are largely analogous to the 'Matlab' code published by Frey and Dueck. The package further provides leveraged affinity propagation and an algorithm for exemplar-based agglomerative clustering that can also be used to join clusters obtained from affinity propagation. Various plotting functions are available for analyzing clustering results.

Maintained by Ulrich Bodenhofer. Last updated 11 months ago.

cpp

10 stars 9.81 score 270 scripts 25 dependents

prestodb

RPresto:DBI Connector to Presto

Implements a 'DBI' compliant interface to Presto. Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes: <https://prestodb.io/>.

Maintained by Jarod G.R. Meng. Last updated 1 months ago.

132 stars 9.73 score 25 scripts 4 dependents

bioc

biocViews:Categorized views of R package repositories

Infrastructure to support 'views' used to classify Bioconductor packages. 'biocViews' are directed acyclic graphs of terms from a controlled vocabulary. There are three major classifications, corresponding to 'software', 'annotation', and 'experiment data' packages.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

infrastructure bioconductor-package core-package

4 stars 9.71 score 30 scripts 14 dependents

bioc

MicrobiotaProcess:A comprehensive R package for managing and analyzing microbiome and other ecological data within the tidy framework

MicrobiotaProcess is an R package for analysis, visualization and biomarker discovery of microbial datasets. It introduces MPSE class, this make it more interoperable with the existing computing ecosystem. Moreover, it introduces a tidy microbiome data structure paradigm and analysis grammar. It provides a wide variety of microbiome data analysis procedures under the unified and common framework (tidy-like framework).

Maintained by Shuangbin Xu. Last updated 5 months ago.

visualization microbiome software multiplecomparison featureextraction microbiome-analysis microbiome-data

183 stars 9.70 score 126 scripts 1 dependents

sdctools

sdcMicro:Statistical Disclosure Control Methods for Anonymization of Data and Risk Estimation

Data from statistical agencies and other institutions are mostly confidential. This package, introduced in Templ, Kowarik and Meindl (2017) <doi:10.18637/jss.v067.i04>, can be used for the generation of anonymized (micro)data, i.e. for the creation of public- and scientific-use files. The theoretical basis for the methods implemented can be found in Templ (2017) <doi:10.1007/978-3-319-50272-4>. Various risk estimation and anonymization methods are included. Note that the package includes a graphical user interface published in Meindl and Templ (2019) <doi:10.3390/a12090191> that allows to use various methods of this package.

Maintained by Matthias Templ. Last updated 1 months ago.

cpp

84 stars 9.63 score 258 scripts

bioc

clusterExperiment:Compare Clusterings for Single-Cell Sequencing

Provides functionality for running and comparing many different clusterings of single-cell sequencing data or other large mRNA Expression data sets.

Maintained by Elizabeth Purdom. Last updated 5 months ago.

clustering rnaseq sequencing software singlecell cpp

38 stars 9.62 score 192 scripts 1 dependents

bioc

cytomapper:Visualization of highly multiplexed imaging data in R

Highly multiplexed imaging acquires the single-cell expression of selected proteins in a spatially-resolved fashion. These measurements can be visualised across multiple length-scales. First, pixel-level intensities represent the spatial distributions of feature expression with highest resolution. Second, after segmentation, expression values or cell-level metadata (e.g. cell-type information) can be visualised on segmented cell areas. This package contains functions for the visualisation of multiplexed read-outs and cell-level information obtained by multiplexed imaging technologies. The main functions of this package allow 1. the visualisation of pixel-level information across multiple channels, 2. the display of cell-level information (expression and/or metadata) on segmentation masks and 3. gating and visualisation of single cells.

Maintained by Lasse Meyer. Last updated 5 months ago.

immunooncology software singlecell onechannel twochannel multiplecomparison normalization dataimport bioimaging imaging-mass-cytometry single-cell spatial-analysis

32 stars 9.61 score 354 scripts 5 dependents

edzer

intervals:Tools for Working with Points and Intervals

Tools for working with and comparing sets of points and intervals.

Maintained by Edzer Pebesma. Last updated 7 months ago.

cpp

11 stars 9.50 score 122 scripts 98 dependents

bioc

snpStats:SnpMatrix and XSnpMatrix classes and methods

Classes and statistical methods for large SNP association studies. This extends the earlier snpMatrix package, allowing for uncertainty in genotypes.

Maintained by David Clayton. Last updated 5 months ago.

microarray snp geneticvariability zlib

9.48 score 674 scripts 20 dependents

bioc

RcisTarget:RcisTarget Identify transcription factor binding motifs enriched on a list of genes or genomic regions

RcisTarget identifies transcription factor binding motifs (TFBS) over-represented on a gene list. In a first step, RcisTarget selects DNA motifs that are significantly over-represented in the surroundings of the transcription start site (TSS) of the genes in the gene-set. This is achieved by using a database that contains genome-wide cross-species rankings for each motif. The motifs that are then annotated to TFs and those that have a high Normalized Enrichment Score (NES) are retained. Finally, for each motif and gene-set, RcisTarget predicts the candidate target genes (i.e. genes in the gene-set that are ranked above the leading edge).

Maintained by Gert Hulselmans. Last updated 5 months ago.

generegulation motifannotation transcriptomics transcription genesetenrichment genetarget

37 stars 9.47 score 191 scripts

bioc

bluster:Clustering Algorithms for Bioconductor

Wraps common clustering algorithms in an easily extended S4 framework. Backends are implemented for hierarchical, k-means and graph-based clustering. Several utilities are also provided to compare and evaluate clustering results.

Maintained by Aaron Lun. Last updated 5 months ago.

immunooncology software geneexpression transcriptomics singlecell clustering cpp

9.43 score 636 scripts 51 dependents

eagerai

fastai:Interface to 'fastai'

The 'fastai' <https://docs.fast.ai/index.html> library simplifies training fast and accurate neural networks using modern best practices. It is based on research in to deep learning best practices undertaken at 'fast.ai', including 'out of the box' support for vision, text, tabular, audio, time series, and collaborative filtering models.

Maintained by Turgut Abdullayev. Last updated 12 months ago.

audio collaborative-filtering darknet darknet-image-classification fastai medical object-detection tabular text vision

118 stars 9.40 score 76 scripts

bioc

SpatialFeatureExperiment:Integrating SpatialExperiment with Simple Features in sf

A new S4 class integrating Simple Features with the R package sf to bring geospatial data analysis methods based on vector data to spatial transcriptomics. Also implements management of spatial neighborhood graphs and geometric operations. This pakage builds upon SpatialExperiment and SingleCellExperiment, hence methods for these parent classes can still be used.

Maintained by Lambda Moses. Last updated 2 months ago.

datarepresentation transcriptomics spatial

49 stars 9.40 score 322 scripts 1 dependents

bioc

ggmsa:Plot Multiple Sequence Alignment using 'ggplot2'

A visual exploration tool for multiple sequence alignment and associated data. Supports MSA of DNA, RNA, and protein sequences using 'ggplot2'. Multiple sequence alignment can easily be combined with other 'ggplot2' plots, such as phylogenetic tree Visualized by 'ggtree', boxplot, genome map and so on. More features: visualization of sequence logos, sequence bundles, RNA secondary structures and detection of sequence recombinations.

Maintained by Guangchuang Yu. Last updated 3 months ago.

software visualization alignment annotation multiplesequencealignment

210 stars 9.35 score 196 scripts 2 dependents

hauselin

ollamar:'Ollama' Language Models

An interface to easily run local language models with 'Ollama' <https://ollama.com> server and API endpoints (see <https://github.com/ollama/ollama/blob/main/docs/api.md> for details). It lets you run open-source large language models locally on your machine.

Maintained by Hause Lin. Last updated 4 days ago.

ai api llm llms ollama ollama-api

89 stars 9.32 score 74 scripts 5 dependents

ohdsi

Andromeda:Asynchronous Disk-Based Representation of Massive Data

Storing very large data objects on a local drive, while still making it possible to manipulate the data in an efficient manner.

Maintained by Martijn Schuemie. Last updated 7 months ago.

hades

11 stars 9.29 score 57 scripts 8 dependents

otoomet

maxLik:Maximum Likelihood Estimation and Related Tools

Functions for Maximum Likelihood (ML) estimation, non-linear optimization, and related tools. It includes a unified way to call different optimizers, and classes and methods to handle the results from the Maximum Likelihood viewpoint. It also includes a number of convenience tools for testing and developing your own models.

Maintained by Ott Toomet. Last updated 1 years ago.

9.14 score 480 scripts 110 dependents

david-cortes

MatrixExtra:Extra Methods for Sparse Matrices

Extends sparse matrix and vector classes from the 'Matrix' package by providing: (a) Methods and operators that work natively on CSR formats (compressed sparse row, a.k.a. 'RsparseMatrix') such as slicing/sub-setting, assignment, rbind(), mathematical operators for CSR and COO such as addition ("+") or sqrt(), and methods such as diag(); (b) Multi-threaded matrix multiplication and cross-product for many <sparse, dense> types, including the 'float32' type from 'float'; (c) Coercion methods between pairs of classes which are not present in 'Matrix', such as 'dgCMatrix' -> 'ngRMatrix', as well as convenience conversion functions; (d) Utility functions for sparse matrices such as sorting the indices or removing zero-valued entries; (e) Fast transposes that work by outputting in the opposite storage format; (f) Faster replacements for many 'Matrix' methods for all sparse types, such as slicing and elementwise multiplication. (g) Convenience functions for sparse objects, such as 'mapSparse' or a shorter 'show' method.

Maintained by David Cortes. Last updated 9 months ago.

csr sparse-matrix openblas cpp openmp

20 stars 9.08 score 84 scripts 29 dependents

insightsengineering

teal.code:Code Storage and Execution Class for 'teal' Applications

Introduction of 'qenv' S4 class, that facilitates code execution and reproducibility in 'teal' applications.

Maintained by Dawid Kaledkowski. Last updated 1 months ago.

nest shiny

12 stars 9.03 score 11 scripts 9 dependents

bioc

topGO:Enrichment Analysis for Gene Ontology

topGO package provides tools for testing GO terms while accounting for the topology of the GO graph. Different test statistics and different methods for eliminating local similarities and dependencies between GO terms can be implemented and applied.

Maintained by Adrian Alexa. Last updated 5 months ago.

microarray visualization

8.96 score 2.0k scripts 20 dependents

bpfaff

urca:Unit Root and Cointegration Tests for Time Series Data

Unit root and cointegration tests encountered in applied econometric analysis are implemented.

Maintained by Bernhard Pfaff. Last updated 10 months ago.

fortran

6 stars 8.95 score 1.4k scripts 270 dependents

bioc

RaggedExperiment:Representation of Sparse Experiments and Assays Across Samples

This package provides a flexible representation of copy number, mutation, and other data that fit into the ragged array schema for genomic location data. The basic representation of such data provides a rectangular flat table interface to the user with range information in the rows and samples/specimen in the columns. The RaggedExperiment class derives from a GRangesList representation and provides a semblance of a rectangular dataset.

Maintained by Marcel Ramos. Last updated 4 months ago.

infrastructure datarepresentation copynumber core-package data-structure mutations u24ca289073

4 stars 8.93 score 76 scripts 14 dependents

bioc

marray:Exploratory analysis for two-color spotted microarray data

Class definitions for two-color spotted microarray data. Fuctions for data input, diagnostic plots, normalization and quality checking.

Maintained by Yee Hwa (Jean) Yang. Last updated 5 months ago.

microarray twochannel preprocessing

8.92 score 222 scripts 38 dependents

quanteda

quanteda.textstats:Textual Statistics for the Quantitative Analysis of Textual Data

Textual statistics functions formerly in the 'quanteda' package. Textual statistics for characterizing and comparing textual data. Includes functions for measuring term and document frequency, the co-occurrence of words, similarity and distance between features and documents, feature entropy, keyword occurrence, readability, and lexical diversity. These functions extend the 'quanteda' package and are specially designed for sparse textual data.

Maintained by Kenneth Benoit. Last updated 7 months ago.

onetbb cpp

15 stars 8.91 score 916 scripts 10 dependents

cran

XML:Tools for Parsing and Generating XML Within R and S-Plus

Many approaches for both reading and creating XML (and HTML) documents (including DTDs), both local and accessible via HTTP or FTP. Also offers access to an 'XPath' "interpreter".

Maintained by CRAN Team. Last updated 3 months ago.

libxml2

3 stars 8.87 score 1.3k dependents

bioc

AnVIL:Bioconductor on the AnVIL compute environment

The AnVIL is a cloud computing resource developed in part by the National Human Genome Research Institute. The AnVIL package provides end-user and developer functionality. For the end-user, AnVIL provides fast binary package installation, utitlities for working with Terra / AnVIL table and data resources, and convenient functions for file movement to and from Google cloud storage. For developers, AnVIL provides programatic access to the Terra, Leonardo, Rawls, and Dockstore RESTful programming interface, including helper functions to transform JSON responses to formats more amenable to manipulation in R.

Maintained by Marcel Ramos. Last updated 1 months ago.

infrastructure

8.85 score 250 scripts 11 dependents

kollerma

robustlmm:Robust Linear Mixed Effects Models

Implements the Robust Scoring Equations estimator to fit linear mixed effects models robustly. Robustness is achieved by modification of the scoring equations combined with the Design Adaptive Scale approach.

Maintained by Manuel Koller. Last updated 1 years ago.

openblas cpp

28 stars 8.79 score 138 scripts

flr

FLCore:Core Package of FLR, Fisheries Modelling in R

Core classes and methods for FLR, a framework for fisheries modelling and management strategy simulation in R. Developed by a team of fisheries scientists in various countries. More information can be found at <http://flr-project.org/>.

Maintained by Iago Mosqueira. Last updated 8 days ago.

fisheries flr fisheries-modelling

16 stars 8.78 score 956 scripts 23 dependents

r-forge

distr:Object Oriented Implementation of Distributions

S4-classes and methods for distributions.

Maintained by Peter Ruckdeschel. Last updated 2 months ago.

8.77 score 327 scripts 32 dependents

bioc

SeqVarTools:Tools for variant data

An interface to the fast-access storage format for VCF data provided in SeqArray, with tools for common operations and analysis.

Maintained by Stephanie M. Gogarten. Last updated 5 months ago.

snp geneticvariability sequencing genetics

3 stars 8.76 score 384 scripts 2 dependents

bart1

move:Visualizing and Analyzing Animal Track Data

Contains functions to access movement data stored in 'movebank.org' as well as tools to visualize and statistically analyze animal movement data, among others functions to calculate dynamic Brownian Bridge Movement Models. Move helps addressing movement ecology questions.

Maintained by Bart Kranstauber. Last updated 4 months ago.

cpp

8.76 score 690 scripts 3 dependents

clementcalenge

adehabitatHR:Home Range Estimation

A collection of tools for the estimation of animals home range.

Maintained by Clement Calenge. Last updated 7 months ago.

6 stars 8.73 score 752 scripts 9 dependents

bioc

GenomicScores:Infrastructure to work with genomewide position-specific scores

Provide infrastructure to store and access genomewide position-specific scores within R and Bioconductor.

Maintained by Robert Castelo. Last updated 2 months ago.

infrastructure genetics annotation sequencing coverage annotationhubsoftware

8 stars 8.71 score 83 scripts 6 dependents

bioc

trackViewer:A R/Bioconductor package with web interface for drawing elegant interactive tracks or lollipop plot to facilitate integrated analysis of multi-omics data

Visualize mapped reads along with annotation as track layers for NGS dataset such as ChIP-seq, RNA-seq, miRNA-seq, DNA-seq, SNPs and methylation data.

Maintained by Jianhong Ou. Last updated 3 months ago.

visualization

8.71 score 145 scripts 2 dependents

pilaboratory

sads:Maximum Likelihood Models for Species Abundance Distributions

Maximum likelihood tools to fit and compare models of species abundance distributions and of species rank-abundance distributions.

Maintained by Paulo I. Prado. Last updated 1 years ago.

23 stars 8.66 score 244 scripts 3 dependents

bioc

pRoloc:A unifying bioinformatics framework for spatial proteomics

The pRoloc package implements machine learning and visualisation methods for the analysis and interogation of quantitiative mass spectrometry data to reliably infer protein sub-cellular localisation.

Maintained by Lisa Breckels. Last updated 1 months ago.

immunooncology proteomics massspectrometry classification clustering qualitycontrol bioconductor proteomics-data spatial-proteomics visualisation openblas cpp

15 stars 8.64 score 101 scripts 2 dependents

bioc

QuasR:Quantify and Annotate Short Reads in R

This package provides a framework for the quantification and analysis of Short Reads. It covers a complete workflow starting from raw sequence reads, over creation of alignments and quality control plots, to the quantification of genomic regions of interest. Read alignments are either generated through Rbowtie (data from DNA/ChIP/ATAC/Bis-seq experiments) or Rhisat2 (data from RNA-seq experiments that require spliced alignments), or can be provided in the form of bam files.

Maintained by Michael Stadler. Last updated 1 months ago.

genetics preprocessing sequencing chipseq rnaseq methylseq coverage alignment qualitycontrol immunooncology curl bzip2 xz-utils zlib cpp

6 stars 8.63 score 79 scripts 1 dependents

djvanderlaan

LaF:Fast Access to Large ASCII Files

Methods for fast access to large ASCII files. Currently the following file formats are supported: comma separated format (CSV) and fixed width format. It is assumed that the files are too large to fit into memory, although the package can also be used to efficiently access files that do fit into memory. Methods are provided to access and process files blockwise. Furthermore, an opened file can be accessed as one would an ordinary data.frame. The LaF vignette gives an overview of the functionality provided.

Maintained by Jan van der Laan. Last updated 4 months ago.

cpp

54 stars 8.62 score 61 scripts 5 dependents

actuaryzhang

cplm:Compound Poisson Linear Models

Likelihood-based and Bayesian methods for various compound Poisson linear models based on Zhang, Yanwei (2013) <doi:10.1007/s11222-012-9343-7>.

Maintained by Yanwei (Wayne) Zhang. Last updated 1 years ago.

openblas

16 stars 8.55 score 75 scripts 10 dependents

bioc

MsExperiment:Infrastructure for Mass Spectrometry Experiments

Infrastructure to store and manage all aspects related to a complete proteomics or metabolomics mass spectrometry (MS) experiment. The MsExperiment package provides light-weight and flexible containers for MS experiments building on the new MS infrastructure provided by the Spectra, QFeatures and related packages. Along with raw data representations, links to original data files and sample annotations, additional metadata or annotations can also be stored within the MsExperiment container. To guarantee maximum flexibility only minimal constraints are put on the type and content of the data within the containers.

Maintained by Laurent Gatto. Last updated 2 months ago.

infrastructure proteomics massspectrometry metabolomics experimentaldesign dataimport

5 stars 8.51 score 126 scripts 14 dependents

bioc

ReactomeGSA:Client for the Reactome Analysis Service for comparative multi-omics gene set analysis

The ReactomeGSA packages uses Reactome's online analysis service to perform a multi-omics gene set analysis. The main advantage of this package is, that the retrieved results can be visualized using REACTOME's powerful webapplication. Since Reactome's analysis service also uses R to perfrom the actual gene set analysis you will get similar results when using the same packages (such as limma and edgeR) locally. Therefore, if you only require a gene set analysis, different packages are more suited.

Maintained by Johannes Griss. Last updated 4 months ago.

genesetenrichment proteomics transcriptomics systemsbiology geneexpression reactome

22 stars 8.50 score 67 scripts 1 dependents

hmorlon

RPANDA:Phylogenetic ANalyses of DiversificAtion

Implements macroevolutionary analyses on phylogenetic trees. See Morlon et al. (2010) <DOI:10.1371/journal.pbio.1000493>, Morlon et al. (2011) <DOI:10.1073/pnas.1102543108>, Condamine et al. (2013) <DOI:10.1111/ele.12062>, Morlon et al. (2014) <DOI:10.1111/ele.12251>, Manceau et al. (2015) <DOI:10.1111/ele.12415>, Lewitus & Morlon (2016) <DOI:10.1093/sysbio/syv116>, Drury et al. (2016) <DOI:10.1093/sysbio/syw020>, Manceau et al. (2016) <DOI:10.1093/sysbio/syw115>, Morlon et al. (2016) <DOI:10.1111/2041-210X.12526>, Clavel & Morlon (2017) <DOI:10.1073/pnas.1606868114>, Drury et al. (2017) <DOI:10.1093/sysbio/syx079>, Lewitus & Morlon (2017) <DOI:10.1093/sysbio/syx095>, Drury et al. (2018) <DOI:10.1371/journal.pbio.2003563>, Clavel et al. (2019) <DOI:10.1093/sysbio/syy045>, Maliet et al. (2019) <DOI:10.1038/s41559-019-0908-0>, Billaud et al. (2019) <DOI:10.1093/sysbio/syz057>, Lewitus et al. (2019) <DOI:10.1093/sysbio/syz061>, Aristide & Morlon (2019) <DOI:10.1111/ele.13385>, Maliet et al. (2020) <DOI:10.1111/ele.13592>, Drury et al. (2021) <DOI:10.1371/journal.pbio.3001270>, Perez-Lamarque & Morlon (2022) <DOI:10.1111/mec.16478>, Perez-Lamarque et al. (2022) <DOI:10.1101/2021.08.30.458192>, Mazet et al. (2023) <DOI:10.1111/2041-210X.14195>, Drury et al. (2024) <DOI:10.1016/j.cub.2023.12.055>.

Maintained by Hélène Morlon. Last updated 3 months ago.

24 stars 8.50 score 255 scripts

bioc

vsn:Variance stabilization and calibration for microarray data

The package implements a method for normalising microarray intensities from single- and multiple-color arrays. It can also be used for data from other technologies, as long as they have similar format. The method uses a robust variant of the maximum-likelihood estimator for an additive-multiplicative error model and affine calibration. The model incorporates data calibration step (a.k.a. normalization), a model for the dependence of the variance on the mean intensity and a variance stabilizing data transformation. Differences between transformed intensities are analogous to "normalized log-ratios". However, in contrast to the latter, their variance is independent of the mean, and they are usually more sensitive and specific in detecting differential transcription.

Maintained by Wolfgang Huber. Last updated 5 months ago.

microarray onechannel twochannel preprocessing

8.49 score 924 scripts 51 dependents

bioc

pwalign:Perform pairwise sequence alignments

The two main functions in the package are pairwiseAlignment() and stringDist(). The former solves (Needleman-Wunsch) global alignment, (Smith-Waterman) local alignment, and (ends-free) overlap alignment problems. The latter computes the Levenshtein edit distance or pairwise alignment score matrix for a set of strings.

Maintained by Hervé Pagès. Last updated 10 days ago.

alignment sequencematching sequencing genetics bioconductor-package

1 stars 8.48 score 27 scripts 104 dependents

bioc

ClassifyR:A framework for cross-validated classification problems, with applications to differential variability and differential distribution testing

The software formalises a framework for classification and survival model evaluation in R. There are four stages; Data transformation, feature selection, model training, and prediction. The requirements of variable types and variable order are fixed, but specialised variables for functions can also be provided. The framework is wrapped in a driver loop that reproducibly carries out a number of cross-validation schemes. Functions for differential mean, differential variability, and differential distribution are included. Additional functions may be developed by the user, by creating an interface to the framework.

Maintained by Dario Strbenac. Last updated 4 days ago.

classification survival cpp

6 stars 8.46 score 45 scripts 3 dependents

bioc

multiMiR:Integration of multiple microRNA-target databases with their disease and drug associations

A collection of microRNAs/targets from external resources, including validated microRNA-target databases (miRecords, miRTarBase and TarBase), predicted microRNA-target databases (DIANA-microT, ElMMo, MicroCosm, miRanda, miRDB, PicTar, PITA and TargetScan) and microRNA-disease/drug databases (miR2Disease, Pharmaco-miR VerSe and PhenomiR).

Maintained by Spencer Mahaffey. Last updated 5 months ago.

mirnadata homo_sapiens_data mus_musculus_data rattus_norvegicus_data organismdata microrna-sequence sql

20 stars 8.45 score 141 scripts

bioc

ScaledMatrix:Creating a DelayedMatrix of Scaled and Centered Values

Provides delayed computation of a matrix of scaled and centered values. The result is equivalent to using the scale() function but avoids explicit realization of a dense matrix during block processing. This permits greater efficiency in common operations, most notably matrix multiplication.

Maintained by Aaron Lun. Last updated 2 months ago.

software datarepresentation

8.44 score 10 scripts 105 dependents

bioc

UniProt.ws:R Interface to UniProt Web Services

The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and annotation data. This package provides a collection of functions for retrieving, processing, and re-packaging UniProt web services. The package makes use of UniProt's modernized REST API and allows mapping of identifiers accross different databases.

Maintained by Marcel Ramos. Last updated 3 months ago.

annotation infrastructure go kegg biocarta bioconductor-package core-package

4 stars 8.38 score 167 scripts 4 dependents

bioc

CompoundDb:Creating and Using (Chemical) Compound Annotation Databases

CompoundDb provides functionality to create and use (chemical) compound annotation databases from a variety of different sources such as LipidMaps, HMDB, ChEBI or MassBank. The database format allows to store in addition MS/MS spectra along with compound information. The package provides also a backend for Bioconductor's Spectra package and allows thus to match experimetal MS/MS spectra against MS/MS spectra in the database. Databases can be stored in SQLite format and are thus portable.

Maintained by Johannes Rainer. Last updated 2 months ago.

massspectrometry metabolomics annotation databases mass-spectrometry

17 stars 8.35 score 69 scripts 1 dependents

bioc

csaw:ChIP-Seq Analysis with Windows

Detection of differentially bound regions in ChIP-seq data with sliding windows, with methods for normalization and proper FDR control.

Maintained by Aaron Lun. Last updated 2 months ago.

multiplecomparison chipseq normalization sequencing coverage genetics annotation differentialpeakcalling curl bzip2 xz-utils zlib cpp

8.32 score 498 scripts 7 dependents

bioc

rols:An R interface to the Ontology Lookup Service

The rols package is an interface to the Ontology Lookup Service (OLS) to access and query hundred of ontolgies directly from R.

Maintained by Laurent Gatto. Last updated 5 months ago.

immunooncology software annotation massspectrometry go

11 stars 8.30 score 89 scripts 5 dependents

tomasfryda

h2o:R Interface for the 'H2O' Scalable Machine Learning Platform

R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).

Maintained by Tomas Fryda. Last updated 1 years ago.

3 stars 8.20 score 7.8k scripts 11 dependents

cran

flexmix:Flexible Mixture Modeling

A general framework for finite mixtures of regression models using the EM algorithm is implemented. The E-step and all data handling are provided, while the M-step can be supplied by the user to easily define new models. Existing drivers implement mixtures of standard linear models, generalized linear models and model-based clustering.

Maintained by Bettina Gruen. Last updated 29 days ago.

5 stars 8.19 score 113 dependents

rdpeng

filehash:Simple Key-Value Database

Implements a simple key-value style database where character string keys are associated with data values that are stored on the disk. A simple interface is provided for inserting, retrieving, and deleting data from the database. Utilities are provided that allow 'filehash' databases to be treated much like environments and lists are already used in R. These utilities are provided to encourage interactive and exploratory analysis on large datasets. Three different file formats for representing the database are currently available and new formats can easily be incorporated by third parties for use in the 'filehash' framework.

Maintained by Roger D. Peng. Last updated 2 years ago.

24 stars 8.16 score 78 scripts 11 dependents

bioc

nullranges:Generation of null ranges via bootstrapping or covariate matching

Modular package for generation of sets of ranges representing the null hypothesis. These can take the form of bootstrap samples of ranges (using the block bootstrap framework of Bickel et al 2010), or sets of control ranges that are matched across one or more covariates. nullranges is designed to be inter-operable with other packages for analysis of genomic overlap enrichment, including the plyranges Bioconductor package.

Maintained by Michael Love. Last updated 5 months ago.

visualization genesetenrichment functionalgenomics epigenetics generegulation genetarget genomeannotation annotation genomewideassociation histonemodification chipseq atacseq dnaseseq rnaseq hiddenmarkovmodel bioconductor bootstrap genomics matching statistics

27 stars 8.16 score 50 scripts 1 dependents

bioc

BioQC:Detect tissue heterogeneity in expression profiles with gene sets

BioQC performs quality control of high-throughput expression data based on tissue gene signatures. It can detect tissue heterogeneity in gene expression data. The core algorithm is a Wilcoxon-Mann-Whitney test that is optimised for high performance.

Maintained by Jitao David Zhang. Last updated 5 months ago.

geneexpression qualitycontrol statisticalmethod genesetenrichment cpp

5 stars 8.16 score 86 scripts

bioc

dreamlet:Scalable differential expression analysis of single cell transcriptomics datasets with complex study designs

Recent advances in single cell/nucleus transcriptomic technology has enabled collection of cohort-scale datasets to study cell type specific gene expression differences associated disease state, stimulus, and genetic regulation. The scale of these data, complex study designs, and low read count per cell mean that characterizing cell type specific molecular mechanisms requires a user-frieldly, purpose-build analytical framework. We have developed the dreamlet package that applies a pseudobulk approach and fits a regression model for each gene and cell cluster to test differential expression across individuals associated with a trait of interest. Use of precision-weighted linear mixed models enables accounting for repeated measures study designs, high dimensional batch effects, and varying sequencing depth or observed cells per biosample.

Maintained by Gabriel Hoffman. Last updated 3 days ago.

rnaseq geneexpression differentialexpression batcheffect qualitycontrol regression genesetenrichment generegulation epigenetics functionalgenomics transcriptomics normalization singlecell preprocessing sequencing immunooncology software cpp

12 stars 8.14 score 128 scripts