R-universe search: topic:infrastructure

bioc

Biostrings:Efficient manipulation of biological strings

Memory efficient string containers, string matching algorithms, and other utilities, for fast manipulation of large biological sequences or sets of sequences.

Maintained by Hervé Pagès. Last updated 1 months ago.

sequencematching alignment sequencing genetics dataimport datarepresentation infrastructure bioconductor-package core-package

62 stars 17.77 score 8.6k scripts 1.2k dependents

bioc

GenomicRanges:Representation and manipulation of genomic intervals

The ability to efficiently represent and manipulate genomic annotations and alignments is playing a central role when it comes to analyzing high-throughput sequencing data (a.k.a. NGS data). The GenomicRanges package defines general purpose containers for storing and manipulating genomic intervals and variables defined along a genome. More specialized containers for representing and manipulating short alignments against a reference genome, or a matrix-like summarization of an experiment, are defined in the GenomicAlignments and SummarizedExperiment packages, respectively. Both packages build on top of the GenomicRanges infrastructure.

Maintained by Hervé Pagès. Last updated 4 months ago.

genetics infrastructure datarepresentation sequencing annotation genomeannotation coverage bioconductor-package core-package

44 stars 17.68 score 13k scripts 1.3k dependents

bioc

BiocParallel:Bioconductor facilities for parallel evaluation

This package provides modified versions and novel implementation of functions for parallel evaluation, tailored to use with Bioconductor objects.

Maintained by Martin Morgan. Last updated 1 months ago.

infrastructure bioconductor-package core-package u24ca289073 cpp

67 stars 17.31 score 7.3k scripts 1.1k dependents

bioc

SummarizedExperiment:A container (S4 class) for matrix-like assays

The SummarizedExperiment container contains one or more assays, each represented by a matrix-like object of numeric or other mode. The rows typically represent genomic ranges of interest and the columns represent samples.

Maintained by Hervé Pagès. Last updated 5 months ago.

genetics infrastructure sequencing annotation coverage genomeannotation bioconductor-package core-package

34 stars 16.84 score 8.6k scripts 1.2k dependents

bioc

Biobase:Biobase: Base functions for Bioconductor

Functions that are needed by many other packages or which replace R functions.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

infrastructure bioconductor-package core-package

9 stars 16.45 score 6.6k scripts 1.8k dependents

bioc

IRanges:Foundation of integer range manipulation in Bioconductor

Provides efficient low-level and highly reusable S4 classes for storing, manipulating and aggregating over annotated ranges of integers. Implements an algebra of range operations, including efficient algorithms for finding overlaps and nearest neighbors. Defines efficient list-like classes for storing, transforming and aggregating large grouped data, i.e., collections of atomic vectors and DataFrames.

Maintained by Hervé Pagès. Last updated 2 months ago.

infrastructure datarepresentation bioconductor-package core-package

22 stars 16.09 score 2.1k scripts 1.8k dependents

bioc

S4Vectors:Foundation of vector-like and list-like containers in Bioconductor

The S4Vectors package defines the Vector and List virtual classes and a set of generic functions that extend the semantic of ordinary vectors and lists in R. Package developers can easily implement vector-like or list-like objects as concrete subclasses of Vector or List. In addition, a few low-level concrete subclasses of general interest (e.g. DataFrame, Rle, Factor, and Hits) are implemented in the S4Vectors package itself (many more are implemented in the IRanges package and in other Bioconductor infrastructure packages).

Maintained by Hervé Pagès. Last updated 2 months ago.

infrastructure datarepresentation bioconductor-package core-package

18 stars 16.05 score 1.0k scripts 1.9k dependents

bioc

rhdf5:R Interface to HDF5

This package provides an interface between HDF5 and R. HDF5's main features are the ability to store and access very large and/or complex datasets and a wide variety of metadata on mass storage (disk) through a completely portable file format. The rhdf5 package is thus suited for the exchange of large and/or complex datasets between R and other software package, and for letting R applications work on datasets that are larger than the available RAM.

Maintained by Mike Smith. Last updated 5 days ago.

infrastructure dataimport hdf5 rhdf5 openssl curl zlib cpp

62 stars 15.87 score 4.2k scripts 232 dependents

bioc

DelayedArray:A unified framework for working transparently with on-disk and in-memory array-like datasets

Wrapping an array-like object (typically an on-disk object) in a DelayedArray object allows one to perform common array operations on it without loading the object in memory. In order to reduce memory usage and optimize performance, operations on the object are either delayed or executed using a block processing mechanism. Note that this also works on in-memory array-like objects like DataFrame objects (typically with Rle columns), Matrix objects, ordinary arrays and, data frames.

Maintained by Hervé Pagès. Last updated 1 months ago.

infrastructure datarepresentation annotation genomeannotation bioconductor-package core-package u24ca289073

27 stars 15.59 score 538 scripts 1.2k dependents

bioc

GenomicFeatures:Query the gene models of a given organism/assembly

Extract the genomic locations of genes, transcripts, exons, introns, and CDS, for the gene models stored in a TxDb object. A TxDb object is a small database that contains the gene models of a given organism/assembly. Bioconductor provides a small collection of TxDb objects in the form of ready-to-install TxDb packages for the most commonly studied organisms. Additionally, the user can easily make a TxDb object (or package) for the organism/assembly of their choice by using the tools from the txdbmaker package.

Maintained by H. Pagès. Last updated 5 months ago.

genetics infrastructure annotation sequencing genomeannotation bioconductor-package core-package

26 stars 15.34 score 5.3k scripts 339 dependents

bioc

GenomicAlignments:Representation and manipulation of short genomic alignments

Provides efficient containers for storing and manipulating short genomic alignments (typically obtained by aligning short reads to a reference genome). This includes read counting, computing the coverage, junction detection, and working with the nucleotide content of the alignments.

Maintained by Hervé Pagès. Last updated 5 months ago.

infrastructure dataimport genetics sequencing rnaseq snp coverage alignment immunooncology bioconductor-package core-package

10 stars 15.21 score 3.1k scripts 528 dependents

bioc

MultiAssayExperiment:Software for the integration of multi-omics experiments in Bioconductor

Harmonize data management of multiple experimental assays performed on an overlapping set of specimens. It provides a familiar Bioconductor user experience by extending concepts from SummarizedExperiment, supporting an open-ended mix of standard data classes for individual assays, and allowing subsetting by genomic ranges or rownames. Facilities are provided for reshaping data into wide and long formats for adaptability to graphing and downstream analysis.

Maintained by Marcel Ramos. Last updated 2 months ago.

infrastructure datarepresentation bioconductor bioconductor-package genomics nci-itcr tcga u24ca289073

71 stars 14.95 score 670 scripts 127 dependents

bioc

BiocGenerics:S4 generic functions used in Bioconductor

The package defines many S4 generic functions used in Bioconductor.

Maintained by Hervé Pagès. Last updated 2 months ago.

infrastructure bioconductor-package core-package

12 stars 14.22 score 612 scripts 2.2k dependents

bioc

BSgenome:Software infrastructure for efficient representation of full genomes and their SNPs

Infrastructure shared by all the Biostrings-based genome data packages.

Maintained by Hervé Pagès. Last updated 2 months ago.

genetics infrastructure datarepresentation sequencematching annotation snp bioconductor-package core-package

9 stars 14.12 score 1.2k scripts 267 dependents

bioc

AnnotationHub:Client to access AnnotationHub resources

This package provides a client for the Bioconductor AnnotationHub web resource. The AnnotationHub web resource provides a central location where genomic files (e.g., VCF, bed, wig) and other resources from standard locations (e.g., UCSC, Ensembl) can be discovered. The resource includes metadata about each resource, e.g., a textual description, tags, and date of modification. The client creates and manages a local cache of files retrieved by the user, helping with quick and reproducible access.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

infrastructure dataimport gui thirdpartyclient core-package u24ca289073

17 stars 13.88 score 2.7k scripts 104 dependents

bioc

SingleCellExperiment:S4 Classes for Single Cell Data

Defines a S4 class for storing data from single-cell experiments. This includes specialized methods to store and retrieve spike-in information, dimensionality reduction coordinates and size factors for each cell, along with the usual metadata for genes and libraries.

Maintained by Davide Risso. Last updated 22 days ago.

immunooncology datarepresentation dataimport infrastructure singlecell

13.53 score 15k scripts 285 dependents

bioc

HDF5Array:HDF5 datasets as array-like objects in R

The HDF5Array package is an HDF5 backend for DelayedArray objects. It implements the HDF5Array, H5SparseMatrix, H5ADMatrix, and TENxMatrix classes, 4 convenient and memory-efficient array-like containers for representing and manipulating either: (1) a conventional (a.k.a. dense) HDF5 dataset, (2) an HDF5 sparse matrix (stored in CSR/CSC/Yale format), (3) the central matrix of an h5ad file (or any matrix in the /layers group), or (4) a 10x Genomics sparse matrix. All these containers are DelayedArray extensions and thus support all operations (delayed or block-processed) supported by DelayedArray objects.

Maintained by Hervé Pagès. Last updated 10 days ago.

infrastructure datarepresentation dataimport sequencing rnaseq coverage annotation genomeannotation singlecell immunooncology bioconductor-package core-package u24ca289073

12 stars 13.20 score 844 scripts 126 dependents

bioc

Spectra:Spectra Infrastructure for Mass Spectrometry Data

The Spectra package defines an efficient infrastructure for storing and handling mass spectrometry spectra and functionality to subset, process, visualize and compare spectra data. It provides different implementations (backends) to store mass spectrometry data. These comprise backends tuned for fast data access and processing and backends for very large data sets ensuring a small memory footprint.

Maintained by RforMassSpectrometry Package Maintainer. Last updated 22 days ago.

infrastructure proteomics massspectrometry metabolomics bioconductor hacktoberfest mass-spectrometry

41 stars 13.01 score 254 scripts 35 dependents

bioc

mzR:parser for netCDF, mzXML and mzML and mzIdentML files (mass spectrometry data)

mzR provides a unified API to the common file formats and parsers available for mass spectrometry data. It comes with a subset of the proteowizard library for mzXML, mzML and mzIdentML. The netCDF reading code has previously been used in XCMS.

Maintained by Steffen Neumann. Last updated 2 months ago.

immunooncology infrastructure dataimport proteomics metabolomics massspectrometry zlib cpp

45 stars 12.77 score 204 scripts 44 dependents

bioc

MSnbase:Base Functions and Classes for Mass Spectrometry and Proteomics

MSnbase provides infrastructure for manipulation, processing and visualisation of mass spectrometry and proteomics data, ranging from raw to quantitative and annotated data.

Maintained by Laurent Gatto. Last updated 15 days ago.

immunooncology infrastructure proteomics massspectrometry qualitycontrol dataimport bioconductor bioinformatics mass-spectrometry proteomics-data visualisation cpp

131 stars 12.76 score 772 scripts 36 dependents

bioc

plyranges:A fluent interface for manipulating GenomicRanges

A dplyr-like interface for interacting with the common Bioconductor classes Ranges and GenomicRanges. By providing a grammatical and consistent way of manipulating these classes their accessiblity for new Bioconductor users is hopefully increased.

Maintained by Michael Love. Last updated 10 days ago.

infrastructure datarepresentation workflowstep coverage bioconductor data-analysis dplyr genomic-ranges genomics tidy-data

144 stars 12.66 score 1.9k scripts 20 dependents

bioc

SpatialExperiment:S4 Class for Spatially Resolved -omics Data

Defines an S4 class for storing data from spatial -omics experiments. The class extends SingleCellExperiment to support storage and retrieval of additional information from spot-based and molecule-based platforms, including spatial coordinates, images, and image metadata. A specialized constructor function is included for data from the 10x Genomics Visium platform.

Maintained by Dario Righelli. Last updated 5 months ago.

datarepresentation dataimport infrastructure immunooncology geneexpression transcriptomics singlecell spatial

59 stars 12.63 score 1.8k scripts 71 dependents

bioc

SNPRelate:Parallel Computing Toolset for Relatedness and Principal Component Analysis of SNP Data

Genome-wide association studies (GWAS) are widely used to investigate the genetic basis of diseases and traits, but they pose many computational challenges. We developed an R package SNPRelate to provide a binary format for single-nucleotide polymorphism (SNP) data in GWAS utilizing CoreArray Genomic Data Structure (GDS) data files. The GDS format offers the efficient operations specifically designed for integers with two bits, since a SNP could occupy only two bits. SNPRelate is also designed to accelerate two key computations on SNP data using parallel computing for multi-core symmetric multiprocessing computer architectures: Principal Component Analysis (PCA) and relatedness analysis using Identity-By-Descent measures. The SNP GDS format is also used by the GWASTools package with the support of S4 classes and generic functions. The extended GDS format is implemented in the SeqArray package to support the storage of single nucleotide variations (SNVs), insertion/deletion polymorphism (indel) and structural variation calls in whole-genome and whole-exome variant data.

Maintained by Xiuwen Zheng. Last updated 5 months ago.

infrastructure genetics statisticalmethod principalcomponent bioinformatics gds-format pca simd snp openblas cpp

105 stars 12.57 score 1.6k scripts 19 dependents

bioc

SparseArray:High-performance sparse data representation and manipulation in R

The SparseArray package provides array-like containers for efficient in-memory representation of multidimensional sparse data in R (arrays and matrices). The package defines the SparseArray virtual class and two concrete subclasses: COO_SparseArray and SVT_SparseArray. Each subclass uses its own internal representation of the nonzero multidimensional data: the "COO layout" and the "SVT layout", respectively. SVT_SparseArray objects mimic as much as possible the behavior of ordinary matrix and array objects in base R. In particular, they suppport most of the "standard matrix and array API" defined in base R and in the matrixStats package from CRAN.

Maintained by Hervé Pagès. Last updated 11 days ago.

infrastructure datarepresentation bioconductor-package core-package openmp

9 stars 12.47 score 79 scripts 1.2k dependents

bioc

ggbio:Visualization tools for genomic data

The ggbio package extends and specializes the grammar of graphics for biological data. The graphics are designed to answer common scientific questions, in particular those often asked of high throughput genomics data. All core Bioconductor data structures are supported, where appropriate. The package supports detailed views of particular genomic regions, as well as genome-wide overviews. Supported overviews include ideograms and grand linear views. High-level plots include sequence fragment length, edge-linked interval to data view, mismatch pileup, and several splicing summaries.

Maintained by Michael Lawrence. Last updated 5 months ago.

infrastructure visualization

111 stars 12.23 score 734 scripts 16 dependents

bioc

SeqArray:Data management of large-scale whole-genome sequence variant calls using GDS files

Data management of large-scale whole-genome sequencing variant calls with thousands of individuals: genotypic data (e.g., SNVs, indels and structural variation calls) and annotations in SeqArray GDS files are stored in an array-oriented and compressed manner, with efficient data access using the R programming language.

Maintained by Xiuwen Zheng. Last updated 6 days ago.

infrastructure datarepresentation sequencing genetics bioinformatics gds-format snp snv wes wgs cpp

45 stars 12.11 score 1.1k scripts 9 dependents

bioc

preprocessCore:A collection of pre-processing functions

A library of core preprocessing routines.

Maintained by Ben Bolstad. Last updated 5 months ago.

infrastructure openblas

19 stars 12.03 score 1.8k scripts 204 dependents

bioc

sparseMatrixStats:Summary Statistics for Rows and Columns of Sparse Matrices

High performance functions for row and column operations on sparse matrices. For example: col / rowMeans2, col / rowMedians, col / rowVars etc. Currently, the optimizations are limited to data in the column sparse format. This package is inspired by the matrixStats package by Henrik Bengtsson.

Maintained by Constantin Ahlmann-Eltze. Last updated 5 months ago.

infrastructure software datarepresentation cpp

54 stars 11.98 score 174 scripts 130 dependents

bioc

ExperimentHub:Client to access ExperimentHub resources

This package provides a client for the Bioconductor ExperimentHub web resource. ExperimentHub provides a central location where curated data from experiments, publications or training courses can be accessed. Each resource has associated metadata, tags and date of modification. The client creates and manages a local cache of files retrieved enabling quick and reproducible access.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

infrastructure dataimport gui thirdpartyclient core-package u24ca289073

10 stars 11.94 score 764 scripts 57 dependents

bioc

QFeatures:Quantitative features for mass spectrometry data

The QFeatures infrastructure enables the management and processing of quantitative features for high-throughput mass spectrometry assays. It provides a familiar Bioconductor user experience to manages quantitative data across different assay levels (such as peptide spectrum matches, peptides and proteins) in a coherent and tractable format.

Maintained by Laurent Gatto. Last updated 25 days ago.

infrastructure massspectrometry proteomics metabolomics bioconductor mass-spectrometry

27 stars 11.87 score 278 scripts 49 dependents

bioc

DelayedMatrixStats:Functions that Apply to Rows and Columns of 'DelayedMatrix' Objects

A port of the 'matrixStats' API for use with DelayedMatrix objects from the 'DelayedArray' package. High-performing functions operating on rows and columns of DelayedMatrix objects, e.g. col / rowMedians(), col / rowRanks(), and col / rowSds(). Functions optimized per data type and for subsetted calculations such that both memory usage and processing time is minimized.

Maintained by Peter Hickey. Last updated 3 months ago.

infrastructure datarepresentation software

16 stars 11.86 score 211 scripts 112 dependents

bioc

MatrixGenerics:S4 Generic Summary Statistic Functions that Operate on Matrix-Like Objects

S4 generic functions modeled after the 'matrixStats' API for alternative matrix implementations. Packages with alternative matrix implementation can depend on this package and implement the generic functions that are defined here for a useful set of row and column summary statistics. Other package developers can import this package and handle a different matrix implementations without worrying about incompatibilities.

Maintained by Peter Hickey. Last updated 3 months ago.

infrastructure software bioconductor-package core-package

12 stars 11.64 score 129 scripts 1.3k dependents

bioc

bumphunter:Bump Hunter

Tools for finding bumps in genomic data

Maintained by Tamilselvi Guharaj. Last updated 5 months ago.

dnamethylation epigenetics infrastructure multiplecomparison immunooncology

16 stars 11.61 score 210 scripts 43 dependents

bioc

systemPipeR:systemPipeR: Workflow Environment for Data Analysis and Report Generation

systemPipeR is a multipurpose data analysis workflow environment that unifies R with command-line tools. It enables scientists to analyze many types of large- or small-scale data on local or distributed computer systems with a high level of reproducibility, scalability and portability. At its core is a command-line interface (CLI) that adopts the Common Workflow Language (CWL). This design allows users to choose for each analysis step the optimal R or command-line software. It supports both end-to-end and partial execution of workflows with built-in restart functionalities. Efficient management of complex analysis tasks is accomplished by a flexible workflow control container class. Handling of large numbers of input samples and experimental designs is facilitated by consistent sample annotation mechanisms. As a multi-purpose workflow toolkit, systemPipeR enables users to run existing workflows, customize them or design entirely new ones while taking advantage of widely adopted data structures within the Bioconductor ecosystem. Another important core functionality is the generation of reproducible scientific analysis and technical reports. For result interpretation, systemPipeR offers a wide range of plotting functionality, while an associated Shiny App offers many useful functionalities for interactive result exploration. The vignettes linked from this page include (1) a general introduction, (2) a description of technical details, and (3) a collection of workflow templates.

Maintained by Thomas Girke. Last updated 5 months ago.

genetics infrastructure dataimport sequencing rnaseq riboseq chipseq methylseq snp geneexpression coverage genesetenrichment alignment qualitycontrol immunooncology reportwriting workflowstep workflowmanagement

53 stars 11.52 score 344 scripts 3 dependents

bioc

XVector:Foundation of external vector representation and manipulation in Bioconductor

Provides memory efficient S4 classes for storing sequences "externally" (e.g. behind an R external pointer, or on disk).

Maintained by Hervé Pagès. Last updated 3 months ago.

infrastructure datarepresentation bioconductor-package core-package zlib

2 stars 11.36 score 67 scripts 1.7k dependents

bioc

gdsfmt:R Interface to CoreArray Genomic Data Structure (GDS) Files

Provides a high-level R interface to CoreArray Genomic Data Structure (GDS) data files. GDS is portable across platforms with hierarchical structure to store multiple scalable array-oriented data sets with metadata information. It is suited for large-scale datasets, especially for data which are much larger than the available random-access memory. The gdsfmt package offers the efficient operations specifically designed for integers of less than 8 bits, since a diploid genotype, like single-nucleotide polymorphism (SNP), usually occupies fewer bits than a byte. Data compression and decompression are available with relatively efficient random access. It is also allowed to read a GDS file in parallel with multiple R processes supported by the package parallel.

Maintained by Xiuwen Zheng. Last updated 14 days ago.

infrastructure dataimport bioinformatics gds-format genomics cpp

18 stars 11.34 score 920 scripts 29 dependents

bioc

ggcyto:Visualize Cytometry data with ggplot

With the dedicated fortify method implemented for flowSet, ncdfFlowSet and GatingSet classes, both raw and gated flow cytometry data can be plotted directly with ggplot. ggcyto wrapper and some customed layers also make it easy to add gates and population statistics to the plot.

Maintained by Mike Jiang. Last updated 5 months ago.

immunooncology flowcytometry cellbasedassays infrastructure visualization

58 stars 11.25 score 362 scripts 5 dependents

bioc

Rhdf5lib:hdf5 library as an R package

Provides C and C++ hdf5 libraries.

Maintained by Mike Smith. Last updated 5 days ago.

infrastructure bioconductor hdf5 hdf5-library fortran zlib

6 stars 11.22 score 26 scripts 341 dependents

bioc

beachmat:Compiling Bioconductor to Handle Each Matrix Type

Provides a consistent C++ class interface for reading from a variety of commonly used matrix types. Ordinary matrices and several sparse/dense Matrix classes are directly supported, along with a subset of the delayed operations implemented in the DelayedArray package. All other matrix-like objects are supported by calling back into R.

Maintained by Aaron Lun. Last updated 16 days ago.

datarepresentation dataimport infrastructure bioconductor-package human-cell-atlas matrix-library cpp

4 stars 11.09 score 21 scripts 142 dependents

bioc

scater:Single-Cell Analysis Toolkit for Gene Expression Data in R

A collection of tools for doing various analyses of single-cell RNA-seq gene expression data, with a focus on quality control and visualization.

Maintained by Alan OCallaghan. Last updated 22 days ago.

immunooncology singlecell rnaseq qualitycontrol preprocessing normalization visualization dimensionreduction transcriptomics geneexpression sequencing software dataimport datarepresentation infrastructure coverage

11.07 score 12k scripts 43 dependents

bioc

S4Arrays:Foundation of array-like containers in Bioconductor

The S4Arrays package defines the Array virtual class to be extended by other S4 classes that wish to implement a container with an array-like semantic. It also provides: (1) low-level functionality meant to help the developer of such container to implement basic operations like display, subsetting, or coercion of their array-like objects to an ordinary matrix or array, and (2) a framework that facilitates block processing of array-like objects (typically on-disk objects).

Maintained by Hervé Pagès. Last updated 2 months ago.

infrastructure datarepresentation bioconductor-package core-package

5 stars 10.99 score 8 scripts 1.2k dependents

bioc

MsCoreUtils:Core Utils for Mass Spectrometry Data

MsCoreUtils defines low-level functions for mass spectrometry data and is independent of any high-level data structures. These functions include mass spectra processing functions (noise estimation, smoothing, binning, baseline estimation), quantitative aggregation functions (median polish, robust summarisation, ...), missing data imputation, data normalisation (quantiles, vsn, ...), misc helper functions, that are used across high-level data structure within the R for Mass Spectrometry packages.

Maintained by RforMassSpectrometry Package Maintainer. Last updated 8 days ago.

infrastructure proteomics massspectrometry metabolomics bioconductor mass-spectrometry utils

16 stars 10.57 score 41 scripts 71 dependents

bioc

ChemmineR:Cheminformatics Toolkit for R

ChemmineR is a cheminformatics package for analyzing drug-like small molecule data in R. Its latest version contains functions for efficient processing of large numbers of molecules, physicochemical/structural property predictions, structural similarity searching, classification and clustering of compound libraries with a wide spectrum of algorithms. In addition, it offers visualization functions for compound clustering results and chemical structures.

Maintained by Thomas Girke. Last updated 5 months ago.

cheminformatics biomedicalinformatics pharmacogenetics pharmacogenomics microtitreplateassay cellbasedassays visualization infrastructure dataimport clustering proteomics metabolomics cpp

15 stars 10.45 score 253 scripts 12 dependents

bioc

Cardinal:A mass spectrometry imaging toolbox for statistical analysis

Implements statistical & computational tools for analyzing mass spectrometry imaging datasets, including methods for efficient pre-processing, spatial segmentation, and classification.

Maintained by Kylie Ariel Bemis. Last updated 3 months ago.

software infrastructure proteomics lipidomics massspectrometry imagingmassspectrometry immunooncology normalization clustering classification regression

48 stars 10.32 score 200 scripts

bioc

illuminaio:Parsing Illumina Microarray Output Files

Tools for parsing Illumina's microarray output files, including IDAT.

Maintained by Kasper Daniel Hansen. Last updated 5 months ago.

infrastructure dataimport microarray proprietaryplatforms bioconductor

5 stars 10.28 score 58 scripts 37 dependents

stemangiola

tidyHeatmap:A Tidy Implementation of Heatmap

This is a tidy implementation for heatmap. At the moment it is based on the (great) package 'ComplexHeatmap'. The goal of this package is to interface a tidy data frame with this powerful tool. Some of the advantages are: Row and/or columns colour annotations are easy to integrate just specifying one parameter (column names). Custom grouping of rows is easy to specify providing a grouped tbl. For example: df %>% group_by(...). Labels size adjusted by row and column total number. Default use of Brewer and Viridis palettes.

Maintained by Stefano Mangiola. Last updated 2 months ago.

assaydomain infrastructure brewer complexheatmap custom-palette dplyr graphviz heatmap mtcars plotting rstudio scale tibble tidy tidy-data-frame tidybulk tidyverse viridis

335 stars 10.23 score 197 scripts 1 dependents

bioc

AnnotationFilter:Facilities for Filtering Bioconductor Annotation Resources

This package provides class and other infrastructure to implement filters for manipulating Bioconductor annotation resources. The filters will be used by ensembldb, Organism.dplyr, and other packages.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

annotation infrastructure software bioconductor-package core-package

5 stars 10.19 score 45 scripts 160 dependents

bioc

cBioPortalData:Exposes and Makes Available Data from the cBioPortal Web Resources

The cBioPortalData R package accesses study datasets from the cBio Cancer Genomics Portal. It accesses the data either from the pre-packaged zip / tar files or from the API interface that was recently implemented by the cBioPortal Data Team. The package can provide data in either tabular format or with MultiAssayExperiment object that uses familiar Bioconductor data representations.

Maintained by Marcel Ramos. Last updated 8 days ago.

software infrastructure thirdpartyclient bioconductor-package nci-itcr u24ca289073

33 stars 10.17 score 147 scripts 4 dependents

bioc

flowCore:flowCore: Basic structures for flow cytometry data

Provides S4 data structures and basic functions to deal with flow cytometry data.

Maintained by Mike Jiang. Last updated 5 months ago.

immunooncology infrastructure flowcytometry cellbasedassays cpp

10.17 score 1.7k scripts 59 dependents

bioc

zlibbioc:An R packaged zlib-1.2.5

This package uses the source code of zlib-1.2.5 to create libraries for systems that do not have these available via other means (most Linux and Mac users should have system-level access to zlib, and no direct need for this package). See the vignette for instructions on use.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

infrastructure bioconductor-package core-package

10.12 score 17 scripts 68 dependents

bioc

UCSC.utils:Low-level utilities to retrieve data from the UCSC Genome Browser

A set of low-level utilities to retrieve data from the UCSC Genome Browser. Most functions in the package access the data via the UCSC REST API but some of them query the UCSC MySQL server directly. Note that the primary purpose of the package is to support higher-level functionalities implemented in downstream packages like GenomeInfoDb or txdbmaker.

Maintained by Hervé Pagès. Last updated 2 months ago.

infrastructure genomeassembly annotation genomeannotation dataimport bioconductor-package core-package

1 stars 10.09 score 4 scripts 1.7k dependents

bioc

BiocCheck:Bioconductor-specific package checks

BiocCheck guides maintainers through Bioconductor best practicies. It runs Bioconductor-specific package checks by searching through package code, examples, and vignettes. Maintainers are required to address all errors, warnings, and most notes produced.

Maintained by Marcel Ramos. Last updated 1 months ago.

infrastructure bioconductor-package core-services

8 stars 10.03 score 114 scripts 6 dependents

bioc

rhdf5filters:HDF5 Compression Filters

Provides a collection of additional compression filters for HDF5 datasets. The package is intended to provide seemless integration with rhdf5, however the compiled filters can also be used with external applications.

Maintained by Mike Smith. Last updated 5 days ago.

infrastructure dataimport compression filter-plugin hdf5

5 stars 9.90 score 4 scripts 233 dependents

bioc

GenVisR:Genomic Visualizations in R

Produce highly customizable publication quality graphics for genomic data primarily at the cohort level.

Maintained by Zachary Skidmore. Last updated 5 months ago.

infrastructure datarepresentation classification dnaseq

217 stars 9.87 score 76 scripts

mrc-ide

odin:ODE Generation and Integration

Generate systems of ordinary differential equations (ODE) and integrate them, using a domain specific language (DSL). The DSL uses R's syntax, but compiles to C in order to efficiently solve the system. A solver is not provided, but instead interfaces to the packages 'deSolve' and 'dde' are generated. With these, while solving the differential equations, no allocations are done and the calculations remain entirely in compiled code. Alternatively, a model can be transpiled to R for use in contexts where a C compiler is not present. After compilation, models can be inspected to return information about parameters and outputs, or intermediate values after calculations. 'odin' is not targeted at any particular domain and is suitable for any system that can be expressed primarily as mathematical expressions. Additional support is provided for working with delays (delay differential equations, DDE), using interpolated functions during interpolation, and for integrating quantities that represent arrays.

Maintained by Rich FitzJohn. Last updated 9 months ago.

infrastructure

106 stars 9.74 score 290 scripts 3 dependents

bioc

biocViews:Categorized views of R package repositories

Infrastructure to support 'views' used to classify Bioconductor packages. 'biocViews' are directed acyclic graphs of terms from a controlled vocabulary. There are three major classifications, corresponding to 'software', 'annotation', and 'experiment data' packages.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

infrastructure bioconductor-package core-package

4 stars 9.71 score 30 scripts 14 dependents

bioc

txdbmaker:Tools for making TxDb objects from genomic annotations

A set of tools for making TxDb objects from genomic annotations from various sources (e.g. UCSC, Ensembl, and GFF files). These tools allow the user to download the genomic locations of transcripts, exons, and CDS, for a given assembly, and to import them in a TxDb object. TxDb objects are implemented in the GenomicFeatures package, together with flexible methods for extracting the desired features in convenient formats.

Maintained by H. Pagès. Last updated 4 months ago.

infrastructure dataimport annotation genomeannotation genomeassembly genetics sequencing bioconductor-package core-package

3 stars 9.68 score 92 scripts 87 dependents

bioc

AnnotationForge:Tools for building SQLite-based annotation data packages

Provides code for generating Annotation packages and their databases. Packages produced are intended to be used with AnnotationDbi.

Maintained by Bioconductor Package Maintainer. Last updated 16 days ago.

annotation infrastructure bioconductor-package core-package

5 stars 9.62 score 143 scripts 19 dependents

bioc

tidybulk:Brings transcriptomics to the tidyverse

This is a collection of utility functions that allow to perform exploration of and calculations to RNA sequencing data, in a modular, pipe-friendly and tidy fashion.

Maintained by Stefano Mangiola. Last updated 11 days ago.

assaydomain infrastructure rnaseq differentialexpression geneexpression normalization clustering qualitycontrol sequencing transcription transcriptomics bioconductor bulk-transcriptional-analyses deseq2 differential-expression edger ensembl-ids entrez gene-symbols gsea mds-dimensions pca pipe redundancy tibble tidy tidy-data tidyverse transcripts tsne

171 stars 9.57 score 172 scripts 1 dependents

bioc

matter:Out-of-core statistical computing and signal processing

Toolbox for larger-than-memory scientific computing and visualization, providing efficient out-of-core data structures using files or shared memory, for dense and sparse vectors, matrices, and arrays, with applications to nonuniformly sampled signals and images.

Maintained by Kylie A. Bemis. Last updated 4 months ago.

infrastructure datarepresentation dataimport dimensionreduction preprocessing cpp

57 stars 9.52 score 64 scripts 2 dependents

stemangiola

tidyseurat:Brings Seurat to the Tidyverse

It creates an invisible layer that allow to see the 'Seurat' object as tibble and interact seamlessly with the tidyverse.

Maintained by Stefano Mangiola. Last updated 8 months ago.

assaydomain infrastructure rnaseq differentialexpression geneexpression normalization clustering qualitycontrol sequencing transcription transcriptomics dplyr ggplot2 pca purrr sct seurat single-cell single-cell-rna-seq tibble tidyr tidyverse transcripts tsne umap

159 stars 9.48 score 398 scripts 1 dependents

bioc

MetaboCoreUtils:Core Utils for Metabolomics Data

MetaboCoreUtils defines metabolomics-related core functionality provided as low-level functions to allow a data structure-independent usage across various R packages. This includes functions to calculate between ion (adduct) and compound mass-to-charge ratios and masses or functions to work with chemical formulas. The package provides also a set of adduct definitions and information on some commercially available internal standard mixes commonly used in MS experiments.

Maintained by Johannes Rainer. Last updated 5 months ago.

infrastructure metabolomics massspectrometry mass-spectrometry

9 stars 9.40 score 58 scripts 36 dependents

bioc

ProtGenerics:Generic infrastructure for Bioconductor mass spectrometry packages

S4 generic functions and classes needed by Bioconductor proteomics packages.

Maintained by Laurent Gatto. Last updated 2 months ago.

infrastructure proteomics massspectrometry bioconductor mass-spectrometry metabolomics

8 stars 9.36 score 4 scripts 188 dependents

rte-antares-rpackage

antaresRead:Import, Manipulate and Explore the Results of an 'Antares' Simulation

Import, manipulate and explore results generated by 'Antares', a powerful open source software developed by RTE (Réseau de Transport d’Électricité) to simulate and study electric power systems (more information about 'Antares' here : <https://antares-simulator.org/>).

Maintained by Tatiana Vargas. Last updated 3 days ago.

infrastructure dataimport adequacy bilan electricity energy hdf5 linear-algebra monte-carlo-simulation optimisation previsionnel rhdf5 rte simulation tyndp

13 stars 9.32 score 148 scripts 3 dependents

bioc

GenomicInteractions:Utilities for handling genomic interaction data

Utilities for handling genomic interaction data such as ChIA-PET or Hi-C, annotating genomic features with interaction information, and producing plots and summary statistics.

Maintained by Liz Ing-Simmons. Last updated 5 months ago.

software infrastructure dataimport datarepresentation hic

7 stars 9.31 score 162 scripts 5 dependents

bioc

basilisk:Freezing Python Dependencies Inside Bioconductor Packages

Installs a self-contained conda instance that is managed by the R/Bioconductor installation machinery. This aims to provide a consistent Python environment that can be used reliably by Bioconductor packages. Functions are also provided to enable smooth interoperability of multiple Python environments in a single R session.

Maintained by Aaron Lun. Last updated 9 days ago.

infrastructure

9.12 score 75 scripts 39 dependents

bioc

affyio:Tools for parsing Affymetrix data files

Routines for parsing Affymetrix data files based upon file format information. Primary focus is on accessing the CEL and CDF file formats.

Maintained by Ben Bolstad. Last updated 2 months ago.

microarray dataimport infrastructure zlib

4 stars 9.07 score 40 scripts 110 dependents

bioc

RaggedExperiment:Representation of Sparse Experiments and Assays Across Samples

This package provides a flexible representation of copy number, mutation, and other data that fit into the ragged array schema for genomic location data. The basic representation of such data provides a rectangular flat table interface to the user with range information in the rows and samples/specimen in the columns. The RaggedExperiment class derives from a GRangesList representation and provides a semblance of a rectangular dataset.

Maintained by Marcel Ramos. Last updated 4 months ago.

infrastructure datarepresentation copynumber core-package data-structure mutations u24ca289073

4 stars 8.93 score 76 scripts 14 dependents

bioc

tidySingleCellExperiment:Brings SingleCellExperiment to the Tidyverse

'tidySingleCellExperiment' is an adapter that abstracts the 'SingleCellExperiment' container in the form of a 'tibble'. This allows *tidy* data manipulation, nesting, and plotting. For example, a 'tidySingleCellExperiment' is directly compatible with functions from 'tidyverse' packages `dplyr` and `tidyr`, as well as plotting with `ggplot2` and `plotly`. In addition, the package provides various utility functions specific to single-cell omics data analysis (e.g., aggregation of cell-level data to pseudobulks).

Maintained by Stefano Mangiola. Last updated 5 months ago.

assaydomain infrastructure rnaseq differentialexpression singlecell geneexpression normalization clustering qualitycontrol sequencing bioconductor dplyr ggplot2 plotly single-cell-rna-seq single-cell-sequencing singlecellexperiment tibble tidyr tidyverse

36 stars 8.86 score 125 scripts 2 dependents

bioc

AnVIL:Bioconductor on the AnVIL compute environment

The AnVIL is a cloud computing resource developed in part by the National Human Genome Research Institute. The AnVIL package provides end-user and developer functionality. For the end-user, AnVIL provides fast binary package installation, utitlities for working with Terra / AnVIL table and data resources, and convenient functions for file movement to and from Google cloud storage. For developers, AnVIL provides programatic access to the Terra, Leonardo, Rawls, and Dockstore RESTful programming interface, including helper functions to transform JSON responses to formats more amenable to manipulation in R.

Maintained by Marcel Ramos. Last updated 1 months ago.

infrastructure

8.85 score 250 scripts 11 dependents

bioc

CellBench:Construct Benchmarks for Single Cell Analysis Methods

This package contains infrastructure for benchmarking analysis methods and access to single cell mixture benchmarking data. It provides a framework for organising analysis methods and testing combinations of methods in a pipeline without explicitly laying out each combination. It also provides utilities for sampling and filtering SingleCellExperiment objects, constructing lists of functions with varying parameters, and multithreaded evaluation of analysis methods.

Maintained by Shian Su. Last updated 5 months ago.

software infrastructure singlecell benchmark bioinformatics

31 stars 8.73 score 98 scripts

bioc

GenomicScores:Infrastructure to work with genomewide position-specific scores

Provide infrastructure to store and access genomewide position-specific scores within R and Bioconductor.

Maintained by Robert Castelo. Last updated 2 months ago.

infrastructure genetics annotation sequencing coverage annotationhubsoftware

8 stars 8.71 score 83 scripts 6 dependents

bioc

monocle:Clustering, differential expression, and trajectory analysis for single- cell RNA-Seq

Monocle performs differential expression and time-series analysis for single-cell expression experiments. It orders individual cells according to progress through a biological process, without knowing ahead of time which genes define progress through that process. Monocle also performs differential expression analysis, clustering, visualization, and other useful tasks on single cell expression data. It is designed to work with RNA-Seq and qPCR data, but could be used with other types as well.

Maintained by Cole Trapnell. Last updated 5 months ago.

immunooncology sequencing rnaseq geneexpression differentialexpression infrastructure dataimport datarepresentation visualization clustering multiplecomparison qualitycontrol cpp

8.71 score 1.6k scripts 2 dependents

bioc

BiocBaseUtils:General utility functions for developing Bioconductor packages

The package provides utility functions related to package development. These include functions that replace slots, and selectors for show methods. It aims to coalesce the various helper functions often re-used throughout the Bioconductor ecosystem.

Maintained by Marcel Ramos. Last updated 5 months ago.

software infrastructure bioconductor-package core-package

4 stars 8.68 score 3 scripts 159 dependents

bioc

MsExperiment:Infrastructure for Mass Spectrometry Experiments

Infrastructure to store and manage all aspects related to a complete proteomics or metabolomics mass spectrometry (MS) experiment. The MsExperiment package provides light-weight and flexible containers for MS experiments building on the new MS infrastructure provided by the Spectra, QFeatures and related packages. Along with raw data representations, links to original data files and sample annotations, additional metadata or annotations can also be stored within the MsExperiment container. To guarantee maximum flexibility only minimal constraints are put on the type and content of the data within the containers.

Maintained by Laurent Gatto. Last updated 2 months ago.

infrastructure proteomics massspectrometry metabolomics experimentaldesign dataimport

5 stars 8.51 score 126 scripts 14 dependents

bioc

tidySummarizedExperiment:Brings SummarizedExperiment to the Tidyverse

The tidySummarizedExperiment package provides a set of tools for creating and manipulating tidy data representations of SummarizedExperiment objects. SummarizedExperiment is a widely used data structure in bioinformatics for storing high-throughput genomic data, such as gene expression or DNA sequencing data. The tidySummarizedExperiment package introduces a tidy framework for working with SummarizedExperiment objects. It allows users to convert their data into a tidy format, where each observation is a row and each variable is a column. This tidy representation simplifies data manipulation, integration with other tidyverse packages, and enables seamless integration with the broader ecosystem of tidy tools for data analysis.

Maintained by Stefano Mangiola. Last updated 5 months ago.

assaydomain infrastructure rnaseq differentialexpression geneexpression normalization clustering qualitycontrol sequencing transcription transcriptomics

26 stars 8.44 score 196 scripts 1 dependents

bioc

PSMatch:Handling and Managing Peptide Spectrum Matches

The PSMatch package helps proteomics practitioners to load, handle and manage Peptide Spectrum Matches. It provides functions to model peptide-protein relations as adjacency matrices and connected components, visualise these as graphs and make informed decision about shared peptide filtering. The package also provides functions to calculate and visualise MS2 fragment ions.

Maintained by Laurent Gatto. Last updated 5 months ago.

infrastructure proteomics massspectrometry mass-spectrometry peptide-spectrum-matches

3 stars 8.40 score 15 scripts 39 dependents

bioc

UniProt.ws:R Interface to UniProt Web Services

The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and annotation data. This package provides a collection of functions for retrieving, processing, and re-packaging UniProt web services. The package makes use of UniProt's modernized REST API and allows mapping of identifiers accross different databases.

Maintained by Marcel Ramos. Last updated 3 months ago.

annotation infrastructure go kegg biocarta bioconductor-package core-package

4 stars 8.38 score 167 scripts 4 dependents

bioc

rawrr:Direct Access to Orbitrap Data and Beyond

This package wraps the functionality of the Thermo Fisher Scientic RawFileReader .NET 8.0 assembly. Within the R environment, spectra and chromatograms are represented by S3 objects. The package provides basic functions to download and install the required third-party libraries. The package is developed, tested, and used at the Functional Genomics Center Zurich, Switzerland.

Maintained by Christian Panse. Last updated 4 months ago.

massspectrometry proteomics metabolomics infrastructure software fast mass-spectrometry multiplatform orbitrap-ms

56 stars 8.19 score 23 scripts 2 dependents

bioc

affxparser:Affymetrix File Parsing SDK

Package for parsing Affymetrix files (CDF, CEL, CHP, BPMAP, BAR). It provides methods for fast and memory efficient parsing of Affymetrix files using the Affymetrix' Fusion SDK. Both ASCII- and binary-based files are supported. Currently, there are methods for reading chip definition file (CDF) and a cell intensity file (CEL). These files can be read either in full or in part. For example, probe signals from a few probesets can be extracted very quickly from a set of CEL files into a convenient list structure.

Maintained by Kasper Daniel Hansen. Last updated 3 months ago.

infrastructure dataimport microarray proprietaryplatforms onechannel bioconductor cpp

7 stars 8.19 score 65 scripts 14 dependents

bioc

biovizBase:Basic graphic utilities for visualization of genomic data.

The biovizBase package is designed to provide a set of utilities, color schemes and conventions for genomic data. It serves as the base for various high-level packages for biological data visualization. This saves development effort and encourages consistency.

Maintained by Michael Lawrence. Last updated 5 months ago.

infrastructure visualization preprocessing

8.03 score 273 scripts 74 dependents

bioc

InteractionSet:Base Classes for Storing Genomic Interaction Data

Provides the GInteractions, InteractionSet and ContactMatrix objects and associated methods for storing and manipulating genomic interaction data from Hi-C and ChIA-PET experiments.

Maintained by Aaron Lun. Last updated 5 months ago.

infrastructure datarepresentation software hic cpp

7.95 score 250 scripts 36 dependents

bioc

ChemmineOB:R interface to a subset of OpenBabel functionalities

ChemmineOB provides an R interface to a subset of cheminformatics functionalities implemented by the OpelBabel C++ project. OpenBabel is an open source cheminformatics toolbox that includes utilities for structure format interconversions, descriptor calculations, compound similarity searching and more. ChemineOB aims to make a subset of these utilities available from within R. For non-developers, ChemineOB is primarily intended to be used from ChemmineR as an add-on package rather than used directly.

Maintained by Thomas Girke. Last updated 5 months ago.

cheminformatics biomedicalinformatics pharmacogenetics pharmacogenomics microtitreplateassay cellbasedassays visualization infrastructure dataimport clustering proteomics metabolomics openbabel cpp

10 stars 7.87 score 77 scripts 1 dependents

bioc

TreeSummarizedExperiment:TreeSummarizedExperiment: a S4 Class for Data with Tree Structures

TreeSummarizedExperiment has extended SingleCellExperiment to include hierarchical information on the rows or columns of the rectangular data.

Maintained by Ruizhu Huang. Last updated 5 months ago.

datarepresentation infrastructure

7.87 score 251 scripts 15 dependents

bioc

biodb:biodb, a library and a development framework for connecting to chemical and biological databases

The biodb package provides access to standard remote chemical and biological databases (ChEBI, KEGG, HMDB, ...), as well as to in-house local database files (CSV, SQLite), with easy retrieval of entries, access to web services, search of compounds by mass and/or name, and mass spectra matching for LCMS and MSMS. Its architecture as a development framework facilitates the development of new database connectors for local projects or inside separate published packages.

Maintained by Pierrick Roger. Last updated 5 months ago.

software infrastructure dataimport kegg biology cheminformatics chemistry databases cpp

11 stars 7.85 score 24 scripts 6 dependents

bioc

SRAdb:A compilation of metadata from NCBI SRA and tools

The Sequence Read Archive (SRA) is the largest public repository of sequencing data from the next generation of sequencing platforms including Roche 454 GS System, Illumina Genome Analyzer, Applied Biosystems SOLiD System, Helicos Heliscope, and others. However, finding data of interest can be challenging using current tools. SRAdb is an attempt to make access to the metadata associated with submission, study, sample, experiment and run much more feasible. This is accomplished by parsing all the NCBI SRA metadata into a SQLite database that can be stored and queried locally. Fulltext search in the package make querying metadata very flexible and powerful. fastq and sra files can be downloaded for doing alignment locally. Beside ftp protocol, the SRAdb has funcitons supporting fastp protocol (ascp from Aspera Connect) for faster downloading large data files over long distance. The SQLite database is updated regularly as new data is added to SRA and can be downloaded at will for the most up-to-date metadata.

Maintained by Jack Zhu. Last updated 4 months ago.

infrastructure sequencing dataimport

2 stars 7.81 score 200 scripts

bioc

MsFeatures:Functionality for Mass Spectrometry Features

The MsFeature package defines functionality for Mass Spectrometry features. This includes functions to group (LC-MS) features based on some of their properties, such as retention time (coeluting features), or correlation of signals across samples. This packge hence allows to group features, and its results can be used as an input for the `QFeatures` package which allows to aggregate abundance levels of features within each group. This package defines concepts and functions for base and common data types, implementations for more specific data types are expected to be implemented in the respective packages (such as e.g. `xcms`). All functionality of this package is implemented in a modular way which allows combination of different grouping approaches and enables its re-use in other R packages.

Maintained by Johannes Rainer. Last updated 5 months ago.

infrastructure massspectrometry metabolomics

7 stars 7.70 score 32 scripts 12 dependents

bioc

BiocPkgTools:Collection of simple tools for learning about Bioconductor Packages

Bioconductor has a rich ecosystem of metadata around packages, usage, and build status. This package is a simple collection of functions to access that metadata from R. The goal is to expose metadata for data mining and value-added functionality such as package searching, text mining, and analytics on packages.

Maintained by Sean Davis. Last updated 25 days ago.

software infrastructure bioconductor metadata

21 stars 7.67 score 68 scripts

bioc

koinar:KoinaR - Remote machine learning inference using Koina

A client to simplify fetching predictions from the Koina web service. Koina is a model repository enabling the remote execution of models. Predictions are generated as a response to HTTP/S requests, the standard protocol used for nearly all web traffic.

Maintained by Ludwig Lautenbacher. Last updated 3 months ago.

massspectrometry proteomics infrastructure software bioinformatics deep-learning machine-learning mass-spectrometry python

34 stars 7.49 score 4 scripts

bioc

MGnifyR:R interface to EBI MGnify metagenomics resource

Utility package to facilitate integration and analysis of EBI MGnify data in R. The package can be used to import microbial data for instance into TreeSummarizedExperiment (TreeSE). In TreeSE format, the data is directly compatible with miaverse framework.

Maintained by Tuomas Borman. Last updated 2 days ago.

infrastructure dataimport metagenomics microbiome microbiomedata

21 stars 7.48 score 32 scripts

bioc

flowViz:Visualization for flow cytometry

Provides visualization tools for flow cytometry data.

Maintained by Mike Jiang. Last updated 5 months ago.

immunooncology infrastructure flowcytometry cellbasedassays visualization

7.44 score 231 scripts 12 dependents

bioc

gDRutils:A package with helper functions for processing drug response data

This package contains utility functions used throughout the gDR platform to fit data, manipulate data, and convert and validate data structures. This package also has the necessary default constants for gDR platform. Many of the functions are utilized by the gDRcore package.

Maintained by Arkadiusz Gladki. Last updated 1 days ago.

software infrastructure

2 stars 7.42 score 3 scripts 3 dependents

ikosmidis

enrichwith:Methods to Enrich R Objects with Extra Components

Provides the "enrich" method to enrich list-like R objects with new, relevant components. The current version has methods for enriching objects of class 'family', 'link-glm', 'lm', 'glm' and 'betareg'. The resulting objects preserve their class, so all methods associated with them still apply. The package also provides the 'enriched_glm' function that has the same interface as 'glm' but results in objects of class 'enriched_glm'. In addition to the usual components in a `glm` object, 'enriched_glm' objects carry an object-specific simulate method and functions to compute the scores, the observed and expected information matrix, the first-order bias, as well as model densities, probabilities, and quantiles at arbitrary parameter values. The package can also be used to produce customizable source code templates for the structured implementation of methods to compute new components and enrich arbitrary objects.

Maintained by Ioannis Kosmidis. Last updated 5 years ago.

infrastructure

6 stars 7.40 score 16 scripts 13 dependents

bioc

gDRimport:Package for handling the import of dose-response data

The package is a part of the gDR suite. It helps to prepare raw drug response data for downstream processing. It mainly contains helper functions for importing/loading/validating dose-response data provided in different file formats.

Maintained by Arkadiusz Gladki. Last updated 2 days ago.

software infrastructure dataimport

3 stars 7.32 score 5 scripts 1 dependents

mrc-ide

rrq:Simple Redis Queue

Simple Redis queue in R.

Maintained by Rich FitzJohn. Last updated 4 months ago.

cluster infrastructure

24 stars 7.31 score 14 scripts 3 dependents

bioc

OrganismDbi:Software to enable the smooth interfacing of different database packages

The package enables a simple unified interface to several annotation packages each of which has its own schema by taking advantage of the fact that each of these packages implements a select methods.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

annotation infrastructure

7.26 score 34 scripts 34 dependents

bioc

basilisk.utils:Basilisk Installation Utilities

Implements utilities for installation of the basilisk package, primarily for creation of the underlying Conda instance. This allows us to avoid re-writing the same R code in both the configure script (for centrally administered R installations) and in the lazy installation mechanism (for distributed package binaries). It is highly unlikely that developers - or, heaven forbid, end-users! - will need to interact with this package directly; they should be using the basilisk package instead.

Maintained by Aaron Lun. Last updated 2 months ago.

infrastructure

7.24 score 9 scripts 40 dependents

bioc

MsBackendMgf:Mass Spectrometry Data Backend for Mascot Generic Format (mgf) Files

Mass spectrometry (MS) data backend supporting import and export of MS/MS spectra data from Mascot Generic Format (mgf) files. Objects defined in this package are supposed to be used with the Spectra Bioconductor package. This package thus adds mgf file support to the Spectra package.

Maintained by RforMassSpectrometry Package Maintainer. Last updated 2 months ago.

infrastructure proteomics massspectrometry metabolomics dataimport

5 stars 7.20 score 35 scripts 4 dependents

bioc

TnT:Interactive Visualization for Genomic Features

A R interface to the TnT javascript library (https://github.com/ tntvis) to provide interactive and flexible visualization of track-based genomic data.

Maintained by Jialin Ma. Last updated 5 months ago.

infrastructure visualization bioconductor genome-browser htmlwidgets shiny

14 stars 7.15 score 17 scripts

bioc

MsBackendMsp:Mass Spectrometry Data Backend for NIST msp Files

Mass spectrometry (MS) data backend supporting import and handling of MS/MS spectra from NIST MSP Format (msp) files. Import of data from files with different MSP *flavours* is supported. Objects from this package add support for MSP files to Bioconductor's Spectra package. This package is thus not supposed to be used without the Spectra package that provides a complete infrastructure for MS data handling.

Maintained by Johannes Rainer. Last updated 2 months ago.

infrastructure proteomics massspectrometry metabolomics dataimport mass-spectrometry

5 stars 7.12 score 37 scripts 3 dependents

bioc

BiocVersion:Set the appropriate version of Bioconductor packages

This package provides repository information for the appropriate version of Bioconductor.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

infrastructure

7.11 score 19 scripts 105 dependents

bioc

CuratedAtlasQueryR:Queries the Human Cell Atlas

Provides access to a copy of the Human Cell Atlas, but with harmonised metadata. This allows for uniform querying across numerous datasets within the Atlas using common fields such as cell type, tissue type, and patient ethnicity. Usage involves first querying the metadata table for cells of interest, and then downloading the corresponding cells into a SingleCellExperiment object.

Maintained by Stefano Mangiola. Last updated 5 months ago.

assaydomain infrastructure rnaseq differentialexpression geneexpression normalization clustering qualitycontrol sequencing transcription transcriptomics database duckdb hdf5 human-cell-atlas single-cell singlecellexperiment tidyverse

90 stars 7.04 score 41 scripts

bioc

systemPipeShiny:systemPipeShiny: An Interactive Framework for Workflow Management and Visualization

systemPipeShiny (SPS) extends the widely used systemPipeR (SPR) workflow environment with a versatile graphical user interface provided by a Shiny App. This allows non-R users, such as experimentalists, to run many systemPipeR’s workflow designs, control, and visualization functionalities interactively without requiring knowledge of R. Most importantly, SPS has been designed as a general purpose framework for interacting with other R packages in an intuitive manner. Like most Shiny Apps, SPS can be used on both local computers as well as centralized server-based deployments that can be accessed remotely as a public web service for using SPR’s functionalities with community and/or private data. The framework can integrate many core packages from the R/Bioconductor ecosystem. Examples of SPS’ current functionalities include: (a) interactive creation of experimental designs and metadata using an easy to use tabular editor or file uploader; (b) visualization of workflow topologies combined with auto-generation of R Markdown preview for interactively designed workflows; (d) access to a wide range of data processing routines; (e) and an extendable set of visualization functionalities. Complex visual results can be managed on a 'Canvas Workbench’ allowing users to organize and to compare plots in an efficient manner combined with a session snapshot feature to continue work at a later time. The present suite of pre-configured visualization examples. The modular design of SPR makes it easy to design custom functions without any knowledge of Shiny, as well as extending the environment in the future with contributions from the community.

Maintained by Le Zhang. Last updated 5 months ago.

shinyapps infrastructure dataimport sequencing qualitycontrol reportwriting experimentaldesign clustering bioconductor bioconductor-package data-visualization shiny systempiper

34 stars 7.04 score 36 scripts

bioc

SharedObject:Sharing R objects across multiple R processes without memory duplication

This package is developed for facilitating parallel computing in R. It is capable to create an R object in the shared memory space and share the data across multiple R processes. It avoids the overhead of memory dulplication and data transfer, which make sharing big data object across many clusters possible.

Maintained by Jiefei Wang. Last updated 5 months ago.

infrastructure sharedobject cpp

45 stars 7.03 score 6 scripts 1 dependents

bioc

fmcsR:Mismatch Tolerant Maximum Common Substructure Searching

The fmcsR package introduces an efficient maximum common substructure (MCS) algorithms combined with a novel matching strategy that allows for atom and/or bond mismatches in the substructures shared among two small molecules. The resulting flexible MCSs (FMCSs) are often larger than strict MCSs, resulting in the identification of more common features in their source structures, as well as a higher sensitivity in finding compounds with weak structural similarities. The fmcsR package provides several utilities to use the FMCS algorithm for pairwise compound comparisons, structure similarity searching and clustering.

Maintained by Thomas Girke. Last updated 5 months ago.

cheminformatics biomedicalinformatics pharmacogenetics pharmacogenomics microtitreplateassay cellbasedassays visualization infrastructure dataimport clustering proteomics metabolomics cpp

5 stars 7.03 score 60 scripts 1 dependents

bioc

dir.expiry:Managing Expiration for Cache Directories

Implements an expiration system for access to versioned directories. Directories that have not been accessed by a registered function within a certain time frame are deleted. This aims to reduce disk usage by eliminating obsolete caches generated by old versions of packages.

Maintained by Aaron Lun. Last updated 5 months ago.

software infrastructure

7.03 score 6 scripts 46 dependents

bioc

h5mread:A fast HDF5 reader

The main function in the h5mread package is h5mread(), which allows reading arbitrary data from an HDF5 dataset into R, similarly to what the h5read() function from the rhdf5 package does. In the case of h5mread(), the implementation has been optimized to make it as fast and memory-efficient as possible.

Maintained by Hervé Pagès. Last updated 2 months ago.

infrastructure datarepresentation dataimport openssl curl zlib

1 stars 6.98 score 4 scripts 127 dependents

r-forge

oompaBase:Class Unions, Matrix Operations, and Color Schemes for OOMPA

Provides the class unions that must be preloaded in order for the basic tools in the OOMPA (Object-Oriented Microarray and Proteomics Analysis) project to be defined and loaded. It also includes vectorized operations for row-by-row means, variances, and t-tests. Finally, it provides new color schemes. Details on the packages in the OOMPA project can be found at <http://oompa.r-forge.r-project.org/>.

Maintained by Kevin R. Coombes. Last updated 2 months ago.

infrastructure

6.97 score 29 scripts 18 dependents

bioc

MetaboAnnotation:Utilities for Annotation of Metabolomics Data

High level functions to assist in annotation of (metabolomics) data sets. These include functions to perform simple tentative annotations based on mass matching but also functions to consider m/z and retention times for annotation of LC-MS features given that respective reference values are available. In addition, the function provides high-level functions to simplify matching of LC-MS/MS spectra against spectral libraries and objects and functionality to represent and manage such matched data.

Maintained by Johannes Rainer. Last updated 3 months ago.

infrastructure metabolomics massspectrometry annotation mass-spectromtry

15 stars 6.90 score 35 scripts

bioc

TileDBArray:Using TileDB as a DelayedArray Backend

Implements a DelayedArray backend for reading and writing dense or sparse arrays in the TileDB format. The resulting TileDBArrays are compatible with all Bioconductor pipelines that can accept DelayedArray instances.

Maintained by Aaron Lun. Last updated 5 months ago.

datarepresentation infrastructure software

10 stars 6.89 score 26 scripts 1 dependents

stemangiola

tidygate:Interactively Gate Points

Interactively gate points on a scatter plot. Interactively drawn gates are recorded and can be applied programmatically to reproduce results exactly. Programmatic gating is based on the package gatepoints by Wajid Jawaid (who is also an author of this package).

Maintained by Stefano Mangiola. Last updated 6 months ago.

assaydomain infrastructure clustering datavis dataviz dplyr drawing facs gate ggplot2 interactive pipe programmatic seurat single-cell single-cell-rna-seq tibble tidy-data tidyverse

23 stars 6.89 score 14 scripts 1 dependents

bioc

RProtoBufLib:C++ headers and static libraries of Protocol buffers

This package provides the headers and static library of Protocol buffers for other R packages to compile and link against.

Maintained by Mike Jiang. Last updated 5 months ago.

infrastructure

6.86 score 61 dependents

bioc

GenomicFiles:Distributed computing by file or by range

This package provides infrastructure for parallel computations distributed 'by file' or 'by range'. User defined MAPPER and REDUCER functions provide added flexibility for data combination and manipulation.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

genetics infrastructure dataimport sequencing coverage

6.86 score 89 scripts 16 dependents

ropensci

mctq:Munich ChronoType Questionnaire Tools

A complete toolkit for processing the Munich ChronoType Questionnaire (MCTQ) in its three versions: standard, micro, and shift. The MCTQ is a quantitative and validated tool used to assess chronotypes based on individuals' sleep behavior. It was originally presented by Till Roenneberg, Anna Wirz-Justice, and Martha Merrow in 2003 (2003, <doi:10.1177/0748730402239679>).

Maintained by Daniel Vartanian. Last updated 3 months ago.

infrastructure preprocessing visualization biological-rhythm chronobiology chronotype circadian-phenotype circadian-rhythm entrainment mctq peer-reviewed sleep temporal-phenotype

12 stars 6.85 score 28 scripts

bioc

GDSArray:Representing GDS files as array-like objects

GDS files are widely used to represent genotyping or sequence data. The GDSArray package implements the `GDSArray` class to represent nodes in GDS files in a matrix-like representation that allows easy manipulation (e.g., subsetting, mathematical transformation) in _R_. The data remains on disk until needed, so that very large files can be processed.

Maintained by Xiuwen Zheng. Last updated 10 days ago.

infrastructure datarepresentation sequencing genotypingarray

5 stars 6.78 score 8 scripts 2 dependents

bioc

rawDiag:Brings Orbitrap Mass Spectrometry Data to Life; Fast and Colorful

Optimizing methods for liquid chromatography coupled to mass spectrometry (LC-MS) poses a nontrivial challenge. The rawDiag package facilitates rational method optimization by generating MS operator-tailored diagnostic plots of scan-level metadata. The package is designed for use on the R shell or as a Shiny application on the Orbitrap instrument PC.

Maintained by Christian Panse. Last updated 5 months ago.

massspectrometry proteomics metabolomics infrastructure software shinyapps fast mass-spectrometry multiplatform orbitrap visualization

36 stars 6.71 score 18 scripts

bioc

chimeraviz:Visualization tools for gene fusions

chimeraviz manages data from fusion gene finders and provides useful visualization tools.

Maintained by Stian Lågstad. Last updated 5 months ago.

infrastructure alignment

37 stars 6.71 score 14 scripts

bioc

bioassayR:Cross-target analysis of small molecule bioactivity

bioassayR is a computational tool that enables simultaneous analysis of thousands of bioassay experiments performed over a diverse set of compounds and biological targets. Unique features include support for large-scale cross-target analyses of both public and custom bioassays, generation of high throughput screening fingerprints (HTSFPs), and an optional preloaded database that provides access to a substantial portion of publicly available bioactivity data.

Maintained by Thomas Girke. Last updated 5 months ago.

immunooncology microtitreplateassay cellbasedassays visualization infrastructure dataimport bioinformatics proteomics metabolomics

5 stars 6.70 score 46 scripts

rte-antares-rpackage

antaresProcessing:'Antares' Results Processing

Process results generated by 'Antares', a powerful open source software developed by RTE (Réseau de Transport d’Électricité) to simulate and study electric power systems (more information about 'Antares' here: <https://github.com/AntaresSimulatorTeam/Antares_Simulator>). This package provides functions to create new columns like net load, load factors, upward and downward margins or to compute aggregated statistics like economic surpluses of consumers, producers and sectors.

Maintained by Tatiana Vargas. Last updated 4 months ago.

infrastructure dataimport adequacy antares bilan datatable energy linear-algebra margins monte-carlo-simulation optimization previsionnel rte simulation surplus tyndp

8 stars 6.70 score 35 scripts 1 dependents

bioc

AnVILBase:Generic functions for interacting with the AnVIL ecosystem

Provides generic functions for interacting with the AnVIL ecosystem. Packages that use either GCP or Azure in AnVIL are built on top of AnVILBase. Extension packages will provide methods for interacting with other cloud providers.

Maintained by Marcel Ramos. Last updated 5 months ago.

software infrastructure

6.68 score 70 scripts 19 dependents

bioc

Modstrings:Working with modified nucleotide sequences

Representing nucleotide modifications in a nucleotide sequence is usually done via special characters from a number of sources. This represents a challenge to work with in R and the Biostrings package. The Modstrings package implements this functionallity for RNA and DNA sequences containing modified nucleotides by translating the character internally in order to work with the infrastructure of the Biostrings package. For this the ModRNAString and ModDNAString classes and derivates and functions to construct and modify these objects despite the encoding issues are implemenented. In addition the conversion from sequences to list like location information (and the reverse operation) is implemented as well.

Maintained by Felix G.M. Ernst. Last updated 5 months ago.

dataimport datarepresentation infrastructure sequencing software bioconductor biostrings dna dna-modifications modified-nucleotides nucleotides rna rna-modification-alphabet rna-modifications sequences

1 stars 6.64 score 5 scripts 8 dependents

bioc

BumpyMatrix:Bumpy Matrix of Non-Scalar Objects

Implements the BumpyMatrix class and several subclasses for holding non-scalar objects in each entry of the matrix. This is akin to a ragged array but the raggedness is in the third dimension, much like a bumpy surface - hence the name. Of particular interest is the BumpyDataFrameMatrix, where each entry is a Bioconductor data frame. This allows us to naturally represent multivariate data in a format that is compatible with two-dimensional containers like the SummarizedExperiment and MultiAssayExperiment objects.

Maintained by Aaron Lun. Last updated 3 months ago.

software infrastructure datarepresentation

1 stars 6.62 score 39 scripts 12 dependents

mrc-ide

context:Contexts for evaluating R expressions

Contexts for evaluating R expressions.

Maintained by Rich FitzJohn. Last updated 2 years ago.

cluster infrastructure

5 stars 6.59 score 1.7k scripts 1 dependents

bioc

Structstrings:Implementation of the dot bracket annotations with Biostrings

The Structstrings package implements the widely used dot bracket annotation for storing base pairing information in structured RNA. Structstrings uses the infrastructure provided by the Biostrings package and derives the DotBracketString and related classes from the BString class. From these, base pair tables can be produced for in depth analysis. In addition, the loop indices of the base pairs can be retrieved as well. For better efficiency, information conversion is implemented in C, inspired to a large extend by the ViennaRNA package.

Maintained by Felix G.M. Ernst. Last updated 5 months ago.

dataimport datarepresentation infrastructure sequencing software alignment sequencematching bioconductor rna rna-structural-analysis rna-structure sequences structures

4 stars 6.46 score 3 scripts 4 dependents

bioc

MoleculeExperiment:Prioritising a molecule-level storage of Spatial Transcriptomics Data

MoleculeExperiment contains functions to create and work with objects from the new MoleculeExperiment class. We introduce this class for analysing molecule-based spatial transcriptomics data (e.g., Xenium by 10X, Cosmx SMI by Nanostring, and Merscope by Vizgen). This allows researchers to analyse spatial transcriptomics data at the molecule level, and to have standardised data formats accross vendors.

Maintained by Shila Ghazanfar. Last updated 5 months ago.

dataimport datarepresentation infrastructure software spatial transcriptomics

12 stars 6.45 score 39 scripts

bioc

lpsymphony:Symphony integer linear programming solver in R

This package was derived from Rsymphony_0.1-17 from CRAN. These packages provide an R interface to SYMPHONY, an open-source linear programming solver written in C++. The main difference between this package and Rsymphony is that it includes the solver source code (SYMPHONY version 5.6), while Rsymphony expects to find header and library files on the users' system. Thus the intention of lpsymphony is to provide an easy to install interface to SYMPHONY. For Windows, precompiled DLLs are included in this package.

Maintained by Vladislav Kim. Last updated 5 months ago.

infrastructure thirdpartyclient coinor-symphony

6.44 score 16 scripts 3 dependents

bioc

aroma.light:Light-Weight Methods for Normalization and Visualization of Microarray Data using Only Basic R Data Types

Methods for microarray analysis that take basic data types such as matrices and lists of vectors. These methods can be used standalone, be utilized in other packages, or be wrapped up in higher-level classes.

Maintained by Henrik Bengtsson. Last updated 5 months ago.

infrastructure microarray onechannel twochannel multichannel visualization preprocessing bioconductor

1 stars 6.43 score 26 scripts 20 dependents

bioc

RNAmodR:Detection of post-transcriptional modifications in high throughput sequencing data

RNAmodR provides classes and workflows for loading/aggregation data from high througput sequencing aimed at detecting post-transcriptional modifications through analysis of specific patterns. In addition, utilities are provided to validate and visualize the results. The RNAmodR package provides a core functionality from which specific analysis strategies can be easily implemented as a seperate package.

Maintained by Felix G.M. Ernst. Last updated 5 months ago.

software infrastructure workflowstep visualization sequencing alkanilineseq bioconductor modifications ribomethseq rna rnamodr

3 stars 6.39 score 9 scripts 3 dependents

bioc

ontoProc:processing of ontologies of anatomy, cell lines, and so on

Support harvesting of diverse bioinformatic ontologies, making particular use of the ontologyIndex package on CRAN. We provide snapshots of key ontologies for terms about cells, cell lines, chemical compounds, and anatomy, to help analyze genome-scale experiments, particularly cell x compound screens. Another purpose is to strengthen development of compelling use cases for richer interfaces to emerging ontologies.

Maintained by Vincent Carey. Last updated 16 days ago.

infrastructure go bioinformatics genomics ontology

3 stars 6.37 score 75 scripts 2 dependents

bioc

SingleCellAlleleExperiment:S4 Class for Single Cell Data with Allele and Functional Levels for Immune Genes

Defines a S4 class that is based on SingleCellExperiment. In addition to the usual gene layer the object can also store data for immune genes such as HLAs, Igs and KIRs at allele and functional level. The package is part of a workflow named single-cell ImmunoGenomic Diversity (scIGD), that firstly incorporates allele-aware quantification data for immune genes. This new data can then be used with the here implemented data structure and functionalities for further data handling and data analysis.

Maintained by Jonas Schuck. Last updated 2 months ago.

datarepresentation infrastructure singlecell transcriptomics geneexpression genetics immunooncology dataimport

7 stars 6.18 score 12 scripts

bioc

Herper:The Herper package is a simple toolset to install and manage conda packages and environments from R

Many tools for data analysis are not available in R, but are present in public repositories like conda. The Herper package provides a comprehensive set of functions to interact with the conda package managament system. With Herper users can install, manage and run conda packages from the comfort of their R session. Herper also provides an ad-hoc approach to handling external system requirements for R packages. For people developing packages with python conda dependencies we recommend using basilisk (https://bioconductor.org/packages/release/bioc/html/basilisk.html) to internally support these system requirments pre-hoc.

Maintained by Thomas Carroll. Last updated 5 months ago.

infrastructure software

5 stars 6.11 score 52 scripts

bioc

tidyomics:Easily install and load the tidyomics ecosystem

The tidyomics ecosystem is a set of packages for ’omic data analysis that work together in harmony; they share common data representations and API design, consistent with the tidyverse ecosystem. The tidyomics package is designed to make it easy to install and load core packages from the tidyomics ecosystem with a single command.

Maintained by Stefano Mangiola. Last updated 5 months ago.

assaydomain infrastructure rnaseq differentialexpression geneexpression normalization clustering qualitycontrol sequencing transcription transcriptomics cytometry genomics tidyverse

64 stars 6.11 score 5 scripts

bioc

gDRstyle:A package with style requirements for the gDR suite

Package fills a helper package role for whole gDR suite. It helps to support good development practices by keeping style requirements and style tests for other packages. It also contains build helpers to make all package requirements met.

Maintained by Arkadiusz Gladki. Last updated 2 months ago.

software infrastructure

2 stars 6.10 score 2 scripts

bioc

tkWidgets:R based tk widgets

Widgets to provide user interfaces. tcltk should have been installed for the widgets to run.

Maintained by J. Zhang. Last updated 5 months ago.

infrastructure

6.04 score 72 scripts 6 dependents

jokergoo

bsub:Submitter and Monitor of the 'LSF Cluster'

It submits R code/R scripts/shell commands to 'LSF cluster' (<https://en.wikipedia.org/wiki/Platform_LSF>, the 'bsub' system) without leaving R. There is also an interactive 'shiny' application for monitoring job status.

Maintained by Zuguang Gu. Last updated 15 days ago.

software infrastructure

25 stars 6.01 score 27 scripts

reconverse

reportfactory:Lightweight Infrastructure for Handling Multiple R Markdown Documents

Provides an infrastructure for handling multiple R Markdown reports, including automated curation and time-stamping of outputs, parameterisation and provision of helper functions to manage dependencies.

Maintained by Thibaut Jombart. Last updated 2 years ago.

infrastructure knitr rmarkdown rmarkdown-document

84 stars 5.99 score 47 scripts

bioc

epivizrServer:WebSocket server infrastructure for epivizr apps and packages

This package provides objects to manage WebSocket connections to epiviz apps. Other epivizr package use this infrastructure.

Maintained by Hector Corrada Bravo. Last updated 5 months ago.

infrastructure visualization

5.95 score 6 scripts 5 dependents

bioc

AnVILWorkflow:Run workflows implemented in Terra/AnVIL workspace

The AnVIL is a cloud computing resource developed in part by the National Human Genome Research Institute. The main cloud-based genomics platform deported by the AnVIL project is Terra. The AnVILWorkflow package allows remote access to Terra implemented workflows, enabling end-user to utilize Terra/ AnVIL provided resources - such as data, workflows, and flexible/scalble computing resources - through the conventional R functions.

Maintained by Sehyun Oh. Last updated 1 months ago.

infrastructure software anvil gcp terra workflows

6 stars 5.95 score 1 scripts

bioc

cummeRbund:Analysis, exploration, manipulation, and visualization of Cufflinks high-throughput sequencing data.

Allows for persistent storage, access, exploration, and manipulation of Cufflinks high-throughput sequencing data. In addition, provides numerous plotting functions for commonly used visualizations.

Maintained by Loyal A. Goff. Last updated 5 months ago.

highthroughputsequencing highthroughputsequencingdata rnaseq rnaseqdata geneexpression differentialexpression infrastructure dataimport datarepresentation visualization bioinformatics clustering multiplecomparisons qualitycontrol

5.92 score 209 scripts

bioc

tidySpatialExperiment:SpatialExperiment with tidy principles

tidySpatialExperiment provides a bridge between the SpatialExperiment package and the tidyverse ecosystem. It creates an invisible layer that allows you to interact with a SpatialExperiment object as if it were a tibble; enabling the use of functions from dplyr, tidyr, ggplot2 and plotly. But, underneath, your data remains a SpatialExperiment object.

Maintained by William Hutchison. Last updated 5 months ago.

infrastructure rnaseq geneexpression sequencing spatial transcriptomics singlecell

6 stars 5.88 score 12 scripts

bioc

DFplyr:A `DataFrame` (`S4Vectors`) backend for `dplyr`

Provides `dplyr` verbs (`mutate`, `select`, `filter`, etc...) supporting `S4Vectors::DataFrame` objects. Importantly, this is achieved without conversion to an intermediate `tibble`. Adds grouping infrastructure to `DataFrame` which is respected by the transformation verbs.

Maintained by Jonathan Carroll. Last updated 5 months ago.

datarepresentation infrastructure software

21 stars 5.87 score 5 scripts

bioc

oligoClasses:Classes for high-throughput arrays supported by oligo and crlmm

This package contains class definitions, validity checks, and initialization methods for classes used by the oligo and crlmm packages.

Maintained by Benilton Carvalho. Last updated 5 months ago.

infrastructure

5.86 score 93 scripts 17 dependents

bioc

SpatialExperimentIO:Read in Xenium, CosMx, MERSCOPE or STARmapPLUS data as SpatialExperiment object

Read in imaging-based spatial transcriptomics technology data. Current available modules are for Xenium by 10X Genomics, CosMx by Nanostring, MERSCOPE by Vizgen, or STARmapPLUS from Broad Institute. You can choose to read the data in as a SpatialExperiment or a SingleCellExperiment object.

Maintained by Yixing E. Dong. Last updated 2 months ago.

datarepresentation dataimport infrastructure transcriptomics singlecell spatial geneexpression

9 stars 5.81 score 16 scripts

bioc

MsBackendMassbank:Mass Spectrometry Data Backend for MassBank record Files

Mass spectrometry (MS) data backend supporting import and export of MS/MS library spectra from MassBank record files. Different backends are available that allow handling of data in plain MassBank text file format or allow also to interact directly with MassBank SQL databases. Objects from this package are supposed to be used with the Spectra Bioconductor package. This package thus adds MassBank support to the Spectra package.

Maintained by RforMassSpectrometry Package Maintainer. Last updated 1 months ago.

infrastructure massspectrometry metabolomics dataimport massbank spectra

3 stars 5.81 score 27 scripts

bioc

HuBMAPR:Interface to 'HuBMAP'

'HuBMAP' provides an open, global bio-molecular atlas of the human body at the cellular level. The `datasets()`, `samples()`, `donors()`, `publications()`, and `collections()` functions retrieves the information for each of these entity types. `*_details()` are available for individual entries of each entity type. `*_derived()` are available for retrieving derived datasets or samples for individual entries of each entity type. Data files can be accessed using `bulk_data_transfer()`.

Maintained by Christine Hou. Last updated 1 months ago.

software singlecell dataimport thirdpartyclient spatial infrastructure bioconductor-package client hubmap rstudio

3 stars 5.80 score 1 scripts

henrikbengtsson

aroma.affymetrix:Analysis of Large Affymetrix Microarray Data Sets

A cross-platform R framework that facilitates processing of any number of Affymetrix microarray samples regardless of computer system. The only parameter that limits the number of chips that can be processed is the amount of available disk space. The Aroma Framework has successfully been used in studies to process tens of thousands of arrays. This package has actively been used since 2006.

Maintained by Henrik Bengtsson. Last updated 1 years ago.

infrastructure proprietaryplatforms exonarray microarray onechannel gui dataimport datarepresentation preprocessing qualitycontrol visualization reportwriting acgh copynumbervariants differentialexpression geneexpression snp transcription affymetrix analysis copy-number dna expression hpc large-scale notebook reproducibility rna

10 stars 5.79 score 112 scripts 3 dependents

bioc

BiocFHIR:Illustration of FHIR ingestion and transformation using R

FHIR R4 bundles in JSON format are derived from https://synthea.mitre.org/downloads. Transformation inspired by a kaggle notebook published by Dr Alexander Scarlat, https://www.kaggle.com/code/drscarlat/fhir-starter-parse-healthcare-bundles-into-tables. This is a very limited illustration of some basic parsing and reorganization processes. Additional tooling will be required to move beyond the Synthea data illustrations.

Maintained by Vincent Carey. Last updated 5 months ago.

infrastructure dataimport datarepresentation fhir

4 stars 5.78 score 15 scripts

mrc-ide

cinterpolate:Interpolation From C

Simple interpolation methods designed to be used from C code. Supports constant, linear and spline interpolation. An R wrapper is included but this package is primarily designed to be used from C code using 'LinkingTo'. The spline calculations are classical cubic interpolation, e.g., Forsythe, Malcolm and Moler (1977) <ISBN: 9780131653320>.

Maintained by Rich FitzJohn. Last updated 6 months ago.

infrastructure openblas

9 stars 5.77 score 1 scripts 4 dependents

bioc

TENxIO:Import methods for 10X Genomics files

Provides a structured S4 approach to importing data files from the 10X pipelines. It mainly supports Single Cell Multiome ATAC + Gene Expression data among other data types. The main Bioconductor data representations used are SingleCellExperiment and RaggedExperiment.

Maintained by Marcel Ramos. Last updated 4 months ago.

software infrastructure dataimport singlecell bioconductor-package u24ca289073

5.77 score 7 scripts 3 dependents

bioc

MetID:Network-based prioritization of putative metabolite IDs

This package uses an innovative network-based approach that will enhance our ability to determine the identities of significant ions detected by LC-MS.

Maintained by Zhenzhi Li. Last updated 5 months ago.

assaydomain biologicalquestion infrastructure researchfield statisticalmethod technology workflowstep network kegg

1 stars 5.74 score 110 scripts

bioc

rexposome:Exposome exploration and outcome data analysis

Package that allows to explore the exposome and to perform association analyses between exposures and health outcomes.

Maintained by Xavier Escribà Montagut. Last updated 5 months ago.

software biologicalquestion infrastructure dataimport datarepresentation biomedicalinformatics experimentaldesign multiplecomparison classification clustering

5.70 score 28 scripts 1 dependents

bioc

iSEEindex:iSEE extension for a landing page to a custom collection of data sets

This package provides an interface to any collection of data sets within a single iSEE web-application. The main functionality of this package is to define a custom landing page allowing app maintainers to list a custom collection of data sets that users can selected from and directly load objects into an iSEE web-application.

Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.

software infrastructure bioconductor hacktoberfest

2 stars 5.65 score 8 scripts

bioc

flowcatchR:Tools to analyze in vivo microscopy imaging data focused on tracking flowing blood cells

flowcatchR is a set of tools to analyze in vivo microscopy imaging data, focused on tracking flowing blood cells. It guides the steps from segmentation to calculation of features, filtering out particles not of interest, providing also a set of utilities to help checking the quality of the performed operations (e.g. how good the segmentation was). It allows investigating the issue of tracking flowing cells such as in blood vessels, to categorize the particles in flowing, rolling and adherent. This classification is applied in the study of phenomena such as hemostasis and study of thrombosis development. Moreover, flowcatchR presents an integrated workflow solution, based on the integration with a Shiny App and Jupyter notebooks, which is delivered alongside the package, and can enable fully reproducible bioimage analysis in the R environment.

Maintained by Federico Marini. Last updated 3 months ago.

software visualization cellbiology classification infrastructure gui shinyapps bioconductor fluorescence microscopy particles tracking

4 stars 5.62 score 8 scripts

bioc

gpuMagic:An openCL compiler with the capacity to compile R functions and run the code on GPU

The package aims to help users write openCL code with little or no effort. It is able to compile an user-defined R function and run it on a device such as a CPU or a GPU. The user can also write and run their openCL code directly by calling .kernel function.

Maintained by Jiefei Wang. Last updated 5 months ago.

infrastructure ocl-icd cpp

10 stars 5.60 score 1 scripts

bioc

scviR:experimental inferface from R to scvi-tools

This package defines interfaces from R to scvi-tools. A vignette works through the totalVI tutorial for analyzing CITE-seq data. Another vignette compares outputs of Chapter 12 of the OSCA book with analogous outputs based on totalVI quantifications. Future work will address other components of scvi-tools, with a focus on building understanding of probabilistic methods based on variational autoencoders.

Maintained by Vincent Carey. Last updated 5 months ago.

infrastructure singlecell dataimport bioconductor cite-seq scverse

6 stars 5.60 score 11 scripts

bioc

convert:Convert Microarray Data Objects

Define coerce methods for microarray data objects.

Maintained by Yee Hwa (Jean) Yang. Last updated 5 months ago.

infrastructure microarray twochannel

5.58 score 91 scripts 1 dependents

bioc

AnVILGCP:The GCP R Client for the AnVIL

The package provides a set of functions to interact with the Google Cloud Platform (GCP) services on the AnVIL platform. The package is designed to work with the AnVIL package. User-level interaction with this package should be minimal.

Maintained by Marcel Ramos. Last updated 2 months ago.

software infrastructure thirdpartyclient dataimport

5.56 score 27 scripts 3 dependents

bioc

eiR:Accelerated similarity searching of small molecules

The eiR package provides utilities for accelerated structure similarity searching of very large small molecule data sets using an embedding and indexing approach.

Maintained by Thomas Girke. Last updated 2 months ago.

cheminformatics biomedicalinformatics pharmacogenetics pharmacogenomics microtitreplateassay cellbasedassays visualization infrastructure dataimport clustering proteomics metabolomics

3 stars 5.51 score 12 scripts

bioc

VisiumIO:Import Visium data from the 10X Space Ranger pipeline

The package allows users to readily import spatial data obtained from either the 10X website or from the Space Ranger pipeline. Supported formats include tar.gz, h5, and mtx files. Multiple files can be imported at once with *List type of functions. The package represents data mainly as SpatialExperiment objects.

Maintained by Marcel Ramos. Last updated 2 months ago.

software infrastructure dataimport singlecell spatial bioconductor-package genomics u24ca289073

5.50 score 14 scripts 1 dependents

bioc

GenomicTuples:Representation and Manipulation of Genomic Tuples

GenomicTuples defines general purpose containers for storing genomic tuples. It aims to provide functionality for tuples of genomic co-ordinates that are analogous to those available for genomic ranges in the GenomicRanges Bioconductor package.

Maintained by Peter Hickey. Last updated 5 months ago.

infrastructure datarepresentation sequencing cpp

4 stars 5.48 score 7 scripts

bioc

genomeIntervals:Operations on genomic intervals

This package defines classes for representing genomic intervals and provides functions and methods for working with these. Note: The package provides the basic infrastructure for and is enhanced by the package 'girafe'.

Maintained by Julien Gagneur. Last updated 5 months ago.

dataimport infrastructure genetics

5.43 score 45 scripts 2 dependents

bioc

MsBackendSql:SQL-based Mass Spectrometry Data Backend

SQL-based mass spectrometry (MS) data backend supporting also storange and handling of very large data sets. Objects from this package are supposed to be used with the Spectra Bioconductor package. Through the MsBackendSql with its minimal memory footprint, this package thus provides an alternative MS data representation for very large or remote MS data sets.

Maintained by Johannes Rainer. Last updated 12 days ago.

infrastructure massspectrometry metabolomics dataimport proteomics

4 stars 5.41 score 16 scripts

bioc

OmicsMLRepoR:Search harmonized metadata created under the OmicsMLRepo project

This package provides functions to browse the harmonized metadata for large omics databases. This package also supports data navigation if the metadata incorporates ontology.

Maintained by Sehyun Oh. Last updated 6 days ago.

software infrastructure datarepresentation u24ca289073

5.40 score 14 scripts

bioc

iSEEhex:iSEE extension for summarising data points in hexagonal bins

This package provides panels summarising data points in hexagonal bins for `iSEE`. It is part of `iSEEu`, the iSEE universe of panels that extend the `iSEE` package.

Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.

software infrastructure bioconductor iseeu shiny-r

5.38 score 7 scripts 2 dependents

bioc

iSEEde:iSEE extension for panels related to differential expression analysis

This package contains diverse functionality to extend the usage of the iSEE package, including additional classes for the panels or modes facilitating the analysis of differential expression results. This package does not perform differential expression. Instead, it provides methods to embed precomputed differential expression results in a SummarizedExperiment object, in a manner that is compatible with interactive visualisation in iSEE applications.

Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.

software infrastructure differentialexpression bioconductor hacktoberfest iseeu

1 stars 5.38 score 15 scripts

bioc

ReUseData:Reusable and reproducible Data Management

ReUseData is an _R/Bioconductor_ software tool to provide a systematic and versatile approach for standardized and reproducible data management. ReUseData facilitates transformation of shell or other ad hoc scripts for data preprocessing into workflow-based data recipes. Evaluation of data recipes generate curated data files in their generic formats (e.g., VCF, bed). Both recipes and data are cached using database infrastructure for easy data management and reuse. Prebuilt data recipes are available through ReUseData portal ("https://rcwl.org/dataRecipes/") with full annotation and user instructions. Pregenerated data are available through ReUseData cloud bucket that is directly downloadable through "getCloudData()".

Maintained by Qian Liu. Last updated 5 months ago.

software infrastructure dataimport preprocessing immunooncology

4 stars 5.38 score 7 scripts

bioc

CardinalIO:Read and write mass spectrometry imaging files

Fast and efficient reading and writing of mass spectrometry imaging data files. Supports imzML and Analyze 7.5 formats. Provides ontologies for mass spectrometry imaging.

Maintained by Kylie Ariel Bemis. Last updated 5 months ago.

software infrastructure dataimport massspectrometry imagingmassspectrometry cpp

1 stars 5.32 score 3 scripts 1 dependents

bioc

QTLExperiment:S4 classes for QTL summary statistics and metadata

QLTExperiment defines an S4 class for storing and manipulating summary statistics from QTL mapping experiments in one or more states. It is based on the 'SummarizedExperiment' class and contains functions for creating, merging, and subsetting objects. 'QTLExperiment' also stores experiment metadata and has checks in place to ensure that transformations apply correctly.

Maintained by Amelia Dunstone. Last updated 9 days ago.

functionalgenomics dataimport datarepresentation infrastructure sequencing snp software

2 stars 5.32 score 14 scripts 1 dependents

bioc

SCArray:Large-scale single-cell omics data manipulation with GDS files

Provides large-scale single-cell omics data manipulation using Genomic Data Structure (GDS) files. It combines dense and sparse matrices stored in GDS files and the Bioconductor infrastructure framework (SingleCellExperiment and DelayedArray) to provide out-of-memory data storage and large-scale manipulation using the R programming language.

Maintained by Xiuwen Zheng. Last updated 5 days ago.

infrastructure datarepresentation dataimport singlecell rnaseq cpp

1 stars 5.32 score 9 scripts 1 dependents

bioc

AnVILAz:R / Bioconductor Support for the AnVIL Azure Platform

The AnVIL is a cloud computing resource developed in part by the National Human Genome Research Institute. The AnVILAz package supports end-users and developers using the AnVIL platform in the Azure cloud. The package provides a programmatic interface to AnVIL resources, including workspaces, notebooks, tables, and workflows. The package also provides utilities for managing resources, including copying files to and from Azure Blob Storage, and creating shared access signatures (SAS) for secure access to Azure resources.

Maintained by Marcel Ramos. Last updated 5 months ago.

software infrastructure thirdpartyclient

5.30 score 5 scripts

bioc

PhIPData:Container for PhIP-Seq Experiments

PhIPData defines an S4 class for phage-immunoprecipitation sequencing (PhIP-seq) experiments. Buliding upon the RangedSummarizedExperiment class, PhIPData enables users to coordinate metadata with experimental data in analyses. Additionally, PhIPData provides specialized methods to subset and identify beads-only samples, subset objects using virus aliases, and use existing peptide libraries to populate object parameters.

Maintained by Athena Chen. Last updated 5 months ago.

infrastructure datarepresentation sequencing coverage

6 stars 5.26 score 6 scripts 1 dependents

bioc

DelayedDataFrame:Delayed operation on DataFrame using standard DataFrame metaphor

Based on the standard DataFrame metaphor, we are trying to implement the feature of delayed operation on the DelayedDataFrame, with a slot of lazyIndex, which saves the mapping indexes for each column of DelayedDataFrame. Methods like show, validity check, [/[[ subsetting, rbind/cbind are implemented for DelayedDataFrame to be operated around lazyIndex. The listData slot stays untouched until a realization call e.g., DataFrame constructor OR as.list() is invoked.

Maintained by Qian Liu. Last updated 5 months ago.

infrastructure datarepresentation

2 stars 5.26 score 3 scripts 1 dependents

rformassspectrometry

SpectriPy:Integrating Spectra with Python's matchms

The SpectriPy package allows integration of Python-based MS analysis code with the Spectra package. Spectra objects can be converted into Python's matchms Spectrum objects. In addition, SpectriPy integrates and wraps the similarity scoring and processing/filtering functions from the matchms package into R.

Maintained by Johannes Rainer. Last updated 2 months ago.

infrastructure metabolomics massspectrometry mass-spectrometry python

8 stars 5.25 score 5 scripts

bioc

SpectraQL:MassQL support for Spectra

The Mass Spec Query Language (MassQL) is a domain-specific language enabling to express a query and retrieve mass spectrometry (MS) data in a more natural and understandable way for MS users. It is inspired by SQL and is by design programming language agnostic. The SpectraQL package adds support for the MassQL query language to R, in particular to MS data represented by Spectra objects. Users can thus apply MassQL expressions to analyze and retrieve specific data from Spectra objects.

Maintained by Johannes Rainer. Last updated 5 months ago.

infrastructure proteomics massspectrometry metabolomics

7 stars 5.24 score 2 scripts

bioc

epivizr:R Interface to epiviz web app

This package provides connections to the epiviz web app (http://epiviz.cbcb.umd.edu) for interactive visualization of genomic data. Objects in R/bioc interactive sessions can be displayed in genome browser tracks or plots to be explored by navigation through genomic regions. Fundamental Bioconductor data structures are supported (e.g., GenomicRanges and RangedSummarizedExperiment objects), while providing an easy mechanism to support other data structures (through package epivizrData). Visualizations (using d3.js) can be easily added to the web app as well.

Maintained by Hector Corrada Bravo. Last updated 5 months ago.

visualization infrastructure gui

5.24 score 29 scripts 2 dependents

bioc

HubPub:Utilities to create and use Bioconductor Hubs

HubPub provides users with functionality to help with the Bioconductor Hub structures. The package provides the ability to create a skeleton of a Hub style package that the user can then populate with the necessary information. There are also functions to help add resources to the Hub package metadata files as well as publish data to the Bioconductor S3 bucket.

Maintained by Kayla Interdonato. Last updated 15 days ago.

dataimport infrastructure software thirdpartyclient bioconductor-package

3 stars 5.18 score 4 scripts

bioc

epivizrData:Data Management API for epiviz interactive visualization app

Serve data from Bioconductor Objects through a WebSocket connection.

Maintained by Hector Corrada Bravo. Last updated 5 months ago.

infrastructure visualization

1 stars 5.08 score 4 scripts 4 dependents

bioc

topdownr:Investigation of Fragmentation Conditions in Top-Down Proteomics

The topdownr package allows automatic and systemic investigation of fragment conditions. It creates Thermo Orbitrap Fusion Lumos method files to test hundreds of fragmentation conditions. Additionally it provides functions to analyse and process the generated MS data and determine the best conditions to maximise overall fragment coverage.

Maintained by Sebastian Gibb. Last updated 5 months ago.

immunooncology infrastructure proteomics massspectrometry coverage mass-spectrometry topdown

1 stars 5.08 score

bioc

AllelicImbalance:Investigates Allele Specific Expression

Provides a framework for allelic specific expression investigation using RNA-seq data.

Maintained by Jesper R Gadin. Last updated 5 months ago.

genetics infrastructure sequencing

5.08 score 7 scripts

bioc

widgetTools:Creates an interactive tcltk widget

This packages contains tools to support the construction of tcltk widgets

Maintained by Jianhua Zhang. Last updated 5 months ago.

infrastructure

5.04 score 11 scripts 8 dependents

bioc

MsBackendMetaboLights:Retrieve Mass Spectrometry Data from MetaboLights

MetaboLights is one of the main public repositories for storage of metabolomics experiments, which includes analysis results as well as raw data. The MsBackendMetaboLights package provides functionality to retrieve and represent mass spectrometry (MS) data from MetaboLights. Data files are downloaded and cached locally avoiding repetitive downloads. MS data from metabolomics experiments can thus be directly and seamlessly integrated into R-based analysis workflows with the Spectra and MsBackendMetaboLights package.

Maintained by Johannes Rainer. Last updated 3 days ago.

infrastructure massspectrometry metabolomics dataimport proteomics mass-spectrometry metabolomics-data

2 stars 5.00 score 7 scripts

bioc

OSTA.data:OSTA book data

'OSTA.data' is a companion package for the "Orchestrating Spatial Transcriptomics Analysis" (OSTA) with Bioconductor online book. Throughout OSTA, we rely on a set of publicly available datasets that cover different sequencing- and imaging-based platforms, such as Visium, Visium HD, Xenium (10x Genomics) and CosMx (NanoString). In addition, we rely on scRNA-seq (Chromium) data for tasks, e.g., spot deconvolution and label transfer (i.e., supervised clustering). These data been deposited in an Open Storage Framework (OSF) repository, and can be queried and downloaded using functions from the 'osfr' package. For convenience, we have implemented 'OSTA.data' to query and retrieve data from our OSF node, and cache retrieved Zip archives using 'BiocFileCache'.

Maintained by Yixing E. Dong. Last updated 1 months ago.

dataimport datarepresentation experimenthubsoftware infrastructure immunooncology geneexpression transcriptomics singlecell spatial

2 stars 5.00 score

bioc

VariantExperiment:A RangedSummarizedExperiment Container for VCF/GDS Data with GDS Backend

VariantExperiment is a Bioconductor package for saving data in VCF/GDS format into RangedSummarizedExperiment object. The high-throughput genetic/genomic data are saved in GDSArray objects. The annotation data for features/samples are saved in DelayedDataFrame format with mono-dimensional GDSArray in each column. The on-disk representation of both assay data and annotation data achieves on-disk reading and processing and saves memory space significantly. The interface of RangedSummarizedExperiment data format enables easy and common manipulations for high-throughput genetic/genomic data with common SummarizedExperiment metaphor in R and Bioconductor.

Maintained by Qian Liu. Last updated 5 months ago.

infrastructure datarepresentation sequencing annotation genomeannotation genotypingarray

1 stars 5.00 score 2 scripts

bioc

iSEEpathways:iSEE extension for panels related to pathway analysis

This package contains diverse functionality to extend the usage of the iSEE package, including additional classes for the panels or modes facilitating the analysis of pathway analysis results. This package does not perform pathway analysis. Instead, it provides methods to embed precomputed pathway analysis results in a SummarizedExperiment object, in a manner that is compatible with interactive visualisation in iSEE applications.

Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.

software infrastructure differentialexpression geneexpression gui visualization pathways genesetenrichment go shinyapps bioconductor hacktoberfest isee iseeu

1 stars 4.95 score 10 scripts

bioc

RSeqAn:R SeqAn

Headers and some wrapper functions from the SeqAn C++ library for ease of usage in R.

Maintained by August Guang. Last updated 5 months ago.

infrastructure software cpp

3 stars 4.95 score 2 scripts 1 dependents

bioc

BSgenomeForge:Forge your own BSgenome data package

A set of tools to forge BSgenome data packages. Supersedes the old seed-based tools from the BSgenome software package. This package allows the user to create a BSgenome data package in one function call, simplifying the old seed-based process.

Maintained by Hervé Pagès. Last updated 5 months ago.

infrastructure datarepresentation genomeassembly annotation genomeannotation sequencing alignment dataimport sequencematching bioconductor-package core-package

4 stars 4.90 score 6 scripts

bioc

beachmat.hdf5:beachmat bindings for HDF5-backed matrices

Extends beachmat to support initialization of tatami matrices from HDF5-backed arrays. This allows C++ code in downstream packages to directly call the HDF5 C/C++ library to access array data, without the need for block processing via DelayedArray. Some utilities are also provided for direct creation of an in-memory tatami matrix from a HDF5 file.

Maintained by Aaron Lun. Last updated 5 months ago.

datarepresentation dataimport infrastructure zlib cpp

4.88 score 6 scripts

bioc

rRDP:Interface to the RDP Classifier

This package installs and interfaces the naive Bayesian classifier for 16S rRNA sequences developed by the Ribosomal Database Project (RDP). With this package the classifier trained with the standard training set can be used or a custom classifier can be trained.

Maintained by Michael Hahsler. Last updated 5 months ago.

genetics sequencing infrastructure classification microbiome immunooncology alignment sequencematching dataimport bayesian bioconductor bioinformatics openjdk

3 stars 4.88 score 6 scripts

hectorrdb

Ecume:Equality of 2 (or k) Continuous Univariate and Multivariate Distributions

We implement (or re-implements in R) a variety of statistical tools. They are focused on non-parametric two-sample (or k-sample) distribution comparisons in the univariate or multivariate case. See the vignette for more info.

Maintained by Hector Roux de Bezieux. Last updated 10 months ago.

software infrastructure

1 stars 4.86 score 16 scripts 3 dependents

bioc

rhdf5client:Access HDF5 content from HDF Scalable Data Service

This package provides functionality for reading data from HDF Scalable Data Service from within R. The HSDSArray function bridges from HSDS to the user via the DelayedArray interface. Bioconductor manages an open HSDS instance graciously provided by John Readey of the HDF Group.

Maintained by Vincent Carey. Last updated 5 months ago.

dataimport software infrastructure

4.82 score 37 scripts 2 dependents

bioc

biodbChebi:biodbChebi, a library for connecting to the ChEBI Database

The biodbChebi library provides access to the ChEBI Database, using biodb package framework. It allows to retrieve entries by their accession number. Web services can be accessed for searching the database by name, mass or other fields.

Maintained by Pierrick Roger. Last updated 5 months ago.

software infrastructure dataimport

2 stars 4.78 score 3 scripts 1 dependents

bioc

MeSHDbi:DBI to construct MeSH-related package from sqlite file

The package is unified implementation of MeSH.db, MeSH.AOR.db, and MeSH.PCR.db and also is interface to construct Gene-MeSH package (MeSH.XXX.eg.db). loadMeSHDbiPkg import sqlite file and generate MeSH.XXX.eg.db.

Maintained by Koki Tsuyuzaki. Last updated 5 months ago.

annotation annotationdata infrastructure

4.76 score 32 scripts 3 dependents

bioc

plyinteractions:Extending tidy verbs to genomic interactions

Operate on `GInteractions` objects as tabular data using `dplyr`-like verbs. The functions and methods in `plyinteractions` provide a grammatical approach to manipulate `GInteractions`, to facilitate their integration in genomic analysis workflows.

Maintained by Jacques Serizay. Last updated 5 months ago.

software infrastructure

4.75 score 14 scripts

bioc

BufferedMatrix:A matrix data storage object held in temporary files

A tabular style data object where most data is stored outside main memory. A buffer is used to speed up access to data.

Maintained by Ben Bolstad. Last updated 3 months ago.

infrastructure

4.73 score 6 scripts 1 dependents

bioc

HoloFoodR:R interface to EBI HoloFood resource

Utility package to facilitate integration and analysis of EBI HoloFood data in R. This package streamlines access to the resource, allowing for direct loading of data into formats optimized for downstream analytics.

Maintained by Tuomas Borman. Last updated 1 months ago.

software infrastructure dataimport microbiome microbiomedata

1 stars 4.70 score 6 scripts

bioc

ChIPseqR:Identifying Protein Binding Sites in High-Throughput Sequencing Data

ChIPseqR identifies protein binding sites from ChIP-seq and nucleosome positioning experiments. The model used to describe binding events was developed to locate nucleosomes but should flexible enough to handle other types of experiments as well.

Maintained by Peter Humburg. Last updated 5 months ago.

chipseq infrastructure

4.70 score 1 scripts

bioc

DelayedTensor:R package for sparse and out-of-core arithmetic and decomposition of Tensor

DelayedTensor operates Tensor arithmetic directly on DelayedArray object. DelayedTensor provides some generic function related to Tensor arithmetic/decompotision and dispatches it on the DelayedArray class. DelayedTensor also suppors Tensor contraction by einsum function, which is inspired by numpy einsum.

Maintained by Koki Tsuyuzaki. Last updated 5 months ago.

software infrastructure datarepresentation dimensionreduction

4 stars 4.68 score 3 scripts

bioc

beachmat.tiledb:beachmat bindings for TileDB-backed matrices

Extends beachmat to initialize tatami matrices from TileDB-backed arrays. This allows C++ code in downstream packages to directly call the TileDB C/C++ library to access array data, without the need for block processing via DelayedArray. Developers only need to import this package to automatically extend the capabilities of beachmat::initializeCpp to TileDBArray instances.

Maintained by Aaron Lun. Last updated 3 months ago.

datarepresentation dataimport infrastructure cpp

4.65 score 4 scripts

bioc

RTCA:Open-source toolkit to analyse data from xCELLigence System (RTCA)

Import, analyze and visualize data from Roche(R) xCELLigence RTCA systems. The package imports real-time cell electrical impedance data into R. As an alternative to commercial software shipped along the system, the Bioconductor package RTCA provides several unique transformation (normalization) strategies and various visualization tools.

Maintained by Jitao David Zhang. Last updated 5 months ago.

immunooncology cellbasedassays infrastructure visualization timecourse

4.60 score 4 scripts

bioc

terraTCGAdata:OpenAccess TCGA Data on Terra as MultiAssayExperiment

Leverage the existing open access TCGA data on Terra with well-established Bioconductor infrastructure. Make use of the Terra data model without learning its complexities. With a few functions, you can copy / download and generate a MultiAssayExperiment from the TCGA example workspaces provided by Terra.

Maintained by Marcel Ramos. Last updated 5 months ago.

software infrastructure dataimport bioconductor-package

4.60 score 4 scripts