R-universe search: topic:datarepresentation

bioc

Biostrings:Efficient manipulation of biological strings

Memory efficient string containers, string matching algorithms, and other utilities, for fast manipulation of large biological sequences or sets of sequences.

Maintained by Hervé Pagès. Last updated 1 months ago.

sequencematching alignment sequencing genetics dataimport datarepresentation infrastructure bioconductor-package core-package

62 stars 17.77 score 8.6k scripts 1.2k dependents

bioc

GenomicRanges:Representation and manipulation of genomic intervals

The ability to efficiently represent and manipulate genomic annotations and alignments is playing a central role when it comes to analyzing high-throughput sequencing data (a.k.a. NGS data). The GenomicRanges package defines general purpose containers for storing and manipulating genomic intervals and variables defined along a genome. More specialized containers for representing and manipulating short alignments against a reference genome, or a matrix-like summarization of an experiment, are defined in the GenomicAlignments and SummarizedExperiment packages, respectively. Both packages build on top of the GenomicRanges infrastructure.

Maintained by Hervé Pagès. Last updated 4 months ago.

genetics infrastructure datarepresentation sequencing annotation genomeannotation coverage bioconductor-package core-package

44 stars 17.68 score 13k scripts 1.3k dependents

bioc

GenomeInfoDb:Utilities for manipulating chromosome names, including modifying them to follow a particular naming style

Contains data and functions that define and allow translation between different chromosome sequence naming conventions (e.g., "chr1" versus "1"), including a function that attempts to place sequence names in their natural, rather than lexicographic, order.

Maintained by Hervé Pagès. Last updated 2 months ago.

genetics datarepresentation annotation genomeannotation bioconductor-package core-package

32 stars 16.32 score 1.3k scripts 1.7k dependents

bioc

IRanges:Foundation of integer range manipulation in Bioconductor

Provides efficient low-level and highly reusable S4 classes for storing, manipulating and aggregating over annotated ranges of integers. Implements an algebra of range operations, including efficient algorithms for finding overlaps and nearest neighbors. Defines efficient list-like classes for storing, transforming and aggregating large grouped data, i.e., collections of atomic vectors and DataFrames.

Maintained by Hervé Pagès. Last updated 2 months ago.

infrastructure datarepresentation bioconductor-package core-package

22 stars 16.09 score 2.1k scripts 1.8k dependents

bioc

S4Vectors:Foundation of vector-like and list-like containers in Bioconductor

The S4Vectors package defines the Vector and List virtual classes and a set of generic functions that extend the semantic of ordinary vectors and lists in R. Package developers can easily implement vector-like or list-like objects as concrete subclasses of Vector or List. In addition, a few low-level concrete subclasses of general interest (e.g. DataFrame, Rle, Factor, and Hits) are implemented in the S4Vectors package itself (many more are implemented in the IRanges package and in other Bioconductor infrastructure packages).

Maintained by Hervé Pagès. Last updated 2 months ago.

infrastructure datarepresentation bioconductor-package core-package

18 stars 16.05 score 1.0k scripts 1.9k dependents

bioc

DelayedArray:A unified framework for working transparently with on-disk and in-memory array-like datasets

Wrapping an array-like object (typically an on-disk object) in a DelayedArray object allows one to perform common array operations on it without loading the object in memory. In order to reduce memory usage and optimize performance, operations on the object are either delayed or executed using a block processing mechanism. Note that this also works on in-memory array-like objects like DataFrame objects (typically with Rle columns), Matrix objects, ordinary arrays and, data frames.

Maintained by Hervé Pagès. Last updated 1 months ago.

infrastructure datarepresentation annotation genomeannotation bioconductor-package core-package u24ca289073

27 stars 15.59 score 538 scripts 1.2k dependents

bioc

MultiAssayExperiment:Software for the integration of multi-omics experiments in Bioconductor

Harmonize data management of multiple experimental assays performed on an overlapping set of specimens. It provides a familiar Bioconductor user experience by extending concepts from SummarizedExperiment, supporting an open-ended mix of standard data classes for individual assays, and allowing subsetting by genomic ranges or rownames. Facilities are provided for reshaping data into wide and long formats for adaptability to graphing and downstream analysis.

Maintained by Marcel Ramos. Last updated 2 months ago.

infrastructure datarepresentation bioconductor bioconductor-package genomics nci-itcr tcga u24ca289073

71 stars 14.95 score 670 scripts 127 dependents

bioc

maftools:Summarize, Analyze and Visualize MAF Files

Analyze and visualize Mutation Annotation Format (MAF) files from large scale sequencing studies. This package provides various functions to perform most commonly used analyses in cancer genomics and to create feature rich customizable visualzations with minimal effort.

Maintained by Anand Mayakonda. Last updated 5 months ago.

datarepresentation dnaseq visualization drivermutation variantannotation featureextraction classification somaticmutation sequencing functionalgenomics survival bioinformatics cancer-genome-atlas cancer-genomics genomics maf-files tcga curl bzip2 xz-utils zlib

459 stars 14.63 score 948 scripts 18 dependents

bioc

BSgenome:Software infrastructure for efficient representation of full genomes and their SNPs

Infrastructure shared by all the Biostrings-based genome data packages.

Maintained by Hervé Pagès. Last updated 2 months ago.

genetics infrastructure datarepresentation sequencematching annotation snp bioconductor-package core-package

9 stars 14.12 score 1.2k scripts 267 dependents

bioc

SingleCellExperiment:S4 Classes for Single Cell Data

Defines a S4 class for storing data from single-cell experiments. This includes specialized methods to store and retrieve spike-in information, dimensionality reduction coordinates and size factors for each cell, along with the usual metadata for genes and libraries.

Maintained by Davide Risso. Last updated 22 days ago.

immunooncology datarepresentation dataimport infrastructure singlecell

13.53 score 15k scripts 285 dependents

bioc

HDF5Array:HDF5 datasets as array-like objects in R

The HDF5Array package is an HDF5 backend for DelayedArray objects. It implements the HDF5Array, H5SparseMatrix, H5ADMatrix, and TENxMatrix classes, 4 convenient and memory-efficient array-like containers for representing and manipulating either: (1) a conventional (a.k.a. dense) HDF5 dataset, (2) an HDF5 sparse matrix (stored in CSR/CSC/Yale format), (3) the central matrix of an h5ad file (or any matrix in the /layers group), or (4) a 10x Genomics sparse matrix. All these containers are DelayedArray extensions and thus support all operations (delayed or block-processed) supported by DelayedArray objects.

Maintained by Hervé Pagès. Last updated 10 days ago.

infrastructure datarepresentation dataimport sequencing rnaseq coverage annotation genomeannotation singlecell immunooncology bioconductor-package core-package u24ca289073

12 stars 13.20 score 844 scripts 126 dependents

bioc

plyranges:A fluent interface for manipulating GenomicRanges

A dplyr-like interface for interacting with the common Bioconductor classes Ranges and GenomicRanges. By providing a grammatical and consistent way of manipulating these classes their accessiblity for new Bioconductor users is hopefully increased.

Maintained by Michael Love. Last updated 10 days ago.

infrastructure datarepresentation workflowstep coverage bioconductor data-analysis dplyr genomic-ranges genomics tidy-data

144 stars 12.66 score 1.9k scripts 20 dependents

bioc

SpatialExperiment:S4 Class for Spatially Resolved -omics Data

Defines an S4 class for storing data from spatial -omics experiments. The class extends SingleCellExperiment to support storage and retrieval of additional information from spot-based and molecule-based platforms, including spatial coordinates, images, and image metadata. A specialized constructor function is included for data from the 10x Genomics Visium platform.

Maintained by Dario Righelli. Last updated 5 months ago.

datarepresentation dataimport infrastructure immunooncology geneexpression transcriptomics singlecell spatial

59 stars 12.63 score 1.8k scripts 71 dependents

bioc

SparseArray:High-performance sparse data representation and manipulation in R

The SparseArray package provides array-like containers for efficient in-memory representation of multidimensional sparse data in R (arrays and matrices). The package defines the SparseArray virtual class and two concrete subclasses: COO_SparseArray and SVT_SparseArray. Each subclass uses its own internal representation of the nonzero multidimensional data: the "COO layout" and the "SVT layout", respectively. SVT_SparseArray objects mimic as much as possible the behavior of ordinary matrix and array objects in base R. In particular, they suppport most of the "standard matrix and array API" defined in base R and in the matrixStats package from CRAN.

Maintained by Hervé Pagès. Last updated 11 days ago.

infrastructure datarepresentation bioconductor-package core-package openmp

9 stars 12.47 score 79 scripts 1.2k dependents

ropensci

treeio:Base Classes and Functions for Phylogenetic Tree Input and Output

'treeio' is an R package to make it easier to import and store phylogenetic tree with associated data; and to link external data from different sources to phylogeny. It also supports exporting phylogenetic tree with heterogeneous associated data to a single tree file and can be served as a platform for merging tree with associated data and converting file formats.

Maintained by Guangchuang Yu. Last updated 5 months ago.

software annotation clustering dataimport datarepresentation alignment multiplesequencealignment phylogenetics exporter parser phylogenetic-trees

102 stars 12.46 score 1.3k scripts 122 dependents

bioc

SeqArray:Data management of large-scale whole-genome sequence variant calls using GDS files

Data management of large-scale whole-genome sequencing variant calls with thousands of individuals: genotypic data (e.g., SNVs, indels and structural variation calls) and annotations in SeqArray GDS files are stored in an array-oriented and compressed manner, with efficient data access using the R programming language.

Maintained by Xiuwen Zheng. Last updated 6 days ago.

infrastructure datarepresentation sequencing genetics bioinformatics gds-format snp snv wes wgs cpp

45 stars 12.11 score 1.1k scripts 9 dependents

bioc

sparseMatrixStats:Summary Statistics for Rows and Columns of Sparse Matrices

High performance functions for row and column operations on sparse matrices. For example: col / rowMeans2, col / rowMedians, col / rowVars etc. Currently, the optimizations are limited to data in the column sparse format. This package is inspired by the matrixStats package by Henrik Bengtsson.

Maintained by Constantin Ahlmann-Eltze. Last updated 5 months ago.

infrastructure software datarepresentation cpp

54 stars 11.98 score 174 scripts 130 dependents

bioc

DelayedMatrixStats:Functions that Apply to Rows and Columns of 'DelayedMatrix' Objects

A port of the 'matrixStats' API for use with DelayedMatrix objects from the 'DelayedArray' package. High-performing functions operating on rows and columns of DelayedMatrix objects, e.g. col / rowMedians(), col / rowRanks(), and col / rowSds(). Functions optimized per data type and for subsetted calculations such that both memory usage and processing time is minimized.

Maintained by Peter Hickey. Last updated 3 months ago.

infrastructure datarepresentation software

16 stars 11.86 score 211 scripts 112 dependents

bioc

XVector:Foundation of external vector representation and manipulation in Bioconductor

Provides memory efficient S4 classes for storing sequences "externally" (e.g. behind an R external pointer, or on disk).

Maintained by Hervé Pagès. Last updated 3 months ago.

infrastructure datarepresentation bioconductor-package core-package zlib

2 stars 11.36 score 67 scripts 1.7k dependents

bioc

zellkonverter:Conversion Between scRNA-seq Objects

Provides methods to convert between Python AnnData objects and SingleCellExperiment objects. These are primarily intended for use by downstream Bioconductor packages that wrap Python methods for single-cell data analysis. It also includes functions to read and write H5AD files used for saving AnnData objects to disk.

Maintained by Luke Zappia. Last updated 20 days ago.

singlecell dataimport datarepresentation bioconductor conversion scrna-seq

159 stars 11.25 score 660 scripts 4 dependents

bioc

beachmat:Compiling Bioconductor to Handle Each Matrix Type

Provides a consistent C++ class interface for reading from a variety of commonly used matrix types. Ordinary matrices and several sparse/dense Matrix classes are directly supported, along with a subset of the delayed operations implemented in the DelayedArray package. All other matrix-like objects are supported by calling back into R.

Maintained by Aaron Lun. Last updated 16 days ago.

datarepresentation dataimport infrastructure bioconductor-package human-cell-atlas matrix-library cpp

4 stars 11.09 score 21 scripts 142 dependents

bioc

scater:Single-Cell Analysis Toolkit for Gene Expression Data in R

A collection of tools for doing various analyses of single-cell RNA-seq gene expression data, with a focus on quality control and visualization.

Maintained by Alan OCallaghan. Last updated 22 days ago.

immunooncology singlecell rnaseq qualitycontrol preprocessing normalization visualization dimensionreduction transcriptomics geneexpression sequencing software dataimport datarepresentation infrastructure coverage

11.07 score 12k scripts 43 dependents

bioc

S4Arrays:Foundation of array-like containers in Bioconductor

The S4Arrays package defines the Array virtual class to be extended by other S4 classes that wish to implement a container with an array-like semantic. It also provides: (1) low-level functionality meant to help the developer of such container to implement basic operations like display, subsetting, or coercion of their array-like objects to an ordinary matrix or array, and (2) a framework that facilitates block processing of array-like objects (typically on-disk objects).

Maintained by Hervé Pagès. Last updated 2 months ago.

infrastructure datarepresentation bioconductor-package core-package

5 stars 10.99 score 8 scripts 1.2k dependents

bioc

SC3:Single-Cell Consensus Clustering

A tool for unsupervised clustering and analysis of single cell RNA-Seq data.

Maintained by Vladimir Kiselev. Last updated 5 months ago.

immunooncology singlecell software classification clustering dimensionreduction supportvectormachine rnaseq visualization transcriptomics datarepresentation gui differentialexpression transcription bioconductor-package human-cell-atlas single-cell-rna-seq openblas cpp

125 stars 10.10 score 374 scripts 1 dependents

bioc

OmnipathR:OmniPath web service client and more

A client for the OmniPath web service (https://www.omnipathdb.org) and many other resources. It also includes functions to transform and pretty print some of the downloaded data, functions to access a number of other resources such as BioPlex, ConsensusPathDB, EVEX, Gene Ontology, Guide to Pharmacology (IUPHAR/BPS), Harmonizome, HTRIdb, Human Phenotype Ontology, InWeb InBioMap, KEGG Pathway, Pathway Commons, Ramilowski et al. 2015, RegNetwork, ReMap, TF census, TRRUST and Vinayagam et al. 2011. Furthermore, OmnipathR features a close integration with the NicheNet method for ligand activity prediction from transcriptomics data, and its R implementation `nichenetr` (available only on github).

Maintained by Denes Turei. Last updated 1 months ago.

graphandnetwork network pathways software thirdpartyclient dataimport datarepresentation genesignaling generegulation systemsbiology transcriptomics singlecell annotation kegg complexes enzyme-ptm networks networks-biology omnipath proteins quarto

130 stars 9.90 score 226 scripts 2 dependents

bioc

GenVisR:Genomic Visualizations in R

Produce highly customizable publication quality graphics for genomic data primarily at the cohort level.

Maintained by Zachary Skidmore. Last updated 5 months ago.

infrastructure datarepresentation classification dnaseq

217 stars 9.87 score 76 scripts

bioc

matter:Out-of-core statistical computing and signal processing

Toolbox for larger-than-memory scientific computing and visualization, providing efficient out-of-core data structures using files or shared memory, for dense and sparse vectors, matrices, and arrays, with applications to nonuniformly sampled signals and images.

Maintained by Kylie A. Bemis. Last updated 4 months ago.

infrastructure datarepresentation dataimport dimensionreduction preprocessing cpp

57 stars 9.52 score 64 scripts 2 dependents

bioc

SpatialFeatureExperiment:Integrating SpatialExperiment with Simple Features in sf

A new S4 class integrating Simple Features with the R package sf to bring geospatial data analysis methods based on vector data to spatial transcriptomics. Also implements management of spatial neighborhood graphs and geometric operations. This pakage builds upon SpatialExperiment and SingleCellExperiment, hence methods for these parent classes can still be used.

Maintained by Lambda Moses. Last updated 2 months ago.

datarepresentation transcriptomics spatial

49 stars 9.40 score 322 scripts 1 dependents

bioc

GenomicInteractions:Utilities for handling genomic interaction data

Utilities for handling genomic interaction data such as ChIA-PET or Hi-C, annotating genomic features with interaction information, and producing plots and summary statistics.

Maintained by Liz Ing-Simmons. Last updated 5 months ago.

software infrastructure dataimport datarepresentation hic

7 stars 9.31 score 162 scripts 5 dependents

bioc

RaggedExperiment:Representation of Sparse Experiments and Assays Across Samples

This package provides a flexible representation of copy number, mutation, and other data that fit into the ragged array schema for genomic location data. The basic representation of such data provides a rectangular flat table interface to the user with range information in the rows and samples/specimen in the columns. The RaggedExperiment class derives from a GRangesList representation and provides a semblance of a rectangular dataset.

Maintained by Marcel Ramos. Last updated 4 months ago.

infrastructure datarepresentation copynumber core-package data-structure mutations u24ca289073

4 stars 8.93 score 76 scripts 14 dependents

bioc

RTCGA:The Cancer Genome Atlas Data Integration

The Cancer Genome Atlas (TCGA) Data Portal provides a platform for researchers to search, download, and analyze data sets generated by TCGA. It contains clinical information, genomic characterization data, and high level sequence analysis of the tumor genomes. The key is to understand genomics to improve cancer care. RTCGA package offers download and integration of the variety and volume of TCGA data using patient barcode key, what enables easier data possession. This may have an benefcial infuence on impact on development of science and improvement of patients' treatment. Furthermore, RTCGA package transforms TCGA data to tidy form which is convenient to use.

Maintained by Marcin Kosinski. Last updated 5 months ago.

immunooncology software dataimport datarepresentation preprocessing rnaseq survival dnamethylation principalcomponent visualization

51 stars 8.91 score 106 scripts 1 dependents

bioc

assorthead:Assorted Header-Only C++ Libraries

Vendors an assortment of useful header-only C++ libraries. Bioconductor packages can use these libraries in their own C++ code by LinkingTo this package without introducing any additional dependencies. The use of a central repository avoids duplicate vendoring of libraries across multiple R packages, and enables better coordination of version updates across cohorts of interdependent C++ libraries.

Maintained by Aaron Lun. Last updated 26 days ago.

singlecell qualitycontrol normalization datarepresentation dataimport differentialexpression alignment

8.89 score 167 dependents

bioc

cmapR:CMap Tools in R

The Connectivity Map (CMap) is a massive resource of perturbational gene expression profiles built by researchers at the Broad Institute and funded by the NIH Library of Integrated Network-Based Cellular Signatures (LINCS) program. Please visit https://clue.io for more information. The cmapR package implements methods to parse, manipulate, and write common CMap data objects, such as annotated matrices and collections of gene sets.

Maintained by Ted Natoli. Last updated 5 months ago.

dataimport datarepresentation geneexpression bioconductor bioinformatics cmap

90 stars 8.86 score 298 scripts

bioc

scmap:A tool for unsupervised projection of single cell RNA-seq data

Single-cell RNA-seq (scRNA-seq) is widely used to investigate the composition of complex tissues since the technology allows researchers to define cell-types using unsupervised clustering of the transcriptome. However, due to differences in experimental methods and computational analyses, it is often challenging to directly compare the cells identified in two different experiments. scmap is a method for projecting cells from a scRNA-seq experiment on to the cell-types or individual cells identified in a different experiment.

Maintained by Vladimir Kiselev. Last updated 5 months ago.

immunooncology singlecell software classification supportvectormachine rnaseq visualization transcriptomics datarepresentation transcription sequencing preprocessing geneexpression dataimport bioconductor-package human-cell-atlas projection-mapping single-cell-rna-seq openblas cpp

95 stars 8.82 score 172 scripts

bioc

monocle:Clustering, differential expression, and trajectory analysis for single- cell RNA-Seq

Monocle performs differential expression and time-series analysis for single-cell expression experiments. It orders individual cells according to progress through a biological process, without knowing ahead of time which genes define progress through that process. Monocle also performs differential expression analysis, clustering, visualization, and other useful tasks on single cell expression data. It is designed to work with RNA-Seq and qPCR data, but could be used with other types as well.

Maintained by Cole Trapnell. Last updated 5 months ago.

immunooncology sequencing rnaseq geneexpression differentialexpression infrastructure dataimport datarepresentation visualization clustering multiplecomparison qualitycontrol cpp

8.71 score 1.6k scripts 2 dependents

bioc

alabaster.base:Save Bioconductor Objects to File

Save Bioconductor data structures into file artifacts, and load them back into memory. This is a more robust and portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.

Maintained by Aaron Lun. Last updated 24 days ago.

datarepresentation dataimport zlib cpp

3 stars 8.47 score 60 scripts 15 dependents

bioc

ScaledMatrix:Creating a DelayedMatrix of Scaled and Centered Values

Provides delayed computation of a matrix of scaled and centered values. The result is equivalent to using the scale() function but avoids explicit realization of a dense matrix during block processing. This permits greater efficiency in common operations, most notably matrix multiplication.

Maintained by Aaron Lun. Last updated 2 months ago.

software datarepresentation

8.44 score 10 scripts 105 dependents

bioc

openCyto:Hierarchical Gating Pipeline for flow cytometry data

This package is designed to facilitate the automated gating methods in sequential way to mimic the manual gating strategy.

Maintained by Mike Jiang. Last updated 3 days ago.

immunooncology flowcytometry dataimport preprocessing datarepresentation cpp

8.02 score 404 scripts 1 dependents

bioc

InteractionSet:Base Classes for Storing Genomic Interaction Data

Provides the GInteractions, InteractionSet and ContactMatrix objects and associated methods for storing and manipulating genomic interaction data from Hi-C and ChIA-PET experiments.

Maintained by Aaron Lun. Last updated 5 months ago.

infrastructure datarepresentation software hic cpp

7.95 score 250 scripts 36 dependents

bioc

flowWorkspace:Infrastructure for representing and interacting with gated and ungated cytometry data sets.

This package is designed to facilitate comparison of automated gating methods against manual gating done in flowJo. This package allows you to import basic flowJo workspaces into BioConductor and replicate the gating from flowJo using the flowCore functionality. Gating hierarchies, groups of samples, compensation, and transformation are performed so that the output matches the flowJo analysis.

Maintained by Greg Finak. Last updated 22 days ago.

immunooncology flowcytometry dataimport preprocessing datarepresentation zlib openblas cpp

7.89 score 576 scripts 10 dependents

bioc

TreeSummarizedExperiment:TreeSummarizedExperiment: a S4 Class for Data with Tree Structures

TreeSummarizedExperiment has extended SingleCellExperiment to include hierarchical information on the rows or columns of the rectangular data.

Maintained by Ruizhu Huang. Last updated 5 months ago.

datarepresentation infrastructure

7.87 score 251 scripts 15 dependents

bioc

Rcpi:Molecular Informatics Toolkit for Compound-Protein Interaction in Drug Discovery

A molecular informatics toolkit with an integration of bioinformatics and chemoinformatics tools for drug discovery.

Maintained by Nan Xiao. Last updated 5 months ago.

software dataimport datarepresentation featureextraction cheminformatics biomedicalinformatics proteomics go systemsbiology bioconductor bioinformatics drug-discovery feature-extraction fingerprint molecular-descriptors protein-sequences

37 stars 7.81 score 29 scripts

bioc

PhyloProfile:PhyloProfile

PhyloProfile is a tool for exploring complex phylogenetic profiles. Phylogenetic profiles, presence/absence patterns of genes over a set of species, are commonly used to trace the functional and evolutionary history of genes across species and time. With PhyloProfile we can enrich regular phylogenetic profiles with further data like sequence/structure similarity, to make phylogenetic profiling more meaningful. Besides the interactive visualisation powered by R-Shiny, the package offers a set of further analysis features to gain insights like the gene age estimation or core gene identification.

Maintained by Vinh Tran. Last updated 8 days ago.

software visualization datarepresentation multiplecomparison functionalprediction dimensionreduction bioinformatics heatmap interactive-visualizations orthologs phylogenetic-profile shiny

33 stars 7.79 score 10 scripts

bioc

phantasus:Visual and interactive gene expression analysis

Phantasus is a web-application for visual and interactive gene expression analysis. Phantasus is based on Morpheus – a web-based software for heatmap visualisation and analysis, which was integrated with an R environment via OpenCPU API. Aside from basic visualization and filtering methods, R-based methods such as k-means clustering, principal component analysis or differential expression analysis with limma package are supported.

Maintained by Alexey Sergushichev. Last updated 5 months ago.

geneexpression gui visualization datarepresentation transcriptomics rnaseq microarray normalization clustering differentialexpression principalcomponent immunooncology

43 stars 7.68 score 15 scripts

bioc

CytoML:A GatingML Interface for Cross Platform Cytometry Data Sharing

Uses platform-specific implemenations of the GatingML2.0 standard to exchange gated cytometry data with other software platforms.

Maintained by Mike Jiang. Last updated 22 days ago.

immunooncology flowcytometry dataimport datarepresentation zlib openblas libxml2 cpp

30 stars 7.60 score 132 scripts

bioc

CAGEfightR:Analysis of Cap Analysis of Gene Expression (CAGE) data using Bioconductor

CAGE is a widely used high throughput assay for measuring transcription start site (TSS) activity. CAGEfightR is an R/Bioconductor package for performing a wide range of common data analysis tasks for CAGE and 5'-end data in general. Core functionality includes: import of CAGE TSSs (CTSSs), tag (or unidirectional) clustering for TSS identification, bidirectional clustering for enhancer identification, annotation with transcript and gene models, correlation of TSS and enhancer expression, calculation of TSS shapes, quantification of CAGE expression as expression matrices and genome brower visualization.

Maintained by Malte Thodberg. Last updated 5 months ago.

software transcription coverage geneexpression generegulation peakdetection dataimport datarepresentation transcriptomics sequencing annotation genomebrowsers normalization preprocessing visualization

8 stars 7.46 score 67 scripts 1 dependents

bioc

GenomicDistributions:GenomicDistributions: fast analysis of genomic intervals with Bioconductor

If you have a set of genomic ranges, this package can help you with visualization and comparison. It produces several kinds of plots, for example: Chromosome distribution plots, which visualize how your regions are distributed over chromosomes; feature distance distribution plots, which visualizes how your regions are distributed relative to a feature of interest, like Transcription Start Sites (TSSs); genomic partition plots, which visualize how your regions overlap given genomic features such as promoters, introns, exons, or intergenic regions. It also makes it easy to compare one set of ranges to another.

Maintained by Kristyna Kupkova. Last updated 5 months ago.

software genomeannotation genomeassembly datarepresentation sequencing coverage functionalgenomics visualization

26 stars 7.44 score 25 scripts

bioc

cytolib:C++ infrastructure for representing and interacting with the gated cytometry data

This package provides the core data structure and API to represent and interact with the gated cytometry data.

Maintained by Mike Jiang. Last updated 2 months ago.

immunooncology flowcytometry dataimport preprocessing datarepresentation

7.39 score 7 scripts 60 dependents

bioc

cogena:co-expressed gene-set enrichment analysis

cogena is a workflow for co-expressed gene-set enrichment analysis. It aims to discovery smaller scale, but highly correlated cellular events that may be of great biological relevance. A novel pipeline for drug discovery and drug repositioning based on the cogena workflow is proposed. Particularly, candidate drugs can be predicted based on the gene expression of disease-related data, or other similar drugs can be identified based on the gene expression of drug-related data. Moreover, the drug mode of action can be disclosed by the associated pathway analysis. In summary, cogena is a flexible workflow for various gene set enrichment analysis for co-expressed genes, with a focus on pathway/GO analysis and drug repositioning.

Maintained by Zhilong Jia. Last updated 5 months ago.

clustering genesetenrichment geneexpression visualization pathways kegg go microarray sequencing systemsbiology datarepresentation dataimport bioconductor bioinformatics

12 stars 7.36 score 32 scripts

bioc

DEP:Differential Enrichment analysis of Proteomics data

This package provides an integrated analysis workflow for robust and reproducible analysis of mass spectrometry proteomics data for differential protein expression or differential enrichment. It requires tabular input (e.g. txt files) as generated by quantitative analysis softwares of raw mass spectrometry data, such as MaxQuant or IsobarQuant. Functions are provided for data preparation, filtering, variance normalization and imputation of missing values, as well as statistical testing of differentially enriched / expressed proteins. It also includes tools to check intermediate steps in the workflow, such as normalization and missing values imputation. Finally, visualization tools are provided to explore the results, including heatmap, volcano plot and barplot representations. For scientists with limited experience in R, the package also contains wrapper functions that entail the complete analysis workflow and generate a report. Even easier to use are the interactive Shiny apps that are provided by the package.

Maintained by Arne Smits. Last updated 5 months ago.

immunooncology proteomics massspectrometry differentialexpression datarepresentation

7.10 score 628 scripts

bioc

alabaster.matrix:Load and Save Artifacts from File

Save matrices, arrays and similar objects into file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.

Maintained by Aaron Lun. Last updated 24 days ago.

dataimport datarepresentation cpp

7.05 score 15 scripts 8 dependents

bioc

pipeComp:pipeComp pipeline benchmarking framework

A simple framework to facilitate the comparison of pipelines involving various steps and parameters. The `pipelineDefinition` class represents pipelines as, minimally, a set of functions consecutively executed on the output of the previous one, and optionally accompanied by step-wise evaluation and aggregation functions. Given such an object, a set of alternative parameters/methods, and benchmark datasets, the `runPipeline` function then proceeds through all combinations arguments, avoiding recomputing the same step twice and compiling evaluations on the fly to avoid storing potentially large intermediate data.

Maintained by Pierre-Luc Germain. Last updated 5 months ago.

geneexpression transcriptomics clustering datarepresentation benchmark bioconductor pipeline-benchmarking pipelines single-cell-rna-seq

41 stars 7.02 score 43 scripts

bioc

h5mread:A fast HDF5 reader

The main function in the h5mread package is h5mread(), which allows reading arbitrary data from an HDF5 dataset into R, similarly to what the h5read() function from the rhdf5 package does. In the case of h5mread(), the implementation has been optimized to make it as fast and memory-efficient as possible.

Maintained by Hervé Pagès. Last updated 2 months ago.

infrastructure datarepresentation dataimport openssl curl zlib

1 stars 6.98 score 4 scripts 127 dependents

bioc

TileDBArray:Using TileDB as a DelayedArray Backend

Implements a DelayedArray backend for reading and writing dense or sparse arrays in the TileDB format. The resulting TileDBArrays are compatible with all Bioconductor pipelines that can accept DelayedArray instances.

Maintained by Aaron Lun. Last updated 5 months ago.

datarepresentation infrastructure software

10 stars 6.89 score 26 scripts 1 dependents

bioc

ResidualMatrix:Creating a DelayedMatrix of Regression Residuals

Provides delayed computation of a matrix of residuals after fitting a linear model to each column of an input matrix. Also supports partial computation of residuals where selected factors are to be preserved in the output matrix. Implements a number of efficient methods for operating on the delayed matrix of residuals, most notably matrix multiplication and calculation of row/column sums or means.

Maintained by Aaron Lun. Last updated 3 months ago.

software datarepresentation regression batcheffect experimentaldesign

1 stars 6.83 score 6 scripts 10 dependents

bioc

GDSArray:Representing GDS files as array-like objects

GDS files are widely used to represent genotyping or sequence data. The GDSArray package implements the `GDSArray` class to represent nodes in GDS files in a matrix-like representation that allows easy manipulation (e.g., subsetting, mathematical transformation) in _R_. The data remains on disk until needed, so that very large files can be processed.

Maintained by Xiuwen Zheng. Last updated 10 days ago.

infrastructure datarepresentation sequencing genotypingarray

5 stars 6.78 score 8 scripts 2 dependents

bioc

VennDetail:A package for visualization and extract details

A set of functions to generate high-resolution Venn,Vennpie plot,extract and combine details of these subsets with user datasets in data frame is available.

Maintained by Kai Guo. Last updated 5 months ago.

datarepresentation graphandnetwork extract venndiagram

29 stars 6.75 score 65 scripts

bioc

GOexpress:Visualise microarray and RNAseq data using gene ontology annotations

The package contains methods to visualise the expression profile of genes from a microarray or RNA-seq experiment, and offers a supervised clustering approach to identify GO terms containing genes with expression levels that best classify two or more predefined groups of samples. Annotations for the genes present in the expression dataset may be obtained from Ensembl through the biomaRt package, if not provided by the user. The default random forest framework is used to evaluate the capacity of each gene to cluster samples according to the factor of interest. Finally, GO terms are scored by averaging the rank (alternatively, score) of their respective gene sets to cluster the samples. P-values may be computed to assess the significance of GO term ranking. Visualisation function include gene expression profile, gene ontology-based heatmaps, and hierarchical clustering of experimental samples using gene expression data.

Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.

software geneexpression transcription differentialexpression genesetenrichment datarepresentation clustering timecourse microarray sequencing rnaseq annotation multiplecomparison pathways go visualization immunooncology bioconductor bioconductor-package bioconductor-stats geneontology geneset-enrichment

9 stars 6.75 score 31 scripts

bioc

Modstrings:Working with modified nucleotide sequences

Representing nucleotide modifications in a nucleotide sequence is usually done via special characters from a number of sources. This represents a challenge to work with in R and the Biostrings package. The Modstrings package implements this functionallity for RNA and DNA sequences containing modified nucleotides by translating the character internally in order to work with the infrastructure of the Biostrings package. For this the ModRNAString and ModDNAString classes and derivates and functions to construct and modify these objects despite the encoding issues are implemenented. In addition the conversion from sequences to list like location information (and the reverse operation) is implemented as well.

Maintained by Felix G.M. Ernst. Last updated 5 months ago.

dataimport datarepresentation infrastructure sequencing software bioconductor biostrings dna dna-modifications modified-nucleotides nucleotides rna rna-modification-alphabet rna-modifications sequences

1 stars 6.64 score 5 scripts 8 dependents

bioc

BumpyMatrix:Bumpy Matrix of Non-Scalar Objects

Implements the BumpyMatrix class and several subclasses for holding non-scalar objects in each entry of the matrix. This is akin to a ragged array but the raggedness is in the third dimension, much like a bumpy surface - hence the name. Of particular interest is the BumpyDataFrameMatrix, where each entry is a Bioconductor data frame. This allows us to naturally represent multivariate data in a format that is compatible with two-dimensional containers like the SummarizedExperiment and MultiAssayExperiment objects.

Maintained by Aaron Lun. Last updated 3 months ago.

software infrastructure datarepresentation

1 stars 6.62 score 39 scripts 12 dependents

bioc

SingleMoleculeFootprinting:Analysis tools for Single Molecule Footprinting (SMF) data

SingleMoleculeFootprinting provides functions to analyze Single Molecule Footprinting (SMF) data. Following the workflow exemplified in its vignette, the user will be able to perform basic data analysis of SMF data with minimal coding effort. Starting from an aligned bam file, we show how to perform quality controls over sequencing libraries, extract methylation information at the single molecule level accounting for the two possible kind of SMF experiments (single enzyme or double enzyme), classify single molecules based on their patterns of molecular occupancy, plot SMF information at a given genomic location.

Maintained by Guido Barzaghi. Last updated 4 days ago.

dnamethylation coverage nucleosomepositioning datarepresentation epigenetics methylseq qualitycontrol sequencing

2 stars 6.46 score 27 scripts

bioc

Structstrings:Implementation of the dot bracket annotations with Biostrings

The Structstrings package implements the widely used dot bracket annotation for storing base pairing information in structured RNA. Structstrings uses the infrastructure provided by the Biostrings package and derives the DotBracketString and related classes from the BString class. From these, base pair tables can be produced for in depth analysis. In addition, the loop indices of the base pairs can be retrieved as well. For better efficiency, information conversion is implemented in C, inspired to a large extend by the ViennaRNA package.

Maintained by Felix G.M. Ernst. Last updated 5 months ago.

dataimport datarepresentation infrastructure sequencing software alignment sequencematching bioconductor rna rna-structural-analysis rna-structure sequences structures

4 stars 6.46 score 3 scripts 4 dependents

bioc

MultiDataSet:Implementation of MultiDataSet and ResultSet

Implementation of the BRGE's (Bioinformatic Research Group in Epidemiology from Center for Research in Environmental Epidemiology) MultiDataSet and ResultSet. MultiDataSet is designed for integrating multi omics data sets and ResultSet is a container for omics results. This package contains base classes for MEAL and rexposome packages.

Maintained by Xavier Escribà Montagut. Last updated 5 months ago.

software datarepresentation

6.45 score 28 scripts 10 dependents

bioc

MoleculeExperiment:Prioritising a molecule-level storage of Spatial Transcriptomics Data

MoleculeExperiment contains functions to create and work with objects from the new MoleculeExperiment class. We introduce this class for analysing molecule-based spatial transcriptomics data (e.g., Xenium by 10X, Cosmx SMI by Nanostring, and Merscope by Vizgen). This allows researchers to analyse spatial transcriptomics data at the molecule level, and to have standardised data formats accross vendors.

Maintained by Shila Ghazanfar. Last updated 5 months ago.

dataimport datarepresentation infrastructure software spatial transcriptomics

12 stars 6.45 score 39 scripts

bioc

alabaster.se:Load and Save SummarizedExperiments from File

Save SummarizedExperiments into file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.

Maintained by Aaron Lun. Last updated 5 months ago.

dataimport datarepresentation

6.39 score 8 scripts 7 dependents

bioc

alabaster.ranges:Load and Save Ranges-related Artifacts from File

Save GenomicRanges, IRanges and related data structures into file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.

Maintained by Aaron Lun. Last updated 5 months ago.

dataimport datarepresentation

6.38 score 8 scripts 8 dependents

bioc

spatialHeatmap:spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

The spatialHeatmap package offers the primary functionality for visualizing cell-, tissue- and organ-specific assay data in spatial anatomical images. Additionally, it provides extended functionalities for large-scale data mining routines and co-visualizing bulk and single-cell data. A description of the project is available here: https://spatialheatmap.org.

Maintained by Jianhai Zhang. Last updated 4 months ago.

spatial visualization microarray sequencing geneexpression datarepresentation network clustering graphandnetwork cellbasedassays atacseq dnaseq tissuemicroarray singlecell cellbiology genetarget

5 stars 6.26 score 12 scripts

bioc

alabaster.schemas:Schemas for the Alabaster Framework

Stores all schemas required by various alabaster.* packages. No computation should be performed by this package, as that is handled by alabaster.base. We use a separate package instead of storing the schemas in alabaster.base itself, to avoid conflating management of the schemas with code maintenence.

Maintained by Aaron Lun. Last updated 5 months ago.

datarepresentation dataimport

6.24 score 17 dependents

bioc

ReportingTools:Tools for making reports in various formats

The ReportingTools software package enables users to easily display reports of analysis results generated from sources such as microarray and sequencing data. The package allows users to create HTML pages that may be viewed on a web browser such as Safari, or in other formats readable by programs such as Excel. Users can generate tables with sortable and filterable columns, make and display plots, and link table entries to other data sources such as NCBI or larger plots within the HTML page. Using the package, users can also produce a table of contents page to link various reports together for a particular project that can be viewed in a web browser. For more examples, please visit our site: http:// research-pub.gene.com/ReportingTools.

Maintained by Jason A. Hackney. Last updated 5 months ago.

immunooncology software visualization microarray rnaseq go datarepresentation genesetenrichment

6.23 score 93 scripts 1 dependents

bioc

SingleCellAlleleExperiment:S4 Class for Single Cell Data with Allele and Functional Levels for Immune Genes

Defines a S4 class that is based on SingleCellExperiment. In addition to the usual gene layer the object can also store data for immune genes such as HLAs, Igs and KIRs at allele and functional level. The package is part of a workflow named single-cell ImmunoGenomic Diversity (scIGD), that firstly incorporates allele-aware quantification data for immune genes. This new data can then be used with the here implemented data structure and functionalities for further data handling and data analysis.

Maintained by Jonas Schuck. Last updated 2 months ago.

datarepresentation infrastructure singlecell transcriptomics geneexpression genetics immunooncology dataimport

7 stars 6.18 score 12 scripts

bioc

Pedixplorer:Pedigree Functions

Routines to handle family data with a Pedigree object. The initial purpose was to create correlation structures that describe family relationships such as kinship and identity-by-descent, which can be used to model family data in mixed effects models, such as in the coxme function. Also includes a tool for Pedigree drawing which is focused on producing compact layouts without intervention. Recent additions include utilities to trim the Pedigree object with various criteria, and kinship for the X chromosome.

Maintained by Louis Le Nezet. Last updated 14 days ago.

software datarepresentation genetics graphandnetwork visualization kinship pedigree

2 stars 6.08 score 10 scripts

bioc

GloScope:Population-level Representation on scRNA-Seq data

This package aims at representing and summarizing the entire single-cell profile of a sample. It allows researchers to perform important bioinformatic analyses at the sample-level such as visualization and quality control. The main functions Estimate sample distribution and calculate statistical divergence among samples, and visualize the distance matrix through MDS plots.

Maintained by William Torous. Last updated 5 months ago.

datarepresentation qualitycontrol rnaseq sequencing software singlecell

3 stars 6.05 score 84 scripts

bioc

cummeRbund:Analysis, exploration, manipulation, and visualization of Cufflinks high-throughput sequencing data.

Allows for persistent storage, access, exploration, and manipulation of Cufflinks high-throughput sequencing data. In addition, provides numerous plotting functions for commonly used visualizations.

Maintained by Loyal A. Goff. Last updated 5 months ago.

highthroughputsequencing highthroughputsequencingdata rnaseq rnaseqdata geneexpression differentialexpression infrastructure dataimport datarepresentation visualization bioinformatics clustering multiplecomparisons qualitycontrol

5.92 score 209 scripts

bioc

DFplyr:A `DataFrame` (`S4Vectors`) backend for `dplyr`

Provides `dplyr` verbs (`mutate`, `select`, `filter`, etc...) supporting `S4Vectors::DataFrame` objects. Importantly, this is achieved without conversion to an intermediate `tibble`. Adds grouping infrastructure to `DataFrame` which is respected by the transformation verbs.

Maintained by Jonathan Carroll. Last updated 5 months ago.

datarepresentation infrastructure software

21 stars 5.87 score 5 scripts

bioc

rBiopaxParser:Parses BioPax files and represents them in R

Parses BioPAX files and represents them in R, at the moment BioPAX level 2 and level 3 are supported.

Maintained by Frank Kramer. Last updated 5 months ago.

datarepresentation

10 stars 5.85 score 7 scripts

bioc

SpatialExperimentIO:Read in Xenium, CosMx, MERSCOPE or STARmapPLUS data as SpatialExperiment object

Read in imaging-based spatial transcriptomics technology data. Current available modules are for Xenium by 10X Genomics, CosMx by Nanostring, MERSCOPE by Vizgen, or STARmapPLUS from Broad Institute. You can choose to read the data in as a SpatialExperiment or a SingleCellExperiment object.

Maintained by Yixing E. Dong. Last updated 2 months ago.

datarepresentation dataimport infrastructure transcriptomics singlecell spatial geneexpression

9 stars 5.81 score 16 scripts

henrikbengtsson

aroma.affymetrix:Analysis of Large Affymetrix Microarray Data Sets

A cross-platform R framework that facilitates processing of any number of Affymetrix microarray samples regardless of computer system. The only parameter that limits the number of chips that can be processed is the amount of available disk space. The Aroma Framework has successfully been used in studies to process tens of thousands of arrays. This package has actively been used since 2006.

Maintained by Henrik Bengtsson. Last updated 1 years ago.

infrastructure proprietaryplatforms exonarray microarray onechannel gui dataimport datarepresentation preprocessing qualitycontrol visualization reportwriting acgh copynumbervariants differentialexpression geneexpression snp transcription affymetrix analysis copy-number dna expression hpc large-scale notebook reproducibility rna

10 stars 5.79 score 112 scripts 3 dependents

bioc

bioCancer:Interactive Multi-Omics Cancers Data Visualization and Analysis

This package is a Shiny App to visualize and analyse interactively Multi-Assays of Cancer Genomic Data.

Maintained by Karim Mezhoud. Last updated 5 months ago.

gui datarepresentation network multiplecomparison pathways reactome visualization geneexpression genetarget analysis biocancer-interface cancer cancer-studies rmarkdown

20 stars 5.78 score 7 scripts

bioc

BiocFHIR:Illustration of FHIR ingestion and transformation using R

FHIR R4 bundles in JSON format are derived from https://synthea.mitre.org/downloads. Transformation inspired by a kaggle notebook published by Dr Alexander Scarlat, https://www.kaggle.com/code/drscarlat/fhir-starter-parse-healthcare-bundles-into-tables. This is a very limited illustration of some basic parsing and reorganization processes. Additional tooling will be required to move beyond the Synthea data illustrations.

Maintained by Vincent Carey. Last updated 5 months ago.

infrastructure dataimport datarepresentation fhir

4 stars 5.78 score 15 scripts

bioc

TVTB:TVTB: The VCF Tool Box

The package provides S4 classes and methods to filter, summarise and visualise genetic variation data stored in VCF files. In particular, the package extends the FilterRules class (S4Vectors package) to define news classes of filter rules applicable to the various slots of VCF objects. Functionalities are integrated and demonstrated in a Shiny web-application, the Shiny Variant Explorer (tSVE).

Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.

software genetics geneticvariability genomicvariation datarepresentation gui dnaseq wholegenome visualization multiplecomparison dataimport variantannotation sequencing coverage alignment sequencematching

2 stars 5.76 score 16 scripts

bioc

rexposome:Exposome exploration and outcome data analysis

Package that allows to explore the exposome and to perform association analyses between exposures and health outcomes.

Maintained by Xavier Escribà Montagut. Last updated 5 months ago.

software biologicalquestion infrastructure dataimport datarepresentation biomedicalinformatics experimentaldesign multiplecomparison classification clustering

5.70 score 28 scripts 1 dependents

bioc

cbpManager:Generate, manage, and edit data and metadata files suitable for the import in cBioPortal for Cancer Genomics

This R package provides an R Shiny application that enables the user to generate, manage, and edit data and metadata files suitable for the import in cBioPortal for Cancer Genomics. Create cancer studies and edit its metadata. Upload mutation data of a patient that will be concatenated to the data_mutation_extended.txt file of the study. Create and edit clinical patient data, sample data, and timeline data. Create custom timeline tracks for patients.

Maintained by Arsenij Ustjanzew. Last updated 5 months ago.

immunooncology dataimport datarepresentation gui thirdpartyclient preprocessing visualization cancer-genomics cbioportal clinical-data filegenerator mutation-data patient-data

8 stars 5.51 score 1 scripts

bioc

CrispRVariants:Tools for counting and visualising mutations in a target location

CrispRVariants provides tools for analysing the results of a CRISPR-Cas9 mutagenesis sequencing experiment, or other sequencing experiments where variants within a given region are of interest. These tools allow users to localize variant allele combinations with respect to any genomic location (e.g. the Cas9 cut site), plot allele combinations and calculate mutation rates with flexible filtering of unrelated variants.

Maintained by Helen Lindsay. Last updated 5 months ago.

immunooncology crispr genomicvariation variantdetection geneticvariability datarepresentation visualization sequencing

5.51 score 32 scripts

bioc

GenomicTuples:Representation and Manipulation of Genomic Tuples

GenomicTuples defines general purpose containers for storing genomic tuples. It aims to provide functionality for tuples of genomic co-ordinates that are analogous to those available for genomic ranges in the GenomicRanges Bioconductor package.

Maintained by Peter Hickey. Last updated 5 months ago.

infrastructure datarepresentation sequencing cpp

4 stars 5.48 score 7 scripts

bioc

alabaster.sce:Load and Save SingleCellExperiment from File

Save SingleCellExperiment into file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.

Maintained by Aaron Lun. Last updated 5 months ago.

dataimport datarepresentation

5.43 score 4 scripts 3 dependents

bioc

OmicsMLRepoR:Search harmonized metadata created under the OmicsMLRepo project

This package provides functions to browse the harmonized metadata for large omics databases. This package also supports data navigation if the metadata incorporates ontology.

Maintained by Sehyun Oh. Last updated 6 days ago.

software infrastructure datarepresentation u24ca289073

5.40 score 14 scripts

bioc

R4RNA:An R package for RNA visualization and analysis

A package for RNA basepair analysis, including the visualization of basepairs as arc diagrams for easy comparison and annotation of sequence and structure. Arc diagrams can additionally be projected onto multiple sequence alignments to assess basepair conservation and covariation, with numerical methods for computing statistics for each.

Maintained by Daniel Lai. Last updated 5 months ago.

alignment multiplesequencealignment preprocessing visualization dataimport datarepresentation multiplecomparison

5.36 score 19 scripts 4 dependents

bioc

SCArray:Large-scale single-cell omics data manipulation with GDS files

Provides large-scale single-cell omics data manipulation using Genomic Data Structure (GDS) files. It combines dense and sparse matrices stored in GDS files and the Bioconductor infrastructure framework (SingleCellExperiment and DelayedArray) to provide out-of-memory data storage and large-scale manipulation using the R programming language.

Maintained by Xiuwen Zheng. Last updated 5 days ago.

infrastructure datarepresentation dataimport singlecell rnaseq cpp

1 stars 5.32 score 9 scripts 1 dependents

bioc

QTLExperiment:S4 classes for QTL summary statistics and metadata

QLTExperiment defines an S4 class for storing and manipulating summary statistics from QTL mapping experiments in one or more states. It is based on the 'SummarizedExperiment' class and contains functions for creating, merging, and subsetting objects. 'QTLExperiment' also stores experiment metadata and has checks in place to ensure that transformations apply correctly.

Maintained by Amelia Dunstone. Last updated 9 days ago.

functionalgenomics dataimport datarepresentation infrastructure sequencing snp software

2 stars 5.32 score 14 scripts 1 dependents

bioc

omXplore:Vizualization tools for 'omics' datasets with R

This package contains a collection of functions (written as shiny modules) for the visualisation and the statistical analysis of omics data. These plots can be displayed individually or embedded in a global Shiny module. Additionaly, it is possible to integrate third party modules to the main interface of the package omXplore.

Maintained by Samuel Wieczorek. Last updated 2 days ago.

software shinyapps massspectrometry datarepresentation gui qualitycontrol prostar2

5.32 score 23 scripts

bioc

PhIPData:Container for PhIP-Seq Experiments

PhIPData defines an S4 class for phage-immunoprecipitation sequencing (PhIP-seq) experiments. Buliding upon the RangedSummarizedExperiment class, PhIPData enables users to coordinate metadata with experimental data in analyses. Additionally, PhIPData provides specialized methods to subset and identify beads-only samples, subset objects using virus aliases, and use existing peptide libraries to populate object parameters.

Maintained by Athena Chen. Last updated 5 months ago.

infrastructure datarepresentation sequencing coverage

6 stars 5.26 score 6 scripts 1 dependents

bioc

DelayedDataFrame:Delayed operation on DataFrame using standard DataFrame metaphor

Based on the standard DataFrame metaphor, we are trying to implement the feature of delayed operation on the DelayedDataFrame, with a slot of lazyIndex, which saves the mapping indexes for each column of DelayedDataFrame. Methods like show, validity check, [/[[ subsetting, rbind/cbind are implemented for DelayedDataFrame to be operated around lazyIndex. The listData slot stays untouched until a realization call e.g., DataFrame constructor OR as.list() is invoked.

Maintained by Qian Liu. Last updated 5 months ago.

infrastructure datarepresentation

2 stars 5.26 score 3 scripts 1 dependents

bioc

SplicingGraphs:Create, manipulate, visualize splicing graphs, and assign RNA-seq reads to them

This package allows the user to create, manipulate, and visualize splicing graphs and their bubbles based on a gene model for a given organism. Additionally it allows the user to assign RNA-seq reads to the edges of a set of splicing graphs, and to summarize them in different ways.

Maintained by H. Pagès. Last updated 5 months ago.

genetics annotation datarepresentation visualization sequencing rnaseq geneexpression alternativesplicing transcription immunooncology bioconductor-package

2 stars 5.26 score 8 scripts

bioc

DelayedRandomArray:Delayed Arrays of Random Values

Implements a DelayedArray of random values where the realization of the sampled values is delayed until they are needed. Reproducible sampling within any subarray is achieved by chunking where each chunk is initialized with a different random seed and stream. The usual distributions in the stats package are supported, along with scalar, vector and arrays for the parameters.

Maintained by Aaron Lun. Last updated 3 months ago.

datarepresentation cpp

5.26 score 6 scripts 1 dependents

bioc

flowDensity:Sequential Flow Cytometry Data Gating

This package provides tools for automated sequential gating analogous to the manual gating strategy based on the density of the data.

Maintained by Mehrnoush Malek. Last updated 5 months ago.

bioinformatics flowcytometry cellbiology clustering cancer flowcytdata datarepresentation stemcell densitygating

5.17 score 83 scripts 3 dependents

bioc

DepecheR:Determination of essential phenotypic elements of clusters in high-dimensional entities

The purpose of this package is to identify traits in a dataset that can separate groups. This is done on two levels. First, clustering is performed, using an implementation of sparse K-means. Secondly, the generated clusters are used to predict outcomes of groups of individuals based on their distribution of observations in the different clusters. As certain clusters with separating information will be identified, and these clusters are defined by a sparse number of variables, this method can reduce the complexity of data, to only emphasize the data that actually matters.

Maintained by Jakob Theorell. Last updated 5 months ago.

software cellbasedassays transcription differentialexpression datarepresentation immunooncology transcriptomics classification clustering dimensionreduction featureextraction flowcytometry rnaseq singlecell visualization cpp

5.08 score 15 scripts

bioc

seahtrue:Seahtrue revives XF data for structured data analysis

Seahtrue organizes oxygen consumption and extracellular acidification analysis data from experiments performed on an XF analyzer into structured nested tibbles.This allows for detailed processing of raw data and advanced data visualization and statistics. Seahtrue introduces an open and reproducible way to analyze these XF experiments. It uses file paths to .xlsx files. These .xlsx files are supplied by the userand are generated by the user in the Wave software from Agilent from the assay result files (.asyr). The .xlsx file contains different sheets of important data for the experiment; 1. Assay Information - Details about how the experiment was set up. 2. Rate Data - Information about the OCR and ECAR rates. 3. Raw Data - The original raw data collected during the experiment. 4. Calibration Data - Data related to calibrating the instrument. Seahtrue focuses on getting the specific data needed for analysis. Once this data is extracted, it is prepared for calculations through preprocessing. To make sure everything is accurate, both the initial data and the preprocessed data go through thorough checks.

Maintained by Vincent de Boer. Last updated 5 months ago.

cellbasedassays functionalprediction datarepresentation dataimport cellbiology cheminformatics metabolomics microtitreplateassay visualization qualitycontrol batcheffect experimentaldesign preprocessing go

5.04 score 2 scripts

bioc

alabaster.spatial:Save and Load Spatial 'Omics Data to/from File

Save SpatialExperiment objects and their images into file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.

Maintained by Aaron Lun. Last updated 5 months ago.

dataimport datarepresentation

5.02 score 5 scripts 1 dependents

bioc

VariantExperiment:A RangedSummarizedExperiment Container for VCF/GDS Data with GDS Backend

VariantExperiment is a Bioconductor package for saving data in VCF/GDS format into RangedSummarizedExperiment object. The high-throughput genetic/genomic data are saved in GDSArray objects. The annotation data for features/samples are saved in DelayedDataFrame format with mono-dimensional GDSArray in each column. The on-disk representation of both assay data and annotation data achieves on-disk reading and processing and saves memory space significantly. The interface of RangedSummarizedExperiment data format enables easy and common manipulations for high-throughput genetic/genomic data with common SummarizedExperiment metaphor in R and Bioconductor.

Maintained by Qian Liu. Last updated 5 months ago.

infrastructure datarepresentation sequencing annotation genomeannotation genotypingarray

1 stars 5.00 score 2 scripts

bioc

OSTA.data:OSTA book data

'OSTA.data' is a companion package for the "Orchestrating Spatial Transcriptomics Analysis" (OSTA) with Bioconductor online book. Throughout OSTA, we rely on a set of publicly available datasets that cover different sequencing- and imaging-based platforms, such as Visium, Visium HD, Xenium (10x Genomics) and CosMx (NanoString). In addition, we rely on scRNA-seq (Chromium) data for tasks, e.g., spot deconvolution and label transfer (i.e., supervised clustering). These data been deposited in an Open Storage Framework (OSF) repository, and can be queried and downloaded using functions from the 'osfr' package. For convenience, we have implemented 'OSTA.data' to query and retrieve data from our OSF node, and cache retrieved Zip archives using 'BiocFileCache'.

Maintained by Yixing E. Dong. Last updated 1 months ago.

dataimport datarepresentation experimenthubsoftware infrastructure immunooncology geneexpression transcriptomics singlecell spatial

2 stars 5.00 score

bioc

alabaster.string:Save and Load Biostrings to/from File

Save Biostrings objects to file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.

Maintained by Aaron Lun. Last updated 5 months ago.

dataimport datarepresentation

4.95 score 5 scripts 2 dependents

bioc

BSgenomeForge:Forge your own BSgenome data package

A set of tools to forge BSgenome data packages. Supersedes the old seed-based tools from the BSgenome software package. This package allows the user to create a BSgenome data package in one function call, simplifying the old seed-based process.

Maintained by Hervé Pagès. Last updated 5 months ago.

infrastructure datarepresentation genomeassembly annotation genomeannotation sequencing alignment dataimport sequencematching bioconductor-package core-package

4 stars 4.90 score 6 scripts

bioc

beachmat.hdf5:beachmat bindings for HDF5-backed matrices

Extends beachmat to support initialization of tatami matrices from HDF5-backed arrays. This allows C++ code in downstream packages to directly call the HDF5 C/C++ library to access array data, without the need for block processing via DelayedArray. Some utilities are also provided for direct creation of an in-memory tatami matrix from a HDF5 file.

Maintained by Aaron Lun. Last updated 5 months ago.

datarepresentation dataimport infrastructure zlib cpp

4.88 score 6 scripts

bioc

alabaster.mae:Load and Save MultiAssayExperiments

Save MultiAssayExperiments into file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.

Maintained by Aaron Lun. Last updated 5 months ago.

dataimport datarepresentation

4.78 score 5 scripts 1 dependents

bioc

consensus:Cross-platform consensus analysis of genomic measurements via interlaboratory testing method

An implementation of the American Society for Testing and Materials (ASTM) Standard E691 for interlaboratory testing procedures, designed for cross-platform genomic measurements. Given three (3) or more genomic platforms or laboratory protocols, this package provides interlaboratory testing procedures giving per-locus comparisons for sensitivity and precision between platforms.

Maintained by Tim Peters. Last updated 5 months ago.

qualitycontrol regression datarepresentation geneexpression microarray rnaseq

4.70 score 10 scripts

bioc

weitrix:Tools for matrices with precision weights, test and explore weighted or sparse data

Data type and tools for working with matrices having precision weights and missing data. This package provides a common representation and tools that can be used with many types of high-throughput data. The meaning of the weights is compatible with usage in the base R function "lm" and the package "limma". Calibrate weights to account for known predictors of precision. Find rows with excess variability. Perform differential testing and find rows with the largest confident differences. Find PCA-like components of variation even with many missing values, rotated so that individual components may be meaningfully interpreted. DelayedArray matrices and BiocParallel are supported.

Maintained by Paul Harrison. Last updated 5 months ago.

software datarepresentation dimensionreduction geneexpression transcriptomics rnaseq singlecell regression

4.70 score 8 scripts

bioc

ILoReg:ILoReg: a tool for high-resolution cell population identification from scRNA-Seq data

ILoReg is a tool for identification of cell populations from scRNA-seq data. In particular, ILoReg is useful for finding cell populations with subtle transcriptomic differences. The method utilizes a self-supervised learning method, called Iteratitive Clustering Projection (ICP), to find cluster probabilities, which are used in noise reduction prior to PCA and the subsequent hierarchical clustering and t-SNE steps. Additionally, functions for differential expression analysis to find gene markers for the populations and gene expression visualization are provided.

Maintained by Johannes Smolander. Last updated 5 months ago.

singlecell software clustering dimensionreduction rnaseq visualization transcriptomics datarepresentation differentialexpression transcription geneexpression

5 stars 4.70 score 2 scripts

bioc

HicAggR:Set of 3D genomic interaction analysis tools

This package provides a set of functions useful in the analysis of 3D genomic interactions. It includes the import of standard HiC data formats into R and HiC normalisation procedures. The main objective of this package is to improve the visualization and quantification of the analysis of HiC contacts through aggregation. The package allows to import 1D genomics data, such as peaks from ATACSeq, ChIPSeq, to create potential couples between features of interest under user-defined parameters such as distance between pairs of features of interest. It allows then the extraction of contact values from the HiC data for these couples and to perform Aggregated Peak Analysis (APA) for visualization, but also to compare normalized contact values between conditions. Overall the package allows to integrate 1D genomics data with 3D genomics data, providing an easy access to HiC contact values.

Maintained by Olivier Cuvier. Last updated 5 months ago.

software hic dataimport datarepresentation normalization visualization dna3dstructure atacseq chipseq dnaseseq rnaseq

4.70 score 3 scripts

bioc

DelayedTensor:R package for sparse and out-of-core arithmetic and decomposition of Tensor

DelayedTensor operates Tensor arithmetic directly on DelayedArray object. DelayedTensor provides some generic function related to Tensor arithmetic/decompotision and dispatches it on the DelayedArray class. DelayedTensor also suppors Tensor contraction by einsum function, which is inspired by numpy einsum.

Maintained by Koki Tsuyuzaki. Last updated 5 months ago.

software infrastructure datarepresentation dimensionreduction

4 stars 4.68 score 3 scripts

bioc

interactiveDisplayBase:Base package for enabling powerful shiny web displays of Bioconductor objects

The interactiveDisplayBase package contains the the basic methods needed to generate interactive Shiny based display methods for Bioconductor objects.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

go geneexpression microarray sequencing classification network qualitycontrol visualization genetics datarepresentation gui annotationdata shinyapps

4.67 score 5 scripts 1 dependents

bioc

alabaster.vcf:Save and Load Variant Data to/from File

Save variant calling SummarizedExperiment to file and load them back as VCF objects. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.

Maintained by Aaron Lun. Last updated 5 months ago.

dataimport datarepresentation

4.65 score 6 scripts 1 dependents

bioc

alabaster.bumpy:Save and Load BumpyMatrices to/from file

Save BumpyMatrix objects into file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.

Maintained by Aaron Lun. Last updated 5 months ago.

dataimport datarepresentation

4.65 score 5 scripts 1 dependents

bioc

beachmat.tiledb:beachmat bindings for TileDB-backed matrices

Extends beachmat to initialize tatami matrices from TileDB-backed arrays. This allows C++ code in downstream packages to directly call the TileDB C/C++ library to access array data, without the need for block processing via DelayedArray. Developers only need to import this package to automatically extend the capabilities of beachmat::initializeCpp to TileDBArray instances.

Maintained by Aaron Lun. Last updated 3 months ago.

datarepresentation dataimport infrastructure cpp

4.65 score 4 scripts

bioc

cellmigRation:Track Cells, Analyze Cell Trajectories and Compute Migration Statistics

Import TIFF images of fluorescently labeled cells, and track cell movements over time. Parallelization is supported for image processing and for fast computation of cell trajectories. In-depth analysis of cell trajectories is enabled by 15 trajectory analysis functions.

Maintained by Waldir Leoncio. Last updated 5 months ago.

cellbiology datarepresentation dataimport bioconductor-package cell-tracking shiny trajectory-analysis

4.60 score 4 scripts

bioc

chromPlot:Global visualization tool of genomic data

Package designed to visualize genomic data along the chromosomes, where the vertical chromosomes are sorted by number, with sex chromosomes at the end.

Maintained by Karen Y. Orostica. Last updated 5 months ago.

datarepresentation functionalgenomics genetics sequencing annotation visualization

4.53 score 24 scripts

bioc

SQLDataFrame:Representation of SQL tables in DataFrame metaphor

Implements bindings for SQL tables that are compatible with Bioconductor S4 data structures, namely the DataFrame and DelayedArray. This allows SQL-derived data to be easily used inside other Bioconductor objects (e.g., SummarizedExperiments) while keeping everything on disk.

Maintained by Qian Liu. Last updated 5 months ago.

datarepresentation infrastructure software

2 stars 4.51 score 5 scripts

bioc

oposSOM:Comprehensive analysis of transcriptome data

This package translates microarray expression data into metadata of reduced dimension. It provides various sample-centered and group-centered visualizations, sample similarity analyses and functional enrichment analyses. The underlying SOM algorithm combines feature clustering, multidimensional scaling and dimension reduction, along with strong visualization capabilities. It enables extraction and description of functional expression modules inherent in the data.

Maintained by Henry Loeffler-Wirth. Last updated 5 months ago.

geneexpression differentialexpression genesetenrichment datarepresentation visualization cpp

4.48 score 7 scripts

bioc

updateObject:Find/fix old serialized S4 instances

A set of tools built around updateObject() to work with old serialized S4 instances. The package is primarily useful to package maintainers who want to update the serialized S4 instances included in their package. This is still work-in-progress.

Maintained by Hervé Pagès. Last updated 5 months ago.

infrastructure datarepresentation bioconductor-package core-package

1 stars 4.48 score 3 scripts

bioc

MACSQuantifyR:Fast treatment of MACSQuantify FACS data

Automatically process the metadata of MACSQuantify FACS sorter. It runs multiple modules: i) imports of raw file and graphical selection of duplicates in well plate, ii) computes statistics on data and iii) can compute combination index.

Maintained by Raphaël Bonnet. Last updated 5 months ago.

dataimport preprocessing normalization flowcytometry datarepresentation gui

4.48 score 3 scripts

regisoc

kibior:A Simple Data Management and Sharing Tool

An interface to store, retrieve, search, join and share datasets, based on Elasticsearch (ES) API. As a decentralized, FAIR and collaborative search engine and database effort, it proposes a simple push/pull/search mechanism only based on ES, a tool which can be deployed on nearly any hardware. It is a high-level R-ES binding to ease data usage using 'elastic' package (S. Chamberlain (2020)) <https://docs.ropensci.org/elastic/>, extends joins from 'dplyr' package (H. Wickham et al. (2020)) <https://dplyr.tidyverse.org/> and integrates specific biological format importation with Bioconductor packages such as 'rtracklayer' (M. Lawrence and al. (2009) <doi:10.1093/bioinformatics/btp328>) <http://bioconductor.org/packages/rtracklayer>, 'Biostrings' (H. Pagès and al. (2020) <doi:10.18129/B9.bioc.Biostrings>) <http://bioconductor.org/packages/Biostrings>, and 'Rsamtools' (M. Morgan and al. (2020) <doi:10.18129/B9.bioc.Rsamtools>) <http://bioconductor.org/packages/Rsamtools>, but also a long list of more common ones with 'rio' (C-h. Chan and al. (2018)) <https://cran.r-project.org/package=rio>.

Maintained by Régis Ongaro-Carcy. Last updated 4 years ago.

dataimport datarepresentation thirdpartyclient data-science database datasets elasticsearch elasticsearch-client push-pull search search-engine

3 stars 4.48 score 8 scripts

bioc

GA4GHclient:A Bioconductor package for accessing GA4GH API data servers

GA4GHclient provides an easy way to access public data servers through Global Alliance for Genomics and Health (GA4GH) genomics API. It provides low-level access to GA4GH API and translates response data into Bioconductor-based class objects.

Maintained by Welliton Souza. Last updated 5 months ago.

datarepresentation thirdpartyclient

1 stars 4.48 score 3 scripts 1 dependents

bioc

flowSpecs:Tools for processing of high-dimensional cytometry data

This package is intended to fill the role of conventional cytometry pre-processing software, for spectral decomposition, transformation, visualization and cleanup, and to aid further downstream analyses, such as with DepecheR, by enabling transformation of flowFrames and flowSets to dataframes. Functions for flowCore-compliant automatic 1D-gating/filtering are in the pipe line. The package name has been chosen both as it will deal with spectral cytometry and as it will hopefully give the user a nice pair of spectacles through which to view their data.

Maintained by Jakob Theorell. Last updated 5 months ago.

software cellbasedassays datarepresentation immunooncology flowcytometry singlecell visualization normalization dataimport

6 stars 4.38 score 7 scripts

bioc

chihaya:Save Delayed Operations to a HDF5 File

Saves the delayed operations of a DelayedArray to a HDF5 file. This enables efficient recovery of the DelayedArray's contents in other languages and analysis frameworks.

Maintained by Aaron Lun. Last updated 5 months ago.

dataimport datarepresentation zlib cpp

4.38 score 16 scripts

bioc

Spaniel:Spatial Transcriptomics Analysis

Spaniel includes a series of tools to aid the quality control and analysis of Spatial Transcriptomics data. Spaniel can import data from either the original Spatial Transcriptomics system or 10X Visium technology. The package contains functions to create a SingleCellExperiment Seurat object and provides a method of loading a histologial image into R. The spanielPlot function allows visualisation of metrics contained within the S4 object overlaid onto the image of the tissue.

Maintained by Rachel Queen. Last updated 5 months ago.

singlecell rnaseq qualitycontrol preprocessing normalization visualization transcriptomics geneexpression sequencing software dataimport datarepresentation infrastructure coverage clustering

4.34 score 22 scripts

bioc

RImmPort:RImmPort: Enabling Ready-for-analysis Immunology Research Data

The RImmPort package simplifies access to ImmPort data for analysis in the R environment. It provides a standards-based interface to the ImmPort study data that is in a proprietary format.

Maintained by Zicheng Hu. Last updated 5 months ago.

biomedicalinformatics dataimport datarepresentation

4.33 score 27 scripts

yunuuuu

BPCellsArray:Using BPCells as a DelayedArray Backend

Implements a DelayedArray backend for reading and writing arrays in the BPCells storage layout. The resulting BPCells*Arrays are compatible with all Bioconductor pipelines that can accept DelayedArray instances.

Maintained by Yun Peng. Last updated 8 months ago.

software dataimport datarepresentation infrastructure single-cell

7 stars 4.32 score

bioc

alabaster.files:Wrappers to Save Common File Formats

Save common bioinformatics file formats within the alabaster framework. This includes BAM, BED, VCF, bigWig, bigBed, FASTQ, FASTA and so on. We save and load additional metadata for each file, and we support linkage between each file and its corresponding index.

Maintained by Aaron Lun. Last updated 5 months ago.

datarepresentation dataimport

4.32 score 21 scripts

bioc

Ularcirc:Shiny app for canonical and back splicing analysis (i.e. circular and mRNA analysis)

Ularcirc reads in STAR aligned splice junction files and provides visualisation and analysis tools for splicing analysis. Users can assess backsplice junctions and forward canonical junctions.

Maintained by David Humphreys. Last updated 5 months ago.

datarepresentation visualization genetics sequencing annotation coverage alternativesplicing differentialsplicing

4.30 score 4 scripts

bioc

OSAT:OSAT: Optimal Sample Assignment Tool

A sizable genomics study such as microarray often involves the use of multiple batches (groups) of experiment due to practical complication. To minimize batch effects, a careful experiment design should ensure the even distribution of biological groups and confounding factors across batches. OSAT (Optimal Sample Assignment Tool) is developed to facilitate the allocation of collected samples to different batches. With minimum steps, it produces setup that optimizes the even distribution of samples in groups of biological interest into different batches, reducing the confounding or correlation between batches and the biological variables of interest. It can also optimize the even distribution of confounding factors across batches. Our tool can handle challenging instances where incomplete and unbalanced sample collections are involved as well as ideal balanced RCBD. OSAT provides a number of predefined layout for some of the most commonly used genomics platform. Related paper can be find at http://www.biomedcentral.com/1471-2164/13/689 .

Maintained by Li Yan. Last updated 5 months ago.

datarepresentation visualization experimentaldesign qualitycontrol

4.30 score 3 scripts

bioc

SpatialOmicsOverlay:Spatial Overlay for Omic Data from Nanostring GeoMx Data

Tools for NanoString Technologies GeoMx Technology. Package to easily graph on top of an OME-TIFF image. Plotting annotations can range from tissue segment to gene expression.

Maintained by Maddy Griswold. Last updated 5 months ago.

geneexpression transcription cellbasedassays dataimport transcriptomics proteomics proprietaryplatforms rnaseq spatial datarepresentation visualization openjdk

4.30 score 8 scripts

henrikbengtsson

aroma.core:Core Methods and Classes Used by 'aroma.*' Packages Part of the Aroma Framework

Core methods and classes used by higher-level 'aroma.*' packages part of the Aroma Project, e.g. 'aroma.affymetrix' and 'aroma.cn'.

Maintained by Henrik Bengtsson. Last updated 2 years ago.

microarray onechannel twochannel multichannel dataimport datarepresentation gui visualization preprocessing qualitycontrol acgh copynumbervariants

1 stars 4.30 score 16 scripts 6 dependents

bioc

TreeAndLeaf:Displaying binary trees with focus on dendrogram leaves

The TreeAndLeaf package combines unrooted and force-directed graph algorithms in order to layout binary trees, aiming to represent multiple layers of information onto dendrogram leaves.

Maintained by Milena A. Cardoso. Last updated 5 months ago.

infrastructure graphandnetwork software network visualization datarepresentation

4.20 score 16 scripts

bioc

cytoMEM:Marker Enrichment Modeling (MEM)

MEM, Marker Enrichment Modeling, automatically generates and displays quantitative labels for cell populations that have been identified from single-cell data. The input for MEM is a dataset that has pre-clustered or pre-gated populations with cells in rows and features in columns. Labels convey a list of measured features and the features' levels of relative enrichment on each population. MEM can be applied to a wide variety of data types and can compare between MEM labels from flow cytometry, mass cytometry, single cell RNA-seq, and spectral flow cytometry using RMSD.

Maintained by Jonathan Irish. Last updated 5 months ago.

proteomics systemsbiology classification flowcytometry datarepresentation dataimport cellbiology singlecell clustering

4.18 score 15 scripts

bioc

LoomExperiment:LoomExperiment container

The LoomExperiment package provide a means to easily convert the Bioconductor "Experiment" classes to loom files and vice versa.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

immunooncology datarepresentation dataimport infrastructure singlecell

4.16 score 73 scripts

bioc

ReducedExperiment:Containers and tools for dimensionally-reduced -omics representations

Provides SummarizedExperiment-like containers for storing and manipulating dimensionally-reduced assay data. The ReducedExperiment classes allow users to simultaneously manipulate their original dataset and their decomposed data, in addition to other method-specific outputs like feature loadings. Implements utilities and specialised classes for the application of stabilised independent component analysis (sICA) and weighted gene correlation network analysis (WGCNA).

Maintained by Jack Gisby. Last updated 2 months ago.

geneexpression infrastructure datarepresentation software dimensionreduction network bioconductor-package bioinformatics dimensionality-reduction

3 stars 4.13 score 8 scripts

bioc

nearBynding:Discern RNA structure proximal to protein binding

Provides a pipeline to discern RNA structure at and proximal to the site of protein binding within regions of the transcriptome defined by the user. CLIP protein-binding data can be input as either aligned BAM or peak-called bedGraph files. RNA structure can either be predicted internally from sequence or users have the option to input their own RNA structure data. RNA structure binding profiles can be visually and quantitatively compared across multiple formats.

Maintained by Veronica Busa. Last updated 5 months ago.

visualization motifdiscovery datarepresentation structuralprediction clustering multiplecomparison

4.08 score 12 scripts

bioc

MultimodalExperiment:Integrative Bulk and Single-Cell Experiment Container

MultimodalExperiment is an S4 class that integrates bulk and single-cell experiment data; it is optimally storage-efficient, and its methods are exceptionally fast. It effortlessly represents multimodal data of any nature and features normalized experiment, subject, sample, and cell annotations, which are related to underlying biological experiments through maps. Its coordination methods are opt-in and employ database-like join operations internally to deliver fast and flexible management of multimodal data.

Maintained by Lucas Schiffer. Last updated 5 months ago.

datarepresentation infrastructure singlecell

4.00 score 3 scripts

bioc

surfaltr:Rapid Comparison of Surface Protein Isoform Membrane Topologies Through surfaltr

Cell surface proteins form a major fraction of the druggable proteome and can be used for tissue-specific delivery of oligonucleotide/cell-based therapeutics. Alternatively spliced surface protein isoforms have been shown to differ in their subcellular localization and/or their transmembrane (TM) topology. Surface proteins are hydrophobic and remain difficult to study thereby necessitating the use of TM topology prediction methods such as TMHMM and Phobius. However, there exists a need for bioinformatic approaches to streamline batch processing of isoforms for comparing and visualizing topologies. To address this gap, we have developed an R package, surfaltr. It pairs inputted isoforms, either known alternatively spliced or novel, with their APPRIS annotated principal counterparts, predicts their TM topologies using TMHMM or Phobius, and generates a customizable graphical output. Further, surfaltr facilitates the prioritization of biologically diverse isoform pairs through the incorporation of three different ranking metrics and through protein alignment functions. Citations for programs mentioned here can be found in the vignette.

Maintained by Pooja Gangras. Last updated 5 months ago.

software visualization datarepresentation splicedalignment alignment multiplesequencealignment multiplecomparison

4.00 score 2 scripts

bioc

ExperimentSubset:Manages subsets of data with Bioconductor Experiment objects

Experiment objects such as the SummarizedExperiment or SingleCellExperiment are data containers for one or more matrix-like assays along with the associated row and column data. Often only a subset of the original data is needed for down-stream analysis. For example, filtering out poor quality samples will require excluding some columns before analysis. The ExperimentSubset object is a container to efficiently manage different subsets of the same data without having to make separate objects for each new subset.

Maintained by Irzam Sarfraz. Last updated 5 months ago.

infrastructure software dataimport datarepresentation

4.00 score 8 scripts

bioc

OMICsPCA:An R package for quantitative integration and analysis of multiple omics assays from heterogeneous samples

OMICsPCA is an analysis pipeline designed to integrate multi OMICs experiments done on various subjects (e.g. Cell lines, individuals), treatments (e.g. disease/control) or time points and to analyse such integrated data from various various angles and perspectives. In it's core OMICsPCA uses Principal Component Analysis (PCA) to integrate multiomics experiments from various sources and thus has ability to over data insufficiency issues by using the ingegrated data as representatives. OMICsPCA can be used in various application including analysis of overall distribution of OMICs assays across various samples /individuals /time points; grouping assays by user-defined conditions; identification of source of variation, similarity/dissimilarity between assays, variables or individuals.

Maintained by Subhadeep Das. Last updated 5 months ago.

immunooncology multiplecomparison principalcomponent datarepresentation workflow visualization dimensionreduction clustering biologicalquestion epigeneticsworkflow transcription geneticvariability gui biomedicalinformatics epigenetics functionalgenomics singlecell

4.00 score 1 scripts

bioc

alabaster:Umbrella for the Alabaster Framework

Umbrella for the alabaster suite, providing a single-line import for all alabaster.* packages. Installing this package ensures that all known alabaster.* packages are also installed, avoiding problems with missing packages when a staging method or loading function is dynamically requested. Obviously, this comes at the cost of needing to install more packages, so advanced users and application developers may prefer to install the required alabaster.* packages individually.

Maintained by Aaron Lun. Last updated 5 months ago.

datarepresentation dataimport

4.00 score 3 scripts

bioc

Omixer:Omixer: multivariate and reproducible sample randomization to proactively counter batch effects in omics studies

Omixer - an Bioconductor package for multivariate and reproducible sample randomization, which ensures optimal sample distribution across batches with well-documented methods. It outputs lab-friendly sample layouts, reducing the risk of sample mixups when manually pipetting randomized samples.

Maintained by Lucy Sinke. Last updated 5 months ago.

datarepresentation experimentaldesign qualitycontrol software visualization

4.00 score 2 scripts

bioc

doseR:doseR

doseR package is a next generation sequencing package for sex chromosome dosage compensation which can be applied broadly to detect shifts in gene expression among an arbitrary number of pre-defined groups of loci. doseR is a differential gene expression package for count data, that detects directional shifts in expression for multiple, specific subsets of genes, broad utility in systems biology research. doseR has been prepared to manage the nature of the data and the desired set of inferences. doseR uses S4 classes to store count data from sequencing experiment. It contains functions to normalize and filter count data, as well as to plot and calculate statistics of count data. It contains a framework for linear modeling of count data. The package has been tested using real and simulated data.

Maintained by ake.vastermark. Last updated 5 months ago.

infrastructure software datarepresentation sequencing geneexpression systemsbiology differentialexpression

4.00 score 3 scripts

bioc

VCFArray:Representing on-disk / remote VCF files as array-like objects

VCFArray extends the DelayedArray to represent VCF data entries as array-like objects with on-disk / remote VCF file as backend. Data entries from VCF files, including info fields, FORMAT fields, and the fixed columns (REF, ALT, QUAL, FILTER) could be converted into VCFArray instances with different dimensions.

Maintained by Qian Liu. Last updated 5 months ago.

infrastructure datarepresentation sequencing variantannotation

1 stars 4.00 score 3 scripts

bioc

Rvisdiff:Interactive Graphs for Differential Expression

Creates a muti-graph web page which allows the interactive exploration of differential expression results. The graphical web interface presents results as a table which is integrated with five interactive graphs: MA-plot, volcano plot, box plot, lines plot and cluster heatmap. Graphical aspect and information represented in the graphs can be customized by means of user controls. Final graphics can be exported as PNG format.

Maintained by David Barrios. Last updated 5 months ago.

software visualization rnaseq datarepresentation differentialexpression

4.00 score 2 scripts

bioc

KEGGlincs:Visualize all edges within a KEGG pathway and overlay LINCS data

See what is going on 'under the hood' of KEGG pathways by explicitly re-creating the pathway maps from information obtained from KGML files.

Maintained by Shana White. Last updated 5 months ago.

networkinference geneexpression datarepresentation thirdpartyclient cellbiology graphandnetwork pathways kegg network

4.00 score 3 scripts

bioc

CellTrails:Reconstruction, visualization and analysis of branching trajectories

CellTrails is an unsupervised algorithm for the de novo chronological ordering, visualization and analysis of single-cell expression data. CellTrails makes use of a geometrically motivated concept of lower-dimensional manifold learning, which exhibits a multitude of virtues that counteract intrinsic noise of single cell data caused by drop-outs, technical variance, and redundancy of predictive variables. CellTrails enables the reconstruction of branching trajectories and provides an intuitive graphical representation of expression patterns along all branches simultaneously. It allows the user to define and infer the expression dynamics of individual and multiple pathways towards distinct phenotypes.

Maintained by Daniel Ellwanger. Last updated 5 months ago.

immunooncology clustering datarepresentation differentialexpression dimensionreduction geneexpression sequencing singlecell software timecourse

4.00 score 7 scripts

bioc

SCArray.sat:Large-scale single-cell RNA-seq data analysis using GDS files and Seurat

Extends the Seurat classes and functions to support Genomic Data Structure (GDS) files as a DelayedArray backend for data representation. It relies on the implementation of GDS-based DelayedMatrix in the SCArray package to represent single cell RNA-seq data. The common optimized algorithms leveraging GDS-based and single cell-specific DelayedMatrix (SC_GDSMatrix) are implemented in the SCArray package. SCArray.sat introduces a new SCArrayAssay class (derived from the Seurat Assay), which wraps raw counts, normalized expressions and scaled data matrix based on GDS-specific DelayedMatrix. It is designed to integrate seamlessly with the Seurat package to provide common data analysis in the SeuratObject-based workflow. Compared with Seurat, SCArray.sat significantly reduces the memory usage without downsampling and can be applied to very large datasets.

Maintained by Xiuwen Zheng. Last updated 5 days ago.

datarepresentation dataimport singlecell rnaseq

1 stars 3.48 score 3 scripts

bioc

interactiveDisplay:Package for enabling powerful shiny web displays of Bioconductor objects

The interactiveDisplay package contains the methods needed to generate interactive Shiny based display methods for Bioconductor objects.

Maintained by Bioconductor Package Maintainer. Last updated 3 months ago.

go geneexpression microarray sequencing classification network qualitycontrol visualization genetics datarepresentation gui annotationdata shinyapps

3.48 score 4 scripts

bioc

R453Plus1Toolbox:A package for importing and analyzing data from Roche's Genome Sequencer System

The R453Plus1 Toolbox comprises useful functions for the analysis of data generated by Roche's 454 sequencing platform. It adds functions for quality assurance as well as for annotation and visualization of detected variants, complementing the software tools shipped by Roche with their product. Further, a pipeline for the detection of structural variants is provided.

Maintained by Hans-Ulrich Klein. Last updated 5 months ago.

sequencing infrastructure dataimport datarepresentation visualization qualitycontrol reportwriting

3.48 score 10 scripts

bioc

flowPlots:flowPlots: analysis plots and data class for gated flow cytometry data

Graphical displays with embedded statistical tests for gated ICS flow cytometry data, and a data class which stores "stacked" data and has methods for computing summary measures on stacked data, such as marginal and polyfunctional degree data.

Maintained by N. Hawkins. Last updated 5 months ago.

immunooncology flowcytometry cellbasedassays visualization datarepresentation

3.30 score 1 scripts

bioc

ipdDb:IPD IMGT/HLA and IPD KIR database for Homo sapiens

All alleles from the IPD IMGT/HLA <https://www.ebi.ac.uk/ipd/imgt/hla/> and IPD KIR <https://www.ebi.ac.uk/ipd/kir/> database for Homo sapiens. Reference: Robinson J, Maccari G, Marsh SGE, Walter L, Blokhuis J, Bimber B, Parham P, De Groot NG, Bontrop RE, Guethlein LA, and Hammond JA KIR Nomenclature in non-human species Immunogenetics (2018), in preparation.

Maintained by Steffen Klasberg. Last updated 5 months ago.

genomicvariation sequencematching variantannotation datarepresentation annotationhubsoftware

3.30 score 4 scripts

bioc

MultiBaC:Multiomic Batch effect Correction

MultiBaC is a strategy to correct batch effects from multiomic datasets distributed across different labs or data acquisition events. MultiBaC is the first Batch effect correction algorithm that dealing with batch effect correction in multiomics datasets. MultiBaC is able to remove batch effects across different omics generated within separate batches provided that at least one common omic data type is included in all the batches considered.

Maintained by The package maintainer. Last updated 5 months ago.

software statisticalmethod principalcomponent datarepresentation geneexpression transcription batcheffect

3.30 score 7 scripts

bioc

BaseSpaceR:R SDK for BaseSpace RESTful API

A rich R interface to Illumina's BaseSpace cloud computing environment, enabling the fast development of data analysis and visualisation tools.

Maintained by Jared OConnell. Last updated 5 months ago.

infrastructure datarepresentation connecttools software dataimport highthroughputsequencing sequencing genetics

3.30 score 9 scripts

bioc

esetVis:Visualizations of expressionSet Bioconductor object

Utility functions for visualization of expressionSet (or SummarizedExperiment) Bioconductor object, including spectral map, tsne and linear discriminant analysis. Static plot via the ggplot2 package or interactive via the ggvis or rbokeh packages are available.

Maintained by Laure Cougnaud. Last updated 5 months ago.

visualization datarepresentation dimensionreduction principalcomponent pathways

3.30 score 6 scripts

bioc

NeuCA:NEUral network-based single-Cell Annotation tool

NeuCA is is a neural-network based method for scRNA-seq data annotation. It can automatically adjust its classification strategy depending on cell type correlations, to accurately annotate cell. NeuCA can automatically utilize the structure information of the cell types through a hierarchical tree to improve the annotation accuracy. It is especially helpful when the data contain closely correlated cell types.

Maintained by Hao Feng. Last updated 3 days ago.

singlecell software classification neuralnetwork rnaseq transcriptomics datarepresentation transcription sequencing preprocessing geneexpression dataimport

3.18 score 3 scripts

henrikbengtsson

aroma.cn:Copy-Number Analysis of Large Microarray Data Sets

Methods for analyzing DNA copy-number data. Specifically, this package implements the multi-source copy-number normalization (MSCN) method for normalizing copy-number data obtained on various platforms and technologies. It also implements the TumorBoost method for normalizing paired tumor-normal SNP data.

Maintained by Henrik Bengtsson. Last updated 1 years ago.

proprietaryplatforms acgh copynumbervariants snp microarray onechannel twochannel dataimport datarepresentation preprocessing qualitycontrol

1 stars 2.70 score 9 scripts

ecamenen

MainExistingDatasets:Main Existing Human Datasets

Shiny for Open Science to visualize, share, and inventory the main existing human datasets for researchers.

Maintained by Etienne Camenen. Last updated 2 years ago.

biomedicalinformatics datarepresentation visualization

2.70 score 1 scripts

bioc

CTDquerier:Package for CTDbase data query, visualization and downstream analysis

Package to retrieve and visualize data from the Comparative Toxicogenomics Database (http://ctdbase.org/). The downloaded data is formated as DataFrames for further downstream analyses.

Maintained by Xavier Escribà-Montagut. Last updated 5 months ago.

software biomedicalinformatics infrastructure dataimport datarepresentation genesetenrichment networkenrichment pathways network go kegg

2.30 score 2 scripts