Showing 19 of total 19 results (show query)
bioc
alabaster.base:Save Bioconductor Objects to File
Save Bioconductor data structures into file artifacts, and load them back into memory. This is a more robust and portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.
Maintained by Aaron Lun. Last updated 11 days ago.
datarepresentationdataimportzlibcpp
25.1 match 3 stars 8.47 score 60 scripts 15 dependentsbioc
glmGamPoi:Fit a Gamma-Poisson Generalized Linear Model
Fit linear models to overdispersed count data. The package can estimate the overdispersion and fit repeated models for matrix input. It is designed to handle large input datasets as they typically occur in single cell RNA-seq experiments.
Maintained by Constantin Ahlmann-Eltze. Last updated 1 months ago.
regressionrnaseqsoftwaresinglecellgamma-poissonglmnegative-binomial-regressionon-diskopenblascpp
10.0 match 110 stars 12.11 score 1.0k scripts 4 dependentsbioc
flowWorkspace:Infrastructure for representing and interacting with gated and ungated cytometry data sets.
This package is designed to facilitate comparison of automated gating methods against manual gating done in flowJo. This package allows you to import basic flowJo workspaces into BioConductor and replicate the gating from flowJo using the flowCore functionality. Gating hierarchies, groups of samples, compensation, and transformation are performed so that the output matches the flowJo analysis.
Maintained by Greg Finak. Last updated 9 days ago.
immunooncologyflowcytometrydataimportpreprocessingdatarepresentationzlibopenblascpp
13.7 match 7.89 score 576 scripts 10 dependentsbioc
DelayedArray:A unified framework for working transparently with on-disk and in-memory array-like datasets
Wrapping an array-like object (typically an on-disk object) in a DelayedArray object allows one to perform common array operations on it without loading the object in memory. In order to reduce memory usage and optimize performance, operations on the object are either delayed or executed using a block processing mechanism. Note that this also works on in-memory array-like objects like DataFrame objects (typically with Rle columns), Matrix objects, ordinary arrays and, data frames.
Maintained by Hervé Pagès. Last updated 30 days ago.
infrastructuredatarepresentationannotationgenomeannotationbioconductor-packagecore-packageu24ca289073
5.2 match 27 stars 15.59 score 538 scripts 1.2k dependentsbioc
Rarr:Read Zarr Files in R
The Zarr specification defines a format for chunked, compressed, N-dimensional arrays. It's design allows efficient access to subsets of the stored array, and supports both local and cloud storage systems. Rarr aims to implement this specifcation in R with minimal reliance on an external tools or libraries.
Maintained by Mike Smith. Last updated 17 days ago.
dataimportome-zarron-diskout-of-memoryzarrc-blosclibzstd
10.0 match 35 stars 6.68 score 21 scriptsadamlilith
fasterRaster:Faster Raster and Spatial Vector Processing Using 'GRASS GIS'
Processing of large-in-memory/large-on disk rasters and spatial vectors using 'GRASS GIS' <https://grass.osgeo.org/>. Most functions in the 'terra' package are recreated. Processing of medium-sized and smaller spatial objects will nearly always be faster using 'terra' or 'sf', but for large-in-memory/large-on-disk objects, 'fasterRaster' may be faster. To use most of the functions, you must have the stand-alone version (not the 'OSGeoW4' installer version) of 'GRASS GIS' 8.0 or higher.
Maintained by Adam B. Smith. Last updated 18 days ago.
aspectdistancefragmentationfragmentation-indicesgisgrassgrass-gisrasterraster-projectionrasterizeslopetopographyvectorization
6.1 match 58 stars 7.69 score 8 scriptstiledb-inc
tiledb:Modern Database Engine for Complex Data Based on Multi-Dimensional Arrays
The modern database 'TileDB' introduces a powerful on-disk format for storing and accessing any complex data based on multi-dimensional arrays. It supports dense and sparse arrays, dataframes and key-values stores, cloud storage ('S3', 'GCS', 'Azure'), chunked arrays, multiple compression, encryption and checksum filters, uses a fully multi-threaded implementation, supports parallel I/O, data versioning ('time travel'), metadata and groups. It is implemented as an embeddable cross-platform C++ library with APIs from several languages, and integrations. This package provides the R support.
Maintained by Isaiah Norton. Last updated 4 days ago.
arrayhdfss3storage-managertiledbcpp
2.3 match 107 stars 11.96 score 306 scripts 4 dependentsbioc
VCFArray:Representing on-disk / remote VCF files as array-like objects
VCFArray extends the DelayedArray to represent VCF data entries as array-like objects with on-disk / remote VCF file as backend. Data entries from VCF files, including info fields, FORMAT fields, and the fixed columns (REF, ALT, QUAL, FILTER) could be converted into VCFArray instances with different dimensions.
Maintained by Qian Liu. Last updated 5 months ago.
infrastructuredatarepresentationsequencingvariantannotation
6.2 match 1 stars 4.00 score 3 scriptswelch-lab
rliger:Linked Inference of Genomic Experimental Relationships
Uses an extension of nonnegative matrix factorization to identify shared and dataset-specific factors. See Welch J, Kozareva V, et al (2019) <doi:10.1016/j.cell.2019.05.006>, and Liu J, Gao C, Sodicoff J, et al (2020) <doi:10.1038/s41596-020-0391-8> for more details.
Maintained by Yichen Wang. Last updated 2 months ago.
nonnegative-matrix-factorizationsingle-cellopenblascpp
1.9 match 408 stars 10.77 score 334 scripts 1 dependentsbioc
VariantExperiment:A RangedSummarizedExperiment Container for VCF/GDS Data with GDS Backend
VariantExperiment is a Bioconductor package for saving data in VCF/GDS format into RangedSummarizedExperiment object. The high-throughput genetic/genomic data are saved in GDSArray objects. The annotation data for features/samples are saved in DelayedDataFrame format with mono-dimensional GDSArray in each column. The on-disk representation of both assay data and annotation data achieves on-disk reading and processing and saves memory space significantly. The interface of RangedSummarizedExperiment data format enables easy and common manipulations for high-throughput genetic/genomic data with common SummarizedExperiment metaphor in R and Bioconductor.
Maintained by Qian Liu. Last updated 5 months ago.
infrastructuredatarepresentationsequencingannotationgenomeannotationgenotypingarray
4.0 match 1 stars 5.00 score 2 scriptsbioc
HDF5Array:HDF5 datasets as array-like objects in R
The HDF5Array package is an HDF5 backend for DelayedArray objects. It implements the HDF5Array, H5SparseMatrix, H5ADMatrix, and TENxMatrix classes, 4 convenient and memory-efficient array-like containers for representing and manipulating either: (1) a conventional (a.k.a. dense) HDF5 dataset, (2) an HDF5 sparse matrix (stored in CSR/CSC/Yale format), (3) the central matrix of an h5ad file (or any matrix in the /layers group), or (4) a 10x Genomics sparse matrix. All these containers are DelayedArray extensions and thus support all operations (delayed or block-processed) supported by DelayedArray objects.
Maintained by Hervé Pagès. Last updated 25 days ago.
infrastructuredatarepresentationdataimportsequencingrnaseqcoverageannotationgenomeannotationsinglecellimmunooncologybioconductor-packagecore-packageu24ca289073
1.3 match 12 stars 13.19 score 844 scripts 123 dependentsbioc
DelayedDataFrame:Delayed operation on DataFrame using standard DataFrame metaphor
Based on the standard DataFrame metaphor, we are trying to implement the feature of delayed operation on the DelayedDataFrame, with a slot of lazyIndex, which saves the mapping indexes for each column of DelayedDataFrame. Methods like show, validity check, [/[[ subsetting, rbind/cbind are implemented for DelayedDataFrame to be operated around lazyIndex. The listData slot stays untouched until a realization call e.g., DataFrame constructor OR as.list() is invoked.
Maintained by Qian Liu. Last updated 5 months ago.
infrastructuredatarepresentation
3.1 match 2 stars 5.26 score 3 scripts 1 dependentsbioc
mbkmeans:Mini-batch K-means Clustering for Single-Cell RNA-seq
Implements the mini-batch k-means algorithm for large datasets, including support for on-disk data representation.
Maintained by Davide Risso. Last updated 5 months ago.
clusteringgeneexpressionrnaseqsoftwaretranscriptomicssequencingsinglecellhuman-cell-atlascpp
1.8 match 10 stars 7.41 score 54 scripts 2 dependentsbioc
methrix:Fast and efficient summarization of generic bedGraph files from Bisufite sequencing
Bedgraph files generated by Bisulfite pipelines often come in various flavors. Critical downstream step requires summarization of these files into methylation/coverage matrices. This step of data aggregation is done by Methrix, including many other useful downstream functions.
Maintained by Anand Mayakonda. Last updated 5 months ago.
dnamethylationsequencingcoveragebedgraphbioinformaticsdna-methylation
1.8 match 31 stars 7.51 score 39 scripts 1 dependentsbioc
dreamlet:Scalable differential expression analysis of single cell transcriptomics datasets with complex study designs
Recent advances in single cell/nucleus transcriptomic technology has enabled collection of cohort-scale datasets to study cell type specific gene expression differences associated disease state, stimulus, and genetic regulation. The scale of these data, complex study designs, and low read count per cell mean that characterizing cell type specific molecular mechanisms requires a user-frieldly, purpose-build analytical framework. We have developed the dreamlet package that applies a pseudobulk approach and fits a regression model for each gene and cell cluster to test differential expression across individuals associated with a trait of interest. Use of precision-weighted linear mixed models enables accounting for repeated measures study designs, high dimensional batch effects, and varying sequencing depth or observed cells per biosample.
Maintained by Gabriel Hoffman. Last updated 5 months ago.
rnaseqgeneexpressiondifferentialexpressionbatcheffectqualitycontrolregressiongenesetenrichmentgeneregulationepigeneticsfunctionalgenomicstranscriptomicsnormalizationsinglecellpreprocessingsequencingimmunooncologysoftwarecpp
1.3 match 12 stars 8.09 score 128 scriptsbioc
SCArray:Large-scale single-cell omics data manipulation with GDS files
Provides large-scale single-cell omics data manipulation using Genomic Data Structure (GDS) files. It combines dense and sparse matrices stored in GDS files and the Bioconductor infrastructure framework (SingleCellExperiment and DelayedArray) to provide out-of-memory data storage and large-scale manipulation using the R programming language.
Maintained by Xiuwen Zheng. Last updated 4 months ago.
infrastructuredatarepresentationdataimportsinglecellrnaseqcpp
1.9 match 1 stars 4.02 score 9 scripts 1 dependentsbioc
BiocFileCache:Manage Files Across Sessions
This package creates a persistent on-disk cache of files that the user can add, update, and retrieve. It is useful for managing resources (such as custom Txdb objects) that are costly or difficult to create, web resources, and data files used across sessions.
Maintained by Lori Shepherd. Last updated 2 months ago.
dataimportcore-packageu24ca289073
0.5 match 13 stars 13.76 score 486 scripts 429 dependentsbioc
SCArray.sat:Large-scale single-cell RNA-seq data analysis using GDS files and Seurat
Extends the Seurat classes and functions to support Genomic Data Structure (GDS) files as a DelayedArray backend for data representation. It relies on the implementation of GDS-based DelayedMatrix in the SCArray package to represent single cell RNA-seq data. The common optimized algorithms leveraging GDS-based and single cell-specific DelayedMatrix (SC_GDSMatrix) are implemented in the SCArray package. SCArray.sat introduces a new SCArrayAssay class (derived from the Seurat Assay), which wraps raw counts, normalized expressions and scaled data matrix based on GDS-specific DelayedMatrix. It is designed to integrate seamlessly with the Seurat package to provide common data analysis in the SeuratObject-based workflow. Compared with Seurat, SCArray.sat significantly reduces the memory usage without downsampling and can be applied to very large datasets.
Maintained by Xiuwen Zheng. Last updated 5 months ago.
datarepresentationdataimportsinglecellrnaseq
1.9 match 1 stars 3.40 score 3 scriptsbioc
S4Arrays:Foundation of array-like containers in Bioconductor
The S4Arrays package defines the Array virtual class to be extended by other S4 classes that wish to implement a container with an array-like semantic. It also provides: (1) low-level functionality meant to help the developer of such container to implement basic operations like display, subsetting, or coercion of their array-like objects to an ordinary matrix or array, and (2) a framework that facilitates block processing of array-like objects (typically on-disk objects).
Maintained by Hervé Pagès. Last updated 1 months ago.
infrastructuredatarepresentationbioconductor-packagecore-package
0.5 match 5 stars 10.99 score 8 scripts 1.2k dependents