Showing 46 of total 46 results (show query)
bioc
SummarizedExperiment:A container (S4 class) for matrix-like assays
The SummarizedExperiment container contains one or more assays, each represented by a matrix-like object of numeric or other mode. The rows typically represent genomic ranges of interest and the columns represent samples.
Maintained by Hervé Pagès. Last updated 5 months ago.
geneticsinfrastructuresequencingannotationcoveragegenomeannotationbioconductor-packagecore-package
34 stars 16.84 score 8.6k scripts 1.2k dependentsbioc
IRanges:Foundation of integer range manipulation in Bioconductor
Provides efficient low-level and highly reusable S4 classes for storing, manipulating and aggregating over annotated ranges of integers. Implements an algebra of range operations, including efficient algorithms for finding overlaps and nearest neighbors. Defines efficient list-like classes for storing, transforming and aggregating large grouped data, i.e., collections of atomic vectors and DataFrames.
Maintained by Hervé Pagès. Last updated 2 months ago.
infrastructuredatarepresentationbioconductor-packagecore-package
22 stars 16.09 score 2.1k scripts 1.8k dependentsbioc
S4Vectors:Foundation of vector-like and list-like containers in Bioconductor
The S4Vectors package defines the Vector and List virtual classes and a set of generic functions that extend the semantic of ordinary vectors and lists in R. Package developers can easily implement vector-like or list-like objects as concrete subclasses of Vector or List. In addition, a few low-level concrete subclasses of general interest (e.g. DataFrame, Rle, Factor, and Hits) are implemented in the S4Vectors package itself (many more are implemented in the IRanges package and in other Bioconductor infrastructure packages).
Maintained by Hervé Pagès. Last updated 2 months ago.
infrastructuredatarepresentationbioconductor-packagecore-package
18 stars 16.05 score 1.0k scripts 1.9k dependentsbioc
AnnotationDbi:Manipulation of SQLite-based annotations in Bioconductor
Implements a user-friendly interface for querying SQLite-based annotation data packages.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
annotationmicroarraysequencinggenomeannotationbioconductor-packagecore-package
9 stars 15.05 score 3.6k scripts 769 dependentsbioc
MultiAssayExperiment:Software for the integration of multi-omics experiments in Bioconductor
Harmonize data management of multiple experimental assays performed on an overlapping set of specimens. It provides a familiar Bioconductor user experience by extending concepts from SummarizedExperiment, supporting an open-ended mix of standard data classes for individual assays, and allowing subsetting by genomic ranges or rownames. Facilities are provided for reshaping data into wide and long formats for adaptability to graphing and downstream analysis.
Maintained by Marcel Ramos. Last updated 2 months ago.
infrastructuredatarepresentationbioconductorbioconductor-packagegenomicsnci-itcrtcgau24ca289073
71 stars 14.94 score 670 scripts 126 dependentsbioc
BiocGenerics:S4 generic functions used in Bioconductor
The package defines many S4 generic functions used in Bioconductor.
Maintained by Hervé Pagès. Last updated 2 months ago.
infrastructurebioconductor-packagecore-package
12 stars 14.22 score 612 scripts 2.2k dependentsvincentarelbundock
tinytable:Simple and Configurable Tables in 'HTML', 'LaTeX', 'Markdown', 'Word', 'PNG', 'PDF', and 'Typst' Formats
Create highly customized tables with this simple and dependency-free package. Data frames can be converted to 'HTML', 'LaTeX', 'Markdown', 'Word', 'PNG', 'PDF', or 'Typst' tables. The user interface is minimalist and easy to learn. The syntax is concise. 'HTML' tables can be customized using the flexible 'Bootstrap' framework, and 'LaTeX' code with the 'tabularray' package.
Maintained by Vincent Arel-Bundock. Last updated 9 days ago.
264 stars 12.26 score 562 scripts 10 dependentsbioc
biomformat:An interface package for the BIOM file format
This is an R package for interfacing with the BIOM format. This package includes basic tools for reading biom-format files, accessing and subsetting data tables from a biom object (which is more complex than a single table), as well as limited support for writing a biom-object back to a biom-format file. The design of this API is intended to match the python API and other tools included with the biom-format project, but with a decidedly "R flavor" that should be familiar to R users. This includes S4 classes and methods, as well as extensions of common core functions/methods.
Maintained by Paul J. McMurdie. Last updated 5 months ago.
immunooncologydataimportmetagenomicsmicrobiome
7 stars 11.39 score 416 scripts 40 dependentsbioc
universalmotif:Import, Modify, and Export Motifs with R
Allows for importing most common motif types into R for use by functions provided by other Bioconductor motif-related packages. Motifs can be exported into most major motif formats from various classes as defined by other Bioconductor packages. A suite of motif and sequence manipulation and analysis functions are included, including enrichment, comparison, P-value calculation, shuffling, trimming, higher-order motifs, and others.
Maintained by Benjamin Jean-Marie Tremblay. Last updated 5 months ago.
motifannotationmotifdiscoverydataimportgeneregulationmotif-analysismotif-enrichment-analysissequence-logocpp
28 stars 11.04 score 342 scripts 12 dependentswrathematics
float:32-Bit Floats
R comes with a suite of utilities for linear algebra with "numeric" (double precision) vectors/matrices. However, sometimes single precision (or less!) is more than enough for a particular task. This package extends R's linear algebra facilities to include 32-bit float (single precision) data. Float vectors/matrices have half the precision of their "numeric"-type counterparts but are generally faster to numerically operate on, for a performance vs accuracy trade-off. The internal representation is an S4 class, which allows us to keep the syntax identical to that of base R's. Interaction between floats and base types for binary operators is generally possible; in these cases, type promotion always defaults to the higher precision. The package ships with copies of the single precision 'BLAS' and 'LAPACK', which are automatically built in the event they are not available on the system.
Maintained by Drew Schmidt. Last updated 21 days ago.
float-matrixhpclinear-algebramatrixfortranopenblasopenmp
46 stars 10.53 score 228 scripts 42 dependentsbioc
Cardinal:A mass spectrometry imaging toolbox for statistical analysis
Implements statistical & computational tools for analyzing mass spectrometry imaging datasets, including methods for efficient pre-processing, spatial segmentation, and classification.
Maintained by Kylie Ariel Bemis. Last updated 3 months ago.
softwareinfrastructureproteomicslipidomicsmassspectrometryimagingmassspectrometryimmunooncologynormalizationclusteringclassificationregression
48 stars 10.32 score 200 scriptsbioc
flowCore:flowCore: Basic structures for flow cytometry data
Provides S4 data structures and basic functions to deal with flow cytometry data.
Maintained by Mike Jiang. Last updated 5 months ago.
immunooncologyinfrastructureflowcytometrycellbasedassayscpp
10.17 score 1.7k scripts 59 dependentsrobinhankin
Brobdingnag:Very Large Numbers in R
Very large numbers in R. Real numbers are held using their natural logarithms, plus a logical flag indicating sign. Functionality for complex numbers is also provided. The package includes a vignette that gives a step-by-step introduction to using S4 methods.
Maintained by Robin K. S. Hankin. Last updated 7 months ago.
5 stars 9.92 score 77 scripts 70 dependentsbioc
RcisTarget:RcisTarget Identify transcription factor binding motifs enriched on a list of genes or genomic regions
RcisTarget identifies transcription factor binding motifs (TFBS) over-represented on a gene list. In a first step, RcisTarget selects DNA motifs that are significantly over-represented in the surroundings of the transcription start site (TSS) of the genes in the gene-set. This is achieved by using a database that contains genome-wide cross-species rankings for each motif. The motifs that are then annotated to TFs and those that have a high Normalized Enrichment Score (NES) are retained. Finally, for each motif and gene-set, RcisTarget predicts the candidate target genes (i.e. genes in the gene-set that are ranked above the leading edge).
Maintained by Gert Hulselmans. Last updated 5 months ago.
generegulationmotifannotationtranscriptomicstranscriptiongenesetenrichmentgenetarget
37 stars 9.18 score 191 scriptsandreyshabalin
MatrixEQTL:Matrix eQTL: Ultra Fast eQTL Analysis via Large Matrix Operations
Matrix eQTL is designed for fast eQTL analysis on large datasets. Matrix eQTL can test for association between genotype and gene expression using linear regression with either additive or ANOVA genotype effects. The models can include covariates to account for factors as population stratification, gender, and clinical variables. It also supports models with heteroscedastic and/or correlated errors, false discovery rate estimation and separate treatment of local (cis) and distant (trans) eQTLs. For more details see Shabalin (2012) <doi:10.1093/bioinformatics/bts163>.
Maintained by Andrey A Shabalin. Last updated 2 years ago.
74 stars 8.31 score 612 scripts 3 dependentstomasfryda
h2o:R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Maintained by Tomas Fryda. Last updated 1 years ago.
3 stars 8.20 score 7.8k scripts 11 dependentsr-hyperspec
hyperSpec:Work with Hyperspectral Data, i.e. Spectra + Meta Information (Spatial, Time, Concentration, ...)
Comfortable ways to work with hyperspectral data sets, i.e. spatially or time-resolved spectra, or spectra with any other kind of information associated with each of the spectra. The spectra can be data as obtained in XRF, UV/VIS, Fluorescence, AES, NIR, IR, Raman, NMR, MS, etc. More generally, any data that is recorded over a discretized variable, e.g. absorbance = f(wavelength), stored as a vector of absorbance values for discrete wavelengths is suitable.
Maintained by Claudia Beleites. Last updated 10 months ago.
data-wranglinghyperspectralimaginginfrarednmrramanspectroscopyuv-visxrf
16 stars 8.10 score 233 scripts 2 dependentspolmine
polmineR:Verbs and Nouns for Corpus Analysis
Package for corpus analysis using the Corpus Workbench ('CWB', <https://cwb.sourceforge.io>) as an efficient back end for indexing and querying large corpora. The package offers functionality to flexibly create subcorpora and to carry out basic statistical operations (count, co-occurrences etc.). The original full text of documents can be reconstructed and inspected at any time. Beyond that, the package is intended to serve as an interface to packages implementing advanced statistical procedures. Respective data structures (document-term matrices, term-co-occurrence matrices etc.) can be created based on the indexed corpora.
Maintained by Andreas Blaette. Last updated 1 years ago.
49 stars 7.96 score 311 scriptscran
timeSeries:Financial Time Series Objects (Rmetrics)
'S4' classes and various tools for financial time series: Basic functions such as scaling and sorting, subsetting, mathematical operations and statistical functions.
Maintained by Georgi N. Boshnakov. Last updated 6 months ago.
2 stars 7.89 score 146 dependentsbioc
flowWorkspace:Infrastructure for representing and interacting with gated and ungated cytometry data sets.
This package is designed to facilitate comparison of automated gating methods against manual gating done in flowJo. This package allows you to import basic flowJo workspaces into BioConductor and replicate the gating from flowJo using the flowCore functionality. Gating hierarchies, groups of samples, compensation, and transformation are performed so that the output matches the flowJo analysis.
Maintained by Greg Finak. Last updated 24 days ago.
immunooncologyflowcytometrydataimportpreprocessingdatarepresentationzlibopenblascpp
7.89 score 576 scripts 10 dependentsbioc
wateRmelon:Illumina DNA methylation array normalization and metrics
15 flavours of betas and three performance metrics, with methods for objects produced by methylumi and minfi packages.
Maintained by Leo C Schalkwyk. Last updated 4 months ago.
dnamethylationmicroarraytwochannelpreprocessingqualitycontrol
7.73 score 247 scripts 2 dependentsbioc
CytoML:A GatingML Interface for Cross Platform Cytometry Data Sharing
Uses platform-specific implemenations of the GatingML2.0 standard to exchange gated cytometry data with other software platforms.
Maintained by Mike Jiang. Last updated 24 days ago.
immunooncologyflowcytometrydataimportdatarepresentationzlibopenblaslibxml2cpp
30 stars 7.60 score 132 scriptsbioc
ncdfFlow:ncdfFlow: A package that provides HDF5 based storage for flow cytometry data.
Provides HDF5 storage based methods and functions for manipulation of flow cytometry data.
Maintained by Mike Jiang. Last updated 3 months ago.
immunooncologyflowcytometryzlibcpp
7.56 score 96 scripts 11 dependentsbioc
cola:A Framework for Consensus Partitioning
Subgroup classification is a basic task in genomic data analysis, especially for gene expression and DNA methylation data analysis. It can also be used to test the agreement to known clinical annotations, or to test whether there exist significant batch effects. The cola package provides a general framework for subgroup classification by consensus partitioning. It has the following features: 1. It modularizes the consensus partitioning processes that various methods can be easily integrated. 2. It provides rich visualizations for interpreting the results. 3. It allows running multiple methods at the same time and provides functionalities to straightforward compare results. 4. It provides a new method to extract features which are more efficient to separate subgroups. 5. It automatically generates detailed reports for the complete analysis. 6. It allows applying consensus partitioning in a hierarchical manner.
Maintained by Zuguang Gu. Last updated 2 months ago.
clusteringgeneexpressionclassificationsoftwareconsensus-clusteringcpp
61 stars 7.49 score 112 scriptsrobinhankin
onion:Octonions and Quaternions
Quaternions and Octonions are four- and eight- dimensional extensions of the complex numbers. They are normed division algebras over the real numbers and find applications in spatial rotations (quaternions), and string theory and relativity (octonions). The quaternions are noncommutative and the octonions nonassociative. See the package vignette for more details.
Maintained by Robin K. S. Hankin. Last updated 1 months ago.
6 stars 7.27 score 43 scripts 3 dependentsbioc
DRIMSeq:Differential transcript usage and tuQTL analyses with Dirichlet-multinomial model in RNA-seq
The package provides two frameworks. One for the differential transcript usage analysis between different conditions and one for the tuQTL analysis. Both are based on modeling the counts of genomic features (i.e., transcripts) with the Dirichlet-multinomial distribution. The package also makes available functions for visualization and exploration of the data and results.
Maintained by Malgorzata Nowicka. Last updated 5 months ago.
immunooncologysnpalternativesplicingdifferentialsplicinggeneticsrnaseqsequencingworkflowstepmultiplecomparisongeneexpressiondifferentialexpression
6.91 score 136 scripts 2 dependentsbioc
MetaboAnnotation:Utilities for Annotation of Metabolomics Data
High level functions to assist in annotation of (metabolomics) data sets. These include functions to perform simple tentative annotations based on mass matching but also functions to consider m/z and retention times for annotation of LC-MS features given that respective reference values are available. In addition, the function provides high-level functions to simplify matching of LC-MS/MS spectra against spectral libraries and objects and functionality to represent and manage such matched data.
Maintained by Johannes Rainer. Last updated 3 months ago.
infrastructuremetabolomicsmassspectrometryannotationmass-spectromtry
15 stars 6.90 score 35 scriptsbioc
GenomicFiles:Distributed computing by file or by range
This package provides infrastructure for parallel computations distributed 'by file' or 'by range'. User defined MAPPER and REDUCER functions provide added flexibility for data combination and manipulation.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
geneticsinfrastructuredataimportsequencingcoverage
6.86 score 89 scripts 16 dependentsandreyshabalin
filematrix:File-Backed Matrix Class with Convenient Read and Write Access
Interface for working with large matrices stored in files, not in computer memory. Supports multiple non-character data types (double, integer, logical and raw) of various sizes (e.g. 8 and 4 byte real values). Access to parts of the matrix is done by indexing, exactly as with usual R matrices. Supports very large matrices. Tested on multi-terabyte matrices. Allows for more than 2^32 rows or columns. Allows for quick addition of extra columns to a filematrix. Cross-platform as the package has R code only.
Maintained by Andrey A Shabalin. Last updated 6 years ago.
8 stars 6.51 score 45 scripts 2 dependentsbioc
annaffy:Annotation tools for Affymetrix biological metadata
Functions for handling data from Bioconductor Affymetrix annotation data packages. Produces compact HTML and text reports including experimental data and URL links to many online databases. Allows searching biological metadata using various criteria.
Maintained by Colin A. Smith. Last updated 5 months ago.
onechannelmicroarrayannotationgopathwaysreportwriting
5.64 score 60 scripts 3 dependentsbioc
bigmelon:Illumina methylation array analysis for large experiments
Methods for working with Illumina arrays using gdsfmt.
Maintained by Leonard C. Schalkwyk. Last updated 5 months ago.
dnamethylationmicroarraytwochannelpreprocessingqualitycontrolmethylationarraydataimportcpgisland
5.47 score 21 scriptsbioc
easyRNASeq:Count summarization and normalization for RNA-Seq data
Calculates the coverage of high-throughput short-reads against a genome of reference and summarizes it per feature of interest (e.g. exon, gene, transcript). The data can be normalized as 'RPKM' or by the 'DESeq' or 'edgeR' package.
Maintained by Nicolas Delhomme. Last updated 5 months ago.
geneexpressionrnaseqgeneticspreprocessingimmunooncology
5.43 score 15 scripts 1 dependentsbioc
VanillaICE:A Hidden Markov Model for high throughput genotyping arrays
Hidden Markov Models for characterizing chromosomal alteration in high throughput SNP arrays.
Maintained by Robert Scharpf. Last updated 5 months ago.
5.36 score 63 scripts 1 dependentscenterforstatistics-ugent
xnet:Two-Step Kernel Ridge Regression for Network Predictions
Fit a two-step kernel ridge regression model for predicting edges in networks, and carry out cross-validation using shortcuts for swift and accurate performance assessment (Stock et al, 2018 <doi:10.1093/bib/bby095> ).
Maintained by Joris Meys. Last updated 4 years ago.
11 stars 5.30 score 12 scriptsuchidamizuki
dibble:Dimensional Data Frames
Provides a 'dibble' that implements data cubes (derived from 'dimensional tibble'), and allows broadcasting by dimensional names.
Maintained by Mizuki Uchida. Last updated 1 months ago.
multidimensional-arraystidy-data
14 stars 5.05 score 8 scriptsfatelarico
FinNet:Quickly Build and Manipulate Financial Networks
Providing classes, methods, and functions to deal with financial networks. Users can easily store information about both physical and legal persons by using pre-made classes that are studied for integration with scraping packages such as 'rvest' and 'RSelenium'. Moreover, the package assists in creating various types of financial networks depending on the type of relation between its units depending on the relation under scrutiny (ownership, board interlocks, etc.), the desired tie type (valued or binary), and renders them in the most common formats (adjacency matrix, incidence matrix, edge list, 'igraph', 'network'). There are also ad-hoc functions for the Fiedler value, global network efficiency, and cascade-failure analysis.
Maintained by Fabio Ashtar Telarico. Last updated 5 months ago.
2 stars 4.78 score 7 scriptsbioc
scanMiRApp:scanMiR shiny application
A shiny interface to the scanMiR package. The application enables the scanning of transcripts and custom sequences for miRNA binding sites, the visualization of KdModels and binding results, as well as browsing predicted repression data. In addition contains the IndexedFst class for fast indexed reading of large GenomicRanges or data.frames, and some utilities for facilitating scans and identifying enriched miRNA-target pairs.
Maintained by Pierre-Luc Germain. Last updated 5 months ago.
mirnasequencematchingguishinyapps
4.76 score 19 scriptsbioc
BufferedMatrix:A matrix data storage object held in temporary files
A tabular style data object where most data is stored outside main memory. A buffer is used to speed up access to data.
Maintained by Ben Bolstad. Last updated 4 months ago.
4.73 score 6 scripts 1 dependentsbioc
BiocHail:basilisk and hail
Use hail via basilisk when appropriate, or via reticulate. This package can be used in terra.bio to interact with UK Biobank resources processed by hail.is.
Maintained by Vincent Carey. Last updated 4 months ago.
infrastructurebioconductorgeneticshail
6 stars 4.53 score 19 scriptsbioc
MultimodalExperiment:Integrative Bulk and Single-Cell Experiment Container
MultimodalExperiment is an S4 class that integrates bulk and single-cell experiment data; it is optimally storage-efficient, and its methods are exceptionally fast. It effortlessly represents multimodal data of any nature and features normalized experiment, subject, sample, and cell annotations, which are related to underlying biological experiments through maps. Its coordination methods are opt-in and employ database-like join operations internally to deliver fast and flexible management of multimodal data.
Maintained by Lucas Schiffer. Last updated 5 months ago.
datarepresentationinfrastructuresinglecell
4.00 score 3 scriptshenrikbengtsson
R.huge:Methods for Accessing Huge Amounts of Data [deprecated]
DEPRECATED. Do not start building new projects based on this package. Cross-platform alternatives are the following packages: bigmemory (CRAN), ff (CRAN), BufferedMatrix (Bioconductor). The main usage of it was inside the aroma.affymetrix package. (The package currently provides a class representing a matrix where the actual data is stored in a binary format on the local file system. This way the size limit of the data is set by the file system and not the memory.)
Maintained by Henrik Bengtsson. Last updated 1 years ago.
3.88 score 2 scripts 5 dependentslawremi
rsolr:R to Solr Interface
A comprehensive R API for querying Apache Solr databases. A Solr core is represented as a data frame or list that supports Solr-side filtering, sorting, transformation and aggregation, all through the familiar base R API. Queries are processed lazily, i.e., a query is only sent to the database when the data are required.
Maintained by Michael Lawrence. Last updated 3 years ago.
9 stars 3.65 score 6 scriptsadamkocsis
via:Virtual Arrays
The base class 'VirtualArray' is defined, which acts as a wrapper around lists allowing users to fold arbitrary sequential data into n-dimensional, R-style virtual arrays. The derived 'XArray' class is defined to be used for homogeneous lists that contain a single class of objects. The 'RasterArray' and 'SfArray' classes enable the use of stacked spatial data instead of lists.
Maintained by Adam T. Kocsis. Last updated 2 years ago.
3 stars 3.18 score 8 scriptscran
ibmdbR:IBM in-Database Analytics for R
Functionality required to efficiently use R with IBM(R) Db2(R) Warehouse offerings (formerly IBM dashDB(R)) and IBM Db2 for z/OS(R) in conjunction with IBM Db2 Analytics Accelerator for z/OS. Many basic and complex R operations are pushed down into the database, which removes the main memory boundary of R and allows to make full use of parallel processing in the underlying database. For executing R-functions in a multi-node environment in parallel the idaTApply() function requires the 'SparkR' package (<https://spark.apache.org/docs/latest/sparkr.html>). The optional 'ggplot2' package is needed for the plot.idaLm() function only.
Maintained by Shaikh Quader. Last updated 1 years ago.
2 stars 3.00 scoreapedrods
MAINT.Data:Model and Analyse Interval Data
Implements methodologies for modelling interval data by Normal and Skew-Normal distributions, considering appropriate parameterizations of the variance-covariance matrix that takes into account the intrinsic nature of interval data, and lead to four different possible configuration structures. The Skew-Normal parameters can be estimated by maximum likelihood, while Normal parameters may be estimated by maximum likelihood or robust trimmed maximum likelihood methods.
Maintained by Pedro Duarte Silva. Last updated 2 years ago.
1.15 score 14 scripts