R-universe search: exports:colnames

bioc

SummarizedExperiment:A container (S4 class) for matrix-like assays

The SummarizedExperiment container contains one or more assays, each represented by a matrix-like object of numeric or other mode. The rows typically represent genomic ranges of interest and the columns represent samples.

Maintained by Hervé Pagès. Last updated 5 months ago.

genetics infrastructure sequencing annotation coverage genomeannotation bioconductor-package core-package

34 stars 16.84 score 8.6k scripts 1.2k dependents

bioc

IRanges:Foundation of integer range manipulation in Bioconductor

Provides efficient low-level and highly reusable S4 classes for storing, manipulating and aggregating over annotated ranges of integers. Implements an algebra of range operations, including efficient algorithms for finding overlaps and nearest neighbors. Defines efficient list-like classes for storing, transforming and aggregating large grouped data, i.e., collections of atomic vectors and DataFrames.

Maintained by Hervé Pagès. Last updated 2 months ago.

infrastructure datarepresentation bioconductor-package core-package

22 stars 16.09 score 2.1k scripts 1.8k dependents

bioc

S4Vectors:Foundation of vector-like and list-like containers in Bioconductor

The S4Vectors package defines the Vector and List virtual classes and a set of generic functions that extend the semantic of ordinary vectors and lists in R. Package developers can easily implement vector-like or list-like objects as concrete subclasses of Vector or List. In addition, a few low-level concrete subclasses of general interest (e.g. DataFrame, Rle, Factor, and Hits) are implemented in the S4Vectors package itself (many more are implemented in the IRanges package and in other Bioconductor infrastructure packages).

Maintained by Hervé Pagès. Last updated 2 months ago.

infrastructure datarepresentation bioconductor-package core-package

18 stars 16.05 score 1.0k scripts 1.9k dependents

bioc

AnnotationDbi:Manipulation of SQLite-based annotations in Bioconductor

Implements a user-friendly interface for querying SQLite-based annotation data packages.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

annotation microarray sequencing genomeannotation bioconductor-package core-package

9 stars 15.05 score 3.6k scripts 769 dependents

bioc

MultiAssayExperiment:Software for the integration of multi-omics experiments in Bioconductor

Harmonize data management of multiple experimental assays performed on an overlapping set of specimens. It provides a familiar Bioconductor user experience by extending concepts from SummarizedExperiment, supporting an open-ended mix of standard data classes for individual assays, and allowing subsetting by genomic ranges or rownames. Facilities are provided for reshaping data into wide and long formats for adaptability to graphing and downstream analysis.

Maintained by Marcel Ramos. Last updated 2 months ago.

infrastructure datarepresentation bioconductor bioconductor-package genomics nci-itcr tcga u24ca289073

71 stars 14.94 score 670 scripts 126 dependents

bioc

BiocGenerics:S4 generic functions used in Bioconductor

The package defines many S4 generic functions used in Bioconductor.

Maintained by Hervé Pagès. Last updated 2 months ago.

infrastructure bioconductor-package core-package

12 stars 14.22 score 612 scripts 2.2k dependents

vincentarelbundock

tinytable:Simple and Configurable Tables in 'HTML', 'LaTeX', 'Markdown', 'Word', 'PNG', 'PDF', and 'Typst' Formats

Create highly customized tables with this simple and dependency-free package. Data frames can be converted to 'HTML', 'LaTeX', 'Markdown', 'Word', 'PNG', 'PDF', or 'Typst' tables. The user interface is minimalist and easy to learn. The syntax is concise. 'HTML' tables can be customized using the flexible 'Bootstrap' framework, and 'LaTeX' code with the 'tabularray' package.

Maintained by Vincent Arel-Bundock. Last updated 9 days ago.

264 stars 12.26 score 562 scripts 10 dependents

bioc

biomformat:An interface package for the BIOM file format

This is an R package for interfacing with the BIOM format. This package includes basic tools for reading biom-format files, accessing and subsetting data tables from a biom object (which is more complex than a single table), as well as limited support for writing a biom-object back to a biom-format file. The design of this API is intended to match the python API and other tools included with the biom-format project, but with a decidedly "R flavor" that should be familiar to R users. This includes S4 classes and methods, as well as extensions of common core functions/methods.

Maintained by Paul J. McMurdie. Last updated 5 months ago.

immunooncology dataimport metagenomics microbiome

7 stars 11.39 score 416 scripts 40 dependents

bioc

universalmotif:Import, Modify, and Export Motifs with R

Allows for importing most common motif types into R for use by functions provided by other Bioconductor motif-related packages. Motifs can be exported into most major motif formats from various classes as defined by other Bioconductor packages. A suite of motif and sequence manipulation and analysis functions are included, including enrichment, comparison, P-value calculation, shuffling, trimming, higher-order motifs, and others.

Maintained by Benjamin Jean-Marie Tremblay. Last updated 5 months ago.

motifannotation motifdiscovery dataimport generegulation motif-analysis motif-enrichment-analysis sequence-logo cpp

28 stars 11.04 score 342 scripts 12 dependents

wrathematics

float:32-Bit Floats

R comes with a suite of utilities for linear algebra with "numeric" (double precision) vectors/matrices. However, sometimes single precision (or less!) is more than enough for a particular task. This package extends R's linear algebra facilities to include 32-bit float (single precision) data. Float vectors/matrices have half the precision of their "numeric"-type counterparts but are generally faster to numerically operate on, for a performance vs accuracy trade-off. The internal representation is an S4 class, which allows us to keep the syntax identical to that of base R's. Interaction between floats and base types for binary operators is generally possible; in these cases, type promotion always defaults to the higher precision. The package ships with copies of the single precision 'BLAS' and 'LAPACK', which are automatically built in the event they are not available on the system.

Maintained by Drew Schmidt. Last updated 21 days ago.

float-matrix hpc linear-algebra matrix fortran openblas openmp

46 stars 10.53 score 228 scripts 42 dependents

bioc

Cardinal:A mass spectrometry imaging toolbox for statistical analysis

Implements statistical & computational tools for analyzing mass spectrometry imaging datasets, including methods for efficient pre-processing, spatial segmentation, and classification.

Maintained by Kylie Ariel Bemis. Last updated 3 months ago.

software infrastructure proteomics lipidomics massspectrometry imagingmassspectrometry immunooncology normalization clustering classification regression

48 stars 10.32 score 200 scripts

bioc

flowCore:flowCore: Basic structures for flow cytometry data

Provides S4 data structures and basic functions to deal with flow cytometry data.

Maintained by Mike Jiang. Last updated 5 months ago.

immunooncology infrastructure flowcytometry cellbasedassays cpp

10.17 score 1.7k scripts 59 dependents

robinhankin

Brobdingnag:Very Large Numbers in R

Very large numbers in R. Real numbers are held using their natural logarithms, plus a logical flag indicating sign. Functionality for complex numbers is also provided. The package includes a vignette that gives a step-by-step introduction to using S4 methods.

Maintained by Robin K. S. Hankin. Last updated 7 months ago.

5 stars 9.92 score 77 scripts 70 dependents

bioc

RcisTarget:RcisTarget Identify transcription factor binding motifs enriched on a list of genes or genomic regions

RcisTarget identifies transcription factor binding motifs (TFBS) over-represented on a gene list. In a first step, RcisTarget selects DNA motifs that are significantly over-represented in the surroundings of the transcription start site (TSS) of the genes in the gene-set. This is achieved by using a database that contains genome-wide cross-species rankings for each motif. The motifs that are then annotated to TFs and those that have a high Normalized Enrichment Score (NES) are retained. Finally, for each motif and gene-set, RcisTarget predicts the candidate target genes (i.e. genes in the gene-set that are ranked above the leading edge).

Maintained by Gert Hulselmans. Last updated 5 months ago.

generegulation motifannotation transcriptomics transcription genesetenrichment genetarget

37 stars 9.18 score 191 scripts

andreyshabalin

MatrixEQTL:Matrix eQTL: Ultra Fast eQTL Analysis via Large Matrix Operations

Matrix eQTL is designed for fast eQTL analysis on large datasets. Matrix eQTL can test for association between genotype and gene expression using linear regression with either additive or ANOVA genotype effects. The models can include covariates to account for factors as population stratification, gender, and clinical variables. It also supports models with heteroscedastic and/or correlated errors, false discovery rate estimation and separate treatment of local (cis) and distant (trans) eQTLs. For more details see Shabalin (2012) <doi:10.1093/bioinformatics/bts163>.

Maintained by Andrey A Shabalin. Last updated 2 years ago.

74 stars 8.31 score 612 scripts 3 dependents

tomasfryda

h2o:R Interface for the 'H2O' Scalable Machine Learning Platform

R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).

Maintained by Tomas Fryda. Last updated 1 years ago.

3 stars 8.20 score 7.8k scripts 11 dependents

r-hyperspec

hyperSpec:Work with Hyperspectral Data, i.e. Spectra + Meta Information (Spatial, Time, Concentration, ...)

Comfortable ways to work with hyperspectral data sets, i.e. spatially or time-resolved spectra, or spectra with any other kind of information associated with each of the spectra. The spectra can be data as obtained in XRF, UV/VIS, Fluorescence, AES, NIR, IR, Raman, NMR, MS, etc. More generally, any data that is recorded over a discretized variable, e.g. absorbance = f(wavelength), stored as a vector of absorbance values for discrete wavelengths is suitable.

Maintained by Claudia Beleites. Last updated 10 months ago.

data-wrangling hyperspectral imaging infrared nmr raman spectroscopy uv-vis xrf

16 stars 8.10 score 233 scripts 2 dependents

polmine

polmineR:Verbs and Nouns for Corpus Analysis

Package for corpus analysis using the Corpus Workbench ('CWB', <https://cwb.sourceforge.io>) as an efficient back end for indexing and querying large corpora. The package offers functionality to flexibly create subcorpora and to carry out basic statistical operations (count, co-occurrences etc.). The original full text of documents can be reconstructed and inspected at any time. Beyond that, the package is intended to serve as an interface to packages implementing advanced statistical procedures. Respective data structures (document-term matrices, term-co-occurrence matrices etc.) can be created based on the indexed corpora.

Maintained by Andreas Blaette. Last updated 1 years ago.

49 stars 7.96 score 311 scripts

cran

timeSeries:Financial Time Series Objects (Rmetrics)

'S4' classes and various tools for financial time series: Basic functions such as scaling and sorting, subsetting, mathematical operations and statistical functions.

Maintained by Georgi N. Boshnakov. Last updated 6 months ago.

2 stars 7.89 score 146 dependents

bioc

flowWorkspace:Infrastructure for representing and interacting with gated and ungated cytometry data sets.

This package is designed to facilitate comparison of automated gating methods against manual gating done in flowJo. This package allows you to import basic flowJo workspaces into BioConductor and replicate the gating from flowJo using the flowCore functionality. Gating hierarchies, groups of samples, compensation, and transformation are performed so that the output matches the flowJo analysis.

Maintained by Greg Finak. Last updated 24 days ago.

immunooncology flowcytometry dataimport preprocessing datarepresentation zlib openblas cpp

7.89 score 576 scripts 10 dependents

bioc

wateRmelon:Illumina DNA methylation array normalization and metrics

15 flavours of betas and three performance metrics, with methods for objects produced by methylumi and minfi packages.

Maintained by Leo C Schalkwyk. Last updated 4 months ago.

dnamethylation microarray twochannel preprocessing qualitycontrol

7.73 score 247 scripts 2 dependents

bioc

CytoML:A GatingML Interface for Cross Platform Cytometry Data Sharing

Uses platform-specific implemenations of the GatingML2.0 standard to exchange gated cytometry data with other software platforms.

Maintained by Mike Jiang. Last updated 24 days ago.

immunooncology flowcytometry dataimport datarepresentation zlib openblas libxml2 cpp

30 stars 7.60 score 132 scripts

bioc

ncdfFlow:ncdfFlow: A package that provides HDF5 based storage for flow cytometry data.

Provides HDF5 storage based methods and functions for manipulation of flow cytometry data.

Maintained by Mike Jiang. Last updated 3 months ago.

immunooncology flowcytometry zlib cpp

7.56 score 96 scripts 11 dependents

bioc

cola:A Framework for Consensus Partitioning

Subgroup classification is a basic task in genomic data analysis, especially for gene expression and DNA methylation data analysis. It can also be used to test the agreement to known clinical annotations, or to test whether there exist significant batch effects. The cola package provides a general framework for subgroup classification by consensus partitioning. It has the following features: 1. It modularizes the consensus partitioning processes that various methods can be easily integrated. 2. It provides rich visualizations for interpreting the results. 3. It allows running multiple methods at the same time and provides functionalities to straightforward compare results. 4. It provides a new method to extract features which are more efficient to separate subgroups. 5. It automatically generates detailed reports for the complete analysis. 6. It allows applying consensus partitioning in a hierarchical manner.

Maintained by Zuguang Gu. Last updated 2 months ago.

clustering geneexpression classification software consensus-clustering cpp

61 stars 7.49 score 112 scripts

robinhankin

onion:Octonions and Quaternions

Quaternions and Octonions are four- and eight- dimensional extensions of the complex numbers. They are normed division algebras over the real numbers and find applications in spatial rotations (quaternions), and string theory and relativity (octonions). The quaternions are noncommutative and the octonions nonassociative. See the package vignette for more details.

Maintained by Robin K. S. Hankin. Last updated 1 months ago.

6 stars 7.27 score 43 scripts 3 dependents

bioc

DRIMSeq:Differential transcript usage and tuQTL analyses with Dirichlet-multinomial model in RNA-seq

The package provides two frameworks. One for the differential transcript usage analysis between different conditions and one for the tuQTL analysis. Both are based on modeling the counts of genomic features (i.e., transcripts) with the Dirichlet-multinomial distribution. The package also makes available functions for visualization and exploration of the data and results.

Maintained by Malgorzata Nowicka. Last updated 5 months ago.

immunooncology snp alternativesplicing differentialsplicing genetics rnaseq sequencing workflowstep multiplecomparison geneexpression differentialexpression

6.91 score 136 scripts 2 dependents

bioc

MetaboAnnotation:Utilities for Annotation of Metabolomics Data

High level functions to assist in annotation of (metabolomics) data sets. These include functions to perform simple tentative annotations based on mass matching but also functions to consider m/z and retention times for annotation of LC-MS features given that respective reference values are available. In addition, the function provides high-level functions to simplify matching of LC-MS/MS spectra against spectral libraries and objects and functionality to represent and manage such matched data.

Maintained by Johannes Rainer. Last updated 3 months ago.

infrastructure metabolomics massspectrometry annotation mass-spectromtry

15 stars 6.90 score 35 scripts

bioc

GenomicFiles:Distributed computing by file or by range

This package provides infrastructure for parallel computations distributed 'by file' or 'by range'. User defined MAPPER and REDUCER functions provide added flexibility for data combination and manipulation.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

genetics infrastructure dataimport sequencing coverage

6.86 score 89 scripts 16 dependents

andreyshabalin

filematrix:File-Backed Matrix Class with Convenient Read and Write Access

Interface for working with large matrices stored in files, not in computer memory. Supports multiple non-character data types (double, integer, logical and raw) of various sizes (e.g. 8 and 4 byte real values). Access to parts of the matrix is done by indexing, exactly as with usual R matrices. Supports very large matrices. Tested on multi-terabyte matrices. Allows for more than 2^32 rows or columns. Allows for quick addition of extra columns to a filematrix. Cross-platform as the package has R code only.

Maintained by Andrey A Shabalin. Last updated 6 years ago.

8 stars 6.51 score 45 scripts 2 dependents

bioc

annaffy:Annotation tools for Affymetrix biological metadata

Functions for handling data from Bioconductor Affymetrix annotation data packages. Produces compact HTML and text reports including experimental data and URL links to many online databases. Allows searching biological metadata using various criteria.

Maintained by Colin A. Smith. Last updated 5 months ago.

onechannel microarray annotation go pathways reportwriting

5.64 score 60 scripts 3 dependents

bioc

bigmelon:Illumina methylation array analysis for large experiments

Methods for working with Illumina arrays using gdsfmt.

Maintained by Leonard C. Schalkwyk. Last updated 5 months ago.

dnamethylation microarray twochannel preprocessing qualitycontrol methylationarray dataimport cpgisland

5.47 score 21 scripts

bioc

easyRNASeq:Count summarization and normalization for RNA-Seq data

Calculates the coverage of high-throughput short-reads against a genome of reference and summarizes it per feature of interest (e.g. exon, gene, transcript). The data can be normalized as 'RPKM' or by the 'DESeq' or 'edgeR' package.

Maintained by Nicolas Delhomme. Last updated 5 months ago.

geneexpression rnaseq genetics preprocessing immunooncology

5.43 score 15 scripts 1 dependents

bioc

VanillaICE:A Hidden Markov Model for high throughput genotyping arrays

Hidden Markov Models for characterizing chromosomal alteration in high throughput SNP arrays.

Maintained by Robert Scharpf. Last updated 5 months ago.

copynumbervariation

5.36 score 63 scripts 1 dependents

centerforstatistics-ugent

xnet:Two-Step Kernel Ridge Regression for Network Predictions

Fit a two-step kernel ridge regression model for predicting edges in networks, and carry out cross-validation using shortcuts for swift and accurate performance assessment (Stock et al, 2018 <doi:10.1093/bib/bby095> ).

Maintained by Joris Meys. Last updated 4 years ago.

11 stars 5.30 score 12 scripts

uchidamizuki

dibble:Dimensional Data Frames

Provides a 'dibble' that implements data cubes (derived from 'dimensional tibble'), and allows broadcasting by dimensional names.

Maintained by Mizuki Uchida. Last updated 1 months ago.

multidimensional-arrays tidy-data

14 stars 5.05 score 8 scripts

fatelarico

FinNet:Quickly Build and Manipulate Financial Networks

Providing classes, methods, and functions to deal with financial networks. Users can easily store information about both physical and legal persons by using pre-made classes that are studied for integration with scraping packages such as 'rvest' and 'RSelenium'. Moreover, the package assists in creating various types of financial networks depending on the type of relation between its units depending on the relation under scrutiny (ownership, board interlocks, etc.), the desired tie type (valued or binary), and renders them in the most common formats (adjacency matrix, incidence matrix, edge list, 'igraph', 'network'). There are also ad-hoc functions for the Fiedler value, global network efficiency, and cascade-failure analysis.

Maintained by Fabio Ashtar Telarico. Last updated 5 months ago.

cpp

2 stars 4.78 score 7 scripts

bioc

scanMiRApp:scanMiR shiny application

A shiny interface to the scanMiR package. The application enables the scanning of transcripts and custom sequences for miRNA binding sites, the visualization of KdModels and binding results, as well as browsing predicted repression data. In addition contains the IndexedFst class for fast indexed reading of large GenomicRanges or data.frames, and some utilities for facilitating scans and identifying enriched miRNA-target pairs.

Maintained by Pierre-Luc Germain. Last updated 5 months ago.

mirna sequencematching gui shinyapps

4.76 score 19 scripts

bioc

BufferedMatrix:A matrix data storage object held in temporary files

A tabular style data object where most data is stored outside main memory. A buffer is used to speed up access to data.

Maintained by Ben Bolstad. Last updated 4 months ago.

infrastructure

4.73 score 6 scripts 1 dependents

bioc

BiocHail:basilisk and hail

Use hail via basilisk when appropriate, or via reticulate. This package can be used in terra.bio to interact with UK Biobank resources processed by hail.is.

Maintained by Vincent Carey. Last updated 4 months ago.

infrastructure bioconductor genetics hail

6 stars 4.53 score 19 scripts

bioc

MultimodalExperiment:Integrative Bulk and Single-Cell Experiment Container

MultimodalExperiment is an S4 class that integrates bulk and single-cell experiment data; it is optimally storage-efficient, and its methods are exceptionally fast. It effortlessly represents multimodal data of any nature and features normalized experiment, subject, sample, and cell annotations, which are related to underlying biological experiments through maps. Its coordination methods are opt-in and employ database-like join operations internally to deliver fast and flexible management of multimodal data.

Maintained by Lucas Schiffer. Last updated 5 months ago.

datarepresentation infrastructure singlecell

4.00 score 3 scripts

clobatofern95

GPUmatrix:Basic Linear Algebra with GPU

GPUs are great resources for data analysis, especially in statistics and linear algebra. Unfortunately, very few packages connect R to the GPU, and none of them are transparent enough to run the computations on the GPU without substantial changes to the code. The maintenance of these packages is cumbersome: several of the earlier attempts have been removed from their respective repositories. It would be desirable to have a properly maintained R package that takes advantage of the GPU with minimal changes to the existing code. We have developed the GPUmatrix package (available on CRAN). GPUmatrix mimics the behavior of the Matrix package and extends R to use the GPU for computations. It includes single(FP32) and double(FP64) precision data types, and provides support for sparse matrices. It is easy to learn, and requires very few code changes to perform the operations on the GPU. GPUmatrix relies on either the Torch or Tensorflow R packages to perform the GPU operations. We have demonstrated its usefulness for several statistical applications and machine learning applications: non-negative matrix factorization, logistic regression and general linear models. We have also included a comparison of GPU and CPU performance on different matrix operations.

Maintained by Cesar Lobato-Fernandez. Last updated 1 years ago.

3.93 score 57 scripts 1 dependents

henrikbengtsson

R.huge:Methods for Accessing Huge Amounts of Data [deprecated]

DEPRECATED. Do not start building new projects based on this package. Cross-platform alternatives are the following packages: bigmemory (CRAN), ff (CRAN), BufferedMatrix (Bioconductor). The main usage of it was inside the aroma.affymetrix package. (The package currently provides a class representing a matrix where the actual data is stored in a binary format on the local file system. This way the size limit of the data is set by the file system and not the memory.)

Maintained by Henrik Bengtsson. Last updated 1 years ago.

3.88 score 2 scripts 5 dependents

lawremi

rsolr:R to Solr Interface

A comprehensive R API for querying Apache Solr databases. A Solr core is represented as a data frame or list that supports Solr-side filtering, sorting, transformation and aggregation, all through the familiar base R API. Queries are processed lazily, i.e., a query is only sent to the database when the data are required.

Maintained by Michael Lawrence. Last updated 3 years ago.

9 stars 3.65 score 6 scripts

adamkocsis

via:Virtual Arrays

The base class 'VirtualArray' is defined, which acts as a wrapper around lists allowing users to fold arbitrary sequential data into n-dimensional, R-style virtual arrays. The derived 'XArray' class is defined to be used for homogeneous lists that contain a single class of objects. The 'RasterArray' and 'SfArray' classes enable the use of stacked spatial data instead of lists.

Maintained by Adam T. Kocsis. Last updated 2 years ago.

3 stars 3.18 score 8 scripts

cran

ibmdbR:IBM in-Database Analytics for R

Functionality required to efficiently use R with IBM(R) Db2(R) Warehouse offerings (formerly IBM dashDB(R)) and IBM Db2 for z/OS(R) in conjunction with IBM Db2 Analytics Accelerator for z/OS. Many basic and complex R operations are pushed down into the database, which removes the main memory boundary of R and allows to make full use of parallel processing in the underlying database. For executing R-functions in a multi-node environment in parallel the idaTApply() function requires the 'SparkR' package (<https://spark.apache.org/docs/latest/sparkr.html>). The optional 'ggplot2' package is needed for the plot.idaLm() function only.

Maintained by Shaikh Quader. Last updated 1 years ago.

2 stars 3.00 score

apedrods

MAINT.Data:Model and Analyse Interval Data

Implements methodologies for modelling interval data by Normal and Skew-Normal distributions, considering appropriate parameterizations of the variance-covariance matrix that takes into account the intrinsic nature of interval data, and lead to four different possible configuration structures. The Skew-Normal parameters can be estimated by maximum likelihood, while Normal parameters may be estimated by maximum likelihood or robust trimmed maximum likelihood methods.

Maintained by Pedro Duarte Silva. Last updated 2 years ago.

openblas cpp

1.15 score 14 scripts