Showing 50 of total 50 results (show query)
bioc
Biostrings:Efficient manipulation of biological strings
Memory efficient string containers, string matching algorithms, and other utilities, for fast manipulation of large biological sequences or sets of sequences.
Maintained by Hervé Pagès. Last updated 1 months ago.
sequencematchingalignmentsequencinggeneticsdataimportdatarepresentationinfrastructurebioconductor-packagecore-package
62 stars 17.77 score 8.6k scripts 1.2k dependentsrspatial
terra:Spatial Data Analysis
Methods for spatial data analysis with vector (points, lines, polygons) and raster (grid) data. Methods for vector data include geometric operations such as intersect and buffer. Raster methods include local, focal, global, zonal and geometric operations. The predict and interpolate methods facilitate the use of regression type (interpolation, machine learning) models for spatial prediction, including with satellite remote sensing data. Processing of very large files is supported. See the manual and tutorials on <https://rspatial.org/> to get started. 'terra' replaces the 'raster' package ('terra' can do more, and it is faster and easier to use).
Maintained by Robert J. Hijmans. Last updated 2 days ago.
geospatialrasterspatialvectoronetbbprojgdalgeoscpp
559 stars 17.64 score 17k scripts 855 dependentsrspatial
raster:Geographic Data Analysis and Modeling
Reading, writing, manipulating, analyzing and modeling of spatial data. This package has been superseded by the "terra" package <https://CRAN.R-project.org/package=terra>.
Maintained by Robert J. Hijmans. Last updated 18 hours ago.
163 stars 17.23 score 58k scripts 562 dependentsquanteda
quanteda:Quantitative Analysis of Textual Data
A fast, flexible, and comprehensive framework for quantitative text analysis in R. Provides functionality for corpus management, creating and manipulating tokens and n-grams, exploring keywords in context, forming and manipulating sparse matrices of documents by features and feature co-occurrences, analyzing keywords, computing feature similarities and distances, applying content dictionaries, applying supervised and unsupervised machine learning, visually representing text and text analyses, and more.
Maintained by Kenneth Benoit. Last updated 3 months ago.
corpusnatural-language-processingquantedatext-analyticsonetbbcpp
851 stars 16.65 score 5.4k scripts 52 dependentsbioc
IRanges:Foundation of integer range manipulation in Bioconductor
Provides efficient low-level and highly reusable S4 classes for storing, manipulating and aggregating over annotated ranges of integers. Implements an algebra of range operations, including efficient algorithms for finding overlaps and nearest neighbors. Defines efficient list-like classes for storing, transforming and aggregating large grouped data, i.e., collections of atomic vectors and DataFrames.
Maintained by Hervé Pagès. Last updated 2 months ago.
infrastructuredatarepresentationbioconductor-packagecore-package
22 stars 16.09 score 2.1k scripts 1.8k dependentsbioc
S4Vectors:Foundation of vector-like and list-like containers in Bioconductor
The S4Vectors package defines the Vector and List virtual classes and a set of generic functions that extend the semantic of ordinary vectors and lists in R. Package developers can easily implement vector-like or list-like objects as concrete subclasses of Vector or List. In addition, a few low-level concrete subclasses of general interest (e.g. DataFrame, Rle, Factor, and Hits) are implemented in the S4Vectors package itself (many more are implemented in the IRanges package and in other Bioconductor infrastructure packages).
Maintained by Hervé Pagès. Last updated 2 months ago.
infrastructuredatarepresentationbioconductor-packagecore-package
18 stars 16.05 score 1.0k scripts 1.9k dependentsbioc
GenomicFeatures:Query the gene models of a given organism/assembly
Extract the genomic locations of genes, transcripts, exons, introns, and CDS, for the gene models stored in a TxDb object. A TxDb object is a small database that contains the gene models of a given organism/assembly. Bioconductor provides a small collection of TxDb objects in the form of ready-to-install TxDb packages for the most commonly studied organisms. Additionally, the user can easily make a TxDb object (or package) for the organism/assembly of their choice by using the tools from the txdbmaker package.
Maintained by H. Pagès. Last updated 5 months ago.
geneticsinfrastructureannotationsequencinggenomeannotationbioconductor-packagecore-package
26 stars 15.34 score 5.3k scripts 339 dependentsbioc
AnnotationDbi:Manipulation of SQLite-based annotations in Bioconductor
Implements a user-friendly interface for querying SQLite-based annotation data packages.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
annotationmicroarraysequencinggenomeannotationbioconductor-packagecore-package
9 stars 15.05 score 3.6k scripts 769 dependentsbioc
BiocGenerics:S4 generic functions used in Bioconductor
The package defines many S4 generic functions used in Bioconductor.
Maintained by Hervé Pagès. Last updated 2 months ago.
infrastructurebioconductor-packagecore-package
12 stars 14.22 score 612 scripts 2.2k dependentsbioc
BSgenome:Software infrastructure for efficient representation of full genomes and their SNPs
Infrastructure shared by all the Biostrings-based genome data packages.
Maintained by Hervé Pagès. Last updated 2 months ago.
geneticsinfrastructuredatarepresentationsequencematchingannotationsnpbioconductor-packagecore-package
9 stars 14.12 score 1.2k scripts 267 dependentsbioc
AnnotationHub:Client to access AnnotationHub resources
This package provides a client for the Bioconductor AnnotationHub web resource. The AnnotationHub web resource provides a central location where genomic files (e.g., VCF, bed, wig) and other resources from standard locations (e.g., UCSC, Ensembl) can be discovered. The resource includes metadata about each resource, e.g., a textual description, tags, and date of modification. The client creates and manages a local cache of files retrieved by the user, helping with quick and reproducible access.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
infrastructuredataimportguithirdpartyclientcore-packageu24ca289073
17 stars 13.88 score 2.7k scripts 104 dependentsbioc
Gviz:Plotting data and annotation information along genomic coordinates
Genomic data analyses requires integrated visualization of known genomic information and new experimental data. Gviz uses the biomaRt and the rtracklayer packages to perform live annotation queries to Ensembl and UCSC and translates this to e.g. gene/transcript structures in viewports of the grid graphics package. This results in genomic information plotted together with your data.
Maintained by Robert Ivanek. Last updated 5 months ago.
visualizationmicroarraysequencing
79 stars 13.08 score 1.4k scripts 48 dependentsbioc
systemPipeR:systemPipeR: Workflow Environment for Data Analysis and Report Generation
systemPipeR is a multipurpose data analysis workflow environment that unifies R with command-line tools. It enables scientists to analyze many types of large- or small-scale data on local or distributed computer systems with a high level of reproducibility, scalability and portability. At its core is a command-line interface (CLI) that adopts the Common Workflow Language (CWL). This design allows users to choose for each analysis step the optimal R or command-line software. It supports both end-to-end and partial execution of workflows with built-in restart functionalities. Efficient management of complex analysis tasks is accomplished by a flexible workflow control container class. Handling of large numbers of input samples and experimental designs is facilitated by consistent sample annotation mechanisms. As a multi-purpose workflow toolkit, systemPipeR enables users to run existing workflows, customize them or design entirely new ones while taking advantage of widely adopted data structures within the Bioconductor ecosystem. Another important core functionality is the generation of reproducible scientific analysis and technical reports. For result interpretation, systemPipeR offers a wide range of plotting functionality, while an associated Shiny App offers many useful functionalities for interactive result exploration. The vignettes linked from this page include (1) a general introduction, (2) a description of technical details, and (3) a collection of workflow templates.
Maintained by Thomas Girke. Last updated 5 months ago.
geneticsinfrastructuredataimportsequencingrnaseqriboseqchipseqmethylseqsnpgeneexpressioncoveragegenesetenrichmentalignmentqualitycontrolimmunooncologyreportwritingworkflowstepworkflowmanagement
53 stars 11.52 score 344 scripts 3 dependentsmetrumresearchgroup
mrgsolve:Simulate from ODE-Based Models
Fast simulation from ordinary differential equation (ODE) based models typically employed in quantitative pharmacology and systems biology.
Maintained by Kyle T Baron. Last updated 9 days ago.
138 stars 10.90 score 1.2k scripts 3 dependentsbioc
matter:Out-of-core statistical computing and signal processing
Toolbox for larger-than-memory scientific computing and visualization, providing efficient out-of-core data structures using files or shared memory, for dense and sparse vectors, matrices, and arrays, with applications to nonuniformly sampled signals and images.
Maintained by Kylie A. Bemis. Last updated 4 months ago.
infrastructuredatarepresentationdataimportdimensionreductionpreprocessingcpp
57 stars 9.52 score 64 scripts 2 dependentsbioc
ProtGenerics:Generic infrastructure for Bioconductor mass spectrometry packages
S4 generic functions and classes needed by Bioconductor proteomics packages.
Maintained by Laurent Gatto. Last updated 2 months ago.
infrastructureproteomicsmassspectrometrybioconductormass-spectrometrymetabolomics
8 stars 9.36 score 4 scripts 188 dependentsbioc
multtest:Resampling-based multiple hypothesis testing
Non-parametric bootstrap and permutation resampling-based multiple testing procedures (including empirical Bayes methods) for controlling the family-wise error rate (FWER), generalized family-wise error rate (gFWER), tail probability of the proportion of false positives (TPPFP), and false discovery rate (FDR). Several choices of bootstrap-based null distribution are implemented (centered, centered and scaled, quantile-transformed). Single-step and step-wise methods are available. Tests based on a variety of t- and F-statistics (including t-statistics based on regression parameters from linear and survival models as well as those based on correlation parameters) are included. When probing hypotheses with t-statistics, users may also select a potentially faster null distribution which is multivariate normal with mean zero and variance covariance matrix derived from the vector influence function. Results are reported in terms of adjusted p-values, confidence regions and test statistic cutoffs. The procedures are directly applicable to identifying differentially expressed genes in DNA microarray experiments.
Maintained by Katherine S. Pollard. Last updated 5 months ago.
microarraydifferentialexpressionmultiplecomparison
9.34 score 932 scripts 136 dependentsbioc
bamsignals:Extract read count signals from bam files
This package allows to efficiently obtain count vectors from indexed bam files. It counts the number of reads in given genomic ranges and it computes reads profiles and coverage profiles. It also handles paired-end data.
Maintained by Johannes Helmuth. Last updated 5 months ago.
dataimportsequencingcoveragealignmentcurlbzip2xz-utilszlibcpp
15 stars 8.95 score 31 scripts 11 dependentsbioc
RaggedExperiment:Representation of Sparse Experiments and Assays Across Samples
This package provides a flexible representation of copy number, mutation, and other data that fit into the ragged array schema for genomic location data. The basic representation of such data provides a rectangular flat table interface to the user with range information in the rows and samples/specimen in the columns. The RaggedExperiment class derives from a GRangesList representation and provides a semblance of a rectangular dataset.
Maintained by Marcel Ramos. Last updated 4 months ago.
infrastructuredatarepresentationcopynumbercore-packagedata-structuremutationsu24ca289073
4 stars 8.93 score 76 scripts 14 dependentspolmine
polmineR:Verbs and Nouns for Corpus Analysis
Package for corpus analysis using the Corpus Workbench ('CWB', <https://cwb.sourceforge.io>) as an efficient back end for indexing and querying large corpora. The package offers functionality to flexibly create subcorpora and to carry out basic statistical operations (count, co-occurrences etc.). The original full text of documents can be reconstructed and inspected at any time. Beyond that, the package is intended to serve as an interface to packages implementing advanced statistical procedures. Respective data structures (document-term matrices, term-co-occurrence matrices etc.) can be created based on the indexed corpora.
Maintained by Andreas Blaette. Last updated 1 years ago.
49 stars 7.96 score 311 scriptscran
timeSeries:Financial Time Series Objects (Rmetrics)
'S4' classes and various tools for financial time series: Basic functions such as scaling and sorting, subsetting, mathematical operations and statistical functions.
Maintained by Georgi N. Boshnakov. Last updated 6 months ago.
2 stars 7.89 score 146 dependentsjellegoeman
penalized:L1 (Lasso and Fused Lasso) and L2 (Ridge) Penalized Estimation in GLMs and in the Cox Model
Fitting possibly high dimensional penalized regression models. The penalty structure can be any combination of an L1 penalty (lasso and fused lasso), an L2 penalty (ridge) and a positivity constraint on the regression coefficients. The supported regression models are linear, logistic and Poisson regression and the Cox Proportional Hazards model. Cross-validation routines allow optimization of the tuning parameters.
Maintained by Jelle Goeman. Last updated 3 years ago.
4 stars 7.09 score 429 scripts 17 dependentsrobinhankin
disordR:Non-Ordered Vectors
Functionality for manipulating values of associative maps. The package is a dependency for mvp-type packages that use the STL map class: it traps plausible idiom that is ill-defined (implementation-specific) and returns an informative error, rather than returning a possibly incorrect result. To cite the package in publications please use Hankin (2022) <doi:10.48550/ARXIV.2210.03856>.
Maintained by Robin K. S. Hankin. Last updated 5 months ago.
1 stars 6.59 score 20 dependentsbioc
MultiDataSet:Implementation of MultiDataSet and ResultSet
Implementation of the BRGE's (Bioinformatic Research Group in Epidemiology from Center for Research in Environmental Epidemiology) MultiDataSet and ResultSet. MultiDataSet is designed for integrating multi omics data sets and ResultSet is a container for omics results. This package contains base classes for MEAL and rexposome packages.
Maintained by Xavier Escribà Montagut. Last updated 5 months ago.
6.45 score 28 scripts 10 dependentsbioc
APL:Association Plots
APL is a package developed for computation of Association Plots (AP), a method for visualization and analysis of single cell transcriptomics data. The main focus of APL is the identification of genes characteristic for individual clusters of cells from input data. The package performs correspondence analysis (CA) and allows to identify cluster-specific genes using Association Plots. Additionally, APL computes the cluster-specificity scores for all genes which allows to rank the genes by their specificity for a selected cell cluster of interest.
Maintained by Clemens Kohl. Last updated 5 months ago.
statisticalmethoddimensionreductionsinglecellsequencingrnaseqgeneexpression
15 stars 6.31 score 15 scriptsbioc
Pedixplorer:Pedigree Functions
Routines to handle family data with a Pedigree object. The initial purpose was to create correlation structures that describe family relationships such as kinship and identity-by-descent, which can be used to model family data in mixed effects models, such as in the coxme function. Also includes a tool for Pedigree drawing which is focused on producing compact layouts without intervention. Recent additions include utilities to trim the Pedigree object with various criteria, and kinship for the X chromosome.
Maintained by Louis Le Nezet. Last updated 13 days ago.
softwaredatarepresentationgeneticsgraphandnetworkvisualizationkinshippedigree
2 stars 6.08 score 10 scriptsbioc
scRNAseqApp:A single-cell RNAseq Shiny app-package
The scRNAseqApp is a Shiny app package designed for interactive visualization of single-cell data. It is an enhanced version derived from the ShinyCell, repackaged to accommodate multiple datasets. The app enables users to visualize data containing various types of information simultaneously, facilitating comprehensive analysis. Additionally, it includes a user management system to regulate database accessibility for different users.
Maintained by Jianhong Ou. Last updated 15 days ago.
visualizationsinglecellrnaseqinteractive-visualizationsmultiple-usersshiny-appssingle-cell-rna-seq
4 stars 5.76 score 3 scriptsbbuchsbaum
neuroim:Data Structures and Handling for Neuroimaging Data
A collection of data structures that represent volumetric brain imaging data. The focus is on basic data handling for 3D and 4D neuroimaging data. In addition, there are function to read and write NIFTI files and limited support for reading AFNI files.
Maintained by Bradley Buchsbaum. Last updated 4 years ago.
6 stars 5.64 score 48 scriptsneotomadb
neotoma2:Working with the Neotoma Paleoecology Database
Access and manipulation of data using the Neotoma Paleoecology Database. <https://api.neotomadb.org/api-docs/>.
Maintained by Dominguez Vidana Socorro. Last updated 8 months ago.
earthcubeneotomansfpaleoecology
8 stars 5.35 score 56 scriptsbioc
DelayedDataFrame:Delayed operation on DataFrame using standard DataFrame metaphor
Based on the standard DataFrame metaphor, we are trying to implement the feature of delayed operation on the DelayedDataFrame, with a slot of lazyIndex, which saves the mapping indexes for each column of DelayedDataFrame. Methods like show, validity check, [/[[ subsetting, rbind/cbind are implemented for DelayedDataFrame to be operated around lazyIndex. The listData slot stays untouched until a realization call e.g., DataFrame constructor OR as.list() is invoked.
Maintained by Qian Liu. Last updated 5 months ago.
infrastructuredatarepresentation
2 stars 5.26 score 3 scripts 1 dependentsbioc
HiTC:High Throughput Chromosome Conformation Capture analysis
The HiTC package was developed to explore high-throughput 'C' data such as 5C or Hi-C. Dedicated R classes as well as standard methods for quality controls, normalization, visualization, and further analysis are also provided.
Maintained by Nicolas Servant. Last updated 5 months ago.
sequencinghighthroughputsequencinghic
5.23 score 42 scriptsbioc
epivizrData:Data Management API for epiviz interactive visualization app
Serve data from Bioconductor Objects through a WebSocket connection.
Maintained by Hector Corrada Bravo. Last updated 5 months ago.
1 stars 5.08 score 4 scripts 4 dependentsbioc
casper:Characterization of Alternative Splicing based on Paired-End Reads
Infer alternative splicing from paired-end RNA-seq data. The model is based on counting paths across exons, rather than pairwise exon connections, and estimates the fragment size and start distributions non-parametrically, which improves estimation precision.
Maintained by David Rossell. Last updated 4 months ago.
immunooncologygeneexpressiondifferentialexpressiontranscriptionrnaseqsequencingcpp
5.02 score 66 scriptsbioc
EpiTxDb:Storing and accessing epitranscriptomic information using the AnnotationDbi interface
EpiTxDb facilitates the storage of epitranscriptomic information. More specifically, it can keep track of modification identity, position, the enzyme for introducing it on the RNA, a specifier which determines the position on the RNA to be modified and the literature references each modification is associated with.
Maintained by Felix G.M. Ernst. Last updated 5 months ago.
4.78 score 7 scriptstpetzoldt
simecol:Simulation of Ecological (and Other) Dynamic Systems
An object oriented framework to simulate ecological (and other) dynamic systems. It can be used for differential equations, individual-based (or agent-based) and other models as well. It supports structuring of simulation scenarios (to avoid copy and paste) and aims to improve readability and re-usability of code.
Maintained by Thomas Petzoldt. Last updated 8 months ago.
4.76 score 190 scriptsrobinhankin
jordan:A Suite of Routines for Working with Jordan Algebras
A Jordan algebra is an algebraic object originally designed to study observables in quantum mechanics. Jordan algebras are commutative but non-associative; they satisfy the Jordan identity. The package follows the ideas and notation of K. McCrimmon (2004, ISBN:0-387-95447-3) "A Taste of Jordan Algebras". To cite the package in publications, please use Hankin (2023) <doi:10.48550/arXiv.2303.06062>.
Maintained by Robin K. S. Hankin. Last updated 9 months ago.
4.60 score 2 scriptsbioc
GenomAutomorphism:Compute the automorphisms between DNA's Abelian group representations
This is a R package to compute the automorphisms between pairwise aligned DNA sequences represented as elements from a Genomic Abelian group. In a general scenario, from genomic regions till the whole genomes from a given population (from any species or close related species) can be algebraically represented as a direct sum of cyclic groups or more specifically Abelian p-groups. Basically, we propose the representation of multiple sequence alignments of length N bp as element of a finite Abelian group created by the direct sum of homocyclic Abelian group of prime-power order.
Maintained by Robersy Sanchez. Last updated 3 months ago.
mathematicalbiologycomparativegenomicsfunctionalgenomicsmultiplesequencealignmentwholegenomegenetic-codegenetic-code-algebragenomegenome-algebra
4.30 score 9 scriptskamapu
vegtable:Handling Vegetation Data Sets
Import and handling data from vegetation-plot databases, especially data stored in 'Turboveg 2' (<https://www.synbiosys.alterra.nl/turboveg/>). Also import/export routines for exchange of data with 'Juice' (<https://www.sci.muni.cz/botany/juice/>) are implemented.
Maintained by Miguel Alvarez. Last updated 9 months ago.
7 stars 4.23 score 49 scriptsreidt03
RadOnc:Analytical Tools for Radiation Oncology
Designed for the import, analysis, and visualization of dosimetric and volumetric data in Radiation Oncology, the tools herein enable import of dose-volume histogram information from multiple treatment planning system platforms and 3D structural representations and dosimetric information from 'DICOM-RT' files. These tools also enable subsequent visualization and statistical analysis of these data.
Maintained by Reid F. Thompson. Last updated 2 months ago.
8 stars 3.78 score 19 scriptslawremi
rsolr:R to Solr Interface
A comprehensive R API for querying Apache Solr databases. A Solr core is represented as a data frame or list that supports Solr-side filtering, sorting, transformation and aggregation, all through the familiar base R API. Queries are processed lazily, i.e., a query is only sent to the database when the data are required.
Maintained by Michael Lawrence. Last updated 3 years ago.
9 stars 3.65 score 6 scriptsrobinhankin
multivator:A Multivariate Emulator
A multivariate generalization of the emulator package.
Maintained by Robin K. S. Hankin. Last updated 2 years ago.
3.62 score 21 scriptscnrakt
haplotypes:Manipulating DNA Sequences and Estimating Unambiguous Haplotype Network with Statistical Parsimony
Provides S4 classes and methods for reading and manipulating aligned DNA sequences, supporting an indel coding methods (only simple indel coding method is available in the current version), showing base substitutions and indels, calculating absolute pairwise distances between DNA sequences, and collapses identical DNA sequences into haplotypes or inferring haplotypes using user provided absolute pairwise character difference matrix. This package also includes S4 classes and methods for estimating genealogical relationships among haplotypes using statistical parsimony and plotting parsimony networks.
Maintained by Caner Aktas. Last updated 2 years ago.
1 stars 3.43 score 54 scriptsbioc
BaseSpaceR:R SDK for BaseSpace RESTful API
A rich R interface to Illumina's BaseSpace cloud computing environment, enabling the fast development of data analysis and visualisation tools.
Maintained by Jared OConnell. Last updated 5 months ago.
infrastructuredatarepresentationconnecttoolssoftwaredataimporthighthroughputsequencingsequencinggenetics
3.30 score 9 scriptskylebaron
dmutate:Mutate Data Frames with Random Variates
Work within the 'dplyr' workflow to add random variates to your data frame. Variates can be added at any level of an existing column. Also, bounds can be specified for simulated variates.
Maintained by Kyle T Baron. Last updated 7 years ago.
1 stars 3.24 score 35 scriptsandremueller
colorpatch:Optimized Rendering of Fold Changes and Confidence Values
Shows color patches for encoding fold changes (e.g. log ratios) together with confidence values within a single diagram. This is especially useful for rendering gene expression data as well as other types of differential experiments. In addition to different rendering methods (ggplot extensions) functionality for perceptually optimizing color palettes are provided. Furthermore the package provides extension methods of the colorspace color-class in order to simplify the work with palettes (a.o. length, as.list, and append are supported).
Maintained by Andre Mueller. Last updated 8 years ago.
3.23 score 34 scriptslawremi
objectProperties:A Factory of Self-Describing Properties
Supports the definition of sets of properties on objects. Observers can listen to changes on individual properties or the set as a whole. The properties are meant to be fully self-describing. In support of this, there is a framework for defining enumerated types, as well as other bounded types, as S4 classes.
Maintained by Michael Lawrence. Last updated 3 years ago.
2.26 score 20 scripts 3 dependentsroustant
kergp:Gaussian Process Laboratory
Gaussian process regression with an emphasis on kernels. Quantitative and qualitative inputs are accepted. Some pre-defined kernels are available, such as radial or tensor-sum for quantitative inputs, and compound symmetry, low rank, group kernel for qualitative inputs. The user can define new kernels and composite kernels through a formula mechanism. Useful methods include parameter estimation by maximum likelihood, simulation, prediction and leave-one-out validation.
Maintained by Olivier Roustant. Last updated 4 months ago.
1 stars 1.83 score 67 scriptshsonne
pathlist:Package Supporting the Work with File Paths
This package implements a S4 class pathlist that internally stores a vector of file paths (as, e.g. received with dir()) as a matrix of path segments. I found out that this is the most compact form to store the paths. The main feature of the class is the dollar function that allows to filter paths for the value of their top-level folder. Using the dollar operator subsequently you can easily narrow down the list of paths. The class implements functions length(), head(), tail(), summary(), and show().
Maintained by Hauke Sonnenberg. Last updated 6 years ago.
1.78 score 2 scripts 2 dependents