Biostrings:Efficient manipulation of biological strings
Memory efficient string containers, string matching algorithms, and other utilities, for fast manipulation of large biological sequences or sets of sequences.
Maintained by Hervé Pagès. Last updated 1 months ago.
62 stars 17.77 score 8.6k scripts 1.2k dependentsrspatial
terra:Spatial Data Analysis
Methods for spatial data analysis with vector (points, lines, polygons) and raster (grid) data. Methods for vector data include geometric operations such as intersect and buffer. Raster methods include local, focal, global, zonal and geometric operations. The predict and interpolate methods facilitate the use of regression type (interpolation, machine learning) models for spatial prediction, including with satellite remote sensing data. Processing of very large files is supported. See the manual and tutorials on <> to get started. 'terra' replaces the 'raster' package ('terra' can do more, and it is faster and easier to use).
Maintained by Robert J. Hijmans. Last updated 2 days ago.
559 stars 17.64 score 17k scripts 855 dependentsrspatial
raster:Geographic Data Analysis and Modeling
Reading, writing, manipulating, analyzing and modeling of spatial data. This package has been superseded by the "terra" package <>.
Maintained by Robert J. Hijmans. Last updated 16 hours ago.
163 stars 17.23 score 58k scripts 562 dependentsr-forge
Matrix:Sparse and Dense Matrix Classes and Methods
A rich hierarchy of sparse and dense matrix classes, including general, symmetric, triangular, and diagonal matrices with numeric, logical, or pattern entries. Efficient methods for operating on such matrices, often wrapping the 'BLAS', 'LAPACK', and 'SuiteSparse' libraries.
Maintained by Martin Maechler. Last updated 19 days ago.
1 stars 17.23 score 33k scripts 12k dependentsbioc
S4Vectors:Foundation of vector-like and list-like containers in Bioconductor
The S4Vectors package defines the Vector and List virtual classes and a set of generic functions that extend the semantic of ordinary vectors and lists in R. Package developers can easily implement vector-like or list-like objects as concrete subclasses of Vector or List. In addition, a few low-level concrete subclasses of general interest (e.g. DataFrame, Rle, Factor, and Hits) are implemented in the S4Vectors package itself (many more are implemented in the IRanges package and in other Bioconductor infrastructure packages).
Maintained by Hervé Pagès. Last updated 2 months ago.
18 stars 16.05 score 1.0k scripts 1.9k dependentsbioc
AnnotationDbi:Manipulation of SQLite-based annotations in Bioconductor
Implements a user-friendly interface for querying SQLite-based annotation data packages.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
9 stars 15.05 score 3.6k scripts 769 dependentsmhahsler
arules:Mining Association Rules and Frequent Itemsets
Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules). Also provides C implementations of the association mining algorithms Apriori and Eclat. Hahsler, Gruen and Hornik (2005) <doi:10.18637/jss.v014.i15>.
Maintained by Michael Hahsler. Last updated 2 months ago.
194 stars 13.99 score 3.3k scripts 28 dependentsropensci
git2r:Provides Access to Git Repositories
Interface to the 'libgit2' library, which is a pure C implementation of the 'Git' core methods. Provides access to 'Git' repositories to extract data and running some basic 'Git' commands.
Maintained by Stefan Widgren. Last updated 13 hours ago.
218 stars 13.93 score 836 scripts 46 dependentsknausb
vcfR:Manipulate and Visualize VCF Data
Facilitates easy manipulation of variant call format (VCF) data. Functions are provided to rapidly read from and write to VCF files. Once VCF data is read into R a parser function extracts matrices of data. This information can then be used for quality control or other purposes. Additional functions provide visualization of genomic data. Once processing is complete data may be written to a VCF file (*.vcf.gz). It also may be converted into other popular R objects (e.g., genlight, DNAbin). VcfR provides a link between VCF data and familiar R software.
Maintained by Brian J. Knaus. Last updated 1 months ago.
256 stars 13.66 score 3.1k scripts 19 dependentsinsightsengineering
rtables:Reporting Tables
Reporting tables often have structure that goes beyond simple rectangular data. The 'rtables' package provides a framework for declaring complex multi-level tabulations and then applying them to data. This framework models both tabulation and the resulting tables as hierarchical, tree-like objects which support sibling sub-tables, arbitrary splitting or grouping of data in row and column dimensions, cells containing multiple values, and the concept of contextual summary computations. A convenient pipe-able interface is provided for declaring table layouts and the corresponding computations, and then applying them to data.
Maintained by Joe Zhu. Last updated 3 months ago.
232 stars 13.65 score 238 scripts 17 dependentsbrodieg
diffobj:Diffs for R Objects
Generate a colorized diff of two R objects for an intuitive visualization of their differences.
Maintained by Brodie Gaslam. Last updated 3 years ago.
231 stars 13.17 score 107 scripts 494 dependentsbiodiverse
unmarked:Models for Data from Unmarked Animals
Fits hierarchical models of animal abundance and occurrence to data collected using survey methods such as point counts, site occupancy sampling, distance sampling, removal sampling, and double observer sampling. Parameters governing the state and observation processes can be modeled as functions of covariates. References: Kellner et al. (2023) <doi:10.1111/2041-210X.14123>, Fiske and Chandler (2011) <doi:10.18637/jss.v043.i10>.
Maintained by Ken Kellner. Last updated 9 days ago.
4 stars 13.02 score 652 scripts 12 dependentsmelff
memisc:Management of Survey Data and Presentation of Analysis Results
An infrastructure for the management of survey data including value labels, definable missing values, recoding of variables, production of code books, and import of (subsets of) 'SPSS' and 'Stata' files is provided. Further, the package allows to produce tables and data frames of arbitrary descriptive statistics and (almost) publication-ready tables of regression model estimates, which can be exported to 'LaTeX' and HTML.
Maintained by Martin Elff. Last updated 23 days ago.
46 stars 12.34 score 1.2k scripts 13 dependentskaneplusplus
bigmemory:Manage Massive Matrices with Shared Memory and Memory-Mapped Files
Create, store, access, and manipulate massive matrices. Matrices are allocated to shared memory and may use memory-mapped files. Packages 'biganalytics', 'bigtabulate', 'synchronicity', and 'bigalgebra' provide advanced functionality.
Maintained by Michael J. Kane. Last updated 1 years ago.
127 stars 11.87 score 920 scripts 64 dependentsbioc
Rgraphviz:Provides plotting capabilities for R graph objects
Interfaces R with the AT and T graphviz library for plotting R graph objects from the graph package.
Maintained by Kasper Daniel Hansen. Last updated 2 days ago.
11.51 score 1.2k scripts 107 dependentsr-forge
Rmpfr:Interface R to MPFR - Multiple Precision Floating-Point Reliable
Arithmetic (via S4 classes and methods) for arbitrary precision floating point numbers, including transcendental ("special") functions. To this end, the package interfaces to the 'LGPL' licensed 'MPFR' (Multiple Precision Floating-Point Reliable) Library which itself is based on the 'GMP' (GNU Multiple Precision) Library.
Maintained by Martin Maechler. Last updated 4 months ago.
11.30 score 316 scripts 141 dependentsfmichonneau
phylobase:Base Package for Phylogenetic Structures and Comparative Data
Provides a base S4 class for comparative methods, incorporating one or more trees and trait data.
Maintained by Francois Michonneau. Last updated 1 years ago.
18 stars 11.10 score 394 scripts 18 dependentsmetrumresearchgroup
mrgsolve:Simulate from ODE-Based Models
Fast simulation from ordinary differential equation (ODE) based models typically employed in quantitative pharmacology and systems biology.
Maintained by Kyle T Baron. Last updated 9 days ago.
138 stars 10.90 score 1.2k scripts 3 dependentszdebruine
RcppML:Rcpp Machine Learning Library
Fast machine learning algorithms including matrix factorization and divisive clustering for large sparse and dense matrices.
Maintained by Zach DeBruine. Last updated 2 years ago.
107 stars 10.66 score 125 scripts 50 dependentsbioc
flowCore:flowCore: Basic structures for flow cytometry data
Provides S4 data structures and basic functions to deal with flow cytometry data.
Maintained by Mike Jiang. Last updated 5 months ago.
10.34 score 1.7k scripts 59 dependentsedzer
intervals:Tools for Working with Points and Intervals
Tools for working with and comparing sets of points and intervals.
Maintained by Edzer Pebesma. Last updated 7 months ago.
11 stars 9.50 score 122 scripts 98 dependentsreinhardfurrer
spam:SPArse Matrix
Set of functions for sparse matrix algebra. Differences with other sparse matrix packages are: (1) we only support (essentially) one sparse matrix format, (2) based on transparent and simple structure(s), (3) tailored for MCMC calculations within G(M)RF. (4) and it is fast and scalable (with the extension package spam64). Documentation about 'spam' is provided by vignettes included in this package, see also Furrer and Sain (2010) <doi:10.18637/jss.v036.i10>; see 'citation("spam")' for details.
Maintained by Reinhard Furrer. Last updated 2 months ago.
1 stars 9.36 score 420 scripts 439 dependentspolmine
polmineR:Verbs and Nouns for Corpus Analysis
Package for corpus analysis using the Corpus Workbench ('CWB', <>) as an efficient back end for indexing and querying large corpora. The package offers functionality to flexibly create subcorpora and to carry out basic statistical operations (count, co-occurrences etc.). The original full text of documents can be reconstructed and inspected at any time. Beyond that, the package is intended to serve as an interface to packages implementing advanced statistical procedures. Respective data structures (document-term matrices, term-co-occurrence matrices etc.) can be created based on the indexed corpora.
Maintained by Andreas Blaette. Last updated 1 years ago.
49 stars 7.96 score 311 scriptscran
timeSeries:Financial Time Series Objects (Rmetrics)
'S4' classes and various tools for financial time series: Basic functions such as scaling and sorting, subsetting, mathematical operations and statistical functions.
Maintained by Georgi N. Boshnakov. Last updated 6 months ago.
2 stars 7.89 score 146 dependentsadamlilith
fasterRaster:Faster Raster and Spatial Vector Processing Using 'GRASS GIS'
Processing of large-in-memory/large-on disk rasters and spatial vectors using 'GRASS GIS' <>. Most functions in the 'terra' package are recreated. Processing of medium-sized and smaller spatial objects will nearly always be faster using 'terra' or 'sf', but for large-in-memory/large-on-disk objects, 'fasterRaster' may be faster. To use most of the functions, you must have the stand-alone version (not the 'OSGeoW4' installer version) of 'GRASS GIS' 8.0 or higher.
Maintained by Adam B. Smith. Last updated 2 days ago.
57 stars 7.68 score 8 scriptsrikenbit
rTensor:Tools for Tensor Analysis and Decomposition
A set of tools for creation, manipulation, and modeling of tensors with arbitrary number of modes. A tensor in the context of data analysis is a multidimensional array. rTensor does this by providing a S4 class 'Tensor' that wraps around the base 'array' class. rTensor provides common tensor operations as methods, including matrix unfolding, summing/averaging across modes, calculating the Frobenius norm, and taking the inner product between two tensors. Familiar array operations are overloaded, such as index subsetting via '[' and element-wise operations. rTensor also implements various tensor decomposition, including CP, GLRAM, MPCA, PVD, and Tucker. For tensors with 3 modes, rTensor also implements transpose, t-product, and t-SVD, as defined in Kilmer et al. (2013). Some auxiliary functions include the Khatri-Rao product, Kronecker product, and the Hadamard product for a list of matrices.
Maintained by Koki Tsuyuzaki. Last updated 2 years ago.
6 stars 7.65 score 278 scripts 30 dependentsspedygiorgio
lifecontingencies:Financial and Actuarial Mathematics for Life Contingencies
Classes and methods that allow the user to manage life table, actuarial tables (also multiple decrements tables). Moreover, functions to easily perform demographic, financial and actuarial mathematics on life contingencies insurances calculations are contained therein. See Spedicato (2013) <doi:10.18637/jss.v055.i10>.
Maintained by Giorgio Alfredo Spedicato. Last updated 6 months ago.
61 stars 7.06 score 156 scriptsfbertran
Cascade:Selection, Reverse-Engineering and Prediction in Cascade Networks
A modeling tool allowing gene selection, reverse engineering, and prediction in cascade networks. Jung, N., Bertrand, F., Bahram, S., Vallat, L., and Maumy-Bertrand, M. (2014) <doi:10.1093/bioinformatics/btt705>.
Maintained by Frederic Bertrand. Last updated 2 years ago.
1 stars 6.56 score 40 scripts 2 dependentsgmbecker
switchr:Installing, Managing, and Switching Between Distinct Sets of Installed Packages
Provides an abstraction for managing, installing, and switching between sets of installed R packages. This allows users to maintain multiple package libraries simultaneously, e.g. to maintain strict, package-version-specific reproducibility of many analyses, or work within a development/production release paradigm. Introduces a generalized package installation process which supports multiple repository and non-repository sources and tracks package provenance.
Maintained by Gabriel Becker. Last updated 2 years ago.
59 stars 6.49 score 52 scriptsjustincally
VicmapR:Access Victorian Spatial Data Through Web File Services (WFS)
Easily interfaces R to spatial datasets available through the Victorian Government's WFS (Web Feature Service): <>, which allows users to read in 'sf' data from these sources. VicmapR uses the lazy querying approach and code developed by Teucher et al. (2021) for the 'bcdata' R package <doi:10.21105/joss.02927>.
Maintained by Justin Cally. Last updated 7 months ago.
17 stars 6.14 score 18 scriptscran
gaston:Genetic Data Handling (QC, GRM, LD, PCA) & Linear Mixed Models
Manipulation of genetic data (SNPs). Computation of GRM and dominance matrix, LD, heritability with efficient algorithms for linear mixed model (AIREML). Dandine et al <doi:10.1159/000488519>.
Maintained by Hervé Perdry. Last updated 1 years ago.
5 stars 6.02 score 12 dependentsbioc
qusage:qusage: Quantitative Set Analysis for Gene Expression
This package is an implementation the Quantitative Set Analysis for Gene Expression (QuSAGE) method described in (Yaari G. et al, Nucl Acids Res, 2013). This is a novel Gene Set Enrichment-type test, which is designed to provide a faster, more accurate, and easier to understand test for gene expression studies. qusage accounts for inter-gene correlations using the Variance Inflation Factor technique proposed by Wu et al. (Nucleic Acids Res, 2012). In addition, rather than simply evaluating the deviation from a null hypothesis with a single number (a P value), qusage quantifies gene set activity with a complete probability density function (PDF). From this PDF, P values and confidence intervals can be easily extracted. Preserving the PDF also allows for post-hoc analysis (e.g., pair-wise comparisons of gene set activity) while maintaining statistical traceability. Finally, while qusage is compatible with individual gene statistics from existing methods (e.g., LIMMA), a Welch-based method is implemented that is shown to improve specificity. The QuSAGE package also includes a mixed effects model implementation, as described in (Turner JA et al, BMC Bioinformatics, 2015), and a meta-analysis framework as described in (Meng H, et al. PLoS Comput Biol. 2019). For questions, contact Chris Bolen ( or Steven Kleinstein (
Maintained by Christopher Bolen. Last updated 5 months ago.
5.65 score 185 scripts 1 dependentsanainesvs
pedigreemm:Pedigree-Based Mixed-Effects Models
Fit pedigree-based mixed-effects models.
Maintained by Ana Ines Vazquez. Last updated 1 years ago.
1 stars 5.42 score 87 scripts 2 dependentsbioc
scanMiRApp:scanMiR shiny application
A shiny interface to the scanMiR package. The application enables the scanning of transcripts and custom sequences for miRNA binding sites, the visualization of KdModels and binding results, as well as browsing predicted repression data. In addition contains the IndexedFst class for fast indexed reading of large GenomicRanges or data.frames, and some utilities for facilitating scans and identifying enriched miRNA-target pairs.
Maintained by Pierre-Luc Germain. Last updated 5 months ago.
4.76 score 19 scriptsbioc
ChIPseqR:Identifying Protein Binding Sites in High-Throughput Sequencing Data
ChIPseqR identifies protein binding sites from ChIP-seq and nucleosome positioning experiments. The model used to describe binding events was developed to locate nucleosomes but should flexible enough to handle other types of experiments as well.
Maintained by Peter Humburg. Last updated 5 months ago.
4.70 score 1 scriptsmikemahoney218
mvdf:A Minimum Viable Data Format for 3D Rendering via Blender
A small, self-contained, minimum viable data format providing a standard interface for using R as a front-end for the Blender 3D rendering program. The core approach centers around an S4 class, 'mvdf', with getter, setter, and validation methods designed to be extended for more specific rendering approaches.
Maintained by Michael Mahoney. Last updated 4 years ago.
16 stars 4.20 score 5 scriptsgeobosh
pcts:Periodically Correlated and Periodically Integrated Time Series
Classes and methods for modelling and simulation of periodically correlated (PC) and periodically integrated time series. Compute theoretical periodic autocovariances and related properties of PC autoregressive moving average models. Some original methods including Boshnakov & Iqelan (2009) <doi:10.1111/j.1467-9892.2009.00617.x>, Boshnakov (1996) <doi:10.1111/j.1467-9892.1996.tb00281.x>.
Maintained by Georgi N. Boshnakov. Last updated 1 years ago.
3 stars 4.18 score 3 scriptsaalfons
simFrame:Simulation Framework
A general framework for statistical simulation, which allows researchers to make use of a wide range of simulation designs with minimal programming effort. The package provides functionality for drawing samples from a distribution or a finite population, for adding outliers and missing values, as well as for visualization of the simulation results. It follows a clear object-oriented design and supports parallel computing to increase computational performance.
Maintained by Andreas Alfons. Last updated 3 years ago.
2 stars 3.90 score 80 scriptsbioc
RegEnrich:Gene regulator enrichment analysis
This package is a pipeline to identify the key gene regulators in a biological process, for example in cell differentiation and in cell development after stimulation. There are four major steps in this pipeline: (1) differential expression analysis; (2) regulator-target network inference; (3) enrichment analysis; and (4) regulators scoring and ranking.
Maintained by Weiyang Tao. Last updated 5 months ago.
3.82 score 22 scriptslawremi
rsolr:R to Solr Interface
A comprehensive R API for querying Apache Solr databases. A Solr core is represented as a data frame or list that supports Solr-side filtering, sorting, transformation and aggregation, all through the familiar base R API. Queries are processed lazily, i.e., a query is only sent to the database when the data are required.
Maintained by Michael Lawrence. Last updated 3 years ago.
9 stars 3.65 score 6 scriptstconwell
html5:Creates Valid HTML5 Strings
Generates valid HTML tag strings for HTML5 elements documented by Mozilla. Attributes are passed as named lists, with names being the attribute name and values being the attribute value. Attribute values are automatically double-quoted. To declare a DOCTYPE, wrap html() with function doctype(). Mozilla's documentation for HTML5 is available here: <>. Elements marked as obsolete are not included.
Maintained by Timothy Conwell. Last updated 2 years ago.
1 stars 3.65 score 1 scripts 3 dependentsrobinhankin
multivator:A Multivariate Emulator
A multivariate generalization of the emulator package.
Maintained by Robin K. S. Hankin. Last updated 2 years ago.
3.62 score 21 scriptsbioc
hypergraph:A package providing hypergraph data structures
A package that implements some simple capabilities for representing and manipulating hypergraphs.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
3.38 score 2 scripts 4 dependentsbioc
diggit:Inference of Genetic Variants Driving Cellular Phenotypes
Inference of Genetic Variants Driving Cellullar Phenotypes by the DIGGIT algorithm
Maintained by Mariano J Alvarez. Last updated 5 months ago.
3.30 score 3 scriptsr-forge
pems.utils:Portable Emissions (and Other Mobile) Measurement System Utilities
Utility functions for the handling, analysis and visualisation of data from portable emissions measurement systems ('PEMS') and other similar mobile activity monitoring devices. The package includes a dedicated 'pems' data class that manages many of the quality control, unit handling and data archiving issues that can hinder efforts to standardise 'PEMS' research.
Maintained by Karl Ropkins. Last updated 3 months ago.
3.06 score 19 scriptscran
ibmdbR:IBM in-Database Analytics for R
Functionality required to efficiently use R with IBM(R) Db2(R) Warehouse offerings (formerly IBM dashDB(R)) and IBM Db2 for z/OS(R) in conjunction with IBM Db2 Analytics Accelerator for z/OS. Many basic and complex R operations are pushed down into the database, which removes the main memory boundary of R and allows to make full use of parallel processing in the underlying database. For executing R-functions in a multi-node environment in parallel the idaTApply() function requires the 'SparkR' package (<>). The optional 'ggplot2' package is needed for the plot.idaLm() function only.
Maintained by Shaikh Quader. Last updated 1 years ago.
2 stars 3.00 scoresvizcaya
gems:Generalized Multistate Simulation Model
Simulate and analyze multistate models with general hazard functions. gems provides functionality for the preparation of hazard functions and parameters, simulation from a general multistate model and predicting future events. The multistate model is not required to be a Markov model and may take the history of previous events into account. In the basic version, it allows to simulate from transition-specific hazard function, whose parameters are multivariable normally distributed.
Maintained by Luisa Salazar Vizcaya. Last updated 8 years ago.
2.52 score 33 scriptshsonne
pathlist:Package Supporting the Work with File Paths
This package implements a S4 class pathlist that internally stores a vector of file paths (as, e.g. received with dir()) as a matrix of path segments. I found out that this is the most compact form to store the paths. The main feature of the class is the dollar function that allows to filter paths for the value of their top-level folder. Using the dollar operator subsequently you can easily narrow down the list of paths. The class implements functions length(), head(), tail(), summary(), and show().
Maintained by Hauke Sonnenberg. Last updated 6 years ago.
1.78 score 2 scripts 2 dependentsapedrods
MAINT.Data:Model and Analyse Interval Data
Implements methodologies for modelling interval data by Normal and Skew-Normal distributions, considering appropriate parameterizations of the variance-covariance matrix that takes into account the intrinsic nature of interval data, and lead to four different possible configuration structures. The Skew-Normal parameters can be estimated by maximum likelihood, while Normal parameters may be estimated by maximum likelihood or robust trimmed maximum likelihood methods.
Maintained by Pedro Duarte Silva. Last updated 2 years ago.
1.15 score 14 scripts