Showing 200 of total 5046 results (show query)
r-lib
httr:Tools for Working with URLs and HTTP
Useful tools for working with HTTP organised by HTTP verbs (GET(), POST(), etc). Configuration functions make it easy to control additional request components (authenticate(), add_headers() and so on).
Maintained by Hadley Wickham. Last updated 1 years ago.
989 stars 20.56 score 29k scripts 4.3k dependentstidyverse
tidyverse:Easily Install and Load the 'Tidyverse'
The 'tidyverse' is a set of packages that work in harmony because they share common data representations and 'API' design. This package is designed to make it easy to install and load multiple 'tidyverse' packages in a single step. Learn more about the 'tidyverse' at <https://www.tidyverse.org>.
Maintained by Hadley Wickham. Last updated 5 months ago.
1.7k stars 20.23 score 664k scripts 125 dependentsr-lib
devtools:Tools to Make Developing R Packages Easier
Collection of package development tools.
Maintained by Jennifer Bryan. Last updated 6 months ago.
2.4k stars 19.55 score 51k scripts 150 dependentsplotly
plotly:Create Interactive Web Graphics via 'plotly.js'
Create interactive web graphics from 'ggplot2' graphs and/or a custom interface to the (MIT-licensed) JavaScript library 'plotly.js' inspired by the grammar of graphics.
Maintained by Carson Sievert. Last updated 3 months ago.
d3jsdata-visualizationggplot2javascriptplotlyshinywebgl
2.6k stars 19.43 score 93k scripts 797 dependentstidyverse
rvest:Easily Harvest (Scrape) Web Pages
Wrappers around the 'xml2' and 'httr' packages to make it easy to download, then manipulate, HTML and XML.
Maintained by Hadley Wickham. Last updated 5 months ago.
1.5k stars 19.33 score 29k scripts 549 dependentsr-lib
pkgdown:Make Static HTML Documentation for a Package
Generate an attractive and useful website from a source package. 'pkgdown' converts your documentation, vignettes, 'README', and more to 'HTML' making it easy to share information about your package online.
Maintained by Hadley Wickham. Last updated 10 days ago.
734 stars 18.46 score 588 scripts 162 dependentsbioc
Biostrings:Efficient manipulation of biological strings
Memory efficient string containers, string matching algorithms, and other utilities, for fast manipulation of large biological sequences or sets of sequences.
Maintained by Hervรฉ Pagรจs. Last updated 1 months ago.
sequencematchingalignmentsequencinggeneticsdataimportdatarepresentationinfrastructurebioconductor-packagecore-package
62 stars 17.77 score 8.6k scripts 1.2k dependentsbioc
GenomicRanges:Representation and manipulation of genomic intervals
The ability to efficiently represent and manipulate genomic annotations and alignments is playing a central role when it comes to analyzing high-throughput sequencing data (a.k.a. NGS data). The GenomicRanges package defines general purpose containers for storing and manipulating genomic intervals and variables defined along a genome. More specialized containers for representing and manipulating short alignments against a reference genome, or a matrix-like summarization of an experiment, are defined in the GenomicAlignments and SummarizedExperiment packages, respectively. Both packages build on top of the GenomicRanges infrastructure.
Maintained by Hervรฉ Pagรจs. Last updated 4 months ago.
geneticsinfrastructuredatarepresentationsequencingannotationgenomeannotationcoveragebioconductor-packagecore-package
44 stars 17.68 score 13k scripts 1.3k dependentsr-lib
httr2:Perform HTTP Requests and Process the Responses
Tools for creating and modifying HTTP requests, then performing them and processing the results. 'httr2' is a modern re-imagining of 'httr' that uses a pipe-based interface and solves more of the problems that API wrapping packages face.
Maintained by Hadley Wickham. Last updated 2 days ago.
246 stars 17.64 score 1.9k scripts 1.1k dependentsr-lib
usethis:Automate Package and Project Setup
Automate package and project setup tasks that are otherwise performed manually. This includes setting up unit testing, test coverage, continuous integration, Git, 'GitHub', licenses, 'Rcpp', 'RStudio' projects, and more.
Maintained by Jennifer Bryan. Last updated 24 days ago.
869 stars 17.54 score 5.6k scripts 336 dependentsdavidgohel
flextable:Functions for Tabular Reporting
Use a grammar for creating and customizing pretty tables. The following formats are supported: 'HTML', 'PDF', 'RTF', 'Microsoft Word', 'Microsoft PowerPoint' and R 'Grid Graphics'. 'R Markdown', 'Quarto' and the package 'officer' can be used to produce the result files. The syntax is the same for the user regardless of the type of output to be produced. A set of functions allows the creation, definition of cell arrangement, addition of headers or footers, formatting and definition of cell content with text and or images. The package also offers a set of high-level functions that allow tabular reporting of statistical models and the creation of complex cross tabulations.
Maintained by David Gohel. Last updated 6 days ago.
docxhtml5ms-office-documentsrmarkdowntable
582 stars 17.09 score 7.3k scripts 124 dependentsbioc
clusterProfiler:A universal enrichment tool for interpreting omics data
This package supports functional characteristics of both coding and non-coding genomics data for thousands of species with up-to-date gene annotation. It provides a univeral interface for gene functional annotation from a variety of sources and thus can be applied in diverse scenarios. It provides a tidy interface to access, manipulate, and visualize enrichment results to help users achieve efficient data interpretation. Datasets obtained from multiple treatments and time points can be analyzed and compared in a single run, easily revealing functional consensus and differences among distinct conditions.
Maintained by Guangchuang Yu. Last updated 4 months ago.
annotationclusteringgenesetenrichmentgokeggmultiplecomparisonpathwaysreactomevisualizationenrichment-analysisgsea
1.1k stars 17.03 score 11k scripts 48 dependentssatijalab
Seurat:Tools for Single Cell Genomics
A toolkit for quality control, analysis, and exploration of single cell RNA sequencing data. 'Seurat' aims to enable users to identify and interpret sources of heterogeneity from single cell transcriptomic measurements, and to integrate diverse types of single cell data. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031>, and Hao, Hao, et al (2020) <doi:10.1101/2020.10.12.335331> for more details.
Maintained by Paul Hoffman. Last updated 1 years ago.
human-cell-atlassingle-cell-genomicssingle-cell-rna-seqcpp
2.4k stars 16.86 score 50k scripts 73 dependentsbioc
SummarizedExperiment:A container (S4 class) for matrix-like assays
The SummarizedExperiment container contains one or more assays, each represented by a matrix-like object of numeric or other mode. The rows typically represent genomic ranges of interest and the columns represent samples.
Maintained by Hervรฉ Pagรจs. Last updated 5 months ago.
geneticsinfrastructuresequencingannotationcoveragegenomeannotationbioconductor-packagecore-package
34 stars 16.84 score 8.6k scripts 1.2k dependentsbioc
GenomeInfoDb:Utilities for manipulating chromosome names, including modifying them to follow a particular naming style
Contains data and functions that define and allow translation between different chromosome sequence naming conventions (e.g., "chr1" versus "1"), including a function that attempts to place sequence names in their natural, rather than lexicographic, order.
Maintained by Hervรฉ Pagรจs. Last updated 2 months ago.
geneticsdatarepresentationannotationgenomeannotationbioconductor-packagecore-package
32 stars 16.32 score 1.3k scripts 1.7k dependentsbioc
DESeq2:Differential gene expression analysis based on the negative binomial distribution
Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.
Maintained by Michael Love. Last updated 24 days ago.
sequencingrnaseqchipseqgeneexpressiontranscriptionnormalizationdifferentialexpressionbayesianregressionprincipalcomponentclusteringimmunooncologyopenblascpp
375 stars 16.11 score 17k scripts 115 dependentsbioc
biomaRt:Interface to BioMart databases (i.e. Ensembl)
In recent years a wealth of biological data has become available in public data repositories. Easy access to these valuable data resources and firm integration with data analysis is needed for comprehensive bioinformatics data analysis. biomaRt provides an interface to a growing collection of databases implementing the BioMart software suite (<http://www.biomart.org>). The package enables retrieval of large amounts of data in a uniform way without the need to know the underlying database schemas or write complex SQL queries. The most prominent examples of BioMart databases are maintain by Ensembl, which provides biomaRt users direct access to a diverse set of data and enables a wide range of powerful online queries from gene annotation to database mining.
Maintained by Mike Smith. Last updated 15 days ago.
annotationbioconductorbiomartensembl
38 stars 15.99 score 13k scripts 230 dependentsdavidgohel
officer:Manipulation of Microsoft Word and PowerPoint Documents
Access and manipulate 'Microsoft Word', 'RTF' and 'Microsoft PowerPoint' documents from R. The package focuses on tabular and graphical reporting from R; it also provides two functions that let users get document content into data objects. A set of functions lets add and remove images, tables and paragraphs of text in new or existing documents. The package does not require any installation of Microsoft products to be able to write Microsoft files.
Maintained by David Gohel. Last updated 6 days ago.
ms-office-documentspowerpointword
629 stars 15.85 score 4.1k scripts 142 dependentsbioc
enrichplot:Visualization of Functional Enrichment Result
The 'enrichplot' package implements several visualization methods for interpreting functional enrichment results obtained from ORA or GSEA analysis. It is mainly designed to work with the 'clusterProfiler' package suite. All the visualization methods are developed based on 'ggplot2' graphics.
Maintained by Guangchuang Yu. Last updated 3 months ago.
annotationgenesetenrichmentgokeggpathwayssoftwarevisualizationenrichment-analysispathway-analysis
239 stars 15.71 score 3.1k scripts 58 dependentsr-lib
gh:'GitHub' 'API'
Minimal client to access the 'GitHub' 'API'.
Maintained by Gรกbor Csรกrdi. Last updated 2 months ago.
224 stars 15.55 score 444 scripts 401 dependentsropensci
rnaturalearth:World Map Data from Natural Earth
Facilitates mapping by making natural earth map data from <https://www.naturalearthdata.com/> more easily available to R users.
Maintained by Philippe Massicotte. Last updated 13 days ago.
234 stars 15.51 score 7.2k scripts 47 dependentsbioc
Rsamtools:Binary alignment (BAM), FASTA, variant call (BCF), and tabix file import
This package provides an interface to the 'samtools', 'bcftools', and 'tabix' utilities for manipulating SAM (Sequence Alignment / Map), FASTA, binary variant call (BCF) and compressed indexed tab-delimited (tabix) files.
Maintained by Bioconductor Package Maintainer. Last updated 4 months ago.
dataimportsequencingcoveragealignmentqualitycontrolbioconductor-packagecore-packagecurlbzip2xz-utilszlibcpp
28 stars 15.34 score 3.2k scripts 569 dependentsbioc
GenomicFeatures:Query the gene models of a given organism/assembly
Extract the genomic locations of genes, transcripts, exons, introns, and CDS, for the gene models stored in a TxDb object. A TxDb object is a small database that contains the gene models of a given organism/assembly. Bioconductor provides a small collection of TxDb objects in the form of ready-to-install TxDb packages for the most commonly studied organisms. Additionally, the user can easily make a TxDb object (or package) for the organism/assembly of their choice by using the tools from the txdbmaker package.
Maintained by H. Pagรจs. Last updated 5 months ago.
geneticsinfrastructureannotationsequencinggenomeannotationbioconductor-packagecore-package
26 stars 15.34 score 5.3k scripts 339 dependentsr-lib
covr:Test Coverage for Packages
Track and report code coverage for your package and (optionally) upload the results to a coverage service like 'Codecov' <https://about.codecov.io> or 'Coveralls' <https://coveralls.io>. Code coverage is a measure of the amount of code being exercised by a set of tests. It is an indirect measure of test quality and completeness. This package is compatible with any testing methodology or framework and tracks coverage of both R code and compiled C/C++/FORTRAN code.
Maintained by Jim Hester. Last updated 2 months ago.
codecovcoveragecoverage-reporttravis-ci
337 stars 15.25 score 2.3k scripts 9 dependentsbioc
GenomicAlignments:Representation and manipulation of short genomic alignments
Provides efficient containers for storing and manipulating short genomic alignments (typically obtained by aligning short reads to a reference genome). This includes read counting, computing the coverage, junction detection, and working with the nucleotide content of the alignments.
Maintained by Hervรฉ Pagรจs. Last updated 5 months ago.
infrastructuredataimportgeneticssequencingrnaseqsnpcoveragealignmentimmunooncologybioconductor-packagecore-package
10 stars 15.21 score 3.1k scripts 528 dependentssparklyr
sparklyr:R Interface to Apache Spark
R interface to Apache Spark, a fast and general engine for big data processing, see <https://spark.apache.org/>. This package supports connecting to local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end, and provides an interface to Spark's built-in machine learning algorithms.
Maintained by Edgar Ruiz. Last updated 11 days ago.
apache-sparkdistributeddplyridelivymachine-learningremote-clusterssparksparklyr
959 stars 15.20 score 4.0k scripts 21 dependentsbioc
AnnotationDbi:Manipulation of SQLite-based annotations in Bioconductor
Implements a user-friendly interface for querying SQLite-based annotation data packages.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
annotationmicroarraysequencinggenomeannotationbioconductor-packagecore-package
9 stars 15.05 score 3.6k scripts 769 dependentsbioc
DOSE:Disease Ontology Semantic and Enrichment analysis
This package implements five methods proposed by Resnik, Schlicker, Jiang, Lin and Wang respectively for measuring semantic similarities among DO terms and gene products. Enrichment analyses including hypergeometric model and gene set enrichment analysis are also implemented for discovering disease associations of high-throughput biological data.
Maintained by Guangchuang Yu. Last updated 5 months ago.
annotationvisualizationmultiplecomparisongenesetenrichmentpathwayssoftwaredisease-ontologyenrichment-analysissemantic-similarity
119 stars 14.97 score 2.0k scripts 61 dependentstidyverse
googledrive:An Interface to Google Drive
Manage Google Drive files from R.
Maintained by Jennifer Bryan. Last updated 8 months ago.
329 stars 14.97 score 2.1k scripts 164 dependentsbioc
MultiAssayExperiment:Software for the integration of multi-omics experiments in Bioconductor
Harmonize data management of multiple experimental assays performed on an overlapping set of specimens. It provides a familiar Bioconductor user experience by extending concepts from SummarizedExperiment, supporting an open-ended mix of standard data classes for individual assays, and allowing subsetting by genomic ranges or rownames. Facilities are provided for reshaping data into wide and long formats for adaptability to graphing and downstream analysis.
Maintained by Marcel Ramos. Last updated 2 months ago.
infrastructuredatarepresentationbioconductorbioconductor-packagegenomicsnci-itcrtcgau24ca289073
71 stars 14.95 score 670 scripts 127 dependentsr-lib
gargle:Utilities for Working with Google APIs
Provides utilities for working with Google APIs <https://developers.google.com/apis-explorer>. This includes functions and classes for handling common credential types and for preparing, executing, and processing HTTP requests.
Maintained by Jennifer Bryan. Last updated 2 years ago.
113 stars 14.90 score 266 scripts 192 dependentsrstudio
rsconnect:Deploy Docs, Apps, and APIs to 'Posit Connect', 'shinyapps.io', and 'RPubs'
Programmatic deployment interface for 'RPubs', 'shinyapps.io', and 'Posit Connect'. Supported content types include R Markdown documents, Shiny applications, Plumber APIs, plots, and static web content.
Maintained by Aron Atkins. Last updated 29 days ago.
139 stars 14.90 score 3.1k scripts 6 dependentsropensci
gert:Simple Git Client for R
Simple git client for R based on 'libgit2' <https://libgit2.org> with support for SSH and HTTPS remotes. All functions in 'gert' use basic R data types (such as vectors and data-frames) for their arguments and return values. User credentials are shared with command line 'git' through the git-credential store and ssh keys stored on disk or ssh-agent.
Maintained by Jeroen Ooms. Last updated 4 days ago.
154 stars 14.85 score 158 scripts 367 dependentsbioc
GSVA:Gene Set Variation Analysis for Microarray and RNA-Seq Data
Gene Set Variation Analysis (GSVA) is a non-parametric, unsupervised method for estimating variation of gene set enrichment through the samples of a expression data set. GSVA performs a change in coordinate systems, transforming the data from a gene by sample matrix to a gene-set by sample matrix, thereby allowing the evaluation of pathway enrichment for each sample. This new matrix of GSVA enrichment scores facilitates applying standard analytical methods like functional enrichment, survival analysis, clustering, CNV-pathway analysis or cross-tissue pathway analysis, in a pathway-centric manner.
Maintained by Robert Castelo. Last updated 8 days ago.
functionalgenomicsmicroarrayrnaseqpathwaysgenesetenrichmentgene-set-enrichmentgenomicspathway-enrichment-analysis
212 stars 14.74 score 1.6k scripts 19 dependentsflorianhartig
DHARMa:Residual Diagnostics for Hierarchical (Multi-Level / Mixed) Regression Models
The 'DHARMa' package uses a simulation-based approach to create readily interpretable scaled (quantile) residuals for fitted (generalized) linear mixed models. Currently supported are linear and generalized linear (mixed) models from 'lme4' (classes 'lmerMod', 'glmerMod'), 'glmmTMB', 'GLMMadaptive', and 'spaMM'; phylogenetic linear models from 'phylolm' (classes 'phylolm' and 'phyloglm'); generalized additive models ('gam' from 'mgcv'); 'glm' (including 'negbin' from 'MASS', but excluding quasi-distributions) and 'lm' model classes. Moreover, externally created simulations, e.g. posterior predictive simulations from Bayesian software such as 'JAGS', 'STAN', or 'BUGS' can be processed as well. The resulting residuals are standardized to values between 0 and 1 and can be interpreted as intuitively as residuals from a linear regression. The package also provides a number of plot and test functions for typical model misspecification problems, such as over/underdispersion, zero-inflation, and residual spatial, phylogenetic and temporal autocorrelation.
Maintained by Florian Hartig. Last updated 25 days ago.
glmmregressionregression-diagnosticsresidual
226 stars 14.74 score 2.8k scripts 10 dependentstidyverse
googlesheets4:Access Google Sheets using the Sheets API V4
Interact with Google Sheets through the Sheets API v4 <https://developers.google.com/sheets/api>. "API" is an acronym for "application programming interface"; the Sheets API allows users to interact with Google Sheets programmatically, instead of via a web browser. The "v4" refers to the fact that the Sheets API is currently at version 4. This package can read and write both the metadata and the cell data in a Sheet.
Maintained by Jennifer Bryan. Last updated 8 months ago.
google-drivegoogle-sheetsspreadsheet
363 stars 14.55 score 7.0k scripts 144 dependentsropensci
osmdata:Import 'OpenStreetMap' Data as Simple Features or Spatial Objects
Download and import of 'OpenStreetMap' ('OSM') data as 'sf' or 'sp' objects. 'OSM' data are extracted from the 'Overpass' web server (<https://overpass-api.de/>) and processed with very fast 'C++' routines for return to 'R'.
Maintained by Mark Padgham. Last updated 1 months ago.
open0street0mapopenstreetmapoverpass0apiosmcpposm-dataoverpass-apipeer-reviewedcpp
322 stars 14.53 score 2.8k scripts 14 dependentsbioc
TCGAbiolinks:TCGAbiolinks: An R/Bioconductor package for integrative analysis with GDC data
The aim of TCGAbiolinks is : i) facilitate the GDC open-access data retrieval, ii) prepare the data using the appropriate pre-processing strategies, iii) provide the means to carry out different standard analyses and iv) to easily reproduce earlier research results. In more detail, the package provides multiple methods for analysis (e.g., differential expression analysis, identifying differentially methylated regions) and methods for visualization (e.g., survival plots, volcano plots, starburst plots) in order to easily develop complete analysis pipelines.
Maintained by Tiago Chedraoui Silva. Last updated 1 months ago.
dnamethylationdifferentialmethylationgeneregulationgeneexpressionmethylationarraydifferentialexpressionpathwaysnetworksequencingsurvivalsoftwarebiocbioconductorgdcintegrative-analysistcgatcga-datatcgabiolinks
310 stars 14.47 score 1.6k scripts 6 dependentsbioc
xcms:LC-MS and GC-MS Data Analysis
Framework for processing and visualization of chromatographically separated and single-spectra mass spectral data. Imports from AIA/ANDI NetCDF, mzXML, mzData and mzML files. Preprocesses data for high-throughput, untargeted analyte profiling.
Maintained by Steffen Neumann. Last updated 15 days ago.
immunooncologymassspectrometrymetabolomicsbioconductorfeature-detectionmass-spectrometrypeak-detectioncpp
196 stars 14.31 score 984 scripts 11 dependentstalgalili
heatmaply:Interactive Cluster Heat Maps Using 'plotly' and 'ggplot2'
Create interactive cluster 'heatmaps' that can be saved as a stand- alone HTML file, embedded in 'R Markdown' documents or in a 'Shiny' app, and available in the 'RStudio' viewer pane. Hover the mouse pointer over a cell to show details or drag a rectangle to zoom. A 'heatmap' is a popular graphical method for visualizing high-dimensional data, in which a table of numbers are encoded as a grid of colored cells. The rows and columns of the matrix are ordered to highlight patterns and are often accompanied by 'dendrograms'. 'Heatmaps' are used in many fields for visualizing observations, correlations, missing values patterns, and more. Interactive 'heatmaps' allow the inspection of specific value by hovering the mouse over a cell, as well as zooming into a region of the 'heatmap' by dragging a rectangle around the relevant area. This work is based on the 'ggplot2' and 'plotly.js' engine. It produces similar 'heatmaps' to 'heatmap.2' with the advantage of speed ('plotly.js' is able to handle larger size matrix), the ability to zoom from the 'dendrogram' panes, and the placing of factor variables in the sides of the 'heatmap'.
Maintained by Tal Galili. Last updated 9 months ago.
d3-heatmapdendextenddendrogramggplot2heatmapplotly
386 stars 14.21 score 2.0k scripts 45 dependentsbusiness-science
timetk:A Tool Kit for Working with Time Series
Easy visualization, wrangling, and feature engineering of time series data for forecasting and machine learning prediction. Consolidates and extends time series functionality from packages including 'dplyr', 'stats', 'xts', 'forecast', 'slider', 'padr', 'recipes', and 'rsample'.
Maintained by Matt Dancho. Last updated 1 years ago.
coercioncoercion-functionsdata-miningdplyrforecastforecastingforecasting-modelsmachine-learningseries-decompositionseries-signaturetibbletidytidyquanttidyversetimetime-seriestimeseries
626 stars 14.20 score 4.0k scripts 16 dependentsrstudio
pins:Pin, Discover, and Share Resources
Publish data sets, models, and other R objects, making it easy to share them across projects and with your colleagues. You can pin objects to a variety of "boards", including local folders (to share on a networked drive or with 'DropBox'), 'Posit Connect', 'AWS S3', and more.
Maintained by Julia Silge. Last updated 2 months ago.
azuregcloudrpinsrsconnects3storage
321 stars 14.17 score 1.9k scripts 17 dependentsdkahle
ggmap:Spatial Visualization with ggplot2
A collection of functions to visualize spatial data and models on top of static maps from various online sources (e.g Google Maps and Stamen Maps). It includes tools common to those tasks, including functions for geolocation and routing.
Maintained by David Kahle. Last updated 1 years ago.
770 stars 14.17 score 12k scripts 31 dependentsdoi-usgs
dataRetrieval:Retrieval Functions for USGS and EPA Hydrology and Water Quality Data
Collection of functions to help retrieve U.S. Geological Survey and U.S. Environmental Protection Agency water quality and hydrology data from web services. Data are discovered from National Water Information System <https://waterservices.usgs.gov/> and <https://waterdata.usgs.gov/nwis>. Water quality data are obtained from the Water Quality Portal <https://www.waterqualitydata.us/>.
Maintained by Laura DeCicco. Last updated 3 days ago.
286 stars 14.16 score 1.7k scripts 15 dependentsbioc
GOSemSim:GO-terms Semantic Similarity Measures
The semantic comparisons of Gene Ontology (GO) annotations provide quantitative ways to compute similarities between genes and gene groups, and have became important basis for many bioinformatics analysis approaches. GOSemSim is an R package for semantic similarity computation among GO terms, sets of GO terms, gene products and gene clusters. GOSemSim implemented five methods proposed by Resnik, Schlicker, Jiang, Lin and Wang respectively.
Maintained by Guangchuang Yu. Last updated 5 months ago.
annotationgoclusteringpathwaysnetworksoftwarebioinformaticsgene-ontologysemantic-similaritycpp
63 stars 14.12 score 708 scripts 68 dependentsbioc
BSgenome:Software infrastructure for efficient representation of full genomes and their SNPs
Infrastructure shared by all the Biostrings-based genome data packages.
Maintained by Hervรฉ Pagรจs. Last updated 2 months ago.
geneticsinfrastructuredatarepresentationsequencematchingannotationsnpbioconductor-packagecore-package
9 stars 14.12 score 1.2k scripts 267 dependentsleifeld
texreg:Conversion of R Regression Output to LaTeX or HTML Tables
Converts coefficients, standard errors, significance stars, and goodness-of-fit statistics of statistical models into LaTeX tables or HTML tables/MS Word documents or to nicely formatted screen output for the R console for easy model comparison. A list of several models can be combined in a single table. The output is highly customizable. New model types can be easily implemented. Details can be found in Leifeld (2013), JStatSoft <doi:10.18637/jss.v055.i08>.)
Maintained by Philip Leifeld. Last updated 3 months ago.
html-tableslatexlatex-tablesregressionreportingtabletexreg
113 stars 14.09 score 1.8k scripts 67 dependentsbioc
ensembldb:Utilities to create and use Ensembl-based annotation databases
The package provides functions to create and use transcript centric annotation databases/packages. The annotation for the databases are directly fetched from Ensembl using their Perl API. The functionality and data is similar to that of the TxDb packages from the GenomicFeatures package, but, in addition to retrieve all gene/transcript models and annotations from the database, ensembldb provides a filter framework allowing to retrieve annotations for specific entries like genes encoded on a chromosome region or transcript models of lincRNA genes. EnsDb databases built with ensembldb contain also protein annotations and mappings between proteins and their encoding transcripts. Finally, ensembldb provides functions to map between genomic, transcript and protein coordinates.
Maintained by Johannes Rainer. Last updated 5 months ago.
geneticsannotationdatasequencingcoverageannotationbioconductorbioconductor-packagesensembl
35 stars 14.08 score 892 scripts 108 dependentswalkerke
tidycensus:Load US Census Boundary and Attribute Data as 'tidyverse' and 'sf'-Ready Data Frames
An integrated R interface to several United States Census Bureau APIs (<https://www.census.gov/data/developers/data-sets.html>) and the US Census Bureau's geographic boundary files. Allows R users to return Census and ACS data as tidyverse-ready data frames, and optionally returns a list-column with feature geometry for mapping and spatial analysis.
Maintained by Kyle Walker. Last updated 2 months ago.
648 stars 14.02 score 7.5k scripts 10 dependentsbioc
phyloseq:Handling and analysis of high-throughput microbiome census data
phyloseq provides a set of classes and tools to facilitate the import, storage, analysis, and graphical display of microbiome census data.
Maintained by Paul J. McMurdie. Last updated 5 months ago.
immunooncologysequencingmicrobiomemetagenomicsclusteringclassificationmultiplecomparisongeneticvariability
597 stars 13.90 score 8.4k scripts 37 dependentsbioc
AnnotationHub:Client to access AnnotationHub resources
This package provides a client for the Bioconductor AnnotationHub web resource. The AnnotationHub web resource provides a central location where genomic files (e.g., VCF, bed, wig) and other resources from standard locations (e.g., UCSC, Ensembl) can be discovered. The resource includes metadata about each resource, e.g., a textual description, tags, and date of modification. The client creates and manages a local cache of files retrieved by the user, helping with quick and reproducible access.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
infrastructuredataimportguithirdpartyclientcore-packageu24ca289073
17 stars 13.88 score 2.7k scripts 104 dependentsconfig-i1
smooth:Forecasting Using State Space Models
Functions implementing Single Source of Error state space models for purposes of time series analysis and forecasting. The package includes ADAM (Svetunkov, 2023, <https://openforecast.org/adam/>), Exponential Smoothing (Hyndman et al., 2008, <doi: 10.1007/978-3-540-71918-2>), SARIMA (Svetunkov & Boylan, 2019 <doi: 10.1080/00207543.2019.1600764>), Complex Exponential Smoothing (Svetunkov & Kourentzes, 2018, <doi: 10.13140/RG.2.2.24986.29123>), Simple Moving Average (Svetunkov & Petropoulos, 2018 <doi: 10.1080/00207543.2017.1380326>) and several simulation functions. It also allows dealing with intermittent demand based on the iETS framework (Svetunkov & Boylan, 2019, <doi: 10.13140/RG.2.2.35897.06242>).
Maintained by Ivan Svetunkov. Last updated 12 days ago.
arimaarima-forecastingcesetsexponential-smoothingforecaststate-spacetime-seriesopenblascpp
90 stars 13.83 score 412 scripts 25 dependentsbioc
BiocFileCache:Manage Files Across Sessions
This package creates a persistent on-disk cache of files that the user can add, update, and retrieve. It is useful for managing resources (such as custom Txdb objects) that are costly or difficult to create, web resources, and data files used across sessions.
Maintained by Lori Shepherd. Last updated 2 months ago.
dataimportcore-packageu24ca289073
13 stars 13.76 score 486 scripts 436 dependentsropensci
rentrez:'Entrez' in R
Provides an R interface to the NCBI's 'EUtils' API, allowing users to search databases like 'GenBank' <https://www.ncbi.nlm.nih.gov/genbank/> and 'PubMed' <https://pubmed.ncbi.nlm.nih.gov/>, process the results of those searches and pull data into their R sessions.
Maintained by David Winter. Last updated 4 years ago.
199 stars 13.72 score 784 scripts 98 dependentsropensci
taxize:Taxonomic Information from Around the Web
Interacts with a suite of web application programming interfaces (API) for taxonomic tasks, such as getting database specific taxonomic identifiers, verifying species names, getting taxonomic hierarchies, fetching downstream and upstream taxonomic names, getting taxonomic synonyms, converting scientific to common names and vice versa, and more. Some of the services supported include 'NCBI E-utilities' (<https://www.ncbi.nlm.nih.gov/books/NBK25501/>), 'Encyclopedia of Life' (<https://eol.org/docs/what-is-eol/data-services>), 'Global Biodiversity Information Facility' (<https://techdocs.gbif.org/en/openapi/>), and many more. Links to the API documentation for other supported services are available in the documentation for their respective functions in this package.
Maintained by Zachary Foster. Last updated 25 days ago.
taxonomybiologynomenclaturejsonapiwebapi-clientidentifiersspeciesnamesapi-wrapperbiodiversitydarwincoredatataxize
274 stars 13.63 score 1.6k scripts 23 dependentsbioc
SingleCellExperiment:S4 Classes for Single Cell Data
Defines a S4 class for storing data from single-cell experiments. This includes specialized methods to store and retrieve spike-in information, dimensionality reduction coordinates and size factors for each cell, along with the usual metadata for genes and libraries.
Maintained by Davide Risso. Last updated 22 days ago.
immunooncologydatarepresentationdataimportinfrastructuresinglecell
13.53 score 15k scripts 285 dependentsbioc
KEGGREST:Client-side REST access to the Kyoto Encyclopedia of Genes and Genomes (KEGG)
A package that provides a client interface to the Kyoto Encyclopedia of Genes and Genomes (KEGG) REST API. Only for academic use by academic users belonging to academic institutions (see <https://www.kegg.jp/kegg/rest/>). Note that KEGGREST is based on KEGGSOAP by J. Zhang, R. Gentleman, and Marc Carlson, and KEGG (python package) by Aurelien Mazurie.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
annotationpathwaysthirdpartyclientkeggbioconductor-packagecore-package
10 stars 13.50 score 688 scripts 771 dependentsbioc
GEOquery:Get data from NCBI Gene Expression Omnibus (GEO)
The NCBI Gene Expression Omnibus (GEO) is a public repository of microarray data. Given the rich and varied nature of this resource, it is only natural to want to apply BioConductor tools to these data. GEOquery is the bridge between GEO and BioConductor.
Maintained by Sean Davis. Last updated 5 months ago.
microarraydataimportonechanneltwochannelsagebioconductorbioinformaticsdata-sciencegenomicsncbi-geo
93 stars 13.48 score 4.1k scripts 45 dependentsbioc
RCy3:Functions to Access and Control Cytoscape
Vizualize, analyze and explore networks using Cytoscape via R. Anything you can do using the graphical user interface of Cytoscape, you can now do with a single RCy3 function.
Maintained by Alex Pico. Last updated 2 days ago.
visualizationgraphandnetworkthirdpartyclientnetwork
52 stars 13.47 score 628 scripts 17 dependentsropensci
RSelenium:R Bindings for 'Selenium WebDriver'
Provides a set of R bindings for the 'Selenium 2.0 WebDriver' (see <https://www.selenium.dev/documentation/> for more information) using the 'JsonWireProtocol' (see <https://github.com/SeleniumHQ/selenium/wiki/JsonWireProtocol> for more information). 'Selenium 2.0 WebDriver' allows driving a web browser natively as a user would either locally or on a remote machine using the Selenium server it marks a leap forward in terms of web browser automation. Selenium automates web browsers (commonly referred to as browsers). Using RSelenium you can automate browsers locally or remotely.
Maintained by Jonathan Vรถlkle. Last updated 2 years ago.
344 stars 13.38 score 1.9k scripts 12 dependentsbusiness-science
tidyquant:Tidy Quantitative Financial Analysis
Bringing business and financial analysis to the 'tidyverse'. The 'tidyquant' package provides a convenient wrapper to various 'xts', 'zoo', 'quantmod', 'TTR' and 'PerformanceAnalytics' package functions and returns the objects in the tidy 'tibble' format. The main advantage is being able to use quantitative functions with the 'tidyverse' functions including 'purrr', 'dplyr', 'tidyr', 'ggplot2', 'lubridate', etc. See the 'tidyquant' website for more information, documentation and examples.
Maintained by Matt Dancho. Last updated 1 months ago.
dplyrfinancial-analysisfinancial-datafinancial-statementsmultiple-stocksperformance-analysisperformanceanalyticsquantmodstockstock-exchangesstock-indexesstock-listsstock-performancestock-pricesstock-symboltidyversetime-seriestimeseriesxts
872 stars 13.34 score 5.2k scriptsropensci
rgbif:Interface to the Global Biodiversity Information Facility API
A programmatic interface to the Web Service methods provided by the Global Biodiversity Information Facility (GBIF; <https://www.gbif.org/developer/summary>). GBIF is a database of species occurrence records from sources all over the globe. rgbif includes functions for searching for taxonomic names, retrieving information on data providers, getting species occurrence records, getting counts of occurrence records, and using the GBIF tile map service to make rasters summarizing huge amounts of data.
Maintained by John Waller. Last updated 16 days ago.
gbifspecimensapiweb-servicesoccurrencesspeciestaxonomybiodiversitydatalifewatchoscibiospocc
161 stars 13.26 score 2.1k scripts 20 dependentsbioc
dada2:Accurate, high-resolution sample inference from amplicon sequencing data
The dada2 package infers exact amplicon sequence variants (ASVs) from high-throughput amplicon sequencing data, replacing the coarser and less accurate OTU clustering approach. The dada2 pipeline takes as input demultiplexed fastq files, and outputs the sequence variants and their sample-wise abundances after removing substitution and chimera errors. Taxonomic classification is available via a native implementation of the RDP naive Bayesian classifier, and species-level assignment to 16S rRNA gene fragments by exact matching.
Maintained by Benjamin Callahan. Last updated 5 months ago.
immunooncologymicrobiomesequencingclassificationmetagenomicsampliconbioconductorbioinformaticsmetabarcodingtaxonomycpp
487 stars 13.17 score 3.0k scripts 4 dependentsbioc
scran:Methods for Single-Cell RNA-Seq Data Analysis
Implements miscellaneous functions for interpretation of single-cell RNA-seq data. Methods are provided for assignment of cell cycle phase, detection of highly variable and significantly correlated genes, identification of marker genes, and other common tasks in routine single-cell analysis workflows.
Maintained by Aaron Lun. Last updated 5 months ago.
immunooncologynormalizationsequencingrnaseqsoftwaregeneexpressiontranscriptomicssinglecellclusteringbioconductor-packagehuman-cell-atlassingle-cell-rna-seqopenblascpp
41 stars 13.05 score 7.6k scripts 37 dependentsbioc
Gviz:Plotting data and annotation information along genomic coordinates
Genomic data analyses requires integrated visualization of known genomic information and new experimental data. Gviz uses the biomaRt and the rtracklayer packages to perform live annotation queries to Ensembl and UCSC and translates this to e.g. gene/transcript structures in viewports of the grid graphics package. This results in genomic information plotted together with your data.
Maintained by Robert Ivanek. Last updated 5 months ago.
visualizationmicroarraysequencing
79 stars 13.05 score 1.4k scripts 46 dependentsbioc
ChIPseeker:ChIPseeker for ChIP peak Annotation, Comparison, and Visualization
This package implements functions to retrieve the nearest genes around the peak, annotate genomic region of the peak, statstical methods for estimate the significance of overlap among ChIP peak data sets, and incorporate GEO database for user to compare the own dataset with those deposited in database. The comparison can be used to infer cooperative regulation and thus can be used to generate hypotheses. Several visualization functions are implemented to summarize the coverage of the peak experiment, average profile and heatmap of peaks binding to TSS regions, genomic annotation, distance to TSS, and overlap of peaks or genes.
Maintained by Guangchuang Yu. Last updated 5 months ago.
annotationchipseqsoftwarevisualizationmultiplecomparisonatac-seqchip-seqcomparisonepigeneticsepigenomics
233 stars 13.05 score 1.6k scripts 5 dependentsropensci
piggyback:Managing Larger Data on a GitHub Repository
Helps store files as GitHub release assets, which is a convenient way for large/binary data files to piggyback onto public and private GitHub repositories. Includes functions for file downloads, uploads, and managing releases via the GitHub API.
Maintained by Carl Boettiger. Last updated 4 months ago.
data-storegit-lfspeer-reviewed
187 stars 12.98 score 187 scripts 12 dependentstidyverse
ellmer:Chat with Large Language Models
Chat with large language models from a range of providers including 'Claude' <https://claude.ai>, 'OpenAI' <https://chatgpt.com>, and more. Supports streaming, asynchronous calls, tool calling, and structured data extraction.
Maintained by Hadley Wickham. Last updated 18 hours ago.
407 stars 12.94 score 98 scripts 8 dependentsmichaelhallquist
MplusAutomation:An R Package for Facilitating Large-Scale Latent Variable Analyses in Mplus
Leverages the R language to automate latent variable model estimation and interpretation using 'Mplus', a powerful latent variable modeling program developed by Muthen and Muthen (<https://www.statmodel.com>). Specifically, this package provides routines for creating related groups of models, running batches of models, and extracting and tabulating model parameters and fit statistics.
Maintained by Michael Hallquist. Last updated 4 days ago.
86 stars 12.92 score 664 scripts 13 dependentswalkerke
tigris:Load Census TIGER/Line Shapefiles
Download TIGER/Line shapefiles from the United States Census Bureau (<https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html>) and load into R as 'sf' objects.
Maintained by Kyle Walker. Last updated 5 months ago.
331 stars 12.87 score 5.3k scripts 16 dependentsrinterface
bs4Dash:A 'Bootstrap 4' Version of 'shinydashboard'
Make 'Bootstrap 4' Shiny dashboards. Use the full power of 'AdminLTE3', a dashboard template built on top of 'Bootstrap 4' <https://github.com/ColorlibHQ/AdminLTE>.
Maintained by David Granjon. Last updated 7 months ago.
bootstrap4dashboard-templateshacktoberfest2022shinyshiny-appsshinydashboard
442 stars 12.87 score 1.2k scripts 15 dependentsbioc
iSEE:Interactive SummarizedExperiment Explorer
Create an interactive Shiny-based graphical user interface for exploring data stored in SummarizedExperiment objects, including row- and column-level metadata. The interface supports transmission of selections between plots and tables, code tracking, interactive tours, interactive or programmatic initialization, preservation of app state, and extensibility to new panel types via S4 classes. Special attention is given to single-cell data in a SingleCellExperiment object with visualization of dimensionality reduction results.
Maintained by Kevin Rue-Albrecht. Last updated 23 days ago.
cellbasedassaysclusteringdimensionreductionfeatureextractiongeneexpressionguiimmunooncologyshinyappssinglecelltranscriptiontranscriptomicsvisualizationdimension-reductionfeature-extractiongene-expressionhacktoberfesthuman-cell-atlasshinysingle-cell
225 stars 12.86 score 380 scripts 9 dependentsmarkedmondson1234
googleAuthR:Authenticate and Create Google APIs
Create R functions that interact with OAuth2 Google APIs <https://developers.google.com/apis-explorer/> easily, with auto-refresh and Shiny compatibility.
Maintained by Erik Grรถnroos. Last updated 10 months ago.
apiauthenticationgooglegoogleauthroauth2-flowshiny
178 stars 12.85 score 804 scripts 13 dependentsbioc
SingleR:Reference-Based Single-Cell RNA-Seq Annotation
Performs unbiased cell type recognition from single-cell RNA sequencing data, by leveraging reference transcriptomic datasets of pure cell types to infer the cell of origin of each single cell independently.
Maintained by Aaron Lun. Last updated 1 months ago.
softwaresinglecellgeneexpressiontranscriptomicsclassificationclusteringannotationbioconductorsinglercpp
184 stars 12.83 score 2.1k scripts 2 dependentsbioc
minfi:Analyze Illumina Infinium DNA methylation arrays
Tools to analyze & visualize Illumina Infinium methylation arrays.
Maintained by Kasper Daniel Hansen. Last updated 4 months ago.
immunooncologydnamethylationdifferentialmethylationepigeneticsmicroarraymethylationarraymultichanneltwochanneldataimportnormalizationpreprocessingqualitycontrol
60 stars 12.82 score 996 scripts 27 dependentstkonopka
umap:Uniform Manifold Approximation and Projection
Uniform manifold approximation and projection is a technique for dimension reduction. The algorithm was described by McInnes and Healy (2018) in <arXiv:1802.03426>. This package provides an interface for two implementations. One is written from scratch, including components for nearest-neighbor search and for embedding. The second implementation is a wrapper for 'python' package 'umap-learn' (requires separate installation, see vignette for more details).
Maintained by Tomasz Konopka. Last updated 11 months ago.
dimensionality-reductionumapcpp
132 stars 12.82 score 3.6k scripts 45 dependentsbioc
MSnbase:Base Functions and Classes for Mass Spectrometry and Proteomics
MSnbase provides infrastructure for manipulation, processing and visualisation of mass spectrometry and proteomics data, ranging from raw to quantitative and annotated data.
Maintained by Laurent Gatto. Last updated 15 days ago.
immunooncologyinfrastructureproteomicsmassspectrometryqualitycontroldataimportbioconductorbioinformaticsmass-spectrometryproteomics-datavisualisationcpp
131 stars 12.76 score 772 scripts 36 dependentsbioc
plyranges:A fluent interface for manipulating GenomicRanges
A dplyr-like interface for interacting with the common Bioconductor classes Ranges and GenomicRanges. By providing a grammatical and consistent way of manipulating these classes their accessiblity for new Bioconductor users is hopefully increased.
Maintained by Michael Love. Last updated 10 days ago.
infrastructuredatarepresentationworkflowstepcoveragebioconductordata-analysisdplyrgenomic-rangesgenomicstidy-data
144 stars 12.66 score 1.9k scripts 20 dependentsbioc
rtracklayer:R interface to genome annotation files and the UCSC genome browser
Extensible framework for interacting with multiple genome browsers (currently UCSC built-in) and manipulating annotation tracks in various formats (currently GFF, BED, bedGraph, BED15, WIG, BigWig and 2bit built-in). The user may export/import tracks to/from the supported browsers, as well as query and modify the browser state, such as the current viewport.
Maintained by Michael Lawrence. Last updated 4 days ago.
annotationvisualizationdataimportzlibopensslcurl
12.66 score 6.7k scripts 480 dependentsinsightsengineering
teal:Exploratory Web Apps for Analyzing Clinical Trials Data
A 'shiny' based interactive exploration framework for analyzing clinical trials data. 'teal' currently provides a dynamic filtering facility and different data viewers. 'teal' 'shiny' applications are built using standard 'shiny' modules.
Maintained by Dawid Kaledkowski. Last updated 1 months ago.
clinical-trialsnestshinywebapp
206 stars 12.65 score 176 scripts 5 dependentsbioc
SpatialExperiment:S4 Class for Spatially Resolved -omics Data
Defines an S4 class for storing data from spatial -omics experiments. The class extends SingleCellExperiment to support storage and retrieval of additional information from spot-based and molecule-based platforms, including spatial coordinates, images, and image metadata. A specialized constructor function is included for data from the 10x Genomics Visium platform.
Maintained by Dario Righelli. Last updated 5 months ago.
datarepresentationdataimportinfrastructureimmunooncologygeneexpressiontranscriptomicssinglecellspatial
59 stars 12.63 score 1.8k scripts 71 dependentshrbrmstr
ggalt:Extra Coordinate Systems, 'Geoms', Statistical Transformations, Scales and Fonts for 'ggplot2'
A compendium of new geometries, coordinate systems, statistical transformations, scales and fonts for 'ggplot2', including splines, 1d and 2d densities, univariate average shifted histograms, a new map coordinate system based on the 'PROJ.4'-library along with geom_cartogram() that mimics the original functionality of geom_map(), formatters for "bytes", a stat_stepribbon() function, increased 'plotly' compatibility and the 'StateFace' open source font 'ProPublica'. Further new functionality includes lollipop charts, dumbbell charts, the ability to encircle points and coordinate-system-based text annotations.
Maintained by Bob Rudis. Last updated 2 years ago.
geomggplot-extensionggplot2ggplot2-geomggplot2-scales
676 stars 12.60 score 2.3k scripts 7 dependentsmassimoaria
bibliometrix:Comprehensive Science Mapping Analysis
Tool for quantitative research in scientometrics and bibliometrics. It implements the comprehensive workflow for science mapping analysis proposed in Aria M. and Cuccurullo C. (2017) <doi:10.1016/j.joi.2017.08.007>. 'bibliometrix' provides various routines for importing bibliographic data from 'SCOPUS', 'Clarivate Analytics Web of Science' (<https://www.webofknowledge.com/>), 'Digital Science Dimensions' (<https://www.dimensions.ai/>), 'OpenAlex' (<https://openalex.org/>), 'Cochrane Library' (<https://www.cochranelibrary.com/>), 'Lens' (<https://lens.org>), and 'PubMed' (<https://pubmed.ncbi.nlm.nih.gov/>) databases, performing bibliometric analysis and building networks for co-citation, coupling, scientific collaboration and co-word analysis.
Maintained by Massimo Aria. Last updated 10 days ago.
bibliometric-analysisbibliometricscitationcitation-networkcitationsco-authorsco-occurenceco-word-analysiscorrespondence-analysiscouplingisi-webjournalmanuscriptquantitative-analysisscholarssciencescience-mappingscientificscientometricsscopus
545 stars 12.54 score 518 scripts 2 dependentsbioc
microbiome:Microbiome Analytics
Utilities for microbiome analysis.
Maintained by Leo Lahti. Last updated 5 months ago.
metagenomicsmicrobiomesequencingsystemsbiologyhitchiphitchip-atlashuman-microbiomemicrobiologymicrobiome-analysisphyloseqpopulation-study
293 stars 12.51 score 2.0k scripts 5 dependentsr-dbi
bigrquery:An Interface to Google's 'BigQuery' 'API'
Easily talk to Google's 'BigQuery' database from R.
Maintained by Hadley Wickham. Last updated 1 months ago.
520 stars 12.47 score 1.8k scripts 4 dependentscloudyr
aws.s3:'AWS S3' Client Package
A simple client package for the Amazon Web Services ('AWS') Simple Storage Service ('S3') 'REST' 'API' <https://aws.amazon.com/s3/>.
Maintained by Simon Urbanek. Last updated 5 years ago.
amazonawsaws-s3cloudyrs3s3-storage
383 stars 12.47 score 1.4k scripts 17 dependentsr-lib
credentials:Tools for Managing SSH and Git Credentials
Setup and retrieve HTTPS and SSH credentials for use with 'git' and other services. For HTTPS remotes the package interfaces the 'git-credential' utility which 'git' uses to store HTTP usernames and passwords. For SSH remotes we provide convenient functions to find or generate appropriate SSH keys. The package both helps the user to setup a local git installation, and also provides a back-end for git/ssh client libraries to authenticate with existing user credentials.
Maintained by Jeroen Ooms. Last updated 6 months ago.
72 stars 12.40 score 91 scripts 380 dependentsbioc
scDblFinder:scDblFinder
The scDblFinder package gathers various methods for the detection and handling of doublets/multiplets in single-cell sequencing data (i.e. multiple cells captured within the same droplet or reaction volume). It includes methods formerly found in the scran package, the new fast and comprehensive scDblFinder method, and a reimplementation of the Amulet detection method for single-cell ATAC-seq.
Maintained by Pierre-Luc Germain. Last updated 9 days ago.
preprocessingsinglecellrnaseqatacseqdoubletssingle-cell
184 stars 12.38 score 888 scripts 1 dependentsbioc
TFBSTools:Software Package for Transcription Factor Binding Site (TFBS) Analysis
TFBSTools is a package for the analysis and manipulation of transcription factor binding sites. It includes matrices conversion between Position Frequency Matirx (PFM), Position Weight Matirx (PWM) and Information Content Matrix (ICM). It can also scan putative TFBS from sequence/alignment, query JASPAR database and provides a wrapper of de novo motif discovery software.
Maintained by Ge Tan. Last updated 17 days ago.
motifannotationgeneregulationmotifdiscoverytranscriptionalignment
28 stars 12.36 score 1.1k scripts 18 dependentsouhscbbmc
REDCapR:Interaction Between R and REDCap
Encapsulates functions to streamline calls from R to the REDCap API. REDCap (Research Electronic Data CAPture) is a web application for building and managing online surveys and databases developed at Vanderbilt University. The Application Programming Interface (API) offers an avenue to access and modify data programmatically, improving the capacity for literate and reproducible programming.
Maintained by Will Beasley. Last updated 3 months ago.
118 stars 12.36 score 438 scripts 6 dependentsropensci
stplanr:Sustainable Transport Planning
Tools for transport planning with an emphasis on spatial transport data and non-motorized modes. The package was originally developed to support the 'Propensity to Cycle Tool', a publicly available strategic cycle network planning tool (Lovelace et al. 2017) <doi:10.5198/jtlu.2016.862>, but has since been extended to support public transport routing and accessibility analysis (Moreno-Monroy et al. 2017) <doi:10.1016/j.jtrangeo.2017.08.012> and routing with locally hosted routing engines such as 'OSRM' (Lowans et al. 2023) <doi:10.1016/j.enconman.2023.117337>. The main functions are for creating and manipulating geographic "desire lines" from origin-destination (OD) data (building on the 'od' package); calculating routes on the transport network locally and via interfaces to routing services such as <https://cyclestreets.net/> (Desjardins et al. 2021) <doi:10.1007/s11116-021-10197-1>; and calculating route segment attributes such as bearing. The package implements the 'travel flow aggregration' method described in Morgan and Lovelace (2020) <doi:10.1177/2399808320942779> and the 'OD jittering' method described in Lovelace et al. (2022) <doi:10.32866/001c.33873>. Further information on the package's aim and scope can be found in the vignettes and in a paper in the R Journal (Lovelace and Ellison 2018) <doi:10.32614/RJ-2018-053>, and in a paper outlining the landscape of open source software for geographic methods in transport planning (Lovelace, 2021) <doi:10.1007/s10109-020-00342-2>.
Maintained by Robin Lovelace. Last updated 7 months ago.
cyclecyclingdesire-linesorigin-destinationpeer-reviewedpubic-transportroute-networkroutesroutingspatialtransporttransport-planningtransportationwalking
427 stars 12.31 score 684 scripts 3 dependentsr-lib
keyring:Access the System Credential Store from R
Platform independent 'API' to access the operating system's credential store. Currently supports: 'Keychain' on 'macOS', Credential Store on 'Windows', the Secret Service 'API' on 'Linux', and simple, platform independent stores implemented with environment variables or encrypted files. Additional storage back-ends can be added easily.
Maintained by Gรกbor Csรกrdi. Last updated 27 days ago.
198 stars 12.29 score 976 scripts 56 dependentsbioc
bsseq:Analyze, manage and store whole-genome methylation data
A collection of tools for analyzing and visualizing whole-genome methylation data from sequencing. This includes whole-genome bisulfite sequencing and Oxford nanopore data.
Maintained by Kasper Daniel Hansen. Last updated 3 months ago.
37 stars 12.26 score 676 scripts 15 dependentsbioc
ReactomePA:Reactome Pathway Analysis
This package provides functions for pathway analysis based on REACTOME pathway database. It implements enrichment analysis, gene set enrichment analysis and several functions for visualization. This package is not affiliated with the Reactome team.
Maintained by Guangchuang Yu. Last updated 5 months ago.
pathwaysvisualizationannotationmultiplecomparisongenesetenrichmentreactomeenrichment-analysisreactome-pathway-analysisreactomepa
40 stars 12.25 score 1.5k scripts 7 dependentsbioc
ggbio:Visualization tools for genomic data
The ggbio package extends and specializes the grammar of graphics for biological data. The graphics are designed to answer common scientific questions, in particular those often asked of high throughput genomics data. All core Bioconductor data structures are supported, where appropriate. The package supports detailed views of particular genomic regions, as well as genome-wide overviews. Supported overviews include ideograms and grand linear views. High-level plots include sequence fragment length, edge-linked interval to data view, mismatch pileup, and several splicing summaries.
Maintained by Michael Lawrence. Last updated 5 months ago.
111 stars 12.23 score 734 scripts 16 dependentsstuart-lab
Signac:Analysis of Single-Cell Chromatin Data
A framework for the analysis and exploration of single-cell chromatin data. The 'Signac' package contains functions for quantifying single-cell chromatin data, computing per-cell quality control metrics, dimension reduction and normalization, visualization, and DNA sequence motif analysis. Reference: Stuart et al. (2021) <doi:10.1038/s41592-021-01282-5>.
Maintained by Tim Stuart. Last updated 7 months ago.
atacbioinformaticssingle-cellzlibcpp
355 stars 12.18 score 3.7k scripts 1 dependentsbioc
glmGamPoi:Fit a Gamma-Poisson Generalized Linear Model
Fit linear models to overdispersed count data. The package can estimate the overdispersion and fit repeated models for matrix input. It is designed to handle large input datasets as they typically occur in single cell RNA-seq experiments.
Maintained by Constantin Ahlmann-Eltze. Last updated 12 days ago.
regressionrnaseqsoftwaresinglecellgamma-poissonglmnegative-binomial-regressionon-diskopenblascpp
111 stars 12.16 score 1.0k scripts 4 dependentsrstudio
shinytest2:Testing for Shiny Applications
Automated unit testing of Shiny applications through a headless 'Chromium' browser.
Maintained by Barret Schloerke. Last updated 3 days ago.
108 stars 12.13 score 704 scripts 1 dependentsbioc
SeqArray:Data management of large-scale whole-genome sequence variant calls using GDS files
Data management of large-scale whole-genome sequencing variant calls with thousands of individuals: genotypic data (e.g., SNVs, indels and structural variation calls) and annotations in SeqArray GDS files are stored in an array-oriented and compressed manner, with efficient data access using the R programming language.
Maintained by Xiuwen Zheng. Last updated 6 days ago.
infrastructuredatarepresentationsequencinggeneticsbioinformaticsgds-formatsnpsnvweswgscpp
45 stars 12.11 score 1.1k scripts 9 dependentsbioc
ShortRead:FASTQ input and manipulation
This package implements sampling, iteration, and input of FASTQ files. The package includes functions for filtering and trimming reads, and for generating a quality assessment report. Data are represented as DNAStringSet-derived objects, and easily manipulated for a diversity of purposes. The package also contains legacy support for early single-end, ungapped alignment formats.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
dataimportsequencingqualitycontrolbioconductor-packagecore-packagezlibcpp
8 stars 12.08 score 1.8k scripts 49 dependentsropensci
RefManageR:Straightforward 'BibTeX' and 'BibLaTeX' Bibliography Management
Provides tools for importing and working with bibliographic references. It greatly enhances the 'bibentry' class by providing a class 'BibEntry' which stores 'BibTeX' and 'BibLaTeX' references, supports 'UTF-8' encoding, and can be easily searched by any field, by date ranges, and by various formats for name lists (author by last names, translator by full names, etc.). Entries can be updated, combined, sorted, printed in a number of styles, and exported. 'BibTeX' and 'BibLaTeX' '.bib' files can be read into 'R' and converted to 'BibEntry' objects. Interfaces to 'NCBI Entrez', 'CrossRef', and 'Zotero' are provided for importing references and references can be created from locally stored 'PDF' files using 'Poppler'. Includes functions for citing and generating a bibliography with hyperlinks for documents prepared with 'RMarkdown' or 'RHTML'.
Maintained by Mathew W. McLean. Last updated 4 months ago.
115 stars 12.06 score 2.3k scripts 16 dependentsropensci
rotl:Interface to the 'Open Tree of Life' API
An interface to the 'Open Tree of Life' API to retrieve phylogenetic trees, information about studies used to assemble the synthetic tree, and utilities to match taxonomic names to 'Open Tree identifiers'. The 'Open Tree of Life' aims at assembling a comprehensive phylogenetic tree for all named species.
Maintained by Francois Michonneau. Last updated 2 years ago.
metadataropensciphylogeneticsindependant-contrastsbiodiversitypeer-reviewedphylogenytaxonomy
40 stars 12.05 score 356 scripts 29 dependentsbioc
slingshot:Tools for ordering single-cell sequencing
Provides functions for inferring continuous, branching lineage structures in low-dimensional data. Slingshot was designed to model developmental trajectories in single-cell RNA sequencing data and serve as a component in an analysis pipeline after dimensionality reduction and clustering. It is flexible enough to handle arbitrarily many branching events and allows for the incorporation of prior knowledge through supervised graph construction.
Maintained by Kelly Street. Last updated 5 months ago.
clusteringdifferentialexpressiongeneexpressionrnaseqsequencingsoftwaresinglecelltranscriptomicsvisualization
283 stars 12.01 score 1.0k scripts 4 dependentsbioc
ExperimentHub:Client to access ExperimentHub resources
This package provides a client for the Bioconductor ExperimentHub web resource. ExperimentHub provides a central location where curated data from experiments, publications or training courses can be accessed. Each resource has associated metadata, tags and date of modification. The client creates and manages a local cache of files retrieved enabling quick and reproducible access.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
infrastructuredataimportguithirdpartyclientcore-packageu24ca289073
10 stars 11.94 score 764 scripts 57 dependentsbioc
GenomicDataCommons:NIH / NCI Genomic Data Commons Access
Programmatically access the NIH / NCI Genomic Data Commons RESTful service.
Maintained by Sean Davis. Last updated 2 months ago.
dataimportsequencingapi-clientbioconductorbioinformaticscancercore-servicesdata-sciencegenomicsncitcgavignette
87 stars 11.94 score 238 scripts 12 dependentsjinghuazhao
gap:Genetic Analysis Package
As first reported [Zhao, J. H. 2007. "gap: Genetic Analysis Package". J Stat Soft 23(8):1-18. <doi:10.18637/jss.v023.i08>], it is designed as an integrated package for genetic data analysis of both population and family data. Currently, it contains functions for sample size calculations of both population-based and family-based designs, probability of familial disease aggregation, kinship calculation, statistics in linkage analysis, and association analysis involving genetic markers including haplotype analysis with or without environmental covariates. Over years, the package has been developed in-between many projects hence also in line with the name (gap).
Maintained by Jing Hua Zhao. Last updated 4 days ago.
12 stars 11.94 score 448 scripts 16 dependentspecanproject
PEcAn.DB:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by David LeBauer. Last updated 2 days ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
216 stars 11.89 score 127 scripts 27 dependentsbioc
QFeatures:Quantitative features for mass spectrometry data
The QFeatures infrastructure enables the management and processing of quantitative features for high-throughput mass spectrometry assays. It provides a familiar Bioconductor user experience to manages quantitative data across different assay levels (such as peptide spectrum matches, peptides and proteins) in a coherent and tractable format.
Maintained by Laurent Gatto. Last updated 25 days ago.
infrastructuremassspectrometryproteomicsmetabolomicsbioconductormass-spectrometry
27 stars 11.87 score 278 scripts 49 dependentsbioc
methylKit:DNA methylation analysis from high-throughput bisulfite sequencing results
methylKit is an R package for DNA methylation analysis and annotation from high-throughput bisulfite sequencing. The package is designed to deal with sequencing data from RRBS and its variants, but also target-capture methods and whole genome bisulfite sequencing. It also has functions to analyze base-pair resolution 5hmC data from experimental protocols such as oxBS-Seq and TAB-Seq. Methylation calling can be performed directly from Bismark aligned BAM files.
Maintained by Altuna Akalin. Last updated 29 days ago.
dnamethylationsequencingmethylseqgenome-biologymethylationstatistical-analysisvisualizationcurlbzip2xz-utilszlibcpp
220 stars 11.80 score 578 scripts 3 dependentsmrkaye97
slackr:Send Messages, Images, R Objects and Files to 'Slack' Channels/Users
'Slack' <https://slack.com/> provides a service for teams to collaborate by sharing messages, images, links, files and more. Functions are provided that make it possible to interact with the 'Slack' platform 'API'. When you need to share information or data from R, rather than resort to copy/ paste in e-mails or other services like 'Skype' <https://www.skype.com/en/>, you can use this package to send well-formatted output from multiple R objects and expressions to all teammates at the same time with little effort. You can also send images from the current graphics device, R objects, and upload files.
Maintained by Matt Kaye. Last updated 5 months ago.
306 stars 11.66 score 179 scriptshaleyjeppson
ggmosaic:Mosaic Plots in the 'ggplot2' Framework
Mosaic plots in the 'ggplot2' framework. Mosaic plot functionality is provided in a single 'ggplot2' layer by calling the geom 'mosaic'.
Maintained by Haley Jeppson. Last updated 6 months ago.
167 stars 11.63 score 1.8k scripts 4 dependentspecanproject
PEcAn.data.atmosphere:PEcAn Functions Used for Managing Climate Driver Data
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The PECAn.data.atmosphere package converts climate driver data into a standard format for models integrated into PEcAn. As a standalone package, it provides an interface to access diverse climate data sets.
Maintained by David LeBauer. Last updated 2 days ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
216 stars 11.61 score 64 scripts 14 dependentsbioc
bumphunter:Bump Hunter
Tools for finding bumps in genomic data
Maintained by Tamilselvi Guharaj. Last updated 5 months ago.
dnamethylationepigeneticsinfrastructuremultiplecomparisonimmunooncology
16 stars 11.61 score 210 scripts 43 dependentsworkflowr
workflowr:A Framework for Reproducible and Collaborative Data Science
Provides a workflow for your analysis projects by combining literate programming ('knitr' and 'rmarkdown') and version control ('Git', via 'git2r') to generate a website containing time-stamped, versioned, and documented results.
Maintained by John Blischak. Last updated 4 months ago.
gitproject-managementrmarkdownwebsiteworkflow
848 stars 11.53 score 566 scriptsurbananalyst
dodgr:Distances on Directed Graphs
Distances on dual-weighted directed graphs using priority-queue shortest paths (Padgham (2019) <doi:10.32866/6945>). Weighted directed graphs have weights from A to B which may differ from those from B to A. Dual-weighted directed graphs have two sets of such weights. A canonical example is a street network to be used for routing in which routes are calculated by weighting distances according to the type of way and mode of transport, yet lengths of routes must be calculated from direct distances.
Maintained by Mark Padgham. Last updated 20 hours ago.
distanceopenstreetmaproutershortest-pathsstreet-networkscpp
129 stars 11.52 score 229 scripts 4 dependentsbioc
systemPipeR:systemPipeR: Workflow Environment for Data Analysis and Report Generation
systemPipeR is a multipurpose data analysis workflow environment that unifies R with command-line tools. It enables scientists to analyze many types of large- or small-scale data on local or distributed computer systems with a high level of reproducibility, scalability and portability. At its core is a command-line interface (CLI) that adopts the Common Workflow Language (CWL). This design allows users to choose for each analysis step the optimal R or command-line software. It supports both end-to-end and partial execution of workflows with built-in restart functionalities. Efficient management of complex analysis tasks is accomplished by a flexible workflow control container class. Handling of large numbers of input samples and experimental designs is facilitated by consistent sample annotation mechanisms. As a multi-purpose workflow toolkit, systemPipeR enables users to run existing workflows, customize them or design entirely new ones while taking advantage of widely adopted data structures within the Bioconductor ecosystem. Another important core functionality is the generation of reproducible scientific analysis and technical reports. For result interpretation, systemPipeR offers a wide range of plotting functionality, while an associated Shiny App offers many useful functionalities for interactive result exploration. The vignettes linked from this page include (1) a general introduction, (2) a description of technical details, and (3) a collection of workflow templates.
Maintained by Thomas Girke. Last updated 5 months ago.
geneticsinfrastructuredataimportsequencingrnaseqriboseqchipseqmethylseqsnpgeneexpressioncoveragegenesetenrichmentalignmentqualitycontrolimmunooncologyreportwritingworkflowstepworkflowmanagement
53 stars 11.52 score 344 scripts 3 dependentsbioc
mia:Microbiome analysis
mia implements tools for microbiome analysis based on the SummarizedExperiment, SingleCellExperiment and TreeSummarizedExperiment infrastructure. Data wrangling and analysis in the context of taxonomic data is the main scope. Additional functions for common task are implemented such as community indices calculation and summarization.
Maintained by Tuomas Borman. Last updated 2 days ago.
microbiomesoftwaredataimportanalysisbioconductorcpp
51 stars 11.51 score 316 scripts 5 dependentsr-lib
gmailr:Access the 'Gmail' 'RESTful' API
An interface to the 'Gmail' 'RESTful' API. Allows access to your 'Gmail' messages, threads, drafts and labels.
Maintained by Jennifer Bryan. Last updated 1 years ago.
230 stars 11.50 score 289 scripts 1 dependentsbioc
msa:Multiple Sequence Alignment
The 'msa' package provides a unified R/Bioconductor interface to the multiple sequence alignment algorithms ClustalW, ClustalOmega, and Muscle. All three algorithms are integrated in the package, therefore, they do not depend on any external software tools and are available for all major platforms. The multiple sequence alignment algorithms are complemented by a function for pretty-printing multiple sequence alignments using the LaTeX package TeXshade.
Maintained by Ulrich Bodenhofer. Last updated 1 months ago.
multiplesequencealignmentalignmentmultiplecomparisonsequencingcpp
17 stars 11.46 score 744 scripts 6 dependentsbioc
destiny:Creates diffusion maps
Create and plot diffusion maps.
Maintained by Philipp Angerer. Last updated 4 months ago.
cellbiologycellbasedassaysclusteringsoftwarevisualizationdiffusion-mapsdimensionality-reductioncpp
82 stars 11.44 score 792 scripts 1 dependentsrolkra
explore:Simplifies Exploratory Data Analysis
Interactive data exploration with one line of code, automated reporting or use an easy to remember set of tidy functions for low code exploratory data analysis.
Maintained by Roland Krasser. Last updated 4 months ago.
data-explorationdata-visualisationdecision-treesedarmarkdownshinytidy
228 stars 11.43 score 221 scripts 1 dependentsbioc
annotate:Annotation for microarrays
Using R enviroments for annotation.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
11.41 score 812 scripts 239 dependentsbioc
VariantAnnotation:Annotation of Genetic Variants
Annotate variants, compute amino acid coding changes, predict coding outcomes.
Maintained by Bioconductor Package Maintainer. Last updated 3 months ago.
dataimportsequencingsnpannotationgeneticsvariantannotationcurlbzip2xz-utilszlib
11.39 score 1.9k scripts 152 dependentsbioc
PharmacoGx:Analysis of Large-Scale Pharmacogenomic Data
Contains a set of functions to perform large-scale analysis of pharmaco-genomic data. These include the PharmacoSet object for storing the results of pharmacogenomic experiments, as well as a number of functions for computing common summaries of drug-dose response and correlating them with the molecular features in a cancer cell-line.
Maintained by Benjamin Haibe-Kains. Last updated 3 months ago.
geneexpressionpharmacogeneticspharmacogenomicssoftwareclassificationdatasetspharmacogenomicpharmacogxcpp
68 stars 11.39 score 442 scripts 3 dependentsr4ss
r4ss:R Code for Stock Synthesis
A collection of R functions for use with Stock Synthesis, a fisheries stock assessment modeling platform written in ADMB by Dr. Richard D. Methot at the NOAA Northwest Fisheries Science Center. The functions include tools for summarizing and plotting results, manipulating files, visualizing model parameterizations, and various other common stock assessment tasks. This version of '{r4ss}' is compatible with Stock Synthesis versions 3.24 through 3.30 (specifically version 3.30.23.1, from December 2024). Support for 3.24 models is only through the core functions for reading output and plotting.
Maintained by Ian G. Taylor. Last updated 17 days ago.
fisheriesfisheries-stock-assessmentstock-synthesis
43 stars 11.38 score 1.0k scripts 2 dependentsdoi-usgs
nhdplusTools:NHDPlus Tools
Tools for traversing and working with National Hydrography Dataset Plus (NHDPlus) data. All methods implemented in 'nhdplusTools' are available in the NHDPlus documentation available from the US Environmental Protection Agency <https://www.epa.gov/waterdata/basic-information>.
Maintained by David Blodgett. Last updated 1 months ago.
87 stars 11.38 score 348 scripts 5 dependentsbioc
pathview:a tool set for pathway based data integration and visualization
Pathview is a tool set for pathway based data integration and visualization. It maps and renders a wide variety of biological data on relevant pathway graphs. All users need is to supply their data and specify the target pathway. Pathview automatically downloads the pathway graph data, parses the data file, maps user data to the pathway, and render pathway graph with the mapped data. In addition, Pathview also seamlessly integrates with pathway and gene set (enrichment) analysis tools for large-scale and fully automated analysis.
Maintained by Weijun Luo. Last updated 8 hours ago.
pathwaysgraphandnetworkvisualizationgenesetenrichmentdifferentialexpressiongeneexpressionmicroarrayrnaseqgeneticsmetabolomicsproteomicssystemsbiologysequencing
40 stars 11.37 score 1.6k scripts 10 dependentsropensci
biomartr:Genomic Data Retrieval
Perform large scale genomic data retrieval and functional annotation retrieval. This package aims to provide users with a standardized way to automate genome, proteome, 'RNA', coding sequence ('CDS'), 'GFF', and metagenome retrieval from 'NCBI RefSeq', 'NCBI Genbank', 'ENSEMBL', and 'UniProt' databases. Furthermore, an interface to the 'BioMart' database (Smedley et al. (2009) <doi:10.1186/1471-2164-10-22>) allows users to retrieve functional annotation for genomic loci. In addition, users can download entire databases such as 'NCBI RefSeq' (Pruitt et al. (2007) <doi:10.1093/nar/gkl842>), 'NCBI nr', 'NCBI nt', 'NCBI Genbank' (Benson et al. (2013) <doi:10.1093/nar/gks1195>), etc. with only one command.
Maintained by Hajk-Georg Drost. Last updated 2 months ago.
biomartgenomic-data-retrievalannotation-retrievaldatabase-retrievalncbiensemblbiological-data-retrievalensembl-serversgenomegenome-annotationgenome-retrievalgenomicsmeta-analysismetagenomicsncbi-genbankpeer-reviewedproteomesequenced-genomes
218 stars 11.35 score 129 scripts 3 dependentsjessecambon
tidygeocoder:Geocoding Made Easy
An intuitive interface for getting data from geocoding services.
Maintained by Jesse Cambon. Last updated 5 months ago.
287 stars 11.35 score 1.0k scripts 9 dependentsdsy109
mixtools:Tools for Analyzing Finite Mixture Models
Analyzes finite mixture models for various parametric and semiparametric settings. This includes mixtures of parametric distributions (normal, multivariate normal, multinomial, gamma), various Reliability Mixture Models (RMMs), mixtures-of-regressions settings (linear regression, logistic regression, Poisson regression, linear regression with changepoints, predictor-dependent mixing proportions, random effects regressions, hierarchical mixtures-of-experts), and tools for selecting the number of components (bootstrapping the likelihood ratio test statistic, mixturegrams, and model selection criteria). Bayesian estimation of mixtures-of-linear-regressions models is available as well as a novel data depth method for obtaining credible bands. This package is based upon work supported by the National Science Foundation under Grant No. SES-0518772 and the Chan Zuckerberg Initiative: Essential Open Source Software for Science (Grant No. 2020-255193).
Maintained by Derek Young. Last updated 10 months ago.
mixture-modelsmixture-of-expertssemiparametric-regression
20 stars 11.34 score 1.4k scripts 56 dependentsr-hub
rhub:Tools for R Package Developers
R-hub v2 uses GitHub Actions to run 'R CMD check' and similar package checks. The 'rhub' package helps you set up R-hub v2 for your R package, and start running checks.
Maintained by Gรกbor Csรกrdi. Last updated 22 days ago.
359 stars 11.33 score 191 scripts 1 dependentsropensci
ssh:Secure Shell (SSH) Client for R
Connect to a remote server over SSH to transfer files via SCP, setup a secure tunnel, or run a command or script on the host while streaming stdout and stderr directly to the client.
Maintained by Jeroen Ooms. Last updated 3 days ago.
129 stars 11.33 score 128 scripts 10 dependentsneuhausi
canvasXpress:Visualization Package for CanvasXpress in R
Enables creation of visualizations using the CanvasXpress framework in R. CanvasXpress is a standalone JavaScript library for reproducible research with complete tracking of data and end-user modifications stored in a single PNG image that can be played back. See <https://www.canvasxpress.org> for more information.
Maintained by Connie Brett. Last updated 21 hours ago.
analyticsbioinformaticschartchartingdashdashboarddata-analyticsdata-sciencedata-visualizationgenomicsgraphsjavascriptnetworknetwork-visualizationpythonreproducible-researchshinyvisualization
297 stars 11.28 score 145 scriptsbioc
MAST:Model-based Analysis of Single Cell Transcriptomics
Methods and models for handling zero-inflated single cell assay data.
Maintained by Andrew McDavid. Last updated 5 months ago.
geneexpressiondifferentialexpressiongenesetenrichmentrnaseqtranscriptomicssinglecell
232 stars 11.28 score 1.8k scripts 5 dependentsmrcieu
TwoSampleMR:Two Sample MR Functions and Interface to MRC Integrative Epidemiology Unit OpenGWAS Database
A package for performing Mendelian randomization using GWAS summary data. It uses the IEU OpenGWAS database <https://gwas.mrcieu.ac.uk/> to automatically obtain data, and a wide range of methods to run the analysis.
Maintained by Gibran Hemani. Last updated 1 days ago.
476 stars 11.27 score 1.7k scripts 1 dependentsjeroen
mongolite:Fast and Simple 'MongoDB' Client for R
High-performance MongoDB client based on 'mongo-c-driver' and 'jsonlite'. Includes support for aggregation, indexing, map-reduce, streaming, encryption, enterprise authentication, and GridFS. The online user manual provides an overview of the available methods in the package: <https://jeroen.github.io/mongolite/>.
Maintained by Jeroen Ooms. Last updated 7 days ago.
285 stars 11.25 score 860 scripts 10 dependentsbioc
zellkonverter:Conversion Between scRNA-seq Objects
Provides methods to convert between Python AnnData objects and SingleCellExperiment objects. These are primarily intended for use by downstream Bioconductor packages that wrap Python methods for single-cell data analysis. It also includes functions to read and write H5AD files used for saving AnnData objects to disk.
Maintained by Luke Zappia. Last updated 20 days ago.
singlecelldataimportdatarepresentationbioconductorconversionscrna-seq
159 stars 11.25 score 660 scripts 4 dependentspaws-r
paws:Amazon Web Services Software Development Kit
Interface to Amazon Web Services <https://aws.amazon.com>, including storage, database, and compute services, such as 'Simple Storage Service' ('S3'), 'DynamoDB' 'NoSQL' database, and 'Lambda' functions-as-a-service.
Maintained by Dyfan Jones. Last updated 16 days ago.
332 stars 11.25 score 177 scripts 12 dependentsbioc
karyoploteR:Plot customizable linear genomes displaying arbitrary data
karyoploteR creates karyotype plots of arbitrary genomes and offers a complete set of functions to plot arbitrary data on them. It mimicks many R base graphics functions coupling them with a coordinate change function automatically mapping the chromosome and data coordinates into the plot coordinates. In addition to the provided data plotting functions, it is easy to add new ones.
Maintained by Bernat Gel. Last updated 5 months ago.
visualizationcopynumbervariationsequencingcoveragednaseqchipseqmethylseqdataimportonechannelbioconductorbioinformaticsdata-visualizationgenomegenomics-visualizationplotting-in-r
307 stars 11.25 score 656 scripts 4 dependentsadeverse
adespatial:Multivariate Multiscale Spatial Analysis
Tools for the multiscale spatial analysis of multivariate data. Several methods are based on the use of a spatial weighting matrix and its eigenvector decomposition (Moran's Eigenvectors Maps, MEM). Several approaches are described in the review Dray et al (2012) <doi:10.1890/11-1183.1>.
Maintained by Aurรฉlie Siberchicot. Last updated 9 days ago.
36 stars 11.16 score 398 scripts 2 dependentsazure
Microsoft365R:Interface to the 'Microsoft 365' Suite of Cloud Services
An interface to the 'Microsoft 365' (formerly known as 'Office 365') suite of cloud services, building on the framework supplied by the 'AzureGraph' package. Enables access from R to data stored in 'Teams', 'SharePoint Online' and 'OneDrive', including the ability to list drive folder contents, upload and download files, send messages, and retrieve data lists. Also provides a full-featured 'Outlook' email client, with the ability to send emails and manage emails and mail folders.
Maintained by Hong Ooi. Last updated 28 days ago.
azure-sdk-rmicrosoft-365microsoft-graph-apioffice-365onedriveonedrive-for-businesssharepoint-online
325 stars 11.14 score 88 scripts 7 dependentsbioc
genomation:Summary, annotation and visualization of genomic data
A package for summary and annotation of genomic intervals. Users can visualize and quantify genomic intervals over pre-defined functional regions, such as promoters, exons, introns, etc. The genomic intervals represent regions with a defined chromosome position, which may be associated with a score, such as aligned reads from HT-seq experiments, TF binding sites, methylation scores, etc. The package can use any tabular genomic feature data as long as it has minimal information on the locations of genomic intervals. In addition, It can use BAM or BigWig files as input.
Maintained by Altuna Akalin. Last updated 5 months ago.
annotationsequencingvisualizationcpgislandcpp
76 stars 11.13 score 738 scripts 5 dependentsbioc
genefilter:genefilter: methods for filtering genes from high-throughput experiments
Some basic functions for filtering genes.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
11.11 score 2.4k scripts 143 dependentsusepa
elevatr:Access Elevation Data from Various APIs
Several web services are available that provide access to elevation data. This package provides access to many of those services and returns elevation data either as an 'sf' simple features object from point elevation services or as a 'raster' object from raster elevation services. In future versions, 'elevatr' will drop support for 'raster' and will instead return 'terra' objects. Currently, the package supports access to the Amazon Web Services Terrain Tiles <https://registry.opendata.aws/terrain-tiles/>, the Open Topography Global Datasets API <https://opentopography.org/developers/>, and the USGS Elevation Point Query Service <https://apps.nationalmap.gov/epqs/>.
Maintained by Jeffrey Hollister. Last updated 7 months ago.
digital-elevation-modelelevation-dataelevatrepamapzen-elevation-servicer-language
206 stars 11.11 score 1.3k scripts 3 dependentsfmichonneau
phylobase:Base Package for Phylogenetic Structures and Comparative Data
Provides a base S4 class for comparative methods, incorporating one or more trees and trait data.
Maintained by Francois Michonneau. Last updated 1 years ago.
18 stars 11.10 score 394 scripts 18 dependentscovid19datahub
COVID19:COVID-19 Data Hub
Unified datasets for a better understanding of COVID-19.
Maintained by Emanuele Guidotti. Last updated 1 months ago.
2019-ncovcoronaviruscovid-19covid-datacovid19-data
252 stars 11.08 score 265 scriptsropengov
eurostat:Tools for Eurostat Open Data
Tools to download data from the Eurostat database <https://ec.europa.eu/eurostat> together with search and manipulation utilities.
Maintained by Leo Lahti. Last updated 1 months ago.
242 stars 11.07 score 892 scripts 4 dependentsbioc
scater:Single-Cell Analysis Toolkit for Gene Expression Data in R
A collection of tools for doing various analyses of single-cell RNA-seq gene expression data, with a focus on quality control and visualization.
Maintained by Alan OCallaghan. Last updated 22 days ago.
immunooncologysinglecellrnaseqqualitycontrolpreprocessingnormalizationvisualizationdimensionreductiontranscriptomicsgeneexpressionsequencingsoftwaredataimportdatarepresentationinfrastructurecoverage
11.07 score 12k scripts 43 dependentspaws-r
paws.common:Paws Low-Level Amazon Web Services API
Functions for making low-level API requests to Amazon Web Services <https://aws.amazon.com>. The functions handle building, signing, and sending requests, and receiving responses. They are designed to help build higher-level interfaces to individual services, such as Simple Storage Service (S3).
Maintained by Dyfan Jones. Last updated 16 days ago.
332 stars 11.07 score 39 dependentsipums
ipumsr:An R Interface for Downloading, Reading, and Handling IPUMS Data
An easy way to work with census, survey, and geographic data provided by IPUMS in R. Generate and download data through the IPUMS API and load IPUMS files into R with their associated metadata to make analysis easier. IPUMS data describing 1.4 billion individuals drawn from over 750 censuses and surveys is available free of charge from the IPUMS website <https://www.ipums.org>.
Maintained by Derek Burk. Last updated 1 months ago.
30 stars 11.05 score 720 scripts 2 dependentsbioc
universalmotif:Import, Modify, and Export Motifs with R
Allows for importing most common motif types into R for use by functions provided by other Bioconductor motif-related packages. Motifs can be exported into most major motif formats from various classes as defined by other Bioconductor packages. A suite of motif and sequence manipulation and analysis functions are included, including enrichment, comparison, P-value calculation, shuffling, trimming, higher-order motifs, and others.
Maintained by Benjamin Jean-Marie Tremblay. Last updated 5 months ago.
motifannotationmotifdiscoverydataimportgeneregulationmotif-analysismotif-enrichment-analysissequence-logocpp
28 stars 11.04 score 342 scripts 12 dependentsopenml
OpenML:Open Machine Learning and Open Data Platform
We provide an R interface to 'OpenML.org' which is an online machine learning platform where researchers can access open data, download and upload data sets, share their machine learning tasks and experiments and organize them online to work and collaborate with other researchers. The R interface allows to query for data sets with specific properties, and allows the downloading and uploading of data sets, tasks, flows and runs. See <https://www.openml.org/guide/api> for more information.
Maintained by Giuseppe Casalicchio. Last updated 10 months ago.
arffbenchmarkingbenchmarking-suiteclassificationdata-sciencedatabasedatasetdatasetsmachine-learningmachine-learning-algorithmsopen-dataopen-scienceopendataopenmlopenscienceregressionreproducible-researchstatistics
97 stars 11.04 score 7.1k scriptsconfig-i1
greybox:Toolbox for Model Building and Forecasting
Implements functions and instruments for regression model building and its application to forecasting. The main scope of the package is in variables selection and models specification for cases of time series data. This includes promotional modelling, selection between different dynamic regressions with non-standard distributions of errors, selection based on cross validation, solutions to the fat regression model problem and more. Models developed in the package are tailored specifically for forecasting purposes. So as a results there are several methods that allow producing forecasts from these models and visualising them.
Maintained by Ivan Svetunkov. Last updated 15 days ago.
forecastingmodel-selectionmodel-selection-and-evaluationregressionregression-modelsstatisticscpp
30 stars 11.03 score 97 scripts 34 dependentsmhahsler
arulesViz:Visualizing Association Rules and Frequent Itemsets
Extends package 'arules' with various visualization techniques for association rules and itemsets. The package also includes several interactive visualizations for rule exploration. Michael Hahsler (2017) <doi:10.32614/RJ-2017-047>.
Maintained by Michael Hahsler. Last updated 7 months ago.
arulesassociation-rulesfrequent-itemsetsinteractive-visualizationsvisualization
54 stars 11.03 score 1.7k scripts 2 dependentsr-lib
jose:JavaScript Object Signing and Encryption
Read and write JSON Web Keys (JWK, rfc7517), generate and verify JSON Web Signatures (JWS, rfc7515) and encode/decode JSON Web Tokens (JWT, rfc7519) <https://datatracker.ietf.org/wg/jose/documents/>. These standards provide modern signing and encryption formats that are natively supported by browsers via the JavaScript WebCryptoAPI <https://www.w3.org/TR/WebCryptoAPI/#jose>, and used by services like OAuth 2.0, LetsEncrypt, and Github Apps.
Maintained by Jeroen Ooms. Last updated 6 months ago.
50 stars 11.00 score 63 scripts 35 dependentsbioc
CATALYST:Cytometry dATa anALYSis Tools
CATALYST provides tools for preprocessing of and differential discovery in cytometry data such as FACS, CyTOF, and IMC. Preprocessing includes i) normalization using bead standards, ii) single-cell deconvolution, and iii) bead-based compensation. For differential discovery, the package provides a number of convenient functions for data processing (e.g., clustering, dimension reduction), as well as a suite of visualizations for exploratory data analysis and exploration of results from differential abundance (DA) and state (DS) analysis in order to identify differences in composition and expression profiles at the subpopulation-level, respectively.
Maintained by Helena L. Crowell. Last updated 4 months ago.
clusteringdataimportdifferentialexpressionexperimentaldesignflowcytometryimmunooncologymassspectrometrynormalizationpreprocessingsinglecellsoftwarestatisticalmethodvisualization
67 stars 10.99 score 362 scripts 2 dependentsdavidgohel
officedown:Enhanced 'R Markdown' Format for 'Word' and 'PowerPoint'
Allows production of 'Microsoft' corporate documents from 'R Markdown' by reusing formatting defined in 'Microsoft Word' documents. You can reuse table styles, list styles but also add column sections, landscape oriented pages. Table and image captions as well as cross-references are transformed into 'Microsoft Word' fields, allowing documents edition and merging without issue with references; the syntax conforms to the 'bookdown' cross-reference definition. Objects generated by the 'officer' package are also supported in the 'knitr' chunks. 'Microsoft PowerPoint' presentations also benefit from this as well as the ability to produce editable vector graphics in 'PowerPoint' and also to define placeholder where content is to be added.
Maintained by David Gohel. Last updated 11 days ago.
371 stars 10.93 score 342 scripts 7 dependentsropensci
CoordinateCleaner:Automated Cleaning of Occurrence Records from Biological Collections
Automated flagging of common spatial and temporal errors in biological and paleontological collection data, for the use in conservation, ecology and paleontology. Includes automated tests to easily flag (and exclude) records assigned to country or province centroid, the open ocean, the headquarters of the Global Biodiversity Information Facility, urban areas or the location of biodiversity institutions (museums, zoos, botanical gardens, universities). Furthermore identifies per species outlier coordinates, zero coordinates, identical latitude/longitude and invalid coordinates. Also implements an algorithm to identify data sets with a significant proportion of rounded coordinates. Especially suited for large data sets. The reference for the methodology is: Zizka et al. (2019) <doi:10.1111/2041-210X.13152>.
Maintained by Alexander Zizka. Last updated 1 years ago.
82 stars 10.93 score 306 scripts 3 dependentsbioc
infercnv:Infer Copy Number Variation from Single-Cell RNA-Seq Data
Using single-cell RNA-Seq expression to visualize CNV in cells.
Maintained by Christophe Georgescu. Last updated 5 months ago.
softwarecopynumbervariationvariantdetectionstructuralvariationgenomicvariationgeneticstranscriptomicsstatisticalmethodbayesianhiddenmarkovmodelsinglecelljagscpp
601 stars 10.92 score 674 scriptsbioc
EnrichedHeatmap:Making Enriched Heatmaps
Enriched heatmap is a special type of heatmap which visualizes the enrichment of genomic signals on specific target regions. Here we implement enriched heatmap by ComplexHeatmap package. Since this type of heatmap is just a normal heatmap but with some special settings, with the functionality of ComplexHeatmap, it would be much easier to customize the heatmap as well as concatenating to a list of heatmaps to show correspondance between different data sources.
Maintained by Zuguang Gu. Last updated 5 months ago.
softwarevisualizationsequencinggenomeannotationcoveragecpp
190 stars 10.87 score 330 scripts 1 dependentsmarce10
warbleR:Streamline Bioacoustic Analysis
Functions aiming to facilitate the analysis of the structure of animal acoustic signals in 'R'. 'warbleR' makes use of the basic sound analysis tools from the packages 'tuneR' and 'seewave', and offers new tools for explore and quantify acoustic signal structure. The package allows to organize and manipulate multiple sound files, create spectrograms of complete recordings or individual signals in different formats, run several measures of acoustic structure, and characterize different structural levels in acoustic signals.
Maintained by Marcelo Araya-Salas. Last updated 2 months ago.
animal-acoustic-signalsaudio-processingbioacousticsspectrogramstreamline-analysiscpp
56 stars 10.86 score 270 scripts 4 dependentsmichelnivard
gptstudio:Use Large Language Models Directly in your Development Environment
Large language models are readily accessible via API. This package lowers the barrier to use the API inside of your development environment. For more on the API, see <https://platform.openai.com/docs/introduction>.
Maintained by James Wade. Last updated 3 days ago.
chatgptgpt-3rstudiorstudio-addin
930 stars 10.85 score 43 scripts 1 dependentsbioc
ANCOMBC:Microbiome differential abudance and correlation analyses with bias correction
ANCOMBC is a package containing differential abundance (DA) and correlation analyses for microbiome data. Specifically, the package includes Analysis of Compositions of Microbiomes with Bias Correction 2 (ANCOM-BC2), Analysis of Compositions of Microbiomes with Bias Correction (ANCOM-BC), and Analysis of Composition of Microbiomes (ANCOM) for DA analysis, and Sparse Estimation of Correlations among Microbiomes (SECOM) for correlation analysis. Microbiome data are typically subject to two sources of biases: unequal sampling fractions (sample-specific biases) and differential sequencing efficiencies (taxon-specific biases). Methodologies included in the ANCOMBC package are designed to correct these biases and construct statistically consistent estimators.
Maintained by Huang Lin. Last updated 13 days ago.
differentialexpressionmicrobiomenormalizationsequencingsoftwareancomancombcancombc2correlationdifferential-abundance-analysissecom
120 stars 10.79 score 406 scripts 1 dependentsazure
AzureStor:Storage Management in 'Azure'
Manage storage in Microsoft's 'Azure' cloud: <https://azure.microsoft.com/en-us/product-categories/storage/>. On the admin side, 'AzureStor' includes features to create, modify and delete storage accounts. On the client side, it includes an interface to blob storage, file storage, and 'Azure Data Lake Storage Gen2': upload and download files and blobs; list containers and files/blobs; create containers; and so on. Authenticated access to storage is supported, via either a shared access key or a shared access signature (SAS). Part of the 'AzureR' family of packages.
Maintained by Hong Ooi. Last updated 2 years ago.
azure-data-lakeazure-sdk-razure-storageazure-storage-blobazure-storage-file
65 stars 10.74 score 298 scripts 4 dependentsbioc
muscat:Multi-sample multi-group scRNA-seq data analysis tools
`muscat` provides various methods and visualization tools for DS analysis in multi-sample, multi-group, multi-(cell-)subpopulation scRNA-seq data, including cell-level mixed models and methods based on aggregated โpseudobulkโ data, as well as a flexible simulation platform that mimics both single and multi-sample scRNA-seq data.
Maintained by Helena L. Crowell. Last updated 5 months ago.
immunooncologydifferentialexpressionsequencingsinglecellsoftwarestatisticalmethodvisualization
184 stars 10.74 score 686 scripts 1 dependentsrstudio
pointblank:Data Validation and Organization of Metadata for Local and Remote Tables
Validate data in data frames, 'tibble' objects, 'Spark' 'DataFrames', and database tables. Validation pipelines can be made using easily-readable, consecutive validation steps. Upon execution of the validation plan, several reporting options are available. User-defined thresholds for failure rates allow for the determination of appropriate reporting actions. Many other workflows are available including an information management workflow, where the aim is to record, collect, and generate useful information on data tables.
Maintained by Richard Iannone. Last updated 3 days ago.
data-assertionsdata-checkerdata-dictionariesdata-framesdata-inferencedata-managementdata-profilerdata-qualitydata-validationdata-verificationdatabase-tableseasy-to-understandreporting-toolschema-validationtesting-toolsyaml-configuration
942 stars 10.73 score 284 scriptsapache
apache.sedona:R Interface for Apache Sedona
R interface for 'Apache Sedona' based on 'sparklyr' (<https://sedona.apache.org>).
Maintained by Apache Sedona. Last updated 5 hours ago.
cluster-computinggeospatialjavapythonscalaspatial-analysisspatial-queryspatial-sql
2.0k stars 10.72 score 105 scriptsjimmyday12
fitzRoy:Easily Scrape and Process AFL Data
An easy package for scraping and processing Australia Rules Football (AFL) data. 'fitzRoy' provides a range of functions for accessing publicly available data from 'AFL Tables' <https://afltables.com/afl/afl_index.html>, 'Footy Wire' <https://www.footywire.com> and 'The Squiggle' <https://squiggle.com.au>. Further functions allow for easy processing, cleaning and transformation of this data into formats that can be used for analysis.
Maintained by James Day. Last updated 8 days ago.
136 stars 10.72 score 324 scriptspecanproject
PEcAn.benchmark:PEcAn Functions Used for Benchmarking
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PEcAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. The PEcAn.benchmark package provides utilities for comparing models and data, including a suite of statistical metrics and plots.
Maintained by Mike Dietze. Last updated 2 days ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
216 stars 10.72 score 416 scripts 11 dependentsmrcieu
ieugwasr:Interface to the 'OpenGWAS' Database API
Interface to the 'OpenGWAS' database API <https://api.opengwas.io/api/>. Includes a wrapper to make generic calls to the API, plus convenience functions for specific queries.
Maintained by Gibran Hemani. Last updated 16 days ago.
89 stars 10.71 score 404 scripts 6 dependentsbioc
ALDEx2:Analysis Of Differential Abundance Taking Sample and Scale Variation Into Account
A differential abundance analysis for the comparison of two or more conditions. Useful for analyzing data from standard RNA-seq or meta-RNA-seq assays as well as selected and unselected values from in-vitro sequence selections. Uses a Dirichlet-multinomial model to infer abundance from counts, optimized for three or more experimental replicates. The method infers biological and sampling variation to calculate the expected false discovery rate, given the variation, based on a Wilcoxon Rank Sum test and Welch's t-test (via aldex.ttest), a Kruskal-Wallis test (via aldex.kw), a generalized linear model (via aldex.glm), or a correlation test (via aldex.corr). All tests report predicted p-values and posterior Benjamini-Hochberg corrected p-values. ALDEx2 also calculates expected standardized effect sizes for paired or unpaired study designs. ALDEx2 can now be used to estimate the effect of scale on the results and report on the scale-dependent robustness of results.
Maintained by Greg Gloor. Last updated 5 months ago.
differentialexpressionrnaseqtranscriptomicsgeneexpressiondnaseqchipseqbayesiansequencingsoftwaremicrobiomemetagenomicsimmunooncologyscale simulationposterior p-value
28 stars 10.70 score 424 scripts 3 dependentscjvanlissa
tidySEM:Tidy Structural Equation Modeling
A tidy workflow for generating, estimating, reporting, and plotting structural equation models using 'lavaan', 'OpenMx', or 'Mplus'. Throughout this workflow, elements of syntax, results, and graphs are represented as 'tidy' data, making them easy to customize. Includes functionality to estimate latent class analyses, and to plot 'dagitty' and 'igraph' objects.
Maintained by Caspar J. van Lissa. Last updated 20 days ago.
58 stars 10.69 score 330 scripts 1 dependentsdoi-usgs
EGRET:Exploration and Graphics for RivEr Trends
Statistics and graphics for streamflow history, water quality trends, and the statistical modeling algorithm: Weighted Regressions on Time, Discharge, and Season (WRTDS).
Maintained by Laura DeCicco. Last updated 4 months ago.
usgswater-qualitywater-quality-data
90 stars 10.67 score 362 scripts 1 dependentsquanteda
readtext:Import and Handling for Plain and Formatted Text Files
Functions for importing and handling text files and formatted text files with additional meta-data, such including '.csv', '.tab', '.json', '.xml', '.html', '.pdf', '.doc', '.docx', '.rtf', '.xls', '.xlsx', and others.
Maintained by Kenneth Benoit. Last updated 4 months ago.
122 stars 10.66 score 1.2k scripts 5 dependentsneonscience
neonUtilities:Utilities for Working with NEON Data
NEON data packages can be accessed through the NEON Data Portal <https://www.neonscience.org> or through the NEON Data API (see <https://data.neonscience.org/data-api> for documentation). Data delivered from the Data Portal are provided as monthly zip files packaged within a parent zip file, while individual files can be accessed from the API. This package provides tools that aid in discovering, downloading, and reformatting data prior to use in analyses. This includes downloading data via the API, merging data tables by type, and converting formats. For more information, see the readme file at <https://github.com/NEONScience/NEON-utilities>.
Maintained by Claire Lunch. Last updated 2 months ago.
57 stars 10.66 score 944 scripts 15 dependentsr-lum
Luminescence:Comprehensive Luminescence Dating Data Analysis
A collection of various R functions for the purpose of Luminescence dating data analysis. This includes, amongst others, data import, export, application of age models, curve deconvolution, sequence analysis and plotting of equivalent dose distributions.
Maintained by Sebastian Kreutzer. Last updated 19 hours ago.
bayesian-statisticsdata-sciencegeochronologyluminescenceluminescence-datingopen-scienceoslplottingradiofluorescencetlxsygcpp
15 stars 10.66 score 178 scripts 8 dependentsbusiness-science
modeltime:The Tidymodels Extension for Time Series Modeling
The time series forecasting framework for use with the 'tidymodels' ecosystem. Models include ARIMA, Exponential Smoothing, and additional time series models from the 'forecast' and 'prophet' packages. Refer to "Forecasting Principles & Practice, Second edition" (<https://otexts.com/fpp2/>). Refer to "Prophet: forecasting at scale" (<https://research.facebook.com/blog/2017/02/prophet-forecasting-at-scale/>.).
Maintained by Matt Dancho. Last updated 5 months ago.
arimadata-sciencedeep-learningetsforecastingmachine-learningmachine-learning-algorithmsmodeltimeprophettbatstidymodelingtidymodelstimetime-seriestime-series-analysistimeseriestimeseries-forecasting
551 stars 10.61 score 1.1k scripts 7 dependentsbioc
tximeta:Transcript Quantification Import with Automatic Metadata
Transcript quantification import from Salmon and other quantifiers with automatic attachment of transcript ranges and release information, and other associated metadata. De novo transcriptomes can be linked to the appropriate sources with linkedTxomes and shared for computational reproducibility.
Maintained by Michael Love. Last updated 2 months ago.
annotationgenomeannotationdataimportpreprocessingrnaseqsinglecelltranscriptomicstranscriptiongeneexpressionfunctionalgenomicsreproducibleresearchreportwritingimmunooncology
67 stars 10.58 score 466 scripts 1 dependentsbioc
Glimma:Interactive visualizations for gene expression analysis
This package produces interactive visualizations for RNA-seq data analysis, utilizing output from limma, edgeR, or DESeq2. It produces interactive htmlwidgets versions of popular RNA-seq analysis plots to enhance the exploration of analysis results by overlaying interactive features. The plots can be viewed in a web browser or embedded in notebook documents.
Maintained by Shian Su. Last updated 2 months ago.
differentialexpressiongeneexpressionmicroarrayreportwritingrnaseqsequencingvisualizationdifferential-expressioninteractive-visualizations
32 stars 10.58 score 600 scripts 1 dependentsbioc
ORFik:Open Reading Frames in Genomics
R package for analysis of transcript and translation features through manipulation of sequence data and NGS data like Ribo-Seq, RNA-Seq, TCP-Seq and CAGE. It is generalized in the sense that any transcript region can be analysed, as the name hints to it was made with investigation of ribosomal patterns over Open Reading Frames (ORFs) as it's primary use case. ORFik is extremely fast through use of C++, data.table and GenomicRanges. Package allows to reassign starts of the transcripts with the use of CAGE-Seq data, automatic shifting of RiboSeq reads, finding of Open Reading Frames for whole genomes and much more.
Maintained by Haakon Tjeldnes. Last updated 1 months ago.
immunooncologysoftwaresequencingriboseqrnaseqfunctionalgenomicscoveragealignmentdataimportcpp
33 stars 10.56 score 115 scripts 2 dependentsbioc
DECIPHER:Tools for curating, analyzing, and manipulating biological sequences
A toolset for deciphering and managing biological sequences.
Maintained by Erik Wright. Last updated 18 days ago.
clusteringgeneticssequencingdataimportvisualizationmicroarrayqualitycontrolqpcralignmentwholegenomemicrobiomeimmunooncologygenepredictionopenmp
10.55 score 1.1k scripts 14 dependentsmlverse
chattr:Interact with Large Language Models in 'RStudio'
Enables user interactivity with large-language models ('LLM') inside the 'RStudio' integrated development environment (IDE). The user can interact with the model using the 'shiny' app included in this package, or directly in the 'R' console. It comes with back-ends for 'OpenAI', 'GitHub' 'Copilot', and 'LlamaGPT'.
Maintained by Edgar Ruiz. Last updated 2 months ago.
215 stars 10.55 score 71 scripts 1 dependentsdatastorm-open
shinymanager:Authentication Management for 'Shiny' Applications
Simple and secure authentification mechanism for single 'Shiny' applications. Credentials can be stored in an encrypted 'SQLite' database or on your own SQL Database (Postgres, MySQL, ...). Source code of main application is protected until authentication is successful.
Maintained by Benoit Thieurmel. Last updated 11 months ago.
391 stars 10.51 score 316 scripts 2 dependentsbioc
ballgown:Flexible, isoform-level differential expression analysis
Tools for statistical analysis of assembled transcriptomes, including flexible differential expression analysis, visualization of transcript structures, and matching of assembled transcripts to annotation.
Maintained by Jack Fu. Last updated 5 months ago.
immunooncologyrnaseqstatisticalmethodpreprocessingdifferentialexpression
145 stars 10.51 score 338 scripts 1 dependentsropensci
gutenbergr:Download and Process Public Domain Works from Project Gutenberg
Download and process public domain works in the Project Gutenberg collection <https://www.gutenberg.org/>. Includes metadata for all Project Gutenberg works, so that they can be searched and retrieved.
Maintained by Jon Harmon. Last updated 3 months ago.
105 stars 10.50 score 1.1k scripts 1 dependentsbioc
miloR:Differential neighbourhood abundance testing on a graph
Milo performs single-cell differential abundance testing. Cell states are modelled as representative neighbourhoods on a nearest neighbour graph. Hypothesis testing is performed using either a negative bionomial generalized linear model or negative binomial generalized linear mixed model.
Maintained by Mike Morgan. Last updated 5 months ago.
singlecellmultiplecomparisonfunctionalgenomicssoftwareopenblascppopenmp
362 stars 10.49 score 340 scripts 1 dependentsrstudio
vetiver:Version, Share, Deploy, and Monitor Models
The goal of 'vetiver' is to provide fluent tooling to version, share, deploy, and monitor a trained model. Functions handle both recording and checking the model's input data prototype, and predicting from a remote API endpoint. The 'vetiver' package is extensible, with generics that can support many kinds of models.
Maintained by Julia Silge. Last updated 6 months ago.
185 stars 10.48 score 466 scripts 1 dependentsbioc
celda:CEllular Latent Dirichlet Allocation
Celda is a suite of Bayesian hierarchical models for clustering single-cell RNA-sequencing (scRNA-seq) data. It is able to perform "bi-clustering" and simultaneously cluster genes into gene modules and cells into cell subpopulations. It also contains DecontX, a novel Bayesian method to computationally estimate and remove RNA contamination in individual cells without empty droplet information. A variety of scRNA-seq data visualization functions is also included.
Maintained by Joshua Campbell. Last updated 1 months ago.
singlecellgeneexpressionclusteringsequencingbayesianimmunooncologydataimportcppopenmp
147 stars 10.47 score 256 scripts 2 dependentscrunch-io
crunch:Crunch.io Data Tools
The Crunch.io service <https://crunch.io/> provides a cloud-based data store and analytic engine, as well as an intuitive web interface. Using this package, analysts can interact with and manipulate Crunch datasets from within R. Importantly, this allows technical researchers to collaborate naturally with team members, managers, and clients who prefer a point-and-click interface.
Maintained by Greg Freedman Ellis. Last updated 8 days ago.
9 stars 10.47 score 200 scripts 2 dependentsvubiostat
redcapAPI:Interface to 'REDCap'
Access data stored in 'REDCap' databases using the Application Programming Interface (API). 'REDCap' (Research Electronic Data CAPture; <https://projectredcap.org>, Harris, et al. (2009) <doi:10.1016/j.jbi.2008.08.010>, Harris, et al. (2019) <doi:10.1016/j.jbi.2019.103208>) is a web application for building and managing online surveys and databases developed at Vanderbilt University. The API allows users to access data and project meta data (such as the data dictionary) from the web programmatically. The 'redcapAPI' package facilitates the process of accessing data with options to prepare an analysis-ready data set consistent with the definitions in a database's data dictionary.
Maintained by Shawn Garbett. Last updated 22 days ago.
22 stars 10.47 score 134 scripts 2 dependentsbioc
GENESIS:GENetic EStimation and Inference in Structured samples (GENESIS): Statistical methods for analyzing genetic data from samples with population structure and/or relatedness
The GENESIS package provides methodology for estimating, inferring, and accounting for population and pedigree structure in genetic analyses. The current implementation provides functions to perform PC-AiR (Conomos et al., 2015, Gen Epi) and PC-Relate (Conomos et al., 2016, AJHG). PC-AiR performs a Principal Components Analysis on genome-wide SNP data for the detection of population structure in a sample that may contain known or cryptic relatedness. Unlike standard PCA, PC-AiR accounts for relatedness in the sample to provide accurate ancestry inference that is not confounded by family structure. PC-Relate uses ancestry representative principal components to adjust for population structure/ancestry and accurately estimate measures of recent genetic relatedness such as kinship coefficients, IBD sharing probabilities, and inbreeding coefficients. Additionally, functions are provided to perform efficient variance component estimation and mixed model association testing for both quantitative and binary phenotypes.
Maintained by Stephanie M. Gogarten. Last updated 2 months ago.
snpgeneticvariabilitygeneticsstatisticalmethoddimensionreductionprincipalcomponentgenomewideassociationqualitycontrolbiocviews
36 stars 10.44 score 342 scripts 1 dependentsbioc
UCell:Rank-based signature enrichment analysis for single-cell data
UCell is a package for evaluating gene signatures in single-cell datasets. UCell signature scores, based on the Mann-Whitney U statistic, are robust to dataset size and heterogeneity, and their calculation demands less computing time and memory than other available methods, enabling the processing of large datasets in a few minutes even on machines with limited computing power. UCell can be applied to any single-cell data matrix, and includes functions to directly interact with SingleCellExperiment and Seurat objects.
Maintained by Massimo Andreatta. Last updated 5 months ago.
singlecellgenesetenrichmenttranscriptomicsgeneexpressioncellbasedassays
143 stars 10.43 score 454 scripts 2 dependentsropensci
rebird:R Client for the eBird Database of Bird Observations
A programmatic client for the eBird database (<https://ebird.org/home>), including functions for searching for bird observations by geographic location (latitude, longitude), eBird hotspots, location identifiers, by notable sightings, by region, and by taxonomic name.
Maintained by Sebastian Pardo. Last updated 2 months ago.
birdsbirdingebirddatabasedatabiologyobservationssightingsornithologyebird-apiebird-webservicesspocc
90 stars 10.43 score 73 scripts 6 dependentsropensci
robotstxt:A 'robots.txt' Parser and 'Webbot'/'Spider'/'Crawler' Permissions Checker
Provides functions to download and parse 'robots.txt' files. Ultimately the package makes it easy to check if bots (spiders, crawler, scrapers, ...) are allowed to access specific resources on a domain.
Maintained by Jordan Bradford. Last updated 4 months ago.
crawlerpeer-reviewedrobotstxtscraperspiderwebscraping
68 stars 10.43 score 414 scripts 7 dependentsbioc
oligo:Preprocessing tools for oligonucleotide arrays
A package to analyze oligonucleotide arrays (expression/SNP/tiling/exon) at probe-level. It currently supports Affymetrix (CEL files) and NimbleGen arrays (XYS files).
Maintained by Benilton Carvalho. Last updated 21 days ago.
microarrayonechanneltwochannelpreprocessingsnpdifferentialexpressionexonarraygeneexpressiondataimportzlib
3 stars 10.42 score 528 scripts 10 dependentsbioc
scRepertoire:A toolkit for single-cell immune receptor profiling
scRepertoire is a toolkit for processing and analyzing single-cell T-cell receptor (TCR) and immunoglobulin (Ig). The scRepertoire framework supports use of 10x, AIRR, BD, MiXCR, Omniscope, TRUST4, and WAT3R single-cell formats. The functionality includes basic clonal analyses, repertoire summaries, distance-based clustering and interaction with the popular Seurat and SingleCellExperiment/Bioconductor R workflows.
Maintained by Nick Borcherding. Last updated 10 days ago.
softwareimmunooncologysinglecellclassificationannotationsequencingcpp
327 stars 10.42 score 240 scriptsazure
AzureAuth:Authentication Services for Azure Active Directory
Provides Azure Active Directory (AAD) authentication functionality for R users of Microsoft's 'Azure' cloud <https://azure.microsoft.com/>. Use this package to obtain 'OAuth' 2.0 tokens for services including Azure Resource Manager, Azure Storage and others. It supports both AAD v1.0 and v2.0, as well as multiple authentication methods, including device code and resource owner grant. Tokens are cached in a user-specific directory obtained using the 'rappdirs' package. The interface is based on the 'OAuth' framework in the 'httr' package, but customised and streamlined for Azure. Part of the 'AzureR' family of packages.
Maintained by Hong Ooi. Last updated 3 years ago.
azureazure-active-directoryazure-sdk-roauth2
44 stars 10.42 score 57 scripts 23 dependentsposit-dev
connectapi:Utilities for Interacting with the 'Posit Connect' Server API
Provides a helpful 'R6' class and methods for interacting with the 'Posit Connect' Server API along with some meaningful utility functions for regular tasks. API documentation varies by 'Posit Connect' installation and version, but the latest documentation is also hosted publicly at <https://docs.posit.co/connect/api/>.
Maintained by Toph Allen. Last updated 4 days ago.
47 stars 10.42 score 252 scripts 1 dependents