Showing 200 of total 1410 results (show query)
tidyverse
tidyverse:Easily Install and Load the 'Tidyverse'
The 'tidyverse' is a set of packages that work in harmony because they share common data representations and 'API' design. This package is designed to make it easy to install and load multiple 'tidyverse' packages in a single step. Learn more about the 'tidyverse' at <https://www.tidyverse.org>.
Maintained by Hadley Wickham. Last updated 5 months ago.
1.7k stars 20.23 score 664k scripts 125 dependentstidyverse
dbplyr:A 'dplyr' Back End for Databases
A 'dplyr' back end for databases that allows you to work with remote database tables as if they are in-memory data frames. Basic features works with any database that has a 'DBI' back end; more advanced features require 'SQL' translation to be provided by the package author.
Maintained by Hadley Wickham. Last updated 4 months ago.
481 stars 19.72 score 5.2k scripts 736 dependentsr-dbi
RSQLite:SQLite Interface for R
Embeds the SQLite database engine in R and provides an interface compliant with the DBI package. The source for the SQLite engine and for various extensions in a recent version is included. System libraries will never be consulted because this package relies on static linking for the plugins it includes; this also ensures a consistent experience across all installations.
Maintained by Kirill Müller. Last updated 2 days ago.
331 stars 18.78 score 8.1k scripts 1.1k dependentsbioc
clusterProfiler:A universal enrichment tool for interpreting omics data
This package supports functional characteristics of both coding and non-coding genomics data for thousands of species with up-to-date gene annotation. It provides a univeral interface for gene functional annotation from a variety of sources and thus can be applied in diverse scenarios. It provides a tidy interface to access, manipulate, and visualize enrichment results to help users achieve efficient data interpretation. Datasets obtained from multiple treatments and time points can be analyzed and compared in a single run, easily revealing functional consensus and differences among distinct conditions.
Maintained by Guangchuang Yu. Last updated 4 months ago.
annotationclusteringgenesetenrichmentgokeggmultiplecomparisonpathwaysreactomevisualizationenrichment-analysisgsea
1.1k stars 17.03 score 11k scripts 48 dependentsr-dbi
odbc:Connect to ODBC Compatible Databases (using the DBI Interface)
A DBI-compatible interface to ODBC databases.
Maintained by Hadley Wickham. Last updated 4 days ago.
396 stars 16.31 score 2.9k scripts 23 dependentsbioc
biomaRt:Interface to BioMart databases (i.e. Ensembl)
In recent years a wealth of biological data has become available in public data repositories. Easy access to these valuable data resources and firm integration with data analysis is needed for comprehensive bioinformatics data analysis. biomaRt provides an interface to a growing collection of databases implementing the BioMart software suite (<http://www.biomart.org>). The package enables retrieval of large amounts of data in a uniform way without the need to know the underlying database schemas or write complex SQL queries. The most prominent examples of BioMart databases are maintain by Ensembl, which provides biomaRt users direct access to a diverse set of data and enables a wide range of powerful online queries from gene annotation to database mining.
Maintained by Mike Smith. Last updated 17 days ago.
annotationbioconductorbiomartensembl
38 stars 15.99 score 13k scripts 230 dependentsbioc
enrichplot:Visualization of Functional Enrichment Result
The 'enrichplot' package implements several visualization methods for interpreting functional enrichment results obtained from ORA or GSEA analysis. It is mainly designed to work with the 'clusterProfiler' package suite. All the visualization methods are developed based on 'ggplot2' graphics.
Maintained by Guangchuang Yu. Last updated 3 months ago.
annotationgenesetenrichmentgokeggpathwayssoftwarevisualizationenrichment-analysispathway-analysis
239 stars 15.71 score 3.1k scripts 58 dependentsbioc
GenomicFeatures:Query the gene models of a given organism/assembly
Extract the genomic locations of genes, transcripts, exons, introns, and CDS, for the gene models stored in a TxDb object. A TxDb object is a small database that contains the gene models of a given organism/assembly. Bioconductor provides a small collection of TxDb objects in the form of ready-to-install TxDb packages for the most commonly studied organisms. Additionally, the user can easily make a TxDb object (or package) for the organism/assembly of their choice by using the tools from the txdbmaker package.
Maintained by H. Pagès. Last updated 5 months ago.
geneticsinfrastructureannotationsequencinggenomeannotationbioconductor-packagecore-package
26 stars 15.34 score 5.3k scripts 339 dependentssparklyr
sparklyr:R Interface to Apache Spark
R interface to Apache Spark, a fast and general engine for big data processing, see <https://spark.apache.org/>. This package supports connecting to local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end, and provides an interface to Spark's built-in machine learning algorithms.
Maintained by Edgar Ruiz. Last updated 13 days ago.
apache-sparkdistributeddplyridelivymachine-learningremote-clusterssparksparklyr
959 stars 15.20 score 4.0k scripts 21 dependentsbioc
AnnotationDbi:Manipulation of SQLite-based annotations in Bioconductor
Implements a user-friendly interface for querying SQLite-based annotation data packages.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
annotationmicroarraysequencinggenomeannotationbioconductor-packagecore-package
9 stars 15.05 score 3.6k scripts 769 dependentsbioc
DOSE:Disease Ontology Semantic and Enrichment analysis
This package implements five methods proposed by Resnik, Schlicker, Jiang, Lin and Wang respectively for measuring semantic similarities among DO terms and gene products. Enrichment analyses including hypergeometric model and gene set enrichment analysis are also implemented for discovering disease associations of high-throughput biological data.
Maintained by Guangchuang Yu. Last updated 5 months ago.
annotationvisualizationmultiplecomparisongenesetenrichmentpathwayssoftwaredisease-ontologyenrichment-analysissemantic-similarity
119 stars 14.97 score 2.0k scripts 61 dependentsr-dbi
RPostgres:C++ Interface to PostgreSQL
Fully DBI-compliant C++-backed interface to PostgreSQL <https://www.postgresql.org/>, an open-source relational database.
Maintained by Kirill Müller. Last updated 1 months ago.
338 stars 14.78 score 1.6k scripts 31 dependentsbioc
GSVA:Gene Set Variation Analysis for Microarray and RNA-Seq Data
Gene Set Variation Analysis (GSVA) is a non-parametric, unsupervised method for estimating variation of gene set enrichment through the samples of a expression data set. GSVA performs a change in coordinate systems, transforming the data from a gene by sample matrix to a gene-set by sample matrix, thereby allowing the evaluation of pathway enrichment for each sample. This new matrix of GSVA enrichment scores facilitates applying standard analytical methods like functional enrichment, survival analysis, clustering, CNV-pathway analysis or cross-tissue pathway analysis, in a pathway-centric manner.
Maintained by Robert Castelo. Last updated 10 days ago.
functionalgenomicsmicroarrayrnaseqpathwaysgenesetenrichmentgene-set-enrichmentgenomicspathway-enrichment-analysis
212 stars 14.74 score 1.6k scripts 19 dependentsbioc
TCGAbiolinks:TCGAbiolinks: An R/Bioconductor package for integrative analysis with GDC data
The aim of TCGAbiolinks is : i) facilitate the GDC open-access data retrieval, ii) prepare the data using the appropriate pre-processing strategies, iii) provide the means to carry out different standard analyses and iv) to easily reproduce earlier research results. In more detail, the package provides multiple methods for analysis (e.g., differential expression analysis, identifying differentially methylated regions) and methods for visualization (e.g., survival plots, volcano plots, starburst plots) in order to easily develop complete analysis pipelines.
Maintained by Tiago Chedraoui Silva. Last updated 1 months ago.
dnamethylationdifferentialmethylationgeneregulationgeneexpressionmethylationarraydifferentialexpressionpathwaysnetworksequencingsurvivalsoftwarebiocbioconductorgdcintegrative-analysistcgatcga-datatcgabiolinks
310 stars 14.47 score 1.6k scripts 6 dependentsbioc
GOSemSim:GO-terms Semantic Similarity Measures
The semantic comparisons of Gene Ontology (GO) annotations provide quantitative ways to compute similarities between genes and gene groups, and have became important basis for many bioinformatics analysis approaches. GOSemSim is an R package for semantic similarity computation among GO terms, sets of GO terms, gene products and gene clusters. GOSemSim implemented five methods proposed by Resnik, Schlicker, Jiang, Lin and Wang respectively.
Maintained by Guangchuang Yu. Last updated 5 months ago.
annotationgoclusteringpathwaysnetworksoftwarebioinformaticsgene-ontologysemantic-similaritycpp
63 stars 14.12 score 708 scripts 68 dependentsbioc
ensembldb:Utilities to create and use Ensembl-based annotation databases
The package provides functions to create and use transcript centric annotation databases/packages. The annotation for the databases are directly fetched from Ensembl using their Perl API. The functionality and data is similar to that of the TxDb packages from the GenomicFeatures package, but, in addition to retrieve all gene/transcript models and annotations from the database, ensembldb provides a filter framework allowing to retrieve annotations for specific entries like genes encoded on a chromosome region or transcript models of lincRNA genes. EnsDb databases built with ensembldb contain also protein annotations and mappings between proteins and their encoding transcripts. Finally, ensembldb provides functions to map between genomic, transcript and protein coordinates.
Maintained by Johannes Rainer. Last updated 5 months ago.
geneticsannotationdatasequencingcoverageannotationbioconductorbioconductor-packagesensembl
35 stars 14.08 score 892 scripts 108 dependentsbioc
AnnotationHub:Client to access AnnotationHub resources
This package provides a client for the Bioconductor AnnotationHub web resource. The AnnotationHub web resource provides a central location where genomic files (e.g., VCF, bed, wig) and other resources from standard locations (e.g., UCSC, Ensembl) can be discovered. The resource includes metadata about each resource, e.g., a textual description, tags, and date of modification. The client creates and manages a local cache of files retrieved by the user, helping with quick and reproducible access.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
infrastructuredataimportguithirdpartyclientcore-packageu24ca289073
17 stars 13.88 score 2.7k scripts 104 dependentsbioc
BiocFileCache:Manage Files Across Sessions
This package creates a persistent on-disk cache of files that the user can add, update, and retrieve. It is useful for managing resources (such as custom Txdb objects) that are costly or difficult to create, web resources, and data files used across sessions.
Maintained by Lori Shepherd. Last updated 2 months ago.
dataimportcore-packageu24ca289073
13 stars 13.76 score 486 scripts 436 dependentsbioc
Gviz:Plotting data and annotation information along genomic coordinates
Genomic data analyses requires integrated visualization of known genomic information and new experimental data. Gviz uses the biomaRt and the rtracklayer packages to perform live annotation queries to Ensembl and UCSC and translates this to e.g. gene/transcript structures in viewports of the grid graphics package. This results in genomic information plotted together with your data.
Maintained by Robert Ivanek. Last updated 5 months ago.
visualizationmicroarraysequencing
79 stars 13.05 score 1.4k scripts 46 dependentsbioc
ChIPseeker:ChIPseeker for ChIP peak Annotation, Comparison, and Visualization
This package implements functions to retrieve the nearest genes around the peak, annotate genomic region of the peak, statstical methods for estimate the significance of overlap among ChIP peak data sets, and incorporate GEO database for user to compare the own dataset with those deposited in database. The comparison can be used to infer cooperative regulation and thus can be used to generate hypotheses. Several visualization functions are implemented to summarize the coverage of the peak experiment, average profile and heatmap of peaks binding to TSS regions, genomic annotation, distance to TSS, and overlap of peaks or genes.
Maintained by Guangchuang Yu. Last updated 5 months ago.
annotationchipseqsoftwarevisualizationmultiplecomparisonatac-seqchip-seqcomparisonepigeneticsepigenomics
233 stars 13.05 score 1.6k scripts 5 dependentsggrothendieck
sqldf:Manipulate R Data Frames Using SQL
The sqldf() function is typically passed a single argument which is an SQL select statement where the table names are ordinary R data frame names. sqldf() transparently sets up a database, imports the data frames into that database, performs the SQL select or other statement and returns the result using a heuristic to determine which class to assign to each column of the returned data frame. The sqldf() or read.csv.sql() functions can also be used to read filtered files into R even if the original files are larger than R itself can handle. 'RSQLite', 'RH2', 'RMySQL' and 'RPostgreSQL' backends are supported.
Maintained by G. Grothendieck. Last updated 3 years ago.
250 stars 13.04 score 8.1k scripts 52 dependentsbioc
minfi:Analyze Illumina Infinium DNA methylation arrays
Tools to analyze & visualize Illumina Infinium methylation arrays.
Maintained by Kasper Daniel Hansen. Last updated 4 months ago.
immunooncologydnamethylationdifferentialmethylationepigeneticsmicroarraymethylationarraymultichanneltwochanneldataimportnormalizationpreprocessingqualitycontrol
60 stars 12.82 score 996 scripts 27 dependentsbioc
SpatialExperiment:S4 Class for Spatially Resolved -omics Data
Defines an S4 class for storing data from spatial -omics experiments. The class extends SingleCellExperiment to support storage and retrieval of additional information from spot-based and molecule-based platforms, including spatial coordinates, images, and image metadata. A specialized constructor function is included for data from the 10x Genomics Visium platform.
Maintained by Dario Righelli. Last updated 5 months ago.
datarepresentationdataimportinfrastructureimmunooncologygeneexpressiontranscriptomicssinglecellspatial
59 stars 12.63 score 1.8k scripts 71 dependentsohdsi
DatabaseConnector:Connecting to Various Database Platforms
An R 'DataBase Interface' ('DBI') compatible interface to various database platforms ('PostgreSQL', 'Oracle', 'Microsoft SQL Server', 'Amazon Redshift', 'Microsoft Parallel Database Warehouse', 'IBM Netezza', 'Apache Impala', 'Google BigQuery', 'Snowflake', 'Spark', 'SQLite', and 'InterSystems IRIS'). Also includes support for fetching data as 'Andromeda' objects. Uses either 'Java Database Connectivity' ('JDBC') or other 'DBI' drivers to connect to databases.
Maintained by Martijn Schuemie. Last updated 2 months ago.
56 stars 12.63 score 772 scripts 11 dependentsbioc
TFBSTools:Software Package for Transcription Factor Binding Site (TFBS) Analysis
TFBSTools is a package for the analysis and manipulation of transcription factor binding sites. It includes matrices conversion between Position Frequency Matirx (PFM), Position Weight Matirx (PWM) and Information Content Matrix (ICM). It can also scan putative TFBS from sequence/alignment, query JASPAR database and provides a wrapper of de novo motif discovery software.
Maintained by Ge Tan. Last updated 19 days ago.
motifannotationgeneregulationmotifdiscoverytranscriptionalignment
28 stars 12.36 score 1.1k scripts 18 dependentsbioc
ReactomePA:Reactome Pathway Analysis
This package provides functions for pathway analysis based on REACTOME pathway database. It implements enrichment analysis, gene set enrichment analysis and several functions for visualization. This package is not affiliated with the Reactome team.
Maintained by Guangchuang Yu. Last updated 5 months ago.
pathwaysvisualizationannotationmultiplecomparisongenesetenrichmentreactomeenrichment-analysisreactome-pathway-analysisreactomepa
40 stars 12.25 score 1.5k scripts 7 dependentsbioc
ggbio:Visualization tools for genomic data
The ggbio package extends and specializes the grammar of graphics for biological data. The graphics are designed to answer common scientific questions, in particular those often asked of high throughput genomics data. All core Bioconductor data structures are supported, where appropriate. The package supports detailed views of particular genomic regions, as well as genome-wide overviews. Supported overviews include ideograms and grand linear views. High-level plots include sequence fragment length, edge-linked interval to data view, mismatch pileup, and several splicing summaries.
Maintained by Michael Lawrence. Last updated 5 months ago.
111 stars 12.23 score 734 scripts 16 dependentsr-dbi
RMariaDB:Database Interface and MariaDB Driver
Implements a DBI-compliant interface to MariaDB (<https://mariadb.org/>) and MySQL (<https://www.mysql.com/>) databases.
Maintained by Kirill Müller. Last updated 1 months ago.
133 stars 12.20 score 792 scripts 10 dependentsbioc
ExperimentHub:Client to access ExperimentHub resources
This package provides a client for the Bioconductor ExperimentHub web resource. ExperimentHub provides a central location where curated data from experiments, publications or training courses can be accessed. Each resource has associated metadata, tags and date of modification. The client creates and manages a local cache of files retrieved enabling quick and reproducible access.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
infrastructuredataimportguithirdpartyclientcore-packageu24ca289073
10 stars 11.94 score 764 scripts 57 dependentspecanproject
PEcAn.DB:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by David LeBauer. Last updated 19 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
216 stars 11.91 score 127 scripts 27 dependentspecanproject
PEcAn.data.atmosphere:PEcAn Functions Used for Managing Climate Driver Data
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The PECAn.data.atmosphere package converts climate driver data into a standard format for models integrated into PEcAn. As a standalone package, it provides an interface to access diverse climate data sets.
Maintained by David LeBauer. Last updated 19 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
216 stars 11.63 score 64 scripts 14 dependentsbioc
bumphunter:Bump Hunter
Tools for finding bumps in genomic data
Maintained by Tamilselvi Guharaj. Last updated 5 months ago.
dnamethylationepigeneticsinfrastructuremultiplecomparisonimmunooncology
16 stars 11.61 score 210 scripts 43 dependentsdarwin-eu
CDMConnector:Connect to an OMOP Common Data Model
Provides tools for working with observational health data in the Observational Medical Outcomes Partnership (OMOP) Common Data Model format with a pipe friendly syntax. Common data model database table references are stored in a single compound object along with metadata.
Maintained by Adam Black. Last updated 1 months ago.
12 stars 11.43 score 502 scripts 12 dependentsbioc
annotate:Annotation for microarrays
Using R enviroments for annotation.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
11.41 score 812 scripts 239 dependentsbioc
VariantAnnotation:Annotation of Genetic Variants
Annotate variants, compute amino acid coding changes, predict coding outcomes.
Maintained by Bioconductor Package Maintainer. Last updated 3 months ago.
dataimportsequencingsnpannotationgeneticsvariantannotationcurlbzip2xz-utilszlib
11.39 score 1.9k scripts 152 dependentsbioc
pathview:a tool set for pathway based data integration and visualization
Pathview is a tool set for pathway based data integration and visualization. It maps and renders a wide variety of biological data on relevant pathway graphs. All users need is to supply their data and specify the target pathway. Pathview automatically downloads the pathway graph data, parses the data file, maps user data to the pathway, and render pathway graph with the mapped data. In addition, Pathview also seamlessly integrates with pathway and gene set (enrichment) analysis tools for large-scale and fully automated analysis.
Maintained by Weijun Luo. Last updated 3 days ago.
pathwaysgraphandnetworkvisualizationgenesetenrichmentdifferentialexpressiongeneexpressionmicroarrayrnaseqgeneticsmetabolomicsproteomicssystemsbiologysequencing
40 stars 11.37 score 1.6k scripts 10 dependentsropensci
biomartr:Genomic Data Retrieval
Perform large scale genomic data retrieval and functional annotation retrieval. This package aims to provide users with a standardized way to automate genome, proteome, 'RNA', coding sequence ('CDS'), 'GFF', and metagenome retrieval from 'NCBI RefSeq', 'NCBI Genbank', 'ENSEMBL', and 'UniProt' databases. Furthermore, an interface to the 'BioMart' database (Smedley et al. (2009) <doi:10.1186/1471-2164-10-22>) allows users to retrieve functional annotation for genomic loci. In addition, users can download entire databases such as 'NCBI RefSeq' (Pruitt et al. (2007) <doi:10.1093/nar/gkl842>), 'NCBI nr', 'NCBI nt', 'NCBI Genbank' (Benson et al. (2013) <doi:10.1093/nar/gks1195>), etc. with only one command.
Maintained by Hajk-Georg Drost. Last updated 2 months ago.
biomartgenomic-data-retrievalannotation-retrievaldatabase-retrievalncbiensemblbiological-data-retrievalensembl-serversgenomegenome-annotationgenome-retrievalgenomicsmeta-analysismetagenomicsncbi-genbankpeer-reviewedproteomesequenced-genomes
218 stars 11.35 score 129 scripts 3 dependentsbioc
karyoploteR:Plot customizable linear genomes displaying arbitrary data
karyoploteR creates karyotype plots of arbitrary genomes and offers a complete set of functions to plot arbitrary data on them. It mimicks many R base graphics functions coupling them with a coordinate change function automatically mapping the chromosome and data coordinates into the plot coordinates. In addition to the provided data plotting functions, it is easy to add new ones.
Maintained by Bernat Gel. Last updated 5 months ago.
visualizationcopynumbervariationsequencingcoveragednaseqchipseqmethylseqdataimportonechannelbioconductorbioinformaticsdata-visualizationgenomegenomics-visualizationplotting-in-r
307 stars 11.25 score 656 scripts 4 dependentsbioc
genefilter:genefilter: methods for filtering genes from high-throughput experiments
Some basic functions for filtering genes.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
11.11 score 2.4k scripts 143 dependentsohdsi
PatientLevelPrediction:Develop Clinical Prediction Models Using the Common Data Model
A user friendly way to create patient level prediction models using the Observational Medical Outcomes Partnership Common Data Model. Given a cohort of interest and an outcome of interest, the package can use data in the Common Data Model to build a large set of features. These features can then be used to fit a predictive model with a number of machine learning algorithms. This is further described in Reps (2017) <doi:10.1093/jamia/ocy032>.
Maintained by Egill Fridgeirsson. Last updated 24 days ago.
190 stars 10.85 score 297 scriptsapache
apache.sedona:R Interface for Apache Sedona
R interface for 'Apache Sedona' based on 'sparklyr' (<https://sedona.apache.org>).
Maintained by Apache Sedona. Last updated 5 hours ago.
cluster-computinggeospatialjavapythonscalaspatial-analysisspatial-queryspatial-sql
2.0k stars 10.73 score 105 scriptspecanproject
PEcAn.benchmark:PEcAn Functions Used for Benchmarking
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PEcAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. The PEcAn.benchmark package provides utilities for comparing models and data, including a suite of statistical metrics and plots.
Maintained by Mike Dietze. Last updated 19 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
216 stars 10.73 score 416 scripts 11 dependentsrstudio
pointblank:Data Validation and Organization of Metadata for Local and Remote Tables
Validate data in data frames, 'tibble' objects, 'Spark' 'DataFrames', and database tables. Validation pipelines can be made using easily-readable, consecutive validation steps. Upon execution of the validation plan, several reporting options are available. User-defined thresholds for failure rates allow for the determination of appropriate reporting actions. Many other workflows are available including an information management workflow, where the aim is to record, collect, and generate useful information on data tables.
Maintained by Richard Iannone. Last updated 5 days ago.
data-assertionsdata-checkerdata-dictionariesdata-framesdata-inferencedata-managementdata-profilerdata-qualitydata-validationdata-verificationdatabase-tableseasy-to-understandreporting-toolschema-validationtesting-toolsyaml-configuration
942 stars 10.73 score 284 scriptsbioc
GWASTools:Tools for Genome Wide Association Studies
Classes for storing very large GWAS data sets and annotation, and functions for GWAS data cleaning and analysis.
Maintained by Stephanie M. Gogarten. Last updated 13 days ago.
snpgeneticvariabilityqualitycontrolmicroarray
17 stars 10.67 score 396 scripts 5 dependentsohdsi
FeatureExtraction:Generating Features for a Cohort
An R interface for generating features for a cohort using data in the Common Data Model. Features can be constructed using default or custom made feature definitions. Furthermore it's possible to aggregate features and get the summary statistics.
Maintained by Ger Inberg. Last updated 11 days ago.
62 stars 10.64 score 209 scripts 2 dependentsbioc
tximeta:Transcript Quantification Import with Automatic Metadata
Transcript quantification import from Salmon and other quantifiers with automatic attachment of transcript ranges and release information, and other associated metadata. De novo transcriptomes can be linked to the appropriate sources with linkedTxomes and shared for computational reproducibility.
Maintained by Michael Love. Last updated 2 months ago.
annotationgenomeannotationdataimportpreprocessingrnaseqsinglecelltranscriptomicstranscriptiongeneexpressionfunctionalgenomicsreproducibleresearchreportwritingimmunooncology
67 stars 10.58 score 466 scripts 1 dependentsbioc
ORFik:Open Reading Frames in Genomics
R package for analysis of transcript and translation features through manipulation of sequence data and NGS data like Ribo-Seq, RNA-Seq, TCP-Seq and CAGE. It is generalized in the sense that any transcript region can be analysed, as the name hints to it was made with investigation of ribosomal patterns over Open Reading Frames (ORFs) as it's primary use case. ORFik is extremely fast through use of C++, data.table and GenomicRanges. Package allows to reassign starts of the transcripts with the use of CAGE-Seq data, automatic shifting of RiboSeq reads, finding of Open Reading Frames for whole genomes and much more.
Maintained by Haakon Tjeldnes. Last updated 1 months ago.
immunooncologysoftwaresequencingriboseqrnaseqfunctionalgenomicscoveragealignmentdataimportcpp
33 stars 10.56 score 115 scripts 2 dependentsdatastorm-open
shinymanager:Authentication Management for 'Shiny' Applications
Simple and secure authentification mechanism for single 'Shiny' applications. Credentials can be stored in an encrypted 'SQLite' database or on your own SQL Database (Postgres, MySQL, ...). Source code of main application is protected until authentication is successful.
Maintained by Benoit Thieurmel. Last updated 11 months ago.
391 stars 10.51 score 316 scripts 2 dependentsbioc
ballgown:Flexible, isoform-level differential expression analysis
Tools for statistical analysis of assembled transcriptomes, including flexible differential expression analysis, visualization of transcript structures, and matching of assembled transcripts to annotation.
Maintained by Jack Fu. Last updated 5 months ago.
immunooncologyrnaseqstatisticalmethodpreprocessingdifferentialexpression
145 stars 10.51 score 338 scripts 1 dependentsbioc
GENESIS:GENetic EStimation and Inference in Structured samples (GENESIS): Statistical methods for analyzing genetic data from samples with population structure and/or relatedness
The GENESIS package provides methodology for estimating, inferring, and accounting for population and pedigree structure in genetic analyses. The current implementation provides functions to perform PC-AiR (Conomos et al., 2015, Gen Epi) and PC-Relate (Conomos et al., 2016, AJHG). PC-AiR performs a Principal Components Analysis on genome-wide SNP data for the detection of population structure in a sample that may contain known or cryptic relatedness. Unlike standard PCA, PC-AiR accounts for relatedness in the sample to provide accurate ancestry inference that is not confounded by family structure. PC-Relate uses ancestry representative principal components to adjust for population structure/ancestry and accurately estimate measures of recent genetic relatedness such as kinship coefficients, IBD sharing probabilities, and inbreeding coefficients. Additionally, functions are provided to perform efficient variance component estimation and mixed model association testing for both quantitative and binary phenotypes.
Maintained by Stephanie M. Gogarten. Last updated 2 months ago.
snpgeneticvariabilitygeneticsstatisticalmethoddimensionreductionprincipalcomponentgenomewideassociationqualitycontrolbiocviews
36 stars 10.44 score 342 scripts 1 dependentsbioc
oligo:Preprocessing tools for oligonucleotide arrays
A package to analyze oligonucleotide arrays (expression/SNP/tiling/exon) at probe-level. It currently supports Affymetrix (CEL files) and NimbleGen arrays (XYS files).
Maintained by Benilton Carvalho. Last updated 23 days ago.
microarrayonechanneltwochannelpreprocessingsnpdifferentialexpressionexonarraygeneexpressiondataimportzlib
3 stars 10.42 score 528 scripts 10 dependentsegeulgen
pathfindR:Enrichment Analysis Utilizing Active Subnetworks
Enrichment analysis enables researchers to uncover mechanisms underlying a phenotype. However, conventional methods for enrichment analysis do not take into account protein-protein interaction information, resulting in incomplete conclusions. 'pathfindR' is a tool for enrichment analysis utilizing active subnetworks. The main function identifies active subnetworks in a protein-protein interaction network using a user-provided list of genes and associated p values. It then performs enrichment analyses on the identified subnetworks, identifying enriched terms (i.e. pathways or, more broadly, gene sets) that possibly underlie the phenotype of interest. 'pathfindR' also offers functionalities to cluster the enriched terms and identify representative terms in each cluster, to score the enriched terms per sample and to visualize analysis results. The enrichment, clustering and other methods implemented in 'pathfindR' are described in detail in Ulgen E, Ozisik O, Sezerman OU. 2019. 'pathfindR': An R Package for Comprehensive Identification of Enriched Pathways in Omics Data Through Active Subnetworks. Front. Genet. <doi:10.3389/fgene.2019.00858>.
Maintained by Ege Ulgen. Last updated 1 months ago.
active-subnetworksenrichmentpathwaypathway-enrichment-analysissubnetwork
187 stars 10.38 score 138 scriptsbcgov
bcdata:Search and Retrieve Data from the BC Data Catalogue
Search, query, and download tabular and 'geospatial' data from the British Columbia Data Catalogue (<https://catalogue.data.gov.bc.ca/>). Search catalogue data records based on keywords, data licence, sector, data format, and B.C. government organization. View metadata directly in R, download many data formats, and query 'geospatial' data available via the B.C. government Web Feature Service ('WFS') using 'dplyr' syntax.
Maintained by Andy Teucher. Last updated 5 days ago.
83 stars 10.36 score 186 scripts 4 dependentsbioc
pRoloc:A unifying bioinformatics framework for spatial proteomics
The pRoloc package implements machine learning and visualisation methods for the analysis and interogation of quantitiative mass spectrometry data to reliably infer protein sub-cellular localisation.
Maintained by Lisa Breckels. Last updated 4 days ago.
immunooncologyproteomicsmassspectrometryclassificationclusteringqualitycontrolbioconductorproteomics-dataspatial-proteomicsvisualisationopenblascpp
15 stars 10.31 score 101 scripts 2 dependentsbioc
GSEABase:Gene set enrichment data structures and methods
This package provides classes and methods to support Gene Set Enrichment Analysis (GSEA).
Maintained by Bioconductor Package Maintainer. Last updated 2 months ago.
geneexpressiongenesetenrichmentgraphandnetworkgokegg
10.27 score 1.5k scripts 77 dependentsbioc
graphite:GRAPH Interaction from pathway Topological Environment
Graph objects from pathway topology derived from KEGG, Panther, PathBank, PharmGKB, Reactome SMPDB and WikiPathways databases.
Maintained by Gabriele Sales. Last updated 5 months ago.
pathwaysthirdpartyclientgraphandnetworknetworkreactomekeggmetabolomicsbioinformaticsmirrorpathway-analysis
8 stars 10.24 score 122 scripts 21 dependentsbioc
EDASeq:Exploratory Data Analysis and Normalization for RNA-Seq
Numerical and graphical summaries of RNA-Seq read data. Within-lane normalization procedures to adjust for GC-content effect (or other gene-level effects) on read counts: loess robust local regression, global-scaling, and full-quantile normalization (Risso et al., 2011). Between-lane normalization procedures to adjust for distributional differences between lanes (e.g., sequencing depth): global-scaling and full-quantile normalization (Bullard et al., 2010).
Maintained by Davide Risso. Last updated 5 months ago.
immunooncologysequencingrnaseqpreprocessingqualitycontroldifferentialexpression
5 stars 10.24 score 594 scripts 9 dependentsidigbio
ridigbio:Interface to the iDigBio Data API
An interface to iDigBio's search API that allows downloading specimen records. Searches are returned as a data.frame. Other functions such as the metadata end points return lists of information. iDigBio is a US project focused on digitizing and serving museum specimen collections on the web. See <https://www.idigbio.org> for information on iDigBio.
Maintained by Jesse Bennett. Last updated 20 days ago.
16 stars 10.23 score 63 scripts 7 dependentsbioc
zinbwave:Zero-Inflated Negative Binomial Model for RNA-Seq Data
Implements a general and flexible zero-inflated negative binomial model that can be used to provide a low-dimensional representations of single-cell RNA-seq data. The model accounts for zero inflation (dropouts), over-dispersion, and the count nature of the data. The model also accounts for the difference in library sizes and optionally for batch effects and/or other covariates, avoiding the need for pre-normalize the data.
Maintained by Davide Risso. Last updated 5 months ago.
immunooncologydimensionreductiongeneexpressionrnaseqsoftwaretranscriptomicssequencingsinglecell
43 stars 10.21 score 190 scripts 6 dependentsbioc
cBioPortalData:Exposes and Makes Available Data from the cBioPortal Web Resources
The cBioPortalData R package accesses study datasets from the cBio Cancer Genomics Portal. It accesses the data either from the pre-packaged zip / tar files or from the API interface that was recently implemented by the cBioPortal Data Team. The package can provide data in either tabular format or with MultiAssayExperiment object that uses familiar Bioconductor data representations.
Maintained by Marcel Ramos. Last updated 10 days ago.
softwareinfrastructurethirdpartyclientbioconductor-packagenci-itcru24ca289073
33 stars 10.17 score 147 scripts 4 dependentsbioc
singleCellTK:Comprehensive and Interactive Analysis of Single Cell RNA-Seq Data
The Single Cell Toolkit (SCTK) in the singleCellTK package provides an interface to popular tools for importing, quality control, analysis, and visualization of single cell RNA-seq data. SCTK allows users to seamlessly integrate tools from various packages at different stages of the analysis workflow. A general "a la carte" workflow gives users the ability access to multiple methods for data importing, calculation of general QC metrics, doublet detection, ambient RNA estimation and removal, filtering, normalization, batch correction or integration, dimensionality reduction, 2-D embedding, clustering, marker detection, differential expression, cell type labeling, pathway analysis, and data exporting. Curated workflows can be used to run Seurat and Celda. Streamlined quality control can be performed on the command line using the SCTK-QC pipeline. Users can analyze their data using commands in the R console or by using an interactive Shiny Graphical User Interface (GUI). Specific analyses or entire workflows can be summarized and shared with comprehensive HTML reports generated by Rmarkdown. Additional documentation and vignettes can be found at camplab.net/sctk.
Maintained by Joshua David Campbell. Last updated 1 months ago.
singlecellgeneexpressiondifferentialexpressionalignmentclusteringimmunooncologybatcheffectnormalizationqualitycontroldataimportgui
182 stars 10.17 score 252 scriptsropensci
rfishbase:R Interface to 'FishBase'
A programmatic interface to 'FishBase', re-written based on an accompanying 'RESTful' API. Access tables describing over 30,000 species of fish, their biology, ecology, morphology, and more. This package also supports experimental access to 'SeaLifeBase' data, which contains nearly 200,000 species records for all types of aquatic life not covered by 'FishBase.'
Maintained by Carl Boettiger. Last updated 3 months ago.
116 stars 10.11 score 764 scripts 2 dependentsropensci
spocc:Interface to Species Occurrence Data Sources
A programmatic interface to many species occurrence data sources, including Global Biodiversity Information Facility ('GBIF'), 'iNaturalist', 'eBird', Integrated Digitized 'Biocollections' ('iDigBio'), 'VertNet', Ocean 'Biogeographic' Information System ('OBIS'), and Atlas of Living Australia ('ALA'). Includes functionality for retrieving species occurrence data, and combining those data.
Maintained by Hannah Owens. Last updated 2 months ago.
specimensapiweb-servicesoccurrencesspeciestaxonomygbifinatvertnetebirdidigbioobisalaantwebbisondataecoengineinaturalistoccurrencespecies-occurrencespocc
118 stars 10.09 score 552 scripts 5 dependentsbioc
sva:Surrogate Variable Analysis
The sva package contains functions for removing batch effects and other unwanted variation in high-throughput experiment. Specifically, the sva package contains functions for the identifying and building surrogate variables for high-dimensional data sets. Surrogate variables are covariates constructed directly from high-dimensional data (like gene expression/RNA sequencing/methylation/brain imaging data) that can be used in subsequent analyses to adjust for unknown, unmodeled, or latent sources of noise. The sva package can be used to remove artifacts in three ways: (1) identifying and estimating surrogate variables for unknown sources of variation in high-throughput experiments (Leek and Storey 2007 PLoS Genetics,2008 PNAS), (2) directly removing known batch effects using ComBat (Johnson et al. 2007 Biostatistics) and (3) removing batch effects with known control probes (Leek 2014 biorXiv). Removing batch effects and using surrogate variables in differential expression analysis have been shown to reduce dependence, stabilize error rate estimates, and improve reproducibility, see (Leek and Storey 2007 PLoS Genetics, 2008 PNAS or Leek et al. 2011 Nat. Reviews Genetics).
Maintained by Jeffrey T. Leek. Last updated 5 months ago.
immunooncologymicroarraystatisticalmethodpreprocessingmultiplecomparisonsequencingrnaseqbatcheffectnormalization
10.04 score 3.2k scripts 50 dependentspecanproject
PEcAn.settings:PEcAn Settings package
Contains functions to read PEcAn settings files.
Maintained by David LeBauer. Last updated 19 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
216 stars 10.03 score 54 scripts 17 dependentsbioc
BiocCheck:Bioconductor-specific package checks
BiocCheck guides maintainers through Bioconductor best practicies. It runs Bioconductor-specific package checks by searching through package code, examples, and vignettes. Maintainers are required to address all errors, warnings, and most notes produced.
Maintained by Marcel Ramos. Last updated 1 months ago.
infrastructurebioconductor-packagecore-services
8 stars 10.03 score 114 scripts 6 dependentsbioc
singscore:Rank-based single-sample gene set scoring method
A simple single-sample gene signature scoring method that uses rank-based statistics to analyze the sample's gene expression profile. It scores the expression activities of gene sets at a single-sample level.
Maintained by Malvika Kharbanda. Last updated 5 months ago.
softwaregeneexpressiongenesetenrichmentbioinformatics
41 stars 10.03 score 124 scripts 4 dependentsbioc
derfinder:Annotation-agnostic differential expression analysis of RNA-seq data at base-pair resolution via the DER Finder approach
This package provides functions for annotation-agnostic differential expression analysis of RNA-seq data. Two implementations of the DER Finder approach are included in this package: (1) single base-level F-statistics and (2) DER identification at the expressed regions-level. The DER Finder approach can also be used to identify differentially bounded ChIP-seq peaks.
Maintained by Leonardo Collado-Torres. Last updated 4 months ago.
differentialexpressionsequencingrnaseqchipseqdifferentialpeakcallingsoftwareimmunooncologycoverageannotation-agnosticbioconductorderfinder
42 stars 10.03 score 78 scripts 6 dependentspecanproject
PEcAn.assim.batch:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by Istem Fer. Last updated 19 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 9.97 score 20 scripts 2 dependentsdarwin-eu
omopgenerics:Methods and Classes for the OMOP Common Data Model
Provides definitions of core classes and methods used by analytic pipelines that query the OMOP (Observational Medical Outcomes Partnership) common data model.
Maintained by Martí Català. Last updated 25 days ago.
9.97 score 193 scripts 16 dependentsbioc
goseq:Gene Ontology analyser for RNA-seq and other length biased data
Detects Gene Ontology and/or other user defined categories which are over/under represented in RNA-seq data.
Maintained by Federico Marini. Last updated 5 months ago.
immunooncologysequencinggogeneexpressiontranscriptionrnaseqdifferentialexpressionannotationgenesetenrichmentkeggpathwayssoftware
2 stars 9.97 score 636 scripts 9 dependentsdarwin-eu
PatientProfiles:Identify Characteristics of Patients in the OMOP Common Data Model
Identify the characteristics of patients in data mapped to the Observational Medical Outcomes Partnership (OMOP) common data model.
Maintained by Marti Catala. Last updated 25 days ago.
1 stars 9.97 score 225 scripts 9 dependentspecanproject
PEcAn.priors:PEcAn Functions Used to Estimate Priors from Data
Functions to estimate priors from data.
Maintained by David LeBauer. Last updated 19 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 9.96 score 13 scripts 6 dependentsbioc
rGREAT:GREAT Analysis - Functional Enrichment on Genomic Regions
GREAT (Genomic Regions Enrichment of Annotations Tool) is a type of functional enrichment analysis directly performed on genomic regions. This package implements the GREAT algorithm (the local GREAT analysis), also it supports directly interacting with the GREAT web service (the online GREAT analysis). Both analysis can be viewed by a Shiny application. rGREAT by default supports more than 600 organisms and a large number of gene set collections, as well as self-provided gene sets and organisms from users. Additionally, it implements a general method for dealing with background regions.
Maintained by Zuguang Gu. Last updated 19 days ago.
genesetenrichmentgopathwayssoftwaresequencingwholegenomegenomeannotationcoveragecpp
86 stars 9.96 score 320 scripts 1 dependentsdarwin-eu
CodelistGenerator:Identify Relevant Clinical Codes and Evaluate Their Use
Generate a candidate code list for the Observational Medical Outcomes Partnership (OMOP) common data model based on string matching. For a given search strategy, a candidate code list will be returned.
Maintained by Edward Burn. Last updated 5 days ago.
14 stars 9.94 score 165 scripts 4 dependentspecanproject
PEcAn.MA:PEcAn Functions Used for Meta-Analysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. The PEcAn.MA package contains the functions used in the Bayesian meta-analysis of trait data.
Maintained by David LeBauer. Last updated 19 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 9.92 score 7 scripts 7 dependentsbioc
RUVSeq:Remove Unwanted Variation from RNA-Seq Data
This package implements the remove unwanted variation (RUV) methods of Risso et al. (2014) for the normalization of RNA-Seq read counts between samples.
Maintained by Davide Risso. Last updated 5 months ago.
immunooncologydifferentialexpressionpreprocessingrnaseqsoftware
13 stars 9.91 score 482 scripts 5 dependentsbioc
OmnipathR:OmniPath web service client and more
A client for the OmniPath web service (https://www.omnipathdb.org) and many other resources. It also includes functions to transform and pretty print some of the downloaded data, functions to access a number of other resources such as BioPlex, ConsensusPathDB, EVEX, Gene Ontology, Guide to Pharmacology (IUPHAR/BPS), Harmonizome, HTRIdb, Human Phenotype Ontology, InWeb InBioMap, KEGG Pathway, Pathway Commons, Ramilowski et al. 2015, RegNetwork, ReMap, TF census, TRRUST and Vinayagam et al. 2011. Furthermore, OmnipathR features a close integration with the NicheNet method for ligand activity prediction from transcriptomics data, and its R implementation `nichenetr` (available only on github).
Maintained by Denes Turei. Last updated 1 months ago.
graphandnetworknetworkpathwayssoftwarethirdpartyclientdataimportdatarepresentationgenesignalinggeneregulationsystemsbiologytranscriptomicssinglecellannotationkeggcomplexesenzyme-ptmnetworksnetworks-biologyomnipathproteinsquarto
130 stars 9.90 score 226 scripts 2 dependentsbioc
methylumi:Handle Illumina methylation data
This package provides classes for holding and manipulating Illumina methylation data. Based on eSet, it can contain MIAME information, sample information, feature information, and multiple matrices of data. An "intelligent" import function, methylumiR can read the Illumina text files and create a MethyLumiSet. methylumIDAT can directly read raw IDAT files from HumanMethylation27 and HumanMethylation450 microarrays. Normalization, background correction, and quality control features for GoldenGate, Infinium, and Infinium HD arrays are also included.
Maintained by Sean Davis. Last updated 5 months ago.
dnamethylationtwochannelpreprocessingqualitycontrolcpgisland
9 stars 9.90 score 89 scripts 9 dependentsbioc
PureCN:Copy number calling and SNV classification using targeted short read sequencing
This package estimates tumor purity, copy number, and loss of heterozygosity (LOH), and classifies single nucleotide variants (SNVs) by somatic status and clonality. PureCN is designed for targeted short read sequencing data, integrates well with standard somatic variant detection and copy number pipelines, and has support for tumor samples without matching normal samples.
Maintained by Markus Riester. Last updated 1 days ago.
copynumbervariationsoftwaresequencingvariantannotationvariantdetectioncoverageimmunooncologybioconductor-packagecell-free-dnacopy-numberlohtumor-heterogeneitytumor-mutational-burdentumor-purity
132 stars 9.88 score 40 scriptsbioc
GenVisR:Genomic Visualizations in R
Produce highly customizable publication quality graphics for genomic data primarily at the cohort level.
Maintained by Zachary Skidmore. Last updated 5 months ago.
infrastructuredatarepresentationclassificationdnaseq
217 stars 9.87 score 76 scriptsbioc
annotatr:Annotation of Genomic Regions to Genomic Annotations
Given a set of genomic sites/regions (e.g. ChIP-seq peaks, CpGs, differentially methylated CpGs or regions, SNPs, etc.) it is often of interest to investigate the intersecting genomic annotations. Such annotations include those relating to gene models (promoters, 5'UTRs, exons, introns, and 3'UTRs), CpGs (CpG islands, CpG shores, CpG shelves), or regulatory sequences such as enhancers. The annotatr package provides an easy way to summarize and visualize the intersection of genomic sites/regions with genomic annotations.
Maintained by Raymond G. Cavalcante. Last updated 5 months ago.
softwareannotationgenomeannotationfunctionalgenomicsvisualizationgenome-annotation
26 stars 9.76 score 246 scripts 5 dependentsbioc
RTCGAToolbox:A new tool for exporting TCGA Firehose data
Managing data from large scale projects such as The Cancer Genome Atlas (TCGA) for further analysis is an important and time consuming step for research projects. Several efforts, such as Firehose project, make TCGA pre-processed data publicly available via web services and data portals but it requires managing, downloading and preparing the data for following steps. We developed an open source and extensible R based data client for Firehose pre-processed data and demonstrated its use with sample case studies. Results showed that RTCGAToolbox could improve data management for researchers who are interested with TCGA data. In addition, it can be integrated with other analysis pipelines for following data analysis.
Maintained by Marcel Ramos. Last updated 3 months ago.
differentialexpressiongeneexpressionsequencing
18 stars 9.75 score 76 scripts 5 dependentsohdsi
CohortConstructor:Build and Manipulate Study Cohorts Using a Common Data Model
Create and manipulate study cohorts in data mapped to the Observational Medical Outcomes Partnership Common Data Model.
Maintained by Edward Burn. Last updated 9 hours ago.
2 stars 9.74 score 207 scripts 2 dependentsprestodb
RPresto:DBI Connector to Presto
Implements a 'DBI' compliant interface to Presto. Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes: <https://prestodb.io/>.
Maintained by Jarod G.R. Meng. Last updated 2 months ago.
132 stars 9.73 score 25 scripts 4 dependentsfemiguez
apsimx:Inspect, Read, Edit and Run 'APSIM' "Next Generation" and 'APSIM' Classic
The functions in this package inspect, read, edit and run files for 'APSIM' "Next Generation" ('JSON') and 'APSIM' "Classic" ('XML'). The files with an 'apsim' extension correspond to 'APSIM' Classic (7.x) - Windows only - and the ones with an 'apsimx' extension correspond to 'APSIM' "Next Generation". For more information about 'APSIM' see (<https://www.apsim.info/>) and for 'APSIM' next generation (<https://apsimnextgeneration.netlify.app/>).
Maintained by Fernando Miguez. Last updated 12 days ago.
59 stars 9.72 score 68 scripts 2 dependentspecanproject
PEcAnRTM:PEcAn Functions Used for Radiative Transfer Modeling
Functions for performing forward runs and inversions of radiative transfer models (RTMs). Inversions can be performed using maximum likelihood, or more complex hierarchical Bayesian methods. Underlying numerical analyses are optimized for speed using Fortran code.
Maintained by Alexey Shiklomanov. Last updated 19 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsfortranjagscpp
216 stars 9.72 score 132 scriptscenterforassessment
SGP:Student Growth Percentiles & Percentile Growth Trajectories
An analytic framework for the calculation of norm- and criterion-referenced academic growth estimates using large scale, longitudinal education assessment data as developed in Betebenner (2009) <doi:10.1111/j.1745-3992.2009.00161.x>.
Maintained by Damian W. Betebenner. Last updated 13 days ago.
percentile-growth-projectionsquantile-regressionsgpsgp-analysesstudent-growth-percentilesstudent-growth-projections
20 stars 9.69 score 88 scriptsappsilon
shiny.telemetry:'Shiny' App Usage Telemetry
Enables instrumentation of 'Shiny' apps for tracking user session events such as input changes, browser type, and session duration. These events can be sent to any of the available storage backends and analyzed using the included 'Shiny' app to gain insights about app usage and adoption.
Maintained by André Veríssimo. Last updated 4 months ago.
67 stars 9.69 score 29 scriptsbioc
txdbmaker:Tools for making TxDb objects from genomic annotations
A set of tools for making TxDb objects from genomic annotations from various sources (e.g. UCSC, Ensembl, and GFF files). These tools allow the user to download the genomic locations of transcripts, exons, and CDS, for a given assembly, and to import them in a TxDb object. TxDb objects are implemented in the GenomicFeatures package, together with flexible methods for extracting the desired features in convenient formats.
Maintained by H. Pagès. Last updated 4 months ago.
infrastructuredataimportannotationgenomeannotationgenomeassemblygeneticssequencingbioconductor-packagecore-package
3 stars 9.68 score 92 scripts 87 dependentsbioc
TCGAutils:TCGA utility functions for data management
A suite of helper functions for checking and manipulating TCGA data including data obtained from the curatedTCGAData experiment package. These functions aim to simplify and make working with TCGA data more manageable. Exported functions include those that import data from flat files into Bioconductor objects, convert row annotations, and identifier translation via the GDC API.
Maintained by Marcel Ramos. Last updated 4 months ago.
softwareworkflowsteppreprocessingdataimportbioconductor-packagetcgau24ca289073utilities
27 stars 9.66 score 210 scripts 10 dependentsplangfelder
WGCNA:Weighted Correlation Network Analysis
Functions necessary to perform Weighted Correlation Network Analysis on high-dimensional data as originally described in Horvath and Zhang (2005) <doi:10.2202/1544-6115.1128> and Langfelder and Horvath (2008) <doi:10.1186/1471-2105-9-559>. Includes functions for rudimentary data cleaning, construction of correlation networks, module identification, summarization, and relating of variables and modules to sample traits. Also includes a number of utility functions for data manipulation and visualization.
Maintained by Peter Langfelder. Last updated 6 months ago.
54 stars 9.65 score 5.3k scripts 32 dependentsvimc
orderly:Lightweight Reproducible Reporting
Order, create and store reports from R. By defining a lightweight interface around the inputs and outputs of an analysis, a lot of the repetitive work for reproducible research can be automated. We define a simple format for organising and describing work that facilitates collaborative reproducible research and acknowledges that all analyses are run multiple times over their lifespans.
Maintained by Rich FitzJohn. Last updated 2 years ago.
117 stars 9.63 score 94 scripts 4 dependentsbioc
pcaExplorer:Interactive Visualization of RNA-seq Data Using a Principal Components Approach
This package provides functionality for interactive visualization of RNA-seq datasets based on Principal Components Analysis. The methods provided allow for quick information extraction and effective data exploration. A Shiny application encapsulates the whole analysis.
Maintained by Federico Marini. Last updated 3 months ago.
immunooncologyvisualizationrnaseqdimensionreductionprincipalcomponentqualitycontrolguireportwritingshinyappsbioconductorprincipal-componentsreproducible-researchrna-seq-analysisrna-seq-datashinytranscriptomeuser-friendly
56 stars 9.63 score 180 scriptsbioc
clusterExperiment:Compare Clusterings for Single-Cell Sequencing
Provides functionality for running and comparing many different clusterings of single-cell sequencing data or other large mRNA Expression data sets.
Maintained by Elizabeth Purdom. Last updated 5 months ago.
clusteringrnaseqsequencingsoftwaresinglecellcpp
38 stars 9.62 score 192 scripts 1 dependentsbioc
AnnotationForge:Tools for building SQLite-based annotation data packages
Provides code for generating Annotation packages and their databases. Packages produced are intended to be used with AnnotationDbi.
Maintained by Bioconductor Package Maintainer. Last updated 18 days ago.
annotationinfrastructurebioconductor-packagecore-package
5 stars 9.62 score 143 scripts 19 dependentsbioc
cytomapper:Visualization of highly multiplexed imaging data in R
Highly multiplexed imaging acquires the single-cell expression of selected proteins in a spatially-resolved fashion. These measurements can be visualised across multiple length-scales. First, pixel-level intensities represent the spatial distributions of feature expression with highest resolution. Second, after segmentation, expression values or cell-level metadata (e.g. cell-type information) can be visualised on segmented cell areas. This package contains functions for the visualisation of multiplexed read-outs and cell-level information obtained by multiplexed imaging technologies. The main functions of this package allow 1. the visualisation of pixel-level information across multiple channels, 2. the display of cell-level information (expression and/or metadata) on segmentation masks and 3. gating and visualisation of single cells.
Maintained by Lasse Meyer. Last updated 5 months ago.
immunooncologysoftwaresinglecellonechanneltwochannelmultiplecomparisonnormalizationdataimportbioimagingimaging-mass-cytometrysingle-cellspatial-analysis
32 stars 9.61 score 354 scripts 5 dependentsropensci
tidyhydat:Extract and Tidy Canadian 'Hydrometric' Data
Provides functions to access historical and real-time national 'hydrometric' data from Water Survey of Canada data sources (<https://dd.weather.gc.ca/hydrometric/csv/> and <https://collaboration.cmc.ec.gc.ca/cmc/hydrometrics/www/>) and then applies tidy data principles.
Maintained by Sam Albers. Last updated 20 days ago.
citzgovernment-datahydrologyhydrometricstidy-datawater-resources
71 stars 9.59 score 202 scripts 3 dependentsbioc
recount:Explore and download data from the recount project
Explore and download data from the recount project available at https://jhubiostatistics.shinyapps.io/recount/. Using the recount package you can download RangedSummarizedExperiment objects at the gene, exon or exon-exon junctions level, the raw counts, the phenotype metadata used, the urls to the sample coverage bigWig files or the mean coverage bigWig file for a particular study. The RangedSummarizedExperiment objects can be used by different packages for performing differential expression analysis. Using http://bioconductor.org/packages/derfinder you can perform annotation-agnostic differential expression analyses with the data from the recount project as described at http://www.nature.com/nbt/journal/v35/n4/full/nbt.3838.html.
Maintained by Leonardo Collado-Torres. Last updated 4 months ago.
coveragedifferentialexpressiongeneexpressionrnaseqsequencingsoftwaredataimportimmunooncologyannotation-agnosticbioconductorcountderfinderdeseq2exongenehumanilluminajunctionrecount
41 stars 9.57 score 498 scripts 3 dependentsrqtl
qtl2:Quantitative Trait Locus Mapping in Experimental Crosses
Provides a set of tools to perform quantitative trait locus (QTL) analysis in experimental crosses. It is a reimplementation of the 'R/qtl' package to better handle high-dimensional data and complex cross designs. Broman et al. (2019) <doi:10.1534/genetics.118.301595>.
Maintained by Karl W Broman. Last updated 23 days ago.
34 stars 9.48 score 1.1k scripts 5 dependentsbioc
SpatialFeatureExperiment:Integrating SpatialExperiment with Simple Features in sf
A new S4 class integrating Simple Features with the R package sf to bring geospatial data analysis methods based on vector data to spatial transcriptomics. Also implements management of spatial neighborhood graphs and geometric operations. This pakage builds upon SpatialExperiment and SingleCellExperiment, hence methods for these parent classes can still be used.
Maintained by Lambda Moses. Last updated 2 months ago.
datarepresentationtranscriptomicsspatial
49 stars 9.40 score 322 scripts 1 dependentsusepa
tcpl:ToxCast Data Analysis Pipeline
The ToxCast Data Analysis Pipeline ('tcpl') is an R package that manages, curve-fits, plots, and stores ToxCast data to populate its linked MySQL database, 'invitrodb'. The package was developed for the chemical screening data curated by the US EPA's Toxicity Forecaster (ToxCast) program, but 'tcpl' can be used to support diverse chemical screening efforts.
Maintained by Jason Brown. Last updated 12 days ago.
36 stars 9.39 score 90 scriptspecanproject
PEcAn.data.land:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by Mike Dietze. Last updated 19 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 9.35 score 19 scripts 10 dependentsbioc
GenomicInteractions:Utilities for handling genomic interaction data
Utilities for handling genomic interaction data such as ChIA-PET or Hi-C, annotating genomic features with interaction information, and producing plots and summary statistics.
Maintained by Liz Ing-Simmons. Last updated 5 months ago.
softwareinfrastructuredataimportdatarepresentationhic
7 stars 9.31 score 162 scripts 5 dependentsohdsi
Andromeda:Asynchronous Disk-Based Representation of Massive Data
Storing very large data objects on a local drive, while still making it possible to manipulate the data in an efficient manner.
Maintained by Martijn Schuemie. Last updated 7 months ago.
11 stars 9.29 score 57 scripts 8 dependentsbioc
EWCE:Expression Weighted Celltype Enrichment
Used to determine which cell types are enriched within gene lists. The package provides tools for testing enrichments within simple gene lists (such as human disease associated genes) and those resulting from differential expression studies. The package does not depend upon any particular Single Cell Transcriptome dataset and user defined datasets can be loaded in and used in the analyses.
Maintained by Alan Murphy. Last updated 2 months ago.
geneexpressiontranscriptiondifferentialexpressiongenesetenrichmentgeneticsmicroarraymrnamicroarrayonechannelrnaseqbiomedicalinformaticsproteomicsvisualizationfunctionalgenomicssinglecelldeconvolutionsingle-cellsingle-cell-rna-seqtranscriptomics
56 stars 9.29 score 99 scriptsbioc
CNEr:CNE Detection and Visualization
Large-scale identification and advanced visualization of sets of conserved noncoding elements.
Maintained by Ge Tan. Last updated 5 months ago.
generegulationvisualizationdataimport
3 stars 9.28 score 35 scripts 19 dependentsbioc
IsoformSwitchAnalyzeR:Identify, Annotate and Visualize Isoform Switches with Functional Consequences from both short- and long-read RNA-seq data
Analysis of alternative splicing and isoform switches with predicted functional consequences (e.g. gain/loss of protein domains etc.) from quantification of all types of RNASeq by tools such as Kallisto, Salmon, StringTie, Cufflinks/Cuffdiff etc.
Maintained by Kristoffer Vitting-Seerup. Last updated 5 months ago.
geneexpressiontranscriptionalternativesplicingdifferentialexpressiondifferentialsplicingvisualizationstatisticalmethodtranscriptomevariantbiomedicalinformaticsfunctionalgenomicssystemsbiologytranscriptomicsrnaseqannotationfunctionalpredictiongenepredictiondataimportmultiplecomparisonbatcheffectimmunooncology
108 stars 9.26 score 125 scriptsbioc
RcisTarget:RcisTarget Identify transcription factor binding motifs enriched on a list of genes or genomic regions
RcisTarget identifies transcription factor binding motifs (TFBS) over-represented on a gene list. In a first step, RcisTarget selects DNA motifs that are significantly over-represented in the surroundings of the transcription start site (TSS) of the genes in the gene-set. This is achieved by using a database that contains genome-wide cross-species rankings for each motif. The motifs that are then annotated to TFs and those that have a high Normalized Enrichment Score (NES) are retained. Finally, for each motif and gene-set, RcisTarget predicts the candidate target genes (i.e. genes in the gene-set that are ranked above the leading edge).
Maintained by Gert Hulselmans. Last updated 5 months ago.
generegulationmotifannotationtranscriptomicstranscriptiongenesetenrichmentgenetarget
37 stars 9.18 score 191 scriptspecanproject
PEcAn.allometry:PEcAn Allometry Functions
Synthesize allometric equations or fit allometries to data.
Maintained by Mike Dietze. Last updated 19 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
216 stars 9.13 score 34 scriptsbioc
sesame:SEnsible Step-wise Analysis of DNA MEthylation BeadChips
Tools For analyzing Illumina Infinium DNA methylation arrays. SeSAMe provides utilities to support analyses of multiple generations of Infinium DNA methylation BeadChips, including preprocessing, quality control, visualization and inference. SeSAMe features accurate detection calling, intelligent inference of ethnicity, sex and advanced quality control routines.
Maintained by Wanding Zhou. Last updated 3 months ago.
dnamethylationmethylationarraypreprocessingqualitycontrolbioinformaticsdna-methylationmicroarray
69 stars 9.08 score 258 scripts 1 dependentsbioc
OUTRIDER:OUTRIDER - OUTlier in RNA-Seq fInDER
Identification of aberrant gene expression in RNA-seq data. Read count expectations are modeled by an autoencoder to control for confounders in the data. Given these expectations, the RNA-seq read counts are assumed to follow a negative binomial distribution with a gene-specific dispersion. Outliers are then identified as read counts that significantly deviate from this distribution. Furthermore, OUTRIDER provides useful plotting functions to analyze and visualize the results.
Maintained by Christian Mertes. Last updated 5 months ago.
immunooncologyrnaseqtranscriptomicsalignmentsequencinggeneexpressiongeneticscount-datadiagnosticsexpression-analysismendelian-geneticsoutlier-detectionrna-seqopenblascpp
50 stars 9.07 score 110 scripts 1 dependentspecanproject
PEcAn.qaqc:QAQC
PEcAn integration and model skill testing
Maintained by David LeBauer. Last updated 19 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
216 stars 9.07 score 5 scriptsbioc
BatchQC:Batch Effects Quality Control Software
Sequencing and microarray samples often are collected or processed in multiple batches or at different times. This often produces technical biases that can lead to incorrect results in the downstream analysis. BatchQC is a software tool that streamlines batch preprocessing and evaluation by providing interactive diagnostics, visualizations, and statistical analyses to explore the extent to which batch variation impacts the data. BatchQC diagnostics help determine whether batch adjustment needs to be done, and how correction should be applied before proceeding with a downstream analysis. Moreover, BatchQC interactively applies multiple common batch effect approaches to the data and the user can quickly see the benefits of each method. BatchQC is developed as a Shiny App. The output is organized into multiple tabs and each tab features an important part of the batch effect analysis and visualization of the data. The BatchQC interface has the following analysis groups: Summary, Differential Expression, Median Correlations, Heatmaps, Circular Dendrogram, PCA Analysis, Shape, ComBat and SVA.
Maintained by Jessica Anderson. Last updated 13 days ago.
batcheffectgraphandnetworkmicroarraynormalizationprincipalcomponentsequencingsoftwarevisualizationqualitycontrolrnaseqpreprocessingdifferentialexpressionimmunooncology
7 stars 9.06 score 54 scriptsohdsi
Cyclops:Cyclic Coordinate Descent for Logistic, Poisson and Survival Analysis
This model fitting tool incorporates cyclic coordinate descent and majorization-minimization approaches to fit a variety of regression models found in large-scale observational healthcare data. Implementations focus on computational optimization and fine-scale parallelization to yield efficient inference in massive datasets. Please see: Suchard, Simpson, Zorych, Ryan and Madigan (2013) <doi:10.1145/2414416.2414791>.
Maintained by Marc A. Suchard. Last updated 4 months ago.
39 stars 9.05 score 73 scripts 4 dependentsbioc
bambu:Context-Aware Transcript Quantification from Long Read RNA-Seq data
bambu is a R package for multi-sample transcript discovery and quantification using long read RNA-Seq data. You can use bambu after read alignment to obtain expression estimates for known and novel transcripts and genes. The output from bambu can directly be used for visualisation and downstream analysis such as differential gene expression or transcript usage.
Maintained by Ying Chen. Last updated 2 months ago.
alignmentcoveragedifferentialexpressionfeatureextractiongeneexpressiongenomeannotationgenomeassemblyimmunooncologylongreadmultiplecomparisonnormalizationrnaseqregressionsequencingsoftwaretranscriptiontranscriptomicsbambubioconductorlong-readsnanoporenanopore-sequencingrna-seqrna-seq-analysistranscript-quantificationtranscript-reconstructioncpp
203 stars 9.04 score 91 scripts 1 dependentsbioc
Banksy:Spatial transcriptomic clustering
Banksy is an R package that incorporates spatial information to cluster cells in a feature space (e.g. gene expression). To incorporate spatial information, BANKSY computes the mean neighborhood expression and azimuthal Gabor filters that capture gene expression gradients. These features are combined with the cell's own expression to embed cells in a neighbor-augmented product space which can then be clustered, allowing for accurate and spatially-aware cell typing and tissue domain segmentation.
Maintained by Joseph Lee. Last updated 28 days ago.
clusteringspatialsinglecellgeneexpressiondimensionreductionclustering-algorithmsingle-cell-omicsspatial-omics
90 stars 9.03 score 248 scriptspecanproject
PEcAn.all:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PEcAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by David LeBauer. Last updated 19 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 9.02 score 266 scriptsbioc
scPipe:Pipeline for single cell multi-omic data pre-processing
A preprocessing pipeline for single cell RNA-seq/ATAC-seq data that starts from the fastq files and produces a feature count matrix with associated quality control information. It can process fastq data generated by CEL-seq, MARS-seq, Drop-seq, Chromium 10x and SMART-seq protocols.
Maintained by Shian Su. Last updated 4 months ago.
immunooncologysoftwaresequencingrnaseqgeneexpressionsinglecellvisualizationsequencematchingpreprocessingqualitycontrolgenomeannotationdataimportcurlbzip2xz-utilszlibcpp
68 stars 9.02 score 84 scriptsbioc
scone:Single Cell Overview of Normalized Expression data
SCONE is an R package for comparing and ranking the performance of different normalization schemes for single-cell RNA-seq and other high-throughput analyses.
Maintained by Davide Risso. Last updated 1 months ago.
immunooncologynormalizationpreprocessingqualitycontrolgeneexpressionrnaseqsoftwaretranscriptomicssequencingsinglecellcoverage
53 stars 9.00 score 104 scriptspecanproject
PEcAn.MAAT:PEcAn Package for Integration of the MAAT Model
This module provides functions to wrap the MAAT model into the PEcAn workflows.
Maintained by Shawn Serbin. Last updated 19 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
216 stars 8.97 score 12 scriptsdexter-psychometrics
dexter:Data Management and Analysis of Tests
A system for the management, assessment, and psychometric analysis of data from educational and psychological tests.
Maintained by Jesse Koops. Last updated 21 days ago.
8 stars 8.97 score 135 scripts 2 dependentssym33
RecordLinkage:Record Linkage Functions for Linking and Deduplicating Data Sets
Provides functions for linking and deduplicating data sets. Methods based on a stochastic approach are implemented as well as classification algorithms from the machine learning domain. For details, see our paper "The RecordLinkage Package: Detecting Errors in Data" Sariyar M / Borg A (2010) <doi:10.32614/RJ-2010-017>.
Maintained by Murat Sariyar. Last updated 2 years ago.
6 stars 8.96 score 454 scripts 8 dependentsbioc
topGO:Enrichment Analysis for Gene Ontology
topGO package provides tools for testing GO terms while accounting for the topology of the GO graph. Different test statistics and different methods for eliminating local similarities and dependencies between GO terms can be implemented and applied.
Maintained by Adrian Alexa. Last updated 5 months ago.
8.96 score 2.0k scripts 20 dependentspecanproject
PEcAn.BIOCRO:PEcAn Package for Integration of the BioCro Model
This module provides functions to link BioCro to PEcAn.
Maintained by David LeBauer. Last updated 19 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 8.96 score 23 scriptspecanproject
PEcAn.uncertainty:PEcAn Functions Used for Propagating and Partitioning Uncertainties in Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by David LeBauer. Last updated 19 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 8.95 score 15 scripts 5 dependentsbioc
motifbreakR:A Package For Predicting The Disruptiveness Of Single Nucleotide Polymorphisms On Transcription Factor Binding Sites
We introduce motifbreakR, which allows the biologist to judge in the first place whether the sequence surrounding the polymorphism is a good match, and in the second place how much information is gained or lost in one allele of the polymorphism relative to another. MotifbreakR is both flexible and extensible over previous offerings; giving a choice of algorithms for interrogation of genomes with motifs from public sources that users can choose from; these are 1) a weighted-sum probability matrix, 2) log-probabilities, and 3) weighted by relative entropy. MotifbreakR can predict effects for novel or previously described variants in public databases, making it suitable for tasks beyond the scope of its original design. Lastly, it can be used to interrogate any genome curated within Bioconductor (currently there are 32 species, a total of 109 versions).
Maintained by Simon Gert Coetzee. Last updated 5 months ago.
chipseqvisualizationmotifannotationtranscription
28 stars 8.89 score 103 scriptsericpante
marmap:Import, Plot and Analyze Bathymetric and Topographic Data
Import xyz data from the NOAA (National Oceanic and Atmospheric Administration, <https://www.noaa.gov>), GEBCO (General Bathymetric Chart of the Oceans, <https://www.gebco.net>) and other sources, plot xyz data to prepare publication-ready figures, analyze xyz data to extract transects, get depth / altitude based on geographical coordinates, or calculate z-constrained least-cost paths.
Maintained by Benoit Simon-Bouhet. Last updated 9 months ago.
32 stars 8.86 score 524 scripts 1 dependentspecanproject
PEcAn.workflow:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PEcAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. This package provides workhorse functions that can be used to run the major steps of a PEcAn analysis.
Maintained by David LeBauer. Last updated 19 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 8.85 score 15 scripts 4 dependentssherrillmix
taxonomizr:Functions to Work with NCBI Accessions and Taxonomy
Functions for assigning taxonomy to NCBI accession numbers and taxon IDs based on NCBI's accession2taxid and taxdump files. This package allows the user to download NCBI data dumps and create a local database for fast and local taxonomic assignment.
Maintained by Scott Sherrill-Mix. Last updated 19 days ago.
72 stars 8.85 score 255 scripts 2 dependentsusdaforestservice
FIESTA:Forest Inventory Estimation and Analysis
A research estimation tool for analysts that work with sample-based inventory data from the U.S. Department of Agriculture, Forest Service, Forest Inventory and Analysis (FIA) Program.
Maintained by Grayson White. Last updated 5 days ago.
30 stars 8.84 score 62 scriptspbiecek
archivist:Tools for Storing, Restoring and Searching for R Objects
Data exploration and modelling is a process in which a lot of data artifacts are produced. Artifacts like: subsets, data aggregates, plots, statistical models, different versions of data sets and different versions of results. The more projects we work with the more artifacts are produced and the harder it is to manage these artifacts. Archivist helps to store and manage artifacts created in R. Archivist allows you to store selected artifacts as a binary files together with their metadata and relations. Archivist allows to share artifacts with others, either through shared folder or github. Archivist allows to look for already created artifacts by using it's class, name, date of the creation or other properties. Makes it easy to restore such artifacts. Archivist allows to check if new artifact is the exact copy that was produced some time ago. That might be useful either for testing or caching.
Maintained by Przemyslaw Biecek. Last updated 8 months ago.
74 stars 8.81 score 105 scripts 2 dependentsmountainmath
cansim:Accessing Statistics Canada Data Table and Vectors
Searches for, accesses, and retrieves Statistics Canada data tables, as well as individual vectors, as tidy data frames. This package enriches the tables with metadata, deals with encoding issues, allows for bilingual English or French language data retrieval, and bundles convenience functions to make it easier to work with retrieved table data. For more efficient data access the package allows for caching data in a local database and database level filtering, data manipulation and summarizing.
Maintained by Jens von Bergmann. Last updated 17 days ago.
45 stars 8.78 score 446 scriptspecanproject
PEcAn.data.remote:PEcAn Functions Used for Extracting Remote Sensing Data
PEcAn module for processing remote data. Python module requirements: requests, json, re, ast, panads, sys. If any of these modules are missing, install using pip install <module name>.
Maintained by Bailey Morrison. Last updated 19 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplants
216 stars 8.77 score 6 scripts 5 dependentspecanproject
PEcAn.ED2:PEcAn Package for Integration of ED2 Model
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation. This package provides functions to link the Ecosystem Demography Model, version 2, to PEcAn.
Maintained by Mike Dietze. Last updated 19 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 8.76 score 145 scriptsbioc
CellBench:Construct Benchmarks for Single Cell Analysis Methods
This package contains infrastructure for benchmarking analysis methods and access to single cell mixture benchmarking data. It provides a framework for organising analysis methods and testing combinations of methods in a pipeline without explicitly laying out each combination. It also provides utilities for sampling and filtering SingleCellExperiment objects, constructing lists of functions with varying parameters, and multithreaded evaluation of analysis methods.
Maintained by Shian Su. Last updated 5 months ago.
softwareinfrastructuresinglecellbenchmarkbioinformatics
31 stars 8.73 score 98 scriptsbioc
qpgraph:Estimation of Genetic and Molecular Regulatory Networks from High-Throughput Genomics Data
Estimate gene and eQTL networks from high-throughput expression and genotyping assays.
Maintained by Robert Castelo. Last updated 3 days ago.
microarraygeneexpressiontranscriptionpathwaysnetworkinferencegraphandnetworkgeneregulationgeneticsgeneticvariabilitysnpsoftwareopenblas
3 stars 8.72 score 20 scripts 3 dependentsbioc
GenomicScores:Infrastructure to work with genomewide position-specific scores
Provide infrastructure to store and access genomewide position-specific scores within R and Bioconductor.
Maintained by Robert Castelo. Last updated 2 months ago.
infrastructuregeneticsannotationsequencingcoverageannotationhubsoftware
8 stars 8.71 score 83 scripts 6 dependentsbioc
Voyager:From geospatial to spatial omics
SpatialFeatureExperiment (SFE) is a new S4 class for working with spatial single-cell genomics data. The voyager package implements basic exploratory spatial data analysis (ESDA) methods for SFE. Univariate methods include univariate global spatial ESDA methods such as Moran's I, permutation testing for Moran's I, and correlograms. Bivariate methods include Lee's L and cross variogram. Multivariate methods include MULTISPATI PCA and multivariate local Geary's C recently developed by Anselin. The Voyager package also implements plotting functions to plot SFE data and ESDA results.
Maintained by Lambda Moses. Last updated 3 months ago.
geneexpressionspatialtranscriptomicsvisualizationbioconductoredaesdaexploratory-data-analysisomicsspatial-statisticsspatial-transcriptomics
88 stars 8.71 score 173 scriptsbioc
trackViewer:A R/Bioconductor package with web interface for drawing elegant interactive tracks or lollipop plot to facilitate integrated analysis of multi-omics data
Visualize mapped reads along with annotation as track layers for NGS dataset such as ChIP-seq, RNA-seq, miRNA-seq, DNA-seq, SNPs and methylation data.
Maintained by Jianhong Ou. Last updated 4 days ago.
8.68 score 145 scripts 2 dependentsbioc
gage:Generally Applicable Gene-set Enrichment for Pathway Analysis
GAGE is a published method for gene set (enrichment or GSEA) or pathway analysis. GAGE is generally applicable independent of microarray or RNA-Seq data attributes including sample sizes, experimental designs, assay platforms, and other types of heterogeneity, and consistently achieves superior performance over other frequently used methods. In gage package, we provide functions for basic GAGE analysis, result processing and presentation. We have also built pipeline routines for of multiple GAGE analyses in a batch, comparison between parallel analyses, and combined analysis of heterogeneous data from different sources/studies. In addition, we provide demo microarray data and commonly used gene set data based on KEGG pathways and GO terms. These funtions and data are also useful for gene set analysis using other methods.
Maintained by Weijun Luo. Last updated 5 months ago.
pathwaysgodifferentialexpressionmicroarrayonechanneltwochannelrnaseqgeneticsmultiplecomparisongenesetenrichmentgeneexpressionsystemsbiologysequencing
5 stars 8.68 score 784 scripts 1 dependentsbcgov
bcmaps:Map Layers and Spatial Utilities for British Columbia
Various layers of B.C., including administrative boundaries, natural resource management boundaries, census boundaries etc. All layers are available in BC Albers (<https://spatialreference.org/ref/epsg/3005/>) equal-area projection, which is the B.C. government standard. The layers are sourced from the British Columbia and Canadian government under open licenses, including B.C. Data Catalogue (<https://data.gov.bc.ca>), the Government of Canada Open Data Portal (<https://open.canada.ca/en/using-open-data>), and Statistics Canada (<https://www.statcan.gc.ca/en/reference/licence>).
Maintained by Andy Teucher. Last updated 3 months ago.
73 stars 8.65 score 254 scriptsbioc
QuasR:Quantify and Annotate Short Reads in R
This package provides a framework for the quantification and analysis of Short Reads. It covers a complete workflow starting from raw sequence reads, over creation of alignments and quality control plots, to the quantification of genomic regions of interest. Read alignments are either generated through Rbowtie (data from DNA/ChIP/ATAC/Bis-seq experiments) or Rhisat2 (data from RNA-seq experiments that require spliced alignments), or can be provided in the form of bam files.
Maintained by Michael Stadler. Last updated 1 months ago.
geneticspreprocessingsequencingchipseqrnaseqmethylseqcoveragealignmentqualitycontrolimmunooncologycurlbzip2xz-utilszlibcpp
6 stars 8.63 score 79 scripts 1 dependentsropensci
ckanr:Client for the Comprehensive Knowledge Archive Network ('CKAN') API
Client for 'CKAN' API (<https://ckan.org/>). Includes interface to 'CKAN' 'APIs' for search, list, show for packages, organizations, and resources. In addition, provides an interface to the 'datastore' API.
Maintained by Francisco Alves. Last updated 2 years ago.
databaseopen-datackanapidatadatasetapi-wrapperckan-api
98 stars 8.60 score 448 scripts 4 dependentsbioc
AUCell:AUCell: Analysis of 'gene set' activity in single-cell RNA-seq data (e.g. identify cells with specific gene signatures)
AUCell allows to identify cells with active gene sets (e.g. signatures, gene modules...) in single-cell RNA-seq data. AUCell uses the "Area Under the Curve" (AUC) to calculate whether a critical subset of the input gene set is enriched within the expressed genes for each cell. The distribution of AUC scores across all the cells allows exploring the relative expression of the signature. Since the scoring method is ranking-based, AUCell is independent of the gene expression units and the normalization procedure. In addition, since the cells are evaluated individually, it can easily be applied to bigger datasets, subsetting the expression matrix if needed.
Maintained by Gert Hulselmans. Last updated 5 months ago.
singlecellgenesetenrichmenttranscriptomicstranscriptiongeneexpressionworkflowstepnormalization
8.59 score 860 scripts 4 dependentsbioc
SPIAT:Spatial Image Analysis of Tissues
SPIAT (**Sp**atial **I**mage **A**nalysis of **T**issues) is an R package with a suite of data processing, quality control, visualization and data analysis tools. SPIAT is compatible with data generated from single-cell spatial proteomics platforms (e.g. OPAL, CODEX, MIBI, cellprofiler). SPIAT reads spatial data in the form of X and Y coordinates of cells, marker intensities and cell phenotypes. SPIAT includes six analysis modules that allow visualization, calculation of cell colocalization, categorization of the immune microenvironment relative to tumor areas, analysis of cellular neighborhoods, and the quantification of spatial heterogeneity, providing a comprehensive toolkit for spatial data analysis.
Maintained by Yuzhou Feng. Last updated 16 days ago.
biomedicalinformaticscellbiologyspatialclusteringdataimportimmunooncologyqualitycontrolsinglecellsoftwarevisualization
22 stars 8.59 score 69 scriptstudo-r
BatchJobs:Batch Computing with R
Provides Map, Reduce and Filter variants to generate jobs on batch computing systems like PBS/Torque, LSF, SLURM and Sun Grid Engine. Multicore and SSH systems are also supported. For further details see the project web page.
Maintained by Bernd Bischl. Last updated 3 years ago.
85 stars 8.57 score 616 scripts 3 dependentsbioc
FRASER:Find RAre Splicing Events in RNA-Seq Data
Detection of rare aberrant splicing events in transcriptome profiles. Read count ratio expectations are modeled by an autoencoder to control for confounding factors in the data. Given these expectations, the ratios are assumed to follow a beta-binomial distribution with a junction specific dispersion. Outlier events are then identified as read-count ratios that deviate significantly from this distribution. FRASER is able to detect alternative splicing, but also intron retention. The package aims to support diagnostics in the field of rare diseases where RNA-seq is performed to identify aberrant splicing defects.
Maintained by Christian Mertes. Last updated 5 months ago.
rnaseqalternativesplicingsequencingsoftwaregeneticscoverageaberrant-splicingdiagnosticsoutlier-detectionrare-diseaserna-seqsplicingopenblascpp
44 stars 8.53 score 155 scriptscboettig
duckdbfs:High Performance Remote File System, Database and 'Geospatial' Access Using 'duckdb'
Provides friendly wrappers for creating 'duckdb'-backed connections to tabular datasets ('csv', parquet, etc) on local or remote file systems. This mimics the behaviour of "open_dataset" in the 'arrow' package, but in addition to 'S3' file system also generalizes to any list of 'http' URLs.
Maintained by Carl Boettiger. Last updated 20 days ago.
85 stars 8.51 score 41 scripts 16 dependentsropensci
weatherOz:An API Client for Australian Weather and Climate Data Resources
Provides automated downloading, parsing and formatting of weather data for Australia through API endpoints provided by the Department of Primary Industries and Regional Development ('DPIRD') of Western Australia and by the Science and Technology Division of the Queensland Government's Department of Environment and Science ('DES'). As well as the Bureau of Meteorology ('BOM') of the Australian government precis and coastal forecasts, and downloading and importing radar and satellite imagery files. 'DPIRD' weather data are accessed through public 'APIs' provided by 'DPIRD', <https://www.agric.wa.gov.au/weather-api-20>, providing access to weather station data from the 'DPIRD' weather station network. Australia-wide weather data are based on data from the Australian Bureau of Meteorology ('BOM') data and accessed through 'SILO' (Scientific Information for Land Owners) Jeffrey et al. (2001) <doi:10.1016/S1364-8152(01)00008-1>. 'DPIRD' data are made available under a Creative Commons Attribution 3.0 Licence (CC BY 3.0 AU) license <https://creativecommons.org/licenses/by/3.0/au/deed.en>. SILO data are released under a Creative Commons Attribution 4.0 International licence (CC BY 4.0) <https://creativecommons.org/licenses/by/4.0/>. 'BOM' data are (c) Australian Government Bureau of Meteorology and released under a Creative Commons (CC) Attribution 3.0 licence or Public Access Licence ('PAL') as appropriate, see <http://www.bom.gov.au/other/copyright.shtml> for further details.
Maintained by Rodrigo Pires. Last updated 1 months ago.
dpirdbommeteorological-dataweather-forecastaustraliaweatherweather-datameteorologywestern-australiaaustralia-bureau-of-meteorologywestern-australia-agricultureaustralia-agricultureaustralia-climateaustralia-weatherapi-clientclimatedatarainfallweather-api
31 stars 8.47 score 40 scriptsbioc
TitanCNA:Subclonal copy number and LOH prediction from whole genome sequencing of tumours
Hidden Markov model to segment and predict regions of subclonal copy number alterations (CNA) and loss of heterozygosity (LOH), and estimate cellular prevalence of clonal clusters in tumour whole genome sequencing data.
Maintained by Gavin Ha. Last updated 5 months ago.
sequencingwholegenomednaseqexomeseqstatisticalmethodcopynumbervariationhiddenmarkovmodelgeneticsgenomicvariationimmunooncology10x-genomicscopy-number-variationgenome-sequencinghmmtumor-heterogeneity
97 stars 8.47 score 68 scriptsbioc
ClassifyR:A framework for cross-validated classification problems, with applications to differential variability and differential distribution testing
The software formalises a framework for classification and survival model evaluation in R. There are four stages; Data transformation, feature selection, model training, and prediction. The requirements of variable types and variable order are fixed, but specialised variables for functions can also be provided. The framework is wrapped in a driver loop that reproducibly carries out a number of cross-validation schemes. Functions for differential mean, differential variability, and differential distribution are included. Additional functions may be developed by the user, by creating an interface to the framework.
Maintained by Dario Strbenac. Last updated 7 days ago.
6 stars 8.46 score 45 scripts 3 dependentsbioc
BgeeDB:Annotation and gene expression data retrieval from Bgee database. TopAnat, an anatomical entities Enrichment Analysis tool for UBERON ontology
A package for the annotation and gene expression data download from Bgee database, and TopAnat analysis: GO-like enrichment of anatomical terms, mapped to genes by expression patterns.
Maintained by Julien Wollbrett. Last updated 5 months ago.
softwaredataimportsequencinggeneexpressionmicroarraygogenesetenrichmentbioinformaticsenrichment-analysisrna-seqscrna-seqsingle-cell
15 stars 8.46 score 19 scripts 1 dependentsbioc
multiMiR:Integration of multiple microRNA-target databases with their disease and drug associations
A collection of microRNAs/targets from external resources, including validated microRNA-target databases (miRecords, miRTarBase and TarBase), predicted microRNA-target databases (DIANA-microT, ElMMo, MicroCosm, miRanda, miRDB, PicTar, PITA and TargetScan) and microRNA-disease/drug databases (miR2Disease, Pharmaco-miR VerSe and PhenomiR).
Maintained by Spencer Mahaffey. Last updated 5 months ago.
mirnadatahomo_sapiens_datamus_musculus_datarattus_norvegicus_dataorganismdatamicrorna-sequencesql
20 stars 8.45 score 141 scriptsbioc
geneplotter:Graphics related functions for Bioconductor
Functions for plotting genomic data
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
8.40 score 249 scripts 10 dependentsbioc
UniProt.ws:R Interface to UniProt Web Services
The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and annotation data. This package provides a collection of functions for retrieving, processing, and re-packaging UniProt web services. The package makes use of UniProt's modernized REST API and allows mapping of identifiers accross different databases.
Maintained by Marcel Ramos. Last updated 3 months ago.
annotationinfrastructuregokeggbiocartabioconductor-packagecore-package
4 stars 8.38 score 167 scripts 4 dependentspecanproject
PEcAn.SIPNET:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by Mike Dietze. Last updated 19 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 8.38 score 61 scriptsbioc
EnrichmentBrowser:Seamless navigation through combined results of set-based and network-based enrichment analysis
The EnrichmentBrowser package implements essential functionality for the enrichment analysis of gene expression data. The analysis combines the advantages of set-based and network-based enrichment analysis in order to derive high-confidence gene sets and biological pathways that are differentially regulated in the expression data under investigation. Besides, the package facilitates the visualization and exploration of such sets and pathways.
Maintained by Ludwig Geistlinger. Last updated 5 months ago.
immunooncologymicroarrayrnaseqgeneexpressiondifferentialexpressionpathwaysgraphandnetworknetworkgenesetenrichmentnetworkenrichmentvisualizationreportwriting
20 stars 8.37 score 164 scripts 3 dependentspecanproject
PEcAn.LINKAGES:PEcAn Package for Integration of the LINKAGES Model
This module provides functions to link the (LINKAGES) to PEcAn.
Maintained by Ann Raiho. Last updated 19 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 8.37 score 59 scriptswallaceecomod
wallace:A Modular Platform for Reproducible Modeling of Species Niches and Distributions
The 'shiny' application Wallace is a modular platform for reproducible modeling of species niches and distributions. Wallace guides users through a complete analysis, from the acquisition of species occurrence and environmental data to visualizing model predictions on an interactive map, thus bundling complex workflows into a single, streamlined interface. An extensive vignette, which guides users through most package functionality can be found on the package's GitHub Pages website: <https://wallaceecomod.github.io/wallace/articles/tutorial-v2.html>.
Maintained by Mary E. Blair. Last updated 24 days ago.
133 stars 8.36 score 96 scriptsbioc
CompoundDb:Creating and Using (Chemical) Compound Annotation Databases
CompoundDb provides functionality to create and use (chemical) compound annotation databases from a variety of different sources such as LipidMaps, HMDB, ChEBI or MassBank. The database format allows to store in addition MS/MS spectra along with compound information. The package provides also a backend for Bioconductor's Spectra package and allows thus to match experimetal MS/MS spectra against MS/MS spectra in the database. Databases can be stored in SQLite format and are thus portable.
Maintained by Johannes Rainer. Last updated 2 months ago.
massspectrometrymetabolomicsannotationdatabasesmass-spectrometry
17 stars 8.35 score 69 scripts 1 dependentssmouksassi
ggquickeda:Quickly Explore Your Data Using 'ggplot2' and 'table1' Summary Tables
Quickly and easily perform exploratory data analysis by uploading your data as a 'csv' file. Start generating insights using 'ggplot2' plots and 'table1' tables with descriptive stats, all using an easy-to-use point and click 'Shiny' interface.
Maintained by Samer Mouksassi. Last updated 15 days ago.
73 stars 8.34 score 27 scriptsbioc
igvR:igvR: integrative genomics viewer
Access to igv.js, the Integrative Genomics Viewer running in a web browser.
Maintained by Arkadiusz Gladki. Last updated 5 months ago.
visualizationthirdpartyclientgenomebrowsers
45 stars 8.33 score 118 scriptssurveydown-dev
surveydown:Markdown-Based Surveys Using 'Quarto' and 'shiny'
Generate surveys using markdown and R code chunks. Surveys are composed of two files: a survey.qmd 'Quarto' file defining the survey content (pages, questions, etc), and an app.R file defining a 'shiny' app with global settings (libraries, database configuration, etc.) and server configuration options (e.g., conditional skipping / display, etc.). Survey data collected from respondents is stored in a 'PostgreSQL' database. Features include controls for conditional skip logic (skip to a page based on an answer to a question), conditional display logic (display a question based on an answer to a question), a customizable progress bar, and a wide variety of question types, including multiple choice (single choice and multiple choices), select, text, numeric, multiple choice buttons, text area, and dates. Because the surveys render into a 'shiny' app, designers can also leverage the reactive capabilities of 'shiny' to create dynamic and interactive surveys.
Maintained by John Paul Helveston. Last updated 4 days ago.
markdownpostgrespostgresqlquartoshinyshiny-appsshiny-rsupabasesurveysurveys
97 stars 8.29 score 133 scriptsbioc
GeneTonic:Enjoy Analyzing And Integrating The Results From Differential Expression Analysis And Functional Enrichment Analysis
This package provides functionality to combine the existing pieces of the transcriptome data and results, making it easier to generate insightful observations and hypothesis. Its usage is made easy with a Shiny application, combining the benefits of interactivity and reproducibility e.g. by capturing the features and gene sets of interest highlighted during the live session, and creating an HTML report as an artifact where text, code, and output coexist. Using the GeneTonicList as a standardized container for all the required components, it is possible to simplify the generation of multiple visualizations and summaries.
Maintained by Federico Marini. Last updated 3 months ago.
guigeneexpressionsoftwaretranscriptiontranscriptomicsvisualizationdifferentialexpressionpathwaysreportwritinggenesetenrichmentannotationgoshinyappsbioconductorbioconductor-packagedata-explorationdata-visualizationfunctional-enrichment-analysisgene-expressionpathway-analysisreproducible-researchrna-seq-analysisrna-seq-datashinytranscriptomeuser-friendly
77 stars 8.28 score 37 scripts 1 dependentsbioc
crisprDesign:Comprehensive design of CRISPR gRNAs for nucleases and base editors
Provides a comprehensive suite of functions to design and annotate CRISPR guide RNA (gRNAs) sequences. This includes on- and off-target search, on-target efficiency scoring, off-target scoring, full gene and TSS contextual annotations, and SNP annotation (human only). It currently support five types of CRISPR modalities (modes of perturbations): CRISPR knockout, CRISPR activation, CRISPR inhibition, CRISPR base editing, and CRISPR knockdown. All types of CRISPR nucleases are supported, including DNA- and RNA-target nucleases such as Cas9, Cas12a, and Cas13d. All types of base editors are also supported. gRNA design can be performed on reference genomes, transcriptomes, and custom DNA and RNA sequences. Both unpaired and paired gRNA designs are enabled.
Maintained by Jean-Philippe Fortin. Last updated 26 days ago.
crisprfunctionalgenomicsgenetargetbioconductorbioconductor-packagecrispr-cas9crispr-designcrispr-targetgenomics-analysisgrnagrna-sequencegrna-sequencessgrnasgrna-design
22 stars 8.28 score 80 scripts 3 dependentsr-dbi
DBItest:Testing DBI Backends
A helper that tests DBI back ends for conformity to the interface.
Maintained by Kirill Müller. Last updated 15 days ago.
24 stars 8.21 score 11 scriptsdarwin-eu
DrugUtilisation:Summarise Patient-Level Drug Utilisation in Data Mapped to the OMOP Common Data Model
Summarise patient-level drug utilisation cohorts using data mapped to the Observational Medical Outcomes Partnership (OMOP) common data model. New users and prevalent users cohorts can be generated and their characteristics, indication and drug use summarised.
Maintained by Martí Català. Last updated 2 months ago.
8.20 score 156 scripts 2 dependentsbioc
POMA:Tools for Omics Data Analysis
The POMA package offers a comprehensive toolkit designed for omics data analysis, streamlining the process from initial visualization to final statistical analysis. Its primary goal is to simplify and unify the various steps involved in omics data processing, making it more accessible and manageable within a single, intuitive R package. Emphasizing on reproducibility and user-friendliness, POMA leverages the standardized SummarizedExperiment class from Bioconductor, ensuring seamless integration and compatibility with a wide array of Bioconductor tools. This approach guarantees maximum flexibility and replicability, making POMA an essential asset for researchers handling omics datasets. See https://github.com/pcastellanoescuder/POMAShiny. Paper: Castellano-Escuder et al. (2021) <doi:10.1371/journal.pcbi.1009148> for more details.
Maintained by Pol Castellano-Escuder. Last updated 4 months ago.
batcheffectclassificationclusteringdecisiontreedimensionreductionmultidimensionalscalingnormalizationpreprocessingprincipalcomponentregressionrnaseqsoftwarestatisticalmethodvisualizationbioconductorbioinformaticsdata-visualizationdimension-reductionexploratory-data-analysismachine-learningomics-data-integrationpipelinepre-processingstatistical-analysisuser-friendlyworkflow
11 stars 8.16 score 20 scripts 1 dependentsbioc
dreamlet:Scalable differential expression analysis of single cell transcriptomics datasets with complex study designs
Recent advances in single cell/nucleus transcriptomic technology has enabled collection of cohort-scale datasets to study cell type specific gene expression differences associated disease state, stimulus, and genetic regulation. The scale of these data, complex study designs, and low read count per cell mean that characterizing cell type specific molecular mechanisms requires a user-frieldly, purpose-build analytical framework. We have developed the dreamlet package that applies a pseudobulk approach and fits a regression model for each gene and cell cluster to test differential expression across individuals associated with a trait of interest. Use of precision-weighted linear mixed models enables accounting for repeated measures study designs, high dimensional batch effects, and varying sequencing depth or observed cells per biosample.
Maintained by Gabriel Hoffman. Last updated 6 days ago.
rnaseqgeneexpressiondifferentialexpressionbatcheffectqualitycontrolregressiongenesetenrichmentgeneregulationepigeneticsfunctionalgenomicstranscriptomicsnormalizationsinglecellpreprocessingsequencingimmunooncologysoftwarecpp
12 stars 8.14 score 128 scriptspecanproject
PEcAnAssimSequential:PEcAn Functions Used for Ecological Forecasts and Reanalysis
The Predictive Ecosystem Carbon Analyzer (PEcAn) is a scientific workflow management tool that is designed to simplify the management of model parameterization, execution, and analysis. The goal of PECAn is to streamline the interaction between data and models, and to improve the efficacy of scientific investigation.
Maintained by Mike Dietze. Last updated 19 hours ago.
bayesiancyberinfrastructuredata-assimilationdata-scienceecosystem-modelecosystem-scienceforecastingmeta-analysisnational-science-foundationpecanplantsjagscpp
216 stars 8.14 score 35 scriptsbioc
motifmatchr:Fast Motif Matching in R
Quickly find motif matches for many motifs and many sequences. Wraps C++ code from the MOODS motif calling library, which was developed by Pasi Rastas, Janne Korhonen, and Petri Martinmäki.
Maintained by Alicia Schep. Last updated 5 months ago.
8.11 score 722 scripts 5 dependentsbioc
monaLisa:Binned Motif Enrichment Analysis and Visualization
Useful functions to work with sequence motifs in the analysis of genomics data. These include methods to annotate genomic regions or sequences with predicted motif hits and to identify motifs that drive observed changes in accessibility or expression. Functions to produce informative visualizations of the obtained results are also provided.
Maintained by Michael Stadler. Last updated 8 days ago.
motifannotationvisualizationfeatureextractionepigenetics
40 stars 8.10 score 53 scriptsbioc
ggkegg:Analyzing and visualizing KEGG information using the grammar of graphics
This package aims to import, parse, and analyze KEGG data such as KEGG PATHWAY and KEGG MODULE. The package supports visualizing KEGG information using ggplot2 and ggraph through using the grammar of graphics. The package enables the direct visualization of the results from various omics analysis packages.
Maintained by Noriaki Sato. Last updated 2 months ago.
pathwaysdataimportkeggggplot2ggraphpathwaytidygraphvisualization
225 stars 8.08 score 30 scripts 1 dependentsbioc
STRINGdb:STRINGdb - Protein-Protein Interaction Networks and Functional Enrichment Analysis
The STRINGdb package provides a R interface to the STRING protein-protein interactions database (https://string-db.org).
Maintained by Damian Szklarczyk. Last updated 5 months ago.
8.08 score 344 scripts 9 dependentsbioc
rBLAST:R Interface for the Basic Local Alignment Search Tool
Seamlessly interfaces the Basic Local Alignment Search Tool (BLAST) running locally to search genetic sequence data bases. This work was partially supported by grant no. R21HG005912 from the National Human Genome Research Institute.
Maintained by Michael Hahsler. Last updated 3 months ago.
geneticssequencingsequencematchingalignmentdataimportbioconductorbioinformaticsblast-search
106 stars 8.07 score 106 scriptsr-arcgis
arcgislayers:An Interface to ArcGIS Data Services
Enables users of 'ArcGIS Enterprise', 'ArcGIS Online', or 'ArcGIS Platform' to read, write, publish, or manage vector and raster data via ArcGIS location services REST API endpoints <https://developers.arcgis.com/rest/>.
Maintained by Josiah Parry. Last updated 13 days ago.
50 stars 8.07 score 38 scripts 4 dependentsbioc
FLAMES:FLAMES: Full Length Analysis of Mutations and Splicing in long read RNA-seq data
Semi-supervised isoform detection and annotation from both bulk and single-cell long read RNA-seq data. Flames provides automated pipelines for analysing isoforms, as well as intermediate functions for manual execution.
Maintained by Changqing Wang. Last updated 20 hours ago.
rnaseqsinglecelltranscriptomicsdataimportdifferentialsplicingalternativesplicinggeneexpressionlongreadzlibcurlbzip2xz-utilscpp
33 stars 8.04 score 12 scriptsdarwin-eu
CohortCharacteristics:Summarise and Visualise Characteristics of Patients in the OMOP CDM
Summarise and visualise the characteristics of patients in data mapped to the Observational Medical Outcomes Partnership (OMOP) common data model (CDM).
Maintained by Marti Catala. Last updated 4 months ago.
1 stars 8.03 score 111 scripts 1 dependentsbioc
biovizBase:Basic graphic utilities for visualization of genomic data.
The biovizBase package is designed to provide a set of utilities, color schemes and conventions for genomic data. It serves as the base for various high-level packages for biological data visualization. This saves development effort and encourages consistency.
Maintained by Michael Lawrence. Last updated 5 months ago.
infrastructurevisualizationpreprocessing
8.03 score 273 scripts 74 dependentsbioc
recount3:Explore and download data from the recount3 project
The recount3 package enables access to a large amount of uniformly processed RNA-seq data from human and mouse. You can download RangedSummarizedExperiment objects at the gene, exon or exon-exon junctions level with sample metadata and QC statistics. In addition we provide access to sample coverage BigWig files.
Maintained by Leonardo Collado-Torres. Last updated 4 months ago.
coveragedifferentialexpressiongeneexpressionrnaseqsequencingsoftwaredataimportannotation-agnosticbioconductorcountderfinderexongenehumanilluminajunctionmouserecountrecount3
33 stars 8.03 score 216 scriptsbioc
simplifyEnrichment:Simplify Functional Enrichment Results
A new clustering algorithm, "binary cut", for clustering similarity matrices of functional terms is implemeted in this package. It also provides functions for visualizing, summarizing and comparing the clusterings.
Maintained by Zuguang Gu. Last updated 5 months ago.
softwarevisualizationgoclusteringgenesetenrichment
113 stars 8.02 score 196 scriptsbioc
spicyR:Spatial analysis of in situ cytometry data
The spicyR package provides a framework for performing inference on changes in spatial relationships between pairs of cell types for cell-resolution spatial omics technologies. spicyR consists of three primary steps: (i) summarizing the degree of spatial localization between pairs of cell types for each image; (ii) modelling the variability in localization summary statistics as a function of cell counts and (iii) testing for changes in spatial localizations associated with a response variable.
Maintained by Ellis Patrick. Last updated 27 days ago.
singlecellcellbasedassaysspatial
9 stars 8.02 score 57 scripts 1 dependentsbioc
netZooR:Unified methods for the inference and analysis of gene regulatory networks
netZooR unifies the implementations of several Network Zoo methods (netzoo, netzoo.github.io) into a single package by creating interfaces between network inference and network analysis methods. Currently, the package has 3 methods for network inference including PANDA and its optimized implementation OTTER (network reconstruction using mutliple lines of biological evidence), LIONESS (single-sample network inference), and EGRET (genotype-specific networks). Network analysis methods include CONDOR (community detection), ALPACA (differential community detection), CRANE (significance estimation of differential modules), MONSTER (estimation of network transition states). In addition, YARN allows to process gene expresssion data for tissue-specific analyses and SAMBAR infers missing mutation data based on pathway information.
Maintained by Tara Eicher. Last updated 13 days ago.
networkinferencenetworkgeneregulationgeneexpressiontranscriptionmicroarraygraphandnetworkgene-regulatory-networktranscription-factors
105 stars 7.98 scoredarwin-eu
IncidencePrevalence:Estimate Incidence and Prevalence using the OMOP Common Data Model
Calculate incidence and prevalence using data mapped to the Observational Medical Outcomes Partnership (OMOP) common data model. Incidence and prevalence can be estimated for the total population in a database or for a stratification cohort.
Maintained by Edward Burn. Last updated 21 days ago.
9 stars 7.96 score 102 scripts 1 dependentsbioc
motifStack:Plot stacked logos for single or multiple DNA, RNA and amino acid sequence
The motifStack package is designed for graphic representation of multiple motifs with different similarity scores. It works with both DNA/RNA sequence motif and amino acid sequence motif. In addition, it provides the flexibility for users to customize the graphic parameters such as the font type and symbol colors.
Maintained by Jianhong Ou. Last updated 3 months ago.
sequencematchingvisualizationsequencingmicroarrayalignmentchipchipchipseqmotifannotationdataimport
7.93 score 188 scripts 6 dependentsbioc
Category:Category Analysis
A collection of tools for performing category (gene set enrichment) analysis.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
annotationgopathwaysgenesetenrichment
7.93 score 183 scripts 16 dependentsohdsi
CohortGenerator:Cohort Generation for the OMOP Common Data Model
Generate cohorts and subsets using an Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) Database. Cohorts are defined using 'CIRCE' (<https://github.com/ohdsi/circe-be>) or SQL compatible with 'SqlRender' (<https://github.com/OHDSI/SqlRender>).
Maintained by Anthony Sena. Last updated 6 months ago.
13 stars 7.91 score 165 scriptsbioc
BayesSpace:Clustering and Resolution Enhancement of Spatial Transcriptomes
Tools for clustering and enhancing the resolution of spatial gene expression experiments. BayesSpace clusters a low-dimensional representation of the gene expression matrix, incorporating a spatial prior to encourage neighboring spots to cluster together. The method can enhance the resolution of the low-dimensional representation into "sub-spots", for which features such as gene expression or cell type composition can be imputed.
Maintained by Matt Stone. Last updated 5 months ago.
softwareclusteringtranscriptomicsgeneexpressionsinglecellimmunooncologydataimportopenblascppopenmp
126 stars 7.90 score 278 scripts 1 dependentsdarwin-eu
visOmopResults:Graphs and Tables for OMOP Results
Provides methods to transform omop_result objects into formatted tables and figures, facilitating the visualisation of study results working with the Observational Medical Outcomes Partnership (OMOP) Common Data Model.
Maintained by Núria Mercadé-Besora. Last updated 11 days ago.
7.89 score 53 scripts 3 dependentsbioc
beadarray:Quality assessment and low-level analysis for Illumina BeadArray data
The package is able to read bead-level data (raw TIFFs and text files) output by BeadScan as well as bead-summary data from BeadStudio. Methods for quality assessment and low-level analysis are provided.
Maintained by Mark Dunning. Last updated 5 months ago.
microarrayonechannelqualitycontrolpreprocessing
7.88 score 70 scripts 4 dependentsbioc
biodb:biodb, a library and a development framework for connecting to chemical and biological databases
The biodb package provides access to standard remote chemical and biological databases (ChEBI, KEGG, HMDB, ...), as well as to in-house local database files (CSV, SQLite), with easy retrieval of entries, access to web services, search of compounds by mass and/or name, and mass spectra matching for LCMS and MSMS. Its architecture as a development framework facilitates the development of new database connectors for local projects or inside separate published packages.
Maintained by Pierrick Roger. Last updated 5 months ago.
softwareinfrastructuredataimportkeggbiologycheminformaticschemistrydatabasescpp
11 stars 7.85 score 24 scripts 6 dependentsnmecsys
BETS:Brazilian Economic Time Series
It provides access to and information about the most important Brazilian economic time series - from the Getulio Vargas Foundation <http://portal.fgv.br/en>, the Central Bank of Brazil <http://www.bcb.gov.br> and the Brazilian Institute of Geography and Statistics <http://www.ibge.gov.br>. It also presents tools for managing, analysing (e.g. generating dynamic reports with a complete analysis of a series) and exporting these time series.
Maintained by Talitha Speranza. Last updated 4 years ago.
38 stars 7.82 score 108 scriptsbioc
Rcpi:Molecular Informatics Toolkit for Compound-Protein Interaction in Drug Discovery
A molecular informatics toolkit with an integration of bioinformatics and chemoinformatics tools for drug discovery.
Maintained by Nan Xiao. Last updated 5 months ago.
softwaredataimportdatarepresentationfeatureextractioncheminformaticsbiomedicalinformaticsproteomicsgosystemsbiologybioconductorbioinformaticsdrug-discoveryfeature-extractionfingerprintmolecular-descriptorsprotein-sequences
37 stars 7.81 score 29 scriptsbioc
SRAdb:A compilation of metadata from NCBI SRA and tools
The Sequence Read Archive (SRA) is the largest public repository of sequencing data from the next generation of sequencing platforms including Roche 454 GS System, Illumina Genome Analyzer, Applied Biosystems SOLiD System, Helicos Heliscope, and others. However, finding data of interest can be challenging using current tools. SRAdb is an attempt to make access to the metadata associated with submission, study, sample, experiment and run much more feasible. This is accomplished by parsing all the NCBI SRA metadata into a SQLite database that can be stored and queried locally. Fulltext search in the package make querying metadata very flexible and powerful. fastq and sra files can be downloaded for doing alignment locally. Beside ftp protocol, the SRAdb has funcitons supporting fastp protocol (ascp from Aspera Connect) for faster downloading large data files over long distance. The SQLite database is updated regularly as new data is added to SRA and can be downloaded at will for the most up-to-date metadata.
Maintained by Jack Zhu. Last updated 4 months ago.
infrastructuresequencingdataimport
2 stars 7.81 score 200 scriptsbioc
debrowser:Interactive Differential Expresion Analysis Browser
Bioinformatics platform containing interactive plots and tables for differential gene and region expression studies. Allows visualizing expression data much more deeply in an interactive and faster way. By changing the parameters, users can easily discover different parts of the data that like never have been done before. Manually creating and looking these plots takes time. With DEBrowser users can prepare plots without writing any code. Differential expression, PCA and clustering analysis are made on site and the results are shown in various plots such as scatter, bar, box, volcano, ma plots and Heatmaps.
Maintained by Alper Kucukural. Last updated 5 months ago.
sequencingchipseqrnaseqdifferentialexpressiongeneexpressionclusteringimmunooncology
61 stars 7.80 score 65 scriptsbioc
PhyloProfile:PhyloProfile
PhyloProfile is a tool for exploring complex phylogenetic profiles. Phylogenetic profiles, presence/absence patterns of genes over a set of species, are commonly used to trace the functional and evolutionary history of genes across species and time. With PhyloProfile we can enrich regular phylogenetic profiles with further data like sequence/structure similarity, to make phylogenetic profiling more meaningful. Besides the interactive visualisation powered by R-Shiny, the package offers a set of further analysis features to gain insights like the gene age estimation or core gene identification.
Maintained by Vinh Tran. Last updated 10 days ago.
softwarevisualizationdatarepresentationmultiplecomparisonfunctionalpredictiondimensionreductionbioinformaticsheatmapinteractive-visualizationsorthologsphylogenetic-profileshiny
33 stars 7.79 score 10 scripts