R-universe search: enrichment

egeulgen

pathfindR:Enrichment Analysis Utilizing Active Subnetworks

Enrichment analysis enables researchers to uncover mechanisms underlying a phenotype. However, conventional methods for enrichment analysis do not take into account protein-protein interaction information, resulting in incomplete conclusions. 'pathfindR' is a tool for enrichment analysis utilizing active subnetworks. The main function identifies active subnetworks in a protein-protein interaction network using a user-provided list of genes and associated p values. It then performs enrichment analyses on the identified subnetworks, identifying enriched terms (i.e. pathways or, more broadly, gene sets) that possibly underlie the phenotype of interest. 'pathfindR' also offers functionalities to cluster the enriched terms and identify representative terms in each cluster, to score the enriched terms per sample and to visualize analysis results. The enrichment, clustering and other methods implemented in 'pathfindR' are described in detail in Ulgen E, Ozisik O, Sezerman OU. 2019. 'pathfindR': An R Package for Comprehensive Identification of Enriched Pathways in Omics Data Through Active Subnetworks. Front. Genet. <doi:10.3389/fgene.2019.00858>.

Maintained by Ege Ulgen. Last updated 26 days ago.

active-subnetworks enrichment pathway pathway-enrichment-analysis subnetwork

75.7 match 186 stars 10.13 score 138 scripts

bioc

DOSE:Disease Ontology Semantic and Enrichment analysis

This package implements five methods proposed by Resnik, Schlicker, Jiang, Lin and Wang respectively for measuring semantic similarities among DO terms and gene products. Enrichment analyses including hypergeometric model and gene set enrichment analysis are also implemented for discovering disease associations of high-throughput biological data.

Maintained by Guangchuang Yu. Last updated 5 months ago.

annotation visualization multiplecomparison genesetenrichment pathways software disease-ontology enrichment-analysis semantic-similarity

29.2 match 119 stars 14.97 score 2.0k scripts 61 dependents

bioc

clusterProfiler:A universal enrichment tool for interpreting omics data

This package supports functional characteristics of both coding and non-coding genomics data for thousands of species with up-to-date gene annotation. It provides a univeral interface for gene functional annotation from a variety of sources and thus can be applied in diverse scenarios. It provides a tidy interface to access, manipulate, and visualize enrichment results to help users achieve efficient data interpretation. Datasets obtained from multiple treatments and time points can be analyzed and compared in a single run, easily revealing functional consensus and differences among distinct conditions.

Maintained by Guangchuang Yu. Last updated 4 months ago.

annotation clustering genesetenrichment go kegg multiplecomparison pathways reactome visualization enrichment-analysis gsea

24.5 match 1.1k stars 17.03 score 11k scripts 48 dependents

moosa-r

rbioapi:User-Friendly R Interface to Biologic Web Services' API

Currently fully supports Enrichr, JASPAR, miEAA, PANTHER, Reactome, STRING, and UniProt! The goal of rbioapi is to provide a user-friendly and consistent interface to biological databases and services. In a way that insulates the user from the technicalities of using web services API and creates a unified and easy-to-use interface to biological and medical web services. This is an ongoing project; New databases and services will be added periodically. Feel free to suggest any databases or services you often use.

Maintained by Moosa Rezwani. Last updated 1 months ago.

api-client bioinformatics biology enrichment enrichment-analysis enrichr jaspar mieaa over-representation-analysis panther reactome string uniprot

50.6 match 20 stars 7.60 score 55 scripts

bioc

fgsea:Fast Gene Set Enrichment Analysis

The package implements an algorithm for fast gene set enrichment analysis. Using the fast algorithm allows to make more permutations and get more fine grained p-values, which allows to use accurate stantard approaches to multiple hypothesis correction.

Maintained by Alexey Sergushichev. Last updated 3 months ago.

geneexpression differentialexpression genesetenrichment pathways cpp

22.3 match 387 stars 16.25 score 3.9k scripts 101 dependents

eltebioinformatics

mulea:Enrichment Analysis Using Multiple Ontologies and False Discovery Rate

Background - Traditional gene set enrichment analyses are typically limited to a few ontologies and do not account for the interdependence of gene sets or terms, resulting in overcorrected p-values. To address these challenges, we introduce mulea, an R package offering comprehensive overrepresentation and functional enrichment analysis. Results - mulea employs a progressive empirical false discovery rate (eFDR) method, specifically designed for interconnected biological data, to accurately identify significant terms within diverse ontologies. mulea expands beyond traditional tools by incorporating a wide range of ontologies, encompassing Gene Ontology, pathways, regulatory elements, genomic locations, and protein domains. This flexibility enables researchers to tailor enrichment analysis to their specific questions, such as identifying enriched transcriptional regulators in gene expression data or overrepresented protein domains in protein sets. To facilitate seamless analysis, mulea provides gene sets (in standardised GMT format) for 27 model organisms, covering 22 ontology types from 16 databases and various identifiers resulting in almost 900 files. Additionally, the muleaData ExperimentData Bioconductor package simplifies access to these pre-defined ontologies. Finally, mulea's architecture allows for easy integration of user-defined ontologies, or GMT files from external sources (e.g., MSigDB or Enrichr), expanding its applicability across diverse research areas. Conclusions - mulea is distributed as a CRAN R package. It offers researchers a powerful and flexible toolkit for functional enrichment analysis, addressing limitations of traditional tools with its progressive eFDR and by supporting a variety of ontologies. Overall, mulea fosters the exploration of diverse biological questions across various model organisms.

Maintained by Tamas Stirling. Last updated 3 months ago.

annotation differentialexpression geneexpression genesetenrichment go graphandnetwork multiplecomparison pathways reactome software transcription visualization enrichment enrichment-analysis functional-enrichment-analysis gene-set-enrichment ontologies transcriptomics cpp

40.7 match 28 stars 7.36 score 34 scripts

bioc

enrichViewNet:From functional enrichment results to biological networks

This package enables the visualization of functional enrichment results as network graphs. First the package enables the visualization of enrichment results, in a format corresponding to the one generated by gprofiler2, as a customizable Cytoscape network. In those networks, both gene datasets (GO terms/pathways/protein complexes) and genes associated to the datasets are represented as nodes. While the edges connect each gene to its dataset(s). The package also provides the option to create enrichment maps from functional enrichment results. Enrichment maps enable the visualization of enriched terms into a network with edges connecting overlapping genes.

Maintained by Astrid Deschênes. Last updated 5 months ago.

biologicalquestion software network networkenrichment go cystocape functional-enrichment

53.3 match 5 stars 5.54 score 6 scripts

ikosmidis

enrichwith:Methods to Enrich R Objects with Extra Components

Provides the "enrich" method to enrich list-like R objects with new, relevant components. The current version has methods for enriching objects of class 'family', 'link-glm', 'lm', 'glm' and 'betareg'. The resulting objects preserve their class, so all methods associated with them still apply. The package also provides the 'enriched_glm' function that has the same interface as 'glm' but results in objects of class 'enriched_glm'. In addition to the usual components in a `glm` object, 'enriched_glm' objects carry an object-specific simulate method and functions to compute the scores, the observed and expected information matrix, the first-order bias, as well as model densities, probabilities, and quantiles at arbitrary parameter values. The package can also be used to produce customizable source code templates for the structured implementation of methods to compute new components and enrich arbitrary objects.

Maintained by Ioannis Kosmidis. Last updated 5 years ago.

infrastructure

39.7 match 6 stars 7.35 score 16 scripts 12 dependents

bioc

NoRCE:NoRCE: Noncoding RNA Sets Cis Annotation and Enrichment

While some non-coding RNAs (ncRNAs) are assigned critical regulatory roles, most remain functionally uncharacterized. This presents a challenge whenever an interesting set of ncRNAs needs to be analyzed in a functional context. Transcripts located close-by on the genome are often regulated together. This genomic proximity on the sequence can hint to a functional association. We present a tool, NoRCE, that performs cis enrichment analysis for a given set of ncRNAs. Enrichment is carried out using the functional annotations of the coding genes located proximal to the input ncRNAs. Other biologically relevant information such as topologically associating domain (TAD) boundaries, co-expression patterns, and miRNA target prediction information can be incorporated to conduct a richer enrichment analysis. To this end, NoRCE includes several relevant datasets as part of its data repository, including cell-line specific TAD boundaries, functional gene sets, and expression data for coding & ncRNAs specific to cancer. Additionally, the users can utilize custom data files in their investigation. Enrichment results can be retrieved in a tabular format or visualized in several different ways. NoRCE is currently available for the following species: human, mouse, rat, zebrafish, fruit fly, worm, and yeast.

Maintained by Gulden Olgun. Last updated 5 months ago.

biologicalquestion differentialexpression genomeannotation genesetenrichment genetarget genomeassembly go

55.5 match 1 stars 4.60 score 6 scripts

bioc

GeneTonic:Enjoy Analyzing And Integrating The Results From Differential Expression Analysis And Functional Enrichment Analysis

This package provides functionality to combine the existing pieces of the transcriptome data and results, making it easier to generate insightful observations and hypothesis. Its usage is made easy with a Shiny application, combining the benefits of interactivity and reproducibility e.g. by capturing the features and gene sets of interest highlighted during the live session, and creating an HTML report as an artifact where text, code, and output coexist. Using the GeneTonicList as a standardized container for all the required components, it is possible to simplify the generation of multiple visualizations and summaries.

Maintained by Federico Marini. Last updated 2 months ago.

gui geneexpression software transcription transcriptomics visualization differentialexpression pathways reportwriting genesetenrichment annotation go shinyapps bioconductor bioconductor-package data-exploration data-visualization functional-enrichment-analysis gene-expression pathway-analysis reproducible-research rna-seq-analysis rna-seq-data shiny transcriptome user-friendly

29.0 match 77 stars 8.28 score 37 scripts 1 dependents

bioc

GSVA:Gene Set Variation Analysis for Microarray and RNA-Seq Data

Gene Set Variation Analysis (GSVA) is a non-parametric, unsupervised method for estimating variation of gene set enrichment through the samples of a expression data set. GSVA performs a change in coordinate systems, transforming the data from a gene by sample matrix to a gene-set by sample matrix, thereby allowing the evaluation of pathway enrichment for each sample. This new matrix of GSVA enrichment scores facilitates applying standard analytical methods like functional enrichment, survival analysis, clustering, CNV-pathway analysis or cross-tissue pathway analysis, in a pathway-centric manner.

Maintained by Robert Castelo. Last updated 3 days ago.

functionalgenomics microarray rnaseq pathways genesetenrichment gene-set-enrichment genomics pathway-enrichment-analysis

16.2 match 210 stars 14.72 score 1.6k scripts 19 dependents

bioc

enrichplot:Visualization of Functional Enrichment Result

The 'enrichplot' package implements several visualization methods for interpreting functional enrichment results obtained from ORA or GSEA analysis. It is mainly designed to work with the 'clusterProfiler' package suite. All the visualization methods are developed based on 'ggplot2' graphics.

Maintained by Guangchuang Yu. Last updated 2 months ago.

annotation genesetenrichment go kegg pathways software visualization enrichment-analysis pathway-analysis

14.3 match 239 stars 15.71 score 3.1k scripts 58 dependents

bioc

EnrichmentBrowser:Seamless navigation through combined results of set-based and network-based enrichment analysis

The EnrichmentBrowser package implements essential functionality for the enrichment analysis of gene expression data. The analysis combines the advantages of set-based and network-based enrichment analysis in order to derive high-confidence gene sets and biological pathways that are differentially regulated in the expression data under investigation. Besides, the package facilitates the visualization and exploration of such sets and pathways.

Maintained by Ludwig Geistlinger. Last updated 5 months ago.

immunooncology microarray rnaseq geneexpression differentialexpression pathways graphandnetwork network genesetenrichment networkenrichment visualization reportwriting

23.6 match 20 stars 9.37 score 164 scripts 3 dependents

bioc

EWCE:Expression Weighted Celltype Enrichment

Used to determine which cell types are enriched within gene lists. The package provides tools for testing enrichments within simple gene lists (such as human disease associated genes) and those resulting from differential expression studies. The package does not depend upon any particular Single Cell Transcriptome dataset and user defined datasets can be loaded in and used in the analyses.

Maintained by Alan Murphy. Last updated 29 days ago.

geneexpression transcription differentialexpression genesetenrichment genetics microarray mrnamicroarray onechannel rnaseq biomedicalinformatics proteomics visualization functionalgenomics singlecell deconvolution single-cell single-cell-rna-seq transcriptomics

23.2 match 55 stars 9.28 score 99 scripts

bioc

BgeeDB:Annotation and gene expression data retrieval from Bgee database. TopAnat, an anatomical entities Enrichment Analysis tool for UBERON ontology

A package for the annotation and gene expression data download from Bgee database, and TopAnat analysis: GO-like enrichment of anatomical terms, mapped to genes by expression patterns.

Maintained by Julien Wollbrett. Last updated 5 months ago.

software dataimport sequencing geneexpression microarray go genesetenrichment bioinformatics enrichment-analysis rna-seq scrna-seq single-cell

25.1 match 15 stars 8.46 score 19 scripts 1 dependents

bioc

maftools:Summarize, Analyze and Visualize MAF Files

Analyze and visualize Mutation Annotation Format (MAF) files from large scale sequencing studies. This package provides various functions to perform most commonly used analyses in cancer genomics and to create feature rich customizable visualzations with minimal effort.

Maintained by Anand Mayakonda. Last updated 5 months ago.

datarepresentation dnaseq visualization drivermutation variantannotation featureextraction classification somaticmutation sequencing functionalgenomics survival bioinformatics cancer-genome-atlas cancer-genomics genomics maf-files tcga curl bzip2 xz-utils zlib

14.3 match 459 stars 14.63 score 948 scripts 18 dependents

bioc

coRdon:Codon Usage Analysis and Prediction of Gene Expressivity

Tool for analysis of codon usage in various unannotated or KEGG/COG annotated DNA sequences. Calculates different measures of CU bias and CU-based predictors of gene expressivity, and performs gene set enrichment analysis for annotated sequences. Implements several methods for visualization of CU and enrichment analysis results.

Maintained by Anamaria Elek. Last updated 5 months ago.

software metagenomics geneexpression genesetenrichment geneprediction visualization kegg pathways genetics cellbiology biomedicalinformatics immunooncology

25.4 match 20 stars 7.71 score 48 scripts 1 dependents

bioc

cogena:co-expressed gene-set enrichment analysis

cogena is a workflow for co-expressed gene-set enrichment analysis. It aims to discovery smaller scale, but highly correlated cellular events that may be of great biological relevance. A novel pipeline for drug discovery and drug repositioning based on the cogena workflow is proposed. Particularly, candidate drugs can be predicted based on the gene expression of disease-related data, or other similar drugs can be identified based on the gene expression of drug-related data. Moreover, the drug mode of action can be disclosed by the associated pathway analysis. In summary, cogena is a flexible workflow for various gene set enrichment analysis for co-expressed genes, with a focus on pathway/GO analysis and drug repositioning.

Maintained by Zhilong Jia. Last updated 5 months ago.

clustering genesetenrichment geneexpression visualization pathways kegg go microarray sequencing systemsbiology datarepresentation dataimport bioconductor bioinformatics

26.0 match 12 stars 7.36 score 32 scripts

bioc

pathlinkR:Analyze and interpret RNA-Seq results

pathlinkR is an R package designed to facilitate analysis of RNA-Seq results. Specifically, our aim with pathlinkR was to provide a number of tools which take a list of DE genes and perform different analyses on them, aiding with the interpretation of results. Functions are included to perform pathway enrichment, with muliplte databases supported, and tools for visualizing these results. Genes can also be used to create and plot protein-protein interaction networks, all from inside of R.

Maintained by Travis Blimkie. Last updated 3 months ago.

genesetenrichment network pathways reactome rnaseq networkenrichment bioinformatics networks pathway-enrichment-analysis visualization

28.6 match 26 stars 6.62 score 2 scripts

bioc

fedup:Fisher's Test for Enrichment and Depletion of User-Defined Pathways

An R package that tests for enrichment and depletion of user-defined pathways using a Fisher's exact test. The method is designed for versatile pathway annotation formats (eg. gmt, txt, xlsx) to allow the user to run pathway analysis on custom annotations. This package is also integrated with Cytoscape to provide network-based pathway visualization that enhances the interpretability of the results.

Maintained by Catherine Ross. Last updated 5 months ago.

genesetenrichment pathways networkenrichment network bioconductor enrichment

34.4 match 7 stars 5.32 score 10 scripts

bioc

EnrichedHeatmap:Making Enriched Heatmaps

Enriched heatmap is a special type of heatmap which visualizes the enrichment of genomic signals on specific target regions. Here we implement enriched heatmap by ComplexHeatmap package. Since this type of heatmap is just a normal heatmap but with some special settings, with the functionality of ComplexHeatmap, it would be much easier to customize the heatmap as well as concatenating to a list of heatmaps to show correspondance between different data sources.

Maintained by Zuguang Gu. Last updated 5 months ago.

software visualization sequencing genomeannotation coverage cpp

16.7 match 190 stars 10.87 score 330 scripts 1 dependents

bioc

monaLisa:Binned Motif Enrichment Analysis and Visualization

Useful functions to work with sequence motifs in the analysis of genomics data. These include methods to annotate genomic regions or sequences with predicted motif hits and to identify motifs that drive observed changes in accessibility or expression. Functions to produce informative visualizations of the obtained results are also provided.

Maintained by Michael Stadler. Last updated 2 months ago.

motifannotation visualization featureextraction epigenetics

22.0 match 40 stars 8.06 score 53 scripts

overton-group

eHDPrep:Quality Control and Semantic Enrichment of Datasets

A tool for the preparation and enrichment of health datasets for analysis (Toner et al. (2023) <doi:10.1093/gigascience/giad030>). Provides functionality for assessing data quality and for improving the reliability and machine interpretability of a dataset. 'eHDPrep' also enables semantic enrichment of a dataset where metavariables are discovered from the relationships between input variables determined from user-provided ontologies.

Maintained by Ian Overton. Last updated 2 years ago.

data-quality health-informatics semantic-enrichment

33.8 match 8 stars 4.90 score 10 scripts

bioc

signatureSearch:Environment for Gene Expression Searching Combined with Functional Enrichment Analysis

This package implements algorithms and data structures for performing gene expression signature (GES) searches, and subsequently interpreting the results functionally with specialized enrichment methods.

Maintained by Brendan Gongol. Last updated 5 months ago.

software geneexpression go kegg networkenrichment sequencing coverage differentialexpression cpp

22.6 match 17 stars 7.18 score 74 scripts 1 dependents

bioc

escape:Easy single cell analysis platform for enrichment

A bridging R package to facilitate gene set enrichment analysis (GSEA) in the context of single-cell RNA sequencing. Using raw count information, Seurat objects, or SingleCellExperiment format, users can perform and visualize ssGSEA, GSVA, AUCell, and UCell-based enrichment calculations across individual cells.

Maintained by Nick Borcherding. Last updated 2 months ago.

software singlecell classification annotation genesetenrichment sequencing genesignaling pathways

27.2 match 5.92 score 138 scripts

bioc

RcisTarget:RcisTarget Identify transcription factor binding motifs enriched on a list of genes or genomic regions

RcisTarget identifies transcription factor binding motifs (TFBS) over-represented on a gene list. In a first step, RcisTarget selects DNA motifs that are significantly over-represented in the surroundings of the transcription start site (TSS) of the genes in the gene-set. This is achieved by using a database that contains genome-wide cross-species rankings for each motif. The motifs that are then annotated to TFs and those that have a high Normalized Enrichment Score (NES) are retained. Finally, for each motif and gene-set, RcisTarget predicts the candidate target genes (i.e. genes in the gene-set that are ranked above the leading edge).

Maintained by Gert Hulselmans. Last updated 5 months ago.

generegulation motifannotation transcriptomics transcription genesetenrichment genetarget

16.4 match 37 stars 9.47 score 191 scripts

bioc

hypeR:An R Package For Geneset Enrichment Workflows

An R Package for Geneset Enrichment Workflows.

Maintained by Anthony Federico. Last updated 5 months ago.

genesetenrichment annotation pathways bioinformatics computational-biology geneset-enrichment-analysis

17.8 match 76 stars 8.22 score 145 scripts

bioc

ViSEAGO:ViSEAGO: a Bioconductor package for clustering biological functions using Gene Ontology and semantic similarity

The main objective of ViSEAGO package is to carry out a data mining of biological functions and establish links between genes involved in the study. We developed ViSEAGO in R to facilitate functional Gene Ontology (GO) analysis of complex experimental design with multiple comparisons of interest. It allows to study large-scale datasets together and visualize GO profiles to capture biological knowledge. The acronym stands for three major concepts of the analysis: Visualization, Semantic similarity and Enrichment Analysis of Gene Ontology. It provides access to the last current GO annotations, which are retrieved from one of NCBI EntrezGene, Ensembl or Uniprot databases for several species. Using available R packages and novel developments, ViSEAGO extends classical functional GO analysis to focus on functional coherence by aggregating closely related biological themes while studying multiple datasets at once. It provides both a synthetic and detailed view using interactive functionalities respecting the GO graph structure and ensuring functional coherence supplied by semantic similarity. ViSEAGO has been successfully applied on several datasets from different species with a variety of biological questions. Results can be easily shared between bioinformaticians and biologists, enhancing reporting capabilities while maintaining reproducibility.

Maintained by Aurelien Brionne. Last updated 2 months ago.

software annotation go genesetenrichment multiplecomparison clustering visualization

21.7 match 6.64 score 22 scripts

bioc

chipenrich:Gene Set Enrichment For ChIP-seq Peak Data

ChIP-Enrich and Poly-Enrich perform gene set enrichment testing using peaks called from a ChIP-seq experiment. The method empirically corrects for confounding factors such as the length of genes, and the mappability of the sequence surrounding genes.

Maintained by Kai Wang. Last updated 4 days ago.

immunooncology chipseq epigenetics functionalgenomics genesetenrichment histonemodification regression

28.9 match 4.94 score 29 scripts

asa12138

ReporterScore:Generalized Reporter Score-Based Enrichment Analysis for Omics Data

Inspired by the classic 'RSA', we developed the improved 'Generalized Reporter Score-based Analysis (GRSA)' method, implemented in the R package 'ReporterScore', along with comprehensive visualization methods and pathway databases. 'GRSA' is a threshold-free method that works well with all types of biomedical features, such as genes, chemical compounds, and microbial species. Importantly, the 'GRSA' supports multi-group and longitudinal experimental designs, because of the included multi-group-compatible statistical methods.

Maintained by Chen Peng. Last updated 2 months ago.

20.3 match 67 stars 6.79 score 13 scripts

reimandlab

ActivePathways:Integrative Pathway Enrichment Analysis of Multivariate Omics Data

Framework for analysing multiple omics datasets in the context of molecular pathways, biological processes and other types of gene sets. The package uses p-value merging to combine gene- or protein-level signals, followed by ranked hypergeometric tests to determine enriched pathways and processes. Genes can be integrated using directional constraints that reflect how the input datasets are expected interact with one another. This approach allows researchers to interpret a series of omics datasets in the context of known biology and gene function, and discover associations that are only apparent when several datasets are combined. The recent version of the package is part of the following publication: Directional integration and pathway enrichment analysis for multi-omics data. Slobodyanyuk M^, Bahcheli AT^, Klein ZP, Bayati M, Strug LJ, Reimand J. Nature Communications (2024) <doi:10.1038/s41467-024-49986-4>.

Maintained by Juri Reimand. Last updated 8 months ago.

15.9 match 107 stars 8.61 score 35 scripts 2 dependents

rpact-com

rpact:Confirmatory Adaptive Clinical Trial Design and Analysis

Design and analysis of confirmatory adaptive clinical trials with continuous, binary, and survival endpoints according to the methods described in the monograph by Wassmer and Brannath (2016) <doi:10.1007/978-3-319-32562-0>. This includes classical group sequential as well as multi-stage adaptive hypotheses tests that are based on the combination testing principle.

Maintained by Friedrich Pahlke. Last updated 9 days ago.

adaptive-design analysis clinical-trials count-data group-sequential-designs power-calculation sample-size-calculation simulation validated fortran cpp

17.1 match 25 stars 7.98 score 110 scripts 1 dependents

bioc

mosdef:MOSt frequently used and useful Differential Expression Functions

This package provides functionality to run a number of tasks in the differential expression analysis workflow. This encompasses the most widely used steps, from running various enrichment analysis tools with a unified interface to creating plots and beautifying table components linking to external websites and databases. This streamlines the generation of comprehensive analysis reports.

Maintained by Federico Marini. Last updated 3 months ago.

geneexpression software transcription transcriptomics differentialexpression visualization reportwriting genesetenrichment go

21.8 match 6.12 score 4 dependents

bioc

ReactomePA:Reactome Pathway Analysis

This package provides functions for pathway analysis based on REACTOME pathway database. It implements enrichment analysis, gene set enrichment analysis and several functions for visualization. This package is not affiliated with the Reactome team.

Maintained by Guangchuang Yu. Last updated 5 months ago.

pathways visualization annotation multiplecomparison genesetenrichment reactome enrichment-analysis reactome-pathway-analysis reactomepa

10.9 match 40 stars 12.25 score 1.5k scripts 7 dependents

bioc

ChIPpeakAnno:Batch annotation of the peaks identified from either ChIP-seq, ChIP-chip experiments, or any experiments that result in large number of genomic interval data

The package encompasses a range of functions for identifying the closest gene, exon, miRNA, or custom features—such as highly conserved elements and user-supplied transcription factor binding sites. Additionally, users can retrieve sequences around the peaks and obtain enriched Gene Ontology (GO) or Pathway terms. In version 2.0.5 and beyond, new functionalities have been introduced. These include features for identifying peaks associated with bi-directional promoters along with summary statistics (peaksNearBDP), summarizing motif occurrences in peaks (summarizePatternInPeaks), and associating additional identifiers with annotated peaks or enrichedGO (addGeneIDs). The package integrates with various other packages such as biomaRt, IRanges, Biostrings, BSgenome, GO.db, multtest, and stat to enhance its analytical capabilities.

Maintained by Jianhong Ou. Last updated 2 months ago.

annotation chipseq chipchip

15.2 match 8.75 score 584 scripts 6 dependents

bioc

knowYourCG:Functional analysis of DNA methylome datasets

KnowYourCG (KYCG) is a supervised learning framework designed for the functional analysis of DNA methylation data. Unlike existing tools that focus on genes or genomic intervals, KnowYourCG directly targets CpG dinucleotides, featuring automated supervised screenings of diverse biological and technical influences, including sequence motifs, transcription factor binding, histone modifications, replication timing, cell-type-specific methylation, and trait-epigenome associations. KnowYourCG addresses the challenges of data sparsity in various methylation datasets, including low-pass Nanopore sequencing, single-cell DNA methylomes, 5-hydroxymethylation profiles, spatial DNA methylation maps, and array-based datasets for epigenome-wide association studies and epigenetic clocks.

Maintained by Goldberg David. Last updated 2 months ago.

epigenetics dnamethylation sequencing singlecell spatial methylationarray zlib

21.7 match 2 stars 6.10 score 4 scripts

bioc

GRaNIE:GRaNIE: Reconstruction cell type specific gene regulatory networks including enhancers using single-cell or bulk chromatin accessibility and RNA-seq data

Genetic variants associated with diseases often affect non-coding regions, thus likely having a regulatory role. To understand the effects of genetic variants in these regulatory regions, identifying genes that are modulated by specific regulatory elements (REs) is crucial. The effect of gene regulatory elements, such as enhancers, is often cell-type specific, likely because the combinations of transcription factors (TFs) that are regulating a given enhancer have cell-type specific activity. This TF activity can be quantified with existing tools such as diffTF and captures differences in binding of a TF in open chromatin regions. Collectively, this forms a gene regulatory network (GRN) with cell-type and data-specific TF-RE and RE-gene links. Here, we reconstruct such a GRN using single-cell or bulk RNAseq and open chromatin (e.g., using ATACseq or ChIPseq for open chromatin marks) and optionally (Capture) Hi-C data. Our network contains different types of links, connecting TFs to regulatory elements, the latter of which is connected to genes in the vicinity or within the same chromatin domain (TAD). We use a statistical framework to assign empirical FDRs and weights to all links using a permutation-based approach.

Maintained by Christian Arnold. Last updated 5 months ago.

software geneexpression generegulation networkinference genesetenrichment biomedicalinformatics genetics transcriptomics atacseq rnaseq graphandnetwork regression transcription chipseq

24.3 match 5.40 score 24 scripts

bioc

universalmotif:Import, Modify, and Export Motifs with R

Allows for importing most common motif types into R for use by functions provided by other Bioconductor motif-related packages. Motifs can be exported into most major motif formats from various classes as defined by other Bioconductor packages. A suite of motif and sequence manipulation and analysis functions are included, including enrichment, comparison, P-value calculation, shuffling, trimming, higher-order motifs, and others.

Maintained by Benjamin Jean-Marie Tremblay. Last updated 4 months ago.

motifannotation motifdiscovery dataimport generegulation motif-analysis motif-enrichment-analysis sequence-logo cpp

11.8 match 28 stars 11.04 score 342 scripts 12 dependents

cran

gprofiler2:Interface to the 'g:Profiler' Toolset

A toolset for functional enrichment analysis and visualization, gene/protein/SNP identifier conversion and mapping orthologous genes across species via 'g:Profiler' (<https://biit.cs.ut.ee/gprofiler/>). The main tools are: (1) 'g:GOSt' - functional enrichment analysis and visualization of gene lists; (2) 'g:Convert' - gene/protein/transcript identifier conversion across various namespaces; (3) 'g:Orth' - orthology search across species; (4) 'g:SNPense' - mapping SNP rs identifiers to chromosome positions, genes and variant effects. This package is an R interface corresponding to the 2019 update of 'g:Profiler' and provides access to 'g:Profiler' for versions 'e94_eg41_p11' and higher. See the package 'gProfileR' for accessing older versions from the 'g:Profiler' toolset.

Maintained by Liis Kolberg. Last updated 1 years ago.

16.2 match 4 stars 7.97 score 1.5k scripts 16 dependents

bioc

LOLA:Locus overlap analysis for enrichment of genomic ranges

Provides functions for testing overlap of sets of genomic regions with public and custom region set (genomic ranges) databases. This makes it possible to do automated enrichment analysis for genomic region sets, thus facilitating interpretation of functional genomics and epigenomics data.

Maintained by Nathan Sheffield. Last updated 5 months ago.

genesetenrichment generegulation genomeannotation systemsbiology functionalgenomics chipseq methylseq sequencing

13.4 match 76 stars 9.34 score 160 scripts

ganglilab

genekitr:Gene Analysis Toolkit

Provides features for searching, converting, analyzing, plotting, and exporting data effortlessly by inputting feature IDs. Enables easy retrieval of feature information, conversion of ID types, gene enrichment analysis, publication-level figures, group interaction plotting, and result export in one Excel file for seamless sharing and communication.

Maintained by Yunze Liu. Last updated 2 months ago.

enrichment-analysis gene id-converter plotting

20.7 match 56 stars 6.00 score 24 scripts 1 dependents

bioc

ELMER:Inferring Regulatory Element Landscapes and Transcription Factor Networks Using Cancer Methylomes

ELMER is designed to use DNA methylation and gene expression from a large number of samples to infere regulatory element landscape and transcription factor network in primary tissue.

Maintained by Tiago Chedraoui Silva. Last updated 5 months ago.

dnamethylation geneexpression motifannotation software generegulation transcription network

16.6 match 7.42 score 176 scripts

bioc

RTN:RTN: Reconstruction of Transcriptional regulatory Networks and analysis of regulons

A transcriptional regulatory network (TRN) consists of a collection of transcription factors (TFs) and the regulated target genes. TFs are regulators that recognize specific DNA sequences and guide the expression of the genome, either activating or repressing the expression the target genes. The set of genes controlled by the same TF forms a regulon. This package provides classes and methods for the reconstruction of TRNs and analysis of regulons.

Maintained by Mauro Castro. Last updated 5 months ago.

transcription network networkinference networkenrichment generegulation geneexpression graphandnetwork genesetenrichment geneticvariability

20.5 match 5.80 score 53 scripts 2 dependents

bioc

TissueEnrich:Tissue-specific gene enrichment analysis

The TissueEnrich package is used to calculate enrichment of tissue-specific genes in a set of input genes. For example, the user can input the most highly expressed genes from RNA-Seq data, or gene co-expression modules to determine which tissue-specific genes are enriched in those datasets. Tissue-specific genes were defined by processing RNA-Seq data from the Human Protein Atlas (HPA) (Uhlén et al. 2015), GTEx (Ardlie et al. 2015), and mouse ENCODE (Shen et al. 2012) using the algorithm from the HPA (Uhlén et al. 2015).The hypergeometric test is being used to determine if the tissue-specific genes are enriched among the input genes. Along with tissue-specific gene enrichment, the TissueEnrich package can also be used to define tissue-specific genes from expression datasets provided by the user, which can then be used to calculate tissue-specific gene enrichments.

Maintained by Ashish Jain. Last updated 5 months ago.

genesetenrichment geneexpression sequencing

19.5 match 5.93 score 57 scripts

bioc

PWMEnrich:PWM enrichment analysis

A toolkit of high-level functions for DNA motif scanning and enrichment analysis built upon Biostrings. The main functionality is PWM enrichment analysis of already known PWMs (e.g. from databases such as MotifDb), but the package also implements high-level functions for PWM scanning and visualisation. The package does not perform "de novo" motif discovery, but is instead focused on using motifs that are either experimentally derived or computationally constructed by other tools.

Maintained by Diego Diez. Last updated 5 months ago.

motifannotation sequencematching software

22.3 match 5.08 score 60 scripts

bioc

MicrobiomeProfiler:An R/shiny package for microbiome functional enrichment analysis

This is an R/shiny package to perform functional enrichment analysis for microbiome data. This package was based on clusterProfiler. Moreover, MicrobiomeProfiler support KEGG enrichment analysis, COG enrichment analysis, Microbe-Disease association enrichment analysis, Metabo-Pathway analysis.

Maintained by Guangchuang Yu. Last updated 5 months ago.

microbiome software visualization kegg

16.7 match 37 stars 6.79 score 22 scripts

bioc

rGREAT:GREAT Analysis - Functional Enrichment on Genomic Regions

GREAT (Genomic Regions Enrichment of Annotations Tool) is a type of functional enrichment analysis directly performed on genomic regions. This package implements the GREAT algorithm (the local GREAT analysis), also it supports directly interacting with the GREAT web service (the online GREAT analysis). Both analysis can be viewed by a Shiny application. rGREAT by default supports more than 600 organisms and a large number of gene set collections, as well as self-provided gene sets and organisms from users. Additionally, it implements a general method for dealing with background regions.

Maintained by Zuguang Gu. Last updated 2 days ago.

genesetenrichment go pathways software sequencing wholegenome genomeannotation coverage cpp

11.2 match 86 stars 9.96 score 320 scripts 1 dependents

bioc

meshes:MeSH Enrichment and Semantic analyses

MeSH (Medical Subject Headings) is the NLM controlled vocabulary used to manually index articles for MEDLINE/PubMed. MeSH terms were associated by Entrez Gene ID by three methods, gendoo, gene2pubmed and RBBH. This association is fundamental for enrichment and semantic analyses. meshes supports enrichment analysis (over-representation and gene set enrichment analysis) of gene list or whole expression profile. The semantic comparisons of MeSH terms provide quantitative ways to compute similarities between genes and gene groups. meshes implemented five methods proposed by Resnik, Schlicker, Jiang, Lin and Wang respectively and supports more than 70 species.

Maintained by Guangchuang Yu. Last updated 5 months ago.

annotation clustering multiplecomparison software enrichment-analysis medical-subject-headings semantic-similarity

14.7 match 11 stars 7.15 score 43 scripts

bioc

xCell2:A Tool for Generic Cell Type Enrichment Analysis

xCell2 provides methods for cell type enrichment analysis using cell type signatures. It includes three main functions - 1. xCell2Train for training custom references objects from bulk or single-cell RNA-seq datasets. 2. xCell2Analysis for conducting the cell type enrichment analysis using the custom reference. 3. xCell2GetLineage for identifying dependencies between different cell types using ontology.

Maintained by Almog Angel. Last updated 2 months ago.

geneexpression transcriptomics microarray rnaseq singlecell differentialexpression immunooncology genesetenrichment

16.9 match 6 stars 6.17 score 15 scripts

ccsosa

GOCompare:Comprehensive GO Terms Comparison Between Species

Supports the assessment of functional enrichment analyses obtained for several lists of genes and provides a workflow to analyze them between two species via weighted graphs. Methods are described in Sosa et al. (2023) <doi:10.1016/j.ygeno.2022.110528>.

Maintained by Chrystian Camilo Sosa. Last updated 4 months ago.

24.9 match 9 stars 4.13 score 1 scripts

bioc

beer:Bayesian Enrichment Estimation in R

BEER implements a Bayesian model for analyzing phage-immunoprecipitation sequencing (PhIP-seq) data. Given a PhIPData object, BEER returns posterior probabilities of enriched antibody responses, point estimates for the relative fold-change in comparison to negative control samples, and more. Additionally, BEER provides a convenient implementation for using edgeR to identify enriched antibody responses.

Maintained by Athena Chen. Last updated 5 months ago.

software statisticalmethod bayesian sequencing coverage jags cpp

18.0 match 10 stars 5.38 score 12 scripts

igordot

msigdbr:MSigDB Gene Sets for Multiple Organisms in a Tidy Data Format

Provides the 'Molecular Signatures Database' (MSigDB) gene sets typically used with the 'Gene Set Enrichment Analysis' (GSEA) software (Subramanian et al. 2005 <doi:10.1073/pnas.0506580102>, Liberzon et al. 2015 <doi:10.1016/j.cels.2015.12.004>, Castanza et al. 2023 <doi:10.1038/s41592-023-02014-7>) as an R data frame. The package includes the human genes as listed in MSigDB as well as the corresponding symbols and IDs for frequently studied model organisms such as mouse, rat, pig, fly, and yeast.

Maintained by Igor Dolgalev. Last updated 3 days ago.

enrichment-analysis gene-sets genomics gsea msigdb pathway-analysis pathways

8.0 match 72 stars 12.01 score 3.6k scripts 21 dependents

bioc

TCGAbiolinks:TCGAbiolinks: An R/Bioconductor package for integrative analysis with GDC data

The aim of TCGAbiolinks is : i) facilitate the GDC open-access data retrieval, ii) prepare the data using the appropriate pre-processing strategies, iii) provide the means to carry out different standard analyses and iv) to easily reproduce earlier research results. In more detail, the package provides multiple methods for analysis (e.g., differential expression analysis, identifying differentially methylated regions) and methods for visualization (e.g., survival plots, volcano plots, starburst plots) in order to easily develop complete analysis pipelines.

Maintained by Tiago Chedraoui Silva. Last updated 25 days ago.

dnamethylation differentialmethylation generegulation geneexpression methylationarray differentialexpression pathways network sequencing survival software bioc bioconductor gdc integrative-analysis tcga tcga-data tcgabiolinks

6.5 match 305 stars 14.45 score 1.6k scripts 6 dependents

hanjunwei-lab

MiRSEA:'MicroRNA' Set Enrichment Analysis

The tools for 'MicroRNA Set Enrichment Analysis' can identify risk pathways(or prior gene sets) regulated by microRNA set in the context of microRNA expression data. (1) This package constructs a correlation profile of microRNA and pathways by the hypergeometric statistic test. The gene sets of pathways derived from the three public databases (Kyoto Encyclopedia of Genes and Genomes ('KEGG'); 'Reactome'; 'Biocarta') and the target gene sets of microRNA are provided by four databases('TarBaseV6.0'; 'mir2Disease'; 'miRecords'; 'miRTarBase';). (2) This package can quantify the change of correlation between microRNA for each pathway(or prior gene set) based on a microRNA expression data with cases and controls. (3) This package uses the weighted Kolmogorov-Smirnov statistic to calculate an enrichment score (ES) of a microRNA set that co-regulate to a pathway , which reflects the degree to which a given pathway is associated with the specific phenotype. (4) This package can provide the visualization of the results.

Maintained by Junwei Han. Last updated 5 years ago.

statistics pathways microrna enrichment analysis

20.6 match 4.51 score 16 scripts

bioc

GOfuncR:Gene ontology enrichment using FUNC

GOfuncR performs a gene ontology enrichment analysis based on the ontology enrichment software FUNC. GO-annotations are obtained from OrganismDb or OrgDb packages ('Homo.sapiens' by default); the GO-graph is included in the package and updated regularly (01-May-2021). GOfuncR provides the standard candidate vs. background enrichment analysis using the hypergeometric test, as well as three additional tests: (i) the Wilcoxon rank-sum test that is used when genes are ranked, (ii) a binomial test that is used when genes are associated with two counts and (iii) a Chi-square or Fisher's exact test that is used in cases when genes are associated with four counts. To correct for multiple testing and interdependency of the tests, family-wise error rates are computed based on random permutations of the gene-associated variables. GOfuncR also provides tools for exploring the ontology graph and the annotations, and options to take gene-length or spatial clustering of genes into account. It is also possible to provide custom gene coordinates, annotations and ontologies.

Maintained by Steffi Grote. Last updated 5 months ago.

genesetenrichment go cpp

16.7 match 5.52 score 55 scripts

abbvie-external

OmicNavigator:Open-Source Software for 'Omic' Data Analysis and Visualization

A tool for interactive exploration of the results from 'omics' experiments to facilitate novel discoveries from high-throughput biology. The software includes R functions for the 'bioinformatician' to deposit study metadata and the outputs from statistical analyses (e.g. differential expression, enrichment). These results are then exported to an interactive JavaScript dashboard that can be interrogated on the user's local machine or deployed online to be explored by collaborators. The dashboard includes 'sortable' tables, interactive plots including network visualization, and fine-grained filtering based on statistical significance.

Maintained by John Blischak. Last updated 2 days ago.

bioinformatics genomics omics opencpu

11.8 match 34 stars 7.68 score 31 scripts

bioc

LRcell:Differential cell type change analysis using Logistic/linear Regression

The goal of LRcell is to identify specific sub-cell types that drives the changes observed in a bulk RNA-seq differential gene expression experiment. To achieve this, LRcell utilizes sets of cell marker genes acquired from single-cell RNA-sequencing (scRNA-seq) as indicators for various cell types in the tissue of interest. Next, for each cell type, using its marker genes as indicators, we apply Logistic Regression on the complete set of genes with differential expression p-values to calculate a cell-type significance p-value. Finally, these p-values are compared to predict which one(s) are likely to be responsible for the differential gene expression pattern observed in the bulk RNA-seq experiments. LRcell is inspired by the LRpath[@sartor2009lrpath] algorithm developed by Sartor et al., originally designed for pathway/gene set enrichment analysis. LRcell contains three major components: LRcell analysis, plot generation and marker gene selection. All modules in this package are written in R. This package also provides marker genes in the Prefrontal Cortex (pFC) human brain region, human PBMC and nine mouse brain regions (Frontal Cortex, Cerebellum, Globus Pallidus, Hippocampus, Entopeduncular, Posterior Cortex, Striatum, Substantia Nigra and Thalamus).

Maintained by Wenjing Ma. Last updated 5 months ago.

singlecell genesetenrichment sequencing regression geneexpression differentialexpression enrichment marker-genes

20.2 match 3 stars 4.48 score 5 scripts

bioc

SurfR:Surface Protein Prediction and Identification

Identify Surface Protein coding genes from a list of candidates. Systematically download data from GEO and TCGA or use your own data. Perform DGE on bulk RNAseq data. Perform Meta-analysis. Descriptive enrichment analysis and plots.

Maintained by Aurora Maurizio. Last updated 16 hours ago.

software sequencing rnaseq geneexpression transcription differentialexpression principalcomponent genesetenrichment pathways batcheffect functionalgenomics visualization dataimport functionalprediction geneprediction go dge enrichment-analysis metaanalysis plots proteins public-data surface surfaceome

16.6 match 3 stars 5.43 score 3 scripts

bioc

MIRit:Integrate microRNA and gene expression to decipher pathway complexity

MIRit is an R package that provides several methods for investigating the relationships between miRNAs and genes in different biological conditions. In particular, MIRit allows to explore the functions of dysregulated miRNAs, and makes it possible to identify miRNA-gene regulatory axes that control biological pathways, thus enabling the users to unveil the complexity of miRNA biology. MIRit is an all-in-one framework that aims to help researchers in all the central aspects of an integrative miRNA-mRNA analyses, from differential expression analysis to network characterization.

Maintained by Jacopo Ronchi. Last updated 18 hours ago.

software generegulation networkenrichment networkinference epigenetics functionalgenomics systemsbiology network pathways geneexpression differentialexpression mirna mirna-mrna-interaction mirna-seq mirnaseq-analysis cpp

22.6 match 4.00 score 2 scripts

bioc

IsoformSwitchAnalyzeR:Identify, Annotate and Visualize Isoform Switches with Functional Consequences from both short- and long-read RNA-seq data

Analysis of alternative splicing and isoform switches with predicted functional consequences (e.g. gain/loss of protein domains etc.) from quantification of all types of RNASeq by tools such as Kallisto, Salmon, StringTie, Cufflinks/Cuffdiff etc.

Maintained by Kristoffer Vitting-Seerup. Last updated 5 months ago.

geneexpression transcription alternativesplicing differentialexpression differentialsplicing visualization statisticalmethod transcriptomevariant biomedicalinformatics functionalgenomics systemsbiology transcriptomics rnaseq annotation functionalprediction geneprediction dataimport multiplecomparison batcheffect immunooncology

9.7 match 108 stars 9.26 score 125 scripts

bioc

loci2path:Loci2path: regulatory annotation of genomic intervals based on tissue-specific expression QTLs

loci2path performs statistics-rigorous enrichment analysis of eQTLs in genomic regions of interest. Using eQTL collections provided by the Genotype-Tissue Expression (GTEx) project and pathway collections from MSigDB.

Maintained by Tianlei Xu. Last updated 5 months ago.

functionalgenomics genetics genesetenrichment software geneexpression sequencing coverage biocarta

20.8 match 1 stars 4.30 score 2 scripts

bioc

DEP:Differential Enrichment analysis of Proteomics data

This package provides an integrated analysis workflow for robust and reproducible analysis of mass spectrometry proteomics data for differential protein expression or differential enrichment. It requires tabular input (e.g. txt files) as generated by quantitative analysis softwares of raw mass spectrometry data, such as MaxQuant or IsobarQuant. Functions are provided for data preparation, filtering, variance normalization and imputation of missing values, as well as statistical testing of differentially enriched / expressed proteins. It also includes tools to check intermediate steps in the workflow, such as normalization and missing values imputation. Finally, visualization tools are provided to explore the results, including heatmap, volcano plot and barplot representations. For scientists with limited experience in R, the package also contains wrapper functions that entail the complete analysis workflow and generate a report. Even easier to use are the interactive Shiny apps that are provided by the package.

Maintained by Arne Smits. Last updated 5 months ago.

immunooncology proteomics massspectrometry differentialexpression datarepresentation

12.6 match 7.10 score 628 scripts

ganglilab

geneset:Get Gene Sets for Gene Enrichment Analysis

Gene sets are fundamental for gene enrichment analysis. The package 'geneset' enables querying gene sets from public databases including 'GO' (Gene Ontology Consortium. (2004) <doi:10.1093/nar/gkh036>), 'KEGG' (Minoru et al. (2000) <doi:10.1093/nar/28.1.27>), 'WikiPathway' (Marvin et al. (2020) <doi:10.1093/nar/gkaa1024>), 'MsigDb' (Arthur et al. (2015) <doi:10.1016/j.cels.2015.12.004>), 'Reactome' (David et al. (2011) <doi:10.1093/nar/gkq1018>), 'MeSH' (Ish et al. (2014) <doi:10.4103/0019-5413.139827>), 'DisGeNET' (Janet et al. (2017) <doi:10.1093/nar/gkw943>), 'Disease Ontology' (Lynn et al. (2011) <doi:10.1093/nar/gkr972>), 'Network of Cancer Genes' (Dimitra et al. (2019) <doi:10.1186/s13059-018-1612-0>) and 'COVID-19' (Maxim et al. (2020) <doi:10.21203/rs.3.rs-28582/v1>). Gene sets are stored in the list object which provides data frame of 'geneset' and 'geneset_name'. The 'geneset' has two columns of term ID and gene ID. The 'geneset_name' has two columns of terms ID and term description.

Maintained by Yunze Liu. Last updated 2 years ago.

enrichment-analysis gene geneset-enrichment

18.4 match 9 stars 4.75 score 21 scripts 2 dependents

bioc

MotifPeeker:Benchmarking Epigenomic Profiling Methods Using Motif Enrichment

MotifPeeker is used to compare and analyse datasets from epigenomic profiling methods with motif enrichment as the key benchmark. The package outputs an HTML report consisting of three sections: (1. General Metrics) Overview of peaks-related general metrics for the datasets (FRiP scores, peak widths and motif-summit distances). (2. Known Motif Enrichment Analysis) Statistics for the frequency of user-provided motifs enriched in the datasets. (3. De-Novo Motif Enrichment Analysis) Statistics for the frequency of de-novo discovered motifs enriched in the datasets and compared with known motifs.

Maintained by Hiranyamaya Dash. Last updated 2 months ago.

epigenetics genetics qualitycontrol chipseq multiplecomparison functionalgenomics motifdiscovery sequencematching software alignment bioconductor bioconductor-package chip-seq epigenomics interactive-report motif-enrichment-analysis

15.9 match 2 stars 5.48 score 6 scripts

ftwkoopmans

goat:Gene Set Analysis Using the Gene Set Ordinal Association Test

Perform gene set enrichment analyses using the Gene set Ordinal Association Test (GOAT) algorithm and visualize your results. Koopmans, F. (2024) <doi:10.1038/s42003-024-06454-5>.

Maintained by Frank Koopmans. Last updated 21 days ago.

bioinformatics geneset-enrichment geneset-enrichment-analysis cpp openmp

19.8 match 10 stars 4.40 score 8 scripts

proteomicslab57357

UniprotR:Retrieving Information of Proteins from Uniprot

Connect to Uniprot <https://www.uniprot.org/> to retrieve information about proteins using their accession number such information could be name or taxonomy information, For detailed information kindly read the publication <https://www.sciencedirect.com/science/article/pii/S1874391919303859>.

Maintained by Mohamed Soudy. Last updated 2 years ago.

11.3 match 61 stars 7.65 score 89 scripts 1 dependents

january3

tmod:Feature Set Enrichment Analysis for Metabolomics and Transcriptomics

Methods and feature set definitions for feature or gene set enrichment analysis in transcriptional and metabolic profiling data. Package includes tests for enrichment based on ranked lists of features, functions for visualisation and multivariate functional analysis. See Zyla et al (2019) <doi:10.1093/bioinformatics/btz447>.

Maintained by January Weiner. Last updated 2 months ago.

11.7 match 3 stars 6.88 score 168 scripts 1 dependents

olink-proteomics

OlinkAnalyze:Facilitate Analysis of Proteomic Data from Olink

A collection of functions to facilitate analysis of proteomic data from Olink, primarily NPX data that has been exported from Olink Software. The functions also work on QUANT data from Olink by log- transforming the QUANT data. The functions are focused on reading data, facilitating data wrangling and quality control analysis, performing statistical analysis and generating figures to visualize the results of the statistical analysis. The goal of this package is to help users extract biological insights from proteomic data run on the Olink platform.

Maintained by Kathleen Nevola. Last updated 19 days ago.

olink proteomics proteomics-data-analysis

8.0 match 104 stars 9.72 score 61 scripts

enblacar

SCpubr:Generate Publication Ready Visualizations of Single Cell Transcriptomics Data

A system that provides a streamlined way of generating publication ready plots for known Single-Cell transcriptomics data in a “publication ready” format. This is, the goal is to automatically generate plots with the highest quality possible, that can be used right away or with minimal modifications for a research article.

Maintained by Enrique Blanco-Carmona. Last updated 30 days ago.

software singlecell visualization data-visualization ggplot2 publication-quality-plots seurat single-cell single-cell-genomics single-cell-rna-seq

8.7 match 178 stars 8.71 score 194 scripts

bioc

genomation:Summary, annotation and visualization of genomic data

A package for summary and annotation of genomic intervals. Users can visualize and quantify genomic intervals over pre-defined functional regions, such as promoters, exons, introns, etc. The genomic intervals represent regions with a defined chromosome position, which may be associated with a score, such as aligned reads from HT-seq experiments, TF binding sites, methylation scores, etc. The package can use any tabular genomic feature data as long as it has minimal information on the locations of genomic intervals. In addition, It can use BAM or BigWig files as input.

Maintained by Altuna Akalin. Last updated 5 months ago.

annotation sequencing visualization cpgisland cpp

6.8 match 75 stars 11.09 score 738 scripts 5 dependents

mrcieu

TwoSampleMR:Two Sample MR Functions and Interface to MRC Integrative Epidemiology Unit OpenGWAS Database

A package for performing Mendelian randomization using GWAS summary data. It uses the IEU OpenGWAS database <https://gwas.mrcieu.ac.uk/> to automatically obtain data, and a wide range of methods to run the analysis.

Maintained by Gibran Hemani. Last updated 9 days ago.

6.7 match 467 stars 11.23 score 1.7k scripts 1 dependents

bioc

edgeR:Empirical Analysis of Digital Gene Expression Data in R

Differential expression analysis of sequence count data. Implements a range of statistical methodology based on the negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models, quasi-likelihood, and gene set enrichment. Can perform differential analyses of any type of omics data that produces read counts, including RNA-seq, ChIP-seq, ATAC-seq, Bisulfite-seq, SAGE, CAGE, metabolomics, or proteomics spectral counts. RNA-seq analyses can be conducted at the gene or isoform level, and tests can be conducted for differential exon or transcript usage.

Maintained by Yunshun Chen. Last updated 4 days ago.

alternativesplicing batcheffect bayesian biomedicalinformatics cellbiology chipseq clustering coverage differentialexpression differentialmethylation differentialsplicing dnamethylation epigenetics functionalgenomics geneexpression genesetenrichment genetics immunooncology multiplecomparison normalization pathways proteomics qualitycontrol regression rnaseq sage sequencing singlecell systemsbiology timecourse transcription transcriptomics openblas

5.6 match 13.40 score 17k scripts 255 dependents

egeulgen

pathfindR.data:Data Package for 'pathfindR'

This is a data-only package, containing data needed to run the CRAN package 'pathfindR', a package for enrichment analysis utilizing active subnetworks. This package contains protein-protein interaction network data, data related to gene sets and example input/output data.

Maintained by Ege Ulgen. Last updated 11 months ago.

17.6 match 4.21 score 1 scripts 1 dependents

bioc

ChromSCape:Analysis of single-cell epigenomics datasets with a Shiny App

ChromSCape - Chromatin landscape profiling for Single Cells - is a ready-to-launch user-friendly Shiny Application for the analysis of single-cell epigenomics datasets (scChIP-seq, scATAC-seq, scCUT&Tag, ...) from aligned data to differential analysis & gene set enrichment analysis. It is highly interactive, enables users to save their analysis and covers a wide range of analytical steps: QC, preprocessing, filtering, batch correction, dimensionality reduction, vizualisation, clustering, differential analysis and gene set analysis.

Maintained by Pacome Prompsy. Last updated 5 months ago.

shinyapps software singlecell chipseq atacseq methylseq classification clustering epigenetics principalcomponent annotation batcheffect multiplecomparison normalization pathways preprocessing qualitycontrol reportwriting visualization genesetenrichment differentialpeakcalling epigenomics shiny single-cell cpp

12.6 match 14 stars 5.83 score 16 scripts

bioc

SeqGSEA:Gene Set Enrichment Analysis (GSEA) of RNA-Seq Data: integrating differential expression and splicing

The package generally provides methods for gene set enrichment analysis of high-throughput RNA-Seq data by integrating differential expression and splicing. It uses negative binomial distribution to model read count data, which accounts for sequencing biases and biological variation. Based on permutation tests, statistical significance can also be achieved regarding each gene's differential expression and splicing, respectively.

Maintained by Xi Wang. Last updated 5 months ago.

sequencing rnaseq genesetenrichment geneexpression differentialexpression differentialsplicing immunooncology

16.7 match 4.34 score 11 scripts

bioc

EnrichDO:a Global Weighted Model for Disease Ontology Enrichment Analysis

To implement disease ontology (DO) enrichment analysis, this package is designed and presents a double weighted model based on the latest annotations of the human genome with DO terms, by integrating the DO graph topology on a global scale. This package exhibits high accuracy that it can identify more specific DO terms, which alleviates the over enriched problem. The package includes various statistical models and visualization schemes for discovering the associations between genes and diseases from biological big data.

Maintained by Hongyu Fu. Last updated 4 months ago.

annotation visualization genesetenrichment software

15.0 match 4.74 score 9 scripts

junjunlab

GseaVis:Implement for 'GSEA' Enrichment Visualization

Mark your interesting genes on plot and support more parameters to handle your own gene set enrichment analysis plot.

Maintained by Jun Zhang. Last updated 2 months ago.

10.7 match 146 stars 6.60 score 54 scripts

bioc

MOFA2:Multi-Omics Factor Analysis v2

The MOFA2 package contains a collection of tools for training and analysing multi-omic factor analysis (MOFA). MOFA is a probabilistic factor model that aims to identify principal axes of variation from data sets that can comprise multiple omic layers and/or groups of samples. Additional time or space information on the samples can be incorporated using the MEFISTO framework, which is part of MOFA2. Downstream analysis functions to inspect molecular features underlying each factor, vizualisation, imputation etc are available.

Maintained by Ricard Argelaguet. Last updated 5 months ago.

dimensionreduction bayesian visualization factor-analysis mofa multi-omics

7.0 match 319 stars 10.02 score 502 scripts

pwwang

plotthis:High-Level Plotting Built Upon 'ggplot2' and Other Plotting Packages

Provides high-level API and a wide range of options to create stunning, publication-quality plots effortlessly. It is built upon 'ggplot2' and other plotting packages, and is designed to be easy to use and to work seamlessly with 'ggplot2' objects. It is particularly useful for creating complex plots with multiple layers, facets, and annotations. It also provides a set of functions to create plots for specific types of data, such as Venn diagrams, alluvial diagrams, and phylogenetic trees. The package is designed to be flexible and customizable, and to work well with the 'ggplot2' ecosystem. The API can be found at <https://pwwang.github.io/plotthis/reference/index.html>.

Maintained by Panwen Wang. Last updated 18 days ago.

ggplot2 plotting single-cell

12.8 match 34 stars 5.46 score 2 scripts

bioc

fenr:Fast functional enrichment for interactive applications

Perform fast functional enrichment on feature lists (like genes or proteins) using the hypergeometric distribution. Tailored for speed, this package is ideal for interactive platforms such as Shiny. It supports the retrieval of functional data from sources like GO, KEGG, Reactome, Bioplanet and WikiPathways. By downloading and preparing data first, it allows for rapid successive tests on various feature selections without the need for repetitive, time-consuming preparatory steps typical of other packages.

Maintained by Marek Gierlinski. Last updated 23 days ago.

functionalprediction differentialexpression genesetenrichment go kegg reactome proteomics

15.1 match 4.60 score 4 scripts

shixiangwang

sigminer:Extract, Analyze and Visualize Mutational Signatures for Genomic Variations

Genomic alterations including single nucleotide substitution, copy number alteration, etc. are the major force for cancer initialization and development. Due to the specificity of molecular lesions caused by genomic alterations, we can generate characteristic alteration spectra, called 'signature' (Wang, Shixiang, et al. (2021) <DOI:10.1371/journal.pgen.1009557> & Alexandrov, Ludmil B., et al. (2020) <DOI:10.1038/s41586-020-1943-3> & Steele Christopher D., et al. (2022) <DOI:10.1038/s41586-022-04738-6>). This package helps users to extract, analyze and visualize signatures from genomic alteration records, thus providing new insight into cancer study.

Maintained by Shixiang Wang. Last updated 5 months ago.

bayesian-nmf bioinformatics cancer-research cnv copynumber-signatures cosmic-signatures dbs easy-to-use indel mutational-signatures nmf nmf-extraction sbs signature-extraction somatic-mutations somatic-variants visualization cpp

7.3 match 150 stars 9.48 score 123 scripts 2 dependents

bioc

sparrow:Take command of set enrichment analyses through a unified interface

Provides a unified interface to a variety of GSEA techniques from different bioconductor packages. Results are harmonized into a single object and can be interrogated uniformly for quick exploration and interpretation of results. Interactive exploration of GSEA results is enabled through a shiny app provided by a sparrow.shiny sibling package.

Maintained by Steve Lianoglou. Last updated 3 months ago.

genesetenrichment pathways bioinformatics gsea

10.5 match 21 stars 6.58 score 13 scripts

bioc

GSEABenchmarkeR:Reproducible GSEA Benchmarking

The GSEABenchmarkeR package implements an extendable framework for reproducible evaluation of set- and network-based methods for enrichment analysis of gene expression data. This includes support for the efficient execution of these methods on comprehensive real data compendia (microarray and RNA-seq) using parallel computation on standard workstations and institutional computer grids. Methods can then be assessed with respect to runtime, statistical significance, and relevance of the results for the phenotypes investigated.

Maintained by Ludwig Geistlinger. Last updated 5 months ago.

immunooncology microarray rnaseq geneexpression differentialexpression pathways graphandnetwork network genesetenrichment networkenrichment visualization reportwriting bioconductor-package u24ca289073

10.4 match 13 stars 6.55 score 23 scripts

bioc

simplifyEnrichment:Simplify Functional Enrichment Results

A new clustering algorithm, "binary cut", for clustering similarity matrices of functional terms is implemeted in this package. It also provides functions for visualizing, summarizing and comparing the clusterings.

Maintained by Zuguang Gu. Last updated 5 months ago.

software visualization go clustering genesetenrichment

8.5 match 113 stars 8.02 score 196 scripts

bioc

cola:A Framework for Consensus Partitioning

Subgroup classification is a basic task in genomic data analysis, especially for gene expression and DNA methylation data analysis. It can also be used to test the agreement to known clinical annotations, or to test whether there exist significant batch effects. The cola package provides a general framework for subgroup classification by consensus partitioning. It has the following features: 1. It modularizes the consensus partitioning processes that various methods can be easily integrated. 2. It provides rich visualizations for interpreting the results. 3. It allows running multiple methods at the same time and provides functionalities to straightforward compare results. 4. It provides a new method to extract features which are more efficient to separate subgroups. 5. It automatically generates detailed reports for the complete analysis. 6. It allows applying consensus partitioning in a hierarchical manner.

Maintained by Zuguang Gu. Last updated 1 months ago.

clustering geneexpression classification software consensus-clustering cpp

9.1 match 61 stars 7.49 score 112 scripts

bioc

GWENA:Pipeline for augmented co-expression analysis

The development of high-throughput sequencing led to increased use of co-expression analysis to go beyong single feature (i.e. gene) focus. We propose GWENA (Gene Whole co-Expression Network Analysis) , a tool designed to perform gene co-expression network analysis and explore the results in a single pipeline. It includes functional enrichment of modules of co-expressed genes, phenotypcal association, topological analysis and comparison of networks configuration between conditions.

Maintained by Gwenaëlle Lemoine. Last updated 5 months ago.

software geneexpression network clustering graphandnetwork genesetenrichment pathways visualization rnaseq transcriptomics mrnamicroarray microarray networkenrichment sequencing go co-expression enrichment-analysis gene network-analysis pipeline

11.8 match 24 stars 5.76 score 12 scripts

stuart-lab

Signac:Analysis of Single-Cell Chromatin Data

A framework for the analysis and exploration of single-cell chromatin data. The 'Signac' package contains functions for quantifying single-cell chromatin data, computing per-cell quality control metrics, dimension reduction and normalization, visualization, and DNA sequence motif analysis. Reference: Stuart et al. (2021) <doi:10.1038/s41592-021-01282-5>.

Maintained by Tim Stuart. Last updated 7 months ago.

atac bioinformatics single-cell zlib cpp

5.5 match 349 stars 12.19 score 3.7k scripts 1 dependents

bioc

INTACT:Integrate TWAS and Colocalization Analysis for Gene Set Enrichment Analysis

This package integrates colocalization probabilities from colocalization analysis with transcriptome-wide association study (TWAS) scan summary statistics to implicate genes that may be biologically relevant to a complex trait. The probabilistic framework implemented in this package constrains the TWAS scan z-score-based likelihood using a gene-level colocalization probability. Given gene set annotations, this package can estimate gene set enrichment using posterior probabilities from the TWAS-colocalization integration step.

Maintained by Jeffrey Okamoto. Last updated 5 months ago.

bayesian genesetenrichment

12.3 match 15 stars 5.47 score 13 scripts

bioc

goSorensen:Statistical inference based on the Sorensen-Dice dissimilarity and the Gene Ontology (GO)

This package implements inferential methods to compare gene lists in terms of their biological meaning as expressed in the GO. The compared gene lists are characterized by cross-tabulation frequency tables of enriched GO items. Dissimilarity between gene lists is evaluated using the Sorensen-Dice index. The fundamental guiding principle is that two gene lists are taken as similar if they share a great proportion of common enriched GO items.

Maintained by Pablo Flores. Last updated 5 months ago.

annotation go genesetenrichment software microarray pathways geneexpression multiplecomparison graphandnetwork reactome clustering kegg

14.3 match 4.56 score 12 scripts

bioc

HPiP:Host-Pathogen Interaction Prediction

HPiP (Host-Pathogen Interaction Prediction) uses an ensemble learning algorithm for prediction of host-pathogen protein-protein interactions (HP-PPIs) using structural and physicochemical descriptors computed from amino acid-composition of host and pathogen proteins.The proposed package can effectively address data shortages and data unavailability for HP-PPI network reconstructions. Moreover, establishing computational frameworks in that regard will reveal mechanistic insights into infectious diseases and suggest potential HP-PPI targets, thus narrowing down the range of possible candidates for subsequent wet-lab experimental validations.

Maintained by Matineh Rahmatbakhsh. Last updated 5 months ago.

proteomics systemsbiology networkinference structuralprediction geneprediction network

13.1 match 3 stars 4.95 score 6 scripts

bioc

UCell:Rank-based signature enrichment analysis for single-cell data

UCell is a package for evaluating gene signatures in single-cell datasets. UCell signature scores, based on the Mann-Whitney U statistic, are robust to dataset size and heterogeneity, and their calculation demands less computing time and memory than other available methods, enabling the processing of large datasets in a few minutes even on machines with limited computing power. UCell can be applied to any single-cell data matrix, and includes functions to directly interact with SingleCellExperiment and Seurat objects.

Maintained by Massimo Andreatta. Last updated 5 months ago.

singlecell genesetenrichment transcriptomics geneexpression cellbasedassays

6.2 match 143 stars 10.43 score 454 scripts 2 dependents

afukushima

TFactSR:Enrichment Approach to Predict Which Transcription Factors are Regulated

R implementation of 'TFactS' to predict which are the transcription factors (TFs), regulated in a biological condition based on lists of differentially expressed genes (DEGs) obtained from transcriptome experiments. This package is based on the 'TFactS' concept by Essaghir et al. (2010) <doi:10.1093/nar/gkq149> and expands it. It allows users to perform 'TFactS'-like enrichment approach. The package can import and use the original catalogue file from the 'TFactS' as well as users' defined catalogues of interest that are not supported by 'TFactS' (e.g., Arabidopsis).

Maintained by Atsushi Fukushima. Last updated 2 years ago.

network software differentialexpression genetarget geneexpression microarray rnaseq transcription networkenrichment

17.4 match 3.70 score 3 scripts

bioc

RITAN:Rapid Integration of Term Annotation and Network resources

Tools for comprehensive gene set enrichment and extraction of multi-resource high confidence subnetworks. RITAN facilitates bioinformatic tasks for enabling network biology research.

Maintained by Michael Zimmermann. Last updated 5 months ago.

qualitycontrol network networkenrichment networkinference genesetenrichment functionalgenomics graphandnetwork

11.8 match 5.40 score 9 scripts

bioc

ATACseqTFEA:Transcription Factor Enrichment Analysis for ATAC-seq

Assay for Transpose-Accessible Chromatin using sequencing (ATAC-seq) is a technique to assess genome-wide chromatin accessibility by probing open chromatin with hyperactive mutant Tn5 Transposase that inserts sequencing adapters into open regions of the genome. ATACseqTFEA is an improvement of the current computational method that detects differential activity of transcription factors (TFs). ATACseqTFEA not only uses the difference of open region information, but also (or emphasizes) the difference of TFs footprints (cutting sites or insertion sites). ATACseqTFEA provides an easy, rigorous way to broadly assess TF activity changes between two conditions.

Maintained by Jianhong Ou. Last updated 2 months ago.

sequencing dnaseq atacseq mnaseseq generegulation

15.1 match 1 stars 4.18 score 4 scripts

bioc

transite:RNA-binding protein motif analysis

transite is a computational method that allows comprehensive analysis of the regulatory role of RNA-binding proteins in various cellular processes by leveraging preexisting gene expression data and current knowledge of binding preferences of RNA-binding proteins.

Maintained by Konstantin Krismer. Last updated 5 months ago.

geneexpression transcription differentialexpression microarray mrnamicroarray genetics genesetenrichment cpp

14.5 match 4.30 score 20 scripts

bioc

fobitools:Tools for Manipulating the FOBI Ontology

A set of tools for interacting with the Food-Biomarker Ontology (FOBI). A collection of basic manipulation tools for biological significance analysis, graphs, and text mining strategies for annotating nutritional data.

Maintained by Pol Castellano-Escuder. Last updated 4 months ago.

massspectrometry metabolomics software visualization biomedicalinformatics graphandnetwork annotation cheminformatics pathways genesetenrichment biological-intrerpretation biological-knowledge biological-significance-analysis enrichment-analysis food-biomarker-ontology knowledge-graph nutrition obofoundry ontology text-mining

12.2 match 1 stars 5.08 score 5 scripts

bioc

artMS:Analytical R tools for Mass Spectrometry

artMS provides a set of tools for the analysis of proteomics label-free datasets. It takes as input the MaxQuant search result output (evidence.txt file) and performs quality control, relative quantification using MSstats, downstream analysis and integration. artMS also provides a set of functions to re-format and make it compatible with other analytical tools, including, SAINTq, SAINTexpress, Phosfate, and PHOTON. Check [http://artms.org](http://artms.org) for details.

Maintained by David Jimenez-Morales. Last updated 5 months ago.

proteomics differentialexpression biomedicalinformatics systemsbiology massspectrometry annotation qualitycontrol genesetenrichment clustering normalization immunooncology multiplecomparison analysis analytical ap-ms bioconductor bioinformatics mass-spectrometry phosphoproteomics post-translational-modification quantitative-analysis

9.6 match 14 stars 6.41 score 13 scripts

bioc

topGO:Enrichment Analysis for Gene Ontology

topGO package provides tools for testing GO terms while accounting for the topology of the GO graph. Different test statistics and different methods for eliminating local similarities and dependencies between GO terms can be implemented and applied.

Maintained by Adrian Alexa. Last updated 5 months ago.

microarray visualization

6.9 match 8.96 score 2.0k scripts 20 dependents

bioc

MAST:Model-based Analysis of Single Cell Transcriptomics

Methods and models for handling zero-inflated single cell assay data.

Maintained by Andrew McDavid. Last updated 5 months ago.

geneexpression differentialexpression genesetenrichment rnaseq transcriptomics singlecell

4.8 match 230 stars 12.75 score 1.8k scripts 5 dependents

bioc

FELLA:Interpretation and enrichment for metabolomics data

Enrichment of metabolomics data using KEGG entries. Given a set of affected compounds, FELLA suggests affected reactions, enzymes, modules and pathways using label propagation in a knowledge model network. The resulting subnetwork can be visualised and exported.

Maintained by Sergio Picart-Armada. Last updated 5 months ago.

software metabolomics graphandnetwork kegg go pathways network networkenrichment

13.9 match 4.41 score 32 scripts

bioc

mitch:Multi-Contrast Gene Set Enrichment Analysis

mitch is an R package for multi-contrast enrichment analysis. At it’s heart, it uses a rank-MANOVA based statistical approach to detect sets of genes that exhibit enrichment in the multidimensional space as compared to the background. The rank-MANOVA concept dates to work by Cox and Mann (https://doi.org/10.1186/1471-2105-13-S16-S12). mitch is useful for pathway analysis of profiling studies with one, two or more contrasts, or in studies with multiple omics profiling, for example proteomic, transcriptomic, epigenomic analysis of the same samples. mitch is perfectly suited for pathway level differential analysis of scRNA-seq data. We have an established routine for pathway enrichment of Infinium Methylation Array data (see vignette). The main strengths of mitch are that it can import datasets easily from many upstream tools and has advanced plotting features to visualise these enrichments.

Maintained by Mark Ziemann. Last updated 4 months ago.

geneexpression genesetenrichment singlecell transcriptomics epigenetics proteomics differentialexpression reactome dnamethylation methylationarray gene-regulation gene-seq-analysis pathway-analysis

8.6 match 16 stars 7.06 score 15 scripts

bioc

CARNIVAL:A CAusal Reasoning tool for Network Identification (from gene expression data) using Integer VALue programming

An upgraded causal reasoning tool from Melas et al in R with updated assignments of TFs' weights from PROGENy scores. Optimization parameters can be freely adjusted and multiple solutions can be obtained and aggregated.

Maintained by Attila Gabor. Last updated 5 months ago.

transcriptomics geneexpression network causal-models footprints integer-linear-programming pathway-enrichment-analysis

6.7 match 57 stars 9.03 score 90 scripts 1 dependents

jpquast

protti:Bottom-Up Proteomics and LiP-MS Quality Control and Data Analysis Tools

Useful functions and workflows for proteomics quality control and data analysis of both limited proteolysis-coupled mass spectrometry (LiP-MS) (Feng et. al. (2014) <doi:10.1038/nbt.2999>) and regular bottom-up proteomics experiments. Data generated with search tools such as 'Spectronaut', 'MaxQuant' and 'Proteome Discover' can be easily used due to flexibility of functions.

Maintained by Jan-Philipp Quast. Last updated 5 months ago.

data-analysis lip-ms mass-spectrometry omics protein proteomics systems-biology

6.9 match 61 stars 8.58 score 83 scripts

eonurk

cinaR:A Computational Pipeline for Bulk 'ATAC-Seq' Profiles

Differential analyses and Enrichment pipeline for bulk 'ATAC-seq' data analyses. This package combines different packages to have an ultimate package for both data analyses and visualization of 'ATAC-seq' data. Methods are described in 'Karakaslar et al.' (2021) <doi:10.1101/2021.03.05.434143>.

Maintained by Onur Karakaslar. Last updated 10 months ago.

atac-seq differential-analysis enrichment-analysis gene-sets

10.8 match 13 stars 5.52 score 51 scripts

eonurk

cinaRgenesets:Ready-to-Use Curated Gene Sets for 'cinaR'

Immune related gene sets provided along with the 'cinaR' package.

Maintained by Onur Karakaslar. Last updated 4 years ago.

dice enrichment enrichment-analysis gene-sets genesets pbmc wikipathways

17.5 match 3 stars 3.35 score 15 scripts

bioc

RegEnrich:Gene regulator enrichment analysis

This package is a pipeline to identify the key gene regulators in a biological process, for example in cell differentiation and in cell development after stimulation. There are four major steps in this pipeline: (1) differential expression analysis; (2) regulator-target network inference; (3) enrichment analysis; and (4) regulators scoring and ranking.

Maintained by Weiyang Tao. Last updated 5 months ago.

geneexpression transcriptomics rnaseq twochannel transcription genetarget networkenrichment differentialexpression network networkinference genesetenrichment functionalprediction

15.2 match 3.82 score 22 scripts

bioc

benchdamic:Benchmark of differential abundance methods on microbiome data

Starting from a microbiome dataset (16S or WMS with absolute count values) it is possible to perform several analysis to assess the performances of many differential abundance detection methods. A basic and standardized version of the main differential abundance analysis methods is supplied but the user can also add his method to the benchmark. The analyses focus on 4 main aspects: i) the goodness of fit of each method's distributional assumptions on the observed count data, ii) the ability to control the false discovery rate, iii) the within and between method concordances, iv) the truthfulness of the findings if any apriori knowledge is given. Several graphical functions are available for result visualization.

Maintained by Matteo Calgaro. Last updated 4 months ago.

metagenomics microbiome differentialexpression multiplecomparison normalization preprocessing software benchmark differential-abundance-methods

10.1 match 6 stars 5.73 score 8 scripts

josesamos

starschemar:Obtaining Stars from Flat Tables

Data in multidimensional systems is obtained from operational systems and is transformed to adapt it to the new structure. Frequently, the operations to be performed aim to transform a flat table into a star schema. Transformations can be carried out using professional extract, transform and load tools or tools intended for data transformation for end users. With the tools mentioned, this transformation can be carried out, but it requires a lot of work. The main objective of this package is to define transformations that allow obtaining stars from flat tables easily. In addition, it includes basic data cleaning, dimension enrichment, incremental data refresh and query operations, adapted to this context.

Maintained by Jose Samos. Last updated 11 months ago.

10.2 match 7 stars 5.66 score 11 scripts 2 dependents

bioc

multiGSEA:Combining GSEA-based pathway enrichment with multi omics data integration

Extracted features from pathways derived from 8 different databases (KEGG, Reactome, Biocarta, etc.) can be used on transcriptomic, proteomic, and/or metabolomic level to calculate a combined GSEA-based enrichment score.

Maintained by Sebastian Canzler. Last updated 2 months ago.

genesetenrichment pathways reactome biocarta

8.1 match 18 stars 7.06 score 32 scripts

bioc

CeTF:Coexpression for Transcription Factors using Regulatory Impact Factors and Partial Correlation and Information Theory analysis

This package provides the necessary functions for performing the Partial Correlation coefficient with Information Theory (PCIT) (Reverter and Chan 2008) and Regulatory Impact Factors (RIF) (Reverter et al. 2010) algorithm. The PCIT algorithm identifies meaningful correlations to define edges in a weighted network and can be applied to any correlation-based network including but not limited to gene co-expression networks, while the RIF algorithm identify critical Transcription Factors (TF) from gene expression data. These two algorithms when combined provide a very relevant layer of information for gene expression studies (Microarray, RNA-seq and single-cell RNA-seq data).

Maintained by Carlos Alberto Oliveira de Biagi Junior. Last updated 5 months ago.

sequencing rnaseq microarray geneexpression transcription normalization differentialexpression singlecell network regression chipseq immunooncology coverage cpp

13.1 match 4.30 score 9 scripts

bioc

normr:Normalization and difference calling in ChIP-seq data

Robust normalization and difference calling procedures for ChIP-seq and alike data. Read counts are modeled jointly as a binomial mixture model with a user-specified number of components. A fitted background estimate accounts for the effect of enrichment in certain regions and, therefore, represents an appropriate null hypothesis. This robust background is used to identify significantly enriched or depleted regions.

Maintained by Johannes Helmuth. Last updated 5 months ago.

bayesian differentialpeakcalling classification dataimport chipseq ripseq functionalgenomics genetics multiplecomparison normalization peakdetection preprocessing alignment cpp openmp

9.4 match 11 stars 5.93 score 13 scripts

bioc

piano:Platform for integrative analysis of omics data

Piano performs gene set analysis using various statistical methods, from different gene level statistics and a wide range of gene-set collections. Furthermore, the Piano package contains functions for combining the results of multiple runs of gene set analyses.

Maintained by Leif Varemo Wigge. Last updated 5 months ago.

microarray preprocessing qualitycontrol differentialexpression visualization geneexpression genesetenrichment pathways bioconductor bioconductor-package bioinformatics gene-set-enrichment transcriptomics

6.7 match 13 stars 8.30 score 183 scripts 7 dependents

bzhanglab

WebGestaltR:Gene Set Analysis Toolkit WebGestaltR

The web version WebGestalt <https://www.webgestalt.org> supports 12 organisms, 354 gene identifiers and 321,251 function categories. Users can upload the data and functional categories with their own gene identifiers. In addition to the Over-Representation Analysis, WebGestalt also supports Gene Set Enrichment Analysis and Network Topology Analysis. The user-friendly output report allows interactive and efficient exploration of enrichment results. The WebGestaltR package not only supports all above functions but also can be integrated into other pipeline or simultaneously analyze multiple gene lists.

Maintained by John Elizarraras. Last updated 29 days ago.

rust cargo

6.0 match 35 stars 9.18 score 180 scripts

bioc

CSAR:Statistical tools for the analysis of ChIP-seq data

Statistical tools for ChIP-seq data analysis. The package includes the statistical method described in Kaufmann et al. (2009) PLoS Biology: 7(4):e1000090. Briefly, Taking the average DNA fragment size subjected to sequencing into account, the software calculates genomic single-nucleotide read-enrichment values. After normalization, sample and control are compared using a test based on the Poisson distribution. Test statistic thresholds to control the false discovery rate are obtained through random permutation.

Maintained by Jose M Muino. Last updated 5 months ago.

chipseq transcription genetics

12.6 match 4.30 score 6 scripts

bioc

GSEABase:Gene set enrichment data structures and methods

This package provides classes and methods to support Gene Set Enrichment Analysis (GSEA).

Maintained by Bioconductor Package Maintainer. Last updated 1 months ago.

geneexpression genesetenrichment graphandnetwork go kegg

5.2 match 10.27 score 1.5k scripts 77 dependents

bioc

limma:Linear Models for Microarray and Omics Data

Data analysis, linear models and differential expression for omics data.

Maintained by Gordon Smyth. Last updated 4 days ago.

exonarray geneexpression transcription alternativesplicing differentialexpression differentialsplicing genesetenrichment dataimport bayesian clustering regression timecourse microarray micrornaarray mrnamicroarray onechannel proprietaryplatforms twochannel sequencing rnaseq batcheffect multiplecomparison normalization preprocessing qualitycontrol biomedicalinformatics cellbiology cheminformatics epigenetics functionalgenomics genetics immunooncology metabolomics proteomics systemsbiology transcriptomics

3.8 match 13.81 score 16k scripts 585 dependents

bioc

memes:motif matching, comparison, and de novo discovery using the MEME Suite

A seamless interface to the MEME Suite family of tools for motif analysis. 'memes' provides data aware utilities for using GRanges objects as entrypoints to motif analysis, data structures for examining & editing motif lists, and novel data visualizations. 'memes' functions and data structures are amenable to both base R and tidyverse workflows.

Maintained by Spencer Nystrom. Last updated 5 months ago.

dataimport functionalgenomics generegulation motifannotation motifdiscovery sequencematching software

6.0 match 49 stars 8.68 score 117 scripts 1 dependents

bioc

famat:Functional analysis of metabolic and transcriptomic data

Famat is made to collect data about lists of genes and metabolites provided by user, and to visualize it through a Shiny app. Information collected is: - Pathways containing some of the user's genes and metabolites (obtained using a pathway enrichment analysis). - Direct interactions between user's elements inside pathways. - Information about elements (their identifiers and descriptions). - Go terms enrichment analysis performed on user's genes. The Shiny app is composed of: - information about genes, metabolites, and direct interactions between them inside pathways. - an heatmap showing which elements from the list are in pathways (pathways are structured in hierarchies). - hierarchies of enriched go terms using Molecular Function and Biological Process.

Maintained by Mathieu Charles. Last updated 5 months ago.

functionalprediction genesetenrichment pathways go reactome kegg compound gene-ontology genes shiny

13.4 match 1 stars 3.78 score 2 scripts

bioc

GOexpress:Visualise microarray and RNAseq data using gene ontology annotations

The package contains methods to visualise the expression profile of genes from a microarray or RNA-seq experiment, and offers a supervised clustering approach to identify GO terms containing genes with expression levels that best classify two or more predefined groups of samples. Annotations for the genes present in the expression dataset may be obtained from Ensembl through the biomaRt package, if not provided by the user. The default random forest framework is used to evaluate the capacity of each gene to cluster samples according to the factor of interest. Finally, GO terms are scored by averaging the rank (alternatively, score) of their respective gene sets to cluster the samples. P-values may be computed to assess the significance of GO term ranking. Visualisation function include gene expression profile, gene ontology-based heatmaps, and hierarchical clustering of experimental samples using gene expression data.

Maintained by Kevin Rue-Albrecht. Last updated 5 months ago.

software geneexpression transcription differentialexpression genesetenrichment datarepresentation clustering timecourse microarray sequencing rnaseq annotation multiplecomparison pathways go visualization immunooncology bioconductor bioconductor-package bioconductor-stats geneontology geneset-enrichment

7.5 match 9 stars 6.75 score 31 scripts

bioc

Mergeomics:Integrative network analysis of omics data

The Mergeomics pipeline serves as a flexible framework for integrating multidimensional omics-disease associations, functional genomics, canonical pathways and gene-gene interaction networks to generate mechanistic hypotheses. It includes two main parts, 1) Marker set enrichment analysis (MSEA); 2) Weighted Key Driver Analysis (wKDA).

Maintained by Zeyneb Kurt. Last updated 5 months ago.

software

11.6 match 4.30 score 8 scripts

bioc

miRNApath:miRNApath: Pathway Enrichment for miRNA Expression Data

This package provides pathway enrichment techniques for miRNA expression data. Specifically, the set of methods handles the many-to-many relationship between miRNAs and the multiple genes they are predicted to target (and thus affect.) It also handles the gene-to-pathway relationships separately. Both steps are designed to preserve the additive effects of miRNAs on genes, many miRNAs affecting one gene, one miRNA affecting multiple genes, or many miRNAs affecting many genes.

Maintained by James M. Ward. Last updated 5 months ago.

annotation pathways differentialexpression networkenrichment mirna

11.5 match 4.30 score 3 scripts

bioc

SNPhood:SNPhood: Investigate, quantify and visualise the epigenomic neighbourhood of SNPs using NGS data

To date, thousands of single nucleotide polymorphisms (SNPs) have been found to be associated with complex traits and diseases. However, the vast majority of these disease-associated SNPs lie in the non-coding part of the genome, and are likely to affect regulatory elements, such as enhancers and promoters, rather than function of a protein. Thus, to understand the molecular mechanisms underlying genetic traits and diseases, it becomes increasingly important to study the effect of a SNP on nearby molecular traits such as chromatin environment or transcription factor (TF) binding. Towards this aim, we developed SNPhood, a user-friendly *Bioconductor* R package to investigate and visualize the local neighborhood of a set of SNPs of interest for NGS data such as chromatin marks or transcription factor binding sites from ChIP-Seq or RNA- Seq experiments. SNPhood comprises a set of easy-to-use functions to extract, normalize and summarize reads for a genomic region, perform various data quality checks, normalize read counts using additional input files, and to cluster and visualize the regions according to the binding pattern. The regions around each SNP can be binned in a user-defined fashion to allow for analysis of very broad patterns as well as a detailed investigation of specific binding shapes. Furthermore, SNPhood supports the integration with genotype information to investigate and visualize genotype-specific binding patterns. Finally, SNPhood can be employed for determining, investigating, and visualizing allele-specific binding patterns around the SNPs of interest.

Maintained by Christian Arnold. Last updated 5 months ago.

software

12.5 match 3.90 score 1 scripts

mpallocc

autoGO:Auto-GO: Reproducible, Robust and High Quality Ontology Enrichment Visualizations

Auto-GO is a framework that enables automated, high quality Gene Ontology enrichment analysis visualizations. It also features a handy wrapper for Differential Expression analysis around the 'DESeq2' package described in Love et al. (2014) <doi:10.1186/s13059-014-0550-8>. The whole framework is structured in different, independent functions, in order to let the user decide which steps of the analysis to perform and which plot to produce.

Maintained by Fabio Ticconi. Last updated 18 days ago.

12.0 match 2 stars 4.08 score

bioc

epiNEM:epiNEM

epiNEM is an extension of the original Nested Effects Models (NEM). EpiNEM is able to take into account double knockouts and infer more complex network signalling pathways. It is tailored towards large scale double knock-out screens.

Maintained by Martin Pirkl. Last updated 5 months ago.

pathways systemsbiology networkinference network

8.4 match 1 stars 5.83 score 1 scripts 3 dependents

xiaozhangryy

CAESAR.Suite:CAESAR: a Cross-Technology and Cross-Resolution Framework for Spatial Omics Annotation

Biotechnology in spatial omics has advanced rapidly over the past few years, enhancing both throughput and resolution. However, existing annotation pipelines in spatial omics predominantly rely on clustering methods, lacking the flexibility to integrate extensive annotated information from single-cell RNA sequencing (scRNA-seq) due to discrepancies in spatial resolutions, species, or modalities. Here we introduce the CAESAR suite, an open-source software package that provides image-based spatial co-embedding of locations and genomic features. It uniquely transfers labels from scRNA-seq reference, enabling the annotation of spatial omics datasets across different technologies, resolutions, species, and modalities, based on the conserved relationship between signature genes and cells/locations at an appropriate level of granularity. Notably, CAESAR enriches location-level pathways, allowing for the detection of gradual biological pathway activation within spatially defined domain types.

Maintained by Xiao Zhang. Last updated 6 months ago.

openblas cpp openmp

11.3 match 4.30 score 2 scripts

bioc

gsean:Gene Set Enrichment Analysis with Networks

Biological molecules in a living organism seldom work individually. They usually interact each other in a cooperative way. Biological process is too complicated to understand without considering such interactions. Thus, network-based procedures can be seen as powerful methods for studying complex process. However, many methods are devised for analyzing individual genes. It is said that techniques based on biological networks such as gene co-expression are more precise ways to represent information than those using lists of genes only. This package is aimed to integrate the gene expression and biological network. A biological network is constructed from gene expression data and it is used for Gene Set Enrichment Analysis.

Maintained by Dongmin Jung. Last updated 5 months ago.

software statisticalmethod network graphandnetwork genesetenrichment geneexpression networkenrichment pathways differentialexpression

11.9 match 4.00 score 1 scripts

bioc

vissE:Visualising Set Enrichment Analysis Results

This package enables the interpretation and analysis of results from a gene set enrichment analysis using network-based and text-mining approaches. Most enrichment analyses result in large lists of significant gene sets that are difficult to interpret. Tools in this package help build a similarity-based network of significant gene sets from a gene set enrichment analysis that can then be investigated for their biological function using text-mining approaches.

Maintained by Dharmesh D. Bhuva. Last updated 5 months ago.

software geneexpression genesetenrichment networkenrichment network bioinformatics

8.0 match 14 stars 5.90 score 19 scripts

bioc

MutationalPatterns:Comprehensive genome-wide analysis of mutational processes

Mutational processes leave characteristic footprints in genomic DNA. This package provides a comprehensive set of flexible functions that allows researchers to easily evaluate and visualize a multitude of mutational patterns in base substitution catalogues of e.g. healthy samples, tumour samples, or DNA-repair deficient cells. The package covers a wide range of patterns including: mutational signatures, transcriptional and replicative strand bias, lesion segregation, genomic distribution and association with genomic features, which are collectively meaningful for studying the activity of mutational processes. The package works with single nucleotide variants (SNVs), insertions and deletions (Indels), double base substitutions (DBSs) and larger multi base substitutions (MBSs). The package provides functionalities for both extracting mutational signatures de novo and determining the contribution of previously identified mutational signatures on a single sample level. MutationalPatterns integrates with common R genomic analysis workflows and allows easy association with (publicly available) annotation data.

Maintained by Mark van Roosmalen. Last updated 5 months ago.

genetics somaticmutation

6.5 match 7.27 score 251 scripts 1 dependents

bioc

systemPipeR:systemPipeR: Workflow Environment for Data Analysis and Report Generation

systemPipeR is a multipurpose data analysis workflow environment that unifies R with command-line tools. It enables scientists to analyze many types of large- or small-scale data on local or distributed computer systems with a high level of reproducibility, scalability and portability. At its core is a command-line interface (CLI) that adopts the Common Workflow Language (CWL). This design allows users to choose for each analysis step the optimal R or command-line software. It supports both end-to-end and partial execution of workflows with built-in restart functionalities. Efficient management of complex analysis tasks is accomplished by a flexible workflow control container class. Handling of large numbers of input samples and experimental designs is facilitated by consistent sample annotation mechanisms. As a multi-purpose workflow toolkit, systemPipeR enables users to run existing workflows, customize them or design entirely new ones while taking advantage of widely adopted data structures within the Bioconductor ecosystem. Another important core functionality is the generation of reproducible scientific analysis and technical reports. For result interpretation, systemPipeR offers a wide range of plotting functionality, while an associated Shiny App offers many useful functionalities for interactive result exploration. The vignettes linked from this page include (1) a general introduction, (2) a description of technical details, and (3) a collection of workflow templates.

Maintained by Thomas Girke. Last updated 5 months ago.

genetics infrastructure dataimport sequencing rnaseq riboseq chipseq methylseq snp geneexpression coverage genesetenrichment alignment qualitycontrol immunooncology reportwriting workflowstep workflowmanagement

4.1 match 53 stars 11.56 score 344 scripts 3 dependents

bioc

GSEAmining:Make Biological Sense of Gene Set Enrichment Analysis Outputs

Gene Set Enrichment Analysis is a very powerful and interesting computational method that allows an easy correlation between differential expressed genes and biological processes. Unfortunately, although it was designed to help researchers to interpret gene expression data it can generate huge amounts of results whose biological meaning can be difficult to interpret. Many available tools rely on the hierarchically structured Gene Ontology (GO) classification to reduce reundandcy in the results. However, due to the popularity of GSEA many more gene set collections, such as those in the Molecular Signatures Database are emerging. Since these collections are not organized as those in GO, their usage for GSEA do not always give a straightforward answer or, in other words, getting all the meaninful information can be challenging with the currently available tools. For these reasons, GSEAmining was born to be an easy tool to create reproducible reports to help researchers make biological sense of GSEA outputs. Given the results of GSEA, GSEAmining clusters the different gene sets collections based on the presence of the same genes in the leadind edge (core) subset. Leading edge subsets are those genes that contribute most to the enrichment score of each collection of genes or gene sets. For this reason, gene sets that participate in similar biological processes should share genes in common and in turn cluster together. After that, GSEAmining is able to identify and represent for each cluster: - The most enriched terms in the names of gene sets (as wordclouds) - The most enriched genes in the leading edge subsets (as bar plots). In each case, positive and negative enrichments are shown in different colors so it is easy to distinguish biological processes or genes that may be of interest in that particular study.

Maintained by Oriol Arqués. Last updated 5 months ago.

genesetenrichment clustering visualization

11.7 match 4.00 score 7 scripts

liuy12

SCdeconR:Deconvolution of Bulk RNA-Seq Data using Single-Cell RNA-Seq Data as Reference

Streamlined workflow from deconvolution of bulk RNA-seq data to downstream differential expression and gene-set enrichment analysis. Provide various visualization functions.

Maintained by Yuanhang Liu. Last updated 10 months ago.

bulk-rna-seq-deconvolution deconvolution differential-expression ffpe geneset-enrichment-analysis scdeconr single-cell

12.3 match 3 stars 3.78 score 4 scripts

bioc

GenomicSuperSignature:Interpretation of RNA-seq experiments through robust, efficient comparison to public databases

This package provides a novel method for interpreting new transcriptomic datasets through near-instantaneous comparison to public archives without high-performance computing requirements. Through the pre-computed index, users can identify public resources associated with their dataset such as gene sets, MeSH term, and publication. Functions to identify interpretable annotations and intuitive visualization options are implemented in this package.

Maintained by Sehyun Oh. Last updated 5 months ago.

transcriptomics systemsbiology principalcomponent rnaseq sequencing pathways clustering bioconductor-package exploratory-data-analysis gsea mesh principal-component-analysis rna-sequencing-profiles transferlearning

6.7 match 16 stars 6.97 score 59 scripts

ocbe-uio

EnrichIntersect:Enrichment Analysis and Intersecting Sankey Diagram

A flexible tool for enrichment analysis based on user-defined sets. It allows users to perform over-representation analysis of the custom sets among any specified ranked feature list, hence making enrichment analysis applicable to various types of data from different scientific fields. 'EnrichIntersect' also enables an interactive means to visualize identified associations based on, for example, the mix-lasso model (Zhao et al., 2022 <doi:10.1016/j.isci.2022.104767>) or similar methods.

Maintained by Zhi Zhao. Last updated 12 months ago.

10.1 match 4 stars 4.60 score

bioc

autonomics:Unified Statistical Modeling of Omics Data

This package unifies access to Statistal Modeling of Omics Data. Across linear modeling engines (lm, lme, lmer, limma, and wilcoxon). Across coding systems (treatment, difference, deviation, etc). Across model formulae (with/without intercept, random effect, interaction or nesting). Across omics platforms (microarray, rnaseq, msproteomics, affinity proteomics, metabolomics). Across projection methods (pca, pls, sma, lda, spls, opls). Across clustering methods (hclust, pam, cmeans). It provides a fast enrichment analysis implementation. And an intuitive contrastogram visualisation to summarize contrast effects in complex designs.

Maintained by Aditya Bhagwat. Last updated 2 months ago.

software dataimport preprocessing dimensionreduction principalcomponent regression differentialexpression genesetenrichment transcriptomics transcription geneexpression rnaseq microarray proteomics metabolomics massspectrometry

7.8 match 5.95 score 5 scripts

bioc

Moonlight2R:Identify oncogenes and tumor suppressor genes from omics data

The understanding of cancer mechanism requires the identification of genes playing a role in the development of the pathology and the characterization of their role (notably oncogenes and tumor suppressors). We present an updated version of the R/bioconductor package called MoonlightR, namely Moonlight2R, which returns a list of candidate driver genes for specific cancer types on the basis of omics data integration. The Moonlight framework contains a primary layer where gene expression data and information about biological processes are integrated to predict genes called oncogenic mediators, divided into putative tumor suppressors and putative oncogenes. This is done through functional enrichment analyses, gene regulatory networks and upstream regulator analyses to score the importance of well-known biological processes with respect to the studied cancer type. By evaluating the effect of the oncogenic mediators on biological processes or through random forests, the primary layer predicts two putative roles for the oncogenic mediators: i) tumor suppressor genes (TSGs) and ii) oncogenes (OCGs). As gene expression data alone is not enough to explain the deregulation of the genes, a second layer of evidence is needed. We have automated the integration of a secondary mutational layer through new functionalities in Moonlight2R. These functionalities analyze mutations in the cancer cohort and classifies these into driver and passenger mutations using the driver mutation prediction tool, CScape-somatic. Those oncogenic mediators with at least one driver mutation are retained as the driver genes. As a consequence, this methodology does not only identify genes playing a dual role (e.g. TSG in one cancer type and OCG in another) but also helps in elucidating the biological processes underlying their specific roles. In particular, Moonlight2R can be used to discover OCGs and TSGs in the same cancer type. This may for instance help in answering the question whether some genes change role between early stages (I, II) and late stages (III, IV). In the future, this analysis could be useful to determine the causes of different resistances to chemotherapeutic treatments. An additional mechanistic layer evaluates if there are mutations affecting the protein stability of the transcription factors (TFs) of the TSGs and OCGs, as that may have an effect on the expression of the genes.

Maintained by Matteo Tiberti. Last updated 2 months ago.

dnamethylation differentialmethylation generegulation geneexpression methylationarray differentialexpression pathways network survival genesetenrichment networkenrichment

7.0 match 5 stars 6.59 score 43 scripts

juananvg

GSEMA:Gene Set Enrichment Meta-Analysis

Performing the different steps of gene set enrichment meta-analysis. It provides different functions that allow the application of meta-analysis based on the combination of effect sizes from different pathways in different studies to obtain significant pathways that are common to all of them.

Maintained by Juan Antonio Villatoro-García. Last updated 5 months ago.

statisticalmethod genesetenrichment pathways

11.5 match 3.90 score 3 scripts

bioc

viper:Virtual Inference of Protein-activity by Enriched Regulon analysis

Inference of protein activity from gene expression data, including the VIPER and msVIPER algorithms

Maintained by Mariano J Alvarez. Last updated 5 months ago.

systemsbiology networkenrichment geneexpression functionalprediction generegulation

6.4 match 7.00 score 342 scripts 5 dependents

andreyshabalin

shiftR:Fast Enrichment Analysis via Circular Permutations

Fast enrichment analysis for locally correlated statistics via circular permutations. The analysis can be performed at multiple significance thresholds for both primary and auxiliary data sets with efficient correction for multiple testing.

Maintained by Andrey A Shabalin. Last updated 6 years ago.

11.0 match 1 stars 4.04 score 11 scripts

bioc

TFEA.ChIP:Analyze Transcription Factor Enrichment

Package to analize transcription factor enrichment in a gene set using data from ChIP-Seq experiments.

Maintained by Laura Puente Santamaría. Last updated 5 months ago.

transcription generegulation genesetenrichment transcriptomics sequencing chipseq rnaseq immunooncology

12.9 match 3.45 score 14 scripts

bioc

gage:Generally Applicable Gene-set Enrichment for Pathway Analysis

GAGE is a published method for gene set (enrichment or GSEA) or pathway analysis. GAGE is generally applicable independent of microarray or RNA-Seq data attributes including sample sizes, experimental designs, assay platforms, and other types of heterogeneity, and consistently achieves superior performance over other frequently used methods. In gage package, we provide functions for basic GAGE analysis, result processing and presentation. We have also built pipeline routines for of multiple GAGE analyses in a batch, comparison between parallel analyses, and combined analysis of heterogeneous data from different sources/studies. In addition, we provide demo microarray data and commonly used gene set data based on KEGG pathways and GO terms. These funtions and data are also useful for gene set analysis using other methods.

Maintained by Weijun Luo. Last updated 5 months ago.

pathways go differentialexpression microarray onechannel twochannel rnaseq genetics multiplecomparison genesetenrichment geneexpression systemsbiology sequencing

5.1 match 5 stars 8.71 score 784 scripts 1 dependents

polmine

polmineR:Verbs and Nouns for Corpus Analysis

Package for corpus analysis using the Corpus Workbench ('CWB', <https://cwb.sourceforge.io>) as an efficient back end for indexing and querying large corpora. The package offers functionality to flexibly create subcorpora and to carry out basic statistical operations (count, co-occurrences etc.). The original full text of documents can be reconstructed and inspected at any time. Beyond that, the package is intended to serve as an interface to packages implementing advanced statistical procedures. Respective data structures (document-term matrices, term-co-occurrence matrices etc.) can be created based on the indexed corpora.

Maintained by Andreas Blaette. Last updated 1 years ago.

5.6 match 49 stars 7.96 score 311 scripts

dsokolo

scMappR:Single Cell Mapper

The single cell mapper (scMappR) R package contains a suite of bioinformatic tools that provide experimentally relevant cell-type specific information to a list of differentially expressed genes (DEG). The function "scMappR_and_pathway_analysis" reranks DEGs to generate cell-type specificity scores called cell-weighted fold-changes. Users input a list of DEGs, normalized counts, and a signature matrix into this function. scMappR then re-weights bulk DEGs by cell-type specific expression from the signature matrix, cell-type proportions from RNA-seq deconvolution and the ratio of cell-type proportions between the two conditions to account for changes in cell-type proportion. With cwFold-changes calculated, scMappR uses two approaches to utilize cwFold-changes to complete cell-type specific pathway analysis. The "process_dgTMatrix_lists" function in the scMappR package contains an automated scRNA-seq processing pipeline where users input scRNA-seq count data, which is made compatible for scMappR and other R packages that analyze scRNA-seq data. We further used this to store hundreds up regularly updating signature matrices. The functions "tissue_by_celltype_enrichment", "tissue_scMappR_internal", and "tissue_scMappR_custom" combine these consistently processed scRNAseq count data with gene-set enrichment tools to allow for cell-type marker enrichment of a generic gene list (e.g. GWAS hits). Reference: Sokolowski,D.J., Faykoo-Martinez,M., Erdman,L., Hou,H., Chan,C., Zhu,H., Holmes,M.M., Goldenberg,A. and Wilson,M.D. (2021) Single-cell mapper (scMappR): using scRNA-seq to infer cell-type specificities of differentially expressed genes. NAR Genomics and Bioinformatics. 3(1). Iqab011. <doi:10.1093/nargab/lqab011>.

Maintained by Dustin Sokolowski. Last updated 2 years ago.

13.4 match 4 stars 3.30 score 9 scripts

wjawaid

enrichR:Provides an R Interface to 'Enrichr'

Provides an R interface to all 'Enrichr' databases. 'Enrichr' is a web-based tool for analysing gene sets and returns any enrichment of common annotated biological features. Quoting from their website 'Enrichment analysis is a computational method for inferring knowledge about an input gene set by comparing it to annotated gene sets representing prior biological knowledge.' See <https://maayanlab.cloud/Enrichr/> for further details.

Maintained by Wajid Jawaid. Last updated 1 months ago.

4.4 match 90 stars 9.96 score 7 dependents

bioc

CBNplot:plot bayesian network inferred from gene expression data based on enrichment analysis results

This package provides the visualization of bayesian network inferred from gene expression data. The networks are based on enrichment analysis results inferred from packages including clusterProfiler and ReactomePA. The networks between pathways and genes inside the pathways can be inferred and visualized.

Maintained by Noriaki Sato. Last updated 5 months ago.

visualization bayesian geneexpression networkinference pathways reactome network networkenrichment genesetenrichment

7.0 match 62 stars 6.27 score 9 scripts

bioc

GIGSEA:Genotype Imputed Gene Set Enrichment Analysis

We presented the Genotype-imputed Gene Set Enrichment Analysis (GIGSEA), a novel method that uses GWAS-and-eQTL-imputed trait-associated differential gene expression to interrogate gene set enrichment for the trait-associated SNPs. By incorporating eQTL from large gene expression studies, e.g. GTEx, GIGSEA appropriately addresses such challenges for SNP enrichment as gene size, gene boundary, SNP distal regulation, and multiple-marker regulation. The weighted linear regression model, taking as weights both imputation accuracy and model completeness, was used to perform the enrichment test, properly adjusting the bias due to redundancy in different gene sets. The permutation test, furthermore, is used to evaluate the significance of enrichment, whose efficiency can be largely elevated by expressing the computational intensive part in terms of large matrix operation. We have shown the appropriate type I error rates for GIGSEA (<5%), and the preliminary results also demonstrate its good performance to uncover the real signal.

Maintained by Shijia Zhu. Last updated 5 months ago.

genesetenrichment snp variantannotation geneexpression generegulation regression differentialexpression

10.2 match 4.30 score 2 scripts

bioc

gep2pep:Creation and Analysis of Pathway Expression Profiles (PEPs)

Pathway Expression Profiles (PEPs) are based on the expression of pathways (defined as sets of genes) as opposed to individual genes. This package converts gene expression profiles to PEPs and performs enrichment analysis of both pathways and experimental conditions, such as "drug set enrichment analysis" and "gene2drug" drug discovery analysis respectively.

Maintained by Francesco Napolitano. Last updated 5 months ago.

geneexpression differentialexpression genesetenrichment dimensionreduction pathways go

9.7 match 4.48 score 4 scripts

bioc

netresponse:Functional Network Analysis

Algorithms for functional network analysis. Includes an implementation of a variational Dirichlet process Gaussian mixture model for nonparametric mixture modeling.

Maintained by Leo Lahti. Last updated 5 months ago.

cellbiology clustering geneexpression genetics network graphandnetwork differentialexpression microarray networkinference transcription

7.6 match 3 stars 5.64 score 21 scripts

bioc

BioNAR:Biological Network Analysis in R

the R package BioNAR, developed to step by step analysis of PPI network. The aim is to quantify and rank each protein’s simultaneous impact into multiple complexes based on network topology and clustering. Package also enables estimating of co-occurrence of diseases across the network and specific clusters pointing towards shared/common mechanisms.

Maintained by Anatoly Sorokin. Last updated 17 days ago.

software graphandnetwork network

7.2 match 3 stars 5.90 score 35 scripts

bioc

spatzie:Identification of enriched motif pairs from chromatin interaction data

Identifies motifs that are significantly co-enriched from enhancer-promoter interaction data. While enhancer-promoter annotation is commonly used to define groups of interaction anchors, spatzie also supports co-enrichment analysis between preprocessed interaction anchors. Supports BEDPE interaction data derived from genome-wide assays such as HiC, ChIA-PET, and HiChIP. Can also be used to look for differentially enriched motif pairs between two interaction experiments.

Maintained by Jennifer Hammelman. Last updated 5 months ago.

dna3dstructure generegulation peakdetection epigenetics functionalgenomics classification hic transcription

9.7 match 4.30 score 5 scripts

acare

hacksig:A Tidy Framework to Hack Gene Expression Signatures

A collection of cancer transcriptomics gene signatures as well as a simple and tidy interface to compute single sample enrichment scores either with the original procedure or with three alternatives: the "combined z-score" of Lee et al. (2008) <doi:10.1371/journal.pcbi.1000217>, the "single sample GSEA" of Barbie et al. (2009) <doi:10.1038/nature08460> and the "singscore" of Foroutan et al. (2018) <doi:10.1186/s12859-018-2435-4>. The 'get_sig_info()' function can be used to retrieve information about each signature implemented.

Maintained by Andrea Carenzo. Last updated 2 years ago.

gene-expression-signatures gene-set-enrichment

7.2 match 19 stars 5.71 score 27 scripts

bioc

EGSEA:Ensemble of Gene Set Enrichment Analyses

This package implements the Ensemble of Gene Set Enrichment Analyses (EGSEA) method for gene set testing. EGSEA algorithm utilizes the analysis results of twelve prominent GSE algorithms in the literature to calculate collective significance scores for each gene set.

Maintained by Monther Alhamdoosh. Last updated 5 months ago.

immunooncology differentialexpression go geneexpression genesetenrichment genetics microarray multiplecomparison onechannel pathways rnaseq sequencing software systemsbiology twochannel metabolomics proteomics kegg graphandnetwork genesignaling genetarget networkenrichment network classification

7.0 match 5.81 score 64 scripts

bioc

phenoTest:Tools to test association between gene expression and phenotype in a way that is efficient, structured, fast and scalable. We also provide tools to do GSEA (Gene set enrichment analysis) and copy number variation.

Tools to test correlation between gene expression and phenotype in a way that is efficient, structured, fast and scalable. GSEA is also provided.

Maintained by Evarist Planet. Last updated 5 months ago.

microarray differentialexpression multiplecomparison clustering classification

8.8 match 4.56 score 9 scripts 1 dependents

bioc

PanomiR:Detection of miRNAs that regulate interacting groups of pathways

PanomiR is a package to detect miRNAs that target groups of pathways from gene expression data. This package provides functionality for generating pathway activity profiles, determining differentially activated pathways between user-specified conditions, determining clusters of pathways via the PCxN package, and generating miRNAs targeting clusters of pathways. These function can be used separately or sequentially to analyze RNA-Seq data.

Maintained by Pourya Naderi. Last updated 5 months ago.

geneexpression genesetenrichment genetarget mirna pathways

8.2 match 3 stars 4.89 score 13 scripts

bioc

decoupleR:decoupleR: Ensemble of computational methods to infer biological activities from omics data

Many methods allow us to extract biological activities from omics data using information from prior knowledge resources, reducing the dimensionality for increased statistical power and better interpretability. Here, we present decoupleR, a Bioconductor package containing different statistical methods to extract these signatures within a unified framework. decoupleR allows the user to flexibly test any method with any resource. It incorporates methods that take into account the sign and weight of network interactions. decoupleR can be used with any omic, as long as its features can be linked to a biological process based on prior knowledge. For example, in transcriptomics gene sets regulated by a transcription factor, or in phospho-proteomics phosphosites that are targeted by a kinase.

Maintained by Pau Badia-i-Mompel. Last updated 5 months ago.

differentialexpression functionalgenomics geneexpression generegulation network software statisticalmethod transcription

3.4 match 230 stars 11.27 score 316 scripts 3 dependents

cran

PEIMAN2:Post-Translational Modification Enrichment, Integration, and Matching Analysis

Functions and mined database from 'UniProt' focusing on post-translational modifications to do single enrichment analysis (SEA) and protein set enrichment analysis (PSEA). Payman Nickchi, Mehdi Mirzaie, Marc Baumann, Amir Ata Saei, Mohieddin Jafari (2022) <bioRxiv:10.1101/2022.11.09.515610>.

Maintained by Payman Nickchi. Last updated 2 years ago.

14.3 match 2.70 score

umich-biostatistics

AEenrich:Adverse Event Enrichment Tests

We extend existing gene enrichment tests to perform adverse event enrichment analysis. Unlike the continuous gene expression data, adverse event data are counts. Therefore, adverse event data has many zeros and ties. We propose two enrichment tests. One is a modified Fisher's exact test based on pre-selected significant adverse events, while the other is based on a modified Kolmogorov-Smirnov statistic. We add Covariate adjustment to improve the analysis."Adverse event enrichment tests using VAERS" Shuoran Li, Lili Zhao (2020) <arXiv:2007.02266>.

Maintained by Michael Kleinsasser. Last updated 2 years ago.

11.0 match 3 stars 3.48 score 1 scripts

bioc

gINTomics:Multi-Omics data integration

gINTomics is an R package for Multi-Omics data integration and visualization. gINTomics is designed to detect the association between the expression of a target and of its regulators, taking into account also their genomics modifications such as Copy Number Variations (CNV) and methylation. What is more, gINTomics allows integration results visualization via a Shiny-based interactive app.

Maintained by Angelo Velle. Last updated 5 months ago.

geneexpression rnaseq microarray visualization copynumbervariation genetarget quarto

7.5 match 3 stars 5.08 score 3 scripts

jokergoo

CePa:Centrality-Based Pathway Enrichment

This package aims to find significant pathways through network topology information. It has several advantages compared with current pathway enrichment tools. First, pathway node instead of single gene is taken as the basic unit when analysing networks to meet the fact that genes must be constructed into complexes to hold normal functions. Second, multiple network centrality measures are applied simultaneously to measure importance of nodes from different aspects to make a full view on the biological system. CePa extends standard pathway enrichment methods, which include both over-representation analysis procedure and gene-set analysis procedure. <https://doi.org/10.1093/bioinformatics/btt008>.

Maintained by Zuguang Gu. Last updated 4 years ago.

5.8 match 3 stars 6.53 score 75 scripts

bioc

traseR:GWAS trait-associated SNP enrichment analyses in genomic intervals

traseR performs GWAS trait-associated SNP enrichment analyses in genomic intervals using different hypothesis testing approaches, also provides various functionalities to explore and visualize the results.

Maintained by li chen. Last updated 5 months ago.

genetics sequencing coverage alignment qualitycontrol dataimport

11.4 match 3.30 score 3 scripts

welch-lab

rliger:Linked Inference of Genomic Experimental Relationships

Uses an extension of nonnegative matrix factorization to identify shared and dataset-specific factors. See Welch J, Kozareva V, et al (2019) <doi:10.1016/j.cell.2019.05.006>, and Liu J, Gao C, Sodicoff J, et al (2020) <doi:10.1038/s41596-020-0391-8> for more details.

Maintained by Yichen Wang. Last updated 2 months ago.

nonnegative-matrix-factorization single-cell openblas cpp

3.4 match 402 stars 10.80 score 334 scripts 1 dependents

bioc

Repitools:Epigenomic tools

Tools for the analysis of enrichment-based epigenomic data. Features include summarization and visualization of epigenomic data across promoters according to gene expression context, finding regions of differential methylation/binding, BayMeth for quantifying methylation etc.

Maintained by Mark Robinson. Last updated 5 months ago.

dnamethylation geneexpression methylseq

6.1 match 5.90 score 267 scripts

moseleybioinformaticslab

categoryCompare2:Meta-Analysis of High-Throughput Experiments Using Feature Annotations

Facilitates comparison of significant annotations (categories) generated on one or more feature lists. Interactive exploration is facilitated through the use of RCytoscape (heavily suggested).

Maintained by Robert M Flight. Last updated 5 months ago.

annotation go multiplecomparison pathways geneexpression bioconductor bioinformatics gene-annotation gene-expression gene-sets

14.5 match 1 stars 2.48 score 9 scripts

bioc

simona:Semantic Similarity on Bio-Ontologies

This package implements infrastructures for ontology analysis by offering efficient data structures, fast ontology traversal methods, and elegant visualizations. It provides a robust toolbox supporting over 70 methods for semantic similarity analysis.

Maintained by Zuguang Gu. Last updated 5 months ago.

software annotation go biomedicalinformatics cpp

5.4 match 16 stars 6.59 score 27 scripts 1 dependents

bioc

categoryCompare:Meta-analysis of high-throughput experiments using feature annotations

Calculates significant annotations (categories) in each of two (or more) feature (i.e. gene) lists, determines the overlap between the annotations, and returns graphical and tabular data about the significant annotations and which combinations of feature lists the annotations were found to be significant. Interactive exploration is facilitated through the use of RCytoscape (heavily suggested).

Maintained by Robert M. Flight. Last updated 5 months ago.

annotation go multiplecomparison pathways geneexpression bioconductor

5.3 match 6 stars 6.68 score

hanjunwei-lab

DTSEA:Drug Target Set Enrichment Analysis

It is a novel tool used to identify the candidate drugs against a particular disease based on the drug target set enrichment analysis. It assumes the most effective drugs are those with a closer affinity in the protein-protein interaction network to the specified disease. (See Gómez-Carballa et al. (2022) <doi: 10.1016/j.envres.2022.112890> and Feng et al. (2022) <doi: 10.7150/ijms.67815> for disease expression profiles; see Wishart et al. (2018) <doi: 10.1093/nar/gkx1037> and Gaulton et al. (2017) <doi: 10.1093/nar/gkw1074> for drug target information; see Kanehisa et al. (2021) <doi: 10.1093/nar/gkaa970> for the details of KEGG database.)

Maintained by Junwei Han. Last updated 2 years ago.

8.1 match 4.32 score 42 scripts

plangfelder

WGCNA:Weighted Correlation Network Analysis

Functions necessary to perform Weighted Correlation Network Analysis on high-dimensional data as originally described in Horvath and Zhang (2005) <doi:10.2202/1544-6115.1128> and Langfelder and Horvath (2008) <doi:10.1186/1471-2105-9-559>. Includes functions for rudimentary data cleaning, construction of correlation networks, module identification, summarization, and relating of variables and modules to sample traits. Also includes a number of utility functions for data manipulation and visualization.

Maintained by Peter Langfelder. Last updated 6 months ago.

cpp

3.6 match 54 stars 9.65 score 5.3k scripts 32 dependents

hanjunwei-lab

ssMutPA:Single-Sample Mutation-Based Pathway Analysis

A systematic bioinformatics tool to perform single-sample mutation-based pathway analysis by integrating somatic mutation data with the Protein-Protein Interaction (PPI) network. In this method, we use local and global weighted strategies to evaluate the effects of network genes from mutations according to the network topology and then calculate the mutation-based pathway enrichment score (ssMutPES) to reflect the accumulated effect of mutations of each pathway. Subsequently, the ssMutPES profiles are used for unsupervised spectral clustering to identify cancer subtypes.

Maintained by Junwei Han. Last updated 5 months ago.

8.5 match 4.00 score 9 scripts

bioc

ttgsea:Tokenizing Text of Gene Set Enrichment Analysis

Functional enrichment analysis methods such as gene set enrichment analysis (GSEA) have been widely used for analyzing gene expression data. GSEA is a powerful method to infer results of gene expression data at a level of gene sets by calculating enrichment scores for predefined sets of genes. GSEA depends on the availability and accuracy of gene sets. There are overlaps between terms of gene sets or categories because multiple terms may exist for a single biological process, and it can thus lead to redundancy within enriched terms. In other words, the sets of related terms are overlapping. Using deep learning, this pakage is aimed to predict enrichment scores for unique tokens or words from text in names of gene sets to resolve this overlapping set issue. Furthermore, we can coin a new term by combining tokens and find its enrichment score by predicting such a combined tokens.

Maintained by Dongmin Jung. Last updated 5 months ago.

software geneexpression genesetenrichment

6.9 match 4.95 score 3 scripts 3 dependents

bioc

phosphonormalizer:Compensates for the bias introduced by median normalization in

It uses the overlap between enriched and non-enriched datasets to compensate for the bias introduced in global phosphorylation after applying median normalization.

Maintained by Sohrab Saraei. Last updated 5 months ago.

software statisticalmethod workflowstep normalization proteomics

9.4 match 3.60 score 2 scripts

bioc

ChIPseeker:ChIPseeker for ChIP peak Annotation, Comparison, and Visualization

This package implements functions to retrieve the nearest genes around the peak, annotate genomic region of the peak, statstical methods for estimate the significance of overlap among ChIP peak data sets, and incorporate GEO database for user to compare the own dataset with those deposited in database. The comparison can be used to infer cooperative regulation and thus can be used to generate hypotheses. Several visualization functions are implemented to summarize the coverage of the peak experiment, average profile and heatmap of peaks binding to TSS regions, genomic annotation, distance to TSS, and overlap of peaks or genes.

Maintained by Guangchuang Yu. Last updated 5 months ago.

annotation chipseq software visualization multiplecomparison atac-seq chip-seq comparison epigenetics epigenomics

2.6 match 234 stars 13.02 score 1.6k scripts 5 dependents

bioc

ATACseqQC:ATAC-seq Quality Control

ATAC-seq, an assay for Transposase-Accessible Chromatin using sequencing, is a rapid and sensitive method for chromatin accessibility analysis. It was developed as an alternative method to MNase-seq, FAIRE-seq and DNAse-seq. Comparing to the other methods, ATAC-seq requires less amount of the biological samples and time to process. In the process of analyzing several ATAC-seq dataset produced in our labs, we learned some of the unique aspects of the quality assessment for ATAC-seq data.To help users to quickly assess whether their ATAC-seq experiment is successful, we developed ATACseqQC package partially following the guideline published in Nature Method 2013 (Greenleaf et al.), including diagnostic plot of fragment size distribution, proportion of mitochondria reads, nucleosome positioning pattern, and CTCF or other Transcript Factor footprints.

Maintained by Jianhong Ou. Last updated 2 months ago.

sequencing dnaseq atacseq generegulation qualitycontrol coverage nucleosomepositioning immunooncology

4.7 match 7.12 score 146 scripts 1 dependents

bioc

CelliD:Unbiased Extraction of Single Cell gene signatures using Multiple Correspondence Analysis

CelliD is a clustering-free multivariate statistical method for the robust extraction of per-cell gene signatures from single-cell RNA-seq. CelliD allows unbiased cell identity recognition across different donors, tissues-of-origin, model organisms and single-cell omics protocols. The package can also be used to explore functional pathways enrichment in single cell data.

Maintained by Akira Cortal. Last updated 5 months ago.

rnaseq singlecell dimensionreduction clustering genesetenrichment geneexpression atacseq openblas cpp openmp

6.8 match 4.85 score 70 scripts

bioc

motifTestR:Perform key tests for binding motifs in sequence data

Taking a set of sequence motifs as PWMs, test a set of sequences for over-representation of these motifs, as well as any positional features within the set of motifs. Enrichment analysis can be undertaken using multiple statistical approaches. The package also contains core functions to prepare data for analysis, and to visualise results.

Maintained by Stevie Pederson. Last updated 7 days ago.

motifannotation chipseq chiponchip sequencematching software

6.7 match 1 stars 4.90 score 2 scripts

bioc

rmspc:Multiple Sample Peak Calling

The rmspc package runs MSPC (Multiple Sample Peak Calling) software using R. The analysis of ChIP-seq samples outputs a number of enriched regions (commonly known as "peaks"), each indicating a protein-DNA interaction or a specific chromatin modification. When replicate samples are analyzed, overlapping peaks are expected. This repeated evidence can therefore be used to locally lower the minimum significance required to accept a peak. MSPC uses combined evidence from replicated experiments to evaluate peak calling output, rescuing peaks, and reduce false positives. It takes any number of replicates as input and improves sensitivity and specificity of peak calling on each, and identifies consensus regions between the input samples.

Maintained by Meriem Bahda. Last updated 18 days ago.

chipseq sequencing chiponchip dataimport rnaseq analysis chip-seq enriched-regions genome-analysis mspc next-generation-sequencing ngs-analysis overlapping-peaks peak peaks

8.0 match 20 stars 4.08 score 5 scripts

dustinsokolowski

DEET:Differential Expression Enrichment Tool

Abstract of Manuscript. Differential gene expression analysis using RNA sequencing (RNA-seq) data is a standard approach for making biological discoveries. Ongoing large-scale efforts to process and normalize publicly available gene expression data enable rapid and systematic reanalysis. While several powerful tools systematically process RNA-seq data, enabling their reanalysis, few resources systematically recompute differentially expressed genes (DEGs) generated from individual studies. We developed a robust differential expression analysis pipeline to recompute 3162 human DEG lists from The Cancer Genome Atlas, Genotype-Tissue Expression Consortium, and 142 studies within the Sequence Read Archive. After measuring the accuracy of the recomputed DEG lists, we built the Differential Expression Enrichment Tool (DEET), which enables users to interact with the recomputed DEG lists. DEET, available through CRAN and RShiny, systematically queries which of the recomputed DEG lists share similar genes, pathways, and TF targets to their own gene lists. DEET identifies relevant studies based on shared results with the user’s gene lists, aiding in hypothesis generation and data-driven literature review. Sokolowski, Dustin J., et al. "Differential Expression Enrichment Tool (DEET): an interactive atlas of human differential gene expression." Nucleic Acids Research Genomics and Bioinformatics (2023).

Maintained by Dustin Sokolowski. Last updated 9 months ago.

11.9 match 2.70 score 2 scripts

bioc

Category:Category Analysis

A collection of tools for performing category (gene set enrichment) analysis.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

annotation go pathways genesetenrichment

3.9 match 7.93 score 183 scripts 16 dependents

bioc

gCrisprTools:Suite of Functions for Pooled Crispr Screen QC and Analysis

Set of tools for evaluating pooled high-throughput screening experiments, typically employing CRISPR/Cas9 or shRNA expression cassettes. Contains methods for interrogating library and cassette behavior within an experiment, identifying differentially abundant cassettes, aggregating signals to identify candidate targets for empirical validation, hypothesis testing, and comprehensive reporting. Version 2.0 extends these applications to include a variety of tools for contextualizing and integrating signals across many experiments, incorporates extended signal enrichment methodologies via the "sparrow" package, and streamlines many formal requirements to aid in interpretablity.

Maintained by Russell Bainer. Last updated 5 months ago.

immunooncology crispr pooledscreens experimentaldesign biomedicalinformatics cellbiology functionalgenomics pharmacogenomics pharmacogenetics systemsbiology differentialexpression genesetenrichment genetics multiplecomparison normalization preprocessing qualitycontrol rnaseq regression software visualization

6.5 match 4.78 score 8 scripts

bioc

miRspongeR:Identification and analysis of miRNA sponge regulation

This package provides several functions to explore miRNA sponge (also called ceRNA or miRNA decoy) regulation from putative miRNA-target interactions or/and transcriptomics data (including bulk, single-cell and spatial gene expression data). It provides eight popular methods for identifying miRNA sponge interactions, and an integrative method to integrate miRNA sponge interactions from different methods, as well as the functions to validate miRNA sponge interactions, and infer miRNA sponge modules, conduct enrichment analysis of miRNA sponge modules, and conduct survival analysis of miRNA sponge modules. By using a sample control variable strategy, it provides a function to infer sample-specific miRNA sponge interactions. In terms of sample-specific miRNA sponge interactions, it implements three similarity methods to construct sample-sample correlation network.

Maintained by Junpeng Zhang. Last updated 5 months ago.

geneexpression biomedicalinformatics networkenrichment survival microarray software singlecell spatial rnaseq cerna mirna sponge

5.2 match 5 stars 5.88 score 8 scripts

bioc

RCAS:RNA Centric Annotation System

RCAS is an R/Bioconductor package designed as a generic reporting tool for the functional analysis of transcriptome-wide regions of interest detected by high-throughput experiments. Such transcriptomic regions could be, for instance, signal peaks detected by CLIP-Seq analysis for protein-RNA interaction sites, RNA modification sites (alias the epitranscriptome), CAGE-tag locations, or any other collection of query regions at the level of the transcriptome. RCAS produces in-depth annotation summaries and coverage profiles based on the distribution of the query regions with respect to transcript features (exons, introns, 5'/3' UTR regions, exon-intron boundaries, promoter regions). Moreover, RCAS can carry out functional enrichment analyses and discriminative motif discovery.

Maintained by Bora Uyar. Last updated 5 months ago.

software genetarget motifannotation motifdiscovery go transcriptomics genomeannotation genesetenrichment coverage

4.9 match 6.32 score 29 scripts 1 dependents

bioc

FGNet:Functional Gene Networks derived from biological enrichment analyses

Build and visualize functional gene and term networks from clustering of enrichment analyses in multiple annotation spaces. The package includes a graphical user interface (GUI) and functions to perform the functional enrichment analysis through DAVID, GeneTerm Linker, gage (GSEA) and topGO.

Maintained by Sara Aibar. Last updated 5 months ago.

annotation go pathways genesetenrichment network visualization functionalgenomics networkenrichment clustering

6.6 match 4.62 score 5 scripts 1 dependents

bioc

spatialHeatmap:spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions

The spatialHeatmap package offers the primary functionality for visualizing cell-, tissue- and organ-specific assay data in spatial anatomical images. Additionally, it provides extended functionalities for large-scale data mining routines and co-visualizing bulk and single-cell data. A description of the project is available here: https://spatialheatmap.org.

Maintained by Jianhai Zhang. Last updated 4 months ago.

spatial visualization microarray sequencing geneexpression datarepresentation network clustering graphandnetwork cellbasedassays atacseq dnaseq tissuemicroarray singlecell cellbiology genetarget

4.9 match 5 stars 6.26 score 12 scripts

sssydysss

TransProR:Analysis and Visualization of Multi-Omics Data

A tool for comprehensive transcriptomic data analysis, with a focus on transcript-level data preprocessing, expression profiling, differential expression analysis, and functional enrichment. It enables researchers to identify key biological processes, disease biomarkers, and gene regulatory mechanisms. 'TransProR' is aimed at researchers and bioinformaticians working with RNA-Seq data, providing an intuitive framework for in-depth analysis and visualization of transcriptomic datasets. The package includes comprehensive documentation and usage examples to guide users through the entire analysis pipeline. The differential expression analysis methods incorporated in the package include 'limma' (Ritchie et al., 2015, <doi:10.1093/nar/gkv007>; Smyth, 2005, <doi:10.1007/0-387-29362-0_23>), 'edgeR' (Robinson et al., 2010, <doi:10.1093/bioinformatics/btp616>), 'DESeq2' (Love et al., 2014, <doi:10.1186/s13059-014-0550-8>), and Wilcoxon tests (Li et al., 2022, <doi:10.1186/s13059-022-02648-4>), providing flexible and robust approaches to RNA-Seq data analysis. For more information, refer to the package vignettes and related publications.

Maintained by Dongyue Yu. Last updated 18 days ago.

4.0 match 174 stars 7.55 score 34 scripts

bioc

ramwas:Fast Methylome-Wide Association Study Pipeline for Enrichment Platforms

A complete toolset for methylome-wide association studies (MWAS). It is specifically designed for data from enrichment based methylation assays, but can be applied to other data as well. The analysis pipeline includes seven steps: (1) scanning aligned reads from BAM files, (2) calculation of quality control measures, (3) creation of methylation score (coverage) matrix, (4) principal component analysis for capturing batch effects and detection of outliers, (5) association analysis with respect to phenotypes of interest while correcting for top PCs and known covariates, (6) annotation of significant findings, and (7) multi-marker analysis (methylation risk score) using elastic net. Additionally, RaMWAS include tools for joint analysis of methlyation and genotype data. This work is published in Bioinformatics, Shabalin et al. (2018) <doi:10.1093/bioinformatics/bty069>.

Maintained by Andrey A Shabalin. Last updated 5 months ago.

dnamethylation sequencing qualitycontrol coverage preprocessing normalization batcheffect principalcomponent differentialmethylation visualization

5.0 match 10 stars 6.08 score 85 scripts

j-mitchel

scITD:Single-Cell Interpretable Tensor Decomposition

Single-cell Interpretable Tensor Decomposition (scITD) employs the Tucker tensor decomposition to extract multicell-type gene expression patterns that vary across donors/individuals. This tool is geared for use with single-cell RNA-sequencing datasets consisting of many source donors. The method has a wide range of potential applications, including the study of inter-individual variation at the population-level, patient sub-grouping/stratification, and the analysis of sample-level batch effects. Each "multicellular process" that is extracted consists of (A) a multi cell type gene loadings matrix and (B) a corresponding donor scores vector indicating the level at which the corresponding loadings matrix is expressed in each donor. Additional methods are implemented to aid in selecting an appropriate number of factors and to evaluate stability of the decomposition. Additional tools are provided for downstream analysis, including integration of gene set enrichment analysis and ligand-receptor analysis. Tucker, L.R. (1966) <doi:10.1007/BF02289464>. Unkel, S., Hannachi, A., Trendafilov, N. T., & Jolliffe, I. T. (2011) <doi:10.1007/s13253-011-0055-9>. Zhou, G., & Cichocki, A. (2012) <doi:10.2478/v10175-012-0051-4>.

Maintained by Jonathan Mitchel. Last updated 2 years ago.

cpp

15.3 match 1.98 score 19 scripts

sabalab

diffEnrich:Given a List of Gene Symbols, Performs Differential Enrichment Analysis

Compare functional enrichment between two experimentally-derived groups of genes or proteins (Peterson, DR., et al.(2018)) <doi: 10.1371/journal.pone.0198139>. Given a list of gene symbols, 'diffEnrich' will perform differential enrichment analysis using the Kyoto Encyclopedia of Genes and Genomes (KEGG) REST API. This package provides a number of functions that are intended to be used in a pipeline. Briefly, the user provides a KEGG formatted species id for either human, mouse or rat, and the package will download and clean species specific ENTREZ gene IDs and map them to their respective KEGG pathways by accessing KEGG's REST API. KEGG's API is used to guarantee the most up-to-date pathway data from KEGG. Next, the user will identify significantly enriched pathways from two gene sets, and finally, the user will identify pathways that are differentially enriched between the two gene sets. In addition to the analysis pipeline, this package also provides a plotting function.

Maintained by Harry Smith. Last updated 3 years ago.

7.3 match 3 stars 4.18 score 10 scripts

bioc

VplotR:Set of tools to make V-plots and compute footprint profiles

The pattern of digestion and protection from DNA nucleases such as DNAse I, micrococcal nuclease, and Tn5 transposase can be used to infer the location of associated proteins. This package contains useful functions to analyze patterns of paired-end sequencing fragment density. VplotR facilitates the generation of V-plots and footprint profiles over single or aggregated genomic loci of interest.

Maintained by Jacques Serizay. Last updated 5 months ago.

nucleosomepositioning coverage sequencing biologicalquestion atacseq alignment

5.3 match 10 stars 5.64 score 11 scripts

bioc

netZooR:Unified methods for the inference and analysis of gene regulatory networks

netZooR unifies the implementations of several Network Zoo methods (netzoo, netzoo.github.io) into a single package by creating interfaces between network inference and network analysis methods. Currently, the package has 3 methods for network inference including PANDA and its optimized implementation OTTER (network reconstruction using mutliple lines of biological evidence), LIONESS (single-sample network inference), and EGRET (genotype-specific networks). Network analysis methods include CONDOR (community detection), ALPACA (differential community detection), CRANE (significance estimation of differential modules), MONSTER (estimation of network transition states). In addition, YARN allows to process gene expresssion data for tissue-specific analyses and SAMBAR infers missing mutation data based on pathway information.

Maintained by Tara Eicher. Last updated 8 days ago.

networkinference network generegulation geneexpression transcription microarray graphandnetwork gene-regulatory-network transcription-factors

3.8 match 105 stars 7.98 score

bioc

cytoMEM:Marker Enrichment Modeling (MEM)

MEM, Marker Enrichment Modeling, automatically generates and displays quantitative labels for cell populations that have been identified from single-cell data. The input for MEM is a dataset that has pre-clustered or pre-gated populations with cells in rows and features in columns. Labels convey a list of measured features and the features' levels of relative enrichment on each population. MEM can be applied to a wide variety of data types and can compare between MEM labels from flow cytometry, mass cytometry, single cell RNA-seq, and spectral flow cytometry using RMSD.

Maintained by Jonathan Irish. Last updated 5 months ago.

proteomics systemsbiology classification flowcytometry datarepresentation dataimport cellbiology singlecell clustering

7.1 match 4.18 score 15 scripts

bioc

CEMiTool:Co-expression Modules identification Tool

The CEMiTool package unifies the discovery and the analysis of coexpression gene modules in a fully automatic manner, while providing a user-friendly html report with high quality graphs. Our tool evaluates if modules contain genes that are over-represented by specific pathways or that are altered in a specific sample group. Additionally, CEMiTool is able to integrate transcriptomic data with interactome information, identifying the potential hubs on each network.

Maintained by Helder Nakaya. Last updated 5 months ago.

geneexpression transcriptomics graphandnetwork mrnamicroarray rnaseq network networkenrichment pathways immunooncology

5.0 match 5.76 score 38 scripts

bioc

SGCP:SGCP: A semi-supervised pipeline for gene clustering using self-training approach in gene co-expression networks

SGC is a semi-supervised pipeline for gene clustering in gene co-expression networks. SGC consists of multiple novel steps that enable the computation of highly enriched modules in an unsupervised manner. But unlike all existing frameworks, it further incorporates a novel step that leverages Gene Ontology information in a semi-supervised clustering method that further improves the quality of the computed modules.

Maintained by Niloofar AghaieAbiane. Last updated 5 months ago.

geneexpression genesetenrichment networkenrichment systemsbiology classification clustering dimensionreduction graphandnetwork neuralnetwork network mrnamicroarray rnaseq visualization bioinformatics genecoexpressionnetwork graphs networkclustering networks self-training semi-supervised-learning unsupervised-learning

5.5 match 2 stars 5.12 score 44 scripts

bioc

lipidr:Data Mining and Analysis of Lipidomics Datasets

lipidr an easy-to-use R package implementing a complete workflow for downstream analysis of targeted and untargeted lipidomics data. lipidomics results can be imported into lipidr as a numerical matrix or a Skyline export, allowing integration into current analysis frameworks. Data mining of lipidomics datasets is enabled through integration with Metabolomics Workbench API. lipidr allows data inspection, normalization, univariate and multivariate analysis, displaying informative visualizations. lipidr also implements a novel Lipid Set Enrichment Analysis (LSEA), harnessing molecular information such as lipid class, total chain length and unsaturation.

Maintained by Ahmed Mohamed. Last updated 5 months ago.

lipidomics massspectrometry normalization qualitycontrol visualization bioconductor

3.8 match 29 stars 7.44 score 40 scripts

bioc

meshr:Tools for conducting enrichment analysis of MeSH

A set of annotation maps describing the entire MeSH assembled using data from MeSH.

Maintained by Koki Tsuyuzaki. Last updated 5 months ago.

annotationdata functionalannotation bioinformatics statistics annotation multiplecomparisons meshdb

6.2 match 4.56 score 9 scripts 1 dependents

bioc

SPONGE:Sparse Partial Correlations On Gene Expression

This package provides methods to efficiently detect competitive endogeneous RNA interactions between two genes. Such interactions are mediated by one or several miRNAs such that both gene and miRNA expression data for a larger number of samples is needed as input. The SPONGE package now also includes spongEffects: ceRNA modules offer patient-specific insights into the miRNA regulatory landscape.

Maintained by Markus List. Last updated 5 months ago.

geneexpression transcription generegulation networkinference transcriptomics systemsbiology regression randomforest machinelearning

5.2 match 5.36 score 38 scripts 1 dependents

bioc

scDataviz:scDataviz: single cell dataviz and downstream analyses

In the single cell World, which includes flow cytometry, mass cytometry, single-cell RNA-seq (scRNA-seq), and others, there is a need to improve data visualisation and to bring analysis capabilities to researchers even from non-technical backgrounds. scDataviz attempts to fit into this space, while also catering for advanced users. Additonally, due to the way that scDataviz is designed, which is based on SingleCellExperiment, it has a 'plug and play' feel, and immediately lends itself as flexibile and compatibile with studies that go beyond scDataviz. Finally, the graphics in scDataviz are generated via the ggplot engine, which means that users can 'add on' features to these with ease.

Maintained by Kevin Blighe. Last updated 5 months ago.

singlecell immunooncology rnaseq geneexpression transcription flowcytometry massspectrometry dataimport

4.4 match 63 stars 6.30 score 16 scripts

bioc

MEDME:Modelling Experimental Data from MeDIP Enrichment

MEDME allows the prediction of absolute and relative methylation levels based on measures obtained by MeDIP-microarray experiments

Maintained by Mattia Pelizzola. Last updated 5 months ago.

microarray cpgisland dnamethylation

6.4 match 4.30 score 2 scripts

bioc

icetea:Integrating Cap Enrichment with Transcript Expression Analysis

icetea (Integrating Cap Enrichment with Transcript Expression Analysis) provides functions for end-to-end analysis of multiple 5'-profiling methods such as CAGE, RAMPAGE and MAPCap, beginning from raw reads to detection of transcription start sites using replicates. It also allows performing differential TSS detection between group of samples, therefore, integrating the mRNA cap enrichment information with transcript expression analysis.

Maintained by Vivek Bhardwaj. Last updated 5 months ago.

immunooncology transcription geneexpression sequencing rnaseq transcriptomics differentialexpression cage expression rna-seq

5.4 match 2 stars 5.08 score 7 scripts

bioc

omXplore:Vizualization tools for 'omics' datasets with R

This package contains a collection of functions (written as shiny modules) for the visualisation and the statistical analysis of omics data. These plots can be displayed individually or embedded in a global Shiny module. Additionaly, it is possible to integrate third party modules to the main interface of the package omXplore.

Maintained by Samuel Wieczorek. Last updated 11 days ago.

software shinyapps massspectrometry datarepresentation gui qualitycontrol prostar2

5.1 match 5.40 score 23 scripts

yangzhao98

esDesign:Adaptive Enrichment Designs with Sample Size Re-Estimation

Software of 'esDesign' is developed to implement the adaptive enrichment designs with sample size re-estimation presented in Lin et al. (2021) <doi: 10.1016/j.cct.2020.106216>. In details, three-proposed trial designs are provided, including the AED1-SSR (or ES1-SSR), AED2-SSR (or ES2-SSR) and AED3-SSR (or ES3-SSR). In addition, this package also contains several widely used adaptive designs, such as the Marker Sequential Test (MaST) design proposed Freidlin et al. (2014) <doi:10.1177/1740774513503739>, the adaptive enrichment designs without early stopping (AED or ES), the sample size re-estimation procedure (SSR) based on the conditional power proposed by Proschan and Hunsberger (1995), and some useful functions. In details, we can calculate the futility and/or efficacy stopping boundaries, the sample size required, calibrate the value of the threshold of the difference between subgroup-specific test statistics, conduct the simulation studies in AED, SSR, AED1-SSR, AED2-SSR and AED3-SSR.

Maintained by Zhao Yang. Last updated 4 years ago.

21.3 match 1.28 score 19 scripts

bioc

DAPAR:Tools for the Differential Analysis of Proteins Abundance with R

The package DAPAR is a Bioconductor distributed R package which provides all the necessary functions to analyze quantitative data from label-free proteomics experiments. Contrarily to most other similar R packages, it is endowed with rich and user-friendly graphical interfaces, so that no programming skill is required (see `Prostar` package).

Maintained by Samuel Wieczorek. Last updated 5 months ago.

proteomics normalization preprocessing massspectrometry qualitycontrol go dataimport prostar1

5.0 match 2 stars 5.42 score 22 scripts 1 dependents

josesamos

geomultistar:Multidimensional Queries Enriched with Geographic Data

Multidimensional systems allow complex queries to be carried out in an easy way. The geographical dimension, together with the temporal dimension, plays a fundamental role in multidimensional systems. Through this package, vector geographic data layers can be associated to the attributes of geographic dimensions, so that the results of multidimensional queries can be obtained directly as vector layers. The multidimensional structures on which we can define the queries can be created from a flat table or imported directly using functions from this package.

Maintained by Jose Samos. Last updated 8 months ago.

5.9 match 2 stars 4.48 score 8 scripts 1 dependents