Showing 158 of total 158 results (show query)
bioc
maftools:Summarize, Analyze and Visualize MAF Files
Analyze and visualize Mutation Annotation Format (MAF) files from large scale sequencing studies. This package provides various functions to perform most commonly used analyses in cancer genomics and to create feature rich customizable visualzations with minimal effort.
Maintained by Anand Mayakonda. Last updated 5 months ago.
datarepresentationdnaseqvisualizationdrivermutationvariantannotationfeatureextractionclassificationsomaticmutationsequencingfunctionalgenomicssurvivalbioinformaticscancer-genome-atlascancer-genomicsgenomicsmaf-filestcgacurlbzip2xz-utilszlib
459 stars 14.63 score 948 scripts 18 dependentsbioc
GEOquery:Get data from NCBI Gene Expression Omnibus (GEO)
The NCBI Gene Expression Omnibus (GEO) is a public repository of microarray data. Given the rich and varied nature of this resource, it is only natural to want to apply BioConductor tools to these data. GEOquery is the bridge between GEO and BioConductor.
Maintained by Sean Davis. Last updated 5 months ago.
microarraydataimportonechanneltwochannelsagebioconductorbioinformaticsdata-sciencegenomicsncbi-geo
92 stars 14.46 score 4.1k scripts 44 dependentsbioc
GOSemSim:GO-terms Semantic Similarity Measures
The semantic comparisons of Gene Ontology (GO) annotations provide quantitative ways to compute similarities between genes and gene groups, and have became important basis for many bioinformatics analysis approaches. GOSemSim is an R package for semantic similarity computation among GO terms, sets of GO terms, gene products and gene clusters. GOSemSim implemented five methods proposed by Resnik, Schlicker, Jiang, Lin and Wang respectively.
Maintained by Guangchuang Yu. Last updated 5 months ago.
annotationgoclusteringpathwaysnetworksoftwarebioinformaticsgene-ontologysemantic-similaritycpp
63 stars 14.12 score 708 scripts 68 dependentsbioc
dada2:Accurate, high-resolution sample inference from amplicon sequencing data
The dada2 package infers exact amplicon sequence variants (ASVs) from high-throughput amplicon sequencing data, replacing the coarser and less accurate OTU clustering approach. The dada2 pipeline takes as input demultiplexed fastq files, and outputs the sequence variants and their sample-wise abundances after removing substitution and chimera errors. Taxonomic classification is available via a native implementation of the RDP naive Bayesian classifier, and species-level assignment to 16S rRNA gene fragments by exact matching.
Maintained by Benjamin Callahan. Last updated 5 months ago.
immunooncologymicrobiomesequencingclassificationmetagenomicsampliconbioconductorbioinformaticsmetabarcodingtaxonomycpp
487 stars 13.17 score 3.0k scripts 4 dependentsbioc
EBImage:Image processing and analysis toolbox for R
EBImage provides general purpose functionality for image processing and analysis. In the context of (high-throughput) microscopy-based cellular assays, EBImage offers tools to segment cells and extract quantitative cellular descriptors. This allows the automation of such tasks using the R programming language and facilitates the use of other tools in the R environment for signal processing, statistical modeling, machine learning and visualization with image data.
Maintained by Andrzej Oleล. Last updated 5 months ago.
visualizationbioinformaticsimage-analysisimage-processingcpp
71 stars 12.77 score 1.5k scripts 33 dependentsbioc
MSnbase:Base Functions and Classes for Mass Spectrometry and Proteomics
MSnbase provides infrastructure for manipulation, processing and visualisation of mass spectrometry and proteomics data, ranging from raw to quantitative and annotated data.
Maintained by Laurent Gatto. Last updated 14 days ago.
immunooncologyinfrastructureproteomicsmassspectrometryqualitycontroldataimportbioconductorbioinformaticsmass-spectrometryproteomics-datavisualisationcpp
131 stars 12.76 score 772 scripts 36 dependentsbioc
SNPRelate:Parallel Computing Toolset for Relatedness and Principal Component Analysis of SNP Data
Genome-wide association studies (GWAS) are widely used to investigate the genetic basis of diseases and traits, but they pose many computational challenges. We developed an R package SNPRelate to provide a binary format for single-nucleotide polymorphism (SNP) data in GWAS utilizing CoreArray Genomic Data Structure (GDS) data files. The GDS format offers the efficient operations specifically designed for integers with two bits, since a SNP could occupy only two bits. SNPRelate is also designed to accelerate two key computations on SNP data using parallel computing for multi-core symmetric multiprocessing computer architectures: Principal Component Analysis (PCA) and relatedness analysis using Identity-By-Descent measures. The SNP GDS format is also used by the GWASTools package with the support of S4 classes and generic functions. The extended GDS format is implemented in the SeqArray package to support the storage of single nucleotide variations (SNVs), insertion/deletion polymorphism (indel) and structural variation calls in whole-genome and whole-exome variant data.
Maintained by Xiuwen Zheng. Last updated 5 months ago.
infrastructuregeneticsstatisticalmethodprincipalcomponentbioinformaticsgds-formatpcasimdsnpopenblascpp
105 stars 12.57 score 1.6k scripts 19 dependentsstuart-lab
Signac:Analysis of Single-Cell Chromatin Data
A framework for the analysis and exploration of single-cell chromatin data. The 'Signac' package contains functions for quantifying single-cell chromatin data, computing per-cell quality control metrics, dimension reduction and normalization, visualization, and DNA sequence motif analysis. Reference: Stuart et al. (2021) <doi:10.1038/s41592-021-01282-5>.
Maintained by Tim Stuart. Last updated 7 months ago.
atacbioinformaticssingle-cellzlibcpp
355 stars 12.18 score 3.7k scripts 1 dependentsbioc
SeqArray:Data management of large-scale whole-genome sequence variant calls using GDS files
Data management of large-scale whole-genome sequencing variant calls with thousands of individuals: genotypic data (e.g., SNVs, indels and structural variation calls) and annotations in SeqArray GDS files are stored in an array-oriented and compressed manner, with efficient data access using the R programming language.
Maintained by Xiuwen Zheng. Last updated 5 days ago.
infrastructuredatarepresentationsequencinggeneticsbioinformaticsgds-formatsnpsnvweswgscpp
45 stars 12.11 score 1.1k scripts 9 dependentsbioc
GenomicDataCommons:NIH / NCI Genomic Data Commons Access
Programmatically access the NIH / NCI Genomic Data Commons RESTful service.
Maintained by Sean Davis. Last updated 2 months ago.
dataimportsequencingapi-clientbioconductorbioinformaticscancercore-servicesdata-sciencegenomicsncitcgavignette
87 stars 11.94 score 238 scripts 12 dependentsprivefl
bigsnpr:Analysis of Massive SNP Arrays
Easy-to-use, efficient, flexible and scalable tools for analyzing massive SNP arrays. Privรฉ et al. (2018) <doi:10.1093/bioinformatics/bty185>.
Maintained by Florian Privรฉ. Last updated 22 days ago.
big-databioinformaticsmemory-mapped-fileparallel-computingpolygenic-scorespopulation-structure-inferencesnp-datastatistical-methodsopenblaszlibcppopenmp
200 stars 11.44 score 1.5k scripts 3 dependentsbioc
decontam:Identify Contaminants in Marker-gene and Metagenomics Sequencing Data
Simple statistical identification of contaminating sequence features in marker-gene or metagenomics data. Works on any kind of feature derived from environmental sequencing data (e.g. ASVs, OTUs, taxonomic groups, MAGs,...). Requires DNA quantitation data or sequenced negative control samples.
Maintained by Benjamin Callahan. Last updated 5 months ago.
immunooncologymicrobiomesequencingclassificationmetagenomicsampliconbioinformaticscontaminationmetabarcoding
153 stars 11.42 score 524 scripts 6 dependentsbioc
gdsfmt:R Interface to CoreArray Genomic Data Structure (GDS) Files
Provides a high-level R interface to CoreArray Genomic Data Structure (GDS) data files. GDS is portable across platforms with hierarchical structure to store multiple scalable array-oriented data sets with metadata information. It is suited for large-scale datasets, especially for data which are much larger than the available random-access memory. The gdsfmt package offers the efficient operations specifically designed for integers of less than 8 bits, since a diploid genotype, like single-nucleotide polymorphism (SNP), usually occupies fewer bits than a byte. Data compression and decompression are available with relatively efficient random access. It is also allowed to read a GDS file in parallel with multiple R processes supported by the package parallel.
Maintained by Xiuwen Zheng. Last updated 13 days ago.
infrastructuredataimportbioinformaticsgds-formatgenomicscpp
18 stars 11.34 score 920 scripts 29 dependentsneuhausi
canvasXpress:Visualization Package for CanvasXpress in R
Enables creation of visualizations using the CanvasXpress framework in R. CanvasXpress is a standalone JavaScript library for reproducible research with complete tracking of data and end-user modifications stored in a single PNG image that can be played back. See <https://www.canvasxpress.org> for more information.
Maintained by Connie Brett. Last updated 12 hours ago.
analyticsbioinformaticschartchartingdashdashboarddata-analyticsdata-sciencedata-visualizationgenomicsgraphsjavascriptnetworknetwork-visualizationpythonreproducible-researchshinyvisualization
297 stars 11.28 score 145 scriptsbioc
karyoploteR:Plot customizable linear genomes displaying arbitrary data
karyoploteR creates karyotype plots of arbitrary genomes and offers a complete set of functions to plot arbitrary data on them. It mimicks many R base graphics functions coupling them with a coordinate change function automatically mapping the chromosome and data coordinates into the plot coordinates. In addition to the provided data plotting functions, it is easy to add new ones.
Maintained by Bernat Gel. Last updated 5 months ago.
visualizationcopynumbervariationsequencingcoveragednaseqchipseqmethylseqdataimportonechannelbioconductorbioinformaticsdata-visualizationgenomegenomics-visualizationplotting-in-r
306 stars 11.22 score 656 scripts 4 dependentsbioc
graphite:GRAPH Interaction from pathway Topological Environment
Graph objects from pathway topology derived from KEGG, Panther, PathBank, PharmGKB, Reactome SMPDB and WikiPathways databases.
Maintained by Gabriele Sales. Last updated 5 months ago.
pathwaysthirdpartyclientgraphandnetworknetworkreactomekeggmetabolomicsbioinformaticsmirrorpathway-analysis
7 stars 10.17 score 122 scripts 21 dependentsbioc
singscore:Rank-based single-sample gene set scoring method
A simple single-sample gene signature scoring method that uses rank-based statistics to analyze the sample's gene expression profile. It scores the expression activities of gene sets at a single-sample level.
Maintained by Malvika Kharbanda. Last updated 5 months ago.
softwaregeneexpressiongenesetenrichmentbioinformatics
41 stars 10.03 score 124 scripts 4 dependentsnanxstats
protr:Generating Various Numerical Representation Schemes for Protein Sequences
Comprehensive toolkit for generating various numerical features of protein sequences described in Xiao et al. (2015) <DOI:10.1093/bioinformatics/btv042>. For full functionality, the software 'ncbi-blast+' is needed, see <https://blast.ncbi.nlm.nih.gov/doc/blast-help/downloadblastdata.html> for more information.
Maintained by Nan Xiao. Last updated 7 months ago.
bioinformaticsfeature-engineeringfeature-extractionmachine-learningpeptidesprotein-sequencessequence-analysis
52 stars 10.02 score 173 scripts 3 dependentsbioc
splatter:Simple Simulation of Single-cell RNA Sequencing Data
Splatter is a package for the simulation of single-cell RNA sequencing count data. It provides a simple interface for creating complex simulations that are reproducible and well-documented. Parameters can be estimated from real data and functions are provided for comparing real and simulated datasets.
Maintained by Luke Zappia. Last updated 4 months ago.
singlecellrnaseqtranscriptomicsgeneexpressionsequencingsoftwareimmunooncologybioconductorbioinformaticsscrna-seqsimulation
224 stars 9.92 score 424 scripts 1 dependentsbioc
scMerge:scMerge: Merging multiple batches of scRNA-seq data
Like all gene expression data, single-cell data suffers from batch effects and other unwanted variations that makes accurate biological interpretations difficult. The scMerge method leverages factor analysis, stably expressed genes (SEGs) and (pseudo-) replicates to remove unwanted variations and merge multiple single-cell data. This package contains all the necessary functions in the scMerge pipeline, including the identification of SEGs, replication-identification methods, and merging of single-cell data.
Maintained by Yingxin Lin. Last updated 5 months ago.
batcheffectgeneexpressionnormalizationrnaseqsequencingsinglecellsoftwaretranscriptomicsbioinformaticssingle-cell
67 stars 9.52 score 137 scripts 1 dependentsimmunomind
immunarch:Bioinformatics Analysis of T-Cell and B-Cell Immune Repertoires
A comprehensive framework for bioinformatics exploratory analysis of bulk and single-cell T-cell receptor and antibody repertoires. It provides seamless data loading, analysis and visualisation for AIRR (Adaptive Immune Receptor Repertoire) data, both bulk immunosequencing (RepSeq) and single-cell sequencing (scRNAseq). Immunarch implements most of the widely used AIRR analysis methods, such as: clonality analysis, estimation of repertoire similarities in distribution of clonotypes and gene segments, repertoire diversity analysis, annotation of clonotypes using external immune receptor databases and clonotype tracking in vaccination and cancer studies. A successor to our previously published 'tcR' immunoinformatics package (Nazarov 2015) <doi:10.1186/s12859-015-0613-1>.
Maintained by Vadim I. Nazarov. Last updated 1 years ago.
airr-analysisb-cell-receptorbcrbcr-repertoirebioinformaticsigig-repertoireimmune-repertoireimmune-repertoire-analysisimmune-repertoire-dataimmunoglobulinimmunoinformaticsimmunologyrep-seqrepertoire-analysissingle-cellsingle-cell-analysist-cell-receptortcrtcr-repertoirecpp
316 stars 9.49 score 203 scriptsshixiangwang
sigminer:Extract, Analyze and Visualize Mutational Signatures for Genomic Variations
Genomic alterations including single nucleotide substitution, copy number alteration, etc. are the major force for cancer initialization and development. Due to the specificity of molecular lesions caused by genomic alterations, we can generate characteristic alteration spectra, called 'signature' (Wang, Shixiang, et al. (2021) <DOI:10.1371/journal.pgen.1009557> & Alexandrov, Ludmil B., et al. (2020) <DOI:10.1038/s41586-020-1943-3> & Steele Christopher D., et al. (2022) <DOI:10.1038/s41586-022-04738-6>). This package helps users to extract, analyze and visualize signatures from genomic alteration records, thus providing new insight into cancer study.
Maintained by Shixiang Wang. Last updated 6 months ago.
bayesian-nmfbioinformaticscancer-researchcnvcopynumber-signaturescosmic-signaturesdbseasy-to-useindelmutational-signaturesnmfnmf-extractionsbssignature-extractionsomatic-mutationssomatic-variantsvisualizationcpp
150 stars 9.48 score 123 scripts 2 dependentsbioc
rWikiPathways:rWikiPathways - R client library for the WikiPathways API
Use this package to interface with the WikiPathways API. It provides programmatic access to WikiPathways content in multiple data and image formats, including official monthly release files and convenient GMT read/write functions.
Maintained by Egon Willighagen. Last updated 5 months ago.
visualizationgraphandnetworkthirdpartyclientnetworkmetabolomicsbioinformaticsdata-accesspathways
15 stars 9.23 score 131 scripts 3 dependentsdosorio
Peptides:Calculate Indices and Theoretical Physicochemical Properties of Protein Sequences
Includes functions to calculate several physicochemical properties and indices for amino-acid sequences as well as to read and plot 'XVG' output files from the 'GROMACS' molecular dynamics package.
Maintained by Daniel Osorio. Last updated 1 years ago.
bioinformaticscalculate-indicespeptidesprotein-sequencesqsarcpp
82 stars 9.14 score 245 scripts 7 dependentsbioc
sesame:SEnsible Step-wise Analysis of DNA MEthylation BeadChips
Tools For analyzing Illumina Infinium DNA methylation arrays. SeSAMe provides utilities to support analyses of multiple generations of Infinium DNA methylation BeadChips, including preprocessing, quality control, visualization and inference. SeSAMe features accurate detection calling, intelligent inference of ethnicity, sex and advanced quality control routines.
Maintained by Wanding Zhou. Last updated 3 months ago.
dnamethylationmethylationarraypreprocessingqualitycontrolbioinformaticsdna-methylationmicroarray
69 stars 9.08 score 258 scripts 1 dependentsbioc
cmapR:CMap Tools in R
The Connectivity Map (CMap) is a massive resource of perturbational gene expression profiles built by researchers at the Broad Institute and funded by the NIH Library of Integrated Network-Based Cellular Signatures (LINCS) program. Please visit https://clue.io for more information. The cmapR package implements methods to parse, manipulate, and write common CMap data objects, such as annotated matrices and collections of gene sets.
Maintained by Ted Natoli. Last updated 5 months ago.
dataimportdatarepresentationgeneexpressionbioconductorbioinformaticscmap
90 stars 8.86 score 298 scriptsbioc
CellBench:Construct Benchmarks for Single Cell Analysis Methods
This package contains infrastructure for benchmarking analysis methods and access to single cell mixture benchmarking data. It provides a framework for organising analysis methods and testing combinations of methods in a pipeline without explicitly laying out each combination. It also provides utilities for sampling and filtering SingleCellExperiment objects, constructing lists of functions with varying parameters, and multithreaded evaluation of analysis methods.
Maintained by Shian Su. Last updated 5 months ago.
softwareinfrastructuresinglecellbenchmarkbioinformatics
31 stars 8.73 score 98 scriptsropensci
UCSCXenaTools:Download and Explore Datasets from UCSC Xena Data Hubs
Download and explore datasets from UCSC Xena data hubs, which are a collection of UCSC-hosted public databases such as TCGA, ICGC, TARGET, GTEx, CCLE, and others. Databases are normalized so they can be combined, linked, filtered, explored and downloaded.
Maintained by Shixiang Wang. Last updated 5 months ago.
api-clientbioinformaticsccledownloadericgctcgatoiltreehouseucscucsc-xena
106 stars 8.55 score 163 scripts 1 dependentsbioc
BgeeDB:Annotation and gene expression data retrieval from Bgee database. TopAnat, an anatomical entities Enrichment Analysis tool for UBERON ontology
A package for the annotation and gene expression data download from Bgee database, and TopAnat analysis: GO-like enrichment of anatomical terms, mapped to genes by expression patterns.
Maintained by Julien Wollbrett. Last updated 5 months ago.
softwaredataimportsequencinggeneexpressionmicroarraygogenesetenrichmentbioinformaticsenrichment-analysisrna-seqscrna-seqsingle-cell
15 stars 8.46 score 19 scripts 1 dependentsbioc
piano:Platform for integrative analysis of omics data
Piano performs gene set analysis using various statistical methods, from different gene level statistics and a wide range of gene-set collections. Furthermore, the Piano package contains functions for combining the results of multiple runs of gene set analyses.
Maintained by Leif Varemo Wigge. Last updated 5 months ago.
microarraypreprocessingqualitycontroldifferentialexpressionvisualizationgeneexpressiongenesetenrichmentpathwaysbioconductorbioconductor-packagebioinformaticsgene-set-enrichmenttranscriptomics
13 stars 8.30 score 183 scripts 7 dependentsbioc
HIBAG:HLA Genotype Imputation with Attribute Bagging
Imputes HLA classical alleles using GWAS SNP data, and it relies on a training set of HLA and SNP genotypes. HIBAG can be used by researchers with published parameter estimates instead of requiring access to large training sample datasets. It combines the concepts of attribute bagging, an ensemble classifier method, with haplotype inference for SNPs and HLA types. Attribute bagging is a technique which improves the accuracy and stability of classifier ensembles using bootstrap aggregating and random variable selection.
Maintained by Xiuwen Zheng. Last updated 4 months ago.
geneticsstatisticalmethodbioinformaticsgpuhlaimputationmhcsnpcpp
30 stars 8.24 score 48 scriptsbioc
hypeR:An R Package For Geneset Enrichment Workflows
An R Package for Geneset Enrichment Workflows.
Maintained by Anthony Federico. Last updated 5 months ago.
genesetenrichmentannotationpathwaysbioinformaticscomputational-biologygeneset-enrichment-analysis
76 stars 8.22 score 145 scriptsbranchlab
metasnf:Meta Clustering with Similarity Network Fusion
Framework to facilitate patient subtyping with similarity network fusion and meta clustering. The similarity network fusion (SNF) algorithm was introduced by Wang et al. (2014) in <doi:10.1038/nmeth.2810>. SNF is a data integration approach that can transform high-dimensional and diverse data types into a single similarity network suitable for clustering with minimal loss of information from each initial data source. The meta clustering approach was introduced by Caruana et al. (2006) in <doi:10.1109/ICDM.2006.103>. Meta clustering involves generating a wide range of cluster solutions by adjusting clustering hyperparameters, then clustering the solutions themselves into a manageable number of qualitatively similar solutions, and finally characterizing representative solutions to find ones that are best for the user's specific context. This package provides a framework to easily transform multi-modal data into a wide range of similarity network fusion-derived cluster solutions as well as to visualize, characterize, and validate those solutions. Core package functionality includes easy customization of distance metrics, clustering algorithms, and SNF hyperparameters to generate diverse clustering solutions; calculation and plotting of associations between features, between patients, and between cluster solutions; and standard cluster validation approaches including resampled measures of cluster stability, standard metrics of cluster quality, and label propagation to evaluate generalizability in unseen data. Associated vignettes guide the user through using the package to identify patient subtypes while adhering to best practices for unsupervised learning.
Maintained by Prashanth S Velayudhan. Last updated 3 days ago.
bioinformaticsclusteringmetaclusteringsnf
8 stars 8.21 score 30 scriptsbioc
POMA:Tools for Omics Data Analysis
The POMA package offers a comprehensive toolkit designed for omics data analysis, streamlining the process from initial visualization to final statistical analysis. Its primary goal is to simplify and unify the various steps involved in omics data processing, making it more accessible and manageable within a single, intuitive R package. Emphasizing on reproducibility and user-friendliness, POMA leverages the standardized SummarizedExperiment class from Bioconductor, ensuring seamless integration and compatibility with a wide array of Bioconductor tools. This approach guarantees maximum flexibility and replicability, making POMA an essential asset for researchers handling omics datasets. See https://github.com/pcastellanoescuder/POMAShiny. Paper: Castellano-Escuder et al. (2021) <doi:10.1371/journal.pcbi.1009148> for more details.
Maintained by Pol Castellano-Escuder. Last updated 4 months ago.
batcheffectclassificationclusteringdecisiontreedimensionreductionmultidimensionalscalingnormalizationpreprocessingprincipalcomponentregressionrnaseqsoftwarestatisticalmethodvisualizationbioconductorbioinformaticsdata-visualizationdimension-reductionexploratory-data-analysismachine-learningomics-data-integrationpipelinepre-processingstatistical-analysisuser-friendlyworkflow
11 stars 8.16 score 20 scripts 1 dependentsbioc
rBLAST:R Interface for the Basic Local Alignment Search Tool
Seamlessly interfaces the Basic Local Alignment Search Tool (BLAST) running locally to search genetic sequence data bases. This work was partially supported by grant no. R21HG005912 from the National Human Genome Research Institute.
Maintained by Michael Hahsler. Last updated 3 months ago.
geneticssequencingsequencematchingalignmentdataimportbioconductorbioinformaticsblast-search
106 stars 8.07 score 106 scriptsuscbiostats
slurmR:A Lightweight Wrapper for 'Slurm'
'Slurm', Simple Linux Utility for Resource Management <https://slurm.schedmd.com/>, is a popular 'Linux' based software used to schedule jobs in 'HPC' (High Performance Computing) clusters. This R package provides a specialized lightweight wrapper of 'Slurm' with a syntax similar to that found in the 'parallel' R package. The package also includes a method for creating socket cluster objects spanning multiple nodes that can be used with the 'parallel' package.
Maintained by George Vega Yon. Last updated 1 years ago.
60 stars 8.07 score 216 scripts 1 dependentsms609
Quartet:Comparison of Phylogenetic Trees Using Quartet and Split Measures
Calculates the number of four-taxon subtrees consistent with a pair of cladograms, calculating the symmetric quartet distance of Bandelt & Dress (1986), Reconstructing the shape of a tree from observed dissimilarity data, Advances in Applied Mathematics, 7, 309-343 <doi:10.1016/0196-8858(86)90038-2>, and using the tqDist algorithm of Sand et al. (2014), tqDist: a library for computing the quartet and triplet distances between binary or general trees, Bioinformatics, 30, 2079โ2080 <doi:10.1093/bioinformatics/btu157> for pairs of binary trees.
Maintained by Martin R. Smith. Last updated 4 days ago.
bioinformaticscomparisonphylogenetic-treesphylogeneticsquartetquartet-distanceresearch-tooltreecpp
14 stars 8.00 score 40 scriptsoganm
homologene:Quick Access to Homologene and Gene Annotation Updates
A wrapper for the homologene database by the National Center for Biotechnology Information ('NCBI'). It allows searching for gene homologs across species. Data in this package can be found at <ftp://ftp.ncbi.nih.gov/pub/HomoloGene/build68/>. The package also includes an updated version of the homologene database where gene identifiers and symbols are replaced with their latest (at the time of submission) version and functions to fetch latest annotation data to keep updated.
Maintained by Ogan Mancarci. Last updated 1 years ago.
bioinformaticshomologenemancarci-2017ncbi-taxonomyogan-biospecieswrapper
43 stars 7.88 score 164 scripts 4 dependentsbioc
orthogene:Interspecies gene mapping
`orthogene` is an R package for easy mapping of orthologous genes across hundreds of species. It pulls up-to-date gene ortholog mappings across **700+ organisms**. It also provides various utility functions to aggregate/expand common objects (e.g. data.frames, gene expression matrices, lists) using **1:1**, **many:1**, **1:many** or **many:many** gene mappings, both within- and between-species.
Maintained by Brian Schilder. Last updated 5 months ago.
geneticscomparativegenomicspreprocessingphylogeneticstranscriptomicsgeneexpressionanimal-modelsbioconductorbioconductor-packagebioinformaticsbiomedicinecomparative-genomicsevolutionary-biologygenesgenomicsontologiestranslational-research
42 stars 7.85 score 31 scripts 2 dependentsbioc
Rcpi:Molecular Informatics Toolkit for Compound-Protein Interaction in Drug Discovery
A molecular informatics toolkit with an integration of bioinformatics and chemoinformatics tools for drug discovery.
Maintained by Nan Xiao. Last updated 5 months ago.
softwaredataimportdatarepresentationfeatureextractioncheminformaticsbiomedicalinformaticsproteomicsgosystemsbiologybioconductorbioinformaticsdrug-discoveryfeature-extractionfingerprintmolecular-descriptorsprotein-sequences
37 stars 7.81 score 29 scriptsbioc
PhyloProfile:PhyloProfile
PhyloProfile is a tool for exploring complex phylogenetic profiles. Phylogenetic profiles, presence/absence patterns of genes over a set of species, are commonly used to trace the functional and evolutionary history of genes across species and time. With PhyloProfile we can enrich regular phylogenetic profiles with further data like sequence/structure similarity, to make phylogenetic profiling more meaningful. Besides the interactive visualisation powered by R-Shiny, the package offers a set of further analysis features to gain insights like the gene age estimation or core gene identification.
Maintained by Vinh Tran. Last updated 7 days ago.
softwarevisualizationdatarepresentationmultiplecomparisonfunctionalpredictiondimensionreductionbioinformaticsheatmapinteractive-visualizationsorthologsphylogenetic-profileshiny
33 stars 7.79 score 10 scriptsabbvie-external
OmicNavigator:Open-Source Software for 'Omic' Data Analysis and Visualization
A tool for interactive exploration of the results from 'omics' experiments to facilitate novel discoveries from high-throughput biology. The software includes R functions for the 'bioinformatician' to deposit study metadata and the outputs from statistical analyses (e.g. differential expression, enrichment). These results are then exported to an interactive JavaScript dashboard that can be interrogated on the user's local machine or deployed online to be explored by collaborators. The dashboard includes 'sortable' tables, interactive plots including network visualization, and fine-grained filtering based on statistical significance.
Maintained by John Blischak. Last updated 15 days ago.
bioinformaticsgenomicsomicsopencpu
34 stars 7.68 score 31 scriptsbioc
signeR:Empirical Bayesian approach to mutational signature discovery
The signeR package provides an empirical Bayesian approach to mutational signature discovery. It is designed to analyze single nucleotide variation (SNV) counts in cancer genomes, but can also be applied to other features as well. Functionalities to characterize signatures or genome samples according to exposure patterns are also provided.
Maintained by Renan Valieris. Last updated 5 months ago.
genomicvariationsomaticmutationstatisticalmethodvisualizationbioconductorbioinformaticsopenblascpp
13 stars 7.67 score 22 scriptsmoosa-r
rbioapi:User-Friendly R Interface to Biologic Web Services' API
Currently fully supports Enrichr, JASPAR, miEAA, PANTHER, Reactome, STRING, and UniProt! The goal of rbioapi is to provide a user-friendly and consistent interface to biological databases and services. In a way that insulates the user from the technicalities of using web services API and creates a unified and easy-to-use interface to biological and medical web services. This is an ongoing project; New databases and services will be added periodically. Feel free to suggest any databases or services you often use.
Maintained by Moosa Rezwani. Last updated 2 months ago.
api-clientbioinformaticsbiologyenrichmentenrichment-analysisenrichrjasparmieaaover-representation-analysispantherreactomestringuniprot
20 stars 7.60 score 55 scriptsbiogenies
tidysq:Tidy Processing and Analysis of Biological Sequences
A tidy approach to analysis of biological sequences. All processing and data-storage functions are heavily optimized to allow the fastest and most efficient data storage.
Maintained by Dominik Rafacz. Last updated 3 months ago.
bioconductorbioinformaticsbiological-sequencesfastas3sequencestibbletidytidyversevctrscpp
40 stars 7.56 score 38 scriptsbioc
scde:Single Cell Differential Expression
The scde package implements a set of statistical methods for analyzing single-cell RNA-seq data. scde fits individual error models for single-cell RNA-seq measurements. These models can then be used for assessment of differential expression between groups of cells, as well as other types of analysis. The scde package also contains the pagoda framework which applies pathway and gene set overdispersion analysis to identify and characterize putative cell subpopulations based on transcriptional signatures. The overall approach to the differential expression analysis is detailed in the following publication: "Bayesian approach to single-cell differential expression analysis" (Kharchenko PV, Silberstein L, Scadden DT, Nature Methods, doi: 10.1038/nmeth.2967). The overall approach to subpopulation identification and characterization is detailed in the following pre-print: "Characterizing transcriptional heterogeneity through pathway and gene set overdispersion analysis" (Fan J, Salathia N, Liu R, Kaeser G, Yung Y, Herman J, Kaper F, Fan JB, Zhang K, Chun J, and Kharchenko PV, Nature Methods, doi:10.1038/nmeth.3734).
Maintained by Evan Biederstedt. Last updated 5 months ago.
immunooncologyrnaseqstatisticalmethoddifferentialexpressionbayesiantranscriptionsoftwareanalysisbioinformaticsheterogenityngssingle-celltranscriptomicsopenblascppopenmp
173 stars 7.53 score 141 scriptsbioc
methrix:Fast and efficient summarization of generic bedGraph files from Bisufite sequencing
Bedgraph files generated by Bisulfite pipelines often come in various flavors. Critical downstream step requires summarization of these files into methylation/coverage matrices. This step of data aggregation is done by Methrix, including many other useful downstream functions.
Maintained by Anand Mayakonda. Last updated 5 months ago.
dnamethylationsequencingcoveragebedgraphbioinformaticsdna-methylation
31 stars 7.51 score 39 scripts 1 dependentsbioc
koinar:KoinaR - Remote machine learning inference using Koina
A client to simplify fetching predictions from the Koina web service. Koina is a model repository enabling the remote execution of models. Predictions are generated as a response to HTTP/S requests, the standard protocol used for nearly all web traffic.
Maintained by Ludwig Lautenbacher. Last updated 3 months ago.
massspectrometryproteomicsinfrastructuresoftwarebioinformaticsdeep-learningmachine-learningmass-spectrometrypython
34 stars 7.49 score 4 scriptstongzhou2017
itol.toolkit:Helper Functions for 'Interactive Tree Of Life'
The 'Interactive Tree Of Life' <https://itol.embl.de/> online server can edit and annotate trees interactively. The 'itol.toolkit' package can support all types of annotation templates.
Maintained by Tong Zhou. Last updated 4 months ago.
bioinformaticsitolvisualization
167 stars 7.48 score 60 scriptsms609
TreeSearch:Phylogenetic Analysis with Discrete Character Data
Reconstruct phylogenetic trees from discrete data. Inapplicable character states are handled using the algorithm of Brazeau, Guillerme and Smith (2019) <doi:10.1093/sysbio/syy083> with the "Morphy" library, under equal or implied step weights. Contains a "shiny" user interface for interactive tree search and exploration of results, including character visualization, rogue taxon detection, tree space mapping, and cluster consensus trees (Smith 2022a, b) <doi:10.1093/sysbio/syab099>, <doi:10.1093/sysbio/syab100>. Profile Parsimony (Faith and Trueman, 2001) <doi:10.1080/10635150118627>, Successive Approximations (Farris, 1969) <doi:10.2307/2412182> and custom optimality criteria are implemented.
Maintained by Martin R. Smith. Last updated 3 days ago.
bioinformaticsmorphological-analysisphylogeneticsresearch-tooltree-searchcpp
7 stars 7.44 score 51 scriptsbodkan
admixr:An Interface for Running 'ADMIXTOOLS' Analyses
An interface for performing all stages of 'ADMIXTOOLS' analyses (<https://reich.hms.harvard.edu/software>) entirely from R. Wrapper functions (D, f4, f3, etc.) completely automate the generation of intermediate configuration files, run 'ADMIXTOOLS' programs on the command-line, and parse output files to extract values of interest. This allows users to focus on the analysis itself instead of worrying about low-level technical details. A set of complementary functions for processing and filtering of data in the 'EIGENSTRAT' format is also provided.
Maintained by Martin Petr. Last updated 1 months ago.
bioinformaticspopgenpopulation-genetics
29 stars 7.42 score 91 scriptsbioc
netSmooth:Network smoothing for scRNAseq
netSmooth is an R package for network smoothing of single cell RNA sequencing data. Using bio networks such as protein-protein interactions as priors for gene co-expression, netsmooth improves cell type identification from noisy, sparse scRNAseq data.
Maintained by Jonathan Ronen. Last updated 5 months ago.
networkgraphandnetworksinglecellrnaseqgeneexpressionsequencingtranscriptomicsnormalizationpreprocessingclusteringdimensionreductionbioinformaticsgenomicssingle-cell
27 stars 7.41 score 4 scriptsbioc
sevenbridges:Seven Bridges Platform API Client and Common Workflow Language Tool Builder in R
R client and utilities for Seven Bridges platform API, from Cancer Genomics Cloud to other Seven Bridges supported platforms.
Maintained by Phil Webster. Last updated 5 months ago.
softwaredataimportthirdpartyclientapi-clientbioconductorbioinformaticscloudcommon-workflow-languagesevenbridges
35 stars 7.40 score 24 scriptsbioc
cogena:co-expressed gene-set enrichment analysis
cogena is a workflow for co-expressed gene-set enrichment analysis. It aims to discovery smaller scale, but highly correlated cellular events that may be of great biological relevance. A novel pipeline for drug discovery and drug repositioning based on the cogena workflow is proposed. Particularly, candidate drugs can be predicted based on the gene expression of disease-related data, or other similar drugs can be identified based on the gene expression of drug-related data. Moreover, the drug mode of action can be disclosed by the associated pathway analysis. In summary, cogena is a flexible workflow for various gene set enrichment analysis for co-expressed genes, with a focus on pathway/GO analysis and drug repositioning.
Maintained by Zhilong Jia. Last updated 5 months ago.
clusteringgenesetenrichmentgeneexpressionvisualizationpathwayskegggomicroarraysequencingsystemsbiologydatarepresentationdataimportbioconductorbioinformatics
12 stars 7.36 score 32 scriptsbioc
NormalyzerDE:Evaluation of normalization methods and calculation of differential expression analysis statistics
NormalyzerDE provides screening of normalization methods for LC-MS based expression data. It calculates a range of normalized matrices using both existing approaches and a novel time-segmented approach, calculates performance measures and generates an evaluation report. Furthermore, it provides an easy utility for Limma- or ANOVA- based differential expression analysis.
Maintained by Jakob Willforss. Last updated 5 months ago.
normalizationmultiplecomparisonvisualizationbayesianproteomicsmetabolomicsdifferentialexpressionbioconductorbioinformaticslimma
22 stars 7.30 score 38 scripts 1 dependentsbioc
tidytof:Analyze High-dimensional Cytometry Data Using Tidy Data Principles
This package implements an interactive, scientific analysis pipeline for high-dimensional cytometry data built using tidy data principles. It is specifically designed to play well with both the tidyverse and Bioconductor software ecosystems, with functionality for reading/writing data files, data cleaning, preprocessing, clustering, visualization, modeling, and other quality-of-life functions. tidytof implements a "grammar" of high-dimensional cytometry data analysis.
Maintained by Timothy Keyes. Last updated 5 months ago.
singlecellflowcytometrybioinformaticscytometrydata-sciencesingle-celltidyversecpp
18 stars 7.24 score 35 scriptslvulliard
BioCircos:Interactive Circular Visualization of Genomic Data using 'htmlwidgets' and 'BioCircos.js'
Implement in 'R' interactive Circos-like visualizations of genomic data, to map information such as genetic variants, genomic fusions and aberrations to a circular genome, as proposed by the 'JavaScript' library 'BioCircos.js', based on the 'JQuery' and 'D3' technologies. The output is by default displayed in stand-alone HTML documents or in the 'RStudio' viewer pane. Moreover it can be integrated in 'R Markdown' documents and 'Shiny' applications.
Maintained by Loan Vulliard. Last updated 6 years ago.
biocircosbioinformaticscircoscircos-graphshtmlwidgetsshiny
37 stars 6.98 score 58 scriptslvclark
polyRAD:Genotype Calling with Uncertainty from Sequencing Data in Polyploids and Diploids
Read depth data from genotyping-by-sequencing (GBS) or restriction site-associated DNA sequencing (RAD-seq) are imported and used to make Bayesian probability estimates of genotypes in polyploids or diploids. The genotype probabilities, posterior mean genotypes, or most probable genotypes can then be exported for downstream analysis. 'polyRAD' is described by Clark et al. (2019) <doi:10.1534/g3.118.200913>, and the Hind/He statistic for marker filtering is described by Clark et al. (2022) <doi:10.1186/s12859-022-04635-9>. A variant calling pipeline for highly duplicated genomes is also included and is described by Clark et al. (2020, Version 1) <doi:10.1101/2020.01.11.902890>.
Maintained by Lindsay V. Clark. Last updated 20 days ago.
bioinformaticsdna-sequencinggenotype-likelihoodsgenotyping-by-sequencinghacktoberfestrad-seqrad-sequencingsnp-genotypingcpp
28 stars 6.98 score 85 scriptsbioc
isobar:Analysis and quantitation of isobarically tagged MSMS proteomics data
isobar provides methods for preprocessing, normalization, and report generation for the analysis of quantitative mass spectrometry proteomics data labeled with isobaric tags, such as iTRAQ and TMT. Features modules for integrating and validating PTM-centric datasets (isobar-PTM). More information on http://www.ms-isobar.org.
Maintained by Florian P Breitwieser. Last updated 5 months ago.
immunooncologyproteomicsmassspectrometrybioinformaticsmultiplecomparisonsqualitycontrol
10 stars 6.96 score 19 scriptsalexisvdb
singleCellHaystack:A Universal Differential Expression Prediction Tool for Single-Cell and Spatial Genomics Data
One key exploratory analysis step in single-cell genomics data analysis is the prediction of features with different activity levels. For example, we want to predict differentially expressed genes (DEGs) in single-cell RNA-seq data, spatial DEGs in spatial transcriptomics data, or differentially accessible regions (DARs) in single-cell ATAC-seq data. 'singleCellHaystack' predicts differentially active features in single cell omics datasets without relying on the clustering of cells into arbitrary clusters. 'singleCellHaystack' uses Kullback-Leibler divergence to find features (e.g., genes, genomic regions, etc) that are active in subsets of cells that are non-randomly positioned inside an input space (such as 1D trajectories, 2D tissue sections, multi-dimensional embeddings, etc). For the theoretical background of 'singleCellHaystack' we refer to our original paper Vandenbon and Diez (Nature Communications, 2020) <doi:10.1038/s41467-020-17900-3> and our update Vandenbon and Diez (Scientific Reports, 2023) <doi:10.1038/s41598-023-38965-2>.
Maintained by Alexis Vandenbon. Last updated 1 years ago.
bioinformaticscite-seqpseudotimescatac-seqsingle-cellspatial-proteomicsspatial-transcriptomicstranscriptomics
81 stars 6.71 score 64 scriptsbioc
bioassayR:Cross-target analysis of small molecule bioactivity
bioassayR is a computational tool that enables simultaneous analysis of thousands of bioassay experiments performed over a diverse set of compounds and biological targets. Unique features include support for large-scale cross-target analyses of both public and custom bioassays, generation of high throughput screening fingerprints (HTSFPs), and an optional preloaded database that provides access to a substantial portion of publicly available bioactivity data.
Maintained by Thomas Girke. Last updated 5 months ago.
immunooncologymicrotitreplateassaycellbasedassaysvisualizationinfrastructuredataimportbioinformaticsproteomicsmetabolomics
5 stars 6.70 score 46 scriptszilong-li
vcfppR:Rapid Manipulation of the Variant Call Format (VCF)
The 'vcfpp.h' (<https://github.com/Zilong-Li/vcfpp>) provides an easy-to-use 'C++' 'API' of 'htslib', offering full functionality for manipulating Variant Call Format (VCF) files. The 'vcfppR' package serves as the R bindings of the 'vcfpp.h' library, enabling rapid processing of both compressed and uncompressed VCF files. Explore a range of powerful features for efficient VCF data manipulation.
Maintained by Zilong Li. Last updated 14 days ago.
bioinformaticsfastrhtslibpopulation-geneticspopulation-genomicsvcfvcf-datavisulizationlibdeflatezlibbzip2xz-utilscurlcpp
13 stars 6.70 score 16 scriptsperson-c
easybio:Comprehensive Single-Cell Annotation and Transcriptomic Analysis Toolkit
Provides a comprehensive toolkit for single-cell annotation with the 'CellMarker2.0' database (see Xia Li, Peng Wang, Yunpeng Zhang (2023) <doi: 10.1093/nar/gkac947>). Streamlines biological label assignment in single-cell RNA-seq data and facilitates transcriptomic analysis, including preparation of TCGA<https://portal.gdc.cancer.gov/> and GEO<https://www.ncbi.nlm.nih.gov/geo/> datasets, differential expression analysis and visualization of enrichment analysis results. Additional utility functions support various bioinformatics workflows. See Wei Cui (2024) <doi: 10.1101/2024.09.14.609619> for more details.
Maintained by Wei Cui. Last updated 25 days ago.
limmageoqueryedgerfgseabioinformaticscellmarker2gsearna-seqsingle-cell
10 stars 6.62 score 35 scriptsbioc
BioCor:Functional similarities
Calculates functional similarities based on the pathways described on KEGG and REACTOME or in gene sets. These similarities can be calculated for pathways or gene sets, genes, or clusters and combined with other similarities. They can be used to improve networks, gene selection, testing relationships...
Maintained by Lluรญs Revilla Sancho. Last updated 5 months ago.
statisticalmethodclusteringgeneexpressionnetworkpathwaysnetworkenrichmentsystemsbiologybioconductor-packagesbioinformaticsfunctional-similaritygenegene-setspathway-analysissimilaritysimilarity-measurement
14 stars 6.59 scorebioc
pathlinkR:Analyze and interpret RNA-Seq results
pathlinkR is an R package designed to facilitate analysis of RNA-Seq results. Specifically, our aim with pathlinkR was to provide a number of tools which take a list of DE genes and perform different analyses on them, aiding with the interpretation of results. Functions are included to perform pathway enrichment, with muliplte databases supported, and tools for visualizing these results. Genes can also be used to create and plot protein-protein interaction networks, all from inside of R.
Maintained by Travis Blimkie. Last updated 3 months ago.
genesetenrichmentnetworkpathwaysreactomernaseqnetworkenrichmentbioinformaticsnetworkspathway-enrichment-analysisvisualization
28 stars 6.59 score 2 scriptsbioc
CiteFuse:CiteFuse: multi-modal analysis of CITE-seq data
CiteFuse pacakage implements a suite of methods and tools for CITE-seq data from pre-processing to integrative analytics, including doublet detection, network-based modality integration, cell type clustering, differential RNA and protein expression analysis, ADT evaluation, ligand-receptor interaction analysis, and interactive web-based visualisation of the analyses.
Maintained by Yingxin Lin. Last updated 5 months ago.
singlecellgeneexpressionbioinformaticssingle-cellcpp
27 stars 6.59 score 18 scriptsrmgpanw
gtexr:Query the GTEx Portal API
A convenient R interface to the Genotype-Tissue Expression (GTEx) Portal API. For more information on the API, see <https://gtexportal.org/api/v2/redoc>.
Maintained by Alasdair Warwick. Last updated 6 months ago.
api-wrapperbioinformaticseqtlgtexsqtl
6 stars 6.49 score 5 scriptsbioc
doubletrouble:Identification and classification of duplicated genes
doubletrouble aims to identify duplicated genes from whole-genome protein sequences and classify them based on their modes of duplication. The duplication modes are i. segmental duplication (SD); ii. tandem duplication (TD); iii. proximal duplication (PD); iv. transposed duplication (TRD) and; v. dispersed duplication (DD). Transposon-derived duplicates (TRD) can be further subdivided into rTRD (retrotransposon-derived duplication) and dTRD (DNA transposon-derived duplication). If users want a simpler classification scheme, duplicates can also be classified into SD- and SSD-derived (small-scale duplication) gene pairs. Besides classifying gene pairs, users can also classify genes, so that each gene is assigned a unique mode of duplication. Users can also calculate substitution rates per substitution site (i.e., Ka and Ks) from duplicate pairs, find peaks in Ks distributions with Gaussian Mixture Models (GMMs), and classify gene pairs into age groups based on Ks peaks.
Maintained by Fabrรญcio Almeida-Silva. Last updated 15 days ago.
softwarewholegenomecomparativegenomicsfunctionalgenomicsphylogeneticsnetworkclassificationbioinformaticscomparative-genomicsgene-duplicationmolecular-evolutionwhole-genome-duplication
23 stars 6.44 score 17 scriptsbioc
artMS:Analytical R tools for Mass Spectrometry
artMS provides a set of tools for the analysis of proteomics label-free datasets. It takes as input the MaxQuant search result output (evidence.txt file) and performs quality control, relative quantification using MSstats, downstream analysis and integration. artMS also provides a set of functions to re-format and make it compatible with other analytical tools, including, SAINTq, SAINTexpress, Phosfate, and PHOTON. Check [http://artms.org](http://artms.org) for details.
Maintained by David Jimenez-Morales. Last updated 5 months ago.
proteomicsdifferentialexpressionbiomedicalinformaticssystemsbiologymassspectrometryannotationqualitycontrolgenesetenrichmentclusteringnormalizationimmunooncologymultiplecomparisonanalysisanalyticalap-msbioconductorbioinformaticsmass-spectrometryphosphoproteomicspost-translational-modificationquantitative-analysis
14 stars 6.41 score 13 scriptsbioc
ontoProc:processing of ontologies of anatomy, cell lines, and so on
Support harvesting of diverse bioinformatic ontologies, making particular use of the ontologyIndex package on CRAN. We provide snapshots of key ontologies for terms about cells, cell lines, chemical compounds, and anatomy, to help analyze genome-scale experiments, particularly cell x compound screens. Another purpose is to strengthen development of compelling use cases for richer interfaces to emerging ontologies.
Maintained by Vincent Carey. Last updated 15 days ago.
infrastructuregobioinformaticsgenomicsontology
3 stars 6.37 score 75 scripts 2 dependentsethanbass
chromatographR:Chromatographic Data Analysis Toolset
Tools for high-throughput analysis of HPLC-DAD/UV chromatograms (or similar data). Includes functions for preprocessing, alignment, peak-finding and fitting, peak-table construction, data-visualization, etc. Preprocessing and peak-table construction follow the rough formula laid out in 'alsace' (Wehrens, R., Bloemberg, T.G., and Eilers P.H.C., 2015. <doi:10.1093/bioinformatics/btv299>. Alignment of chromatograms is available using parametric time warping (as implemented in the 'ptw' package) (Wehrens, R., Bloemberg, T.G., and Eilers P.H.C. 2015. <doi:10.1093/bioinformatics/btv299>) or variable penalty dynamic time warping (as implemented in 'VPdtw') (Clifford, D., & Stone, G. 2012. <doi:10.18637/jss.v047.i08>). Peak-finding uses the algorithm by Tom O'Haver <https://terpconnect.umd.edu/~toh/spectrum/PeakFindingandMeasurement.htm>. Peaks are then fitted to a gaussian or exponential-gaussian hybrid peak shape using non-linear least squares (Lan, K. & Jorgenson, J. W. 2001. <doi:10.1016/S0021-9673(01)00594-5>). See the vignette for more details and suggested workflow.
Maintained by Ethan Bass. Last updated 12 days ago.
bioinformaticscheminformaticschromatographygc-fidhplchplc-dadhplc-pdahplv-uvmetabolomicsopen-dataopen-sciencereproducibilityreproducible-research
18 stars 6.36 score 8 scripts 1 dependentssbg
sevenbridges2:The 'Seven Bridges Platform' API Client
R client and utilities for 'Seven Bridges Platform' API, from 'Cancer Genomics Cloud' to other 'Seven Bridges' supported platforms. API documentation is hosted publicly at <https://docs.sevenbridges.com/docs/the-api>.
Maintained by Marko Trifunovic. Last updated 3 days ago.
api-clientbioinformaticscloudsevenbridges
3 stars 6.32 score 4 scriptspcruniversum
RDML:Importing Real-Time Thermo Cycler (qPCR) Data from RDML Format Files
Imports real-time thermo cycler (qPCR) data from Real-time PCR Data Markup Language (RDML) and transforms to the appropriate formats of the 'qpcR' and 'chipPCR' packages. Contains a dendrogram visualization for the structure of RDML object and GUI for RDML editing.
Maintained by Konstantin A. Blagodatskikh. Last updated 7 months ago.
21 stars 6.26 score 58 scripts 1 dependentsbioc
CopyNumberPlots:Create Copy-Number Plots using karyoploteR functionality
CopyNumberPlots have a set of functions extending karyoploteRs functionality to create beautiful, customizable and flexible plots of copy-number related data.
Maintained by Bernat Gel. Last updated 5 months ago.
visualizationcopynumbervariationcoverageonechanneldataimportsequencingdnaseqbioconductorbioconductor-packagebioinformaticscopy-number-variationgenomicsgenomics-visualization
6 stars 6.24 score 16 scripts 2 dependentsbioc
scruff:Single Cell RNA-Seq UMI Filtering Facilitator (scruff)
A pipeline which processes single cell RNA-seq (scRNA-seq) reads from CEL-seq and CEL-seq2 protocols. Demultiplex scRNA-seq FASTQ files, align reads to reference genome using Rsubread, and generate UMI filtered count matrix. Also provide visualizations of read alignments and pre- and post-alignment QC metrics.
Maintained by Zhe Wang. Last updated 5 months ago.
softwaretechnologysequencingalignmentrnaseqsinglecellworkflowsteppreprocessingqualitycontrolvisualizationimmunooncologybioinformaticsscrna-seqsingle-cellumi
8 stars 6.20 score 22 scriptscogdisreslab
drugfindR:Investigate iLINCS for candidate repurposable drugs
This package provides a convenient way to access the LINCS Signatures available in the iLINCS database. These signatures include Consensus Gene Knockdown Signatures, Gene Overexpression signatures and Chemical Perturbagen Signatures. It also provides a way to enter your own transcriptomic signatures and identify concordant and discordant signatures in the LINCS database.
Maintained by Ali Sajid Imami. Last updated 3 days ago.
lincsilincsdrug repurposingdrug discoverytranscriptomicsgene expressiongene knockdowngene overexpressionchemical perturbagendrugfindrbioinformaticsbioinformatics-pipeline
8 stars 6.19 score 145 scriptsbioc
martini:GWAS Incorporating Networks
martini deals with the low power inherent to GWAS studies by using prior knowledge represented as a network. SNPs are the vertices of the network, and the edges represent biological relationships between them (genomic adjacency, belonging to the same gene, physical interaction between protein products). The network is scanned using SConES, which looks for groups of SNPs maximally associated with the phenotype, that form a close subnetwork.
Maintained by Hector Climente-Gonzalez. Last updated 5 months ago.
softwaregenomewideassociationsnpgeneticvariabilitygeneticsfeatureextractiongraphandnetworknetworkbioinformaticsgenomicsgwasnetwork-analysissnpssystems-biologycpp
4 stars 6.16 score 30 scriptsmyles-lewis
glmmSeq:General Linear Mixed Models for Gene-Level Differential Expression
Using mixed effects models to analyse longitudinal gene expression can highlight differences between sample groups over time. The most widely used differential gene expression tools are unable to fit linear mixed effect models, and are less optimal for analysing longitudinal data. This package provides negative binomial and Gaussian mixed effects models to fit gene expression and other biological data across repeated samples. This is particularly useful for investigating changes in RNA-Sequencing gene expression between groups of individuals over time, as described in: Rivellese, F., Surace, A. E., Goldmann, K., Sciacca, E., Cubuk, C., Giorli, G., ... Lewis, M. J., & Pitzalis, C. (2022) Nature medicine <doi:10.1038/s41591-022-01789-0>.
Maintained by Myles Lewis. Last updated 2 months ago.
bioinformaticsdifferential-gene-expressiongene-expressionglmmmixed-modelstranscriptomics
20 stars 6.13 score 45 scriptsbioc
cfDNAPro:cfDNAPro extracts and Visualises biological features from whole genome sequencing data of cell-free DNA
cfDNA fragments carry important features for building cancer sample classification ML models, such as fragment size, and fragment end motif etc. Analyzing and visualizing fragment size metrics, as well as other biological features in a curated, standardized, scalable, well-documented, and reproducible way might be time intensive. This package intends to resolve these problems and simplify the process. It offers two sets of functions for cfDNA feature characterization and visualization.
Maintained by Haichao Wang. Last updated 5 months ago.
visualizationsequencingwholegenomebioinformaticscancer-genomicscancer-researchcell-free-dnaearly-detectiongenomics-visualizationliquid-biopsyswgswhole-genome-sequencing
28 stars 6.04 score 13 scriptsmrcieu
epigraphdb:Interface Package for the 'EpiGraphDB' Platform
The interface package to access data from the 'EpiGraphDB' <https://epigraphdb.org> platform. It provides easy access to the 'EpiGraphDB' platform with functions that query the corresponding REST endpoints on the API <https://api.epigraphdb.org> and return the response data in the 'tibble' data frame format.
Maintained by Yi Liu. Last updated 3 years ago.
api-clientbioinformaticsepidemiologygraph-databasemendelian-randomizationphenotypes
27 stars 6.02 score 13 scriptsbioc
RMassBank:Workflow to process tandem MS files and build MassBank records
Workflow to process tandem MS files and build MassBank records. Functions include automated extraction of tandem MS spectra, formula assignment to tandem MS fragments, recalibration of tandem MS spectra with assigned fragments, spectrum cleanup, automated retrieval of compound information from Internet databases, and export to MassBank records.
Maintained by RMassBank at Eawag. Last updated 5 months ago.
immunooncologybioinformaticsmassspectrometrymetabolomicssoftwareopenjdk
6.02 score 26 scriptsbioc
gemma.R:A wrapper for Gemma's Restful API to access curated gene expression data and differential expression analyses
Low- and high-level wrappers for Gemma's RESTful API. They enable access to curated expression and differential expression data from over 10,000 published studies. Gemma is a web site, database and a set of tools for the meta-analysis, re-use and sharing of genomics data, currently primarily targeted at the analysis of gene expression profiles.
Maintained by Ogan Mancarci. Last updated 4 months ago.
softwaredataimportmicroarraysinglecellthirdpartyclientdifferentialexpressiongeneexpressionbayesianannotationexperimentaldesignnormalizationbatcheffectpreprocessingbioinformaticsgemmagenomicstranscriptomics
10 stars 5.99 score 26 scriptsbioc
ClustIRR:Clustering of immune receptor repertoires
ClustIRR analyzes repertoires of B- and T-cell receptors. It starts by identifying communities of immune receptors with similar specificities, based on the sequences of their complementarity-determining regions (CDRs). Next, it employs a Bayesian probabilistic models to quantify differential community occupancy (DCO) between repertoires, allowing the identification of expanding or contracting communities in response to e.g. infection or cancer treatment.
Maintained by Simo Kitanovski. Last updated 28 days ago.
clusteringimmunooncologysinglecellsoftwareclassificationb-cell-receptorbioinformaticsimmunoinformaticsimmunologyquantitative-methodsrep-seqrepertoire-analysist-cell-receptorcpp
2 stars 5.95 score 2 scriptsbioc
vissE:Visualising Set Enrichment Analysis Results
This package enables the interpretation and analysis of results from a gene set enrichment analysis using network-based and text-mining approaches. Most enrichment analyses result in large lists of significant gene sets that are difficult to interpret. Tools in this package help build a similarity-based network of significant gene sets from a gene set enrichment analysis that can then be investigated for their biological function using text-mining approaches.
Maintained by Dharmesh D. Bhuva. Last updated 5 months ago.
softwaregeneexpressiongenesetenrichmentnetworkenrichmentnetworkbioinformatics
15 stars 5.93 score 19 scriptsbioc
cummeRbund:Analysis, exploration, manipulation, and visualization of Cufflinks high-throughput sequencing data.
Allows for persistent storage, access, exploration, and manipulation of Cufflinks high-throughput sequencing data. In addition, provides numerous plotting functions for commonly used visualizations.
Maintained by Loyal A. Goff. Last updated 5 months ago.
highthroughputsequencinghighthroughputsequencingdatarnaseqrnaseqdatageneexpressiondifferentialexpressioninfrastructuredataimportdatarepresentationvisualizationbioinformaticsclusteringmultiplecomparisonsqualitycontrol
5.92 score 209 scriptskatrionagoldmann
volcano3D:3D Volcano Plots and Polar Plots for Three-Class Data
Generates interactive plots for analysing and visualising three-class high dimensional data. It is particularly suited to visualising differences in continuous attributes such as gene/protein/biomarker expression levels between three groups. Differential gene/biomarker expression analysis between two classes is typically shown as a volcano plot. However, with three groups this type of visualisation is particularly difficult to interpret. This package generates 3D volcano plots and 3-way polar plots for easier interpretation of three-class data.
Maintained by Katriona Goldmann. Last updated 2 years ago.
bioinformaticsdifferential-expressiondifferential-expression-analysisgene-expressioninteractiveomicsplotlyrna-seqtranscriptomicsvolcanoplots
36 stars 5.90 score 37 scriptsbioc
globaltest:Testing Groups of Covariates/Features for Association with a Response Variable, with Applications to Gene Set Testing
The global test tests groups of covariates (or features) for association with a response variable. This package implements the test with diagnostic plots and multiple testing utilities, along with several functions to facilitate the use of this test for gene set testing of GO and KEGG terms.
Maintained by Jelle Goeman. Last updated 5 months ago.
microarrayonechannelbioinformaticsdifferentialexpressiongopathways
5.89 score 79 scripts 6 dependentscox-labs
PerseusR:Perseus R Interop
Enables the interoperability between the Perseus platform for omics data analysis (Tyanova et al. 2016) <doi:10.1038/nmeth.3901> and R. It provides the foundation for developing and running Perseus plugins implemented in R by providing all required input and output handling, including data and parameter parsing as described in Rudolph and Cox 2018 <doi:10.1101/447268>.
Maintained by Jan Rudolph. Last updated 3 years ago.
bioinformaticsinteropmaxquantperseusproteomics
13 stars 5.88 score 58 scriptsmt1022
cubar:Codon Usage Bias Analysis
A suite of functions for rapid and flexible analysis of codon usage bias. It provides in-depth analysis at the codon level, including relative synonymous codon usage (RSCU), tRNA weight calculations, machine learning predictions for optimal or preferred codons, and visualization of codon-anticodon pairing. Additionally, it can calculate various gene- specific codon indices such as codon adaptation index (CAI), effective number of codons (ENC), fraction of optimal codons (Fop), tRNA adaptation index (tAI), mean codon stabilization coefficients (CSCg), and GC contents (GC/GC3s/GC4d). It also supports both standard and non-standard genetic code tables found in NCBI, as well as custom genetic code tables.
Maintained by Hong Zhang. Last updated 4 months ago.
bioinformaticscodon-usagemachine-learningsequence-analysis
6 stars 5.82 score 8 scriptsineelhere
clintrialx:Connect and Work with Clinical Trials Data Sources
Are you spending too much time fetching and managing clinical trial data? Struggling with complex queries and bulk data extraction? What if you could simplify this process with just a few lines of code? Introducing 'clintrialx' - Fetch clinical trial data from sources like 'ClinicalTrials.gov' <https://clinicaltrials.gov/> and the 'Clinical Trials Transformation Initiative - Access to Aggregate Content of ClinicalTrials.gov' database <https://aact.ctti-clinicaltrials.org/>, supporting pagination and bulk downloads. Also, you can generate HTML reports based on the data obtained from the sources!
Maintained by Indraneel Chakraborty. Last updated 17 days ago.
aactbioinformaticsclinical-dataclinical-trialsclinicaltrialsgovcttidatadata-managementmedical-informaticsr-languagetrials
15 stars 5.76 score 11 scriptsbioc
CNORode:ODE add-on to CellNOptR
Logic based ordinary differential equation (ODE) add-on to CellNOptR.
Maintained by Attila Gabor. Last updated 5 months ago.
immunooncologycellbasedassayscellbiologyproteomicsbioinformaticstimecourse
5.74 score 37 scripts 1 dependentsbioc
sparrow:Take command of set enrichment analyses through a unified interface
Provides a unified interface to a variety of GSEA techniques from different bioconductor packages. Results are harmonized into a single object and can be interrogated uniformly for quick exploration and interpretation of results. Interactive exploration of GSEA results is enabled through a shiny app provided by a sparrow.shiny sibling package.
Maintained by Steve Lianoglou. Last updated 11 days ago.
genesetenrichmentpathwaysbioinformaticsgsea
21 stars 5.74 score 13 scriptscore-bioinformatics
ClustAssess:Tools for Assessing Clustering
A set of tools for evaluating clustering robustness using proportion of ambiguously clustered pairs (Senbabaoglu et al. (2014) <doi:10.1038/srep06207>), as well as similarity across methods and method stability using element-centric clustering comparison (Gates et al. (2019) <doi:10.1038/s41598-019-44892-y>). Additionally, this package enables stability-based parameter assessment for graph-based clustering pipelines typical in single-cell data analysis.
Maintained by Andi Munteanu. Last updated 1 months ago.
softwaresinglecellrnaseqatacseqnormalizationpreprocessingdimensionreductionvisualizationqualitycontrolclusteringclassificationannotationgeneexpressiondifferentialexpressionbioinformaticsgenomicsmachine-learningparameter-optimizationrobustnesssingle-cellunsupervised-learningcpp
23 stars 5.70 score 18 scriptspepkit
pepr:Reading Portable Encapsulated Projects
A PEP, or Portable Encapsulated Project, is a dataset that subscribes to the PEP structure for organizing metadata. It is written using a simple YAML + CSV format, it is your one-stop solution to metadata management across data analysis environments. This package reads this standardized project configuration structure into R. Described in Sheffield et al. (2021) <doi:10.1093/gigascience/giab077>.
Maintained by Nathan Sheffield. Last updated 1 years ago.
3 stars 5.62 score 20 scriptsg3viz
g3viz:Interactively Visualize Genetic Mutation Data using a Lollipop-Diagram
Interface for 'g3-lollipop' 'JavaScript' library. Visualize genetic mutation data using an interactive lollipop diagram in 'RStudio' or your web browser.
Maintained by Xin Guo. Last updated 7 months ago.
bioinformaticsgenomics-visualizationlollipop-plotvariantsvisualize-mutation-data
31 stars 5.61 score 22 scriptsjmw86069
jamba:Just Analysis Methods Base
Just analysis methods ('jam') base functions focused on bioinformatics. Version- and gene-centric alphanumeric sort, unique name and version assignment, colorized console and 'HTML' output, color ramp and palette manipulation, 'Rmarkdown' cache import, styled 'Excel' worksheet import and export, interpolated raster output from smooth scatter and image plots, list to delimited vector, efficient list tools.
Maintained by James M. Ward. Last updated 21 hours ago.
6 stars 5.52 scoremultimeric
TidyMultiqc:Converts 'MultiQC' Reports into Tidy Data Frames
Provides the means to convert 'multiqc_data.json' files, produced by the wonderful 'MultiQC' tool, into tidy data frames for downstream analysis in R. This analysis might involve cohort analysis, quality control visualisation, change-point detection, statistical process control, clustering, or any other type of quality analysis.
Maintained by Michael Milton. Last updated 3 years ago.
18 stars 5.50 score 35 scriptsbioc
TDbasedUFE:Tensor Decomposition Based Unsupervised Feature Extraction
This is a comprehensive package to perform Tensor decomposition based unsupervised feature extraction. It can perform unsupervised feature extraction. It uses tensor decomposition. It is applicable to gene expression, DNA methylation, and histone modification etc. It can perform multiomics analysis. It is also potentially applicable to single cell omics data sets.
Maintained by Y-h. Taguchi. Last updated 5 months ago.
geneexpressionfeatureextractionmethylationarraysinglecellbioinformaticsdna-methylationgene-expression-profileshistone-modificationsmultiomicstensor-decomposition
5 stars 5.48 score 9 scripts 1 dependentsbioc
BERT:High Performance Data Integration for Large-Scale Analyses of Incomplete Omic Profiles Using Batch-Effect Reduction Trees (BERT)
Provides efficient batch-effect adjustment of data with missing values. BERT orders all batch effect correction to a tree of pairwise computations. BERT allows parallelization over sub-trees.
Maintained by Yannis Schumann. Last updated 2 months ago.
batcheffectpreprocessingexperimentaldesignqualitycontrolbatch-effectbioconductor-packagebioinformaticsdata-integrationdata-science
2 stars 5.40 score 18 scriptsc4tb
shinyExprPortal:A Configurable 'shiny' Portal for Sharing Analysis of Molecular Expression Data
Enables deploying configuration file-based 'shiny' apps with minimal programming for interactive exploration and analysis showcase of molecular expression data. For exploration, supports visualization of correlations between rows of an expression matrix and a table of observations, such as clinical measures, and comparison of changes in expression over time. For showcase, enables visualizing the results of differential expression from package such as 'limma', co-expression modules from 'WGCNA' and lower dimensional projections.
Maintained by Rafael Henkin. Last updated 8 months ago.
bioinformaticsdata-analysistranscriptomics
5 stars 5.30 score 8 scriptsbioc
biotmle:Targeted Learning with Moderated Statistics for Biomarker Discovery
Tools for differential expression biomarker discovery based on microarray and next-generation sequencing data that leverage efficient semiparametric estimators of the average treatment effect for variable importance analysis. Estimation and inference of the (marginal) average treatment effects of potential biomarkers are computed by targeted minimum loss-based estimation, with joint, stable inference constructed across all biomarkers using a generalization of moderated statistics for use with the estimated efficient influence function. The procedure accommodates the use of ensemble machine learning for the estimation of nuisance functions.
Maintained by Nima Hejazi. Last updated 5 months ago.
regressiongeneexpressiondifferentialexpressionsequencingmicroarrayrnaseqimmunooncologybioconductorbioconductor-packagebioconductor-packagesbioinformaticsbiomarker-discoverybiostatisticscausal-inferencecomputational-biologymachine-learningstatisticstargeted-learning
5 stars 5.30 score 5 scriptsbioc
DEWSeq:Differential Expressed Windows Based on Negative Binomial Distribution
DEWSeq is a sliding window approach for the analysis of differentially enriched binding regions eCLIP or iCLIP next generation sequencing data.
Maintained by bioinformatics team Hentze. Last updated 5 months ago.
sequencinggeneregulationfunctionalgenomicsdifferentialexpressionbioinformaticseclipngs-analysis
5 stars 5.30 score 4 scriptsbioc
epistack:Heatmaps of Stack Profiles from Epigenetic Signals
The epistack package main objective is the visualizations of stacks of genomic tracks (such as, but not restricted to, ChIP-seq, ATAC-seq, DNA methyation or genomic conservation data) centered at genomic regions of interest. epistack needs three different inputs: 1) a genomic score objects, such as ChIP-seq coverage or DNA methylation values, provided as a `GRanges` (easily obtained from `bigwig` or `bam` files). 2) a list of feature of interest, such as peaks or transcription start sites, provided as a `GRanges` (easily obtained from `gtf` or `bed` files). 3) a score to sort the features, such as peak height or gene expression value.
Maintained by DEVAILLY Guillaume. Last updated 5 months ago.
rnaseqpreprocessingchipseqgeneexpressioncoveragebioinformatics
6 stars 5.26 score 5 scriptspiplus2
SPUTNIK:Spatially Automatic Denoising for Imaging Mass Spectrometry Toolkit
Set of tools for peak filtering of mass spectrometry imaging data based on spatial distribution of signal. Given a region-of-interest, representing the spatial region where the informative signal is expected to be localized, a series of filters determine which peak signals are characterized by an implausible spatial distribution. The filters reduce the dataset dimension and increase its information vs noise ratio, improving the quality of the unsupervised analysis results, reducing data dimension and simplifying the chemical interpretation. The methods are described in Inglese P. et al (2019) <doi:10.1093/bioinformatics/bty622>.
Maintained by Paolo Inglese. Last updated 12 months ago.
bioinformaticsdesi-msiimage-processingmaldi-msimaldi-tof-msmass-spectrometrymass-spectrometry-imaging
4 stars 5.24 score 43 scriptswheretrue
exonr:Scientific Data Processing
This package provides a set of tools for processing scientific data. It's based on the exon Rust package.
Maintained by Trent Hauck. Last updated 2 months ago.
arrowbioinformaticsdatafusionngsproteomicsrustsqlcargo
59 stars 5.23 score 2 scriptsgitter-lab
LPWC:Lag Penalized Weighted Correlation for Time Series Clustering
Computes a time series distance measure for clustering based on weighted correlation and introduction of lags. The lags capture delayed responses in a time series dataset. The timepoints must be specified. T. Chandereng, A. Gitter (2020) <doi:10.1186/s12859-019-3324-1>.
Maintained by Thevaa Chandereng. Last updated 5 years ago.
bioinformaticsclusteringtime-series
20 stars 5.23 score 17 scriptshautaniemilab
jellyfisher:Visualize Spatiotemporal Tumor Evolution with Jellyfish Plots
Generates interactive Jellyfish plots to visualize spatiotemporal tumor evolution by integrating sample and phylogenetic trees into a unified plot. This approach provides an intuitive way to analyze tumor heterogeneity and evolution over time and across anatomical locations. The Jellyfish plot visualization design was first introduced by Lahtinen, Lavikka, et al. (2023, <doi:10.1016/j.ccell.2023.04.017>). This package also supports visualizing ClonEvol results, a tool developed by Dang, et al. (2017, <doi:10.1093/annonc/mdx517>), for analyzing clonal evolution from multi-sample sequencing data. The 'clonevol' package is not available on CRAN but can be installed from its GitHub repository (<https://github.com/hdng/clonevol>).
Maintained by Kari Lavikka. Last updated 30 days ago.
visualizationphylogeneticssoftwarespatialbioinformaticsphylogenetic-analysistumor-evolutiontumor-heterogeneityvisualization-tool
3 stars 5.18 score 2 scriptsnanxstats
ssw:Striped Smith-Waterman Algorithm for Sequence Alignment using SIMD
Provides an R interface for 'SSW' (Striped Smith-Waterman) via its 'Python' binding 'ssw-py'. 'SSW' is a fast 'C' and 'C++' implementation of the Smith-Waterman algorithm for pairwise sequence alignment using Single-Instruction-Multiple-Data (SIMD) instructions. 'SSW' enhances the standard algorithm by efficiently returning alignment information and suboptimal alignment scores. The core 'SSW' library offers performance improvements for various bioinformatics tasks, including protein database searches, short-read alignments, primary and split-read mapping, structural variant detection, and read-overlap graph generation. These features make 'SSW' particularly useful for genomic applications. Zhao et al. (2013) <doi:10.1371/journal.pone.0082138> developed the original 'C' and 'C++' implementation.
Maintained by Nan Xiao. Last updated 6 months ago.
bioinformaticsreticulatesequence-alignmentsimdsmith-waterman
6 stars 5.18 scorebioc
flowDensity:Sequential Flow Cytometry Data Gating
This package provides tools for automated sequential gating analogous to the manual gating strategy based on the density of the data.
Maintained by Mehrnoush Malek. Last updated 5 months ago.
bioinformaticsflowcytometrycellbiologyclusteringcancerflowcytdatadatarepresentationstemcelldensitygating
5.17 score 83 scripts 3 dependentsbioc
SGCP:SGCP: A semi-supervised pipeline for gene clustering using self-training approach in gene co-expression networks
SGC is a semi-supervised pipeline for gene clustering in gene co-expression networks. SGC consists of multiple novel steps that enable the computation of highly enriched modules in an unsupervised manner. But unlike all existing frameworks, it further incorporates a novel step that leverages Gene Ontology information in a semi-supervised clustering method that further improves the quality of the computed modules.
Maintained by Niloofar AghaieAbiane. Last updated 5 months ago.
geneexpressiongenesetenrichmentnetworkenrichmentsystemsbiologyclassificationclusteringdimensionreductiongraphandnetworkneuralnetworknetworkmrnamicroarrayrnaseqvisualizationbioinformaticsgenecoexpressionnetworkgraphsnetworkclusteringnetworksself-trainingsemi-supervised-learningunsupervised-learning
2 stars 5.12 score 44 scriptsbioc
cTRAP:Identification of candidate causal perturbations from differential gene expression data
Compare differential gene expression results with those from known cellular perturbations (such as gene knock-down, overexpression or small molecules) derived from the Connectivity Map. Such analyses allow not only to infer the molecular causes of the observed difference in gene expression but also to identify small molecules that could drive or revert specific transcriptomic alterations.
Maintained by Nuno Saraiva-Agostinho. Last updated 5 months ago.
differentialexpressiongeneexpressionrnaseqtranscriptomicspathwaysimmunooncologygenesetenrichmentbioconductorbioinformaticscmapgene-expressionl1000
5 stars 5.08 score 16 scriptsmartinloza
Canek:Batch Correction of Single Cell Transcriptome Data
Non-linear/linear hybrid method for batch-effect correction that uses Mutual Nearest Neighbors (MNNs) to identify similar cells between datasets. Reference: Loza M. et al. (NAR Genomics and Bioinformatics, 2020) <doi:10.1093/nargab/lqac022>.
Maintained by Martin Loza. Last updated 1 years ago.
batch-effectsbioinformaticssingle-cell-rna-seqtranscriptomics
5 stars 5.06 score 23 scriptsbioc
mobileRNA:mobileRNA: Investigate the RNA mobilome & population-scale changes
Genomic analysis can be utilised to identify differences between RNA populations in two conditions, both in production and abundance. This includes the identification of RNAs produced by multiple genomes within a biological system. For example, RNA produced by pathogens within a host or mobile RNAs in plant graft systems. The mobileRNA package provides methods to pre-process, analyse and visualise the sRNA and mRNA populations based on the premise of mapping reads to all genotypes at the same time.
Maintained by Katie Jeynes-Cupper. Last updated 5 months ago.
visualizationrnaseqsequencingsmallrnagenomeassemblyclusteringexperimentaldesignqualitycontrolworkflowstepalignmentpreprocessingbioinformaticsplant-science
4 stars 5.00 score 2 scriptsluciorq
condathis:Run Any CLI Tool on a 'Conda' Environment
Simplifies the execution of command line interface (CLI) tools within isolated and reproducible environments. It enables users to effortlessly manage 'Conda' environments, execute command line tools, handle dependencies, and ensure reproducibility in their data analysis workflows.
Maintained by Lucio Queiroz. Last updated 17 days ago.
bioinformaticscondareproducibilityreproducible-research
10 stars 5.00 score 1 scriptsnanxstats
grex:Gene ID Mapping for Genotype-Tissue Expression (GTEx) Data
Convert 'Ensembl' gene identifiers from Genotype-Tissue Expression (GTEx) data to identifiers in other annotation systems, including 'Entrez', 'HGNC', and 'UniProt'.
Maintained by Nan Xiao. Last updated 3 years ago.
bioinformaticsgene-expressiongenotype-tissue-expressiongtex
8 stars 4.96 score 23 scriptsbioc
rRDP:Interface to the RDP Classifier
This package installs and interfaces the naive Bayesian classifier for 16S rRNA sequences developed by the Ribosomal Database Project (RDP). With this package the classifier trained with the standard training set can be used or a custom classifier can be trained.
Maintained by Michael Hahsler. Last updated 5 months ago.
geneticssequencinginfrastructureclassificationmicrobiomeimmunooncologyalignmentsequencematchingdataimportbayesianbioconductorbioinformaticsopenjdk
3 stars 4.88 score 6 scriptsbioc
rTRM:Identification of Transcriptional Regulatory Modules from Protein-Protein Interaction Networks
rTRM identifies transcriptional regulatory modules (TRMs) from protein-protein interaction networks.
Maintained by Diego Diez. Last updated 5 months ago.
transcriptionnetworkgeneregulationgraphandnetworkbioconductorbioinformatics
3 stars 4.86 score 3 scripts 1 dependentsbioc
packFinder:de novo Annotation of Pack-TYPE Transposable Elements
Algorithm and tools for in silico pack-TYPE transposon discovery. Filters a given genome for properties unique to DNA transposons and provides tools for the investigation of returned matches. Sequences are input in DNAString format, and ranges are returned as a dataframe (in the format returned by as.dataframe(GRanges)).
Maintained by Jack Gisby. Last updated 5 months ago.
geneticssequencematchingannotationbioinformaticstext-mining
7 stars 4.85 score 6 scriptssamilhll
macrosyntR:Draw Ordered Oxford Grids
Use standard genomics file format (BED) and a table of orthologs to illustrate synteny conservation at the genome-wide scale. Significantly conserved linkage groups are identified as described in Simakov et al. (2020) <doi:10.1038/s41559-020-1156-z> and displayed on an Oxford Grid (Edwards (1991) <doi:10.1111/j.1469-1809.1991.tb00394.x>) or a chord diagram as in Simakov et al. (2022) <doi:10.1126/sciadv.abi5884>. The package provides a function that uses a network-based greedy algorithm to find communities (Clauset et al. (2004) <doi:10.1103/PhysRevE.70.066111>) and so automatically order the chromosomes on the plot to improve interpretability.
Maintained by Sami El Hilali. Last updated 10 months ago.
bioinformaticsgenomic-visualizationsgenomics
14 stars 4.85 score 5 scriptssysbiolab
PathwaySpace:Spatial Projection of Network Signals along Geodesic Paths
For a given graph containing vertices, edges, and a signal associated with the vertices, the 'PathwaySpace' package performs a convolution operation, which involves a weighted combination of neighboring vertices and their associated signals. The package then uses a decay function to project these signals, creating geodesic paths on a 2D-image space. 'PathwaySpace' could have various applications, such as visualizing and analyzing network data in a graphical format that highlights the relationships and signal strengths between vertices. It can be particularly useful for understanding the influence of signals through complex networks. By combining graph theory, signal processing, and visualization, the 'PathwaySpace' package provides a novel way of representing and analyzing graph data.
Maintained by Mauro Castro. Last updated 3 months ago.
bioinformaticsbiological-networksgraph
2 stars 4.85 score 5 scriptsbioc
rmelting:R Interface to MELTING 5
R interface to the MELTING 5 program (https://www.ebi.ac.uk/biomodels/tools/melting/) to compute melting temperatures of nucleic acid duplexes along with other thermodynamic parameters.
Maintained by J. Aravind. Last updated 5 months ago.
biomedicalinformaticscheminformaticsbioconductorbioinformaticsmelting-temperatureopenjdk
2 stars 4.78 score 10 scriptsmoseleybioinformaticslab
visualizationQualityControl:Development of visualization methods for quality control
Provides utilities useful quality control of high-throughput -omics datasets.
Maintained by Robert M Flight. Last updated 1 years ago.
bioinformaticscorrelationquality-controlvisualization
10 stars 4.78 score 30 scriptsandersgs
harrietr:Wrangle Phylogenetic Distance Matrices and Other Utilities
Harriet was Charles Darwin's pet tortoise (possibly). 'harrietr' implements some function to manipulate distance matrices and phylogenetic trees to make it easier to plot with 'ggplot2' and to manipulate using 'tidyverse' tools.
Maintained by Anders Gonรงalves da Silva. Last updated 7 years ago.
bioinformaticsevolutionphylogenetics
12 stars 4.78 score 50 scriptsbioc
GEOfastq:Downloads ENA Fastqs With GEO Accessions
GEOfastq is used to download fastq files from the European Nucleotide Archive (ENA) starting with an accession from the Gene Expression Omnibus (GEO). To do this, sample metadata is retrieved from GEO and the Sequence Read Archive (SRA). SRA run accessions are then used to construct FTP and aspera download links for fastq files generated by the ENA.
Maintained by Alex Pickering. Last updated 5 months ago.
rnaseqdataimportbioinformaticsfastqgene-expressiongeorna-seq
4 stars 4.60 score 6 scriptsbioc
methylInheritance:Permutation-Based Analysis associating Conserved Differentially Methylated Elements Across Multiple Generations to a Treatment Effect
Permutation analysis, based on Monte Carlo sampling, for testing the hypothesis that the number of conserved differentially methylated elements, between several generations, is associated to an effect inherited from a treatment and that stochastic effect can be dismissed.
Maintained by Astrid Deschรชnes. Last updated 5 months ago.
biologicalquestionepigeneticsdnamethylationdifferentialmethylationmethylseqsoftwareimmunooncologystatisticalmethodwholegenomesequencinganalysisbioconductorbioinformaticscpgdifferentially-methylated-elementsinheritancemonte-carlo-samplingpermutation
4.60 score 1 scriptsbioc
meshr:Tools for conducting enrichment analysis of MeSH
A set of annotation maps describing the entire MeSH assembled using data from MeSH.
Maintained by Koki Tsuyuzaki. Last updated 5 months ago.
annotationdatafunctionalannotationbioinformaticsstatisticsannotationmultiplecomparisonsmeshdb
4.56 score 9 scripts 1 dependentsurniaz
kmeRs:K-Mers Similarity Score Matrix and HeatMap
Similarity Score Matrix and HeatMap for nucleic and amino acid k-mers. Similarity score is evaluated by Point Accepted Mutation (PAM) and BLOcks SUbstitution Matrix (BLOSUM). The 30, 40, 70, 120, 250 and 62, 45, 50, 62, 80, 100 matrix versions are available for PAM and BLOSUM, respectively. Alignment is evaluated by local and global alignment.
Maintained by Rafal Urniaz. Last updated 7 months ago.
softwareamino-acidsbioinformaticsnucleic-acidssimilarity-matrix
4.54 score 3 scriptssergejruff
Virusparies:Visualize and Process Output from 'VirusHunterGatherer'
A collection of tools for downstream analysis of 'VirusHunterGatherer' output. Processing of hittables and plotting of results, enabling better interpretation, is made easier with the provided functions.
Maintained by Ruff Sergej. Last updated 3 months ago.
bioinformaticsdata-drivendiscoverdiscoveryggplot2graphical-tablehidden-markov-modelhmmlearnplotr-programmingsummary-statisticsvirusvirus-discoveryvirus-scanningvirusgatherervirushuntervirushuntergatherervisualization
1 stars 4.49 score 28 scriptsbioc
ggseqalign:Minimal Visualization of Sequence Alignments
Simple visualizations of alignments of DNA or AA sequences as well as arbitrary strings. Compatible with Biostrings and ggplot2. The plots are fully customizable using ggplot2 modifiers such as theme().
Maintained by Simeon Lim Rossmann. Last updated 25 days ago.
alignmentmultiplesequencealignmentsoftwarevisualizationbioinformaticsggplot2-enhancementsminimalistic
4.48 scorebioc
TDbasedUFEadv:Advanced package of tensor decomposition based unsupervised feature extraction
This is an advanced version of TDbasedUFE, which is a comprehensive package to perform Tensor decomposition based unsupervised feature extraction. In contrast to TDbasedUFE which can perform simple the feature selection and the multiomics analyses, this package can perform more complicated and advanced features, but they are not so popularly required. Only users who require more specific features can make use of its functionality.
Maintained by Y-h. Taguchi. Last updated 5 months ago.
geneexpressionfeatureextractionmethylationarraysinglecellsoftwarebioconductor-packagebioinformaticstensor-decomposition
4.48 score 4 scriptsbioc
biobtreeR:Using biobtree tool from R
The biobtreeR package provides an interface to [biobtree](https://github.com/tamerh/biobtree) tool which covers large set of bioinformatics datasets and allows search and chain mappings functionalities.
Maintained by Tamer Gur. Last updated 5 months ago.
3 stars 4.48 score 3 scriptsyannabraham
bodenmiller:Profiling of Peripheral Blood Mononuclear Cells using CyTOF
This data package contains a subset of the Bodenmiller et al, Nat Biotech 2012 dataset for testing single cell, high dimensional analysis and visualization methods.
Maintained by Yann Abraham. Last updated 4 years ago.
bioinformaticscytofdatasetscience
2 stars 4.45 score 28 scriptskumes
chatAI4R:Chat-Based Interactive Artificial Intelligence for R
The Large Language Model (LLM) represents a groundbreaking advancement in data science and programming, and also allows us to extend the world of R. A seamless interface for integrating the 'OpenAI' Web APIs into R is provided in this package. This package leverages LLM-based AI techniques, enabling efficient knowledge discovery and data analysis (see 'OpenAI' Web APIs details <https://openai.com/blog/openai-api>). The previous functions such as seamless translation and image generation have been moved to other packages 'deepRstudio' and 'stableDiffusion4R'.
Maintained by Satoshi Kume. Last updated 1 months ago.
aibioinformaticschatgptgptimageimage-generation
14 stars 4.45 score 3 scriptsftwkoopmans
goat:Gene Set Analysis Using the Gene Set Ordinal Association Test
Perform gene set enrichment analyses using the Gene set Ordinal Association Test (GOAT) algorithm and visualize your results. Koopmans, F. (2024) <doi:10.1038/s42003-024-06454-5>.
Maintained by Frank Koopmans. Last updated 1 months ago.
bioinformaticsgeneset-enrichmentgeneset-enrichment-analysiscppopenmp
10 stars 4.40 score 8 scriptsxueyuancao
GSDA:Gene Set Distance Analysis (GSDA)
The gene-set distance analysis of omic data is implemented by generalizing distance correlations to evaluate the association of a gene set with categorical and censored event-time variables.
Maintained by Xueyuan Cao. Last updated 4 years ago.
microarraybioinformaticsgene expression
1 stars 4.30 score 8 scriptsbioc
getDEE2:Programmatic access to the DEE2 RNA expression dataset
Digital Expression Explorer 2 (or DEE2 for short) is a repository of processed RNA-seq data in the form of counts. It was designed so that researchers could undertake re-analysis and meta-analysis of published RNA-seq studies quickly and easily. As of April 2020, over 1 million SRA datasets have been processed. This package provides an R interface to access these expression data. More information about the DEE2 project can be found at the project homepage (http://dee2.io) and main publication (https://doi.org/10.1093/gigascience/giz022).
Maintained by Mark Ziemann. Last updated 2 months ago.
geneexpressiontranscriptomicssequencingbioinformaticsdata-mininggenomicsrna-expressionrna-seq
4 stars 4.20 score 5 scriptsbioc
ReducedExperiment:Containers and tools for dimensionally-reduced -omics representations
Provides SummarizedExperiment-like containers for storing and manipulating dimensionally-reduced assay data. The ReducedExperiment classes allow users to simultaneously manipulate their original dataset and their decomposed data, in addition to other method-specific outputs like feature loadings. Implements utilities and specialised classes for the application of stabilised independent component analysis (sICA) and weighted gene correlation network analysis (WGCNA).
Maintained by Jack Gisby. Last updated 2 months ago.
geneexpressioninfrastructuredatarepresentationsoftwaredimensionreductionnetworkbioconductor-packagebioinformaticsdimensionality-reduction
3 stars 4.13 score 8 scriptsluciorq
isoformic:Isoform-Level Biological Interpretation of Transcriptomic Data
Isoform-level biological interpretation of transcriptomic data.
Maintained by Lucio Rezende Queiroz. Last updated 4 months ago.
bioconductorbioinformaticstranscriptomics
22 stars 4.12 score 4 scriptsbioc
ccrepe:ccrepe_and_nc.score
The CCREPE (Compositionality Corrected by REnormalizaion and PErmutation) package is designed to assess the significance of general similarity measures in compositional datasets. In microbial abundance data, for example, the total abundances of all microbes sum to one; CCREPE is designed to take this constraint into account when assigning p-values to similarity measures between the microbes. The package has two functions: ccrepe: Calculates similarity measures, p-values and q-values for relative abundances of bugs in one or two body sites using bootstrap and permutation matrices of the data. nc.score: Calculates species-level co-variation and co-exclusion patterns based on an extension of the checkerboard score to ordinal data.
Maintained by Emma Schwager. Last updated 5 months ago.
immunooncologystatisticsmetagenomicsbioinformaticssoftware
4.08 score 7 scriptssbg
biocompute:Create and Manipulate BioCompute Objects
Tools to create, validate, and export BioCompute Objects described in King et al. (2019) <doi:10.17605/osf.io/h59uh>. Users can encode information in data frames, and compose BioCompute Objects from the domains defined by the standard. A checksum validator and a JSON schema validator are provided. This package also supports exporting BioCompute Objects as JSON, PDF, HTML, or 'Word' documents, and exporting to cloud-based platforms.
Maintained by Soner Koc. Last updated 10 months ago.
biocomputebiocompute-objectsbioinformaticsscience-communicationsevenbridgesstandardizationworkflow
3 stars 4.07 score 13 scriptsvivianstats
scINSIGHT:Interpretation of Heterogeneous Single-Cell Gene Expression Data
We develop a novel matrix factorization tool named 'scINSIGHT' to jointly analyze multiple single-cell gene expression samples from biologically heterogeneous sources, such as different disease phases, treatment groups, or developmental stages. Given multiple gene expression samples from different biological conditions, 'scINSIGHT' simultaneously identifies common and condition-specific gene modules and quantify their expression levels in each sample in a lower-dimensional space. With the factorized results, the inferred expression levels and memberships of common gene modules can be used to cluster cells and detect cell identities, and the condition-specific gene modules can help compare functional differences in transcriptomes from distinct conditions. Please also see Qian K, Fu SW, Li HW, Li WV (2022) <doi:10.1186/s13059-022-02649-3>.
Maintained by Kun Qian. Last updated 3 years ago.
bioinformaticsgene-expressionintegrationscrna-seqopenblascpp
21 stars 4.02 score 10 scriptsambuvjyn
baseq:Basic Sequence Processing Tool for Biological Data
Primarily created as an easy and understanding way to do basic sequences surrounding the central dogma of molecular biology.
Maintained by Ambu Vijayan. Last updated 2 years ago.
2 stars 4.00 scorestephenturner
kgp:1000 Genomes Project Metadata
Metadata about populations and data about samples from the 1000 Genomes Project, including the 2,504 samples sequenced for the Phase 3 release and the expanded collection of 3,202 samples with 602 additional trios. The data is described in Auton et al. (2015) <doi:10.1038/nature15393> and Byrska-Bishop et al. (2022) <doi:10.1016/j.cell.2022.08.004>, and raw data is available at <http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/>. See Turner (2022) <doi:10.48550/arXiv.2210.00539> for more details.
Maintained by Stephen Turner. Last updated 2 years ago.
1000genomesbioinformaticsgeneticsgenomicsmetadatapopulation-geneticssequencing
20 stars 4.00 score 3 scriptscnuge
debar:A Post-Clustering Denoiser for COI-5P Barcode Data
The 'debar' sequence processing pipeline is designed for denoising high throughput sequencing data for the animal DNA barcode marker cytochrome c oxidase I (COI). The package is designed to detect and correct insertion and deletion errors within sequencer outputs. This is accomplished through comparison of input sequences against a profile hidden Markov model (PHMM) using the Viterbi algorithm (for algorithm details see Durbin et al. 1998, ISBN: 9780521629713). Inserted base pairs are removed and deleted base pairs are accounted for through the introduction of a placeholder character. Since the PHMM is a probabilistic representation of the COI barcode, corrections are not always perfect. For this reason 'debar' censors base pairs adjacent to reported indel sites, turning them into placeholder characters (default is 7 base pairs in either direction, this feature can be disabled). Testing has shown that this censorship results in the correct sequence length being restored, and erroneous base pairs being masked the vast majority of the time (>95%).
Maintained by Cameron M. Nugent. Last updated 1 years ago.
bioinformaticsdenoisingdna-barcodingdna-sequencinghidden-markov-modelmachine-learning
1 stars 4.00 score 8 scriptsbioc
geneXtendeR:Optimized Functional Annotation Of ChIP-seq Data
geneXtendeR optimizes the functional annotation of ChIP-seq peaks by exploring relative differences in annotating ChIP-seq peak sets to variable-length gene bodies. In contrast to prior techniques, geneXtendeR considers peak annotations beyond just the closest gene, allowing users to see peak summary statistics for the first-closest gene, second-closest gene, ..., n-closest gene whilst ranking the output according to biologically relevant events and iteratively comparing the fidelity of peak-to-gene overlap across a user-defined range of upstream and downstream extensions on the original boundaries of each gene's coordinates. Since different ChIP-seq peak callers produce different differentially enriched peaks with a large variance in peak length distribution and total peak count, annotating peak lists with their nearest genes can often be a noisy process. As such, the goal of geneXtendeR is to robustly link differentially enriched peaks with their respective genes, thereby aiding experimental follow-up and validation in designing primers for a set of prospective gene candidates during qPCR.
Maintained by Bohdan Khomtchouk. Last updated 5 months ago.
chipseqgeneticsannotationgenomeannotationdifferentialpeakcallingcoveragepeakdetectionchiponchiphistonemodificationdataimportnaturallanguageprocessingvisualizationgosoftwarebioconductorbioinformaticscchip-seqcomputational-biologyepigeneticsfunctional-annotation
9 stars 3.95 score 5 scriptsbiogenies
CancerGram:Prediction of Anticancer Peptides
Predicts anticancer peptides using random forests trained on the n-gram encoded peptides. The implemented algorithm can be accessed from both the command line and shiny-based GUI. The CancerGram model is too large for CRAN and it has to be downloaded separately from the repository: <https://github.com/BioGenies/CancerGramModel>. For more information see: Burdukiewicz et al. (2020) <doi:10.3390/pharmaceutics12111045>.
Maintained by Michal Burdukiewicz. Last updated 4 years ago.
anticancer-peptidesbioinformaticsk-mern-grampeptide-identificationrandom-forests
4 stars 3.90 score 3 scriptsmhahsler
rMSA:Interface for Popular Multiple Sequence Alignment Tools
Seamlessly interfaces the Multiple Sequence Alignment software packages ClustalW, MAFFT, MUSCLE and Kalign (downloaded separately) and provides support to calcualte distances between sequences. This work was partially supported by grant no. R21HG005912 from the National Human Genome Research Institute.
Maintained by Michael Hahsler. Last updated 10 months ago.
geneticssequencinginfrastructurealignmentbioinformaticssequence-alignment
12 stars 3.78 score 7 scriptssudoms
freqpcr:Estimates Allele Frequency on qPCR DeltaDeltaCq from Bulk Samples
Interval estimation of the population allele frequency from qPCR analysis based on the restriction enzyme digestion (RED)-DeltaDeltaCq method (Osakabe et al. 2017, <doi:10.1016/j.pestbp.2017.04.003>), as well as general DeltaDeltaCq analysis. Compatible with the Cq measurement of DNA extracted from multiple individuals at once, so called "group-testing", this model assumes that the quantity of DNA extracted from an individual organism follows a gamma distribution. Therefore, the point estimate is robust regarding the uncertainty of the DNA yield.
Maintained by Masaaki Sudo. Last updated 3 years ago.
bioinformaticsfrequency-estimationpcrstatistics
1 stars 3.70 score 4 scriptsbioc
mimager:mimager: The Microarray Imager
Easily visualize and inspect microarrays for spatial artifacts.
Maintained by Aaron Wolen. Last updated 5 months ago.
infrastructurevisualizationmicroarraybioconductorbioinformatics
3.70 score 3 scriptsseninp-bioinfo
MetaComp:EDGE Taxonomy Assignments Visualization
Implements routines for metagenome sample taxonomy assignments collection, aggregation, and visualization. Accepts the EDGE-formatted output from GOTTCHA/GOTTCHA2, BWA, Kraken, MetaPhlAn, DIAMOND, and Pangia. Produces SVG and PDF heatmap-like plots comparing taxa abundances across projects.
Maintained by Pavel Senin. Last updated 7 years ago.
bioinformaticscomparative-genomicsedgeheatmapmetagenomicsvisualization
4 stars 3.60 score 7 scriptsimamachi-n
bridger2:Genome-Wide RNA Degradation Analysis Using BRIC-Seq Data
BRIC-seq is a genome-wide approach for determining RNA stability in mammalian cells. This package provides a series of functions for performing quality check of your BRIC-seq data, calculation of RNA half-life for each transcript and comparison of RNA half-lives between two conditions.
Maintained by Naoto Imamachi. Last updated 8 years ago.
bioinformaticsbric-seqhalf-lifengsrnarpkm-values
3 stars 3.43 score 18 scriptsbioc
tRanslatome:Comparison between multiple levels of gene expression
Detection of differentially expressed genes (DEGs) from the comparison of two biological conditions (treated vs. untreated, diseased vs. normal, mutant vs. wild-type) among different levels of gene expression (transcriptome ,translatome, proteome), using several statistical methods: Rank Product, Translational Efficiency, t-test, Limma, ANOTA, DESeq, edgeR. Possibility to plot the results with scatterplots, histograms, MA plots, standard deviation (SD) plots, coefficient of variation (CV) plots. Detection of significantly enriched post-transcriptional regulatory factors (RBPs, miRNAs, etc) and Gene Ontology terms in the lists of DEGs previously identified for the two expression levels. Comparison of GO terms enriched only in one of the levels or in both. Calculation of the semantic similarity score between the lists of enriched GO terms coming from the two expression levels. Visual examination and comparison of the enriched terms with heatmaps, radar plots and barplots.
Maintained by Toma Tebaldi. Last updated 5 months ago.
cellbiologygeneregulationregulationgeneexpressiondifferentialexpressionmicroarrayhighthroughputsequencingqualitycontrolgomultiplecomparisonsbioinformatics
3.30 score 2 scriptsbioc
mosaics:MOSAiCS (MOdel-based one and two Sample Analysis and Inference for ChIP-Seq)
This package provides functions for fitting MOSAiCS and MOSAiCS-HMM, a statistical framework to analyze one-sample or two-sample ChIP-seq data of transcription factor binding and histone modification.
Maintained by Dongjun Chung. Last updated 5 months ago.
chipseqsequencingtranscriptiongeneticsbioinformaticscpp
3.30 score 8 scriptsmoseleybioinformaticslab
categoryCompare2:Meta-Analysis of High-Throughput Experiments Using Feature Annotations
Facilitates comparison of significant annotations (categories) generated on one or more feature lists. Interactive exploration is facilitated through the use of RCytoscape (heavily suggested).
Maintained by Robert M Flight. Last updated 5 months ago.
annotationgomultiplecomparisonpathwaysgeneexpressionbioconductorbioinformaticsgene-annotationgene-expressiongene-sets
1 stars 2.48 score 9 scriptscogdisreslab
BioPathNet:BioPathNet: Three Pod Analysis System
This package aims to provide a simple interface to perform the Three Pod Analysis of RNASeq dataaset. In addition, this also provides utility functions to perform the individual components.
Maintained by Ali Sajid Imami. Last updated 2 years ago.
bioinformaticsbioinformatics-pipelineilincstranscriptomics
2 stars 2.00 score 5 scriptsweiliang
MMDvariance:Detecting Differentially Variable Genes Using the Mixture of Marginal Distributions
Gene selection based on variance using the marginal distributions of gene profiles that characterized by a mixture of three-component multivariate distributions. Please see the reference: Li X, Fu Y, Wang X, DeMeo DL, Tantisira K, Weiss ST, Qiu W. (2018) <doi:10.1155/2018/6591634>.
Maintained by Weiliang Qiu. Last updated 7 years ago.
bioinformaticsdifferentialexpression
1.00 score 2 scriptsyixinzhang-stat
eLNNpairedCov:Model-Based Gene Selection for Paired Data
Model-based clustering for paired data based on the regression of a mixture of Bayesian hierarchical models on covariates. Zhang et al. (2023) <doi:10.1186/s12859-023-05556-x>.
Maintained by Yixin Zhang. Last updated 1 years ago.
bioinformaticsdifferentialexpression
1.00 scorezhang-zeyu
countTransformers:Transform Counts in RNA-Seq Data Analysis
Provide data transformation functions to transform counts in RNA-seq data analysis. Please see the reference: Zhang Z, Yu D, Seo M, Hersh CP, Weiss ST, Qiu W. (2019) <doi.org/10.1038/s41598-019-41315-w>.
Maintained by Zeyu Zhang. Last updated 6 years ago.
bioinformaticsdifferentialexpression
1.00 score 10 scripts