R-universe search: topic:bioinformatics

bioc

maftools:Summarize, Analyze and Visualize MAF Files

Analyze and visualize Mutation Annotation Format (MAF) files from large scale sequencing studies. This package provides various functions to perform most commonly used analyses in cancer genomics and to create feature rich customizable visualzations with minimal effort.

Maintained by Anand Mayakonda. Last updated 5 months ago.

datarepresentation dnaseq visualization drivermutation variantannotation featureextraction classification somaticmutation sequencing functionalgenomics survival bioinformatics cancer-genome-atlas cancer-genomics genomics maf-files tcga curl bzip2 xz-utils zlib

459 stars 14.63 score 948 scripts 18 dependents

bioc

GEOquery:Get data from NCBI Gene Expression Omnibus (GEO)

The NCBI Gene Expression Omnibus (GEO) is a public repository of microarray data. Given the rich and varied nature of this resource, it is only natural to want to apply BioConductor tools to these data. GEOquery is the bridge between GEO and BioConductor.

Maintained by Sean Davis. Last updated 5 months ago.

microarray dataimport onechannel twochannel sage bioconductor bioinformatics data-science genomics ncbi-geo

92 stars 14.46 score 4.1k scripts 44 dependents

bioc

GOSemSim:GO-terms Semantic Similarity Measures

The semantic comparisons of Gene Ontology (GO) annotations provide quantitative ways to compute similarities between genes and gene groups, and have became important basis for many bioinformatics analysis approaches. GOSemSim is an R package for semantic similarity computation among GO terms, sets of GO terms, gene products and gene clusters. GOSemSim implemented five methods proposed by Resnik, Schlicker, Jiang, Lin and Wang respectively.

Maintained by Guangchuang Yu. Last updated 5 months ago.

annotation go clustering pathways network software bioinformatics gene-ontology semantic-similarity cpp

63 stars 14.12 score 708 scripts 68 dependents

bioc

dada2:Accurate, high-resolution sample inference from amplicon sequencing data

The dada2 package infers exact amplicon sequence variants (ASVs) from high-throughput amplicon sequencing data, replacing the coarser and less accurate OTU clustering approach. The dada2 pipeline takes as input demultiplexed fastq files, and outputs the sequence variants and their sample-wise abundances after removing substitution and chimera errors. Taxonomic classification is available via a native implementation of the RDP naive Bayesian classifier, and species-level assignment to 16S rRNA gene fragments by exact matching.

Maintained by Benjamin Callahan. Last updated 5 months ago.

immunooncology microbiome sequencing classification metagenomics amplicon bioconductor bioinformatics metabarcoding taxonomy cpp

487 stars 13.17 score 3.0k scripts 4 dependents

bioc

EBImage:Image processing and analysis toolbox for R

EBImage provides general purpose functionality for image processing and analysis. In the context of (high-throughput) microscopy-based cellular assays, EBImage offers tools to segment cells and extract quantitative cellular descriptors. This allows the automation of such tasks using the R programming language and facilitates the use of other tools in the R environment for signal processing, statistical modeling, machine learning and visualization with image data.

Maintained by Andrzej Oleś. Last updated 5 months ago.

visualization bioinformatics image-analysis image-processing cpp

71 stars 12.77 score 1.5k scripts 33 dependents

bioc

MSnbase:Base Functions and Classes for Mass Spectrometry and Proteomics

MSnbase provides infrastructure for manipulation, processing and visualisation of mass spectrometry and proteomics data, ranging from raw to quantitative and annotated data.

Maintained by Laurent Gatto. Last updated 14 days ago.

immunooncology infrastructure proteomics massspectrometry qualitycontrol dataimport bioconductor bioinformatics mass-spectrometry proteomics-data visualisation cpp

131 stars 12.76 score 772 scripts 36 dependents

bioc

SNPRelate:Parallel Computing Toolset for Relatedness and Principal Component Analysis of SNP Data

Genome-wide association studies (GWAS) are widely used to investigate the genetic basis of diseases and traits, but they pose many computational challenges. We developed an R package SNPRelate to provide a binary format for single-nucleotide polymorphism (SNP) data in GWAS utilizing CoreArray Genomic Data Structure (GDS) data files. The GDS format offers the efficient operations specifically designed for integers with two bits, since a SNP could occupy only two bits. SNPRelate is also designed to accelerate two key computations on SNP data using parallel computing for multi-core symmetric multiprocessing computer architectures: Principal Component Analysis (PCA) and relatedness analysis using Identity-By-Descent measures. The SNP GDS format is also used by the GWASTools package with the support of S4 classes and generic functions. The extended GDS format is implemented in the SeqArray package to support the storage of single nucleotide variations (SNVs), insertion/deletion polymorphism (indel) and structural variation calls in whole-genome and whole-exome variant data.

Maintained by Xiuwen Zheng. Last updated 5 months ago.

infrastructure genetics statisticalmethod principalcomponent bioinformatics gds-format pca simd snp openblas cpp

105 stars 12.57 score 1.6k scripts 19 dependents

stuart-lab

Signac:Analysis of Single-Cell Chromatin Data

A framework for the analysis and exploration of single-cell chromatin data. The 'Signac' package contains functions for quantifying single-cell chromatin data, computing per-cell quality control metrics, dimension reduction and normalization, visualization, and DNA sequence motif analysis. Reference: Stuart et al. (2021) <doi:10.1038/s41592-021-01282-5>.

Maintained by Tim Stuart. Last updated 7 months ago.

atac bioinformatics single-cell zlib cpp

355 stars 12.18 score 3.7k scripts 1 dependents

bioc

SeqArray:Data management of large-scale whole-genome sequence variant calls using GDS files

Data management of large-scale whole-genome sequencing variant calls with thousands of individuals: genotypic data (e.g., SNVs, indels and structural variation calls) and annotations in SeqArray GDS files are stored in an array-oriented and compressed manner, with efficient data access using the R programming language.

Maintained by Xiuwen Zheng. Last updated 5 days ago.

infrastructure datarepresentation sequencing genetics bioinformatics gds-format snp snv wes wgs cpp

45 stars 12.11 score 1.1k scripts 9 dependents

bioc

GenomicDataCommons:NIH / NCI Genomic Data Commons Access

Programmatically access the NIH / NCI Genomic Data Commons RESTful service.

Maintained by Sean Davis. Last updated 2 months ago.

dataimport sequencing api-client bioconductor bioinformatics cancer core-services data-science genomics nci tcga vignette

87 stars 11.94 score 238 scripts 12 dependents

privefl

bigsnpr:Analysis of Massive SNP Arrays

Easy-to-use, efficient, flexible and scalable tools for analyzing massive SNP arrays. Privé et al. (2018) <doi:10.1093/bioinformatics/bty185>.

Maintained by Florian Privé. Last updated 22 days ago.

big-data bioinformatics memory-mapped-file parallel-computing polygenic-scores population-structure-inference snp-data statistical-methods openblas zlib cpp openmp

200 stars 11.44 score 1.5k scripts 3 dependents

bioc

decontam:Identify Contaminants in Marker-gene and Metagenomics Sequencing Data

Simple statistical identification of contaminating sequence features in marker-gene or metagenomics data. Works on any kind of feature derived from environmental sequencing data (e.g. ASVs, OTUs, taxonomic groups, MAGs,...). Requires DNA quantitation data or sequenced negative control samples.

Maintained by Benjamin Callahan. Last updated 5 months ago.

immunooncology microbiome sequencing classification metagenomics amplicon bioinformatics contamination metabarcoding

153 stars 11.42 score 524 scripts 6 dependents

bioc

gdsfmt:R Interface to CoreArray Genomic Data Structure (GDS) Files

Provides a high-level R interface to CoreArray Genomic Data Structure (GDS) data files. GDS is portable across platforms with hierarchical structure to store multiple scalable array-oriented data sets with metadata information. It is suited for large-scale datasets, especially for data which are much larger than the available random-access memory. The gdsfmt package offers the efficient operations specifically designed for integers of less than 8 bits, since a diploid genotype, like single-nucleotide polymorphism (SNP), usually occupies fewer bits than a byte. Data compression and decompression are available with relatively efficient random access. It is also allowed to read a GDS file in parallel with multiple R processes supported by the package parallel.

Maintained by Xiuwen Zheng. Last updated 13 days ago.

infrastructure dataimport bioinformatics gds-format genomics cpp

18 stars 11.34 score 920 scripts 29 dependents

neuhausi

canvasXpress:Visualization Package for CanvasXpress in R

Enables creation of visualizations using the CanvasXpress framework in R. CanvasXpress is a standalone JavaScript library for reproducible research with complete tracking of data and end-user modifications stored in a single PNG image that can be played back. See <https://www.canvasxpress.org> for more information.

Maintained by Connie Brett. Last updated 12 hours ago.

analytics bioinformatics chart charting dash dashboard data-analytics data-science data-visualization genomics graphs javascript network network-visualization python reproducible-research shiny visualization

297 stars 11.28 score 145 scripts

bioc

karyoploteR:Plot customizable linear genomes displaying arbitrary data

karyoploteR creates karyotype plots of arbitrary genomes and offers a complete set of functions to plot arbitrary data on them. It mimicks many R base graphics functions coupling them with a coordinate change function automatically mapping the chromosome and data coordinates into the plot coordinates. In addition to the provided data plotting functions, it is easy to add new ones.

Maintained by Bernat Gel. Last updated 5 months ago.

visualization copynumbervariation sequencing coverage dnaseq chipseq methylseq dataimport onechannel bioconductor bioinformatics data-visualization genome genomics-visualization plotting-in-r

306 stars 11.22 score 656 scripts 4 dependents

bioc

graphite:GRAPH Interaction from pathway Topological Environment

Graph objects from pathway topology derived from KEGG, Panther, PathBank, PharmGKB, Reactome SMPDB and WikiPathways databases.

Maintained by Gabriele Sales. Last updated 5 months ago.

pathways thirdpartyclient graphandnetwork network reactome kegg metabolomics bioinformatics mirror pathway-analysis

7 stars 10.17 score 122 scripts 21 dependents

bioc

singscore:Rank-based single-sample gene set scoring method

A simple single-sample gene signature scoring method that uses rank-based statistics to analyze the sample's gene expression profile. It scores the expression activities of gene sets at a single-sample level.

Maintained by Malvika Kharbanda. Last updated 5 months ago.

software geneexpression genesetenrichment bioinformatics

41 stars 10.03 score 124 scripts 4 dependents

nanxstats

protr:Generating Various Numerical Representation Schemes for Protein Sequences

Comprehensive toolkit for generating various numerical features of protein sequences described in Xiao et al. (2015) <DOI:10.1093/bioinformatics/btv042>. For full functionality, the software 'ncbi-blast+' is needed, see <https://blast.ncbi.nlm.nih.gov/doc/blast-help/downloadblastdata.html> for more information.

Maintained by Nan Xiao. Last updated 7 months ago.

bioinformatics feature-engineering feature-extraction machine-learning peptides protein-sequences sequence-analysis

52 stars 10.02 score 173 scripts 3 dependents

bioc

splatter:Simple Simulation of Single-cell RNA Sequencing Data

Splatter is a package for the simulation of single-cell RNA sequencing count data. It provides a simple interface for creating complex simulations that are reproducible and well-documented. Parameters can be estimated from real data and functions are provided for comparing real and simulated datasets.

Maintained by Luke Zappia. Last updated 4 months ago.

singlecell rnaseq transcriptomics geneexpression sequencing software immunooncology bioconductor bioinformatics scrna-seq simulation

224 stars 9.92 score 424 scripts 1 dependents

bioc

scMerge:scMerge: Merging multiple batches of scRNA-seq data

Like all gene expression data, single-cell data suffers from batch effects and other unwanted variations that makes accurate biological interpretations difficult. The scMerge method leverages factor analysis, stably expressed genes (SEGs) and (pseudo-) replicates to remove unwanted variations and merge multiple single-cell data. This package contains all the necessary functions in the scMerge pipeline, including the identification of SEGs, replication-identification methods, and merging of single-cell data.

Maintained by Yingxin Lin. Last updated 5 months ago.

batcheffect geneexpression normalization rnaseq sequencing singlecell software transcriptomics bioinformatics single-cell

67 stars 9.52 score 137 scripts 1 dependents

immunomind

immunarch:Bioinformatics Analysis of T-Cell and B-Cell Immune Repertoires

A comprehensive framework for bioinformatics exploratory analysis of bulk and single-cell T-cell receptor and antibody repertoires. It provides seamless data loading, analysis and visualisation for AIRR (Adaptive Immune Receptor Repertoire) data, both bulk immunosequencing (RepSeq) and single-cell sequencing (scRNAseq). Immunarch implements most of the widely used AIRR analysis methods, such as: clonality analysis, estimation of repertoire similarities in distribution of clonotypes and gene segments, repertoire diversity analysis, annotation of clonotypes using external immune receptor databases and clonotype tracking in vaccination and cancer studies. A successor to our previously published 'tcR' immunoinformatics package (Nazarov 2015) <doi:10.1186/s12859-015-0613-1>.

Maintained by Vadim I. Nazarov. Last updated 1 years ago.

airr-analysis b-cell-receptor bcr bcr-repertoire bioinformatics ig ig-repertoire immune-repertoire immune-repertoire-analysis immune-repertoire-data immunoglobulin immunoinformatics immunology rep-seq repertoire-analysis single-cell single-cell-analysis t-cell-receptor tcr tcr-repertoire cpp

316 stars 9.49 score 203 scripts

shixiangwang

sigminer:Extract, Analyze and Visualize Mutational Signatures for Genomic Variations

Genomic alterations including single nucleotide substitution, copy number alteration, etc. are the major force for cancer initialization and development. Due to the specificity of molecular lesions caused by genomic alterations, we can generate characteristic alteration spectra, called 'signature' (Wang, Shixiang, et al. (2021) <DOI:10.1371/journal.pgen.1009557> & Alexandrov, Ludmil B., et al. (2020) <DOI:10.1038/s41586-020-1943-3> & Steele Christopher D., et al. (2022) <DOI:10.1038/s41586-022-04738-6>). This package helps users to extract, analyze and visualize signatures from genomic alteration records, thus providing new insight into cancer study.

Maintained by Shixiang Wang. Last updated 6 months ago.

bayesian-nmf bioinformatics cancer-research cnv copynumber-signatures cosmic-signatures dbs easy-to-use indel mutational-signatures nmf nmf-extraction sbs signature-extraction somatic-mutations somatic-variants visualization cpp

150 stars 9.48 score 123 scripts 2 dependents

bioc

rWikiPathways:rWikiPathways - R client library for the WikiPathways API

Use this package to interface with the WikiPathways API. It provides programmatic access to WikiPathways content in multiple data and image formats, including official monthly release files and convenient GMT read/write functions.

Maintained by Egon Willighagen. Last updated 5 months ago.

visualization graphandnetwork thirdpartyclient network metabolomics bioinformatics data-access pathways

15 stars 9.23 score 131 scripts 3 dependents

dosorio

Peptides:Calculate Indices and Theoretical Physicochemical Properties of Protein Sequences

Includes functions to calculate several physicochemical properties and indices for amino-acid sequences as well as to read and plot 'XVG' output files from the 'GROMACS' molecular dynamics package.

Maintained by Daniel Osorio. Last updated 1 years ago.

bioinformatics calculate-indices peptides protein-sequences qsar cpp

82 stars 9.14 score 245 scripts 7 dependents

bioc

sesame:SEnsible Step-wise Analysis of DNA MEthylation BeadChips

Tools For analyzing Illumina Infinium DNA methylation arrays. SeSAMe provides utilities to support analyses of multiple generations of Infinium DNA methylation BeadChips, including preprocessing, quality control, visualization and inference. SeSAMe features accurate detection calling, intelligent inference of ethnicity, sex and advanced quality control routines.

Maintained by Wanding Zhou. Last updated 3 months ago.

dnamethylation methylationarray preprocessing qualitycontrol bioinformatics dna-methylation microarray

69 stars 9.08 score 258 scripts 1 dependents

bioc

cmapR:CMap Tools in R

The Connectivity Map (CMap) is a massive resource of perturbational gene expression profiles built by researchers at the Broad Institute and funded by the NIH Library of Integrated Network-Based Cellular Signatures (LINCS) program. Please visit https://clue.io for more information. The cmapR package implements methods to parse, manipulate, and write common CMap data objects, such as annotated matrices and collections of gene sets.

Maintained by Ted Natoli. Last updated 5 months ago.

dataimport datarepresentation geneexpression bioconductor bioinformatics cmap

90 stars 8.86 score 298 scripts

bioc

CellBench:Construct Benchmarks for Single Cell Analysis Methods

This package contains infrastructure for benchmarking analysis methods and access to single cell mixture benchmarking data. It provides a framework for organising analysis methods and testing combinations of methods in a pipeline without explicitly laying out each combination. It also provides utilities for sampling and filtering SingleCellExperiment objects, constructing lists of functions with varying parameters, and multithreaded evaluation of analysis methods.

Maintained by Shian Su. Last updated 5 months ago.

software infrastructure singlecell benchmark bioinformatics

31 stars 8.73 score 98 scripts

ropensci

UCSCXenaTools:Download and Explore Datasets from UCSC Xena Data Hubs

Download and explore datasets from UCSC Xena data hubs, which are a collection of UCSC-hosted public databases such as TCGA, ICGC, TARGET, GTEx, CCLE, and others. Databases are normalized so they can be combined, linked, filtered, explored and downloaded.

Maintained by Shixiang Wang. Last updated 5 months ago.

api-client bioinformatics ccle downloader icgc tcga toil treehouse ucsc ucsc-xena

106 stars 8.55 score 163 scripts 1 dependents

bioc

BgeeDB:Annotation and gene expression data retrieval from Bgee database. TopAnat, an anatomical entities Enrichment Analysis tool for UBERON ontology

A package for the annotation and gene expression data download from Bgee database, and TopAnat analysis: GO-like enrichment of anatomical terms, mapped to genes by expression patterns.

Maintained by Julien Wollbrett. Last updated 5 months ago.

software dataimport sequencing geneexpression microarray go genesetenrichment bioinformatics enrichment-analysis rna-seq scrna-seq single-cell

15 stars 8.46 score 19 scripts 1 dependents

bioc

piano:Platform for integrative analysis of omics data

Piano performs gene set analysis using various statistical methods, from different gene level statistics and a wide range of gene-set collections. Furthermore, the Piano package contains functions for combining the results of multiple runs of gene set analyses.

Maintained by Leif Varemo Wigge. Last updated 5 months ago.

microarray preprocessing qualitycontrol differentialexpression visualization geneexpression genesetenrichment pathways bioconductor bioconductor-package bioinformatics gene-set-enrichment transcriptomics

13 stars 8.30 score 183 scripts 7 dependents

bioc

HIBAG:HLA Genotype Imputation with Attribute Bagging

Imputes HLA classical alleles using GWAS SNP data, and it relies on a training set of HLA and SNP genotypes. HIBAG can be used by researchers with published parameter estimates instead of requiring access to large training sample datasets. It combines the concepts of attribute bagging, an ensemble classifier method, with haplotype inference for SNPs and HLA types. Attribute bagging is a technique which improves the accuracy and stability of classifier ensembles using bootstrap aggregating and random variable selection.

Maintained by Xiuwen Zheng. Last updated 4 months ago.

genetics statisticalmethod bioinformatics gpu hla imputation mhc snp cpp

30 stars 8.24 score 48 scripts

bioc

hypeR:An R Package For Geneset Enrichment Workflows

An R Package for Geneset Enrichment Workflows.

Maintained by Anthony Federico. Last updated 5 months ago.

genesetenrichment annotation pathways bioinformatics computational-biology geneset-enrichment-analysis

76 stars 8.22 score 145 scripts

branchlab

metasnf:Meta Clustering with Similarity Network Fusion

Framework to facilitate patient subtyping with similarity network fusion and meta clustering. The similarity network fusion (SNF) algorithm was introduced by Wang et al. (2014) in <doi:10.1038/nmeth.2810>. SNF is a data integration approach that can transform high-dimensional and diverse data types into a single similarity network suitable for clustering with minimal loss of information from each initial data source. The meta clustering approach was introduced by Caruana et al. (2006) in <doi:10.1109/ICDM.2006.103>. Meta clustering involves generating a wide range of cluster solutions by adjusting clustering hyperparameters, then clustering the solutions themselves into a manageable number of qualitatively similar solutions, and finally characterizing representative solutions to find ones that are best for the user's specific context. This package provides a framework to easily transform multi-modal data into a wide range of similarity network fusion-derived cluster solutions as well as to visualize, characterize, and validate those solutions. Core package functionality includes easy customization of distance metrics, clustering algorithms, and SNF hyperparameters to generate diverse clustering solutions; calculation and plotting of associations between features, between patients, and between cluster solutions; and standard cluster validation approaches including resampled measures of cluster stability, standard metrics of cluster quality, and label propagation to evaluate generalizability in unseen data. Associated vignettes guide the user through using the package to identify patient subtypes while adhering to best practices for unsupervised learning.

Maintained by Prashanth S Velayudhan. Last updated 3 days ago.

bioinformatics clustering metaclustering snf

8 stars 8.21 score 30 scripts

bioc

POMA:Tools for Omics Data Analysis

The POMA package offers a comprehensive toolkit designed for omics data analysis, streamlining the process from initial visualization to final statistical analysis. Its primary goal is to simplify and unify the various steps involved in omics data processing, making it more accessible and manageable within a single, intuitive R package. Emphasizing on reproducibility and user-friendliness, POMA leverages the standardized SummarizedExperiment class from Bioconductor, ensuring seamless integration and compatibility with a wide array of Bioconductor tools. This approach guarantees maximum flexibility and replicability, making POMA an essential asset for researchers handling omics datasets. See https://github.com/pcastellanoescuder/POMAShiny. Paper: Castellano-Escuder et al. (2021) <doi:10.1371/journal.pcbi.1009148> for more details.

Maintained by Pol Castellano-Escuder. Last updated 4 months ago.

batcheffect classification clustering decisiontree dimensionreduction multidimensionalscaling normalization preprocessing principalcomponent regression rnaseq software statisticalmethod visualization bioconductor bioinformatics data-visualization dimension-reduction exploratory-data-analysis machine-learning omics-data-integration pipeline pre-processing statistical-analysis user-friendly workflow

11 stars 8.16 score 20 scripts 1 dependents

bioc

rBLAST:R Interface for the Basic Local Alignment Search Tool

Seamlessly interfaces the Basic Local Alignment Search Tool (BLAST) running locally to search genetic sequence data bases. This work was partially supported by grant no. R21HG005912 from the National Human Genome Research Institute.

Maintained by Michael Hahsler. Last updated 3 months ago.

genetics sequencing sequencematching alignment dataimport bioconductor bioinformatics blast-search

106 stars 8.07 score 106 scripts

uscbiostats

slurmR:A Lightweight Wrapper for 'Slurm'

'Slurm', Simple Linux Utility for Resource Management <https://slurm.schedmd.com/>, is a popular 'Linux' based software used to schedule jobs in 'HPC' (High Performance Computing) clusters. This R package provides a specialized lightweight wrapper of 'Slurm' with a syntax similar to that found in the 'parallel' R package. The package also includes a method for creating socket cluster objects spanning multiple nodes that can be used with the 'parallel' package.

Maintained by George Vega Yon. Last updated 1 years ago.

bioinformatics hpc slurm

60 stars 8.07 score 216 scripts 1 dependents

ms609

Quartet:Comparison of Phylogenetic Trees Using Quartet and Split Measures

Calculates the number of four-taxon subtrees consistent with a pair of cladograms, calculating the symmetric quartet distance of Bandelt & Dress (1986), Reconstructing the shape of a tree from observed dissimilarity data, Advances in Applied Mathematics, 7, 309-343 <doi:10.1016/0196-8858(86)90038-2>, and using the tqDist algorithm of Sand et al. (2014), tqDist: a library for computing the quartet and triplet distances between binary or general trees, Bioinformatics, 30, 2079–2080 <doi:10.1093/bioinformatics/btu157> for pairs of binary trees.

Maintained by Martin R. Smith. Last updated 4 days ago.

bioinformatics comparison phylogenetic-trees phylogenetics quartet quartet-distance research-tool tree cpp

14 stars 8.00 score 40 scripts

oganm

homologene:Quick Access to Homologene and Gene Annotation Updates

A wrapper for the homologene database by the National Center for Biotechnology Information ('NCBI'). It allows searching for gene homologs across species. Data in this package can be found at <ftp://ftp.ncbi.nih.gov/pub/HomoloGene/build68/>. The package also includes an updated version of the homologene database where gene identifiers and symbols are replaced with their latest (at the time of submission) version and functions to fetch latest annotation data to keep updated.

Maintained by Ogan Mancarci. Last updated 1 years ago.

bioinformatics homologene mancarci-2017 ncbi-taxonomy ogan-bio species wrapper

43 stars 7.88 score 164 scripts 4 dependents

bioc

orthogene:Interspecies gene mapping

`orthogene` is an R package for easy mapping of orthologous genes across hundreds of species. It pulls up-to-date gene ortholog mappings across **700+ organisms**. It also provides various utility functions to aggregate/expand common objects (e.g. data.frames, gene expression matrices, lists) using **1:1**, **many:1**, **1:many** or **many:many** gene mappings, both within- and between-species.

Maintained by Brian Schilder. Last updated 5 months ago.

genetics comparativegenomics preprocessing phylogenetics transcriptomics geneexpression animal-models bioconductor bioconductor-package bioinformatics biomedicine comparative-genomics evolutionary-biology genes genomics ontologies translational-research

42 stars 7.85 score 31 scripts 2 dependents

bioc

Rcpi:Molecular Informatics Toolkit for Compound-Protein Interaction in Drug Discovery

A molecular informatics toolkit with an integration of bioinformatics and chemoinformatics tools for drug discovery.

Maintained by Nan Xiao. Last updated 5 months ago.

software dataimport datarepresentation featureextraction cheminformatics biomedicalinformatics proteomics go systemsbiology bioconductor bioinformatics drug-discovery feature-extraction fingerprint molecular-descriptors protein-sequences

37 stars 7.81 score 29 scripts

bioc

PhyloProfile:PhyloProfile

PhyloProfile is a tool for exploring complex phylogenetic profiles. Phylogenetic profiles, presence/absence patterns of genes over a set of species, are commonly used to trace the functional and evolutionary history of genes across species and time. With PhyloProfile we can enrich regular phylogenetic profiles with further data like sequence/structure similarity, to make phylogenetic profiling more meaningful. Besides the interactive visualisation powered by R-Shiny, the package offers a set of further analysis features to gain insights like the gene age estimation or core gene identification.

Maintained by Vinh Tran. Last updated 7 days ago.

software visualization datarepresentation multiplecomparison functionalprediction dimensionreduction bioinformatics heatmap interactive-visualizations orthologs phylogenetic-profile shiny

33 stars 7.79 score 10 scripts

abbvie-external

OmicNavigator:Open-Source Software for 'Omic' Data Analysis and Visualization

A tool for interactive exploration of the results from 'omics' experiments to facilitate novel discoveries from high-throughput biology. The software includes R functions for the 'bioinformatician' to deposit study metadata and the outputs from statistical analyses (e.g. differential expression, enrichment). These results are then exported to an interactive JavaScript dashboard that can be interrogated on the user's local machine or deployed online to be explored by collaborators. The dashboard includes 'sortable' tables, interactive plots including network visualization, and fine-grained filtering based on statistical significance.

Maintained by John Blischak. Last updated 15 days ago.

bioinformatics genomics omics opencpu

34 stars 7.68 score 31 scripts

bioc

signeR:Empirical Bayesian approach to mutational signature discovery

The signeR package provides an empirical Bayesian approach to mutational signature discovery. It is designed to analyze single nucleotide variation (SNV) counts in cancer genomes, but can also be applied to other features as well. Functionalities to characterize signatures or genome samples according to exposure patterns are also provided.

Maintained by Renan Valieris. Last updated 5 months ago.

genomicvariation somaticmutation statisticalmethod visualization bioconductor bioinformatics openblas cpp

13 stars 7.67 score 22 scripts

moosa-r

rbioapi:User-Friendly R Interface to Biologic Web Services' API

Currently fully supports Enrichr, JASPAR, miEAA, PANTHER, Reactome, STRING, and UniProt! The goal of rbioapi is to provide a user-friendly and consistent interface to biological databases and services. In a way that insulates the user from the technicalities of using web services API and creates a unified and easy-to-use interface to biological and medical web services. This is an ongoing project; New databases and services will be added periodically. Feel free to suggest any databases or services you often use.

Maintained by Moosa Rezwani. Last updated 2 months ago.

api-client bioinformatics biology enrichment enrichment-analysis enrichr jaspar mieaa over-representation-analysis panther reactome string uniprot

20 stars 7.60 score 55 scripts

biogenies

tidysq:Tidy Processing and Analysis of Biological Sequences

A tidy approach to analysis of biological sequences. All processing and data-storage functions are heavily optimized to allow the fastest and most efficient data storage.

Maintained by Dominik Rafacz. Last updated 3 months ago.

bioconductor bioinformatics biological-sequences fasta s3 sequences tibble tidy tidyverse vctrs cpp

40 stars 7.56 score 38 scripts

bioc

scde:Single Cell Differential Expression

The scde package implements a set of statistical methods for analyzing single-cell RNA-seq data. scde fits individual error models for single-cell RNA-seq measurements. These models can then be used for assessment of differential expression between groups of cells, as well as other types of analysis. The scde package also contains the pagoda framework which applies pathway and gene set overdispersion analysis to identify and characterize putative cell subpopulations based on transcriptional signatures. The overall approach to the differential expression analysis is detailed in the following publication: "Bayesian approach to single-cell differential expression analysis" (Kharchenko PV, Silberstein L, Scadden DT, Nature Methods, doi: 10.1038/nmeth.2967). The overall approach to subpopulation identification and characterization is detailed in the following pre-print: "Characterizing transcriptional heterogeneity through pathway and gene set overdispersion analysis" (Fan J, Salathia N, Liu R, Kaeser G, Yung Y, Herman J, Kaper F, Fan JB, Zhang K, Chun J, and Kharchenko PV, Nature Methods, doi:10.1038/nmeth.3734).

Maintained by Evan Biederstedt. Last updated 5 months ago.

immunooncology rnaseq statisticalmethod differentialexpression bayesian transcription software analysis bioinformatics heterogenity ngs single-cell transcriptomics openblas cpp openmp

173 stars 7.53 score 141 scripts

bioc

methrix:Fast and efficient summarization of generic bedGraph files from Bisufite sequencing

Bedgraph files generated by Bisulfite pipelines often come in various flavors. Critical downstream step requires summarization of these files into methylation/coverage matrices. This step of data aggregation is done by Methrix, including many other useful downstream functions.

Maintained by Anand Mayakonda. Last updated 5 months ago.

dnamethylation sequencing coverage bedgraph bioinformatics dna-methylation

31 stars 7.51 score 39 scripts 1 dependents

bioc

koinar:KoinaR - Remote machine learning inference using Koina

A client to simplify fetching predictions from the Koina web service. Koina is a model repository enabling the remote execution of models. Predictions are generated as a response to HTTP/S requests, the standard protocol used for nearly all web traffic.

Maintained by Ludwig Lautenbacher. Last updated 3 months ago.

massspectrometry proteomics infrastructure software bioinformatics deep-learning machine-learning mass-spectrometry python

34 stars 7.49 score 4 scripts

tongzhou2017

itol.toolkit:Helper Functions for 'Interactive Tree Of Life'

The 'Interactive Tree Of Life' <https://itol.embl.de/> online server can edit and annotate trees interactively. The 'itol.toolkit' package can support all types of annotation templates.

Maintained by Tong Zhou. Last updated 4 months ago.

bioinformatics itol visualization

167 stars 7.48 score 60 scripts

ms609

TreeSearch:Phylogenetic Analysis with Discrete Character Data

Reconstruct phylogenetic trees from discrete data. Inapplicable character states are handled using the algorithm of Brazeau, Guillerme and Smith (2019) <doi:10.1093/sysbio/syy083> with the "Morphy" library, under equal or implied step weights. Contains a "shiny" user interface for interactive tree search and exploration of results, including character visualization, rogue taxon detection, tree space mapping, and cluster consensus trees (Smith 2022a, b) <doi:10.1093/sysbio/syab099>, <doi:10.1093/sysbio/syab100>. Profile Parsimony (Faith and Trueman, 2001) <doi:10.1080/10635150118627>, Successive Approximations (Farris, 1969) <doi:10.2307/2412182> and custom optimality criteria are implemented.

Maintained by Martin R. Smith. Last updated 3 days ago.

bioinformatics morphological-analysis phylogenetics research-tool tree-search cpp

7 stars 7.44 score 51 scripts

bodkan

admixr:An Interface for Running 'ADMIXTOOLS' Analyses

An interface for performing all stages of 'ADMIXTOOLS' analyses (<https://reich.hms.harvard.edu/software>) entirely from R. Wrapper functions (D, f4, f3, etc.) completely automate the generation of intermediate configuration files, run 'ADMIXTOOLS' programs on the command-line, and parse output files to extract values of interest. This allows users to focus on the analysis itself instead of worrying about low-level technical details. A set of complementary functions for processing and filtering of data in the 'EIGENSTRAT' format is also provided.

Maintained by Martin Petr. Last updated 1 months ago.

bioinformatics popgen population-genetics

29 stars 7.42 score 91 scripts

bioc

netSmooth:Network smoothing for scRNAseq

netSmooth is an R package for network smoothing of single cell RNA sequencing data. Using bio networks such as protein-protein interactions as priors for gene co-expression, netsmooth improves cell type identification from noisy, sparse scRNAseq data.

Maintained by Jonathan Ronen. Last updated 5 months ago.

network graphandnetwork singlecell rnaseq geneexpression sequencing transcriptomics normalization preprocessing clustering dimensionreduction bioinformatics genomics single-cell

27 stars 7.41 score 4 scripts

bioc

sevenbridges:Seven Bridges Platform API Client and Common Workflow Language Tool Builder in R

R client and utilities for Seven Bridges platform API, from Cancer Genomics Cloud to other Seven Bridges supported platforms.

Maintained by Phil Webster. Last updated 5 months ago.

software dataimport thirdpartyclient api-client bioconductor bioinformatics cloud common-workflow-language sevenbridges

35 stars 7.40 score 24 scripts

bioc

cogena:co-expressed gene-set enrichment analysis

cogena is a workflow for co-expressed gene-set enrichment analysis. It aims to discovery smaller scale, but highly correlated cellular events that may be of great biological relevance. A novel pipeline for drug discovery and drug repositioning based on the cogena workflow is proposed. Particularly, candidate drugs can be predicted based on the gene expression of disease-related data, or other similar drugs can be identified based on the gene expression of drug-related data. Moreover, the drug mode of action can be disclosed by the associated pathway analysis. In summary, cogena is a flexible workflow for various gene set enrichment analysis for co-expressed genes, with a focus on pathway/GO analysis and drug repositioning.

Maintained by Zhilong Jia. Last updated 5 months ago.

clustering genesetenrichment geneexpression visualization pathways kegg go microarray sequencing systemsbiology datarepresentation dataimport bioconductor bioinformatics

12 stars 7.36 score 32 scripts

bioc

NormalyzerDE:Evaluation of normalization methods and calculation of differential expression analysis statistics

NormalyzerDE provides screening of normalization methods for LC-MS based expression data. It calculates a range of normalized matrices using both existing approaches and a novel time-segmented approach, calculates performance measures and generates an evaluation report. Furthermore, it provides an easy utility for Limma- or ANOVA- based differential expression analysis.

Maintained by Jakob Willforss. Last updated 5 months ago.

normalization multiplecomparison visualization bayesian proteomics metabolomics differentialexpression bioconductor bioinformatics limma

22 stars 7.30 score 38 scripts 1 dependents

bioc

tidytof:Analyze High-dimensional Cytometry Data Using Tidy Data Principles

This package implements an interactive, scientific analysis pipeline for high-dimensional cytometry data built using tidy data principles. It is specifically designed to play well with both the tidyverse and Bioconductor software ecosystems, with functionality for reading/writing data files, data cleaning, preprocessing, clustering, visualization, modeling, and other quality-of-life functions. tidytof implements a "grammar" of high-dimensional cytometry data analysis.

Maintained by Timothy Keyes. Last updated 5 months ago.

singlecell flowcytometry bioinformatics cytometry data-science single-cell tidyverse cpp

18 stars 7.24 score 35 scripts

lvulliard

BioCircos:Interactive Circular Visualization of Genomic Data using 'htmlwidgets' and 'BioCircos.js'

Implement in 'R' interactive Circos-like visualizations of genomic data, to map information such as genetic variants, genomic fusions and aberrations to a circular genome, as proposed by the 'JavaScript' library 'BioCircos.js', based on the 'JQuery' and 'D3' technologies. The output is by default displayed in stand-alone HTML documents or in the 'RStudio' viewer pane. Moreover it can be integrated in 'R Markdown' documents and 'Shiny' applications.

Maintained by Loan Vulliard. Last updated 6 years ago.

biocircos bioinformatics circos circos-graphs htmlwidgets shiny

37 stars 6.98 score 58 scripts

lvclark

polyRAD:Genotype Calling with Uncertainty from Sequencing Data in Polyploids and Diploids

Read depth data from genotyping-by-sequencing (GBS) or restriction site-associated DNA sequencing (RAD-seq) are imported and used to make Bayesian probability estimates of genotypes in polyploids or diploids. The genotype probabilities, posterior mean genotypes, or most probable genotypes can then be exported for downstream analysis. 'polyRAD' is described by Clark et al. (2019) <doi:10.1534/g3.118.200913>, and the Hind/He statistic for marker filtering is described by Clark et al. (2022) <doi:10.1186/s12859-022-04635-9>. A variant calling pipeline for highly duplicated genomes is also included and is described by Clark et al. (2020, Version 1) <doi:10.1101/2020.01.11.902890>.

Maintained by Lindsay V. Clark. Last updated 20 days ago.

bioinformatics dna-sequencing genotype-likelihoods genotyping-by-sequencing hacktoberfest rad-seq rad-sequencing snp-genotyping cpp

28 stars 6.98 score 85 scripts

bioc

isobar:Analysis and quantitation of isobarically tagged MSMS proteomics data

isobar provides methods for preprocessing, normalization, and report generation for the analysis of quantitative mass spectrometry proteomics data labeled with isobaric tags, such as iTRAQ and TMT. Features modules for integrating and validating PTM-centric datasets (isobar-PTM). More information on http://www.ms-isobar.org.

Maintained by Florian P Breitwieser. Last updated 5 months ago.

immunooncology proteomics massspectrometry bioinformatics multiplecomparisons qualitycontrol

10 stars 6.96 score 19 scripts

alexisvdb

singleCellHaystack:A Universal Differential Expression Prediction Tool for Single-Cell and Spatial Genomics Data

One key exploratory analysis step in single-cell genomics data analysis is the prediction of features with different activity levels. For example, we want to predict differentially expressed genes (DEGs) in single-cell RNA-seq data, spatial DEGs in spatial transcriptomics data, or differentially accessible regions (DARs) in single-cell ATAC-seq data. 'singleCellHaystack' predicts differentially active features in single cell omics datasets without relying on the clustering of cells into arbitrary clusters. 'singleCellHaystack' uses Kullback-Leibler divergence to find features (e.g., genes, genomic regions, etc) that are active in subsets of cells that are non-randomly positioned inside an input space (such as 1D trajectories, 2D tissue sections, multi-dimensional embeddings, etc). For the theoretical background of 'singleCellHaystack' we refer to our original paper Vandenbon and Diez (Nature Communications, 2020) <doi:10.1038/s41467-020-17900-3> and our update Vandenbon and Diez (Scientific Reports, 2023) <doi:10.1038/s41598-023-38965-2>.

Maintained by Alexis Vandenbon. Last updated 1 years ago.

bioinformatics cite-seq pseudotime scatac-seq single-cell spatial-proteomics spatial-transcriptomics transcriptomics

81 stars 6.71 score 64 scripts

bioc

bioassayR:Cross-target analysis of small molecule bioactivity

bioassayR is a computational tool that enables simultaneous analysis of thousands of bioassay experiments performed over a diverse set of compounds and biological targets. Unique features include support for large-scale cross-target analyses of both public and custom bioassays, generation of high throughput screening fingerprints (HTSFPs), and an optional preloaded database that provides access to a substantial portion of publicly available bioactivity data.

Maintained by Thomas Girke. Last updated 5 months ago.

immunooncology microtitreplateassay cellbasedassays visualization infrastructure dataimport bioinformatics proteomics metabolomics

5 stars 6.70 score 46 scripts

zilong-li

vcfppR:Rapid Manipulation of the Variant Call Format (VCF)

The 'vcfpp.h' (<https://github.com/Zilong-Li/vcfpp>) provides an easy-to-use 'C++' 'API' of 'htslib', offering full functionality for manipulating Variant Call Format (VCF) files. The 'vcfppR' package serves as the R bindings of the 'vcfpp.h' library, enabling rapid processing of both compressed and uncompressed VCF files. Explore a range of powerful features for efficient VCF data manipulation.

Maintained by Zilong Li. Last updated 14 days ago.

bioinformatics fastr htslib population-genetics population-genomics vcf vcf-data visulization libdeflate zlib bzip2 xz-utils curl cpp

13 stars 6.70 score 16 scripts

person-c

easybio:Comprehensive Single-Cell Annotation and Transcriptomic Analysis Toolkit

Provides a comprehensive toolkit for single-cell annotation with the 'CellMarker2.0' database (see Xia Li, Peng Wang, Yunpeng Zhang (2023) <doi: 10.1093/nar/gkac947>). Streamlines biological label assignment in single-cell RNA-seq data and facilitates transcriptomic analysis, including preparation of TCGA<https://portal.gdc.cancer.gov/> and GEO<https://www.ncbi.nlm.nih.gov/geo/> datasets, differential expression analysis and visualization of enrichment analysis results. Additional utility functions support various bioinformatics workflows. See Wei Cui (2024) <doi: 10.1101/2024.09.14.609619> for more details.

Maintained by Wei Cui. Last updated 25 days ago.

limma geoquery edger fgsea bioinformatics cellmarker2 gsea rna-seq single-cell

10 stars 6.62 score 35 scripts

bioc

BioCor:Functional similarities

Calculates functional similarities based on the pathways described on KEGG and REACTOME or in gene sets. These similarities can be calculated for pathways or gene sets, genes, or clusters and combined with other similarities. They can be used to improve networks, gene selection, testing relationships...

Maintained by Lluís Revilla Sancho. Last updated 5 months ago.

statisticalmethod clustering geneexpression network pathways networkenrichment systemsbiology bioconductor-packages bioinformatics functional-similarity gene gene-sets pathway-analysis similarity similarity-measurement

14 stars 6.59 score

bioc

pathlinkR:Analyze and interpret RNA-Seq results

pathlinkR is an R package designed to facilitate analysis of RNA-Seq results. Specifically, our aim with pathlinkR was to provide a number of tools which take a list of DE genes and perform different analyses on them, aiding with the interpretation of results. Functions are included to perform pathway enrichment, with muliplte databases supported, and tools for visualizing these results. Genes can also be used to create and plot protein-protein interaction networks, all from inside of R.

Maintained by Travis Blimkie. Last updated 3 months ago.

genesetenrichment network pathways reactome rnaseq networkenrichment bioinformatics networks pathway-enrichment-analysis visualization

28 stars 6.59 score 2 scripts

bioc

CiteFuse:CiteFuse: multi-modal analysis of CITE-seq data

CiteFuse pacakage implements a suite of methods and tools for CITE-seq data from pre-processing to integrative analytics, including doublet detection, network-based modality integration, cell type clustering, differential RNA and protein expression analysis, ADT evaluation, ligand-receptor interaction analysis, and interactive web-based visualisation of the analyses.

Maintained by Yingxin Lin. Last updated 5 months ago.

singlecell geneexpression bioinformatics single-cell cpp

27 stars 6.59 score 18 scripts

rmgpanw

gtexr:Query the GTEx Portal API

A convenient R interface to the Genotype-Tissue Expression (GTEx) Portal API. For more information on the API, see <https://gtexportal.org/api/v2/redoc>.

Maintained by Alasdair Warwick. Last updated 6 months ago.

api-wrapper bioinformatics eqtl gtex sqtl

6 stars 6.49 score 5 scripts

bioc

doubletrouble:Identification and classification of duplicated genes

doubletrouble aims to identify duplicated genes from whole-genome protein sequences and classify them based on their modes of duplication. The duplication modes are i. segmental duplication (SD); ii. tandem duplication (TD); iii. proximal duplication (PD); iv. transposed duplication (TRD) and; v. dispersed duplication (DD). Transposon-derived duplicates (TRD) can be further subdivided into rTRD (retrotransposon-derived duplication) and dTRD (DNA transposon-derived duplication). If users want a simpler classification scheme, duplicates can also be classified into SD- and SSD-derived (small-scale duplication) gene pairs. Besides classifying gene pairs, users can also classify genes, so that each gene is assigned a unique mode of duplication. Users can also calculate substitution rates per substitution site (i.e., Ka and Ks) from duplicate pairs, find peaks in Ks distributions with Gaussian Mixture Models (GMMs), and classify gene pairs into age groups based on Ks peaks.

Maintained by Fabrício Almeida-Silva. Last updated 15 days ago.

software wholegenome comparativegenomics functionalgenomics phylogenetics network classification bioinformatics comparative-genomics gene-duplication molecular-evolution whole-genome-duplication

23 stars 6.44 score 17 scripts

bioc

artMS:Analytical R tools for Mass Spectrometry

artMS provides a set of tools for the analysis of proteomics label-free datasets. It takes as input the MaxQuant search result output (evidence.txt file) and performs quality control, relative quantification using MSstats, downstream analysis and integration. artMS also provides a set of functions to re-format and make it compatible with other analytical tools, including, SAINTq, SAINTexpress, Phosfate, and PHOTON. Check [http://artms.org](http://artms.org) for details.

Maintained by David Jimenez-Morales. Last updated 5 months ago.

proteomics differentialexpression biomedicalinformatics systemsbiology massspectrometry annotation qualitycontrol genesetenrichment clustering normalization immunooncology multiplecomparison analysis analytical ap-ms bioconductor bioinformatics mass-spectrometry phosphoproteomics post-translational-modification quantitative-analysis

14 stars 6.41 score 13 scripts

bioc

ontoProc:processing of ontologies of anatomy, cell lines, and so on

Support harvesting of diverse bioinformatic ontologies, making particular use of the ontologyIndex package on CRAN. We provide snapshots of key ontologies for terms about cells, cell lines, chemical compounds, and anatomy, to help analyze genome-scale experiments, particularly cell x compound screens. Another purpose is to strengthen development of compelling use cases for richer interfaces to emerging ontologies.

Maintained by Vincent Carey. Last updated 15 days ago.

infrastructure go bioinformatics genomics ontology

3 stars 6.37 score 75 scripts 2 dependents

ethanbass

chromatographR:Chromatographic Data Analysis Toolset

Tools for high-throughput analysis of HPLC-DAD/UV chromatograms (or similar data). Includes functions for preprocessing, alignment, peak-finding and fitting, peak-table construction, data-visualization, etc. Preprocessing and peak-table construction follow the rough formula laid out in 'alsace' (Wehrens, R., Bloemberg, T.G., and Eilers P.H.C., 2015. <doi:10.1093/bioinformatics/btv299>. Alignment of chromatograms is available using parametric time warping (as implemented in the 'ptw' package) (Wehrens, R., Bloemberg, T.G., and Eilers P.H.C. 2015. <doi:10.1093/bioinformatics/btv299>) or variable penalty dynamic time warping (as implemented in 'VPdtw') (Clifford, D., & Stone, G. 2012. <doi:10.18637/jss.v047.i08>). Peak-finding uses the algorithm by Tom O'Haver <https://terpconnect.umd.edu/~toh/spectrum/PeakFindingandMeasurement.htm>. Peaks are then fitted to a gaussian or exponential-gaussian hybrid peak shape using non-linear least squares (Lan, K. & Jorgenson, J. W. 2001. <doi:10.1016/S0021-9673(01)00594-5>). See the vignette for more details and suggested workflow.

Maintained by Ethan Bass. Last updated 12 days ago.

bioinformatics cheminformatics chromatography gc-fid hplc hplc-dad hplc-pda hplv-uv metabolomics open-data open-science reproducibility reproducible-research

18 stars 6.36 score 8 scripts 1 dependents

sbg

sevenbridges2:The 'Seven Bridges Platform' API Client

R client and utilities for 'Seven Bridges Platform' API, from 'Cancer Genomics Cloud' to other 'Seven Bridges' supported platforms. API documentation is hosted publicly at <https://docs.sevenbridges.com/docs/the-api>.

Maintained by Marko Trifunovic. Last updated 3 days ago.

api-client bioinformatics cloud sevenbridges

3 stars 6.32 score 4 scripts

pcruniversum

RDML:Importing Real-Time Thermo Cycler (qPCR) Data from RDML Format Files

Imports real-time thermo cycler (qPCR) data from Real-time PCR Data Markup Language (RDML) and transforms to the appropriate formats of the 'qpcR' and 'chipPCR' packages. Contains a dendrogram visualization for the structure of RDML object and GUI for RDML editing.

Maintained by Konstantin A. Blagodatskikh. Last updated 7 months ago.

bioinformatics pcr qpcr rdml

21 stars 6.26 score 58 scripts 1 dependents

bioc

CopyNumberPlots:Create Copy-Number Plots using karyoploteR functionality

CopyNumberPlots have a set of functions extending karyoploteRs functionality to create beautiful, customizable and flexible plots of copy-number related data.

Maintained by Bernat Gel. Last updated 5 months ago.

visualization copynumbervariation coverage onechannel dataimport sequencing dnaseq bioconductor bioconductor-package bioinformatics copy-number-variation genomics genomics-visualization

6 stars 6.24 score 16 scripts 2 dependents

bioc

scruff:Single Cell RNA-Seq UMI Filtering Facilitator (scruff)

A pipeline which processes single cell RNA-seq (scRNA-seq) reads from CEL-seq and CEL-seq2 protocols. Demultiplex scRNA-seq FASTQ files, align reads to reference genome using Rsubread, and generate UMI filtered count matrix. Also provide visualizations of read alignments and pre- and post-alignment QC metrics.

Maintained by Zhe Wang. Last updated 5 months ago.

software technology sequencing alignment rnaseq singlecell workflowstep preprocessing qualitycontrol visualization immunooncology bioinformatics scrna-seq single-cell umi

8 stars 6.20 score 22 scripts

cogdisreslab

drugfindR:Investigate iLINCS for candidate repurposable drugs

This package provides a convenient way to access the LINCS Signatures available in the iLINCS database. These signatures include Consensus Gene Knockdown Signatures, Gene Overexpression signatures and Chemical Perturbagen Signatures. It also provides a way to enter your own transcriptomic signatures and identify concordant and discordant signatures in the LINCS database.

Maintained by Ali Sajid Imami. Last updated 3 days ago.

lincs ilincs drug repurposing drug discovery transcriptomics gene expression gene knockdown gene overexpression chemical perturbagen drugfindr bioinformatics bioinformatics-pipeline

8 stars 6.19 score 145 scripts

bioc

martini:GWAS Incorporating Networks

martini deals with the low power inherent to GWAS studies by using prior knowledge represented as a network. SNPs are the vertices of the network, and the edges represent biological relationships between them (genomic adjacency, belonging to the same gene, physical interaction between protein products). The network is scanned using SConES, which looks for groups of SNPs maximally associated with the phenotype, that form a close subnetwork.

Maintained by Hector Climente-Gonzalez. Last updated 5 months ago.

software genomewideassociation snp geneticvariability genetics featureextraction graphandnetwork network bioinformatics genomics gwas network-analysis snps systems-biology cpp

4 stars 6.16 score 30 scripts

myles-lewis

glmmSeq:General Linear Mixed Models for Gene-Level Differential Expression

Using mixed effects models to analyse longitudinal gene expression can highlight differences between sample groups over time. The most widely used differential gene expression tools are unable to fit linear mixed effect models, and are less optimal for analysing longitudinal data. This package provides negative binomial and Gaussian mixed effects models to fit gene expression and other biological data across repeated samples. This is particularly useful for investigating changes in RNA-Sequencing gene expression between groups of individuals over time, as described in: Rivellese, F., Surace, A. E., Goldmann, K., Sciacca, E., Cubuk, C., Giorli, G., ... Lewis, M. J., & Pitzalis, C. (2022) Nature medicine <doi:10.1038/s41591-022-01789-0>.

Maintained by Myles Lewis. Last updated 2 months ago.

bioinformatics differential-gene-expression gene-expression glmm mixed-models transcriptomics

20 stars 6.13 score 45 scripts

bioc

cfDNAPro:cfDNAPro extracts and Visualises biological features from whole genome sequencing data of cell-free DNA

cfDNA fragments carry important features for building cancer sample classification ML models, such as fragment size, and fragment end motif etc. Analyzing and visualizing fragment size metrics, as well as other biological features in a curated, standardized, scalable, well-documented, and reproducible way might be time intensive. This package intends to resolve these problems and simplify the process. It offers two sets of functions for cfDNA feature characterization and visualization.

Maintained by Haichao Wang. Last updated 5 months ago.

visualization sequencing wholegenome bioinformatics cancer-genomics cancer-research cell-free-dna early-detection genomics-visualization liquid-biopsy swgs whole-genome-sequencing

28 stars 6.04 score 13 scripts

mrcieu

epigraphdb:Interface Package for the 'EpiGraphDB' Platform

The interface package to access data from the 'EpiGraphDB' <https://epigraphdb.org> platform. It provides easy access to the 'EpiGraphDB' platform with functions that query the corresponding REST endpoints on the API <https://api.epigraphdb.org> and return the response data in the 'tibble' data frame format.

Maintained by Yi Liu. Last updated 3 years ago.

api-client bioinformatics epidemiology graph-database mendelian-randomization phenotypes

27 stars 6.02 score 13 scripts

bioc

RMassBank:Workflow to process tandem MS files and build MassBank records

Workflow to process tandem MS files and build MassBank records. Functions include automated extraction of tandem MS spectra, formula assignment to tandem MS fragments, recalibration of tandem MS spectra with assigned fragments, spectrum cleanup, automated retrieval of compound information from Internet databases, and export to MassBank records.

Maintained by RMassBank at Eawag. Last updated 5 months ago.

immunooncology bioinformatics massspectrometry metabolomics software openjdk

6.02 score 26 scripts

bioc

gemma.R:A wrapper for Gemma's Restful API to access curated gene expression data and differential expression analyses

Low- and high-level wrappers for Gemma's RESTful API. They enable access to curated expression and differential expression data from over 10,000 published studies. Gemma is a web site, database and a set of tools for the meta-analysis, re-use and sharing of genomics data, currently primarily targeted at the analysis of gene expression profiles.

Maintained by Ogan Mancarci. Last updated 4 months ago.

software dataimport microarray singlecell thirdpartyclient differentialexpression geneexpression bayesian annotation experimentaldesign normalization batcheffect preprocessing bioinformatics gemma genomics transcriptomics

10 stars 5.99 score 26 scripts

bioc

ClustIRR:Clustering of immune receptor repertoires

ClustIRR analyzes repertoires of B- and T-cell receptors. It starts by identifying communities of immune receptors with similar specificities, based on the sequences of their complementarity-determining regions (CDRs). Next, it employs a Bayesian probabilistic models to quantify differential community occupancy (DCO) between repertoires, allowing the identification of expanding or contracting communities in response to e.g. infection or cancer treatment.

Maintained by Simo Kitanovski. Last updated 28 days ago.

clustering immunooncology singlecell software classification b-cell-receptor bioinformatics immunoinformatics immunology quantitative-methods rep-seq repertoire-analysis t-cell-receptor cpp

2 stars 5.95 score 2 scripts

bioc

vissE:Visualising Set Enrichment Analysis Results

This package enables the interpretation and analysis of results from a gene set enrichment analysis using network-based and text-mining approaches. Most enrichment analyses result in large lists of significant gene sets that are difficult to interpret. Tools in this package help build a similarity-based network of significant gene sets from a gene set enrichment analysis that can then be investigated for their biological function using text-mining approaches.

Maintained by Dharmesh D. Bhuva. Last updated 5 months ago.

software geneexpression genesetenrichment networkenrichment network bioinformatics

15 stars 5.93 score 19 scripts

bioc

cummeRbund:Analysis, exploration, manipulation, and visualization of Cufflinks high-throughput sequencing data.

Allows for persistent storage, access, exploration, and manipulation of Cufflinks high-throughput sequencing data. In addition, provides numerous plotting functions for commonly used visualizations.

Maintained by Loyal A. Goff. Last updated 5 months ago.

highthroughputsequencing highthroughputsequencingdata rnaseq rnaseqdata geneexpression differentialexpression infrastructure dataimport datarepresentation visualization bioinformatics clustering multiplecomparisons qualitycontrol

5.92 score 209 scripts

katrionagoldmann

volcano3D:3D Volcano Plots and Polar Plots for Three-Class Data

Generates interactive plots for analysing and visualising three-class high dimensional data. It is particularly suited to visualising differences in continuous attributes such as gene/protein/biomarker expression levels between three groups. Differential gene/biomarker expression analysis between two classes is typically shown as a volcano plot. However, with three groups this type of visualisation is particularly difficult to interpret. This package generates 3D volcano plots and 3-way polar plots for easier interpretation of three-class data.

Maintained by Katriona Goldmann. Last updated 2 years ago.

bioinformatics differential-expression differential-expression-analysis gene-expression interactive omics plotly rna-seq transcriptomics volcanoplots

36 stars 5.90 score 37 scripts

bioc

globaltest:Testing Groups of Covariates/Features for Association with a Response Variable, with Applications to Gene Set Testing

The global test tests groups of covariates (or features) for association with a response variable. This package implements the test with diagnostic plots and multiple testing utilities, along with several functions to facilitate the use of this test for gene set testing of GO and KEGG terms.

Maintained by Jelle Goeman. Last updated 5 months ago.

microarray onechannel bioinformatics differentialexpression go pathways

5.89 score 79 scripts 6 dependents

cox-labs

PerseusR:Perseus R Interop

Enables the interoperability between the Perseus platform for omics data analysis (Tyanova et al. 2016) <doi:10.1038/nmeth.3901> and R. It provides the foundation for developing and running Perseus plugins implemented in R by providing all required input and output handling, including data and parameter parsing as described in Rudolph and Cox 2018 <doi:10.1101/447268>.

Maintained by Jan Rudolph. Last updated 3 years ago.

bioinformatics interop maxquant perseus proteomics

13 stars 5.88 score 58 scripts

mt1022

cubar:Codon Usage Bias Analysis

A suite of functions for rapid and flexible analysis of codon usage bias. It provides in-depth analysis at the codon level, including relative synonymous codon usage (RSCU), tRNA weight calculations, machine learning predictions for optimal or preferred codons, and visualization of codon-anticodon pairing. Additionally, it can calculate various gene- specific codon indices such as codon adaptation index (CAI), effective number of codons (ENC), fraction of optimal codons (Fop), tRNA adaptation index (tAI), mean codon stabilization coefficients (CSCg), and GC contents (GC/GC3s/GC4d). It also supports both standard and non-standard genetic code tables found in NCBI, as well as custom genetic code tables.

Maintained by Hong Zhang. Last updated 4 months ago.

bioinformatics codon-usage machine-learning sequence-analysis

6 stars 5.82 score 8 scripts

ineelhere

clintrialx:Connect and Work with Clinical Trials Data Sources

Are you spending too much time fetching and managing clinical trial data? Struggling with complex queries and bulk data extraction? What if you could simplify this process with just a few lines of code? Introducing 'clintrialx' - Fetch clinical trial data from sources like 'ClinicalTrials.gov' <https://clinicaltrials.gov/> and the 'Clinical Trials Transformation Initiative - Access to Aggregate Content of ClinicalTrials.gov' database <https://aact.ctti-clinicaltrials.org/>, supporting pagination and bulk downloads. Also, you can generate HTML reports based on the data obtained from the sources!

Maintained by Indraneel Chakraborty. Last updated 17 days ago.

aact bioinformatics clinical-data clinical-trials clinicaltrialsgov ctti data data-management medical-informatics r-language trials

15 stars 5.76 score 11 scripts

bioc

CNORode:ODE add-on to CellNOptR

Logic based ordinary differential equation (ODE) add-on to CellNOptR.

Maintained by Attila Gabor. Last updated 5 months ago.

immunooncology cellbasedassays cellbiology proteomics bioinformatics timecourse

5.74 score 37 scripts 1 dependents

bioc

sparrow:Take command of set enrichment analyses through a unified interface

Provides a unified interface to a variety of GSEA techniques from different bioconductor packages. Results are harmonized into a single object and can be interrogated uniformly for quick exploration and interpretation of results. Interactive exploration of GSEA results is enabled through a shiny app provided by a sparrow.shiny sibling package.

Maintained by Steve Lianoglou. Last updated 11 days ago.

genesetenrichment pathways bioinformatics gsea

21 stars 5.74 score 13 scripts

core-bioinformatics

ClustAssess:Tools for Assessing Clustering

A set of tools for evaluating clustering robustness using proportion of ambiguously clustered pairs (Senbabaoglu et al. (2014) <doi:10.1038/srep06207>), as well as similarity across methods and method stability using element-centric clustering comparison (Gates et al. (2019) <doi:10.1038/s41598-019-44892-y>). Additionally, this package enables stability-based parameter assessment for graph-based clustering pipelines typical in single-cell data analysis.

Maintained by Andi Munteanu. Last updated 1 months ago.

software singlecell rnaseq atacseq normalization preprocessing dimensionreduction visualization qualitycontrol clustering classification annotation geneexpression differentialexpression bioinformatics genomics machine-learning parameter-optimization robustness single-cell unsupervised-learning cpp

23 stars 5.70 score 18 scripts

pepkit

pepr:Reading Portable Encapsulated Projects

A PEP, or Portable Encapsulated Project, is a dataset that subscribes to the PEP structure for organizing metadata. It is written using a simple YAML + CSV format, it is your one-stop solution to metadata management across data analysis environments. This package reads this standardized project configuration structure into R. Described in Sheffield et al. (2021) <doi:10.1093/gigascience/giab077>.

Maintained by Nathan Sheffield. Last updated 1 years ago.

bioinformatics metadata

3 stars 5.62 score 20 scripts

g3viz

g3viz:Interactively Visualize Genetic Mutation Data using a Lollipop-Diagram

Interface for 'g3-lollipop' 'JavaScript' library. Visualize genetic mutation data using an interactive lollipop diagram in 'RStudio' or your web browser.

Maintained by Xin Guo. Last updated 7 months ago.

bioinformatics genomics-visualization lollipop-plot variants visualize-mutation-data

31 stars 5.61 score 22 scripts

jmw86069

jamba:Just Analysis Methods Base

Just analysis methods ('jam') base functions focused on bioinformatics. Version- and gene-centric alphanumeric sort, unique name and version assignment, colorized console and 'HTML' output, color ramp and palette manipulation, 'Rmarkdown' cache import, styled 'Excel' worksheet import and export, interpolated raster output from smooth scatter and image plots, list to delimited vector, efficient list tools.

Maintained by James M. Ward. Last updated 21 hours ago.

bioinformatics

6 stars 5.52 score

multimeric

TidyMultiqc:Converts 'MultiQC' Reports into Tidy Data Frames

Provides the means to convert 'multiqc_data.json' files, produced by the wonderful 'MultiQC' tool, into tidy data frames for downstream analysis in R. This analysis might involve cohort analysis, quality control visualisation, change-point detection, statistical process control, clustering, or any other type of quality analysis.

Maintained by Michael Milton. Last updated 3 years ago.

bioinformatics

18 stars 5.50 score 35 scripts

bioc

TDbasedUFE:Tensor Decomposition Based Unsupervised Feature Extraction

This is a comprehensive package to perform Tensor decomposition based unsupervised feature extraction. It can perform unsupervised feature extraction. It uses tensor decomposition. It is applicable to gene expression, DNA methylation, and histone modification etc. It can perform multiomics analysis. It is also potentially applicable to single cell omics data sets.

Maintained by Y-h. Taguchi. Last updated 5 months ago.

geneexpression featureextraction methylationarray singlecell bioinformatics dna-methylation gene-expression-profiles histone-modifications multiomics tensor-decomposition

5 stars 5.48 score 9 scripts 1 dependents

bioc

BERT:High Performance Data Integration for Large-Scale Analyses of Incomplete Omic Profiles Using Batch-Effect Reduction Trees (BERT)

Provides efficient batch-effect adjustment of data with missing values. BERT orders all batch effect correction to a tree of pairwise computations. BERT allows parallelization over sub-trees.

Maintained by Yannis Schumann. Last updated 2 months ago.

batcheffect preprocessing experimentaldesign qualitycontrol batch-effect bioconductor-package bioinformatics data-integration data-science

2 stars 5.40 score 18 scripts

c4tb

shinyExprPortal:A Configurable 'shiny' Portal for Sharing Analysis of Molecular Expression Data

Enables deploying configuration file-based 'shiny' apps with minimal programming for interactive exploration and analysis showcase of molecular expression data. For exploration, supports visualization of correlations between rows of an expression matrix and a table of observations, such as clinical measures, and comparison of changes in expression over time. For showcase, enables visualizing the results of differential expression from package such as 'limma', co-expression modules from 'WGCNA' and lower dimensional projections.

Maintained by Rafael Henkin. Last updated 8 months ago.

bioinformatics data-analysis transcriptomics

5 stars 5.30 score 8 scripts

bioc

biotmle:Targeted Learning with Moderated Statistics for Biomarker Discovery

Tools for differential expression biomarker discovery based on microarray and next-generation sequencing data that leverage efficient semiparametric estimators of the average treatment effect for variable importance analysis. Estimation and inference of the (marginal) average treatment effects of potential biomarkers are computed by targeted minimum loss-based estimation, with joint, stable inference constructed across all biomarkers using a generalization of moderated statistics for use with the estimated efficient influence function. The procedure accommodates the use of ensemble machine learning for the estimation of nuisance functions.

Maintained by Nima Hejazi. Last updated 5 months ago.

regression geneexpression differentialexpression sequencing microarray rnaseq immunooncology bioconductor bioconductor-package bioconductor-packages bioinformatics biomarker-discovery biostatistics causal-inference computational-biology machine-learning statistics targeted-learning

5 stars 5.30 score 5 scripts

bioc

DEWSeq:Differential Expressed Windows Based on Negative Binomial Distribution

DEWSeq is a sliding window approach for the analysis of differentially enriched binding regions eCLIP or iCLIP next generation sequencing data.

Maintained by bioinformatics team Hentze. Last updated 5 months ago.

sequencing generegulation functionalgenomics differentialexpression bioinformatics eclip ngs-analysis

5 stars 5.30 score 4 scripts

bioc

epistack:Heatmaps of Stack Profiles from Epigenetic Signals

The epistack package main objective is the visualizations of stacks of genomic tracks (such as, but not restricted to, ChIP-seq, ATAC-seq, DNA methyation or genomic conservation data) centered at genomic regions of interest. epistack needs three different inputs: 1) a genomic score objects, such as ChIP-seq coverage or DNA methylation values, provided as a `GRanges` (easily obtained from `bigwig` or `bam` files). 2) a list of feature of interest, such as peaks or transcription start sites, provided as a `GRanges` (easily obtained from `gtf` or `bed` files). 3) a score to sort the features, such as peak height or gene expression value.

Maintained by DEVAILLY Guillaume. Last updated 5 months ago.

rnaseq preprocessing chipseq geneexpression coverage bioinformatics

6 stars 5.26 score 5 scripts

piplus2

SPUTNIK:Spatially Automatic Denoising for Imaging Mass Spectrometry Toolkit

Set of tools for peak filtering of mass spectrometry imaging data based on spatial distribution of signal. Given a region-of-interest, representing the spatial region where the informative signal is expected to be localized, a series of filters determine which peak signals are characterized by an implausible spatial distribution. The filters reduce the dataset dimension and increase its information vs noise ratio, improving the quality of the unsupervised analysis results, reducing data dimension and simplifying the chemical interpretation. The methods are described in Inglese P. et al (2019) <doi:10.1093/bioinformatics/bty622>.

Maintained by Paolo Inglese. Last updated 12 months ago.

bioinformatics desi-msi image-processing maldi-msi maldi-tof-ms mass-spectrometry mass-spectrometry-imaging

4 stars 5.24 score 43 scripts

wheretrue

exonr:Scientific Data Processing

This package provides a set of tools for processing scientific data. It's based on the exon Rust package.

Maintained by Trent Hauck. Last updated 2 months ago.

arrow bioinformatics datafusion ngs proteomics rust sql cargo

59 stars 5.23 score 2 scripts

gitter-lab

LPWC:Lag Penalized Weighted Correlation for Time Series Clustering

Computes a time series distance measure for clustering based on weighted correlation and introduction of lags. The lags capture delayed responses in a time series dataset. The timepoints must be specified. T. Chandereng, A. Gitter (2020) <doi:10.1186/s12859-019-3324-1>.

Maintained by Thevaa Chandereng. Last updated 5 years ago.

bioinformatics clustering time-series

20 stars 5.23 score 17 scripts

hautaniemilab

jellyfisher:Visualize Spatiotemporal Tumor Evolution with Jellyfish Plots

Generates interactive Jellyfish plots to visualize spatiotemporal tumor evolution by integrating sample and phylogenetic trees into a unified plot. This approach provides an intuitive way to analyze tumor heterogeneity and evolution over time and across anatomical locations. The Jellyfish plot visualization design was first introduced by Lahtinen, Lavikka, et al. (2023, <doi:10.1016/j.ccell.2023.04.017>). This package also supports visualizing ClonEvol results, a tool developed by Dang, et al. (2017, <doi:10.1093/annonc/mdx517>), for analyzing clonal evolution from multi-sample sequencing data. The 'clonevol' package is not available on CRAN but can be installed from its GitHub repository (<https://github.com/hdng/clonevol>).

Maintained by Kari Lavikka. Last updated 30 days ago.

visualization phylogenetics software spatial bioinformatics phylogenetic-analysis tumor-evolution tumor-heterogeneity visualization-tool

3 stars 5.18 score 2 scripts

nanxstats

ssw:Striped Smith-Waterman Algorithm for Sequence Alignment using SIMD

Provides an R interface for 'SSW' (Striped Smith-Waterman) via its 'Python' binding 'ssw-py'. 'SSW' is a fast 'C' and 'C++' implementation of the Smith-Waterman algorithm for pairwise sequence alignment using Single-Instruction-Multiple-Data (SIMD) instructions. 'SSW' enhances the standard algorithm by efficiently returning alignment information and suboptimal alignment scores. The core 'SSW' library offers performance improvements for various bioinformatics tasks, including protein database searches, short-read alignments, primary and split-read mapping, structural variant detection, and read-overlap graph generation. These features make 'SSW' particularly useful for genomic applications. Zhao et al. (2013) <doi:10.1371/journal.pone.0082138> developed the original 'C' and 'C++' implementation.

Maintained by Nan Xiao. Last updated 6 months ago.

bioinformatics reticulate sequence-alignment simd smith-waterman

6 stars 5.18 score

bioc

flowDensity:Sequential Flow Cytometry Data Gating

This package provides tools for automated sequential gating analogous to the manual gating strategy based on the density of the data.

Maintained by Mehrnoush Malek. Last updated 5 months ago.

bioinformatics flowcytometry cellbiology clustering cancer flowcytdata datarepresentation stemcell densitygating

5.17 score 83 scripts 3 dependents

bioc

SGCP:SGCP: A semi-supervised pipeline for gene clustering using self-training approach in gene co-expression networks

SGC is a semi-supervised pipeline for gene clustering in gene co-expression networks. SGC consists of multiple novel steps that enable the computation of highly enriched modules in an unsupervised manner. But unlike all existing frameworks, it further incorporates a novel step that leverages Gene Ontology information in a semi-supervised clustering method that further improves the quality of the computed modules.

Maintained by Niloofar AghaieAbiane. Last updated 5 months ago.

geneexpression genesetenrichment networkenrichment systemsbiology classification clustering dimensionreduction graphandnetwork neuralnetwork network mrnamicroarray rnaseq visualization bioinformatics genecoexpressionnetwork graphs networkclustering networks self-training semi-supervised-learning unsupervised-learning

2 stars 5.12 score 44 scripts

bioc

cTRAP:Identification of candidate causal perturbations from differential gene expression data

Compare differential gene expression results with those from known cellular perturbations (such as gene knock-down, overexpression or small molecules) derived from the Connectivity Map. Such analyses allow not only to infer the molecular causes of the observed difference in gene expression but also to identify small molecules that could drive or revert specific transcriptomic alterations.

Maintained by Nuno Saraiva-Agostinho. Last updated 5 months ago.

differentialexpression geneexpression rnaseq transcriptomics pathways immunooncology genesetenrichment bioconductor bioinformatics cmap gene-expression l1000

5 stars 5.08 score 16 scripts

martinloza

Canek:Batch Correction of Single Cell Transcriptome Data

Non-linear/linear hybrid method for batch-effect correction that uses Mutual Nearest Neighbors (MNNs) to identify similar cells between datasets. Reference: Loza M. et al. (NAR Genomics and Bioinformatics, 2020) <doi:10.1093/nargab/lqac022>.

Maintained by Martin Loza. Last updated 1 years ago.

batch-effects bioinformatics single-cell-rna-seq transcriptomics

5 stars 5.06 score 23 scripts

bioc

mobileRNA:mobileRNA: Investigate the RNA mobilome & population-scale changes

Genomic analysis can be utilised to identify differences between RNA populations in two conditions, both in production and abundance. This includes the identification of RNAs produced by multiple genomes within a biological system. For example, RNA produced by pathogens within a host or mobile RNAs in plant graft systems. The mobileRNA package provides methods to pre-process, analyse and visualise the sRNA and mRNA populations based on the premise of mapping reads to all genotypes at the same time.

Maintained by Katie Jeynes-Cupper. Last updated 5 months ago.

visualization rnaseq sequencing smallrna genomeassembly clustering experimentaldesign qualitycontrol workflowstep alignment preprocessing bioinformatics plant-science

4 stars 5.00 score 2 scripts

luciorq

condathis:Run Any CLI Tool on a 'Conda' Environment

Simplifies the execution of command line interface (CLI) tools within isolated and reproducible environments. It enables users to effortlessly manage 'Conda' environments, execute command line tools, handle dependencies, and ensure reproducibility in their data analysis workflows.

Maintained by Lucio Queiroz. Last updated 17 days ago.

bioinformatics conda reproducibility reproducible-research

10 stars 5.00 score 1 scripts

nanxstats

grex:Gene ID Mapping for Genotype-Tissue Expression (GTEx) Data

Convert 'Ensembl' gene identifiers from Genotype-Tissue Expression (GTEx) data to identifiers in other annotation systems, including 'Entrez', 'HGNC', and 'UniProt'.

Maintained by Nan Xiao. Last updated 3 years ago.

bioinformatics gene-expression genotype-tissue-expression gtex

8 stars 4.96 score 23 scripts

bioc

rRDP:Interface to the RDP Classifier

This package installs and interfaces the naive Bayesian classifier for 16S rRNA sequences developed by the Ribosomal Database Project (RDP). With this package the classifier trained with the standard training set can be used or a custom classifier can be trained.

Maintained by Michael Hahsler. Last updated 5 months ago.

genetics sequencing infrastructure classification microbiome immunooncology alignment sequencematching dataimport bayesian bioconductor bioinformatics openjdk

3 stars 4.88 score 6 scripts

bioc

rTRM:Identification of Transcriptional Regulatory Modules from Protein-Protein Interaction Networks

rTRM identifies transcriptional regulatory modules (TRMs) from protein-protein interaction networks.

Maintained by Diego Diez. Last updated 5 months ago.

transcription network generegulation graphandnetwork bioconductor bioinformatics

3 stars 4.86 score 3 scripts 1 dependents

bioc

packFinder:de novo Annotation of Pack-TYPE Transposable Elements

Algorithm and tools for in silico pack-TYPE transposon discovery. Filters a given genome for properties unique to DNA transposons and provides tools for the investigation of returned matches. Sequences are input in DNAString format, and ranges are returned as a dataframe (in the format returned by as.dataframe(GRanges)).

Maintained by Jack Gisby. Last updated 5 months ago.

genetics sequencematching annotation bioinformatics text-mining

7 stars 4.85 score 6 scripts

samilhll

macrosyntR:Draw Ordered Oxford Grids

Use standard genomics file format (BED) and a table of orthologs to illustrate synteny conservation at the genome-wide scale. Significantly conserved linkage groups are identified as described in Simakov et al. (2020) <doi:10.1038/s41559-020-1156-z> and displayed on an Oxford Grid (Edwards (1991) <doi:10.1111/j.1469-1809.1991.tb00394.x>) or a chord diagram as in Simakov et al. (2022) <doi:10.1126/sciadv.abi5884>. The package provides a function that uses a network-based greedy algorithm to find communities (Clauset et al. (2004) <doi:10.1103/PhysRevE.70.066111>) and so automatically order the chromosomes on the plot to improve interpretability.

Maintained by Sami El Hilali. Last updated 10 months ago.

bioinformatics genomic-visualizations genomics

14 stars 4.85 score 5 scripts

sysbiolab

PathwaySpace:Spatial Projection of Network Signals along Geodesic Paths

For a given graph containing vertices, edges, and a signal associated with the vertices, the 'PathwaySpace' package performs a convolution operation, which involves a weighted combination of neighboring vertices and their associated signals. The package then uses a decay function to project these signals, creating geodesic paths on a 2D-image space. 'PathwaySpace' could have various applications, such as visualizing and analyzing network data in a graphical format that highlights the relationships and signal strengths between vertices. It can be particularly useful for understanding the influence of signals through complex networks. By combining graph theory, signal processing, and visualization, the 'PathwaySpace' package provides a novel way of representing and analyzing graph data.

Maintained by Mauro Castro. Last updated 3 months ago.

bioinformatics biological-networks graph

2 stars 4.85 score 5 scripts

bioc

rmelting:R Interface to MELTING 5

R interface to the MELTING 5 program (https://www.ebi.ac.uk/biomodels/tools/melting/) to compute melting temperatures of nucleic acid duplexes along with other thermodynamic parameters.

Maintained by J. Aravind. Last updated 5 months ago.

biomedicalinformatics cheminformatics bioconductor bioinformatics melting-temperature openjdk

2 stars 4.78 score 10 scripts

moseleybioinformaticslab

visualizationQualityControl:Development of visualization methods for quality control

Provides utilities useful quality control of high-throughput -omics datasets.

Maintained by Robert M Flight. Last updated 1 years ago.

bioinformatics correlation quality-control visualization

10 stars 4.78 score 30 scripts

andersgs

harrietr:Wrangle Phylogenetic Distance Matrices and Other Utilities

Harriet was Charles Darwin's pet tortoise (possibly). 'harrietr' implements some function to manipulate distance matrices and phylogenetic trees to make it easier to plot with 'ggplot2' and to manipulate using 'tidyverse' tools.

Maintained by Anders Gonçalves da Silva. Last updated 7 years ago.

bioinformatics evolution phylogenetics

12 stars 4.78 score 50 scripts

bioc

GEOfastq:Downloads ENA Fastqs With GEO Accessions

GEOfastq is used to download fastq files from the European Nucleotide Archive (ENA) starting with an accession from the Gene Expression Omnibus (GEO). To do this, sample metadata is retrieved from GEO and the Sequence Read Archive (SRA). SRA run accessions are then used to construct FTP and aspera download links for fastq files generated by the ENA.

Maintained by Alex Pickering. Last updated 5 months ago.

rnaseq dataimport bioinformatics fastq gene-expression geo rna-seq

4 stars 4.60 score 6 scripts

bioc

methylInheritance:Permutation-Based Analysis associating Conserved Differentially Methylated Elements Across Multiple Generations to a Treatment Effect

Permutation analysis, based on Monte Carlo sampling, for testing the hypothesis that the number of conserved differentially methylated elements, between several generations, is associated to an effect inherited from a treatment and that stochastic effect can be dismissed.

Maintained by Astrid Deschênes. Last updated 5 months ago.

biologicalquestion epigenetics dnamethylation differentialmethylation methylseq software immunooncology statisticalmethod wholegenome sequencing analysis bioconductor bioinformatics cpg differentially-methylated-elements inheritance monte-carlo-sampling permutation

4.60 score 1 scripts

bioc

meshr:Tools for conducting enrichment analysis of MeSH

A set of annotation maps describing the entire MeSH assembled using data from MeSH.

Maintained by Koki Tsuyuzaki. Last updated 5 months ago.

annotationdata functionalannotation bioinformatics statistics annotation multiplecomparisons meshdb

4.56 score 9 scripts 1 dependents

urniaz

kmeRs:K-Mers Similarity Score Matrix and HeatMap

Similarity Score Matrix and HeatMap for nucleic and amino acid k-mers. Similarity score is evaluated by Point Accepted Mutation (PAM) and BLOcks SUbstitution Matrix (BLOSUM). The 30, 40, 70, 120, 250 and 62, 45, 50, 62, 80, 100 matrix versions are available for PAM and BLOSUM, respectively. Alignment is evaluated by local and global alignment.

Maintained by Rafal Urniaz. Last updated 7 months ago.

software amino-acids bioinformatics nucleic-acids similarity-matrix

4.54 score 3 scripts

sergejruff

Virusparies:Visualize and Process Output from 'VirusHunterGatherer'

A collection of tools for downstream analysis of 'VirusHunterGatherer' output. Processing of hittables and plotting of results, enabling better interpretation, is made easier with the provided functions.

Maintained by Ruff Sergej. Last updated 3 months ago.

bioinformatics data-driven discover discovery ggplot2 graphical-table hidden-markov-model hmmlearn plot r-programming summary-statistics virus virus-discovery virus-scanning virusgatherer virushunter virushuntergatherer visualization

1 stars 4.49 score 28 scripts

bioc

ggseqalign:Minimal Visualization of Sequence Alignments

Simple visualizations of alignments of DNA or AA sequences as well as arbitrary strings. Compatible with Biostrings and ggplot2. The plots are fully customizable using ggplot2 modifiers such as theme().

Maintained by Simeon Lim Rossmann. Last updated 25 days ago.

alignment multiplesequencealignment software visualization bioinformatics ggplot2-enhancements minimalistic

4.48 score

bioc

TDbasedUFEadv:Advanced package of tensor decomposition based unsupervised feature extraction

This is an advanced version of TDbasedUFE, which is a comprehensive package to perform Tensor decomposition based unsupervised feature extraction. In contrast to TDbasedUFE which can perform simple the feature selection and the multiomics analyses, this package can perform more complicated and advanced features, but they are not so popularly required. Only users who require more specific features can make use of its functionality.

Maintained by Y-h. Taguchi. Last updated 5 months ago.

geneexpression featureextraction methylationarray singlecell software bioconductor-package bioinformatics tensor-decomposition

4.48 score 4 scripts

bioc

biobtreeR:Using biobtree tool from R

The biobtreeR package provides an interface to [biobtree](https://github.com/tamerh/biobtree) tool which covers large set of bioinformatics datasets and allows search and chain mappings functionalities.

Maintained by Tamer Gur. Last updated 5 months ago.

annotation bioinformatics

3 stars 4.48 score 3 scripts

yannabraham

bodenmiller:Profiling of Peripheral Blood Mononuclear Cells using CyTOF

This data package contains a subset of the Bodenmiller et al, Nat Biotech 2012 dataset for testing single cell, high dimensional analysis and visualization methods.

Maintained by Yann Abraham. Last updated 4 years ago.

bioinformatics cytof dataset science

2 stars 4.45 score 28 scripts

kumes

chatAI4R:Chat-Based Interactive Artificial Intelligence for R

The Large Language Model (LLM) represents a groundbreaking advancement in data science and programming, and also allows us to extend the world of R. A seamless interface for integrating the 'OpenAI' Web APIs into R is provided in this package. This package leverages LLM-based AI techniques, enabling efficient knowledge discovery and data analysis (see 'OpenAI' Web APIs details <https://openai.com/blog/openai-api>). The previous functions such as seamless translation and image generation have been moved to other packages 'deepRstudio' and 'stableDiffusion4R'.

Maintained by Satoshi Kume. Last updated 1 months ago.

ai bioinformatics chatgpt gpt image image-generation

14 stars 4.45 score 3 scripts

ftwkoopmans

goat:Gene Set Analysis Using the Gene Set Ordinal Association Test

Perform gene set enrichment analyses using the Gene set Ordinal Association Test (GOAT) algorithm and visualize your results. Koopmans, F. (2024) <doi:10.1038/s42003-024-06454-5>.

Maintained by Frank Koopmans. Last updated 1 months ago.

bioinformatics geneset-enrichment geneset-enrichment-analysis cpp openmp

10 stars 4.40 score 8 scripts

xueyuancao

GSDA:Gene Set Distance Analysis (GSDA)

The gene-set distance analysis of omic data is implemented by generalizing distance correlations to evaluate the association of a gene set with categorical and censored event-time variables.

Maintained by Xueyuan Cao. Last updated 4 years ago.

microarray bioinformatics gene expression

1 stars 4.30 score 8 scripts

bioc

getDEE2:Programmatic access to the DEE2 RNA expression dataset

Digital Expression Explorer 2 (or DEE2 for short) is a repository of processed RNA-seq data in the form of counts. It was designed so that researchers could undertake re-analysis and meta-analysis of published RNA-seq studies quickly and easily. As of April 2020, over 1 million SRA datasets have been processed. This package provides an R interface to access these expression data. More information about the DEE2 project can be found at the project homepage (http://dee2.io) and main publication (https://doi.org/10.1093/gigascience/giz022).

Maintained by Mark Ziemann. Last updated 2 months ago.

geneexpression transcriptomics sequencing bioinformatics data-mining genomics rna-expression rna-seq

4 stars 4.20 score 5 scripts

bioc

ReducedExperiment:Containers and tools for dimensionally-reduced -omics representations

Provides SummarizedExperiment-like containers for storing and manipulating dimensionally-reduced assay data. The ReducedExperiment classes allow users to simultaneously manipulate their original dataset and their decomposed data, in addition to other method-specific outputs like feature loadings. Implements utilities and specialised classes for the application of stabilised independent component analysis (sICA) and weighted gene correlation network analysis (WGCNA).

Maintained by Jack Gisby. Last updated 2 months ago.

geneexpression infrastructure datarepresentation software dimensionreduction network bioconductor-package bioinformatics dimensionality-reduction

3 stars 4.13 score 8 scripts

luciorq

isoformic:Isoform-Level Biological Interpretation of Transcriptomic Data

Isoform-level biological interpretation of transcriptomic data.

Maintained by Lucio Rezende Queiroz. Last updated 4 months ago.

bioconductor bioinformatics transcriptomics

22 stars 4.12 score 4 scripts

bioc

ccrepe:ccrepe_and_nc.score

The CCREPE (Compositionality Corrected by REnormalizaion and PErmutation) package is designed to assess the significance of general similarity measures in compositional datasets. In microbial abundance data, for example, the total abundances of all microbes sum to one; CCREPE is designed to take this constraint into account when assigning p-values to similarity measures between the microbes. The package has two functions: ccrepe: Calculates similarity measures, p-values and q-values for relative abundances of bugs in one or two body sites using bootstrap and permutation matrices of the data. nc.score: Calculates species-level co-variation and co-exclusion patterns based on an extension of the checkerboard score to ordinal data.

Maintained by Emma Schwager. Last updated 5 months ago.

immunooncology statistics metagenomics bioinformatics software

4.08 score 7 scripts

sbg

biocompute:Create and Manipulate BioCompute Objects

Tools to create, validate, and export BioCompute Objects described in King et al. (2019) <doi:10.17605/osf.io/h59uh>. Users can encode information in data frames, and compose BioCompute Objects from the domains defined by the standard. A checksum validator and a JSON schema validator are provided. This package also supports exporting BioCompute Objects as JSON, PDF, HTML, or 'Word' documents, and exporting to cloud-based platforms.

Maintained by Soner Koc. Last updated 10 months ago.

biocompute biocompute-objects bioinformatics science-communication sevenbridges standardization workflow

3 stars 4.07 score 13 scripts

vivianstats

scINSIGHT:Interpretation of Heterogeneous Single-Cell Gene Expression Data

We develop a novel matrix factorization tool named 'scINSIGHT' to jointly analyze multiple single-cell gene expression samples from biologically heterogeneous sources, such as different disease phases, treatment groups, or developmental stages. Given multiple gene expression samples from different biological conditions, 'scINSIGHT' simultaneously identifies common and condition-specific gene modules and quantify their expression levels in each sample in a lower-dimensional space. With the factorized results, the inferred expression levels and memberships of common gene modules can be used to cluster cells and detect cell identities, and the condition-specific gene modules can help compare functional differences in transcriptomes from distinct conditions. Please also see Qian K, Fu SW, Li HW, Li WV (2022) <doi:10.1186/s13059-022-02649-3>.

Maintained by Kun Qian. Last updated 3 years ago.

bioinformatics gene-expression integration scrna-seq openblas cpp

21 stars 4.02 score 10 scripts

ambuvjyn

baseq:Basic Sequence Processing Tool for Biological Data

Primarily created as an easy and understanding way to do basic sequences surrounding the central dogma of molecular biology.

Maintained by Ambu Vijayan. Last updated 2 years ago.

bioinformatics sequencing

2 stars 4.00 score

stephenturner

kgp:1000 Genomes Project Metadata

Metadata about populations and data about samples from the 1000 Genomes Project, including the 2,504 samples sequenced for the Phase 3 release and the expanded collection of 3,202 samples with 602 additional trios. The data is described in Auton et al. (2015) <doi:10.1038/nature15393> and Byrska-Bishop et al. (2022) <doi:10.1016/j.cell.2022.08.004>, and raw data is available at <http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/>. See Turner (2022) <doi:10.48550/arXiv.2210.00539> for more details.

Maintained by Stephen Turner. Last updated 2 years ago.

1000genomes bioinformatics genetics genomics metadata population-genetics sequencing

20 stars 4.00 score 3 scripts

cnuge

debar:A Post-Clustering Denoiser for COI-5P Barcode Data

The 'debar' sequence processing pipeline is designed for denoising high throughput sequencing data for the animal DNA barcode marker cytochrome c oxidase I (COI). The package is designed to detect and correct insertion and deletion errors within sequencer outputs. This is accomplished through comparison of input sequences against a profile hidden Markov model (PHMM) using the Viterbi algorithm (for algorithm details see Durbin et al. 1998, ISBN: 9780521629713). Inserted base pairs are removed and deleted base pairs are accounted for through the introduction of a placeholder character. Since the PHMM is a probabilistic representation of the COI barcode, corrections are not always perfect. For this reason 'debar' censors base pairs adjacent to reported indel sites, turning them into placeholder characters (default is 7 base pairs in either direction, this feature can be disabled). Testing has shown that this censorship results in the correct sequence length being restored, and erroneous base pairs being masked the vast majority of the time (>95%).

Maintained by Cameron M. Nugent. Last updated 1 years ago.

bioinformatics denoising dna-barcoding dna-sequencing hidden-markov-model machine-learning

1 stars 4.00 score 8 scripts

bioc

geneXtendeR:Optimized Functional Annotation Of ChIP-seq Data

geneXtendeR optimizes the functional annotation of ChIP-seq peaks by exploring relative differences in annotating ChIP-seq peak sets to variable-length gene bodies. In contrast to prior techniques, geneXtendeR considers peak annotations beyond just the closest gene, allowing users to see peak summary statistics for the first-closest gene, second-closest gene, ..., n-closest gene whilst ranking the output according to biologically relevant events and iteratively comparing the fidelity of peak-to-gene overlap across a user-defined range of upstream and downstream extensions on the original boundaries of each gene's coordinates. Since different ChIP-seq peak callers produce different differentially enriched peaks with a large variance in peak length distribution and total peak count, annotating peak lists with their nearest genes can often be a noisy process. As such, the goal of geneXtendeR is to robustly link differentially enriched peaks with their respective genes, thereby aiding experimental follow-up and validation in designing primers for a set of prospective gene candidates during qPCR.

Maintained by Bohdan Khomtchouk. Last updated 5 months ago.

chipseq genetics annotation genomeannotation differentialpeakcalling coverage peakdetection chiponchip histonemodification dataimport naturallanguageprocessing visualization go software bioconductor bioinformatics c chip-seq computational-biology epigenetics functional-annotation

9 stars 3.95 score 5 scripts

biogenies

CancerGram:Prediction of Anticancer Peptides

Predicts anticancer peptides using random forests trained on the n-gram encoded peptides. The implemented algorithm can be accessed from both the command line and shiny-based GUI. The CancerGram model is too large for CRAN and it has to be downloaded separately from the repository: <https://github.com/BioGenies/CancerGramModel>. For more information see: Burdukiewicz et al. (2020) <doi:10.3390/pharmaceutics12111045>.

Maintained by Michal Burdukiewicz. Last updated 4 years ago.

anticancer-peptides bioinformatics k-mer n-gram peptide-identification random-forests

4 stars 3.90 score 3 scripts

mhahsler

rMSA:Interface for Popular Multiple Sequence Alignment Tools

Seamlessly interfaces the Multiple Sequence Alignment software packages ClustalW, MAFFT, MUSCLE and Kalign (downloaded separately) and provides support to calcualte distances between sequences. This work was partially supported by grant no. R21HG005912 from the National Human Genome Research Institute.

Maintained by Michael Hahsler. Last updated 10 months ago.

genetics sequencing infrastructure alignment bioinformatics sequence-alignment

12 stars 3.78 score 7 scripts

sudoms

freqpcr:Estimates Allele Frequency on qPCR DeltaDeltaCq from Bulk Samples

Interval estimation of the population allele frequency from qPCR analysis based on the restriction enzyme digestion (RED)-DeltaDeltaCq method (Osakabe et al. 2017, <doi:10.1016/j.pestbp.2017.04.003>), as well as general DeltaDeltaCq analysis. Compatible with the Cq measurement of DNA extracted from multiple individuals at once, so called "group-testing", this model assumes that the quantity of DNA extracted from an individual organism follows a gamma distribution. Therefore, the point estimate is robust regarding the uncertainty of the DNA yield.

Maintained by Masaaki Sudo. Last updated 3 years ago.

bioinformatics frequency-estimation pcr statistics

1 stars 3.70 score 4 scripts

bioc

mimager:mimager: The Microarray Imager

Easily visualize and inspect microarrays for spatial artifacts.

Maintained by Aaron Wolen. Last updated 5 months ago.

infrastructure visualization microarray bioconductor bioinformatics

3.70 score 3 scripts

seninp-bioinfo

MetaComp:EDGE Taxonomy Assignments Visualization

Implements routines for metagenome sample taxonomy assignments collection, aggregation, and visualization. Accepts the EDGE-formatted output from GOTTCHA/GOTTCHA2, BWA, Kraken, MetaPhlAn, DIAMOND, and Pangia. Produces SVG and PDF heatmap-like plots comparing taxa abundances across projects.

Maintained by Pavel Senin. Last updated 7 years ago.

bioinformatics comparative-genomics edge heatmap metagenomics visualization

4 stars 3.60 score 7 scripts

imamachi-n

bridger2:Genome-Wide RNA Degradation Analysis Using BRIC-Seq Data

BRIC-seq is a genome-wide approach for determining RNA stability in mammalian cells. This package provides a series of functions for performing quality check of your BRIC-seq data, calculation of RNA half-life for each transcript and comparison of RNA half-lives between two conditions.

Maintained by Naoto Imamachi. Last updated 8 years ago.

bioinformatics bric-seq half-life ngs rna rpkm-values

3 stars 3.43 score 18 scripts

bioc

tRanslatome:Comparison between multiple levels of gene expression

Detection of differentially expressed genes (DEGs) from the comparison of two biological conditions (treated vs. untreated, diseased vs. normal, mutant vs. wild-type) among different levels of gene expression (transcriptome ,translatome, proteome), using several statistical methods: Rank Product, Translational Efficiency, t-test, Limma, ANOTA, DESeq, edgeR. Possibility to plot the results with scatterplots, histograms, MA plots, standard deviation (SD) plots, coefficient of variation (CV) plots. Detection of significantly enriched post-transcriptional regulatory factors (RBPs, miRNAs, etc) and Gene Ontology terms in the lists of DEGs previously identified for the two expression levels. Comparison of GO terms enriched only in one of the levels or in both. Calculation of the semantic similarity score between the lists of enriched GO terms coming from the two expression levels. Visual examination and comparison of the enriched terms with heatmaps, radar plots and barplots.

Maintained by Toma Tebaldi. Last updated 5 months ago.

cellbiology generegulation regulation geneexpression differentialexpression microarray highthroughputsequencing qualitycontrol go multiplecomparisons bioinformatics

3.30 score 2 scripts

bioc

mosaics:MOSAiCS (MOdel-based one and two Sample Analysis and Inference for ChIP-Seq)

This package provides functions for fitting MOSAiCS and MOSAiCS-HMM, a statistical framework to analyze one-sample or two-sample ChIP-seq data of transcription factor binding and histone modification.

Maintained by Dongjun Chung. Last updated 5 months ago.

chipseq sequencing transcription genetics bioinformatics cpp

3.30 score 8 scripts

moseleybioinformaticslab

categoryCompare2:Meta-Analysis of High-Throughput Experiments Using Feature Annotations

Facilitates comparison of significant annotations (categories) generated on one or more feature lists. Interactive exploration is facilitated through the use of RCytoscape (heavily suggested).

Maintained by Robert M Flight. Last updated 5 months ago.

annotation go multiplecomparison pathways geneexpression bioconductor bioinformatics gene-annotation gene-expression gene-sets

1 stars 2.48 score 9 scripts

cogdisreslab

BioPathNet:BioPathNet: Three Pod Analysis System

This package aims to provide a simple interface to perform the Three Pod Analysis of RNASeq dataaset. In addition, this also provides utility functions to perform the individual components.

Maintained by Ali Sajid Imami. Last updated 2 years ago.

bioinformatics bioinformatics-pipeline ilincs transcriptomics

2 stars 2.00 score 5 scripts

weiliang

MMDvariance:Detecting Differentially Variable Genes Using the Mixture of Marginal Distributions

Gene selection based on variance using the marginal distributions of gene profiles that characterized by a mixture of three-component multivariate distributions. Please see the reference: Li X, Fu Y, Wang X, DeMeo DL, Tantisira K, Weiss ST, Qiu W. (2018) <doi:10.1155/2018/6591634>.

Maintained by Weiliang Qiu. Last updated 7 years ago.

bioinformatics differentialexpression

1.00 score 2 scripts

yixinzhang-stat

eLNNpairedCov:Model-Based Gene Selection for Paired Data

Model-based clustering for paired data based on the regression of a mixture of Bayesian hierarchical models on covariates. Zhang et al. (2023) <doi:10.1186/s12859-023-05556-x>.

Maintained by Yixin Zhang. Last updated 1 years ago.

bioinformatics differentialexpression

1.00 score

zhang-zeyu

countTransformers:Transform Counts in RNA-Seq Data Analysis

Provide data transformation functions to transform counts in RNA-seq data analysis. Please see the reference: Zhang Z, Yu D, Seo M, Hersh CP, Weiss ST, Qiu W. (2019) <doi.org/10.1038/s41598-019-41315-w>.

Maintained by Zeyu Zhang. Last updated 6 years ago.

bioinformatics differentialexpression

1.00 score 10 scripts