R-universe search: signatures

shixiangwang

sigminer:Extract, Analyze and Visualize Mutational Signatures for Genomic Variations

Genomic alterations including single nucleotide substitution, copy number alteration, etc. are the major force for cancer initialization and development. Due to the specificity of molecular lesions caused by genomic alterations, we can generate characteristic alteration spectra, called 'signature' (Wang, Shixiang, et al. (2021) <DOI:10.1371/journal.pgen.1009557> & Alexandrov, Ludmil B., et al. (2020) <DOI:10.1038/s41586-020-1943-3> & Steele Christopher D., et al. (2022) <DOI:10.1038/s41586-022-04738-6>). This package helps users to extract, analyze and visualize signatures from genomic alteration records, thus providing new insight into cancer study.

Maintained by Shixiang Wang. Last updated 5 months ago.

bayesian-nmf bioinformatics cancer-research cnv copynumber-signatures cosmic-signatures dbs easy-to-use indel mutational-signatures nmf nmf-extraction sbs signature-extraction somatic-mutations somatic-variants visualization cpp

107.1 match 150 stars 9.48 score 123 scripts 2 dependents

bioc

YAPSA:Yet Another Package for Signature Analysis

This package provides functions and routines for supervised analyses of mutational signatures (i.e., the signatures have to be known, cf. L. Alexandrov et al., Nature 2013 and L. Alexandrov et al., Bioaxiv 2018). In particular, the family of functions LCD (LCD = linear combination decomposition) can use optimal signature-specific cutoffs which takes care of different detectability of the different signatures. Moreover, the package provides different sets of mutational signatures, including the COSMIC and PCAWG SNV signatures and the PCAWG Indel signatures; the latter infering that with YAPSA, the concept of supervised analysis of mutational signatures is extended to Indel signatures. YAPSA also provides confidence intervals as computed by profile likelihoods and can perform signature analysis on a stratified mutational catalogue (SMC = stratify mutational catalogue) in order to analyze enrichment and depletion patterns for the signatures in different strata.

Maintained by Zuguang Gu. Last updated 5 months ago.

sequencing dnaseq somaticmutation visualization clustering genomicvariation statisticalmethod biologicalquestion

126.1 match 6.41 score 57 scripts

bioc

signifinder:Collection and implementation of public transcriptional cancer signatures

signifinder is an R package for computing and exploring a compendium of tumor signatures. It allows to compute a variety of signatures, based on gene expression values, and return single-sample scores. Currently, signifinder contains more than 60 distinct signatures collected from the literature, relating to multiple tumors and multiple cancer processes.

Maintained by Stefania Pirrotta. Last updated 2 months ago.

geneexpression genetarget immunooncology biomedicalinformatics rnaseq microarray reportwriting visualization singlecell spatial genesignaling

115.2 match 7 stars 6.40 score 15 scripts

cloudyr

aws.signature:Amazon Web Services Request Signatures

Generates version 2 and version 4 request signatures for Amazon Web Services ('AWS') <https://aws.amazon.com/> Application Programming Interfaces ('APIs') and provides a mechanism for retrieving credentials from environment variables, 'AWS' credentials files, and 'EC2' instance metadata. For use on 'EC2' instances, users will need to install the suggested package 'aws.ec2metadata' <https://cran.r-project.org/package=aws.ec2metadata>.

Maintained by Jonathan Stott. Last updated 3 years ago.

aws cloudyr iam request-signatures

56.3 match 31 stars 9.96 score 84 scripts 31 dependents

bioc

TBSignatureProfiler:Profile RNA-Seq Data Using TB Pathway Signatures

Gene signatures of TB progression, TB disease, and other TB disease states have been validated and published previously. This package aggregates known signatures and provides computational tools to enlist their usage on other datasets. The TBSignatureProfiler makes it easy to profile RNA-Seq data using these signatures and includes common signature profiling tools including ASSIGN, GSVA, and ssGSEA. Original models for some gene signatures are also available. A shiny app provides some functionality alongside for detailed command line accessibility.

Maintained by Aubrey R. Odom. Last updated 3 months ago.

geneexpression differentialexpression bioconductor-package biomarkers gene-signatures tuberculosis

74.3 match 12 stars 7.25 score 23 scripts

bioc

musicatk:Mutational Signature Comprehensive Analysis Toolkit

Mutational signatures are carcinogenic exposures or aberrant cellular processes that can cause alterations to the genome. We created musicatk (MUtational SIgnature Comprehensive Analysis ToolKit) to address shortcomings in versatility and ease of use in other pre-existing computational tools. Although many different types of mutational data have been generated, current software packages do not have a flexible framework to allow users to mix and match different types of mutations in the mutational signature inference process. Musicatk enables users to count and combine multiple mutation types, including SBS, DBS, and indels. Musicatk calculates replication strand, transcription strand and combinations of these features along with discovery from unique and proprietary genomic feature associated with any mutation type. Musicatk also implements several methods for discovery of new signatures as well as methods to infer exposure given an existing set of signatures. Musicatk provides functions for visualization and downstream exploratory analysis including the ability to compare signatures between cohorts and find matching signatures in COSMIC V2 or COSMIC V3.

Maintained by Joshua D. Campbell. Last updated 5 months ago.

software biologicalquestion somaticmutation variantannotation

57.9 match 13 stars 7.02 score 20 scripts

bioc

genefu:Computation of Gene Expression-Based Signatures in Breast Cancer

This package contains functions implementing various tasks usually required by gene expression analysis, especially in breast cancer studies: gene mapping between different microarray platforms, identification of molecular subtypes, implementation of published gene signatures, gene selection, and survival analysis.

Maintained by Benjamin Haibe-Kains. Last updated 4 months ago.

differentialexpression geneexpression visualization clustering classification

45.0 match 7.42 score 193 scripts 3 dependents

bioc

MutationalPatterns:Comprehensive genome-wide analysis of mutational processes

Mutational processes leave characteristic footprints in genomic DNA. This package provides a comprehensive set of flexible functions that allows researchers to easily evaluate and visualize a multitude of mutational patterns in base substitution catalogues of e.g. healthy samples, tumour samples, or DNA-repair deficient cells. The package covers a wide range of patterns including: mutational signatures, transcriptional and replicative strand bias, lesion segregation, genomic distribution and association with genomic features, which are collectively meaningful for studying the activity of mutational processes. The package works with single nucleotide variants (SNVs), insertions and deletions (Indels), double base substitutions (DBSs) and larger multi base substitutions (MBSs). The package provides functionalities for both extracting mutational signatures de novo and determining the contribution of previously identified mutational signatures on a single sample level. MutationalPatterns integrates with common R genomic analysis workflows and allows easy association with (publicly available) annotation data.

Maintained by Mark van Roosmalen. Last updated 5 months ago.

genetics somaticmutation

42.6 match 7.27 score 251 scripts 1 dependents

bioc

UCell:Rank-based signature enrichment analysis for single-cell data

UCell is a package for evaluating gene signatures in single-cell datasets. UCell signature scores, based on the Mann-Whitney U statistic, are robust to dataset size and heterogeneity, and their calculation demands less computing time and memory than other available methods, enabling the processing of large datasets in a few minutes even on machines with limited computing power. UCell can be applied to any single-cell data matrix, and includes functions to directly interact with SingleCellExperiment and Seurat objects.

Maintained by Massimo Andreatta. Last updated 5 months ago.

singlecell genesetenrichment transcriptomics geneexpression cellbasedassays

24.1 match 143 stars 10.43 score 454 scripts 2 dependents

bioc

decompTumor2Sig:Decomposition of individual tumors into mutational signatures by signature refitting

Uses quadratic programming for signature refitting, i.e., to decompose the mutation catalog from an individual tumor sample into a set of given mutational signatures (either Alexandrov-model signatures or Shiraishi-model signatures), computing weights that reflect the contributions of the signatures to the mutation load of the tumor.

Maintained by Rosario M. Piro. Last updated 5 months ago.

software snp sequencing dnaseq genomicvariation somaticmutation biomedicalinformatics genetics biologicalquestion statisticalmethod

50.7 match 1 stars 4.78 score 10 scripts 1 dependents

bioc

singscore:Rank-based single-sample gene set scoring method

A simple single-sample gene signature scoring method that uses rank-based statistics to analyze the sample's gene expression profile. It scores the expression activities of gene sets at a single-sample level.

Maintained by Malvika Kharbanda. Last updated 5 months ago.

software geneexpression genesetenrichment bioinformatics

23.9 match 41 stars 10.03 score 124 scripts 4 dependents

bioc

maftools:Summarize, Analyze and Visualize MAF Files

Analyze and visualize Mutation Annotation Format (MAF) files from large scale sequencing studies. This package provides various functions to perform most commonly used analyses in cancer genomics and to create feature rich customizable visualzations with minimal effort.

Maintained by Anand Mayakonda. Last updated 5 months ago.

datarepresentation dnaseq visualization drivermutation variantannotation featureextraction classification somaticmutation sequencing functionalgenomics survival bioinformatics cancer-genome-atlas cancer-genomics genomics maf-files tcga curl bzip2 xz-utils zlib

14.5 match 459 stars 14.63 score 948 scripts 18 dependents

dami82

mutSignatures:Decipher Mutational Signatures from Somatic Mutational Catalogs

Cancer cells accumulate DNA mutations as result of DNA damage and DNA repair processes. This computational framework is aimed at deciphering DNA mutational signatures operating in cancer. The framework includes modules that support raw data import and processing, mutational signature extraction, and results interpretation and visualization. The framework accepts widely used file formats storing information about DNA variants, such as Variant Call Format files. The framework performs Non-Negative Matrix Factorization to extract mutational signatures explaining the observed set of DNA mutations. Bootstrapping is performed as part of the analysis. The framework supports parallelization and is optimized for use on multi-core systems. The software was described by Fantini D et al (2020) <doi:10.1038/s41598-020-75062-0> and is based on a custom R-based implementation of the original MATLAB WTSI framework by Alexandrov LB et al (2013) <doi:10.1016/j.celrep.2012.12.008>.

Maintained by Damiano Fantini. Last updated 2 years ago.

31.6 match 14 stars 5.83 score 48 scripts

bioc

HiLDA:Conducting statistical inference on comparing the mutational exposures of mutational signatures by using hierarchical latent Dirichlet allocation

A package built under the Bayesian framework of applying hierarchical latent Dirichlet allocation. It statistically tests whether the mutational exposures of mutational signatures (Shiraishi-model signatures) are different between two groups. The package also provides inference and visualization.

Maintained by Zhi Yang. Last updated 5 months ago.

software somaticmutation sequencing statisticalmethod bayesian mutational-signatures rjags somatic-mutations cpp jags

33.1 match 3 stars 5.56 score 7 scripts 1 dependents

bioc

PharmacoGx:Analysis of Large-Scale Pharmacogenomic Data

Contains a set of functions to perform large-scale analysis of pharmaco-genomic data. These include the PharmacoSet object for storing the results of pharmacogenomic experiments, as well as a number of functions for computing common summaries of drug-dose response and correlating them with the molecular features in a cancer cell-line.

Maintained by Benjamin Haibe-Kains. Last updated 2 months ago.

geneexpression pharmacogenetics pharmacogenomics software classification datasets pharmacogenomic pharmacogx cpp

16.0 match 68 stars 11.39 score 442 scripts 3 dependents

business-science

timetk:A Tool Kit for Working with Time Series

Easy visualization, wrangling, and feature engineering of time series data for forecasting and machine learning prediction. Consolidates and extends time series functionality from packages including 'dplyr', 'stats', 'xts', 'forecast', 'slider', 'padr', 'recipes', and 'rsample'.

Maintained by Matt Dancho. Last updated 1 years ago.

coercion coercion-functions data-mining dplyr forecast forecasting forecasting-models machine-learning series-decomposition series-signature tibble tidy tidyquant tidyverse time time-series timeseries

12.5 match 625 stars 14.15 score 4.0k scripts 16 dependents

bioc

bugsigdbr:R-side access to published microbial signatures from BugSigDB

The bugsigdbr package implements convenient access to bugsigdb.org from within R/Bioconductor. The goal of the package is to facilitate import of BugSigDB data into R/Bioconductor, provide utilities for extracting microbe signatures, and enable export of the extracted signatures to plain text files in standard file formats such as GMT.

Maintained by Ludwig Geistlinger. Last updated 8 days ago.

dataimport genesetenrichment metagenomics microbiome bioconductor-package

26.8 match 3 stars 6.46 score 48 scripts

jeroen

openssl:Toolkit for Encryption, Signatures and Certificates Based on OpenSSL

Bindings to OpenSSL libssl and libcrypto, plus custom SSH key parsers. Supports RSA, DSA and EC curves P-256, P-384, P-521, and curve25519. Cryptographic signatures can either be created and verified manually or via x509 certificates. AES can be used in cbc, ctr or gcm mode for symmetric encryption; RSA for asymmetric (public key) encryption or EC for Diffie Hellman. High-level envelope functions combine RSA and AES for encrypting arbitrary sized data. Other utilities include key generators, hash functions (md5, sha1, sha256, etc), base64 encoder, a secure random number generator, and 'bignum' math methods for manually performing crypto calculations on large multibyte integers.

Maintained by Jeroen Ooms. Last updated 1 months ago.

openssl

8.2 match 65 stars 18.00 score 632 scripts 5.0k dependents

cstewartgh

QFASA:Quantitative Fatty Acid Signature Analysis

Accurate estimates of the diets of predators are required in many areas of ecology, but for many species current methods are imprecise, limited to the last meal, and often biased. The diversity of fatty acids and their patterns in organisms, coupled with the narrow limitations on their biosynthesis, properties of digestion in monogastric animals, and the prevalence of large storage reservoirs of lipid in many predators, led to the development of quantitative fatty acid signature analysis (QFASA) to study predator diets.

Maintained by Connie Stewart. Last updated 7 months ago.

29.9 match 1 stars 4.83 score 17 scripts

acare

hacksig:A Tidy Framework to Hack Gene Expression Signatures

A collection of cancer transcriptomics gene signatures as well as a simple and tidy interface to compute single sample enrichment scores either with the original procedure or with three alternatives: the "combined z-score" of Lee et al. (2008) <doi:10.1371/journal.pcbi.1000217>, the "single sample GSEA" of Barbie et al. (2009) <doi:10.1038/nature08460> and the "singscore" of Foroutan et al. (2018) <doi:10.1186/s12859-018-2435-4>. The 'get_sig_info()' function can be used to retrieve information about each signature implemented.

Maintained by Andrea Carenzo. Last updated 2 years ago.

gene-expression-signatures gene-set-enrichment

24.6 match 19 stars 5.71 score 27 scripts

bioc

cola:A Framework for Consensus Partitioning

Subgroup classification is a basic task in genomic data analysis, especially for gene expression and DNA methylation data analysis. It can also be used to test the agreement to known clinical annotations, or to test whether there exist significant batch effects. The cola package provides a general framework for subgroup classification by consensus partitioning. It has the following features: 1. It modularizes the consensus partitioning processes that various methods can be easily integrated. 2. It provides rich visualizations for interpreting the results. 3. It allows running multiple methods at the same time and provides functionalities to straightforward compare results. 4. It provides a new method to extract features which are more efficient to separate subgroups. 5. It automatically generates detailed reports for the complete analysis. 6. It allows applying consensus partitioning in a hierarchical manner.

Maintained by Zuguang Gu. Last updated 1 months ago.

clustering geneexpression classification software consensus-clustering cpp

18.6 match 61 stars 7.49 score 112 scripts

bioc

SomaticSignatures:Somatic Signatures

The SomaticSignatures package identifies mutational signatures of single nucleotide variants (SNVs). It provides a infrastructure related to the methodology described in Nik-Zainal (2012, Cell), with flexibility in the matrix decomposition algorithms.

Maintained by Julian Gehring. Last updated 5 months ago.

sequencing somaticmutation visualization clustering genomicvariation statisticalmethod

19.9 match 22 stars 6.85 score 54 scripts 1 dependents

bromaghin

qfasar:Quantitative Fatty Acid Signature Analysis in R

An implementation of Quantitative Fatty Acid Signature Analysis (QFASA) in R. QFASA is a method of estimating the diet composition of predators. The fundamental unit of information in QFASA is a fatty acid signature (signature), which is a vector of proportions describing the composition of fatty acids within lipids. Signature data from at least one predator and from samples of all potential prey types are required. Calibration coefficients, which adjust for the differential metabolism of individual fatty acids by predators, are also required. Given those data inputs, a predator signature is modeled as a mixture of prey signatures and its diet estimate is obtained as the mixture that minimizes a measure of distance between the observed and modeled signatures. A variety of estimation options and simulation capabilities are implemented. Please refer to the vignette for additional details and references.

Maintained by Jeffrey F. Bromaghin. Last updated 5 years ago.

42.8 match 2.90 score 40 scripts

bioc

BulkSignalR:Infer Ligand-Receptor Interactions from bulk expression (transcriptomics/proteomics) data, or spatial transcriptomics

Inference of ligand-receptor (LR) interactions from bulk expression (transcriptomics/proteomics) data, or spatial transcriptomics. BulkSignalR bases its inferences on the LRdb database included in our other package, SingleCellSignalR available from Bioconductor. It relies on a statistical model that is specific to bulk data sets. Different visualization and data summary functions are proposed to help navigating prediction results.

Maintained by Jean-Philippe Villemin. Last updated 3 months ago.

network rnaseq software proteomics transcriptomics networkinference spatial

23.7 match 5.22 score 15 scripts

bioc

TCGAbiolinks:TCGAbiolinks: An R/Bioconductor package for integrative analysis with GDC data

The aim of TCGAbiolinks is : i) facilitate the GDC open-access data retrieval, ii) prepare the data using the appropriate pre-processing strategies, iii) provide the means to carry out different standard analyses and iv) to easily reproduce earlier research results. In more detail, the package provides multiple methods for analysis (e.g., differential expression analysis, identifying differentially methylated regions) and methods for visualization (e.g., survival plots, volcano plots, starburst plots) in order to easily develop complete analysis pipelines.

Maintained by Tiago Chedraoui Silva. Last updated 26 days ago.

dnamethylation differentialmethylation generegulation geneexpression methylationarray differentialexpression pathways network sequencing survival software bioc bioconductor gdc integrative-analysis tcga tcga-data tcgabiolinks

8.2 match 305 stars 14.45 score 1.6k scripts 6 dependents

bioc

ASSIGN:Adaptive Signature Selection and InteGratioN (ASSIGN)

ASSIGN is a computational tool to evaluate the pathway deregulation/activation status in individual patient samples. ASSIGN employs a flexible Bayesian factor analysis approach that adapts predetermined pathway signatures derived either from knowledge-based literature or from perturbation experiments to the cell-/tissue-specific pathway signatures. The deregulation/activation level of each context-specific pathway is quantified to a score, which represents the extent to which a patient sample encompasses the pathway deregulation/activation signature.

Maintained by Ying Shen. Last updated 5 months ago.

software geneexpression pathways bayesian

14.7 match 2 stars 7.37 score 65 scripts 1 dependents

sdanzige

ADAPTS:Automated Deconvolution Augmentation of Profiles for Tissue Specific Cells

Tools to construct (or add to) cell-type signature matrices using flow sorted or single cell samples and deconvolve bulk gene expression data. Useful for assessing the quality of single cell RNAseq experiments, estimating the accuracy of signature matrices, and determining cell-type spillover. Please cite: Danziger SA et al. (2019) ADAPTS: Automated Deconvolution Augmentation of Profiles for Tissue Specific cells <doi:10.1371/journal.pone.0224693>.

Maintained by Samuel A Danziger. Last updated 3 years ago.

16.3 match 2 stars 6.56 score 40 scripts 1 dependents

rozen-lab

cosmicsig:Mutational Signatures from COSMIC (Catalogue of Somatic Mutations in Cancer)

A data package with 2 main package variables: 'signature' and 'etiology'. The 'signature' variable contains the latest mutational signature profiles released on COSMIC <https://cancer.sanger.ac.uk/signatures/> for 3 mutation types: * Single base substitutions in the context of preceding and following bases, * Doublet base substitutions, and * Small insertions and deletions. The 'etiology' variable provides the known or hypothesized causes of signatures. 'cosmicsig' stands for COSMIC signatures. Please run ?'cosmicsig' for more information.

Maintained by Steven Rozen. Last updated 2 years ago.

33.9 match 1 stars 3.04 score 22 scripts

bioc

biosigner:Signature discovery from omics data

Feature selection is critical in omics data analysis to extract restricted and meaningful molecular signatures from complex and high-dimension data, and to build robust classifiers. This package implements a new method to assess the relevance of the variables for the prediction performances of the classifier. The approach can be run in parallel with the PLS-DA, Random Forest, and SVM binary classifiers. The signatures and the corresponding 'restricted' models are returned, enabling future predictions on new datasets. A Galaxy implementation of the package is available within the Workflow4metabolomics.org online infrastructure for computational metabolomics.

Maintained by Etienne A. Thevenot. Last updated 5 months ago.

classification featureextraction transcriptomics proteomics metabolomics lipidomics massspectrometry

24.2 match 4.00 score 10 scripts

bioc

mastR:Markers Automated Screening Tool in R

mastR is an R package designed for automated screening of signatures of interest for specific research questions. The package is developed for generating refined lists of signature genes from multiple group comparisons based on the results from edgeR and limma differential expression (DE) analysis workflow. It also takes into account the background noise of tissue-specificity, which is often ignored by other marker generation tools. This package is particularly useful for the identification of group markers in various biological and medical applications, including cancer research and developmental biology.

Maintained by Jinjin Chen. Last updated 5 months ago.

software geneexpression transcriptomics differentialexpression visualization

18.9 match 4 stars 5.08 score 3 scripts

cogdisreslab

drugfindR:Investigate iLINCS for candidate repurposable drugs

This package provides a convenient way to access the LINCS Signatures available in the iLINCS database. These signatures include Consensus Gene Knockdown Signatures, Gene Overexpression signatures and Chemical Perturbagen Signatures. It also provides a way to enter your own transcriptomic signatures and identify concordant and discordant signatures in the LINCS database.

Maintained by Ali Sajid Imami. Last updated 20 days ago.

lincs ilincs drug repurposing drug discovery transcriptomics gene expression gene knockdown gene overexpression chemical perturbagen drugfindr bioinformatics bioinformatics-pipeline

15.5 match 8 stars 6.16 score 145 scripts

bioc

SigsPack:Mutational Signature Estimation for Single Samples

Single sample estimation of exposure to mutational signatures. Exposures to known mutational signatures are estimated for single samples, based on quadratic programming algorithms. Bootstrapping the input mutational catalogues provides estimations on the stability of these exposures. The effect of the sequence composition of mutational context can be taken into account by normalising the catalogues.

Maintained by Franziska Schumann. Last updated 5 months ago.

somaticmutation snp variantannotation biomedicalinformatics dnaseq

21.3 match 2 stars 4.30 score 4 scripts

carmonalab

scGate:Marker-Based Cell Type Purification for Single-Cell Sequencing Data

A common bioinformatics task in single-cell data analysis is to purify a cell type or cell population of interest from heterogeneous datasets. 'scGate' automatizes marker-based purification of specific cell populations, without requiring training data or reference gene expression profiles. Briefly, 'scGate' takes as input: i) a gene expression matrix stored in a 'Seurat' object and ii) a “gating model” (GM), consisting of a set of marker genes that define the cell population of interest. The GM can be as simple as a single marker gene, or a combination of positive and negative markers. More complex GMs can be constructed in a hierarchical fashion, akin to gating strategies employed in flow cytometry. 'scGate' evaluates the strength of signature marker expression in each cell using the rank-based method 'UCell', and then performs k-nearest neighbor (kNN) smoothing by calculating the mean 'UCell' score across neighboring cells. kNN-smoothing aims at compensating for the large degree of sparsity in scRNA-seq data. Finally, a universal threshold over kNN-smoothed signature scores is applied in binary decision trees generated from the user-provided gating model, to annotate cells as either “pure” or “impure”, with respect to the cell population of interest. See the related publication Andreatta et al. (2022) <doi:10.1093/bioinformatics/btac141>.

Maintained by Massimo Andreatta. Last updated 1 months ago.

filtering marker-genes scgate signatures single-cell

10.8 match 106 stars 8.38 score 163 scripts

bioc

SigCheck:Check a gene signature's prognostic performance against random signatures, known signatures, and permuted data/metadata

While gene signatures are frequently used to predict phenotypes (e.g. predict prognosis of cancer patients), it it not always clear how optimal or meaningful they are (cf David Venet, Jacques E. Dumont, and Vincent Detours' paper "Most Random Gene Expression Signatures Are Significantly Associated with Breast Cancer Outcome"). Based on suggestions in that paper, SigCheck accepts a data set (as an ExpressionSet) and a gene signature, and compares its performance on survival and/or classification tasks against a) random gene signatures of the same length; b) known, related and unrelated gene signatures; and c) permuted data and/or metadata.

Maintained by Rory Stark. Last updated 30 days ago.

geneexpression classification genesetenrichment

29.0 match 3.00 score 1 scripts

openbiox

UCSCXenaShiny:Interactive Analysis of UCSC Xena Data

Provides functions and a Shiny application for downloading, analyzing and visualizing datasets from UCSC Xena (<http://xena.ucsc.edu/>), which is a collection of UCSC-hosted public databases such as TCGA, ICGC, TARGET, GTEx, CCLE, and others.

Maintained by Shixiang Wang. Last updated 4 months ago.

cancer-dataset shiny-apps ucsc-xena

10.1 match 96 stars 8.54 score 35 scripts

bioc

selectKSigs:Selecting the number of mutational signatures using a perplexity-based measure and cross-validation

A package to suggest the number of mutational signatures in a collection of somatic mutations using calculating the cross-validated perplexity score.

Maintained by Zhi Yang. Last updated 5 months ago.

software somaticmutation sequencing statisticalmethod clustering mutational-signatures rjags somatic-mutations cpp jags

21.1 match 3 stars 4.08 score 1 scripts

bioc

signeR:Empirical Bayesian approach to mutational signature discovery

The signeR package provides an empirical Bayesian approach to mutational signature discovery. It is designed to analyze single nucleotide variation (SNV) counts in cancer genomes, but can also be applied to other features as well. Functionalities to characterize signatures or genome samples according to exposure patterns are also provided.

Maintained by Renan Valieris. Last updated 5 months ago.

genomicvariation somaticmutation statisticalmethod visualization bioconductor bioinformatics openblas cpp

11.0 match 13 stars 7.67 score 22 scripts

rstudio

reticulate:Interface to 'Python'

Interface to 'Python' modules, classes, and functions. When calling into 'Python', R data types are automatically converted to their equivalent 'Python' types. When values are returned from 'Python' to R they are converted back to R types. Compatible with all versions of 'Python' >= 2.7.

Maintained by Tomasz Kalinowski. Last updated 1 days ago.

cpp

3.8 match 1.7k stars 21.07 score 18k scripts 427 dependents

bioc

BatchQC:Batch Effects Quality Control Software

Sequencing and microarray samples often are collected or processed in multiple batches or at different times. This often produces technical biases that can lead to incorrect results in the downstream analysis. BatchQC is a software tool that streamlines batch preprocessing and evaluation by providing interactive diagnostics, visualizations, and statistical analyses to explore the extent to which batch variation impacts the data. BatchQC diagnostics help determine whether batch adjustment needs to be done, and how correction should be applied before proceeding with a downstream analysis. Moreover, BatchQC interactively applies multiple common batch effect approaches to the data and the user can quickly see the benefits of each method. BatchQC is developed as a Shiny App. The output is organized into multiple tabs and each tab features an important part of the batch effect analysis and visualization of the data. The BatchQC interface has the following analysis groups: Summary, Differential Expression, Median Correlations, Heatmaps, Circular Dendrogram, PCA Analysis, Shape, ComBat and SVA.

Maintained by Jessica McClintock. Last updated 5 months ago.

batcheffect graphandnetwork microarray normalization principalcomponent sequencing software visualization qualitycontrol rnaseq preprocessing differentialexpression immunooncology

8.3 match 7 stars 8.99 score 54 scripts

opengeos

whitebox:'WhiteboxTools' R Frontend

An R frontend for the 'WhiteboxTools' library, which is an advanced geospatial data analysis platform developed by Prof. John Lindsay at the University of Guelph's Geomorphometry and Hydrogeomatics Research Group. 'WhiteboxTools' can be used to perform common geographical information systems (GIS) analysis operations, such as cost-distance analysis, distance buffering, and raster reclassification. Remote sensing and image processing tasks include image enhancement (e.g. panchromatic sharpening, contrast adjustments), image mosaicing, numerous filtering operations, simple classification (k-means), and common image transformations. 'WhiteboxTools' also contains advanced tooling for spatial hydrological analysis (e.g. flow-accumulation, watershed delineation, stream network analysis, sink removal), terrain analysis (e.g. common terrain indices such as slope, curvatures, wetness index, hillshading; hypsometric analysis; multi-scale topographic position analysis), and LiDAR data processing. Suggested citation: Lindsay (2016) <doi:10.1016/j.cageo.2016.07.003>.

Maintained by Andrew Brown. Last updated 5 months ago.

geomorphometry geoprocessing geospatial gis hydrology remote-sensing rstudio

7.5 match 173 stars 9.65 score 203 scripts 2 dependents

bioc

viper:Virtual Inference of Protein-activity by Enriched Regulon analysis

Inference of protein activity from gene expression data, including the VIPER and msVIPER algorithms

Maintained by Mariano J Alvarez. Last updated 5 months ago.

systemsbiology networkenrichment geneexpression functionalprediction generegulation

10.0 match 7.00 score 342 scripts 5 dependents

bioc

AUCell:AUCell: Analysis of 'gene set' activity in single-cell RNA-seq data (e.g. identify cells with specific gene signatures)

AUCell allows to identify cells with active gene sets (e.g. signatures, gene modules...) in single-cell RNA-seq data. AUCell uses the "Area Under the Curve" (AUC) to calculate whether a critical subset of the input gene set is enriched within the expressed genes for each cell. The distribution of AUC scores across all the cells allows exploring the relative expression of the signature. Since the scoring method is ranking-based, AUCell is independent of the gene expression units and the normalization procedure. In addition, since the cells are evaluated individually, it can easily be applied to bigger datasets, subsetting the expression matrix if needed.

Maintained by Gert Hulselmans. Last updated 5 months ago.

singlecell genesetenrichment transcriptomics transcription geneexpression workflowstep normalization

8.1 match 8.59 score 860 scripts 4 dependents

bioc

TMSig:Tools for Molecular Signatures

The TMSig package contains tools to prepare, analyze, and visualize named lists of sets, with an emphasis on molecular signatures (such as gene or kinase sets). It includes fast, memory efficient functions to construct sparse incidence and similarity matrices and filter, cluster, invert, and decompose sets. Additionally, bubble heatmaps can be created to visualize the results of any differential or molecular signatures analysis.

Maintained by Tyler Sagendorf. Last updated 5 months ago.

clustering genesetenrichment graphandnetwork pathways visualization gene-sets molecular-signatures

11.6 match 4 stars 5.60 score 4 scripts

brandmaier

pdc:Permutation Distribution Clustering

Permutation Distribution Clustering is a clustering method for time series. Dissimilarity of time series is formalized as the divergence between their permutation distributions. The permutation distribution was proposed as measure of the complexity of a time series.

Maintained by Andreas M. Brandmaier. Last updated 2 years ago.

11.4 match 6 stars 5.61 score 25 scripts 9 dependents

feiyoung

ProFAST:Probabilistic Factor Analysis for Spatially-Aware Dimension Reduction

Probabilistic factor analysis for spatially-aware dimension reduction across multi-section spatial transcriptomics data with millions of spatial locations. More details can be referred to Wei Liu, et al. (2023) <doi:10.1101/2023.07.11.548486>.

Maintained by Wei Liu. Last updated 1 months ago.

openblas cpp

10.6 match 2 stars 5.86 score 12 scripts 1 dependents

bioc

granulator:Rapid benchmarking of methods for in silico deconvolution of bulk RNA-seq data

granulator is an R package for the cell type deconvolution of heterogeneous tissues based on bulk RNA-seq data or single cell RNA-seq expression profiles. The package provides a unified testing interface to rapidly run and benchmark multiple state-of-the-art deconvolution methods. Data for the deconvolution of peripheral blood mononuclear cells (PBMCs) into individual immune cell types is provided as well.

Maintained by Sabina Pfister. Last updated 5 months ago.

rnaseq geneexpression differentialexpression transcriptomics singlecell statisticalmethod regression

13.7 match 3 stars 4.48 score 7 scripts

bioc

easier:Estimate Systems Immune Response from RNA-seq data

This package provides a workflow for the use of EaSIeR tool, developed to assess patients' likelihood to respond to ICB therapies providing just the patients' RNA-seq data as input. We integrate RNA-seq data with different types of prior knowledge to extract quantitative descriptors of the tumor microenvironment from several points of view, including composition of the immune repertoire, and activity of intra- and extra-cellular communications. Then, we use multi-task machine learning trained in TCGA data to identify how these descriptors can simultaneously predict several state-of-the-art hallmarks of anti-cancer immune response. In this way we derive cancer-specific models and identify cancer-specific systems biomarkers of immune response. These biomarkers have been experimentally validated in the literature and the performance of EaSIeR predictions has been validated using independent datasets form four different cancer types with patients treated with anti-PD1 or anti-PDL1 therapy.

Maintained by Oscar Lapuente-Santana. Last updated 5 months ago.

geneexpression software transcription systemsbiology pathways genesetenrichment immunooncology epigenetics classification biomedicalinformatics regression experimenthubsoftware

13.7 match 4.20 score 16 scripts

bioc

MAST:Model-based Analysis of Single Cell Transcriptomics

Methods and models for handling zero-inflated single cell assay data.

Maintained by Andrew McDavid. Last updated 5 months ago.

geneexpression differentialexpression genesetenrichment rnaseq transcriptomics singlecell

4.5 match 230 stars 12.75 score 1.8k scripts 5 dependents

bioc

PDATK:Pancreatic Ductal Adenocarcinoma Tool-Kit

Pancreatic ductal adenocarcinoma (PDA) has a relatively poor prognosis and is one of the most lethal cancers. Molecular classification of gene expression profiles holds the potential to identify meaningful subtypes which can inform therapeutic strategy in the clinical setting. The Pancreatic Cancer Adenocarcinoma Tool-Kit (PDATK) provides an S4 class-based interface for performing unsupervised subtype discovery, cross-cohort meta-clustering, gene-expression-based classification, and subsequent survival analysis to identify prognostically useful subtypes in pancreatic cancer and beyond. Two novel methods, Consensus Subtypes in Pancreatic Cancer (CSPC) and Pancreatic Cancer Overall Survival Predictor (PCOSP) are included for consensus-based meta-clustering and overall-survival prediction, respectively. Additionally, four published subtype classifiers and three published prognostic gene signatures are included to allow users to easily recreate published results, apply existing classifiers to new data, and benchmark the relative performance of new methods. The use of existing Bioconductor classes as input to all PDATK classes and methods enables integration with existing Bioconductor datasets, including the 21 pancreatic cancer patient cohorts available in the MetaGxPancreas data package. PDATK has been used to replicate results from Sandhu et al (2019) [https://doi.org/10.1200/cci.18.00102] and an additional paper is in the works using CSPC to validate subtypes from the included published classifiers, both of which use the data available in MetaGxPancreas. The inclusion of subtype centroids and prognostic gene signatures from these and other publications will enable researchers and clinicians to classify novel patient gene expression data, allowing the direct clinical application of the classifiers included in PDATK. Overall, PDATK provides a rich set of tools to identify and validate useful prognostic and molecular subtypes based on gene-expression data, benchmark new classifiers against existing ones, and apply discovered classifiers on novel patient data to inform clinical decision making.

Maintained by Benjamin Haibe-Kains. Last updated 5 months ago.

geneexpression pharmacogenetics pharmacogenomics software classification survival clustering geneprediction

13.2 match 1 stars 4.31 score 17 scripts

jeroen

gpg:GNU Privacy Guard for R

Bindings to GnuPG for working with OpenGPG (RFC4880) cryptographic methods. Includes utilities for public key encryption, creating and verifying digital signatures, and managing your local keyring. Some functionality depends on the version of GnuPG that is installed on the system. On Windows this package can be used together with 'GPG4Win' which provides a GUI for managing keys and entering passphrases.

Maintained by Jeroen Ooms. Last updated 6 months ago.

gpgme

8.8 match 19 stars 6.25 score 21 scripts

robinhankin

clifford:Arbitrary Dimensional Clifford Algebras

A suite of routines for Clifford algebras, using the 'Map' class of the Standard Template Library. Canonical reference: Hestenes (1987, ISBN 90-277-1673-0, "Clifford algebra to geometric calculus"). Special cases including Lorentz transforms, quaternion multiplication, and Grassmann algebra, are discussed. Vignettes presenting conformal geometric algebra, quaternions and split quaternions, dual numbers, and Lorentz transforms are included. The package follows 'disordR' discipline.

Maintained by Robin K. S. Hankin. Last updated 1 months ago.

cpp

8.1 match 5 stars 6.71 score 4 scripts

bioc

signatureSearch:Environment for Gene Expression Searching Combined with Functional Enrichment Analysis

This package implements algorithms and data structures for performing gene expression signature (GES) searches, and subsequently interpreting the results functionally with specialized enrichment methods.

Maintained by Brendan Gongol. Last updated 5 months ago.

software geneexpression go kegg networkenrichment sequencing coverage differentialexpression cpp

7.5 match 17 stars 7.18 score 74 scripts 1 dependents

wolski

sigora:Signature Overrepresentation Analysis

Pathway Analysis is statistically linking observations on the molecular level to biological processes or pathways on the systems(i.e., organism, organ, tissue, cell) level. Traditionally, pathway analysis methods regard pathways as collections of single genes and treat all genes in a pathway as equally informative. However, this can lead to identifying spurious pathways as statistically significant since components are often shared amongst pathways. SIGORA seeks to avoid this pitfall by focusing on genes or gene pairs that are (as a combination) specific to a single pathway. In relying on such pathway gene-pair signatures (Pathway-GPS), SIGORA inherently uses the status of other genes in the experimental context to identify the most relevant pathways. The current version allows for pathway analysis of human and mouse datasets. In addition, it contains pre-computed Pathway-GPS data for pathways in the KEGG and Reactome pathway repositories and mechanisms for extracting GPS for user-supplied repositories.

Maintained by Witold Wolski. Last updated 3 years ago.

genesetenrichment go software pathways kegg

11.5 match 4.43 score 18 scripts 1 dependents

steverozen

ICAMS:In-Depth Characterization and Analysis of Mutational Signatures ('ICAMS')

Analysis and visualization of experimentally elucidated mutational signatures -- the kind of analysis and visualization in Boot et al., "In-depth characterization of the cisplatin mutational signature in human cell lines and in esophageal and liver tumors", Genome Research 2018, <doi:10.1101/gr.230219.117> and "Characterization of colibactin-associated mutational signature in an Asian oral squamous cell carcinoma and in other mucosal tumor types", Genome Research 2020 <doi:10.1101/gr.255620.119>. 'ICAMS' stands for In-depth Characterization and Analysis of Mutational Signatures. 'ICAMS' has functions to read in variant call files (VCFs) and to collate the corresponding catalogs of mutational spectra and to analyze and plot catalogs of mutational spectra and signatures. Handles both "counts-based" and "density-based" (i.e. representation as mutations per megabase) mutational spectra or signatures.

Maintained by Steve Rozen. Last updated 3 years ago.

9.3 match 8 stars 5.41 score 128 scripts

bioc

NanoStringNCTools:NanoString nCounter Tools

Tools for NanoString Technologies nCounter Technology. Provides support for reading RCC files into an ExpressionSet derived object. Also includes methods for QC and normalizaztion of NanoString data.

Maintained by Maddy Griswold. Last updated 5 months ago.

geneexpression transcription cellbasedassays dataimport transcriptomics proteomics mrnamicroarray proprietaryplatforms rnaseq

7.9 match 6.35 score 94 scripts 4 dependents

bioc

SUITOR:Selecting the number of mutational signatures through cross-validation

An unsupervised cross-validation method to select the optimal number of mutational signatures. A data set of mutational counts is split into training and validation data.Signatures are estimated in the training data and then used to predict the mutations in the validation data.

Maintained by Bill Wheeler. Last updated 5 months ago.

genetics software somaticmutation

12.5 match 4.00 score 2 scripts

andrewdhawan

sigQC:Quality Control Metrics for Gene Signatures

Provides gene signature quality control metrics in publication ready plots. Namely, enables the visualization of properties such as expression, variability, correlation, and comparison of methods of standardisation and scoring metrics.

Maintained by Andrew Dhawan. Last updated 7 months ago.

10.1 match 4 stars 4.89 score 13 scripts

hojsgaard

doBy:Groupwise Statistics, LSmeans, Linear Estimates, Utilities

Utility package containing: 1) Facilities for working with grouped data: 'do' something to data stratified 'by' some variables. 2) LSmeans (least-squares means), general linear estimates. 3) Restrict functions to a smaller domain. 4) Miscellaneous other utilities.

Maintained by Søren Højsgaard. Last updated 4 days ago.

3.3 match 1 stars 14.94 score 3.2k scripts 939 dependents

dsokolo

scMappR:Single Cell Mapper

The single cell mapper (scMappR) R package contains a suite of bioinformatic tools that provide experimentally relevant cell-type specific information to a list of differentially expressed genes (DEG). The function "scMappR_and_pathway_analysis" reranks DEGs to generate cell-type specificity scores called cell-weighted fold-changes. Users input a list of DEGs, normalized counts, and a signature matrix into this function. scMappR then re-weights bulk DEGs by cell-type specific expression from the signature matrix, cell-type proportions from RNA-seq deconvolution and the ratio of cell-type proportions between the two conditions to account for changes in cell-type proportion. With cwFold-changes calculated, scMappR uses two approaches to utilize cwFold-changes to complete cell-type specific pathway analysis. The "process_dgTMatrix_lists" function in the scMappR package contains an automated scRNA-seq processing pipeline where users input scRNA-seq count data, which is made compatible for scMappR and other R packages that analyze scRNA-seq data. We further used this to store hundreds up regularly updating signature matrices. The functions "tissue_by_celltype_enrichment", "tissue_scMappR_internal", and "tissue_scMappR_custom" combine these consistently processed scRNAseq count data with gene-set enrichment tools to allow for cell-type marker enrichment of a generic gene list (e.g. GWAS hits). Reference: Sokolowski,D.J., Faykoo-Martinez,M., Erdman,L., Hou,H., Chan,C., Zhu,H., Holmes,M.M., Goldenberg,A. and Wilson,M.D. (2021) Single-cell mapper (scMappR): using scRNA-seq to infer cell-type specificities of differentially expressed genes. NAR Genomics and Bioinformatics. 3(1). Iqab011. <doi:10.1093/nargab/lqab011>.

Maintained by Dustin Sokolowski. Last updated 2 years ago.

14.8 match 4 stars 3.30 score 9 scripts

bioc

supersigs:Supervised mutational signatures

Generate SuperSigs (supervised mutational signatures) from single nucleotide variants in the cancer genome. Functions included in the package allow the user to learn supervised mutational signatures from their data and apply them to new data. The methodology is based on the one described in Afsari (2021, ELife).

Maintained by Albert Kuo. Last updated 5 months ago.

featureextraction classification regression sequencing wholegenome somaticmutation

10.2 match 3 stars 4.78 score 3 scripts

geobosh

Rdpack:Update and Manipulate Rd Documentation Objects

Functions for manipulation of R documentation objects, including functions reprompt() and ereprompt() for updating 'Rd' documentation for functions, methods and classes; 'Rd' macros for citations and import of references from 'bibtex' files for use in 'Rd' files and 'roxygen2' comments; 'Rd' macros for evaluating and inserting snippets of 'R' code and the results of its evaluation or creating graphics on the fly; and many functions for manipulation of references and Rd files.

Maintained by Georgi N. Boshnakov. Last updated 4 months ago.

bibtex bibtex-references citations documentation rd-format roxygen2

3.5 match 30 stars 13.58 score 73 scripts 2.3k dependents

bioc

AnVILBase:Generic functions for interacting with the AnVIL ecosystem

Provides generic functions for interacting with the AnVIL ecosystem. Packages that use either GCP or Azure in AnVIL are built on top of AnVILBase. Extension packages will provide methods for interacting with other cloud providers.

Maintained by Marcel Ramos. Last updated 5 months ago.

software infrastructure

7.2 match 6.68 score 70 scripts 19 dependents

r-forge

ROI:R Optimization Infrastructure

The R Optimization Infrastructure ('ROI') <doi:10.18637/jss.v094.i15> is a sophisticated framework for handling optimization problems in R. Additional information can be found on the 'ROI' homepage <http://roi.r-forge.r-project.org/>.

Maintained by Stefan Theussl. Last updated 2 years ago.

6.3 match 7.68 score 506 scripts 47 dependents

bafuentes

rassta:Raster-Based Spatial Stratification Algorithms

Algorithms for the spatial stratification of landscapes, sampling and modeling of spatially-varying phenomena. These algorithms offer a simple framework for the stratification of geographic space based on raster layers representing landscape factors and/or factor scales. The stratification process follows a hierarchical approach, which is based on first level units (i.e., classification units) and second-level units (i.e., stratification units). Nonparametric techniques allow to measure the correspondence between the geographic space and the landscape configuration represented by the units. These correspondence metrics are useful to define sampling schemes and to model the spatial variability of environmental phenomena. The theoretical background of the algorithms and code examples are presented in Fuentes, Dorantes, and Tipton (2021). <doi:10.31223/X50S57>.

Maintained by Bryan A. Fuentes. Last updated 3 years ago.

ecology geoinformatics hierarchical modeling sampling spatial

7.9 match 16 stars 5.96 score 19 scripts

ropensci

sodium:A Modern and Easy-to-Use Crypto Library

Bindings to 'libsodium' <https://doc.libsodium.org/>: a modern, easy-to-use software library for encryption, decryption, signatures, password hashing and more. Sodium uses curve25519, a state-of-the-art Diffie-Hellman function by Daniel Bernstein, which has become very popular after it was discovered that the NSA had backdoored Dual EC DRBG.

Maintained by Jeroen Ooms. Last updated 3 months ago.

libsodium

3.8 match 70 stars 12.43 score 175 scripts 103 dependents

bioc

DeconRNASeq:Deconvolution of Heterogeneous Tissue Samples for mRNA-Seq data

DeconSeq is an R package for deconvolution of heterogeneous tissues based on mRNA-Seq data. It modeled expression levels from heterogeneous cell populations in mRNA-Seq as the weighted average of expression from different constituting cell types and predicted cell type proportions of single expression profiles.

Maintained by Ting Gong. Last updated 5 months ago.

differentialexpression

8.7 match 5.16 score 72 scripts

rozen-lab

mSigTools:Mutational Signature Analysis Tools

Utility functions for mutational signature analysis as described in Alexandrov, L. B. (2020) <doi:10.1038/s41586-020-1943-3>. This package provides two groups of functions. One is for dealing with mutational signature "exposures" (i.e. the counts of mutations in a sample that are due to each mutational signature). The other group of functions is for matching or comparing sets of mutational signatures. 'mSigTools' stands for mutational Signature analysis Tools.

Maintained by Steven Rozen. Last updated 2 years ago.

14.8 match 2 stars 3.00 score 9 scripts

bioc

GeomxTools:NanoString GeoMx Tools

Tools for NanoString Technologies GeoMx Technology. Package provides functions for reading in DCC and PKC files based on an ExpressionSet derived object. Normalization and QC functions are also included.

Maintained by Maddy Griswold. Last updated 5 months ago.

geneexpression transcription cellbasedassays dataimport transcriptomics proteomics mrnamicroarray proprietaryplatforms rnaseq sequencing experimentaldesign normalization spatial

6.0 match 7.11 score 239 scripts 3 dependents

andymckenzie

BRETIGEA:Brain Cell Type Specific Gene Expression Analysis

Analysis of relative cell type proportions in bulk gene expression data. Provides a well-validated set of brain cell type-specific marker genes derived from multiple types of experiments, as described in McKenzie (2018) <doi:10.1038/s41598-018-27293-5>. For brain tissue data sets, there are marker genes available for astrocytes, endothelial cells, microglia, neurons, oligodendrocytes, and oligodendrocyte precursor cells, derived from each of human, mice, and combination human/mouse data sets. However, if you have access to your own marker genes, the functions can be applied to bulk gene expression data from any tissue. Also implements multiple options for relative cell type proportion estimation using these marker genes, adapting and expanding on approaches from the 'CellCODE' R package described in Chikina (2015) <doi:10.1093/bioinformatics/btv015>. The number of cell type marker genes used in a given analysis can be increased or decreased based on your preferences and the data set. Finally, provides functions to use the estimates to adjust for variability in the relative proportion of cell types across samples prior to downstream analyses.

Maintained by Andrew McKenzie. Last updated 1 years ago.

cell-type gene-expression gene-expression-signatures

6.7 match 15 stars 6.20 score 30 scripts

bioc

hermes:Preprocessing, analyzing, and reporting of RNA-seq data

Provides classes and functions for quality control, filtering, normalization and differential expression analysis of pre-processed `RNA-seq` data. Data can be imported from `SummarizedExperiment` as well as `matrix` objects and can be annotated from `BioMart`. Filtering for genes without too low expression or containing required annotations, as well as filtering for samples with sufficient correlation to other samples or total number of reads is supported. The standard normalization methods including cpm, rpkm and tpm can be used, and 'DESeq2` as well as voom differential expression analyses are available.

Maintained by Daniel Sabanés Bové. Last updated 5 months ago.

rnaseq differentialexpression normalization preprocessing qualitycontrol rna-seq statistical-engineering

5.3 match 11 stars 7.77 score 48 scripts 1 dependents

bioc

xcore:xcore expression regulators inference

xcore is an R package for transcription factor activity modeling based on known molecular signatures and user's gene expression data. Accompanying xcoredata package provides a collection of molecular signatures, constructed from publicly available ChiP-seq experiments. xcore use ridge regression to model changes in expression as a linear combination of molecular signatures and find their unknown activities. Obtained, estimates can be further tested for significance to select molecular signatures with the highest predicted effect on the observed expression changes.

Maintained by Maciej Migdał. Last updated 5 months ago.

geneexpression generegulation epigenetics regression sequencing

10.0 match 4.00 score 8 scripts

bioc

motifStack:Plot stacked logos for single or multiple DNA, RNA and amino acid sequence

The motifStack package is designed for graphic representation of multiple motifs with different similarity scores. It works with both DNA/RNA sequence motif and amino acid sequence motif. In addition, it provides the flexibility for users to customize the graphic parameters such as the font type and symbol colors.

Maintained by Jianhong Ou. Last updated 2 months ago.

sequencematching visualization sequencing microarray alignment chipchip chipseq motifannotation dataimport

5.0 match 7.93 score 188 scripts 6 dependents

bioc

CelliD:Unbiased Extraction of Single Cell gene signatures using Multiple Correspondence Analysis

CelliD is a clustering-free multivariate statistical method for the robust extraction of per-cell gene signatures from single-cell RNA-seq. CelliD allows unbiased cell identity recognition across different donors, tissues-of-origin, model organisms and single-cell omics protocols. The package can also be used to explore functional pathways enrichment in single cell data.

Maintained by Akira Cortal. Last updated 5 months ago.

rnaseq singlecell dimensionreduction clustering genesetenrichment geneexpression atacseq openblas cpp openmp

7.7 match 4.85 score 70 scripts

bioc

GeneExpressionSignature:Gene Expression Signature based Similarity Metric

This package gives the implementations of the gene expression signature and its distance to each. Gene expression signature is represented as a list of genes whose expression is correlated with a biological state of interest. And its distance is defined using a nonparametric, rank-based pattern-matching strategy based on the Kolmogorov-Smirnov statistic. Gene expression signature and its distance can be used to detect similarities among the signatures of drugs, diseases, and biological states of interest.

Maintained by Yang Cao. Last updated 5 months ago.

geneexpression

7.2 match 1 stars 5.00 score 5 scripts

igordot

clustermole:Unbiased Single-Cell Transcriptomic Data Cell Type Identification

Assignment of cell type labels to single-cell RNA sequencing (scRNA-seq) clusters is often a time-consuming process that involves manual inspection of the cluster marker genes complemented with a detailed literature search. This is especially challenging when unexpected or poorly described populations are present. The clustermole R package provides methods to query thousands of human and mouse cell identity markers sourced from a variety of databases.

Maintained by Igor Dolgalev. Last updated 1 years ago.

cell-type cell-type-annotation cell-type-classification cell-type-identification cell-type-matching gene-expression-signatures scrna-seq single-cell

6.7 match 13 stars 5.37 score 36 scripts

bioc

BioQC:Detect tissue heterogeneity in expression profiles with gene sets

BioQC performs quality control of high-throughput expression data based on tissue gene signatures. It can detect tissue heterogeneity in gene expression data. The core algorithm is a Wilcoxon-Mann-Whitney test that is optimised for high performance.

Maintained by Jitao David Zhang. Last updated 5 months ago.

geneexpression qualitycontrol statisticalmethod genesetenrichment cpp

4.3 match 5 stars 8.16 score 86 scripts

bioc

GSgalgoR:An Evolutionary Framework for the Identification and Study of Prognostic Gene Expression Signatures in Cancer

A multi-objective optimization algorithm for disease sub-type discovery based on a non-dominated sorting genetic algorithm. The 'Galgo' framework combines the advantages of clustering algorithms for grouping heterogeneous 'omics' data and the searching properties of genetic algorithms for feature selection. The algorithm search for the optimal number of clusters determination considering the features that maximize the survival difference between sub-types while keeping cluster consistency high.

Maintained by Carlos Catania. Last updated 5 months ago.

geneexpression transcription clustering classification survival

6.3 match 15 stars 5.48 score 6 scripts

bioc

SparseSignatures:SparseSignatures

Point mutations occurring in a genome can be divided into 96 categories based on the base being mutated, the base it is mutated into and its two flanking bases. Therefore, for any patient, it is possible to represent all the point mutations occurring in that patient's tumor as a vector of length 96, where each element represents the count of mutations for a given category in the patient. A mutational signature represents the pattern of mutations produced by a mutagen or mutagenic process inside the cell. Each signature can also be represented by a vector of length 96, where each element represents the probability that this particular mutagenic process generates a mutation of the 96 above mentioned categories. In this R package, we provide a set of functions to extract and visualize the mutational signatures that best explain the mutation counts of a large number of patients.

Maintained by Luca De Sano. Last updated 5 months ago.

biomedicalinformatics somaticmutation

5.4 match 11 stars 6.42 score 4 scripts

richierocks

sig:Print Function Signatures

Print function signatures and find overly complicated code.

Maintained by Richard Cotton. Last updated 9 years ago.

7.8 match 4.45 score 19 scripts 1 dependents

nowosad

motif:Local Pattern Analysis

Describes spatial patterns of categorical raster data for any defined regular and irregular areas. Patterns are described quantitatively using built-in signatures based on co-occurrence matrices but also allows for any user-defined functions. It enables spatial analysis such as search, change detection, and clustering to be performed on spatial patterns (Nowosad (2021) <doi:10.1007/s10980-020-01135-0>).

Maintained by Jakub Nowosad. Last updated 7 months ago.

categorical-raster global-ecology landscape-ecology spatial cpp

4.5 match 63 stars 7.48 score 48 scripts

satijalab

Seurat:Tools for Single Cell Genomics

A toolkit for quality control, analysis, and exploration of single cell RNA sequencing data. 'Seurat' aims to enable users to identify and interpret sources of heterogeneity from single cell transcriptomic measurements, and to integrate diverse types of single cell data. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031>, and Hao, Hao, et al (2020) <doi:10.1101/2020.10.12.335331> for more details.

Maintained by Paul Hoffman. Last updated 1 years ago.

human-cell-atlas single-cell-genomics single-cell-rna-seq cpp

2.0 match 2.4k stars 16.86 score 50k scripts 73 dependents

ropensci

gert:Simple Git Client for R

Simple git client for R based on 'libgit2' <https://libgit2.org> with support for SSH and HTTPS remotes. All functions in 'gert' use basic R data types (such as vectors and data-frames) for their arguments and return values. User credentials are shared with command line 'git' through the git-credential store and ssh keys stored on disk or ssh-agent.

Maintained by Jeroen Ooms. Last updated 4 months ago.

libgit2

2.3 match 154 stars 14.82 score 158 scripts 369 dependents

bioc

CoreGx:Classes and Functions to Serve as the Basis for Other 'Gx' Packages

A collection of functions and classes which serve as the foundation for our lab's suite of R packages, such as 'PharmacoGx' and 'RadioGx'. This package was created to abstract shared functionality from other lab package releases to increase ease of maintainability and reduce code repetition in current and future 'Gx' suite programs. Major features include a 'CoreSet' class, from which 'RadioSet' and 'PharmacoSet' are derived, along with get and set methods for each respective slot. Additional functions related to fitting and plotting dose response curves, quantifying statistical correlation and calculating area under the curve (AUC) or survival fraction (SF) are included. For more details please see the included documentation, as well as: Smirnov, P., Safikhani, Z., El-Hachem, N., Wang, D., She, A., Olsen, C., Freeman, M., Selby, H., Gendoo, D., Grossman, P., Beck, A., Aerts, H., Lupien, M., Goldenberg, A. (2015) <doi:10.1093/bioinformatics/btv723>. Manem, V., Labie, M., Smirnov, P., Kofia, V., Freeman, M., Koritzinksy, M., Abazeed, M., Haibe-Kains, B., Bratman, S. (2018) <doi:10.1101/449793>.

Maintained by Benjamin Haibe-Kains. Last updated 5 months ago.

software pharmacogenomics classification survival

5.0 match 6.53 score 63 scripts 6 dependents

josetamezpena

FRESA.CAD:Feature Selection Algorithms for Computer Aided Diagnosis

Contains a set of utilities for building and testing statistical models (linear, logistic,ordinal or COX) for Computer Aided Diagnosis/Prognosis applications. Utilities include data adjustment, univariate analysis, model building, model-validation, longitudinal analysis, reporting and visualization.

Maintained by Jose Gerardo Tamez-Pena. Last updated 1 months ago.

openblas cpp openmp

5.9 match 7 stars 5.59 score 31 scripts

bioc

hypeR:An R Package For Geneset Enrichment Workflows

An R Package for Geneset Enrichment Workflows.

Maintained by Anthony Federico. Last updated 5 months ago.

genesetenrichment annotation pathways bioinformatics computational-biology geneset-enrichment-analysis

3.9 match 76 stars 8.22 score 145 scripts

olajumokeevangelina

MetabolicSurv:A Biomarker Validation Approach for Classification and Predicting Survival Using Metabolomics Signature

An approach to identifies metabolic biomarker signature for metabolic data by discovering predictive metabolite for predicting survival and classifying patients into risk groups. Classifiers are constructed as a linear combination of predictive/important metabolites, prognostic factors and treatment effects if necessary. Several methods were implemented to reduce the metabolomics matrix such as the principle component analysis of Wold Svante et al. (1987) <doi:10.1016/0169-7439(87)80084-9> , the LASSO method by Robert Tibshirani (1998) <doi:10.1002/(SICI)1097-0258(19970228)16:4%3C385::AID-SIM380%3E3.0.CO;2-3>, the elastic net approach by Hui Zou and Trevor Hastie (2005) <doi:10.1111/j.1467-9868.2005.00503.x>. Sensitivity analysis on the quantile used for the classification can also be accessed to check the deviation of the classification group based on the quantile specified. Large scale cross validation can be performed in order to investigate the mostly selected predictive metabolites and for internal validation. During the evaluation process, validation is accessed using the hazard ratios (HR) distribution of the test set and inference is mainly based on resampling and permutations technique.

Maintained by Olajumoke Evangelina Owokotomo. Last updated 6 years ago.

7.7 match 4.13 score 27 scripts

ropensci

git2r:Provides Access to Git Repositories

Interface to the 'libgit2' library, which is a pure C implementation of the 'Git' core methods. Provides access to 'Git' repositories to extract data and running some basic 'Git' commands.

Maintained by Stefan Widgren. Last updated 11 days ago.

git git-client libgit2 libgit2-library

2.3 match 218 stars 13.86 score 836 scripts 49 dependents

bioc

deconvR:Simulation and Deconvolution of Omic Profiles

This package provides a collection of functions designed for analyzing deconvolution of the bulk sample(s) using an atlas of reference omic signature profiles and a user-selected model. Users are given the option to create or extend a reference atlas and,also simulate the desired size of the bulk signature profile of the reference cell types.The package includes the cell-type-specific methylation atlas and, Illumina Epic B5 probe ids that can be used in deconvolution. Additionally,we included BSmeth2Probe, to make mapping WGBS data to their probe IDs easier.

Maintained by Irem B. Gündüz. Last updated 5 months ago.

dnamethylation regression geneexpression rnaseq singlecell statisticalmethod transcriptomics bioconductor-package deconvolution dna-methylation omics

5.4 match 10 stars 5.78 score 15 scripts

hojsgaard

gRbase:A Package for Graphical Modelling in R

The 'gRbase' package provides graphical modelling features used by e.g. the packages 'gRain', 'gRim' and 'gRc'. 'gRbase' implements graph algorithms including (i) maximum cardinality search (for marked and unmarked graphs). (ii) moralization, (iii) triangulation, (iv) creation of junction tree. 'gRbase' facilitates array operations, 'gRbase' implements functions for testing for conditional independence. 'gRbase' illustrates how hierarchical log-linear models may be implemented and describes concept of graphical meta data. The facilities of the package are documented in the book by Højsgaard, Edwards and Lauritzen (2012, <doi:10.1007/978-1-4614-2299-0>) and in the paper by Dethlefsen and Højsgaard, (2005, <doi:10.18637/jss.v014.i17>). Please see 'citation("gRbase")' for citation details.

Maintained by Søren Højsgaard. Last updated 4 months ago.

openblas cpp

3.3 match 3 stars 9.24 score 241 scripts 20 dependents

bioc

phenoTest:Tools to test association between gene expression and phenotype in a way that is efficient, structured, fast and scalable. We also provide tools to do GSEA (Gene set enrichment analysis) and copy number variation.

Tools to test correlation between gene expression and phenotype in a way that is efficient, structured, fast and scalable. GSEA is also provided.

Maintained by Evarist Planet. Last updated 5 months ago.

microarray differentialexpression multiplecomparison clustering classification

6.6 match 4.56 score 9 scripts 1 dependents

xiaozhangryy

CAESAR.Suite:CAESAR: a Cross-Technology and Cross-Resolution Framework for Spatial Omics Annotation

Biotechnology in spatial omics has advanced rapidly over the past few years, enhancing both throughput and resolution. However, existing annotation pipelines in spatial omics predominantly rely on clustering methods, lacking the flexibility to integrate extensive annotated information from single-cell RNA sequencing (scRNA-seq) due to discrepancies in spatial resolutions, species, or modalities. Here we introduce the CAESAR suite, an open-source software package that provides image-based spatial co-embedding of locations and genomic features. It uniquely transfers labels from scRNA-seq reference, enabling the annotation of spatial omics datasets across different technologies, resolutions, species, and modalities, based on the conserved relationship between signature genes and cells/locations at an appropriate level of granularity. Notably, CAESAR enriches location-level pathways, allowing for the detection of gradual biological pathway activation within spatially defined domain types.

Maintained by Xiao Zhang. Last updated 6 months ago.

openblas cpp openmp

7.5 match 1 stars 3.95 score 2 scripts

bioc

DEGreport:Report of DEG analysis

Creation of ready-to-share figures of differential expression analyses of count data. It integrates some of the code mentioned in DESeq2 and edgeR vignettes, and report a ranked list of genes according to the fold changes mean and variability for each selected gene.

Maintained by Lorena Pantano. Last updated 5 months ago.

differentialexpression visualization rnaseq reportwriting geneexpression immunooncology bioconductor differential-expression qc report rna-seq smallrna

3.1 match 24 stars 9.42 score 354 scripts 1 dependents

marioangst

motifr:Motif Analysis in Multi-Level Networks

Tools for motif analysis in multi-level networks. Multi-level networks combine multiple networks in one, e.g. social-ecological networks. Motifs are small configurations of nodes and edges (subgraphs) occurring in networks. 'motifr' can visualize multi-level networks, count multi-level network motifs and compare motif occurrences to baseline models. It also identifies contributions of existing or potential edges to motifs to find critical or missing edges. The package is in many parts an R wrapper for the excellent 'SESMotifAnalyser' 'Python' package written by Tim Seppelt.

Maintained by Mario Angst. Last updated 4 years ago.

5.5 match 15 stars 5.43 score 18 scripts

bioc

progeny:Pathway RespOnsive GENes for activity inference from gene expression

PROGENy is resource that leverages a large compendium of publicly available signaling perturbation experiments to yield a common core of pathway responsive genes for human and mouse. These, coupled with any statistical method, can be used to infer pathway activities from bulk or single-cell transcriptomics.

Maintained by Aurélien Dugourd. Last updated 5 months ago.

systemsbiology geneexpression functionalprediction generegulation

3.3 match 99 stars 8.90 score 221 scripts 1 dependents

bioc

rScudo:Signature-based Clustering for Diagnostic Purposes

SCUDO (Signature-based Clustering for Diagnostic Purposes) is a rank-based method for the analysis of gene expression profiles for diagnostic and classification purposes. It is based on the identification of sample-specific gene signatures composed of the most up- and down-regulated genes for that sample. Starting from gene expression data, functions in this package identify sample-specific gene signatures and use them to build a graph of samples. In this graph samples are joined by edges if they have a similar expression profile, according to a pre-computed similarity matrix. The similarity between the expression profiles of two samples is computed using a method similar to GSEA. The graph of samples can then be used to perform community clustering or to perform supervised classification of samples in a testing set.

Maintained by Matteo Ciciani. Last updated 5 months ago.

geneexpression differentialexpression biomedicalinformatics classification clustering graphandnetwork network proteomics transcriptomics systemsbiology featureextraction

5.7 match 4 stars 5.19 score 13 scripts

leonawicz

tabr:Music Notation Syntax, Manipulation, Analysis and Transcription in R

Provides a music notation syntax and a collection of music programming functions for generating, manipulating, organizing, and analyzing musical information in R. Music syntax can be entered directly in character strings, for example to quickly transcribe short pieces of music. The package contains functions for directly performing various mathematical, logical and organizational operations and musical transformations on special object classes that facilitate working with music data and notation. The same music data can be organized in tidy data frames for a familiar and powerful approach to the analysis of large amounts of structured music data. Functions are available for mapping seamlessly between these formats and their representations of musical information. The package also provides an API to 'LilyPond' (<https://lilypond.org/>) for transcribing musical representations in R into tablature ("tabs") and sheet music. 'LilyPond' is open source music engraving software for generating high quality sheet music based on markup syntax. The package generates 'LilyPond' files from R code and can pass them to the 'LilyPond' command line interface to be rendered into sheet music PDF files or inserted into R markdown documents. The package offers nominal MIDI file output support in conjunction with rendering sheet music. The package can read MIDI files and attempts to structure the MIDI data to integrate as best as possible with the data structures and functionality found throughout the package.

Maintained by Matthew Leonawicz. Last updated 6 months ago.

guitar-tablature lilypond lilypond-api music-analysis music-data music-notation music-programming music-syntax music-transcription sheet-music

3.8 match 132 stars 7.87 score 94 scripts

datawookie

emayili:Send Email Messages

A light, simple tool for sending emails with minimal dependencies.

Maintained by Andrew B. Collier. Last updated 1 months ago.

hacktoberfest

3.0 match 180 stars 9.59 score 95 scripts 3 dependents

mooresm

serrsBayes:Bayesian Modelling of Raman Spectroscopy

Sequential Monte Carlo (SMC) algorithms for fitting a generalised additive mixed model (GAMM) to surface-enhanced resonance Raman spectroscopy (SERRS), using the method of Moores et al. (2016) <arXiv:1604.07299>. Multivariate observations of SERRS are highly collinear and lend themselves to a reduced-rank representation. The GAMM separates the SERRS signal into three components: a sequence of Lorentzian, Gaussian, or pseudo-Voigt peaks; a smoothly-varying baseline; and additive white noise. The parameters of each component of the model are estimated iteratively using SMC. The posterior distributions of the parameters given the observed spectra are represented as a population of weighted particles.

Maintained by Matt Moores. Last updated 4 years ago.

bayesian chemometrics raman sequential-monte-carlo spectroscopy cpp

5.3 match 8 stars 5.46 score 36 scripts

shixiangwang

sigminer.prediction:Train and Predict Cancer Subtype with Keras Model based on Mutational Signatures

Mutational signatures represent mutational processes occured in cancer evolution, thus are stable and genetic resources for subtyping. This tool provides functions for training neutral network models to predict the subtype a sample belongs to based on 'keras' and 'sigminer' packages.

Maintained by Shixiang Wang. Last updated 3 years ago.

keras mutational-signatures prostate-cancer sigminer

10.8 match 8 stars 2.60 score 2 scripts

january3

tmod:Feature Set Enrichment Analysis for Metabolomics and Transcriptomics

Methods and feature set definitions for feature or gene set enrichment analysis in transcriptional and metabolic profiling data. Package includes tests for enrichment based on ranked lists of features, functions for visualisation and multivariate functional analysis. See Zyla et al (2019) <doi:10.1093/bioinformatics/btz447>.

Maintained by January Weiner. Last updated 2 months ago.

4.0 match 3 stars 6.88 score 168 scripts 1 dependents

willju-wangqian

cmpsR:R Implementation of Congruent Matching Profile Segments Method

This is an open-source implementation of the Congruent Matching Profile Segments (CMPS) method (Chen et al. 2019)<doi:10.1016/j.forsciint.2019.109964>. In general, it can be used for objective comparison of striated tool marks, and in our examples, we specifically use it for bullet signatures comparisons. The CMPS score is expected to be large if two signatures are similar. So it can also be considered as a feature that measures the similarity of two bullet signatures.

Maintained by Wangqian Ju. Last updated 2 years ago.

7.1 match 1 stars 3.85 score 14 scripts

kharchenkolab

pagoda2:Single Cell Analysis and Differential Expression

Analyzing and interactively exploring large-scale single-cell RNA-seq datasets. 'pagoda2' primarily performs normalization and differential gene expression analysis, with an interactive application for exploring single-cell RNA-seq datasets. It performs basic tasks such as cell size normalization, gene variance normalization, and can be used to identify subpopulations and run differential expression within individual samples. 'pagoda2' was written to rapidly process modern large-scale scRNAseq datasets of approximately 1e6 cells. The companion web application allows users to explore which gene expression patterns form the different subpopulations within your data. The package also serves as the primary method for preprocessing data for conos, <https://github.com/kharchenkolab/conos>. This package interacts with data available through the 'p2data' package, which is available in a 'drat' repository. To access this data package, see the instructions at <https://github.com/kharchenkolab/pagoda2>. The size of the 'p2data' package is approximately 6 MB.

Maintained by Evan Biederstedt. Last updated 1 years ago.

scrna-seq single-cell single-cell-rna-seq transcriptomics openblas cpp openmp

3.4 match 222 stars 8.00 score 282 scripts

azure

AzureStor:Storage Management in 'Azure'

Manage storage in Microsoft's 'Azure' cloud: <https://azure.microsoft.com/en-us/product-categories/storage/>. On the admin side, 'AzureStor' includes features to create, modify and delete storage accounts. On the client side, it includes an interface to blob storage, file storage, and 'Azure Data Lake Storage Gen2': upload and download files and blobs; list containers and files/blobs; create containers; and so on. Authenticated access to storage is supported, via either a shared access key or a shared access signature (SAS). Part of the 'AzureR' family of packages.

Maintained by Hong Ooi. Last updated 2 years ago.

azure-data-lake azure-sdk-r azure-storage azure-storage-blob azure-storage-file

2.4 match 64 stars 10.72 score 298 scripts 4 dependents

bioc

DART:Denoising Algorithm based on Relevance network Topology

Denoising Algorithm based on Relevance network Topology (DART) is an algorithm designed to evaluate the consistency of prior information molecular signatures (e.g in-vitro perturbation expression signatures) in independent molecular data (e.g gene expression data sets). If consistent, a pruning network strategy is then used to infer the activation status of the molecular signature in individual samples.

Maintained by Charles Shijie Zheng. Last updated 5 months ago.

geneexpression differentialexpression graphandnetwork pathways

5.9 match 4.30 score 1 scripts

bioc

Pigengene:Infers biological signatures from gene expression data

Pigengene package provides an efficient way to infer biological signatures from gene expression profiles. The signatures are independent from the underlying platform, e.g., the input can be microarray or RNA Seq data. It can even infer the signatures using data from one platform, and evaluate them on the other. Pigengene identifies the modules (clusters) of highly coexpressed genes using coexpression network analysis, summarizes the biological information of each module in an eigengene, learns a Bayesian network that models the probabilistic dependencies between modules, and builds a decision tree based on the expression of eigengenes.

Maintained by Habil Zare. Last updated 5 months ago.

geneexpression rnaseq networkinference network graphandnetwork biomedicalinformatics systemsbiology transcriptomics classification clustering decisiontree dimensionreduction principalcomponent microarray normalization immunooncology

5.5 match 4.56 score 10 scripts 1 dependents

r-spatial

rgee:R Bindings for Calling the 'Earth Engine' API

Earth Engine <https://earthengine.google.com/> client library for R. All of the 'Earth Engine' API classes, modules, and functions are made available. Additional functions implemented include importing (exporting) of Earth Engine spatial objects, extraction of time series, interactive map display, assets management interface, and metadata display. See <https://r-spatial.github.io/rgee/> for further details.

Maintained by Cesar Aybar. Last updated 3 days ago.

earth-engine earthengine google-earth-engine googleearthengine spatial-analysis spatial-data

1.8 match 715 stars 13.77 score 1.9k scripts 3 dependents

songw01

MEGENA:Multiscale Clustering of Geometrical Network

Co-Expression Network Analysis by adopting network embedding technique. Song W.-M., Zhang B. (2015) Multiscale Embedded Gene Co-expression Network Analysis. PLoS Comput Biol 11(11): e1004574. <doi: 10.1371/journal.pcbi.1004574>.

Maintained by Won-Min Song. Last updated 1 years ago.

cpp

3.4 match 49 stars 6.82 score 45 scripts 1 dependents

jellegoeman

penalized:L1 (Lasso and Fused Lasso) and L2 (Ridge) Penalized Estimation in GLMs and in the Cox Model

Fitting possibly high dimensional penalized regression models. The penalty structure can be any combination of an L1 penalty (lasso and fused lasso), an L2 penalty (ridge) and a positivity constraint on the regression coefficients. The supported regression models are linear, logistic and Poisson regression and the Cox Proportional Hazards model. Cross-validation routines allow optimization of the tuning parameters.

Maintained by Jelle Goeman. Last updated 3 years ago.

openblas cpp

3.2 match 4 stars 7.09 score 429 scripts 17 dependents

bioc

ideal:Interactive Differential Expression AnaLysis

This package provides functions for an Interactive Differential Expression AnaLysis of RNA-sequencing datasets, to extract quickly and effectively information downstream the step of differential expression. A Shiny application encapsulates the whole package. Support for reproducibility of the whole analysis is provided by means of a template report which gets automatically compiled and can be stored/shared.

Maintained by Federico Marini. Last updated 3 months ago.

immunooncology geneexpression differentialexpression rnaseq sequencing visualization qualitycontrol gui genesetenrichment reportwriting shinyapps bioconductor differential-expression reproducible-research rna-seq rna-seq-analysis shiny user-friendly

3.3 match 29 stars 6.78 score 5 scripts

ncss-tech

aqp:Algorithms for Quantitative Pedology

The Algorithms for Quantitative Pedology (AQP) project was started in 2009 to organize a loosely-related set of concepts and source code on the topic of soil profile visualization, aggregation, and classification into this package (aqp). Over the past 8 years, the project has grown into a suite of related R packages that enhance and simplify the quantitative analysis of soil profile data. Central to the AQP project is a new vocabulary of specialized functions and data structures that can accommodate the inherent complexity of soil profile information; freeing the scientist to focus on ideas rather than boilerplate data processing tasks <doi:10.1016/j.cageo.2012.10.020>. These functions and data structures have been extensively tested and documented, applied to projects involving hundreds of thousands of soil profiles, and deeply integrated into widely used tools such as SoilWeb <https://casoilresource.lawr.ucdavis.edu/soilweb-apps>. Components of the AQP project (aqp, soilDB, sharpshootR, soilReports packages) serve an important role in routine data analysis within the USDA-NRCS Soil Science Division. The AQP suite of R packages offer a convenient platform for bridging the gap between pedometric theory and practice.

Maintained by Dylan Beaudette. Last updated 28 days ago.

digital-soil-mapping ncss-tech nrcs pedology pedometrics soil soil-survey usda

1.9 match 55 stars 11.77 score 1.2k scripts 2 dependents

bioc

ToxicoGx:Analysis of Large-Scale Toxico-Genomic Data

Contains a set of functions to perform large-scale analysis of toxicogenomic data, providing a standardized data structure to hold information relevant to annotation, visualization and statistical analysis of toxicogenomic data.

Maintained by Benjamin Haibe-Kains. Last updated 5 months ago.

geneexpression pharmacogenetics pharmacogenomics software

5.0 match 4.36 score 23 scripts

sistia01

DWLS:Gene Expression Deconvolution Using Dampened Weighted Least Squares

The rapid development of single-cell transcriptomic technologies has helped uncover the cellular heterogeneity within cell populations. However, bulk RNA-seq continues to be the main workhorse for quantifying gene expression levels due to technical simplicity and low cost. To most effectively extract information from bulk data given the new knowledge gained from single-cell methods, we have developed a novel algorithm to estimate the cell-type composition of bulk data from a single-cell RNA-seq-derived cell-type signature. Comparison with existing methods using various real RNA-seq data sets indicates that our new approach is more accurate and comprehensive than previous methods, especially for the estimation of rare cell types. More importantly,our method can detect cell-type composition changes in response to external perturbations, thereby providing a valuable, cost-effective method for dissecting the cell-type-specific effects of drug treatments or condition changes. As such, our method is applicable to a wide range of biological and clinical investigations. Dampened weighted least squares ('DWLS') is an estimation method for gene expression deconvolution, in which the cell-type composition of a bulk RNA-seq data set is computationally inferred. This method corrects common biases towards cell types that are characterized by highly expressed genes and/or are highly prevalent, to provide accurate detection across diverse cell types. See: <https://www.nature.com/articles/s41467-019-10802-z.pdf> for more information about the development of 'DWLS' and the methods behind our functions.

Maintained by Adriana Sistig. Last updated 3 years ago.

5.9 match 2 stars 3.62 score 42 scripts

insightsengineering

teal.modules.hermes:RNA-Seq Analysis Modules to Add to a Teal Application

RNA-seq analysis teal modules based on the `hermes` package.

Maintained by Daniel Sabanés Bové. Last updated 1 years ago.

modules rna-seq-analysis shiny

3.6 match 7 stars 5.54 score 32 scripts

bioc

GSVA:Gene Set Variation Analysis for Microarray and RNA-Seq Data

Gene Set Variation Analysis (GSVA) is a non-parametric, unsupervised method for estimating variation of gene set enrichment through the samples of a expression data set. GSVA performs a change in coordinate systems, transforming the data from a gene by sample matrix to a gene-set by sample matrix, thereby allowing the evaluation of pathway enrichment for each sample. This new matrix of GSVA enrichment scores facilitates applying standard analytical methods like functional enrichment, survival analysis, clustering, CNV-pathway analysis or cross-tissue pathway analysis, in a pathway-centric manner.

Maintained by Robert Castelo. Last updated 4 days ago.

functionalgenomics microarray rnaseq pathways genesetenrichment gene-set-enrichment genomics pathway-enrichment-analysis

1.3 match 210 stars 14.72 score 1.6k scripts 19 dependents

ajitjohnson

imsig:Immune Cell Gene Signatures for Profiling the Microenvironment of Solid Tumours

Estimate the relative abundance of tissue-infiltrating immune subpopulations abundances using gene expression data.

Maintained by Ajit Johnson Nirmal. Last updated 4 years ago.

cancer deconvolution immune transcriptomics

4.7 match 26 stars 4.11 score 8 scripts

r-lib

cpp11:A C++11 Interface for R's C Interface

Provides a header only, C++11 interface to R's C interface. Compared to other approaches 'cpp11' strives to be safe against long jumps from the C API as well as C++ exceptions, conform to normal R function semantics and supports interaction with 'ALTREP' vectors.

Maintained by Davis Vaughan. Last updated 12 days ago.

cpp cpp11

1.1 match 212 stars 17.69 score 104 scripts 8.6k dependents

yuande

signatureSurvival:Signature Survival Analysis

When multiple Cox proportional hazard models are performed on clinical data (month or year and status) and a set of differential expressions of genes, the results (Hazard risks, z-scores and p-values) can be used to create gene-expression signatures. Weights are calculated using the survival p-values of genes and are utilized to calculate expression values of the signature across the selected genes in all patients in a cohort. A Single or multiple univariate or multivariate Cox proportional hazard survival analyses of the patients in one cohort can be performed by using the gene-expression signature and visualized using our survival plots.

Maintained by Yuan-De Tan. Last updated 2 years ago.

19.0 match 1 stars 1.00 score

ropensci

RNeXML:Semantically Rich I/O for the 'NeXML' Format

Provides access to phyloinformatic data in 'NeXML' format. The package should add new functionality to R such as the possibility to manipulate 'NeXML' objects in more various and refined way and compatibility with 'ape' objects.

Maintained by Carl Boettiger. Last updated 10 months ago.

metadata nexml phylogenetics linked-data

1.9 match 13 stars 9.92 score 100 scripts 19 dependents

r-lib

jose:JavaScript Object Signing and Encryption

Read and write JSON Web Keys (JWK, rfc7517), generate and verify JSON Web Signatures (JWS, rfc7515) and encode/decode JSON Web Tokens (JWT, rfc7519) <https://datatracker.ietf.org/wg/jose/documents/>. These standards provide modern signing and encryption formats that are natively supported by browsers via the JavaScript WebCryptoAPI <https://www.w3.org/TR/WebCryptoAPI/#jose>, and used by services like OAuth 2.0, LetsEncrypt, and Github Apps.

Maintained by Jeroen Ooms. Last updated 5 months ago.

1.7 match 50 stars 10.98 score 63 scripts 35 dependents

louisaslett

ReliabilityTheory:Structural Reliability Analysis

Perform structural reliability analysis, including computation and simulation with system signatures, Samaniego (2007) <doi:10.1007/978-0-387-71797-5>, and survival signatures, Coolen and Coolen-Maturi (2013) <doi:10.1007/978-3-642-30662-4_8>. Additionally supports parametric and topological inference given system lifetime data, Aslett (2012) <https://www.louisaslett.com/PhD_Thesis.pdf>.

Maintained by Louis Aslett. Last updated 6 months ago.

reliability-engineering

4.7 match 7 stars 3.92 score 12 scripts

bioc

RESOLVE:RESOLVE: An R package for the efficient analysis of mutational signatures from cancer genomes

Cancer is a genetic disease caused by somatic mutations in genes controlling key biological functions such as cellular growth and division. Such mutations may arise both through cell-intrinsic and exogenous processes, generating characteristic mutational patterns over the genome named mutational signatures. The study of mutational signatures have become a standard component of modern genomics studies, since it can reveal which (environmental and endogenous) mutagenic processes are active in a tumor, and may highlight markers for therapeutic response. Mutational signatures computational analysis presents many pitfalls. First, the task of determining the number of signatures is very complex and depends on heuristics. Second, several signatures have no clear etiology, casting doubt on them being computational artifacts rather than due to mutagenic processes. Last, approaches for signatures assignment are greatly influenced by the set of signatures used for the analysis. To overcome these limitations, we developed RESOLVE (Robust EStimation Of mutationaL signatures Via rEgularization), a framework that allows the efficient extraction and assignment of mutational signatures. RESOLVE implements a novel algorithm that enables (i) the efficient extraction, (ii) exposure estimation, and (iii) confidence assessment during the computational inference of mutational signatures.

Maintained by Luca De Sano. Last updated 5 months ago.

biomedicalinformatics somaticmutation

3.9 match 1 stars 4.60 score 3 scripts

cynkra

constructive:Display Idiomatic Code to Construct Most R Objects

Prints code that can be used to recreate R objects. In a sense it is similar to 'base::dput()' or 'base::deparse()' but 'constructive' strives to use idiomatic constructors.

Maintained by Antoine Fabri. Last updated 2 days ago.

2.0 match 137 stars 8.63 score 20 scripts

bioc

scDataviz:scDataviz: single cell dataviz and downstream analyses

In the single cell World, which includes flow cytometry, mass cytometry, single-cell RNA-seq (scRNA-seq), and others, there is a need to improve data visualisation and to bring analysis capabilities to researchers even from non-technical backgrounds. scDataviz attempts to fit into this space, while also catering for advanced users. Additonally, due to the way that scDataviz is designed, which is based on SingleCellExperiment, it has a 'plug and play' feel, and immediately lends itself as flexibile and compatibile with studies that go beyond scDataviz. Finally, the graphics in scDataviz are generated via the ggplot engine, which means that users can 'add on' features to these with ease.

Maintained by Kevin Blighe. Last updated 5 months ago.

singlecell immunooncology rnaseq geneexpression transcription flowcytometry massspectrometry dataimport

2.6 match 63 stars 6.30 score 16 scripts

bioc

PanomiR:Detection of miRNAs that regulate interacting groups of pathways

PanomiR is a package to detect miRNAs that target groups of pathways from gene expression data. This package provides functionality for generating pathway activity profiles, determining differentially activated pathways between user-specified conditions, determining clusters of pathways via the PCxN package, and generating miRNAs targeting clusters of pathways. These function can be used separately or sequentially to analyze RNA-Seq data.

Maintained by Pourya Naderi. Last updated 5 months ago.

geneexpression genesetenrichment genetarget mirna pathways

3.4 match 3 stars 4.89 score 13 scripts

bioc

mistyR:Multiview Intercellular SpaTial modeling framework

mistyR is an implementation of the Multiview Intercellular SpaTialmodeling framework (MISTy). MISTy is an explainable machine learning framework for knowledge extraction and analysis of single-cell, highly multiplexed, spatially resolved data. MISTy facilitates an in-depth understanding of marker interactions by profiling the intra- and intercellular relationships. MISTy is a flexible framework able to process a custom number of views. Each of these views can describe a different spatial context, i.e., define a relationship among the observed expressions of the markers, such as intracellular regulation or paracrine regulation, but also, the views can also capture cell-type specific relationships, capture relations between functional footprints or focus on relations between different anatomical regions. Each MISTy view is considered as a potential source of variability in the measured marker expressions. Each MISTy view is then analyzed for its contribution to the total expression of each marker and is explained in terms of the interactions with other measurements that led to the observed contribution.

Maintained by Jovan Tanevski. Last updated 5 months ago.

software biomedicalinformatics cellbiology systemsbiology regression decisiontree singlecell spatial bioconductor biology intercellular machine-learning modular molecular-biology multiview spatial-transcriptomics

2.0 match 51 stars 7.87 score 160 scripts

bioc

ccmap:Combination Connectivity Mapping

Finds drugs and drug combinations that are predicted to reverse or mimic gene expression signatures. These drugs might reverse diseases or mimic healthy lifestyles.

Maintained by Alex Pickering. Last updated 5 months ago.

geneexpression transcription microarray differentialexpression

5.2 match 3.00 score 5 scripts

r-lib

mockery:Mocking Library for R

The two main functionalities of this package are creating mock objects (functions) and selectively intercepting calls to a given function that originate in some other function. It can be used with any testing framework available for R. Mock objects can be injected with either this package's own stub() function or a similar with_mock() facility present in the 'testthat' package.

Maintained by Hadley Wickham. Last updated 1 years ago.

1.3 match 100 stars 11.57 score 504 scripts 5 dependents

s-u

PKI:Public Key Infrastucture for R Based on the X.509 Standard

Public Key Infrastucture functions such as verifying certificates, RSA encription and signing which can be used to build PKI infrastructure and perform cryptographic tasks.

Maintained by Simon Urbanek. Last updated 7 months ago.

openssl

1.8 match 18 stars 8.52 score 63 scripts 8 dependents

bioc

timeOmics:Time-Course Multi-Omics data integration

timeOmics is a generic data-driven framework to integrate multi-Omics longitudinal data measured on the same biological samples and select key temporal features with strong associations within the same sample group. The main steps of timeOmics are: 1. Plaform and time-specific normalization and filtering steps; 2. Modelling each biological into one time expression profile; 3. Clustering features with the same expression profile over time; 4. Post-hoc validation step.

Maintained by Antoine Bodein. Last updated 5 months ago.

clustering featureextraction timecourse dimensionreduction software sequencing microarray metabolomics metagenomics proteomics classification regression immunooncology geneprediction multiplecomparison cluster integration multi-omics time-series

2.5 match 24 stars 5.98 score 10 scripts

bioc

mosbi:Molecular Signature identification using Biclustering

This package is a implementation of biclustering ensemble method MoSBi (Molecular signature Identification from Biclustering). MoSBi provides standardized interfaces for biclustering results and can combine their results with a multi-algorithm ensemble approach to compute robust ensemble biclusters on molecular omics data. This is done by computing similarity networks of biclusters and filtering for overlaps using a custom error model. After that, the louvain modularity it used to extract bicluster communities from the similarity network, which can then be converted to ensemble biclusters. Additionally, MoSBi includes several network visualization methods to give an intuitive and scalable overview of the results. MoSBi comes with several biclustering algorithms, but can be easily extended to new biclustering algorithms.

Maintained by Tim Daniel Rose. Last updated 5 months ago.

software statisticalmethod clustering network cpp

3.5 match 4.30 score 8 scripts

bioc

GeneTonic:Enjoy Analyzing And Integrating The Results From Differential Expression Analysis And Functional Enrichment Analysis

This package provides functionality to combine the existing pieces of the transcriptome data and results, making it easier to generate insightful observations and hypothesis. Its usage is made easy with a Shiny application, combining the benefits of interactivity and reproducibility e.g. by capturing the features and gene sets of interest highlighted during the live session, and creating an HTML report as an artifact where text, code, and output coexist. Using the GeneTonicList as a standardized container for all the required components, it is possible to simplify the generation of multiple visualizations and summaries.

Maintained by Federico Marini. Last updated 2 months ago.

gui geneexpression software transcription transcriptomics visualization differentialexpression pathways reportwriting genesetenrichment annotation go shinyapps bioconductor bioconductor-package data-exploration data-visualization functional-enrichment-analysis gene-expression pathway-analysis reproducible-research rna-seq-analysis rna-seq-data shiny transcriptome user-friendly

1.8 match 77 stars 8.28 score 37 scripts 1 dependents

bioc

xCell2:A Tool for Generic Cell Type Enrichment Analysis

xCell2 provides methods for cell type enrichment analysis using cell type signatures. It includes three main functions - 1. xCell2Train for training custom references objects from bulk or single-cell RNA-seq datasets. 2. xCell2Analysis for conducting the cell type enrichment analysis using the custom reference. 3. xCell2GetLineage for identifying dependencies between different cell types using ontology.

Maintained by Almog Angel. Last updated 2 months ago.

geneexpression transcriptomics microarray rnaseq singlecell differentialexpression immunooncology genesetenrichment

2.4 match 6 stars 6.17 score 15 scripts

bioc

DECIPHER:Tools for curating, analyzing, and manipulating biological sequences

A toolset for deciphering and managing biological sequences.

Maintained by Erik Wright. Last updated 5 days ago.

clustering genetics sequencing dataimport visualization microarray qualitycontrol qpcr alignment wholegenome microbiome immunooncology geneprediction openmp

1.7 match 8.40 score 1.1k scripts 14 dependents

brazil-data-cube

rstac:Client Library for SpatioTemporal Asset Catalog

Provides functions to access, search and download spacetime earth observation data via SpatioTemporal Asset Catalog (STAC). This package supports the version 1.0.0 (and older) of the STAC specification (<https://github.com/radiantearth/stac-spec>). For further details see Simoes et al. (2021) <doi:10.1109/IGARSS47720.2021.9553518>.

Maintained by Felipe Carvalho. Last updated 9 months ago.

geospatial spatiotemporal-asset-catalog stac

1.7 match 72 stars 8.28 score 250 scripts 3 dependents

syedhaider5

SIMMS:Subnetwork Integration for Multi-Modal Signatures

Algorithms to create prognostic biomarkers using biological genesets or networks.

Maintained by Syed Haider. Last updated 3 years ago.

5.9 match 2.30 score 20 scripts

bioc

miloR:Differential neighbourhood abundance testing on a graph

Milo performs single-cell differential abundance testing. Cell states are modelled as representative neighbourhoods on a nearest neighbour graph. Hypothesis testing is performed using either a negative bionomial generalized linear model or negative binomial generalized linear mixed model.

Maintained by Mike Morgan. Last updated 5 months ago.

singlecell multiplecomparison functionalgenomics software openblas cpp openmp

1.3 match 357 stars 10.49 score 340 scripts 1 dependents

bioc

canceR:A Graphical User Interface for accessing and modeling the Cancer Genomics Data of MSKCC

The package is user friendly interface based on the cgdsr and other modeling packages to explore, compare, and analyse all available Cancer Data (Clinical data, Gene Mutation, Gene Methylation, Gene Expression, Protein Phosphorylation, Copy Number Alteration) hosted by the Computational Biology Center at Memorial-Sloan-Kettering Cancer Center (MSKCC).

Maintained by Karim Mezhoud. Last updated 5 months ago.

gui geneexpression clustering go genesetenrichment kegg multiplecomparison cancer cancer-data gene gene-expression gene-methylation gene-mutation gene-sets methylation mskcc mutations tcltk

2.5 match 7 stars 5.25 score 17 scripts

bioc

quantiseqr:Quantification of the Tumor Immune contexture from RNA-seq data

This package provides a streamlined workflow for the quanTIseq method, developed to perform the quantification of the Tumor Immune contexture from RNA-seq data. The quantification is performed against the TIL10 signature (dissecting the contributions of ten immune cell types), carefully crafted from a collection of human RNA-seq samples. The TIL10 signature has been extensively validated using simulated, flow cytometry, and immunohistochemistry data.

Maintained by Federico Marini. Last updated 3 months ago.

geneexpression software transcription transcriptomics sequencing microarray visualization annotation immunooncology featureextraction classification statisticalmethod experimenthubsoftware flowcytometry

2.8 match 4.65 score 3 scripts 1 dependents

bioc

SVMDO:Identification of Tumor-Discriminating mRNA Signatures via Support Vector Machines Supported by Disease Ontology

It is an easy-to-use GUI using disease information for detecting tumor/normal sample discriminating gene sets from differentially expressed genes. Our approach is based on an iterative algorithm filtering genes with disease ontology enrichment analysis and wilk and wilks lambda criterion connected to SVM classification model construction. Along with gene set extraction, SVMDO also provides individual prognostic marker detection. The algorithm is designed for FPKM and RPKM normalized RNA-Seq transcriptome datasets.

Maintained by Mustafa Erhan Ozer. Last updated 5 months ago.

genesetenrichment differentialexpression gui classification rnaseq transcriptomics survival machine-learning rna-seq shiny

2.7 match 4.60 score 2 scripts

larssnip

microseq:Basic Biological Sequence Handling

Basic functions for microbial sequence data analysis. The idea is to use generic R data structures as much as possible, making R data wrangling possible also for sequence data.

Maintained by Lars Snipen. Last updated 10 months ago.

cpp

2.3 match 3 stars 5.46 score 54 scripts 3 dependents

cleanzr

klsh:Blocking for Record Linkage

An implementation of the blocking algorithm KLSH in Steorts, Ventura, Sadinle, Fienberg (2014) <DOI:10.1007/978-3-319-11257-2_20>, which is a k-means variant of locality sensitive hashing. The method is illustrated with examples and a vignette.

Maintained by Rebecca Steorts. Last updated 4 years ago.

3.3 match 3.70 score 3 scripts

bioc

AnVILAz:R / Bioconductor Support for the AnVIL Azure Platform

The AnVIL is a cloud computing resource developed in part by the National Human Genome Research Institute. The AnVILAz package supports end-users and developers using the AnVIL platform in the Azure cloud. The package provides a programmatic interface to AnVIL resources, including workspaces, notebooks, tables, and workflows. The package also provides utilities for managing resources, including copying files to and from Azure Blob Storage, and creating shared access signatures (SAS) for secure access to Azure resources.

Maintained by Marcel Ramos. Last updated 5 months ago.

software infrastructure thirdpartyclient

2.2 match 5.45 score 5 scripts

bioc

SynMut:SynMut: Designing Synonymously Mutated Sequences with Different Genomic Signatures

There are increasing demands on designing virus mutants with specific dinucleotide or codon composition. This tool can take both dinucleotide preference and/or codon usage bias into account while designing mutants. It is a powerful tool for in silico designs of DNA sequence mutants.

Maintained by Haogao Gu. Last updated 5 months ago.

sequencematching experimentaldesign preprocessing

2.8 match 2 stars 4.30 score 1 scripts

computationalproteomics

proteoDeconv:Enabling Cell-Type Deconvolution of Proteomics Data

Tools for deconvoluting proteomics data to identify and quantify cell types (e.g., immune cell types) in complex biological samples.

Maintained by Måns Zamore. Last updated 2 days ago.

3.1 match 2 stars 3.85 score

enchufa2

RcppXPtrUtils:XPtr Add-Ons for 'Rcpp'

Provides the means to compile user-supplied C++ functions with 'Rcpp' and retrieve an 'XPtr' that can be passed to other C++ components.

Maintained by Iñaki Ucar. Last updated 5 months ago.

rcpp xptr

2.0 match 20 stars 5.98 score 45 scripts 7 dependents

hneth

unicol:The Colors of your University

Most universities use specific color combinations to express their unique brand identity. The 'unicol' package provides the colors and color palettes of various universities for easy plotting and printing in R. We collect and provide a diverse range of color palettes for creating scientific visualizations.

Maintained by Hansjoerg Neth. Last updated 7 months ago.

branding color color-palettes color-schemes corporate-design university-colors visual-identity

1.8 match 9 stars 6.58 score 10 scripts

bioc

oppar:Outlier profile and pathway analysis in R

The R implementation of mCOPA package published by Wang et al. (2012). Oppar provides methods for Cancer Outlier profile Analysis. Although initially developed to detect outlier genes in cancer studies, methods presented in oppar can be used for outlier profile analysis in general. In addition, tools are provided for gene set enrichment and pathway analysis.

Maintained by Soroor Hediyeh zadeh. Last updated 5 months ago.

pathways genesetenrichment systemsbiology geneexpression software

3.5 match 3.30 score 3 scripts

finlaycampbell

outbreaker2:Bayesian Reconstruction of Disease Outbreaks by Combining Epidemiologic and Genomic Data

Bayesian reconstruction of disease outbreaks using epidemiological and genetic information. Jombart T, Cori A, Didelot X, Cauchemez S, Fraser C and Ferguson N. 2014. <doi:10.1371/journal.pcbi.1003457>. Campbell, F, Cori A, Ferguson N, Jombart T. 2019. <doi:10.1371/journal.pcbi.1006930>.

Maintained by Finlay Campbell. Last updated 6 months ago.

cpp

1.5 match 7.67 score 101 scripts 1 dependents

hheiling

glmmPen:High Dimensional Penalized Generalized Linear Mixed Models (pGLMM)

Fits high dimensional penalized generalized linear mixed models using the Monte Carlo Expectation Conditional Minimization (MCECM) algorithm. The purpose of the package is to perform variable selection on both the fixed and random effects simultaneously for generalized linear mixed models. The package supports fitting of Binomial, Gaussian, and Poisson data with canonical links, and supports penalization using the MCP, SCAD, or LASSO penalties. The MCECM algorithm is described in Rashid et al. (2020) <doi:10.1080/01621459.2019.1671197>. The techniques used in the minimization portion of the procedure (the M-step) are derived from the procedures of the 'ncvreg' package (Breheny and Huang (2011) <doi:10.1214/10-AOAS388>) and 'grpreg' package (Breheny and Huang (2015) <doi:10.1007/s11222-013-9424-2>), with appropriate modifications to account for the estimation and penalization of the random effects. The 'ncvreg' and 'grpreg' packages also describe the MCP, SCAD, and LASSO penalties.

Maintained by Hillary Heiling. Last updated 7 months ago.

cpp

3.0 match 6 stars 3.73 score 18 scripts

bioc

GmicR:Combines WGCNA and xCell readouts with bayesian network learrning to generate a Gene-Module Immune-Cell network (GMIC)

This package uses bayesian network learning to detect relationships between Gene Modules detected by WGCNA and immune cell signatures defined by xCell. It is a hypothesis generating tool.

Maintained by Richard Virgen-Slane. Last updated 5 months ago.

software systemsbiology graphandnetwork network networkinference gui immunooncology geneexpression qualitycontrol bayesian clustering

2.8 match 4.00 score 2 scripts

bioc

maPredictDSC:Phenotype prediction using microarray data: approach of the best overall team in the IMPROVER Diagnostic Signature Challenge

This package implements the classification pipeline of the best overall team (Team221) in the IMPROVER Diagnostic Signature Challenge. Additional functionality is added to compare 27 combinations of data preprocessing, feature selection and classifier types.

Maintained by Adi Laurentiu Tarca. Last updated 5 months ago.

microarray classification

4.8 match 2.30 score 2 scripts

bioc

sparrow:Take command of set enrichment analyses through a unified interface

Provides a unified interface to a variety of GSEA techniques from different bioconductor packages. Results are harmonized into a single object and can be interrogated uniformly for quick exploration and interpretation of results. Interactive exploration of GSEA results is enabled through a shiny app provided by a sparrow.shiny sibling package.

Maintained by Steve Lianoglou. Last updated 3 months ago.

genesetenrichment pathways bioinformatics gsea

1.7 match 21 stars 6.58 score 13 scripts

cran

topologyGSA:Gene Set Analysis Exploiting Pathway Topology

Using Gaussian graphical models we propose a novel approach to perform pathway analysis using gene expression. Given the structure of a graph (a pathway) we introduce two statistical tests to compare the mean and the concentration matrices between two groups. Specifically, these tests can be performed on the graph and on its connected components (cliques). The package is based on the method described in Massa M.S., Chiogna M., Romualdi C. (2010) <doi:10.1186/1752-0509-4-121>.

Maintained by Gabriele Sales. Last updated 1 years ago.

6.7 match 1.60 score

coolbutuseless

callme:Easily Compile and Call Inline 'C' Functions

Compile inline 'C' code and easily call with automatically generated wrapper functions. By allowing user-defined headers and compilation flags (preprocessor, compiler and linking flags) the user can configure optimization options and linking to third party libraries. Multiple functions may be defined in a single block of code - which may be defined in a string or a path to a source file.

Maintained by Mike Cheng. Last updated 27 days ago.

c

1.5 match 20 stars 6.95 score 21 scripts

bioc

RNAAgeCalc:A multi-tissue transcriptional age calculator

It has been shown that both DNA methylation and RNA transcription are linked to chronological age and age related diseases. Several estimators have been developed to predict human aging from DNA level and RNA level. Most of the human transcriptional age predictor are based on microarray data and limited to only a few tissues. To date, transcriptional studies on aging using RNASeq data from different human tissues is limited. The aim of this package is to provide a tool for across-tissue and tissue-specific transcriptional age calculation based on GTEx RNASeq data.

Maintained by Xu Ren. Last updated 5 months ago.

rnaseq geneexpression biological-age elastic-net gene-expression genotype-tissue-expression prediction regularized-regression rna-seq

2.0 match 8 stars 5.20 score 10 scripts

dosorio

retriever:Generate Disease-Specific Response Signatures from the LINCS-L1000 Data

Generates disease-specific drug-response profiles that are independent of time, concentration, and cell-line. Based on the cell lines used as surrogates, the returned profiles represent the unique transcriptional changes induced by a compound in a given disease.

Maintained by Daniel Osorio. Last updated 3 years ago.

4.6 match 2.26 score 12 scripts

bioc

ontoProc:processing of ontologies of anatomy, cell lines, and so on

Support harvesting of diverse bioinformatic ontologies, making particular use of the ontologyIndex package on CRAN. We provide snapshots of key ontologies for terms about cells, cell lines, chemical compounds, and anatomy, to help analyze genome-scale experiments, particularly cell x compound screens. Another purpose is to strengthen development of compelling use cases for richer interfaces to emerging ontologies.

Maintained by Vincent Carey. Last updated 3 days ago.

infrastructure go bioinformatics genomics ontology

1.6 match 3 stars 6.37 score 75 scripts 2 dependents

selkamand

mutaliskRutils:What the Package Does (One Line, Title Case)

What the package does (one paragraph).

Maintained by Sam El-Kamand. Last updated 1 years ago.

4.6 match 2.18 score 1 dependents

jli-stat

ClussCluster:Simultaneous Detection of Clusters and Cluster-Specific Genes in High-Throughput Transcriptome Data

Implements a new method 'ClussCluster' descried in Ge Jiang and Jun Li, "Simultaneous Detection of Clusters and Cluster-Specific Genes in High-throughput Transcriptome Data" (Unpublished). Simultaneously perform clustering analysis and signature gene selection on high-dimensional transcriptome data sets. To do so, 'ClussCluster' incorporates a Lasso-type regularization penalty term to the objective function of K- means so that cell-type-specific signature genes can be identified while clustering the cells.

Maintained by Li Jun. Last updated 6 years ago.

3.6 match 2.70 score

bioc

sigsquared:Gene signature generation for functionally validated signaling pathways

By leveraging statistical properties (log-rank test for survival) of patient cohorts defined by binary thresholds, poor-prognosis patients are identified by the sigsquared package via optimization over a cost function reducing type I and II error.

Maintained by UnJin Lee. Last updated 5 months ago.

2.9 match 3.30 score 1 scripts

hanjunwei-lab

pathwayTMB:Pathway Based Tumor Mutational Burden

A systematic bioinformatics tool to develop a new pathway-based gene panel for tumor mutational burden (TMB) assessment (pathway-based tumor mutational burden, PTMB), using somatic mutations files in an efficient manner from either The Cancer Genome Atlas sources or any in-house studies as long as the data is in mutation annotation file (MAF) format. Besides, we develop a multiple machine learning method using the sample's PTMB profiles to identify cancer-specific dysfunction pathways, which can be a biomarker of prognostic and predictive for cancer immunotherapy.

Maintained by Junwei Han. Last updated 3 years ago.

3.8 match 2.48 score 2 scripts 1 dependents

bioc

pathlinkR:Analyze and interpret RNA-Seq results

pathlinkR is an R package designed to facilitate analysis of RNA-Seq results. Specifically, our aim with pathlinkR was to provide a number of tools which take a list of DE genes and perform different analyses on them, aiding with the interpretation of results. Functions are included to perform pathway enrichment, with muliplte databases supported, and tools for visualizing these results. Genes can also be used to create and plot protein-protein interaction networks, all from inside of R.

Maintained by Travis Blimkie. Last updated 3 months ago.

genesetenrichment network pathways reactome rnaseq networkenrichment bioinformatics networks pathway-enrichment-analysis visualization

1.3 match 26 stars 6.62 score 2 scripts

signaturescience

pracpac:Practical 'R' Packaging in 'Docker'

Streamline the creation of 'Docker' images with 'R' packages and dependencies embedded. The 'pracpac' package provides a 'usethis'-like interface to creating Dockerfiles with dependencies managed by 'renv'. The 'pracpac' functionality is described in Nagraj and Turner (2023) <doi:10.48550/arXiv.2303.07876>.

Maintained by VP Nagraj. Last updated 2 years ago.

1.6 match 34 stars 5.53 score 5 scripts

sterniii3

drugdevelopR:Utility-Based Optimal Phase II/III Drug Development Planning

Plan optimal sample size allocation and go/no-go decision rules for phase II/III drug development programs with time-to-event, binary or normally distributed endpoints when assuming fixed treatment effects or a prior distribution for the treatment effect, using methods from Kirchner et al. (2016) <doi:10.1002/sim.6624> and Preussler (2020). Optimal is in the sense of maximal expected utility, where the utility is a function taking into account the expected cost and benefit of the program. It is possible to extend to more complex settings with bias correction (Preussler S et al. (2020) <doi:10.1186/s12874-020-01093-w>), multiple phase III trials (Preussler et al. (2019) <doi:10.1002/bimj.201700241>), multi-arm trials (Preussler et al. (2019) <doi:10.1080/19466315.2019.1702092>), and multiple endpoints (Kieser et al. (2018) <doi:10.1002/pst.1861>).

Maintained by Lukas D. Sauer. Last updated 2 months ago.

1.3 match 3 stars 6.39 score 15 scripts

signaturescience

skater:Utilities for SNP-Based Kinship Analysis

Utilities for single nucleotide polymorphism (SNP) based kinship analysis testing and evaluation. The 'skater' package contains functions for importing, parsing, and analyzing pedigree data, performing relationship degree inference, benchmarking relationship degree classification, and summarizing identity by descent (IBD) segment data. Package functions and methods are described in Turner et al. (2021) "skater: An R package for SNP-based Kinship Analysis, Testing, and Evaluation" <doi:10.1101/2021.07.21.453083>.

Maintained by Stephen Turner. Last updated 2 years ago.

1.5 match 9 stars 5.26 score 7 scripts

bioc

uSORT:uSORT: A self-refining ordering pipeline for gene selection

This package is designed to uncover the intrinsic cell progression path from single-cell RNA-seq data. It incorporates data pre-processing, preliminary PCA gene selection, preliminary cell ordering, feature selection, refined cell ordering, and post-analysis interpretation and visualization.

Maintained by Hao Chen. Last updated 5 months ago.

immunooncology rnaseq gui cellbiology dnaseq

2.4 match 3.30 score

bioc

consICA:consensus Independent Component Analysis

consICA implements a data-driven deconvolution method – consensus independent component analysis (ICA) to decompose heterogeneous omics data and extract features suitable for patient diagnostics and prognostics. The method separates biologically relevant transcriptional signals from technical effects and provides information about the cellular composition and biological processes. The implementation of parallel computing in the package ensures efficient analysis of modern multicore systems.

Maintained by Petr V. Nazarov. Last updated 5 months ago.

technology statisticalmethod sequencing rnaseq transcriptomics classification featureextraction

1.8 match 4.30 score 2 scripts

alenxav

NAM:Nested Association Mapping

Designed for association studies in nested association mapping (NAM) panels, experimental and random panels. The method is described by Xavier et al. (2015) <doi:10.1093/bioinformatics/btv448>. It includes tools for genome-wide associations of multiple populations, marker quality control, population genetics analysis, genome-wide prediction, solving mixed models and finding variance components through likelihood and Bayesian methods.

Maintained by Alencar Xavier. Last updated 5 years ago.

cpp

1.3 match 2 stars 5.72 score 44 scripts 1 dependents

bioc

CARDspa:Spatially Informed Cell Type Deconvolution for Spatial Transcriptomics

CARD is a reference-based deconvolution method that estimates cell type composition in spatial transcriptomics based on cell type specific expression information obtained from a reference scRNA-seq data. A key feature of CARD is its ability to accommodate spatial correlation in the cell type composition across tissue locations, enabling accurate and spatially informed cell type deconvolution as well as refined spatial map construction. CARD relies on an efficient optimization algorithm for constrained maximum likelihood estimation and is scalable to spatial transcriptomics with tens of thousands of spatial locations and tens of thousands of genes.

Maintained by Jing Fu. Last updated 15 days ago.

spatial singlecell transcriptomics visualization openblas cpp openmp

1.6 match 4.54 score 3 scripts

bioc

PREDA:Position Related Data Analysis

Package for the position related analysis of quantitative functional genomics data.

Maintained by Francesco Ferrari. Last updated 5 months ago.

software copynumbervariation geneexpression genetics

1.7 match 4.30 score 9 scripts

bioc

escape:Easy single cell analysis platform for enrichment

A bridging R package to facilitate gene set enrichment analysis (GSEA) in the context of single-cell RNA sequencing. Using raw count information, Seurat objects, or SingleCellExperiment format, users can perform and visualize ssGSEA, GSVA, AUCell, and UCell-based enrichment calculations across individual cells.

Maintained by Nick Borcherding. Last updated 2 months ago.

software singlecell classification annotation genesetenrichment sequencing genesignaling pathways

1.2 match 5.92 score 138 scripts

erichare

bulletr:Algorithms for Matching Bullet Lands

Analyze bullet lands using nonparametric methods. We provide a reading routine for x3p files (see <http://www.openfmc.org> for more information) and a host of analysis functions designed to assess the probability that two bullets were fired from the same gun barrel.

Maintained by Eric Hare. Last updated 7 years ago.

1.9 match 3.72 score 52 scripts

hanjunwei-lab

SMDIC:Identification of Somatic Mutation-Driven Immune Cells

A computing tool is developed to automated identify somatic mutation-driven immune cells. The operation modes including: i) inferring the relative abundance matrix of tumor-infiltrating immune cells and integrating it with a particular gene mutation status, ii) detecting differential immune cells with respect to the gene mutation status and converting the abundance matrix of significant differential immune cell into two binary matrices (one for up-regulated and one for down-regulated), iii) identifying somatic mutation-driven immune cells by comparing the gene mutation status with each immune cell in the binary matrices across all samples, and iv) visualization of immune cell abundance of samples in different mutation status..

Maintained by Junwei Han. Last updated 5 months ago.

1.6 match 2 stars 4.00 score 5 scripts

atalv

azlogr:Logging in 'R' and Post to 'Azure Log Analytics' Workspace

It extends the functionality of 'logger' package. Additional logging metadata can be configured to be collected. Logging messages are displayed on console and optionally they are sent to 'Azure Log Analytics' workspace in real-time.

Maintained by Vivek Atal. Last updated 12 months ago.

azure azure-log-analytics logging

1.7 match 3.70 score 8 scripts

igordot

msigdbr:MSigDB Gene Sets for Multiple Organisms in a Tidy Data Format

Provides the 'Molecular Signatures Database' (MSigDB) gene sets typically used with the 'Gene Set Enrichment Analysis' (GSEA) software (Subramanian et al. 2005 <doi:10.1073/pnas.0506580102>, Liberzon et al. 2015 <doi:10.1016/j.cels.2015.12.004>, Castanza et al. 2023 <doi:10.1038/s41592-023-02014-7>) as an R data frame. The package includes the human genes as listed in MSigDB as well as the corresponding symbols and IDs for frequently studied model organisms such as mouse, rat, pig, fly, and yeast.

Maintained by Igor Dolgalev. Last updated 4 days ago.

enrichment-analysis gene-sets genomics gsea msigdb pathway-analysis pathways

0.5 match 72 stars 12.01 score 3.6k scripts 21 dependents

bioc

decoupleR:decoupleR: Ensemble of computational methods to infer biological activities from omics data

Many methods allow us to extract biological activities from omics data using information from prior knowledge resources, reducing the dimensionality for increased statistical power and better interpretability. Here, we present decoupleR, a Bioconductor package containing different statistical methods to extract these signatures within a unified framework. decoupleR allows the user to flexibly test any method with any resource. It incorporates methods that take into account the sign and weight of network interactions. decoupleR can be used with any omic, as long as its features can be linked to a biological process based on prior knowledge. For example, in transcriptomics gene sets regulated by a transcription factor, or in phospho-proteomics phosphosites that are targeted by a kinase.

Maintained by Pau Badia-i-Mompel. Last updated 5 months ago.

differentialexpression functionalgenomics geneexpression generegulation network software statisticalmethod transcription

0.5 match 230 stars 11.27 score 316 scripts 3 dependents

neonira

wyz.code.rdoc:Wizardry Code Offensive Programming R Documentation

Allows to generate on-demand or by batch, any R documentation file, whatever is kind, data, function, class or package. It populates documentation sections, either automatically or by considering your input. Input code could be standard R code or offensive programming code. Documentation content completeness depends on the type of code you use. With offensive programming code, expect generated documentation to be fully completed, from a format and content point of view. With some standard R code, you will have to activate post processing to fill-in any section that requires complements. Produced manual page validity is automatically tested against R documentation compliance rules. Documentation language proficiency, wording style, and phrasal adjustments remains your job.

Maintained by Fabien Gelineau. Last updated 3 years ago.

1.9 match 2.70 score 1 scripts

alexeckert

parallelDist:Parallel Distance Matrix Computation using Multiple Threads

A fast parallelized alternative to R's native 'dist' function to calculate distance matrices for continuous, binary, and multi-dimensional input matrices, which supports a broad variety of 41 predefined distance functions from the 'stats', 'proxy' and 'dtw' R packages, as well as user- defined functions written in C++. For ease of use, the 'parDist' function extends the signature of the 'dist' function and uses the same parameter naming conventions as distance methods of existing R packages. The package is mainly implemented in C++ and leverages the 'RcppParallel' package to parallelize the distance computations with the help of the 'TinyThread' library. Furthermore, the 'Armadillo' linear algebra library is used for optimized matrix operations during distance calculations. The curiously recurring template pattern (CRTP) technique is applied to avoid virtual functions, which improves the Dynamic Time Warping calculations while the implementation stays flexible enough to support different DTW step patterns and normalization methods.

Maintained by Alexander Eckert. Last updated 3 years ago.

data-science distance-computations matrices openblas cpp

0.5 match 51 stars 9.92 score 432 scripts 14 dependents

bioc

iNETgrate:Integrates DNA methylation data with gene expression in a single gene network

The iNETgrate package provides functions to build a correlation network in which nodes are genes. DNA methylation and gene expression data are integrated to define the connections between genes. This network is used to identify modules (clusters) of genes. The biological information in each of the resulting modules is represented by an eigengene. These biological signatures can be used as features e.g., for classification of patients into risk categories. The resulting biological signatures are very robust and give a holistic view of the underlying molecular changes.

Maintained by Habil Zare. Last updated 5 months ago.

geneexpression rnaseq dnamethylation networkinference network graphandnetwork biomedicalinformatics systemsbiology transcriptomics classification clustering dimensionreduction principalcomponent mrnamicroarray normalization geneprediction kegg survival core-services

0.8 match 74 stars 6.21 score 1 scripts

usdaforestservice

gdalraster:Bindings to the 'Geospatial Data Abstraction Library' Raster API

Interface to the Raster API of the 'Geospatial Data Abstraction Library' ('GDAL', <https://gdal.org>). Bindings are implemented in an exposed C++ class encapsulating a 'GDALDataset' and its raster band objects, along with several stand-alone functions. These support manual creation of uninitialized datasets, creation from existing raster as template, read/set dataset parameters, low level I/O, color tables, raster attribute tables, virtual raster (VRT), and 'gdalwarp' wrapper for reprojection and mosaicing. Includes 'GDAL' algorithms ('dem_proc()', 'polygonize()', 'rasterize()', etc.), and functions for coordinate transformation and spatial reference systems. Calling signatures resemble the native C, C++ and Python APIs provided by the 'GDAL' project. Includes raster 'calc()' to evaluate a given R expression on a layer or stack of layers, with pixel x/y available as variables in the expression; and raster 'combine()' to identify and count unique pixel combinations across multiple input layers, with optional output of the pixel-level combination IDs. Provides raster display using base 'graphics'. Bindings to a subset of the 'OGR' API are also included for managing vector data sources. Bindings to a subset of the Virtual Systems Interface ('VSI') are also included to support operations on 'GDAL' virtual file systems. These are general utility functions that abstract file system operations on URLs, cloud storage services, 'Zip'/'GZip'/'7z'/'RAR' archives, and in-memory files. 'gdalraster' may be useful in applications that need scalable, low-level I/O, or prefer a direct 'GDAL' API.

Maintained by Chris Toney. Last updated 15 hours ago.

gdal geospatial raster vector cpp

0.5 match 42 stars 9.50 score 32 scripts 3 dependents

bioc

tenXplore:ontological exploration of scRNA-seq of 1.3 million mouse neurons from 10x genomics

Perform ontological exploration of scRNA-seq of 1.3 million mouse neurons from 10x genomics.

Maintained by VJ Carey. Last updated 5 months ago.

immunooncology dimensionreduction principalcomponent transcriptomics singlecell

1.1 match 4.18 score 7 scripts

bioc

cmapR:CMap Tools in R

The Connectivity Map (CMap) is a massive resource of perturbational gene expression profiles built by researchers at the Broad Institute and funded by the NIH Library of Integrated Network-Based Cellular Signatures (LINCS) program. Please visit https://clue.io for more information. The cmapR package implements methods to parse, manipulate, and write common CMap data objects, such as annotated matrices and collections of gene sets.

Maintained by Ted Natoli. Last updated 5 months ago.

dataimport datarepresentation geneexpression bioconductor bioinformatics cmap

0.5 match 89 stars 8.85 score 298 scripts

neonira

wyz.code.metaTesting:Wizardry Code Meta Testing

Meta testing is the ability to test a function without having to provide its parameter values. Those values will be generated, based on semantic naming of parameters, as introduced by package 'wyz.code.offensiveProgramming'. Value generation logic can be completed with your own data types and generation schemes. This to meet your most specific requirements and to answer to a wide variety of usages, from general use case to very specific ones. While using meta testing, it becomes easier to generate stress test campaigns, non-regression test campaigns and robustness test campaigns, as generated tests can be saved and reused from session to session. Main benefits of using 'wyz.code.metaTesting' is ability to discover valid and invalid function parameter combinations, ability to infer valid parameter values, and to provide smart summaries that allows you to focus on dysfunctional cases.

Maintained by Fabien Gelineau. Last updated 1 years ago.

2.3 match 2.00 score

wtwt5237

DisHet:Estimate the Gene Expression Levels and Component Proportions of the Normal, Stroma (Immune) and Tumor Components of Bulk Tumor Samples

Model cell type heterogeneity of bulk renal cell carcinoma. The observed gene expression in bulk tumor sample is modeled by a log-normal distribution with the location parameter structured as a linear combination of the component-specific gene expressions.

Maintained by Tao Wang. Last updated 7 years ago.

3.4 match 1.30 score 3 scripts

jmcurran

dafs:Data Analysis for Forensic Scientists

Data and miscellanea to support the book "Introduction to Data analysis with R for Forensic Scientists." This book was written by James Curran and published by CRC Press in 2010 (ISBN: 978-1-4200-8826-7).

Maintained by James Curran. Last updated 3 years ago.

4.0 match 1 stars 1.08 score 12 scripts

bioc

roastgsa:Rotation based gene set analysis

This package implements a variety of functions useful for gene set analysis using rotations to approximate the null distribution. It contributes with the implementation of seven test statistic scores that can be used with different goals and interpretations. Several functions are available to complement the statistical results with graphical representations.

Maintained by Adria Caballe. Last updated 5 months ago.

microarray preprocessing normalization geneexpression survival transcription sequencing transcriptomics bayesian clustering regression rnaseq micrornaarray mrnamicroarray functionalgenomics systemsbiology immunooncology differentialexpression genesetenrichment batcheffect multiplecomparison qualitycontrol timecourse metabolomics proteomics epigenetics cheminformatics exonarray onechannel twochannel proprietaryplatforms cellbiology biomedicalinformatics alternativesplicing differentialsplicing dataimport pathways

1.9 match 2.30 score

cran

GenomicSig:Computation of Genomic Signatures

Genomic signatures represent unique features within a species' DNA, enabling the differentiation of species and offering broad applications across various fields. This package provides essential tools for calculating these specific signatures, streamlining the process for researchers and offering a comprehensive and time-saving solution for genomic analysis.The amino acid contents are identified based on the work published by Sandberg et al. (2003) <doi:10.1016/s0378-1119(03)00581-x> and Xiao et al. (2015) <doi:10.1093/bioinformatics/btv042>. The Average Mutual Information Profiles (AMIP) values are calculated based on the work of Bauer et al. (2008) <doi:10.1186/1471-2105-9-48>. The Chaos Game Representation (CGR) plot visualization was done based on the work of Deschavanne et al. (1999) <doi:10.1093/oxfordjournals.molbev.a026048> and Jeffrey et al. (1990) <doi:10.1093/nar/18.8.2163>. The GC content is calculated based on the work published by Nakabachi et al. (2006) <doi:10.1126/science.1134196> and Barbu et al. (1956) <https://pubmed.ncbi.nlm.nih.gov/13363015>. The Oligonucleotide Frequency Derived Error Gradient (OFDEG) values are computed based on the work published by Saeed et al. (2009) <doi:10.1186/1471-2164-10-S3-S10>. The Relative Synonymous Codon Usage (RSCU) values are calculated based on the work published by Elek (2018) <https://urn.nsk.hr/urn:nbn:hr:217:686131>.

Maintained by Anu Sharma. Last updated 6 months ago.

4.1 match 1.00 score

bioc

TypeInfo:Optional Type Specification Prototype

A prototype for a mechanism for specifying the types of parameters and the return value for an R function. This is meta-information that can be used to generate stubs for servers and various interfaces to these functions. Additionally, the arguments in a call to a typed function can be validated using the type specifications. We allow types to be specified as either i) by class name using either inheritance - is(x, className), or strict instance of - class(x) %in% className, or ii) a dynamic test given as an R expression which is evaluated at run-time. More precise information and interesting tests can be done via ii), but it is harder to use this information as meta-data as it requires more effort to interpret it and it is of course run-time information. It is typically more meaningful.

Maintained by Duncan Temple Lang. Last updated 5 months ago.

infrastructure

1.7 match 2.30 score 5 scripts

bioc

scde:Single Cell Differential Expression

The scde package implements a set of statistical methods for analyzing single-cell RNA-seq data. scde fits individual error models for single-cell RNA-seq measurements. These models can then be used for assessment of differential expression between groups of cells, as well as other types of analysis. The scde package also contains the pagoda framework which applies pathway and gene set overdispersion analysis to identify and characterize putative cell subpopulations based on transcriptional signatures. The overall approach to the differential expression analysis is detailed in the following publication: "Bayesian approach to single-cell differential expression analysis" (Kharchenko PV, Silberstein L, Scadden DT, Nature Methods, doi: 10.1038/nmeth.2967). The overall approach to subpopulation identification and characterization is detailed in the following pre-print: "Characterizing transcriptional heterogeneity through pathway and gene set overdispersion analysis" (Fan J, Salathia N, Liu R, Kaeser G, Yung Y, Herman J, Kaper F, Fan JB, Zhang K, Chun J, and Kharchenko PV, Nature Methods, doi:10.1038/nmeth.3734).

Maintained by Evan Biederstedt. Last updated 5 months ago.

immunooncology rnaseq statisticalmethod differentialexpression bayesian transcription software analysis bioinformatics heterogenity ngs single-cell transcriptomics openblas cpp openmp

0.5 match 173 stars 7.53 score 141 scripts

cozygene

BisqueRNA:Decomposition of Bulk Expression with Single-Cell Sequencing

Provides tools to accurately estimate cell type abundances from heterogeneous bulk expression. A reference-based method utilizes single-cell information to generate a signature matrix and transformation of bulk expression for accurate regression based estimates. A marker-based method utilizes known cell-specific marker genes to measure relative abundances across samples. For more details, see Jew and Alvarez et al (2019) <doi:10.1101/669911>.

Maintained by Brandon Jew. Last updated 4 years ago.

0.5 match 72 stars 6.95 score 124 scripts

manalytics

stppSim:Spatiotemporal Point Patterns Simulation

Generates artificial point patterns marked by their spatial and temporal signatures. The resulting point cloud may exhibit inherent interactions between both signatures. The simulation integrates microsimulation (Holm, E., (2017)<doi:10.1002/9781118786352.wbieg0320>) and agent-based models (Bonabeau, E., (2002)<doi:10.1073/pnas.082080899>), beginning with the configuration of movement characteristics for the specified agents (referred to as 'walkers') and their interactions within the simulation environment. These interactions (Quaglietta, L. and Porto, M., (2019)<doi:10.1186/s40462-019-0154-8>) result in specific spatiotemporal patterns that can be visualized, analyzed, and used for various analytical purposes. Given the growing scarcity of detailed spatiotemporal data across many domains, this package provides an alternative data source for applications in social and life sciences.

Maintained by Monsuru Adepeju. Last updated 8 months ago.

0.8 match 4 stars 4.60 score 5 scripts

appsilon

box.lsp:Provides 'box' Compatibility for 'languageserver'

A 'box' compatible custom language parser for the 'languageserver' package to provide completion and signature hints in code editors.

Maintained by Ricardo Rodrigo Basa. Last updated 6 months ago.

0.5 match 5 stars 6.48 score 2 scripts 1 dependents

cran

ROCSI:Receiver Operating Characteristic Based Signature Identification

Optimal linear combination predictive signatures for maximizing the area between two Receiver Operating Characteristic (ROC) curves (treatment vs. control).

Maintained by Xin Huang. Last updated 3 years ago.

3.4 match 1.00 score

carmonalab

GeneNMF:Non-Negative Matrix Factorization for Single-Cell Omics

A collection of methods to extract gene programs from single-cell gene expression data using non-negative matrix factorization (NMF). 'GeneNMF' contains functions to directly interact with the 'Seurat' toolkit and derive interpretable gene program signatures.

Maintained by Massimo Andreatta. Last updated 10 days ago.

0.5 match 102 stars 6.63 score 12 scripts

cran

msig:An R Package for Exploring Molecular Signatures Database

The Molecular Signatures Database ('MSigDB') is one of the most widely used and comprehensive databases of gene sets for performing gene set enrichment analysis <doi:10.1016/j.cels.2015.12.004>. The 'msig' package provides you with powerful, easy-to-use and flexible query functions for the 'MsigDB' database. There are 2 query modes in the 'msig' package: online query and local query. Both queries contain 2 steps: gene set name and gene. The online search is divided into 2 modes: registered search and non-registered browse. For registered search, email that you registered should be provided. Local queries can be made from local database, which can be updated by msig_update() function.

Maintained by Jing Zhang. Last updated 4 years ago.

3.4 match 1.00 score

dipterix

rutabaga:Simple R Tools for Analysis and Visualizations

Provides functions (R, C++) to speed up array calculations. Includes various tools for prettier visualizations via R base plots.

Maintained by Zhengjia Wang. Last updated 3 years ago.

2.0 match 1.70 score 2 scripts

bioc

smartid:Scoring and Marker Selection Method Based on Modified TF-IDF

This package enables automated selection of group specific signature, especially for rare population. The package is developed for generating specifc lists of signature genes based on Term Frequency-Inverse Document Frequency (TF-IDF) modified methods. It can also be used as a new gene-set scoring method or data transformation method. Multiple visualization functions are implemented in this package.

Maintained by Jinjin Chen. Last updated 4 months ago.

software geneexpression transcriptomics

0.8 match 1 stars 4.30 score 2 scripts

kurthornik

tm.plugin.mail:Text Mining E-Mail Plug-in

A plug-in for the tm text mining framework providing mail handling functionality.

Maintained by Kurt Hornik. Last updated 6 months ago.

1.9 match 1.72 score 26 scripts

federicogiorgi

corto:Inference of Gene Regulatory Networks

We present 'corto' (Correlation Tool), a simple package to infer gene regulatory networks and visualize master regulators from gene expression data using DPI (Data Processing Inequality) and bootstrapping to recover edges. An initial step is performed to calculate all significant edges between a list of source nodes (centroids) and target genes. Then all triplets containing two centroids and one target are tested in a DPI step which removes edges. A bootstrapping process then calculates the robustness of the network, eventually re-adding edges previously removed by DPI. The algorithm has been optimized to run outside a computing cluster, using a fast correlation implementation. The package finally provides functions to calculate network enrichment analysis from RNA-Seq and ATAC-Seq signatures as described in the article by Giorgi lab (2020) <doi:10.1093/bioinformatics/btaa223>.

Maintained by Federico M. Giorgi. Last updated 2 years ago.

0.5 match 20 stars 6.25 score 59 scripts

syedhaider5

iDOS:Integrated Discovery of Oncogenic Signatures

A method to integrate molecular profiles of cancer patients (gene copy number and mRNA abundance) to identify candidate gain of function alterations. These candidate alterations can be subsequently further tested to discover cancer driver alterations. Briefly, this method tests of genomic correlates of mRNA dysregulation and prioritise those where DNA gains/amplifications are associated with elevated mRNA expression of the same gene. For details see, Haider S et al. (2016) "Genomic alterations underlie a pan-cancer metabolic shift associated with tumour hypoxia", Genome Biology, <https://pubmed.ncbi.nlm.nih.gov/27358048/>.

Maintained by Syed Haider. Last updated 1 years ago.

3.1 match 1.00 score 10 scripts

sigminer:Extract, Analyze and Visualize Mutational Signatures for Genomic Variations

YAPSA:Yet Another Package for Signature Analysis

signifinder:Collection and implementation of public transcriptional cancer signatures

aws.signature:Amazon Web Services Request Signatures

TBSignatureProfiler:Profile RNA-Seq Data Using TB Pathway Signatures

musicatk:Mutational Signature Comprehensive Analysis Toolkit

genefu:Computation of Gene Expression-Based Signatures in Breast Cancer

MutationalPatterns:Comprehensive genome-wide analysis of mutational processes

UCell:Rank-based signature enrichment analysis for single-cell data

decompTumor2Sig:Decomposition of individual tumors into mutational signatures by signature refitting

singscore:Rank-based single-sample gene set scoring method

maftools:Summarize, Analyze and Visualize MAF Files

mutSignatures:Decipher Mutational Signatures from Somatic Mutational Catalogs

HiLDA:Conducting statistical inference on comparing the mutational exposures of mutational signatures by using hierarchical latent Dirichlet allocation

PharmacoGx:Analysis of Large-Scale Pharmacogenomic Data

timetk:A Tool Kit for Working with Time Series

bugsigdbr:R-side access to published microbial signatures from BugSigDB

openssl:Toolkit for Encryption, Signatures and Certificates Based on OpenSSL

QFASA:Quantitative Fatty Acid Signature Analysis

hacksig:A Tidy Framework to Hack Gene Expression Signatures

cola:A Framework for Consensus Partitioning

SomaticSignatures:Somatic Signatures

qfasar:Quantitative Fatty Acid Signature Analysis in R

BulkSignalR:Infer Ligand-Receptor Interactions from bulk expression (transcriptomics/proteomics) data, or spatial transcriptomics

TCGAbiolinks:TCGAbiolinks: An R/Bioconductor package for integrative analysis with GDC data

ASSIGN:Adaptive Signature Selection and InteGratioN (ASSIGN)

ADAPTS:Automated Deconvolution Augmentation of Profiles for Tissue Specific Cells

cosmicsig:Mutational Signatures from COSMIC (Catalogue of Somatic Mutations in Cancer)

biosigner:Signature discovery from omics data

mastR:Markers Automated Screening Tool in R

drugfindR:Investigate iLINCS for candidate repurposable drugs

SigsPack:Mutational Signature Estimation for Single Samples

scGate:Marker-Based Cell Type Purification for Single-Cell Sequencing Data

SigCheck:Check a gene signature's prognostic performance against random signatures, known signatures, and permuted data/metadata

UCSCXenaShiny:Interactive Analysis of UCSC Xena Data

selectKSigs:Selecting the number of mutational signatures using a perplexity-based measure and cross-validation

signeR:Empirical Bayesian approach to mutational signature discovery

reticulate:Interface to 'Python'

BatchQC:Batch Effects Quality Control Software

whitebox:'WhiteboxTools' R Frontend

viper:Virtual Inference of Protein-activity by Enriched Regulon analysis

AUCell:AUCell: Analysis of 'gene set' activity in single-cell RNA-seq data (e.g. identify cells with specific gene signatures)

TMSig:Tools for Molecular Signatures

pdc:Permutation Distribution Clustering

ProFAST:Probabilistic Factor Analysis for Spatially-Aware Dimension Reduction

granulator:Rapid benchmarking of methods for *in silico* deconvolution of bulk RNA-seq data

easier:Estimate Systems Immune Response from RNA-seq data

MAST:Model-based Analysis of Single Cell Transcriptomics

PDATK:Pancreatic Ductal Adenocarcinoma Tool-Kit

gpg:GNU Privacy Guard for R

clifford:Arbitrary Dimensional Clifford Algebras

signatureSearch:Environment for Gene Expression Searching Combined with Functional Enrichment Analysis

sigora:Signature Overrepresentation Analysis

ICAMS:In-Depth Characterization and Analysis of Mutational Signatures ('ICAMS')

NanoStringNCTools:NanoString nCounter Tools

SUITOR:Selecting the number of mutational signatures through cross-validation

sigQC:Quality Control Metrics for Gene Signatures

doBy:Groupwise Statistics, LSmeans, Linear Estimates, Utilities

scMappR:Single Cell Mapper

supersigs:Supervised mutational signatures

Rdpack:Update and Manipulate Rd Documentation Objects

AnVILBase:Generic functions for interacting with the AnVIL ecosystem

ROI:R Optimization Infrastructure

rassta:Raster-Based Spatial Stratification Algorithms

sodium:A Modern and Easy-to-Use Crypto Library

DeconRNASeq:Deconvolution of Heterogeneous Tissue Samples for mRNA-Seq data

mSigTools:Mutational Signature Analysis Tools

GeomxTools:NanoString GeoMx Tools

BRETIGEA:Brain Cell Type Specific Gene Expression Analysis

hermes:Preprocessing, analyzing, and reporting of RNA-seq data

xcore:xcore expression regulators inference

motifStack:Plot stacked logos for single or multiple DNA, RNA and amino acid sequence

CelliD:Unbiased Extraction of Single Cell gene signatures using Multiple Correspondence Analysis

GeneExpressionSignature:Gene Expression Signature based Similarity Metric

clustermole:Unbiased Single-Cell Transcriptomic Data Cell Type Identification

BioQC:Detect tissue heterogeneity in expression profiles with gene sets

GSgalgoR:An Evolutionary Framework for the Identification and Study of Prognostic Gene Expression Signatures in Cancer

SparseSignatures:SparseSignatures

sig:Print Function Signatures

motif:Local Pattern Analysis

granulator:Rapid benchmarking of methods for in silico deconvolution of bulk RNA-seq data