Showing 177 of total 177 results (show query)
bioc
onlineFDR:Online error rate control
This package allows users to control the false discovery rate (FDR) or familywise error rate (FWER) for online multiple hypothesis testing, where hypotheses arrive in a stream. In this framework, a null hypothesis is rejected based on the evidence against it and on the previous rejection decisions.
Maintained by David S. Robertson. Last updated 5 months ago.
multiplecomparisonsoftwarestatisticalmethoderror-rate-controlfdrfwerhypothesis-testingcpp
37.5 match 14 stars 6.88 score 26 scriptsbioc
fdrame:FDR adjustments of Microarray Experiments (FDR-AME)
This package contains two main functions. The first is fdr.ma which takes normalized expression data array, experimental design and computes adjusted p-values It returns the fdr adjusted p-values and plots, according to the methods described in (Reiner, Yekutieli and Benjamini 2002). The second, is fdr.gui() which creates a simple graphic user interface to access fdr.ma
Maintained by Effi Kenigsberg. Last updated 5 months ago.
microarraydifferentialexpressionmultiplecomparison
34.4 match 3.30 scorebioc
SWATH2stats:Transform and Filter SWATH Data for Statistical Packages
This package is intended to transform SWATH data from the OpenSWATH software into a format readable by other statistics packages while performing filtering, annotation and FDR estimation.
Maintained by Peter Blattmann. Last updated 5 months ago.
proteomicsannotationexperimentaldesignpreprocessingmassspectrometryimmunooncology
17.0 match 1 stars 6.30 score 22 scriptsbioc
OCplus:Operating characteristics plus sample size and local fdr for microarray experiments
This package allows to characterize the operating characteristics of a microarray experiment, i.e. the trade-off between false discovery rate and the power to detect truly regulated genes. The package includes tools both for planned experiments (for sample size assessment) and for already collected data (identification of differentially expressed genes).
Maintained by Alexander Ploner. Last updated 5 months ago.
microarraydifferentialexpressionmultiplecomparison
25.3 match 4.08 score 2 scriptsbioc
clusterProfiler:A universal enrichment tool for interpreting omics data
This package supports functional characteristics of both coding and non-coding genomics data for thousands of species with up-to-date gene annotation. It provides a univeral interface for gene functional annotation from a variety of sources and thus can be applied in diverse scenarios. It provides a tidy interface to access, manipulate, and visualize enrichment results to help users achieve efficient data interpretation. Datasets obtained from multiple treatments and time points can be analyzed and compared in a single run, easily revealing functional consensus and differences among distinct conditions.
Maintained by Guangchuang Yu. Last updated 4 months ago.
annotationclusteringgenesetenrichmentgokeggmultiplecomparisonpathwaysreactomevisualizationenrichment-analysisgsea
4.8 match 1.1k stars 17.03 score 11k scripts 48 dependentsmoviedo5
fda.usc:Functional Data Analysis and Utilities for Statistical Computing
Routines for exploratory and descriptive analysis of functional data such as depth measurements, atypical curves detection, regression models, supervised classification, unsupervised classification and functional analysis of variance.
Maintained by Manuel Oviedo de la Fuente. Last updated 4 months ago.
functional-data-analysisfortran
7.4 match 12 stars 9.72 score 560 scripts 22 dependentscran
ssize.fdr:Sample Size Calculations for Microarray Experiments
Functions that calculate appropriate sample sizes for one-sample t-tests, two-sample t-tests, and F-tests for microarray experiments based on desired power while controlling for false discovery rates. For all tests, the standard deviations (variances) among genes can be assumed fixed or random. This is also true for effect sizes among genes in one-sample and two sample experiments. Functions also output a chart of power versus sample size, a table of power at different sample sizes, and a table of critical test values at different sample sizes.
Maintained by Megan Orr. Last updated 3 years ago.
37.5 match 1 stars 1.78 score 2 dependentsuscbiostats
cit:Causal Inference Test
A likelihood-based hypothesis testing approach is implemented for assessing causal mediation. Described in Millstein, Chen, and Breton (2016), <DOI:10.1093/bioinformatics/btw135>, it could be used to test for mediation of a known causal association between a DNA variant, the 'instrumental variable', and a clinical outcome or phenotype by gene expression or DNA methylation, the potential mediator. Another example would be testing mediation of the effect of a drug on a clinical outcome by the molecular target. The hypothesis test generates a p-value or permutation-based FDR value with confidence intervals to quantify uncertainty in the causal inference. The outcome can be represented by either a continuous or binary variable, the potential mediator is continuous, and the instrumental variable can be continuous or binary and is not limited to a single variable but may be a design matrix representing multiple variables.
Maintained by Joshua Millstein. Last updated 9 months ago.
16.1 match 2 stars 3.81 score 32 scriptsyonghui-ni
FDRsamplesize2:Computing Power and Sample Size for the False Discovery Rate in Multiple Applications
Defines a collection of functions to compute average power and sample size for studies that use the false discovery rate as the final measure of statistical significance. A three-rectangle approximation method of a p-value histogram is proposed to derive a formula to compute the statistical power for analyses that involve the FDR. The methodology paper of this package is under review.
Maintained by Yonghui Ni. Last updated 1 years ago.
33.7 match 1.70 scoredsy109
mixtools:Tools for Analyzing Finite Mixture Models
Analyzes finite mixture models for various parametric and semiparametric settings. This includes mixtures of parametric distributions (normal, multivariate normal, multinomial, gamma), various Reliability Mixture Models (RMMs), mixtures-of-regressions settings (linear regression, logistic regression, Poisson regression, linear regression with changepoints, predictor-dependent mixing proportions, random effects regressions, hierarchical mixtures-of-experts), and tools for selecting the number of components (bootstrapping the likelihood ratio test statistic, mixturegrams, and model selection criteria). Bayesian estimation of mixtures-of-linear-regressions models is available as well as a novel data depth method for obtaining credible bands. This package is based upon work supported by the National Science Foundation under Grant No. SES-0518772 and the Chan Zuckerberg Initiative: Essential Open Source Software for Science (Grant No. 2020-255193).
Maintained by Derek Young. Last updated 9 months ago.
mixture-modelsmixture-of-expertssemiparametric-regression
4.9 match 20 stars 11.34 score 1.4k scripts 56 dependentsbioc
categoryCompare:Meta-analysis of high-throughput experiments using feature annotations
Calculates significant annotations (categories) in each of two (or more) feature (i.e. gene) lists, determines the overlap between the annotations, and returns graphical and tabular data about the significant annotations and which combinations of feature lists the annotations were found to be significant. Interactive exploration is facilitated through the use of RCytoscape (heavily suggested).
Maintained by Robert M. Flight. Last updated 5 months ago.
annotationgomultiplecomparisonpathwaysgeneexpressionbioconductor
8.2 match 6 stars 6.68 scoremillstei
fdrci:Permutation-Based FDR Point and Confidence Interval Estimation
FDR functions for permutation-based estimators, including pi0 as well as FDR confidence intervals. The confidence intervals account for dependencies between tests by the incorporation of an overdispersion parameter, which is estimated from the permuted data. Also included are options for an analog parametric approach.
Maintained by Joshua Millstein. Last updated 2 years ago.
19.1 match 2.78 score 12 scriptsbioc
Mergeomics:Integrative network analysis of omics data
The Mergeomics pipeline serves as a flexible framework for integrating multidimensional omics-disease associations, functional genomics, canonical pathways and gene-gene interaction networks to generate mechanistic hypotheses. It includes two main parts, 1) Marker set enrichment analysis (MSEA); 2) Weighted Key Driver Analysis (wKDA).
Maintained by Zeyneb Kurt. Last updated 5 months ago.
11.9 match 4.30 score 8 scriptsbioc
OLIN:Optimized local intensity-dependent normalisation of two-color microarrays
Functions for normalisation of two-color microarrays by optimised local regression and for detection of artefacts in microarray data
Maintained by Matthias Futschik. Last updated 5 months ago.
microarraytwochannelqualitycontrolpreprocessingvisualization
9.0 match 4.78 score 2 scripts 1 dependentsdisohda
DiscreteFDR:FDR Based Multiple Testing Procedures with Adaptation for Discrete Tests
Implementations of the multiple testing procedures for discrete tests described in the paper Dรถhler, Durand and Roquain (2018) "New FDR bounds for discrete and heterogeneous tests" <doi:10.1214/18-EJS1441>. The main procedures of the paper (HSU and HSD), their adaptive counterparts (AHSU and AHSD), and the HBR variant are available and are coded to take as input the results of a test procedure from package 'DiscreteTests', or a set of observed p-values and their discrete support under their nulls. A shortcut function to obtain such p-values and supports is also provided, along with a wrapper allowing to apply discrete procedures directly to data.
Maintained by Florian Junge. Last updated 7 days ago.
6.9 match 3 stars 6.02 score 16 scripts 2 dependentsmurraymegan
FDRestimation:Estimate, Plot, and Summarize False Discovery Rates
The user can directly compute and display false discovery rates from inputted p-values or z-scores under a variety of assumptions. p.fdr() computes FDRs, adjusted p-values and decision reject vectors from inputted p-values or z-values. get.pi0() estimates the proportion of data that are truly null. plot.p.fdr() plots the FDRs, adjusted p-values, and the raw p-values points against their rejection threshold lines.
Maintained by Megan Murray. Last updated 3 years ago.
11.2 match 6 stars 3.65 score 15 scriptsbioc
PLPE:Local Pooled Error Test for Differential Expression with Paired High-throughput Data
This package performs tests for paired high-throughput data.
Maintained by Soo-heang Eo. Last updated 5 months ago.
proteomicsmicroarraydifferentialexpression
12.1 match 3.30 score 7 scriptsallenzhuaz
FixSeqMTP:Fixed Sequence Multiple Testing Procedures
Several generalized / directional Fixed Sequence Multiple Testing Procedures (FSMTPs) are developed for testing a sequence of pre-ordered hypotheses while controlling the FWER, FDR and Directional Error (mdFWER). All three FWER controlling generalized FSMTPs are designed under arbitrary dependence, which allow any number of acceptances. Two FDR controlling generalized FSMTPs are respectively designed under arbitrary dependence and independence, which allow more but a given number of acceptances. Two mdFWER controlling directional FSMTPs are respectively designed under arbitrary dependence and independence, which can also make directional decisions based on the signs of the test statistics. The main functions for each proposed generalized / directional FSMTPs are designed to calculate adjusted p-values and critical values, respectively. For users' convenience, the functions also provide the output option for printing decision rules.
Maintained by Yalin Zhu. Last updated 6 years ago.
multiple-testingpre-ordersequential-testing
11.7 match 3 stars 3.22 score 11 scriptsbioc
TPP2D:Detection of ligand-protein interactions from 2D thermal profiles (DLPTP)
Detection of ligand-protein interactions from 2D thermal profiles (DLPTP), Performs an FDR-controlled analysis of 2D-TPP experiments by functional analysis of dose-response curves across temperatures.
Maintained by Nils Kurzawa. Last updated 5 months ago.
8.9 match 4.20 score 16 scriptsbioc
csaw:ChIP-Seq Analysis with Windows
Detection of differentially bound regions in ChIP-seq data with sliding windows, with methods for normalization and proper FDR control.
Maintained by Aaron Lun. Last updated 2 months ago.
multiplecomparisonchipseqnormalizationsequencingcoveragegeneticsannotationdifferentialpeakcallingcurlbzip2xz-utilszlibcpp
4.4 match 8.32 score 498 scripts 7 dependentsbioc
qvalue:Q-value estimation for false discovery rate control
This package takes a list of p-values resulting from the simultaneous testing of many hypotheses and estimates their q-values and local FDR values. The q-value of a test measures the proportion of false positives incurred (called the false discovery rate) when that particular test is called significant. The local FDR measures the posterior probability the null hypothesis is true given the test's p-value. Various plots are automatically generated, allowing one to make sensible significance cut-offs. Several mathematical results have recently been shown on the conservative accuracy of the estimated q-values from this software. The software can be applied to problems in genomics, brain imaging, astrophysics, and data mining.
Maintained by John D. Storey. Last updated 5 months ago.
2.5 match 114 stars 14.06 score 3.0k scripts 139 dependentsbioc
treeclimbR:An algorithm to find optimal signal levels in a tree
The arrangement of hypotheses in a hierarchical structure appears in many research fields and often indicates different resolutions at which data can be viewed. This raises the question of which resolution level the signal should best be interpreted on. treeclimbR provides a flexible method to select optimal resolution levels (potentially different levels in different parts of the tree), rather than cutting the tree at an arbitrary level. treeclimbR uses a tuning parameter to generate candidate resolutions and from these selects the optimal one.
Maintained by Charlotte Soneson. Last updated 3 months ago.
statisticalmethodcellbasedassays
5.0 match 20 stars 7.00 score 45 scriptsprotviz
prozor:Minimal Protein Set Explaining Peptide Spectrum Matches
Determine minimal protein set explaining peptide spectrum matches. Utility functions for creating fasta amino acid databases with decoys and contaminants. Peptide false discovery rate estimation for target decoy search results on psm, precursor, peptide and protein level. Computing dynamic swath window sizes based on MS1 or MS2 signal distributions.
Maintained by Witold Wolski. Last updated 4 months ago.
softwaremassspectrometryproteomicsexperimenthubsoftware
7.9 match 6 stars 4.45 score 93 scriptssnoweye
MixfMRI:Mixture fMRI Clustering Analysis
Utilizing model-based clustering (unsupervised) for functional magnetic resonance imaging (fMRI) data. The developed methods (Chen and Maitra (2023) <doi:10.1002/hbm.26425>) include 2D and 3D clustering analyses (for p-values with voxel locations) and segmentation analyses (for p-values alone) for fMRI data where p-values indicate significant level of activation responding to stimulate of interesting. The analyses are mainly identifying active voxel/signal associated with normal brain behaviors. Analysis pipelines (R scripts) utilizing this package (see examples in 'inst/workflow/') is also implemented with high performance techniques.
Maintained by Wei-Chen Chen. Last updated 5 months ago.
8.1 match 2 stars 4.26 score 18 scriptsandrewzm
EFDR:Wavelet-Based Enhanced FDR for Detecting Signals from Complete or Incomplete Spatially Aggregated Data
Enhanced False Discovery Rate (EFDR) is a tool to detect anomalies in an image. The image is first transformed into the wavelet domain in order to decorrelate any noise components, following which the coefficients at each resolution are standardised. Statistical tests (in a multiple hypothesis testing setting) are then carried out to find the anomalies. The power of EFDR exceeds that of standard FDR, which would carry out tests on every wavelet coefficient: EFDR choose which wavelets to test based on a criterion described in Shen et al. (2002). The package also provides elementary tools to interpolate spatially irregular data onto a grid of the required size. The work is based on Shen, X., Huang, H.-C., and Cressie, N. 'Nonparametric hypothesis testing for a spatial signal.' Journal of the American Statistical Association 97.460 (2002): 1122-1140.
Maintained by Andrew Zammit-Mangion. Last updated 2 years ago.
7.2 match 5 stars 4.74 score 22 scriptskornl
mutoss:Unified Multiple Testing Procedures
Designed to ease the application and comparison of multiple hypothesis testing procedures for FWER, gFWER, FDR and FDX. Methods are standardized and usable by the accompanying 'mutossGUI'.
Maintained by Kornelius Rohmeyer. Last updated 12 months ago.
3.9 match 4 stars 8.44 score 24 scripts 16 dependentsbioc
iCOBRA:Comparison and Visualization of Ranking and Assignment Methods
This package provides functions for calculation and visualization of performance metrics for evaluation of ranking and binary classification (assignment) methods. Various types of performance plots can be generated programmatically. The package also contains a shiny application for interactive exploration of results.
Maintained by Charlotte Soneson. Last updated 3 months ago.
3.6 match 14 stars 8.86 score 192 scripts 1 dependentsivis4ml
fssemR:Fused Sparse Structural Equation Models to Jointly Infer Gene Regulatory Network
An optimizer of Fused-Sparse Structural Equation Models, which is the state of the art jointly fused sparse maximum likelihood function for structural equation models proposed by Xin Zhou and Xiaodong Cai (2018 <doi:10.1101/466623>).
Maintained by Xin Zhou. Last updated 3 years ago.
6.6 match 4 stars 4.85 score 35 scriptsbioc
PICS:Probabilistic inference of ChIP-seq
Probabilistic inference of ChIP-Seq using an empirical Bayes mixture model approach.
Maintained by Renan Sauteraud. Last updated 5 months ago.
clusteringvisualizationsequencingchipseqgsl
5.8 match 5.48 score 7 scripts 1 dependentsbnaras
pamr:Pam: Prediction Analysis for Microarrays
Some functions for sample classification in microarrays.
Maintained by Balasubramanian Narasimhan. Last updated 9 months ago.
3.9 match 7.90 score 256 scripts 14 dependentsbioc
LPE:Methods for analyzing microarray data using Local Pooled Error (LPE) method
This LPE library is used to do significance analysis of microarray data with small number of replicates. It uses resampling based FDR adjustment, and gives less conservative results than traditional 'BH' or 'BY' procedures. Data accepted is raw data in txt format from MAS4, MAS5 or dChip. Data can also be supplied after normalization. LPE library is primarily used for analyzing data between two conditions. To use it for paired data, see LPEP library. For using LPE in multiple conditions, use HEM library.
Maintained by Nitin Jain. Last updated 5 months ago.
microarraydifferentialexpression
6.6 match 4.58 score 21 scripts 1 dependentsallenzhuaz
MHTdiscrete:Multiple Hypotheses Testing for Discrete Data
A comprehensive tool for almost all existing multiple testing methods for discrete data. The package also provides some novel multiple testing procedures controlling FWER/FDR for discrete data. Given discrete p-values and their domains, the [method].p.adjust function returns adjusted p-values, which can be used to compare with the nominal significant level alpha and make decisions. For users' convenience, the functions also provide the output option for printing decision rules.
Maintained by Yalin Zhu. Last updated 6 years ago.
adjustment-computationsbenjamini-hochbergbonferronidiscrete-distributionsmultiple-testing-correction
8.7 match 1 stars 3.27 score 37 scriptsthermostats
RVA:RNAseq Visualization Automation
Automate downstream visualization & pathway analysis in RNAseq analysis. 'RVA' is a collection of functions that efficiently visualize RNAseq differential expression analysis result from summary statistics tables. It also utilize the Fisher's exact test to evaluate gene set or pathway enrichment in a convenient and efficient manner.
Maintained by Xingpeng Li. Last updated 3 years ago.
5.0 match 9 stars 5.65 score 6 scriptsschw4b
DGM:Dynamic Graphical Models
Dynamic graphical models for multivariate time series data to estimate directed dynamic networks in functional magnetic resonance imaging (fMRI), see Schwab et al. (2017) <doi:10.1016/j.neuroimage.2018.03.074>.
Maintained by Simon Schwab. Last updated 3 years ago.
dynamic-graphical-modelsfunctional-connectivitytime-varying-connectivityopenblascppopenmp
5.1 match 25 stars 5.49 score 25 scriptsbioc
compcodeR:RNAseq data simulation, differential expression analysis and performance comparison of differential expression methods
This package provides extensive functionality for comparing results obtained by different methods for differential expression analysis of RNAseq data. It also contains functions for simulating count data. Finally, it provides convenient interfaces to several packages for performing the differential expression analysis. These can also be used as templates for setting up and running a user-defined differential analysis workflow within the framework of the package.
Maintained by Charlotte Soneson. Last updated 3 months ago.
immunooncologyrnaseqdifferentialexpression
3.5 match 11 stars 8.06 score 26 scriptsjasinmachkour
TRexSelector:T-Rex Selector: High-Dimensional Variable Selection & FDR Control
Performs fast variable selection in high-dimensional settings while controlling the false discovery rate (FDR) at a user-defined target level. The package is based on the paper Machkour, Muma, and Palomar (2022) <arXiv:2110.06048>.
Maintained by Jasin Machkour. Last updated 1 years ago.
5.9 match 5 stars 4.40 score 5 scriptsdsstoffer
astsa:Applied Statistical Time Series Analysis
Contains data sets and scripts for analyzing time series in both the frequency and time domains including state space modeling as well as supporting the texts Time Series Analysis and Its Applications: With R Examples (5th ed), by R.H. Shumway and D.S. Stoffer. Springer Texts in Statistics, 2025, <https://link.springer.com/book/9783031705830>, and Time Series: A Data Analysis Approach Using R. Chapman-Hall, 2019, <DOI:10.1201/9780429273285>.
Maintained by David Stoffer. Last updated 2 months ago.
3.3 match 7 stars 7.86 score 2.2k scripts 8 dependentsstan-pounds
FDRsampsize:Compute Sample Size that Meets Requirements for Average Power and FDR
Defines a collection of functions to compute average power and sample size for studies that use the false discovery rate as the final measure of statistical significance.
Maintained by Stan Pounds. Last updated 9 years ago.
18.3 match 1.23 score 17 scriptshrpcisd
locfdr:Computes Local False Discovery Rates
Computation of local false discovery rates.
Maintained by Balasubramanian Narasimhan. Last updated 10 years ago.
3.8 match 5.99 score 106 scripts 14 dependentsbioc
safe:Significance Analysis of Function and Expression
SAFE is a resampling-based method for testing functional categories in gene expression experiments. SAFE can be applied to 2-sample and multi-class comparisons, or simple linear regressions. Other experimental designs can also be accommodated through user-defined functions.
Maintained by Ludwig Geistlinger. Last updated 5 months ago.
differentialexpressionpathwaysgenesetenrichmentstatisticalmethodsoftware
4.0 match 5.60 score 32 scripts 5 dependentskbroman
qtl:Tools for Analyzing QTL Experiments
Analysis of experimental crosses to identify genes (called quantitative trait loci, QTLs) contributing to variation in quantitative traits. Broman et al. (2003) <doi:10.1093/bioinformatics/btg112>.
Maintained by Karl W Broman. Last updated 7 months ago.
1.8 match 80 stars 12.79 score 2.4k scripts 29 dependentsichcha-m
cophescan:Adaptation of the Coloc Method for PheWAS
A Bayesian method for Phenome-wide association studies (PheWAS) that identifies causal associations between genetic variants and traits, while simultaneously addressing confounding due to linkage disequilibrium. For details see Manipur et al (2023) <doi:10.1101/2023.06.29.546856>.
Maintained by Ichcha Manipur. Last updated 9 months ago.
3.8 match 6 stars 5.76 score 24 scriptsparsifal9
RFlocalfdr:Significance Level for Random Forest Impurity Importance Scores
Sets a significance level for Random Forest MDI (Mean Decrease in Impurity, Gini or sum of squares) variable importance scores, using an empirical Bayes approach. See Dunne et al. (2022) <doi:10.1101/2022.04.06.487300>.
Maintained by Robert Dunne. Last updated 2 months ago.
4.5 match 1 stars 4.72 score 13 scriptsbioc
miloR:Differential neighbourhood abundance testing on a graph
Milo performs single-cell differential abundance testing. Cell states are modelled as representative neighbourhoods on a nearest neighbour graph. Hypothesis testing is performed using either a negative bionomial generalized linear model or negative binomial generalized linear mixed model.
Maintained by Mike Morgan. Last updated 5 months ago.
singlecellmultiplecomparisonfunctionalgenomicssoftwareopenblascppopenmp
2.0 match 357 stars 10.49 score 340 scripts 1 dependentsafukushima
DiffCorr:Analyzing and Visualizing Differential Correlation Networks in Biological Data
A method for identifying pattern changes between 2 experimental conditions in correlation networks (e.g., gene co-expression networks), which builds on a commonly used association measure, such as Pearson's correlation coefficient. This package includes functions to calculate correlation matrices for high-dimensional dataset and to test differential correlation, which means the changes in the correlation relationship among variables (e.g., genes and metabolites) between 2 experimental conditions.
Maintained by Atsushi Fukushima. Last updated 6 months ago.
3.1 match 5 stars 6.81 score 29 scripts 1 dependentsbioc
GPA:GPA (Genetic analysis incorporating Pleiotropy and Annotation)
This package provides functions for fitting GPA, a statistical framework to prioritize GWAS results by integrating pleiotropy information and annotation data. In addition, it also includes ShinyGPA, an interactive visualization toolkit to investigate pleiotropic architecture.
Maintained by Dongjun Chung. Last updated 5 months ago.
softwarestatisticalmethodclassificationgenomewideassociationsnpgeneticsclusteringmultiplecomparisonpreprocessinggeneexpressiondifferentialexpressioncpp
3.3 match 14 stars 6.15 score 7 scriptsbioc
ReactomePA:Reactome Pathway Analysis
This package provides functions for pathway analysis based on REACTOME pathway database. It implements enrichment analysis, gene set enrichment analysis and several functions for visualization. This package is not affiliated with the Reactome team.
Maintained by Guangchuang Yu. Last updated 5 months ago.
pathwaysvisualizationannotationmultiplecomparisongenesetenrichmentreactomeenrichment-analysisreactome-pathway-analysisreactomepa
1.6 match 40 stars 12.25 score 1.5k scripts 7 dependentsdppalomar
spectralGraphTopology:Learning Graphs from Data via Spectral Constraints
In the era of big data and hyperconnectivity, learning high-dimensional structures such as graphs from data has become a prominent task in machine learning and has found applications in many fields such as finance, health care, and networks. 'spectralGraphTopology' is an open source, documented, and well-tested R package for learning graphs from data. It provides implementations of state of the art algorithms such as Combinatorial Graph Laplacian Learning (CGL), Spectral Graph Learning (SGL), Graph Estimation based on Majorization-Minimization (GLE-MM), and Graph Estimation based on Alternating Direction Method of Multipliers (GLE-ADMM). In addition, graph learning has been widely employed for clustering, where specific algorithms are available in the literature. To this end, we provide an implementation of the Constrained Laplacian Rank (CLR) algorithm.
Maintained by Ze Vinicius. Last updated 2 years ago.
3.3 match 2 stars 5.91 score 135 scripts 1 dependentsumich-cphds
bama:High Dimensional Bayesian Mediation Analysis
Perform mediation analysis in the presence of high-dimensional mediators based on the potential outcome framework. Bayesian Mediation Analysis (BAMA), developed by Song et al (2019) <doi:10.1111/biom.13189> and Song et al (2020) <arXiv:2009.11409>, relies on two Bayesian sparse linear mixed models to simultaneously analyze a relatively large number of mediators for a continuous exposure and outcome assuming a small number of mediators are truly active. This sparsity assumption also allows the extension of univariate mediator analysis by casting the identification of active mediators as a variable selection problem and applying Bayesian methods with continuous shrinkage priors on the effects.
Maintained by Mike Kleinsasser. Last updated 2 years ago.
4.0 match 4.80 score 42 scripts 1 dependentsbioc
HEM:Heterogeneous error model for identification of differentially expressed genes under multiple conditions
This package fits heterogeneous error models for analysis of microarray data
Maintained by HyungJun Cho. Last updated 5 months ago.
microarraydifferentialexpression
4.5 match 4.30 score 6 scriptsbioc
cydar:Using Mass Cytometry for Differential Abundance Analyses
Identifies differentially abundant populations between samples and groups in mass cytometry data. Provides methods for counting cells into hyperspheres, controlling the spatial false discovery rate, and visualizing changes in abundance in the high-dimensional marker space.
Maintained by Aaron Lun. Last updated 5 months ago.
immunooncologyflowcytometrymultiplecomparisonproteomicssinglecellcpp
3.3 match 5.64 score 48 scriptsbioc
acde:Artificial Components Detection of Differentially Expressed Genes
This package provides a multivariate inferential analysis method for detecting differentially expressed genes in gene expression data. It uses artificial components, close to the data's principal components but with an exact interpretation in terms of differential genetic expression, to identify differentially expressed genes while controlling the false discovery rate (FDR). The methods on this package are described in the vignette or in the article 'Multivariate Method for Inferential Identification of Differentially Expressed Genes in Gene Expression Experiments' by J. P. Acosta, L. Lopez-Kleine and S. Restrepo (2015, pending publication).
Maintained by Juan Pablo Acosta. Last updated 5 months ago.
differentialexpressiontimecourseprincipalcomponentgeneexpressionmicroarraymrnamicroarray
5.6 match 3.30 score 1 scriptsbioc
DAPAR:Tools for the Differential Analysis of Proteins Abundance with R
The package DAPAR is a Bioconductor distributed R package which provides all the necessary functions to analyze quantitative data from label-free proteomics experiments. Contrarily to most other similar R packages, it is endowed with rich and user-friendly graphical interfaces, so that no programming skill is required (see `Prostar` package).
Maintained by Samuel Wieczorek. Last updated 5 months ago.
proteomicsnormalizationpreprocessingmassspectrometryqualitycontrolgodataimportprostar1
3.4 match 2 stars 5.42 score 22 scripts 1 dependentsbioc
calm:Covariate Assisted Large-scale Multiple testing
Statistical methods for multiple testing with covariate information. Traditional multiple testing methods only consider a list of test statistics, such as p-values. Our methods incorporate the auxiliary information, such as the lengths of gene coding regions or the minor allele frequencies of SNPs, to improve power.
Maintained by Kun Liang. Last updated 5 months ago.
bayesiandifferentialexpressiongeneexpressionregressionmicroarraysequencingrnaseqmultiplecomparisongeneticsimmunooncologymetabolomicsproteomicstranscriptomics
5.5 match 3.30 score 2 scriptsmeierluk
hdi:High-Dimensional Inference
Implementation of multiple approaches to perform inference in high-dimensional models.
Maintained by Lukas Meier. Last updated 4 years ago.
4.0 match 2 stars 4.47 score 139 scripts 7 dependentswanghaoxue0
SplitKnockoff:Split Knockoffs for Structural Sparsity
Split Knockoff is a data adaptive variable selection framework for controlling the (directional) false discovery rate (FDR) in structural sparsity, where variable selection on linear transformation of parameters is of concern. This proposed scheme relaxes the linear subspace constraint to its neighborhood, often known as variable splitting in optimization. Simulation experiments can be reproduced following the Vignette. We include data (both .mat and .csv format) and application with our method of Alzheimer's Disease study in this package. 'Split Knockoffs' is first defined in Cao et al. (2021) <arXiv:2103.16159>.
Maintained by Haoxue Wang. Last updated 3 years ago.
4.1 match 3 stars 4.18 score 4 scriptssnoweye
EMCluster:EM Algorithm for Model-Based Clustering of Finite Mixture Gaussian Distribution
EM algorithms and several efficient initialization methods for model-based clustering of finite mixture Gaussian distribution with unstructured dispersion in both of unsupervised and semi-supervised learning.
Maintained by Wei-Chen Chen. Last updated 6 months ago.
2.3 match 18 stars 7.53 score 123 scripts 2 dependentsbioc
scDDboost:A compositional model to assess expression changes from single-cell rna-seq data
scDDboost is an R package to analyze changes in the distribution of single-cell expression data between two experimental conditions. Compared to other methods that assess differential expression, scDDboost benefits uniquely from information conveyed by the clustering of cells into cellular subtypes. Through a novel empirical Bayesian formulation it calculates gene-specific posterior probabilities that the marginal expression distribution is the same (or different) between the two conditions. The implementation in scDDboost treats gene-level expression data within each condition as a mixture of negative binomial distributions.
Maintained by Xiuyu Ma. Last updated 1 days ago.
singlecellsoftwareclusteringsequencinggeneexpressiondifferentialexpressionbayesiancpp
3.6 match 4.68 score 19 scriptsbioc
ABarray:Microarray QA and statistical data analysis for Applied Biosystems Genome Survey Microrarray (AB1700) gene expression data.
Automated pipline to perform gene expression analysis for Applied Biosystems Genome Survey Microarray (AB1700) data format. Functions include data preprocessing, filtering, control probe analysis, statistical analysis in one single function. A GUI interface is also provided. The raw data, processed data, graphics output and statistical results are organized into folders according to the analysis settings used.
Maintained by Yongming Andrew Sun. Last updated 5 months ago.
microarrayonechannelpreprocessing
3.9 match 4.20 score 3 scriptsannennenne
causalDisco:Tools for Causal Discovery on Observational Data
Various tools for inferring causal models from observational data. The package includes an implementation of the temporal Peter-Clark (TPC) algorithm. Petersen, Osler and Ekstrรธm (2021) <doi:10.1093/aje/kwab087>. It also includes general tools for evaluating differences in adjacency matrices, which can be used for evaluating performance of causal discovery procedures.
Maintained by Anne Helby Petersen. Last updated 14 days ago.
3.3 match 19 stars 4.76 score 10 scriptscran
mSTEM:Multiple Testing of Local Extrema for Detection of Change Points
A new approach to detect change points based on smoothing and multiple testing, which is for long data sequence modeled as piecewise constant functions plus stationary Gaussian noise, see Dan Cheng and Armin Schwartzman (2015) <arXiv:1504.06384>.
Maintained by Zhibing He. Last updated 5 years ago.
9.0 match 1.70 scorebioc
phenoTest:Tools to test association between gene expression and phenotype in a way that is efficient, structured, fast and scalable. We also provide tools to do GSEA (Gene set enrichment analysis) and copy number variation.
Tools to test correlation between gene expression and phenotype in a way that is efficient, structured, fast and scalable. GSEA is also provided.
Maintained by Evarist Planet. Last updated 5 months ago.
microarraydifferentialexpressionmultiplecomparisonclusteringclassification
3.3 match 4.56 score 9 scripts 1 dependentsbioc
RNAseqCovarImpute:Impute Covariate Data in RNA Sequencing Studies
The RNAseqCovarImpute package makes linear model analysis for RNA sequencing read counts compatible with multiple imputation (MI) of missing covariates. A major problem with implementing MI in RNA sequencing studies is that the outcome data must be included in the imputation prediction models to avoid bias. This is difficult in omics studies with high-dimensional data. The first method we developed in the RNAseqCovarImpute package surmounts the problem of high-dimensional outcome data by binning genes into smaller groups to analyze pseudo-independently. This method implements covariate MI in gene expression studies by 1) randomly binning genes into smaller groups, 2) creating M imputed datasets separately within each bin, where the imputation predictor matrix includes all covariates and the log counts per million (CPM) for the genes within each bin, 3) estimating gene expression changes using `limma::voom` followed by `limma::lmFit` functions, separately on each M imputed dataset within each gene bin, 4) un-binning the gene sets and stacking the M sets of model results before applying the `limma::squeezeVar` function to apply a variance shrinking Bayesian procedure to each M set of model results, 5) pooling the results with Rubinsโ rules to produce combined coefficients, standard errors, and P-values, and 6) adjusting P-values for multiplicity to account for false discovery rate (FDR). A faster method uses principal component analysis (PCA) to avoid binning genes while still retaining outcome information in the MI models. Binning genes into smaller groups requires that the MI and limma-voom analysis is run many times (typically hundreds). The more computationally efficient MI PCA method implements covariate MI in gene expression studies by 1) performing PCA on the log CPM values for all genes using the Bioconductor `PCAtools` package, 2) creating M imputed datasets where the imputation predictor matrix includes all covariates and the optimum number of PCs to retain (e.g., based on Hornโs parallel analysis or the number of PCs that account for >80% explained variation), 3) conducting the standard limma-voom pipeline with the `voom` followed by `lmFit` followed by `eBayes` functions on each M imputed dataset, 4) pooling the results with Rubinsโ rules to produce combined coefficients, standard errors, and P-values, and 5) adjusting P-values for multiplicity to account for false discovery rate (FDR).
Maintained by Brennan Baker. Last updated 5 months ago.
rnaseqgeneexpressiondifferentialexpressionsequencing
3.3 match 1 stars 4.48 score 6 scriptsphilipppro
measures:Performance Measures for Statistical Learning
Provides the biggest amount of statistical measures in the whole R world. Includes measures of regression, (multiclass) classification and multilabel classification. The measures come mainly from the 'mlr' package and were programed by several 'mlr' developers.
Maintained by Philipp Probst. Last updated 4 years ago.
3.3 match 1 stars 4.47 score 88 scripts 2 dependentsbioc
ssize:Estimate Microarray Sample Size
Functions for computing and displaying sample size information for gene expression arrays.
Maintained by Gregory R. Warnes. Last updated 5 months ago.
microarraydifferentialexpression
3.5 match 4.18 score 15 scriptsarunabhacodes
CPBayes:Bayesian Meta Analysis for Studying Cross-Phenotype Genetic Associations
A Bayesian meta-analysis method for studying cross-phenotype genetic associations. It uses summary-level data across multiple phenotypes to simultaneously measure the evidence of aggregate-level pleiotropic association and estimate an optimal subset of traits associated with the risk locus. CPBayes is based on a spike and slab prior. The methodology is available from: A Majumdar, T Haldar, S Bhattacharya, JS Witte (2018) <doi:10.1371/journal.pgen.1007139>.
Maintained by Arunabha Majumdar. Last updated 4 years ago.
3.3 match 3 stars 4.26 score 12 scriptsstamats
MKclass:Statistical Classification
Performance measures and scores for statistical classification such as accuracy, sensitivity, specificity, recall, similarity coefficients, AUC, GINI index, Brier score and many more. Calculation of optimal cut-offs and decision stumps (Iba and Langley (1991), <doi:10.1016/B978-1-55860-247-2.50035-8>) for all implemented performance measures. Hosmer-Lemeshow goodness of fit tests (Lemeshow and Hosmer (1982), <doi:10.1093/oxfordjournals.aje.a113284>; Hosmer et al (1997), <doi:10.1002/(SICI)1097-0258(19970515)16:9%3C965::AID-SIM509%3E3.0.CO;2-O>). Statistical and epidemiological risk measures such as relative risk, odds ratio, number needed to treat (Porta (2014), <doi:10.1093%2Facref%2F9780199976720.001.0001>).
Maintained by Matthias Kohl. Last updated 1 years ago.
3.3 match 2 stars 4.26 score 18 scriptsbioc
EBSeq:An R package for gene and isoform differential expression analysis of RNA-seq data
Differential Expression analysis at both gene and isoform level using RNA-seq data
Maintained by Xiuyu Ma. Last updated 2 months ago.
immunooncologystatisticalmethoddifferentialexpressionmultiplecomparisonrnaseqsequencingcpp
1.8 match 7.77 score 162 scripts 6 dependentsff1201
sgs:Sparse-Group SLOPE: Adaptive Bi-Level Selection with FDR Control
Implementation of Sparse-group SLOPE (SGS) (Feser and Evangelou (2023) <doi:10.48550/arXiv.2305.09467>) models. Linear and logistic regression models are supported, both of which can be fit using k-fold cross-validation. Dense and sparse input matrices are supported. In addition, a general Adaptive Three Operator Splitting (ATOS) (Pedregosa and Gidel (2018) <doi:10.48550/arXiv.1804.02339>) implementation is provided. Group SLOPE (gSLOPE) (Brzyski et al. (2019) <doi:10.1080/01621459.2017.1411269>) and group-based OSCAR models (Feser and Evangelou (2024) <doi:10.48550/arXiv.2405.15357>) are also implemented. All models are available with strong screening rules (Feser and Evangelou (2024) <doi:10.48550/arXiv.2405.15357>) for computational speed-up.
Maintained by Fabio Feser. Last updated 13 days ago.
2.8 match 1 stars 4.99 score 13 scripts 1 dependentsbioc
ANCOMBC:Microbiome differential abudance and correlation analyses with bias correction
ANCOMBC is a package containing differential abundance (DA) and correlation analyses for microbiome data. Specifically, the package includes Analysis of Compositions of Microbiomes with Bias Correction 2 (ANCOM-BC2), Analysis of Compositions of Microbiomes with Bias Correction (ANCOM-BC), and Analysis of Composition of Microbiomes (ANCOM) for DA analysis, and Sparse Estimation of Correlations among Microbiomes (SECOM) for correlation analysis. Microbiome data are typically subject to two sources of biases: unequal sampling fractions (sample-specific biases) and differential sequencing efficiencies (taxon-specific biases). Methodologies included in the ANCOMBC package are designed to correct these biases and construct statistically consistent estimators.
Maintained by Huang Lin. Last updated 1 days ago.
differentialexpressionmicrobiomenormalizationsequencingsoftwareancomancombcancombc2correlationdifferential-abundance-analysissecom
1.3 match 120 stars 10.79 score 406 scripts 1 dependentsbioc
limpca:An R package for the linear modeling of high-dimensional designed data based on ASCA/APCA family of methods
This package has for objectives to provide a method to make Linear Models for high-dimensional designed data. limpca applies a GLM (General Linear Model) version of ASCA and APCA to analyse multivariate sample profiles generated by an experimental design. ASCA/APCA provide powerful visualization tools for multivariate structures in the space of each effect of the statistical model linked to the experimental design and contrarily to MANOVA, it can deal with mutlivariate datasets having more variables than observations. This method can handle unbalanced design.
Maintained by Manon Martin. Last updated 5 months ago.
statisticalmethodprincipalcomponentregressionvisualizationexperimentaldesignmultiplecomparisongeneexpressionmetabolomics
2.3 match 2 stars 5.73 score 2 scriptsbioc
autonomics:Unified Statistical Modeling of Omics Data
This package unifies access to Statistal Modeling of Omics Data. Across linear modeling engines (lm, lme, lmer, limma, and wilcoxon). Across coding systems (treatment, difference, deviation, etc). Across model formulae (with/without intercept, random effect, interaction or nesting). Across omics platforms (microarray, rnaseq, msproteomics, affinity proteomics, metabolomics). Across projection methods (pca, pls, sma, lda, spls, opls). Across clustering methods (hclust, pam, cmeans). It provides a fast enrichment analysis implementation. And an intuitive contrastogram visualisation to summarize contrast effects in complex designs.
Maintained by Aditya Bhagwat. Last updated 2 months ago.
softwaredataimportpreprocessingdimensionreductionprincipalcomponentregressiondifferentialexpressiongenesetenrichmenttranscriptomicstranscriptiongeneexpressionrnaseqmicroarrayproteomicsmetabolomicsmassspectrometry
2.3 match 5.95 score 5 scriptsmodal-inria
MLGL:Multi-Layer Group-Lasso
It implements a new procedure of variable selection in the context of redundancy between explanatory variables, which holds true with high dimensional data (Grimonprez et al. (2023) <doi:10.18637/jss.v106.i03>).
Maintained by Quentin Grimonprez. Last updated 2 years ago.
3.7 match 3 stars 3.61 score 27 scriptsbioc
GGPA:graph-GPA: A graphical model for prioritizing GWAS results and investigating pleiotropic architecture
Genome-wide association studies (GWAS) is a widely used tool for identification of genetic variants associated with phenotypes and diseases, though complex diseases featuring many genetic variants with small effects present difficulties for traditional these studies. By leveraging pleiotropy, the statistical power of a single GWAS can be increased. This package provides functions for fitting graph-GPA, a statistical framework to prioritize GWAS results by integrating pleiotropy. 'GGPA' package provides user-friendly interface to fit graph-GPA models, implement association mapping, and generate a phenotype graph.
Maintained by Dongjun Chung. Last updated 5 months ago.
softwarestatisticalmethodclassificationgenomewideassociationsnpgeneticsclusteringmultiplecomparisonpreprocessinggeneexpressiondifferentialexpressionopenblascpp
3.3 match 1 stars 4.00 score 2 scriptshneth
riskyr:Rendering Risk Literacy more Transparent
Risk-related information (like the prevalence of conditions, the sensitivity and specificity of diagnostic tests, or the effectiveness of interventions or treatments) can be expressed in terms of frequencies or probabilities. By providing a toolbox of corresponding metrics and representations, 'riskyr' computes, translates, and visualizes risk-related information in a variety of ways. Adopting multiple complementary perspectives provides insights into the interplay between key parameters and renders teaching and training programs on risk literacy more transparent.
Maintained by Hansjoerg Neth. Last updated 10 months ago.
2x2-matrixbayesian-inferencecontingency-tablerepresentationriskrisk-literacyvisualization
1.7 match 19 stars 7.36 score 80 scriptsbioc
topconfects:Top Confident Effect Sizes
Rank results by confident effect sizes, while maintaining False Discovery Rate and False Coverage-statement Rate control. Topconfects is an alternative presentation of TREAT results with improved usability, eliminating p-values and instead providing confidence bounds. The main application is differential gene expression analysis, providing genes ranked in order of confident log2 fold change, but it can be applied to any collection of effect sizes with associated standard errors.
Maintained by Paul Harrison. Last updated 3 months ago.
geneexpressiondifferentialexpressiontranscriptomicsrnaseqmrnamicroarrayregressionmultiplecomparison
1.7 match 14 stars 7.38 score 18 scripts 2 dependentsbioc
pepXMLTab:Parsing pepXML files and filter based on peptide FDR.
Parsing pepXML files based one XML package. The package tries to handle pepXML files generated from different softwares. The output will be a peptide-spectrum-matching tabular file. The package also provide function to filter the PSMs based on FDR.
Maintained by Xiaojing Wang. Last updated 5 months ago.
immunooncologyproteomicsmassspectrometry
3.4 match 3.60 score 9 scriptsbioc
reconsi:Resampling Collapsed Null Distributions for Simultaneous Inference
Improves simultaneous inference under dependence of tests by estimating a collapsed null distribution through resampling. Accounting for the dependence between tests increases the power while reducing the variability of the false discovery proportion. This dependence is common in genomics applications, e.g. when combining flow cytometry measurements with microbiome sequence counts.
Maintained by Stijn Hawinkel. Last updated 5 months ago.
metagenomicsmicrobiomemultiplecomparisonflowcytometry
2.6 match 2 stars 4.60 score 2 scriptsjasinmachkour
tlars:The T-LARS Algorithm: Early-Terminated Forward Variable Selection
Computes the solution path of the Terminating-LARS (T-LARS) algorithm. The T-LARS algorithm is a major building block of the T-Rex selector (see R package 'TRexSelector'). The package is based on the papers Machkour, Muma, and Palomar (2022) <arXiv:2110.06048>, Efron, Hastie, Johnstone, and Tibshirani (2004) <doi:10.1214/009053604000000067>, and Tibshirani (1996) <doi:10.1111/j.2517-6161.1996.tb02080.x>.
Maintained by Jasin Machkour. Last updated 1 years ago.
2.6 match 2 stars 4.48 score 5 scripts 1 dependentsygeunkim
bvhar:Bayesian Vector Heterogeneous Autoregressive Modeling
Tools to model and forecast multivariate time series including Bayesian Vector heterogeneous autoregressive (VHAR) model by Kim & Baek (2023) (<doi:10.1080/00949655.2023.2281644>). 'bvhar' can model Vector Autoregressive (VAR), VHAR, Bayesian VAR (BVAR), and Bayesian VHAR (BVHAR) models.
Maintained by Young Geun Kim. Last updated 18 days ago.
bayesianbayesian-econometricsbvareigenforecastingharpybind11pythonrcppeigentime-seriesvector-autoregressioncppopenmp
1.8 match 6 stars 6.42 score 25 scriptsallenzhuaz
MHTmult:Multiple Hypotheses Testing for Multiple Families/Groups Structure
A Comprehensive tool for almost all existing multiple testing methods for multiple families. The package summarizes the existing methods for multiple families multiple testing procedures (MTPs) such as double FDR, group Benjamini-Hochberg (GBH) procedure and average FDR controlling procedure. The package also provides some novel multiple testing procedures using selective inference idea.
Maintained by Yalin Zhu. Last updated 3 years ago.
hierarchical-datamultiple-testingmultiplicity
4.3 match 2.70 score 9 scriptsbioc
signatureSearch:Environment for Gene Expression Searching Combined with Functional Enrichment Analysis
This package implements algorithms and data structures for performing gene expression signature (GES) searches, and subsequently interpreting the results functionally with specialized enrichment methods.
Maintained by Brendan Gongol. Last updated 5 months ago.
softwaregeneexpressiongokeggnetworkenrichmentsequencingcoveragedifferentialexpressioncpp
1.6 match 17 stars 7.18 score 74 scripts 1 dependentsbioc
swfdr:Estimation of the science-wise false discovery rate and the false discovery rate conditional on covariates
This package allows users to estimate the science-wise false discovery rate from Jager and Leek, "Empirical estimates suggest most published medical research is true," 2013, Biostatistics, using an EM approach due to the presence of rounding and censoring. It also allows users to estimate the false discovery rate conditional on covariates, using a regression framework, as per Boca and Leek, "A direct approach to estimating false discovery rates conditional on covariates," 2018, PeerJ.
Maintained by Simina M. Boca. Last updated 5 months ago.
multiplecomparisonstatisticalmethodsoftware
1.8 match 3 stars 6.25 score 37 scriptsbioc
scp:Mass Spectrometry-Based Single-Cell Proteomics Data Analysis
Utility functions for manipulating, processing, and analyzing mass spectrometry-based single-cell proteomics data. The package is an extension to the 'QFeatures' package and relies on 'SingleCellExpirement' to enable single-cell proteomics analyses. The package offers the user the functionality to process quantitative table (as generated by MaxQuant, Proteome Discoverer, and more) into data tables ready for downstream analysis and data visualization.
Maintained by Christophe Vanderaa. Last updated 18 days ago.
geneexpressionproteomicssinglecellmassspectrometrypreprocessingcellbasedassaysbioconductormass-spectrometrysingle-cellsoftware
1.3 match 25 stars 8.94 score 115 scriptshanjunwei-lab
ICDS:Identification of Cancer Dysfunctional Subpathway with Omics Data
Identify Cancer Dysfunctional Sub-pathway by integrating gene expression, DNA methylation and copy number variation, and pathway topological information. 1)We firstly calculate the gene risk scores by integrating three kinds of data: DNA methylation, copy number variation, and gene expression. 2)Secondly, we perform a greedy search algorithm to identify the key dysfunctional sub-pathways within the pathways for which the discriminative scores were locally maximal. 3)Finally, the permutation test was used to calculate statistical significance level for these key dysfunctional sub-pathways.
Maintained by Junwei Han. Last updated 8 months ago.
3.1 match 3.54 score 3 scriptsbioc
IHW:Independent Hypothesis Weighting
Independent hypothesis weighting (IHW) is a multiple testing procedure that increases power compared to the method of Benjamini and Hochberg by assigning data-driven weights to each hypothesis. The input to IHW is a two-column table of p-values and covariates. The covariate can be any continuous-valued or categorical variable that is thought to be informative on the statistical properties of each hypothesis test, while it is independent of the p-value under the null hypothesis.
Maintained by Nikos Ignatiadis. Last updated 5 months ago.
immunooncologymultiplecomparisonrnaseq
1.5 match 7.25 score 264 scripts 2 dependentsbioc
BioNet:Routines for the functional analysis of biological networks
This package provides functions for the integrated analysis of protein-protein interaction networks and the detection of functional modules. Different datasets can be integrated into the network by assigning p-values of statistical tests to the nodes of the network. E.g. p-values obtained from the differential expression of the genes from an Affymetrix array are assigned to the nodes of the network. By fitting a beta-uniform mixture model and calculating scores from the p-values, overall scores of network regions can be calculated and an integer linear programming algorithm identifies the maximum scoring subnetwork.
Maintained by Marcus Dittrich. Last updated 5 months ago.
microarraydataimportgraphandnetworknetworknetworkenrichmentgeneexpressiondifferentialexpression
1.8 match 6.14 score 114 scripts 2 dependentscran
rope:Model Selection with FDR Control of Selected Variables
Selects one model with variable selection FDR controlled at a specified level. A q-value for each potential variable is also returned. The input, variable selection counts over many bootstraps for several levels of penalization, is modeled as coming from a beta-binomial mixture distribution.
Maintained by Jonatan Kallus. Last updated 8 years ago.
5.3 match 2.00 scoreaudreyqyfu
MRPC:PC Algorithm with the Principle of Mendelian Randomization
A PC Algorithm with the Principle of Mendelian Randomization. This package implements the MRPC (PC with the principle of Mendelian randomization) algorithm to infer causal graphs. It also contains functions to simulate data under a certain topology, to visualize a graph in different ways, and to compare graphs and quantify the differences. See Badsha and Fu (2019) <doi:10.3389/fgene.2019.00460>,Badsha, Martin and Fu (2021) <doi:10.3389/fgene.2021.651812>.
Maintained by Audrey Fu. Last updated 3 years ago.
2.3 match 8 stars 4.68 score 20 scriptsbioc
dreamlet:Scalable differential expression analysis of single cell transcriptomics datasets with complex study designs
Recent advances in single cell/nucleus transcriptomic technology has enabled collection of cohort-scale datasets to study cell type specific gene expression differences associated disease state, stimulus, and genetic regulation. The scale of these data, complex study designs, and low read count per cell mean that characterizing cell type specific molecular mechanisms requires a user-frieldly, purpose-build analytical framework. We have developed the dreamlet package that applies a pseudobulk approach and fits a regression model for each gene and cell cluster to test differential expression across individuals associated with a trait of interest. Use of precision-weighted linear mixed models enables accounting for repeated measures study designs, high dimensional batch effects, and varying sequencing depth or observed cells per biosample.
Maintained by Gabriel Hoffman. Last updated 5 months ago.
rnaseqgeneexpressiondifferentialexpressionbatcheffectqualitycontrolregressiongenesetenrichmentgeneregulationepigeneticsfunctionalgenomicstranscriptomicsnormalizationsinglecellpreprocessingsequencingimmunooncologysoftwarecpp
1.3 match 12 stars 8.09 score 128 scriptsbioc
INTACT:Integrate TWAS and Colocalization Analysis for Gene Set Enrichment Analysis
This package integrates colocalization probabilities from colocalization analysis with transcriptome-wide association study (TWAS) scan summary statistics to implicate genes that may be biologically relevant to a complex trait. The probabilistic framework implemented in this package constrains the TWAS scan z-score-based likelihood using a gene-level colocalization probability. Given gene set annotations, this package can estimate gene set enrichment using posterior probabilities from the TWAS-colocalization integration step.
Maintained by Jeffrey Okamoto. Last updated 5 months ago.
1.8 match 15 stars 5.47 score 13 scriptsbioc
EBarrays:Unified Approach for Simultaneous Gene Clustering and Differential Expression Identification
EBarrays provides tools for the analysis of replicated/unreplicated microarray data.
Maintained by Ming Yuan. Last updated 5 months ago.
clusteringdifferentialexpression
1.8 match 5.56 score 5 scripts 6 dependentsbioc
DEXSeq:Inference of differential exon usage in RNA-Seq
The package is focused on finding differential exon usage using RNA-seq exon counts between samples with different experimental designs. It provides functions that allows the user to make the necessary statistical tests based on a model that uses the negative binomial distribution to estimate the variance between biological replicates and generalized linear models for testing. The package also provides functions for the visualization and exploration of the results.
Maintained by Alejandro Reyes. Last updated 17 days ago.
immunooncologysequencingrnaseqdifferentialexpressionalternativesplicingdifferentialsplicinggeneexpressionvisualization
1.3 match 7.75 score 330 scripts 6 dependentsjgill22
BaM:Functions and Datasets for "Bayesian Methods: A Social and Behavioral Sciences Approach"
Functions and datasets for Jeff Gill: "Bayesian Methods: A Social and Behavioral Sciences Approach". First, Second, and Third Edition. Published by Chapman and Hall/CRC (2002, 2007, 2014) <doi:10.1201/b17888>.
Maintained by Jeff Gill. Last updated 2 years ago.
6.6 match 1 stars 1.43 score 27 scriptsbioc
sights:Statistics and dIagnostic Graphs for HTS
SIGHTS is a suite of normalization methods, statistical tests, and diagnostic graphical tools for high throughput screening (HTS) assays. HTS assays use microtitre plates to screen large libraries of compounds for their biological, chemical, or biochemical activity.
Maintained by Elika Garg. Last updated 5 months ago.
immunooncologycellbasedassaysmicrotitreplateassaynormalizationmultiplecomparisonpreprocessingqualitycontrolbatcheffectvisualization
2.3 match 4.00 score 9 scriptsbioc
pathwayPCA:Integrative Pathway Analysis with Modern PCA Methodology and Gene Selection
pathwayPCA is an integrative analysis tool that implements the principal component analysis (PCA) based pathway analysis approaches described in Chen et al. (2008), Chen et al. (2010), and Chen (2011). pathwayPCA allows users to: (1) Test pathway association with binary, continuous, or survival phenotypes. (2) Extract relevant genes in the pathways using the SuperPCA and AES-PCA approaches. (3) Compute principal components (PCs) based on the selected genes. These estimated latent variables represent pathway activities for individual subjects, which can then be used to perform integrative pathway analysis, such as multi-omics analysis. (4) Extract relevant genes that drive pathway significance as well as data corresponding to these relevant genes for additional in-depth analysis. (5) Perform analyses with enhanced computational efficiency with parallel computing and enhanced data safety with S4-class data objects. (6) Analyze studies with complex experimental designs, with multiple covariates, and with interaction effects, e.g., testing whether pathway association with clinical phenotype is different between male and female subjects. Citations: Chen et al. (2008) <https://doi.org/10.1093/bioinformatics/btn458>; Chen et al. (2010) <https://doi.org/10.1002/gepi.20532>; and Chen (2011) <https://doi.org/10.2202/1544-6115.1697>.
Maintained by Gabriel Odom. Last updated 5 months ago.
copynumbervariationdnamethylationgeneexpressionsnptranscriptiongenepredictiongenesetenrichmentgenesignalinggenetargetgenomewideassociationgenomicvariationcellbiologyepigeneticsfunctionalgenomicsgeneticslipidomicsmetabolomicsproteomicssystemsbiologytranscriptomicsclassificationdimensionreductionfeatureextractionprincipalcomponentregressionsurvivalmultiplecomparisonpathways
1.1 match 11 stars 7.74 score 42 scriptsbabak-khorsand
EvaluationMeasures:Collection of Model Evaluation Measure Functions
Provides Some of the most important evaluation measures for evaluating a model. Just by giving the real and predicted class, measures such as accuracy, sensitivity, specificity, ppv, npv, fmeasure, mcc and ... will be returned.
Maintained by Babak Khorsand. Last updated 9 years ago.
4.5 match 1.91 score 27 scripts 1 dependentscran
TestCor:FWER and FDR Controlling Procedures for Multiple Correlation Tests
Different multiple testing procedures for correlation tests are implemented. These procedures were shown to theoretically control asymptotically the Family Wise Error Rate (Roux (2018) <https://tel.archives-ouvertes.fr/tel-01971574v1>) or the False Discovery Rate (Cai & Liu (2016) <doi:10.1080/01621459.2014.999157>). The package gather four test statistics used in correlation testing, four FWER procedures with either single step or stepdown versions, and four FDR procedures.
Maintained by Gannaz Irene. Last updated 4 years ago.
8.4 match 1 stars 1.00 scorebioc
MSnID:Utilities for Exploration and Assessment of Confidence of LC-MSn Proteomics Identifications
Extracts MS/MS ID data from mzIdentML (leveraging mzID package) or text files. After collating the search results from multiple datasets it assesses their identification quality and optimize filtering criteria to achieve the maximum number of identifications while not exceeding a specified false discovery rate. Also contains a number of utilities to explore the MS/MS results and assess missed and irregular enzymatic cleavages, mass measurement accuracy, etc.
Maintained by Vlad Petyuk. Last updated 5 months ago.
proteomicsmassspectrometryimmunooncology
1.7 match 5.06 score 57 scriptsbioc
synapter:Label-free data analysis pipeline for optimal identification and quantitation
The synapter package provides functionality to reanalyse label-free proteomics data acquired on a Synapt G2 mass spectrometer. One or several runs, possibly processed with additional ion mobility separation to increase identification accuracy can be combined to other quantitation files to maximise identification and quantitation accuracy.
Maintained by Laurent Gatto. Last updated 5 days ago.
immunooncologymassspectrometryproteomicsqualitycontrol
1.8 match 4 stars 4.73 score 5 scriptsbioc
LimROTS:A Hybrid Method Integrating Empirical Bayes and Reproducibility-Optimized Statistics for Robust Analysis of Proteomics and Metabolomics Data
Differential expression analysis is a prevalent method utilised in the examination of diverse biological data. The reproducibility-optimized test statistic (ROTS) modifies a t-statistic based on the data's intrinsic characteristics and ranks features according to their statistical significance for differential expression between two or more groups (f-statistic). Focussing on proteomics and metabolomics, the current ROTS implementation cannot account for technical or biological covariates such as MS batches or gender differences among the samples. Consequently, we developed LimROTS, which employs a reproducibility-optimized test statistic utilising the limma methodology to simulate complex experimental designs. LimROTS is a hybrid method integrating empirical bayes and reproducibility-optimized statistics for robust analysis of proteomics and metabolomics data.
Maintained by Ali Mostafa Anwar. Last updated 3 months ago.
softwaregeneexpressiondifferentialexpressionmicroarrayrnaseqproteomicsimmunooncologymetabolomicsmrnamicroarray
1.7 match 1 stars 4.70 score 1 scriptsconnor-reid-tiffany
omu:A Metabolomics Analysis Tool for Intuitive Figures and Convenient Metadata Collection
Facilitates the creation of intuitive figures to describe metabolomics data by utilizing Kyoto Encyclopedia of Genes and Genomes (KEGG) hierarchy data, and gathers functional orthology and gene data from the KEGG-REST API.
Maintained by Connor Tiffany. Last updated 1 years ago.
1.6 match 3 stars 4.89 score 52 scriptssmeekes
bootUR:Bootstrap Unit Root Tests
Set of functions to perform various bootstrap unit root tests for both individual time series (including augmented Dickey-Fuller test and union tests), multiple time series and panel data; see Smeekes and Wilms (2023) <doi:10.18637/jss.v106.i12>, Palm, Smeekes and Urbain (2008) <doi:10.1111/j.1467-9892.2007.00565.x>, Palm, Smeekes and Urbain (2011) <doi:10.1016/j.jeconom.2010.11.010>, Moon and Perron (2012) <doi:10.1016/j.jeconom.2012.01.008>, Smeekes and Taylor (2012) <doi:10.1017/S0266466611000387> and Smeekes (2015) <doi:10.1111/jtsa.12110> for key references.
Maintained by Stephan Smeekes. Last updated 1 months ago.
bootstrapdickey-fullerhypothesis-testtime-seriesunit-rootopenblascpp
1.3 match 10 stars 5.91 score 27 scriptsbioc
GeneBreak:Gene Break Detection
Recurrent breakpoint gene detection on copy number aberration profiles.
Maintained by Evert van den Broek. Last updated 5 months ago.
acghcopynumbervariationdnaseqgeneticssequencingwholegenomevisualization
1.6 match 2 stars 4.60 score 6 scriptsbioc
CSAR:Statistical tools for the analysis of ChIP-seq data
Statistical tools for ChIP-seq data analysis. The package includes the statistical method described in Kaufmann et al. (2009) PLoS Biology: 7(4):e1000090. Briefly, Taking the average DNA fragment size subjected to sequencing into account, the software calculates genomic single-nucleotide read-enrichment values. After normalization, sample and control are compared using a test based on the Poisson distribution. Test statistic thresholds to control the false discovery rate are obtained through random permutation.
Maintained by Jose M Muino. Last updated 5 months ago.
1.7 match 4.30 score 6 scriptscb4ds
DGEobj.utils:Differential Gene Expression (DGE) Analysis Utility Toolkit
Provides a function toolkit to facilitate reproducible RNA-Seq Differential Gene Expression (DGE) analysis (Law (2015) <doi:10.12688/f1000research.9005.3>). The tools include both analysis work-flow and utility functions: mapping/unit conversion, count normalization, accounting for unknown covariates, and more. This is a complement/cohort to the 'DGEobj' package that provides a flexible container to manage and annotate Differential Gene Expression analysis results.
Maintained by Connie Brett. Last updated 2 months ago.
1.3 match 2 stars 5.26 score 30 scripts 1 dependentsbioc
Motif2Site:Detect binding sites from motifs and ChIP-seq experiments, and compare binding sites across conditions
Detect binding sites using motifs IUPAC sequence or bed coordinates and ChIP-seq experiments in bed or bam format. Combine/compare binding sites across experiments, tissues, or conditions. All normalization and differential steps are done using TMM-GLM method. Signal decomposition is done by setting motifs as the centers of the mixture of normal distribution curves.
Maintained by Peyman Zarrineh. Last updated 5 months ago.
softwaresequencingchipseqdifferentialpeakcallingepigeneticssequencematching
1.8 match 4.00 score 3 scriptsbioc
mirTarRnaSeq:mirTarRnaSeq
mirTarRnaSeq R package can be used for interactive mRNA miRNA sequencing statistical analysis. This package utilizes expression or differential expression mRNA and miRNA sequencing results and performs interactive correlation and various GLMs (Regular GLM, Multivariate GLM, and Interaction GLMs ) analysis between mRNA and miRNA expriments. These experiments can be time point experiments, and or condition expriments.
Maintained by Mercedeh Movassagh. Last updated 5 months ago.
mirnaregressionsoftwaresequencingsmallrnatimecoursedifferentialexpression
1.7 match 4.00 score 9 scriptsnmargaritella
APFr:Multiple Testing Approach using Average Power Function (APF) and Bayes FDR Robust Estimation
Implements a multiple testing approach to the choice of a threshold gamma on the p-values using the Average Power Function (APF) and Bayes False Discovery Rate (FDR) robust estimation. Function apf_fdr() estimates both quantities from either raw data or p-values. Function apf_plot() produces smooth graphs and tables of the relevant results. Details of the methods can be found in Quatto P, Margaritella N, et al. (2019) <doi:10.1177/0962280219844288>.
Maintained by Nicolรฒ Margaritella. Last updated 6 years ago.
6.8 match 1.00 scorebioc
IsoBayes:IsoBayes: Single Isoform protein inference Method via Bayesian Analyses
IsoBayes is a Bayesian method to perform inference on single protein isoforms. Our approach infers the presence/absence of protein isoforms, and also estimates their abundance; additionally, it provides a measure of the uncertainty of these estimates, via: i) the posterior probability that a protein isoform is present in the sample; ii) a posterior credible interval of its abundance. IsoBayes inputs liquid cromatography mass spectrometry (MS) data, and can work with both PSM counts, and intensities. When available, trascript isoform abundances (i.e., TPMs) are also incorporated: TPMs are used to formulate an informative prior for the respective protein isoform relative abundance. We further identify isoforms where the relative abundance of proteins and transcripts significantly differ. We use a two-layer latent variable approach to model two sources of uncertainty typical of MS data: i) peptides may be erroneously detected (even when absent); ii) many peptides are compatible with multiple protein isoforms. In the first layer, we sample the presence/absence of each peptide based on its estimated probability of being mistakenly detected, also known as PEP (i.e., posterior error probability). In the second layer, for peptides that were estimated as being present, we allocate their abundance across the protein isoforms they map to. These two steps allow us to recover the presence and abundance of each protein isoform.
Maintained by Simone Tiberi. Last updated 5 months ago.
statisticalmethodbayesianproteomicsmassspectrometryalternativesplicingsequencingrnaseqgeneexpressiongeneticsvisualizationsoftwarecpp
1.3 match 7 stars 5.39 score 10 scriptsafukushima
TFactSR:Enrichment Approach to Predict Which Transcription Factors are Regulated
R implementation of 'TFactS' to predict which are the transcription factors (TFs), regulated in a biological condition based on lists of differentially expressed genes (DEGs) obtained from transcriptome experiments. This package is based on the 'TFactS' concept by Essaghir et al. (2010) <doi:10.1093/nar/gkq149> and expands it. It allows users to perform 'TFactS'-like enrichment approach. The package can import and use the original catalogue file from the 'TFactS' as well as users' defined catalogues of interest that are not supported by 'TFactS' (e.g., Arabidopsis).
Maintained by Atsushi Fukushima. Last updated 2 years ago.
networksoftwaredifferentialexpressiongenetargetgeneexpressionmicroarrayrnaseqtranscriptionnetworkenrichment
1.8 match 3.70 score 3 scriptsbioc
cycle:Significance of periodic expression pattern in time-series data
Package for assessing the statistical significance of periodic expression based on Fourier analysis and comparison with data generated by different background models
Maintained by Matthias Futschik. Last updated 5 months ago.
1.7 match 3.72 score 13 scriptscran
fdrtool:Estimation of (Local) False Discovery Rates and Higher Criticism
Estimates both tail area-based false discovery rates (Fdr) as well as local false discovery rates (fdr) for a variety of null models (p-values, z-scores, correlation coefficients, t-scores). The proportion of null values and the parameters of the null distribution are adaptively estimated from the data. In addition, the package contains functions for non-parametric density estimation (Grenander estimator), for monotone regression (isotonic regression and antitonic regression with weights), for computing the greatest convex minorant (GCM) and the least concave majorant (LCM), for the half-normal and correlation distributions, and for computing empirical higher criticism (HC) scores and the corresponding decision threshold.
Maintained by Korbinian Strimmer. Last updated 7 months ago.
0.8 match 3 stars 8.24 score 844 scripts 118 dependentsccicb
CRUX:Easily explore patterns of somatic variation in cancer using 'CRUX'
Shiny app for exploring somatic variation in cancer. Powered by maftools.
Maintained by Sam El-Kamand. Last updated 1 years ago.
3.2 match 2 stars 2.00 score 5 scriptsbioc
GeneSelectMMD:Gene selection based on the marginal distributions of gene profiles that characterized by a mixture of three-component multivariate distributions
Gene selection based on a mixture of marginal distributions.
Maintained by Weiliang Qiu. Last updated 5 months ago.
1.7 match 3.78 score 1 scripts 1 dependentsstephens999
ashr:Methods for Adaptive Shrinkage, using Empirical Bayes
The R package 'ashr' implements an Empirical Bayes approach for large-scale hypothesis testing and false discovery rate (FDR) estimation based on the methods proposed in M. Stephens, 2016, "False discovery rates: a new deal", <DOI:10.1093/biostatistics/kxw041>. These methods can be applied whenever two sets of summary statistics---estimated effects and standard errors---are available, just as 'qvalue' can be applied to previously computed p-values. Two main interfaces are provided: ash(), which is more user-friendly; and ash.workhorse(), which has more options and is geared toward advanced users. The ash() and ash.workhorse() also provides a flexible modeling interface that can accommodate a variety of likelihoods (e.g., normal, Poisson) and mixture priors (e.g., uniform, normal).
Maintained by Peter Carbonetto. Last updated 10 months ago.
0.5 match 82 stars 12.10 score 780 scripts 15 dependentsscottpanhan
newIMVC:A Robust Integrated Mean Variance Correlation
Measure the dependence structure between two random variables with IMVC and extend IMVC to hypothesis test, feature screening and FDR control.
Maintained by Han Pan. Last updated 1 years ago.
2.2 match 2.70 scoreranbi1990
ssizeRNA:Sample Size Calculation for RNA-Seq Experimental Design
We propose a procedure for sample size calculation while controlling false discovery rate for RNA-seq experimental design. Our procedure depends on the Voom method proposed for RNA-seq data analysis by Law et al. (2014) <DOI:10.1186/gb-2014-15-2-r29> and the sample size calculation method proposed for microarray experiments by Liu and Hwang (2007) <DOI:10.1093/bioinformatics/btl664>. We develop a set of functions that calculates appropriate sample sizes for two-sample t-test for RNA-seq experiments with fixed or varied set of parameters. The outputs also contain a plot of power versus sample size, a table of power at different sample sizes, and a table of critical test values at different sample sizes. To install this package, please use 'source("http://bioconductor.org/biocLite.R"); biocLite("ssizeRNA")'. For R version 3.5 or greater, please use 'if(!requireNamespace("BiocManager", quietly = TRUE)){install.packages("BiocManager")}; BiocManager::install("ssizeRNA")'.
Maintained by Ran Bi. Last updated 6 years ago.
geneexpressiondifferentialexpressionexperimentaldesignsequencingrnaseqdnaseqmicroarray
1.7 match 1 stars 3.53 score 28 scripts 1 dependentscran
CEDA:CRISPR Screen and Gene Expression Differential Analysis
Provides analytical methods for analyzing CRISPR screen data at different levels of gene expression. Multi-component normal mixture models and EM algorithms are used for modeling.
Maintained by Lianbo Yu. Last updated 1 years ago.
1.7 match 3.40 score 2 scriptscran
dSTEM:Multiple Testing of Local Extrema for Detection of Change Points
Simultaneously detect the number and locations of change points in piecewise linear models under stationary Gaussian noise allowing autocorrelated random noise. The core idea is to transform the problem of detecting change points into the detection of local extrema (local maxima and local minima)through kernel smoothing and differentiation of the data sequence, see Cheng et al. (2020) <doi:10.1214/20-EJS1751>. A low-computational and fast algorithm call 'dSTEM' is introduced to detect change points based on the 'STEM' algorithm in D. Cheng and A. Schwartzman (2017) <doi:10.1214/16-AOS1458>.
Maintained by Zhibing He. Last updated 2 years ago.
5.0 match 1.00 scorebioc
ROSeq:Modeling expression ranks for noise-tolerant differential expression analysis of scRNA-Seq data
ROSeq - A rank based approach to modeling gene expression with filtered and normalized read count matrix. ROSeq takes filtered and normalized read matrix and cell-annotation/condition as input and determines the differentially expressed genes between the contrasting groups of single cells. One of the input parameters is the number of cores to be used.
Maintained by Krishan Gupta. Last updated 5 months ago.
geneexpressiondifferentialexpressionsinglecellcount-datagene-expressiongene-expression-profilesnormalizationpopulationsranktmmtungtung-datasettutorialvignette
1.1 match 2 stars 4.34 score 11 scriptsbioc
randRotation:Random Rotation Methods for High Dimensional Data with Batch Structure
A collection of methods for performing random rotations on high-dimensional, normally distributed data (e.g. microarray or RNA-seq data) with batch structure. The random rotation approach allows exact testing of dependent test statistics with linear models following arbitrary batch effect correction methods.
Maintained by Peter Hettegger. Last updated 5 months ago.
softwaresequencingbatcheffectbiomedicalinformaticsrnaseqpreprocessingmicroarraydifferentialexpressiongeneexpressiongeneticsmicrornaarraynormalizationstatisticalmethod
1.3 match 3.60 score 3 scriptscran
KnockoffHybrid:Hybrid Analysis of Population and Trio Data with Knockoff Statistics for FDR Control
Identification of putative causal variants in genome-wide association studies using hybrid analysis of both the trio and population designs. The package implements the method in the paper: Yang, Y., Wang, Q., Wang, C., Buxbaum, J., & Ionita-Laza, I. (2024). KnockoffHybrid: A knockoff framework for hybrid analysis of trio and population designs in genome-wide association studies. The American Journal of Human Genetics, in press.
Maintained by Yi Yang. Last updated 9 months ago.
2.8 match 1.70 scorebioc
multtest:Resampling-based multiple hypothesis testing
Non-parametric bootstrap and permutation resampling-based multiple testing procedures (including empirical Bayes methods) for controlling the family-wise error rate (FWER), generalized family-wise error rate (gFWER), tail probability of the proportion of false positives (TPPFP), and false discovery rate (FDR). Several choices of bootstrap-based null distribution are implemented (centered, centered and scaled, quantile-transformed). Single-step and step-wise methods are available. Tests based on a variety of t- and F-statistics (including t-statistics based on regression parameters from linear and survival models as well as those based on correlation parameters) are included. When probing hypotheses with t-statistics, users may also select a potentially faster null distribution which is multivariate normal with mean zero and variance covariance matrix derived from the vector influence function. Results are reported in terms of adjusted p-values, confidence regions and test statistic cutoffs. The procedures are directly applicable to identifying differentially expressed genes in DNA microarray experiments.
Maintained by Katherine S. Pollard. Last updated 5 months ago.
microarraydifferentialexpressionmultiplecomparison
0.5 match 9.34 score 932 scripts 136 dependentsjacky11
cp4p:Calibration Plot for Proteomics
Functions to check whether a vector of p-values respects the assumptions of FDR (false discovery rate) control procedures and to compute adjusted p-values.
Maintained by Quentin Giai Gianetto. Last updated 6 years ago.
2.3 match 2.03 score 18 scripts 1 dependentsjbp7
TEAM:Multiple Hypothesis Testing on an Aggregation Tree Method
An implementation of the TEAM algorithm to identify local differences between two (e.g. case and control) independent, univariate distributions, as described in J Pura, C Chan, and J Xie (2019) <arXiv:1906.07757>. The algorithm is based on embedding a multiple-testing procedure on a hierarchical structure to identify high-resolution differences between two distributions. The hierarchical structure is designed to identify strong, short-range differences at lower layers and weaker, but long-range differences at increasing layers. TEAM yields consistent layer-specific and overall false discovery rate control.
Maintained by John Pura. Last updated 6 years ago.
2.3 match 2.00 scorecran
NewmanOmics:Extending the Newman Studentized Range Statistic to Transcriptomics
Extends the classical Newman studentized range statistic in various ways that can be applied to genome-scale transcriptomic or other expression data.
Maintained by Kevin R. Coombes. Last updated 27 days ago.
1.8 match 2.38 score 12 scriptscran
SiFINeT:Single Cell Feature Identification with Network Topology
Cluster-independent method based on topology structure of gene co-expression network for identifying feature gene sets, extracting cellular subpopulations, and elucidating intrinsic relationships among these subpopulations. Without prior cell clustering, SifiNet circumvents potential inaccuracies in clustering that may influence subsequent analyses. This method is introduced in Qi Gao, Zhicheng Ji, Liuyang Wang, Kouros Owzar, Qi-Jing Li, Cliburn Chan, Jichun Xie "SifiNet: a robust and accurate method to identify feature gene sets and annotate cells" (2024) <doi:10.1093/nar/gkae307>.
Maintained by Qi Gao. Last updated 2 months ago.
1.5 match 2.70 scorebioc
siggenes:Multiple Testing using SAM and Efron's Empirical Bayes Approaches
Identification of differentially expressed genes and estimation of the False Discovery Rate (FDR) using both the Significance Analysis of Microarrays (SAM) and the Empirical Bayes Analyses of Microarrays (EBAM).
Maintained by Holger Schwender. Last updated 5 months ago.
multiplecomparisonmicroarraygeneexpressionsnpexonarraydifferentialexpression
0.5 match 7.86 score 74 scripts 33 dependentst-grimes
dnapath:Differential Network Analysis using Gene Pathways
Integrates pathway information into the differential network analysis of two gene expression datasets as described in Grimes, Potter, and Datta (2019) <doi:10.1038/s41598-019-41918-3>. Provides summary functions to break down the results at the pathway, gene, or individual connection level. The differential networks for each pathway of interest can be plotted, and the visualization will highlight any differentially expressed genes and all of the gene-gene associations that are significantly differentially connected.
Maintained by Tyler Grimes. Last updated 24 days ago.
1.7 match 2.30 score 5 scriptsajbass
sffdr:Surrogate Functional False Discovery Rates for Genome-Wide Association Studies
Pleiotropy-informed significance analysis of genome-wide association studies with surrogate functional false discovery rates (sfFDR). The sfFDR framework adapts the fFDR to leverage informative data from multiple sets of GWAS summary statistics to increase power in study while accommodating for linkage disequilibrium. sfFDR provides estimates of key FDR quantities in a significance analysis such as the functional local FDR and $q$-value, and uses these estimates to derive a functional $p$-value for type I error rate control and a functional local Bayes' factor for post-GWAS analyses (e.g., fine mapping and colocalization).
Maintained by Andrew Bass. Last updated 1 months ago.
0.8 match 4 stars 5.00 score 3 scriptsbioc
TargetDecoy:Diagnostic Plots to Evaluate the Target Decoy Approach
A first step in the data analysis of Mass Spectrometry (MS) based proteomics data is to identify peptides and proteins. With this respect the huge number of experimental mass spectra typically have to be assigned to theoretical peptides derived from a sequence database. Search engines are used for this purpose. These tools compare each of the observed spectra to all candidate theoretical spectra derived from the sequence data base and calculate a score for each comparison. The observed spectrum is then assigned to the theoretical peptide with the best score, which is also referred to as the peptide to spectrum match (PSM). It is of course crucial for the downstream analysis to evaluate the quality of these matches. Therefore False Discovery Rate (FDR) control is used to return a reliable list PSMs. The FDR, however, requires a good characterisation of the score distribution of PSMs that are matched to the wrong peptide (bad target hits). In proteomics, the target decoy approach (TDA) is typically used for this purpose. The TDA method matches the spectra to a database of real (targets) and nonsense peptides (decoys). A popular approach to generate these decoys is to reverse the target database. Hence, all the PSMs that match to a decoy are known to be bad hits and the distribution of their scores are used to estimate the distribution of the bad scoring target PSMs. A crucial assumption of the TDA is that the decoy PSM hits have similar properties as bad target hits so that the decoy PSM scores are a good simulation of the target PSM scores. Users, however, typically do not evaluate these assumptions. To this end we developed TargetDecoy to generate diagnostic plots to evaluate the quality of the target decoy method.
Maintained by Elke Debrie. Last updated 5 months ago.
massspectrometryproteomicsqualitycontrolsoftwarevisualizationbioconductormass-spectrometry
0.8 match 1 stars 4.60 score 9 scriptschristianak
ManyTests:Multiple Testing Procedures of Cox (2011) and Wong and Cox (2007)
Performs the multiple testing procedures of Cox (2011) <doi:10.5170/CERN-2011-006> and Wong and Cox (2007) <doi:10.1080/02664760701240014>.
Maintained by Christiana Kartsonaki. Last updated 8 years ago.
3.3 match 1.00 score 5 scriptsnchenderson
rvalues:R-Values for Ranking in High-Dimensional Settings
A collection of functions for computing "r-values" from various kinds of user input such as MCMC output or a list of effect size estimates and associated standard errors. Given a large collection of measurement units, the r-value, r, of a particular unit is a reported percentile that may be interpreted as the smallest percentile at which the unit should be placed in the top r-fraction of units.
Maintained by Nicholas Henderson. Last updated 4 years ago.
2.3 match 1.30 score 20 scriptskaijunwang19
LPRelevance:Relevance-Integrated Statistical Inference Engine
Provide methods to perform customized inference at individual level by taking contextual covariates into account. Three main functions are provided in this package: (i) LASER(): it generates specially-designed artificial relevant samples for a given case; (ii) g2l.proc(): computes customized fdr(z|x); and (iii) rEB.proc(): performs empirical Bayes inference based on LASERs. The details can be found in Mukhopadhyay, S., and Wang, K (2021, <arXiv:2004.09588>).
Maintained by Kaijun Wang. Last updated 3 years ago.
2.8 match 1.00 scoremsesia
knockoff:The Knockoff Filter for Controlled Variable Selection
The knockoff filter is a general procedure for controlling the false discovery rate (FDR) when performing variable selection. For more information, see the website below and the accompanying paper: Candes et al., "Panning for gold: model-X knockoffs for high-dimensional controlled variable selection", J. R. Statist. Soc. B (2018) 80, 3, pp. 551-577.
Maintained by Matteo Sesia. Last updated 3 years ago.
0.5 match 2 stars 5.35 score 248 scripts 5 dependentsbioc
cn.mops:cn.mops - Mixture of Poissons for CNV detection in NGS data
cn.mops (Copy Number estimation by a Mixture Of PoissonS) is a data processing pipeline for copy number variations and aberrations (CNVs and CNAs) from next generation sequencing (NGS) data. The package supplies functions to convert BAM files into read count matrices or genomic ranges objects, which are the input objects for cn.mops. cn.mops models the depths of coverage across samples at each genomic position. Therefore, it does not suffer from read count biases along chromosomes. Using a Bayesian approach, cn.mops decomposes read variations across samples into integer copy numbers and noise by its mixture components and Poisson distributions, respectively. cn.mops guarantees a low FDR because wrong detections are indicated by high noise and filtered out. cn.mops is very fast and written in C++.
Maintained by Gundula Povysil. Last updated 3 months ago.
sequencingcopynumbervariationhomo_sapienscellbiologyhapmapgeneticscpp
0.5 match 5.35 score 94 scripts 4 dependentsbioc
RnaSeqSampleSize:RnaSeqSampleSize
RnaSeqSampleSize package provides a sample size calculation method based on negative binomial model and the exact test for assessing differential expression analysis of RNA-seq data. It controls FDR for multiple testing and utilizes the average read count and dispersion distributions from real data to estimate a more reliable sample size. It is also equipped with several unique features, including estimation for interested genes or pathway, power curve visualization, and parameter optimization.
Maintained by Shilin Zhao Developer. Last updated 5 months ago.
immunooncologyexperimentaldesignsequencingrnaseqgeneexpressiondifferentialexpressioncpp
0.5 match 5.30 score 20 scriptsbioc
scCB2:CB2 improves power of cell detection in droplet-based single-cell RNA sequencing data
scCB2 is an R package implementing CB2 for distinguishing real cells from empty droplets in droplet-based single cell RNA-seq experiments (especially for 10x Chromium). It is based on clustering similar barcodes and calculating Monte-Carlo p-value for each cluster to test against background distribution. This cluster-level test outperforms single-barcode-level tests in dealing with low count barcodes and homogeneous sequencing library, while keeping FDR well controlled.
Maintained by Zijian Ni. Last updated 5 months ago.
dataimportrnaseqsinglecellsequencinggeneexpressiontranscriptomicspreprocessingclustering
0.5 match 10 stars 5.30 score 5 scriptscran
MiDA:Microarray Data Analysis
Set of functions designed to simplify transcriptome analysis and identification of marker molecules using microarrays data. The package includes a set of functions that allows performing full pipeline of analysis including data normalization, summarisation, binary classification, FDR (False Discovery Rate) multiple comparison and the definition of potential biological markers.
Maintained by Elena Filatova. Last updated 6 years ago.
2.3 match 1.08 score 12 scriptsbioc
magpie:MeRIP-Seq data Analysis for Genomic Power Investigation and Evaluation
This package aims to perform power analysis for the MeRIP-seq study. It calculates FDR, FDC, power, and precision under various study design parameters, including but not limited to sample size, sequencing depth, and testing method. It can also output results into .xlsx files or produce corresponding figures of choice.
Maintained by Daoyu Duan. Last updated 5 months ago.
epitranscriptomicsdifferentialmethylationsequencingrnaseqsoftware
0.5 match 4.60 score 40 scriptscran
GroupTest:Multiple Testing Procedure for Grouped Hypotheses
Contains functions for a two-stage multiple testing procedure for grouped hypothesis, aiming at controlling both the total posterior false discovery rate and within-group false discovery rate.
Maintained by Zhigen Zhao. Last updated 9 years ago.
1.8 match 1.30 scoreolangsrud
ffmanova:Fifty-Fifty MANOVA
General linear modeling with multiple responses (MANCOVA). An overall p-value for each model term is calculated by the 50-50 MANOVA method by Langsrud (2002) <doi:10.1111/1467-9884.00320>, which handles collinear responses. Rotation testing, described by Langsrud (2005) <doi:10.1007/s11222-005-4789-5>, is used to compute adjusted single response p-values according to familywise error rates and false discovery rates (FDR). The approach to FDR is described in the appendix of Moen et al. (2005) <doi:10.1128/AEM.71.4.2086-2094.2005>. Unbalanced designs are handled by Type II sums of squares as argued in Langsrud (2003) <doi:10.1023/A:1023260610025>. Furthermore, the Type II philosophy is extended to continuous design variables as described in Langsrud et al. (2007) <doi:10.1080/02664760701594246>. This means that the method is invariant to scale changes and that common pitfalls are avoided.
Maintained by รyvind Langsrud. Last updated 1 years ago.
0.8 match 2 stars 3.00 score 7 scriptscran
simpleFDR:Simple False Discovery Rate Calculation
Using the adjustment method from Benjamini & Hochberg (1995) <doi:10.1111/j.2517-6161.1995.tb02031.x>, this package determines which variables are significant under repeated testing with a given dataframe of p values and an user defined "q" threshold. It then returns the original dataframe along with a significance column where an asterisk denotes a significant p value after FDR calculation, and NA denotes all other p values. This package uses the Benjamini & Hochberg method specifically as described in Lee, S., & Lee, D. K. (2018) <doi:10.4097/kja.d.18.00242>.
Maintained by Stephen Wisser. Last updated 3 years ago.
2.3 match 1.00 scorebioc
mbQTL:mbQTL: A package for SNP-Taxa mGWAS analysis
mbQTL is a statistical R package for simultaneous 16srRNA,16srDNA (microbial) and variant, SNP, SNV (host) relationship, correlation, regression studies. We apply linear, logistic and correlation based statistics to identify the relationships of taxa, genus, species and variant, SNP, SNV in the infected host. We produce various statistical significance measures such as P values, FDR, BC and probability estimation to show significance of these relationships. Further we provide various visualization function for ease and clarification of the results of these analysis. The package is compatible with dataframe, MRexperiment and text formats.
Maintained by Mercedeh Movassagh. Last updated 5 months ago.
snpmicrobiomewholegenomemetagenomicsstatisticalmethodregression
0.5 match 1 stars 4.00 score 3 scriptsbioc
MBttest:Multiple Beta t-Tests
MBttest method was developed from beta t-test method of Baggerly et al(2003). Compared to baySeq (Hard castle and Kelly 2010), DESeq (Anders and Huber 2010) and exact test (Robinson and Smyth 2007, 2008) and the GLM of McCarthy et al(2012), MBttest is of high work efficiency,that is, it has high power, high conservativeness of FDR estimation and high stability. MBttest is suit- able to transcriptomic data, tag data, SAGE data (count data) from small samples or a few replicate libraries. It can be used to identify genes, mRNA isoforms or tags differentially expressed between two conditions.
Maintained by Yuan-De Tan. Last updated 5 months ago.
sequencingdifferentialexpressionmultiplecomparisonsagegeneexpressiontranscriptionalternativesplicingcoveragedifferentialsplicing
0.5 match 4.00 score 3 scriptsardenewan
APCanalysis:Analysis of Unreplicated Orthogonal Experiments using All Possible Comparisons
Analysis of data from unreplicated orthogonal experiments such as 2-level factorial and fractional factorial designs and Plackett-Burman designs using the all possible comparisons (APC) methodology developed by Miller (2005) <doi:10.1198/004017004000000608>.
Maintained by Arden Miller. Last updated 7 years ago.
2.0 match 1.00 score 10 scriptsabichat
evabic:Evaluation of Binary Classifiers
Evaluates the performance of binary classifiers. Computes confusion measures (TP, TN, FP, FN), derived measures (TPR, FDR, accuracy, F1, DOR, ..), and area under the curve. Outputs are well suited for nested dataframes.
Maintained by Antoine Bichat. Last updated 3 years ago.
classifiermeasurespredictorsroc-curvestatistics
0.5 match 6 stars 3.62 score 14 scriptsbioc
cypress:Cell-Type-Specific Power Assessment
CYPRESS is a cell-type-specific power tool. This package aims to perform power analysis for the cell-type-specific data. It calculates FDR, FDC, and power, under various study design parameters, including but not limited to sample size, and effect size. It takes the input of a SummarizeExperimental(SE) object with observed mixture data (feature by sample matrix), and the cell-type mixture proportions (sample by cell-type matrix). It can solve the cell-type mixture proportions from the reference free panel from TOAST and conduct tests to identify cell-type-specific differential expression (csDE) genes.
Maintained by Shilin Yu. Last updated 5 months ago.
softwaregeneexpressiondataimportrnaseqsequencing
0.5 match 1 stars 3.70 score 2 scriptsyushengding
DNLC:Differential Network Local Consistency Analysis
Using Local Moran's I for detection of differential network local consistency.
Maintained by Yusheng Ding. Last updated 8 years ago.
1.8 match 1.00 scorerahmasarina
NMTox:Dose-Response Relationship Analysis of Nanomaterial Toxicity
Perform an exploration and a preliminary analysis on the dose- response relationship of nanomaterial toxicity. Several functions are provided for data exploration, including functions for creating a subset of dataset, frequency tables and plots. Inference for order restricted dose- response data is performed by testing the significance of monotonic dose-response relationship, using Williams, Marcus, M, Modified M and Likelihood ratio tests. Several methods of multiplicity adjustment are also provided. Description of the methods can be found in <https://github.com/rahmasarina/dose-response-analysis/blob/main/Methodology.pdf>.
Maintained by Rahmasari Nur Azizah. Last updated 3 years ago.
1.7 match 1.00 scorecran
JUMP:Replicability Analysis of High-Throughput Experiments
Implementing a computationally scalable false discovery rate control procedure for replicability analysis based on maximum of p-values. Please cite the manuscript corresponding to this package [Lyu, P. et al., (2023), <https://www.biorxiv.org/content/10.1101/2023.02.13.528417v2>].
Maintained by Yan Li. Last updated 2 years ago.
1.7 match 1.00 scorebioc
Mulcom:Calculates Mulcom test
Identification of differentially expressed genes and false discovery rate (FDR) calculation by Multiple Comparison test.
Maintained by Claudio Isella. Last updated 5 months ago.
statisticalmethodmultiplecomparisonmicroarraydifferentialexpressiongeneexpressioncpp
0.5 match 3.00 scorefreejstone
groupwalk:Implement the Group Walk Algorithm
A procedure that uses target-decoy competition (or knockoffs) to reject multiple hypotheses in the presence of group structure. The procedure controls the false discovery rate (FDR) at a user-specified threshold.
Maintained by Jack Freestone. Last updated 3 years ago.
0.5 match 2.70 score 1 scriptsvallejosgroup
bayefdr:Bayesian Estimation and Optimisation of Expected False Discovery Rate
Implements the Bayesian FDR control described by Newton et al. (2004), <doi:10.1093/biostatistics/5.2.155>. Allows optimisation and visualisation of expected error rates based on tail posterior probability tests. Based on code written by Catalina Vallejos for BASiCS, see Beyond comparisons of means: understanding changes in gene expression at the single-cell level Vallejos et al. (2016) <doi:10.1186/s13059-016-0930-3>.
Maintained by Alan OCallaghan. Last updated 3 years ago.
0.5 match 2.70 score 1 scriptsolechnwin
DIME:Differential Identification using Mixture Ensemble
A robust identification of differential binding sites method for analyzing ChIP-seq (Chromatin Immunoprecipitation Sequencing) comparing two samples that considers an ensemble of finite mixture models combined with a local false discovery rate (fdr) allowing for flexible modeling of data. Methods for Differential Identification using Mixture Ensemble (DIME) is described in: Taslim et al., (2011) <doi:10.1093/bioinformatics/btr165>.
Maintained by Cenny Taslim. Last updated 3 years ago.
0.5 match 2.63 score 43 scriptscran
DiscreteQvalue:Improved q-Values for Discrete Uniform and Homogeneous Tests
We consider a multiple testing procedure used in many modern applications which is the q-value method proposed by Storey and Tibshirani (2003), <doi:10.1073/pnas.1530509100>. The q-value method is based on the false discovery rate (FDR), hence versions of the q-value method can be defined depending on which estimator of the proportion of true null hypotheses, p0, is plugged in the FDR estimator. We implement the q-value method based on two classical pi0 estimators, and furthermore, we propose and implement three versions of the q-value method for homogeneous discrete uniform P-values based on pi0 estimators which take into account the discrete distribution of the P-values.
Maintained by Marta Cousido Rocha. Last updated 5 years ago.
0.8 match 1.00 scorecran
GhostKnockoff:The Knockoff Inference Using Summary Statistics
Functions for multiple knockoff inference using summary statistics, e.g. Z-scores. The knockoff inference is a general procedure for controlling the false discovery rate (FDR) when performing variable selection. This package provides a procedure which performs knockoff inference without ever constructing individual knockoffs (GhostKnockoff). It additionally supports multiple knockoff inference for improved stability and reproducibility. Moreover, it supports meta-analysis of multiple overlapping studies.
Maintained by Zihuai He. Last updated 3 years ago.
0.5 match 1 stars 1.00 scorecran
DNetFinder:Estimating Differential Networks under Semiparametric Gaussian Graphical Models
Provides a modified hierarchical test (Liu (2017) <doi:10.1214/17-AOS1539>) for detecting the structural difference between two Semiparametric Gaussian graphical models. The multiple testing procedure asymptotically controls the false discovery rate (FDR) at a user-specified level. To construct the test statistic, a truncated estimator is used to approximate the transformation functions and two R functions including lassoGGM() and lassoNPN() are provided to compute the lasso estimates of the regression coefficients.
Maintained by Qingyang Zhang. Last updated 2 years ago.
0.5 match 1 stars 1.00 score 8 scriptsmarsdu1989
easyDes:An Easy Way to Descriptive Analysis
Descriptive analysis is essential for publishing medical articles. This package provides an easy way to conduct the descriptive analysis. 1. Both numeric and factor variables can be handled. For numeric variables, normality test will be applied to choose the parametric and nonparametric test. 2. Both two or more groups can be handled. For groups more than two, the post hoc test will be applied, 'Tukey' for the numeric variables and 'FDR' for the factor variables. 3. T test, ANOVA or Fisher test can be forced to apply. 4. Mean and standard deviation can be forced to display.
Maintained by Zhicheng Du. Last updated 3 years ago.
0.5 match 1.00 score 1 scripts