R-universe search: fdr

package

owner

contributor

author

maintainer

topic

needs

exports

data

Currently serving26320packages,22487articles, and64231datasets by1264organizations,13665 maintainers and22074 contributors.

Not sure what to search for? Why not try:maps, bayesian, ecology, climate, genome, gam, spatial, database, pdf, shiny, rstudio, machine learning, prediction, birds, fish, sports, ... (more popular topics)

Organizations

vimc

lcbc-uio

stan-dev

pharmaverse

r-spatial

tidyverse

ropengov

rstudio

r-lib

ropensci

bioc

r-forge

kwb-r

pik-piam

hypertidy

poissonconsulting

mrc-ide

pecanproject

tidymodels

insightsengineering

thinkr-open

mlr-org

inbo

ggseg

ohdsi

modeloriented

predictiveecology

paws-r

flr

ropenspain

sciviews

bnosac

rmi-pacta

repboxr

mrcieu

openvolley

epiverse-trace

nlmixr2

yulab-smu

ices-tools-prod

frbcesab

statnet

azure

appsilon

mlverse

bips-hb

riatelab

cloudyr

rjdverse

epiforecasts

tmsalab

openpharma

bupaverse

usepa

usaid-oha-si

hubverse-org

dreamrs

coatless-rpkg

easystats

darwin-eu

certe-medical-epidemiology

merck

business-science

ambiorix-web

r-dbi

nutriverse

bluegreen-labs

uscbiostats

spatstat

rikenbit

rsquaredacademy

hugheylab

humaniverse

ctu-bern

gesistsa

terminological

nflverse

ipeagit

epicentre-msf

rspatial

ocbe-uio

apache

ifpri

cogdisreslab

reconhub

biometris

data-cleaning

ecohealthalliance

idslme

lbbe-software

framverse

statisticsnorway

oxfordihtm

quanteda

winvector

a2-ai

Want to learn more about r-universe? Have a look atropensci.org/r-universeor updates from the rOpenSci blog:

Better documentation for R-universe!February 28, 2025
R-Universe Named an R Consortium Top-Level ProjectDecember 3, 2024
Capturing Screenshots Programmatically With RSeptember 10, 2024
Navigating the R ecosystem using R-universeSeptember 24, 2024
A fresh new look for R-universe!June 12, 2024
R-Universe Documentation Gets a Boost from Google Season of DocsApril 12, 2024
R-universe now builds MacOS ARM64 binaries for use on Apple Silicon (aka M1/M2/M3) systemsJanuary 14, 2024
R-universe now builds WASM binaries for all R packagesNovember 17, 2023
The rOpenSci MultiverseNovember 6, 2023
CRAN-ial Expansion: Taking Your R Package Development to New Frontiers with R-UniverseSeptember 19, 2023
Meeting the Stars of the R-Universe: The R-Universe Against Diseases.September 15, 2023
My Life with the R-universeAugust 1, 2023
New cran.dev shortlinks to package information and documentationJuly 26, 2023
Meeting the Stars of the R-Universe: PEcAn, an Open Source Project to Take Care of the PlanetJune 6, 2023
Downloading snapshots and creating stable R packages repositories using r-universeMay 31, 2023
How r-universe searches for packages on CRAN / BioconductorApril 3, 2023
Meeting the Stars of the R-Universe: Researching Our Brain with the Magic of the R-UniverseMarch 30, 2023
Meeting the Stars of the R-universe: ThinkR's Approach to Contributing to a Growing and Friendly R CommunityFebruary 28, 2023
Discovering and learning everything there is to know about R packages using r-universeFebruary 27, 2023
New preferred repo name for r-universe registriesFebruary 7, 2023
Improved permanent URL schema for r-universe.devJanuary 30, 2023
postdoc 1.0: minimal and uncluttered HTML package manualsNovember 29, 2022
Meeting the stars of the R-universe: R Community, Exchange and LearnNovember 23, 2022
Searching and browsing the R universeMarch 23, 2022
A Blend of Package Build FailuresJanuary 31, 2022
How renv restores packages from r-universe for reproducibility or productionJanuary 6, 2022
RSS feeds of package updates in r-universeNovember 24, 2021
How I Test cffr on (about) 2,000 Packages using GitHub Actions and R-universeNovember 23, 2021
Generating and customizing badges in r-universeOctober 14, 2021
rOpenSci docs are now built on r-universeSeptember 3, 2021
How to create your personal CRAN-like repository on R-universeJune 22, 2021
Publishing and browsing articles on R-universeApril 9, 2021
rOpenSci's R-universe ProjectMay 25, 2021
A first look at the R-universe build infrastructureMarch 4, 2021
Moving away from Travis CINovember 19, 2020
How to precompute package vignettes or pkgdown articlesDecember 8, 2019

Showing 177 of total 177 results (show query)

bioc

onlineFDR:Online error rate control

This package allows users to control the false discovery rate (FDR) or familywise error rate (FWER) for online multiple hypothesis testing, where hypotheses arrive in a stream. In this framework, a null hypothesis is rejected based on the evidence against it and on the previous rejection decisions.

Maintained by David S. Robertson. Last updated 5 months ago.

multiplecomparison software statisticalmethod error-rate-control fdr fwer hypothesis-testing cpp

37.5 match 14 stars 6.88 score 26 scripts

bioc

fdrame:FDR adjustments of Microarray Experiments (FDR-AME)

This package contains two main functions. The first is fdr.ma which takes normalized expression data array, experimental design and computes adjusted p-values It returns the fdr adjusted p-values and plots, according to the methods described in (Reiner, Yekutieli and Benjamini 2002). The second, is fdr.gui() which creates a simple graphic user interface to access fdr.ma

Maintained by Effi Kenigsberg. Last updated 5 months ago.

microarray differentialexpression multiplecomparison

34.4 match 3.30 score

bioc

SWATH2stats:Transform and Filter SWATH Data for Statistical Packages

This package is intended to transform SWATH data from the OpenSWATH software into a format readable by other statistics packages while performing filtering, annotation and FDR estimation.

Maintained by Peter Blattmann. Last updated 5 months ago.

proteomics annotation experimentaldesign preprocessing massspectrometry immunooncology

17.0 match 1 stars 6.30 score 22 scripts

bioc

OCplus:Operating characteristics plus sample size and local fdr for microarray experiments

This package allows to characterize the operating characteristics of a microarray experiment, i.e. the trade-off between false discovery rate and the power to detect truly regulated genes. The package includes tools both for planned experiments (for sample size assessment) and for already collected data (identification of differentially expressed genes).

Maintained by Alexander Ploner. Last updated 5 months ago.

microarray differentialexpression multiplecomparison

25.3 match 4.08 score 2 scripts

bioc

clusterProfiler:A universal enrichment tool for interpreting omics data

This package supports functional characteristics of both coding and non-coding genomics data for thousands of species with up-to-date gene annotation. It provides a univeral interface for gene functional annotation from a variety of sources and thus can be applied in diverse scenarios. It provides a tidy interface to access, manipulate, and visualize enrichment results to help users achieve efficient data interpretation. Datasets obtained from multiple treatments and time points can be analyzed and compared in a single run, easily revealing functional consensus and differences among distinct conditions.

Maintained by Guangchuang Yu. Last updated 4 months ago.

annotation clustering genesetenrichment go kegg multiplecomparison pathways reactome visualization enrichment-analysis gsea

4.8 match 1.1k stars 17.03 score 11k scripts 48 dependents

moviedo5

fda.usc:Functional Data Analysis and Utilities for Statistical Computing

Routines for exploratory and descriptive analysis of functional data such as depth measurements, atypical curves detection, regression models, supervised classification, unsupervised classification and functional analysis of variance.

Maintained by Manuel Oviedo de la Fuente. Last updated 4 months ago.

functional-data-analysis fortran

7.4 match 12 stars 9.72 score 560 scripts 22 dependents

cran

ssize.fdr:Sample Size Calculations for Microarray Experiments

Functions that calculate appropriate sample sizes for one-sample t-tests, two-sample t-tests, and F-tests for microarray experiments based on desired power while controlling for false discovery rates. For all tests, the standard deviations (variances) among genes can be assumed fixed or random. This is also true for effect sizes among genes in one-sample and two sample experiments. Functions also output a chart of power versus sample size, a table of power at different sample sizes, and a table of critical test values at different sample sizes.

Maintained by Megan Orr. Last updated 3 years ago.

37.5 match 1 stars 1.78 score 2 dependents

uscbiostats

cit:Causal Inference Test

A likelihood-based hypothesis testing approach is implemented for assessing causal mediation. Described in Millstein, Chen, and Breton (2016), <DOI:10.1093/bioinformatics/btw135>, it could be used to test for mediation of a known causal association between a DNA variant, the 'instrumental variable', and a clinical outcome or phenotype by gene expression or DNA methylation, the potential mediator. Another example would be testing mediation of the effect of a drug on a clinical outcome by the molecular target. The hypothesis test generates a p-value or permutation-based FDR value with confidence intervals to quantify uncertainty in the causal inference. The outcome can be represented by either a continuous or binary variable, the potential mediator is continuous, and the instrumental variable can be continuous or binary and is not limited to a single variable but may be a design matrix representing multiple variables.

Maintained by Joshua Millstein. Last updated 9 months ago.

gsl cpp

16.1 match 2 stars 3.81 score 32 scripts

yonghui-ni

FDRsamplesize2:Computing Power and Sample Size for the False Discovery Rate in Multiple Applications

Defines a collection of functions to compute average power and sample size for studies that use the false discovery rate as the final measure of statistical significance. A three-rectangle approximation method of a p-value histogram is proposed to derive a formula to compute the statistical power for analyses that involve the FDR. The methodology paper of this package is under review.

Maintained by Yonghui Ni. Last updated 1 years ago.

33.7 match 1.70 score

dsy109

mixtools:Tools for Analyzing Finite Mixture Models

Analyzes finite mixture models for various parametric and semiparametric settings. This includes mixtures of parametric distributions (normal, multivariate normal, multinomial, gamma), various Reliability Mixture Models (RMMs), mixtures-of-regressions settings (linear regression, logistic regression, Poisson regression, linear regression with changepoints, predictor-dependent mixing proportions, random effects regressions, hierarchical mixtures-of-experts), and tools for selecting the number of components (bootstrapping the likelihood ratio test statistic, mixturegrams, and model selection criteria). Bayesian estimation of mixtures-of-linear-regressions models is available as well as a novel data depth method for obtaining credible bands. This package is based upon work supported by the National Science Foundation under Grant No. SES-0518772 and the Chan Zuckerberg Initiative: Essential Open Source Software for Science (Grant No. 2020-255193).

Maintained by Derek Young. Last updated 9 months ago.

mixture-models mixture-of-experts semiparametric-regression

4.9 match 20 stars 11.34 score 1.4k scripts 56 dependents

bioc

categoryCompare:Meta-analysis of high-throughput experiments using feature annotations

Calculates significant annotations (categories) in each of two (or more) feature (i.e. gene) lists, determines the overlap between the annotations, and returns graphical and tabular data about the significant annotations and which combinations of feature lists the annotations were found to be significant. Interactive exploration is facilitated through the use of RCytoscape (heavily suggested).

Maintained by Robert M. Flight. Last updated 5 months ago.

annotation go multiplecomparison pathways geneexpression bioconductor

8.2 match 6 stars 6.68 score

millstei

fdrci:Permutation-Based FDR Point and Confidence Interval Estimation

FDR functions for permutation-based estimators, including pi0 as well as FDR confidence intervals. The confidence intervals account for dependencies between tests by the incorporation of an overdispersion parameter, which is estimated from the permuted data. Also included are options for an analog parametric approach.

Maintained by Joshua Millstein. Last updated 2 years ago.

19.1 match 2.78 score 12 scripts

myllym

GET:Global Envelopes

Implementation of global envelopes for a set of general d-dimensional vectors T in various applications. A 100(1-alpha)% global envelope is a band bounded by two vectors such that the probability that T falls outside this envelope in any of the d points is equal to alpha. Global means that the probability is controlled simultaneously for all the d elements of the vectors. The global envelopes can be used for graphical Monte Carlo and permutation tests where the test statistic is a multivariate vector or function (e.g. goodness-of-fit testing for point patterns and random sets, functional analysis of variance, functional general linear model, n-sample test of correspondence of distribution functions), for central regions of functional or multivariate data (e.g. outlier detection, functional boxplot) and for global confidence and prediction bands (e.g. confidence band in polynomial regression, Bayesian posterior prediction). See Myllymäki and Mrkvička (2024) <doi:10.18637/jss.v111.i03>, Myllymäki et al. (2017) <doi:10.1111/rssb.12172>, Mrkvička and Myllymäki (2023) <doi:10.1007/s11222-023-10275-7>, Mrkvička et al. (2016) <doi:10.1016/j.spasta.2016.04.005>, Mrkvička et al. (2017) <doi:10.1007/s11222-016-9683-9>, Mrkvička et al. (2020) <doi:10.14736/kyb-2020-3-0432>, Mrkvička et al. (2021) <doi:10.1007/s11009-019-09756-y>, Myllymäki et al. (2021) <doi:10.1016/j.spasta.2020.100436>, Mrkvička et al. (2022) <doi:10.1002/sim.9236>, Dai et al. (2022) <doi:10.5772/intechopen.100124>, Dvořák and Mrkvička (2022) <doi:10.1007/s00180-021-01134-y>, Mrkvička et al. (2023) <doi:10.48550/arXiv.2309.04746>, and Konstantinou et al. (2024) <doi: 10.1007/s00180-024-01569-z>.

Maintained by Mari Myllymäki. Last updated 4 months ago.

5.6 match 11 stars 9.33 score 46 scripts 5 dependents

bioc

Mergeomics:Integrative network analysis of omics data

The Mergeomics pipeline serves as a flexible framework for integrating multidimensional omics-disease associations, functional genomics, canonical pathways and gene-gene interaction networks to generate mechanistic hypotheses. It includes two main parts, 1) Marker set enrichment analysis (MSEA); 2) Weighted Key Driver Analysis (wKDA).

Maintained by Zeyneb Kurt. Last updated 5 months ago.

software

11.9 match 4.30 score 8 scripts

bioc

OLIN:Optimized local intensity-dependent normalisation of two-color microarrays

Functions for normalisation of two-color microarrays by optimised local regression and for detection of artefacts in microarray data

Maintained by Matthias Futschik. Last updated 5 months ago.

microarray twochannel qualitycontrol preprocessing visualization

9.0 match 4.78 score 2 scripts 1 dependents

disohda

DiscreteFDR:FDR Based Multiple Testing Procedures with Adaptation for Discrete Tests

Implementations of the multiple testing procedures for discrete tests described in the paper Döhler, Durand and Roquain (2018) "New FDR bounds for discrete and heterogeneous tests" <doi:10.1214/18-EJS1441>. The main procedures of the paper (HSU and HSD), their adaptive counterparts (AHSU and AHSD), and the HBR variant are available and are coded to take as input the results of a test procedure from package 'DiscreteTests', or a set of observed p-values and their discrete support under their nulls. A shortcut function to obtain such p-values and supports is also provided, along with a wrapper allowing to apply discrete procedures directly to data.

Maintained by Florian Junge. Last updated 8 days ago.

cpp

6.9 match 3 stars 6.02 score 16 scripts 2 dependents

murraymegan

FDRestimation:Estimate, Plot, and Summarize False Discovery Rates

The user can directly compute and display false discovery rates from inputted p-values or z-scores under a variety of assumptions. p.fdr() computes FDRs, adjusted p-values and decision reject vectors from inputted p-values or z-values. get.pi0() estimates the proportion of data that are truly null. plot.p.fdr() plots the FDRs, adjusted p-values, and the raw p-values points against their rejection threshold lines.

Maintained by Megan Murray. Last updated 3 years ago.

statistics

11.2 match 6 stars 3.65 score 15 scripts

bioc

PLPE:Local Pooled Error Test for Differential Expression with Paired High-throughput Data

This package performs tests for paired high-throughput data.

Maintained by Soo-heang Eo. Last updated 5 months ago.

proteomics microarray differentialexpression

12.1 match 3.30 score 7 scripts

allenzhuaz

FixSeqMTP:Fixed Sequence Multiple Testing Procedures

Several generalized / directional Fixed Sequence Multiple Testing Procedures (FSMTPs) are developed for testing a sequence of pre-ordered hypotheses while controlling the FWER, FDR and Directional Error (mdFWER). All three FWER controlling generalized FSMTPs are designed under arbitrary dependence, which allow any number of acceptances. Two FDR controlling generalized FSMTPs are respectively designed under arbitrary dependence and independence, which allow more but a given number of acceptances. Two mdFWER controlling directional FSMTPs are respectively designed under arbitrary dependence and independence, which can also make directional decisions based on the signs of the test statistics. The main functions for each proposed generalized / directional FSMTPs are designed to calculate adjusted p-values and critical values, respectively. For users' convenience, the functions also provide the output option for printing decision rules.

Maintained by Yalin Zhu. Last updated 6 years ago.

multiple-testing pre-order sequential-testing

11.7 match 3 stars 3.22 score 11 scripts

bioc

TPP2D:Detection of ligand-protein interactions from 2D thermal profiles (DLPTP)

Detection of ligand-protein interactions from 2D thermal profiles (DLPTP), Performs an FDR-controlled analysis of 2D-TPP experiments by functional analysis of dose-response curves across temperatures.

Maintained by Nils Kurzawa. Last updated 5 months ago.

software proteomics dataimport

8.9 match 4.20 score 16 scripts

bioc

csaw:ChIP-Seq Analysis with Windows

Detection of differentially bound regions in ChIP-seq data with sliding windows, with methods for normalization and proper FDR control.

Maintained by Aaron Lun. Last updated 2 months ago.

multiplecomparison chipseq normalization sequencing coverage genetics annotation differentialpeakcalling curl bzip2 xz-utils zlib cpp

4.4 match 8.32 score 498 scripts 7 dependents

bioc

qvalue:Q-value estimation for false discovery rate control

This package takes a list of p-values resulting from the simultaneous testing of many hypotheses and estimates their q-values and local FDR values. The q-value of a test measures the proportion of false positives incurred (called the false discovery rate) when that particular test is called significant. The local FDR measures the posterior probability the null hypothesis is true given the test's p-value. Various plots are automatically generated, allowing one to make sensible significance cut-offs. Several mathematical results have recently been shown on the conservative accuracy of the estimated q-values from this software. The software can be applied to problems in genomics, brain imaging, astrophysics, and data mining.

Maintained by John D. Storey. Last updated 5 months ago.

multiplecomparisons

2.5 match 114 stars 14.06 score 3.0k scripts 139 dependents

bioc

treeclimbR:An algorithm to find optimal signal levels in a tree

The arrangement of hypotheses in a hierarchical structure appears in many research fields and often indicates different resolutions at which data can be viewed. This raises the question of which resolution level the signal should best be interpreted on. treeclimbR provides a flexible method to select optimal resolution levels (potentially different levels in different parts of the tree), rather than cutting the tree at an arbitrary level. treeclimbR uses a tuning parameter to generate candidate resolutions and from these selects the optimal one.

Maintained by Charlotte Soneson. Last updated 3 months ago.

statisticalmethod cellbasedassays

5.0 match 20 stars 7.00 score 45 scripts

protviz

prozor:Minimal Protein Set Explaining Peptide Spectrum Matches

Determine minimal protein set explaining peptide spectrum matches. Utility functions for creating fasta amino acid databases with decoys and contaminants. Peptide false discovery rate estimation for target decoy search results on psm, precursor, peptide and protein level. Computing dynamic swath window sizes based on MS1 or MS2 signal distributions.

Maintained by Witold Wolski. Last updated 4 months ago.

software massspectrometry proteomics experimenthubsoftware

7.9 match 6 stars 4.45 score 93 scripts

snoweye

MixfMRI:Mixture fMRI Clustering Analysis

Utilizing model-based clustering (unsupervised) for functional magnetic resonance imaging (fMRI) data. The developed methods (Chen and Maitra (2023) <doi:10.1002/hbm.26425>) include 2D and 3D clustering analyses (for p-values with voxel locations) and segmentation analyses (for p-values alone) for fMRI data where p-values indicate significant level of activation responding to stimulate of interesting. The analyses are mainly identifying active voxel/signal associated with normal brain behaviors. Analysis pipelines (R scripts) utilizing this package (see examples in 'inst/workflow/') is also implemented with high performance techniques.

Maintained by Wei-Chen Chen. Last updated 5 months ago.

8.1 match 2 stars 4.26 score 18 scripts

andrewzm

EFDR:Wavelet-Based Enhanced FDR for Detecting Signals from Complete or Incomplete Spatially Aggregated Data

Enhanced False Discovery Rate (EFDR) is a tool to detect anomalies in an image. The image is first transformed into the wavelet domain in order to decorrelate any noise components, following which the coefficients at each resolution are standardised. Statistical tests (in a multiple hypothesis testing setting) are then carried out to find the anomalies. The power of EFDR exceeds that of standard FDR, which would carry out tests on every wavelet coefficient: EFDR choose which wavelets to test based on a criterion described in Shen et al. (2002). The package also provides elementary tools to interpolate spatially irregular data onto a grid of the required size. The work is based on Shen, X., Huang, H.-C., and Cressie, N. 'Nonparametric hypothesis testing for a spatial signal.' Journal of the American Statistical Association 97.460 (2002): 1122-1140.

Maintained by Andrew Zammit-Mangion. Last updated 2 years ago.

7.2 match 5 stars 4.74 score 22 scripts

kornl

mutoss:Unified Multiple Testing Procedures

Designed to ease the application and comparison of multiple hypothesis testing procedures for FWER, gFWER, FDR and FDX. Methods are standardized and usable by the accompanying 'mutossGUI'.

Maintained by Kornelius Rohmeyer. Last updated 12 months ago.

3.9 match 4 stars 8.44 score 24 scripts 16 dependents

bioc

iCOBRA:Comparison and Visualization of Ranking and Assignment Methods

This package provides functions for calculation and visualization of performance metrics for evaluation of ranking and binary classification (assignment) methods. Various types of performance plots can be generated programmatically. The package also contains a shiny application for interactive exploration of results.

Maintained by Charlotte Soneson. Last updated 3 months ago.

classification visualization

3.6 match 14 stars 8.86 score 192 scripts 1 dependents

ivis4ml

fssemR:Fused Sparse Structural Equation Models to Jointly Infer Gene Regulatory Network

An optimizer of Fused-Sparse Structural Equation Models, which is the state of the art jointly fused sparse maximum likelihood function for structural equation models proposed by Xin Zhou and Xiaodong Cai (2018 <doi:10.1101/466623>).

Maintained by Xin Zhou. Last updated 3 years ago.

cpp

6.6 match 4 stars 4.85 score 35 scripts

bioc

PICS:Probabilistic inference of ChIP-seq

Probabilistic inference of ChIP-Seq using an empirical Bayes mixture model approach.

Maintained by Renan Sauteraud. Last updated 5 months ago.

clustering visualization sequencing chipseq gsl

5.8 match 5.48 score 7 scripts 1 dependents

bnaras

pamr:Pam: Prediction Analysis for Microarrays

Some functions for sample classification in microarrays.

Maintained by Balasubramanian Narasimhan. Last updated 9 months ago.

3.9 match 7.90 score 256 scripts 14 dependents

bioc

LPE:Methods for analyzing microarray data using Local Pooled Error (LPE) method

This LPE library is used to do significance analysis of microarray data with small number of replicates. It uses resampling based FDR adjustment, and gives less conservative results than traditional 'BH' or 'BY' procedures. Data accepted is raw data in txt format from MAS4, MAS5 or dChip. Data can also be supplied after normalization. LPE library is primarily used for analyzing data between two conditions. To use it for paired data, see LPEP library. For using LPE in multiple conditions, use HEM library.

Maintained by Nitin Jain. Last updated 5 months ago.

microarray differentialexpression

6.6 match 4.58 score 21 scripts 1 dependents

allenzhuaz

MHTdiscrete:Multiple Hypotheses Testing for Discrete Data

A comprehensive tool for almost all existing multiple testing methods for discrete data. The package also provides some novel multiple testing procedures controlling FWER/FDR for discrete data. Given discrete p-values and their domains, the [method].p.adjust function returns adjusted p-values, which can be used to compare with the nominal significant level alpha and make decisions. For users' convenience, the functions also provide the output option for printing decision rules.

Maintained by Yalin Zhu. Last updated 6 years ago.

adjustment-computations benjamini-hochberg bonferroni discrete-distributions multiple-testing-correction

8.7 match 1 stars 3.27 score 37 scripts

thermostats

RVA:RNAseq Visualization Automation

Automate downstream visualization & pathway analysis in RNAseq analysis. 'RVA' is a collection of functions that efficiently visualize RNAseq differential expression analysis result from summary statistics tables. It also utilize the Fisher's exact test to evaluate gene set or pathway enrichment in a convenient and efficient manner.

Maintained by Xingpeng Li. Last updated 3 years ago.

5.0 match 9 stars 5.65 score 6 scripts

schw4b

DGM:Dynamic Graphical Models

Dynamic graphical models for multivariate time series data to estimate directed dynamic networks in functional magnetic resonance imaging (fMRI), see Schwab et al. (2017) <doi:10.1016/j.neuroimage.2018.03.074>.

Maintained by Simon Schwab. Last updated 3 years ago.

dynamic-graphical-models functional-connectivity time-varying-connectivity openblas cpp openmp

5.1 match 25 stars 5.49 score 25 scripts

bioc

compcodeR:RNAseq data simulation, differential expression analysis and performance comparison of differential expression methods

This package provides extensive functionality for comparing results obtained by different methods for differential expression analysis of RNAseq data. It also contains functions for simulating count data. Finally, it provides convenient interfaces to several packages for performing the differential expression analysis. These can also be used as templates for setting up and running a user-defined differential analysis workflow within the framework of the package.

Maintained by Charlotte Soneson. Last updated 3 months ago.

immunooncology rnaseq differentialexpression

3.5 match 11 stars 8.06 score 26 scripts

adriancorrendo

metrica:Prediction Performance Metrics

A compilation of more than 80 functions designed to quantitatively and visually evaluate prediction performance of regression (continuous variables) and classification (categorical variables) of point-forecast models (e.g. APSIM, DSSAT, DNDC, supervised Machine Learning). For regression, it includes functions to generate plots (scatter, tiles, density, & Bland-Altman plot), and to estimate error metrics (e.g. MBE, MAE, RMSE), error decomposition (e.g. lack of accuracy-precision), model efficiency (e.g. NSE, E1, KGE), indices of agreement (e.g. d, RAC), goodness of fit (e.g. r, R2), adjusted correlation coefficients (e.g. CCC, dcorr), symmetric regression coefficients (intercept, slope), and mean absolute scaled error (MASE) for time series predictions. For classification (binomial and multinomial), it offers functions to generate and plot confusion matrices, and to estimate performance metrics such as accuracy, precision, recall, specificity, F-score, Cohen's Kappa, G-mean, and many more. For more details visit the vignettes <https://adriancorrendo.github.io/metrica/>.

Maintained by Adrian A. Correndo. Last updated 9 months ago.

3.3 match 77 stars 8.18 score 49 scripts

jasinmachkour

TRexSelector:T-Rex Selector: High-Dimensional Variable Selection & FDR Control

Performs fast variable selection in high-dimensional settings while controlling the false discovery rate (FDR) at a user-defined target level. The package is based on the paper Machkour, Muma, and Palomar (2022) <arXiv:2110.06048>.

Maintained by Jasin Machkour. Last updated 1 years ago.

5.9 match 5 stars 4.40 score 5 scripts

dsstoffer

astsa:Applied Statistical Time Series Analysis

Contains data sets and scripts for analyzing time series in both the frequency and time domains including state space modeling as well as supporting the texts Time Series Analysis and Its Applications: With R Examples (5th ed), by R.H. Shumway and D.S. Stoffer. Springer Texts in Statistics, 2025, <https://link.springer.com/book/9783031705830>, and Time Series: A Data Analysis Approach Using R. Chapman-Hall, 2019, <DOI:10.1201/9780429273285>.

Maintained by David Stoffer. Last updated 2 months ago.

3.3 match 7 stars 7.86 score 2.2k scripts 8 dependents

stan-pounds

FDRsampsize:Compute Sample Size that Meets Requirements for Average Power and FDR

Defines a collection of functions to compute average power and sample size for studies that use the false discovery rate as the final measure of statistical significance.

Maintained by Stan Pounds. Last updated 9 years ago.

18.3 match 1.23 score 17 scripts

hrpcisd

locfdr:Computes Local False Discovery Rates

Computation of local false discovery rates.

Maintained by Balasubramanian Narasimhan. Last updated 10 years ago.

3.8 match 5.99 score 106 scripts 14 dependents

bioc

safe:Significance Analysis of Function and Expression

SAFE is a resampling-based method for testing functional categories in gene expression experiments. SAFE can be applied to 2-sample and multi-class comparisons, or simple linear regressions. Other experimental designs can also be accommodated through user-defined functions.

Maintained by Ludwig Geistlinger. Last updated 5 months ago.

differentialexpression pathways genesetenrichment statisticalmethod software

4.0 match 5.60 score 32 scripts 5 dependents

kbroman

qtl:Tools for Analyzing QTL Experiments

Analysis of experimental crosses to identify genes (called quantitative trait loci, QTLs) contributing to variation in quantitative traits. Broman et al. (2003) <doi:10.1093/bioinformatics/btg112>.

Maintained by Karl W Broman. Last updated 7 months ago.

openblas

1.8 match 80 stars 12.79 score 2.4k scripts 29 dependents

ichcha-m

cophescan:Adaptation of the Coloc Method for PheWAS

A Bayesian method for Phenome-wide association studies (PheWAS) that identifies causal associations between genetic variants and traits, while simultaneously addressing confounding due to linkage disequilibrium. For details see Manipur et al (2023) <doi:10.1101/2023.06.29.546856>.

Maintained by Ichcha Manipur. Last updated 9 months ago.

cpp openmp

3.8 match 6 stars 5.76 score 24 scripts

parsifal9

RFlocalfdr:Significance Level for Random Forest Impurity Importance Scores

Sets a significance level for Random Forest MDI (Mean Decrease in Impurity, Gini or sum of squares) variable importance scores, using an empirical Bayes approach. See Dunne et al. (2022) <doi:10.1101/2022.04.06.487300>.

Maintained by Robert Dunne. Last updated 2 months ago.

4.5 match 1 stars 4.72 score 13 scripts

bioc

miloR:Differential neighbourhood abundance testing on a graph

Milo performs single-cell differential abundance testing. Cell states are modelled as representative neighbourhoods on a nearest neighbour graph. Hypothesis testing is performed using either a negative bionomial generalized linear model or negative binomial generalized linear mixed model.

Maintained by Mike Morgan. Last updated 5 months ago.

singlecell multiplecomparison functionalgenomics software openblas cpp openmp

2.0 match 357 stars 10.49 score 340 scripts 1 dependents

afukushima

DiffCorr:Analyzing and Visualizing Differential Correlation Networks in Biological Data

A method for identifying pattern changes between 2 experimental conditions in correlation networks (e.g., gene co-expression networks), which builds on a commonly used association measure, such as Pearson's correlation coefficient. This package includes functions to calculate correlation matrices for high-dimensional dataset and to test differential correlation, which means the changes in the correlation relationship among variables (e.g., genes and metabolites) between 2 experimental conditions.

Maintained by Atsushi Fukushima. Last updated 6 months ago.

3.1 match 5 stars 6.81 score 29 scripts 1 dependents

bioc

GPA:GPA (Genetic analysis incorporating Pleiotropy and Annotation)

This package provides functions for fitting GPA, a statistical framework to prioritize GWAS results by integrating pleiotropy information and annotation data. In addition, it also includes ShinyGPA, an interactive visualization toolkit to investigate pleiotropic architecture.

Maintained by Dongjun Chung. Last updated 5 months ago.

software statisticalmethod classification genomewideassociation snp genetics clustering multiplecomparison preprocessing geneexpression differentialexpression cpp

3.3 match 14 stars 6.15 score 7 scripts

bioc

ReactomePA:Reactome Pathway Analysis

This package provides functions for pathway analysis based on REACTOME pathway database. It implements enrichment analysis, gene set enrichment analysis and several functions for visualization. This package is not affiliated with the Reactome team.

Maintained by Guangchuang Yu. Last updated 5 months ago.

pathways visualization annotation multiplecomparison genesetenrichment reactome enrichment-analysis reactome-pathway-analysis reactomepa

1.6 match 40 stars 12.25 score 1.5k scripts 7 dependents

dppalomar

spectralGraphTopology:Learning Graphs from Data via Spectral Constraints

In the era of big data and hyperconnectivity, learning high-dimensional structures such as graphs from data has become a prominent task in machine learning and has found applications in many fields such as finance, health care, and networks. 'spectralGraphTopology' is an open source, documented, and well-tested R package for learning graphs from data. It provides implementations of state of the art algorithms such as Combinatorial Graph Laplacian Learning (CGL), Spectral Graph Learning (SGL), Graph Estimation based on Majorization-Minimization (GLE-MM), and Graph Estimation based on Alternating Direction Method of Multipliers (GLE-ADMM). In addition, graph learning has been widely employed for clustering, where specific algorithms are available in the literature. To this end, we provide an implementation of the Constrained Laplacian Rank (CLR) algorithm.

Maintained by Ze Vinicius. Last updated 2 years ago.

openblas cpp

3.3 match 2 stars 5.91 score 135 scripts 1 dependents

umich-cphds

bama:High Dimensional Bayesian Mediation Analysis

Perform mediation analysis in the presence of high-dimensional mediators based on the potential outcome framework. Bayesian Mediation Analysis (BAMA), developed by Song et al (2019) <doi:10.1111/biom.13189> and Song et al (2020) <arXiv:2009.11409>, relies on two Bayesian sparse linear mixed models to simultaneously analyze a relatively large number of mediators for a continuous exposure and outcome assuming a small number of mediators are truly active. This sparsity assumption also allows the extension of univariate mediator analysis by casting the identification of active mediators as a variable selection problem and applying Bayesian methods with continuous shrinkage priors on the effects.

Maintained by Mike Kleinsasser. Last updated 2 years ago.

openblas cpp

4.0 match 4.80 score 42 scripts 1 dependents

bioc

HEM:Heterogeneous error model for identification of differentially expressed genes under multiple conditions

This package fits heterogeneous error models for analysis of microarray data

Maintained by HyungJun Cho. Last updated 5 months ago.

microarray differentialexpression

4.5 match 4.30 score 6 scripts

bioc

cydar:Using Mass Cytometry for Differential Abundance Analyses

Identifies differentially abundant populations between samples and groups in mass cytometry data. Provides methods for counting cells into hyperspheres, controlling the spatial false discovery rate, and visualizing changes in abundance in the high-dimensional marker space.

Maintained by Aaron Lun. Last updated 5 months ago.

immunooncology flowcytometry multiplecomparison proteomics singlecell cpp

3.3 match 5.64 score 48 scripts

bioc

acde:Artificial Components Detection of Differentially Expressed Genes

This package provides a multivariate inferential analysis method for detecting differentially expressed genes in gene expression data. It uses artificial components, close to the data's principal components but with an exact interpretation in terms of differential genetic expression, to identify differentially expressed genes while controlling the false discovery rate (FDR). The methods on this package are described in the vignette or in the article 'Multivariate Method for Inferential Identification of Differentially Expressed Genes in Gene Expression Experiments' by J. P. Acosta, L. Lopez-Kleine and S. Restrepo (2015, pending publication).

Maintained by Juan Pablo Acosta. Last updated 5 months ago.

differentialexpression timecourse principalcomponent geneexpression microarray mrnamicroarray

5.6 match 3.30 score 1 scripts

bioc

DAPAR:Tools for the Differential Analysis of Proteins Abundance with R

The package DAPAR is a Bioconductor distributed R package which provides all the necessary functions to analyze quantitative data from label-free proteomics experiments. Contrarily to most other similar R packages, it is endowed with rich and user-friendly graphical interfaces, so that no programming skill is required (see `Prostar` package).

Maintained by Samuel Wieczorek. Last updated 5 months ago.

proteomics normalization preprocessing massspectrometry qualitycontrol go dataimport prostar1

3.4 match 2 stars 5.42 score 22 scripts 1 dependents

bioc

calm:Covariate Assisted Large-scale Multiple testing

Statistical methods for multiple testing with covariate information. Traditional multiple testing methods only consider a list of test statistics, such as p-values. Our methods incorporate the auxiliary information, such as the lengths of gene coding regions or the minor allele frequencies of SNPs, to improve power.

Maintained by Kun Liang. Last updated 5 months ago.

bayesian differentialexpression geneexpression regression microarray sequencing rnaseq multiplecomparison genetics immunooncology metabolomics proteomics transcriptomics

5.5 match 3.30 score 2 scripts

meierluk

hdi:High-Dimensional Inference

Implementation of multiple approaches to perform inference in high-dimensional models.

Maintained by Lukas Meier. Last updated 4 years ago.

4.0 match 2 stars 4.47 score 139 scripts 7 dependents

r-forge

fuzzySim:Fuzzy Similarity in Species Distributions

Functions to compute fuzzy versions of species occurrence patterns based on presence-absence data (including inverse distance interpolation, trend surface analysis, and prevalence-independent favourability obtained from probability of presence), as well as pair-wise fuzzy similarity (based on fuzzy logic versions of commonly used similarity indices) among those occurrence patterns. Includes also functions for model consensus and comparison (overlap and fuzzy similarity, fuzzy loss, fuzzy gain), and for data preparation, such as obtaining unique abbreviations of species names, defining the background region, cleaning and gridding (thinning) point occurrence data onto raster maps, selecting among (pseudo)absences to address survey bias, converting species lists (long format) to presence-absence tables (wide format), transposing part of a data frame, selecting relevant variables for models, assessing the false discovery rate, or analysing and dealing with multicollinearity. Initially described in Barbosa (2015) <doi:10.1111/2041-210X.12372>.

Maintained by A. Marcia Barbosa. Last updated 22 days ago.

3.3 match 2 stars 5.35 score 156 scripts

cran

bnlearn:Bayesian Network Structure Learning, Parameter Learning and Inference

Bayesian network structure learning, parameter learning and inference. This package implements constraint-based (PC, GS, IAMB, Inter-IAMB, Fast-IAMB, MMPC, Hiton-PC, HPC), pairwise (ARACNE and Chow-Liu), score-based (Hill-Climbing and Tabu Search) and hybrid (MMHC, RSMAX2, H2PC) structure learning algorithms for discrete, Gaussian and conditional Gaussian networks, along with many score functions and conditional independence tests. The Naive Bayes and the Tree-Augmented Naive Bayes (TAN) classifiers are also implemented. Some utility functions (model comparison and manipulation, random data generation, arc orientation testing, simple and advanced plots) are included, as well as support for parameter estimation (maximum likelihood and Bayesian) and inference, conditional probability queries, cross-validation, bootstrap and model averaging. Development snapshots with the latest bugfixes are available from <https://www.bnlearn.com/>.

Maintained by Marco Scutari. Last updated 2 months ago.

openblas

2.3 match 57 stars 7.72 score 32 dependents

izmirlig

pwrFDR:FDR Power

Computing Average and TPX Power under various BHFDR type sequential procedures. All of these procedures involve control of some summary of the distribution of the FDP, e.g. the proportion of discoveries which are false in a given experiment. The most widely known of these, the BH-FDR procedure, controls the FDR which is the mean of the FDP. A lesser known procedure, due to Lehmann and Romano, controls the FDX, or probability that the FDP exceeds a user provided threshold. This is less conservative than FWE control procedures but much more conservative than the BH-FDR proceudre. This package and the references supporting it introduce a new procedure for controlling the FDX which we call the BH-FDX procedure. This procedure iteratively identifies, given alpha and lower threshold delta, an alpha* less than alpha at which BH-FDR guarantees FDX control. This uses asymptotic approximation and is only slightly more conservative than the BH-FDR procedure. Likewise, we can think of the power in multiple testing experiments in terms of a summary of the distribution of the True Positive Proportion (TPP), the portion of tests truly non-null distributed that are called significant. The package will compute power, sample size or any other missing parameter required for power defined as (i) the mean of the TPP (average power) or (ii) the probability that the TPP exceeds a given value, lambda, (TPX power) via asymptotic approximation. All supplied theoretical results are also obtainable via simulation. The suggested approach is to narrow in on a design via the theoretical approaches and then make final adjustments/verify the results by simulation. The theoretical results are described in Izmirlian, G (2020) Statistics and Probability letters, "<doi:10.1016/j.spl.2020.108713>", and an applied paper describing the methodology with a simulation study is in preparation. See citation("pwrFDR").

Maintained by Grant Izmirlian. Last updated 2 months ago.

6.6 match 2.58 score 19 scripts

wanghaoxue0

SplitKnockoff:Split Knockoffs for Structural Sparsity

Split Knockoff is a data adaptive variable selection framework for controlling the (directional) false discovery rate (FDR) in structural sparsity, where variable selection on linear transformation of parameters is of concern. This proposed scheme relaxes the linear subspace constraint to its neighborhood, often known as variable splitting in optimization. Simulation experiments can be reproduced following the Vignette. We include data (both .mat and .csv format) and application with our method of Alzheimer's Disease study in this package. 'Split Knockoffs' is first defined in Cao et al. (2021) <arXiv:2103.16159>.

Maintained by Haoxue Wang. Last updated 3 years ago.

4.1 match 3 stars 4.18 score 4 scripts

snoweye

EMCluster:EM Algorithm for Model-Based Clustering of Finite Mixture Gaussian Distribution

EM algorithms and several efficient initialization methods for model-based clustering of finite mixture Gaussian distribution with unstructured dispersion in both of unsupervised and semi-supervised learning.

Maintained by Wei-Chen Chen. Last updated 6 months ago.

openblas

2.3 match 18 stars 7.53 score 123 scripts 2 dependents

bioc

scDDboost:A compositional model to assess expression changes from single-cell rna-seq data

scDDboost is an R package to analyze changes in the distribution of single-cell expression data between two experimental conditions. Compared to other methods that assess differential expression, scDDboost benefits uniquely from information conveyed by the clustering of cells into cellular subtypes. Through a novel empirical Bayesian formulation it calculates gene-specific posterior probabilities that the marginal expression distribution is the same (or different) between the two conditions. The implementation in scDDboost treats gene-level expression data within each condition as a mixture of negative binomial distributions.

Maintained by Xiuyu Ma. Last updated 2 days ago.

singlecell software clustering sequencing geneexpression differentialexpression bayesian cpp

3.6 match 4.68 score 19 scripts

bioc

ABarray:Microarray QA and statistical data analysis for Applied Biosystems Genome Survey Microrarray (AB1700) gene expression data.

Automated pipline to perform gene expression analysis for Applied Biosystems Genome Survey Microarray (AB1700) data format. Functions include data preprocessing, filtering, control probe analysis, statistical analysis in one single function. A GUI interface is also provided. The raw data, processed data, graphics output and statistical results are organized into folders according to the analysis settings used.

Maintained by Yongming Andrew Sun. Last updated 5 months ago.

microarray onechannel preprocessing

3.9 match 4.20 score 3 scripts

annennenne

causalDisco:Tools for Causal Discovery on Observational Data

Various tools for inferring causal models from observational data. The package includes an implementation of the temporal Peter-Clark (TPC) algorithm. Petersen, Osler and Ekstrøm (2021) <doi:10.1093/aje/kwab087>. It also includes general tools for evaluating differences in adjacency matrices, which can be used for evaluating performance of causal discovery procedures.

Maintained by Anne Helby Petersen. Last updated 15 days ago.

3.3 match 19 stars 4.76 score 10 scripts

cran

mSTEM:Multiple Testing of Local Extrema for Detection of Change Points

A new approach to detect change points based on smoothing and multiple testing, which is for long data sequence modeled as piecewise constant functions plus stationary Gaussian noise, see Dan Cheng and Armin Schwartzman (2015) <arXiv:1504.06384>.

Maintained by Zhibing He. Last updated 5 years ago.

9.0 match 1.70 score

bioc

phenoTest:Tools to test association between gene expression and phenotype in a way that is efficient, structured, fast and scalable. We also provide tools to do GSEA (Gene set enrichment analysis) and copy number variation.

Tools to test correlation between gene expression and phenotype in a way that is efficient, structured, fast and scalable. GSEA is also provided.

Maintained by Evarist Planet. Last updated 5 months ago.

microarray differentialexpression multiplecomparison clustering classification

3.3 match 4.56 score 9 scripts 1 dependents

bioc

RNAseqCovarImpute:Impute Covariate Data in RNA Sequencing Studies

The RNAseqCovarImpute package makes linear model analysis for RNA sequencing read counts compatible with multiple imputation (MI) of missing covariates. A major problem with implementing MI in RNA sequencing studies is that the outcome data must be included in the imputation prediction models to avoid bias. This is difficult in omics studies with high-dimensional data. The first method we developed in the RNAseqCovarImpute package surmounts the problem of high-dimensional outcome data by binning genes into smaller groups to analyze pseudo-independently. This method implements covariate MI in gene expression studies by 1) randomly binning genes into smaller groups, 2) creating M imputed datasets separately within each bin, where the imputation predictor matrix includes all covariates and the log counts per million (CPM) for the genes within each bin, 3) estimating gene expression changes using `limma::voom` followed by `limma::lmFit` functions, separately on each M imputed dataset within each gene bin, 4) un-binning the gene sets and stacking the M sets of model results before applying the `limma::squeezeVar` function to apply a variance shrinking Bayesian procedure to each M set of model results, 5) pooling the results with Rubins’ rules to produce combined coefficients, standard errors, and P-values, and 6) adjusting P-values for multiplicity to account for false discovery rate (FDR). A faster method uses principal component analysis (PCA) to avoid binning genes while still retaining outcome information in the MI models. Binning genes into smaller groups requires that the MI and limma-voom analysis is run many times (typically hundreds). The more computationally efficient MI PCA method implements covariate MI in gene expression studies by 1) performing PCA on the log CPM values for all genes using the Bioconductor `PCAtools` package, 2) creating M imputed datasets where the imputation predictor matrix includes all covariates and the optimum number of PCs to retain (e.g., based on Horn’s parallel analysis or the number of PCs that account for >80% explained variation), 3) conducting the standard limma-voom pipeline with the `voom` followed by `lmFit` followed by `eBayes` functions on each M imputed dataset, 4) pooling the results with Rubins’ rules to produce combined coefficients, standard errors, and P-values, and 5) adjusting P-values for multiplicity to account for false discovery rate (FDR).

Maintained by Brennan Baker. Last updated 5 months ago.

rnaseq geneexpression differentialexpression sequencing

3.3 match 1 stars 4.48 score 6 scripts

philipppro

measures:Performance Measures for Statistical Learning

Provides the biggest amount of statistical measures in the whole R world. Includes measures of regression, (multiclass) classification and multilabel classification. The measures come mainly from the 'mlr' package and were programed by several 'mlr' developers.

Maintained by Philipp Probst. Last updated 4 years ago.

3.3 match 1 stars 4.47 score 88 scripts 2 dependents

bioc

ssize:Estimate Microarray Sample Size

Functions for computing and displaying sample size information for gene expression arrays.

Maintained by Gregory R. Warnes. Last updated 5 months ago.

microarray differentialexpression

3.5 match 4.18 score 15 scripts

arunabhacodes

CPBayes:Bayesian Meta Analysis for Studying Cross-Phenotype Genetic Associations

A Bayesian meta-analysis method for studying cross-phenotype genetic associations. It uses summary-level data across multiple phenotypes to simultaneously measure the evidence of aggregate-level pleiotropic association and estimate an optimal subset of traits associated with the risk locus. CPBayes is based on a spike and slab prior. The methodology is available from: A Majumdar, T Haldar, S Bhattacharya, JS Witte (2018) <doi:10.1371/journal.pgen.1007139>.

Maintained by Arunabha Majumdar. Last updated 4 years ago.

3.3 match 3 stars 4.26 score 12 scripts

stamats

MKclass:Statistical Classification

Performance measures and scores for statistical classification such as accuracy, sensitivity, specificity, recall, similarity coefficients, AUC, GINI index, Brier score and many more. Calculation of optimal cut-offs and decision stumps (Iba and Langley (1991), <doi:10.1016/B978-1-55860-247-2.50035-8>) for all implemented performance measures. Hosmer-Lemeshow goodness of fit tests (Lemeshow and Hosmer (1982), <doi:10.1093/oxfordjournals.aje.a113284>; Hosmer et al (1997), <doi:10.1002/(SICI)1097-0258(19970515)16:9%3C965::AID-SIM509%3E3.0.CO;2-O>). Statistical and epidemiological risk measures such as relative risk, odds ratio, number needed to treat (Porta (2014), <doi:10.1093%2Facref%2F9780199976720.001.0001>).

Maintained by Matthias Kohl. Last updated 2 years ago.

3.3 match 2 stars 4.26 score 18 scripts

bioc

EBSeq:An R package for gene and isoform differential expression analysis of RNA-seq data

Differential Expression analysis at both gene and isoform level using RNA-seq data

Maintained by Xiuyu Ma. Last updated 2 months ago.

immunooncology statisticalmethod differentialexpression multiplecomparison rnaseq sequencing cpp

1.8 match 7.77 score 162 scripts 6 dependents

ff1201

sgs:Sparse-Group SLOPE: Adaptive Bi-Level Selection with FDR Control

Implementation of Sparse-group SLOPE (SGS) (Feser and Evangelou (2023) <doi:10.48550/arXiv.2305.09467>) models. Linear and logistic regression models are supported, both of which can be fit using k-fold cross-validation. Dense and sparse input matrices are supported. In addition, a general Adaptive Three Operator Splitting (ATOS) (Pedregosa and Gidel (2018) <doi:10.48550/arXiv.1804.02339>) implementation is provided. Group SLOPE (gSLOPE) (Brzyski et al. (2019) <doi:10.1080/01621459.2017.1411269>) and group-based OSCAR models (Feser and Evangelou (2024) <doi:10.48550/arXiv.2405.15357>) are also implemented. All models are available with strong screening rules (Feser and Evangelou (2024) <doi:10.48550/arXiv.2405.15357>) for computational speed-up.

Maintained by Fabio Feser. Last updated 14 days ago.

openblas cpp openmp

2.8 match 1 stars 4.99 score 13 scripts 1 dependents

ziqiaow

IMIX:Gaussian Mixture Model for Multi-Omics Data Integration

A multivariate Gaussian mixture model framework to integrate multiple types of genomic data and allow modeling of inter-data-type correlations for association analysis. 'IMIX' can be implemented to test whether a disease is associated with genes in multiple genomic data types, such as DNA methylation, copy number variation, gene expression, etc. It can also study the integration of multiple pathways. 'IMIX' uses the summary statistics of association test outputs and conduct integration analysis for two or three types of genomics data. 'IMIX' features statistically-principled model selection, global FDR control and computational efficiency. Details are described in Ziqiao Wang and Peng Wei (2020) <doi:10.1093/bioinformatics/btaa1001>.

Maintained by Ziqiao Wang. Last updated 2 years ago.

3.9 match 7 stars 3.54 score

bioc

ANCOMBC:Microbiome differential abudance and correlation analyses with bias correction

ANCOMBC is a package containing differential abundance (DA) and correlation analyses for microbiome data. Specifically, the package includes Analysis of Compositions of Microbiomes with Bias Correction 2 (ANCOM-BC2), Analysis of Compositions of Microbiomes with Bias Correction (ANCOM-BC), and Analysis of Composition of Microbiomes (ANCOM) for DA analysis, and Sparse Estimation of Correlations among Microbiomes (SECOM) for correlation analysis. Microbiome data are typically subject to two sources of biases: unequal sampling fractions (sample-specific biases) and differential sequencing efficiencies (taxon-specific biases). Methodologies included in the ANCOMBC package are designed to correct these biases and construct statistically consistent estimators.

Maintained by Huang Lin. Last updated 2 days ago.

differentialexpression microbiome normalization sequencing software ancom ancombc ancombc2 correlation differential-abundance-analysis secom

1.3 match 120 stars 10.79 score 406 scripts 1 dependents

bioc

limpca:An R package for the linear modeling of high-dimensional designed data based on ASCA/APCA family of methods

This package has for objectives to provide a method to make Linear Models for high-dimensional designed data. limpca applies a GLM (General Linear Model) version of ASCA and APCA to analyse multivariate sample profiles generated by an experimental design. ASCA/APCA provide powerful visualization tools for multivariate structures in the space of each effect of the statistical model linked to the experimental design and contrarily to MANOVA, it can deal with mutlivariate datasets having more variables than observations. This method can handle unbalanced design.

Maintained by Manon Martin. Last updated 5 months ago.

statisticalmethod principalcomponent regression visualization experimentaldesign multiplecomparison geneexpression metabolomics

2.3 match 2 stars 5.73 score 2 scripts

bioc

autonomics:Unified Statistical Modeling of Omics Data

This package unifies access to Statistal Modeling of Omics Data. Across linear modeling engines (lm, lme, lmer, limma, and wilcoxon). Across coding systems (treatment, difference, deviation, etc). Across model formulae (with/without intercept, random effect, interaction or nesting). Across omics platforms (microarray, rnaseq, msproteomics, affinity proteomics, metabolomics). Across projection methods (pca, pls, sma, lda, spls, opls). Across clustering methods (hclust, pam, cmeans). It provides a fast enrichment analysis implementation. And an intuitive contrastogram visualisation to summarize contrast effects in complex designs.

Maintained by Aditya Bhagwat. Last updated 2 months ago.

software dataimport preprocessing dimensionreduction principalcomponent regression differentialexpression genesetenrichment transcriptomics transcription geneexpression rnaseq microarray proteomics metabolomics massspectrometry

2.3 match 5.95 score 5 scripts

modal-inria

MLGL:Multi-Layer Group-Lasso

It implements a new procedure of variable selection in the context of redundancy between explanatory variables, which holds true with high dimensional data (Grimonprez et al. (2023) <doi:10.18637/jss.v106.i03>).

Maintained by Quentin Grimonprez. Last updated 2 years ago.

group-lasso variable-selection

3.7 match 3 stars 3.61 score 27 scripts

bioc

GGPA:graph-GPA: A graphical model for prioritizing GWAS results and investigating pleiotropic architecture

Genome-wide association studies (GWAS) is a widely used tool for identification of genetic variants associated with phenotypes and diseases, though complex diseases featuring many genetic variants with small effects present difficulties for traditional these studies. By leveraging pleiotropy, the statistical power of a single GWAS can be increased. This package provides functions for fitting graph-GPA, a statistical framework to prioritize GWAS results by integrating pleiotropy. 'GGPA' package provides user-friendly interface to fit graph-GPA models, implement association mapping, and generate a phenotype graph.

Maintained by Dongjun Chung. Last updated 5 months ago.

software statisticalmethod classification genomewideassociation snp genetics clustering multiplecomparison preprocessing geneexpression differentialexpression openblas cpp

3.3 match 1 stars 4.00 score 2 scripts

hneth

riskyr:Rendering Risk Literacy more Transparent

Risk-related information (like the prevalence of conditions, the sensitivity and specificity of diagnostic tests, or the effectiveness of interventions or treatments) can be expressed in terms of frequencies or probabilities. By providing a toolbox of corresponding metrics and representations, 'riskyr' computes, translates, and visualizes risk-related information in a variety of ways. Adopting multiple complementary perspectives provides insights into the interplay between key parameters and renders teaching and training programs on risk literacy more transparent.

Maintained by Hansjoerg Neth. Last updated 10 months ago.

2x2-matrix bayesian-inference contingency-table representation risk risk-literacy visualization

1.7 match 19 stars 7.18 score 80 scripts

bioc

topconfects:Top Confident Effect Sizes

Rank results by confident effect sizes, while maintaining False Discovery Rate and False Coverage-statement Rate control. Topconfects is an alternative presentation of TREAT results with improved usability, eliminating p-values and instead providing confidence bounds. The main application is differential gene expression analysis, providing genes ranked in order of confident log2 fold change, but it can be applied to any collection of effect sizes with associated standard errors.

Maintained by Paul Harrison. Last updated 3 months ago.

geneexpression differentialexpression transcriptomics rnaseq mrnamicroarray regression multiplecomparison

1.7 match 14 stars 7.38 score 18 scripts 2 dependents

bioc

pepXMLTab:Parsing pepXML files and filter based on peptide FDR.

Parsing pepXML files based one XML package. The package tries to handle pepXML files generated from different softwares. The output will be a peptide-spectrum-matching tabular file. The package also provide function to filter the PSMs based on FDR.

Maintained by Xiaojing Wang. Last updated 5 months ago.

immunooncology proteomics massspectrometry

3.4 match 3.60 score 9 scripts

bioc

reconsi:Resampling Collapsed Null Distributions for Simultaneous Inference

Improves simultaneous inference under dependence of tests by estimating a collapsed null distribution through resampling. Accounting for the dependence between tests increases the power while reducing the variability of the false discovery proportion. This dependence is common in genomics applications, e.g. when combining flow cytometry measurements with microbiome sequence counts.

Maintained by Stijn Hawinkel. Last updated 5 months ago.

metagenomics microbiome multiplecomparison flowcytometry

2.6 match 2 stars 4.60 score 2 scripts

jasinmachkour

tlars:The T-LARS Algorithm: Early-Terminated Forward Variable Selection

Computes the solution path of the Terminating-LARS (T-LARS) algorithm. The T-LARS algorithm is a major building block of the T-Rex selector (see R package 'TRexSelector'). The package is based on the papers Machkour, Muma, and Palomar (2022) <arXiv:2110.06048>, Efron, Hastie, Johnstone, and Tibshirani (2004) <doi:10.1214/009053604000000067>, and Tibshirani (1996) <doi:10.1111/j.2517-6161.1996.tb02080.x>.

Maintained by Jasin Machkour. Last updated 1 years ago.

openblas cpp openmp

2.6 match 2 stars 4.48 score 5 scripts 1 dependents

ygeunkim

bvhar:Bayesian Vector Heterogeneous Autoregressive Modeling

Tools to model and forecast multivariate time series including Bayesian Vector heterogeneous autoregressive (VHAR) model by Kim & Baek (2023) (<doi:10.1080/00949655.2023.2281644>). 'bvhar' can model Vector Autoregressive (VAR), VHAR, Bayesian VAR (BVAR), and Bayesian VHAR (BVHAR) models.

Maintained by Young Geun Kim. Last updated 18 days ago.

bayesian bayesian-econometrics bvar eigen forecasting har pybind11 python rcppeigen time-series vector-autoregression cpp openmp

1.8 match 6 stars 6.42 score 25 scripts

allenzhuaz

MHTmult:Multiple Hypotheses Testing for Multiple Families/Groups Structure

A Comprehensive tool for almost all existing multiple testing methods for multiple families. The package summarizes the existing methods for multiple families multiple testing procedures (MTPs) such as double FDR, group Benjamini-Hochberg (GBH) procedure and average FDR controlling procedure. The package also provides some novel multiple testing procedures using selective inference idea.

Maintained by Yalin Zhu. Last updated 3 years ago.

hierarchical-data multiple-testing multiplicity

4.3 match 2.70 score 9 scripts

bioc

signatureSearch:Environment for Gene Expression Searching Combined with Functional Enrichment Analysis

This package implements algorithms and data structures for performing gene expression signature (GES) searches, and subsequently interpreting the results functionally with specialized enrichment methods.

Maintained by Brendan Gongol. Last updated 5 months ago.

software geneexpression go kegg networkenrichment sequencing coverage differentialexpression cpp

1.6 match 17 stars 7.18 score 74 scripts 1 dependents

bioc

swfdr:Estimation of the science-wise false discovery rate and the false discovery rate conditional on covariates

This package allows users to estimate the science-wise false discovery rate from Jager and Leek, "Empirical estimates suggest most published medical research is true," 2013, Biostatistics, using an EM approach due to the presence of rounding and censoring. It also allows users to estimate the false discovery rate conditional on covariates, using a regression framework, as per Boca and Leek, "A direct approach to estimating false discovery rates conditional on covariates," 2018, PeerJ.

Maintained by Simina M. Boca. Last updated 5 months ago.

multiplecomparison statisticalmethod software

1.8 match 3 stars 6.25 score 37 scripts

bioc

scp:Mass Spectrometry-Based Single-Cell Proteomics Data Analysis

Utility functions for manipulating, processing, and analyzing mass spectrometry-based single-cell proteomics data. The package is an extension to the 'QFeatures' package and relies on 'SingleCellExpirement' to enable single-cell proteomics analyses. The package offers the user the functionality to process quantitative table (as generated by MaxQuant, Proteome Discoverer, and more) into data tables ready for downstream analysis and data visualization.

Maintained by Christophe Vanderaa. Last updated 18 days ago.

geneexpression proteomics singlecell massspectrometry preprocessing cellbasedassays bioconductor mass-spectrometry single-cell software

1.3 match 25 stars 8.94 score 115 scripts

hanjunwei-lab

ICDS:Identification of Cancer Dysfunctional Subpathway with Omics Data

Identify Cancer Dysfunctional Sub-pathway by integrating gene expression, DNA methylation and copy number variation, and pathway topological information. 1)We firstly calculate the gene risk scores by integrating three kinds of data: DNA methylation, copy number variation, and gene expression. 2)Secondly, we perform a greedy search algorithm to identify the key dysfunctional sub-pathways within the pathways for which the discriminative scores were locally maximal. 3)Finally, the permutation test was used to calculate statistical significance level for these key dysfunctional sub-pathways.

Maintained by Junwei Han. Last updated 8 months ago.

3.1 match 3.54 score 3 scripts

bioc

IHW:Independent Hypothesis Weighting

Independent hypothesis weighting (IHW) is a multiple testing procedure that increases power compared to the method of Benjamini and Hochberg by assigning data-driven weights to each hypothesis. The input to IHW is a two-column table of p-values and covariates. The covariate can be any continuous-valued or categorical variable that is thought to be informative on the statistical properties of each hypothesis test, while it is independent of the p-value under the null hypothesis.

Maintained by Nikos Ignatiadis. Last updated 5 months ago.

immunooncology multiplecomparison rnaseq

1.5 match 7.25 score 264 scripts 2 dependents

bioc

BioNet:Routines for the functional analysis of biological networks

This package provides functions for the integrated analysis of protein-protein interaction networks and the detection of functional modules. Different datasets can be integrated into the network by assigning p-values of statistical tests to the nodes of the network. E.g. p-values obtained from the differential expression of the genes from an Affymetrix array are assigned to the nodes of the network. By fitting a beta-uniform mixture model and calculating scores from the p-values, overall scores of network regions can be calculated and an integer linear programming algorithm identifies the maximum scoring subnetwork.

Maintained by Marcus Dittrich. Last updated 5 months ago.

microarray dataimport graphandnetwork network networkenrichment geneexpression differentialexpression

1.8 match 6.14 score 114 scripts 2 dependents

cran

rope:Model Selection with FDR Control of Selected Variables

Selects one model with variable selection FDR controlled at a specified level. A q-value for each potential variable is also returned. The input, variable selection counts over many bootstraps for several levels of penalization, is modeled as coming from a beta-binomial mixture distribution.

Maintained by Jonatan Kallus. Last updated 8 years ago.

5.3 match 2.00 score

audreyqyfu

MRPC:PC Algorithm with the Principle of Mendelian Randomization

A PC Algorithm with the Principle of Mendelian Randomization. This package implements the MRPC (PC with the principle of Mendelian randomization) algorithm to infer causal graphs. It also contains functions to simulate data under a certain topology, to visualize a graph in different ways, and to compare graphs and quantify the differences. See Badsha and Fu (2019) <doi:10.3389/fgene.2019.00460>,Badsha, Martin and Fu (2021) <doi:10.3389/fgene.2021.651812>.

Maintained by Audrey Fu. Last updated 3 years ago.

2.3 match 8 stars 4.68 score 20 scripts

bioc

dreamlet:Scalable differential expression analysis of single cell transcriptomics datasets with complex study designs

Recent advances in single cell/nucleus transcriptomic technology has enabled collection of cohort-scale datasets to study cell type specific gene expression differences associated disease state, stimulus, and genetic regulation. The scale of these data, complex study designs, and low read count per cell mean that characterizing cell type specific molecular mechanisms requires a user-frieldly, purpose-build analytical framework. We have developed the dreamlet package that applies a pseudobulk approach and fits a regression model for each gene and cell cluster to test differential expression across individuals associated with a trait of interest. Use of precision-weighted linear mixed models enables accounting for repeated measures study designs, high dimensional batch effects, and varying sequencing depth or observed cells per biosample.

Maintained by Gabriel Hoffman. Last updated 5 months ago.

rnaseq geneexpression differentialexpression batcheffect qualitycontrol regression genesetenrichment generegulation epigenetics functionalgenomics transcriptomics normalization singlecell preprocessing sequencing immunooncology software cpp

1.3 match 12 stars 7.99 score 128 scripts

bioc

INTACT:Integrate TWAS and Colocalization Analysis for Gene Set Enrichment Analysis

This package integrates colocalization probabilities from colocalization analysis with transcriptome-wide association study (TWAS) scan summary statistics to implicate genes that may be biologically relevant to a complex trait. The probabilistic framework implemented in this package constrains the TWAS scan z-score-based likelihood using a gene-level colocalization probability. Given gene set annotations, this package can estimate gene set enrichment using posterior probabilities from the TWAS-colocalization integration step.

Maintained by Jeffrey Okamoto. Last updated 5 months ago.

bayesian genesetenrichment

1.8 match 15 stars 5.47 score 13 scripts

bioc

EBarrays:Unified Approach for Simultaneous Gene Clustering and Differential Expression Identification

EBarrays provides tools for the analysis of replicated/unreplicated microarray data.

Maintained by Ming Yuan. Last updated 5 months ago.

clustering differentialexpression

1.8 match 5.56 score 5 scripts 6 dependents

bioc

DEXSeq:Inference of differential exon usage in RNA-Seq

The package is focused on finding differential exon usage using RNA-seq exon counts between samples with different experimental designs. It provides functions that allows the user to make the necessary statistical tests based on a model that uses the negative binomial distribution to estimate the variance between biological replicates and generalized linear models for testing. The package also provides functions for the visualization and exploration of the results.

Maintained by Alejandro Reyes. Last updated 18 days ago.

immunooncology sequencing rnaseq differentialexpression alternativesplicing differentialsplicing geneexpression visualization

1.3 match 7.75 score 330 scripts 6 dependents

jgill22

BaM:Functions and Datasets for "Bayesian Methods: A Social and Behavioral Sciences Approach"

Functions and datasets for Jeff Gill: "Bayesian Methods: A Social and Behavioral Sciences Approach". First, Second, and Third Edition. Published by Chapman and Hall/CRC (2002, 2007, 2014) <doi:10.1201/b17888>.

Maintained by Jeff Gill. Last updated 2 years ago.

6.6 match 1 stars 1.43 score 27 scripts

haotianxu

changepoints:A Collection of Change-Point Detection Methods

Performs a series of offline and/or online change-point detection algorithms for 1) univariate mean: <doi:10.1214/20-EJS1710>, <arXiv:2006.03283>; 2) univariate polynomials: <doi:10.1214/21-EJS1963>; 3) univariate and multivariate nonparametric settings: <doi:10.1214/21-EJS1809>, <doi:10.1109/TIT.2021.3130330>; 4) high-dimensional covariances: <doi:10.3150/20-BEJ1249>; 5) high-dimensional networks with and without missing values: <doi:10.1214/20-AOS1953>, <arXiv:2101.05477>, <arXiv:2110.06450>; 6) high-dimensional linear regression models: <arXiv:2010.10410>, <arXiv:2207.12453>; 7) high-dimensional vector autoregressive models: <arXiv:1909.06359>; 8) high-dimensional self exciting point processes: <arXiv:2006.03572>; 9) dependent dynamic nonparametric random dot product graphs: <arXiv:1911.07494>; 10) univariate mean against adversarial attacks: <arXiv:2105.10417>.

Maintained by Haotian Xu. Last updated 1 years ago.

openblas cpp

1.6 match 12 stars 5.78 score 25 scripts

bioc

sights:Statistics and dIagnostic Graphs for HTS

SIGHTS is a suite of normalization methods, statistical tests, and diagnostic graphical tools for high throughput screening (HTS) assays. HTS assays use microtitre plates to screen large libraries of compounds for their biological, chemical, or biochemical activity.

Maintained by Elika Garg. Last updated 5 months ago.

immunooncology cellbasedassays microtitreplateassay normalization multiplecomparison preprocessing qualitycontrol batcheffect visualization

2.3 match 4.00 score 9 scripts

bioc

pathwayPCA:Integrative Pathway Analysis with Modern PCA Methodology and Gene Selection

pathwayPCA is an integrative analysis tool that implements the principal component analysis (PCA) based pathway analysis approaches described in Chen et al. (2008), Chen et al. (2010), and Chen (2011). pathwayPCA allows users to: (1) Test pathway association with binary, continuous, or survival phenotypes. (2) Extract relevant genes in the pathways using the SuperPCA and AES-PCA approaches. (3) Compute principal components (PCs) based on the selected genes. These estimated latent variables represent pathway activities for individual subjects, which can then be used to perform integrative pathway analysis, such as multi-omics analysis. (4) Extract relevant genes that drive pathway significance as well as data corresponding to these relevant genes for additional in-depth analysis. (5) Perform analyses with enhanced computational efficiency with parallel computing and enhanced data safety with S4-class data objects. (6) Analyze studies with complex experimental designs, with multiple covariates, and with interaction effects, e.g., testing whether pathway association with clinical phenotype is different between male and female subjects. Citations: Chen et al. (2008) <https://doi.org/10.1093/bioinformatics/btn458>; Chen et al. (2010) <https://doi.org/10.1002/gepi.20532>; and Chen (2011) <https://doi.org/10.2202/1544-6115.1697>.

Maintained by Gabriel Odom. Last updated 5 months ago.

copynumbervariation dnamethylation geneexpression snp transcription geneprediction genesetenrichment genesignaling genetarget genomewideassociation genomicvariation cellbiology epigenetics functionalgenomics genetics lipidomics metabolomics proteomics systemsbiology transcriptomics classification dimensionreduction featureextraction principalcomponent regression survival multiplecomparison pathways

1.1 match 11 stars 7.74 score 42 scripts

babak-khorsand

EvaluationMeasures:Collection of Model Evaluation Measure Functions

Provides Some of the most important evaluation measures for evaluating a model. Just by giving the real and predicted class, measures such as accuracy, sensitivity, specificity, ppv, npv, fmeasure, mcc and ... will be returned.

Maintained by Babak Khorsand. Last updated 9 years ago.

4.5 match 1.91 score 27 scripts 1 dependents

cran

TestCor:FWER and FDR Controlling Procedures for Multiple Correlation Tests

Different multiple testing procedures for correlation tests are implemented. These procedures were shown to theoretically control asymptotically the Family Wise Error Rate (Roux (2018) <https://tel.archives-ouvertes.fr/tel-01971574v1>) or the False Discovery Rate (Cai & Liu (2016) <doi:10.1080/01621459.2014.999157>). The package gather four test statistics used in correlation testing, four FWER procedures with either single step or stepdown versions, and four FDR procedures.

Maintained by Gannaz Irene. Last updated 4 years ago.

cpp

8.4 match 1 stars 1.00 score

bioc

MSnID:Utilities for Exploration and Assessment of Confidence of LC-MSn Proteomics Identifications

Extracts MS/MS ID data from mzIdentML (leveraging mzID package) or text files. After collating the search results from multiple datasets it assesses their identification quality and optimize filtering criteria to achieve the maximum number of identifications while not exceeding a specified false discovery rate. Also contains a number of utilities to explore the MS/MS results and assess missed and irregular enzymatic cleavages, mass measurement accuracy, etc.

Maintained by Vlad Petyuk. Last updated 5 months ago.

proteomics massspectrometry immunooncology

1.7 match 5.06 score 57 scripts

bioc

synapter:Label-free data analysis pipeline for optimal identification and quantitation

The synapter package provides functionality to reanalyse label-free proteomics data acquired on a Synapt G2 mass spectrometer. One or several runs, possibly processed with additional ion mobility separation to increase identification accuracy can be combined to other quantitation files to maximise identification and quantitation accuracy.

Maintained by Laurent Gatto. Last updated 6 days ago.

immunooncology massspectrometry proteomics qualitycontrol

1.8 match 4 stars 4.73 score 5 scripts

bioc

LimROTS:A Hybrid Method Integrating Empirical Bayes and Reproducibility-Optimized Statistics for Robust Analysis of Proteomics and Metabolomics Data

Differential expression analysis is a prevalent method utilised in the examination of diverse biological data. The reproducibility-optimized test statistic (ROTS) modifies a t-statistic based on the data's intrinsic characteristics and ranks features according to their statistical significance for differential expression between two or more groups (f-statistic). Focussing on proteomics and metabolomics, the current ROTS implementation cannot account for technical or biological covariates such as MS batches or gender differences among the samples. Consequently, we developed LimROTS, which employs a reproducibility-optimized test statistic utilising the limma methodology to simulate complex experimental designs. LimROTS is a hybrid method integrating empirical bayes and reproducibility-optimized statistics for robust analysis of proteomics and metabolomics data.

Maintained by Ali Mostafa Anwar. Last updated 3 months ago.

software geneexpression differentialexpression microarray rnaseq proteomics immunooncology metabolomics mrnamicroarray

1.7 match 1 stars 4.70 score 1 scripts

connor-reid-tiffany

omu:A Metabolomics Analysis Tool for Intuitive Figures and Convenient Metadata Collection

Facilitates the creation of intuitive figures to describe metabolomics data by utilizing Kyoto Encyclopedia of Genes and Genomes (KEGG) hierarchy data, and gathers functional orthology and gene data from the KEGG-REST API.

Maintained by Connor Tiffany. Last updated 1 years ago.

1.6 match 3 stars 4.89 score 52 scripts

smeekes

bootUR:Bootstrap Unit Root Tests

Set of functions to perform various bootstrap unit root tests for both individual time series (including augmented Dickey-Fuller test and union tests), multiple time series and panel data; see Smeekes and Wilms (2023) <doi:10.18637/jss.v106.i12>, Palm, Smeekes and Urbain (2008) <doi:10.1111/j.1467-9892.2007.00565.x>, Palm, Smeekes and Urbain (2011) <doi:10.1016/j.jeconom.2010.11.010>, Moon and Perron (2012) <doi:10.1016/j.jeconom.2012.01.008>, Smeekes and Taylor (2012) <doi:10.1017/S0266466611000387> and Smeekes (2015) <doi:10.1111/jtsa.12110> for key references.

Maintained by Stephan Smeekes. Last updated 1 months ago.

bootstrap dickey-fuller hypothesis-test time-series unit-root openblas cpp

1.3 match 10 stars 5.91 score 27 scripts

bioc

GeneBreak:Gene Break Detection

Recurrent breakpoint gene detection on copy number aberration profiles.

Maintained by Evert van den Broek. Last updated 5 months ago.

acgh copynumbervariation dnaseq genetics sequencing wholegenome visualization

1.6 match 2 stars 4.60 score 6 scripts

dcauseur

ERP:Significance Analysis of Event-Related Potentials Data

Functions for signal detection and identification designed for Event-Related Potentials (ERP) data in a linear model framework. The functional F-test proposed in Causeur, Sheu, Perthame, Rufini (2018, submitted) for analysis of variance issues in ERP designs is implemented for signal detection (tests for mean difference among groups of curves in One-way ANOVA designs for example). Once an experimental effect is declared significant, identification of significant intervals is achieved by the multiple testing procedures reviewed and compared in Sheu, Perthame, Lee and Causeur (2016, <DOI:10.1214/15-AOAS888>). Some of the methods gathered in the package are the classical FDR- and FWER-controlling procedures, also available using function p.adjust. The package also implements the Guthrie-Buchwald procedure (Guthrie and Buchwald, 1991 <DOI:10.1111/j.1469-8986.1991.tb00417.x>), which accounts for the auto-correlation among t-tests to control erroneous detection of short intervals. The Adaptive Factor-Adjustment method is an extension of the method described in Causeur, Chu, Hsieh and Sheu (2012, <DOI:10.3758/s13428-012-0230-0>). It assumes a factor model for the correlation among tests and combines adaptively the estimation of the signal and the updating of the dependence modelling (see Sheu et al., 2016, <DOI:10.1214/15-AOAS888> for further details).

Maintained by David Causeur. Last updated 5 years ago.

2.2 match 3.30 score 20 scripts

bioc

CSAR:Statistical tools for the analysis of ChIP-seq data

Statistical tools for ChIP-seq data analysis. The package includes the statistical method described in Kaufmann et al. (2009) PLoS Biology: 7(4):e1000090. Briefly, Taking the average DNA fragment size subjected to sequencing into account, the software calculates genomic single-nucleotide read-enrichment values. After normalization, sample and control are compared using a test based on the Poisson distribution. Test statistic thresholds to control the false discovery rate are obtained through random permutation.

Maintained by Jose M Muino. Last updated 5 months ago.

chipseq transcription genetics

1.7 match 4.30 score 6 scripts

cb4ds

DGEobj.utils:Differential Gene Expression (DGE) Analysis Utility Toolkit

Provides a function toolkit to facilitate reproducible RNA-Seq Differential Gene Expression (DGE) analysis (Law (2015) <doi:10.12688/f1000research.9005.3>). The tools include both analysis work-flow and utility functions: mapping/unit conversion, count normalization, accounting for unknown covariates, and more. This is a complement/cohort to the 'DGEobj' package that provides a flexible container to manage and annotate Differential Gene Expression analysis results.

Maintained by Connie Brett. Last updated 2 months ago.

1.3 match 2 stars 5.26 score 30 scripts 1 dependents

bioc

Motif2Site:Detect binding sites from motifs and ChIP-seq experiments, and compare binding sites across conditions

Detect binding sites using motifs IUPAC sequence or bed coordinates and ChIP-seq experiments in bed or bam format. Combine/compare binding sites across experiments, tissues, or conditions. All normalization and differential steps are done using TMM-GLM method. Signal decomposition is done by setting motifs as the centers of the mixture of normal distribution curves.

Maintained by Peyman Zarrineh. Last updated 5 months ago.

software sequencing chipseq differentialpeakcalling epigenetics sequencematching

1.8 match 4.00 score 3 scripts

bioc

mirTarRnaSeq:mirTarRnaSeq

mirTarRnaSeq R package can be used for interactive mRNA miRNA sequencing statistical analysis. This package utilizes expression or differential expression mRNA and miRNA sequencing results and performs interactive correlation and various GLMs (Regular GLM, Multivariate GLM, and Interaction GLMs ) analysis between mRNA and miRNA expriments. These experiments can be time point experiments, and or condition expriments.

Maintained by Mercedeh Movassagh. Last updated 5 months ago.

mirna regression software sequencing smallrna timecourse differentialexpression

1.7 match 4.00 score 9 scripts

nmargaritella

APFr:Multiple Testing Approach using Average Power Function (APF) and Bayes FDR Robust Estimation

Implements a multiple testing approach to the choice of a threshold gamma on the p-values using the Average Power Function (APF) and Bayes False Discovery Rate (FDR) robust estimation. Function apf_fdr() estimates both quantities from either raw data or p-values. Function apf_plot() produces smooth graphs and tables of the relevant results. Details of the methods can be found in Quatto P, Margaritella N, et al. (2019) <doi:10.1177/0962280219844288>.

Maintained by Nicolò Margaritella. Last updated 6 years ago.

6.8 match 1.00 score

linc2021

HDTSA:High Dimensional Time Series Analysis Tools

An implementation for high-dimensional time series analysis methods, including factor model for vector time series proposed by Lam and Yao (2012) <doi:10.1214/12-AOS970> and Chang, Guo and Yao (2015) <doi:10.1016/j.jeconom.2015.03.024>, martingale difference test proposed by Chang, Jiang and Shao (2023) <doi:10.1016/j.jeconom.2022.09.001>, principal component analysis for vector time series proposed by Chang, Guo and Yao (2018) <doi:10.1214/17-AOS1613>, cointegration analysis proposed by Zhang, Robinson and Yao (2019) <doi:10.1080/01621459.2018.1458620>, unit root test proposed by Chang, Cheng and Yao (2022) <doi:10.1093/biomet/asab034>, white noise test proposed by Chang, Yao and Zhou (2017) <doi:10.1093/biomet/asw066>, CP-decomposition for matrix time series proposed by Chang et al. (2023) <doi:10.1093/jrsssb/qkac011> and Chang et al. (2024) <doi:10.48550/arXiv.2410.05634>, and statistical inference for spectral density matrix proposed by Chang et al. (2022) <doi:10.48550/arXiv.2212.13686>.

Maintained by Chen Lin. Last updated 2 months ago.

cpp

1.7 match 2 stars 3.94 score 11 scripts

bioc

IsoBayes:IsoBayes: Single Isoform protein inference Method via Bayesian Analyses

IsoBayes is a Bayesian method to perform inference on single protein isoforms. Our approach infers the presence/absence of protein isoforms, and also estimates their abundance; additionally, it provides a measure of the uncertainty of these estimates, via: i) the posterior probability that a protein isoform is present in the sample; ii) a posterior credible interval of its abundance. IsoBayes inputs liquid cromatography mass spectrometry (MS) data, and can work with both PSM counts, and intensities. When available, trascript isoform abundances (i.e., TPMs) are also incorporated: TPMs are used to formulate an informative prior for the respective protein isoform relative abundance. We further identify isoforms where the relative abundance of proteins and transcripts significantly differ. We use a two-layer latent variable approach to model two sources of uncertainty typical of MS data: i) peptides may be erroneously detected (even when absent); ii) many peptides are compatible with multiple protein isoforms. In the first layer, we sample the presence/absence of each peptide based on its estimated probability of being mistakenly detected, also known as PEP (i.e., posterior error probability). In the second layer, for peptides that were estimated as being present, we allocate their abundance across the protein isoforms they map to. These two steps allow us to recover the presence and abundance of each protein isoform.

Maintained by Simone Tiberi. Last updated 5 months ago.

statisticalmethod bayesian proteomics massspectrometry alternativesplicing sequencing rnaseq geneexpression genetics visualization software cpp

1.3 match 7 stars 5.39 score 10 scripts

afukushima

TFactSR:Enrichment Approach to Predict Which Transcription Factors are Regulated

R implementation of 'TFactS' to predict which are the transcription factors (TFs), regulated in a biological condition based on lists of differentially expressed genes (DEGs) obtained from transcriptome experiments. This package is based on the 'TFactS' concept by Essaghir et al. (2010) <doi:10.1093/nar/gkq149> and expands it. It allows users to perform 'TFactS'-like enrichment approach. The package can import and use the original catalogue file from the 'TFactS' as well as users' defined catalogues of interest that are not supported by 'TFactS' (e.g., Arabidopsis).

Maintained by Atsushi Fukushima. Last updated 2 years ago.

network software differentialexpression genetarget geneexpression microarray rnaseq transcription networkenrichment

1.8 match 3.70 score 3 scripts

bioc

cycle:Significance of periodic expression pattern in time-series data

Package for assessing the statistical significance of periodic expression based on Fourier analysis and comparison with data generated by different background models

Maintained by Matthias Futschik. Last updated 5 months ago.

microarray timecourse

1.7 match 3.72 score 13 scripts

cran

fdrtool:Estimation of (Local) False Discovery Rates and Higher Criticism

Estimates both tail area-based false discovery rates (Fdr) as well as local false discovery rates (fdr) for a variety of null models (p-values, z-scores, correlation coefficients, t-scores). The proportion of null values and the parameters of the null distribution are adaptively estimated from the data. In addition, the package contains functions for non-parametric density estimation (Grenander estimator), for monotone regression (isotonic regression and antitonic regression with weights), for computing the greatest convex minorant (GCM) and the least concave majorant (LCM), for the half-normal and correlation distributions, and for computing empirical higher criticism (HC) scores and the corresponding decision threshold.

Maintained by Korbinian Strimmer. Last updated 7 months ago.

0.8 match 3 stars 8.24 score 844 scripts 118 dependents

ccicb

CRUX:Easily explore patterns of somatic variation in cancer using 'CRUX'

Shiny app for exploring somatic variation in cancer. Powered by maftools.

Maintained by Sam El-Kamand. Last updated 1 years ago.

3.2 match 2 stars 2.00 score 5 scripts

bioc

GeneSelectMMD:Gene selection based on the marginal distributions of gene profiles that characterized by a mixture of three-component multivariate distributions

Gene selection based on a mixture of marginal distributions.

Maintained by Weiliang Qiu. Last updated 5 months ago.

differentialexpression fortran

1.7 match 3.78 score 1 scripts 1 dependents

cran

HDMT:A Multiple Testing Procedure for High-Dimensional Mediation Hypotheses

A multiple-testing procedure for high-dimensional mediation hypotheses. Mediation analysis is of rising interest in epidemiology and clinical trials. Among existing methods for mediation analyses, the popular joint significance (JS) test yields an overly conservative type I error rate and therefore low power. In the R package 'HDMT' we implement a multiple-testing procedure that accurately controls the family-wise error rate (FWER) and the false discovery rate (FDR) when using JS for testing high-dimensional mediation hypotheses. The core of our procedure is based on estimating the proportions of three component null hypotheses and deriving the corresponding mixture distribution of null p-values. Results of the data examples include better-behaved quantile-quantile plots and improved detection of novel mediation relationships on the role of DNA methylation in genetic regulation of gene expression. With increasing interest in mediation by molecular intermediaries such as gene expression, the proposed method addresses an unmet methodological challenge. Methods used in the package refer to James Y. Dai, Janet L. Stanford & Michael LeBlanc (2020) <doi:10.1080/01621459.2020.1765785>.

Maintained by James Dai. Last updated 3 years ago.

2.2 match 2.86 score 12 scripts 2 dependents

stephens999

ashr:Methods for Adaptive Shrinkage, using Empirical Bayes

The R package 'ashr' implements an Empirical Bayes approach for large-scale hypothesis testing and false discovery rate (FDR) estimation based on the methods proposed in M. Stephens, 2016, "False discovery rates: a new deal", <DOI:10.1093/biostatistics/kxw041>. These methods can be applied whenever two sets of summary statistics---estimated effects and standard errors---are available, just as 'qvalue' can be applied to previously computed p-values. Two main interfaces are provided: ash(), which is more user-friendly; and ash.workhorse(), which has more options and is geared toward advanced users. The ash() and ash.workhorse() also provides a flexible modeling interface that can accommodate a variety of likelihoods (e.g., normal, Poisson) and mixture priors (e.g., uniform, normal).

Maintained by Peter Carbonetto. Last updated 10 months ago.

cpp

0.5 match 82 stars 12.10 score 780 scripts 15 dependents

scottpanhan

newIMVC:A Robust Integrated Mean Variance Correlation

Measure the dependence structure between two random variables with IMVC and extend IMVC to hypothesis test, feature screening and FDR control.

Maintained by Han Pan. Last updated 1 years ago.

2.2 match 2.70 score

ranbi1990

ssizeRNA:Sample Size Calculation for RNA-Seq Experimental Design

We propose a procedure for sample size calculation while controlling false discovery rate for RNA-seq experimental design. Our procedure depends on the Voom method proposed for RNA-seq data analysis by Law et al. (2014) <DOI:10.1186/gb-2014-15-2-r29> and the sample size calculation method proposed for microarray experiments by Liu and Hwang (2007) <DOI:10.1093/bioinformatics/btl664>. We develop a set of functions that calculates appropriate sample sizes for two-sample t-test for RNA-seq experiments with fixed or varied set of parameters. The outputs also contain a plot of power versus sample size, a table of power at different sample sizes, and a table of critical test values at different sample sizes. To install this package, please use 'source("http://bioconductor.org/biocLite.R"); biocLite("ssizeRNA")'. For R version 3.5 or greater, please use 'if(!requireNamespace("BiocManager", quietly = TRUE)){install.packages("BiocManager")}; BiocManager::install("ssizeRNA")'.

Maintained by Ran Bi. Last updated 6 years ago.

geneexpression differentialexpression experimentaldesign sequencing rnaseq dnaseq microarray

1.7 match 1 stars 3.53 score 28 scripts 1 dependents

cran

CEDA:CRISPR Screen and Gene Expression Differential Analysis

Provides analytical methods for analyzing CRISPR screen data at different levels of gene expression. Multi-component normal mixture models and EM algorithms are used for modeling.

Maintained by Lianbo Yu. Last updated 1 years ago.

1.7 match 3.40 score 2 scripts

cran

dSTEM:Multiple Testing of Local Extrema for Detection of Change Points

Simultaneously detect the number and locations of change points in piecewise linear models under stationary Gaussian noise allowing autocorrelated random noise. The core idea is to transform the problem of detecting change points into the detection of local extrema (local maxima and local minima)through kernel smoothing and differentiation of the data sequence, see Cheng et al. (2020) <doi:10.1214/20-EJS1751>. A low-computational and fast algorithm call 'dSTEM' is introduced to detect change points based on the 'STEM' algorithm in D. Cheng and A. Schwartzman (2017) <doi:10.1214/16-AOS1458>.

Maintained by Zhibing He. Last updated 2 years ago.

5.0 match 1.00 score

bioc

ROSeq:Modeling expression ranks for noise-tolerant differential expression analysis of scRNA-Seq data

ROSeq - A rank based approach to modeling gene expression with filtered and normalized read count matrix. ROSeq takes filtered and normalized read matrix and cell-annotation/condition as input and determines the differentially expressed genes between the contrasting groups of single cells. One of the input parameters is the number of cores to be used.

Maintained by Krishan Gupta. Last updated 5 months ago.

geneexpression differentialexpression singlecell count-data gene-expression gene-expression-profiles normalization populations rank tmm tung tung-dataset tutorial vignette

1.1 match 2 stars 4.34 score 11 scripts

bioc

randRotation:Random Rotation Methods for High Dimensional Data with Batch Structure

A collection of methods for performing random rotations on high-dimensional, normally distributed data (e.g. microarray or RNA-seq data) with batch structure. The random rotation approach allows exact testing of dependent test statistics with linear models following arbitrary batch effect correction methods.

Maintained by Peter Hettegger. Last updated 5 months ago.

software sequencing batcheffect biomedicalinformatics rnaseq preprocessing microarray differentialexpression geneexpression genetics micrornaarray normalization statisticalmethod

1.3 match 3.60 score 3 scripts

cran

KnockoffHybrid:Hybrid Analysis of Population and Trio Data with Knockoff Statistics for FDR Control

Identification of putative causal variants in genome-wide association studies using hybrid analysis of both the trio and population designs. The package implements the method in the paper: Yang, Y., Wang, Q., Wang, C., Buxbaum, J., & Ionita-Laza, I. (2024). KnockoffHybrid: A knockoff framework for hybrid analysis of trio and population designs in genome-wide association studies. The American Journal of Human Genetics, in press.

Maintained by Yi Yang. Last updated 9 months ago.

2.8 match 1.70 score

bioc

multtest:Resampling-based multiple hypothesis testing

Non-parametric bootstrap and permutation resampling-based multiple testing procedures (including empirical Bayes methods) for controlling the family-wise error rate (FWER), generalized family-wise error rate (gFWER), tail probability of the proportion of false positives (TPPFP), and false discovery rate (FDR). Several choices of bootstrap-based null distribution are implemented (centered, centered and scaled, quantile-transformed). Single-step and step-wise methods are available. Tests based on a variety of t- and F-statistics (including t-statistics based on regression parameters from linear and survival models as well as those based on correlation parameters) are included. When probing hypotheses with t-statistics, users may also select a potentially faster null distribution which is multivariate normal with mean zero and variance covariance matrix derived from the vector influence function. Results are reported in terms of adjusted p-values, confidence regions and test statistic cutoffs. The procedures are directly applicable to identifying differentially expressed genes in DNA microarray experiments.

Maintained by Katherine S. Pollard. Last updated 5 months ago.

microarray differentialexpression multiplecomparison

0.5 match 9.34 score 932 scripts 136 dependents

jacky11

cp4p:Calibration Plot for Proteomics

Functions to check whether a vector of p-values respects the assumptions of FDR (false discovery rate) control procedures and to compute adjusted p-values.

Maintained by Quentin Giai Gianetto. Last updated 6 years ago.

2.3 match 2.03 score 18 scripts 1 dependents

jbp7

TEAM:Multiple Hypothesis Testing on an Aggregation Tree Method

An implementation of the TEAM algorithm to identify local differences between two (e.g. case and control) independent, univariate distributions, as described in J Pura, C Chan, and J Xie (2019) <arXiv:1906.07757>. The algorithm is based on embedding a multiple-testing procedure on a hierarchical structure to identify high-resolution differences between two distributions. The hierarchical structure is designed to identify strong, short-range differences at lower layers and weaker, but long-range differences at increasing layers. TEAM yields consistent layer-specific and overall false discovery rate control.

Maintained by John Pura. Last updated 6 years ago.

2.3 match 2.00 score

cran

NewmanOmics:Extending the Newman Studentized Range Statistic to Transcriptomics

Extends the classical Newman studentized range statistic in various ways that can be applied to genome-scale transcriptomic or other expression data.

Maintained by Kevin R. Coombes. Last updated 27 days ago.

1.8 match 2.38 score 12 scripts

cran

SiFINeT:Single Cell Feature Identification with Network Topology

Cluster-independent method based on topology structure of gene co-expression network for identifying feature gene sets, extracting cellular subpopulations, and elucidating intrinsic relationships among these subpopulations. Without prior cell clustering, SifiNet circumvents potential inaccuracies in clustering that may influence subsequent analyses. This method is introduced in Qi Gao, Zhicheng Ji, Liuyang Wang, Kouros Owzar, Qi-Jing Li, Cliburn Chan, Jichun Xie "SifiNet: a robust and accurate method to identify feature gene sets and annotate cells" (2024) <doi:10.1093/nar/gkae307>.

Maintained by Qi Gao. Last updated 2 months ago.

openblas cpp openmp

1.5 match 2.70 score

bioc

siggenes:Multiple Testing using SAM and Efron's Empirical Bayes Approaches

Identification of differentially expressed genes and estimation of the False Discovery Rate (FDR) using both the Significance Analysis of Microarrays (SAM) and the Empirical Bayes Analyses of Microarrays (EBAM).

Maintained by Holger Schwender. Last updated 5 months ago.

multiplecomparison microarray geneexpression snp exonarray differentialexpression

0.5 match 7.87 score 74 scripts 34 dependents

t-grimes

dnapath:Differential Network Analysis using Gene Pathways

Integrates pathway information into the differential network analysis of two gene expression datasets as described in Grimes, Potter, and Datta (2019) <doi:10.1038/s41598-019-41918-3>. Provides summary functions to break down the results at the pathway, gene, or individual connection level. The differential networks for each pathway of interest can be plotted, and the visualization will highlight any differentially expressed genes and all of the gene-gene associations that are significantly differentially connected.

Maintained by Tyler Grimes. Last updated 25 days ago.

cpp

1.7 match 2.30 score 5 scripts

ajbass

sffdr:Surrogate Functional False Discovery Rates for Genome-Wide Association Studies

Pleiotropy-informed significance analysis of genome-wide association studies with surrogate functional false discovery rates (sfFDR). The sfFDR framework adapts the fFDR to leverage informative data from multiple sets of GWAS summary statistics to increase power in study while accommodating for linkage disequilibrium. sfFDR provides estimates of key FDR quantities in a significance analysis such as the functional local FDR and $q$-value, and uses these estimates to derive a functional $p$-value for type I error rate control and a functional local Bayes' factor for post-GWAS analyses (e.g., fine mapping and colocalization).

Maintained by Andrew Bass. Last updated 1 months ago.

cpp

0.8 match 4 stars 5.00 score 3 scripts

bioc

TargetDecoy:Diagnostic Plots to Evaluate the Target Decoy Approach

A first step in the data analysis of Mass Spectrometry (MS) based proteomics data is to identify peptides and proteins. With this respect the huge number of experimental mass spectra typically have to be assigned to theoretical peptides derived from a sequence database. Search engines are used for this purpose. These tools compare each of the observed spectra to all candidate theoretical spectra derived from the sequence data base and calculate a score for each comparison. The observed spectrum is then assigned to the theoretical peptide with the best score, which is also referred to as the peptide to spectrum match (PSM). It is of course crucial for the downstream analysis to evaluate the quality of these matches. Therefore False Discovery Rate (FDR) control is used to return a reliable list PSMs. The FDR, however, requires a good characterisation of the score distribution of PSMs that are matched to the wrong peptide (bad target hits). In proteomics, the target decoy approach (TDA) is typically used for this purpose. The TDA method matches the spectra to a database of real (targets) and nonsense peptides (decoys). A popular approach to generate these decoys is to reverse the target database. Hence, all the PSMs that match to a decoy are known to be bad hits and the distribution of their scores are used to estimate the distribution of the bad scoring target PSMs. A crucial assumption of the TDA is that the decoy PSM hits have similar properties as bad target hits so that the decoy PSM scores are a good simulation of the target PSM scores. Users, however, typically do not evaluate these assumptions. To this end we developed TargetDecoy to generate diagnostic plots to evaluate the quality of the target decoy method.

Maintained by Elke Debrie. Last updated 5 months ago.

massspectrometry proteomics qualitycontrol software visualization bioconductor mass-spectrometry

0.8 match 1 stars 4.60 score 9 scripts

petersyy1677

TCIU:Spacekime Analytics, Time Complexity and Inferential Uncertainty

Provide the core functionality to transform longitudinal data to complex-time (kime) data using analytic and numerical techniques, visualize the original time-series and reconstructed kime-surfaces, perform model based (e.g., tensor-linear regression) and model-free classification and clustering methods in the book Dinov, ID and Velev, MV. (2021) "Data Science: Time Complexity, Inferential Uncertainty, and Spacekime Analytics", De Gruyter STEM Series, ISBN 978-3-11-069780-3. <https://www.degruyter.com/view/title/576646>. The package includes 18 core functions which can be separated into three groups. 1) draw longitudinal data, such as Functional magnetic resonance imaging(fMRI) time-series, and forecast or transform the time-series data. 2) simulate real-valued time-series data, e.g., fMRI time-courses, detect the activated areas, report the corresponding p-values, and visualize the p-values in the 3D brain space. 3) Laplace transform and kimesurface reconstructions of the fMRI data.

Maintained by Yueyang Shen. Last updated 6 months ago.

fortran openblas

1.2 match 2.90 score 2 scripts

christianak

ManyTests:Multiple Testing Procedures of Cox (2011) and Wong and Cox (2007)

Performs the multiple testing procedures of Cox (2011) <doi:10.5170/CERN-2011-006> and Wong and Cox (2007) <doi:10.1080/02664760701240014>.

Maintained by Christiana Kartsonaki. Last updated 8 years ago.

3.3 match 1.00 score 5 scripts

mqbssppe

bayesCureRateModel:Bayesian Cure Rate Modeling for Time-to-Event Data

A fully Bayesian approach in order to estimate a general family of cure rate models under the presence of covariates, see Papastamoulis and Milienos (2024) <doi:10.1007/s11749-024-00942-w>. The promotion time can be modelled (a) parametrically using typical distributional assumptions for time to event data (including the Weibull, Exponential, Gompertz, log-Logistic distributions), or (b) semiparametrically using finite mixtures of distributions. In both cases, user-defined families of distributions are allowed under some specific requirements. Posterior inference is carried out by constructing a Metropolis-coupled Markov chain Monte Carlo (MCMC) sampler, which combines Gibbs sampling for the latent cure indicators and Metropolis-Hastings steps with Langevin diffusion dynamics for parameter updates. The main MCMC algorithm is embedded within a parallel tempering scheme by considering heated versions of the target posterior distribution.

Maintained by Panagiotis Papastamoulis. Last updated 6 months ago.

openblas cpp

1.9 match 1.60 score 1 scripts

nchenderson

rvalues:R-Values for Ranking in High-Dimensional Settings

A collection of functions for computing "r-values" from various kinds of user input such as MCMC output or a list of effect size estimates and associated standard errors. Given a large collection of measurement units, the r-value, r, of a particular unit is a reported percentile that may be interpreted as the smallest percentile at which the unit should be placed in the top r-fraction of units.

Maintained by Nicholas Henderson. Last updated 4 years ago.

2.3 match 1.30 score 20 scripts

cran

KnockoffTrio:GWAS with Trio and Duo Data using Knockoff Statistics for FDR Control

Identification of putative causal variants in genome-wide association studies with trio and duo families. The package calculates the W feature statistics from KnockoffTrio and p-values from the family-based association test (FBAT) using trio and/or duo data. Compared to previous versions, a significant improvement has been made in Version 1.1.0 to allow the package to be applied not only to trio families but also to duo families. The package implements the methods in the paper: "Yang, Y., Wang, C., Liu, L., Buxbaum, J., He, Z., & Ionita-Laza, I. (2022). KnockoffTrio: A knockoff framework for the identification of putative causal variants in genome-wide association studies with trio design. The American Journal of Human Genetics, 109(10), 1761-1776."

Maintained by Yi Yang. Last updated 2 months ago.

2.8 match 1.00 score

kaijunwang19

LPRelevance:Relevance-Integrated Statistical Inference Engine

Provide methods to perform customized inference at individual level by taking contextual covariates into account. Three main functions are provided in this package: (i) LASER(): it generates specially-designed artificial relevant samples for a given case; (ii) g2l.proc(): computes customized fdr(z|x); and (iii) rEB.proc(): performs empirical Bayes inference based on LASERs. The details can be found in Mukhopadhyay, S., and Wang, K (2021, <arXiv:2004.09588>).

Maintained by Kaijun Wang. Last updated 3 years ago.

2.8 match 1.00 score

msesia

knockoff:The Knockoff Filter for Controlled Variable Selection

The knockoff filter is a general procedure for controlling the false discovery rate (FDR) when performing variable selection. For more information, see the website below and the accompanying paper: Candes et al., "Panning for gold: model-X knockoffs for high-dimensional controlled variable selection", J. R. Statist. Soc. B (2018) 80, 3, pp. 551-577.

Maintained by Matteo Sesia. Last updated 3 years ago.

0.5 match 2 stars 5.35 score 248 scripts 5 dependents

bioc

cn.mops:cn.mops - Mixture of Poissons for CNV detection in NGS data

cn.mops (Copy Number estimation by a Mixture Of PoissonS) is a data processing pipeline for copy number variations and aberrations (CNVs and CNAs) from next generation sequencing (NGS) data. The package supplies functions to convert BAM files into read count matrices or genomic ranges objects, which are the input objects for cn.mops. cn.mops models the depths of coverage across samples at each genomic position. Therefore, it does not suffer from read count biases along chromosomes. Using a Bayesian approach, cn.mops decomposes read variations across samples into integer copy numbers and noise by its mixture components and Poisson distributions, respectively. cn.mops guarantees a low FDR because wrong detections are indicated by high noise and filtered out. cn.mops is very fast and written in C++.

Maintained by Gundula Povysil. Last updated 3 months ago.

sequencing copynumbervariation homo_sapiens cellbiology hapmap genetics cpp

0.5 match 5.35 score 94 scripts 4 dependents

bioc

RnaSeqSampleSize:RnaSeqSampleSize

RnaSeqSampleSize package provides a sample size calculation method based on negative binomial model and the exact test for assessing differential expression analysis of RNA-seq data. It controls FDR for multiple testing and utilizes the average read count and dispersion distributions from real data to estimate a more reliable sample size. It is also equipped with several unique features, including estimation for interested genes or pathway, power curve visualization, and parameter optimization.

Maintained by Shilin Zhao Developer. Last updated 5 months ago.

immunooncology experimentaldesign sequencing rnaseq geneexpression differentialexpression cpp

0.5 match 5.30 score 20 scripts

bioc

scCB2:CB2 improves power of cell detection in droplet-based single-cell RNA sequencing data

scCB2 is an R package implementing CB2 for distinguishing real cells from empty droplets in droplet-based single cell RNA-seq experiments (especially for 10x Chromium). It is based on clustering similar barcodes and calculating Monte-Carlo p-value for each cluster to test against background distribution. This cluster-level test outperforms single-barcode-level tests in dealing with low count barcodes and homogeneous sequencing library, while keeping FDR well controlled.

Maintained by Zijian Ni. Last updated 5 months ago.

dataimport rnaseq singlecell sequencing geneexpression transcriptomics preprocessing clustering

0.5 match 10 stars 5.30 score 5 scripts

xiongzhichen

fdrDiscreteNull:False Discovery Rate Procedures Under Discrete and Heterogeneous Null Distributions

It is known that current false discovery rate (FDR) procedures can be very conservative when applied to multiple testing in the discrete paradigm where p-values (and test statistics) have discrete and heterogeneous null distributions. This package implements more powerful weighted or adaptive FDR procedures for FDR control and estimation in the discrete paradigm. The package takes in the original data set rather than just the p-values in order to carry out the adjustments for discreteness and heterogeneity of p-value distributions. The package implements methods for two types of test statistics and their p-values: (a) binomial test on if two independent Poisson distributions have the same means, (b) Fisher's exact test on if the conditional distribution is the same as the marginal distribution for two binomial distributions, or on if two independent binomial distributions have the same probabilities of success.

Maintained by Xiongzhi Chen. Last updated 5 years ago.

2.6 match 1 stars 1.00 score 6 scripts

cran

MiDA:Microarray Data Analysis

Set of functions designed to simplify transcriptome analysis and identification of marker molecules using microarrays data. The package includes a set of functions that allows performing full pipeline of analysis including data normalization, summarisation, binary classification, FDR (False Discovery Rate) multiple comparison and the definition of potential biological markers.

Maintained by Elena Filatova. Last updated 6 years ago.

2.3 match 1.08 score 12 scripts

yijuanhu

LDM:Testing Hypotheses About the Microbiome using the Linear Decomposition Model

A single analysis path that includes distance-based ordination, global tests of any effect of the microbiome, and tests of the effects of individual taxa with false-discovery-rate (FDR) control. It accommodates both continuous and discrete covariates as well as interaction terms to be tested either singly or in combination, allows for adjustment of confounding covariates, and uses permutation-based p-values that can control for sample correlations. It can be applied to transformed data, and an omnibus test can combine results from analyses conducted on different transformation scales. It can also be used for testing presence-absence associations based on infinite number of rarefaction replicates, testing mediation effects of the microbiome, analyzing censored time-to-event outcomes, and for compositional analysis by fitting linear models to centered-log-ratio taxa count data.

Maintained by Yi-Juan Hu. Last updated 2 years ago.

0.5 match 7 stars 4.91 score 23 scripts

bioc

magpie:MeRIP-Seq data Analysis for Genomic Power Investigation and Evaluation

This package aims to perform power analysis for the MeRIP-seq study. It calculates FDR, FDC, power, and precision under various study design parameters, including but not limited to sample size, sequencing depth, and testing method. It can also output results into .xlsx files or produce corresponding figures of choice.

Maintained by Daoyu Duan. Last updated 5 months ago.

epitranscriptomics differentialmethylation sequencing rnaseq software

0.5 match 4.60 score 40 scripts

cran

GroupTest:Multiple Testing Procedure for Grouped Hypotheses

Contains functions for a two-stage multiple testing procedure for grouped hypothesis, aiming at controlling both the total posterior false discovery rate and within-group false discovery rate.

Maintained by Zhigen Zhao. Last updated 9 years ago.

1.8 match 1.30 score

olangsrud

ffmanova:Fifty-Fifty MANOVA

General linear modeling with multiple responses (MANCOVA). An overall p-value for each model term is calculated by the 50-50 MANOVA method by Langsrud (2002) <doi:10.1111/1467-9884.00320>, which handles collinear responses. Rotation testing, described by Langsrud (2005) <doi:10.1007/s11222-005-4789-5>, is used to compute adjusted single response p-values according to familywise error rates and false discovery rates (FDR). The approach to FDR is described in the appendix of Moen et al. (2005) <doi:10.1128/AEM.71.4.2086-2094.2005>. Unbalanced designs are handled by Type II sums of squares as argued in Langsrud (2003) <doi:10.1023/A:1023260610025>. Furthermore, the Type II philosophy is extended to continuous design variables as described in Langsrud et al. (2007) <doi:10.1080/02664760701594246>. This means that the method is invariant to scale changes and that common pitfalls are avoided.

Maintained by Øyvind Langsrud. Last updated 1 years ago.

0.8 match 2 stars 3.00 score 7 scripts

cran

simpleFDR:Simple False Discovery Rate Calculation

Using the adjustment method from Benjamini & Hochberg (1995) <doi:10.1111/j.2517-6161.1995.tb02031.x>, this package determines which variables are significant under repeated testing with a given dataframe of p values and an user defined "q" threshold. It then returns the original dataframe along with a significance column where an asterisk denotes a significant p value after FDR calculation, and NA denotes all other p values. This package uses the Benjamini & Hochberg method specifically as described in Lee, S., & Lee, D. K. (2018) <doi:10.4097/kja.d.18.00242>.

Maintained by Stephen Wisser. Last updated 3 years ago.

2.3 match 1.00 score

bioc

mbQTL:mbQTL: A package for SNP-Taxa mGWAS analysis

mbQTL is a statistical R package for simultaneous 16srRNA,16srDNA (microbial) and variant, SNP, SNV (host) relationship, correlation, regression studies. We apply linear, logistic and correlation based statistics to identify the relationships of taxa, genus, species and variant, SNP, SNV in the infected host. We produce various statistical significance measures such as P values, FDR, BC and probability estimation to show significance of these relationships. Further we provide various visualization function for ease and clarification of the results of these analysis. The package is compatible with dataframe, MRexperiment and text formats.

Maintained by Mercedeh Movassagh. Last updated 5 months ago.

snp microbiome wholegenome metagenomics statisticalmethod regression

0.5 match 1 stars 4.00 score 3 scripts

bioc

MBttest:Multiple Beta t-Tests

MBttest method was developed from beta t-test method of Baggerly et al(2003). Compared to baySeq (Hard castle and Kelly 2010), DESeq (Anders and Huber 2010) and exact test (Robinson and Smyth 2007, 2008) and the GLM of McCarthy et al(2012), MBttest is of high work efficiency,that is, it has high power, high conservativeness of FDR estimation and high stability. MBttest is suit- able to transcriptomic data, tag data, SAGE data (count data) from small samples or a few replicate libraries. It can be used to identify genes, mRNA isoforms or tags differentially expressed between two conditions.

Maintained by Yuan-De Tan. Last updated 5 months ago.

sequencing differentialexpression multiplecomparison sage geneexpression transcription alternativesplicing coverage differentialsplicing

0.5 match 4.00 score 3 scripts

ardenewan

APCanalysis:Analysis of Unreplicated Orthogonal Experiments using All Possible Comparisons

Analysis of data from unreplicated orthogonal experiments such as 2-level factorial and fractional factorial designs and Plackett-Burman designs using the all possible comparisons (APC) methodology developed by Miller (2005) <doi:10.1198/004017004000000608>.

Maintained by Arden Miller. Last updated 7 years ago.

2.0 match 1.00 score 10 scripts

abichat

evabic:Evaluation of Binary Classifiers

Evaluates the performance of binary classifiers. Computes confusion measures (TP, TN, FP, FN), derived measures (TPR, FDR, accuracy, F1, DOR, ..), and area under the curve. Outputs are well suited for nested dataframes.

Maintained by Antoine Bichat. Last updated 3 years ago.

classifier measures predictors roc-curve statistics

0.5 match 6 stars 3.62 score 14 scripts

bioc

cypress:Cell-Type-Specific Power Assessment

CYPRESS is a cell-type-specific power tool. This package aims to perform power analysis for the cell-type-specific data. It calculates FDR, FDC, and power, under various study design parameters, including but not limited to sample size, and effect size. It takes the input of a SummarizeExperimental(SE) object with observed mixture data (feature by sample matrix), and the cell-type mixture proportions (sample by cell-type matrix). It can solve the cell-type mixture proportions from the reference free panel from TOAST and conduct tests to identify cell-type-specific differential expression (csDE) genes.

Maintained by Shilin Yu. Last updated 5 months ago.

software geneexpression dataimport rnaseq sequencing

0.5 match 1 stars 3.70 score 2 scripts

yushengding

DNLC:Differential Network Local Consistency Analysis

Using Local Moran's I for detection of differential network local consistency.

Maintained by Yusheng Ding. Last updated 8 years ago.

1.8 match 1.00 score

rahmasarina

NMTox:Dose-Response Relationship Analysis of Nanomaterial Toxicity

Perform an exploration and a preliminary analysis on the dose- response relationship of nanomaterial toxicity. Several functions are provided for data exploration, including functions for creating a subset of dataset, frequency tables and plots. Inference for order restricted dose- response data is performed by testing the significance of monotonic dose-response relationship, using Williams, Marcus, M, Modified M and Likelihood ratio tests. Several methods of multiplicity adjustment are also provided. Description of the methods can be found in <https://github.com/rahmasarina/dose-response-analysis/blob/main/Methodology.pdf>.

Maintained by Rahmasari Nur Azizah. Last updated 3 years ago.

1.7 match 1.00 score

cran

JUMP:Replicability Analysis of High-Throughput Experiments

Implementing a computationally scalable false discovery rate control procedure for replicability analysis based on maximum of p-values. Please cite the manuscript corresponding to this package [Lyu, P. et al., (2023), <https://www.biorxiv.org/content/10.1101/2023.02.13.528417v2>].

Maintained by Yan Li. Last updated 2 years ago.

cpp

1.7 match 1.00 score

bioc

Mulcom:Calculates Mulcom test

Identification of differentially expressed genes and false discovery rate (FDR) calculation by Multiple Comparison test.

Maintained by Claudio Isella. Last updated 5 months ago.

statisticalmethod multiplecomparison microarray differentialexpression geneexpression cpp

0.5 match 3.00 score

freejstone

groupwalk:Implement the Group Walk Algorithm

A procedure that uses target-decoy competition (or knockoffs) to reject multiple hypotheses in the presence of group structure. The procedure controls the false discovery rate (FDR) at a user-specified threshold.

Maintained by Jack Freestone. Last updated 3 years ago.

0.5 match 2.70 score 1 scripts

vallejosgroup

bayefdr:Bayesian Estimation and Optimisation of Expected False Discovery Rate

Implements the Bayesian FDR control described by Newton et al. (2004), <doi:10.1093/biostatistics/5.2.155>. Allows optimisation and visualisation of expected error rates based on tail posterior probability tests. Based on code written by Catalina Vallejos for BASiCS, see Beyond comparisons of means: understanding changes in gene expression at the single-cell level Vallejos et al. (2016) <doi:10.1186/s13059-016-0930-3>.

Maintained by Alan OCallaghan. Last updated 3 years ago.

0.5 match 2.70 score 1 scripts

olechnwin

DIME:Differential Identification using Mixture Ensemble

A robust identification of differential binding sites method for analyzing ChIP-seq (Chromatin Immunoprecipitation Sequencing) comparing two samples that considers an ensemble of finite mixture models combined with a local false discovery rate (fdr) allowing for flexible modeling of data. Methods for Differential Identification using Mixture Ensemble (DIME) is described in: Taslim et al., (2011) <doi:10.1093/bioinformatics/btr165>.

Maintained by Cenny Taslim. Last updated 3 years ago.

0.5 match 2.63 score 43 scripts

cran

SILGGM:Statistical Inference of Large-Scale Gaussian Graphical Model in Gene Networks

Provides a general framework to perform statistical inference of each gene pair and global inference of whole-scale gene pairs in gene networks using the well known Gaussian graphical model (GGM) in a time-efficient manner. We focus on the high-dimensional settings where p (the number of genes) is allowed to be far larger than n (the number of subjects). Four main approaches are supported in this package: (1) the bivariate nodewise scaled Lasso (Ren et al (2015) <doi:10.1214/14-AOS1286>) (2) the de-sparsified nodewise scaled Lasso (Jankova and van de Geer (2017) <doi:10.1007/s11749-016-0503-5>) (3) the de-sparsified graphical Lasso (Jankova and van de Geer (2015) <doi:10.1214/15-EJS1031>) (4) the GGM estimation with false discovery rate control (FDR) using scaled Lasso or Lasso (Liu (2013) <doi:10.1214/13-AOS1169>). Windows users should install 'Rtools' before the installation of this package.

Maintained by Rong Zhang. Last updated 7 years ago.

cpp

0.5 match 3 stars 2.40 score 14 scripts 1 dependents

cran

sdafilter:Symmetrized Data Aggregation

We develop a new class of distribution free multiple testing rules for false discovery rate (FDR) control under general dependence. A key element in our proposal is a symmetrized data aggregation (SDA) approach to incorporating the dependence structure via sample splitting, data screening and information pooling. The proposed SDA filter first constructs a sequence of ranking statistics that fulfill global symmetry properties, and then chooses a data driven threshold along the ranking to control the FDR. For more information, see the website below and the accompanying paper: Du et al. (2020), "False Discovery Rate Control Under General Dependence By Symmetrized Data Aggregation", <arXiv:2002.11992>.

Maintained by Lilun Du. Last updated 5 years ago.

0.8 match 1.00 score

cran

DiscreteQvalue:Improved q-Values for Discrete Uniform and Homogeneous Tests

We consider a multiple testing procedure used in many modern applications which is the q-value method proposed by Storey and Tibshirani (2003), <doi:10.1073/pnas.1530509100>. The q-value method is based on the false discovery rate (FDR), hence versions of the q-value method can be defined depending on which estimator of the proportion of true null hypotheses, p0, is plugged in the FDR estimator. We implement the q-value method based on two classical pi0 estimators, and furthermore, we propose and implement three versions of the q-value method for homogeneous discrete uniform P-values based on pi0 estimators which take into account the discrete distribution of the P-values.

Maintained by Marta Cousido Rocha. Last updated 5 years ago.

0.8 match 1.00 score

cran

GhostKnockoff:The Knockoff Inference Using Summary Statistics

Functions for multiple knockoff inference using summary statistics, e.g. Z-scores. The knockoff inference is a general procedure for controlling the false discovery rate (FDR) when performing variable selection. This package provides a procedure which performs knockoff inference without ever constructing individual knockoffs (GhostKnockoff). It additionally supports multiple knockoff inference for improved stability and reproducibility. Moreover, it supports meta-analysis of multiple overlapping studies.

Maintained by Zihuai He. Last updated 3 years ago.

0.5 match 1 stars 1.00 score

cran

DNetFinder:Estimating Differential Networks under Semiparametric Gaussian Graphical Models

Provides a modified hierarchical test (Liu (2017) <doi:10.1214/17-AOS1539>) for detecting the structural difference between two Semiparametric Gaussian graphical models. The multiple testing procedure asymptotically controls the false discovery rate (FDR) at a user-specified level. To construct the test statistic, a truncated estimator is used to approximate the transformation functions and two R functions including lassoGGM() and lassoNPN() are provided to compute the lasso estimates of the regression coefficients.

Maintained by Qingyang Zhang. Last updated 2 years ago.

0.5 match 1 stars 1.00 score 8 scripts

marsdu1989

easyDes:An Easy Way to Descriptive Analysis

Descriptive analysis is essential for publishing medical articles. This package provides an easy way to conduct the descriptive analysis. 1. Both numeric and factor variables can be handled. For numeric variables, normality test will be applied to choose the parametric and nonparametric test. 2. Both two or more groups can be handled. For groups more than two, the post hoc test will be applied, 'Tukey' for the numeric variables and 'FDR' for the factor variables. 3. T test, ANOVA or Fisher test can be forced to apply. 4. Mean and standard deviation can be forced to display.

Maintained by Zhicheng Du. Last updated 3 years ago.

0.5 match 1.00 score 1 scripts