Showing 177 of total 177 results (show query)

myllym

GET:Global Envelopes

Implementation of global envelopes for a set of general d-dimensional vectors T in various applications. A 100(1-alpha)% global envelope is a band bounded by two vectors such that the probability that T falls outside this envelope in any of the d points is equal to alpha. Global means that the probability is controlled simultaneously for all the d elements of the vectors. The global envelopes can be used for graphical Monte Carlo and permutation tests where the test statistic is a multivariate vector or function (e.g. goodness-of-fit testing for point patterns and random sets, functional analysis of variance, functional general linear model, n-sample test of correspondence of distribution functions), for central regions of functional or multivariate data (e.g. outlier detection, functional boxplot) and for global confidence and prediction bands (e.g. confidence band in polynomial regression, Bayesian posterior prediction). See Myllymรคki and Mrkviฤka (2024) <doi:10.18637/jss.v111.i03>, Myllymรคki et al. (2017) <doi:10.1111/rssb.12172>, Mrkviฤka and Myllymรคki (2023) <doi:10.1007/s11222-023-10275-7>, Mrkviฤka et al. (2016) <doi:10.1016/j.spasta.2016.04.005>, Mrkviฤka et al. (2017) <doi:10.1007/s11222-016-9683-9>, Mrkviฤka et al. (2020) <doi:10.14736/kyb-2020-3-0432>, Mrkviฤka et al. (2021) <doi:10.1007/s11009-019-09756-y>, Myllymรคki et al. (2021) <doi:10.1016/j.spasta.2020.100436>, Mrkviฤka et al. (2022) <doi:10.1002/sim.9236>, Dai et al. (2022) <doi:10.5772/intechopen.100124>, Dvoล™รกk and Mrkviฤka (2022) <doi:10.1007/s00180-021-01134-y>, Mrkviฤka et al. (2023) <doi:10.48550/arXiv.2309.04746>, and Konstantinou et al. (2024) <doi: 10.1007/s00180-024-01569-z>.

Maintained by Mari Myllymรคki. Last updated 4 months ago.

5.6 match 11 stars 9.33 score 46 scripts 5 dependents

bnaras

pamr:Pam: Prediction Analysis for Microarrays

Some functions for sample classification in microarrays.

Maintained by Balasubramanian Narasimhan. Last updated 9 months ago.

3.9 match 7.90 score 256 scripts 14 dependents

hrpcisd

locfdr:Computes Local False Discovery Rates

Computation of local false discovery rates.

Maintained by Balasubramanian Narasimhan. Last updated 10 years ago.

3.8 match 5.99 score 106 scripts 14 dependents

meierluk

hdi:High-Dimensional Inference

Implementation of multiple approaches to perform inference in high-dimensional models.

Maintained by Lukas Meier. Last updated 4 years ago.

4.0 match 2 stars 4.47 score 139 scripts 7 dependents

izmirlig

pwrFDR:FDR Power

Computing Average and TPX Power under various BHFDR type sequential procedures. All of these procedures involve control of some summary of the distribution of the FDP, e.g. the proportion of discoveries which are false in a given experiment. The most widely known of these, the BH-FDR procedure, controls the FDR which is the mean of the FDP. A lesser known procedure, due to Lehmann and Romano, controls the FDX, or probability that the FDP exceeds a user provided threshold. This is less conservative than FWE control procedures but much more conservative than the BH-FDR proceudre. This package and the references supporting it introduce a new procedure for controlling the FDX which we call the BH-FDX procedure. This procedure iteratively identifies, given alpha and lower threshold delta, an alpha* less than alpha at which BH-FDR guarantees FDX control. This uses asymptotic approximation and is only slightly more conservative than the BH-FDR procedure. Likewise, we can think of the power in multiple testing experiments in terms of a summary of the distribution of the True Positive Proportion (TPP), the portion of tests truly non-null distributed that are called significant. The package will compute power, sample size or any other missing parameter required for power defined as (i) the mean of the TPP (average power) or (ii) the probability that the TPP exceeds a given value, lambda, (TPX power) via asymptotic approximation. All supplied theoretical results are also obtainable via simulation. The suggested approach is to narrow in on a design via the theoretical approaches and then make final adjustments/verify the results by simulation. The theoretical results are described in Izmirlian, G (2020) Statistics and Probability letters, "<doi:10.1016/j.spl.2020.108713>", and an applied paper describing the methodology with a simulation study is in preparation. See citation("pwrFDR").

Maintained by Grant Izmirlian. Last updated 2 months ago.

6.6 match 2.58 score 19 scripts

bioc

RNAseqCovarImpute:Impute Covariate Data in RNA Sequencing Studies

The RNAseqCovarImpute package makes linear model analysis for RNA sequencing read counts compatible with multiple imputation (MI) of missing covariates. A major problem with implementing MI in RNA sequencing studies is that the outcome data must be included in the imputation prediction models to avoid bias. This is difficult in omics studies with high-dimensional data. The first method we developed in the RNAseqCovarImpute package surmounts the problem of high-dimensional outcome data by binning genes into smaller groups to analyze pseudo-independently. This method implements covariate MI in gene expression studies by 1) randomly binning genes into smaller groups, 2) creating M imputed datasets separately within each bin, where the imputation predictor matrix includes all covariates and the log counts per million (CPM) for the genes within each bin, 3) estimating gene expression changes using `limma::voom` followed by `limma::lmFit` functions, separately on each M imputed dataset within each gene bin, 4) un-binning the gene sets and stacking the M sets of model results before applying the `limma::squeezeVar` function to apply a variance shrinking Bayesian procedure to each M set of model results, 5) pooling the results with Rubinsโ€™ rules to produce combined coefficients, standard errors, and P-values, and 6) adjusting P-values for multiplicity to account for false discovery rate (FDR). A faster method uses principal component analysis (PCA) to avoid binning genes while still retaining outcome information in the MI models. Binning genes into smaller groups requires that the MI and limma-voom analysis is run many times (typically hundreds). The more computationally efficient MI PCA method implements covariate MI in gene expression studies by 1) performing PCA on the log CPM values for all genes using the Bioconductor `PCAtools` package, 2) creating M imputed datasets where the imputation predictor matrix includes all covariates and the optimum number of PCs to retain (e.g., based on Hornโ€™s parallel analysis or the number of PCs that account for >80% explained variation), 3) conducting the standard limma-voom pipeline with the `voom` followed by `lmFit` followed by `eBayes` functions on each M imputed dataset, 4) pooling the results with Rubinsโ€™ rules to produce combined coefficients, standard errors, and P-values, and 5) adjusting P-values for multiplicity to account for false discovery rate (FDR).

Maintained by Brennan Baker. Last updated 5 months ago.

rnaseqgeneexpressiondifferentialexpressionsequencing

3.3 match 1 stars 4.48 score 6 scripts

bioc

pathwayPCA:Integrative Pathway Analysis with Modern PCA Methodology and Gene Selection

pathwayPCA is an integrative analysis tool that implements the principal component analysis (PCA) based pathway analysis approaches described in Chen et al. (2008), Chen et al. (2010), and Chen (2011). pathwayPCA allows users to: (1) Test pathway association with binary, continuous, or survival phenotypes. (2) Extract relevant genes in the pathways using the SuperPCA and AES-PCA approaches. (3) Compute principal components (PCs) based on the selected genes. These estimated latent variables represent pathway activities for individual subjects, which can then be used to perform integrative pathway analysis, such as multi-omics analysis. (4) Extract relevant genes that drive pathway significance as well as data corresponding to these relevant genes for additional in-depth analysis. (5) Perform analyses with enhanced computational efficiency with parallel computing and enhanced data safety with S4-class data objects. (6) Analyze studies with complex experimental designs, with multiple covariates, and with interaction effects, e.g., testing whether pathway association with clinical phenotype is different between male and female subjects. Citations: Chen et al. (2008) <https://doi.org/10.1093/bioinformatics/btn458>; Chen et al. (2010) <https://doi.org/10.1002/gepi.20532>; and Chen (2011) <https://doi.org/10.2202/1544-6115.1697>.

Maintained by Gabriel Odom. Last updated 5 months ago.

copynumbervariationdnamethylationgeneexpressionsnptranscriptiongenepredictiongenesetenrichmentgenesignalinggenetargetgenomewideassociationgenomicvariationcellbiologyepigeneticsfunctionalgenomicsgeneticslipidomicsmetabolomicsproteomicssystemsbiologytranscriptomicsclassificationdimensionreductionfeatureextractionprincipalcomponentregressionsurvivalmultiplecomparisonpathways

1.1 match 11 stars 7.74 score 42 scripts

dcauseur

ERP:Significance Analysis of Event-Related Potentials Data

Functions for signal detection and identification designed for Event-Related Potentials (ERP) data in a linear model framework. The functional F-test proposed in Causeur, Sheu, Perthame, Rufini (2018, submitted) for analysis of variance issues in ERP designs is implemented for signal detection (tests for mean difference among groups of curves in One-way ANOVA designs for example). Once an experimental effect is declared significant, identification of significant intervals is achieved by the multiple testing procedures reviewed and compared in Sheu, Perthame, Lee and Causeur (2016, <DOI:10.1214/15-AOAS888>). Some of the methods gathered in the package are the classical FDR- and FWER-controlling procedures, also available using function p.adjust. The package also implements the Guthrie-Buchwald procedure (Guthrie and Buchwald, 1991 <DOI:10.1111/j.1469-8986.1991.tb00417.x>), which accounts for the auto-correlation among t-tests to control erroneous detection of short intervals. The Adaptive Factor-Adjustment method is an extension of the method described in Causeur, Chu, Hsieh and Sheu (2012, <DOI:10.3758/s13428-012-0230-0>). It assumes a factor model for the correlation among tests and combines adaptively the estimation of the signal and the updating of the dependence modelling (see Sheu et al., 2016, <DOI:10.1214/15-AOAS888> for further details).

Maintained by David Causeur. Last updated 5 years ago.

2.2 match 3.30 score 20 scripts

bioc

IsoBayes:IsoBayes: Single Isoform protein inference Method via Bayesian Analyses

IsoBayes is a Bayesian method to perform inference on single protein isoforms. Our approach infers the presence/absence of protein isoforms, and also estimates their abundance; additionally, it provides a measure of the uncertainty of these estimates, via: i) the posterior probability that a protein isoform is present in the sample; ii) a posterior credible interval of its abundance. IsoBayes inputs liquid cromatography mass spectrometry (MS) data, and can work with both PSM counts, and intensities. When available, trascript isoform abundances (i.e., TPMs) are also incorporated: TPMs are used to formulate an informative prior for the respective protein isoform relative abundance. We further identify isoforms where the relative abundance of proteins and transcripts significantly differ. We use a two-layer latent variable approach to model two sources of uncertainty typical of MS data: i) peptides may be erroneously detected (even when absent); ii) many peptides are compatible with multiple protein isoforms. In the first layer, we sample the presence/absence of each peptide based on its estimated probability of being mistakenly detected, also known as PEP (i.e., posterior error probability). In the second layer, for peptides that were estimated as being present, we allocate their abundance across the protein isoforms they map to. These two steps allow us to recover the presence and abundance of each protein isoform.

Maintained by Simone Tiberi. Last updated 5 months ago.

statisticalmethodbayesianproteomicsmassspectrometryalternativesplicingsequencingrnaseqgeneexpressiongeneticsvisualizationsoftwarecpp

1.3 match 7 stars 5.39 score 10 scripts

ccicb

CRUX:Easily explore patterns of somatic variation in cancer using 'CRUX'

Shiny app for exploring somatic variation in cancer. Powered by maftools.

Maintained by Sam El-Kamand. Last updated 1 years ago.

3.2 match 2 stars 2.00 score 5 scripts

bioc

TargetDecoy:Diagnostic Plots to Evaluate the Target Decoy Approach

A first step in the data analysis of Mass Spectrometry (MS) based proteomics data is to identify peptides and proteins. With this respect the huge number of experimental mass spectra typically have to be assigned to theoretical peptides derived from a sequence database. Search engines are used for this purpose. These tools compare each of the observed spectra to all candidate theoretical spectra derived from the sequence data base and calculate a score for each comparison. The observed spectrum is then assigned to the theoretical peptide with the best score, which is also referred to as the peptide to spectrum match (PSM). It is of course crucial for the downstream analysis to evaluate the quality of these matches. Therefore False Discovery Rate (FDR) control is used to return a reliable list PSMs. The FDR, however, requires a good characterisation of the score distribution of PSMs that are matched to the wrong peptide (bad target hits). In proteomics, the target decoy approach (TDA) is typically used for this purpose. The TDA method matches the spectra to a database of real (targets) and nonsense peptides (decoys). A popular approach to generate these decoys is to reverse the target database. Hence, all the PSMs that match to a decoy are known to be bad hits and the distribution of their scores are used to estimate the distribution of the bad scoring target PSMs. A crucial assumption of the TDA is that the decoy PSM hits have similar properties as bad target hits so that the decoy PSM scores are a good simulation of the target PSM scores. Users, however, typically do not evaluate these assumptions. To this end we developed TargetDecoy to generate diagnostic plots to evaluate the quality of the target decoy method.

Maintained by Elke Debrie. Last updated 5 months ago.

massspectrometryproteomicsqualitycontrolsoftwarevisualizationbioconductormass-spectrometry

0.8 match 1 stars 4.60 score 9 scripts