Showing 200 of total 311 results (show query)

bioc

mixOmics:Omics Data Integration Project

Multivariate methods are well suited to large omics data sets where the number of variables (e.g. genes, proteins, metabolites) is much larger than the number of samples (patients, cells, mice). They have the appealing properties of reducing the dimension of the data by using instrumental variables (components), which are defined as combinations of all variables. Those components are then used to produce useful graphical outputs that enable better understanding of the relationships and correlation structures between the different data sets that are integrated. mixOmics offers a wide range of multivariate methods for the exploration and integration of biological datasets with a particular focus on variable selection. The package proposes several sparse multivariate models we have developed to identify the key variables that are highly correlated, and/or explain the biological outcome of interest. The data that can be analysed with mixOmics may come from high throughput sequencing technologies, such as omics data (transcriptomics, metabolomics, proteomics, metagenomics etc) but also beyond the realm of omics (e.g. spectral imaging). The methods implemented in mixOmics can also handle missing values without having to delete entire rows with missing data. A non exhaustive list of methods include variants of generalised Canonical Correlation Analysis, sparse Partial Least Squares and sparse Discriminant Analysis. Recently we implemented integrative methods to combine multiple data sets: N-integration with variants of Generalised Canonical Correlation Analysis and P-integration with variants of multi-group Partial Least Squares.

Maintained by Eva Hamrud. Last updated 2 days ago.

immunooncologymicroarraysequencingmetabolomicsmetagenomicsproteomicsgenepredictionmultiplecomparisonclassificationregressionbioconductorgenomicsgenomics-datagenomics-visualizationmultivariate-analysismultivariate-statisticsomicsr-pkgr-project

185 stars 13.75 score 1.3k scripts 22 dependents

sachaepskamp

semPlot:Path Diagrams and Visual Analysis of Various SEM Packages' Output

Path diagrams and visual analysis of various SEM packages' output.

Maintained by Sacha Epskamp. Last updated 3 years ago.

63 stars 10.64 score 2.1k scripts 13 dependents

alexiosg

rmgarch:Multivariate GARCH Models

Feasible multivariate GARCH models including DCC, GO-GARCH and Copula-GARCH.

Maintained by Alexios Galanos. Last updated 3 months ago.

openblascppopenmp

14 stars 8.51 score 294 scripts 2 dependents

sensitivequestions

list:Statistical Methods for the Item Count Technique and List Experiment

Allows researchers to conduct multivariate statistical analyses of survey data with list experiments. This survey methodology is also known as the item count technique or the unmatched count technique and is an alternative to the commonly used randomized response method. The package implements the methods developed by Imai (2011) <doi:10.1198/jasa.2011.ap10415>, Blair and Imai (2012) <doi:10.1093/pan/mpr048>, Blair, Imai, and Lyall (2013) <doi:10.1111/ajps.12086>, Imai, Park, and Greene (2014) <doi:10.1093/pan/mpu017>, Aronow, Coppock, Crawford, and Green (2015) <doi:10.1093/jssam/smu023>, Chou, Imai, and Rosenfeld (2017) <doi:10.1177/0049124117729711>, and Blair, Chou, and Imai (2018) <https://imai.fas.harvard.edu/research/files/listerror.pdf>. This includes a Bayesian MCMC implementation of regression for the standard and multiple sensitive item list experiment designs and a random effects setup, a Bayesian MCMC hierarchical regression model with up to three hierarchical groups, the combined list experiment and endorsement experiment regression model, a joint model of the list experiment that enables the analysis of the list experiment as a predictor in outcome regression models, a method for combining list experiments with direct questions, and methods for diagnosing and adjusting for response error. In addition, the package implements the statistical test that is designed to detect certain failures of list experiments, and a placebo test for the list experiment using data from direct questions.

Maintained by Graeme Blair. Last updated 1 years ago.

openblas

7 stars 6.60 score 191 scripts

bioc

GlobalAncova:Global test for groups of variables via model comparisons

The association between a variable of interest (e.g. two groups) and the global pattern of a group of variables (e.g. a gene set) is tested via a global F-test. We give the following arguments in support of the GlobalAncova approach: After appropriate normalisation, gene-expression-data appear rather symmetrical and outliers are no real problem, so least squares should be rather robust. ANCOVA with interaction yields saturated data modelling e.g. different means per group and gene. Covariate adjustment can help to correct for possible selection bias. Variance homogeneity and uncorrelated residuals cannot be expected. Application of ordinary least squares gives unbiased, but no longer optimal estimates (Gauss-Markov-Aitken). Therefore, using the classical F-test is inappropriate, due to correlation. The test statistic however mirrors deviations from the null hypothesis. In combination with a permutation approach, empirical significance levels can be approximated. Alternatively, an approximation yields asymptotic p-values. The framework is generalized to groups of categorical variables or even mixed data by a likelihood ratio approach. Closed and hierarchical testing procedures are supported. This work was supported by the NGFN grant 01 GR 0459, BMBF, Germany and BMBF grant 01ZX1309B, Germany.

Maintained by Manuela Hummel. Last updated 5 months ago.

microarrayonechanneldifferentialexpressionpathwaysregression

5.31 score 9 scripts 1 dependents

ncchung

jackstraw:Statistical Inference for Unsupervised Learning

Test for association between the observed data and their estimated latent variables. The jackstraw package provides a resampling strategy and testing scheme to estimate statistical significance of association between the observed data and their latent variables. Depending on the data type and the analysis aim, the latent variables may be estimated by principal component analysis (PCA), factor analysis (FA), K-means clustering, and related unsupervised learning algorithms. The jackstraw methods learn over-fitting characteristics inherent in this circular analysis, where the observed data are used to estimate the latent variables and used again to test against that estimated latent variables. When latent variables are estimated by PCA, the jackstraw enables statistical testing for association between observed variables and latent variables, as estimated by low-dimensional principal components (PCs). This essentially leads to identifying variables that are significantly associated with PCs. Similarly, unsupervised clustering, such as K-means clustering, partition around medoids (PAM), and others, finds coherent groups in high-dimensional data. The jackstraw estimates statistical significance of cluster membership, by testing association between data and cluster centers. Clustering membership can be improved by using the resulting jackstraw p-values and posterior inclusion probabilities (PIPs), with an application to unsupervised evaluation of cell identities in single cell RNA-seq (scRNA-seq).

Maintained by Neo Christopher Chung. Last updated 3 months ago.

clusteringk-meansmachine-learningpcastatisticsunsupervised

16 stars 5.29 score 35 scripts

anttonalberdi

hilldiv:Integral Analysis of Diversity Based on Hill Numbers

Tools for analysing, comparing, visualising and partitioning diversity based on Hill numbers. 'hilldiv' is an R package that provides a set of functions to assist analysis of diversity for diet reconstruction, microbial community profiling or more general ecosystem characterisation analyses based on Hill numbers, using OTU/ASV tables and associated phylogenetic trees as inputs. The package includes functions for (phylo)diversity measurement, (phylo)diversity profile plotting, (phylo)diversity comparison between samples and groups, (phylo)diversity partitioning and (dis)similarity measurement. All of these grounded in abundance-based and incidence-based Hill numbers. The statistical framework developed around Hill numbers encompasses many of the most broadly employed diversity (e.g. richness, Shannon index, Simpson index), phylogenetic diversity (e.g. Faith's PD, Allen's H, Rao's quadratic entropy) and dissimilarity (e.g. Sorensen index, Unifrac distances) metrics. This enables the most common analyses of diversity to be performed while grounded in a single statistical framework. The methods are described in Jost et al. (2007) <DOI:10.1890/06-1736.1>, Chao et al. (2010) <DOI:10.1098/rstb.2010.0272> and Chiu et al. (2014) <DOI:10.1890/12-0960.1>; and reviewed in the framework of molecularly characterised biological systems in Alberdi & Gilbert (2019) <DOI:10.1111/1755-0998.13014>.

Maintained by Antton Alberdi. Last updated 4 years ago.

11 stars 4.35 score 41 scripts

georgekoliopanos

modgo:MOck Data GeneratiOn

Generation of mock data from a real dataset using rank normal inverse transformation.

Maintained by George Koliopanos. Last updated 9 months ago.

1 stars 4.00 score 3 scripts

dcauseur

ERP:Significance Analysis of Event-Related Potentials Data

Functions for signal detection and identification designed for Event-Related Potentials (ERP) data in a linear model framework. The functional F-test proposed in Causeur, Sheu, Perthame, Rufini (2018, submitted) for analysis of variance issues in ERP designs is implemented for signal detection (tests for mean difference among groups of curves in One-way ANOVA designs for example). Once an experimental effect is declared significant, identification of significant intervals is achieved by the multiple testing procedures reviewed and compared in Sheu, Perthame, Lee and Causeur (2016, <DOI:10.1214/15-AOAS888>). Some of the methods gathered in the package are the classical FDR- and FWER-controlling procedures, also available using function p.adjust. The package also implements the Guthrie-Buchwald procedure (Guthrie and Buchwald, 1991 <DOI:10.1111/j.1469-8986.1991.tb00417.x>), which accounts for the auto-correlation among t-tests to control erroneous detection of short intervals. The Adaptive Factor-Adjustment method is an extension of the method described in Causeur, Chu, Hsieh and Sheu (2012, <DOI:10.3758/s13428-012-0230-0>). It assumes a factor model for the correlation among tests and combines adaptively the estimation of the signal and the updating of the dependence modelling (see Sheu et al., 2016, <DOI:10.1214/15-AOAS888> for further details).

Maintained by David Causeur. Last updated 5 years ago.

3.30 score 20 scripts

polinasuter

BiDAG:Bayesian Inference for Directed Acyclic Graphs

Implementation of a collection of MCMC methods for Bayesian structure learning of directed acyclic graphs (DAGs), both from continuous and discrete data. For efficient inference on larger DAGs, the space of DAGs is pruned according to the data. To filter the search space, the algorithm employs a hybrid approach, combining constraint-based learning with search and score. A reduced search space is initially defined on the basis of a skeleton obtained by means of the PC-algorithm, and then iteratively improved with search and score. Search and score is then performed following two approaches: Order MCMC, or Partition MCMC. The BGe score is implemented for continuous data and the BDe score is implemented for binary data or categorical data. The algorithms may provide the maximum a posteriori (MAP) graph or a sample (a collection of DAGs) from the posterior distribution given the data. All algorithms are also applicable for structure learning and sampling for dynamic Bayesian networks. References: J. Kuipers, P. Suter, G. Moffa (2022) <doi:10.1080/10618600.2021.2020127>, N. Friedman and D. Koller (2003) <doi:10.1023/A:1020249912095>, J. Kuipers and G. Moffa (2017) <doi:10.1080/01621459.2015.1133426>, M. Kalisch et al. (2012) <doi:10.18637/jss.v047.i11>, D. Geiger and D. Heckerman (2002) <doi:10.1214/aos/1035844981>, P. Suter, J. Kuipers, G. Moffa, N.Beerenwinkel (2023) <doi:10.18637/jss.v105.i09>.

Maintained by Polina Suter. Last updated 2 years ago.

cpp

4 stars 3.29 score 81 scripts 2 dependents