R-universe search: needs:corpcor

bioc

mixOmics:Omics Data Integration Project

Multivariate methods are well suited to large omics data sets where the number of variables (e.g. genes, proteins, metabolites) is much larger than the number of samples (patients, cells, mice). They have the appealing properties of reducing the dimension of the data by using instrumental variables (components), which are defined as combinations of all variables. Those components are then used to produce useful graphical outputs that enable better understanding of the relationships and correlation structures between the different data sets that are integrated. mixOmics offers a wide range of multivariate methods for the exploration and integration of biological datasets with a particular focus on variable selection. The package proposes several sparse multivariate models we have developed to identify the key variables that are highly correlated, and/or explain the biological outcome of interest. The data that can be analysed with mixOmics may come from high throughput sequencing technologies, such as omics data (transcriptomics, metabolomics, proteomics, metagenomics etc) but also beyond the realm of omics (e.g. spectral imaging). The methods implemented in mixOmics can also handle missing values without having to delete entire rows with missing data. A non exhaustive list of methods include variants of generalised Canonical Correlation Analysis, sparse Partial Least Squares and sparse Discriminant Analysis. Recently we implemented integrative methods to combine multiple data sets: N-integration with variants of Generalised Canonical Correlation Analysis and P-integration with variants of multi-group Partial Least Squares.

Maintained by Eva Hamrud. Last updated 2 days ago.

immunooncology microarray sequencing metabolomics metagenomics proteomics geneprediction multiplecomparison classification regression bioconductor genomics genomics-data genomics-visualization multivariate-analysis multivariate-statistics omics r-pkg r-project

185 stars 13.75 score 1.3k scripts 22 dependents

bioc

variancePartition:Quantify and interpret drivers of variation in multilevel gene expression experiments

Quantify and interpret multiple sources of biological and technical variation in gene expression experiments. Uses a linear mixed model to quantify variation in gene expression attributable to individual, tissue, time point, or technical variables. Includes dream differential expression analysis for repeated measures.

Maintained by Gabriel E. Hoffman. Last updated 3 months ago.

rnaseq geneexpression genesetenrichment differentialexpression batcheffect qualitycontrol regression epigenetics functionalgenomics transcriptomics normalization preprocessing microarray immunooncology software

7 stars 11.69 score 1.1k scripts 3 dependents

sachaepskamp

qgraph:Graph Plotting Methods, Psychometric Data Visualization and Graphical Model Estimation

Fork of qgraph - Weighted network visualization and analysis, as well as Gaussian graphical model computation. See Epskamp et al. (2012) <doi:10.18637/jss.v048.i04>.

Maintained by Sacha Epskamp. Last updated 1 years ago.

cpp

69 stars 11.43 score 1.2k scripts 63 dependents

r4ss

r4ss:R Code for Stock Synthesis

A collection of R functions for use with Stock Synthesis, a fisheries stock assessment modeling platform written in ADMB by Dr. Richard D. Methot at the NOAA Northwest Fisheries Science Center. The functions include tools for summarizing and plotting results, manipulating files, visualizing model parameterizations, and various other common stock assessment tasks. This version of '{r4ss}' is compatible with Stock Synthesis versions 3.24 through 3.30 (specifically version 3.30.23.1, from December 2024). Support for 3.24 models is only through the core functions for reading output and plotting.

Maintained by Ian G. Taylor. Last updated 18 days ago.

fisheries fisheries-stock-assessment stock-synthesis

43 stars 11.38 score 1.0k scripts 2 dependents

biometry

bipartite:Visualising Bipartite Networks and Calculating Some (Ecological) Indices

Functions to visualise webs and calculate a series of indices commonly used to describe pattern in (ecological) webs. It focuses on webs consisting of only two levels (bipartite), e.g. pollination webs or predator-prey-webs. Visualisation is important to get an idea of what we are actually looking at, while the indices summarise different aspects of the web's topology.

Maintained by Carsten F. Dormann. Last updated 20 days ago.

cpp

37 stars 10.93 score 592 scripts 15 dependents

bioc

muscat:Multi-sample multi-group scRNA-seq data analysis tools

`muscat` provides various methods and visualization tools for DS analysis in multi-sample, multi-group, multi-(cell-)subpopulation scRNA-seq data, including cell-level mixed models and methods based on aggregated “pseudobulk” data, as well as a flexible simulation platform that mimics both single and multi-sample scRNA-seq data.

Maintained by Helena L. Crowell. Last updated 5 months ago.

immunooncology differentialexpression sequencing singlecell software statisticalmethod visualization

184 stars 10.74 score 686 scripts 1 dependents

sachaepskamp

semPlot:Path Diagrams and Visual Analysis of Various SEM Packages' Output

Path diagrams and visual analysis of various SEM packages' output.

Maintained by Sacha Epskamp. Last updated 3 years ago.

63 stars 10.64 score 2.1k scripts 13 dependents

thej022214

corHMM:Hidden Markov Models of Character Evolution

Fits hidden Markov models of discrete character evolution which allow different transition rate classes on different portions of a phylogeny. Beaulieu et al (2013) <doi:10.1093/sysbio/syt034>.

Maintained by Jeremy Beaulieu. Last updated 1 months ago.

13 stars 9.52 score 422 scripts 2 dependents

jclavel

mvMORPH:Multivariate Comparative Tools for Fitting Evolutionary Models to Morphometric Data

Fits multivariate (Brownian Motion, Early Burst, ACDC, Ornstein-Uhlenbeck and Shifts) models of continuous traits evolution on trees and time series. 'mvMORPH' also proposes high-dimensional multivariate comparative tools (linear models using Generalized Least Squares and multivariate tests) based on penalized likelihood. See Clavel et al. (2015) <DOI:10.1111/2041-210X.12420>, Clavel et al. (2019) <DOI:10.1093/sysbio/syy045>, and Clavel & Morlon (2020) <DOI:10.1093/sysbio/syaa010>.

Maintained by Julien Clavel. Last updated 2 months ago.

openblas

17 stars 9.46 score 189 scripts 3 dependents

sachaepskamp

bootnet:Bootstrap Methods for Various Network Estimation Routines

Bootstrap methods to assess accuracy and stability of estimated network structures and centrality indices <doi:10.3758/s13428-017-0862-1>. Allows for flexible specification of any undirected network estimation procedure in R, and offers default sets for various estimation routines.

Maintained by Sacha Epskamp. Last updated 5 months ago.

32 stars 8.94 score 155 scripts 3 dependents

merck

gsDesign2:Group Sequential Design with Non-Constant Effect

The goal of 'gsDesign2' is to enable fixed or group sequential design under non-proportional hazards. To enable highly flexible enrollment, time-to-event and time-to-dropout assumptions, 'gsDesign2' offers piecewise constant enrollment, failure rates, and dropout rates for a stratified population. This package includes three methods for designs: average hazard ratio, weighted logrank tests in Yung and Liu (2019) <doi:10.1111/biom.13196>, and MaxCombo tests. Substantial flexibility on top of what is in the 'gsDesign' package is intended for selecting boundaries.

Maintained by Yujie Zhao. Last updated 2 days ago.

cpp

22 stars 8.91 score 186 scripts

jarrodhadfield

MCMCglmm:MCMC Generalised Linear Mixed Models

Fits Multivariate Generalised Linear Mixed Models (and related models) using Markov chain Monte Carlo techniques (Hadfield 2010 J. Stat. Soft.).

Maintained by Jarrod Hadfield. Last updated 3 months ago.

cpp

2 stars 8.83 score 1.2k scripts 14 dependents

ss3sim

ss3sim:Fisheries Stock Assessment Simulation Testing with Stock Synthesis

A framework for fisheries stock assessment simulation testing with Stock Synthesis (SS3) as described in Anderson et al. (2014) <doi:10.1371/journal.pone.0092725>.

Maintained by Kelli F. Johnson. Last updated 5 months ago.

fisheries simulation stock-synthesis

39 stars 8.72 score 149 scripts

alexiosg

rmgarch:Multivariate GARCH Models

Feasible multivariate GARCH models including DCC, GO-GARCH and Copula-GARCH.

Maintained by Alexios Galanos. Last updated 3 months ago.

openblas cpp openmp

14 stars 8.51 score 294 scripts 2 dependents

hmorlon

RPANDA:Phylogenetic ANalyses of DiversificAtion

Implements macroevolutionary analyses on phylogenetic trees. See Morlon et al. (2010) <DOI:10.1371/journal.pbio.1000493>, Morlon et al. (2011) <DOI:10.1073/pnas.1102543108>, Condamine et al. (2013) <DOI:10.1111/ele.12062>, Morlon et al. (2014) <DOI:10.1111/ele.12251>, Manceau et al. (2015) <DOI:10.1111/ele.12415>, Lewitus & Morlon (2016) <DOI:10.1093/sysbio/syv116>, Drury et al. (2016) <DOI:10.1093/sysbio/syw020>, Manceau et al. (2016) <DOI:10.1093/sysbio/syw115>, Morlon et al. (2016) <DOI:10.1111/2041-210X.12526>, Clavel & Morlon (2017) <DOI:10.1073/pnas.1606868114>, Drury et al. (2017) <DOI:10.1093/sysbio/syx079>, Lewitus & Morlon (2017) <DOI:10.1093/sysbio/syx095>, Drury et al. (2018) <DOI:10.1371/journal.pbio.2003563>, Clavel et al. (2019) <DOI:10.1093/sysbio/syy045>, Maliet et al. (2019) <DOI:10.1038/s41559-019-0908-0>, Billaud et al. (2019) <DOI:10.1093/sysbio/syz057>, Lewitus et al. (2019) <DOI:10.1093/sysbio/syz061>, Aristide & Morlon (2019) <DOI:10.1111/ele.13385>, Maliet et al. (2020) <DOI:10.1111/ele.13592>, Drury et al. (2021) <DOI:10.1371/journal.pbio.3001270>, Perez-Lamarque & Morlon (2022) <DOI:10.1111/mec.16478>, Perez-Lamarque et al. (2022) <DOI:10.1101/2021.08.30.458192>, Mazet et al. (2023) <DOI:10.1111/2041-210X.14195>, Drury et al. (2024) <DOI:10.1016/j.cub.2023.12.055>.

Maintained by Hélène Morlon. Last updated 3 months ago.

24 stars 8.50 score 255 scripts

thej022214

hisse:Hidden State Speciation and Extinction

Sets up and executes a HiSSE model (Hidden State Speciation and Extinction) on a phylogeny and character sets to test for hidden shifts in trait dependent rates of diversification. Beaulieu and O'Meara (2016) <doi:10.1093/sysbio/syw022>.

Maintained by Jeremy Beaulieu. Last updated 2 months ago.

6 stars 8.45 score 152 scripts

thej022214

OUwie:Analysis of Evolutionary Rates in an OU Framework

Estimates rates for continuous character evolution under Brownian motion and a new set of Ornstein-Uhlenbeck based Hansen models that allow both the strength of the pull and stochastic motion to vary across selective regimes. Beaulieu et al (2012).

Maintained by Jeremy Beaulieu. Last updated 12 days ago.

9 stars 8.42 score 161 scripts

bioc

flowStats:Statistical methods for the analysis of flow cytometry data

Methods and functionality to analyse flow data that is beyond the basic infrastructure provided by the flowCore package.

Maintained by Greg Finak. Last updated 5 months ago.

immunooncology flowcytometry cellbasedassays

14 stars 8.27 score 195 scripts 1 dependents

ampl-psych

EMC2:Bayesian Hierarchical Analysis of Cognitive Models of Choice

Fit Bayesian (hierarchical) cognitive models using a linear modeling language interface using particle Metropolis Markov chain Monte Carlo sampling with Gibbs steps. The diffusion decision model (DDM), linear ballistic accumulator model (LBA), racing diffusion model (RDM), and the lognormal race model (LNR) are supported. Additionally, users can specify their own likelihood function and/or choose for non-hierarchical estimation, as well as for a diagonal, blocked or full multivariate normal group-level distribution to test individual differences. Prior specification is facilitated through methods that visualize the (implied) prior. A wide range of plotting functions assist in assessing model convergence and posterior inference. Models can be easily evaluated using functions that plot posterior predictions or using relative model comparison metrics such as information criteria or Bayes factors. References: Stevenson et al. (2024) <doi:10.31234/osf.io/2e4dq>.

Maintained by Niek Stevenson. Last updated 20 days ago.

cpp

13 stars 8.25 score 392 scripts

jmbh

mgm:Estimating Time-Varying k-Order Mixed Graphical Models

Estimation of k-Order time-varying Mixed Graphical Models and mixed VAR(p) models via elastic-net regularized neighborhood regression. For details see Haslbeck & Waldorp (2020) <doi:10.18637/jss.v093.i08>.

Maintained by Jonas Haslbeck. Last updated 19 days ago.

29 stars 8.16 score 125 scripts 6 dependents

bioc

POMA:Tools for Omics Data Analysis

The POMA package offers a comprehensive toolkit designed for omics data analysis, streamlining the process from initial visualization to final statistical analysis. Its primary goal is to simplify and unify the various steps involved in omics data processing, making it more accessible and manageable within a single, intuitive R package. Emphasizing on reproducibility and user-friendliness, POMA leverages the standardized SummarizedExperiment class from Bioconductor, ensuring seamless integration and compatibility with a wide array of Bioconductor tools. This approach guarantees maximum flexibility and replicability, making POMA an essential asset for researchers handling omics datasets. See https://github.com/pcastellanoescuder/POMAShiny. Paper: Castellano-Escuder et al. (2021) <doi:10.1371/journal.pcbi.1009148> for more details.

Maintained by Pol Castellano-Escuder. Last updated 4 months ago.

batcheffect classification clustering decisiontree dimensionreduction multidimensionalscaling normalization preprocessing principalcomponent regression rnaseq software statisticalmethod visualization bioconductor bioinformatics data-visualization dimension-reduction exploratory-data-analysis machine-learning omics-data-integration pipeline pre-processing statistical-analysis user-friendly workflow

11 stars 8.16 score 20 scripts 1 dependents

bioc

dreamlet:Scalable differential expression analysis of single cell transcriptomics datasets with complex study designs

Recent advances in single cell/nucleus transcriptomic technology has enabled collection of cohort-scale datasets to study cell type specific gene expression differences associated disease state, stimulus, and genetic regulation. The scale of these data, complex study designs, and low read count per cell mean that characterizing cell type specific molecular mechanisms requires a user-frieldly, purpose-build analytical framework. We have developed the dreamlet package that applies a pseudobulk approach and fits a regression model for each gene and cell cluster to test differential expression across individuals associated with a trait of interest. Use of precision-weighted linear mixed models enables accounting for repeated measures study designs, high dimensional batch effects, and varying sequencing depth or observed cells per biosample.

Maintained by Gabriel Hoffman. Last updated 4 days ago.

rnaseq geneexpression differentialexpression batcheffect qualitycontrol regression genesetenrichment generegulation epigenetics functionalgenomics transcriptomics normalization singlecell preprocessing sequencing immunooncology software cpp

12 stars 8.14 score 128 scripts

gfellerlab

SuperCell:Simplification of scRNA-seq data by merging together similar cells

Aggregates large single-cell data into metacell dataset by merging together gene expression of very similar cells.

Maintained by The package maintainer. Last updated 8 months ago.

software coarse-graining scrna-seq-analysis scrna-seq-data

72 stars 8.08 score 93 scripts

bioc

TOAST:Tools for the analysis of heterogeneous tissues

This package is devoted to analyzing high-throughput data (e.g. gene expression microarray, DNA methylation microarray, RNA-seq) from complex tissues. Current functionalities include 1. detect cell-type specific or cross-cell type differential signals 2. tree-based differential analysis 3. improve variable selection in reference-free deconvolution 4. partial reference-free deconvolution with prior knowledge.

Maintained by Ziyi Li. Last updated 5 months ago.

dnamethylation geneexpression differentialexpression differentialmethylation microarray genetarget epigenetics methylationarray

11 stars 8.01 score 104 scripts 3 dependents

bioc

netZooR:Unified methods for the inference and analysis of gene regulatory networks

netZooR unifies the implementations of several Network Zoo methods (netzoo, netzoo.github.io) into a single package by creating interfaces between network inference and network analysis methods. Currently, the package has 3 methods for network inference including PANDA and its optimized implementation OTTER (network reconstruction using mutliple lines of biological evidence), LIONESS (single-sample network inference), and EGRET (genotype-specific networks). Network analysis methods include CONDOR (community detection), ALPACA (differential community detection), CRANE (significance estimation of differential modules), MONSTER (estimation of network transition states). In addition, YARN allows to process gene expresssion data for tissue-specific analyses and SAMBAR infers missing mutation data based on pathway information.

Maintained by Tara Eicher. Last updated 12 days ago.

networkinference network generegulation geneexpression transcription microarray graphandnetwork gene-regulatory-network transcription-factors

105 stars 7.98 score

hfgolino

EGAnet:Exploratory Graph Analysis – a Framework for Estimating the Number of Dimensions in Multivariate Data using Network Psychometrics

Implements the Exploratory Graph Analysis (EGA) framework for dimensionality and psychometric assessment. EGA estimates the number of dimensions in psychological data using network estimation methods and community detection algorithms. A bootstrap method is provided to assess the stability of dimensions and items. Fit is evaluated using the Entropy Fit family of indices. Unique Variable Analysis evaluates the extent to which items are locally dependent (or redundant). Network loadings provide similar information to factor loadings and can be used to compute network scores. A bootstrap and permutation approach are available to assess configural and metric invariance. Hierarchical structures can be detected using Hierarchical EGA. Time series and intensive longitudinal data can be analyzed using Dynamic EGA, supporting individual, group, and population level assessments.

Maintained by Hudson Golino. Last updated 13 days ago.

47 stars 7.83 score 61 scripts 1 dependents

fbertran

plsRglm:Partial Least Squares Regression for Generalized Linear Models

Provides (weighted) Partial least squares Regression for generalized linear models and repeated k-fold cross-validation of such models using various criteria <arXiv:1810.01005>. It allows for missing data in the explanatory variables. Bootstrap confidence intervals constructions are also available.

Maintained by Frederic Bertrand. Last updated 2 years ago.

16 stars 7.75 score 103 scripts 5 dependents

gateslab

gimme:Group Iterative Multiple Model Estimation

Data-driven approach for arriving at person-specific time series models. The method first identifies which relations replicate across the majority of individuals to detect signal from noise. These group-level relations are then used as a foundation for starting the search for person-specific (or individual-level) relations. See Gates & Molenaar (2012) <doi:10.1016/j.neuroimage.2012.06.026>.

Maintained by Kathleen M Gates. Last updated 9 days ago.

26 stars 7.61 score 53 scripts

bioc

AlpsNMR:Automated spectraL Processing System for NMR

Reads Bruker NMR data directories both zipped and unzipped. It provides automated and efficient signal processing for untargeted NMR metabolomics. It is able to interpolate the samples, detect outliers, exclude regions, normalize, detect peaks, align the spectra, integrate peaks, manage metadata and visualize the spectra. After spectra proccessing, it can apply multivariate analysis on extracted data. Efficient plotting with 1-D data is also available. Basic reading of 1D ACD/Labs exported JDX samples is also available.

Maintained by Sergio Oller Moreno. Last updated 5 months ago.

software preprocessing visualization classification cheminformatics metabolomics dataimport

15 stars 7.59 score 12 scripts 1 dependents

sebastian-engelke

graphicalExtremes:Statistical Methodology for Graphical Extreme Value Models

Statistical methodology for sparse multivariate extreme value models. Methods are provided for exact simulation and statistical inference for multivariate Pareto distributions on graphical structures as described in the paper 'Graphical Models for Extremes' by Engelke and Hitz (2020) <doi:10.1111/rssb.12355>.

Maintained by Sebastian Engelke. Last updated 3 months ago.

16 stars 7.38 score 28 scripts 1 dependents

r-forge

pcalg:Methods for Graphical Models and Causal Inference

Functions for causal structure learning and causal inference using graphical models. The main algorithms for causal structure learning are PC (for observational data without hidden variables), FCI and RFCI (for observational data with hidden variables), and GIES (for a mix of data from observational studies (i.e. observational data) and data from experiments involving interventions (i.e. interventional data) without hidden variables). For causal inference the IDA algorithm, the Generalized Backdoor Criterion (GBC), the Generalized Adjustment Criterion (GAC) and some related functions are implemented. Functions for incorporating background knowledge are provided.

Maintained by Markus Kalisch. Last updated 7 months ago.

openblas cpp

7.30 score 700 scripts 19 dependents

sfcheung

semptools:Customizing Structural Equation Modelling Plots

Most function focus on specific ways to customize a graph. They use a 'qgraph' output as the first argument, and return a modified 'qgraph' object. This allows the functions to be chained by a pipe operator.

Maintained by Shu Fai Cheung. Last updated 3 months ago.

diagram graph lavaan plot sem structural-equation-modeling

7 stars 7.12 score 87 scripts

alexchristensen

NetworkToolbox:Methods and Measures for Brain, Cognitive, and Psychometric Network Analysis

Implements network analysis and graph theory measures used in neuroscience, cognitive science, and psychology. Methods include various filtering methods and approaches such as threshold, dependency (Kenett, Tumminello, Madi, Gur-Gershgoren, Mantegna, & Ben-Jacob, 2010 <doi:10.1371/journal.pone.0015032>), Information Filtering Networks (Barfuss, Massara, Di Matteo, & Aste, 2016 <doi:10.1103/PhysRevE.94.062306>), and Efficiency-Cost Optimization (Fallani, Latora, & Chavez, 2017 <doi:10.1371/journal.pcbi.1005305>). Brain methods include the recently developed Connectome Predictive Modeling (see references in package). Also implements several network measures including local network characteristics (e.g., centrality), community-level network characteristics (e.g., community centrality), global network characteristics (e.g., clustering coefficient), and various other measures associated with the reliability and reproducibility of network analysis.

Maintained by Alexander Christensen. Last updated 2 years ago.

network-analysis

23 stars 7.04 score 101 scripts 4 dependents

bioc

lfa:Logistic Factor Analysis for Categorical Data

Logistic Factor Analysis is a method for a PCA analogue on Binomial data via estimation of latent structure in the natural parameter. The main method estimates genetic population structure from genotype data. There are also methods for estimating individual-specific allele frequencies using the population structure. Lastly, a structured Hardy-Weinberg equilibrium (HWE) test is developed, which quantifies the goodness of fit of the genotype data to the estimated population structure, via the estimated individual-specific allele frequencies (all of which generalizes traditional HWE tests).

Maintained by Alejandro Ochoa. Last updated 5 months ago.

snp dimensionreduction principalcomponent regression openblas

16 stars 7.04 score 57 scripts 1 dependents

psoerensen

qgg:Statistical Tools for Quantitative Genetic Analyses

Provides an infrastructure for efficient processing of large-scale genetic and phenotypic data including core functions for: 1) fitting linear mixed models, 2) constructing marker-based genomic relationship matrices, 3) estimating genetic parameters (heritability and correlation), 4) performing genomic prediction and genetic risk profiling, and 5) single or multi-marker association analyses. Rohde et al. (2019) <doi:10.1101/503631>.

Maintained by Peter Soerensen. Last updated 11 days ago.

fortran openblas cpp

36 stars 7.01 score 47 scripts

sachaepskamp

psychonetrics:Structural Equation Modeling and Confirmatory Network Analysis

Multi-group (dynamical) structural equation models in combination with confirmatory network models from cross-sectional, time-series and panel data <doi:10.31234/osf.io/8ha93>. Allows for confirmatory testing and fit as well as exploratory model search.

Maintained by Sacha Epskamp. Last updated 3 days ago.

openblas cpp

51 stars 6.88 score 41 scripts 1 dependents

cvborkulo

IsingFit:Fitting Ising Models Using the ELasso Method

This network estimation procedure eLasso, which is based on the Ising model, combines l1-regularized logistic regression with model selection based on the Extended Bayesian Information Criterion (EBIC). EBIC is a fit measure that identifies relevant relationships between variables. The resulting network consists of variables as nodes and relevant relationships as edges. Can deal with binary data.

Maintained by Sacha Epskamp. Last updated 1 years ago.

10 stars 6.85 score 25 scripts 5 dependents

vast-lib

tinyVAST:Multivariate Spatio-Temporal Models using Structural Equations

Fits a wide variety of multivariate spatio-temporal models with simultaneous and lagged interactions among variables (including vector autoregressive spatio-temporal ('VAST') dynamics) for areal, continuous, or network spatial domains. It includes time-variable, space-variable, and space-time-variable interactions using dynamic structural equation models ('DSEM') as expressive interface, and the 'mgcv' package to specify splines via the formula interface. See Thorson et al. (2024) <doi:10.48550/arXiv.2401.10193> for more details.

Maintained by James T. Thorson. Last updated 9 days ago.

vector-autoregressive-spatio-temporal-model cpp

14 stars 6.83 score

tom-wolff

ideanet:Integrating Data Exchange and Analysis for Networks ('ideanet')

A suite of convenient tools for social network analysis geared toward students, entry-level users, and non-expert practitioners. ‘ideanet’ features unique functions for the processing and measurement of sociocentric and egocentric network data. These functions automatically generate node- and system-level measures commonly used in the analysis of these types of networks. Outputs from these functions maximize the ability of novice users to employ network measurements in further analyses while making all users less prone to common data analytic errors. Additionally, ‘ideanet’ features an R Shiny graphic user interface that allows novices to explore network data with minimal need for coding.

Maintained by Tom Wolff. Last updated 16 days ago.

6 stars 6.80 score 10 scripts

jmcurran

Hotelling:Hotelling's T^2 Test and Variants

A set of R functions which implements Hotelling's T^2 test and some variants of it. Functions are also included for Aitchison's additive log ratio and centred log ratio transformations.

Maintained by James Curran. Last updated 4 years ago.

2 stars 6.78 score 139 scripts 3 dependents

paytonjjones

networktools:Tools for Identifying Important Nodes in Networks

Includes assorted tools for network analysis. Bridge centrality; goldbricker; MDS, PCA, & eigenmodel network plotting.

Maintained by Payton Jones. Last updated 1 months ago.

10 stars 6.75 score 93 scripts 5 dependents

vlyubchich

funtimes:Functions for Time Series Analysis

Nonparametric estimators and tests for time series analysis. The functions use bootstrap techniques and robust nonparametric difference-based estimators to test for the presence of possibly non-monotonic trends and for synchronicity of trends in multiple time series.

Maintained by Vyacheslav Lyubchich. Last updated 2 years ago.

7 stars 6.69 score 93 scripts

sensitivequestions

list:Statistical Methods for the Item Count Technique and List Experiment

Allows researchers to conduct multivariate statistical analyses of survey data with list experiments. This survey methodology is also known as the item count technique or the unmatched count technique and is an alternative to the commonly used randomized response method. The package implements the methods developed by Imai (2011) <doi:10.1198/jasa.2011.ap10415>, Blair and Imai (2012) <doi:10.1093/pan/mpr048>, Blair, Imai, and Lyall (2013) <doi:10.1111/ajps.12086>, Imai, Park, and Greene (2014) <doi:10.1093/pan/mpu017>, Aronow, Coppock, Crawford, and Green (2015) <doi:10.1093/jssam/smu023>, Chou, Imai, and Rosenfeld (2017) <doi:10.1177/0049124117729711>, and Blair, Chou, and Imai (2018) <https://imai.fas.harvard.edu/research/files/listerror.pdf>. This includes a Bayesian MCMC implementation of regression for the standard and multiple sensitive item list experiment designs and a random effects setup, a Bayesian MCMC hierarchical regression model with up to three hierarchical groups, the combined list experiment and endorsement experiment regression model, a joint model of the list experiment that enables the analysis of the list experiment as a predictor in outcome regression models, a method for combining list experiments with direct questions, and methods for diagnosing and adjusting for response error. In addition, the package implements the statistical test that is designed to detect certain failures of list experiments, and a placebo test for the list experiment using data from direct questions.

Maintained by Graeme Blair. Last updated 1 years ago.

openblas

7 stars 6.60 score 191 scripts

bioc

M3C:Monte Carlo Reference-based Consensus Clustering

M3C is a consensus clustering algorithm that uses a Monte Carlo simulation to eliminate overestimation of K and can reject the null hypothesis K=1.

Maintained by Christopher John. Last updated 5 months ago.

clustering geneexpression transcription rnaseq sequencing immunooncology

6.59 score 174 scripts 1 dependents

sonsoleslp

tna:Transition Network Analysis (TNA)

Provides tools for performing Transition Network Analysis (TNA) to study relational dynamics, including functions for building and plotting TNA models, calculating centrality measures, and identifying dominant events and patterns. TNA statistical techniques (e.g., bootstrapping and permutation tests) ensure the reliability of observed insights and confirm that identified dynamics are meaningful. See (Saqr et al., 2025) <doi:10.1145/3706468.3706513> for more details on TNA.

Maintained by Sonsoles López-Pernas. Last updated 4 days ago.

educational-data-mining learning-analytics markov-model temporal-analysis

4 stars 6.51 score 5 scripts

jazznbass

scan:Single-Case Data Analyses for Single and Multiple Baseline Designs

A collection of procedures for analysing, visualising, and managing single-case data. These include piecewise linear regression models, multilevel models, overlap indices ('PND', 'PEM', 'PAND', 'PET', 'tau-u', 'baseline corrected tau', 'CDC'), and randomization tests. Data preparation functions support outlier detection, handling missing values, scaling, and custom transformations. An export function helps to generate html, word, and latex tables in a publication friendly style. More details can be found in the online book 'Analyzing single-case data with R and scan', Juergen Wilbert (2025) <https://jazznbass.github.io/scan-Book/>.

Maintained by Juergen Wilbert. Last updated 11 days ago.

4 stars 6.47 score 62 scripts 1 dependents

tidymodels

plsmod:Model Wrappers for Projection Methods

Bindings for additional regression models for use with the 'parsnip' package, including ordinary and spare partial least squares models for regression and classification (Rohart et al (2017) <doi:10.1371/journal.pcbi.1005752>).

Maintained by Max Kuhn. Last updated 6 months ago.

mixomics

14 stars 6.47 score 59 scripts 1 dependents

bioc

zenith:Gene set analysis following differential expression using linear (mixed) modeling with dream

Zenith performs gene set analysis on the result of differential expression using linear (mixed) modeling with dream by considering the correlation between gene expression traits. This package implements the camera method from the limma package proposed by Wu and Smyth (2012). Zenith is a simple extension of camera to be compatible with linear mixed models implemented in variancePartition::dream().

Maintained by Gabriel Hoffman. Last updated 5 days ago.

rnaseq geneexpression genesetenrichment differentialexpression batcheffect qualitycontrol regression epigenetics functionalgenomics transcriptomics normalization preprocessing microarray immunooncology software

6.39 score 91 scripts 1 dependents

milanwiedemann

lcsm:Univariate and Bivariate Latent Change Score Modelling

Helper functions to implement univariate and bivariate latent change score models in R using the 'lavaan' package. For details about Latent Change Score Modeling (LCSM) see McArdle (2009) <doi:10.1146/annurev.psych.60.110707.163612> and Grimm, An, McArdle, Zonderman and Resnick (2012) <doi:10.1080/10705511.2012.659627>. The package automatically generates 'lavaan' syntax for different model specifications and varying timepoints. The 'lavaan' syntax generated by this package can be returned and further specifications can be added manually. Longitudinal plots as well as simplified path diagrams can be created to visualise data and model specifications. Estimated model parameters and fit statistics can be extracted as data frames. Data for different univariate and bivariate LCSM can be simulated by specifying estimates for model parameters to explore their effects. This package combines the strengths of other R packages like 'lavaan', 'broom', and 'semPlot' by generating 'lavaan' syntax that helps these packages work together.

Maintained by Milan Wiedemann. Last updated 2 years ago.

17 stars 6.34 score 43 scripts

biorgeo

bioregion:Comparison of Bioregionalisation Methods

The main purpose of this package is to propose a transparent methodological framework to compare bioregionalisation methods based on hierarchical and non-hierarchical clustering algorithms (Kreft & Jetz (2010) <doi:10.1111/j.1365-2699.2010.02375.x>) and network algorithms (Lenormand et al. (2019) <doi:10.1002/ece3.4718> and Leroy et al. (2019) <doi:10.1111/jbi.13674>).

Maintained by Maxime Lenormand. Last updated 24 days ago.

biogeography bioregion bioregionalization cpp

7 stars 6.27 score 11 scripts

jsakaluk

dySEM:Dyadic Structural Equation Modeling

Scripting of structural equation models via 'lavaan' for Dyadic Data Analysis, and helper functions for supplemental calculations, tabling, and model visualization. Current models supported include Dyadic Confirmatory Factor Analysis, the Actor–Partner Interdependence Model (observed and latent), the Common Fate Model (observed and latent), Mutual Influence Model (latent), and the Bifactor Dyadic Model (latent).

Maintained by John Sakaluk. Last updated 4 days ago.

6 stars 6.12 score 10 scripts

kosukehamazaki

RAINBOWR:Genome-Wide Association Study with SNP-Set Methods

By using 'RAINBOWR' (Reliable Association INference By Optimizing Weights with R), users can test multiple SNPs (Single Nucleotide Polymorphisms) simultaneously by kernel-based (SNP-set) methods. This package can also be applied to haplotype-based GWAS (Genome-Wide Association Study). Users can test not only additive effects but also dominance and epistatic effects. In detail, please check our paper on PLOS Computational Biology: Kosuke Hamazaki and Hiroyoshi Iwata (2020) <doi:10.1371/journal.pcbi.1007663>.

Maintained by Kosuke Hamazaki. Last updated 4 months ago.

cpp

22 stars 5.99 score 22 scripts

bioc

timeOmics:Time-Course Multi-Omics data integration

timeOmics is a generic data-driven framework to integrate multi-Omics longitudinal data measured on the same biological samples and select key temporal features with strong associations within the same sample group. The main steps of timeOmics are: 1. Plaform and time-specific normalization and filtering steps; 2. Modelling each biological into one time expression profile; 3. Clustering features with the same expression profile over time; 4. Post-hoc validation step.

Maintained by Antoine Bodein. Last updated 5 months ago.

clustering featureextraction timecourse dimensionreduction software sequencing microarray metabolomics metagenomics proteomics classification regression immunooncology geneprediction multiplecomparison cluster integration multi-omics time-series

24 stars 5.98 score 10 scripts

bmaitner

S4DM:Small Sample Size Species Distribution Modeling

Implements a set of distribution modeling methods that are suited to species with small sample sizes (e.g., poorly sampled species or rare species). While these methods can also be used on well-sampled taxa, they are united by the fact that they can be utilized with relatively few data points. More details on the currently implemented methodologies can be found in Drake and Richards (2018) <doi:10.1002/ecs2.2373>, Drake (2015) <doi:10.1098/rsif.2015.0086>, and Drake (2014) <doi:10.1890/ES13-00202.1>.

Maintained by Brian S. Maitner. Last updated 2 months ago.

open-science range-modelling rare-species species-distribution-modeling species-distribution-modelling

4 stars 5.97 score 33 scripts

leoegidi

pivmet:Pivotal Methods for Bayesian Relabelling and k-Means Clustering

Collection of pivotal algorithms for: relabelling the MCMC chains in order to undo the label switching problem in Bayesian mixture models; fitting sparse finite mixtures; initializing the centers of the classical k-means algorithm in order to obtain a better clustering solution. For further details see Egidi, Pappadà, Pauli and Torelli (2018b)<ISBN:9788891910233>.

Maintained by Leonardo Egidi. Last updated 10 months ago.

jags cpp

5 stars 5.94 score 25 scripts

caetanods

ratematrix:Bayesian Estimation of the Evolutionary Rate Matrix

The Evolutionary Rate Matrix is a variance-covariance matrix which describes both the rates of trait evolution and the evolutionary correlation among multiple traits. This package has functions to estimate these parameters using Bayesian MCMC. It is possible to test if the pattern of evolutionary correlations among traits has changed between predictive regimes painted along the branches of the phylogenetic tree. Regimes can be created a priori or estimated as part of the MCMC under a joint estimation approach. The package has functions to run MCMC chains, plot results, evaluate convergence, and summarize posterior distributions.

Maintained by Daniel Caetano. Last updated 2 years ago.

openblas cpp openmp

10 stars 5.91 score 18 scripts 1 dependents

bioc

PathoStat:PathoStat Statistical Microbiome Analysis Package

The purpose of this package is to perform Statistical Microbiome Analysis on metagenomics results from sequencing data samples. In particular, it supports analyses on the PathoScope generated report files. PathoStat provides various functionalities including Relative Abundance charts, Diversity estimates and plots, tests of Differential Abundance, Time Series visualization, and Core OTU analysis.

Maintained by Solaiappan Manimaran. Last updated 5 months ago.

microbiome metagenomics graphandnetwork microarray patternlogic principalcomponent sequencing software visualization rnaseq immunooncology

8 stars 5.90 score 8 scripts

bioc

miRspongeR:Identification and analysis of miRNA sponge regulation

This package provides several functions to explore miRNA sponge (also called ceRNA or miRNA decoy) regulation from putative miRNA-target interactions or/and transcriptomics data (including bulk, single-cell and spatial gene expression data). It provides eight popular methods for identifying miRNA sponge interactions, and an integrative method to integrate miRNA sponge interactions from different methods, as well as the functions to validate miRNA sponge interactions, and infer miRNA sponge modules, conduct enrichment analysis of miRNA sponge modules, and conduct survival analysis of miRNA sponge modules. By using a sample control variable strategy, it provides a function to infer sample-specific miRNA sponge interactions. In terms of sample-specific miRNA sponge interactions, it implements three similarity methods to construct sample-sample correlation network.

Maintained by Junpeng Zhang. Last updated 5 months ago.

geneexpression biomedicalinformatics networkenrichment survival microarray software singlecell spatial rnaseq cerna mirna sponge

5 stars 5.88 score 8 scripts

bioc

epiNEM:epiNEM

epiNEM is an extension of the original Nested Effects Models (NEM). EpiNEM is able to take into account double knockouts and infer more complex network signalling pathways. It is tailored towards large scale double knock-out screens.

Maintained by Martin Pirkl. Last updated 5 months ago.

pathways systemsbiology networkinference network

1 stars 5.83 score 1 scripts 3 dependents

bioc

benchdamic:Benchmark of differential abundance methods on microbiome data

Starting from a microbiome dataset (16S or WMS with absolute count values) it is possible to perform several analysis to assess the performances of many differential abundance detection methods. A basic and standardized version of the main differential abundance analysis methods is supplied but the user can also add his method to the benchmark. The analyses focus on 4 main aspects: i) the goodness of fit of each method's distributional assumptions on the observed count data, ii) the ability to control the false discovery rate, iii) the within and between method concordances, iv) the truthfulness of the findings if any apriori knowledge is given. Several graphical functions are available for result visualization.

Maintained by Matteo Calgaro. Last updated 4 months ago.

metagenomics microbiome differentialexpression multiplecomparison normalization preprocessing software benchmark differential-abundance-methods

8 stars 5.78 score 8 scripts

mmeierer

REndo:Fitting Linear Models with Endogenous Regressors using Latent Instrumental Variables

Fits linear models with endogenous regressor using latent instrumental variable approaches. The methods included in the package are Lewbel's (1997) <doi:10.2307/2171884> higher moments approach as well as Lewbel's (2012) <doi:10.1080/07350015.2012.643126> heteroscedasticity approach, Park and Gupta's (2012) <doi:10.1287/mksc.1120.0718> joint estimation method that uses Gaussian copula and Kim and Frees's (2007) <doi:10.1007/s11336-007-9008-1> multilevel generalized method of moment approach that deals with endogeneity in a multilevel setting. These are statistical techniques to address the endogeneity problem where no external instrumental variables are needed. See the publication related to this package in the Journal of Statistical Software for more details: <doi:10.18637/jss.v107.i03>. Note that with version 2.0.0 sweeping changes were introduced which greatly improve functionality and usability but break backwards compatibility.

Maintained by Raluca Gui. Last updated 9 months ago.

cpp

16 stars 5.76 score 23 scripts

ugroempi

relaimpo:Relative Importance of Regressors in Linear Models

Provides several metrics for assessing relative importance in linear models. These can be printed, plotted and bootstrapped. The recommended metric is lmg, which provides a decomposition of the model explained variance into non-negative contributions. There is a version of this package available that additionally provides a new and also recommended metric called pmvd. If you are a non-US user, you can download this extended version from Ulrike Groempings web site.

Maintained by Ulrike Groemping. Last updated 1 years ago.

3 stars 5.75 score 632 scripts 3 dependents

svazzole

sparsevar:Sparse VAR/VECM Models Estimation

A wrapper for sparse VAR/VECM time series models estimation using penalties like ENET (Elastic Net), SCAD (Smoothly Clipped Absolute Deviation) and MCP (Minimax Concave Penalty). Based on the work of Sumanta Basu and George Michailidis <doi:10.1214/15-AOS1315>.

Maintained by Simone Vazzoler. Last updated 4 years ago.

econometrics lasso mcp scad sparse statistics time-series var vecm

11 stars 5.69 score 30 scripts 1 dependents

bioc

debCAM:Deconvolution by Convex Analysis of Mixtures

An R package for fully unsupervised deconvolution of complex tissues. It provides basic functions to perform unsupervised deconvolution on mixture expression profiles by Convex Analysis of Mixtures (CAM) and some auxiliary functions to help understand the subpopulation-specific results. It also implements functions to perform supervised deconvolution based on prior knowledge of molecular markers, S matrix or A matrix. Combining molecular markers from CAM and from prior knowledge can achieve semi-supervised deconvolution of mixtures.

Maintained by Lulu Chen. Last updated 5 months ago.

software cellbiology geneexpression openjdk

7 stars 5.69 score 14 scripts

bioc

methyLImp2:Missing value estimation of DNA methylation data

This package allows to estimate missing values in DNA methylation data. methyLImp method is based on linear regression since methylation levels show a high degree of inter-sample correlation. Implementation is parallelised over chromosomes since probes on different chromosomes are usually independent. Mini-batch approach to reduce the runtime in case of large number of samples is available.

Maintained by Anna Plaksienko. Last updated 2 months ago.

dnamethylation microarray software methylationarray regression imputation methylation missing-value-imputation

6 stars 5.62 score 3 scripts

transbiozi

RMTL:Regularized Multi-Task Learning

Efficient solvers for 10 regularized multi-task learning algorithms applicable for regression, classification, joint feature selection, task clustering, low-rank learning, sparse learning and network incorporation. Based on the accelerated gradient descent method, the algorithms feature a state-of-art computational complexity O(1/k^2). Sparse model structure is induced by the solving the proximal operator. The detail of the package is described in the paper of Han Cao and Emanuel Schwarz (2018) <doi:10.1093/bioinformatics/bty831>.

Maintained by Han Cao. Last updated 6 years ago.

low-rank-representaion multi-task-learning regularization sparse-coding

19 stars 5.60 score 21 scripts

benyamindsmith

ig.degree.betweenness:"Smith-Pittman Community Detection Algorithm for 'igraph' Objects (2024)"

Implements the "Smith-Pittman" community detection algorithm for network analysis using 'igraph' objects. This algorithm combines node degree and betweenness centrality measures to identify communities within networks, with a gradient evident in social partitioning. The package provides functions for community detection, visualization, and analysis of the resulting community structure. Methods are based on results from Smith, Pittman and Xu (2024) <doi:10.48550/arXiv.2411.01394>.

Maintained by Benjamin Smith. Last updated 14 days ago.

community-detection-algorithms igraph

38 stars 5.50 score 11 scripts

molinlab

Holomics:An User-Friendly R 'shiny' Application for Multi-Omics Data Integration and Analysis

A 'shiny' application, which allows you to perform single- and multi-omics analyses using your own omics datasets. After the upload of the omics datasets and a metadata file, single-omics is performed for feature selection and dataset reduction. These datasets are used for pairwise- and multi-omics analyses, where automatic tuning is done to identify correlations between the datasets - the end goal of the recommended 'Holomics' workflow. Methods used in the package were implemented in the package 'mixomics' by Florian Rohart,Benoît Gautier,Amrit Singh,Kim-Anh Lê Cao (2017) <doi:10.1371/journal.pcbi.1005752> and are described there in further detail.

Maintained by Katharina Munk. Last updated 10 months ago.

7 stars 5.45 score 7 scripts

bips-hb

cpi:Conditional Predictive Impact

A general test for conditional independence in supervised learning algorithms as proposed by Watson & Wright (2021) <doi:10.1007/s10994-021-06030-6>. Implements a conditional variable importance measure which can be applied to any supervised learning algorithm and loss function. Provides statistical inference procedures without parametric assumptions and applies equally well to continuous and categorical predictors and outcomes.

Maintained by Marvin N. Wright. Last updated 4 months ago.

11 stars 5.42 score 24 scripts

andybega

spduration:Split-Population Duration (Cure) Regression

An implementation of split-population duration regression models. Unlike regular duration models, split-population duration models are mixture models that accommodate the presence of a sub-population that is not at risk for failure, e.g. cancer patients who have been cured by treatment. This package implements Weibull and Loglogistic forms for the duration component, and focuses on data with time-varying covariates. These models were originally formulated in Boag (1949) and Berkson and Gage (1952), and extended in Schmidt and Witte (1989).

Maintained by Andreas Beger. Last updated 1 years ago.

mixture-model regression split-population survival-analysis cpp

4 stars 5.38 score 40 scripts

bioc

PLSDAbatch:PLSDA-batch

A novel framework to correct for batch effects prior to any downstream analysis in microbiome data based on Projection to Latent Structures Discriminant Analysis. The main method is named “PLSDA-batch”. It first estimates treatment and batch variation with latent components, then subtracts batch-associated components from the data whilst preserving biological variation of interest. PLSDA-batch is highly suitable for microbiome data as it is non-parametric, multivariate and allows for ordination and data visualisation. Combined with centered log-ratio transformation for addressing uneven library sizes and compositional structure, PLSDA-batch addresses all characteristics of microbiome data that existing correction methods have ignored so far. Two other variants are proposed for 1/ unbalanced batch x treatment designs that are commonly encountered in studies with small sample sizes, and for 2/ selection of discriminative variables amongst treatment groups to avoid overfitting in classification problems. These two variants have widened the scope of applicability of PLSDA-batch to different data settings.

Maintained by Yiwen (Eva) Wang. Last updated 5 months ago.

statisticalmethod dimensionreduction principalcomponent classification microbiome batcheffect normalization visualization

13 stars 5.37 score 18 scripts

msesia

knockoff:The Knockoff Filter for Controlled Variable Selection

The knockoff filter is a general procedure for controlling the false discovery rate (FDR) when performing variable selection. For more information, see the website below and the accompanying paper: Candes et al., "Panning for gold: model-X knockoffs for high-dimensional controlled variable selection", J. R. Statist. Soc. B (2018) 80, 3, pp. 551-577.

Maintained by Matteo Sesia. Last updated 3 years ago.

2 stars 5.35 score 248 scripts 5 dependents

bioc

MOSClip:Multi Omics Survival Clip

Topological pathway analysis tool able to integrate multi-omics data. It finds survival-associated modules or significant modules for two-class analysis. This tool have two main methods: pathway tests and module tests. The latter method allows the user to dig inside the pathways itself.

Maintained by Paolo Martini. Last updated 5 months ago.

software statisticalmethod graphandnetwork survival regression dimensionreduction pathways reactome

5.34 score 5 scripts

bioc

GlobalAncova:Global test for groups of variables via model comparisons

The association between a variable of interest (e.g. two groups) and the global pattern of a group of variables (e.g. a gene set) is tested via a global F-test. We give the following arguments in support of the GlobalAncova approach: After appropriate normalisation, gene-expression-data appear rather symmetrical and outliers are no real problem, so least squares should be rather robust. ANCOVA with interaction yields saturated data modelling e.g. different means per group and gene. Covariate adjustment can help to correct for possible selection bias. Variance homogeneity and uncorrelated residuals cannot be expected. Application of ordinary least squares gives unbiased, but no longer optimal estimates (Gauss-Markov-Aitken). Therefore, using the classical F-test is inappropriate, due to correlation. The test statistic however mirrors deviations from the null hypothesis. In combination with a permutation approach, empirical significance levels can be approximated. Alternatively, an approximation yields asymptotic p-values. The framework is generalized to groups of categorical variables or even mixed data by a likelihood ratio approach. Closed and hierarchical testing procedures are supported. This work was supported by the NGFN grant 01 GR 0459, BMBF, Germany and BMBF grant 01ZX1309B, Germany.

Maintained by Manuela Hummel. Last updated 5 months ago.

microarray onechannel differentialexpression pathways regression

5.31 score 9 scripts 1 dependents

biostatomics

Coxmos:Cox MultiBlock Survival

This software package provides Cox survival analysis for high-dimensional and multiblock datasets. It encompasses a suite of functions dedicated from the classical Cox regression to newest analysis, including Cox proportional hazards model, Stepwise Cox regression, and Elastic-Net Cox regression, Sparse Partial Least Squares Cox regression (sPLS-COX) incorporating three distinct strategies, and two Multiblock-PLS Cox regression (MB-sPLS-COX) methods. This tool is designed to adeptly handle high-dimensional data, and provides tools for cross-validation, plot generation, and additional resources for interpreting results. While references are available within the corresponding functions, key literature is mentioned below. Terry M Therneau (2024) <https://CRAN.R-project.org/package=survival>, Noah Simon et al. (2011) <doi:10.18637/jss.v039.i05>, Philippe Bastien et al. (2005) <doi:10.1016/j.csda.2004.02.005>, Philippe Bastien (2008) <doi:10.1016/j.chemolab.2007.09.009>, Philippe Bastien et al. (2014) <doi:10.1093/bioinformatics/btu660>, Kassu Mehari Beyene and Anouar El Ghouch (2020) <doi:10.1002/sim.8671>, Florian Rohart et al. (2017) <doi:10.1371/journal.pcbi.1005752>.

Maintained by Pedro Salguero García. Last updated 25 days ago.

1 stars 5.30 score 5 scripts

langejens

CliquePercolation:Clique Percolation for Networks

Clique percolation community detection for weighted and unweighted networks as well as threshold and plotting functions. For more information see Farkas et al. (2007) <doi:10.1088/1367-2630/9/6/180> and Palla et al. (2005) <doi:10.1038/nature03607>.

Maintained by Jens Lange. Last updated 1 years ago.

4 stars 5.30 score 11 scripts 1 dependents

ncchung

jackstraw:Statistical Inference for Unsupervised Learning

Test for association between the observed data and their estimated latent variables. The jackstraw package provides a resampling strategy and testing scheme to estimate statistical significance of association between the observed data and their latent variables. Depending on the data type and the analysis aim, the latent variables may be estimated by principal component analysis (PCA), factor analysis (FA), K-means clustering, and related unsupervised learning algorithms. The jackstraw methods learn over-fitting characteristics inherent in this circular analysis, where the observed data are used to estimate the latent variables and used again to test against that estimated latent variables. When latent variables are estimated by PCA, the jackstraw enables statistical testing for association between observed variables and latent variables, as estimated by low-dimensional principal components (PCs). This essentially leads to identifying variables that are significantly associated with PCs. Similarly, unsupervised clustering, such as K-means clustering, partition around medoids (PAM), and others, finds coherent groups in high-dimensional data. The jackstraw estimates statistical significance of cluster membership, by testing association between data and cluster centers. Clustering membership can be improved by using the resulting jackstraw p-values and posterior inclusion probabilities (PIPs), with an application to unsupervised evaluation of cell identities in single cell RNA-seq (scRNA-seq).

Maintained by Neo Christopher Chung. Last updated 3 months ago.

clustering k-means machine-learning pca statistics unsupervised

16 stars 5.29 score 35 scripts

manueleleonelli

bnRep:A Repository of Bayesian Networks from the Academic Literature

A collection of Bayesian networks (discrete, Gaussian, and conditional linear Gaussian) collated from recent academic literature. The 'bnRep_summary' object provides an overview of the Bayesian networks in the repository and the package documentation includes details about the variables in each network. A Shiny app to explore the repository can be launched with 'bnRep_app()' and is available online at <https://manueleleonelli.shinyapps.io/bnRep>. For details see <https://github.com/manueleleonelli/bnRep>.

Maintained by Manuele Leonelli. Last updated 6 months ago.

6 stars 5.18 score 7 scripts

bioc

gcatest:Genotype Conditional Association TEST

GCAT is an association test for genome wide association studies that controls for population structure under a general class of trait models. This test conditions on the trait, which makes it immune to confounding by unmodeled environmental factors. Population structure is modeled via logistic factors, which are estimated using the `lfa` package.

Maintained by Alejandro Ochoa. Last updated 5 months ago.

snp dimensionreduction principalcomponent genomewideassociation

5 stars 5.18 score 4 scripts

cbg-ethz

clustNet:Network-Based Clustering

Network-based clustering using a Bayesian network mixture model with optional covariate adjustment.

Maintained by Fritz Bayer. Last updated 1 years ago.

bayesian-network bayesian-networks clustering dag genomics mixture-model network-clustering

7 stars 5.16 score 41 scripts

fbertran

plsRcox:Partial Least Squares Regression for Cox Models and Related Techniques

Provides Partial least squares Regression and various regular, sparse or kernel, techniques for fitting Cox models in high dimensional settings <doi:10.1093/bioinformatics/btu660>, Bastien, P., Bertrand, F., Meyer N., Maumy-Bertrand, M. (2015), Deviance residuals-based sparse PLS and sparse kernel PLS regression for censored data, Bioinformatics, 31(3):397-404. Cross validation criteria were studied in <arXiv:1810.02962>, Bertrand, F., Bastien, Ph. and Maumy-Bertrand, M. (2018), Cross validating extensions of kernel, sparse or regular partial least squares regression models to censored data.

Maintained by Frederic Bertrand. Last updated 2 years ago.

4 stars 5.13 score 56 scripts 2 dependents

bioc

DepecheR:Determination of essential phenotypic elements of clusters in high-dimensional entities

The purpose of this package is to identify traits in a dataset that can separate groups. This is done on two levels. First, clustering is performed, using an implementation of sparse K-means. Secondly, the generated clusters are used to predict outcomes of groups of individuals based on their distribution of observations in the different clusters. As certain clusters with separating information will be identified, and these clusters are defined by a sparse number of variables, this method can reduce the complexity of data, to only emphasize the data that actually matters.

Maintained by Jakob Theorell. Last updated 5 months ago.

software cellbasedassays transcription differentialexpression datarepresentation immunooncology transcriptomics classification clustering dimensionreduction featureextraction flowcytometry rnaseq singlecell visualization cpp

5.08 score 15 scripts

vandenman

NetworkComparisonTest:Statistical Comparison of Two Networks Based on Several Invariance Measures

This permutation based hypothesis test, suited for several types of data supported by the estimateNetwork function of the bootnet package (Epskamp & Fried, 2018), assesses the difference between two networks based on several invariance measures (network structure invariance, global strength invariance, edge invariance, several centrality measures, etc.). Network structures are estimated with l1-regularization. The Network Comparison Test is suited for comparison of independent (e.g., two different groups) and dependent samples (e.g., one group that is measured twice). See van Borkulo et al. (2021, in press; the final article will be available, upon publication, via its DOI: 10.1037/met0000476).

Maintained by Claudia van Borkulo. Last updated 3 years ago.

5.07 score 70 scripts

bonsook

REN:Regularization Ensemble for Robust Portfolio Optimization

Portfolio optimization is achieved through a combination of regularization techniques and ensemble methods that are designed to generate stable out-of-sample return predictions, particularly in the presence of strong correlations among assets. The package includes functions for data preparation, parallel processing, and portfolio analysis using methods such as Mean-Variance, James-Stein, LASSO, Ridge Regression, and Equal Weighting. It also provides visualization tools and performance metrics, such as the Sharpe ratio, volatility, and maximum drawdown, to assess the results.

Maintained by Bonsoo Koo. Last updated 6 months ago.

1 stars 5.04 score 2 scripts

ivaughan

econullnetr:Null Model Analysis for Ecological Networks

Tools for using null models to analyse ecological networks (e.g. food webs, flower-visitation networks, seed-dispersal networks) and detect resource preferences or non-random interactions among network nodes. Tools are provided to run null models, test for and plot preferences, plot and analyse bipartite networks, and export null model results in a form compatible with other network analysis packages. The underlying null model was developed by Agusti et al. (2003) Molecular Ecology <doi:10.1046/j.1365-294X.2003.02014.x> and the full application to ecological networks by Vaughan et al. (2018) econullnetr: an R package using null models to analyse the structure of ecological networks and identify resource selection. Methods in Ecology & Evolution, <doi:10.1111/2041-210X.12907>.

Maintained by Ian Vaughan. Last updated 4 years ago.

7 stars 5.04 score 31 scripts

r-forge

plasma:Partial LeAst Squares for Multiomic Analysis

Contains tools for supervised analyses of incomplete, overlapping multiomics datasets. Applies partial least squares in multiple steps to find models that predict survival outcomes. See Yamaguchi et al. (2023) <doi:10.1101/2023.03.10.532096>.

Maintained by Kevin R. Coombes. Last updated 2 months ago.

4.97 score 13 scripts

rikenbit

iTensor:ICA-Based Matrix/Tensor Decomposition

Some functions for performing ICA, MICA, Group ICA, and Multilinear ICA are implemented. ICA, MICA/Group ICA, and Multilinear ICA extract statistically independent components from single matrix, multiple matrices, and single tensor, respectively. For the details of these methods, see the reference section of GitHub README.md <https://github.com/rikenbit/iTensor>.

Maintained by Koki Tsuyuzaki. Last updated 2 years ago.

1 stars 4.95 score 2 scripts 1 dependents

jmbh

mnet:Modeling Group Differences and Moderation Effects in Statistical Network Models

A toolbox for modeling manifest and latent group differences and moderation effects in various statistical network models.

Maintained by Jonas Haslbeck. Last updated 2 months ago.

4.91 score 18 scripts

bioc

mfa:Bayesian hierarchical mixture of factor analyzers for modelling genomic bifurcations

MFA models genomic bifurcations using a Bayesian hierarchical mixture of factor analysers.

Maintained by Kieran Campbell. Last updated 5 months ago.

immunooncology rnaseq geneexpression bayesian singlecell cpp

4.85 score 35 scripts

netcoupler

NetCoupler:Inference of Causal Links Between a Network and an External Variable

The 'NetCoupler' algorithm identifies potential direct effects of correlated, high-dimensional variables formed as a network with an external variable. The external variable may act as the dependent/response variable or as an independent/predictor variable to the network.

Maintained by Luke Johnston. Last updated 1 years ago.

6 stars 4.78 score 7 scripts

annennenne

causalDisco:Tools for Causal Discovery on Observational Data

Various tools for inferring causal models from observational data. The package includes an implementation of the temporal Peter-Clark (TPC) algorithm. Petersen, Osler and Ekstrøm (2021) <doi:10.1093/aje/kwab087>. It also includes general tools for evaluating differences in adjacency matrices, which can be used for evaluating performance of causal discovery procedures.

Maintained by Anne Helby Petersen. Last updated 27 days ago.

19 stars 4.76 score 10 scripts

jakobbossek

mcMST:A Toolbox for the Multi-Criteria Minimum Spanning Tree Problem

Algorithms to approximate the Pareto-front of multi-criteria minimum spanning tree problems.

Maintained by Jakob Bossek. Last updated 2 years ago.

evolutionary-algorithms mcmst minimum-spanning-trees multi-objective-optimization spanningtrees

4 stars 4.73 score 27 scripts

bioc

miRLAB:Dry lab for exploring miRNA-mRNA relationships

Provide tools exploring miRNA-mRNA relationships, including popular miRNA target prediction methods, ensemble methods that integrate individual methods, functions to get data from online resources, functions to validate the results, and functions to conduct enrichment analyses.

Maintained by Thuc Duy Le. Last updated 5 months ago.

mirna geneexpression networkinference network

4.72 score 11 scripts

kylehamilton

lavaan.shiny:Latent Variable Analysis with Shiny

Interactive shiny application for working with different kinds of latent variable analysis, with the 'lavaan' package. Graphical output for models are provided and different estimators are supported.

Maintained by William Kyle Hamilton. Last updated 9 years ago.

10 stars 4.70 score 1 scripts

bioc

MetNet:Inferring metabolic networks from untargeted high-resolution mass spectrometry data

MetNet contains functionality to infer metabolic network topologies from quantitative data and high-resolution mass/charge information. Using statistical models (including correlation, mutual information, regression and Bayes statistics) and quantitative data (intensity values of features) adjacency matrices are inferred that can be combined to a consensus matrix. Mass differences calculated between mass/charge values of features will be matched against a data frame of supplied mass/charge differences referring to transformations of enzymatic activities. In a third step, the two levels of information are combined to form a adjacency matrix inferred from both quantitative and structure information.

Maintained by Thomas Naake. Last updated 5 months ago.

immunooncology metabolomics massspectrometry network regression

4.70 score 1 scripts

audreyqyfu

MRPC:PC Algorithm with the Principle of Mendelian Randomization

A PC Algorithm with the Principle of Mendelian Randomization. This package implements the MRPC (PC with the principle of Mendelian randomization) algorithm to infer causal graphs. It also contains functions to simulate data under a certain topology, to visualize a graph in different ways, and to compare graphs and quantify the differences. See Badsha and Fu (2019) <doi:10.3389/fgene.2019.00460>,Badsha, Martin and Fu (2021) <doi:10.3389/fgene.2021.651812>.

Maintained by Audrey Fu. Last updated 3 years ago.

8 stars 4.68 score 20 scripts

bkeller2

mlmpower:Power Analysis and Data Simulation for Multilevel Models

A declarative language for specifying multilevel models, solving for population parameters based on specified variance-explained effect size measures, generating data, and conducting power analyses to determine sample size recommendations. The specification allows for any number of within-cluster effects, between-cluster effects, covariate effects at either level, and random coefficients. Moreover, the models do not assume orthogonal effects, and predictors can correlate at either level and accommodate models with multiple interaction effects.

Maintained by Brian T. Keller. Last updated 5 months ago.

3 stars 4.65 score 3 scripts

bioc

nempi:Inferring unobserved perturbations from gene expression data

Takes as input an incomplete perturbation profile and differential gene expression in log odds and infers unobserved perturbations and augments observed ones. The inference is done by iteratively inferring a network from the perturbations and inferring perturbations from the network. The network inference is done by Nested Effects Models.

Maintained by Martin Pirkl. Last updated 5 months ago.

software geneexpression differentialexpression differentialmethylation genesignaling pathways network classification neuralnetwork networkinference atacseq dnaseq rnaseq pooledscreens crispr singlecell systemsbiology

2 stars 4.60 score 2 scripts

jongheepark

NetworkChange:Bayesian Package for Network Changepoint Analysis

Network changepoint analysis for undirected network data. The package implements a hidden Markov network change point model (Park and Sohn (2020)). Functions for break number detection using the approximate marginal likelihood and WAIC are also provided.

Maintained by Jong Hee Park. Last updated 3 years ago.

bayesian changepoint latent-space network

5 stars 4.60 score 16 scripts

bioc

bnem:Training of logical models from indirect measurements of perturbation experiments

bnem combines the use of indirect measurements of Nested Effects Models (package mnem) with the Boolean networks of CellNOptR. Perturbation experiments of signalling nodes in cells are analysed for their effect on the global gene expression profile. Those profiles give evidence for the Boolean regulation of down-stream nodes in the network, e.g., whether two parents activate their child independently (OR-gate) or jointly (AND-gate).

Maintained by Martin Pirkl. Last updated 5 months ago.

pathways systemsbiology networkinference network geneexpression generegulation preprocessing

2 stars 4.60 score 5 scripts

bioc

dce:Pathway Enrichment Based on Differential Causal Effects

Compute differential causal effects (dce) on (biological) networks. Given observational samples from a control experiment and non-control (e.g., cancer) for two genes A and B, we can compute differential causal effects with a (generalized) linear regression. If the causal effect of gene A on gene B in the control samples is different from the causal effect in the non-control samples the dce will differ from zero. We regularize the dce computation by the inclusion of prior network information from pathway databases such as KEGG.

Maintained by Kim Philipp Jablonski. Last updated 3 months ago.

software statisticalmethod graphandnetwork regression geneexpression differentialexpression networkenrichment network kegg bioconductor causality

13 stars 4.59 score 4 scripts

karolinehuth

easybgm:Extracting and Visualizing Bayesian Graphical Models

Fit and visualize the results of a Bayesian analysis of networks commonly found in psychology. The package supports fitting cross-sectional network models fitted using the packages 'BDgraph', 'bgms' and 'BGGM'. The package provides the parameter estimates, posterior inclusion probabilities, inclusion Bayes factor, and the posterior density of the parameters. In addition, for 'BDgraph' and 'bgms' it allows to assess the posterior structure space. Furthermore, the package comes with an extensive suite for visualizing results.

Maintained by Karoline Huth. Last updated 5 months ago.

4.51 score 27 scripts

alexchristensen

SemNeT:Methods and Measures for Semantic Network Analysis

Implements several functions for the analysis of semantic networks including different network estimation algorithms, partial node bootstrapping (Kenett, Anaki, & Faust, 2014 <doi:10.3389/fnhum.2014.00407>), random walk simulation (Kenett & Austerweil, 2016 <http://alab.psych.wisc.edu/papers/files/Kenett16CreativityRW.pdf>), and a function to compute global network measures. Significance tests and plotting features are also implemented.

Maintained by Alexander P. Christensen. Last updated 2 years ago.

semantic-network-analysis

23 stars 4.51 score 28 scripts

bioc

clipper:Gene Set Analysis Exploiting Pathway Topology

Implements topological gene set analysis using a two-step empirical approach. It exploits graph decomposition theory to create a junction tree and reconstruct the most relevant signal path. In the first step clipper selects significant pathways according to statistical tests on the means and the concentration matrices of the graphs derived from pathway topologies. Then, it "clips" the whole pathway identifying the signal paths having the greatest association with a specific phenotype.

Maintained by Paolo Martini. Last updated 5 months ago.

4.48 score 19 scripts

jcdterry

cassandRa:Finds Missing Links and Metric Confidence Intervals in Ecological Bipartite Networks

Provides methods to deal with under sampling in ecological bipartite networks from Terry and Lewis (2020) Ecology <doi:10.1002/ecy.3047> Includes tools to fit a variety of statistical network models and sample coverage estimators to highlight most likely missing links. Also includes simple functions to resample from observed networks to generate confidence intervals for common ecological network metrics.

Maintained by Chris Terry. Last updated 10 months ago.

3 stars 4.48 score 4 scripts

kelliejarcher

hdcuremodels:Penalized Mixture Cure Models for High-Dimensional Data

Provides functions for fitting various penalized parametric and semi-parametric mixture cure models with different penalty functions, testing for a significant cure fraction, and testing for sufficient follow-up as described in Fu et al (2022)<doi:10.1002/sim.9513> and Archer et al (2024)<doi:10.1186/s13045-024-01553-6>. False discovery rate controlled variable selection is provided using model-X knock-offs.

Maintained by Kellie J. Archer. Last updated 8 days ago.

4.48 score 5 scripts

baeyc

varTestnlme:Variance Components Testing for Linear and Nonlinear Mixed Effects Models

An implementation of the Likelihood ratio Test (LRT) for testing that, in a (non)linear mixed effects model, the variances of a subset of the random effects are equal to zero. There is no restriction on the subset of variances that can be tested: for example, it is possible to test that all the variances are equal to zero. Note that the implemented test is asymptotic. This package should be used on model fits from packages 'nlme', 'lmer', and 'saemix'. Charlotte Baey and Estelle Kuhn (2019) <doi:10.18637/jss.v107.i06>.

Maintained by Charlotte Baey. Last updated 2 years ago.

2 stars 4.48 score 4 scripts 1 dependents

pwarncke77

ResIN:Response Item Networks

Contains various tools to perform and visualize Response Item Networks ('ResIN's'). 'ResIN' binarizes ordered-categorical and qualitative response choices from (survey) data, calculates pairwise associations and maps the location of each item response as a node in a force-directed network. Please refer to <https://www.resinmethod.net/> for more details.

Maintained by Philip Warncke. Last updated 6 months ago.

4.48 score 3 scripts

bioc

epistasisGA:An R package to identify multi-snp effects in nuclear family studies using the GADGETS method

This package runs the GADGETS method to identify epistatic effects in nuclear family studies. It also provides functions for permutation-based inference and graphical visualization of the results.

Maintained by Michael Nodzenski. Last updated 5 months ago.

genetics snp geneticvariability openblas cpp

1 stars 4.48 score 5 scripts

bioc

PRONE:The PROteomics Normalization Evaluator

High-throughput omics data are often affected by systematic biases introduced throughout all the steps of a clinical study, from sample collection to quantification. Normalization methods aim to adjust for these biases to make the actual biological signal more prominent. However, selecting an appropriate normalization method is challenging due to the wide range of available approaches. Therefore, a comparative evaluation of unnormalized and normalized data is essential in identifying an appropriate normalization strategy for a specific data set. This R package provides different functions for preprocessing, normalizing, and evaluating different normalization approaches. Furthermore, normalization methods can be evaluated on downstream steps, such as differential expression analysis and statistical enrichment analysis. Spike-in data sets with known ground truth and real-world data sets of biological experiments acquired by either tandem mass tag (TMT) or label-free quantification (LFQ) can be analyzed.

Maintained by Lis Arend. Last updated 9 days ago.

proteomics preprocessing normalization differentialexpression visualization data-analysis evaluation

2 stars 4.41 score 9 scripts

bioc

snm:Supervised Normalization of Microarrays

SNM is a modeling strategy especially designed for normalizing high-throughput genomic data. The underlying premise of our approach is that your data is a function of what we refer to as study-specific variables. These variables are either biological variables that represent the target of the statistical analysis, or adjustment variables that represent factors arising from the experimental or biological setting the data is drawn from. The SNM approach aims to simultaneously model all study-specific variables in order to more accurately characterize the biological or clinical variables of interest.

Maintained by John D. Storey. Last updated 5 months ago.

microarray onechannel twochannel multichannel differentialexpression exonarray geneexpression transcription multiplecomparison preprocessing qualitycontrol

4.41 score 64 scripts

alexiosg

parma:Portfolio Allocation and Risk Management Applications

Provision of a set of models and methods for use in the allocation and management of capital in financial portfolios.

Maintained by Alexios Galanos. Last updated 2 years ago.

openblas

4 stars 4.38 score 12 scripts

anttonalberdi

hilldiv:Integral Analysis of Diversity Based on Hill Numbers

Tools for analysing, comparing, visualising and partitioning diversity based on Hill numbers. 'hilldiv' is an R package that provides a set of functions to assist analysis of diversity for diet reconstruction, microbial community profiling or more general ecosystem characterisation analyses based on Hill numbers, using OTU/ASV tables and associated phylogenetic trees as inputs. The package includes functions for (phylo)diversity measurement, (phylo)diversity profile plotting, (phylo)diversity comparison between samples and groups, (phylo)diversity partitioning and (dis)similarity measurement. All of these grounded in abundance-based and incidence-based Hill numbers. The statistical framework developed around Hill numbers encompasses many of the most broadly employed diversity (e.g. richness, Shannon index, Simpson index), phylogenetic diversity (e.g. Faith's PD, Allen's H, Rao's quadratic entropy) and dissimilarity (e.g. Sorensen index, Unifrac distances) metrics. This enables the most common analyses of diversity to be performed while grounded in a single statistical framework. The methods are described in Jost et al. (2007) <DOI:10.1890/06-1736.1>, Chao et al. (2010) <DOI:10.1098/rstb.2010.0272> and Chiu et al. (2014) <DOI:10.1890/12-0960.1>; and reviewed in the framework of molecularly characterised biological systems in Alberdi & Gilbert (2019) <DOI:10.1111/1755-0998.13014>.

Maintained by Antton Alberdi. Last updated 4 years ago.

11 stars 4.35 score 41 scripts

fbertran

c060:Extended Inference for Lasso and Elastic-Net Regularized Cox and Generalized Linear Models

The c060 package provides additional functions to perform stability selection, model validation and parameter tuning for glmnet models.

Maintained by Frederic Bertrand. Last updated 2 years ago.

3 stars 4.35 score 37 scripts

fbertran

plsRbeta:Partial Least Squares Regression for Beta Regression Models

Provides Partial least squares Regression for (weighted) beta regression models (Bertrand 2013, <http://journal-sfds.fr/article/view/215>) and k-fold cross-validation of such models using various criteria. It allows for missing data in the explanatory variables. Bootstrap confidence intervals constructions are also available.

Maintained by Frederic Bertrand. Last updated 2 years ago.

2 stars 4.34 score 22 scripts

greenwoodlab

pcev:Principal Component of Explained Variance

Principal component of explained variance (PCEV) is a statistical tool for the analysis of a multivariate response vector. It is a dimension- reduction technique, similar to Principal component analysis (PCA), that seeks to maximize the proportion of variance (in the response vector) being explained by a set of covariates.

Maintained by Maxime Turgeon. Last updated 6 years ago.

4 stars 4.30 score 7 scripts

bioc

nethet:A bioconductor package for high-dimensional exploration of biological network heterogeneity

Package nethet is an implementation of statistical solid methodology enabling the analysis of network heterogeneity from high-dimensional data. It combines several implementations of recent statistical innovations useful for estimation and comparison of networks in a heterogeneous, high-dimensional setting. In particular, we provide code for formal two-sample testing in Gaussian graphical models (differential network and GGM-GSA; Stadler and Mukherjee, 2013, 2014) and make a novel network-based clustering algorithm available (mixed graphical lasso, Stadler and Mukherjee, 2013).

Maintained by Nicolas Staedler. Last updated 5 months ago.

clustering graphandnetwork

4.30 score 7 scripts

bioc

RegionalST:Investigating regions of interest and performing regional cell type-specific analysis with spatial transcriptomics data

This package analyze spatial transcriptomics data through cross-regional cell type-specific analysis. It selects regions of interest (ROIs) and identifys cross-regional cell type-specific differential signals. The ROIs can be selected using automatic algorithm or through manual selection. It facilitates manual selection of ROIs using a shiny application.

Maintained by Ziyi Li. Last updated 4 months ago.

spatial transcriptomics reactome kegg

4.30 score 8 scripts

jafarilab

NIMAA:Nominal Data Mining Analysis

Functions for nominal data mining based on bipartite graphs, which build a pipeline for analysis and missing values imputation. Methods are mainly from the paper: Jafari, Mohieddin, et al. (2021) <doi:10.1101/2021.03.18.436040>, some new ones are also included.

Maintained by Mohieddin Jafari. Last updated 2 years ago.

4 stars 4.30 score 7 scripts

jiangyouxiang

TestAnaAPP:A 'shiny' App for Test Analysis and Visualization

This application provides exploratory and confirmatory factor analysis, classical test theory, unidimensional and multidimensional item response theory, and continuous item response model analysis, through the 'shiny' interactive interface. In addition, it offers rich functionalities for visualizing and downloading results. Users can download figures, tables, and analysis reports via the interactive interface.

Maintained by Youxiang Jiang. Last updated 4 months ago.

4 stars 4.30 score 2 scripts

bioc

mogsa:Multiple omics data integrative clustering and gene set analysis

This package provide a method for doing gene set analysis based on multiple omics data.

Maintained by Chen Meng. Last updated 5 months ago.

geneexpression principalcomponent statisticalmethod clustering software

4.29 score 49 scripts

piyalkarum

rCNV:Detect Copy Number Variants from SNPs Data

Functions in this package will import filtered variant call format (VCF) files of SNPs data and generate data sets to detect copy number variants, visualize them and do downstream analyses with copy number variants(e.g. Environmental association analyses).

Maintained by Piyal Karunarathne. Last updated 26 days ago.

cnv-analysis copy-number-variation gene-duplication genetics genomics landscape-genetics snps cpp

6 stars 4.26 score 4 scripts

andrisignorell

ModTools:Building Regression and Classification Models

Consistent user interface to the most common regression and classification algorithms, such as random forest, neural networks, C5 trees and support vector machines, complemented with a handful of auxiliary functions, such as variable importance and a tuning function for the parameters.

Maintained by Andri Signorell. Last updated 2 months ago.

2 stars 4.20 score 3 scripts

bioc

MICSQTL:MICSQTL (Multi-omic deconvolution, Integration and Cell-type-specific Quantitative Trait Loci)

Our pipeline, MICSQTL, utilizes scRNA-seq reference and bulk transcriptomes to estimate cellular composition in the matched bulk proteomes. The expression of genes and proteins at either bulk level or cell type level can be integrated by Angle-based Joint and Individual Variation Explained (AJIVE) framework. Meanwhile, MICSQTL can perform cell-type-specic quantitative trait loci (QTL) mapping to proteins or transcripts based on the input of bulk expression data and the estimated cellular composition per molecule type, without the need for single cell sequencing. We use matched transcriptome-proteome from human brain frontal cortex tissue samples to demonstrate the input and output of our tool.

Maintained by Qian Li. Last updated 5 months ago.

geneexpression genetics proteomics rnaseq sequencing singlecell software visualization cellbasedassays coverage

4.18 score 3 scripts

barbaratarantino

SEMdeep:Structural Equation Modeling with Deep Neural Network and Machine Learning

Training and validation of a custom (or data-driven) Structural Equation Models using layer-wise Deep Neural Networks or node-wise Machine Learning algorithms, which extend the fitting procedures of the 'SEMgraph' R package <doi:10.32614/CRAN.package.SEMgraph>.

Maintained by Barbara Tarantino. Last updated 2 months ago.

4 stars 4.15 score

desanou

mglasso:Multiscale Graphical Lasso

Inference of Multiscale graphical models with neighborhood selection approach. The method is based on solving a convex optimization problem combining a Lasso and fused-group Lasso penalties. This allows to infer simultaneously a conditional independence graph and a clustering partition. The optimization is based on the Continuation with Nesterov smoothing in a Shrinkage-Thresholding Algorithm solver (Hadj-Selem et al. 2018) <doi:10.1109/TMI.2018.2829802> implemented in python.

Maintained by Edmond Sanou. Last updated 2 years ago.

2 stars 4.11 score 13 scripts

topepo

sparsediscrim:Sparse and Regularized Discriminant Analysis

A collection of sparse and regularized discriminant analysis methods intended for small-sample, high-dimensional data sets. The package features the High-Dimensional Regularized Discriminant Analysis classifier from Ramey et al. (2017) <arXiv:1602.01182>. Other classifiers include those from Dudoit et al. (2002) <doi:10.1198/016214502753479248>, Pang et al. (2009) <doi:10.1111/j.1541-0420.2009.01200.x>, and Tong et al. (2012) <doi:10.1093/bioinformatics/btr690>.

Maintained by Max Kuhn. Last updated 4 years ago.

3 stars 4.11 score 86 scripts

juliengamartin

pedtricks:Visualize, Summarize and Simulate Data from Pedigrees

Sensitivity and power analysis, for calculating statistics describing pedigrees from wild populations, and for visualizing pedigrees. This is a reboot of the methods developped by Morrissey and Wilson (2010) <doi: 10.1111/j.1755-0998.2009.02817.x>

Maintained by Julien Martin. Last updated 7 months ago.

2 stars 4.08 score 1 scripts

pachoning

bigmds:Multidimensional Scaling for Big Data

MDS is a statistic tool for reduction of dimensionality, using as input a distance matrix of dimensions n × n. When n is large, classical algorithms suffer from computational problems and MDS configuration can not be obtained. With this package, we address these problems by means of six algorithms, being two of them original proposals: - Landmark MDS proposed by De Silva V. and JB. Tenenbaum (2004). - Interpolation MDS proposed by Delicado P. and C. Pachón-García (2021) <arXiv:2007.11919> (original proposal). - Reduced MDS proposed by Paradis E (2018). - Pivot MDS proposed by Brandes U. and C. Pich (2007) - Divide-and-conquer MDS proposed by Delicado P. and C. Pachón-García (2021) <arXiv:2007.11919> (original proposal). - Fast MDS, proposed by Yang, T., J. Liu, L. McMillan and W. Wang (2006).

Maintained by Cristian Pachón García. Last updated 1 years ago.

17 stars 4.08 score 14 scripts

psychbruce

PsychWordVec:Word Embedding Research Framework for Psychological Science

An integrative toolbox of word embedding research that provides: (1) a collection of 'pre-trained' static word vectors in the '.RData' compressed format <https://psychbruce.github.io/WordVector_RData.pdf>; (2) a series of functions to process, analyze, and visualize word vectors; (3) a range of tests to examine conceptual associations, including the Word Embedding Association Test <doi:10.1126/science.aal4230> and the Relative Norm Distance <doi:10.1073/pnas.1720347115>, with permutation test of significance; (4) a set of training methods to locally train (static) word vectors from text corpora, including 'Word2Vec' <arXiv:1301.3781>, 'GloVe' <doi:10.3115/v1/D14-1162>, and 'FastText' <arXiv:1607.04606>; (5) a group of functions to download 'pre-trained' language models (e.g., 'GPT', 'BERT') and extract contextualized (dynamic) word vectors (based on the R package 'text').

Maintained by Han-Wu-Shuang Bao. Last updated 1 years ago.

22 stars 4.04 score 10 scripts

bioc

splineTimeR:Time-course differential gene expression data analysis using spline regression models followed by gene association network reconstruction

This package provides functions for differential gene expression analysis of gene expression time-course data. Natural cubic spline regression models are used. Identified genes may further be used for pathway enrichment analysis and/or the reconstruction of time dependent gene regulatory association networks.

Maintained by Herbert Braselmann. Last updated 5 months ago.

geneexpression differentialexpression timecourse regression genesetenrichment networkenrichment networkinference graphandnetwork

4.01 score 17 scripts

georgiosseitidis

viscomp:Visualize Multi-Component Interventions in Network Meta-Analysis

A set of functions providing several visualization tools for exploring the behavior of the components in a network meta-analysis of multi-component (complex) interventions: - components descriptive analysis - heat plot of the two-by-two component combinations - leaving one component combination out scatter plot - violin plot for specific component combinations' effects - density plot for components' effects - waterfall plot for the interventions' effects that differ by a certain component combination - network graph of components - rank heat plot of components for multiple outcomes. The implemented tools are described by Seitidis et al. (2023) <doi:10.1002/jrsm.1617>.

Maintained by Georgios Seitidis. Last updated 2 years ago.

cnma complex multicomponent nma visualization

2 stars 4.00 score 6 scripts

georgekoliopanos

modgo:MOck Data GeneratiOn

Generation of mock data from a real dataset using rank normal inverse transformation.

Maintained by George Koliopanos. Last updated 9 months ago.

1 stars 4.00 score 3 scripts

ttacail

isobxr:Stable Isotope Box Modelling in R

A set of functions to run simple and composite box-models to describe the dynamic or static distribution of stable isotopes in open or closed systems. The package also allows the sweeping of many parameters in both static and dynamic conditions. The mathematical models used in this package are derived from Albarede, 1995, Introduction to Geochemical Modelling, Cambridge University Press, Cambridge <doi:10.1017/CBO9780511622960>.

Maintained by Theo Tacail. Last updated 11 months ago.

1 stars 4.00 score 2 scripts

flr

ss3om:Tools for Conditioning Fisheries Operating Models Using Stock Synthesis 3

Tools for loading Stock Synthesis (SS3) models into FLR. Used in conditioning of Operating Models based on SS3 by considering structural uncertainty in input parameters and assumptions. A grid of SS3 runs can be created and results loaded on objects of various FLR classes.

Maintained by Iago Mosqueira. Last updated 2 months ago.

fisheries ss3 flr

3.94 score 44 scripts

manueleleonelli

bnmonitor:An Implementation of Sensitivity Analysis in Bayesian Networks

An implementation of sensitivity and robustness methods in Bayesian networks in R. It includes methods to perform parameter variations via a variety of co-variation schemes, to compute sensitivity functions and to quantify the dissimilarity of two Bayesian networks via distances and divergences. It further includes diagnostic methods to assess the goodness of fit of a Bayesian networks to data, including global, node and parent-child monitors. Reference: M. Leonelli, R. Ramanathan, R.L. Wilkerson (2022) <doi:10.1016/j.knosys.2023.110882>.

Maintained by Manuele Leonelli. Last updated 6 months ago.

3 stars 3.92 score 14 scripts

alexchristensen

latentFactoR:Data Simulation Based on Latent Factors

Generates data based on latent factor models. Data can be continuous, polytomous, dichotomous, or mixed. Skews, cross-loadings, wording effects, population errors, and local dependencies can be added. All parameters can be manipulated. Data categorization is based on Garrido, Abad, and Ponsoda (2011) <doi:10.1177/0013164410389489>.

Maintained by Alexander Christensen. Last updated 8 months ago.

3 stars 3.88 score 2 scripts

paytonjjones

networktree:Recursive Partitioning of Network Models

Network trees recursively partition the data with respect to covariates. Two network tree algorithms are available: model-based trees based on a multivariate normal model and nonparametric trees based on covariance structures. After partitioning, correlation-based networks (psychometric networks) can be fit on the partitioned data. For details see Jones, Mair, Simon, & Zeileis (2020) <doi:10.1007/s11336-020-09731-4>.

Maintained by Payton Jones. Last updated 3 years ago.

network-analysis psychometrics tree-models

13 stars 3.85 score 11 scripts

bioc

flowVS:Variance stabilization in flow cytometry (and microarrays)

Per-channel variance stabilization from a collection of flow cytometry samples by Bertlett test for homogeneity of variances. The approach is applicable to microarrays data as well.

Maintained by Ariful Azad. Last updated 5 months ago.

immunooncology flowcytometry cellbasedassays microarray

3.82 score 11 scripts

dustinstoltz

text2map:R Tools for Text Matrices, Embeddings, and Networks

This is a collection of functions optimized for working with with various kinds of text matrices. Focusing on the text matrix as the primary object - represented either as a base R dense matrix or a 'Matrix' package sparse matrix - allows for a consistent and intuitive interface that stays close to the underlying mathematical foundation of computational text analysis. In particular, the package includes functions for working with word embeddings, text networks, and document-term matrices. Methods developed in Stoltz and Taylor (2019) <doi:10.1007/s42001-019-00048-6>, Taylor and Stoltz (2020) <doi:10.1007/s42001-020-00075-8>, Taylor and Stoltz (2020) <doi:10.15195/v7.a23>, and Stoltz and Taylor (2021) <doi:10.1016/j.poetic.2021.101567>.

Maintained by Dustin Stoltz. Last updated 4 months ago.

3.82 score 22 scripts

sciurus365

quadVAR:Quadratic Vector Autoregression

Estimate quadratic vector autoregression models with the strong hierarchy using the Regularization Algorithm under Marginality Principle (RAMP) by Hao et al. (2018) <doi:10.1080/01621459.2016.1264956>, compare the performance with linear models, and construct networks with partial derivatives.

Maintained by Jingmeng Cui. Last updated 2 months ago.

3.78 score 3 scripts

cran

fastml:Fast Machine Learning Model Training and Evaluation

Streamlines the training, evaluation, and comparison of multiple machine learning models with minimal code by providing comprehensive data preprocessing and support for a wide range of algorithms with hyperparameter tuning. It offers performance metrics and visualization tools to facilitate efficient and effective machine learning workflows.

Maintained by Selcuk Korkmaz. Last updated 23 days ago.

3.76 score

mikehellstern

netgsa:Network-Based Gene Set Analysis

Carry out Network-based Gene Set Analysis by incorporating external information about interactions among genes, as well as novel interactions learned from data. Implements methods described in Shojaie A, Michailidis G (2010) <doi:10.1093/biomet/asq038>, Shojaie A, Michailidis G (2009) <doi:10.1089/cmb.2008.0081>, and Ma J, Shojaie A, Michailidis G (2016) <doi:10.1093/bioinformatics/btw410>

Maintained by Michael Hellstern. Last updated 3 years ago.

cpp

4 stars 3.75 score 28 scripts

xinkaidupsy

IVPP:Invariance Partial Pruning Test

An implementation of the Invariance Partial Pruning (IVPP) approach described in Du, X., Johnson, S. U., Epskamp, S. (2025) The Invariance Partial Pruning Approach to The Network Comparison in Longitudinal Data. IVPP is a two-step method that first test for global network structural difference with invariance test and then inspect specific edge difference with partial pruning.

Maintained by Xinkai Du. Last updated 4 days ago.

3.74 score 7 scripts

biometry

tapnet:Trait Matching and Abundance for Predicting Bipartite Networks

Functions to produce, fit and predict from bipartite networks with abundance, trait and phylogenetic information. Its methods are described in detail in Benadi, G., Dormann, C.F., Fruend, J., Stephan, R. & Vazquez, D.P. (2021) Quantitative prediction of interactions in bipartite networks based on traits, abundances, and phylogeny. The American Naturalist, in press.

Maintained by Carsten Dormann. Last updated 6 months ago.

1 stars 3.70 score 2 scripts

fbertran

bootPLS:Bootstrap Hyperparameter Selection for PLS Models and Extensions

Several implementations of non-parametric stable bootstrap-based techniques to determine the numbers of components for Partial Least Squares linear or generalized linear regression models as well as and sparse Partial Least Squares linear or generalized linear regression models. The package collects techniques that were published in a book chapter (Magnanensi et al. 2016, 'The Multiple Facets of Partial Least Squares and Related Methods', <doi:10.1007/978-3-319-40643-5_18>) and two articles (Magnanensi et al. 2017, 'Statistics and Computing', <doi:10.1007/s11222-016-9651-4>) and (Magnanensi et al. 2021, 'Frontiers in Applied Mathematics and Statistics', <doi:10.3389/fams.2021.693126>).

Maintained by Frederic Bertrand. Last updated 6 months ago.

1 stars 3.70 score 4 scripts

bioc

cypress:Cell-Type-Specific Power Assessment

CYPRESS is a cell-type-specific power tool. This package aims to perform power analysis for the cell-type-specific data. It calculates FDR, FDC, and power, under various study design parameters, including but not limited to sample size, and effect size. It takes the input of a SummarizeExperimental(SE) object with observed mixture data (feature by sample matrix), and the cell-type mixture proportions (sample by cell-type matrix). It can solve the cell-type mixture proportions from the reference free panel from TOAST and conduct tests to identify cell-type-specific differential expression (csDE) genes.

Maintained by Shilin Yu. Last updated 5 months ago.

software geneexpression dataimport rnaseq sequencing

1 stars 3.70 score 2 scripts

haowang47

PCGII:Partial Correlation Graph with Information Incorporation

Large-scale gene expression studies allow gene network construction to uncover associations among genes. This package is developed for estimating and testing partial correlation graphs with prior information incorporated.

Maintained by Hao Wang. Last updated 1 years ago.

1 stars 3.70 score 10 scripts

dswatson

leakyIV:Leaky Instrumental Variables

Instrumental variables (IVs) are a popular and powerful tool for estimating causal effects in the presence of unobserved confounding. However, classical methods rely on strong assumptions such as the exclusion criterion, which states that instrumental effects must be entirely mediated by treatments. In the so-called "leaky" IV setting, candidate instruments are allowed to have some direct influence on outcomes, rendering the average treatment effect (ATE) unidentifiable. But with limits on the amount of information leakage, we may still recover sharp bounds on the ATE, providing partial identification. This package implements methods for ATE bounding in the leaky IV setting with linear structural equations. For details, see Watson et al. (2024) <doi:10.48550/arXiv.2404.04446>.

Maintained by David S. Watson. Last updated 11 months ago.

1 stars 3.70 score 1 scripts

richardkwo

eff2:Efficient Least Squares for Total Causal Effects

Estimate a total causal effect from observational data under linearity and causal sufficiency. The observational data is supposed to be generated from a linear structural equation model (SEM) with independent and additive noise. The underlying causal DAG associated the SEM is required to be known up to a maximally oriented partially directed graph (MPDAG), which is a general class of graphs consisting of both directed and undirected edges, including CPDAGs (i.e., essential graphs) and DAGs. Such graphs are usually obtained with structure learning algorithms with added background knowledge. The program is able to estimate every identified effect, including single and multiple treatment variables. Moreover, the resulting estimate has the minimal asymptotic covariance (and hence shortest confidence intervals) among all estimators that are based on the sample covariance.

Maintained by Richard Guo. Last updated 1 years ago.

3.70 score 3 scripts

bips-hb

micd:Multiple Imputation in Causal Graph Discovery

Modified functions of the package 'pcalg' and some additional functions to run the PC and the FCI (Fast Causal Inference) algorithm for constraint-based causal discovery in incomplete and multiply imputed datasets. Foraita R, Friemel J, Günther K, Behrens T, Bullerdiek J, Nimzyk R, Ahrens W, Didelez V (2020) <doi:10.1111/rssa.12565>; Andrews RM, Foraita R, Didelez V, Witte J (2021) <arXiv:2108.13395>; Witte J, Foraita R, Didelez V (2022) <doi:10.1002/sim.9535>.

Maintained by Ronja Foraita. Last updated 2 years ago.

causal-discovery graphical-models multiple-imputation

5 stars 3.70 score 20 scripts

argeorgeson

phantSEM:Create Phantom Variables in Structural Equation Models for Sensitivity Analyses

Create phantom variables, which are variables that were not observed, for the purpose of sensitivity analyses for structural equation models. The package makes it easier for a user to test different combinations of covariances between the phantom variable(s) and observed variables. The package may be used to assess a model's or effect's sensitivity to temporal bias (e.g., if cross-sectional data were collected) or confounding bias.

Maintained by Alexis Georgeson. Last updated 5 months ago.

3.70 score 7 scripts

mw201608

COMBAT:A Combined Association Test for Genes using Summary Statistics

Genome-wide association studies (GWAS) have been widely used for identifying common variants associated with complex diseases. Due to the small effect sizes of common variants, the power to detect individual risk variants is generally low. Complementary to SNP-level analysis, a variety of gene-based association tests have been proposed. However, the power of existing gene-based tests is often dependent on the underlying genetic models, and it is not known a priori which test is optimal. Here we proposed COMBined Association Test (COMBAT) to incorporate strengths from multiple existing gene-based tests, including VEGAS, GATES and simpleM. Compared to individual tests, COMBAT shows higher overall performance and robustness across a wide range of genetic models. The algorithm behind this method is described in Wang et al (2017) <doi:10.1534/genetics.117.300257>.

Maintained by Minghui Wang. Last updated 3 years ago.

3.70 score 7 scripts

martinrd3d

PCRA:Companion to Portfolio Construction and Risk Analysis

A collection of functions and data sets that support teaching a quantitative finance MS level course on Portfolio Construction and Risk Analysis, and the writing of a textbook for such a course. The package is unique in providing several real-world data sets that may be used for problem assignments and student projects. The data sets include cross-sections of stock data from the Center for Research on Security Prices, LLC (CRSP), corresponding factor exposures data from S&P Global, and several SP500 data sets.

Maintained by Doug Martin. Last updated 2 years ago.

3.67 score 94 scripts

suren-rathnayake

deepgmm:Deep Gaussian Mixture Models

Deep Gaussian mixture models as proposed by Viroli and McLachlan (2019) <doi:10.1007/s11222-017-9793-z> provide a generalization of classical Gaussian mixtures to multiple layers. Each layer contains a set of latent variables that follow a mixture of Gaussian distributions. To avoid overparameterized solutions, dimension reduction is applied at each layer by way of factor models.

Maintained by Suren Rathnayake. Last updated 2 years ago.

clustering deep-learning mixed-models

9 stars 3.65 score 8 scripts

fkgruber

SID:Structural Intervention Distance

The code computes the structural intervention distance (SID) between a true directed acyclic graph (DAG) and an estimated DAG. Definition and details about the implementation can be found in J. Peters and P. Bühlmann: "Structural intervention distance (SID) for evaluating causal graphs", Neural Computation 27, pages 771-799, 2015.

Maintained by Fred Gruber. Last updated 1 years ago.

2 stars 3.62 score 21 scripts

paullabonne

BayesMultiMode:Bayesian Mode Inference

A two-step Bayesian approach for mode inference following Cross, Hoogerheide, Labonne and van Dijk (2024) <doi:10.1016/j.econlet.2024.111579>). First, a mixture distribution is fitted on the data using a sparse finite mixture (SFM) Markov chain Monte Carlo (MCMC) algorithm. The number of mixture components does not have to be known; the size of the mixture is estimated endogenously through the SFM approach. Second, the modes of the estimated mixture at each MCMC draw are retrieved using algorithms specifically tailored for mode detection. These estimates are then used to construct posterior probabilities for the number of modes, their locations and uncertainties, providing a powerful tool for mode inference.

Maintained by Paul Labonne. Last updated 5 months ago.

1 stars 3.60 score 8 scripts

bips-hb

tpc:Tiered PC Algorithm

Constraint-based causal discovery using the PC algorithm while accounting for a partial node ordering, for example a partial temporal ordering when the data were collected in different waves of a cohort study. Andrews RM, Foraita R, Didelez V, Witte J (2021) <arXiv:2108.13395> provide a guide how to use tpc to analyse cohort data.

Maintained by Ronja Foraita. Last updated 2 years ago.

causal-discovery cohort-analysis graphical-models

5 stars 3.60 score 16 scripts

henry-heppe

adproclus:Additive Profile Clustering Algorithms

Obtain overlapping clustering models for object-by-variable data matrices using the Additive Profile Clustering (ADPROCLUS) method. Also contains the low dimensional ADPROCLUS method for simultaneous dimension reduction and overlapping clustering. For reference see Depril, Van Mechelen, Mirkin (2008) <doi:10.1016/j.csda.2008.04.014> and Depril, Van Mechelen, Wilderjans (2012) <doi:10.1007/s00357-012-9112-5>.

Maintained by Henry Heppe. Last updated 7 months ago.

2 stars 3.60 score 2 scripts

mihaiconstantin

powerly:Sample Size Analysis for Psychological Networks and More

An implementation of the sample size computation method for network models proposed by Constantin et al. (2021) <doi:10.31234/osf.io/j5v7u>. The implementation takes the form of a three-step recursive algorithm designed to find an optimal sample size given a model specification and a performance measure of interest. It starts with a Monte Carlo simulation step for computing the performance measure and a statistic at various sample sizes selected from an initial sample size range. It continues with a monotone curve-fitting step for interpolating the statistic across the entire sample size range. The final step employs stratified bootstrapping to quantify the uncertainty around the fitted curve.

Maintained by Mihai Constantin. Last updated 2 years ago.

network-models power-analysis psychology sample-size-calculation

8 stars 3.60 score 3 scripts

jglev

veccompare:Perform Set Operations on Vectors, Automatically Generating All n-Wise Comparisons, and Create Markdown Output

Automates set operations (i.e., comparisons of overlap) between multiple vectors. It also contains a function for automating reporting in 'RMarkdown', by generating markdown output for easy analysis, as well as an 'RMarkdown' template for use with 'RStudio'.

Maintained by Jacob Gerard Levernier. Last updated 8 years ago.

8 stars 3.60 score 10 scripts

bbuchsbaum

multivarious:Extensible Data Structures for Multivariate Analysis

Provides a set of basic and extensible data structures and functions for multivariate analysis, including dimensionality reduction techniques, projection methods, and preprocessing functions. The aim of this package is to offer a flexible and user-friendly framework for multivariate analysis that can be easily extended for custom requirements and specific data analysis tasks.

Maintained by Bradley Buchsbaum. Last updated 3 months ago.

3.53 score 17 scripts

bioc

trigger:Transcriptional Regulatory Inference from Genetics of Gene ExpRession

This R package provides tools for the statistical analysis of integrative genomic data that involve some combination of: genotypes, high-dimensional intermediate traits (e.g., gene expression, protein abundance), and higher-order traits (phenotypes). The package includes functions to: (1) construct global linkage maps between genetic markers and gene expression; (2) analyze multiple-locus linkage (epistasis) for gene expression; (3) quantify the proportion of genome-wide variation explained by each locus and identify eQTL hotspots; (4) estimate pair-wise causal gene regulatory probabilities and construct gene regulatory networks; and (5) identify causal genes for a quantitative trait of interest.

Maintained by John D. Storey. Last updated 4 days ago.

geneexpression snp geneticvariability microarray genetics

3.48 score 3 scripts

donaldrwilliams

GGMnonreg:Non-Regularized Gaussian Graphical Models

Estimate non-regularized Gaussian graphical models, Ising models, and mixed graphical models. The current methods consist of multiple regression, a non-parametric bootstrap <doi:10.1080/00273171.2019.1575716>, and Fisher z transformed partial correlations <doi:10.1111/bmsp.12173>. Parameter uncertainty, predictability, and network replicability <doi:10.31234/osf.io/fb4sa> are also implemented.

Maintained by Donald Williams. Last updated 3 years ago.

6 stars 3.48 score 4 scripts

fhui28

boral:Bayesian Ordination and Regression AnaLysis

Bayesian approaches for analyzing multivariate data in ecology. Estimation is performed using Markov Chain Monte Carlo (MCMC) methods via Three. JAGS types of models may be fitted: 1) With explanatory variables only, boral fits independent column Generalized Linear Models (GLMs) to each column of the response matrix; 2) With latent variables only, boral fits a purely latent variable model for model-based unconstrained ordination; 3) With explanatory and latent variables, boral fits correlated column GLMs with latent variables to account for any residual correlation between the columns of the response matrix.

Maintained by Francis K.C. Hui. Last updated 1 years ago.

jags cpp

2 stars 3.45 score 79 scripts

fbertran

penalizedSVM:Feature Selection SVM using Penalty Functions

Support Vector Machine (SVM) classification with simultaneous feature selection using penalty functions is implemented. The smoothly clipped absolute deviation (SCAD), 'L1-norm', 'Elastic Net' ('L1-norm' and 'L2-norm') and 'Elastic SCAD' (SCAD and 'L2-norm') penalties are available. The tuning parameters can be found using either a fixed grid or a interval search.

Maintained by Frederic Bertrand. Last updated 2 years ago.

1 stars 3.36 score 76 scripts 1 dependents

michhernand

RelimpPCR:Relative Importance PCA Regression

Performs Principal Components Analysis (also known as PCA) dimensionality reduction in the context of a linear regression. In most cases, PCA dimensionality reduction is performed independent of the response variable for a regression. This captures the majority of the variance of the model's predictors, but may not actually be the optimal dimensionality reduction solution for a regression against the response variable. An alternative method, optimized for a regression against the response variable, is to use both PCA and a relative importance measure. This package applies PCA to a given data frame of predictors, and then calculates the relative importance of each PCA factor against the response variable. It outputs ordered factors that are optimized for model fit. By performing dimensionality reduction with this method, an individual can achieve a the same r-squared value as performing just PCA, but with fewer PCA factors. References: Yuri Balasanov (2017) <https://ilykei.com>.

Maintained by Michael Hernandez. Last updated 12 months ago.

pca regression

2 stars 3.30 score 2 scripts

dcauseur

ERP:Significance Analysis of Event-Related Potentials Data

Functions for signal detection and identification designed for Event-Related Potentials (ERP) data in a linear model framework. The functional F-test proposed in Causeur, Sheu, Perthame, Rufini (2018, submitted) for analysis of variance issues in ERP designs is implemented for signal detection (tests for mean difference among groups of curves in One-way ANOVA designs for example). Once an experimental effect is declared significant, identification of significant intervals is achieved by the multiple testing procedures reviewed and compared in Sheu, Perthame, Lee and Causeur (2016, <DOI:10.1214/15-AOAS888>). Some of the methods gathered in the package are the classical FDR- and FWER-controlling procedures, also available using function p.adjust. The package also implements the Guthrie-Buchwald procedure (Guthrie and Buchwald, 1991 <DOI:10.1111/j.1469-8986.1991.tb00417.x>), which accounts for the auto-correlation among t-tests to control erroneous detection of short intervals. The Adaptive Factor-Adjustment method is an extension of the method described in Causeur, Chu, Hsieh and Sheu (2012, <DOI:10.3758/s13428-012-0230-0>). It assumes a factor model for the correlation among tests and combines adaptively the estimation of the signal and the updating of the dependence modelling (see Sheu et al., 2016, <DOI:10.1214/15-AOAS888> for further details).

Maintained by David Causeur. Last updated 5 years ago.

3.30 score 20 scripts

bioc

cpvSNP:Gene set analysis methods for SNP association p-values that lie in genes in given gene sets

Gene set analysis methods exist to combine SNP-level association p-values into gene sets, calculating a single association p-value for each gene set. This package implements two such methods that require only the calculated SNP p-values, the gene set(s) of interest, and a correlation matrix (if desired). One method (GLOSSI) requires independent SNPs and the other (VEGAS) can take into account correlation (LD) among the SNPs. Built-in plotting functions are available to help users visualize results.

Maintained by Caitlin McHugh. Last updated 5 months ago.

genetics statisticalmethod pathways genesetenrichment genomicvariation

3.30 score 3 scripts

finyang

flap:Forecast Linear Augmented Projection

The Forecast Linear Augmented Projection (flap) method reduces forecast variance by adjusting the forecasts of multivariate time series to be consistent with the forecasts of linear combinations (components) of the series by projecting all forecasts onto the space where the linear constraints are satisfied. The forecast variance can be reduced monotonically by including more components. For a given number of components, the flap method achieves maximum forecast variance reduction among linear projections.

Maintained by Yangzhuoran Fin Yang. Last updated 9 months ago.

1 stars 3.30 score 2 scripts

polinasuter

BiDAG:Bayesian Inference for Directed Acyclic Graphs

Implementation of a collection of MCMC methods for Bayesian structure learning of directed acyclic graphs (DAGs), both from continuous and discrete data. For efficient inference on larger DAGs, the space of DAGs is pruned according to the data. To filter the search space, the algorithm employs a hybrid approach, combining constraint-based learning with search and score. A reduced search space is initially defined on the basis of a skeleton obtained by means of the PC-algorithm, and then iteratively improved with search and score. Search and score is then performed following two approaches: Order MCMC, or Partition MCMC. The BGe score is implemented for continuous data and the BDe score is implemented for binary data or categorical data. The algorithms may provide the maximum a posteriori (MAP) graph or a sample (a collection of DAGs) from the posterior distribution given the data. All algorithms are also applicable for structure learning and sampling for dynamic Bayesian networks. References: J. Kuipers, P. Suter, G. Moffa (2022) <doi:10.1080/10618600.2021.2020127>, N. Friedman and D. Koller (2003) <doi:10.1023/A:1020249912095>, J. Kuipers and G. Moffa (2017) <doi:10.1080/01621459.2015.1133426>, M. Kalisch et al. (2012) <doi:10.18637/jss.v047.i11>, D. Geiger and D. Heckerman (2002) <doi:10.1214/aos/1035844981>, P. Suter, J. Kuipers, G. Moffa, N.Beerenwinkel (2023) <doi:10.18637/jss.v105.i09>.

Maintained by Polina Suter. Last updated 2 years ago.

cpp

4 stars 3.29 score 81 scripts 2 dependents

annaltyler

cape:Combined Analysis of Pleiotropy and Epistasis for Diversity Outbred Mice

Combined Analysis of Pleiotropy and Epistasis infers predictive networks between genetic variants and phenotypes. It can be used with standard two-parent populations as well as multi-parent populations, such as the Diversity Outbred (DO) mice, Collaborative Cross (CC) mice, or the multi-parent advanced generation intercross (MAGIC) population of Arabidopsis thaliana. It uses complementary information of pleiotropic gene variants across different phenotypes to resolve models of epistatic interactions between alleles. To do this, cape reparametrizes main effect and interaction coefficients from pairwise variant regressions into directed influence parameters. These parameters describe how alleles influence each other, in terms of suppression and enhancement, as well as how gene variants influence phenotypes. All of the final interactions are reported as directed interactions between pairs of parental alleles. For detailed descriptions of the methods used in this package please see the following references. Carter, G. W., Hays, M., Sherman, A. & Galitski, T. (2012) <doi:10.1371/journal.pgen.1003010>. Tyler, A. L., Lu, W., Hendrick, J. J., Philip, V. M. & Carter, G. W. (2013) <doi:10.1371/journal.pcbi.1003270>.

Maintained by Anna Tyler. Last updated 1 years ago.

3.27 score 37 scripts

cran

sda:Shrinkage Discriminant Analysis and CAT Score Variable Selection

Provides an efficient framework for high-dimensional linear and diagonal discriminant analysis with variable selection. The classifier is trained using James-Stein-type shrinkage estimators and predictor variables are ranked using correlation-adjusted t-scores (CAT scores). Variable selection error is controlled using false non-discovery rates or higher criticism.

Maintained by Korbinian Strimmer. Last updated 3 years ago.

3.21 score 3 dependents

cran

GeneNet:Modeling and Inferring Gene Networks

Analyzes gene expression (time series) data with focus on the inference of gene networks. In particular, GeneNet implements the methods of Schaefer and Strimmer (2005a,b,c) and Opgen-Rhein and Strimmer (2006, 2007) for learning large-scale gene association networks (including assignment of putative directions).

Maintained by Korbinian Strimmer. Last updated 3 years ago.

3.18 score 5 dependents

namgillee

VARshrink:Shrinkage Estimation Methods for Vector Autoregressive Models

Vector autoregressive (VAR) model is a fundamental and effective approach for multivariate time series analysis. Shrinkage estimation methods can be applied to high-dimensional VAR models with dimensionality greater than the number of observations, contrary to the standard ordinary least squares method. This package is an integrative package delivering nonparametric, parametric, and semiparametric methods in a unified and consistent manner, such as the multivariate ridge regression in Golub, Heath, and Wahba (1979) <doi:10.2307/1268518>, a James-Stein type nonparametric shrinkage method in Opgen-Rhein and Strimmer (2007) <doi:10.1186/1471-2105-8-S2-S3>, and Bayesian estimation methods using noninformative and informative priors in Lee, Choi, and S.-H. Kim (2016) <doi:10.1016/j.csda.2016.03.007> and Ni and Sun (2005) <doi:10.1198/073500104000000622>.

Maintained by Namgil Lee. Last updated 5 years ago.

3 stars 3.18 score 6 scripts

gwpcor

gwpcormapper:Geographically Weighted Partial Correlation Mapper

An interactive mapping tool for geographically weighted correlation and partial correlation. Geographically weighted partial correlation coefficients are calculated following (Percival and Tsutsumida, 2017)<doi:10.1553/giscience2017_01_s36> and are described in greater detail in (Tsutsumida et al., 2019)<doi:10.5194/ica-abs-1-372-2019> and (Percival et al., 2021)<arXiv:2101.03491>.

Maintained by Joseph Emile Honour Percival. Last updated 3 years ago.

cpp

3 stars 3.18 score 1 scripts

jmbh

fspe:Estimating the Number of Factors in EFA with Out-of-Sample Prediction Errors

Estimating the number of factors in Exploratory Factor Analysis (EFA) with out-of-sample prediction errors using a cross-validation scheme. Haslbeck & van Bork (Preprint) <https://psyarxiv.com/qktsd>.

Maintained by Jonas Haslbeck. Last updated 2 years ago.

1 stars 3.18 score 2 scripts 1 dependents

vwendy

pompom:Person-Oriented Method and Perturbation on the Model

An implementation of a hybrid method of person-oriented method and perturbation on the model. Pompom is the initials of the two methods. The hybrid method will provide a multivariate intraindividual variability metric (iRAM). The person-oriented method used in this package refers to uSEM (unified structural equation modeling, see Kim et al., 2007, Gates et al., 2010 and Gates et al., 2012 for details). Perturbation on the model was conducted according to impulse response analysis introduced in Lutkepohl (2007). Kim, J., Zhu, W., Chang, L., Bentler, P. M., & Ernst, T. (2007) <doi:10.1002/hbm.20259>. Gates, K. M., Molenaar, P. C. M., Hillary, F. G., Ram, N., & Rovine, M. J. (2010) <doi:10.1016/j.neuroimage.2009.12.117>. Gates, K. M., & Molenaar, P. C. M. (2012) <doi:10.1016/j.neuroimage.2012.06.026>. Lutkepohl, H. (2007, ISBN:3540262393).

Maintained by Xiao Yang. Last updated 4 years ago.

3.08 score 24 scripts

epertham

xLLiM:High Dimensional Locally-Linear Mapping

Provides a tool for non linear mapping (non linear regression) using a mixture of regression model and an inverse regression strategy. The methods include the GLLiM model (see Deleforge et al (2015) <DOI:10.1007/s11222-014-9461-5>) based on Gaussian mixtures and a robust version of GLLiM, named SLLiM (see Perthame et al (2016) <DOI:10.1016/j.jmva.2017.09.009>) based on a mixture of Generalized Student distributions. The methods also include BLLiM (see Devijver et al (2017) <arXiv:1701.07899>) which is an extension of GLLiM with a sparse block diagonal structure for large covariance matrices (particularly interesting for transcriptomic data).

Maintained by Emeline Perthame. Last updated 1 years ago.

mixomics

1 stars 3.02 score 21 scripts

marcvidalbadia

pfica:Independent Components Analysis Techniques for Functional Data

This package includes a set of tools to perform smoothed (and non-smoothed) principal/independent components analysis of functional data. Various functional pre-whitening approaches are implemented as discussed in Vidal and Aguilera (2022) “Novel whitening approaches in functional settings", <doi:10.1002/sta4.516>. Further whitening representations of functional data can be derived in terms of a few principal components, providing a powerful avenue to explore hidden structures in low dimensional settings: see Vidal, Rosso and Aguilera (2021) “Bi-smoothed functional independent component analysis for EEG artifact removal”, <doi:10.3390/math9111243>.

Maintained by Marc Vidal. Last updated 2 years ago.

b-splines fobi ica kurtosis penalization

2 stars 3.00 score 3 scripts

bips-hb

SRSim:Spontaneous Reporting Simulator (SRSim)

A package for simulating spontaneous reporting data as used in the field of pharmacovigilance.

Maintained by Louis Dijkstra. Last updated 2 months ago.

binary-data pharmacovigilance simulator cpp

5 stars 3.00 score 4 scripts

cran

smdi:Perform Structural Missing Data Investigations

An easy to use implementation of routine structural missing data diagnostics with functions to visualize the proportions of missing observations, investigate missing data patterns and conduct various empirical missing data diagnostic tests. Reference: Weberpals J, Raman SR, Shaw PA, Lee H, Hammill BG, Toh S, Connolly JG, Dandreo KJ, Tian F, Liu W, Li J, Hernández-Muñoz JJ, Glynn RJ, Desai RJ. smdi: an R package to perform structural missing data investigations on partially observed confounders in real-world evidence studies. JAMIA Open. 2024 Jan 31;7(1):ooae008. <doi:10.1093/jamiaopen/ooae008>.

Maintained by Janick Weberpals. Last updated 6 months ago.

3.00 score

robustport

facmodTS:Time Series Models for Asset Returns

Supports teaching methods of estimating and testing time series models for use in robust portfolio construction and analysis. Unique in providing not only classical least squares, but also modern robust model fitting methods which are not much influenced by outliers. Includes returns and risk decompositions, with user choice of standard deviation, value-at-risk, and expected shortfall risk measures. "Robust Statistics Theory and Methods (with R)", R. A. Maronna, R. D. Martin, V. J. Yohai, M. Salibian-Barrera (2019) <doi:10.1002/9781119214656>.

Maintained by Doug Martin. Last updated 22 days ago.

1 stars 3.00 score

niklaspfister

StabilizedRegression:Stabilizing Regression and Variable Selection

Contains an implementation of 'StabilizedRegression', a regression framework for heterogeneous data introduced in Pfister et al. (2021) <arXiv:1911.01850>. The procedure uses averaging to estimate a regression of a set of predictors X on a response variable Y by enforcing stability with respect to a given environment variable. The resulting regression leads to a variable selection procedure which allows to distinguish between stable and unstable predictors. The package further implements a visualization technique which illustrates the trade-off between stability and predictiveness of individual predictors.

Maintained by Niklas Pfister. Last updated 3 years ago.

2 stars 3.00 score 1 scripts

samaneh-bioinformatics

BNrich:Pathway Enrichment Analysis Based on Bayesian Network

Maleknia et al. (2020) <doi:10.1101/2020.01.13.905448>. A novel pathway enrichment analysis package based on Bayesian network to investigate the topology features of the pathways. firstly, 187 kyoto encyclopedia of genes and genomes (KEGG) human non-metabolic pathways which their cycles were eliminated by biological approach, enter in analysis as Bayesian network structures. The constructed Bayesian network were optimized by the Least Absolute Shrinkage Selector Operator (lasso) and the parameters were learned based on gene expression data. Finally, the impacted pathways were enriched by Fisher’s Exact Test on significant parameters.

Maintained by Samaneh Maleknia. Last updated 5 years ago.

networkenrichment geneexpression pathways bayesian kegg

3.00 score

pilacuan-bonete-luis

LDABiplots:Biplot Graphical Interface for LDA Models

Contains the development of a tool that provides a web-based graphical user interface (GUI) to perform Biplots representations from a scraping of news from digital newspapers under the Bayesian approach of Latent Dirichlet Assignment (LDA) and machine learning algorithms. Contains LDA methods described by Blei , David M., Andrew Y. Ng and Michael I. Jordan (2003) <https://jmlr.org/papers/volume3/blei03a/blei03a.pdf>, and Biplot methods described by Gabriel K.R(1971) <doi:10.1093/biomet/58.3.453> and Galindo-Villardon P(1986) <https://diarium.usal.es/pgalindo/files/2012/07/Questiio.pdf>.

Maintained by Luis Pilacuan-Bonete. Last updated 3 years ago.

3.00 score 4 scripts

donaldrwilliams

IRCcheck:Irrepresentable Condition Check

Check the irrepresentable condition (IRC) in both L1-regularized regression <doi:10.1109/TIT.2006.883611> and Gaussian graphical models. The IRC requires that the important and unimportant variables are not correlated, at least not all that much, and it is necessary for consistent model selection. Exploring the IRC as a function of the number of variables, assumed sparsity, and effect size can provide valuable insights into the model selection properties of L1-regularization.

Maintained by Donald Williams. Last updated 4 years ago.

2 stars 3.00 score 1 scripts

rchen18

RNGforGPD:Random Number Generation for Generalized Poisson Distribution

Generation of univariate and multivariate data that follow the generalized Poisson distribution. The details of the univariate part are explained in Demirtas (2017) <doi: 10.1080/03610918.2014.968725>, and the multivariate part is an extension of the correlated Poisson data generation routine that was introduced in Yahav and Shmueli (2012) <doi: 10.1002/asmb.901>.

Maintained by Ruizhe Chen. Last updated 4 years ago.

1 stars 3.00 score 11 scripts 3 dependents

vaudigier

clusterMI:Cluster Analysis with Missing Values by Multiple Imputation

Allows clustering of incomplete observations by addressing missing values using multiple imputation. For achieving this goal, the methodology consists in three steps, following Audigier and Niang 2022 <doi:10.1007/s11634-022-00519-1>. I) Missing data imputation using dedicated models. Four multiple imputation methods are proposed, two are based on joint modelling and two are fully sequential methods, as discussed in Audigier et al. (2021) <doi:10.48550/arXiv.2106.04424>. II) cluster analysis of imputed data sets. Six clustering methods are available (distances-based or model-based), but custom methods can also be easily used. III) Partition pooling. The set of partitions is aggregated using Non-negative Matrix Factorization based method. An associated instability measure is computed by bootstrap (see Fang, Y. and Wang, J., 2012 <doi:10.1016/j.csda.2011.09.003>). Among applications, this instability measure can be used to choose a number of clusters with missing values. The package also proposes several diagnostic tools to tune the number of imputed data sets, to tune the number of iterations in fully sequential imputation, to check the fit of imputation models, etc.

Maintained by Vincent Audigier. Last updated 1 months ago.

openblas cpp

2.90 score

adafede

sapid:A Strategy to Analyze Plant Extracts Taste In Depth

This package provides the infrastructure to implement a Strategy to Analyze Plant Extracts Taste In Depth.

Maintained by Adriano Rutz. Last updated 3 days ago.

computational metabolomics natural extracts taste

2.90 score

txm676

nos:Compute Node Overlap and Segregation in Ecological Networks

Calculate NOS (node overlap and segregation) and the associated metrics described in Strona and Veech (2015) <DOI:10.1111/2041-210X.12395> and Strona et al. (2017, In Press). The functions provided in the package enable assessment of structural patterns ranging from complete node segregation to perfect nestedness in a variety of network types. In addition, they provide a measure of network modularity.

Maintained by Thomas J. Matthews. Last updated 1 years ago.

2.88 score 15 scripts

granatumx

lilikoi:Metabolomics Personalized Pathway Analysis Tool

A comprehensive analysis tool for metabolomics data. It consists a variety of functional modules, including several new modules: a pre-processing module for normalization and imputation, an exploratory data analysis module for dimension reduction and source of variation analysis, a classification module with the new deep-learning method and other machine-learning methods, a prognosis module with cox-PH and neural-network based Cox-nnet methods, and pathway analysis module to visualize the pathway and interpret metabolite-pathway relationships. References: H. Paul Benton <http://www.metabolomics-forum.com/index.php?topic=281.0> Jeff Xia <https://github.com/cangfengzhe/Metabo/blob/master/MetaboAnalyst/website/name_match.R> Travers Ching, Xun Zhu, Lana X. Garmire (2018) <doi:10.1371/journal.pcbi.1006076>.

Maintained by Lana Garmire. Last updated 2 years ago.

openjdk

1 stars 2.85 score 14 scripts

mjafin

GeneCycle:Identification of Periodically Expressed Genes

The GeneCycle package implements the approaches of Wichert et al. (2004) <doi:10.1093/bioinformatics/btg364>, Ahdesmaki et al. (2005) <doi:10.1186/1471-2105-6-117> and Ahdesmaki et al. (2007) <DOI:10.1186/1471-2105-8-233> for detecting periodically expressed genes from gene expression time series data.

Maintained by Miika Ahdesmaki. Last updated 4 years ago.

1 stars 2.81 score 64 scripts

cran

mixKernel:Omics Data Integration Using Kernel Methods

Kernel-based methods are powerful methods for integrating heterogeneous types of data. mixKernel aims at providing methods to combine kernel for unsupervised exploratory analysis. Different solutions are provided to compute a meta-kernel, in a consensus way or in a way that best preserves the original topology of the data. mixKernel also integrates kernel PCA to visualize similarities between samples in a non linear space and from the multiple source point of view <doi:10.1093/bioinformatics/btx682>. A method to select (as well as funtions to display) important variables is also provided <doi:10.1093/nargab/lqac014>.

Maintained by Nathalie Vialaneix. Last updated 1 years ago.

2.78 score

helloworld9293

VARcpDetectOnline:Sequential Change Point Detection for High-Dimensional VAR Models

Implements the algorithm introduced in Tian, Y., and Safikhani, A. (2024) <doi:10.5705/ss.202024.0182>, "Sequential Change Point Detection in High-dimensional Vector Auto-regressive Models". This package provides tools for detecting change points in the transition matrices of VAR models, effectively identifying shifts in temporal and cross-correlations within high-dimensional time series data.

Maintained by Yuhan Tian. Last updated 2 months ago.

3 stars 2.78 score

jmanitz

NetOrigin:Origin Estimation for Propagation Processes on Complex Networks

Performs network-based source estimation. Different approaches are available: effective distance median, recursive backtracking, and centrality-based source estimation. Additionally, we provide public transportation network data as well as methods for data preparation, source estimation performance analysis and visualization.

Maintained by Juliane Manitz. Last updated 2 years ago.

2.74 score 11 scripts

rdinnager

fibre:Fast Evolutionary Trait Modelling on Phylogenies using Branch Regression Models

Implements Phylogenetic Branch Regression models which allow for flexible and versatile models of evolution along a phylogeny. The model can be used to detect shifts in rates of evolution along branches. The model uses a continuous and linear model structure and so can be easily combined with other non-phylogenetic statistical structures, as long as they are implemented using the R package INLA. One major uses of this are to condition on phylogeny in a standard regression between two traits, thus 'accounting' for phylogenetic structure in the response variable, similar to how pgls is used but allowing for a more flexible phylogenetic model. This also allows the phylogenetic model to be combined with the spatial models that INLA excels at (and with comparable flexibility to those spatial models).

Maintained by Russell Dinnage. Last updated 4 months ago.

3 stars 2.71 score 34 scripts

katherineloor

RcppCensSpatial:Spatial Estimation and Prediction for Censored/Missing Responses

It provides functions to estimate parameters in linear spatial models with censored/missing responses via the Expectation-Maximization (EM), the Stochastic Approximation EM (SAEM), or the Monte Carlo EM (MCEM) algorithm. These algorithms are widely used to compute the maximum likelihood (ML) estimates in problems with incomplete data. The EM algorithm computes the ML estimates when a closed expression for the conditional expectation of the complete-data log-likelihood function is available. In the MCEM algorithm, the conditional expectation is substituted by a Monte Carlo approximation based on many independent simulations of the missing data. In contrast, the SAEM algorithm splits the E-step into simulation and integration steps. This package also approximates the standard error of the estimates using the Louis method. Moreover, it has a function that performs spatial prediction in new locations.

Maintained by Katherine A. L. Valeriano. Last updated 3 years ago.

openblas cpp openmp

2.70 score 1 scripts

cran

ref.ICAR:Objective Bayes Intrinsic Conditional Autoregressive Model for Areal Data

Implements an objective Bayes intrinsic conditional autoregressive prior. This model provides an objective Bayesian approach for modeling spatially correlated areal data using an intrinsic conditional autoregressive prior on a vector of spatial random effects.

Maintained by Erica M. Porter. Last updated 2 months ago.

2.70 score

mrijalussholihin

sae.prop:Small Area Estimation using Fay-Herriot Models with Additive Logistic Transformation

Implements Additive Logistic Transformation (alr) for Small Area Estimation under Fay Herriot Model. Small Area Estimation is used to borrow strength from auxiliary variables to improve the effectiveness of a domain sample size. This package uses Empirical Best Linear Unbiased Prediction (EBLUP) estimator. The Additive Logistic Transformation (alr) are based on transformation by Aitchison J (1986). The covariance matrix for multivariate application is base on covariance matrix used by Esteban M, Lombardía M, López-Vizcaíno E, Morales D, and Pérez A <doi:10.1007/s11749-019-00688-w>. The non-sampled models are modified area-level models based on models proposed by Anisa R, Kurnia A, and Indahwati I <doi:10.9790/5728-10121519>, with univariate model using model-3, and multivariate model using model-1. The MSE are estimated using Parametric Bootstrap approach. For non-sampled cases, MSE are estimated using modified approach proposed by Haris F and Ubaidillah A <doi:10.4108/eai.2-8-2019.2290339>.

Maintained by M. Rijalus Sholihin. Last updated 3 years ago.

2.70 score