Showing 200 of total 335 results (show query)
bommert
stabm:Stability Measures for Feature Selection
An implementation of many measures for the assessment of the stability of feature selection. Both simple measures and measures which take into account the similarities between features are available, see Bommert (2020) <doi:10.17877/DE290R-21906>.
Maintained by Andrea Bommert. Last updated 2 years ago.
46.5 match 6 stars 6.25 score 33 scripts 3 dependentsmyaseen208
stability:Stability Analysis of Genotype by Environment Interaction (GEI)
Functionalities to perform Stability Analysis of Genotype by Environment Interaction (GEI) to identify superior and stable genotypes under diverse environments. It performs Eberhart & Russel's ANOVA (1966) (<doi:10.2135/cropsci1966.0011183X000600010011x>), Finlay and Wilkinson (1963) Joint Linear Regression (<doi:10.1071/AR9630742>), Wricke (1962, 1964) Ecovalence, Shukla's stability variance parameter (1972) (<doi:10.1038/hdy.1972.87>) and Kang's (1991) (<doi:10.2134/agronj1991.00021962008300010037x>) simultaneous selection for high yielding and stable parameter.
Maintained by Muhammad Yaseen. Last updated 6 years ago.
68.3 match 3 stars 4.17 score 33 scripts 1 dependentshofnerb
stabs:Stability Selection with Error Control
Resampling procedures to assess the stability of selected variables with additional finite sample error control for high-dimensional variable selection procedures such as Lasso or boosting. Both, standard stability selection (Meinshausen & Buhlmann, 2010, <doi:10.1111/j.1467-9868.2010.00740.x>) and complementary pairs stability selection with improved error bounds (Shah & Samworth, 2013, <doi:10.1111/j.1467-9868.2011.01034.x>) are implemented. The package can be combined with arbitrary user specified variable selection approaches.
Maintained by Benjamin Hofner. Last updated 4 years ago.
machine-learningr-languageresamplingstability-selectionvariable-importancevariable-selection
27.5 match 26 stars 9.59 score 53 scripts 31 dependentscore-bioinformatics
ClustAssess:Tools for Assessing Clustering
A set of tools for evaluating clustering robustness using proportion of ambiguously clustered pairs (Senbabaoglu et al. (2014) <doi:10.1038/srep06207>), as well as similarity across methods and method stability using element-centric clustering comparison (Gates et al. (2019) <doi:10.1038/s41598-019-44892-y>). Additionally, this package enables stability-based parameter assessment for graph-based clustering pipelines typical in single-cell data analysis.
Maintained by Andi Munteanu. Last updated 1 months ago.
softwaresinglecellrnaseqatacseqnormalizationpreprocessingdimensionreductionvisualizationqualitycontrolclusteringclassificationannotationgeneexpressiondifferentialexpressionbioinformaticsgenomicsmachine-learningparameter-optimizationrobustnesssingle-cellunsupervised-learningcpp
24.2 match 23 stars 5.70 score 18 scriptsbarbarabodinier
sharp:Stability-enHanced Approaches using Resampling Procedures
In stability selection (N Meinshausen, P Bühlmann (2010) <doi:10.1111/j.1467-9868.2010.00740.x>) and consensus clustering (S Monti et al (2003) <doi:10.1023/A:1023949509487>), resampling techniques are used to enhance the reliability of the results. In this package, hyper-parameters are calibrated by maximising model stability, which is measured under the null hypothesis that all selection (or co-membership) probabilities are identical (B Bodinier et al (2023a) <doi:10.1093/jrsssc/qlad058> and B Bodinier et al (2023b) <doi:10.1093/bioinformatics/btad635>). Functions are readily implemented for the use of LASSO regression, sparse PCA, sparse (group) PLS or graphical LASSO in stability selection, and hierarchical clustering, partitioning around medoids, K means or Gaussian mixture models in consensus clustering.
Maintained by Barbara Bodinier. Last updated 1 years ago.
21.1 match 13 stars 5.91 score 124 scriptscran
agricolae:Statistical Procedures for Agricultural Research
Original idea was presented in the thesis "A statistical analysis tool for agricultural research" to obtain the degree of Master on science, National Engineering University (UNI), Lima-Peru. Some experimental data for the examples come from the CIP and others research. Agricolae offers extensive functionality on experimental design especially for agricultural and plant breeding experiments, which can also be useful for other purposes. It supports planning of lattice, Alpha, Cyclic, Complete Block, Latin Square, Graeco-Latin Squares, augmented block, factorial, split and strip plot designs. There are also various analysis facilities for experimental data, e.g. treatment comparison procedures and several non-parametric tests comparison, biodiversity indexes and consensus cluster.
Maintained by Felipe de Mendiburu. Last updated 1 years ago.
15.0 match 7 stars 7.01 score 15 dependentsbioc
DESeq2:Differential gene expression analysis based on the negative binomial distribution
Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.
Maintained by Michael Love. Last updated 11 days ago.
sequencingrnaseqchipseqgeneexpressiontranscriptionnormalizationdifferentialexpressionbayesianregressionprincipalcomponentclusteringimmunooncologyopenblascpp
6.4 match 375 stars 16.11 score 17k scripts 115 dependentsillustratien
toolStability:Tool for Stability Indices Calculation
Tools to calculate stability indices with parametric, non-parametric and probabilistic approaches. The basic data format requirement for 'toolStability' is a data frame with 3 columns including numeric trait values, genotype,and environmental labels. Output format of each function is the dataframe with chosen stability index for each genotype. Function "table_stability" offers the summary table of all stability indices in this package. This R package toolStability is part of the main publication: Wang, Casadebaig and Chen (2023) <doi:10.1007/s00122-023-04264-7>. Analysis pipeline for main publication can be found on github: <https://github.com/Illustratien/Wang_2023_TAAG/tree/V1.0.0>. Sample dataset in this package is derived from another publication: Casadebaig P, Zheng B, Chapman S et al. (2016) <doi:10.1371/journal.pone.0146385>. For detailed documentation of dataset, please see on Zenodo <doi:10.5281/zenodo.4729636>. Indices used in this package are from: Döring TF, Reckling M (2018) <doi:10.1016/j.eja.2018.06.007>. Eberhart SA, Russell WA (1966) <doi:10.2135/cropsci1966.0011183X000600010011x>. Eskridge KM (1990) <doi:10.2135/cropsci1990.0011183X003000020025x>. Finlay KW, Wilkinson GN (1963) <doi:10.1071/AR9630742>. Hanson WD (1970) Genotypic stability. <doi:10.1007/BF00285245>. Lin CS, Binns MR (1988) <https://cdnsciencepub.com/doi/abs/10.4141/cjps88-018>. Nassar R, Hühn M (1987). Pinthus MJ (1973) <doi:10.1007/BF00021563>. Römer T (1917). Shukla GK (1972). Wricke G (1962).
Maintained by Tien-Cheng Wang. Last updated 1 years ago.
analysis-packagereproducible-researchstability
24.9 match 1 stars 3.74 score 11 scriptsdbosak01
pkgdiff:Identifies Package Differences
Identifies differences between versions of a package. Specifically, the functions help determine if there are breaking changes from one package version to the next. The package also includes a stability assessment, to help you determine the overall stability of a package, or even an entire repository.
Maintained by David Bosak. Last updated 8 hours ago.
16.4 match 1 stars 5.26 scoreaccelstab
AccelStab:Accelerated Stability Kinetic Modelling
Estimate the Šesták–Berggren kinetic model (degradation model) from experimental data. A A closed-form (analytic) solution to the degradation model is implemented as a non-linear fit, allowing for the extrapolation of the degradation of a drug product - both in time and temperature. Parametric bootstrap, with kinetic parameters drawn from the multivariate t-distribution, and analytical formulae (the delta method) are available options to calculate the confidence and prediction intervals. The results (modelling, extrapolations and statistical intervals) can be visualised with multiple plots. The examples illustrate the accelerated stability modelling in drugs and vaccines development.
Maintained by Ben Wells. Last updated 5 months ago.
arrheniuskineticsmodellingnon-linear-modelpharmaceuticalspharmacokineticssestak-berggrenstabilitystatisticstemperaturetemperature-excursionvaccine
23.1 match 1 stars 3.48 score 2 scriptsgleon
rLakeAnalyzer:Lake Physics Tools
Standardized methods for calculating common important derived physical features of lakes including water density based based on temperature, thermal layers, thermocline depth, lake number, Wedderburn number, Schmidt stability and others.
Maintained by Luke Winslow. Last updated 4 years ago.
8.5 match 45 stars 9.05 score 280 scripts 1 dependentskwstat
agridat:Agricultural Datasets
Datasets from books, papers, and websites related to agriculture. Example graphics and analyses are included. Data come from small-plot trials, multi-environment trials, uniformity trials, yield monitors, and more.
Maintained by Kevin Wright. Last updated 28 days ago.
6.9 match 125 stars 11.02 score 1.7k scripts 2 dependentsr-lib
vctrs:Vector Helpers
Defines new notions of prototype and size that are used to provide tools for consistent and well-founded type-coercion and size-recycling, and are in turn connected to ideas of type- and size-stability useful for analysing function interfaces.
Maintained by Davis Vaughan. Last updated 5 months ago.
3.9 match 290 stars 18.97 score 1.1k scripts 13k dependentssonsoleslp
tna:Transition Network Analysis (TNA)
Provides tools for performing Transition Network Analysis (TNA) to study relational dynamics, including functions for building and plotting TNA models, calculating centrality measures, and identifying dominant events and patterns. TNA statistical techniques (e.g., bootstrapping and permutation tests) ensure the reliability of observed insights and confirm that identified dynamics are meaningful. See (Saqr et al., 2025) <doi:10.1145/3706468.3706513> for more details on TNA.
Maintained by Sonsoles López-Pernas. Last updated 3 days ago.
educational-data-mininglearning-analyticsmarkov-modeltemporal-analysis
11.0 match 4 stars 6.48 score 5 scriptsbioc
evaluomeR:Evaluation of Bioinformatics Metrics
Evaluating the reliability of your own metrics and the measurements done on your own datasets by analysing the stability and goodness of the classifications of such metrics.
Maintained by José Antonio Bernabé-Díaz. Last updated 5 months ago.
clusteringclassificationfeatureextractionassessmentclustering-evaluationevaluomeevaluomermetrics
14.4 match 4.82 score 33 scriptsrobinhankin
permutations:The Symmetric Group: Permutations of a Finite Set
Manipulates invertible functions from a finite set to itself. Can transform from word form to cycle form and back. To cite the package in publications please use Hankin (2020) "Introducing the permutations R package", SoftwareX, volume 11 <doi:10.1016/j.softx.2020.100453>.
Maintained by Robin K. S. Hankin. Last updated 1 months ago.
8.3 match 6 stars 8.23 score 49 scripts 2 dependentsbioc
transformGamPoi:Variance Stabilizing Transformation for Gamma-Poisson Models
Variance-stabilizing transformations help with the analysis of heteroskedastic data (i.e., data where the variance is not constant, like count data). This package provide two types of variance stabilizing transformations: (1) methods based on the delta method (e.g., 'acosh', 'log(x+1)'), (2) model residual based (Pearson and randomized quantile residuals).
Maintained by Constantin Ahlmann-Eltze. Last updated 5 months ago.
singlecellnormalizationpreprocessingregressioncpp
11.3 match 21 stars 5.95 score 21 scriptsbioc
omada:Machine learning tools for automated transcriptome clustering analysis
Symptomatic heterogeneity in complex diseases reveals differences in molecular states that need to be investigated. However, selecting the numerous parameters of an exploratory clustering analysis in RNA profiling studies requires deep understanding of machine learning and extensive computational experimentation. Tools that assist with such decisions without prior field knowledge are nonexistent and further gene association analyses need to be performed independently. We have developed a suite of tools to automate these processes and make robust unsupervised clustering of transcriptomic data more accessible through automated machine learning based functions. The efficiency of each tool was tested with four datasets characterised by different expression signal strengths. Our toolkit’s decisions reflected the real number of stable partitions in datasets where the subgroups are discernible. Even in datasets with less clear biological distinctions, stable subgroups with different expression profiles and clinical associations were found.
Maintained by Sokratis Kariotis. Last updated 5 months ago.
softwareclusteringrnaseqgeneexpression
18.3 match 3.60 score 5 scriptsbranchlab
metasnf:Meta Clustering with Similarity Network Fusion
Framework to facilitate patient subtyping with similarity network fusion and meta clustering. The similarity network fusion (SNF) algorithm was introduced by Wang et al. (2014) in <doi:10.1038/nmeth.2810>. SNF is a data integration approach that can transform high-dimensional and diverse data types into a single similarity network suitable for clustering with minimal loss of information from each initial data source. The meta clustering approach was introduced by Caruana et al. (2006) in <doi:10.1109/ICDM.2006.103>. Meta clustering involves generating a wide range of cluster solutions by adjusting clustering hyperparameters, then clustering the solutions themselves into a manageable number of qualitatively similar solutions, and finally characterizing representative solutions to find ones that are best for the user's specific context. This package provides a framework to easily transform multi-modal data into a wide range of similarity network fusion-derived cluster solutions as well as to visualize, characterize, and validate those solutions. Core package functionality includes easy customization of distance metrics, clustering algorithms, and SNF hyperparameters to generate diverse clustering solutions; calculation and plotting of associations between features, between patients, and between cluster solutions; and standard cluster validation approaches including resampled measures of cluster stability, standard metrics of cluster quality, and label propagation to evaluate generalizability in unseen data. Associated vignettes guide the user through using the package to identify patient subtypes while adhering to best practices for unsupervised learning.
Maintained by Prashanth S Velayudhan. Last updated 5 days ago.
bioinformaticsclusteringmetaclusteringsnf
7.5 match 8 stars 8.21 score 30 scriptsmyles-lewis
nestedcv:Nested Cross-Validation with 'glmnet' and 'caret'
Implements nested k*l-fold cross-validation for lasso and elastic-net regularised linear models via the 'glmnet' package and other machine learning models via the 'caret' package <doi:10.1093/bioadv/vbad048>. Cross-validation of 'glmnet' alpha mixing parameter and embedded fast filter functions for feature selection are provided. Described as double cross-validation by Stone (1977) <doi:10.1111/j.2517-6161.1977.tb01603.x>. Also implemented is a method using outer CV to measure unbiased model performance metrics when fitting Bayesian linear and logistic regression shrinkage models using the horseshoe prior over parameters to encourage a sparse model as described by Piironen & Vehtari (2017) <doi:10.1214/17-EJS1337SI>.
Maintained by Myles Lewis. Last updated 6 days ago.
7.6 match 12 stars 7.92 score 46 scriptsgdurif
plsgenomics:PLS Analyses for Genomics
Routines for PLS-based genomic analyses, implementing PLS methods for classification with microarray data and prediction of transcription factor activities from combined ChIP-chip analysis. The >=1.2-1 versions include two new classification methods for microarray data: GSIM and Ridge PLS. The >=1.3 versions includes a new classification method combining variable selection and compression in logistic regression context: logit-SPLS; and an adaptive version of the sparse PLS.
Maintained by Ghislain Durif. Last updated 12 months ago.
10.8 match 5.55 score 140 scripts 2 dependentsbioc
microSTASIS:Microbiota STability ASsessment via Iterative cluStering
The toolkit 'µSTASIS', or microSTASIS, has been developed for the stability analysis of microbiota in a temporal framework by leveraging on iterative clustering. Concretely, the core function uses Hartigan-Wong k-means algorithm as many times as possible for stressing out paired samples from the same individuals to test if they remain together for multiple numbers of clusters over a whole data set of individuals. Moreover, the package includes multiple functions to subset samples from paired times, validate the results or visualize the output.
Maintained by Pedro Sánchez-Sánchez. Last updated 5 months ago.
geneticvariabilitybiomedicalinformaticsclusteringmultiplecomparisonmicrobiome
14.0 match 2 stars 4.30 score 1 scriptsjuergenknauer
bigleaf:Physical and Physiological Ecosystem Properties from Eddy Covariance Data
Calculation of physical (e.g. aerodynamic conductance, surface temperature), and physiological (e.g. canopy conductance, water-use efficiency) ecosystem properties from eddy covariance data and accompanying meteorological measurements. Calculations assume the land surface to behave like a 'big-leaf' and return bulk ecosystem/canopy variables.
Maintained by Juergen Knauer. Last updated 8 months ago.
8.3 match 7.23 score 124 scripts 17 dependentsbioc
ReducedExperiment:Containers and tools for dimensionally-reduced -omics representations
Provides SummarizedExperiment-like containers for storing and manipulating dimensionally-reduced assay data. The ReducedExperiment classes allow users to simultaneously manipulate their original dataset and their decomposed data, in addition to other method-specific outputs like feature loadings. Implements utilities and specialised classes for the application of stabilised independent component analysis (sICA) and weighted gene correlation network analysis (WGCNA).
Maintained by Jack Gisby. Last updated 2 months ago.
geneexpressioninfrastructuredatarepresentationsoftwaredimensionreductionnetworkbioconductor-packagebioinformaticsdimensionality-reduction
11.3 match 3 stars 5.18 score 8 scriptsantonio-pgarcia
rrepast:Invoke 'Repast Simphony' Simulation Models
An R and Repast integration tool for running individual-based (IbM) simulation models developed using 'Repast Simphony' Agent-Based framework directly from R code supporting multicore execution. This package integrates 'Repast Simphony' models within R environment, making easier the tasks of running and analyzing model output data for automated parameter calibration and for carrying out uncertainty and sensitivity analysis using the power of R environment.
Maintained by Antonio Prestes Garcia. Last updated 5 years ago.
12.9 match 3 stars 4.53 score 38 scripts 1 dependentshannahlowens
climateStability:Estimating Climate Stability from Climate Model Data
Climate stability measures are not formalized in the literature and tools for generating stability metrics from existing data are nascent. This package provides tools for calculating climate stability from raster data encapsulating climate change as a series of time slices. The methods follow Owens and Guralnick <doi:10.17161/bi.v14i0.9786> Biodiversity Informatics.
Maintained by Hannah Owens. Last updated 2 years ago.
11.5 match 7 stars 5.01 score 29 scriptstianmoul
bootcluster:Bootstrapping Estimates of Clustering Stability
Implementation of the bootstrapping approach for the estimation of clustering stability and its application in estimating the number of clusters, as introduced by Yu et al (2016)<doi:10.1142/9789814749411_0007>. Implementation of the non-parametric bootstrap approach to assessing the stability of module detection in a graph, the extension for the selection of a parameter set that defines a graph from data in a way that optimizes stability and the corresponding visualization functions, as introduced by Tian et al (2021) <doi:10.1002/sam.11495>. Implemented out-of-bag stability estimation function and k-select Smin-based k-selection function as introduced by Liu et al (2022) <doi:10.1002/sam.11593>. Implemented ensemble clustering method based-on k-means clustering method, spectral clustering method and hierarchical clustering method.
Maintained by Tianmou Liu. Last updated 4 months ago.
26.5 match 1 stars 2.18 score 3 scriptsmjg211
phaseR:Phase Plane Analysis of One- And Two-Dimensional Autonomous ODE Systems
Performs a qualitative analysis of one- and two-dimensional autonomous ordinary differential equation systems, using phase plane methods. Programs are available to identify and classify equilibrium points, plot the direction field, and plot trajectories for multiple initial conditions. In the one-dimensional case, a program is also available to plot the phase portrait. Whilst in the two-dimensional case, programs are additionally available to plot nullclines and stable/unstable manifolds of saddle points. Many example systems are provided for the user. For further details can be found in Grayling (2014) <doi:10.32614/RJ-2014-023>.
Maintained by Michael J Grayling. Last updated 3 years ago.
biological-modelingdifferential-equationsdynamical-systemsecological-modellinglotka-volterramanifoldsmodeling-dynamic-systemsmorris-lecarperturbation-analysisphase-planesir-modelspecies-interactionsvan-der-pol
8.6 match 15 stars 6.63 score 94 scripts 1 dependentsvpihur
clValid:Validation of Clustering Results
Statistical and biological validation of clustering results. This package implements Dunn Index, Silhouette, Connectivity, Stability, BHI and BSI. Further information can be found in Brock, G et al. (2008) <doi: 10.18637/jss.v025.i04>.
Maintained by Vasyl Pihur. Last updated 4 years ago.
7.8 match 5 stars 7.19 score 422 scripts 14 dependentsbioc
SC3:Single-Cell Consensus Clustering
A tool for unsupervised clustering and analysis of single cell RNA-Seq data.
Maintained by Vladimir Kiselev. Last updated 5 months ago.
immunooncologysinglecellsoftwareclassificationclusteringdimensionreductionsupportvectormachinernaseqvisualizationtranscriptomicsdatarepresentationguidifferentialexpressiontranscriptionbioconductor-packagehuman-cell-atlassingle-cell-rna-seqopenblascpp
5.2 match 122 stars 10.09 score 374 scripts 1 dependentscran
clv:Cluster Validation Techniques
Package contains most of the popular internal and external cluster validation methods ready to use for the most of the outputs produced by functions coming from package "cluster". Package contains also functions and examples of usage for cluster stability approach that might be applied to algorithms implemented in "cluster" package as well as user defined clustering algorithms.
Maintained by Lukasz Nieweglowski. Last updated 1 years ago.
11.1 match 1 stars 4.73 score 148 scripts 17 dependentspiusdahinden
expirest:Expiry Estimation Procedures
The Australian Regulatory Guidelines for Prescription Medicines (ARGPM), guidance on "Stability testing for prescription medicines", recommends to predict the shelf life of chemically derived medicines from stability data by taking the worst case situation at batch release into account. Consequently, if a change over time is observed, a release limit needs to be specified. Finding a release limit and the associated shelf life is supported, as well as the standard approach that is recommended by guidance Q1E "Evaluation of stability data" from the International Council for Harmonisation (ICH).
Maintained by Pius Dahinden. Last updated 20 days ago.
15.4 match 3.40 score 6 scriptsrobjhyndman
tsfeatures:Time Series Feature Extraction
Methods for extracting various features from time series data. The features provided are those from Hyndman, Wang and Laptev (2013) <doi:10.1109/ICDMW.2015.104>, Kang, Hyndman and Smith-Miles (2017) <doi:10.1016/j.ijforecast.2016.09.004> and from Fulcher, Little and Jones (2013) <doi:10.1098/rsif.2013.0048>. Features include spectral entropy, autocorrelations, measures of the strength of seasonality and trend, and so on. Users can also define their own feature functions.
Maintained by Rob Hyndman. Last updated 8 months ago.
4.5 match 254 stars 11.47 score 268 scripts 22 dependentsrjacobucci
regsem:Regularized Structural Equation Modeling
Uses both ridge and lasso penalties (and extensions) to penalize specific parameters in structural equation models. The package offers additional cost functions, cross validation, and other extensions beyond traditional structural equation models. Also contains a function to perform exploratory mediation (XMed).
Maintained by Ross Jacobucci. Last updated 2 years ago.
7.6 match 14 stars 6.63 score 77 scriptsgarthtarr
mplot:Graphical Model Stability and Variable Selection Procedures
Model stability and variable inclusion plots [Mueller and Welsh (2010, <doi:10.1111/j.1751-5823.2010.00108.x>); Murray, Heritier and Mueller (2013, <doi:10.1002/sim.5855>)] as well as the adaptive fence [Jiang et al. (2008, <doi:10.1214/07-AOS517>); Jiang et al. (2009, <doi:10.1016/j.spl.2008.10.014>)] for linear and generalised linear models.
Maintained by Garth Tarr. Last updated 4 years ago.
10.6 match 12 stars 4.70 score 42 scriptsrobertwbuchkowski
soilfoodwebs:Soil Food Web Analysis
Analyzing soil food webs or any food web measured at equilibrium. The package calculates carbon and nitrogen fluxes and stability properties using methods described by Hunt et al. (1987) <doi:10.1007/BF00260580>, de Ruiter et al. (1995) <doi:10.1126/science.269.5228.1257>, Holtkamp et al. (2011) <doi:10.1016/j.soilbio.2010.10.004>, and Buchkowski and Lindo (2021) <doi:10.1111/1365-2435.13706>. The package can also manipulate the structure of the food web as well as simulate food webs away from equilibrium and run decomposition experiments.
Maintained by Robert Buchkowski. Last updated 11 months ago.
10.9 match 5 stars 4.40 score 4 scriptstmlange
optRF:Optimising Random Forest Stability by Determining the Optimal Number of Trees
Calculating the stability of random forest with certain numbers of trees. The non-linear relationship between stability and numbers of trees is described using a logistic regression model and used to estimate the optimal number of trees.
Maintained by Thomas Martin Lange. Last updated 1 months ago.
9.9 match 4.78 scoremartirm
clustAnalytics:Cluster Evaluation on Graphs
Evaluates the stability and significance of clusters on 'igraph' graphs. Supports weighted and unweighted graphs. Implements the cluster evaluation methods defined by Arratia A, Renedo M (2021) <doi:10.7717/peerj-cs.600>. Also includes an implementation of the Reduced Mutual Information introduced by Newman et al. (2020) <doi:10.1103/PhysRevE.101.042304>.
Maintained by Martí Renedo Mirambell. Last updated 1 years ago.
9.6 match 5 stars 4.92 score 33 scriptsopendendro
dplR:Dendrochronology Program Library in R
Perform tree-ring analyses such as detrending, chronology building, and cross dating. Read and write standard file formats used in dendrochronology.
Maintained by Andy Bunn. Last updated 19 days ago.
4.0 match 39 stars 11.71 score 546 scripts 26 dependentsnicholasjclark
mvgam:Multivariate (Dynamic) Generalized Additive Models
Fit Bayesian Dynamic Generalized Additive Models to multivariate observations. Users can build nonlinear State-Space models that can incorporate semiparametric effects in observation and process components, using a wide range of observation families. Estimation is performed using Markov Chain Monte Carlo with Hamiltonian Monte Carlo in the software 'Stan'. References: Clark & Wells (2023) <doi:10.1111/2041-210X.13974>.
Maintained by Nicholas J Clark. Last updated 19 hours ago.
bayesian-statisticsdynamic-factor-modelsecological-modellingforecastinggaussian-processgeneralised-additive-modelsgeneralized-additive-modelsjoint-species-distribution-modellingmultilevel-modelsmultivariate-timeseriesstantime-series-analysistimeseriesvector-autoregressionvectorautoregressioncpp
4.8 match 139 stars 9.85 score 117 scriptsjamesramsay5
fda:Functional Data Analysis
These functions were developed to support functional data analysis as described in Ramsay, J. O. and Silverman, B. W. (2005) Functional Data Analysis. New York: Springer and in Ramsay, J. O., Hooker, Giles, and Graves, Spencer (2009). Functional Data Analysis with R and Matlab (Springer). The package includes data sets and script files working many examples including all but one of the 76 figures in this latter book. Matlab versions are available by ftp from <https://www.psych.mcgill.ca/misc/fda/downloads/FDAfuns/>.
Maintained by James Ramsay. Last updated 4 months ago.
3.6 match 3 stars 12.29 score 2.0k scripts 143 dependentschavent
ClustOfVar:Clustering of Variables
Cluster analysis of a set of variables. Variables can be quantitative, qualitative or a mixture of both.
Maintained by Marie Chavent. Last updated 5 years ago.
6.7 match 7 stars 6.47 score 142 scripts 2 dependentslbelzile
mev:Modelling of Extreme Values
Various tools for the analysis of univariate, multivariate and functional extremes. Exact simulation from max-stable processes [Dombry, Engelke and Oesting (2016) <doi:10.1093/biomet/asw008>, R-Pareto processes for various parametric models, including Brown-Resnick (Wadsworth and Tawn, 2014, <doi:10.1093/biomet/ast042>) and Extremal Student (Thibaud and Opitz, 2015, <doi:10.1093/biomet/asv045>). Threshold selection methods, including Wadsworth (2016) <doi:10.1080/00401706.2014.998345>, and Northrop and Coleman (2014) <doi:10.1007/s10687-014-0183-z>. Multivariate extreme diagnostics. Estimation and likelihoods for univariate extremes, e.g., Coles (2001) <doi:10.1007/978-1-4471-3675-0>.
Maintained by Leo Belzile. Last updated 5 months ago.
extreme-value-statisticslikelihood-functionsmax-stablesimulationthreshold-selectionopenblascppopenmp
5.2 match 13 stars 8.23 score 94 scripts 4 dependentsbpfaff
vars:VAR Modelling
Estimation, lag selection, diagnostic testing, forecasting, causality analysis, forecast error variance decomposition and impulse response functions of VAR models and estimation of SVAR and SVEC models.
Maintained by Bernhard Pfaff. Last updated 12 months ago.
4.9 match 7 stars 8.68 score 2.8k scripts 44 dependentsbioc
monaLisa:Binned Motif Enrichment Analysis and Visualization
Useful functions to work with sequence motifs in the analysis of genomics data. These include methods to annotate genomic regions or sequences with predicted motif hits and to identify motifs that drive observed changes in accessibility or expression. Functions to produce informative visualizations of the obtained results are also provided.
Maintained by Michael Stadler. Last updated 2 months ago.
motifannotationvisualizationfeatureextractionepigenetics
5.0 match 40 stars 8.06 score 53 scriptsapariciojohan
agriutilities:Utilities for Data Analysis in Agriculture
Utilities designed to make the analysis of field trials easier and more accessible for everyone working in plant breeding. It provides a simple and intuitive interface for conducting single and multi-environmental trial analysis, with minimal coding required. Whether you're a beginner or an experienced user, 'agriutilities' will help you quickly and easily carry out complex analyses with confidence. With built-in functions for fitting Linear Mixed Models, 'agriutilities' is the ideal choice for anyone who wants to save time and focus on interpreting their results. Some of the functions require the R package 'asreml' for the 'ASReml' software, this can be obtained upon purchase from 'VSN' international <https://vsni.co.uk/software/asreml-r/>.
Maintained by Johan Aparicio. Last updated 2 months ago.
5.3 match 18 stars 7.46 score 88 scripts 1 dependentsbiometris
statgenGxE:Genotype by Environment (GxE) Analysis
Analysis of multi environment data of plant breeding experiments following the analyses described in Malosetti, Ribaut, and van Eeuwijk (2013), <doi:10.3389/fphys.2013.00044>. One of a series of statistical genetic packages for streamlining the analysis of typical plant breeding experiments developed by Biometris. Some functions have been created to be used in conjunction with the R package 'asreml' for the 'ASReml' software, which can be obtained upon purchase from 'VSN' international (<https://vsni.co.uk/software/asreml-r/>).
Maintained by Bart-Jan van Rossum. Last updated 6 months ago.
geneticsgxegxe-modellingmulti-trial-analysis
7.0 match 10 stars 5.53 score 17 scriptsannawysocki
stim:Incorporating Stability Information into Cross-Sectional Estimates
The goal of 'stim' is to provide a function for estimating the Stability Informed Model. The Stability Informed Model integrates stability information (how much a variable correlates with itself in the future) into cross-sectional estimates. Wysocki and Rhemtulla (2022) <https://psyarxiv.com/vg5as>.
Maintained by Anna Wysocki. Last updated 2 years ago.
10.3 match 3.70 score 4 scriptspaulnorthrop
threshr:Threshold Selection and Uncertainty for Extreme Value Analysis
Provides functions for the selection of thresholds for use in extreme value models, based mainly on the methodology in Northrop, Attalides and Jonathan (2017) <doi:10.1111/rssc.12159>. It also performs predictive inferences about future extreme values, based either on a single threshold or on a weighted average of inferences from multiple thresholds, using the 'revdbayes' package <https://cran.r-project.org/package=revdbayes>. At the moment only the case where the data can be treated as independent identically distributed observations is considered.
Maintained by Paul J. Northrop. Last updated 1 months ago.
extreme-value-statisticsextremesgeneralizedinferenceparetoplotpredictionthresholdthreshold-selectionuncertainty
6.7 match 6 stars 5.72 score 29 scripts 1 dependentsbioc
lumi:BeadArray Specific Methods for Illumina Methylation and Expression Microarrays
The lumi package provides an integrated solution for the Illumina microarray data analysis. It includes functions of Illumina BeadStudio (GenomeStudio) data input, quality control, BeadArray-specific variance stabilization, normalization and gene annotation at the probe level. It also includes the functions of processing Illumina methylation microarrays, especially Illumina Infinium methylation microarrays.
Maintained by Lei Huang. Last updated 5 months ago.
microarrayonechannelpreprocessingdnamethylationqualitycontroltwochannel
6.1 match 6.27 score 294 scripts 5 dependentsmwsill
s4vd:Biclustering via Sparse Singular Value Decomposition Incorporating Stability Selection
The main function s4vd() performs a biclustering via sparse singular value decomposition with a nested stability selection. The results is an biclust object and thus all methods of the biclust package can be applied.
Maintained by Martin Sill. Last updated 5 years ago.
7.0 match 4 stars 5.31 score 17 scripts 2 dependentsgjmvanboxtel
gsignal:Signal Processing
R implementation of the 'Octave' package 'signal', containing a variety of signal processing tools, such as signal generation and measurement, correlation and convolution, filtering, filter design, filter analysis and conversion, power spectrum analysis, system identification, decimation and sample rate change, and windowing.
Maintained by Geert van Boxtel. Last updated 2 months ago.
3.6 match 24 stars 10.03 score 133 scripts 34 dependentsmyaseen208
StabilityApp:Stability Analysis App for GEI in Multi-Environment Trials
Provides tools for Genotype by Environment Interaction (GEI) analysis, using statistical models and visualizations to assess genotype performance across environments. It helps researchers explore interaction effects, stability, and adaptability in multi-environment trials, identifying the best-performing genotypes in different conditions. Which Win Where!
Maintained by Muhammad Yaseen. Last updated 4 months ago.
ammibiplotgeiggemetstability-analysis
10.9 match 3.30 scorenathanieltwarog
braidrm:Fitting Combined Action with the BRAID Response Surface Model
Contains functions for evaluating, analyzing, and fitting combined action dose response surfaces with the Bivariate Response to Additive Interacting Doses (BRAID) model of combined action, along with tools for implementing other combination analysis methods, including Bliss independence, combination index, and additional response surface methods.
Maintained by Nathaniel R. Twarog. Last updated 6 months ago.
5.9 match 1 stars 6.10 score 35 scripts 1 dependentsgobbios
EloRating:Animal Dominance Hierarchies by Elo Rating
Provides functions to quantify animal dominance hierarchies. The major focus is on Elo rating and its ability to deal with temporal dynamics in dominance interaction sequences. For static data, David's score and de Vries' I&SI are also implemented. In addition, the package provides functions to assess transitivity, linearity and stability of dominance networks. See Neumann et al (2011) <doi:10.1016/j.anbehav.2011.07.016> for an introduction.
Maintained by Christof Neumann. Last updated 8 months ago.
5.1 match 4 stars 6.86 score 61 scripts 1 dependentsbioc
flowVS:Variance stabilization in flow cytometry (and microarrays)
Per-channel variance stabilization from a collection of flow cytometry samples by Bertlett test for homogeneity of variances. The approach is applicable to microarrays data as well.
Maintained by Ariful Azad. Last updated 5 months ago.
immunooncologyflowcytometrycellbasedassaysmicroarray
8.8 match 3.82 score 11 scriptsagrocares
OBIC:Calculate the Open Bodem Index (OBI) Score
The Open Bodem Index (OBI) is a method to evaluate the quality of soils of agricultural fields in The Netherlands and the sustainability of the current agricultural practices. The OBI score is based on four main criteria: chemical, physical, biological and management, which consist of more than 21 indicators. By providing results of a soil analysis and management info the 'OBIC' package can be use to calculate he scores, indicators and derivatives that are used by the OBI. More information about the Open Bodem Index can be found at <https://openbodemindex.nl/>.
Maintained by Sven Verweij. Last updated 6 months ago.
4.9 match 11 stars 6.82 score 20 scriptsnashjc
nlsr:Functions for Nonlinear Least Squares Solutions - Updated 2022
Provides tools for working with nonlinear least squares problems. For the estimation of models reliable and robust tools than nls(), where the the Gauss-Newton method frequently stops with 'singular gradient' messages. This is accomplished by using, where possible, analytic derivatives to compute the matrix of derivatives and a stabilization of the solution of the estimation equations. Tools for approximate or externally supplied derivative matrices are included. Bounds and masks on parameters are handled properly.
Maintained by John C Nash. Last updated 27 days ago.
4.8 match 7.02 score 94 scripts 5 dependentshfgolino
EGAnet:Exploratory Graph Analysis – a Framework for Estimating the Number of Dimensions in Multivariate Data using Network Psychometrics
Implements the Exploratory Graph Analysis (EGA) framework for dimensionality and psychometric assessment. EGA estimates the number of dimensions in psychological data using network estimation methods and community detection algorithms. A bootstrap method is provided to assess the stability of dimensions and items. Fit is evaluated using the Entropy Fit family of indices. Unique Variable Analysis evaluates the extent to which items are locally dependent (or redundant). Network loadings provide similar information to factor loadings and can be used to compute network scores. A bootstrap and permutation approach are available to assess configural and metric invariance. Hierarchical structures can be detected using Hierarchical EGA. Time series and intensive longitudinal data can be analyzed using Dynamic EGA, supporting individual, group, and population level assessments.
Maintained by Hudson Golino. Last updated 9 days ago.
4.3 match 47 stars 7.80 score 61 scripts 1 dependentsbella2001
causalCmprsk:Nonparametric and Cox-Based Estimation of Average Treatment Effects in Competing Risks
Estimation of average treatment effects (ATE) of point interventions on time-to-event outcomes with K competing risks (K can be 1). The method uses propensity scores and inverse probability weighting for emulation of baseline randomization, which is described in Charpignon et al. (2022) <doi:10.1038/s41467-022-35157-w>.
Maintained by Bella Vakulenko-Lagun. Last updated 2 years ago.
6.9 match 3 stars 4.48 score 8 scriptsbioc
AlpsNMR:Automated spectraL Processing System for NMR
Reads Bruker NMR data directories both zipped and unzipped. It provides automated and efficient signal processing for untargeted NMR metabolomics. It is able to interpolate the samples, detect outliers, exclude regions, normalize, detect peaks, align the spectra, integrate peaks, manage metadata and visualize the spectra. After spectra proccessing, it can apply multivariate analysis on extracted data. Efficient plotting with 1-D data is also available. Basic reading of 1D ACD/Labs exported JDX samples is also available.
Maintained by Sergio Oller Moreno. Last updated 5 months ago.
softwarepreprocessingvisualizationclassificationcheminformaticsmetabolomicsdataimport
4.0 match 15 stars 7.59 score 12 scripts 1 dependentsbioc
bluster:Clustering Algorithms for Bioconductor
Wraps common clustering algorithms in an easily extended S4 framework. Backends are implemented for hierarchical, k-means and graph-based clustering. Several utilities are also provided to compare and evaluate clustering results.
Maintained by Aaron Lun. Last updated 5 months ago.
immunooncologysoftwaregeneexpressiontranscriptomicssinglecellclusteringcpp
3.2 match 9.43 score 636 scripts 51 dependentspln-team
PLNmodels:Poisson Lognormal Models
The Poisson-lognormal model and variants (Chiquet, Mariadassou and Robin, 2021 <doi:10.3389/fevo.2021.588292>) can be used for a variety of multivariate problems when count data are at play, including principal component analysis for count data, discriminant analysis, model-based clustering and network inference. Implements variational algorithms to fit such models accompanied with a set of functions for visualization and diagnostic.
Maintained by Julien Chiquet. Last updated 4 days ago.
count-datamultivariate-analysisnetwork-inferencepcapoisson-lognormal-modelopenblascpp
3.1 match 56 stars 9.50 score 226 scriptsbioc
vsn:Variance stabilization and calibration for microarray data
The package implements a method for normalising microarray intensities from single- and multiple-color arrays. It can also be used for data from other technologies, as long as they have similar format. The method uses a robust variant of the maximum-likelihood estimator for an additive-multiplicative error model and affine calibration. The model incorporates data calibration step (a.k.a. normalization), a model for the dependence of the variance on the mean intensity and a variance stabilizing data transformation. Differences between transformed intensities are analogous to "normalized log-ratios". However, in contrast to the latter, their variance is independent of the mean, and they are usually more sensitive and specific in detecting differential transcription.
Maintained by Wolfgang Huber. Last updated 5 months ago.
microarrayonechanneltwochannelpreprocessing
3.5 match 8.49 score 924 scripts 51 dependentsjchiquet
quadrupen:Sparsity by Worst-Case Quadratic Penalties
Fits classical sparse regression models with efficient active set algorithms by solving quadratic problems as described by Grandvalet, Chiquet and Ambroise (2017) <doi:10.48550/arXiv.1210.2077>. Also provides a few methods for model selection purpose (cross-validation, stability selection).
Maintained by Julien Chiquet. Last updated 9 months ago.
9.3 match 3.18 score 30 scriptsboost-r
mboost:Model-Based Boosting
Functional gradient descent algorithm (boosting) for optimizing general risk functions utilizing component-wise (penalised) least squares estimates or regression trees as base-learners for fitting generalized linear, additive and interaction models to potentially high-dimensional data. Models and algorithms are described in <doi:10.1214/07-STS242>, a hands-on tutorial is available from <doi:10.1007/s00180-012-0382-5>. The package allows user-specified loss functions and base-learners.
Maintained by Torsten Hothorn. Last updated 4 months ago.
boosting-algorithmsgamglmmachine-learningmboostmodellingr-languagetutorialsvariable-selectionopenblas
2.3 match 72 stars 12.70 score 540 scripts 27 dependentsdatastorm-open
visNetwork:Network Visualization using 'vis.js' Library
Provides an R interface to the 'vis.js' JavaScript charting library. It allows an interactive visualization of networks.
Maintained by Benoit Thieurmel. Last updated 2 years ago.
1.9 match 549 stars 15.14 score 4.1k scripts 195 dependentsbioc
microbiome:Microbiome Analytics
Utilities for microbiome analysis.
Maintained by Leo Lahti. Last updated 5 months ago.
metagenomicsmicrobiomesequencingsystemsbiologyhitchiphitchip-atlashuman-microbiomemicrobiologymicrobiome-analysisphyloseqpopulation-study
2.3 match 290 stars 12.50 score 2.0k scripts 5 dependentsmwheymans
psfmi:Prediction Model Pooling, Selection and Performance Evaluation Across Multiply Imputed Datasets
Pooling, backward and forward selection of linear, logistic and Cox regression models in multiply imputed datasets. Backward and forward selection can be done from the pooled model using Rubin's Rules (RR), the D1, D2, D3, D4 and the median p-values method. This is also possible for Mixed models. The models can contain continuous, dichotomous, categorical and restricted cubic spline predictors and interaction terms between all these type of predictors. The stability of the models can be evaluated using (cluster) bootstrapping. The package further contains functions to pool model performance measures as ROC/AUC, Reclassification, R-squared, scaled Brier score, H&L test and calibration plots for logistic regression models. Internal validation can be done across multiply imputed datasets with cross-validation or bootstrapping. The adjusted intercept after shrinkage of pooled regression coefficients can be obtained. Backward and forward selection as part of internal validation is possible. A function to externally validate logistic prediction models in multiple imputed datasets is available and a function to compare models. For Cox models a strata variable can be included. Eekhout (2017) <doi:10.1186/s12874-017-0404-7>. Wiel (2009) <doi:10.1093/biostatistics/kxp011>. Marshall (2009) <doi:10.1186/1471-2288-9-57>.
Maintained by Martijn Heymans. Last updated 2 years ago.
cox-regressionimputationimputed-datasetslogisticmultiple-imputationpoolpredictorregressionselectionsplinespline-predictors
3.9 match 10 stars 7.17 score 70 scriptshojsgaard
doBy:Groupwise Statistics, LSmeans, Linear Estimates, Utilities
Utility package containing: 1) Facilities for working with grouped data: 'do' something to data stratified 'by' some variables. 2) LSmeans (least-squares means), general linear estimates. 3) Restrict functions to a smaller domain. 4) Miscellaneous other utilities.
Maintained by Søren Højsgaard. Last updated 4 days ago.
1.9 match 1 stars 14.94 score 3.2k scripts 939 dependentsbioc
clusterStab:Compute cluster stability scores for microarray data
This package can be used to estimate the number of clusters in a set of microarray data, as well as test the stability of these clusters.
Maintained by James W. MacDonald. Last updated 5 months ago.
7.0 match 3.90 score 2 scriptsarsilva87
soilphysics:Soil Physical Analysis
Basic and model-based soil physical analyses.
Maintained by Anderson Rodrigo da Silva. Last updated 3 years ago.
5.7 match 11 stars 4.82 score 12 scriptsbioc
genefu:Computation of Gene Expression-Based Signatures in Breast Cancer
This package contains functions implementing various tasks usually required by gene expression analysis, especially in breast cancer studies: gene mapping between different microarray platforms, identification of molecular subtypes, implementation of published gene signatures, gene selection, and survival analysis.
Maintained by Benjamin Haibe-Kains. Last updated 4 months ago.
differentialexpressiongeneexpressionvisualizationclusteringclassification
3.6 match 7.42 score 193 scripts 3 dependentsbioc
limma:Linear Models for Microarray and Omics Data
Data analysis, linear models and differential expression for omics data.
Maintained by Gordon Smyth. Last updated 5 days ago.
exonarraygeneexpressiontranscriptionalternativesplicingdifferentialexpressiondifferentialsplicinggenesetenrichmentdataimportbayesianclusteringregressiontimecoursemicroarraymicrornaarraymrnamicroarrayonechannelproprietaryplatformstwochannelsequencingrnaseqbatcheffectmultiplecomparisonnormalizationpreprocessingqualitycontrolbiomedicalinformaticscellbiologycheminformaticsepigeneticsfunctionalgenomicsgeneticsimmunooncologymetabolomicsproteomicssystemsbiologytranscriptomics
1.9 match 13.81 score 16k scripts 585 dependentsbioc
BioNERO:Biological Network Reconstruction Omnibus
BioNERO aims to integrate all aspects of biological network inference in a single package, including data preprocessing, exploratory analyses, network inference, and analyses for biological interpretations. BioNERO can be used to infer gene coexpression networks (GCNs) and gene regulatory networks (GRNs) from gene expression data. Additionally, it can be used to explore topological properties of protein-protein interaction (PPI) networks. GCN inference relies on the popular WGCNA algorithm. GRN inference is based on the "wisdom of the crowds" principle, which consists in inferring GRNs with multiple algorithms (here, CLR, GENIE3 and ARACNE) and calculating the average rank for each interaction pair. As all steps of network analyses are included in this package, BioNERO makes users avoid having to learn the syntaxes of several packages and how to communicate between them. Finally, users can also identify consensus modules across independent expression sets and calculate intra and interspecies module preservation statistics between different networks.
Maintained by Fabricio Almeida-Silva. Last updated 5 months ago.
softwaregeneexpressiongeneregulationsystemsbiologygraphandnetworkpreprocessingnetworknetworkinference
3.2 match 27 stars 7.78 score 50 scripts 1 dependentsmodal-inria
MLGL:Multi-Layer Group-Lasso
It implements a new procedure of variable selection in the context of redundancy between explanatory variables, which holds true with high dimensional data (Grimonprez et al. (2023) <doi:10.18637/jss.v106.i03>).
Maintained by Quentin Grimonprez. Last updated 2 years ago.
6.9 match 3 stars 3.61 score 27 scriptsbioc
IntEREst:Intron-Exon Retention Estimator
This package performs Intron-Exon Retention analysis on RNA-seq data (.bam files).
Maintained by Ali Oghabian. Last updated 4 days ago.
softwarealternativesplicingcoveragedifferentialsplicingsequencingrnaseqalignmentnormalizationdifferentialexpressionimmunooncology
6.0 match 4.16 score 12 scriptsmt1022
cubar:Codon Usage Bias Analysis
A suite of functions for rapid and flexible analysis of codon usage bias. It provides in-depth analysis at the codon level, including relative synonymous codon usage (RSCU), tRNA weight calculations, machine learning predictions for optimal or preferred codons, and visualization of codon-anticodon pairing. Additionally, it can calculate various gene- specific codon indices such as codon adaptation index (CAI), effective number of codons (ENC), fraction of optimal codons (Fop), tRNA adaptation index (tAI), mean codon stabilization coefficients (CSCg), and GC contents (GC/GC3s/GC4d). It also supports both standard and non-standard genetic code tables found in NCBI, as well as custom genetic code tables.
Maintained by Hong Zhang. Last updated 3 months ago.
bioinformaticscodon-usagemachine-learningsequence-analysis
4.3 match 6 stars 5.82 score 8 scriptsjedazard
MVR:Mean-Variance Regularization
Implements a non-parametric method for joint adaptive mean-variance regularization and variance stabilization of high-dimensional data. It is suited for handling difficult problems posed by high-dimensional multivariate datasets (p >> n paradigm). Among those are that the variance is often a function of the mean, variable-specific estimators of variances are not reliable, and tests statistics have low powers due to a lack of degrees of freedom. Key features include: (i) Normalization and/or variance stabilization of the data, (ii) Computation of mean-variance-regularized t-statistics (F-statistics to follow), (iii) Generation of diverse diagnostic plots, (iv) Computationally efficient implementation using C/C++ interfacing and an option for parallel computing to enjoy a faster and easier experience in the R environment.
Maintained by Jean-Eudes Dazard. Last updated 3 years ago.
6.5 match 1 stars 3.78 score 12 scriptsbioc
spatialDE:R wrapper for SpatialDE
SpatialDE is a method to find spatially variable genes (SVG) from spatial transcriptomics data. This package provides wrappers to use the Python SpatialDE library in R, using reticulate and basilisk.
Maintained by Gabriele Sales. Last updated 5 months ago.
softwaretranscriptomicspythonspatial-datawrapper
5.0 match 3 stars 4.76 score 16 scriptsmagnusdv
pedmut:Mutation Models for Pedigree Likelihood Computations
A collection of functions for modelling mutations in pedigrees with marker data, as used e.g. in likelihood computations with microsatellite data. Implemented models include equal, proportional and stepwise models, as well as random models for experimental work, and custom models allowing the user to apply any valid mutation matrix. Allele lumping is done following the lumpability criteria of Kemeny and Snell (1976), ISBN:0387901922.
Maintained by Magnus Dehli Vigeland. Last updated 1 years ago.
5.0 match 2 stars 4.76 score 5 scripts 19 dependentschrhennig
fpc:Flexible Procedures for Clustering
Various methods for clustering and cluster validation. Fixed point clustering. Linear regression clustering. Clustering by merging Gaussian mixture components. Symmetric and asymmetric discriminant projections for visualisation of the separation of groupings. Cluster validation statistics for distance based clustering including corrected Rand index. Standardisation of cluster validation statistics by random clusterings and comparison between many clustering methods and numbers of clusters based on this. Cluster-wise cluster stability assessment. Methods for estimation of the number of clusters: Calinski-Harabasz, Tibshirani and Walther's prediction strength, Fang and Wang's bootstrap stability. Gaussian/multinomial mixture fitting for mixed continuous/categorical variables. Variable-wise statistics for cluster interpretation. DBSCAN clustering. Interface functions for many clustering methods implemented in R, including estimating the number of clusters with kmeans, pam and clara. Modality diagnosis for Gaussian mixtures. For an overview see package?fpc.
Maintained by Christian Hennig. Last updated 6 months ago.
2.6 match 11 stars 9.25 score 2.6k scripts 70 dependentscran
flexclust:Flexible Cluster Algorithms
The main function kcca implements a general framework for k-centroids cluster analysis supporting arbitrary distance measures and centroid computation. Further cluster methods include hard competitive learning, neural gas, and QT clustering. There are numerous visualization methods for cluster results (neighborhood graphs, convex cluster hulls, barcharts of centroids, ...), and bootstrap methods for the analysis of cluster stability.
Maintained by Bettina Grün. Last updated 17 days ago.
4.1 match 3 stars 5.81 score 52 dependentsbstewart
stm:Estimation of the Structural Topic Model
The Structural Topic Model (STM) allows researchers to estimate topic models with document-level covariates. The package also includes tools for model selection, visualization, and estimation of topic-covariate regressions. Methods developed in Roberts et. al. (2014) <doi:10.1111/ajps.12103> and Roberts et. al. (2016) <doi:10.1080/01621459.2016.1141684>. Vignette is Roberts et. al. (2019) <doi:10.18637/jss.v091.i02>.
Maintained by Brandon Stewart. Last updated 1 years ago.
1.8 match 404 stars 12.63 score 1.6k scripts 6 dependentssachaepskamp
bootnet:Bootstrap Methods for Various Network Estimation Routines
Bootstrap methods to assess accuracy and stability of estimated network structures and centrality indices <doi:10.3758/s13428-017-0862-1>. Allows for flexible specification of any undirected network estimation procedure in R, and offers default sets for various estimation routines.
Maintained by Sacha Epskamp. Last updated 5 months ago.
2.5 match 32 stars 8.92 score 155 scripts 3 dependentsjsegrestin
comstab:Partitioning the Drivers of Stability of Ecological Communities
Contains the basic functions to apply the unified framework for partitioning the drivers of stability of ecological communities. Segrestin et al. (2024) <doi:10.1111/geb.13828>.
Maintained by Jules Segrestin. Last updated 8 months ago.
5.3 match 6 stars 4.18 scoremeierluk
hdi:High-Dimensional Inference
Implementation of multiple approaches to perform inference in high-dimensional models.
Maintained by Lukas Meier. Last updated 4 years ago.
4.9 match 2 stars 4.47 score 139 scripts 7 dependentsjohannes-titz
fastpos:Finds the Critical Sequential Point of Stability for a Pearson Correlation
Finds the critical sample size ("critical point of stability") for a correlation to stabilize in Schoenbrodt and Perugini's definition of sequential stability (see <doi:10.1016/j.jrp.2013.05.009>).
Maintained by Johannes Titz. Last updated 3 years ago.
7.6 match 2.81 score 13 scriptsbioc
Cepo:Cepo for the identification of differentially stable genes
Defining the identity of a cell is fundamental to understand the heterogeneity of cells to various environmental signals and perturbations. We present Cepo, a new method to explore cell identities from single-cell RNA-sequencing data using differential stability as a new metric to define cell identity genes. Cepo computes cell-type specific gene statistics pertaining to differential stable gene expression.
Maintained by Hani Jieun Kim. Last updated 5 months ago.
classificationgeneexpressionsinglecellsoftwaresequencingdifferentialexpression
4.6 match 4.62 score 14 scripts 1 dependentsschochastics
stabilityAI:Interact with the API of 'stability.ai'
An implementation of calls to interact with the API of 'stability.ai'. The API is documented at <https://platform.stability.ai/docs/api-reference>.
Maintained by David Schoch. Last updated 1 years ago.
7.4 match 14 stars 2.85 score 1 scriptsmhahsler
dbscan:Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Related Algorithms
A fast reimplementation of several density-based algorithms of the DBSCAN family. Includes the clustering algorithms DBSCAN (density-based spatial clustering of applications with noise) and HDBSCAN (hierarchical DBSCAN), the ordering algorithm OPTICS (ordering points to identify the clustering structure), shared nearest neighbor clustering, and the outlier detection algorithms LOF (local outlier factor) and GLOSH (global-local outlier score from hierarchies). The implementations use the kd-tree data structure (from library ANN) for faster k-nearest neighbor search. An R interface to fast kNN and fixed-radius NN search is also provided. Hahsler, Piekenbrock and Doran (2019) <doi:10.18637/jss.v091.i01>.
Maintained by Michael Hahsler. Last updated 2 months ago.
clusteringdbscandensity-based-clusteringhdbscanlofopticscpp
1.3 match 321 stars 15.62 score 1.6k scripts 84 dependentsfanhansen
creditmodel:Toolkit for Credit Modeling, Analysis and Visualization
Provides a highly efficient R tool suite for Credit Modeling, Analysis and Visualization.Contains infrastructure functionalities such as data exploration and preparation, missing values treatment, outliers treatment, variable derivation, variable selection, dimensionality reduction, grid search for hyper parameters, data mining and visualization, model evaluation, strategy analysis etc. This package is designed to make the development of binary classification models (machine learning based models as well as credit scorecard) simpler and faster. The references including: 1 Refaat, M. (2011, ISBN: 9781447511199). Credit Risk Scorecard: Development and Implementation Using SAS; 2 Bezdek, James C.FCM: The fuzzy c-means clustering algorithm. Computers & Geosciences (0098-3004),<DOI:10.1016/0098-3004(84)90020-7>.
Maintained by Dongping Fan. Last updated 3 years ago.
5.9 match 4 stars 3.48 score 15 scriptsdrquestion
coglasso:Collaborative Graphical Lasso - Multi-Omics Network Reconstruction
Reconstruct networks from multi-omics data sets with the collaborative graphical lasso (coglasso) algorithm described in Albanese, A., Kohlen, W., and Behrouzi, P. (2024) <arXiv:2403.18602>. Build multiple networks using the coglasso() function, select the best one with stars_coglasso().
Maintained by Alessio Albanese. Last updated 5 months ago.
3.5 match 3 stars 5.62 score 5 scriptsboost-r
gamboostLSS:Boosting Methods for 'GAMLSS'
Boosting models for fitting generalized additive models for location, shape and scale ('GAMLSS') to potentially high dimensional data.
Maintained by Benjamin Hofner. Last updated 20 days ago.
boosting-algorithmsgamboostlssgamlssmachine-learningr-languagevariable-selection
2.3 match 26 stars 8.52 score 163 scripts 1 dependentstkonopka
umap:Uniform Manifold Approximation and Projection
Uniform manifold approximation and projection is a technique for dimension reduction. The algorithm was described by McInnes and Healy (2018) in <arXiv:1802.03426>. This package provides an interface for two implementations. One is written from scratch, including components for nearest-neighbor search and for embedding. The second implementation is a wrapper for 'python' package 'umap-learn' (requires separate installation, see vignette for more details).
Maintained by Tomasz Konopka. Last updated 11 months ago.
dimensionality-reductionumapcpp
1.5 match 132 stars 12.74 score 3.6k scripts 43 dependentsfbertran
c060:Extended Inference for Lasso and Elastic-Net Regularized Cox and Generalized Linear Models
The c060 package provides additional functions to perform stability selection, model validation and parameter tuning for glmnet models.
Maintained by Frederic Bertrand. Last updated 2 years ago.
4.3 match 3 stars 4.35 score 37 scriptsjonthegeek
stbl:Stabilize Function Arguments
A set of consistent, opinionated functions to quickly check function arguments, coerce them to the desired configuration, or deliver informative error messages when that is not possible.
Maintained by Jon Harmon. Last updated 4 months ago.
3.3 match 14 stars 5.58 score 1 scripts 6 dependentssciurus365
Isinglandr:Landscape Construction and Simulation for Ising Networks
A toolbox for constructing potential landscapes for Ising networks. The parameters of the networks can be directly supplied by users or estimated by the 'IsingFit' package by van Borkulo and Epskamp (2016) <https://CRAN.R-project.org/package=IsingFit> from empirical data. The Ising model's Boltzmann distribution is preserved for the potential landscape function. The landscape functions can be used for quantifying and visualizing the stability of network states, as well as visualizing the simulation process.
Maintained by Jingmeng Cui. Last updated 5 months ago.
5.6 match 1 stars 3.30 score 3 scriptsniklaspfister
StabilizedRegression:Stabilizing Regression and Variable Selection
Contains an implementation of 'StabilizedRegression', a regression framework for heterogeneous data introduced in Pfister et al. (2021) <arXiv:1911.01850>. The procedure uses averaging to estimate a regression of a set of predictors X on a response variable Y by enforcing stability with respect to a given environment variable. The resulting regression leads to a variable selection procedure which allows to distinguish between stable and unstable predictors. The package further implements a visualization technique which illustrates the trade-off between stability and predictiveness of individual predictors.
Maintained by Niklas Pfister. Last updated 3 years ago.
6.2 match 2 stars 3.00 score 1 scriptsgeobosh
uroot:Unit Root Tests for Seasonal Time Series
Seasonal unit roots and seasonal stability tests. P-values based on response surface regressions are available for both tests. P-values based on bootstrap are available for seasonal unit root tests.
Maintained by Georgi N. Boshnakov. Last updated 11 months ago.
2.3 match 2 stars 7.88 score 512 scripts 11 dependentsbioc
mixOmics:Omics Data Integration Project
Multivariate methods are well suited to large omics data sets where the number of variables (e.g. genes, proteins, metabolites) is much larger than the number of samples (patients, cells, mice). They have the appealing properties of reducing the dimension of the data by using instrumental variables (components), which are defined as combinations of all variables. Those components are then used to produce useful graphical outputs that enable better understanding of the relationships and correlation structures between the different data sets that are integrated. mixOmics offers a wide range of multivariate methods for the exploration and integration of biological datasets with a particular focus on variable selection. The package proposes several sparse multivariate models we have developed to identify the key variables that are highly correlated, and/or explain the biological outcome of interest. The data that can be analysed with mixOmics may come from high throughput sequencing technologies, such as omics data (transcriptomics, metabolomics, proteomics, metagenomics etc) but also beyond the realm of omics (e.g. spectral imaging). The methods implemented in mixOmics can also handle missing values without having to delete entire rows with missing data. A non exhaustive list of methods include variants of generalised Canonical Correlation Analysis, sparse Partial Least Squares and sparse Discriminant Analysis. Recently we implemented integrative methods to combine multiple data sets: N-integration with variants of Generalised Canonical Correlation Analysis and P-integration with variants of multi-group Partial Least Squares.
Maintained by Eva Hamrud. Last updated 4 days ago.
immunooncologymicroarraysequencingmetabolomicsmetagenomicsproteomicsgenepredictionmultiplecomparisonclassificationregressionbioconductorgenomicsgenomics-datagenomics-visualizationmultivariate-analysismultivariate-statisticsomicsr-pkgr-project
1.3 match 182 stars 13.71 score 1.3k scripts 22 dependentscran
fluxweb:Estimate Energy Fluxes in Food Webs
Compute energy fluxes in trophic networks, from resources to their consumers, and can be applied to systems ranging from simple two-species interactions to highly complex food webs. It implements the approach described in Gauzens et al. (2017) <doi:10.1101/229450> to calculate energy fluxes, which are also used to calculate equilibrium stability.
Maintained by Benoit Gauzens. Last updated 6 years ago.
9.0 match 1 stars 2.00 scorerkillick
changepoint.influence:Package to Calculate the Influence of the Data on a Changepoint Segmentation
Allows users to input their data, segmentation and function used for the segmentation (and additional arguments) and the package calculates the influence of the data on the changepoint locations, see Wilms et al. (2022) <doi:10.1080/10618600.2021.2000873>. Currently this can only be used with the changepoint package functions to identify changes, but we plan to extend this. There are options for different types of graphics to assess the influence.
Maintained by Rebecca Killick. Last updated 1 years ago.
6.0 match 3.00 score 3 scriptsboost-r
FDboost:Boosting Functional Regression Models
Regression models for functional data, i.e., scalar-on-function, function-on-scalar and function-on-function regression models, are fitted by a component-wise gradient boosting algorithm. For a manual on how to use 'FDboost', see Brockhaus, Ruegamer, Greven (2017) <doi:10.18637/jss.v094.i10>.
Maintained by David Ruegamer. Last updated 3 months ago.
boostingboosting-algorithmsfunction-on-function-regressionfunction-on-scalar-regressionmachine-learningscalar-on-function-regressionvariable-selection
2.3 match 17 stars 8.00 score 98 scriptssimularia
simulariatools:Simularia Tools for the Analysis of Air Pollution Data
A set of tools developed at Simularia for Simularia, to help preprocessing and post-processing of meteorological and air quality data.
Maintained by Giuseppe Carlino. Last updated 7 months ago.
air-qualityatmospheric-modellingpollution-predictionvisualization
4.3 match 5 stars 4.18 score 4 scriptsnathanieltwarog
braidReports:Visualize Combined Action Response Surfaces and Report BRAID Analyses
Provides functions to visualize combined action data in 'ggplot2'. Also provides functions for producing full BRAID analysis reports with custom layouts and aesthetics, using the BRAID method originally described in Twarog et al. (2016) <doi:10.1038/srep25523>.
Maintained by Nathaniel R. Twarog. Last updated 6 months ago.
3.4 match 1 stars 5.13 score 15 scriptsbioc
NormqPCR:Functions for normalisation of RT-qPCR data
Functions for the selection of optimal reference genes and the normalisation of real-time quantitative PCR data.
Maintained by James Perkins. Last updated 5 months ago.
microtitreplateassaygeneexpressionqpcr
3.6 match 4.72 score 26 scriptsncss-tech
sharpshootR:A Soil Survey Toolkit
A collection of data processing, visualization, and export functions to support soil survey operations. Many of the functions build on the `SoilProfileCollection` S4 class provided by the aqp package, extending baseline visualization to more elaborate depictions in the context of spatial and taxonomic data. While this package is primarily developed by and for the USDA-NRCS, in support of the National Cooperative Soil Survey, the authors strive for generalization sufficient to support any soil survey operation. Many of the included functions are used by the SoilWeb suite of websites and movile applications. These functions are provided here, with additional documentation, to enable others to replicate high quality versions of these figures for their own purposes.
Maintained by Dylan Beaudette. Last updated 13 days ago.
2.0 match 18 stars 8.37 score 327 scriptsecospat
ecospat:Spatial Ecology Miscellaneous Methods
Collection of R functions and data sets for the support of spatial ecology analyses with a focus on pre, core and post modelling analyses of species distribution, niche quantification and community assembly. Written by current and former members and collaborators of the ecospat group of Antoine Guisan, Department of Ecology and Evolution (DEE) and Institute of Earth Surface Dynamics (IDYST), University of Lausanne, Switzerland. Read Di Cola et al. (2016) <doi:10.1111/ecog.02671> for details.
Maintained by Olivier Broennimann. Last updated 1 months ago.
1.8 match 32 stars 9.35 score 418 scripts 1 dependentsplangfelder
WGCNA:Weighted Correlation Network Analysis
Functions necessary to perform Weighted Correlation Network Analysis on high-dimensional data as originally described in Horvath and Zhang (2005) <doi:10.2202/1544-6115.1128> and Langfelder and Horvath (2008) <doi:10.1186/1471-2105-9-559>. Includes functions for rudimentary data cleaning, construction of correlation networks, module identification, summarization, and relating of variables and modules to sample traits. Also includes a number of utility functions for data manipulation and visualization.
Maintained by Peter Langfelder. Last updated 6 months ago.
1.7 match 54 stars 9.65 score 5.3k scripts 32 dependentsjacob-long
panelr:Regression Models and Utilities for Repeated Measures and Panel Data
Provides an object type and associated tools for storing and wrangling panel data. Implements several methods for creating regression models that take advantage of the unique aspects of panel data. Among other capabilities, automates the "within-between" (also known as "between-within" and "hybrid") panel regression specification that combines the desirable aspects of both fixed effects and random effects econometric models and fits them as multilevel models (Allison, 2009 <doi:10.4135/9781412993869.d33>; Bell & Jones, 2015 <doi:10.1017/psrm.2014.7>). These models can also be estimated via generalized estimating equations (GEE; McNeish, 2019 <doi:10.1080/00273171.2019.1602504>) and Bayesian estimation is (optionally) supported via 'Stan'. Supports estimation of asymmetric effects models via first differences (Allison, 2019 <doi:10.1177/2378023119826441>) as well as a generalized linear model extension thereof using GEE.
Maintained by Jacob A. Long. Last updated 1 years ago.
1.8 match 101 stars 8.76 score 181 scripts 1 dependentsalexchristensen
NetworkToolbox:Methods and Measures for Brain, Cognitive, and Psychometric Network Analysis
Implements network analysis and graph theory measures used in neuroscience, cognitive science, and psychology. Methods include various filtering methods and approaches such as threshold, dependency (Kenett, Tumminello, Madi, Gur-Gershgoren, Mantegna, & Ben-Jacob, 2010 <doi:10.1371/journal.pone.0015032>), Information Filtering Networks (Barfuss, Massara, Di Matteo, & Aste, 2016 <doi:10.1103/PhysRevE.94.062306>), and Efficiency-Cost Optimization (Fallani, Latora, & Chavez, 2017 <doi:10.1371/journal.pcbi.1005305>). Brain methods include the recently developed Connectome Predictive Modeling (see references in package). Also implements several network measures including local network characteristics (e.g., centrality), community-level network characteristics (e.g., community centrality), global network characteristics (e.g., clustering coefficient), and various other measures associated with the reliability and reproducibility of network analysis.
Maintained by Alexander Christensen. Last updated 2 years ago.
2.3 match 23 stars 6.99 score 101 scripts 4 dependentskirstenlandsiedel
scqe:Stability Controlled Quasi-Experimentation
Functions to implement the stability controlled quasi-experiment (SCQE) approach to study the effects of newly adopted treatments that were not assigned at random. This package contains tools to help users avoid making statistical assumptions that rely on infeasible assumptions. Methods developed in Hazlett (2019) <doi:10.1002/sim.8717>.
Maintained by Kirsten Landsiedel. Last updated 4 years ago.
15.6 match 1.00 score 3 scriptsbioc
monocle:Clustering, differential expression, and trajectory analysis for single- cell RNA-Seq
Monocle performs differential expression and time-series analysis for single-cell expression experiments. It orders individual cells according to progress through a biological process, without knowing ahead of time which genes define progress through that process. Monocle also performs differential expression analysis, clustering, visualization, and other useful tasks on single cell expression data. It is designed to work with RNA-Seq and qPCR data, but could be used with other types as well.
Maintained by Cole Trapnell. Last updated 5 months ago.
immunooncologysequencingrnaseqgeneexpressiondifferentialexpressioninfrastructuredataimportdatarepresentationvisualizationclusteringmultiplecomparisonqualitycontrolcpp
1.8 match 8.89 score 1.6k scripts 2 dependentsdesanou
mglasso:Multiscale Graphical Lasso
Inference of Multiscale graphical models with neighborhood selection approach. The method is based on solving a convex optimization problem combining a Lasso and fused-group Lasso penalties. This allows to infer simultaneously a conditional independence graph and a clustering partition. The optimization is based on the Continuation with Nesterov smoothing in a Shrinkage-Thresholding Algorithm solver (Hadj-Selem et al. 2018) <doi:10.1109/TMI.2018.2829802> implemented in python.
Maintained by Edmond Sanou. Last updated 2 years ago.
3.8 match 2 stars 4.11 score 13 scriptsveradjordjilovic
penalizedclr:Integrative Penalized Conditional Logistic Regression
Implements L1 and L2 penalized conditional logistic regression with penalty factors allowing for integration of multiple data sources. Implements stability selection for variable selection.
Maintained by Vera Djordjilovic. Last updated 2 years ago.
5.7 match 1 stars 2.70 score 2 scriptsahoshiyar
ordPens:Selection, Fusion, Smoothing and Principal Components Analysis for Ordinal Variables
Selection, fusion, and/or smoothing of ordinally scaled independent variables using a group lasso, fused lasso or generalized ridge penalty, as well as non-linear principal components analysis for ordinal variables using a second-order difference/smoothing penalty.
Maintained by Aisouda Hoshiyar. Last updated 10 months ago.
4.0 match 2 stars 3.79 score 31 scriptsdatarob
panelvar:Panel Vector Autoregression
We extend two general methods of moment estimators to panel vector autoregression models (PVAR) with p lags of endogenous variables, predetermined and strictly exogenous variables. This general PVAR model contains the first difference GMM estimator by Holtz-Eakin et al. (1988) <doi:10.2307/1913103>, Arellano and Bond (1991) <doi:10.2307/2297968> and the system GMM estimator by Blundell and Bond (1998) <doi:10.1016/S0304-4076(98)00009-8>. We also provide specification tests (Hansen overidentification test, lag selection criterion and stability test of the PVAR polynomial) and classical structural analysis for PVAR models such as orthogonal and generalized impulse response functions, bootstrapped confidence intervals for impulse response analysis and forecast error variance decompositions.
Maintained by Robert Ferstl. Last updated 4 months ago.
5.4 match 9 stars 2.84 score 76 scriptslazappi
clustree:Visualise Clusterings at Different Resolutions
Deciding what resolution to use can be a difficult question when approaching a clustering analysis. One way to approach this problem is to look at how samples move as the number of clusters increases. This package allows you to produce clustering trees, a visualisation for interrogating clusterings as resolution increases.
Maintained by Luke Zappia. Last updated 1 years ago.
clusteringclustering-treesvisualisationvisualization
1.3 match 219 stars 11.40 score 1.9k scripts 5 dependentstidyverse
duckplyr:A 'DuckDB'-Backed Version of 'dplyr'
A drop-in replacement for 'dplyr', powered by 'DuckDB' for performance. Offers convenient utilities for working with in-memory and larger-than-memory data while retaining full 'dplyr' compatibility.
Maintained by Kirill Müller. Last updated 5 days ago.
analyticsdataframedplyrduckdbperformance
1.3 match 309 stars 11.33 score 220 scriptsr-forge
fPortfolio:Rmetrics - Portfolio Selection and Optimization
A collection of functions to optimize portfolios and to analyze them from different points of view.
Maintained by Stefan Theussl. Last updated 3 months ago.
2.3 match 1 stars 6.66 score 179 scripts 2 dependentschongwu-biostat
prclust:Penalized Regression-Based Clustering Method
Clustering is unsupervised and exploratory in nature. Yet, it can be performed through penalized regression with grouping pursuit. In this package, we provide two algorithms for fitting the penalized regression-based clustering (PRclust) with non-convex grouping penalties, such as group truncated lasso, MCP and SCAD. One algorithm is based on quadratic penalty and difference convex method. Another algorithm is based on difference convex and ADMM, called DC-ADD, which is more efficient. Generalized cross validation and stability based method were provided to select the tuning parameters. Rand index, adjusted Rand index and Jaccard index were provided to estimate the agreement between estimated cluster memberships and the truth.
Maintained by Chong Wu. Last updated 8 years ago.
5.4 match 2.70 score 6 scriptsbioc
corral:Correspondence Analysis for Single Cell Data
Correspondence analysis (CA) is a matrix factorization method, and is similar to principal components analysis (PCA). Whereas PCA is designed for application to continuous, approximately normally distributed data, CA is appropriate for non-negative, count-based data that are in the same additive scale. The corral package implements CA for dimensionality reduction of a single matrix of single-cell data, as well as a multi-table adaptation of CA that leverages data-optimized scaling to align data generated from different sequencing platforms by projecting into a shared latent space. corral utilizes sparse matrices and a fast implementation of SVD, and can be called directly on Bioconductor objects (e.g., SingleCellExperiment) for easy pipeline integration. The package also includes additional options, including variations of CA to address overdispersion in count data (e.g., Freeman-Tukey chi-squared residual), as well as the option to apply CA-style processing to continuous data (e.g., proteomic TOF intensities) with the Hellinger distance adaptation of CA.
Maintained by Lauren Hsu. Last updated 5 months ago.
batcheffectdimensionreductiongeneexpressionpreprocessingprincipalcomponentsequencingsinglecellsoftwarevisualization
3.1 match 4.64 score 22 scriptsjpgattuso
seacarb:Seawater Carbonate Chemistry
Calculates parameters of the seawater carbonate system and assists the design of ocean acidification perturbation experiments.
Maintained by Jean-Pierre Gattuso. Last updated 1 years ago.
1.8 match 8 stars 8.27 score 350 scripts 5 dependentsygeunkim
bvhar:Bayesian Vector Heterogeneous Autoregressive Modeling
Tools to model and forecast multivariate time series including Bayesian Vector heterogeneous autoregressive (VHAR) model by Kim & Baek (2023) (<doi:10.1080/00949655.2023.2281644>). 'bvhar' can model Vector Autoregressive (VAR), VHAR, Bayesian VAR (BVAR), and Bayesian VHAR (BVHAR) models.
Maintained by Young Geun Kim. Last updated 17 days ago.
bayesianbayesian-econometricsbvareigenforecastingharpybind11pythonrcppeigentime-seriesvector-autoregressioncppopenmp
2.3 match 6 stars 6.42 score 25 scriptsjhorzek
lessSEM:Non-Smooth Regularization for Structural Equation Models
Provides regularized structural equation modeling (regularized SEM) with non-smooth penalty functions (e.g., lasso) building on 'lavaan'. The package is heavily inspired by the ['regsem'](<https://github.com/Rjacobucci/regsem>) and ['lslx'](<https://github.com/psyphh/lslx>) packages.
Maintained by Jannik H. Orzek. Last updated 1 years ago.
lassopsychometricsregularizationregularized-structural-equation-modelsemstructural-equation-modelingopenblascppopenmp
2.0 match 7 stars 7.19 score 223 scriptsroga11
MSTest:Hypothesis Testing for Markov Switching Models
Implementation of hypothesis testing procedures described in Hansen (1992) <doi:10.1002/jae.3950070506>, Carrasco, Hu, & Ploberger (2014) <doi:10.3982/ECTA8609>, Dufour & Luger (2017) <doi:10.1080/07474938.2017.1307548>, and Rodriguez Rondon & Dufour (2024) <https://grodriguezrondon.com/files/RodriguezRondon_Dufour_2024_MonteCarlo_LikelihoodRatioTest_MarkovSwitchingModels_20241015.pdf> that can be used to identify the number of regimes in Markov switching models.
Maintained by Gabriel Rodriguez Rondon. Last updated 20 days ago.
autoregressivebootstraphypothesis-testinglikelihood-ratio-testmarkov-chainmomentsmonte-carlonon-linearregime-switchingtime-seriesopenblascppopenmp
3.4 match 5 stars 4.18 score 3 scriptsr-forge
crch:Censored Regression with Conditional Heteroscedasticity
Different approaches to censored or truncated regression with conditional heteroscedasticity are provided. First, continuous distributions can be used for the (right and/or left censored or truncated) response with separate linear predictors for the mean and variance. Second, cumulative link models for ordinal data (obtained by interval-censoring continuous data) can be employed for heteroscedastic extended logistic regression (HXLR). In the latter type of models, the intercepts depend on the thresholds that define the intervals. Infrastructure for working with censored or truncated normal, logistic, and Student-t distributions, i.e., d/p/q/r functions and distributions3 objects.
Maintained by Achim Zeileis. Last updated 3 days ago.
1.8 match 8.11 score 93 scripts 2 dependentslozalojo
mem:The Moving Epidemic Method
The Moving Epidemic Method, created by T Vega and JE Lozano (2012, 2015) <doi:10.1111/j.1750-2659.2012.00422.x>, <doi:10.1111/irv.12330>, allows the weekly assessment of the epidemic and intensity status to help in routine respiratory infections surveillance in health systems. Allows the comparison of different epidemic indicators, timing and shape with past epidemics and across different regions or countries with different surveillance systems. Also, it gives a measure of the performance of the method in terms of sensitivity and specificity of the alert week.
Maintained by Jose E. Lozano. Last updated 2 years ago.
2.3 match 14 stars 6.24 score 82 scripts 1 dependentscrp2a
gamma:Dose Rate Estimation from in-Situ Gamma-Ray Spectrometry Measurements
Process in-situ Gamma-Ray Spectrometry for Luminescence Dating. This package allows to import, inspect and correct the energy shifts of gamma-ray spectra. It provides methods for estimating the gamma dose rate by the use of a calibration curve as described in Mercier and Falguères (2007). The package only supports Canberra CNF and TKA and Kromek SPE files.
Maintained by Archéosciences Bordeaux. Last updated 6 months ago.
archaeometrygamma-spectrometrygeochronologyluminescence-dating
2.0 match 6 stars 6.99 score 11 scripts 1 dependentsfbertran
plsRglm:Partial Least Squares Regression for Generalized Linear Models
Provides (weighted) Partial least squares Regression for generalized linear models and repeated k-fold cross-validation of such models using various criteria <arXiv:1810.01005>. It allows for missing data in the explanatory variables. Bootstrap confidence intervals constructions are also available.
Maintained by Frederic Bertrand. Last updated 2 years ago.
1.8 match 16 stars 7.75 score 103 scripts 5 dependentslbelzile
longevity:Statistical Methods for the Analysis of Excess Lifetimes
A collection of parametric and nonparametric methods for the analysis of survival data. Parametric families implemented include Gompertz-Makeham, exponential and generalized Pareto models and extended models. The package includes an implementation of the nonparametric maximum likelihood estimator for arbitrary truncation and censoring pattern based on Turnbull (1976) <doi:10.1111/j.2517-6161.1976.tb01597.x>, along with graphical goodness-of-fit diagnostics. Parametric models for positive random variables and peaks over threshold models based on extreme value theory are described in Rootzén and Zholud (2017) <doi:10.1007/s10687-017-0305-5>; Belzile et al. (2021) <doi:10.1098/rsos.202097> and Belzile et al. (2022) <doi:10.1146/annurev-statistics-040120-025426>.
Maintained by Leo Belzile. Last updated 4 months ago.
4.0 match 3.48 score 15 scriptsasa12138
MetaNet:Network Analysis for Omics Data
Comprehensive network analysis package. Calculate correlation network fastly, accelerate lots of analysis by parallel computing. Support for multi-omics data, search sub-nets fluently. Handle bigger data, more than 10,000 nodes in each omics. Offer various layout method for multi-omics network and some interfaces to other software ('Gephi', 'Cytoscape', 'ggplot2'), easy to visualize. Provide comprehensive topology indexes calculation, including ecological network stability.
Maintained by Chen Peng. Last updated 11 days ago.
dataimportnetwork analysisomicssoftwarevisualization
2.5 match 13 stars 5.51 score 9 scriptscran
gfboost:Gradient-Free Gradient Boosting
Implementation of routines of the author's PhD thesis on gradient-free Gradient Boosting (Werner, Tino (2020) "Gradient-Free Gradient Boosting", URL '<https://oops.uni-oldenburg.de/id/eprint/4290>').
Maintained by Tino Werner. Last updated 3 years ago.
6.8 match 2.00 scorealexanderlange53
svars:Data-Driven Identification of SVAR Models
Implements data-driven identification methods for structural vector autoregressive (SVAR) models as described in Lange et al. (2021) <doi:10.18637/jss.v097.i05>. Based on an existing VAR model object (provided by e.g. VAR() from the 'vars' package), the structural impact matrix is obtained via data-driven identification techniques (i.e. changes in volatility (Rigobon, R. (2003) <doi:10.1162/003465303772815727>), patterns of GARCH (Normadin, M., Phaneuf, L. (2004) <doi:10.1016/j.jmoneco.2003.11.002>), independent component analysis (Matteson, D. S, Tsay, R. S., (2013) <doi:10.1080/01621459.2016.1150851>), least dependent innovations (Herwartz, H., Ploedt, M., (2016) <doi:10.1016/j.jimonfin.2015.11.001>), smooth transition in variances (Luetkepohl, H., Netsunajev, A. (2017) <doi:10.1016/j.jedc.2017.09.001>) or non-Gaussian maximum likelihood (Lanne, M., Meitz, M., Saikkonen, P. (2017) <doi:10.1016/j.jeconom.2016.06.002>)).
Maintained by Alexander Lange. Last updated 2 years ago.
1.9 match 46 stars 7.22 score 130 scriptsthiloklein
matchingMarkets:Analysis of Stable Matchings
Implements structural estimators to correct for the sample selection bias from observed outcomes in matching markets. This includes one-sided matching of agents into groups as well as two-sided matching of students to schools. The package also contains algorithms to find stable matchings in the three most common matching problems: the stable roommates problem, the college admissions problem, and the house allocation problem.
Maintained by Thilo Klein. Last updated 5 years ago.
2.3 match 40 stars 5.99 score 49 scriptstscnlab
LightLogR:Process Data from Wearable Light Loggers and Optical Radiation Dosimeters
Import, processing, validation, and visualization of personal light exposure measurement data from wearable devices. The package implements features such as the import of data and metadata files, conversion of common file formats, validation of light logging data, verification of crucial metadata, calculation of common parameters, and semi-automated analysis and visualization.
Maintained by Johannes Zauner. Last updated 24 days ago.
dosimetrylighttime-series-analysiswearable-deviceswearable-sensors
2.3 match 12 stars 5.91 score 28 scriptsjeksterslab
simStateSpace:Simulate Data from State Space Models
Provides a streamlined and user-friendly framework for simulating data in state space models, particularly when the number of subjects/units (n) exceeds one, a scenario commonly encountered in social and behavioral sciences. For an introduction to state space models in social and behavioral sciences, refer to Chow, Ho, Hamaker, and Dolan (2010) <doi:10.1080/10705511003661553>.
Maintained by Ivan Jacob Agaloos Pesigan. Last updated 30 days ago.
simulationstate-space-modelopenblascppopenmp
2.3 match 1 stars 5.78 score 75 scripts 2 dependentschristinaheinze
CompareCausalNetworks:Interface to Diverse Estimation Methods of Causal Networks
Unified interface for the estimation of causal networks, including the methods 'backShift' (from package 'backShift'), 'bivariateANM' (bivariate additive noise model), 'bivariateCAM' (bivariate causal additive model), 'CAM' (causal additive model) (from package 'CAM'; the package is temporarily unavailable on the CRAN repository; formerly available versions can be obtained from the archive), 'hiddenICP' (invariant causal prediction with hidden variables), 'ICP' (invariant causal prediction) (from package 'InvariantCausalPrediction'), 'GES' (greedy equivalence search), 'GIES' (greedy interventional equivalence search), 'LINGAM', 'PC' (PC Algorithm), 'FCI' (fast causal inference), 'RFCI' (really fast causal inference) (all from package 'pcalg') and regression.
Maintained by Christina Heinze-Deml. Last updated 5 years ago.
3.3 match 22 stars 3.90 score 12 scripts 1 dependentsfreezenik
bamlss:Bayesian Additive Models for Location, Scale, and Shape (and Beyond)
Infrastructure for estimating probabilistic distributional regression models in a Bayesian framework. The distribution parameters may capture location, scale, shape, etc. and every parameter may depend on complex additive terms (fixed, random, smooth, spatial, etc.) similar to a generalized additive model. The conceptual and computational framework is introduced in Umlauf, Klein, Zeileis (2019) <doi:10.1080/10618600.2017.1407325> and the R package in Umlauf, Klein, Simon, Zeileis (2021) <doi:10.18637/jss.v100.i04>.
Maintained by Nikolaus Umlauf. Last updated 5 months ago.
2.3 match 1 stars 5.76 score 239 scripts 5 dependentsbioc
Rtreemix:Rtreemix: Mutagenetic trees mixture models.
Rtreemix is a package that offers an environment for estimating the mutagenetic trees mixture models from cross-sectional data and using them for various predictions. It includes functions for fitting the trees mixture models, likelihood computations, model comparisons, waiting time estimations, stability analysis, etc.
Maintained by Jasmina Bogojeska. Last updated 26 days ago.
4.5 match 2.86 score 12 scriptscran
DDHFm:Variance Stabilization by Data-Driven Haar-Fisz (for Microarrays)
Contains the normalizing and variance stabilizing Data-Driven Haar-Fisz algorithm. Also contains related algorithms for simulating from certain microarray gene intensity models and evaluation of certain transformations. Contains cDNA and shipping credit flow data.
Maintained by Guy Nason. Last updated 5 months ago.
6.4 match 2.00 scorejaredhuling
fastglm:Fast and Stable Fitting of Generalized Linear Models using 'RcppEigen'
Fits generalized linear models efficiently using 'RcppEigen'. The iteratively reweighted least squares implementation utilizes the step-halving approach of Marschner (2011) <doi:10.32614/RJ-2011-012> to help safeguard against convergence issues.
Maintained by Jared Huling. Last updated 3 years ago.
1.5 match 57 stars 8.47 score 59 scripts 13 dependentssaulo-chaves
ProbBreed:Probability Theory for Selecting Candidates in Plant Breeding
Use probability theory under the Bayesian framework for calculating the risk of selecting candidates in a multi-environment context. Contained are functions used to fit a Bayesian multi-environment model (based on the available presets), extract posterior values and maximum posterior values, compute the variance components, check the model’s convergence, and calculate the probabilities. For both across and within-environments scopes, the package computes the probability of superior performance and the pairwise probability of superior performance. Furthermore, the probability of superior stability and the pairwise probability of superior stability across environments is estimated. A joint probability of superior performance and stability is also provided.
Maintained by Saulo Chaves. Last updated 1 months ago.
2.8 match 8 stars 4.45 score 4 scriptsjtilly
matchingR:Matching Algorithms in R and C++
Computes matching algorithms quickly using Rcpp. Implements the Gale-Shapley Algorithm to compute the stable matching for two-sided markets, such as the stable marriage problem and the college-admissions problem. Implements Irving's Algorithm for the stable roommate problem. Implements the top trading cycle algorithm for the indivisible goods trading problem.
Maintained by Jan Tilly. Last updated 6 months ago.
1.7 match 50 stars 7.09 score 41 scripts 2 dependentsmattcefalu
twang:Toolkit for Weighting and Analysis of Nonequivalent Groups
Provides functions for propensity score estimating and weighting, nonresponse weighting, and diagnosis of the weights.
Maintained by Matthew Cefalu. Last updated 3 years ago.
1.7 match 6 stars 6.83 score 169 scripts 10 dependentsuupharmacometrics
xpose4:Diagnostics for Nonlinear Mixed-Effect Models
A model building aid for nonlinear mixed-effects (population) model analysis using NONMEM, facilitating data set checkout, exploration and visualization, model diagnostics, candidate covariate identification and model comparison. The methods are described in Keizer et al. (2013) <doi:10.1038/psp.2013.24>, and Jonsson et al. (1999) <doi:10.1016/s0169-2607(98)00067-4>.
Maintained by Andrew C. Hooker. Last updated 1 years ago.
diagnosticsnonmempharmacometricspopulation-modelxpose
1.6 match 35 stars 7.30 score 315 scriptsbioc
MOSim:Multi-Omics Simulation (MOSim)
MOSim package simulates multi-omic experiments that mimic regulatory mechanisms within the cell, allowing flexible experimental design including time course and multiple groups.
Maintained by Sonia Tarazona. Last updated 5 months ago.
softwaretimecourseexperimentaldesignrnaseqcpp
1.5 match 9 stars 7.46 score 11 scriptsbioc
UMI4Cats:UMI4Cats: Processing, analysis and visualization of UMI-4C chromatin contact data
UMI-4C is a technique that allows characterization of 3D chromatin interactions with a bait of interest, taking advantage of a sonication step to produce unique molecular identifiers (UMIs) that help remove duplication bias, thus allowing a better differential comparsion of chromatin interactions between conditions. This package allows processing of UMI-4C data, starting from FastQ files provided by the sequencing facility. It provides two statistical methods for detecting differential contacts and includes a visualization function to plot integrated information from a UMI-4C assay.
Maintained by Mireia Ramos-Rodriguez. Last updated 5 months ago.
qualitycontrolpreprocessingalignmentnormalizationvisualizationsequencingcoveragechromatinchromatin-interactiongenomicsumi4c
2.0 match 5 stars 5.57 score 7 scriptsg-rho
clustMixType:k-Prototypes Clustering for Mixed Variable-Type Data
Functions to perform k-prototypes partitioning clustering for mixed variable-type data according to Z.Huang (1998): Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Variables, Data Mining and Knowledge Discovery 2, 283-304.
Maintained by Gero Szepannek. Last updated 9 months ago.
1.8 match 1 stars 6.07 score 111 scripts 8 dependentsjackdunnnz
iai:Interface to 'Interpretable AI' Modules
An interface to the algorithms of 'Interpretable AI' <https://www.interpretable.ai> from the R programming language. 'Interpretable AI' provides various modules, including 'Optimal Trees' for classification, regression, prescription and survival analysis, 'Optimal Imputation' for missing data imputation and outlier detection, and 'Optimal Feature Selection' for exact sparse regression. The 'iai' package is an open-source project. The 'Interpretable AI' software modules are proprietary products, but free academic and evaluation licenses are available.
Maintained by Jack Dunn. Last updated 5 months ago.
5.4 match 1 stars 2.00 score 7 scriptsdoserjef
rFIA:Estimation of Forest Variables using the FIA Database
The goal of 'rFIA' is to increase the accessibility and use of the United States Forest Services (USFS) Forest Inventory and Analysis (FIA) Database by providing a user-friendly, open source toolkit to easily query and analyze FIA Data. Designed to accommodate a wide range of potential user objectives, 'rFIA' simplifies the estimation of forest variables from the FIA Database and allows all R users (experts and newcomers alike) to unlock the flexibility inherent to the Enhanced FIA design. Specifically, 'rFIA' improves accessibility to the spatial-temporal estimation capacity of the FIA Database by producing space-time indexed summaries of forest variables within user-defined population boundaries. Direct integration with other popular R packages (e.g., 'dplyr', 'tidyr', and 'sf') facilitates efficient space-time query and data summary, and supports common data representations and API design. The package implements design-based estimation procedures outlined by Bechtold & Patterson (2005) <doi:10.2737/SRS-GTR-80>, and has been validated against estimates and sampling errors produced by FIA 'EVALIDator'. Current development is focused on the implementation of spatially-enabled model-assisted and model-based estimators to improve population, change, and ratio estimates.
Maintained by Jeffrey Doser. Last updated 9 days ago.
compute-estimatesfiafia-databasefia-datamartforest-inventoryforest-variablesinventoriesspace-timespatial
1.8 match 49 stars 5.93 scorems609
TreeSearch:Phylogenetic Analysis with Discrete Character Data
Reconstruct phylogenetic trees from discrete data. Inapplicable character states are handled using the algorithm of Brazeau, Guillerme and Smith (2019) <doi:10.1093/sysbio/syy083> with the "Morphy" library, under equal or implied step weights. Contains a "shiny" user interface for interactive tree search and exploration of results, including character visualization, rogue taxon detection, tree space mapping, and cluster consensus trees (Smith 2022a, b) <doi:10.1093/sysbio/syab099>, <doi:10.1093/sysbio/syab100>. Profile Parsimony (Faith and Trueman, 2001) <doi:10.1080/10635150118627>, Successive Approximations (Farris, 1969) <doi:10.2307/2412182> and custom optimality criteria are implemented.
Maintained by Martin R. Smith. Last updated 3 days ago.
bioinformaticsmorphological-analysisphylogeneticsresearch-tooltree-searchcpp
1.3 match 7 stars 7.89 score 51 scriptsgrvanderploeg
parafac4microbiome:Parallel Factor Analysis Modelling of Longitudinal Microbiome Data
Creation and selection of PARAllel FACtor Analysis (PARAFAC) models of longitudinal microbiome data. You can import your own data with our import functions or use one of the example datasets to create your own PARAFAC models. Selection of the optimal number of components can be done using assessModelQuality() and assessModelStability(). The selected model can then be plotted using plotPARAFACmodel(). The Parallel Factor Analysis method was originally described by Caroll and Chang (1970) <doi:10.1007/BF02310791> and Harshman (1970) <https://www.psychology.uwo.ca/faculty/harshman/wpppfac0.pdf>.
Maintained by Geert Roelof van der Ploeg. Last updated 20 days ago.
dimensionality-reductionmicrobiomemicrobiome-datamultiwaymultiway-algorithmsparallel-factor-analysis
1.7 match 6 stars 6.31 score 13 scriptspiplus2
SPUTNIK:Spatially Automatic Denoising for Imaging Mass Spectrometry Toolkit
Set of tools for peak filtering of mass spectrometry imaging data based on spatial distribution of signal. Given a region-of-interest, representing the spatial region where the informative signal is expected to be localized, a series of filters determine which peak signals are characterized by an implausible spatial distribution. The filters reduce the dataset dimension and increase its information vs noise ratio, improving the quality of the unsupervised analysis results, reducing data dimension and simplifying the chemical interpretation. The methods are described in Inglese P. et al (2019) <doi:10.1093/bioinformatics/bty622>.
Maintained by Paolo Inglese. Last updated 11 months ago.
bioinformaticsdesi-msiimage-processingmaldi-msimaldi-tof-msmass-spectrometrymass-spectrometry-imaging
2.0 match 4 stars 5.24 score 43 scriptsanspiess
reverseR:Linear Regression Stability to Significance Reversal
Tests linear regressions for significance reversal through leave-one(multiple)-out.
Maintained by Andrej-Nikolai Spiess. Last updated 3 months ago.
3.0 match 3 stars 3.48 scorebioc
scGPS:A complete analysis of single cell subpopulations, from identifying subpopulations to analysing their relationship (scGPS = single cell Global Predictions of Subpopulation)
The package implements two main algorithms to answer two key questions: a SCORE (Stable Clustering at Optimal REsolution) to find subpopulations, followed by scGPS to investigate the relationships between subpopulations.
Maintained by Quan Nguyen. Last updated 5 months ago.
singlecellclusteringdataimportsequencingcoverageopenblascpp
2.0 match 4 stars 5.20 score 7 scriptsbioc
CytoGLMM:Conditional Differential Analysis for Flow and Mass Cytometry Experiments
The CytoGLMM R package implements two multiple regression strategies: A bootstrapped generalized linear model (GLM) and a generalized linear mixed model (GLMM). Most current data analysis tools compare expressions across many computationally discovered cell types. CytoGLMM focuses on just one cell type. Our narrower field of application allows us to define a more specific statistical model with easier to control statistical guarantees. As a result, CytoGLMM finds differential proteins in flow and mass cytometry data while reducing biases arising from marker correlations and safeguarding against false discoveries induced by patient heterogeneity.
Maintained by Christof Seiler. Last updated 5 months ago.
flowcytometryproteomicssinglecellcellbasedassayscellbiologyimmunooncologyregressionstatisticalmethodsoftware
1.8 match 2 stars 5.68 score 1 scripts 1 dependentstushiqi
MAnorm2:Tools for Normalizing and Comparing ChIP-seq Samples
Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is the premier technology for profiling genome-wide localization of chromatin-binding proteins, including transcription factors and histones with various modifications. This package provides a robust method for normalizing ChIP-seq signals across individual samples or groups of samples. It also designs a self-contained system of statistical models for calling differential ChIP-seq signals between two or more biological conditions as well as for calling hypervariable ChIP-seq signals across samples. Refer to Tu et al. (2021) <doi:10.1101/gr.262675.120> and Chen et al. (2022) <doi:10.1186/s13059-022-02627-9> for associated statistical details.
Maintained by Shiqi Tu. Last updated 2 years ago.
chip-seqdifferential-analysisempirical-bayeswinsorize-values
1.8 match 32 stars 5.48 score 19 scriptsmsesia
knockoff:The Knockoff Filter for Controlled Variable Selection
The knockoff filter is a general procedure for controlling the false discovery rate (FDR) when performing variable selection. For more information, see the website below and the accompanying paper: Candes et al., "Panning for gold: model-X knockoffs for high-dimensional controlled variable selection", J. R. Statist. Soc. B (2018) 80, 3, pp. 551-577.
Maintained by Matteo Sesia. Last updated 3 years ago.
1.8 match 2 stars 5.35 score 248 scripts 5 dependentsbioc
DEP:Differential Enrichment analysis of Proteomics data
This package provides an integrated analysis workflow for robust and reproducible analysis of mass spectrometry proteomics data for differential protein expression or differential enrichment. It requires tabular input (e.g. txt files) as generated by quantitative analysis softwares of raw mass spectrometry data, such as MaxQuant or IsobarQuant. Functions are provided for data preparation, filtering, variance normalization and imputation of missing values, as well as statistical testing of differentially enriched / expressed proteins. It also includes tools to check intermediate steps in the workflow, such as normalization and missing values imputation. Finally, visualization tools are provided to explore the results, including heatmap, volcano plot and barplot representations. For scientists with limited experience in R, the package also contains wrapper functions that entail the complete analysis workflow and generate a report. Even easier to use are the interactive Shiny apps that are provided by the package.
Maintained by Arne Smits. Last updated 5 months ago.
immunooncologyproteomicsmassspectrometrydifferentialexpressiondatarepresentation
1.3 match 7.10 score 628 scriptskechrislab
HeritSeq:Heritability of Gene Expression for Next-Generation Sequencing
Statistical framework to analyze heritability of gene expression based on next-generation sequencing data and simulating sequencing reads. Variance partition coefficients (VPC) are computed using linear mixed effects and generalized linear mixed effects models. Compound Poisson and negative binomial models are included. Reference: Rudra, Pratyaydipta, et al. "Model based heritability scores for high-throughput sequencing data." BMC bioinformatics 18.1 (2017): 143.
Maintained by W. Jenny Shi. Last updated 6 years ago.
3.6 match 2 stars 2.60 score 8 scriptsf-rousset
spaMM:Mixed-Effect Models, with or without Spatial Random Effects
Inference based on models with or without spatially-correlated random effects, multivariate responses, or non-Gaussian random effects (e.g., Beta). Variation in residual variance (heteroscedasticity) can itself be represented by a mixed-effect model. Both classical geostatistical models (Rousset and Ferdy 2014 <doi:10.1111/ecog.00566>), and Markov random field models on irregular grids (as considered in the 'INLA' package, <https://www.r-inla.org>), can be fitted, with distinct computational procedures exploiting the sparse matrix representations for the latter case and other autoregressive models. Laplace approximations are used for likelihood or restricted likelihood. Penalized quasi-likelihood and other variants discussed in the h-likelihood literature (Lee and Nelder 2001 <doi:10.1093/biomet/88.4.987>) are also implemented.
Maintained by François Rousset. Last updated 9 months ago.
1.9 match 4.94 score 208 scripts 5 dependentsbioc
TrajectoryGeometry:This Package Discovers Directionality in Time and Pseudo-times Series of Gene Expression Patterns
Given a time series or pseudo-times series of gene expression data, we might wish to know: Do the changes in gene expression in these data exhibit directionality? Are there turning points in this directionality. Do different subsets of the data move in different directions? This package uses spherical geometry to probe these sorts of questions. In particular, if we are looking at (say) the first n dimensions of the PCA of gene expression, directionality can be detected as the clustering of points on the (n-1)-dimensional sphere.
Maintained by Michael Shapiro. Last updated 5 months ago.
biologicalquestionstatisticalmethodgeneexpressionsinglecell
2.0 match 4.60 score 7 scriptsrwehrens
BioMark:Find Biomarkers in Two-Class Discrimination Problems
Variable selection methods are provided for several classification methods: the lasso/elastic net, PCLDA, PLSDA, and several t-tests. Two approaches for selecting cutoffs can be used, one based on the stability of model coefficients under perturbation, and the other on higher criticism.
Maintained by Ron Wehrens. Last updated 10 years ago.
3.9 match 2.32 score 21 scriptsandrija-djurovic
PDtoolkit:Collection of Tools for PD Rating Model Development and Validation
The goal of this package is to cover the most common steps in probability of default (PD) rating model development and validation. The main procedures available are those that refer to univariate, bivariate, multivariate analysis, calibration and validation. Along with accompanied 'monobin' and 'monobinShiny' packages, 'PDtoolkit' provides functions which are suitable for different data transformation and modeling tasks such as: imputations, monotonic binning of numeric risk factors, binning of categorical risk factors, weights of evidence (WoE) and information value (IV) calculations, WoE coding (replacement of risk factors modalities with WoE values), risk factor clustering, area under curve (AUC) calculation and others. Additionally, package provides set of validation functions for testing homogeneity, heterogeneity, discriminatory and predictive power of the model.
Maintained by Andrija Djurovic. Last updated 1 years ago.
1.9 match 14 stars 4.78 score 86 scriptsbertcarnell
triangle:Distribution Functions and Parameter Estimates for the Triangle Distribution
Provides the "r, q, p, and d" distribution functions for the triangle distribution. Also includes maximum likelihood estimation of parameters.
Maintained by Rob Carnell. Last updated 7 months ago.
1.1 match 3 stars 8.01 score 293 scripts 21 dependentshaoranpopevo
ecode:Ordinary Differential Equation Systems in Ecology
A framework to simulate ecosystem dynamics through ordinary differential equations (ODEs). You create an ODE model, tells 'ecode' to explore its behaviour, and perform numerical simulations on the model. 'ecode' also allows you to fit model parameters by machine learning algorithms. Potential users include researchers who are interested in the dynamics of ecological community and biogeochemical cycles.
Maintained by Haoran Wu. Last updated 8 months ago.
2.3 match 7 stars 3.85 score 3 scriptsalec-stashevsky
blocklength:Select an Optimal Block-Length to Bootstrap Dependent Data (Block Bootstrap)
A set of functions to select the optimal block-length for a dependent bootstrap (block-bootstrap). Includes the Hall, Horowitz, and Jing (1995) <doi:10.1093/biomet/82.3.561> subsampling-based cross-validation method, the Politis and White (2004) <doi:10.1081/ETC-120028836> Spectral Density Plug-in method, including the Patton, Politis, and White (2009) <doi:10.1080/07474930802459016> correction, and the Lahiri, Furukawa, and Lee (2007) <doi:10.1016/j.stamet.2006.08.002> nonparametric plug-in method, with a corresponding set of S3 plot methods.
Maintained by Alec Stashevsky. Last updated 8 days ago.
block-bootstrapblock-resamplingblocklengthbootbootstrapdepedent-bootstrapdependenthorowitzinferencemebootpolitisresamplestatstimetime-seriestime-series-analysistseries
1.8 match 4 stars 4.78 score 8 scriptsmetabocomp
MUVR2:Multivariate Methods with Unbiased Variable Selection
Predictive multivariate modelling for metabolomics. Types: Classification and regression. Methods: Partial Least Squares, Random Forest ans Elastic Net Data structures: Paired and unpaired Validation: repeated double cross-validation (Westerhuis et al. (2008)<doi:10.1007/s11306-007-0099-6>, Filzmoser et al. (2009)<doi:10.1002/cem.1225>) Variable selection: Performed internally, through tuning in the inner cross-validation loop.
Maintained by Yingxiao Yan. Last updated 6 months ago.
2.3 match 1 stars 3.81 score 1 scriptswanchanglin
mt:Metabolomics Data Analysis Toolbox
Functions for metabolomics data analysis: data preprocessing, orthogonal signal correction, PCA analysis, PCA-DA analysis, PLS-DA analysis, classification, feature selection, correlation analysis, data visualisation and re-sampling strategies.
Maintained by Wanchang Lin. Last updated 1 years ago.
1.9 match 3 stars 4.57 score 50 scriptslucaskook
tramicp:Model-Based Causal Feature Selection for General Response Types
Extends invariant causal prediction (Peters et al., 2016, <doi:10.1111/rssb.12167>) to generalized linear and transformation models (Hothorn et al., 2018, <doi:10.1111/sjos.12291>). The methodology is described in Kook et al. (2023, <doi:10.1080/01621459.2024.2395588>).
Maintained by Lucas Kook. Last updated 1 months ago.
2.0 match 7 stars 4.24 scorexcding1212
Sie2nts:Sieve Methods for Non-Stationary Time Series
We provide functions for estimation and inference of locally-stationary time series using the sieve methods and bootstrapping procedure. In addition, it also contains functions to generate Daubechies and Coiflet wavelet by Cascade algorithm and to process data visualization.
Maintained by Xiucai Ding. Last updated 2 years ago.
3.4 match 2.48 score 2 scripts 1 dependentsroaldarbol
animovement:An R toolbox for analysing animal movement across space and time
An R toolbox for analysing animal movement across space and time.
Maintained by Mikkel Roald-Arbøl. Last updated 2 months ago.
animal-behaviouranimal-movementneuroethologyneuroscience
1.8 match 10 stars 4.81 score 8 scriptsbioc
spillR:Spillover Compensation in Mass Cytometry Data
Channel interference in mass cytometry can cause spillover and may result in miscounting of protein markers. We develop a nonparametric finite mixture model and use the mixture components to estimate the probability of spillover. We implement our method using expectation-maximization to fit the mixture model.
Maintained by Marco Guazzini. Last updated 5 months ago.
flowcytometryimmunooncologymassspectrometrypreprocessingsinglecellsoftwarestatisticalmethodvisualizationregression
1.9 match 4.48 score 3 scriptsarmadillouqam
ClusterStability:Assessment of Stability of Individual Objects or Clusters in Partitioning Solutions
Allows one to assess the stability of individual objects, clusters and whole clustering solutions based on repeated runs of the K-means and K-medoids partitioning algorithms.
Maintained by Etienne Lord. Last updated 2 years ago.
8.2 match 1.00 score 6 scriptscran
EngrExpt:Data sets from "Introductory Statistics for Engineering Experimentation"
Datasets from Nelson, Coffin and Copeland "Introductory Statistics for Engineering Experimentation" (Elsevier, 2003) with sample code.
Maintained by Douglas Bates. Last updated 16 years ago.
4.0 match 2.05 score 113 scriptscran
ctrlGene:Assess the Stability of Candidate Housekeeping Genes
A simple way to assess the stability of candidate housekeeping genes is implemented in this package.
Maintained by Shanliang Zhong. Last updated 6 years ago.
5.6 match 1.48 score 1 dependentsmandymejia
fMRIscrub:Scrubbing and Other Data Cleaning Routines for fMRI
Data-driven fMRI denoising with projection scrubbing (Pham et al (2022) <doi:10.1016/j.neuroimage.2023.119972>). Also includes routines for DVARS (Derivatives VARianceS) (Afyouni and Nichols (2018) <doi:10.1016/j.neuroimage.2017.12.098>), motion scrubbing (Power et al (2012) <doi:10.1016/j.neuroimage.2011.10.018>), aCompCor (anatomical Components Correction) (Muschelli et al (2014) <doi:10.1016/j.neuroimage.2014.03.028>), detrending, and nuisance regression. Projection scrubbing is also applicable to other outlier detection tasks involving high-dimensional data.
Maintained by Amanda Mejia. Last updated 2 years ago.
1.8 match 4 stars 4.56 score 15 scripts 1 dependentsblansche
fdm2id:Data Mining and R Programming for Beginners
Contains functions to simplify the use of data mining methods (classification, regression, clustering, etc.), for students and beginners in R programming. Various R packages are used and wrappers are built around the main functions, to standardize the use of data mining methods (input/output): it brings a certain loss of flexibility, but also a gain of simplicity. The package name came from the French "Fouille de Données en Master 2 Informatique Décisionnelle".
Maintained by Alexandre Blansché. Last updated 2 years ago.
5.0 match 1 stars 1.62 score 42 scriptsjeremygelb
geocmeans:Implementing Methods for Spatial Fuzzy Unsupervised Classification
Provides functions to apply spatial fuzzy unsupervised classification, visualize and interpret results. This method is well suited when the user wants to analyze data with a fuzzy clustering algorithm and to account for the spatial dimension of the dataset. In addition, indexes for estimating the spatial consistency and classification quality are proposed. The methods were originally proposed in the field of brain imagery (seed Cai and al. 2007 <doi:10.1016/j.patcog.2006.07.011> and Zaho and al. 2013 <doi:10.1016/j.dsp.2012.09.016>) and recently applied in geography (see Gelb and Apparicio <doi:10.4000/cybergeo.36414>).
Maintained by Jeremy Gelb. Last updated 4 months ago.
clusteringcmeansfuzzy-classification-algorithmsspatial-analysisspatial-fuzzy-cmeansunsupervised-learningcppopenmp
1.3 match 27 stars 6.08 score 90 scriptsbioc
reconsi:Resampling Collapsed Null Distributions for Simultaneous Inference
Improves simultaneous inference under dependence of tests by estimating a collapsed null distribution through resampling. Accounting for the dependence between tests increases the power while reducing the variability of the false discovery proportion. This dependence is common in genomics applications, e.g. when combining flow cytometry measurements with microbiome sequence counts.
Maintained by Stijn Hawinkel. Last updated 5 months ago.
metagenomicsmicrobiomemultiplecomparisonflowcytometry
1.8 match 2 stars 4.60 score 2 scriptsbdwilliamson
flevr:Flexible, Ensemble-Based Variable Selection with Potentially Missing Data
Perform variable selection in settings with possibly missing data based on extrinsic (algorithm-specific) and intrinsic (population-level) variable importance. Uses a Super Learner ensemble to estimate the underlying prediction functions that give rise to estimates of variable importance. For more information about the methods, please see Williamson and Huang (2023+) <arXiv:2202.12989>.
Maintained by Brian D. Williamson. Last updated 1 years ago.
1.7 match 5 stars 4.88 score 2 scriptsmyaseen208
baystability:Bayesian Stability Analysis of Genotype by Environment Interaction (GEI)
Performs general Bayesian estimation method of linear–bilinear models for genotype × environment interaction. The method is explained in Perez-Elizalde, S., Jarquin, D., and Crossa, J. (2011) (<doi:10.1007/s13253-011-0063-9>).
Maintained by Muhammad Yaseen. Last updated 5 months ago.
2.9 match 2.81 score 13 scriptsbioc
PRONE:The PROteomics Normalization Evaluator
High-throughput omics data are often affected by systematic biases introduced throughout all the steps of a clinical study, from sample collection to quantification. Normalization methods aim to adjust for these biases to make the actual biological signal more prominent. However, selecting an appropriate normalization method is challenging due to the wide range of available approaches. Therefore, a comparative evaluation of unnormalized and normalized data is essential in identifying an appropriate normalization strategy for a specific data set. This R package provides different functions for preprocessing, normalizing, and evaluating different normalization approaches. Furthermore, normalization methods can be evaluated on downstream steps, such as differential expression analysis and statistical enrichment analysis. Spike-in data sets with known ground truth and real-world data sets of biological experiments acquired by either tandem mass tag (TMT) or label-free quantification (LFQ) can be analyzed.
Maintained by Lis Arend. Last updated 17 days ago.
proteomicspreprocessingnormalizationdifferentialexpressionvisualizationdata-analysisevaluation
1.8 match 2 stars 4.38 score 9 scriptsfrederikziebell
RNAseqQC:Quality Control for RNA-Seq Data
Functions for semi-automated quality control of bulk RNA-seq data.
Maintained by Frederik Ziebell. Last updated 8 months ago.
1.5 match 2 stars 5.21 score 27 scriptsbioc
SLqPCR:Functions for analysis of real-time quantitative PCR data at SIRS-Lab GmbH
Functions for analysis of real-time quantitative PCR data at SIRS-Lab GmbH
Maintained by Matthias Kohl. Last updated 5 months ago.
1.8 match 4.30 score 2 scriptsanikoszabo
Oncotree:Estimating Oncogenetic Trees
Construct and evaluate directed tree structures that model the process of occurrence of genetic alterations during carcinogenesis as described in Szabo, A. and Boucher, K (2002) <doi:10.1016/S0025-5564(02)00086-X>.
Maintained by Aniko Szabo. Last updated 1 years ago.
1.8 match 4.28 score 19 scriptssunweisurrey
snn:Stabilized Nearest Neighbor Classifier
Implement K-nearest neighbor classifier, weighted nearest neighbor classifier, bagged nearest neighbor classifier, optimal weighted nearest neighbor classifier and stabilized nearest neighbor classifier, and perform model selection via 5 fold cross-validation for them. This package also provides functions for computing the classification error and classification instability of a classification procedure.
Maintained by Wei Sun. Last updated 10 years ago.
7.3 match 1.04 score 11 scripts