Showing 200 of total 315 results (show query)
kisungyou
Rdimtools:Dimension Reduction and Estimation Methods
We provide linear and nonlinear dimension reduction techniques. Intrinsic dimension estimation methods for exploratory analysis are also provided. For more details on the package, see the paper by You and Shung (2022) <doi:10.1016/j.simpa.2022.100414>.
Maintained by Kisung You. Last updated 2 years ago.
dimension-estimationdimension-reductionmanifold-learningsubspace-learningopenblascppopenmp
50.6 match 52 stars 8.37 score 186 scripts 8 dependentsbioc
mixOmics:Omics Data Integration Project
Multivariate methods are well suited to large omics data sets where the number of variables (e.g. genes, proteins, metabolites) is much larger than the number of samples (patients, cells, mice). They have the appealing properties of reducing the dimension of the data by using instrumental variables (components), which are defined as combinations of all variables. Those components are then used to produce useful graphical outputs that enable better understanding of the relationships and correlation structures between the different data sets that are integrated. mixOmics offers a wide range of multivariate methods for the exploration and integration of biological datasets with a particular focus on variable selection. The package proposes several sparse multivariate models we have developed to identify the key variables that are highly correlated, and/or explain the biological outcome of interest. The data that can be analysed with mixOmics may come from high throughput sequencing technologies, such as omics data (transcriptomics, metabolomics, proteomics, metagenomics etc) but also beyond the realm of omics (e.g. spectral imaging). The methods implemented in mixOmics can also handle missing values without having to delete entire rows with missing data. A non exhaustive list of methods include variants of generalised Canonical Correlation Analysis, sparse Partial Least Squares and sparse Discriminant Analysis. Recently we implemented integrative methods to combine multiple data sets: N-integration with variants of Generalised Canonical Correlation Analysis and P-integration with variants of multi-group Partial Least Squares.
Maintained by Eva Hamrud. Last updated 4 days ago.
immunooncologymicroarraysequencingmetabolomicsmetagenomicsproteomicsgenepredictionmultiplecomparisonclassificationregressionbioconductorgenomicsgenomics-datagenomics-visualizationmultivariate-analysismultivariate-statisticsomicsr-pkgr-project
17.6 match 182 stars 13.71 score 1.3k scripts 22 dependentsadeverse
ade4:Analysis of Ecological Data: Exploratory and Euclidean Methods in Environmental Sciences
Tools for multivariate data analysis. Several methods are provided for the analysis (i.e., ordination) of one-table (e.g., principal component analysis, correspondence analysis), two-table (e.g., coinertia analysis, redundancy analysis), three-table (e.g., RLQ analysis) and K-table (e.g., STATIS, multiple coinertia analysis). The philosophy of the package is described in Dray and Dufour (2007) <doi:10.18637/jss.v022.i04>.
Maintained by Aurélie Siberchicot. Last updated 12 days ago.
14.6 match 39 stars 14.96 score 2.2k scripts 256 dependentsvalentint
rrcov:Scalable Robust Estimators with High Breakdown Point
Robust Location and Scatter Estimation and Robust Multivariate Analysis with High Breakdown Point: principal component analysis (Filzmoser and Todorov (2013), <doi:10.1016/j.ins.2012.10.017>), linear and quadratic discriminant analysis (Todorov and Pires (2007)), multivariate tests (Todorov and Filzmoser (2010) <doi:10.1016/j.csda.2009.08.015>), outlier detection (Todorov et al. (2010) <doi:10.1007/s11634-010-0075-2>). See also Todorov and Filzmoser (2009) <urn:isbn:978-3838108148>, Todorov and Filzmoser (2010) <doi:10.18637/jss.v032.i03> and Boudt et al. (2019) <doi:10.1007/s11222-019-09869-x>.
Maintained by Valentin Todorov. Last updated 7 months ago.
18.7 match 2 stars 10.57 score 484 scripts 96 dependentschrhennig
fpc:Flexible Procedures for Clustering
Various methods for clustering and cluster validation. Fixed point clustering. Linear regression clustering. Clustering by merging Gaussian mixture components. Symmetric and asymmetric discriminant projections for visualisation of the separation of groupings. Cluster validation statistics for distance based clustering including corrected Rand index. Standardisation of cluster validation statistics by random clusterings and comparison between many clustering methods and numbers of clusters based on this. Cluster-wise cluster stability assessment. Methods for estimation of the number of clusters: Calinski-Harabasz, Tibshirani and Walther's prediction strength, Fang and Wang's bootstrap stability. Gaussian/multinomial mixture fitting for mixed continuous/categorical variables. Variable-wise statistics for cluster interpretation. DBSCAN clustering. Interface functions for many clustering methods implemented in R, including estimating the number of clusters with kmeans, pam and clara. Modality diagnosis for Gaussian mixtures. For an overview see package?fpc.
Maintained by Christian Hennig. Last updated 6 months ago.
17.6 match 11 stars 9.25 score 2.6k scripts 70 dependentsneurodata
mgc:Multiscale Graph Correlation
Multiscale Graph Correlation (MGC) is a framework developed by Vogelstein et al. (2019) <DOI:10.7554/eLife.41690> that extends global correlation procedures to be multiscale; consequently, MGC tests typically require far fewer samples than existing methods for a wide variety of dependence structures and dimensionalities, while maintaining computational efficiency. Moreover, MGC provides a simple and elegant multiscale characterization of the potentially complex latent geometry underlying the relationship.
Maintained by Eric Bridgeford. Last updated 4 years ago.
21.1 match 9 stars 7.50 score 59 scripts 2 dependentsbioc
CMA:Synthesis of microarray-based classification
This package provides a comprehensive collection of various microarray-based classification algorithms both from Machine Learning and Statistics. Variable Selection, Hyperparameter tuning, Evaluation and Comparison can be performed combined or stepwise in a user-friendly environment.
Maintained by Roman Hornung. Last updated 5 months ago.
30.7 match 5.09 score 61 scriptsxinghuq
DA:Discriminant Analysis for Evolutionary Inference
Discriminant Analysis (DA) for evolutionary inference (Qin, X. et al, 2020, <doi:10.22541/au.159256808.83862168>), especially for population genetic structure and community structure inference. This package incorporates the commonly used linear and non-linear, local and global supervised learning approaches (discriminant analysis), including Linear Discriminant Analysis of Kernel Principal Components (LDAKPC), Local (Fisher) Linear Discriminant Analysis (LFDA), Local (Fisher) Discriminant Analysis of Kernel Principal Components (LFDAKPC) and Kernel Local (Fisher) Discriminant Analysis (KLFDA). These discriminant analyses can be used to do ecological and evolutionary inference, including demography inference, species identification, and population/community structure inference.
Maintained by Xinghu Qin. Last updated 4 years ago.
biomedicalinformaticschipseqclusteringcoveragednamethylationdifferentialexpressiondifferentialmethylationsoftwaredifferentialsplicingepigeneticsfunctionalgenomicsgeneexpressiongenesetenrichmentgeneticsimmunooncologymultiplecomparisonnormalizationpathwaysqualitycontrolrnaseqregressionsagesequencingsystemsbiologytimecoursetranscriptiontranscriptomicsdapcdiscriminant-analysisecologicalkernelkernel-localkernel-principle-componentspopulation-structure-inferenceprincipal-components
32.6 match 1 stars 4.70 score 1 scriptsfriendly
candisc:Visualizing Generalized Canonical Discriminant and Canonical Correlation Analysis
Functions for computing and visualizing generalized canonical discriminant analyses and canonical correlation analysis for a multivariate linear model. Traditional canonical discriminant analysis is restricted to a one-way 'MANOVA' design and is equivalent to canonical correlation analysis between a set of quantitative response variables and a set of dummy variables coded from the factor variable. The 'candisc' package generalizes this to higher-way 'MANOVA' designs for all factors in a multivariate linear model, computing canonical scores and vectors for each term. The graphic functions provide low-rank (1D, 2D, 3D) visualizations of terms in an 'mlm' via the 'plot.candisc' and 'heplot.candisc' methods. Related plots are now provided for canonical correlation analysis when all predictors are quantitative.
Maintained by Michael Friendly. Last updated 10 months ago.
dimension-reductionmultivariate-linear-modelsvisualization
17.3 match 15 stars 8.86 score 221 scripts 3 dependentscran
callback:Computes Statistics from Discrimination Experimental Data
In discrimination experiments candidates are sent on the same test (e.g. job, house rental) and one examines whether they receive the same outcome. The number of non negative answers are first examined in details looking for outcome differences. Then various statistics are computed. This package can also be used for analyzing the results from random experiments.
Maintained by Emmanuel Duguet. Last updated 13 days ago.
48.4 match 2.78 scoretidymodels
parsnip:A Common API to Modeling and Analysis Functions
A common interface is provided to allow users to specify a model without having to remember the different argument names across different functions or computational engines (e.g. 'R', 'Spark', 'Stan', 'H2O', etc).
Maintained by Max Kuhn. Last updated 4 days ago.
8.0 match 612 stars 16.37 score 3.4k scripts 69 dependentskozodoi
fairness:Algorithmic Fairness Metrics
Offers calculation, visualization and comparison of algorithmic fairness metrics. Fair machine learning is an emerging topic with the overarching aim to critically assess whether ML algorithms reinforce existing social biases. Unfair algorithms can propagate such biases and produce predictions with a disparate impact on various sensitive groups of individuals (defined by sex, gender, ethnicity, religion, income, socioeconomic status, physical or mental disabilities). Fair algorithms possess the underlying foundation that these groups should be treated similarly or have similar prediction outcomes. The fairness R package offers the calculation and comparisons of commonly and less commonly used fairness metrics in population subgroups. These methods are described by Calders and Verwer (2010) <doi:10.1007/s10618-010-0190-x>, Chouldechova (2017) <doi:10.1089/big.2016.0047>, Feldman et al. (2015) <doi:10.1145/2783258.2783311> , Friedler et al. (2018) <doi:10.1145/3287560.3287589> and Zafar et al. (2017) <doi:10.1145/3038912.3052660>. The package also offers convenient visualizations to help understand fairness metrics.
Maintained by Nikita Kozodoi. Last updated 2 years ago.
algorithmic-discriminationalgorithmic-fairnessdiscriminationdisparate-impactfairnessfairness-aifairness-mlmachine-learning
17.5 match 32 stars 6.82 score 69 scripts 1 dependentstrevorhastie
mda:Mixture and Flexible Discriminant Analysis
Mixture and flexible discriminant analysis, multivariate adaptive regression splines (MARS), BRUTO, and vector-response smoothing splines. Hastie, Tibshirani and Friedman (2009) "Elements of Statistical Learning (second edition, chap 12)" Springer, New York.
Maintained by Trevor Hastie. Last updated 4 months ago.
15.1 match 3 stars 7.60 score 428 scripts 17 dependentsgmcmacran
dann:Discriminant Adaptive Nearest Neighbor Classification
Discriminant Adaptive Nearest Neighbor Classification is a variation of k nearest neighbors where the shape of the neighborhood is data driven. This package implements dann and sub_dann from Hastie (1996) <https://web.stanford.edu/~hastie/Papers/dann_IEEE.pdf>.
Maintained by Greg McMahan. Last updated 8 months ago.
28.4 match 3.74 score 37 scriptsaigorahub
sensR:Thurstonian Models for Sensory Discrimination
Provides methods for sensory discrimination methods; duotrio, tetrad, triangle, 2-AFC, 3-AFC, A-not A, same-different, 2-AC and degree-of-difference. This enables the calculation of d-primes, standard errors of d-primes, sample size and power computations, and comparisons of different d-primes. Methods for profile likelihood confidence intervals and plotting are included. Most methods are described in Brockhoff, P.B. and Christensen, R.H.B. (2010) <doi:10.1016/j.foodqual.2009.04.003>.
Maintained by Dominik Rafacz. Last updated 1 years ago.
19.6 match 7 stars 4.92 score 77 scriptstopepo
sparsediscrim:Sparse and Regularized Discriminant Analysis
A collection of sparse and regularized discriminant analysis methods intended for small-sample, high-dimensional data sets. The package features the High-Dimensional Regularized Discriminant Analysis classifier from Ramey et al. (2017) <arXiv:1602.01182>. Other classifiers include those from Dudoit et al. (2002) <doi:10.1198/016214502753479248>, Pang et al. (2009) <doi:10.1111/j.1541-0420.2009.01200.x>, and Tong et al. (2012) <doi:10.1093/bioinformatics/btr690>.
Maintained by Max Kuhn. Last updated 4 years ago.
22.6 match 3 stars 4.11 score 86 scriptsluca-scr
mclust:Gaussian Mixture Modelling for Model-Based Clustering, Classification, and Density Estimation
Gaussian finite mixture models fitted via EM algorithm for model-based clustering, classification, and density estimation, including Bayesian regularization, dimension reduction for visualisation, and resampling-based inference.
Maintained by Luca Scrucca. Last updated 11 months ago.
7.4 match 21 stars 12.23 score 6.6k scripts 587 dependentssatijalab
Seurat:Tools for Single Cell Genomics
A toolkit for quality control, analysis, and exploration of single cell RNA sequencing data. 'Seurat' aims to enable users to identify and interpret sources of heterogeneity from single cell transcriptomic measurements, and to integrate diverse types of single cell data. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031>, and Hao, Hao, et al (2020) <doi:10.1101/2020.10.12.335331> for more details.
Maintained by Paul Hoffman. Last updated 1 years ago.
human-cell-atlassingle-cell-genomicssingle-cell-rna-seqcpp
5.3 match 2.4k stars 16.86 score 50k scripts 73 dependentsbrianstock
MixSIAR:Bayesian Mixing Models in R
Creates and runs Bayesian mixing models to analyze biological tracer data (i.e. stable isotopes, fatty acids), which estimate the proportions of source (prey) contributions to a mixture (consumer). 'MixSIAR' is not one model, but a framework that allows a user to create a mixing model based on their data structure and research questions, via options for fixed/ random effects, source data types, priors, and error terms. 'MixSIAR' incorporates several years of advances since 'MixSIR' and 'SIAR'.
Maintained by Brian Stock. Last updated 4 years ago.
9.8 match 96 stars 9.21 score 122 scriptsthibautjombart
adegenet:Exploratory Analysis of Genetic and Genomic Data
Toolset for the exploration of genetic and genomic data. Adegenet provides formal (S4) classes for storing and handling various genetic data, including genetic markers with varying ploidy and hierarchical population structure ('genind' class), alleles counts by populations ('genpop'), and genome-wide SNP data ('genlight'). It also implements original multivariate methods (DAPC, sPCA), graphics, statistical tests, simulation tools, distance and similarity measures, and several spatial methods. A range of both empirical and simulated datasets is also provided to illustrate various methods.
Maintained by Zhian N. Kamvar. Last updated 1 months ago.
7.0 match 182 stars 12.60 score 1.9k scripts 29 dependentscran
MASS:Support Functions and Datasets for Venables and Ripley's MASS
Functions and datasets to support Venables and Ripley, "Modern Applied Statistics with S" (4th edition, 2002).
Maintained by Brian Ripley. Last updated 16 days ago.
7.7 match 19 stars 10.53 score 11k dependentsr-forge
Sleuth3:Data Sets from Ramsey and Schafer's "Statistical Sleuth (3rd Ed)"
Data sets from Ramsey, F.L. and Schafer, D.W. (2013), "The Statistical Sleuth: A Course in Methods of Data Analysis (3rd ed)", Cengage Learning.
Maintained by Berwin A Turlach. Last updated 1 years ago.
12.5 match 6.38 score 522 scriptssantagos
dad:Three-Way / Multigroup Data Analysis Through Densities
The data consist of a set of variables measured on several groups of individuals. To each group is associated an estimated probability density function. The package provides tools to create or manage such data and functional methods (principal component analysis, multidimensional scaling, cluster analysis, discriminant analysis...) for such probability densities.
Maintained by Pierre Santagostini. Last updated 4 months ago.
14.1 match 5.33 score 92 scriptseagerai
fastai:Interface to 'fastai'
The 'fastai' <https://docs.fast.ai/index.html> library simplifies training fast and accurate neural networks using modern best practices. It is based on research in to deep learning best practices undertaken at 'fast.ai', including 'out of the box' support for vision, text, tabular, audio, time series, and collaborative filtering models.
Maintained by Turgut Abdullayev. Last updated 11 months ago.
audiocollaborative-filteringdarknetdarknet-image-classificationfastaimedicalobject-detectiontabulartextvision
8.0 match 118 stars 9.40 score 76 scriptsrfastofficial
Rfast2:A Collection of Efficient and Extremely Fast R Functions II
A collection of fast statistical and utility functions for data analysis. Functions for regression, maximum likelihood, column-wise statistics and many more have been included. C++ has been utilized to speed up the functions. References: Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>.
Maintained by Manos Papadakis. Last updated 1 years ago.
9.1 match 38 stars 8.09 score 75 scripts 26 dependentsmarekslenker
MorphoTools2:Multivariate Morphometric Analysis
Tools for multivariate analyses of morphological data, wrapped in one package, to make the workflow convenient and fast. Statistical and graphical tools provide a comprehensive framework for checking and manipulating input data, statistical analyses, and visualization of results. Several methods are provided for the analysis of raw data, to make the dataset ready for downstream analyses. Integrated statistical methods include hierarchical classification, principal component analysis, principal coordinates analysis, non-metric multidimensional scaling, and multiple discriminant analyses: canonical, stepwise, and classificatory (linear, quadratic, and the non-parametric k nearest neighbours). The philosophy of the package is described in Šlenker et al. 2022.
Maintained by Marek Šlenker. Last updated 6 months ago.
14.5 match 7 stars 5.02 score 9 scriptsr-forge
Sleuth2:Data Sets from Ramsey and Schafer's "Statistical Sleuth (2nd Ed)"
Data sets from Ramsey, F.L. and Schafer, D.W. (2002), "The Statistical Sleuth: A Course in Methods of Data Analysis (2nd ed)", Duxbury.
Maintained by Berwin A Turlach. Last updated 1 years ago.
12.5 match 5.70 score 191 scriptskjhealy
gssrdoc:Document General Social Survey Variable
The General Social Survey (GSS) is a long-running, mostly annual survey of US households. It is administered by the National Opinion Research Center (NORC). This package contains the a tibble with information on the survey variables, together with every variable documented as an R help page. For more information on the GSS see \url{http://gss.norc.org}.
Maintained by Kieran Healy. Last updated 11 months ago.
31.1 match 2.28 score 38 scriptsterrytangyuan
lfda:Local Fisher Discriminant Analysis
Functions for performing and visualizing Local Fisher Discriminant Analysis(LFDA), Kernel Fisher Discriminant Analysis(KLFDA), and Semi-supervised Local Fisher Discriminant Analysis(SELF).
Maintained by Yuan Tang. Last updated 2 years ago.
dimensionality-reductiondistance-metric-learningmachine-learningmetric-learningstatistics
10.8 match 76 stars 6.50 score 74 scripts 3 dependentsvandomed
pooling:Fit Poolwise Regression Models
Functions for calculating power and fitting regression models in studies where a biomarker is measured in "pooled" samples rather than for each individual. Approaches for handling measurement error follow the framework of Schisterman et al. (2010) <doi:10.1002/sim.3823>.
Maintained by Dane R. Van Domelen. Last updated 5 years ago.
assay-modelingbiomarkersefficiencyepidemiologymaximum-likelihoodmeasurement-errorpooling
19.5 match 3.60 score 80 scriptsmatloff
dsld:Data Science Looks at Discrimination
Statistical and graphical tools for detecting and measuring discrimination and bias, be it racial, gender, age or other. Detection and remediation of bias in machine learning algorithms. 'Python' interfaces available.
Maintained by Norm Matloff. Last updated 1 months ago.
8.8 match 12 stars 7.81 score 35 scriptsgumeo
accSDA:Accelerated Sparse Discriminant Analysis
Implementation of sparse linear discriminant analysis, which is a supervised classification method for multiple classes. Various novel optimization approaches to this problem are implemented including alternating direction method of multipliers ('ADMM'), proximal gradient (PG) and accelerated proximal gradient ('APG') (See Atkins 'et al'. <arXiv:1705.07194>). Functions for performing cross validation are also supplied along with basic prediction and plotting functions. Sparse zero variance discriminant analysis ('SZVD') is also included in the package (See Ames and Hong, <arXiv:1401.5492>). See the 'github' wiki for a more extended description.
Maintained by Gudmundur Einarsson. Last updated 1 years ago.
19.9 match 5 stars 3.40 score 10 scriptsjkrijthe
RSSL:Implementations of Semi-Supervised Learning Approaches for Classification
A collection of implementations of semi-supervised classifiers and methods to evaluate their performance. The package includes implementations of, among others, Implicitly Constrained Learning, Moment Constrained Learning, the Transductive SVM, Manifold regularization, Maximum Contrastive Pessimistic Likelihood estimation, S4VM and WellSVM.
Maintained by Jesse Krijthe. Last updated 1 years ago.
10.8 match 58 stars 6.05 score 128 scripts 1 dependentsuligges
klaR:Classification and Visualization
Miscellaneous functions for classification and visualization, e.g. regularized discriminant analysis, sknn() kernel-density naive Bayes, an interface to 'svmlight' and stepclass() wrapper variable selection for supervised classification, partimat() visualization of classification rules and shardsplot() of cluster results as well as kmodes() clustering for categorical data, corclust() variable clustering, variable extraction from different variable clustering models and weight of evidence preprocessing.
Maintained by Uwe Ligges. Last updated 1 years ago.
7.9 match 5 stars 7.61 score 1.4k scripts 13 dependentsbrian-j-smith
MachineShop:Machine Learning Models and Tools
Meta-package for statistical and machine learning with a unified interface for model fitting, prediction, performance assessment, and presentation of results. Approaches for model fitting and prediction of numerical, categorical, or censored time-to-event outcomes include traditional regression models, regularization methods, tree-based methods, support vector machines, neural networks, ensembles, data preprocessing, filtering, and model tuning and selection. Performance metrics are provided for model assessment and can be estimated with independent test sets, split sampling, cross-validation, or bootstrap resampling. Resample estimation can be executed in parallel for faster processing and nested in cases of model tuning and selection. Modeling results can be summarized with descriptive statistics; calibration curves; variable importance; partial dependence plots; confusion matrices; and ROC, lift, and other performance curves.
Maintained by Brian J Smith. Last updated 7 months ago.
classification-modelsmachine-learningpredictive-modelingregression-modelssurvival-models
7.4 match 61 stars 7.95 score 121 scriptsjclavel
mvMORPH:Multivariate Comparative Tools for Fitting Evolutionary Models to Morphometric Data
Fits multivariate (Brownian Motion, Early Burst, ACDC, Ornstein-Uhlenbeck and Shifts) models of continuous traits evolution on trees and time series. 'mvMORPH' also proposes high-dimensional multivariate comparative tools (linear models using Generalized Least Squares and multivariate tests) based on penalized likelihood. See Clavel et al. (2015) <DOI:10.1111/2041-210X.12420>, Clavel et al. (2019) <DOI:10.1093/sysbio/syy045>, and Clavel & Morlon (2020) <DOI:10.1093/sysbio/syaa010>.
Maintained by Julien Clavel. Last updated 1 months ago.
5.9 match 17 stars 9.46 score 189 scripts 3 dependentsgzt
MixMatrix:Classification with Matrix Variate Normal and t Distributions
Provides sampling and density functions for matrix variate normal, t, and inverted t distributions; ML estimation for matrix variate normal and t distributions using the EM algorithm, including some restrictions on the parameters; and classification by linear and quadratic discriminant analysis for matrix variate normal and t distributions described in Thompson et al. (2019) <doi:10.1080/10618600.2019.1696208>. Performs clustering with matrix variate normal and t mixture models.
Maintained by Geoffrey Thompson. Last updated 6 months ago.
8.8 match 3 stars 6.19 score 29 scripts 3 dependentsdernarr
ndl:Naive Discriminative Learning
Naive discriminative learning implements learning and classification models based on the Rescorla-Wagner equations and their equilibrium equations.
Maintained by Tino Sering. Last updated 7 years ago.
17.8 match 1 stars 3.00 score 66 scriptstopepo
sparseLDA:Sparse Discriminant Analysis
Performs sparse linear discriminant analysis for Gaussians and mixture of Gaussian models.
Maintained by Max Kuhn. Last updated 8 years ago.
9.6 match 7 stars 5.45 score 45 scripts 3 dependentssimsem
semTools:Useful Tools for Structural Equation Modeling
Provides miscellaneous tools for structural equation modeling, many of which extend the 'lavaan' package. For example, latent interactions can be estimated using product indicators (Lin et al., 2010, <doi:10.1080/10705511.2010.488999>) and simple effects probed; analytical power analyses can be conducted (Jak et al., 2021, <doi:10.3758/s13428-020-01479-0>); and scale reliability can be estimated based on estimated factor-model parameters.
Maintained by Terrence D. Jorgensen. Last updated 3 days ago.
3.6 match 79 stars 13.74 score 1.1k scripts 31 dependentspln-team
PLNmodels:Poisson Lognormal Models
The Poisson-lognormal model and variants (Chiquet, Mariadassou and Robin, 2021 <doi:10.3389/fevo.2021.588292>) can be used for a variety of multivariate problems when count data are at play, including principal component analysis for count data, discriminant analysis, model-based clustering and network inference. Implements variational algorithms to fit such models accompanied with a set of functions for visualization and diagnostic.
Maintained by Julien Chiquet. Last updated 4 days ago.
count-datamultivariate-analysisnetwork-inferencepcapoisson-lognormal-modelopenblascpp
5.0 match 56 stars 9.50 score 226 scriptsmlr-org
mlr3learners:Recommended Learners for 'mlr3'
Recommended Learners for 'mlr3'. Extends 'mlr3' with interfaces to essential machine learning packages on CRAN. This includes, but is not limited to: (penalized) linear and logistic regression, linear and quadratic discriminant analysis, k-nearest neighbors, naive Bayes, support vector machines, and gradient boosting.
Maintained by Marc Becker. Last updated 4 months ago.
classificationlearnersmachine-learningmlr3regression
4.1 match 91 stars 11.51 score 1.5k scripts 10 dependentsrunehaubo
ordinal:Regression Models for Ordinal Data
Implementation of cumulative link (mixed) models also known as ordered regression models, proportional odds models, proportional hazards models for grouped survival times and ordered logit/probit/... models. Estimation is via maximum likelihood and mixed models are fitted with the Laplace approximation and adaptive Gauss-Hermite quadrature. Multiple random effect terms are allowed and they may be nested, crossed or partially nested/crossed. Restrictions of symmetry and equidistance can be imposed on the thresholds (cut-points/intercepts). Standard model methods are available (summary, anova, drop-methods, step, confint, predict etc.) in addition to profile methods and slice methods for visualizing the likelihood function and checking convergence.
Maintained by Rune Haubo Bojesen Christensen. Last updated 3 months ago.
3.8 match 38 stars 12.41 score 1.3k scripts 178 dependentsyuqingxx
TULIP:A Toolbox for Linear Discriminant Analysis with Penalties
Integrates several popular high-dimensional methods based on Linear Discriminant Analysis (LDA) and provides a comprehensive and user-friendly toolbox for linear, semi-parametric and tensor-variate classification as mentioned in Yuqing Pan, Qing Mai and Xin Zhang (2019) <arXiv:1904.03469>. Functions are included for covariate adjustment, model fitting, cross validation and prediction.
Maintained by Yuqing Pan. Last updated 4 years ago.
23.1 match 2.00 score 9 scriptstidymodels
discrim:Model Wrappers for Discriminant Analysis
Bindings for additional classification models for use with the 'parsnip' package. Models include flavors of discriminant analysis, such as linear (Fisher (1936) <doi:10.1111/j.1469-1809.1936.tb02137.x>), regularized (Friedman (1989) <doi:10.1080/01621459.1989.10478752>), and flexible (Hastie, Tibshirani, and Buja (1994) <doi:10.1080/01621459.1994.10476866>), as well as naive Bayes classifiers (Hand and Yu (2007) <doi:10.1111/j.1751-5823.2001.tb00465.x>).
Maintained by Emil Hvitfeldt. Last updated 5 months ago.
5.4 match 29 stars 8.26 score 1.1k scripts 1 dependentssvkucheryavski
mdatools:Multivariate Data Analysis for Chemometrics
Projection based methods for preprocessing, exploring and analysis of multivariate data used in chemometrics. S. Kucheryavskiy (2020) <doi:10.1016/j.chemolab.2020.103937>.
Maintained by Sergey Kucheryavskiy. Last updated 8 months ago.
5.6 match 35 stars 7.37 score 220 scripts 1 dependentsegenn
rtemis:Machine Learning and Visualization
Advanced Machine Learning and Visualization. Unsupervised Learning (Clustering, Decomposition), Supervised Learning (Classification, Regression), Cross-Decomposition, Bagging, Boosting, Meta-models. Static and interactive graphics.
Maintained by E.D. Gennatas. Last updated 1 months ago.
data-sciencedata-visualizationmachine-learningmachine-learning-libraryvisualization
5.8 match 145 stars 7.09 score 50 scripts 2 dependentscjvanlissa
tidySEM:Tidy Structural Equation Modeling
A tidy workflow for generating, estimating, reporting, and plotting structural equation models using 'lavaan', 'OpenMx', or 'Mplus'. Throughout this workflow, elements of syntax, results, and graphs are represented as 'tidy' data, making them easy to customize. Includes functionality to estimate latent class analyses, and to plot 'dagitty' and 'igraph' objects.
Maintained by Caspar J. van Lissa. Last updated 7 days ago.
3.8 match 58 stars 10.69 score 330 scripts 1 dependentsthothorn
ipred:Improved Predictors
Improved predictive models by indirect classification and bagging for classification, regression and survival problems as well as resampling based estimators of prediction error.
Maintained by Torsten Hothorn. Last updated 8 months ago.
3.7 match 10.76 score 3.3k scripts 411 dependentsohdsi
PatientLevelPrediction:Develop Clinical Prediction Models Using the Common Data Model
A user friendly way to create patient level prediction models using the Observational Medical Outcomes Partnership Common Data Model. Given a cohort of interest and an outcome of interest, the package can use data in the Common Data Model to build a large set of features. These features can then be used to fit a predictive model with a number of machine learning algorithms. This is further described in Reps (2017) <doi:10.1093/jamia/ocy032>.
Maintained by Egill Fridgeirsson. Last updated 9 days ago.
3.6 match 190 stars 10.85 score 297 scriptsmoran79
folda:Forward Stepwise Discriminant Analysis with Pillai's Trace
A novel forward stepwise discriminant analysis framework that integrates Pillai's trace with Uncorrelated Linear Discriminant Analysis (ULDA), providing an improvement over traditional stepwise LDA methods that rely on Wilks' Lambda. A stand-alone ULDA implementation is also provided, offering a more general solution than the one available in the 'MASS' package. It automatically handles missing values and provides visualization tools. For more details, see Wang (2024) <doi:10.48550/arXiv.2409.03136>.
Maintained by Siyu Wang. Last updated 5 months ago.
7.2 match 2 stars 5.18 score 6 scripts 1 dependentstarnduong
ks:Kernel Smoothing
Kernel smoothers for univariate and multivariate data, with comprehensive visualisation and bandwidth selection capabilities, including for densities, density derivatives, cumulative distributions, clustering, classification, density ridges, significant modal regions, and two-sample hypothesis tests. Chacon & Duong (2018) <doi:10.1201/9780429485572>.
Maintained by Tarn Duong. Last updated 6 months ago.
3.7 match 6 stars 10.14 score 920 scripts 262 dependentstraminer
TraMineR:Trajectory Miner: a Sequence Analysis Toolkit
Set of sequence analysis tools for manipulating, describing and rendering categorical sequences, and more generally mining sequence data in the field of social sciences. Although this sequence analysis package is primarily intended for state or event sequences that describe time use or life courses such as family formation histories or professional careers, its features also apply to many other kinds of categorical sequence data. It accepts many different sequence representations as input and provides tools for converting sequences from one format to another. It offers several functions for describing and rendering sequences, for computing distances between sequences with different metrics (among which optimal matching), original dissimilarity-based analysis tools, and functions for extracting the most frequent event subsequences and identifying the most discriminating ones among them. A user's guide can be found on the TraMineR web page.
Maintained by Gilbert Ritschard. Last updated 3 months ago.
4.5 match 11 stars 8.24 score 534 scripts 13 dependentscran
dawai:Discriminant Analysis with Additional Information
In applications it is usual that some additional information is available. This package dawai (an acronym for Discriminant Analysis With Additional Information) performs linear and quadratic discriminant analysis with additional information expressed as inequality restrictions among the populations means. It also computes several estimations of the true error rate.
Maintained by David Conde. Last updated 5 months ago.
18.1 match 2.00 scorecran
verification:Weather Forecast Verification Utilities
Utilities for verifying discrete, continuous and probabilistic forecasts, and forecasts expressed as parametric distributions are included.
Maintained by Eric Gilleland. Last updated 4 months ago.
8.5 match 3 stars 4.19 score 6 dependentsmneunhoe
RGAN:Generative Adversarial Nets (GAN) in R
An easy way to get started with Generative Adversarial Nets (GAN) in R. The GAN algorithm was initially described by Goodfellow et al. 2014 <https://proceedings.neurips.cc/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf>. A GAN can be used to learn the joint distribution of complex data by comparison. A GAN consists of two neural networks a Generator and a Discriminator, where the two neural networks play an adversarial minimax game. Built-in GAN models make the training of GANs in R possible in one line and make it easy to experiment with different design choices (e.g. different network architectures, value functions, optimizers). The built-in GAN models work with tabular data (e.g. to produce synthetic data) and image data. Methods to post-process the output of GAN models to enhance the quality of samples are available.
Maintained by Marcel Neunhoeffer. Last updated 2 years ago.
8.8 match 19 stars 3.98 score 7 scriptsbioc
MicrobiotaProcess:A comprehensive R package for managing and analyzing microbiome and other ecological data within the tidy framework
MicrobiotaProcess is an R package for analysis, visualization and biomarker discovery of microbial datasets. It introduces MPSE class, this make it more interoperable with the existing computing ecosystem. Moreover, it introduces a tidy microbiome data structure paradigm and analysis grammar. It provides a wide variety of microbiome data analysis procedures under the unified and common framework (tidy-like framework).
Maintained by Shuangbin Xu. Last updated 5 months ago.
visualizationmicrobiomesoftwaremultiplecomparisonfeatureextractionmicrobiome-analysismicrobiome-data
3.6 match 183 stars 9.70 score 126 scripts 1 dependentsliangcj
hds:Hazard Discrimination Summary
Functions for calculating the hazard discrimination summary and its standard errors, as described in Liang and Heagerty (2016) <doi:10.1111/biom.12628>.
Maintained by C. Jason Liang. Last updated 8 years ago.
12.7 match 1 stars 2.70 score 3 scriptsjfukuyama
treeDA:Tree-Based Discriminant Analysis
Performs sparse discriminant analysis on a combination of node and leaf predictors when the predictor variables are structured according to a tree, as described in Fukuyama et al. (2017) <doi:10.1371/journal.pcbi.1005706>.
Maintained by Julia Fukuyama. Last updated 4 years ago.
9.2 match 3.70 score 9 scriptseasystats
performance:Assessment of Regression Models Performance
Utilities for computing measures to assess model quality, which are not directly provided by R's 'base' or 'stats' packages. These include e.g. measures like r-squared, intraclass correlation coefficient (Nakagawa, Johnson & Schielzeth (2017) <doi:10.1098/rsif.2017.0213>), root mean squared error or functions to check models for overdispersion, singularity or zero-inflation and more. Functions apply to a large variety of regression models, including generalized linear models, mixed effects models and Bayesian models. References: Lüdecke et al. (2021) <doi:10.21105/joss.03139>.
Maintained by Daniel Lüdecke. Last updated 18 days ago.
aiceasystatshacktoberfestloomachine-learningmixed-modelsmodelsperformancer2statistics
2.0 match 1.1k stars 16.17 score 4.3k scripts 47 dependentsfriendly
HistData:Data Sets from the History of Statistics and Data Visualization
The 'HistData' package provides a collection of small data sets that are interesting and important in the history of statistics and data visualization. The goal of the package is to make these available, both for instructional use and for historical research. Some of these present interesting challenges for graphics or analysis in R.
Maintained by Michael Friendly. Last updated 10 months ago.
3.5 match 63 stars 9.19 score 732 scripts 2 dependentstopepo
caret:Classification and Regression Training
Misc functions for training and plotting classification and regression models.
Maintained by Max Kuhn. Last updated 3 months ago.
1.7 match 1.6k stars 19.24 score 61k scripts 303 dependentsapedrods
HiDimDA:High Dimensional Discriminant Analysis
Performs linear discriminant analysis in high dimensional problems based on reliable covariance estimators for problems with (many) more variables than observations. Includes routines for classifier training, prediction, cross-validation and variable selection.
Maintained by Antonio Pedro Duarte Silva. Last updated 5 months ago.
21.1 match 1.52 score 33 scriptsalexanderrobitzsch
CDM:Cognitive Diagnosis Modeling
Functions for cognitive diagnosis modeling and multidimensional item response modeling for dichotomous and polytomous item responses. This package enables the estimation of the DINA and DINO model (Junker & Sijtsma, 2001, <doi:10.1177/01466210122032064>), the multiple group (polytomous) GDINA model (de la Torre, 2011, <doi:10.1007/s11336-011-9207-7>), the multiple choice DINA model (de la Torre, 2009, <doi:10.1177/0146621608320523>), the general diagnostic model (GDM; von Davier, 2008, <doi:10.1348/000711007X193957>), the structured latent class model (SLCA; Formann, 1992, <doi:10.1080/01621459.1992.10475229>) and regularized latent class analysis (Chen, Li, Liu, & Ying, 2017, <doi:10.1007/s11336-016-9545-6>). See George, Robitzsch, Kiefer, Gross, and Uenlue (2017) <doi:10.18637/jss.v074.i02> or Robitzsch and George (2019, <doi:10.1007/978-3-030-05584-4_26>) for further details on estimation and the package structure. For tutorials on how to use the CDM package see George and Robitzsch (2015, <doi:10.20982/tqmp.11.3.p189>) as well as Ravand and Robitzsch (2015).
Maintained by Alexander Robitzsch. Last updated 9 months ago.
cognitive-diagnostic-modelsitem-response-theorycpp
3.6 match 22 stars 8.76 score 138 scripts 28 dependentsmodal-inria
RMixtCompUtilities:Utility Functions for 'MixtComp' Outputs
Mixture Composer <https://github.com/modal-inria/MixtComp> is a project to build mixture models with heterogeneous data sets and partially missing data management. This package contains graphical, getter and some utility functions to facilitate the analysis of 'MixtComp' output.
Maintained by Quentin Grimonprez. Last updated 10 months ago.
clusteringcppheterogeneous-datamissing-datamixed-datamixture-modelstatistics
6.0 match 13 stars 5.19 score 2 scripts 1 dependentsamices
mice:Multivariate Imputation by Chained Equations
Multiple imputation using Fully Conditional Specification (FCS) implemented by the MICE algorithm as described in Van Buuren and Groothuis-Oudshoorn (2011) <doi:10.18637/jss.v045.i03>. Each variable has its own imputation model. Built-in imputation models are provided for continuous data (predictive mean matching, normal), binary data (logistic regression), unordered categorical data (polytomous logistic regression) and ordered categorical data (proportional odds). MICE can also impute continuous two-level data (normal model, pan, second-level variables). Passive imputation can be used to maintain consistency between variables. Various diagnostic plots are available to inspect the quality of the imputations.
Maintained by Stef van Buuren. Last updated 6 days ago.
chained-equationsfcsimputationmicemissing-datamissing-valuesmultiple-imputationmultivariate-datacpp
1.9 match 462 stars 16.50 score 10k scripts 154 dependentsandrewljackson
SIBER:Stable Isotope Bayesian Ellipses in R
Fits bi-variate ellipses to stable isotope data using Bayesian inference with the aim being to describe and compare their isotopic niche.
Maintained by Andrew Jackson. Last updated 10 months ago.
community-ecologyecologyniche-modellingstable-isotopesjagscpp
3.4 match 36 stars 9.13 score 187 scripts 1 dependentsrtdists
rtdists:Response Time Distributions
Provides response time distributions (density/PDF, distribution function/CDF, quantile function, and random generation): (a) Ratcliff diffusion model (Ratcliff & McKoon, 2008, <doi:10.1162/neco.2008.12-06-420>) based on C code by Andreas and Jochen Voss and (b) linear ballistic accumulator (LBA; Brown & Heathcote, 2008, <doi:10.1016/j.cogpsych.2007.12.002>) with different distributions underlying the drift rate.
Maintained by Henrik Singmann. Last updated 3 years ago.
3.4 match 46 stars 8.85 score 116 scripts 2 dependentscorybrunson
ordr:A Tidyverse Extension for Ordinations and Biplots
Ordination comprises several multivariate exploratory and explanatory techniques with theoretical foundations in geometric data analysis; see Podani (2000, ISBN:90-5782-067-6) for techniques and applications and Le Roux & Rouanet (2005) <doi:10.1007/1-4020-2236-0> for foundations. Greenacre (2010, ISBN:978-84-923846) shows how the most established of these, including principal components analysis, correspondence analysis, multidimensional scaling, factor analysis, and discriminant analysis, rely on eigen-decompositions or singular value decompositions of pre-processed numeric matrix data. These decompositions give rise to a set of shared coordinates along which the row and column elements can be measured. The overlay of their scatterplots on these axes, introduced by Gabriel (1971) <doi:10.1093/biomet/58.3.453>, is called a biplot. 'ordr' provides inspection, extraction, manipulation, and visualization tools for several popular ordination classes supported by a set of recovery methods. It is inspired by and designed to integrate into 'tidyverse' workflows provided by Wickham et al (2019) <doi:10.21105/joss.01686>.
Maintained by Jason Cory Brunson. Last updated 13 days ago.
biplotdata-visualizationdimension-reductiongeometric-data-analysisgrammar-of-graphicslog-ratio-analysismultivariate-analysismultivariate-statisticsordinationtidymodelstidyverse
4.1 match 24 stars 7.26 score 28 scriptsharrelfe
Hmisc:Harrell Miscellaneous
Contains many functions useful for data analysis, high-level graphics, utility operations, functions for computing sample size and power, simulation, importing and annotating datasets, imputing missing values, advanced table making, variable clustering, character string manipulation, conversion of R objects to LaTeX and html code, recoding variables, caching, simplified parallel computing, encrypting and decrypting data using a safe workflow, general moving window statistical estimation, and assistance in interpreting principal component analysis.
Maintained by Frank E Harrell Jr. Last updated 5 hours ago.
1.7 match 210 stars 17.61 score 17k scripts 750 dependentspatriciamar
ShinyItemAnalysis:Test and Item Analysis via Shiny
Package including functions and interactive shiny application for the psychometric analysis of educational tests, psychological assessments, health-related and other types of multi-item measurements, or ratings from multiple raters.
Maintained by Patricia Martinkova. Last updated 1 months ago.
assessmentdifferential-item-functioningitem-analysisitem-response-theorypsychometricsshiny
3.7 match 44 stars 7.88 score 105 scripts 3 dependentstroutinthemilk
IsotopeR:Stable Isotope Mixing Model
Estimates diet contributions from isotopic sources using JAGS. Includes estimation of concentration dependence and measurement error.
Maintained by Jake Ferguson. Last updated 9 years ago.
10.6 match 1 stars 2.70 scorephilchalmers
mirt:Multidimensional Item Response Theory
Analysis of discrete response data using unidimensional and multidimensional item analysis models under the Item Response Theory paradigm (Chalmers (2012) <doi:10.18637/jss.v048.i06>). Exploratory and confirmatory item factor analysis models are estimated with quadrature (EM) or stochastic (MHRM) methods. Confirmatory bi-factor and two-tier models are available for modeling item testlets using dimension reduction EM algorithms, while multiple group analyses and mixed effects designs are included for detecting differential item, bundle, and test functioning, and for modeling item and person covariates. Finally, latent class models such as the DINA, DINO, multidimensional latent class, mixture IRT models, and zero-inflated response models are supported, as well as a wide family of probabilistic unfolding models.
Maintained by Phil Chalmers. Last updated 11 days ago.
1.9 match 210 stars 14.98 score 2.5k scripts 40 dependentsjhmaindonald
hddplot:Use Known Groups in High-Dimensional Data to Derive Scores for Plots
Cross-validated linear discriminant calculations determine the optimum number of features. Test and training scores from successive cross-validation steps determine, via a principal components calculation, a low-dimensional global space onto which test scores are projected, in order to plot them. Further functions are included that are intended for didactic use. The package implements, and extends, methods described in J.H. Maindonald and C.J. Burden (2005) <https://journal.austms.org.au/V46/CTAC2004/Main/home.html>.
Maintained by John Maindonald. Last updated 2 years ago.
9.3 match 3.00 score 10 scriptsmoran79
LDATree:Oblique Classification Trees with Uncorrelated Linear Discriminant Analysis Splits
A classification tree method that uses Uncorrelated Linear Discriminant Analysis (ULDA) for variable selection, split determination, and model fitting in terminal nodes. It automatically handles missing values and offers visualization tools. For more details, see Wang (2024) <doi:10.48550/arXiv.2410.23147>.
Maintained by Siyu Wang. Last updated 5 months ago.
5.0 match 7 stars 5.54 score 5 scriptscran
sda:Shrinkage Discriminant Analysis and CAT Score Variable Selection
Provides an efficient framework for high-dimensional linear and diagonal discriminant analysis with variable selection. The classifier is trained using James-Stein-type shrinkage estimators and predictor variables are ranked using correlation-adjusted t-scores (CAT scores). Variable selection error is controlled using false non-discovery rates or higher criticism.
Maintained by Korbinian Strimmer. Last updated 3 years ago.
8.6 match 3.21 score 3 dependentslbb220
GWmodel:Geographically-Weighted Models
Techniques from a particular branch of spatial statistics,termed geographically-weighted (GW) models. GW models suit situations when data are not described well by some global model, but where there are spatial regions where a suitably localised calibration provides a better description. 'GWmodel' includes functions to calibrate: GW summary statistics (Brunsdon et al., 2002)<doi: 10.1016/s0198-9715(01)00009-6>, GW principal components analysis (Harris et al., 2011)<doi: 10.1080/13658816.2011.554838>, GW discriminant analysis (Brunsdon et al., 2007)<doi: 10.1111/j.1538-4632.2007.00709.x> and various forms of GW regression (Brunsdon et al., 1996)<doi: 10.1111/j.1538-4632.1996.tb00936.x>; some of which are provided in basic and robust (outlier resistant) forms.
Maintained by Binbin Lu. Last updated 6 months ago.
4.3 match 18 stars 6.38 score 266 scripts 4 dependentsnanxstats
hdnom:Benchmarking and Visualization Toolkit for Penalized Cox Models
Creates nomogram visualizations for penalized Cox regression models, with the support of reproducible survival model building, validation, calibration, and comparison for high-dimensional data.
Maintained by Nan Xiao. Last updated 6 months ago.
benchmarkhigh-dimensional-datalinear-regressionnomogram-visualizationpenalized-cox-modelssurvival-analysisopenblas
3.4 match 43 stars 8.07 score 68 scripts 1 dependentslrberge
HDclassif:High Dimensional Supervised Classification and Clustering
Discriminant analysis and data clustering methods for high dimensional data, based on the assumption that high-dimensional data live in different subspaces with low dimensionality proposing a new parametrization of the Gaussian mixture model which combines the ideas of dimension reduction and constraints on the model.
Maintained by Laurent Berge. Last updated 2 years ago.
5.9 match 1 stars 4.59 score 87 scripts 1 dependentsmomx
Momocs:Morphometrics using R
The goal of 'Momocs' is to provide a complete, convenient, reproducible and open-source toolkit for 2D morphometrics. It includes most common 2D morphometrics approaches on outlines, open outlines, configurations of landmarks, traditional morphometrics, and facilities for data preparation, manipulation and visualization with a consistent grammar throughout. It allows reproducible, complex morphometrics analyses and other morphometrics approaches should be easy to plug in, or develop from, on top of this canvas.
Maintained by Vincent Bonhomme. Last updated 1 years ago.
3.7 match 51 stars 7.42 score 346 scriptscoatless-rpkg
msos:Data Sets and Functions Used in Multivariate Statistics: Old School by John Marden
Multivariate Analysis methods and data sets used in John Marden's book Multivariate Statistics: Old School (2015) <ISBN:978-1456538835>. This also serves as a companion package for the STAT 571: Multivariate Analysis course offered by the Department of Statistics at the University of Illinois at Urbana-Champaign ('UIUC').
Maintained by James Balamuta. Last updated 1 years ago.
6.5 match 3 stars 4.16 score 32 scripts 1 dependentseasystats
parameters:Processing of Model Parameters
Utilities for processing the parameters of various statistical models. Beyond computing p values, CIs, and other indices for a wide variety of models (see list of supported models using the function 'insight::supported_models()'), this package implements features like bootstrapping or simulating of parameters and models, feature reduction (feature extraction and variable selection) as well as functions to describe data and variable characteristics (e.g. skewness, kurtosis, smoothness or distribution).
Maintained by Daniel Lüdecke. Last updated 2 days ago.
betabootstrapciconfidence-intervalsdata-reductioneasystatsfafeature-extractionfeature-reductionhacktoberfestparameterspcapvaluesregression-modelsrobust-statisticsstandardizestandardized-estimatesstatistical-models
1.7 match 453 stars 15.65 score 1.8k scripts 56 dependentschemhouse-group
rchemo:Dimension Reduction, Regression and Discrimination for Chemometrics
Data exploration and prediction with focus on high dimensional data and chemometrics. The package was initially designed about partial least squares regression and discrimination models and variants, in particular locally weighted PLS models (LWPLS). Then, it has been expanded to many other methods for analyzing high dimensional data. The name 'rchemo' comes from the fact that the package is orientated to chemometrics, but most of the provided methods are fully generic to other domains. Functions such as transform(), predict(), coef() and summary() are available. Tuning the predictive models is facilitated by generic functions gridscore() (validation dataset) and gridcv() (cross-validation). Faster versions are also available for models based on latent variables (LVs) (gridscorelv() and gridcvlv()) and ridge regularization (gridscorelb() and gridcvlb()).
Maintained by Marion Brandolini-Bunlon. Last updated 6 months ago.
7.5 match 3 stars 3.52 score 11 scriptsarsilva87
biotools:Tools for Biometry and Applied Statistics in Agricultural Science
Tools designed to perform and evaluate cluster analysis (including Tocher's algorithm), discriminant analysis and path analysis (standard and under collinearity), as well as some useful miscellaneous tools for dealing with sample size and optimum plot size calculations. A test for seed sample heterogeneity is now available. Mantel's permutation test can be found in this package. A new approach for calculating its power is implemented. biotools also contains tests for genetic covariance components. Heuristic approaches for performing non-parametric spatial predictions of generic response variables and spatial gene diversity are implemented.
Maintained by Anderson Rodrigo da Silva. Last updated 3 years ago.
cluster-analysismultivariate-analysisstatisticstocher
3.5 match 2 stars 7.11 score 161 scripts 1 dependentssantiagobarreda
phonTools:Tools for Phonetic and Acoustic Analyses
Contains tools for the organization, display, and analysis of the sorts of data frequently encountered in phonetics research and experimentation, including the easy creation of IPA vowel plots, and the creation and manipulation of WAVE audio files.
Maintained by Santiago Barreda. Last updated 1 years ago.
4.0 match 4 stars 6.21 score 157 scripts 7 dependentsnicolas-robette
GDAtools:Geometric Data Analysis
Many tools for Geometric Data Analysis (Le Roux & Rouanet (2005) <doi:10.1007/1-4020-2236-0>), such as MCA variants (Specific Multiple Correspondence Analysis, Class Specific Analysis), many graphical and statistical aids to interpretation (structuring factors, concentration ellipses, inductive tests, bootstrap validation, etc.) and multiple-table analysis (Multiple Factor Analysis, between- and inter-class analysis, Principal Component Analysis and Correspondence Analysis with Instrumental Variables, etc.).
Maintained by Nicolas Robette. Last updated 10 months ago.
4.1 match 10 stars 5.93 score 94 scripts 2 dependentshiroyukiyamamoto
loadings:Loadings for Principal Component Analysis and Partial Least Squares
Computing statistical hypothesis testing for loading in principal component analysis (PCA) (Yamamoto, H. et al. (2014) <doi:10.1186/1471-2105-15-51>), orthogonal smoothed PCA (OS-PCA) (Yamamoto, H. et al. (2021) <doi:10.3390/metabo11030149>), one-sided kernel PCA (Yamamoto, H. (2023) <doi:10.51094/jxiv.262>), partial least squares (PLS) and PLS discriminant analysis (PLS-DA) (Yamamoto, H. et al. (2009) <doi:10.1016/j.chemolab.2009.05.006>), PLS with rank order of groups (PLS-ROG) (Yamamoto, H. (2017) <doi:10.1002/cem.2883>), regularized canonical correlation analysis discriminant analysis (RCCA-DA) (Yamamoto, H. et al. (2008) <doi:10.1016/j.bej.2007.12.009>), multiset PLS and PLS-ROG (Yamamoto, H. (2022) <doi:10.1101/2022.08.30.505949>).
Maintained by Hiroyuki Yamamoto. Last updated 11 months ago.
5.9 match 3 stars 4.08 score 27 scripts 1 dependentsbioc
structToolbox:Data processing & analysis tools for Metabolomics and other omics
An extensive set of data (pre-)processing and analysis methods and tools for metabolomics and other omics, with a strong emphasis on statistics and machine learning. This toolbox allows the user to build extensive and standardised workflows for data analysis. The methods and tools have been implemented using class-based templates provided by the struct (Statistics in R Using Class-based Templates) package. The toolbox includes pre-processing methods (e.g. signal drift and batch correction, normalisation, missing value imputation and scaling), univariate (e.g. ttest, various forms of ANOVA, Kruskal–Wallis test and more) and multivariate statistical methods (e.g. PCA and PLS, including cross-validation and permutation testing) as well as machine learning methods (e.g. Support Vector Machines). The STATistics Ontology (STATO) has been integrated and implemented to provide standardised definitions for the different methods, inputs and outputs.
Maintained by Gavin Rhys Lloyd. Last updated 25 days ago.
workflowstepmetabolomicsbioconductor-packagedimslc-msmachine-learningmultivariate-analysisstatisticsunivariate
3.8 match 10 stars 6.26 score 12 scriptsbioc
limma:Linear Models for Microarray and Omics Data
Data analysis, linear models and differential expression for omics data.
Maintained by Gordon Smyth. Last updated 5 days ago.
exonarraygeneexpressiontranscriptionalternativesplicingdifferentialexpressiondifferentialsplicinggenesetenrichmentdataimportbayesianclusteringregressiontimecoursemicroarraymicrornaarraymrnamicroarrayonechannelproprietaryplatformstwochannelsequencingrnaseqbatcheffectmultiplecomparisonnormalizationpreprocessingqualitycontrolbiomedicalinformaticscellbiologycheminformaticsepigeneticsfunctionalgenomicsgeneticsimmunooncologymetabolomicsproteomicssystemsbiologytranscriptomics
1.7 match 13.81 score 16k scripts 585 dependentslchen723
NetDA:Network-Based Discriminant Analysis Subject to Multi-Label Classes
Implementation of discriminant analysis with network structures in predictors accommodated to do classification and prediction.
Maintained by Li-Pang Chen. Last updated 3 years ago.
11.8 match 2.00 score 10 scriptsbioc
PLSDAbatch:PLSDA-batch
A novel framework to correct for batch effects prior to any downstream analysis in microbiome data based on Projection to Latent Structures Discriminant Analysis. The main method is named “PLSDA-batch”. It first estimates treatment and batch variation with latent components, then subtracts batch-associated components from the data whilst preserving biological variation of interest. PLSDA-batch is highly suitable for microbiome data as it is non-parametric, multivariate and allows for ordination and data visualisation. Combined with centered log-ratio transformation for addressing uneven library sizes and compositional structure, PLSDA-batch addresses all characteristics of microbiome data that existing correction methods have ignored so far. Two other variants are proposed for 1/ unbalanced batch x treatment designs that are commonly encountered in studies with small sample sizes, and for 2/ selection of discriminative variables amongst treatment groups to avoid overfitting in classification problems. These two variants have widened the scope of applicability of PLSDA-batch to different data settings.
Maintained by Yiwen (Eva) Wang. Last updated 5 months ago.
statisticalmethoddimensionreductionprincipalcomponentclassificationmicrobiomebatcheffectnormalizationvisualization
4.3 match 13 stars 5.37 score 18 scriptsherulor
DFIT:Differential Functioning of Items and Tests
A set of functions to perform Raju, van der Linden and Fleer's (1995, <doi:10.1177/014662169501900405>) Differential Functioning of Items and Tests (DFIT) analyses. It includes functions to use the Monte Carlo Item Parameter Replication approach (Oshima, Raju, & Nanda, 2006, <doi:10.1111/j.1745-3984.2006.00001.x>) for obtaining the associated statistical significance tests cut-off points. They may also be used for a priori and post-hoc power calculations (Cervantes, 2017, <doi:10.18637/jss.v076.i05>).
Maintained by Victor H. Cervantes. Last updated 9 months ago.
9.8 match 2.30 score 20 scriptstidymodels
shinymodels:Interactive Assessments of Models
Launch a 'shiny' application for 'tidymodels' results. For classification or regression models, the app can be used to determine if there is lack of fit or poorly predicted points.
Maintained by Simon Couch. Last updated 5 months ago.
3.6 match 48 stars 6.21 score 48 scriptsnspyrison
spinifex:Manual Tours, Manual Control of Dynamic Projections of Numeric Multivariate Data
Data visualization tours animates linear projection of multivariate data as its basis (ie. orientation) changes. The 'spinifex' packages generates paths for manual tours by manipulating the contribution of a single variable at a time Cook & Buja (1997) <doi:10.1080/10618600.1997.10474754>. Other types of tours, such as grand (random walk) and guided (optimizing some objective function) are available in the 'tourr' package Wickham et al. <doi:10.18637/jss.v040.i02>. 'spinifex' builds on 'tourr' and can render tours with 'gganimate' and 'plotly' graphics, and allows for exporting as an .html widget and as an .gif, respectively. This work is fully discussed in Spyrison & Cook (2020) <doi:10.32614/RJ-2020-027>.
Maintained by Nicholas Spyrison. Last updated 2 months ago.
dimensionreductiontoursvisualization
3.5 match 3 stars 6.28 score 105 scripts 1 dependentsmmaechler
sfsmisc:Utilities from 'Seminar fuer Statistik' ETH Zurich
Useful utilities ['goodies'] from Seminar fuer Statistik ETH Zurich, some of which were ported from S-plus in the 1990s. For graphics, have pretty (Log-scale) axes eaxis(), an enhanced Tukey-Anscombe plot, combining histogram and boxplot, 2d-residual plots, a 'tachoPlot()', pretty arrows, etc. For robustness, have a robust F test and robust range(). For system support, notably on Linux, provides 'Sys.*()' functions with more access to system and CPU information. Finally, miscellaneous utilities such as simple efficient prime numbers, integer codes, Duplicated(), toLatex.numeric() and is.whole().
Maintained by Martin Maechler. Last updated 5 months ago.
2.0 match 11 stars 10.87 score 566 scripts 119 dependentsbioc
memes:motif matching, comparison, and de novo discovery using the MEME Suite
A seamless interface to the MEME Suite family of tools for motif analysis. 'memes' provides data aware utilities for using GRanges objects as entrypoints to motif analysis, data structures for examining & editing motif lists, and novel data visualizations. 'memes' functions and data structures are amenable to both base R and tidyverse workflows.
Maintained by Spencer Nystrom. Last updated 5 months ago.
dataimportfunctionalgenomicsgeneregulationmotifannotationmotifdiscoverysequencematchingsoftware
2.4 match 49 stars 8.68 score 117 scripts 1 dependentstidymodels
butcher:Model Butcher
Provides a set of S3 generics to axe components of fitted model objects and help reduce the size of model objects saved to disk.
Maintained by Julia Silge. Last updated 13 days ago.
1.8 match 132 stars 11.54 score 146 scripts 13 dependentsvenpopov
bmm:Easy and Accessible Bayesian Measurement Models Using 'brms'
Fit computational and measurement models using full Bayesian inference. The package provides a simple and accessible interface by translating complex domain-specific models into 'brms' syntax, a powerful and flexible framework for fitting Bayesian regression models using 'Stan'. The package is designed so that users can easily apply state-of-the-art models in various research fields, and so that researchers can use it as a new model development framework. References: Frischkorn and Popov (2023) <doi:10.31234/osf.io/umt57>.
Maintained by Vencislav Popov. Last updated 13 days ago.
3.5 match 15 stars 5.92 score 35 scriptsejbz
BsMD:Bayes Screening and Model Discrimination
Bayes screening and model discrimination follow-up designs.
Maintained by Ernesto Barrios. Last updated 1 years ago.
7.2 match 2.86 score 57 scriptsaflapan
biClassify:Binary Classification Using Extensions of Discriminant Analysis
Implements methods for sample size reduction within Linear and Quadratic Discriminant Analysis in Lapanowski and Gaynanova (2020) <arXiv:2005.03858>. Also includes methods for non-linear discriminant analysis with simultaneous sparse feature selection in Lapanowski and Gaynanova (2019) PMLR 89:1704-1713.
Maintained by Alexander F. Lapanowski. Last updated 3 years ago.
10.3 match 2.00 score 4 scriptsedwardslee
alphabetr:Algorithms for High-Throughput Sequencing of Antigen-Specific T Cells
Provides algorithms for frequency-based pairing of alpha-beta T cell receptors.
Maintained by Edward Lee. Last updated 8 years ago.
4.4 match 8 stars 4.60 score 9 scriptsbioc
DFP:Gene Selection
This package provides a supervised technique able to identify differentially expressed genes, based on the construction of \emph{Fuzzy Patterns} (FPs). The Fuzzy Patterns are built by means of applying 3 Membership Functions to discretized gene expression values.
Maintained by Rodrigo Alvarez-Glez. Last updated 5 months ago.
microarraydifferentialexpression
5.3 match 3.78 score 5 scriptsmjnueda
lpda:Linear Programming Discriminant Analysis
Classification method obtained through linear programming. It is advantageous with respect to the classical developments when the distribution of the variables involved is unknown or when the number of variables is much greater than the number of individuals. LPDA method is published in Nueda, et al. (2022) "LPDA: A new classification method based on linear programming". <doi:10.1371/journal.pone.0270403>.
Maintained by Maria Jose Nueda. Last updated 2 years ago.
9.6 match 2.00 score 2 scriptskzychaluk
modelfree:Model-Free Estimation of a Psychometric Function
Local linear estimation of psychometric functions. Provides functions for nonparametric estimation of a psychometric function and for estimation of a derived threshold and slope, and their standard deviations and confidence intervals.
Maintained by Kamila Zychaluk. Last updated 2 years ago.
12.1 match 1.58 score 38 scriptsguokai8
o2plsda:Multiomics Data Integration
Provides functions to do 'O2PLS-DA' analysis for multiple omics data integration. The algorithm came from "O2-PLS, a two-block (X±Y) latent variable regression (LVR) method with an integral OSC filter" which published by Johan Trygg and Svante Wold at 2003 <doi:10.1002/cem.775>. 'O2PLS' is a bidirectional multivariate regression method that aims to separate the covariance between two data sets (it was recently extended to multiple data sets) (Löfstedt and Trygg, 2011 <doi:10.1002/cem.1388>; Löfstedt et al., 2012 <doi:10.1016/j.aca.2013.06.026>) from the systematic sources of variance being specific for each data set separately.
Maintained by Kai Guo. Last updated 27 days ago.
integrationmulti-omicso2plsomicsplsdaopenblascppopenmp
3.5 match 6 stars 4.95 score 6 scriptsjhmaindonald
gamclass:Functions and Data for a Course on Modern Regression and Classification
Functions and data are provided that support a course that emphasizes statistical issues of inference and generalizability. The functions are designed to make it straightforward to illustrate the use of cross-validation, the training/test approach, simulation, and model-based estimates of accuracy. Methods considered are Generalized Additive Modeling, Linear and Quadratic Discriminant Analysis, Tree-based methods, and Random Forests.
Maintained by John Maindonald. Last updated 2 years ago.
3.5 match 4.82 score 44 scriptsmlcollyer
RRPP:Linear Model Evaluation with Randomized Residuals in a Permutation Procedure
Linear model calculations are made for many random versions of data. Using residual randomization in a permutation procedure, sums of squares are calculated over many permutations to generate empirical probability distributions for evaluating model effects. Additionally, coefficients, statistics, fitted values, and residuals generated over many permutations can be used for various procedures including pairwise tests, prediction, classification, and model comparison. This package should provide most tools one could need for the analysis of high-dimensional data, especially in ecology and evolutionary biology, but certainly other fields, as well.
Maintained by Michael Collyer. Last updated 26 days ago.
1.7 match 4 stars 9.84 score 173 scripts 7 dependentsterrytangyuan
dml:Distance Metric Learning in R
State-of-the-art algorithms for distance metric learning, including global and local methods such as Relevant Component Analysis, Discriminative Component Analysis, Local Fisher Discriminant Analysis, etc. These distance metric learning methods are widely applied in feature extraction, dimensionality reduction, clustering, classification, information retrieval, and computer vision problems.
Maintained by Yuan Tang. Last updated 2 years ago.
dimensionality-reductiondistance-metric-learningmachine-learningmetric-learningstatistics
2.8 match 58 stars 5.94 score 8 scripts 1 dependentsbioc
MBECS:Evaluation and correction of batch effects in microbiome data-sets
The Microbiome Batch Effect Correction Suite (MBECS) provides a set of functions to evaluate and mitigate unwated noise due to processing in batches. To that end it incorporates a host of batch correcting algorithms (BECA) from various packages. In addition it offers a correction and reporting pipeline that provides a preliminary look at the characteristics of a data-set before and after correcting for batch effects.
Maintained by Michael Olbrich. Last updated 5 months ago.
batcheffectmicrobiomereportwritingvisualizationnormalizationqualitycontrol
3.5 match 4 stars 4.60 score 4 scriptsdepmix
hmmr:"Mixture and Hidden Markov Models with R" Datasets and Example Code
Datasets and code examples that accompany our book Visser & Speekenbrink (2021), "Mixture and Hidden Markov Models with R", <https://depmix.github.io/hmmr/>.
Maintained by Ingmar Visser. Last updated 4 years ago.
8.0 match 1 stars 2.00 score 7 scriptsmanuelrausch
statConfR:Models of Decision Confidence and Measures of Metacognition
Provides fitting functions and other tools for decision confidence and metacognition researchers, including meta-d'/d', often considered to be the gold standard to measure metacognitive efficiency, and information-theoretic measures of metacognition. Also allows to fit several static models of decision making and confidence.
Maintained by Manuel Rausch. Last updated 12 days ago.
cognitive-modelinginformation-theorymetacognitionsignal-detection-theory
3.2 match 5 stars 4.93 score 8 scriptsrikenbit
guidedPLS:Supervised Dimensional Reduction by Guided Partial Least Squares
Guided partial least squares (guided-PLS) is the combination of partial least squares by singular value decomposition (PLS-SVD) and guided principal component analysis (guided-PCA). For the details of the methods, see the reference section of GitHub README.md <https://github.com/rikenbit/guidedPLS>.
Maintained by Koki Tsuyuzaki. Last updated 2 years ago.
4.0 match 4.00 scoretjetka
SLEMI:Statistical Learning Based Estimation of Mutual Information
The implementation of the algorithm for estimation of mutual information and channel capacity from experimental data by classification procedures (logistic regression). Technically, it allows to estimate information-theoretic measures between finite-state input and multivariate, continuous output. Method described in Jetka et al. (2019) <doi:10.1371/journal.pcbi.1007132>.
Maintained by Tomasz Jetka. Last updated 1 years ago.
channel-capacityinformation-theorylogistic-regressionmutual-information-estimation
3.1 match 4 stars 4.92 score 21 scriptscran
packMBPLSDA:Multi-Block Partial Least Squares Discriminant Analysis
Several functions are provided to implement a MBPLSDA : components search, optimal model components number search, optimal model validity test by permutation tests, observed values evaluation of optimal model parameters and predicted categories, bootstrap values evaluation of optimal model parameters and predicted cross-validated categories. The use of this package is described in Brandolini-Bunlon et al (2019. Multi-block PLS discriminant analysis for the joint analysis of metabolomic and epidemiological data. Metabolomics, 15(10):134).
Maintained by Marion Brandolini-Bunlon. Last updated 3 years ago.
15.0 match 1.00 scorebioc
SVMDO:Identification of Tumor-Discriminating mRNA Signatures via Support Vector Machines Supported by Disease Ontology
It is an easy-to-use GUI using disease information for detecting tumor/normal sample discriminating gene sets from differentially expressed genes. Our approach is based on an iterative algorithm filtering genes with disease ontology enrichment analysis and wilk and wilks lambda criterion connected to SVM classification model construction. Along with gene set extraction, SVMDO also provides individual prognostic marker detection. The algorithm is designed for FPKM and RPKM normalized RNA-Seq transcriptome datasets.
Maintained by Mustafa Erhan Ozer. Last updated 5 months ago.
genesetenrichmentdifferentialexpressionguiclassificationrnaseqtranscriptomicssurvivalmachine-learningrna-seqshiny
3.2 match 4.60 score 2 scriptsvalentint
tclust:Robust Trimmed Clustering
Provides functions for robust trimmed clustering. The methods are described in Garcia-Escudero (2008) <doi:10.1214/07-AOS515>, Fritz et al. (2012) <doi:10.18637/jss.v047.i12>, Garcia-Escudero et al. (2011) <doi:10.1007/s11222-010-9194-z> and others.
Maintained by Valentin Todorov. Last updated 25 days ago.
1.8 match 3 stars 8.02 score 72 scripts 3 dependentssciviews
mlearning:Machine Learning Algorithms with Unified Interface and Confusion Matrices
A unified interface is provided to various machine learning algorithms like linear or quadratic discriminant analysis, k-nearest neighbors, random forest, support vector machine, ... It allows to train, test, and apply cross-validation using similar functions and function arguments with a minimalist and clean, formula-based interface. Missing data are processed the same way as base and stats R functions for all algorithms, both in training and testing. Confusion matrices are also provided with a rich set of metrics calculated and a few specific plots.
Maintained by Philippe Grosjean. Last updated 2 years ago.
4.0 match 3.59 score 26 scripts 1 dependentsdsy109
HoRM:Supplemental Functions and Datasets for "Handbook of Regression Methods"
Supplement for the book "Handbook of Regression Methods" by D. S. Young. Some datasets used in the book are included and documented. Wrapper functions are included that simplify the examples in the textbook, such as code for constructing a regressogram and expanding ANOVA tables to reflect the total sum of squares.
Maintained by Derek S. Young. Last updated 9 months ago.
regression-analysisregression-modelsshiny-apps
4.0 match 3.56 score 73 scriptsparadoxical-rhapsody
PLFD:Portmanteau Local Feature Discrimination for Matrix-Variate Data
The portmanteau local feature discriminant approach first identifies the local discriminant features and their differential structures, then constructs the discriminant rule by pooling the identified local features together. This method is applicable to high-dimensional matrix-variate data. See the paper by Xu, Luo and Chen (2021, <doi:10/gmt2gd>).
Maintained by Zengchao Xu. Last updated 2 years ago.
3.8 match 3.70 score 3 scriptsmchlbckr
lcda:Latent Class Discriminant Analysis
Providing a method for Local Discrimination via Latent Class Models. The approach is described in <https://www.r-project.org/conferences/useR-2009/abstracts/pdf/Bucker.pdf>.
Maintained by Michael Buecker. Last updated 1 years ago.
13.9 match 1.00 score 6 scriptsajmolstad
MatrixLDA:Penalized Matrix-Normal Linear Discriminant Analysis
Fits the penalized matrix-normal model to be used for linear discriminant analysis with matrix-valued predictors. For a description of the method, see Molstad and Rothman (2018) <doi:10.1080/10618600.2018.1476249>.
Maintained by Aaron J. Molstad. Last updated 1 years ago.
5.1 match 1 stars 2.70 score 4 scriptsmspeekenbrink
sdamr:Statistics: Data Analysis and Modelling
Data sets and functions to support the books "Statistics: Data analysis and modelling" by Speekenbrink, M. (2021) <https://mspeekenbrink.github.io/sdam-book/> and "An R companion to Statistics: data analysis and modelling" by Speekenbrink, M. (2021) <https://mspeekenbrink.github.io/sdam-r-companion/>. All datasets analysed in these books are provided in this package. In addition, the package provides functions to compute sample statistics (variance, standard deviation, mode), create raincloud and enhanced Q-Q plots, and expand Anova results into omnibus tests and tests of individual contrasts.
Maintained by Maarten Speekenbrink. Last updated 1 months ago.
3.1 match 5 stars 4.39 score 99 scriptsg-rho
hda:Heteroscedastic Discriminant Analysis
Functions to perform dimensionality reduction for classification if the covariance matrices of the classes are unequal.
Maintained by Gero Szepannek. Last updated 9 years ago.
9.1 match 1.48 score 9 scripts 1 dependentssharifrahmanie
MBMethPred:Medulloblastoma Subgroups Prediction
Utilizing a combination of machine learning models (Random Forest, Naive Bayes, K-Nearest Neighbor, Support Vector Machines, Extreme Gradient Boosting, and Linear Discriminant Analysis) and a deep Artificial Neural Network model, 'MBMethPred' can predict medulloblastoma subgroups, including wingless (WNT), sonic hedgehog (SHH), Group 3, and Group 4 from DNA methylation beta values. See Sharif Rahmani E, Lawarde A, Lingasamy P, Moreno SV, Salumets A and Modhukur V (2023), MBMethPred: a computational framework for the accurate classification of childhood medulloblastoma subgroups using data integration and AI-based approaches. Front. Genet. 14:1233657. <doi: 10.3389/fgene.2023.1233657> for more details.
Maintained by Edris Sharif Rahmani. Last updated 1 years ago.
3.6 match 3.70 score 1 scriptsboxiang-wang
sdwd:Sparse Distance Weighted Discrimination
Formulates a sparse distance weighted discrimination (SDWD) for high-dimensional classification and implements a very fast algorithm for computing its solution path with the L1, the elastic-net, and the adaptive elastic-net penalties. More details about the methodology SDWD is seen on Wang and Zou (2016) (<doi:10.1080/10618600.2015.1049700>).
Maintained by Boxiang Wang. Last updated 3 years ago.
5.5 match 2.41 score 13 scriptsbbuchsbaum
multivarious:Extensible Data Structures for Multivariate Analysis
Provides a set of basic and extensible data structures and functions for multivariate analysis, including dimensionality reduction techniques, projection methods, and preprocessing functions. The aim of this package is to offer a flexible and user-friendly framework for multivariate analysis that can be easily extended for custom requirements and specific data analysis tasks.
Maintained by Bradley Buchsbaum. Last updated 3 months ago.
3.8 match 3.53 score 17 scriptsrichjjackson
psc:Personalised Synthetic Controls
Allows the comparison of data cohorts (DC) against a Counter Factual Model (CFM) and measures the difference in terms of an efficacy parameter. Allows the application of Personalised Synthetic Controls.
Maintained by Richard Jackson. Last updated 4 months ago.
3.1 match 1 stars 4.23 score 24 scriptsroelandkindt
BiodiversityR:Package for Community Ecology and Suitability Analysis
Graphical User Interface (via the R-Commander) and utility functions (often based on the vegan package) for statistical analysis of biodiversity and ecological communities, including species accumulation curves, diversity indices, Renyi profiles, GLMs for analysis of species abundance and presence-absence, distance matrices, Mantel tests, and cluster, constrained and unconstrained ordination analysis. A book on biodiversity and community ecology analysis is available for free download from the website. In 2012, methods for (ensemble) suitability modelling and mapping were expanded in the package.
Maintained by Roeland Kindt. Last updated 2 months ago.
1.7 match 16 stars 7.42 score 390 scripts 2 dependentsdrizopoulos
JMbayes:Joint Modeling of Longitudinal and Time-to-Event Data under a Bayesian Approach
Shared parameter models for the joint modeling of longitudinal and time-to-event data using MCMC; Dimitris Rizopoulos (2016) <doi:10.18637/jss.v072.i07>.
Maintained by Dimitris Rizopoulos. Last updated 4 years ago.
joint-modelslongitudinal-responsesprediction-modelsurvival-analysisopenblascppopenmpjags
1.8 match 60 stars 6.98 score 80 scriptsmoskante
MixedPsy:Statistical Tools for the Analysis of Psychophysical Data
Tools for the analysis of psychophysical data in R. This package allows to estimate the Point of Subjective Equivalence (PSE) and the Just Noticeable Difference (JND), either from a psychometric function or from a Generalized Linear Mixed Model (GLMM). Additionally, the package allows plotting the fitted models and the response data, simulating psychometric functions of different shapes, and simulating data sets. For a description of the use of GLMMs applied to psychophysical data, refer to Moscatelli et al. (2012).
Maintained by Alessandro Moscatelli. Last updated 26 days ago.
3.4 match 5 stars 3.70 score 9 scriptsscarpino
multiDimBio:Multivariate Analysis and Visualization for Biological Data
Code to support a systems biology research program from inception through publication. The methods focus on dimension reduction approaches to detect patterns in complex, multivariate experimental data and places an emphasis on informative visualizations. The goal for this project is to create a package that will evolve over time, thereby remaining relevant and reflective of current methods and techniques. As a result, we encourage suggested additions to the package, both methodological and graphical.
Maintained by Samuel V. Scarpino. Last updated 5 years ago.
5.1 match 2.41 score 26 scriptsdvrbts
labdsv:Ordination and Multivariate Analysis for Ecology
A variety of ordination and community analyses useful in analysis of data sets in community ecology. Includes many of the common ordination methods, with graphical routines to facilitate their interpretation, as well as several novel analyses.
Maintained by David W. Roberts. Last updated 2 years ago.
2.0 match 3 stars 6.08 score 452 scripts 13 dependentsjonasbhend
easyVerification:Ensemble Forecast Verification for Large Data Sets
Set of tools to simplify application of atomic forecast verification metrics for (comparative) verification of ensemble forecasts to large data sets. The forecast metrics are imported from the 'SpecsVerification' package, and additional forecast metrics are provided with this package. Alternatively, new user-defined forecast scores can be implemented using the example scores provided and applied using the functionality of this package.
Maintained by Jonas Bhend. Last updated 2 years ago.
2.0 match 1 stars 6.04 score 61 scripts 4 dependentsips-lmu
emuR:Main Package of the EMU Speech Database Management System
Provide the EMU Speech Database Management System (EMU-SDMS) with database management, data extraction, data preparation and data visualization facilities. See <https://ips-lmu.github.io/The-EMU-SDMS-Manual/> for more details.
Maintained by Markus Jochim. Last updated 1 years ago.
1.8 match 24 stars 6.89 score 135 scripts 1 dependentskhliland
plsVarSel:Variable Selection in Partial Least Squares
Interfaces and methods for variable selection in Partial Least Squares. The methods include filter methods, wrapper methods and embedded methods. Both regression and classification is supported.
Maintained by Kristian Hovde Liland. Last updated 3 days ago.
1.9 match 3 stars 6.33 score 40 scripts 4 dependentswjakethompson
measr:Bayesian Psychometric Measurement Using 'Stan'
Estimate diagnostic classification models (also called cognitive diagnostic models) with 'Stan'. Diagnostic classification models are confirmatory latent class models, as described by Rupp et al. (2010, ISBN: 978-1-60623-527-0). Automatically generate 'Stan' code for the general loglinear cognitive diagnostic diagnostic model proposed by Henson et al. (2009) <doi:10.1007/s11336-008-9089-5> and other subtypes that introduce additional model constraints. Using the generated 'Stan' code, estimate the model evaluate the model's performance using model fit indices, information criteria, and reliability metrics.
Maintained by W. Jake Thompson. Last updated 2 months ago.
bayesiancdmcmdstanrcognitive-diagnosiscognitive-diagnostic-modelsdcmdiagnostic-classification-modelspsychometricsrstanstancpp
1.8 match 10 stars 6.75 score 31 scriptsbioc
RCAS:RNA Centric Annotation System
RCAS is an R/Bioconductor package designed as a generic reporting tool for the functional analysis of transcriptome-wide regions of interest detected by high-throughput experiments. Such transcriptomic regions could be, for instance, signal peaks detected by CLIP-Seq analysis for protein-RNA interaction sites, RNA modification sites (alias the epitranscriptome), CAGE-tag locations, or any other collection of query regions at the level of the transcriptome. RCAS produces in-depth annotation summaries and coverage profiles based on the distribution of the query regions with respect to transcript features (exons, introns, 5'/3' UTR regions, exon-intron boundaries, promoter regions). Moreover, RCAS can carry out functional enrichment analyses and discriminative motif discovery.
Maintained by Bora Uyar. Last updated 5 months ago.
softwaregenetargetmotifannotationmotifdiscoverygotranscriptomicsgenomeannotationgenesetenrichmentcoverage
1.8 match 6.32 score 29 scripts 1 dependentsbioc
esetVis:Visualizations of expressionSet Bioconductor object
Utility functions for visualization of expressionSet (or SummarizedExperiment) Bioconductor object, including spectral map, tsne and linear discriminant analysis. Static plot via the ggplot2 package or interactive via the ggvis or rbokeh packages are available.
Maintained by Laure Cougnaud. Last updated 5 months ago.
visualizationdatarepresentationdimensionreductionprincipalcomponentpathways
3.5 match 3.30 score 6 scriptshusson
SensoMineR:Sensory Data Analysis
Statistical Methods to Analyse Sensory Data. SensoMineR: A package for sensory data analysis. S. Le and F. Husson (2008).
Maintained by Francois Husson. Last updated 1 years ago.
2.0 match 5.72 score 108 scripts 3 dependentsbioc
MLInterfaces:Uniform interfaces to R machine learning procedures for data in Bioconductor containers
This package provides uniform interfaces to machine learning code for data in R and Bioconductor containers.
Maintained by Vincent Carey. Last updated 5 months ago.
1.5 match 7.63 score 79 scripts 6 dependentsdavidhofmeyr
PPCI:Projection Pursuit for Cluster Identification
Implements recently developed projection pursuit algorithms for finding optimal linear cluster separators. The clustering algorithms use optimal hyperplane separators based on minimum density, Pavlidis et. al (2016) <https://jmlr.csail.mit.edu/papers/volume17/15-307/15-307.pdf>; minimum normalised cut, Hofmeyr (2017) <doi:10.1109/TPAMI.2016.2609929>; and maximum variance ratio clusterability, Hofmeyr and Pavlidis (2015) <doi:10.1109/SSCI.2015.116>.
Maintained by David Hofmeyr. Last updated 5 years ago.
3.5 match 2 stars 3.26 score 18 scriptsglenmartin31
predRupdate:Prediction Model Validation and Updating
Evaluate the predictive performance of an existing (i.e. previously developed) prediction/ prognostic model given relevant information about the existing prediction model (e.g. coefficients) and a new dataset. Provides a range of model updating methods that help tailor the existing model to the new dataset; see Su et al. (2018) <doi:10.1177/0962280215626466>. Techniques to aggregate multiple existing prediction models on the new data are also provided; see Debray et al. (2014) <doi:10.1002/sim.6080> and Martin et al. (2018) <doi:10.1002/sim.7586>).
Maintained by Glen P. Martin. Last updated 7 months ago.
2.0 match 7 stars 5.62 score 9 scriptsadeverse
ade4TkGUI:'ade4' Tcl/Tk Graphical User Interface
A Tcl/Tk GUI for some basic functions in the 'ade4' package.
Maintained by Aurélie Siberchicot. Last updated 6 days ago.
2.3 match 2 stars 4.96 score 5 scriptsdcauseur
FADA:Variable Selection for Supervised Classification in High Dimension
The functions provided in the FADA (Factor Adjusted Discriminant Analysis) package aim at performing supervised classification of high-dimensional and correlated profiles. The procedure combines a decorrelation step based on a factor modeling of the dependence among covariates and a classification method. The available methods are Lasso regularized logistic model (see Friedman et al. (2010)), sparse linear discriminant analysis (see Clemmensen et al. (2011)), shrinkage linear and diagonal discriminant analysis (see M. Ahdesmaki et al. (2010)). More methods of classification can be used on the decorrelated data provided by the package FADA.
Maintained by David Causeur. Last updated 5 years ago.
5.8 match 1.90 score 6 scriptsrwehrens
BioMark:Find Biomarkers in Two-Class Discrimination Problems
Variable selection methods are provided for several classification methods: the lasso/elastic net, PCLDA, PLSDA, and several t-tests. Two approaches for selecting cutoffs can be used, one based on the stability of model coefficients under perturbation, and the other on higher criticism.
Maintained by Ron Wehrens. Last updated 10 years ago.
4.7 match 2.32 score 21 scriptsbpoconnor
DFA.CANCOR:Linear Discriminant Function and Canonical Correlation Analysis
Produces SPSS- and SAS-like output for linear discriminant function analysis and canonical correlation analysis. The methods are described in Manly & Alberto (2017, ISBN:9781498728966), Rencher (2002, ISBN:0-471-41889-7), and Tabachnik & Fidell (2019, ISBN:9780134790541).
Maintained by Brian P. OConnor. Last updated 4 months ago.
5.4 match 2.00 scorehuacheng1985
catR:Generation of IRT Response Patterns under Computerized Adaptive Testing
Provides routines for the generation of response patterns under unidimensional dichotomous and polytomous computerized adaptive testing (CAT) framework. It holds many standard functions to estimate ability, select the first item(s) to administer and optimally select the next item, as well as several stopping rules. Options to control for item exposure and content balancing are also available (Magis and Barrada (2017) <doi:10.18637/jss.v076.c01>).
Maintained by Cheng Hua. Last updated 3 years ago.
2.7 match 3 stars 4.03 score 107 scripts 1 dependentsvalentint
rda:Shrunken Centroids Regularized Discriminant Analysis
Provides functions implementing the shrunken centroids regularized discriminant analysis for classification purpose in high dimensional data. The method is described in Guo at al. (2013) <doi:10.1093/biostatistics/kxj035>.
Maintained by Valentin Todorov. Last updated 2 years ago.
3.5 match 3.02 score 21 scriptscran
OBsMD:Objective Bayesian Model Discrimination in Follow-Up Designs
Implements the objective Bayesian methodology proposed in Consonni and Deldossi in order to choose the optimal experiment that better discriminate between competing models, see Deldossi and Nai Ruscone (2020) <doi:10.18637/jss.v094.i02>.
Maintained by Marta Nai Ruscone. Last updated 7 months ago.
6.9 match 1.52 score 33 scriptsevolecolgroup
tidypopgen:Tidy Population Genetics
We provide a tidy grammar of population genetics, facilitating the manipulation and analysis of data on biallelic single nucleotide polymorphisms (SNPs).
Maintained by Andrea Manica. Last updated 3 days ago.
1.8 match 4 stars 5.83 score 8 scriptsjakobraymaekers
classmap:Visualizing Classification Results
Tools to visualize the results of a classification of cases. The graphical displays include stacked plots, silhouette plots, quasi residual plots, and class maps. Implements the techniques described and illustrated in Raymaekers, Rousseeuw and Hubert (2021), Class maps for visualizing classification results, Technometrics, appeared online. <doi:10.1080/00401706.2021.1927849> (open access) and Raymaekers and Rousseeuw (2021), Silhouettes and quasi residual plots for neural nets and tree-based classifiers, <arXiv:2106.08814>. Examples can be found in the vignettes: "Discriminant_analysis_examples","K_nearest_neighbors_examples", "Support_vector_machine_examples", "Rpart_examples", "Random_forest_examples", and "Neural_net_examples".
Maintained by Jakob Raymaekers. Last updated 2 years ago.
3.4 match 3.08 score 20 scriptscran
transDA:Transformation Discriminant Analysis
Performs transformation discrimination analysis and non-transformation discrimination analysis. It also includes functions for Linear Discriminant Analysis, Quadratic Discriminant Analysis, and Mixture Discriminant Analysis. In the context of mixture discriminant analysis, it offers options for both common covariance matrix (common sigma) and individual covariance matrices (uncommon sigma) for the mixture components.
Maintained by Jing Li. Last updated 4 months ago.
10.2 match 1.00 scoreapedrods
MAINT.Data:Model and Analyse Interval Data
Implements methodologies for modelling interval data by Normal and Skew-Normal distributions, considering appropriate parameterizations of the variance-covariance matrix that takes into account the intrinsic nature of interval data, and lead to four different possible configuration structures. The Skew-Normal parameters can be estimated by maximum likelihood, while Normal parameters may be estimated by maximum likelihood or robust trimmed maximum likelihood methods.
Maintained by Pedro Duarte Silva. Last updated 2 years ago.
8.9 match 1.15 score 14 scriptslanl
ezECM:Event Categorization Matrix Classification for Nuclear Detonations
Implementation of an Event Categorization Matrix (ECM) detonation detection model and a Bayesian variant. Functions are provided for importing and exporting data, fitting models, and applying decision criteria for categorizing new events. This package implements methods described in the paper "Bayesian Event Categorization Matrix Approach for Nuclear Detonations" Koermer, Carmichael, and Williams (2024) available on arXiv at <doi:10.48550/arXiv.2409.18227>.
Maintained by Scott Koermer. Last updated 5 months ago.
2.0 match 5.08 score 4 scriptsanthonyraborn
repsd:Root Expected Proportion Squared Difference for Detecting DIF
Root Expected Proportion Squared Difference (REPSD) is a nonparametric differential item functioning (DIF) method that (a) allows practitioners to explore for DIF related to small, fine-grained focal groups of examinees, and (b) compares the focal group directly to the composite group that will be used to develop the reported test score scale. Using your provided response matrix with a column that identifies focal group membership, this package provides the REPSD values, a simulated null distribution of possible REPSD values, and the simulated p-values identifying items possibly displaying DIF without requiring enormous sample sizes.
Maintained by Anthony William Raborn. Last updated 2 years ago.
3.8 match 2.70 score 1 scriptscran
funFEM:Clustering in the Discriminative Functional Subspace
The funFEM algorithm (Bouveyron et al., 2014) allows to cluster functional data by modeling the curves within a common and discriminative functional subspace.
Maintained by Charles Bouveyron. Last updated 3 years ago.
5.3 match 1.84 score 23 scripts 1 dependentspcruniversum
shinyMolBio:Molecular Biology Visualization Tools for 'Shiny' Apps
Interactive visualization of 'RDML' files via 'shiny' apps. Package provides (1) PCR plate interface with ability to select individual tubes; (2) amplification/melting plots with fast hiding and highlighting individual curves; (3) 2D allelic discrimination plot.
Maintained by Konstantin A. Blagodatskikh. Last updated 4 months ago.
2.4 match 6 stars 4.10 score 14 scriptscran
TExPosition:Two-Table ExPosition
An extension of ExPosition for two table analyses, specifically, discriminant analyses.
Maintained by Derek Beaton. Last updated 6 years ago.
4.6 match 2.15 score 70 scriptsromanguchenko
rodd:Optimal Discriminating Designs
A collection of functions for numerical construction of optimal discriminating designs. At the current moment T-optimal designs (which maximize the lower bound for the power of F-test for regression model discrimination), KL-optimal designs (for lognormal errors) and their robust analogues can be calculated with the package.
Maintained by Roman Guchenko. Last updated 9 years ago.
9.7 match 1.00 score 3 scriptstfletcher05
psychometric:Applied Psychometric Theory
Contains functions useful for correlation theory, meta-analysis (validity-generalization), reliability, item analysis, inter-rater reliability, and classical utility.
Maintained by Thomas D. Fletcher. Last updated 1 years ago.
2.3 match 4.24 score 181 scripts 1 dependentsgdurif
plsgenomics:PLS Analyses for Genomics
Routines for PLS-based genomic analyses, implementing PLS methods for classification with microarray data and prediction of transcription factor activities from combined ChIP-chip analysis. The >=1.2-1 versions include two new classification methods for microarray data: GSIM and Ridge PLS. The >=1.3 versions includes a new classification method combining variable selection and compression in logistic regression context: logit-SPLS; and an adaptive version of the sparse PLS.
Maintained by Ghislain Durif. Last updated 12 months ago.
1.7 match 5.55 score 140 scripts 2 dependentscran
wavethresh:Wavelets Statistics and Transforms
Performs 1, 2 and 3D real and complex-valued wavelet transforms, nondecimated transforms, wavelet packet transforms, nondecimated wavelet packet transforms, multiple wavelet transforms, complex-valued wavelet transforms, wavelet shrinkage for various kinds of data, locally stationary wavelet time series, nonstationary multiscale transfer function modeling, density estimation.
Maintained by Guy Nason. Last updated 7 months ago.
1.6 match 5.89 score 41 dependentsbioc
FuseSOM:A Correlation Based Multiview Self Organizing Maps Clustering For IMC Datasets
A correlation-based multiview self-organizing map for the characterization of cell types in highly multiplexed in situ imaging cytometry assays (`FuseSOM`) is a tool for unsupervised clustering. `FuseSOM` is robust and achieves high accuracy by combining a `Self Organizing Map` architecture and a `Multiview` integration of correlation based metrics. This allows FuseSOM to cluster highly multiplexed in situ imaging cytometry assays.
Maintained by Elijah Willie. Last updated 5 months ago.
singlecellcellbasedassaysclusteringspatial
2.0 match 1 stars 4.71 score 17 scriptsjosetamezpena
FRESA.CAD:Feature Selection Algorithms for Computer Aided Diagnosis
Contains a set of utilities for building and testing statistical models (linear, logistic,ordinal or COX) for Computer Aided Diagnosis/Prognosis applications. Utilities include data adjustment, univariate analysis, model building, model-validation, longitudinal analysis, reporting and visualization.
Maintained by Jose Gerardo Tamez-Pena. Last updated 1 months ago.
1.7 match 7 stars 5.59 score 31 scriptsbioc
wavClusteR:Sensitive and highly resolved identification of RNA-protein interaction sites in PAR-CLIP data
The package provides an integrated pipeline for the analysis of PAR-CLIP data. PAR-CLIP-induced transitions are first discriminated from sequencing errors, SNPs and additional non-experimental sources by a non- parametric mixture model. The protein binding sites (clusters) are then resolved at high resolution and cluster statistics are estimated using a rigorous Bayesian framework. Post-processing of the results, data export for UCSC genome browser visualization and motif search analysis are provided. In addition, the package allows to integrate RNA-Seq data to estimate the False Discovery Rate of cluster detection. Key functions support parallel multicore computing. Note: while wavClusteR was designed for PAR-CLIP data analysis, it can be applied to the analysis of other NGS data obtained from experimental procedures that induce nucleotide substitutions (e.g. BisSeq).
Maintained by Federico Comoglio. Last updated 5 months ago.
immunooncologysequencingtechnologyripseqrnaseqbayesian
2.0 match 4.60 score 3 scriptscran
robqda:Robust Quadratic Discriminant Analysis
The minimum covariance determinant estimator is used to perform robust quadratic discriminant analysis, including cross-validation. References: Friedman J., Hastie T. and Tibshirani R. (2009). "The elements of statistical learning", 2nd edition. Springer, Berlin. <doi:10.1007/978-0-387-84858-7>.
Maintained by Michail Tsagris. Last updated 3 months ago.
9.1 match 1.00 scorecran
qdm:Fitting a Quadrilateral Dissimilarity Model to Same-Different Judgments
This package provides different specifications of a Quadrilateral Dissimilarity Model which can be used to fit same-different judgments in order to get a predicted matrix that satisfies regular minimality [Colonius & Dzhafarov, 2006, Measurement and representations of sensations, Erlbaum]. From such a matrix, Fechnerian distances can be computed.
Maintained by Nora Umbach. Last updated 10 years ago.
9.1 match 1.00 scoredrizopoulos
JM:Joint Modeling of Longitudinal and Survival Data
Shared parameter models for the joint modeling of longitudinal and time-to-event data.
Maintained by Dimitris Rizopoulos. Last updated 3 years ago.
1.8 match 2 stars 4.93 score 112 scripts 1 dependentsblansche
fdm2id:Data Mining and R Programming for Beginners
Contains functions to simplify the use of data mining methods (classification, regression, clustering, etc.), for students and beginners in R programming. Various R packages are used and wrappers are built around the main functions, to standardize the use of data mining methods (input/output): it brings a certain loss of flexibility, but also a gain of simplicity. The package name came from the French "Fouille de Données en Master 2 Informatique Décisionnelle".
Maintained by Alexandre Blansché. Last updated 2 years ago.
5.4 match 1 stars 1.62 score 42 scriptsbioc
scDiagnostics:Cell type annotation diagnostics
The scDiagnostics package provides diagnostic plots to assess the quality of cell type assignments from single cell gene expression profiles. The implemented functionality allows to assess the reliability of cell type annotations, investigate gene expression patterns, and explore relationships between different cell types in query and reference datasets allowing users to detect potential misalignments between reference and query datasets. The package also provides visualization capabilities for diagnostics purposes.
Maintained by Anthony Christidis. Last updated 5 months ago.
annotationclassificationclusteringgeneexpressionrnaseqsinglecellsoftwaretranscriptomics
1.1 match 8 stars 7.77 score 46 scriptscran
IDmeasurer:Assessment of Individual Identity in Animal Signals
Provides tools for assessment and quantification of individual identity information in animal signals. This package accompanies a research article by Linhart et al. (2019) <doi:10.1101/546143>: "Measuring individual identity information in animal signals: Overview and performance of available identity metrics".
Maintained by Pavel Linhart. Last updated 6 years ago.
3.2 match 2.70 score 4 scriptsbioc
DepecheR:Determination of essential phenotypic elements of clusters in high-dimensional entities
The purpose of this package is to identify traits in a dataset that can separate groups. This is done on two levels. First, clustering is performed, using an implementation of sparse K-means. Secondly, the generated clusters are used to predict outcomes of groups of individuals based on their distribution of observations in the different clusters. As certain clusters with separating information will be identified, and these clusters are defined by a sparse number of variables, this method can reduce the complexity of data, to only emphasize the data that actually matters.
Maintained by Jakob Theorell. Last updated 5 months ago.
softwarecellbasedassaystranscriptiondifferentialexpressiondatarepresentationimmunooncologytranscriptomicsclassificationclusteringdimensionreductionfeatureextractionflowcytometryrnaseqsinglecellvisualizationcpp
1.7 match 5.18 score 15 scriptsbioc
ASICS:Automatic Statistical Identification in Complex Spectra
With a set of pure metabolite reference spectra, ASICS quantifies concentration of metabolites in a complex spectrum. The identification of metabolites is performed by fitting a mixture model to the spectra of the library with a sparse penalty. The method and its statistical properties are described in Tardivel et al. (2017) <doi:10.1007/s11306-017-1244-5>.
Maintained by Gaëlle Lefort. Last updated 5 months ago.
softwaredataimportcheminformaticsmetabolomics
1.7 match 5.18 score 30 scriptsbioc
ropls:PCA, PLS(-DA) and OPLS(-DA) for multivariate analysis and feature selection of omics data
Latent variable modeling with Principal Component Analysis (PCA) and Partial Least Squares (PLS) are powerful methods for visualization, regression, classification, and feature selection of omics data where the number of variables exceeds the number of samples and with multicollinearity among variables. Orthogonal Partial Least Squares (OPLS) enables to separately model the variation correlated (predictive) to the factor of interest and the uncorrelated (orthogonal) variation. While performing similarly to PLS, OPLS facilitates interpretation. Successful applications of these chemometrics techniques include spectroscopic data such as Raman spectroscopy, nuclear magnetic resonance (NMR), mass spectrometry (MS) in metabolomics and proteomics, but also transcriptomics data. In addition to scores, loadings and weights plots, the package provides metrics and graphics to determine the optimal number of components (e.g. with the R2 and Q2 coefficients), check the validity of the model by permutation testing, detect outliers, and perform feature selection (e.g. with Variable Importance in Projection or regression coefficients). The package can be accessed via a user interface on the Workflow4Metabolomics.org online resource for computational metabolomics (built upon the Galaxy environment).
Maintained by Etienne A. Thevenot. Last updated 5 months ago.
regressionclassificationprincipalcomponenttranscriptomicsproteomicsmetabolomicslipidomicsmassspectrometryimmunooncology
1.1 match 7.55 score 210 scripts 8 dependentsbioc
geNetClassifier:Classify diseases and build associated gene networks using gene expression profiles
Comprehensive package to automatically train and validate a multi-class SVM classifier based on gene expression data. Provides transparent selection of gene markers, their coexpression networks, and an interface to query the classifier.
Maintained by Sara Aibar. Last updated 5 months ago.
classificationdifferentialexpressionmicroarray
1.9 match 4.38 score 1 scripts 2 dependentscalbertsen
otoclass:Otolith Classification and Proportion Estimation
Methods for classification and analysis of otoliths along with methods for estimating species proportions in samples.
Maintained by Christoffer Moesgaard Albertsen. Last updated 11 months ago.
contour-detectionotolith-classificationotolith-shapeotolithscpp
3.8 match 3 stars 2.18 scorecran
sgPLS:Sparse Group Partial Least Square Methods
Regularized version of partial least square approaches providing sparse, group, and sparse group versions of partial least square regression models (Liquet, B., Lafaye de Micheaux, P., Hejblum B., Thiebaut, R. (2016) <doi:10.1093/bioinformatics/btv535>). Version of PLS Discriminant analysis is also provided.
Maintained by Benoit Liquet. Last updated 1 years ago.
5.5 match 1 stars 1.48 score 7 scriptscran
IPCAPS:Iterative Pruning to Capture Population Structure
An unsupervised clustering algorithm based on iterative pruning is for capturing population structure. This version supports ordinal data which can be applied directly to SNP data to identify fine-level population structure and it is built on the iterative pruning Principal Component Analysis ('ipPCA') algorithm as explained in Intarapanich et al. (2009) <doi:10.1186/1471-2105-10-382>. The 'IPCAPS' involves an iterative process using multiple splits based on multivariate Gaussian mixture modeling of principal components and 'Expectation-Maximization' clustering as explained in Lebret et al. (2015) <doi:10.18637/jss.v067.i06>. In each iteration, rough clusters and outliers are also identified using the function rubikclust() from the R package 'KRIS'.
Maintained by Kridsadakorn Chaichoompu. Last updated 4 years ago.
4.0 match 2.00 score 10 scriptsgiscience-fsu
sperrorest:Perform Spatial Error Estimation and Variable Importance Assessment
Implements spatial error estimation and permutation-based variable importance measures for predictive models using spatial cross-validation and spatial block bootstrap.
Maintained by Alexander Brenning. Last updated 2 years ago.
cross-validationmachine-learningspatial-statisticsspatio-temporal-modelingstatistical-learning
1.3 match 19 stars 6.46 score 46 scriptsjangraffelman
ToolsForCoDa:Multivariate Tools for Compositional Data Analysis
Provides functions for multivariate analysis with compositional data. Includes a function for doing compositional canonical correlation analysis. This analysis requires two data matrices of compositions, which can be adequately transformed and used as entries in a specialized program for canonical correlation analysis, that is able to deal with singular covariance matrices. The methodology is described in Graffelman et al. (2017) <doi:10.1101/144584>. Functions for log-ratio principal component analysis with condition number computations and log-ratio discriminant analysis have been added to the package.
Maintained by Jan Graffelman. Last updated 2 months ago.
3.5 match 2.30 score 8 scriptsconverseg
ML2Pvae:Variational Autoencoder Models for IRT Parameter Estimation
Based on the work of Curi, Converse, Hajewski, and Oliveira (2019) <doi:10.1109/IJCNN.2019.8852333>. This package provides easy-to-use functions which create a variational autoencoder (VAE) to be used for parameter estimation in Item Response Theory (IRT) - namely the Multidimensional Logistic 2-Parameter (ML2P) model. To use a neural network as such, nontrivial modifications to the architecture must be made, such as restricting the nonzero weights in the decoder according to some binary matrix Q. The functions in this package allow for straight-forward construction, training, and evaluation so that minimal knowledge of 'tensorflow' or 'keras' is required.
Maintained by Geoffrey Converse. Last updated 3 years ago.
4.0 match 2.00 score 4 scriptscran
afc:Generalized Discrimination Score
This is an implementation of the Generalized Discrimination Score (also known as Two Alternatives Forced Choice Score, 2AFC) for various representations of forecasts and verifying observations. The Generalized Discrimination Score is a generic forecast verification framework which can be applied to any of the following verification contexts: dichotomous, polychotomous (ordinal and nominal), continuous, probabilistic, and ensemble. A comprehensive description of the Generalized Discrimination Score, including all equations used in this package, is provided by Mason and Weigel (2009) <doi:10.1175/MWR-D-10-05069.1>.
Maintained by Jonas Bhend. Last updated 8 years ago.
7.9 match 1.00 scorecran
TableHC:Higher Criticism Test of Two Frequency Counts Tables
Higher Criticism (HC) test between two frequency tables. Test is based on an adaptation of the Tukey-Donoho-Jin HC statistic to testing frequency tables described in Kipnis (2019) <arXiv:1911.01208>.
Maintained by Alon Kipnis. Last updated 5 years ago.
2.9 match 2.70 scoretesselle
nexus:Sourcing Archaeological Materials by Chemical Composition
Exploration and analysis of compositional data in the framework of Aitchison (1986, ISBN: 978-94-010-8324-9). This package provides tools for chemical fingerprinting and source tracking of ancient materials.
Maintained by Nicolas Frerebeau. Last updated 12 days ago.
archaeologyarchaeological-sciencearchaeometrycompositional-dataprovenance-studies
1.5 match 5.21 score 26 scripts 1 dependentsdanielcfurr
edstan:Stan Models for Item Response Theory
Provides convenience functions and pre-programmed Stan models related to item response theory. Its purpose is to make fitting common item response theory models using Stan easy.
Maintained by Daniel C. Furr. Last updated 10 days ago.
1.3 match 8 stars 6.26 score 25 scripts 2 dependentsainsuotain
kfda:Kernel Fisher Discriminant Analysis
Kernel Fisher Discriminant Analysis (KFDA) is performed using Kernel Principal Component Analysis (KPCA) and Fisher Discriminant Analysis (FDA). There are some similar packages. First, 'lfda' is a package that performs Local Fisher Discriminant Analysis (LFDA) and performs other functions. In particular, 'lfda' seems to be impossible to test because it needs the label information of the data in the function argument. Also, the 'ks' package has a limited dimension, which makes it difficult to analyze properly. This package is a simple and practical package for KFDA based on the paper of Yang, J., Jin, Z., Yang, J. Y., Zhang, D., and Frangi, A. F. (2004) <DOI:10.1016/j.patcog.2003.10.015>.
Maintained by Donghwan Kim. Last updated 7 years ago.
7.5 match 1.00 score 2 scriptscran
ccda:Combined Cluster and Discriminant Analysis
Implements the combined cluster and discriminant analysis method for finding homogeneous groups of data with known origin as described in Kovacs et. al (2014): Classification into homogeneous groups using combined cluster and discriminant analysis (CCDA). Environmental Modelling & Software. <doi:10.1016/j.envsoft.2014.01.010>.
Maintained by Solt Kovacs. Last updated 5 years ago.
7.5 match 1.00 scoredevpsylab
petersenlab:A Collection of R Functions by the Petersen Lab
A collection of R functions that are widely used by the Petersen Lab. Included are functions for various purposes, including evaluating the accuracy of judgments and predictions, performing scoring of assessments, generating correlation matrices, conversion of data between various types, data management, psychometric evaluation, extensions related to latent variable modeling, various plotting capabilities, and other miscellaneous useful functions. By making the package available, we hope to make our methods reproducible and replicable by others and to help others perform their data processing and analysis methods more easily and efficiently. The codebase is provided in Petersen (2025) <doi:10.5281/zenodo.7602890> and on 'CRAN': <doi: 10.32614/CRAN.package.petersenlab>. The package is described in "Principles of Psychological Assessment: With Applied Examples in R" (Petersen, 2024, 2025) <doi:10.1201/9781003357421>, <doi:10.25820/work.007199>, <doi:10.5281/zenodo.6466589>.
Maintained by Isaac T. Petersen. Last updated 25 days ago.
data-analysisdata-analysis-in-rdata-managementpsychometrics
1.8 match 1 stars 4.15 score 1 scriptskylecaudle
rTensor2:MultiLinear Algebra
A set of tools for basic tensor operators. A tensor in the context of data analysis in a multidimensional array. The tools in this package rely on using any discrete transformation (e.g. Fast Fourier Transform (FFT)). Standard tools included are the Eigenvalue decomposition of a tensor, the QR decomposition and LU decomposition. Other functionality includes the inverse of a tensor and the transpose of a symmetric tensor. Functionality in the package is outlined in Kernfeld et al. (2015) <https://www.sciencedirect.com/science/article/pii/S0024379515004358>.
Maintained by Kyle Caudle. Last updated 12 months ago.
3.0 match 2.48 score 2 scripts 1 dependentscran
robustDA:Robust Mixture Discriminant Analysis
Robust mixture discriminant analysis (RMDA), proposed in Bouveyron & Girard, 2009 <doi:10.1016/j.patcog.2009.03.027>, allows to build a robust supervised classifier from learning data with label noise. The idea of the proposed method is to confront an unsupervised modeling of the data with the supervised information carried by the labels of the learning data in order to detect inconsistencies. The method is able afterward to build a robust classifier taking into account the detected inconsistencies into the labels.
Maintained by Charles Bouveyron. Last updated 4 years ago.
7.4 match 1.00 score 3 scripts