Showing 45 of total 45 results (show query)
vegandevs
vegan:Community Ecology Package
Ordination methods, diversity analysis and other functions for community and vegetation ecologists.
Maintained by Jari Oksanen. Last updated 29 days ago.
ecological-modellingecologyordinationfortranopenblas
472 stars 19.41 score 15k scripts 440 dependentskhliland
pls:Partial Least Squares and Principal Component Regression
Multivariate regression methods Partial Least Squares Regression (PLSR), Principal Component Regression (PCR) and Canonical Powered Partial Least Squares (CPPLS).
Maintained by Kristian Hovde Liland. Last updated 2 months ago.
37 stars 13.60 score 3.2k scripts 85 dependentsbioc
pcaMethods:A collection of PCA methods
Provides Bayesian PCA, Probabilistic PCA, Nipals PCA, Inverse Non-Linear PCA and the conventional SVD PCA. A cluster based method for missing value estimation is included for comparison. BPCA, PPCA and NipalsPCA may be used to perform PCA on incomplete data as well as for accurate missing value estimation. A set of methods for printing and plotting the results is also provided. All PCA methods make use of the same data structure (pcaRes) to provide a common interface to the PCA results. Initiated at the Max-Planck Institute for Molecular Plant Physiology, Golm, Germany.
Maintained by Henning Redestig. Last updated 5 months ago.
49 stars 13.10 score 538 scripts 73 dependentsmarkmfredrickson
optmatch:Functions for Optimal Matching
Distance based bipartite matching using minimum cost flow, oriented to matching of treatment and control groups in observational studies ('Hansen' and 'Klopfer' 2006 <doi:10.1198/106186006X137047>). Routines are provided to generate distances from generalised linear models (propensity score matching), formulas giving variables on which to limit matched distances, stratified or exact matching directives, or calipers, alone or in combination.
Maintained by Josh Errickson. Last updated 4 months ago.
47 stars 12.22 score 588 scripts 5 dependentstrinker
qdap:Bridging the Gap Between Qualitative Data and Quantitative Analysis
Automates many of the tasks associated with quantitative discourse analysis of transcripts containing discourse including frequency counts of sentence types, words, sentences, turns of talk, syllables and other assorted analysis tasks. The package provides parsing tools for preparing transcript data. Many functions enable the user to aggregate data by any number of grouping variables, providing analysis and seamless integration with other R packages that undertake higher level analysis and visualization of text. This affords the user a more efficient and targeted analysis. 'qdap' is designed for transcript analysis, however, many functions are applicable to other areas of Text Mining/ Natural Language Processing.
Maintained by Tyler Rinker. Last updated 5 years ago.
qdapquantitative-discourse-analysistext-analysistext-miningtext-plottingopenjdk
176 stars 9.61 score 1.3k scripts 3 dependentsfriendly
candisc:Visualizing Generalized Canonical Discriminant and Canonical Correlation Analysis
Functions for computing and visualizing generalized canonical discriminant analyses and canonical correlation analysis for a multivariate linear model. Traditional canonical discriminant analysis is restricted to a one-way 'MANOVA' design and is equivalent to canonical correlation analysis between a set of quantitative response variables and a set of dummy variables coded from the factor variable. The 'candisc' package generalizes this to higher-way 'MANOVA' designs for all factors in a multivariate linear model, computing canonical scores and vectors for each term. The graphic functions provide low-rank (1D, 2D, 3D) visualizations of terms in an 'mlm' via the 'plot.candisc' and 'heplot.candisc' methods. Related plots are now provided for canonical correlation analysis when all predictors are quantitative.
Maintained by Michael Friendly. Last updated 1 days ago.
dimension-reductionmultivariate-linear-modelsvisualization
15 stars 8.99 score 221 scripts 3 dependentsbhklab
mRMRe:Parallelized Minimum Redundancy, Maximum Relevance (mRMR)
Computes mutual information matrices from continuous, categorical and survival variables, as well as feature selection with minimum redundancy, maximum relevance (mRMR) and a new ensemble mRMR technique. Published in De Jay et al. (2013) <doi:10.1093/bioinformatics/btt383>.
Maintained by Benjamin Haibe-Kains. Last updated 4 years ago.
19 stars 8.95 score 105 scripts 2 dependentsbioc
GenomicScores:Infrastructure to work with genomewide position-specific scores
Provide infrastructure to store and access genomewide position-specific scores within R and Bioconductor.
Maintained by Robert Castelo. Last updated 2 months ago.
infrastructuregeneticsannotationsequencingcoverageannotationhubsoftware
8 stars 8.71 score 83 scripts 6 dependentssciviews
SciViews:'SciViews' - Data Processing and Visualization with the 'SciViews::R' Dialect
The 'SciViews::R' dialect provides a set of functions that streamlines data input, process, analysis and visualization especially, but not exclusively, for beginners or occasional users. It mixes base R and tidyverse, plus another set of CRAN packages for an easy and coherent use of R.
Maintained by Philippe Grosjean. Last updated 7 months ago.
8 stars 7.62 score 116 scripts 1 dependentsbioc
HiCExperiment:Bioconductor class for interacting with Hi-C files in R
R generic interface to Hi-C contact matrices in `.(m)cool`, `.hic` or HiC-Pro derived formats, as well as other Hi-C processed file formats. Contact matrices can be partially parsed using a random access method, allowing a memory-efficient representation of Hi-C data in R. The `HiCExperiment` class stores the Hi-C contacts parsed from local contact matrix files. `HiCExperiment` instances can be further investigated in R using the `HiContacts` analysis package.
Maintained by Jacques Serizay. Last updated 11 days ago.
9 stars 7.02 score 48 scripts 2 dependentsbioc
bioassayR:Cross-target analysis of small molecule bioactivity
bioassayR is a computational tool that enables simultaneous analysis of thousands of bioassay experiments performed over a diverse set of compounds and biological targets. Unique features include support for large-scale cross-target analyses of both public and custom bioassays, generation of high throughput screening fingerprints (HTSFPs), and an optional preloaded database that provides access to a substantial portion of publicly available bioactivity data.
Maintained by Thomas Girke. Last updated 5 months ago.
immunooncologymicrotitreplateassaycellbasedassaysvisualizationinfrastructuredataimportbioinformaticsproteomicsmetabolomics
5 stars 6.70 score 46 scriptskhliland
multiblock:Multiblock Data Fusion in Statistics and Machine Learning
Functions and datasets to support Smilde, Næs and Liland (2021, ISBN: 978-1-119-60096-1) "Multiblock Data Fusion in Statistics and Machine Learning - Applications in the Natural and Life Sciences". This implements and imports a large collection of methods for multiblock data analysis with common interfaces, result- and plotting functions, several real data sets and six vignettes covering a range different applications.
Maintained by Kristian Hovde Liland. Last updated 2 months ago.
14 stars 6.68 score 19 scriptsluqqe
outliers:Tests for Outliers
A collection of some tests commonly used for identifying outliers.
Maintained by Lukasz Komsta. Last updated 3 years ago.
1 stars 6.56 score 888 scripts 11 dependentsbioc
COMPASS:Combinatorial Polyfunctionality Analysis of Single Cells
COMPASS is a statistical framework that enables unbiased analysis of antigen-specific T-cell subsets. COMPASS uses a Bayesian hierarchical framework to model all observed cell-subsets and select the most likely to be antigen-specific while regularizing the small cell counts that often arise in multi-parameter space. The model provides a posterior probability of specificity for each cell subset and each sample, which can be used to profile a subject's immune response to external stimuli such as infection or vaccination.
Maintained by Greg Finak. Last updated 5 months ago.
immunooncologyflowcytometrycpp
7 stars 6.51 score 42 scriptsrickhelmus
patRoon:Workflows for Mass-Spectrometry Based Non-Target Analysis
Provides an easy-to-use interface to a mass spectrometry based non-target analysis workflow. Various (open-source) tools are combined which provide algorithms for extraction and grouping of features, extraction of MS and MS/MS data, automatic formula and compound annotation and grouping related features to components. In addition, various tools are provided for e.g. data preparation and cleanup, plotting results and automatic reporting.
Maintained by Rick Helmus. Last updated 8 days ago.
mass-spectrometrynon-targetcppopenjdk
65 stars 6.24 score 43 scriptsdvrbts
labdsv:Ordination and Multivariate Analysis for Ecology
A variety of ordination and community analyses useful in analysis of data sets in community ecology. Includes many of the common ordination methods, with graphical routines to facilitate their interpretation, as well as several novel analyses.
Maintained by David W. Roberts. Last updated 2 years ago.
3 stars 6.05 score 452 scripts 12 dependentsbioc
autonomics:Unified Statistical Modeling of Omics Data
This package unifies access to Statistal Modeling of Omics Data. Across linear modeling engines (lm, lme, lmer, limma, and wilcoxon). Across coding systems (treatment, difference, deviation, etc). Across model formulae (with/without intercept, random effect, interaction or nesting). Across omics platforms (microarray, rnaseq, msproteomics, affinity proteomics, metabolomics). Across projection methods (pca, pls, sma, lda, spls, opls). Across clustering methods (hclust, pam, cmeans). It provides a fast enrichment analysis implementation. And an intuitive contrastogram visualisation to summarize contrast effects in complex designs.
Maintained by Aditya Bhagwat. Last updated 2 months ago.
softwaredataimportpreprocessingdimensionreductionprincipalcomponentregressiondifferentialexpressiongenesetenrichmenttranscriptomicstranscriptiongeneexpressionrnaseqmicroarrayproteomicsmetabolomicsmassspectrometry
5.95 score 5 scriptschrisaddy
rrr:Reduced-Rank Regression
Reduced-rank regression, diagnostics and graphics.
Maintained by Chris Addy. Last updated 8 years ago.
10 stars 5.06 score 23 scriptsbiometris
douconca:Double Constrained Correspondence Analysis for Trait-Environment Analysis in Ecology
Double constrained correspondence analysis (dc-CA) analyzes (multi-)trait (multi-)environment ecological data by using the 'vegan' package and native R code. Throughout the two step algorithm of ter Braak et al. (2018) is used. This algorithm combines and extends community- (sample-) and species-level analyses, i.e. the usual community weighted means (CWM)-based regression analysis and the species-level analysis of species-niche centroids (SNC)-based regression analysis. The two steps use canonical correspondence analysis to regress the abundance data on to the traits and (weighted) redundancy analysis to regress the CWM of the orthonormalized traits on to the environmental predictors. The function dc_CA() has an option to divide the abundance data of a site by the site total, giving equal site weights. This division has the advantage that the multivariate analysis corresponds with an unweighted (multi-trait) community-level analysis, instead of being weighted. The first step of the algorithm uses vegan::cca(). The second step uses wrda() but vegan::rda() if the site weights are equal. This version has a predict() function. For details see ter Braak et al. 2018 <doi:10.1007/s10651-017-0395-x>.
Maintained by Bart-Jan van Rossum. Last updated 4 months ago.
correspondence-analysisecologyecology-modelingmulti-environmentmulti-trait
5.02 score 6 scriptsguokai8
o2plsda:Multiomics Data Integration
Provides functions to do 'O2PLS-DA' analysis for multiple omics data integration. The algorithm came from "O2-PLS, a two-block (X±Y) latent variable regression (LVR) method with an integral OSC filter" which published by Johan Trygg and Svante Wold at 2003 <doi:10.1002/cem.775>. 'O2PLS' is a bidirectional multivariate regression method that aims to separate the covariance between two data sets (it was recently extended to multiple data sets) (Löfstedt and Trygg, 2011 <doi:10.1002/cem.1388>; Löfstedt et al., 2012 <doi:10.1016/j.aca.2013.06.026>) from the systematic sources of variance being specific for each data set separately.
Maintained by Kai Guo. Last updated 1 months ago.
integrationmulti-omicso2plsomicsplsdaopenblascppopenmp
7 stars 5.02 score 6 scriptsmbinois
hetGP:Heteroskedastic Gaussian Process Modeling and Design under Replication
Performs Gaussian process regression with heteroskedastic noise following the model by Binois, M., Gramacy, R., Ludkovski, M. (2016) <doi:10.48550/arXiv.1611.05902>, with implementation details in Binois, M. & Gramacy, R. B. (2021) <doi:10.18637/jss.v098.i13>. The input dependent noise is modeled as another Gaussian process. Replicated observations are encouraged as they yield computational savings. Sequential design procedures based on the integrated mean square prediction error and lookahead heuristics are provided, and notably fast update functions when adding new observations.
Maintained by Mickael Binois. Last updated 7 months ago.
5 stars 4.89 score 260 scripts 2 dependentsdanielquiroz97
RGCxGC:Preprocessing and Multivariate Analysis of Bidimensional Gas Chromatography Data
Toolbox for chemometrics analysis of bidimensional gas chromatography data. This package import data for common scientific data format (NetCDF) and fold it to 2D chromatogram. Then, it can perform preprocessing and multivariate analysis. In the preprocessing algorithms, baseline correction, smoothing, and peak alignment are available. While in multivariate analysis, multiway principal component analysis is incorporated.
Maintained by Cristian Quiroz-Moreno. Last updated 2 years ago.
chemphyschemometricsgcxgcmultiway-algorithmspreprocessing
7 stars 4.77 score 17 scriptsbioc
MAIT:Statistical Analysis of Metabolomic Data
The MAIT package contains functions to perform end-to-end statistical analysis of LC/MS Metabolomic Data. Special emphasis is put on peak annotation and in modular function design of the functions.
Maintained by Pol Sola-Santos. Last updated 5 months ago.
immunooncologymassspectrometrymetabolomicssoftware
4.60 score 20 scriptsbioc
ChIPanalyser:ChIPanalyser: Predicting Transcription Factor Binding Sites
ChIPanalyser is a package to predict and understand TF binding by utilizing a statistical thermodynamic model. The model incorporates 4 main factors thought to drive TF binding: Chromatin State, Binding energy, Number of bound molecules and a scaling factor modulating TF binding affinity. Taken together, ChIPanalyser produces ChIP-like profiles that closely mimic the patterns seens in real ChIP-seq data.
Maintained by Patrick C.N. Martin. Last updated 5 months ago.
softwarebiologicalquestionworkflowsteptranscriptionsequencingchiponchipcoveragealignmentchipseqsequencematchingdataimportpeakdetection
4.38 score 12 scriptsanalyticsresearchlab
thestats:R Package for Exploring Turkish Higher Education Statistics
A user-friendly R data package that is intended to make Turkish higher education statistics more accessible.
Maintained by Olgun Aydin. Last updated 2 years ago.
14 stars 4.36 score 11 scriptskhliland
HDANOVA:High-Dimensional Analysis of Variance
Functions and datasets to support Smilde, Marini, Westerhuis and Liland (2025, ISBN: 978-1-394-21121-0) "Analysis of Variance for High-Dimensional Data - Applications in Life, Food and Chemical Sciences". This implements and imports a collection of methods for HD-ANOVA data analysis with common interfaces, result- and plotting functions, multiple real data sets and four vignettes covering a range different applications.
Maintained by Kristian Hovde Liland. Last updated 16 days ago.
4.35 score 8 scripts 1 dependentsbioc
scoreInvHap:Get inversion status in predefined regions
scoreInvHap can get the samples' inversion status of known inversions. scoreInvHap uses SNP data as input and requires the following information about the inversion: genotype frequencies in the different haplotypes, R2 between the region SNPs and inversion status and heterozygote genotypes in the reference. The package include this data for 21 inversions.
Maintained by Dolors Pelegri-Siso. Last updated 5 months ago.
4.34 score 11 scriptsbioc
cytofQC:Labels normalized cells for CyTOF data and assigns probabilities for each label
cytofQC is a package for initial cleaning of CyTOF data. It uses a semi-supervised approach for labeling cells with their most likely data type (bead, doublet, debris, dead) and the probability that they belong to each label type. This package does not remove data from the dataset, but provides labels and information to aid the data user in cleaning their data. Our algorithm is able to distinguish between doublets and large cells.
Maintained by Jill Lundell. Last updated 5 months ago.
2 stars 4.30 score 3 scriptspi-kappa-devel
markets:Estimation Methods for Markets in Equilibrium and Disequilibrium
Provides estimation methods for markets in equilibrium and disequilibrium. Supports the estimation of an equilibrium and four disequilibrium models with both correlated and independent shocks. Also provides post-estimation analysis tools, such as aggregation, marginal effect, and shortage calculations. See Karapanagiotis (2024) <doi:10.18637/jss.v108.i02> for an overview of the functionality and examples. The estimation methods are based on full information maximum likelihood techniques given in Maddala and Nelson (1974) <doi:10.2307/1914215>. They are implemented using the analytic derivative expressions calculated in Karapanagiotis (2020) <doi:10.2139/ssrn.3525622>. Standard errors can be estimated by adjusting for heteroscedasticity or clustering. The equilibrium estimation constitutes a case of a system of linear, simultaneous equations. Instead, the disequilibrium models replace the market-clearing condition with a non-linear, short-side rule and allow for different specifications of price dynamics.
Maintained by Pantelis Karapanagiotis. Last updated 1 years ago.
disequilibriumeconomicsfinancefull-information-maximum-likelihoodmarket-clearingmarket-modelsshort-side-rulecpp
1 stars 4.30 score 9 scriptsbioc
geva:Gene Expression Variation Analysis (GEVA)
Statistic methods to evaluate variations of differential expression (DE) between multiple biological conditions. It takes into account the fold-changes and p-values from previous differential expression (DE) results that use large-scale data (*e.g.*, microarray and RNA-seq) and evaluates which genes would react in response to the distinct experiments. This evaluation involves an unique pipeline of statistical methods, including weighted summarization, quantile detection, cluster analysis, and ANOVA tests, in order to classify a subset of relevant genes whose DE is similar or dependent to certain biological factors.
Maintained by Itamar José Guimarães Nunes. Last updated 5 months ago.
classificationdifferentialexpressiongeneexpressionmicroarraymultiplecomparisonrnaseqsystemsbiologytranscriptomics
2 stars 4.30 score 4 scriptsselbouhaddani-umc
OmicsPLS:Data Integration with Two-Way Orthogonal Partial Least Squares
Performs the O2PLS data integration method for two datasets, yielding joint and data-specific parts for each dataset. The algorithm automatically switches to a memory-efficient approach to fit O2PLS to high dimensional data. It provides a rigorous and a faster alternative cross-validation method to select the number of components, as well as functions to report proportions of explained variation and to construct plots of the results. See the software article by el Bouhaddani et al (2018) <doi:10.1186/s12859-018-2371-3>, and Trygg and Wold (2003) <doi:10.1002/cem.775>. It also performs Sparse Group (Penalized) O2PLS, see Gu et al (2020) <doi:10.1186/s12859-021-03958-3> and cross-validation for the degree of sparsity.
Maintained by Said el Bouhaddani. Last updated 4 years ago.
3.84 score 57 scripts 1 dependentsgobbios
EloSteepness:Bayesian Dominance Hierarchy Steepness via Elo Rating and David's Scores
Obtain Bayesian posterior distributions of dominance hierarchy steepness (Neumann and Fischer (2023) <doi:10.1111/2041-210X.14021>). Steepness estimation is based on Bayesian implementations of either Elo-rating or David's scores.
Maintained by Christof Neumann. Last updated 2 years ago.
3.70 score 5 scriptsbbuchsbaum
multivarious:Extensible Data Structures for Multivariate Analysis
Provides a set of basic and extensible data structures and functions for multivariate analysis, including dimensionality reduction techniques, projection methods, and preprocessing functions. The aim of this package is to offer a flexible and user-friendly framework for multivariate analysis that can be easily extended for custom requirements and specific data analysis tasks.
Maintained by Bradley Buchsbaum. Last updated 3 months ago.
3.53 score 17 scriptsbioc
ternarynet:Ternary Network Estimation
Gene-regulatory network (GRN) modeling seeks to infer dependencies between genes and thereby provide insight into the regulatory relationships that exist within a cell. This package provides a computational Bayesian approach to GRN estimation from perturbation experiments using a ternary network model, in which gene expression is discretized into one of 3 states: up, unchanged, or down). The ternarynet package includes a parallel implementation of the replica exchange Monte Carlo algorithm for fitting network models, using MPI.
Maintained by McCall N. Matthew. Last updated 5 months ago.
softwarecellbiologygraphandnetworknetworkbayesiancpp
3.30 score 3 scriptskhliland
ER:Effect + Residual Modelling
Multivariate modeling of data after deflation of interfering effects. EF Mosleth et al. (2021) <doi:10.1038/s41598-021-82388-w> and EF Mosleth et al. (2020) <doi:10.1016/B978-0-12-409547-2.14882-6>.
Maintained by Kristian Hovde Liland. Last updated 2 years ago.
3.00 score 1 scriptskatie-frank
combinedevents:Calculate Scores and Marks for Track and Field Combined Events
Includes functions to calculate scores and marks for track and field combined events competitions. The functions are based on the scoring tables for combined events set forth by the International Association of Athletics Federation (2001).
Maintained by Katie Frank. Last updated 4 years ago.
1 stars 2.70 score 2 scriptscran
ssMRCD:Spatially Smoothed MRCD Estimator
Estimation of the Spatially Smoothed Minimum Regularized Determinant (ssMRCD) estimator and its usage in an ssMRCD-based outlier detection method as described in Puchhammer and Filzmoser (2023) <doi:10.1080/10618600.2023.2277875> and for sparse robust PCA for multi-source data described in Puchhammer, Wilms and Filzmoser (2024) <doi:10.48550/arXiv.2407.16299>. Included are also complementary visualization and parameter tuning tools.
Maintained by Patricia Puchhammer. Last updated 7 months ago.
2.00 scoreroustant
kergp:Gaussian Process Laboratory
Gaussian process regression with an emphasis on kernels. Quantitative and qualitative inputs are accepted. Some pre-defined kernels are available, such as radial or tensor-sum for quantitative inputs, and compound symmetry, low rank, group kernel for qualitative inputs. The user can define new kernels and composite kernels through a formula mechanism. Useful methods include parameter estimation by maximum likelihood, simulation, prediction and leave-one-out validation.
Maintained by Olivier Roustant. Last updated 4 months ago.
1 stars 1.83 score 67 scriptscran
SAutomata:Inference and Learning in Stochastic Automata
Machine learning provides algorithms that can learn from data and make inferences or predictions. Stochastic automata is a class of input/output devices which can model components. This work provides implementation an inference algorithm for stochastic automata which is similar to the Viterbi algorithm. Moreover, we specify a learning algorithm using the expectation-maximization technique and provide a more efficient implementation of the Baum-Welch algorithm for stochastic automata. This work is based on Inference and learning in stochastic automata was by Karl-Heinz Zimmermann(2017) <doi:10.12732/ijpam.v115i3.15>.
Maintained by Muhammad Kashif Hanif. Last updated 6 years ago.
1.00 score