Showing 43 of total 43 results (show query)
vegandevs
vegan:Community Ecology Package
Ordination methods, diversity analysis and other functions for community and vegetation ecologists.
Maintained by Jari Oksanen. Last updated 29 days ago.
ecological-modellingecologyordinationfortranopenblas
472 stars 19.41 score 15k scripts 440 dependentsbioc
mixOmics:Omics Data Integration Project
Multivariate methods are well suited to large omics data sets where the number of variables (e.g. genes, proteins, metabolites) is much larger than the number of samples (patients, cells, mice). They have the appealing properties of reducing the dimension of the data by using instrumental variables (components), which are defined as combinations of all variables. Those components are then used to produce useful graphical outputs that enable better understanding of the relationships and correlation structures between the different data sets that are integrated. mixOmics offers a wide range of multivariate methods for the exploration and integration of biological datasets with a particular focus on variable selection. The package proposes several sparse multivariate models we have developed to identify the key variables that are highly correlated, and/or explain the biological outcome of interest. The data that can be analysed with mixOmics may come from high throughput sequencing technologies, such as omics data (transcriptomics, metabolomics, proteomics, metagenomics etc) but also beyond the realm of omics (e.g. spectral imaging). The methods implemented in mixOmics can also handle missing values without having to delete entire rows with missing data. A non exhaustive list of methods include variants of generalised Canonical Correlation Analysis, sparse Partial Least Squares and sparse Discriminant Analysis. Recently we implemented integrative methods to combine multiple data sets: N-integration with variants of Generalised Canonical Correlation Analysis and P-integration with variants of multi-group Partial Least Squares.
Maintained by Eva Hamrud. Last updated 16 days ago.
immunooncologymicroarraysequencingmetabolomicsmetagenomicsproteomicsgenepredictionmultiplecomparisonclassificationregressionbioconductorgenomicsgenomics-datagenomics-visualizationmultivariate-analysismultivariate-statisticsomicsr-pkgr-project
182 stars 13.71 score 1.3k scripts 22 dependentsbioc
pcaMethods:A collection of PCA methods
Provides Bayesian PCA, Probabilistic PCA, Nipals PCA, Inverse Non-Linear PCA and the conventional SVD PCA. A cluster based method for missing value estimation is included for comparison. BPCA, PPCA and NipalsPCA may be used to perform PCA on incomplete data as well as for accurate missing value estimation. A set of methods for printing and plotting the results is also provided. All PCA methods make use of the same data structure (pcaRes) to provide a common interface to the PCA results. Initiated at the Max-Planck Institute for Molecular Plant Physiology, Golm, Germany.
Maintained by Henning Redestig. Last updated 5 months ago.
49 stars 13.10 score 538 scripts 73 dependentsmsberends
AMR:Antimicrobial Resistance Data Analysis
Functions to simplify and standardise antimicrobial resistance (AMR) data analysis and to work with microbial and antimicrobial properties by using evidence-based methods, as described in <doi:10.18637/jss.v104.i03>.
Maintained by Matthijs S. Berends. Last updated 12 hours ago.
amrantimicrobial-dataepidemiologymicrobiologysoftware
95 stars 11.83 score 182 scripts 6 dependentsbioc
PCAtools:PCAtools: Everything Principal Components Analysis
Principal Component Analysis (PCA) is a very powerful technique that has wide applicability in data science, bioinformatics, and further afield. It was initially developed to analyse large volumes of data in order to tease out the differences/relationships between the logical entities being analysed. It extracts the fundamental structure of the data without the need to build any model to represent it. This 'summary' of the data is arrived at through a process of reduction that can transform the large number of variables into a lesser number that are uncorrelated (i.e. the 'principal components'), while at the same time being capable of easy interpretation on the original data. PCAtools provides functions for data exploration via PCA, and allows the user to generate publication-ready figures. PCA is performed via BiocSingular - users can also identify optimal number of principal components via different metrics, such as elbow method and Horn's parallel analysis, which has relevance for data reduction in single-cell RNA-seq (scRNA-seq) and high dimensional mass cytometry data.
Maintained by Kevin Blighe. Last updated 5 months ago.
rnaseqatacseqgeneexpressiontranscriptionsinglecellprincipalcomponentcpp
343 stars 11.12 score 832 scripts 2 dependentsjamovi
jmv:The 'jamovi' Analyses
A suite of common statistical methods such as descriptives, t-tests, ANOVAs, regression, correlation matrices, proportion tests, contingency tables, and factor analysis. This package is also useable from the 'jamovi' statistical spreadsheet (see <https://www.jamovi.org> for more information).
Maintained by Jonathon Love. Last updated 26 days ago.
59 stars 9.58 score 440 scriptseagerai
fastai:Interface to 'fastai'
The 'fastai' <https://docs.fast.ai/index.html> library simplifies training fast and accurate neural networks using modern best practices. It is based on research in to deep learning best practices undertaken at 'fast.ai', including 'out of the box' support for vision, text, tabular, audio, time series, and collaborative filtering models.
Maintained by Turgut Abdullayev. Last updated 12 months ago.
audiocollaborative-filteringdarknetdarknet-image-classificationfastaimedicalobject-detectiontabulartextvision
118 stars 9.40 score 76 scriptsbabaknaimi
sdm:Species Distribution Modelling
An extensible framework for developing species distribution models using individual and community-based approaches, generate ensembles of models, evaluate the models, and predict species potential distributions in space and time. For more information, please check the following paper: Naimi, B., Araujo, M.B. (2016) <doi:10.1111/ecog.01881>.
Maintained by Babak Naimi. Last updated 2 months ago.
24 stars 9.31 score 312 scripts 1 dependentsbioc
SeqVarTools:Tools for variant data
An interface to the fast-access storage format for VCF data provided in SeqArray, with tools for common operations and analysis.
Maintained by Stephanie M. Gogarten. Last updated 5 months ago.
snpgeneticvariabilitysequencinggenetics
3 stars 8.76 score 384 scripts 2 dependentsbioboot
bio3d:Biological Structure Analysis
Utilities to process, organize and explore protein structure, sequence and dynamics data. Features include the ability to read and write structure, sequence and dynamic trajectory data, perform sequence and structure database searches, data summaries, atom selection, alignment, superposition, rigid core identification, clustering, torsion analysis, distance matrix analysis, structure and sequence conservation analysis, normal mode analysis, principal component analysis of heterogeneous structure data, and correlation network analysis from normal mode and molecular dynamics data. In addition, various utility functions are provided to enable the statistical and graphical power of the R environment to work with biological sequence and structural data. Please refer to the URLs below for more information.
Maintained by Barry Grant. Last updated 5 months ago.
5 stars 8.49 score 1.4k scripts 10 dependentsrfastofficial
Rfast2:A Collection of Efficient and Extremely Fast R Functions II
A collection of fast statistical and utility functions for data analysis. Functions for regression, maximum likelihood, column-wise statistics and many more have been included. C++ has been utilized to speed up the functions. References: Tsagris M., Papadakis M. (2018). Taking R to its limits: 70+ tips. PeerJ Preprints 6:e26605v1 <doi:10.7287/peerj.preprints.26605v1>.
Maintained by Manos Papadakis. Last updated 1 years ago.
38 stars 8.09 score 75 scripts 26 dependentsantoinelucas64
amap:Another Multidimensional Analysis Package
Tools for Clustering and Principal Component Analysis (With robust methods, and parallelized functions).
Maintained by Antoine Lucas. Last updated 5 months ago.
7.73 score 460 scripts 26 dependentssvkucheryavski
mdatools:Multivariate Data Analysis for Chemometrics
Projection based methods for preprocessing, exploring and analysis of multivariate data used in chemometrics. S. Kucheryavskiy (2020) <doi:10.1016/j.chemolab.2020.103937>.
Maintained by Sergey Kucheryavskiy. Last updated 8 months ago.
36 stars 7.41 score 220 scripts 1 dependentstkcaccia
KODAMA:Knowledge Discovery by Accuracy Maximization
An unsupervised and semi-supervised learning algorithm that performs feature extraction from noisy and high-dimensional data. It facilitates identification of patterns representing underlying groups on all samples in a data set. Based on Cacciatore S, Tenori L, Luchinat C, Bennett PR, MacIntyre DA. (2017) Bioinformatics <doi:10.1093/bioinformatics/btw705> and Cacciatore S, Luchinat C, Tenori L. (2014) Proc Natl Acad Sci USA <doi:10.1073/pnas.1220873111>.
Maintained by Stefano Cacciatore. Last updated 13 days ago.
1 stars 7.00 score 63 scripts 1 dependentsnepem-ufsc
pliman:Tools for Plant Image Analysis
Tools for both single and batch image manipulation and analysis (Olivoto, 2022 <doi:10.1111/2041-210X.13803>) and phytopathometry (Olivoto et al., 2022 <doi:10.1007/S40858-021-00487-5>). The tools can be used for the quantification of leaf area, object counting, extraction of image indexes, shape measurement, object landmark identification, and Elliptical Fourier Analysis of object outlines (Claude (2008) <doi:10.1007/978-0-387-77789-4>). The package also provides a comprehensive pipeline for generating shapefiles with complex layouts and supports high-throughput phenotyping of RGB, multispectral, and hyperspectral orthomosaics. This functionality facilitates field phenotyping using UAV- or satellite-based imagery.
Maintained by Tiago Olivoto. Last updated 1 days ago.
11 stars 6.76 score 476 scriptsbioc
maser:Mapping Alternative Splicing Events to pRoteins
This package provides functionalities for downstream analysis, annotation and visualizaton of alternative splicing events generated by rMATS.
Maintained by Diogo F.T. Veiga. Last updated 5 months ago.
alternativesplicingtranscriptomicsvisualization
17 stars 6.74 score 18 scriptskhliland
multiblock:Multiblock Data Fusion in Statistics and Machine Learning
Functions and datasets to support Smilde, Næs and Liland (2021, ISBN: 978-1-119-60096-1) "Multiblock Data Fusion in Statistics and Machine Learning - Applications in the Natural and Life Sciences". This implements and imports a large collection of methods for multiblock data analysis with common interfaces, result- and plotting functions, several real data sets and six vignettes covering a range different applications.
Maintained by Kristian Hovde Liland. Last updated 2 months ago.
14 stars 6.68 score 19 scriptsbioc
LEA:LEA: an R package for Landscape and Ecological Association Studies
LEA is an R package dedicated to population genomics, landscape genomics and genotype-environment association tests. LEA can run analyses of population structure and genome-wide tests for local adaptation, and also performs imputation of missing genotypes. The package includes statistical methods for estimating ancestry coefficients from large genotypic matrices and for evaluating the number of ancestral populations (snmf). It performs statistical tests using latent factor mixed models for identifying genetic polymorphisms that exhibit association with environmental gradients or phenotypic traits (lfmm2). In addition, LEA computes values of genetic offset statistics based on new or predicted environments (genetic.gap, genetic.offset). LEA is mainly based on optimized programs that can scale with the dimensions of large data sets.
Maintained by Olivier Francois. Last updated 18 days ago.
softwarestatistical methodclusteringregressionopenblas
6.63 score 534 scriptsbioc
M3C:Monte Carlo Reference-based Consensus Clustering
M3C is a consensus clustering algorithm that uses a Monte Carlo simulation to eliminate overestimation of K and can reject the null hypothesis K=1.
Maintained by Christopher John. Last updated 5 months ago.
clusteringgeneexpressiontranscriptionrnaseqsequencingimmunooncology
6.59 score 174 scripts 1 dependentsdvrbts
labdsv:Ordination and Multivariate Analysis for Ecology
A variety of ordination and community analyses useful in analysis of data sets in community ecology. Includes many of the common ordination methods, with graphical routines to facilitate their interpretation, as well as several novel analyses.
Maintained by David W. Roberts. Last updated 2 years ago.
3 stars 6.05 score 452 scripts 12 dependentscrj32
Spectrum:Fast Adaptive Spectral Clustering for Single and Multi-View Data
A self-tuning spectral clustering method for single or multi-view data. 'Spectrum' uses a new type of adaptive density aware kernel that strengthens connections in the graph based on common nearest neighbours. It uses a tensor product graph data integration and diffusion procedure to integrate different data sources and reduce noise. 'Spectrum' uses either the eigengap or multimodality gap heuristics to determine the number of clusters. The method is sufficiently flexible so that a wide range of Gaussian and non-Gaussian structures can be clustered with automatic selection of K.
Maintained by Christopher R John. Last updated 5 years ago.
7 stars 5.99 score 47 scripts 1 dependentsbioc
autonomics:Unified Statistical Modeling of Omics Data
This package unifies access to Statistal Modeling of Omics Data. Across linear modeling engines (lm, lme, lmer, limma, and wilcoxon). Across coding systems (treatment, difference, deviation, etc). Across model formulae (with/without intercept, random effect, interaction or nesting). Across omics platforms (microarray, rnaseq, msproteomics, affinity proteomics, metabolomics). Across projection methods (pca, pls, sma, lda, spls, opls). Across clustering methods (hclust, pam, cmeans). It provides a fast enrichment analysis implementation. And an intuitive contrastogram visualisation to summarize contrast effects in complex designs.
Maintained by Aditya Bhagwat. Last updated 2 months ago.
softwaredataimportpreprocessingdimensionreductionprincipalcomponentregressiondifferentialexpressiongenesetenrichmenttranscriptomicstranscriptiongeneexpressionrnaseqmicroarrayproteomicsmetabolomicsmassspectrometry
5.95 score 5 scriptsbioc
rexposome:Exposome exploration and outcome data analysis
Package that allows to explore the exposome and to perform association analyses between exposures and health outcomes.
Maintained by Xavier Escribà Montagut. Last updated 5 months ago.
softwarebiologicalquestioninfrastructuredataimportdatarepresentationbiomedicalinformaticsexperimentaldesignmultiplecomparisonclassificationclustering
5.70 score 28 scripts 1 dependentsblasbenito
spatialRF:Easy Spatial Modeling with Random Forest
Automatic generation and selection of spatial predictors for spatial regression with Random Forest. Spatial predictors are surrogates of variables driving the spatial structure of a response variable. The package offers two methods to generate spatial predictors from a distance matrix among training cases: 1) Moran's Eigenvector Maps (MEMs; Dray, Legendre, and Peres-Neto 2006 <DOI:10.1016/j.ecolmodel.2006.02.015>): computed as the eigenvectors of a weighted matrix of distances; 2) RFsp (Hengl et al. <DOI:10.7717/peerj.5518>): columns of the distance matrix used as spatial predictors. Spatial predictors help minimize the spatial autocorrelation of the model residuals and facilitate an honest assessment of the importance scores of the non-spatial predictors. Additionally, functions to reduce multicollinearity, identify relevant variable interactions, tune random forest hyperparameters, assess model transferability via spatial cross-validation, and explore model results via partial dependence curves and interaction surfaces are included in the package. The modelling functions are built around the highly efficient 'ranger' package (Wright and Ziegler 2017 <DOI:10.18637/jss.v077.i01>).
Maintained by Blas M. Benito. Last updated 3 years ago.
random-forestspatial-analysisspatial-regression
114 stars 5.45 score 49 scriptsvanderleidebastiani
SYNCSA:Analysis of Functional and Phylogenetic Patterns in Metacommunities
Analysis of metacommunities based on functional traits and phylogeny of the community components. The functions that are offered here implement for the R environment methods that have been available in the SYNCSA application written in C++ (by Valerio Pillar, available at <http://ecoqua.ecologia.ufrgs.br/SYNCSA.html>).
Maintained by Vanderlei Julio Debastiani. Last updated 5 years ago.
3 stars 5.36 score 28 scripts 1 dependentsgabrielodom
mvMonitoring:Multi-State Adaptive Dynamic Principal Component Analysis for Multivariate Process Monitoring
Use multi-state splitting to apply Adaptive-Dynamic PCA (ADPCA) to data generated from a continuous-time multivariate industrial or natural process. Employ PCA-based dimension reduction to extract linear combinations of relevant features, reducing computational burdens. For a description of ADPCA, see <doi:10.1007/s00477-016-1246-2>, the 2016 paper from Kazor et al. The multi-state application of ADPCA is from a manuscript under current revision entitled "Multi-State Multivariate Statistical Process Control" by Odom, Newhart, Cath, and Hering, and is expected to appear in Q1 of 2018.
Maintained by Gabriel Odom. Last updated 1 years ago.
4 stars 5.24 score 29 scriptstesselle
nexus:Sourcing Archaeological Materials by Chemical Composition
Exploration and analysis of compositional data in the framework of Aitchison (1986, ISBN: 978-94-010-8324-9). This package provides tools for chemical fingerprinting and source tracking of ancient materials.
Maintained by Nicolas Frerebeau. Last updated 24 days ago.
archaeologyarchaeological-sciencearchaeometrycompositional-dataprovenance-studies
5.21 score 26 scripts 1 dependentsbioc
ASICS:Automatic Statistical Identification in Complex Spectra
With a set of pure metabolite reference spectra, ASICS quantifies concentration of metabolites in a complex spectrum. The identification of metabolites is performed by fitting a mixture model to the spectra of the library with a sparse penalty. The method and its statistical properties are described in Tardivel et al. (2017) <doi:10.1007/s11306-017-1244-5>.
Maintained by Gaëlle Lefort. Last updated 5 months ago.
softwaredataimportcheminformaticsmetabolomics
5.18 score 30 scriptsfriendly
genridge:Generalized Ridge Trace Plots for Ridge Regression
The genridge package introduces generalizations of the standard univariate ridge trace plot used in ridge regression and related methods. These graphical methods show both bias (actually, shrinkage) and precision, by plotting the covariance ellipsoids of the estimated coefficients, rather than just the estimates themselves. 2D and 3D plotting methods are provided, both in the space of the predictor variables and in the transformed space of the PCA/SVD of the predictors.
Maintained by Michael Friendly. Last updated 4 months ago.
bias-variancegraphicsprincipal-component-analysisregression-modelsridge-regressionsingular-value-decomposition
4 stars 4.84 score 69 scriptsbioc
phenoTest:Tools to test association between gene expression and phenotype in a way that is efficient, structured, fast and scalable. We also provide tools to do GSEA (Gene set enrichment analysis) and copy number variation.
Tools to test correlation between gene expression and phenotype in a way that is efficient, structured, fast and scalable. GSEA is also provided.
Maintained by Evarist Planet. Last updated 5 months ago.
microarraydifferentialexpressionmultiplecomparisonclusteringclassification
4.56 score 9 scripts 1 dependentsloosolab
wilson:Web-Based Interactive Omics Visualization
Tool-set of modules for creating web-based applications that use plot based strategies to visualize and analyze multi-omics data. This package utilizes the 'shiny' and 'plotly' frameworks to provide a user friendly dashboard for interactive plotting.
Maintained by Hendrik Schultheis. Last updated 4 years ago.
2 stars 4.30 score 7 scriptsjrvanderdoes
fChange:Functional Change Point Detection and Analysis
Analyze functional data and its change points. Includes functionality to store and process data, summarize and validate assumptions, characterize and perform inference of change points, and provide visualizations. Data is stored as discretely collected observations without requiring the selection of basis functions.
Maintained by Jeremy VanderDoes. Last updated 5 days ago.
1 stars 4.04 scorebioc
CellTrails:Reconstruction, visualization and analysis of branching trajectories
CellTrails is an unsupervised algorithm for the de novo chronological ordering, visualization and analysis of single-cell expression data. CellTrails makes use of a geometrically motivated concept of lower-dimensional manifold learning, which exhibits a multitude of virtues that counteract intrinsic noise of single cell data caused by drop-outs, technical variance, and redundancy of predictive variables. CellTrails enables the reconstruction of branching trajectories and provides an intuitive graphical representation of expression patterns along all branches simultaneously. It allows the user to define and infer the expression dynamics of individual and multiple pathways towards distinct phenotypes.
Maintained by Daniel Ellwanger. Last updated 5 months ago.
immunooncologyclusteringdatarepresentationdifferentialexpressiondimensionreductiongeneexpressionsequencingsinglecellsoftwaretimecourse
4.00 score 7 scriptsrcurtin
mlpack:'Rcpp' Integration for the 'mlpack' Library
A fast, flexible machine learning library, written in C++, that aims to provide fast, extensible implementations of cutting-edge machine learning algorithms. See also Curtin et al. (2023) <doi:10.21105/joss.05026>.
Maintained by Ryan Curtin. Last updated 4 months ago.
3.71 score 20 scripts 8 dependentsbbuchsbaum
multivarious:Extensible Data Structures for Multivariate Analysis
Provides a set of basic and extensible data structures and functions for multivariate analysis, including dimensionality reduction techniques, projection methods, and preprocessing functions. The aim of this package is to offer a flexible and user-friendly framework for multivariate analysis that can be easily extended for custom requirements and specific data analysis tasks.
Maintained by Bradley Buchsbaum. Last updated 3 months ago.
3.53 score 17 scriptstsukubai
hclusteasy:Determining Hierarchical Clustering Easily
Facilitates hierarchical clustering analysis with functions to read data in 'txt', 'xlsx', and 'xls' formats, apply normalization techniques to the dataset, perform hierarchical clustering and construct scatter plot from principal component analysis to evaluate the groups obtained.
Maintained by Henrique Andrade. Last updated 9 months ago.
3.00 score 1 scriptssciviews
exploreit:Exploratory Data Analysis for 'SciViews::R'
Multivariate analysis and data exploration for the 'SciViews::R' dialect.
Maintained by Philippe Grosjean. Last updated 11 months ago.
multivariate-analysissciviewsstatistical-methods
2.70 score 4 scriptscaromillat
MultiGroupO:MultiGroup Method and Simulation Data Analysis
Two method new of multigroup and simulation of data. The first technique called multigroup PCA (mgPCA) this multivariate exploration approach that has the idea of considering the structure of groups and / or different types of variables. On the other hand, the second multivariate technique called Multigroup Dimensionality Reduction (MDR) it is another multivariate exploration method that is based on projections. In addition, a method called Single Dimension Exploration (SDE) was incorporated for to analyze the exploration of the data. It could help us in a better way to observe the behavior of the multigroup data with certain variables of interest.
Maintained by Carolina Millap/an. Last updated 9 months ago.
2.60 score 4 scriptszhiweilin27
AnalysisLin:Exploratory Data Analysis
A quick and effective data exploration toolkit. It provides essential features, including a descriptive statistics table for a quick overview of your dataset, interactive distribution plots to visualize variable patterns, Principal Component Analysis for dimensionality reduction and feature analysis, missing value imputation methods, and correlation analysis.
Maintained by Zhiwei Lin. Last updated 1 years ago.
1 stars 2.00 scorereyar
Statsomat:Shiny Apps for Automated Data Analysis and Automated Interpretation
Shiny apps for automated data analysis, annotated outputs and human-readable interpretation in natural language. Designed especially for learners and applied researchers. Currently available methods: EDA, EDA with Python, Correlation Analysis, Principal Components Analysis, Confirmatory Factor Analysis.
Maintained by Denise Welsch. Last updated 3 years ago.
1.00 score 6 scriptscoissac
ProcMod:Informative Procrustean Matrix Correlation
Estimates corrected Procrustean correlation between matrices for removing overfitting effect. Coissac Eric and Gonindard-Melodelima Christelle (2019) <doi:10.1101/842070>.
Maintained by Eric Coissac. Last updated 4 years ago.
1.00 score