Showing 200 of total 1071 results (show query)
mfrasco
Metrics:Evaluation Metrics for Machine Learning
An implementation of evaluation metrics in R that are commonly used in supervised machine learning. It implements metrics for regression, time series, binary classification, classification, and information retrieval problems. It has zero dependencies and a consistent, simple interface for all functions.
Maintained by Michael Frasco. Last updated 6 years ago.
59.5 match 99 stars 13.02 score 6.1k scripts 51 dependentsrstudio
keras3:R Interface to 'Keras'
Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.
Maintained by Tomasz Kalinowski. Last updated 3 days ago.
41.3 match 845 stars 13.57 score 264 scripts 2 dependentscbielow
PTXQC:Quality Report Generation for MaxQuant and mzTab Results
Generates Proteomics (PTX) quality control (QC) reports for shotgun LC-MS data analyzed with the MaxQuant software suite (from .txt files) or mzTab files (ideally from OpenMS 'QualityControl' tool). Reports are customizable (target thresholds, subsetting) and available in HTML or PDF format. Published in J. Proteome Res., Proteomics Quality Control: Quality Control Software for MaxQuant Results (2015) <doi:10.1021/acs.jproteome.5b00780>.
Maintained by Chris Bielow. Last updated 1 years ago.
drag-and-drophacktoberfestheatmapmatch-between-runsmaxquantmetricmztabopenmsproteomicsquality-controlquality-metricsreport
44.4 match 42 stars 9.35 score 105 scripts 1 dependentstidymodels
yardstick:Tidy Characterizations of Model Performance
Tidy tools for quantifying how well model fits to a data set such as confusion matrices, class probability curve summaries, and regression metrics (e.g., RMSE).
Maintained by Emil Hvitfeldt. Last updated 3 days ago.
25.4 match 387 stars 15.47 score 2.2k scripts 60 dependentsepiforecasts
scoringutils:Utilities for Scoring and Assessing Predictions
Facilitate the evaluation of forecasts in a convenient framework based on data.table. It allows user to to check their forecasts and diagnose issues, to visualise forecasts and missing data, to transform data before scoring, to handle missing forecasts, to aggregate scores, and to visualise the results of the evaluation. The package mostly focuses on the evaluation of probabilistic forecasts and allows evaluating several different forecast types and input formats. Find more information about the package in the Vignettes as well as in the accompanying paper, <doi:10.48550/arXiv.2205.07090>.
Maintained by Nikos Bosse. Last updated 12 days ago.
forecast-evaluationforecasting
29.6 match 52 stars 11.37 score 326 scripts 7 dependentst-kalinowski
keras:R Interface to 'Keras'
Interface to 'Keras' <https://keras.io>, a high-level neural networks 'API'. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both 'CPU' and 'GPU' devices.
Maintained by Tomasz Kalinowski. Last updated 11 months ago.
30.0 match 10.82 score 10k scripts 54 dependentsmicrosoft
wpa:Tools for Analysing and Visualising Viva Insights Data
Opinionated functions that enable easier and faster analysis of Viva Insights data. There are three main types of functions in 'wpa': (i) Standard functions create a 'ggplot' visual or a summary table based on a specific Viva Insights metric; (2) Report Generation functions generate HTML reports on a specific analysis area, e.g. Collaboration; (3) Other miscellaneous functions cover more specific applications (e.g. Subject Line text mining) of Viva Insights data. This package adheres to 'tidyverse' principles and works well with the pipe syntax. 'wpa' is built with the beginner-to-intermediate R users in mind, and is optimised for simplicity.
Maintained by Martin Chan. Last updated 4 months ago.
44.1 match 30 stars 6.69 score 39 scripts 1 dependentsthie1e
cutpointr:Determine and Evaluate Optimal Cutpoints in Binary Classification Tasks
Estimate cutpoints that optimize a specified metric in binary classification tasks and validate performance using bootstrapping. Some methods for more robust cutpoint estimation are supported, e.g. a parametric method assuming normal distributions, bootstrapped cutpoints, and smoothing of the metric values per cutpoint using Generalized Additive Models. Various plotting functions are included. For an overview of the package see Thiele and Hirschfeld (2021) <doi:10.18637/jss.v098.i11>.
Maintained by Christian Thiele. Last updated 3 months ago.
bootstrappingcutpoint-optimizationroc-curvecpp
27.9 match 88 stars 10.44 score 322 scripts 1 dependentsbenrwoodard
adobeanalyticsr:R Client for 'Adobe Analytics' API 2.0
Connect to the 'Adobe Analytics' API v2.0 <https://github.com/AdobeDocs/analytics-2.0-apis> which powers 'Analysis Workspace'. The package was developed with the analyst in mind, and it will continue to be developed with the guiding principles of iterative, repeatable, timely analysis.
Maintained by Ben Woodard. Last updated 2 months ago.
41.5 match 18 stars 7.02 score 39 scriptsmodeloriented
fairmodels:Flexible Tool for Bias Detection, Visualization, and Mitigation
Measure fairness metrics in one place for many models. Check how big is model's bias towards different races, sex, nationalities etc. Use measures such as Statistical Parity, Equal odds to detect the discrimination against unprivileged groups. Visualize the bias using heatmap, radar plot, biplot, bar chart (and more!). There are various pre-processing and post-processing bias mitigation algorithms implemented. Package also supports calculating fairness metrics for regression models. Find more details in (Wiśniewski, Biecek (2021)) <arXiv:2104.00507>.
Maintained by Jakub Wiśniewski. Last updated 1 months ago.
explain-classifiersexplainable-mlfairnessfairness-comparisonfairness-mlmodel-evaluation
35.4 match 86 stars 7.72 score 51 scripts 1 dependentsjackstat
ModelMetrics:Rapid Calculation of Model Metrics
Collection of metrics for evaluating models written in C++ using 'Rcpp'. Popular metrics include area under the curve, log loss, root mean square error, etc.
Maintained by Tyler Hunt. Last updated 4 years ago.
aucloglossmachine-learningmetricsmodel-evaluationmodel-metricscpp
21.4 match 29 stars 11.83 score 1.3k scripts 306 dependentsfgazzelloni
hmsidwR:Health Metrics and the Spread of Infectious Diseases
A collection of datasets and supporting functions accompanying Health Metrics and the Spread of Infectious Diseases by Federica Gazzelloni (2024). This package provides data for health metrics calculations, including Disability-Adjusted Life Years (DALYs), Years of Life Lost (YLLs), and Years Lived with Disability (YLDs), as well as additional tools for analyzing and visualizing health data. Federica Gazzelloni (2024) <doi:10.5281/zenodo.10818338>.
Maintained by Federica Gazzelloni. Last updated 2 months ago.
deathshealth-datainfectious-diseaseslifeexpectancy
45.0 match 4 stars 5.48 score 6 scriptssamuel-marsh
scCustomize:Custom Visualizations & Functions for Streamlined Analyses of Single Cell Sequencing
Collection of functions created and/or curated to aid in the visualization and analysis of single-cell data using 'R'. 'scCustomize' aims to provide 1) Customized visualizations for aid in ease of use and to create more aesthetic and functional visuals. 2) Improve speed/reproducibility of common tasks/pieces of code in scRNA-seq analysis with a single or group of functions. For citation please use: Marsh SE (2021) "Custom Visualizations & Functions for Streamlined Analyses of Single Cell Sequencing" <doi:10.5281/zenodo.5706430> RRID:SCR_024675.
Maintained by Samuel Marsh. Last updated 3 months ago.
customizationggplot2scrna-seqseuratsingle-cellsingle-cell-genomicssingle-cell-rna-seqvisualization
27.5 match 242 stars 8.75 score 1.1k scriptsazure
azuremlsdk:Interface to the 'Azure Machine Learning' 'SDK'
Interface to the 'Azure Machine Learning' Software Development Kit ('SDK'). Data scientists can use the 'SDK' to train, deploy, automate, and manage machine learning models on the 'Azure Machine Learning' service. To learn more about 'Azure Machine Learning' visit the website: <https://docs.microsoft.com/en-us/azure/machine-learning/service/overview-what-is-azure-ml>.
Maintained by Diondra Peck. Last updated 3 years ago.
amlcomputeazureazure-machine-learningazuremldsimachine-learningrstudiosdk-r
26.6 match 106 stars 8.91 score 221 scriptsmoviedo5
fda.usc:Functional Data Analysis and Utilities for Statistical Computing
Routines for exploratory and descriptive analysis of functional data such as depth measurements, atypical curves detection, regression models, supervised classification, unsupervised classification and functional analysis of variance.
Maintained by Manuel Oviedo de la Fuente. Last updated 4 months ago.
functional-data-analysisfortran
23.8 match 12 stars 9.72 score 560 scripts 22 dependentsmicrosoft
vivainsights:Analyze and Visualize Data from 'Microsoft Viva Insights'
Provides a versatile range of functions, including exploratory data analysis, time-series analysis, organizational network analysis, and data validation, whilst at the same time implements a set of best practices in analyzing and visualizing data specific to 'Microsoft Viva Insights'.
Maintained by Martin Chan. Last updated 23 days ago.
37.0 match 11 stars 6.12 score 68 scriptsspatstat
spatstat.geom:Geometrical Functionality of the 'spatstat' Family
Defines spatial data types and supports geometrical operations on them. Data types include point patterns, windows (domains), pixel images, line segment patterns, tessellations and hyperframes. Capabilities include creation and manipulation of data (using command line or graphical interaction), plotting, geometrical operations (rotation, shift, rescale, affine transformation), convex hull, discretisation and pixellation, Dirichlet tessellation, Delaunay triangulation, pairwise distances, nearest-neighbour distances, distance transform, morphological operations (erosion, dilation, closing, opening), quadrat counting, geometrical measurement, geometrical covariance, colour maps, calculus on spatial domains, Gaussian blur, level sets of images, transects of images, intersections between objects, minimum distance matching. (Excludes spatial data on a network, which are supported by the package 'spatstat.linnet'.)
Maintained by Adrian Baddeley. Last updated 22 hours ago.
classes-and-objectsdistance-calculationgeometrygeometry-processingimagesmensurationplottingpoint-patternsspatial-dataspatial-data-analysis
18.6 match 7 stars 12.11 score 241 scripts 227 dependentsludvigolsen
cvms:Cross-Validation for Model Selection
Cross-validate one or multiple regression and classification models and get relevant evaluation metrics in a tidy format. Validate the best model on a test set and compare it to a baseline evaluation. Alternatively, evaluate predictions from an external model. Currently supports regression and classification (binary and multiclass). Described in chp. 5 of Jeyaraman, B. P., Olsen, L. R., & Wambugu M. (2019, ISBN: 9781838550134).
Maintained by Ludvig Renbo Olsen. Last updated 9 days ago.
21.1 match 39 stars 10.31 score 492 scripts 5 dependentsrmi-pacta
pacta.multi.loanbook:Run 'PACTA' on Multiple Loan Books Easily
Run Paris Agreement Capital Transition Assessment ('PACTA') analyses on multiple loan books in a structured way. Provides access to standard 'PACTA' metrics and additional 'PACTA'-related metrics for multiple loan books. Results take the form of 'csv' files and plots and are exported to user-specified project paths.
Maintained by Jacob Kastl. Last updated 1 days ago.
climate-changepactapactaversesustainable-finance
33.2 match 6.48 score 4 scriptsbioc
evaluomeR:Evaluation of Bioinformatics Metrics
Evaluating the reliability of your own metrics and the measurements done on your own datasets by analysing the stability and goodness of the classifications of such metrics.
Maintained by José Antonio Bernabé-Díaz. Last updated 5 months ago.
clusteringclassificationfeatureextractionassessmentclustering-evaluationevaluomeevaluomermetrics
44.6 match 4.82 score 33 scriptsjacobbien
simulator:An Engine for Running Simulations
A framework for performing simulations such as those common in methodological statistics papers. The design principles of this package are described in greater depth in Bien, J. (2016) "The simulator: An Engine to Streamline Simulations," which is available at <arXiv:1607.00021>.
Maintained by Jacob Bien. Last updated 2 years ago.
29.9 match 52 stars 7.13 score 103 scriptsr-spatialecology
landscapemetrics:Landscape Metrics for Categorical Map Patterns
Calculates landscape metrics for categorical landscape patterns in a tidy workflow. 'landscapemetrics' reimplements the most common metrics from 'FRAGSTATS' (<https://www.fragstats.org/>) and new ones from the current literature on landscape metrics. This package supports 'terra' SpatRaster objects as input arguments. It further provides utility functions to visualize patches, select metrics and building blocks to develop new metrics.
Maintained by Maximilian H.K. Hesselbarth. Last updated 1 months ago.
landscape-ecologylandscape-metricsrasterspatialcpp
16.8 match 240 stars 12.47 score 584 scripts 4 dependentsirinagain
iglu:Interpreting Glucose Data from Continuous Glucose Monitors
Implements a wide range of metrics for measuring glucose control and glucose variability based on continuous glucose monitoring data. The list of implemented metrics is summarized in Rodbard (2009) <doi:10.1089/dia.2009.0015>. Additional visualization tools include time-series plots, lasagna plots and ambulatory glucose profile report.
Maintained by Irina Gaynanova. Last updated 9 days ago.
22.8 match 26 stars 9.00 score 39 scriptsvandomed
stocks:Stock Market Analysis
Functions for analyzing and visualizing stock market data. Main features are loading and aligning historical data, calculating performance metrics for individual funds or portfolios (e.g. annualized growth, maximum drawdown, Sharpe/Sortino ratio), and creating graphs.
Maintained by Dane R. Van Domelen. Last updated 5 years ago.
investment-analysisportfolio-constructionportfolio-optimizationsharpe-ratiostock-markettime-seriescpp
42.0 match 22 stars 4.63 score 39 scriptsjedick
chem16S:Chemical Metrics for Microbial Communities
Combines taxonomic classifications of high-throughput 16S rRNA gene sequences with reference proteomes of archaeal and bacterial taxa to generate amino acid compositions of community reference proteomes. Calculates chemical metrics including carbon oxidation state ('Zc'), stoichiometric oxidation and hydration state ('nO2' and 'nH2O'), H/C, N/C, O/C, and S/C ratios, grand average of hydropathicity ('GRAVY'), isoelectric point ('pI'), protein length, and average molecular weight of amino acid residues. Uses precomputed reference proteomes for archaea and bacteria derived from the Genome Taxonomy Database ('GTDB'). Also includes reference proteomes derived from the NCBI Reference Sequence ('RefSeq') database and manual mapping from the 'RDP Classifier' training set to 'RefSeq' taxonomy as described by Dick and Tan (2023) <doi:10.1007/s00248-022-01988-9>. Processes taxonomic classifications in 'RDP Classifier' format or OTU tables in 'phyloseq-class' objects from the Bioconductor package 'phyloseq'.
Maintained by Jeffrey Dick. Last updated 6 days ago.
16s-rrnacarbon-oxidation-statechemical-metricsgenomic-adaptationmicrobial-communities
31.3 match 4 stars 5.92 score 8 scriptsphilips-software
latrend:A Framework for Clustering Longitudinal Data
A framework for clustering longitudinal datasets in a standardized way. The package provides an interface to existing R packages for clustering longitudinal univariate trajectories, facilitating reproducible and transparent analyses. Additionally, standard tools are provided to support cluster analyses, including repeated estimation, model validation, and model assessment. The interface enables users to compare results between methods, and to implement and evaluate new methods with ease. The 'akmedoids' package is available from <https://github.com/MAnalytics/akmedoids>.
Maintained by Niek Den Teuling. Last updated 2 months ago.
cluster-analysisclustering-evaluationclustering-methodsdata-sciencelongitudinal-clusteringlongitudinal-datamixture-modelstime-series-analysis
26.4 match 30 stars 6.77 score 26 scriptsbioc
wateRmelon:Illumina DNA methylation array normalization and metrics
15 flavours of betas and three performance metrics, with methods for objects produced by methylumi and minfi packages.
Maintained by Leo C Schalkwyk. Last updated 4 months ago.
dnamethylationmicroarraytwochannelpreprocessingqualitycontrol
19.7 match 7.75 score 247 scripts 2 dependentstomasfryda
h2o:R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Maintained by Tomas Fryda. Last updated 1 years ago.
18.6 match 3 stars 8.20 score 7.8k scripts 11 dependentsbioc
MsQuality:MsQuality - Quality metric calculation from Spectra and MsExperiment objects
The MsQuality provides functionality to calculate quality metrics for mass spectrometry-derived, spectral data at the per-sample level. MsQuality relies on the mzQC framework of quality metrics defined by the Human Proteom Organization-Proteomics Standards Initiative (HUPO-PSI). These metrics quantify the quality of spectral raw files using a controlled vocabulary. The package is especially addressed towards users that acquire mass spectrometry data on a large scale (e.g. data sets from clinical settings consisting of several thousands of samples). The MsQuality package allows to calculate low-level quality metrics that require minimum information on mass spectrometry data: retention time, m/z values, and associated intensities. MsQuality relies on the Spectra package, or alternatively the MsExperiment package, and its infrastructure to store spectral data.
Maintained by Thomas Naake. Last updated 2 months ago.
metabolomicsproteomicsmassspectrometryqualitycontrolmass-spectrometryqc
27.7 match 7 stars 5.45 score 2 scriptsbioc
similaRpeak:Metrics to estimate a level of similarity between two ChIP-Seq profiles
This package calculates metrics which quantify the level of similarity between ChIP-Seq profiles. More specifically, the package implements six pseudometrics specialized in pattern similarity detection in ChIP-Seq profiles.
Maintained by Astrid Deschênes. Last updated 5 months ago.
biologicalquestionchipseqgeneticsmultiplecomparisondifferentialexpressionbioconductorbioconductor-packagechip-profileschip-seqmetrics
26.1 match 7 stars 5.62 score 7 scriptsbioc
CNVMetrics:Copy Number Variant Metrics
The CNVMetrics package calculates similarity metrics to facilitate copy number variant comparison among samples and/or methods. Similarity metrics can be employed to compare CNV profiles of genetically unrelated samples as well as those with a common genetic background. Some metrics are based on the shared amplified/deleted regions while other metrics rely on the level of amplification/deletion. The data type used as input is a plain text file containing the genomic position of the copy number variations, as well as the status and/or the log2 ratio values. Finally, a visualization tool is provided to explore resulting metrics.
Maintained by Astrid Deschênes. Last updated 5 months ago.
biologicalquestionsoftwarecopynumbervariationcnvcopy-number-variationmetricsr-language
28.9 match 4 stars 5.08 score 8 scriptsbranchlab
metasnf:Meta Clustering with Similarity Network Fusion
Framework to facilitate patient subtyping with similarity network fusion and meta clustering. The similarity network fusion (SNF) algorithm was introduced by Wang et al. (2014) in <doi:10.1038/nmeth.2810>. SNF is a data integration approach that can transform high-dimensional and diverse data types into a single similarity network suitable for clustering with minimal loss of information from each initial data source. The meta clustering approach was introduced by Caruana et al. (2006) in <doi:10.1109/ICDM.2006.103>. Meta clustering involves generating a wide range of cluster solutions by adjusting clustering hyperparameters, then clustering the solutions themselves into a manageable number of qualitatively similar solutions, and finally characterizing representative solutions to find ones that are best for the user's specific context. This package provides a framework to easily transform multi-modal data into a wide range of similarity network fusion-derived cluster solutions as well as to visualize, characterize, and validate those solutions. Core package functionality includes easy customization of distance metrics, clustering algorithms, and SNF hyperparameters to generate diverse clustering solutions; calculation and plotting of associations between features, between patients, and between cluster solutions; and standard cluster validation approaches including resampled measures of cluster stability, standard metrics of cluster quality, and label propagation to evaluate generalizability in unseen data. Associated vignettes guide the user through using the package to identify patient subtypes while adhering to best practices for unsupervised learning.
Maintained by Prashanth S Velayudhan. Last updated 4 days ago.
bioinformaticsclusteringmetaclusteringsnf
17.7 match 8 stars 8.21 score 30 scriptsnetworkgroupr
fastnet:Large-Scale Social Network Analysis
We present an implementation of the algorithms required to simulate large-scale social networks and retrieve their most relevant metrics.
Maintained by Nazrul Shaikh. Last updated 8 years ago.
42.4 match 5 stars 3.37 score 47 scriptsterrytangyuan
dml:Distance Metric Learning in R
State-of-the-art algorithms for distance metric learning, including global and local methods such as Relevant Component Analysis, Discriminative Component Analysis, Local Fisher Discriminant Analysis, etc. These distance metric learning methods are widely applied in feature extraction, dimensionality reduction, clustering, classification, information retrieval, and computer vision problems.
Maintained by Yuan Tang. Last updated 2 years ago.
dimensionality-reductiondistance-metric-learningmachine-learningmetric-learningstatistics
23.7 match 58 stars 5.94 score 8 scripts 1 dependentsmayer79
MetricsWeighted:Weighted Metrics and Performance Measures for Machine Learning
Provides weighted versions of several metrics and performance measures used in machine learning, including average unit deviances of the Bernoulli, Tweedie, Poisson, and Gamma distributions, see Jorgensen B. (1997, ISBN: 978-0412997112). The package also contains a weighted version of generalized R-squared, see e.g. Cohen, J. et al. (2002, ISBN: 978-0805822236). Furthermore, 'dplyr' chains are supported.
Maintained by Michael Mayer. Last updated 8 months ago.
machine-learningmetricsperformancestatistics
20.2 match 11 stars 6.79 score 75 scripts 5 dependentsschlosslab
mikropml:User-Friendly R Package for Supervised Machine Learning Pipelines
An interface to build machine learning models for classification and regression problems. 'mikropml' implements the ML pipeline described by Topçuoğlu et al. (2020) <doi:10.1128/mBio.00434-20> with reasonable default options for data preprocessing, hyperparameter tuning, cross-validation, testing, model evaluation, and interpretation steps. See the website <https://www.schlosslab.org/mikropml/> for more information, documentation, and examples.
Maintained by Kelly Sovacool. Last updated 2 years ago.
17.2 match 56 stars 7.83 score 86 scriptsgeanders
weathermetrics:Functions to Convert Between Weather Metrics
Functions to convert between weather metrics, including conversions for metrics of temperature, air moisture, wind speed, and precipitation. This package also includes functions to calculate the heat index from air temperature and air moisture.
Maintained by Brooke Anderson. Last updated 8 years ago.
16.1 match 23 stars 8.32 score 506 scripts 1 dependentsfhdsl
metricminer:Mine Metrics from Common Places on the Web
Mine metrics on common places on the web through the power of their APIs (application programming interfaces). It also helps make the data in a format that is easily used for a dashboard or other purposes. There is an associated dashboard template and tutorials that are underdevelopment that help you fully utilize 'metricminer'.
Maintained by Candace Savonen. Last updated 2 days ago.
21.6 match 2 stars 6.13 score 21 scripts8-bit-sheep
googleAnalyticsR:Google Analytics API into R
Interact with the Google Analytics APIs <https://developers.google.com/analytics/>, including the Core Reporting API (v3 and v4), Management API, User Activity API GA4's Data API and Admin API and Multi-Channel Funnel API.
Maintained by Erik Grönroos. Last updated 6 months ago.
analyticsapigooglegoogleanalyticsrgoogleauthr
12.6 match 262 stars 10.11 score 680 scripts 1 dependentsaidanmorales
rTwig:Realistic Quantitative Structure Models
Real Twig is a method to correct branch overestimation in quantitative structure models. Overestimated cylinders are correctly tapered using measured twig diameters of corresponding tree species. Supported quantitative structure modeling software includes 'TreeQSM', 'SimpleForest', 'Treegraph', and 'aRchi'. Also included is a novel database of twig diameters and tools for fractal analysis of point clouds.
Maintained by Aidan Morales. Last updated 12 days ago.
forestrylidarmodelingqsmrcppcpp
17.9 match 8 stars 7.10 score 13 scriptscarlos-alberto-silva
rGEDI:NASA's Global Ecosystem Dynamics Investigation (GEDI) Data Visualization and Processing
Set of tools for downloading, reading, visualizing and processing GEDI Level1B, Level2A and Level2B data.
Maintained by Caio Hamamura. Last updated 5 months ago.
20.4 match 169 stars 6.11 score 85 scripts 1 dependentsjedick
canprot:Chemical Analysis of Proteins
Chemical analysis of proteins based on their amino acid compositions. Amino acid compositions can be read from FASTA files and used to calculate chemical metrics including carbon oxidation state and stoichiometric water content, as described in Dick et al. (2020) <doi:10.5194/bg-17-6145-2020>. Other properties that can be calculated include protein length, grand average of hydropathy (GRAVY), isoelectric point (pI), molecular weight (MW), standard molal volume (V0), and metabolic costs (Akashi and Gojobori, 2002 <doi:10.1073/pnas.062526999>; Wagner, 2005 <doi:10.1093/molbev/msi126>; Zhang et al., 2018 <doi:10.1038/s41467-018-06461-1>). A database of amino acid compositions of human proteins derived from UniProt is provided.
Maintained by Jeffrey Dick. Last updated 12 days ago.
amino-acid-compositionchemical-metricshydration-stateisoelectric-pointoxidation-stateproteins
18.1 match 3 stars 6.70 score 46 scripts 1 dependentsbrian-j-smith
MachineShop:Machine Learning Models and Tools
Meta-package for statistical and machine learning with a unified interface for model fitting, prediction, performance assessment, and presentation of results. Approaches for model fitting and prediction of numerical, categorical, or censored time-to-event outcomes include traditional regression models, regularization methods, tree-based methods, support vector machines, neural networks, ensembles, data preprocessing, filtering, and model tuning and selection. Performance metrics are provided for model assessment and can be estimated with independent test sets, split sampling, cross-validation, or bootstrap resampling. Resample estimation can be executed in parallel for faster processing and nested in cases of model tuning and selection. Modeling results can be summarized with descriptive statistics; calibration curves; variable importance; partial dependence plots; confusion matrices; and ROC, lift, and other performance curves.
Maintained by Brian J Smith. Last updated 7 months ago.
classification-modelsmachine-learningpredictive-modelingregression-modelssurvival-models
15.0 match 61 stars 7.95 score 121 scriptsdmpe
urlshorteneR:R Wrapper for the 'Bit.ly' and 'Is.gd'/'v.gd' URL Shortening Services
Allows using two URL shortening services, which also provide expanding and analytic functions. Specifically developed for 'Bit.ly' (which requires OAuth 2.0) and 'is.gd' (no API key).
Maintained by John Malc. Last updated 28 days ago.
bitlyisgdshorten-urlsshortenershorturlurl
17.8 match 21 stars 6.70 score 53 scripts 1 dependentsr-lidar
lidR:Airborne LiDAR Data Manipulation and Visualization for Forestry Applications
Airborne LiDAR (Light Detection and Ranging) interface for data manipulation and visualization. Read/write 'las' and 'laz' files, computation of metrics in area based approach, point filtering, artificial point reduction, classification from geographic data, normalization, individual tree segmentation and other manipulations.
Maintained by Jean-Romain Roussel. Last updated 1 months ago.
alsforestrylaslazlidarpoint-cloudremote-sensingopenblascppopenmp
8.1 match 623 stars 14.47 score 844 scripts 8 dependentsmarinapapa
swaRmverse:Swarm Space Creation
Provides a pipeline for the comparative analysis of collective movement data (e.g. fish schools, bird flocks, baboon troops) by processing 2-dimensional positional data (x,y,t) from GPS trackers or computer vision tracking systems, discretizing events of collective motion, calculating a set of established metrics that characterize each event, and placing the events in a multi-dimensional swarm space constructed from these metrics. The swarm space concept, the metrics and data sets included are described in: Papadopoulou Marina, Furtbauer Ines, O'Bryan Lisa R., Garnier Simon, Georgopoulou Dimitra G., Bracken Anna M., Christensen Charlotte and King Andrew J. (2023) <doi:10.1098/rstb.2022.0068>.
Maintained by Marina Papadopoulou. Last updated 5 months ago.
22.6 match 2 stars 5.13 score 15 scriptspydemull
activAnalyzer:A 'Shiny' App to Analyze Accelerometer-Measured Daily Physical Behavior Data
A tool to analyse 'ActiGraph' accelerometer data and to implement the use of the PROactive Physical Activity in COPD (chronic obstructive pulmonary disease) instruments. Once analysis is completed, the app allows to export results to .csv files and to generate a report of the measurement. All the configured inputs relevant for interpreting the results are recorded in the report. In addition to the existing 'R' packages that are fully integrated with the app, the app uses some functions from the 'actigraph.sleepr' package developed by Petkova (2021) <https://github.com/dipetkov/actigraph.sleepr/>.
Maintained by Pierre-Yves de Müllenheim. Last updated 6 months ago.
accelerometeractigraphappmonitorshiny
22.4 match 5 stars 5.18 score 8 scriptsmolina-valero
FORTLS:Automatic Processing of Terrestrial-Based Technologies Point Cloud Data for Forestry Purposes
Process automation of point cloud data derived from terrestrial-based technologies such as Terrestrial Laser Scanner (TLS) or Mobile Laser Scanner. 'FORTLS' enables (i) detection of trees and estimation of tree-level attributes (e.g. diameters and heights), (ii) estimation of stand-level variables (e.g. density, basal area, mean and dominant height), (iii) computation of metrics related to important forest attributes estimated in Forest Inventories at stand-level, and (iv) optimization of plot design for combining TLS data and field measured data. Documentation about 'FORTLS' is described in Molina-Valero et al. (2022, <doi:10.1016/j.envsoft.2022.105337>).
Maintained by Juan Alberto Molina-Valero. Last updated 3 months ago.
forest-inventoryforest-monitoringlidar-point-cloudcpp
18.6 match 22 stars 6.16 score 11 scriptssparklyr
sparklyr:R Interface to Apache Spark
R interface to Apache Spark, a fast and general engine for big data processing, see <https://spark.apache.org/>. This package supports connecting to local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end, and provides an interface to Spark's built-in machine learning algorithms.
Maintained by Edgar Ruiz. Last updated 9 days ago.
apache-sparkdistributeddplyridelivymachine-learningremote-clusterssparksparklyr
7.4 match 959 stars 15.16 score 4.0k scripts 21 dependentsbiorgeo
bioregion:Comparison of Bioregionalisation Methods
The main purpose of this package is to propose a transparent methodological framework to compare bioregionalisation methods based on hierarchical and non-hierarchical clustering algorithms (Kreft & Jetz (2010) <doi:10.1111/j.1365-2699.2010.02375.x>) and network algorithms (Lenormand et al. (2019) <doi:10.1002/ece3.4718> and Leroy et al. (2019) <doi:10.1111/jbi.13674>).
Maintained by Maxime Lenormand. Last updated 10 days ago.
biogeographybioregionbioregionalizationcpp
17.6 match 7 stars 6.27 score 11 scriptszachmayer
caretEnsemble:Ensembles of Caret Models
Functions for creating ensembles of caret models: caretList() and caretStack(). caretList() is a convenience function for fitting multiple caret::train() models to the same dataset. caretStack() will make linear or non-linear combinations of these models, using a caret::train() model as a meta-model.
Maintained by Zachary A. Deane-Mayer. Last updated 3 months ago.
9.1 match 226 stars 11.92 score 780 scripts 1 dependentsphuais
multilandr:Landscape Analysis at Multiple Spatial Scales
Provides a tidy workflow for landscape-scale analysis. 'multilandr' offers tools to generate landscapes at multiple spatial scales and compute landscape metrics, primarily using the 'landscapemetrics' package. It also features utility functions for plotting and analyzing multi-scale landscapes, exploring correlations between metrics, filtering landscapes based on specific conditions, generating landscape gradients for a given metric, and preparing datasets for further statistical analysis. Documentation about 'multilandr' is provided in an introductory vignette included in this package and in the paper by Huais (2024) <doi:10.1007/s10980-024-01930-z>; see citation("multilandr") for details.
Maintained by Pablo Yair Huais. Last updated 27 days ago.
18.8 match 9 stars 5.61 score 5 scriptsglobalecologylab
poems:Pattern-Oriented Ensemble Modeling System
A framework of interoperable R6 classes (Chang, 2020, <https://CRAN.R-project.org/package=R6>) for building ensembles of viable models via the pattern-oriented modeling (POM) approach (Grimm et al.,2005, <doi:10.1126/science.1116681>). The package includes classes for encapsulating and generating model parameters, and managing the POM workflow. The workflow includes: model setup; generating model parameters via Latin hyper-cube sampling (Iman & Conover, 1980, <doi:10.1080/03610928008827996>); running multiple sampled model simulations; collating summary results; and validating and selecting an ensemble of models that best match known patterns. By default, model validation and selection utilizes an approximate Bayesian computation (ABC) approach (Beaumont et al., 2002, <doi:10.1093/genetics/162.4.2025>), although alternative user-defined functionality could be employed. The package includes a spatially explicit demographic population model simulation engine, which incorporates default functionality for density dependence, correlated environmental stochasticity, stage-based transitions, and distance-based dispersal. The user may customize the simulator by defining functionality for translocations, harvesting, mortality, and other processes, as well as defining the sequence order for the simulator processes. The framework could also be adapted for use with other model simulators by utilizing its extendable (inheritable) base classes.
Maintained by July Pilowsky. Last updated 19 days ago.
biogeographypopulation-modelprocess-based
13.1 match 10 stars 8.05 score 59 scripts 2 dependentsterrytangyuan
lfda:Local Fisher Discriminant Analysis
Functions for performing and visualizing Local Fisher Discriminant Analysis(LFDA), Kernel Fisher Discriminant Analysis(KLFDA), and Semi-supervised Local Fisher Discriminant Analysis(SELF).
Maintained by Yuan Tang. Last updated 2 years ago.
dimensionality-reductiondistance-metric-learningmachine-learningmetric-learningstatistics
16.0 match 76 stars 6.50 score 74 scripts 3 dependentscraig-parylo
cvdprevent:Wrapper for the 'CVD Prevent' Application Programming Interface
Provides an R wrapper to the 'CVD Prevent' application programming interface (API). Users can make API requests through built-in R functions. The Cardiovascular Disease Prevention Audit (CVDPREVENT) is an England-wide primary care audit that automatically extracts routinely held GP health data. <https://bmchealthdocs.atlassian.net/wiki/spaces/CP/pages/317882369/CVDPREVENT+API+Documentation>.
Maintained by Craig Parylo. Last updated 1 months ago.
20.1 match 3 stars 5.02 score 4 scriptswadpac
GGIR:Raw Accelerometer Data Analysis
A tool to process and analyse data collected with wearable raw acceleration sensors as described in Migueles and colleagues (JMPB 2019), and van Hees and colleagues (JApplPhysiol 2014; PLoSONE 2015). The package has been developed and tested for binary data from 'GENEActiv' <https://activinsights.com/>, binary (.gt3x) and .csv-export data from 'Actigraph' <https://theactigraph.com> devices, and binary (.cwa) and .csv-export data from 'Axivity' <https://axivity.com>. These devices are currently widely used in research on human daily physical activity. Further, the package can handle accelerometer data file from any other sensor brand providing that the data is stored in csv format. Also the package allows for external function embedding.
Maintained by Vincent T van Hees. Last updated 1 days ago.
accelerometeractivity-recognitioncircadian-rhythmmovement-sensorsleep
7.6 match 109 stars 13.20 score 342 scripts 3 dependentsbupaverse
edeaR:Exploratory and Descriptive Event-Based Data Analysis
Exploratory and descriptive analysis of event based data. Provides methods for describing and selecting process data, and for preparing event log data for process mining. Builds on the S3-class for event logs implemented in the package 'bupaR'.
Maintained by Gert Janssenswillen. Last updated 3 months ago.
11.0 match 12 stars 9.17 score 149 scripts 8 dependentstguillerme
dispRity:Measuring Disparity
A modular package for measuring disparity (multidimensional space occupancy). Disparity can be calculated from any matrix defining a multidimensional space. The package provides a set of implemented metrics to measure properties of the space and allows users to provide and test their own metrics. The package also provides functions for looking at disparity in a serial way (e.g. disparity through time) or per groups as well as visualising the results. Finally, this package provides several statistical tests for disparity analysis.
Maintained by Thomas Guillerme. Last updated 1 days ago.
disparityecologymultidimensionalitypalaeobiology
11.5 match 26 stars 8.69 score 220 scripts 1 dependentsbillpetti
baseballr:Acquiring and Analyzing Baseball Data
Provides numerous utilities for acquiring and analyzing baseball data from online sources such as 'Baseball Reference' <https://www.baseball-reference.com/>, 'FanGraphs' <https://www.fangraphs.com/>, and the 'MLB Stats' API <https://www.mlb.com/>.
Maintained by Saiem Gilani. Last updated 4 months ago.
baseballpitchfxsabermetricsstatcast
11.2 match 380 stars 8.98 score 582 scriptsmlverse
luz:Higher Level 'API' for 'torch'
A high level interface for 'torch' providing utilities to reduce the the amount of code needed for common tasks, abstract away torch details and make the same code work on both the 'CPU' and 'GPU'. It's flexible enough to support expressing a large range of models. It's heavily inspired by 'fastai' by Howard et al. (2020) <arXiv:2002.04688>, 'Keras' by Chollet et al. (2015) and 'PyTorch Lightning' by Falcon et al. (2019) <doi:10.5281/zenodo.3828935>.
Maintained by Daniel Falbel. Last updated 6 months ago.
10.1 match 89 stars 9.88 score 318 scripts 4 dependentsandrewljackson
SIBER:Stable Isotope Bayesian Ellipses in R
Fits bi-variate ellipses to stable isotope data using Bayesian inference with the aim being to describe and compare their isotopic niche.
Maintained by Andrew Jackson. Last updated 10 months ago.
community-ecologyecologyniche-modellingstable-isotopesjagscpp
10.9 match 36 stars 9.13 score 187 scripts 1 dependentsmlampros
nmslibR:Non Metric Space (Approximate) Library
A Non-Metric Space Library ('NMSLIB' <https://github.com/nmslib/nmslib>) wrapper, which according to the authors "is an efficient cross-platform similarity search library and a toolkit for evaluation of similarity search methods. The goal of the 'NMSLIB' <https://github.com/nmslib/nmslib> Library is to create an effective and comprehensive toolkit for searching in generic non-metric spaces. Being comprehensive is important, because no single method is likely to be sufficient in all cases. Also note that exact solutions are hardly efficient in high dimensions and/or non-metric spaces. Hence, the main focus is on approximate methods". The wrapper also includes Approximate Kernel k-Nearest-Neighbor functions based on the 'NMSLIB' <https://github.com/nmslib/nmslib> 'Python' Library.
Maintained by Lampros Mouselimis. Last updated 2 years ago.
approximate-nearest-neighbor-searchnmslibnon-metricpythonreticulatecppopenmp
19.4 match 12 stars 5.14 score 23 scriptsrobwschlegel
heatwaveR:Detect Heatwaves and Cold-Spells
The different methods for defining, detecting, and categorising the extreme events known as heatwaves or cold-spells, as first proposed in Hobday et al. (2016) <doi: 10.1016/j.pocean.2015.12.014> and Hobday et al. (2018) <https://www.jstor.org/stable/26542662>. The functions in this package work on both air and water temperature data. These detection algorithms may be used on non-temperature data as well.
Maintained by Robert W. Schlegel. Last updated 2 months ago.
10.6 match 46 stars 9.36 score 343 scriptseasystats
easystats:Framework for Easy Statistical Modeling, Visualization, and Reporting
A meta-package that installs and loads a set of packages from 'easystats' ecosystem in a single step. This collection of packages provide a unifying and consistent framework for statistical modeling, visualization, and reporting. Additionally, it provides articles targeted at instructors for teaching 'easystats', and a dashboard targeted at new R users for easily conducting statistical analysis by accessing summary results, model fit indices, and visualizations with minimal programming.
Maintained by Daniel Lüdecke. Last updated 11 days ago.
dataanalyticsdatascienceeasystatshacktoberfestmodelsperformance-metricsregression-modelsstatistics
7.5 match 1.1k stars 13.01 score 1.8k scripts 1 dependentsjmadinlab
habtools:Tools and Metrics for 3D Surfaces and Objects
A collection of functions for sampling and simulating 3D surfaces and objects and estimating metrics like rugosity, fractal dimension, convexity, sphericity, circularity, second moments of area and volume, and more.
Maintained by Nina Schiettekatte. Last updated 10 days ago.
15.9 match 12 stars 6.10 score 9 scriptssatijalab
Seurat:Tools for Single Cell Genomics
A toolkit for quality control, analysis, and exploration of single cell RNA sequencing data. 'Seurat' aims to enable users to identify and interpret sources of heterogeneity from single cell transcriptomic measurements, and to integrate diverse types of single cell data. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031>, and Hao, Hao, et al (2020) <doi:10.1101/2020.10.12.335331> for more details.
Maintained by Paul Hoffman. Last updated 1 years ago.
human-cell-atlassingle-cell-genomicssingle-cell-rna-seqcpp
5.7 match 2.4k stars 16.86 score 50k scripts 73 dependentsms609
TreeDist:Calculate and Map Distances Between Phylogenetic Trees
Implements measures of tree similarity, including information-based generalized Robinson-Foulds distances (Phylogenetic Information Distance, Clustering Information Distance, Matching Split Information Distance; Smith 2020) <doi:10.1093/bioinformatics/btaa614>; Jaccard-Robinson-Foulds distances (Bocker et al. 2013) <doi:10.1007/978-3-642-40453-5_13>, including the Nye et al. (2006) metric <doi:10.1093/bioinformatics/bti720>; the Matching Split Distance (Bogdanowicz & Giaro 2012) <doi:10.1109/TCBB.2011.48>; Maximum Agreement Subtree distances; the Kendall-Colijn (2016) distance <doi:10.1093/molbev/msw124>, and the Nearest Neighbour Interchange (NNI) distance, approximated per Li et al. (1996) <doi:10.1007/3-540-61332-3_168>. Includes tools for visualizing mappings of tree space (Smith 2022) <doi:10.1093/sysbio/syab100>, for identifying islands of trees (Silva and Wilkinson 2021) <doi:10.1093/sysbio/syab015>, for calculating the median of sets of trees, and for computing the information content of trees and splits.
Maintained by Martin R. Smith. Last updated 1 months ago.
phylogeneticstree-distancephylogenetic-treestree-distancestreescpp
9.2 match 32 stars 10.32 score 97 scripts 5 dependentsatheriel
openmetrics:A 'Prometheus' Client for R Using the 'OpenMetrics' Format
Provides a client for the open-source monitoring and alerting toolkit, 'Prometheus', that emits metrics in the 'OpenMetrics' format. Allows users to automatically instrument 'Plumber' and 'Shiny' applications, collect standard process metrics, as well as define custom counter, gauge, and histogram metrics of their own.
Maintained by Aaron Jacobs. Last updated 4 years ago.
metricsopenmetricsplumberprometheusprometheus-clientshiny
22.1 match 35 stars 4.24 score 5 scriptsbraverock
PerformanceAnalytics:Econometric Tools for Performance and Risk Analysis
Collection of econometric functions for performance and risk analysis. In addition to standard risk and performance metrics, this package aims to aid practitioners and researchers in utilizing the latest research in analysis of non-normal return streams. In general, it is most tested on return (rather than price) data on a regular scale, but most functions will work with irregular return data as well, and increasing numbers of functions will work with P&L or price data where possible.
Maintained by Brian G. Peterson. Last updated 3 months ago.
5.7 match 222 stars 15.93 score 4.8k scripts 20 dependentsmikeblazanin
gcplyr:Wrangle and Analyze Growth Curve Data
Easy wrangling and model-free analysis of microbial growth curve data, as commonly output by plate readers. Tools for reshaping common plate reader outputs into 'tidy' formats and merging them with design information, making data easy to work with using 'gcplyr' and other packages. Also streamlines common growth curve processing steps, like smoothing and calculating derivatives, and facilitates model-free characterization and analysis of growth data. See methods at <https://mikeblazanin.github.io/gcplyr/>.
Maintained by Mike Blazanin. Last updated 2 months ago.
11.3 match 30 stars 7.90 score 75 scriptsbioxgeo
geodiv:Methods for Calculating Gradient Surface Metrics
Methods for calculating gradient surface metrics for continuous analysis of landscape features.
Maintained by Annie C. Smith. Last updated 1 years ago.
14.8 match 11 stars 5.88 score 23 scripts 1 dependentsfabrice-rossi
mixvlmc:Variable Length Markov Chains with Covariates
Estimates Variable Length Markov Chains (VLMC) models and VLMC with covariates models from discrete sequences. Supports model selection via information criteria and simulation of new sequences from an estimated model. See Bühlmann, P. and Wyner, A. J. (1999) <doi:10.1214/aos/1018031204> for VLMC and Zanin Zambom, A., Kim, S. and Lopes Garcia, N. (2022) <doi:10.1111/jtsa.12615> for VLMC with covariates.
Maintained by Fabrice Rossi. Last updated 10 months ago.
machine-learningmarkov-chainmarkov-modelstatisticstime-seriescpp
13.9 match 2 stars 6.23 score 20 scriptspharmar
riskmetric:Risk Metrics to Evaluating R Packages
Facilities for assessing R packages against a number of metrics to help quantify their robustness.
Maintained by Eli Miller. Last updated 8 days ago.
9.4 match 167 stars 8.89 score 43 scriptshneth
riskyr:Rendering Risk Literacy more Transparent
Risk-related information (like the prevalence of conditions, the sensitivity and specificity of diagnostic tests, or the effectiveness of interventions or treatments) can be expressed in terms of frequencies or probabilities. By providing a toolbox of corresponding metrics and representations, 'riskyr' computes, translates, and visualizes risk-related information in a variety of ways. Adopting multiple complementary perspectives provides insights into the interplay between key parameters and renders teaching and training programs on risk literacy more transparent.
Maintained by Hansjoerg Neth. Last updated 10 months ago.
2x2-matrixbayesian-inferencecontingency-tablerepresentationriskrisk-literacyvisualization
11.3 match 19 stars 7.36 score 80 scriptsropensci
repometrics:Metrics for Your Code Repository
Metrics for your code repository. Call one function to generate an interactive dashboard displaying the state of your code.
Maintained by Mark Padgham. Last updated 3 days ago.
18.3 match 2 stars 4.53 scorebioc
scuttle:Single-Cell RNA-Seq Analysis Utilities
Provides basic utility functions for performing single-cell analyses, focusing on simple normalization, quality control and data transformations. Also provides some helper functions to assist development of other packages.
Maintained by Aaron Lun. Last updated 5 months ago.
immunooncologysinglecellrnaseqqualitycontrolpreprocessingnormalizationtranscriptomicsgeneexpressionsequencingsoftwaredataimportopenblascpp
7.9 match 10.21 score 1.7k scripts 80 dependentsnashjc
optimx:Expanded Replacement and Extension of the 'optim' Function
Provides a replacement and extension of the optim() function to call to several function minimization codes in R in a single statement. These methods handle smooth, possibly box constrained functions of several or many parameters. Note that function 'optimr()' was prepared to simplify the incorporation of minimization codes going forward. Also implements some utility codes and some extra solvers, including safeguarded Newton methods. Many methods previously separate are now included here. This is the version for CRAN.
Maintained by John C Nash. Last updated 2 months ago.
6.3 match 2 stars 12.87 score 1.8k scripts 89 dependentsrstudio
shinyloadtest:Load Test Shiny Applications
Assesses the number of concurrent users 'shiny' applications are capable of supporting, and for directing application changes in order to support a higher number of users. Provides facilities for recording 'shiny' application sessions, playing recorded sessions against a target server at load, and analyzing the resulting metrics.
Maintained by Barret Schloerke. Last updated 7 months ago.
11.0 match 112 stars 7.14 score 61 scriptsmlysy
nicheROVER:Niche Region and Niche Overlap Metrics for Multidimensional Ecological Niches
Implementation of a probabilistic method to calculate 'nicheROVER' (_niche_ _r_egion and niche _over_lap) metrics using multidimensional niche indicator data (e.g., stable isotopes, environmental variables, etc.). The niche region is defined as the joint probability density function of the multidimensional niche indicators at a user-defined probability alpha (e.g., 95%). Uncertainty is accounted for in a Bayesian framework, and the method can be extended to three or more indicator dimensions. It provides directional estimates of niche overlap, accounts for species-specific distributions in multivariate niche space, and produces unique and consistent bivariate projections of the multivariate niche region. The article by Swanson et al. (2015) <doi:10.1890/14-0235.1> provides a detailed description of the methodology. See the package vignette for a worked example using fish stable isotope data.
Maintained by Martin Lysy. Last updated 1 years ago.
11.4 match 9 stars 6.80 score 47 scripts 1 dependentsbioc
spatialFDA:A Tool for Spatial Multi-sample Comparisons
spatialFDA is a package to calculate spatial statistics metrics. The package takes a SpatialExperiment object and calculates spatial statistics metrics using the package spatstat. Then it compares the resulting functions across samples/conditions using functional additive models as implemented in the package refund. Furthermore, it provides exploratory visualisations using functional principal component analysis, as well implemented in refund.
Maintained by Martin Emons. Last updated 24 days ago.
softwarespatialtranscriptomics
15.4 match 2 stars 5.00 score 6 scriptsmlampros
KernelKnn:Kernel k Nearest Neighbors
Extends the simple k-nearest neighbors algorithm by incorporating numerous kernel functions and a variety of distance metrics. The package takes advantage of 'RcppArmadillo' to speed up the calculation of distances between observations.
Maintained by Lampros Mouselimis. Last updated 2 years ago.
cpp11distance-metrickernel-methodsknnrcpparmadilloopenblascppopenmp
8.0 match 17 stars 9.16 score 54 scripts 13 dependentsrbarkerclarke
gtexture:Generalized Application of Co-Occurrence Matrices and Haralick Texture
Generalizes application of gray-level co-occurrence matrix (GLCM) metrics to objects outside of images. The current focus is to apply GLCM metrics to the study of biological networks and fitness landscapes that are used in studying evolutionary medicine and biology, particularly the evolution of cancer resistance. The package was used in our publication, Barker-Clarke et al. (2023) <doi:10.1088/1361-6560/ace305>. A general reference to learn more about mathematical oncology can be found at Rockne et al. (2019) <doi:10.1088/1478-3975/ab1a09>.
Maintained by Rowan Barker-Clarke. Last updated 12 months ago.
24.4 match 3.00 score 1 scriptsvjoshy
ATQ:Alert Time Quality - Evaluating Timely Epidemic Metrics
Provides tools for evaluating timely epidemic detection models within school absenteeism-based surveillance systems. Introduces the concept of alert time quality as an evaluation metric. Includes functions to simulate populations, epidemics, and alert metrics associated with epidemic spread using population census data. The methods are based on research published in Vanderkruk et al. (2023) <doi:10.1186/s12889-023-15747-z> and Ward et al. (2019) <doi:10.1186/s12889-019-7521-7>.
Maintained by Vinay Joshy. Last updated 7 months ago.
16.2 match 1 stars 4.48 score 6 scriptsropensci
nlrx:Setup, Run and Analyze 'NetLogo' Model Simulations from 'R' via 'XML'
Setup, run and analyze 'NetLogo' (<https://ccl.northwestern.edu/netlogo/>) model simulations in 'R'. 'nlrx' experiments use a similar structure as 'NetLogos' Behavior Space experiments. However, 'nlrx' offers more flexibility and additional tools for running and analyzing complex simulation designs and sensitivity analyses. The user defines all information that is needed in an intuitive framework, using class objects. Experiments are submitted from 'R' to 'NetLogo' via 'XML' files that are dynamically written, based on specifications defined by the user. By nesting model calls in future environments, large simulation design with many runs can be executed in parallel. This also enables simulating 'NetLogo' experiments on remote high performance computing machines. In order to use this package, 'Java' and 'NetLogo' (>= 5.3.1) need to be available on the executing system.
Maintained by Sebastian Hanss. Last updated 6 months ago.
agent-based-modelingindividual-based-modellingnetlogopeer-reviewed
8.0 match 78 stars 8.86 score 195 scriptsevolecolgroup
tidysdm:Species Distribution Models with Tidymodels
Fit species distribution models (SDMs) using the 'tidymodels' framework, which provides a standardised interface to define models and process their outputs. 'tidysdm' expands 'tidymodels' by providing methods for spatial objects, models and metrics specific to SDMs, as well as a number of specialised functions to process occurrences for contemporary and palaeo datasets. The full functionalities of the package are described in Leonardi et al. (2023) <doi:10.1101/2023.07.24.550358>.
Maintained by Andrea Manica. Last updated 9 days ago.
species-distribution-modellingtidymodels
8.0 match 31 stars 8.82 score 51 scriptsmyles-lewis
nestedcv:Nested Cross-Validation with 'glmnet' and 'caret'
Implements nested k*l-fold cross-validation for lasso and elastic-net regularised linear models via the 'glmnet' package and other machine learning models via the 'caret' package <doi:10.1093/bioadv/vbad048>. Cross-validation of 'glmnet' alpha mixing parameter and embedded fast filter functions for feature selection are provided. Described as double cross-validation by Stone (1977) <doi:10.1111/j.2517-6161.1977.tb01603.x>. Also implemented is a method using outer CV to measure unbiased model performance metrics when fitting Bayesian linear and logistic regression shrinkage models using the horseshoe prior over parameters to encourage a sparse model as described by Piironen & Vehtari (2017) <doi:10.1214/17-EJS1337SI>.
Maintained by Myles Lewis. Last updated 5 days ago.
8.7 match 12 stars 7.92 score 46 scriptsbioc
GRmetrics:Calculate growth-rate inhibition (GR) metrics
Functions for calculating and visualizing growth-rate inhibition (GR) metrics.
Maintained by Nicholas Clark. Last updated 5 months ago.
immunooncologycellbasedassayscellbiologysoftwaretimecoursevisualization
14.2 match 1 stars 4.83 score 17 scriptsdiegommcc
SpatialDDLS:Deconvolution of Spatial Transcriptomics Data Based on Neural Networks
Deconvolution of spatial transcriptomics data based on neural networks and single-cell RNA-seq data. SpatialDDLS implements a workflow to create neural network models able to make accurate estimates of cell composition of spots from spatial transcriptomics data using deep learning and the meaningful information provided by single-cell RNA-seq data. See Torroja and Sanchez-Cabo (2019) <doi:10.3389/fgene.2019.00978> and Mañanes et al. (2024) <doi:10.1093/bioinformatics/btae072> to get an overview of the method and see some examples of its performance.
Maintained by Diego Mañanes. Last updated 5 months ago.
deconvolutiondeep-learningneural-networkspatial-transcriptomics
13.6 match 5 stars 5.00 score 1 scriptsmartinspindler
hdm:High-Dimensional Metrics
Implementation of selected high-dimensional statistical and econometric methods for estimation and inference. Efficient estimators and uniformly valid confidence intervals for various low-dimensional causal/ structural parameters are provided which appear in high-dimensional approximately sparse models. Including functions for fitting heteroscedastic robust Lasso regressions with non-Gaussian errors and for instrumental variable (IV) and treatment effect estimation in a high-dimensional setting. Moreover, the methods enable valid post-selection inference and rely on a theoretically grounded, data-driven choice of the penalty. Chernozhukov, Hansen, Spindler (2016) <arXiv:1603.01700>.
Maintained by Martin Spindler. Last updated 4 years ago.
8.3 match 14 stars 8.17 score 564 scripts 4 dependentsvegandevs
vegan:Community Ecology Package
Ordination methods, diversity analysis and other functions for community and vegetation ecologists.
Maintained by Jari Oksanen. Last updated 16 days ago.
ecological-modellingecologyordinationfortranopenblas
3.5 match 472 stars 19.41 score 15k scripts 440 dependentspgomba
MDPIexploreR:Web Scraping and Bibliometric Analysis of MDPI Journals
Provides comprehensive tools to scrape and analyze data from the MDPI journals. It allows users to extract metrics such as submission-to-acceptance times, article types, and whether articles are part of special issues. The package can also visualize this information through plots. Additionally, 'MDPIexploreR' offers tools to explore patterns of self-citations within articles and provides insights into guest-edited special issues.
Maintained by Pablo Gómez Barreiro. Last updated 4 months ago.
analysisdata-analysisdata-visualizationmdpimetricsscientific-journalsvisualizationweb-scraping
10.5 match 20 stars 6.20 score 9 scriptsltorgo
performanceEstimation:An Infra-Structure for Performance Estimation of Predictive Models
An infra-structure for estimating the predictive performance of predictive models. In this context, it can also be used to compare and/or select among different alternative ways of solving one or more predictive tasks. The main goal of the package is to provide a generic infra-structure to estimate the values of different metrics of predictive performance using different estimation procedures. These estimation tasks can be applied to any solutions (workflows) to the predictive tasks. The package provides easy to use standard workflows that allow the usage of any available R modeling algorithm together with some pre-defined data pre-processing steps and also prediction post- processing methods. It also provides means for addressing issues related with the statistical significance of the observed differences.
Maintained by Luis Torgo. Last updated 8 years ago.
10.9 match 16 stars 5.97 score 195 scripts 1 dependentshiweller
colordistance:Distance Metrics for Image Color Similarity
Loads and displays images, selectively masks specified background colors, bins pixels by color using either data-dependent or automatically generated color bins, quantitatively measures color similarity among images using one of several distance metrics for comparing pixel color clusters, and clusters images by object color similarity. Uses CIELAB, RGB, or HSV color spaces. Originally written for use with organism coloration (reef fish color diversity, butterfly mimicry, etc), but easily applicable for any image set.
Maintained by Hannah Weller. Last updated 1 years ago.
8.2 match 37 stars 7.93 score 76 scripts 2 dependentsprophet:Automatic Forecasting Procedure
Implements a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well.
Maintained by Sean Taylor. Last updated 5 months ago.
4.1 match 19k stars 15.53 score 976 scripts 13 dependentsbusiness-science
modeltime:The Tidymodels Extension for Time Series Modeling
The time series forecasting framework for use with the 'tidymodels' ecosystem. Models include ARIMA, Exponential Smoothing, and additional time series models from the 'forecast' and 'prophet' packages. Refer to "Forecasting Principles & Practice, Second edition" (<https://otexts.com/fpp2/>). Refer to "Prophet: forecasting at scale" (<https://research.facebook.com/blog/2017/02/prophet-forecasting-at-scale/>.).
Maintained by Matt Dancho. Last updated 5 months ago.
arimadata-sciencedeep-learningetsforecastingmachine-learningmachine-learning-algorithmsmodeltimeprophettbatstidymodelingtidymodelstimetime-seriestime-series-analysistimeseriestimeseries-forecasting
5.9 match 549 stars 10.57 score 1.1k scripts 7 dependentsbioc
cfDNAPro:cfDNAPro extracts and Visualises biological features from whole genome sequencing data of cell-free DNA
cfDNA fragments carry important features for building cancer sample classification ML models, such as fragment size, and fragment end motif etc. Analyzing and visualizing fragment size metrics, as well as other biological features in a curated, standardized, scalable, well-documented, and reproducible way might be time intensive. This package intends to resolve these problems and simplify the process. It offers two sets of functions for cfDNA feature characterization and visualization.
Maintained by Haichao Wang. Last updated 5 months ago.
visualizationsequencingwholegenomebioinformaticscancer-genomicscancer-researchcell-free-dnaearly-detectiongenomics-visualizationliquid-biopsyswgswhole-genome-sequencing
10.3 match 28 stars 6.04 score 13 scriptsbusiness-science
modeltime.resample:Resampling Tools for Time Series Forecasting
A 'modeltime' extension that implements forecast resampling tools that assess time-based model performance and stability for a single time series, panel data, and cross-sectional time series analysis.
Maintained by Matt Dancho. Last updated 1 years ago.
accuracy-metricsbacktestingbootstrapbootstrappingcross-validationforecastingmodeltimemodeltime-resampleresamplingstatisticstidymodelstime-series
9.3 match 19 stars 6.64 score 38 scripts 1 dependentsemf-creaf
indicspecies:Relationship Between Species and Groups of Sites
Functions to assess the strength and statistical significance of the relationship between species occurrence/abundance and groups of sites [De Caceres & Legendre (2009) <doi:10.1890/08-1823.1>]. Also includes functions to measure species niche breadth using resource categories [De Caceres et al. (2011) <doi:10.1111/J.1600-0706.2011.19679.x>].
Maintained by Miquel De Cáceres. Last updated 23 days ago.
6.5 match 10 stars 9.49 score 386 scripts 4 dependentsewenharrison
finalfit:Quickly Create Elegant Regression Results Tables and Plots when Modelling
Generate regression results tables and plots in final format for publication. Explore models and export directly to PDF and 'Word' using 'RMarkdown'.
Maintained by Ewen Harrison. Last updated 7 months ago.
5.3 match 270 stars 11.43 score 1.0k scriptsyanyachen
MLmetrics:Machine Learning Evaluation Metrics
A collection of evaluation metrics, including loss, score and utility functions, that measure regression, classification and ranking performance.
Maintained by Yachen Yan. Last updated 11 months ago.
5.5 match 69 stars 11.09 score 2.2k scripts 20 dependentsbioc
tidytof:Analyze High-dimensional Cytometry Data Using Tidy Data Principles
This package implements an interactive, scientific analysis pipeline for high-dimensional cytometry data built using tidy data principles. It is specifically designed to play well with both the tidyverse and Bioconductor software ecosystems, with functionality for reading/writing data files, data cleaning, preprocessing, clustering, visualization, modeling, and other quality-of-life functions. tidytof implements a "grammar" of high-dimensional cytometry data analysis.
Maintained by Timothy Keyes. Last updated 5 months ago.
singlecellflowcytometrybioinformaticscytometrydata-sciencesingle-celltidyversecpp
8.3 match 19 stars 7.26 score 35 scriptsfmestre1
lconnect:Simple Tools to Compute Landscape Connectivity Metrics
Provides functions to upload vectorial data and derive landscape connectivity metrics in habitat or matrix systems. Additionally, includes an approach to assess individual patch contribution to the overall landscape connectivity, enabling the prioritization of habitat patches. The computation of landscape connectivity and patch importance are very useful in Landscape Ecology research. The metrics available are: number of components, number of links, size of the largest component, mean size of components, class coincidence probability, landscape coincidence probability, characteristic path length, expected cluster size, area-weighted flux and integral index of connectivity. Pascual-Hortal, L., and Saura, S. (2006) <doi:10.1007/s10980-006-0013-z> Urban, D., and Keitt, T. (2001) <doi:10.2307/2679983> Laita, A., Kotiaho, J., Monkkonen, M. (2011) <doi:10.1007/s10980-011-9620-4>.
Maintained by Frederico Mestre. Last updated 1 years ago.
connectivityhabitat-connectivitylandscapemetricscpp
15.7 match 6 stars 3.78 score 3 scriptsbioc
singleCellTK:Comprehensive and Interactive Analysis of Single Cell RNA-Seq Data
The Single Cell Toolkit (SCTK) in the singleCellTK package provides an interface to popular tools for importing, quality control, analysis, and visualization of single cell RNA-seq data. SCTK allows users to seamlessly integrate tools from various packages at different stages of the analysis workflow. A general "a la carte" workflow gives users the ability access to multiple methods for data importing, calculation of general QC metrics, doublet detection, ambient RNA estimation and removal, filtering, normalization, batch correction or integration, dimensionality reduction, 2-D embedding, clustering, marker detection, differential expression, cell type labeling, pathway analysis, and data exporting. Curated workflows can be used to run Seurat and Celda. Streamlined quality control can be performed on the command line using the SCTK-QC pipeline. Users can analyze their data using commands in the R console or by using an interactive Shiny Graphical User Interface (GUI). Specific analyses or entire workflows can be summarized and shared with comprehensive HTML reports generated by Rmarkdown. Additional documentation and vignettes can be found at camplab.net/sctk.
Maintained by Joshua David Campbell. Last updated 23 days ago.
singlecellgeneexpressiondifferentialexpressionalignmentclusteringimmunooncologybatcheffectnormalizationqualitycontroldataimportgui
5.8 match 181 stars 10.16 score 252 scriptsbioc
EpiCompare:Comparison, Benchmarking & QC of Epigenomic Datasets
EpiCompare is used to compare and analyse epigenetic datasets for quality control and benchmarking purposes. The package outputs an HTML report consisting of three sections: (1. General metrics) Metrics on peaks (percentage of blacklisted and non-standard peaks, and peak widths) and fragments (duplication rate) of samples, (2. Peak overlap) Percentage and statistical significance of overlapping and non-overlapping peaks. Also includes upset plot and (3. Functional annotation) functional annotation (ChromHMM, ChIPseeker and enrichment analysis) of peaks. Also includes peak enrichment around TSS.
Maintained by Hiranyamaya Dash. Last updated 29 days ago.
epigeneticsgeneticsqualitycontrolchipseqmultiplecomparisonfunctionalgenomicsatacseqdnaseseqbenchmarkbenchmarkingbioconductorbioconductor-packagecomparisonhtmlinteractive-reporting
7.8 match 14 stars 7.54 score 46 scriptslaperez
Clustering:Techniques for Evaluating Clustering
The design of this package allows us to run different clustering packages and compare the results between them, to determine which algorithm behaves best from the data provided. See Martos, L.A.P., García-Vico, Á.M., González, P. et al.(2023) <doi:10.1007/s13748-022-00294-2> "Clustering: an R library to facilitate the analysis and comparison of cluster algorithms.", Martos, L.A.P., García-Vico, Á.M., González, P. et al. "A Multiclustering Evolutionary Hyperrectangle-Based Algorithm" <doi:10.1007/s44196-023-00341-3> and L.A.P., García-Vico, Á.M., González, P. et al. "An Evolutionary Fuzzy System for Multiclustering in Data Streaming" <doi:10.1016/j.procs.2023.12.058>.
Maintained by Luis Alfonso Perez Martos. Last updated 11 months ago.
14.3 match 5 stars 4.04 score 7 scriptsgabferreira
phyloraster:Evolutionary Diversity Metrics for Raster Data
Phylogenetic Diversity (PD, Faith 1992), Evolutionary Distinctiveness (ED, Isaac et al. 2007), Phylogenetic Endemism (PE, Rosauer et al. 2009; Laffan et al. 2016), and Weighted Endemism (WE, Laffan et al. 2016) for presence-absence raster. Faith, D. P. (1992) <doi:10.1016/0006-3207(92)91201-3> Isaac, N. J. et al. (2007) <doi:10.1371/journal.pone.0000296> Laffan, S. W. et al. (2016) <doi:10.1111/2041-210X.12513> Rosauer, D. et al. (2009) <doi:10.1111/j.1365-294X.2009.04311.x>.
Maintained by Gabriela Alves-Ferreira. Last updated 15 days ago.
10.1 match 7 stars 5.66 score 33 scriptsjlmelville
rnndescent:Nearest Neighbor Descent Method for Approximate Nearest Neighbors
The Nearest Neighbor Descent method for finding approximate nearest neighbors by Dong and co-workers (2010) <doi:10.1145/1963405.1963487>. Based on the 'Python' package 'PyNNDescent' <https://github.com/lmcinnes/pynndescent>.
Maintained by James Melville. Last updated 8 months ago.
approximate-nearest-neighbor-searchcpp
7.8 match 11 stars 7.31 score 75 scriptssapfluxnet
sapfluxnetr:Working with 'Sapfluxnet' Project Data
Access, modify, aggregate and plot data from the 'Sapfluxnet' project (<http://sapfluxnet.creaf.cat>), the first global database of sap flow measurements.
Maintained by Victor Granda. Last updated 2 years ago.
8.7 match 25 stars 6.57 score 49 scriptsherveabdi
DistatisR:DiSTATIS Three Way Metric Multidimensional Scaling
Implement DiSTATIS and CovSTATIS (three-way multidimensional scaling). DiSTATIS and CovSTATIS are used to analyze multiple distance/covariance matrices collected on the same set of observations. These methods are based on Abdi, H., Williams, L.J., Valentin, D., & Bennani-Dosse, M. (2012) <doi:10.1002/wics.198>.
Maintained by Herve Abdi. Last updated 1 years ago.
3-way-mdsdistatismetric-multidimensional-scaling
12.9 match 4 stars 4.42 score 44 scriptsnimble-dev
compareMCMCs:Compare MCMC Efficiency from 'nimble' and/or Other MCMC Engines
Manages comparison of MCMC performance metrics from multiple MCMC algorithms. These may come from different MCMC configurations using the 'nimble' package or from other packages. Plug-ins for JAGS via 'rjags' and Stan via 'rstan' are provided. It is possible to write plug-ins for other packages. Performance metrics are held in an MCMCresult class along with samples and timing data. It is easy to apply new performance metrics. Reports are generated as html pages with figures comparing sets of runs. It is possible to configure the html pages, including providing new figure components.
Maintained by Perry de Valpine. Last updated 5 months ago.
12.1 match 1 stars 4.71 score 17 scriptscran
datarobot:'DataRobot' Predictive Modeling API
For working with the 'DataRobot' predictive modeling platform's API <https://www.datarobot.com/>.
Maintained by AJ Alon. Last updated 1 years ago.
16.4 match 2 stars 3.48 scoreflr
FLCore:Core Package of FLR, Fisheries Modelling in R
Core classes and methods for FLR, a framework for fisheries modelling and management strategy simulation in R. Developed by a team of fisheries scientists in various countries. More information can be found at <http://flr-project.org/>.
Maintained by Iago Mosqueira. Last updated 9 days ago.
fisheriesflrfisheries-modelling
6.5 match 16 stars 8.78 score 956 scripts 23 dependentsrstudio
vetiver:Version, Share, Deploy, and Monitor Models
The goal of 'vetiver' is to provide fluent tooling to version, share, deploy, and monitor a trained model. Functions handle both recording and checking the model's input data prototype, and predicting from a remote API endpoint. The 'vetiver' package is extensible, with generics that can support many kinds of models.
Maintained by Julia Silge. Last updated 5 months ago.
5.4 match 185 stars 10.48 score 466 scripts 1 dependentsthijsjanzen
treestats:Phylogenetic Tree Statistics
Collection of phylogenetic tree statistics, collected throughout the literature. All functions have been written to maximize computation speed. The package includes umbrella functions to calculate all statistics, all balance associated statistics, or all branching time related statistics. Furthermore, the 'treestats' package supports summary statistic calculations on Ltables, provides speed-improved coding of branching times, Ltable conversion and includes algorithms to create intermediately balanced trees. Full description can be found in Janzen (2024) <doi:10.1016/j.ympev.2024.108168>.
Maintained by Thijs Janzen. Last updated 6 months ago.
10.4 match 16 stars 5.43 score 16 scripts 1 dependentsbioc
struct:Statistics in R Using Class-based Templates
Defines and includes a set of class-based templates for developing and implementing data processing and analysis workflows, with a strong emphasis on statistics and machine learning. The templates can be used and where needed extended to 'wrap' tools and methods from other packages into a common standardised structure to allow for effective and fast integration. Model objects can be combined into sequences, and sequences nested in iterators using overloaded operators to simplify and improve readability of the code. Ontology lookup has been integrated and implemented to provide standardised definitions for methods, inputs and outputs wrapped using the class-based templates.
Maintained by Gavin Rhys Lloyd. Last updated 5 months ago.
9.3 match 6.04 score 76 scripts 3 dependentsbioc
musicatk:Mutational Signature Comprehensive Analysis Toolkit
Mutational signatures are carcinogenic exposures or aberrant cellular processes that can cause alterations to the genome. We created musicatk (MUtational SIgnature Comprehensive Analysis ToolKit) to address shortcomings in versatility and ease of use in other pre-existing computational tools. Although many different types of mutational data have been generated, current software packages do not have a flexible framework to allow users to mix and match different types of mutations in the mutational signature inference process. Musicatk enables users to count and combine multiple mutation types, including SBS, DBS, and indels. Musicatk calculates replication strand, transcription strand and combinations of these features along with discovery from unique and proprietary genomic feature associated with any mutation type. Musicatk also implements several methods for discovery of new signatures as well as methods to infer exposure given an existing set of signatures. Musicatk provides functions for visualization and downstream exploratory analysis including the ability to compare signatures between cohorts and find matching signatures in COSMIC V2 or COSMIC V3.
Maintained by Joshua D. Campbell. Last updated 5 months ago.
softwarebiologicalquestionsomaticmutationvariantannotation
7.8 match 13 stars 7.02 score 20 scriptsegenn
rtemis:Machine Learning and Visualization
Advanced Machine Learning and Visualization. Unsupervised Learning (Clustering, Decomposition), Supervised Learning (Classification, Regression), Cross-Decomposition, Bagging, Boosting, Meta-models. Static and interactive graphics.
Maintained by E.D. Gennatas. Last updated 1 months ago.
data-sciencedata-visualizationmachine-learningmachine-learning-libraryvisualization
7.6 match 145 stars 7.09 score 50 scripts 2 dependentsalexwaterboybezzina
CalcThemAll.PRM:Calculate Pesticide Risk Metric (PRM) Values from Multiple Pesticides...Calc Them All
Contains functions which can be used to calculate Pesticide Risk Metric values in aquatic environments from concentrations of multiple pesticides with known species sensitive distributions (SSDs). Pesticides provided by this package have all be validated however if the user has their own pesticides with SSD values they can append them to the pesticide_info table to include them in estimates.
Maintained by Alexander Bezzina. Last updated 11 months ago.
11.2 match 2 stars 4.78 scorefsavje
distances:Tools for Distance Metrics
Provides tools for constructing, manipulating and using distance metrics.
Maintained by Fredrik Savje. Last updated 1 years ago.
7.7 match 17 stars 6.92 score 117 scripts 12 dependentsjlmelville
mize:Unconstrained Numerical Optimization Algorithms
Optimization algorithms implemented in R, including conjugate gradient (CG), Broyden-Fletcher-Goldfarb-Shanno (BFGS) and the limited memory BFGS (L-BFGS) methods. Most internal parameters can be set through the call interface. The solvers hold up quite well for higher-dimensional problems.
Maintained by James Melville. Last updated 2 months ago.
conjugate-gradientl-bfgsnumerical-optimization
7.6 match 10 stars 6.95 score 25 scripts 6 dependentsdavidbolin
rSPDE:Rational Approximations of Fractional Stochastic Partial Differential Equations
Functions that compute rational approximations of fractional elliptic stochastic partial differential equations. The package also contains functions for common statistical usage of these approximations. The main references for rSPDE are Bolin, Simas and Xiong (2023) <doi:10.1080/10618600.2023.2231051> for the covariance-based method and Bolin and Kirchner (2020) <doi:10.1080/10618600.2019.1665537> for the operator-based rational approximation. These can be generated by the citation function in R.
Maintained by David Bolin. Last updated 8 days ago.
6.9 match 11 stars 7.57 score 188 scripts 3 dependentsatkinsjeff
forestr:Ecosystem and Canopy Structural Complexity Metrics from LiDAR
Provides a toolkit for calculating forest and canopy structural complexity metrics from terrestrial LiDAR (light detection and ranging). References: Atkins et al. 2018 <doi:10.1111/2041-210X.13061>; Hardiman et al. 2013 <doi:10.3390/f4030537>; Parker et al. 2004 <doi:10.1111/j.0021-8901.2004.00925.x>.
Maintained by Jeff Atkins. Last updated 1 years ago.
10.4 match 29 stars 4.95 score 31 scriptsandrewdhawan
sigQC:Quality Control Metrics for Gene Signatures
Provides gene signature quality control metrics in publication ready plots. Namely, enables the visualization of properties such as expression, variability, correlation, and comparison of methods of standardisation and scoring metrics.
Maintained by Andrew Dhawan. Last updated 7 months ago.
10.6 match 4 stars 4.89 score 13 scriptscdowd
twosamples:Fast Permutation Based Two Sample Tests
Fast randomization based two sample tests. Testing the hypothesis that two samples come from the same distribution using randomization to create p-values. Included tests are: Kolmogorov-Smirnov, Kuiper, Cramer-von Mises, Anderson-Darling, Wasserstein, and DTS. The default test (two_sample) is based on the DTS test statistic, as it is the most powerful, and thus most useful to most users. The DTS test statistic builds on the Wasserstein distance by using a weighting scheme like that of Anderson-Darling. See the companion paper at <arXiv:2007.01360> or <https://codowd.com/public/DTS.pdf> for details of that test statistic, and non-standard uses of the package (parallel for big N, weighted observations, one sample tests, etc). We also include the permutation scheme to make test building simple for others.
Maintained by Connor Dowd. Last updated 2 years ago.
7.5 match 17 stars 6.88 score 62 scripts 8 dependentstidymodels
tune:Tidy Tuning Tools
The ability to tune models is important. 'tune' contains functions and classes to be used in conjunction with other 'tidymodels' packages for finding reasonable values of hyper-parameters in models, pre-processing methods, and post-processing steps.
Maintained by Max Kuhn. Last updated 11 days ago.
3.6 match 293 stars 14.27 score 756 scripts 39 dependentsropensci
spatsoc:Group Animal Relocation Data by Spatial and Temporal Relationship
Detects spatial and temporal groups in GPS relocations (Robitaille et al. (2019) <doi:10.1111/2041-210X.13215>). It can be used to convert GPS relocations to gambit-of-the-group format to build proximity-based social networks In addition, the randomizations function provides data-stream randomization methods suitable for GPS data.
Maintained by Alec L. Robitaille. Last updated 1 months ago.
5.1 match 24 stars 9.97 score 145 scripts 3 dependentscran
ncappc:NCA Calculations and Population Model Diagnosis
A flexible tool that can perform (i) traditional non-compartmental analysis (NCA) and (ii) Simulation-based posterior predictive checks for population pharmacokinetic (PK) and/or pharmacodynamic (PKPD) models using NCA metrics.
Maintained by Andrew C. Hooker. Last updated 7 years ago.
18.8 match 2.70 scorebioc
MicrobiotaProcess:A comprehensive R package for managing and analyzing microbiome and other ecological data within the tidy framework
MicrobiotaProcess is an R package for analysis, visualization and biomarker discovery of microbial datasets. It introduces MPSE class, this make it more interoperable with the existing computing ecosystem. Moreover, it introduces a tidy microbiome data structure paradigm and analysis grammar. It provides a wide variety of microbiome data analysis procedures under the unified and common framework (tidy-like framework).
Maintained by Shuangbin Xu. Last updated 5 months ago.
visualizationmicrobiomesoftwaremultiplecomparisonfeatureextractionmicrobiome-analysismicrobiome-data
5.2 match 183 stars 9.70 score 126 scripts 1 dependentsr-lib
systemfonts:System Native Font Finding
Provides system native access to the font catalogue. As font handling varies between systems it is difficult to correctly locate installed fonts across different operating systems. The 'systemfonts' package provides bindings to the native libraries on Windows, macOS and Linux for finding font files that can then be used further by e.g. graphic devices. The main use is intended to be from compiled code but 'systemfonts' also provides access from R.
Maintained by Thomas Lin Pedersen. Last updated 2 months ago.
3.3 match 95 stars 15.62 score 384 scripts 990 dependentskozodoi
fairness:Algorithmic Fairness Metrics
Offers calculation, visualization and comparison of algorithmic fairness metrics. Fair machine learning is an emerging topic with the overarching aim to critically assess whether ML algorithms reinforce existing social biases. Unfair algorithms can propagate such biases and produce predictions with a disparate impact on various sensitive groups of individuals (defined by sex, gender, ethnicity, religion, income, socioeconomic status, physical or mental disabilities). Fair algorithms possess the underlying foundation that these groups should be treated similarly or have similar prediction outcomes. The fairness R package offers the calculation and comparisons of commonly and less commonly used fairness metrics in population subgroups. These methods are described by Calders and Verwer (2010) <doi:10.1007/s10618-010-0190-x>, Chouldechova (2017) <doi:10.1089/big.2016.0047>, Feldman et al. (2015) <doi:10.1145/2783258.2783311> , Friedler et al. (2018) <doi:10.1145/3287560.3287589> and Zafar et al. (2017) <doi:10.1145/3038912.3052660>. The package also offers convenient visualizations to help understand fairness metrics.
Maintained by Nikita Kozodoi. Last updated 2 years ago.
algorithmic-discriminationalgorithmic-fairnessdiscriminationdisparate-impactfairnessfairness-aifairness-mlmachine-learning
7.4 match 32 stars 6.82 score 69 scripts 1 dependentsbioc
maftools:Summarize, Analyze and Visualize MAF Files
Analyze and visualize Mutation Annotation Format (MAF) files from large scale sequencing studies. This package provides various functions to perform most commonly used analyses in cancer genomics and to create feature rich customizable visualzations with minimal effort.
Maintained by Anand Mayakonda. Last updated 5 months ago.
datarepresentationdnaseqvisualizationdrivermutationvariantannotationfeatureextractionclassificationsomaticmutationsequencingfunctionalgenomicssurvivalbioinformaticscancer-genome-atlascancer-genomicsgenomicsmaf-filestcgacurlbzip2xz-utilszlib
3.5 match 459 stars 14.63 score 948 scripts 18 dependentsdicook
nullabor:Tools for Graphical Inference
Tools for visual inference. Generate null data sets and null plots using permutation and simulation. Calculate distance metrics for a lineup, and examine the distributions of metrics.
Maintained by Di Cook. Last updated 1 months ago.
4.9 match 57 stars 10.38 score 370 scripts 2 dependentskapsner
mlexperiments:Machine Learning Experiments
Provides 'R6' objects to perform parallelized hyperparameter optimization and cross-validation. Hyperparameter optimization can be performed with Bayesian optimization (via 'ParBayesianOptimization' <https://cran.r-project.org/package=ParBayesianOptimization>) and grid search. The optimized hyperparameters can be validated using k-fold cross-validation. Alternatively, hyperparameter optimization and validation can be performed with nested cross-validation. While 'mlexperiments' focuses on core wrappers for machine learning experiments, additional learner algorithms can be supplemented by inheriting from the provided learner base class.
Maintained by Lorenz A. Kapsner. Last updated 11 days ago.
cross-validationexperimenthyperparameter-optimizationhyperparameter-tuningmachine-learningnested
6.6 match 5 stars 7.64 score 49 scripts 2 dependentstbep-tech
tbeptools:Data and Indicators for the Tampa Bay Estuary Program
Several functions are provided for working with Tampa Bay Estuary Program data and indicators, including the water quality report card, tidal creek assessments, Tampa Bay Nekton Index, Tampa Bay Benthic Index, seagrass transect data, habitat report card, and fecal indicator bacteria. Additional functions are provided for miscellaneous tasks, such as reference library curation.
Maintained by Marcus Beck. Last updated 8 days ago.
data-analysistampa-baytbepwater-quality
6.4 match 10 stars 7.86 score 133 scriptsidblr
ndi:Neighborhood Deprivation Indices
Computes various geospatial indices of socioeconomic deprivation and disparity in the United States. Some indices are considered "spatial" because they consider the values of neighboring (i.e., adjacent) census geographies in their computation, while other indices are "aspatial" because they only consider the value within each census geography. Two types of aspatial neighborhood deprivation indices (NDI) are available: including: (1) based on Messer et al. (2006) <doi:10.1007/s11524-006-9094-x> and (2) based on Andrews et al. (2020) <doi:10.1080/17445647.2020.1750066> and Slotman et al. (2022) <doi:10.1016/j.dib.2022.108002> who use variables chosen by Roux and Mair (2010) <doi:10.1111/j.1749-6632.2009.05333.x>. Both are a decomposition of multiple demographic characteristics from the U.S. Census Bureau American Community Survey 5-year estimates (ACS-5; 2006-2010 onward). Using data from the ACS-5 (2005-2009 onward), the package can also compute indices of racial or ethnic residential segregation, including but limited to those discussed in Massey & Denton (1988) <doi:10.1093/sf/67.2.281>, and additional indices of socioeconomic disparity.
Maintained by Ian D. Buller. Last updated 7 months ago.
censuscensus-apicensus-datadeprivationdeprivation-statsdisparitygeospatialgeospatial-datametric-developmentprincipal-component-analysissegregation-measuressocio-economic-indicators
7.5 match 21 stars 6.67 score 7 scripts 1 dependentsbioc
pipeComp:pipeComp pipeline benchmarking framework
A simple framework to facilitate the comparison of pipelines involving various steps and parameters. The `pipelineDefinition` class represents pipelines as, minimally, a set of functions consecutively executed on the output of the previous one, and optionally accompanied by step-wise evaluation and aggregation functions. Given such an object, a set of alternative parameters/methods, and benchmark datasets, the `runPipeline` function then proceeds through all combinations arguments, avoiding recomputing the same step twice and compiling evaluations on the fly to avoid storing potentially large intermediate data.
Maintained by Pierre-Luc Germain. Last updated 5 months ago.
geneexpressiontranscriptomicsclusteringdatarepresentationbenchmarkbioconductorpipeline-benchmarkingpipelinessingle-cell-rna-seq
7.1 match 41 stars 7.02 score 43 scriptsdavid-cortes
recometrics:Evaluation Metrics for Implicit-Feedback Recommender Systems
Calculates evaluation metrics for implicit-feedback recommender systems that are based on low-rank matrix factorization models, given the fitted model matrices and data, thus allowing to compare models from a variety of libraries. Metrics include P@K (precision-at-k, for top-K recommendations), R@K (recall at k), AP@K (average precision at k), NDCG@K (normalized discounted cumulative gain at k), Hit@K (from which the 'Hit Rate' is calculated), RR@K (reciprocal rank at k, from which the 'MRR' or 'mean reciprocal rank' is calculated), ROC-AUC (area under the receiver-operating characteristic curve), and PR-AUC (area under the precision-recall curve). These are calculated on a per-user basis according to the ranking of items induced by the model, using efficient multi-threaded routines. Also provides functions for creating train-test splits for model fitting and evaluation.
Maintained by David Cortes. Last updated 2 months ago.
implicit-feedbackmatrix-factorizationrecommender-systemsopenblascppopenmp
9.1 match 28 stars 5.45 scorebioc
beadarray:Quality assessment and low-level analysis for Illumina BeadArray data
The package is able to read bead-level data (raw TIFFs and text files) output by BeadScan as well as bead-summary data from BeadStudio. Methods for quality assessment and low-level analysis are provided.
Maintained by Mark Dunning. Last updated 5 months ago.
microarrayonechannelqualitycontrolpreprocessing
6.3 match 7.88 score 70 scripts 4 dependentsneptune-ai
neptune:MLOps Metadata Store - Experiment Tracking and Model Registry for Production Teams
An interface to Neptune. A metadata store for MLOps, built for teams that run a lot of experiments. It gives you a single place to log, store, display, organize, compare, and query all your model-building metadata. Neptune is used for: • Experiment tracking: Log, display, organize, and compare ML experiments in a single place. • Model registry: Version, store, manage, and query trained models, and model building metadata. • Monitoring ML runs live: Record and monitor model training, evaluation, or production runs live For more information see <https://neptune.ai/>.
Maintained by Rafal Jankowski. Last updated 2 years ago.
comparelanguagelogmanagementmetadatametricsmlopsmodelsmonitoringorganizeparametersstoretrackervisualization
10.0 match 14 stars 4.89 score 16 scriptsmandymejia
ciftiTools:Tools for Reading, Writing, Viewing and Manipulating CIFTI Files
CIFTI files contain brain imaging data in "grayordinates," which represent the gray matter as cortical surface vertices (left and right) and subcortical voxels (cerebellum, basal ganglia, and other deep gray matter). 'ciftiTools' provides a unified environment for reading, writing, visualizing and manipulating CIFTI-format data. It supports the "dscalar," "dlabel," and "dtseries" intents. Grayordinate data is read in as a "xifti" object, which is structured for convenient access to the data and metadata, and includes support for surface geometry files to enable spatially-dependent functionality such as static or interactive visualizations and smoothing.
Maintained by Amanda Mejia. Last updated 1 months ago.
5.4 match 47 stars 8.90 score 176 scripts 4 dependentsemeyers
NeuroDecodeR:Decode Information from Neural Activity
Neural decoding is method of analyzing neural data that uses a pattern classifiers to predict experimental conditions based on neural activity. 'NeuroDecodeR' is a system of objects that makes it easy to run neural decoding analyses. For more information on neural decoding see Meyers & Kreiman (2011) <doi:10.7551/mitpress/8404.003.0024>.
Maintained by Ethan Meyers. Last updated 1 years ago.
7.4 match 12 stars 6.49 score 17 scriptsbioc
scPipe:Pipeline for single cell multi-omic data pre-processing
A preprocessing pipeline for single cell RNA-seq/ATAC-seq data that starts from the fastq files and produces a feature count matrix with associated quality control information. It can process fastq data generated by CEL-seq, MARS-seq, Drop-seq, Chromium 10x and SMART-seq protocols.
Maintained by Shian Su. Last updated 3 months ago.
immunooncologysoftwaresequencingrnaseqgeneexpressionsinglecellvisualizationsequencematchingpreprocessingqualitycontrolgenomeannotationdataimportcurlbzip2xz-utilszlibcpp
5.3 match 68 stars 9.02 score 84 scriptstbep-tech
wqtrends:Assess Water Quality Trends with Generalized Additive Models
Assess Water Quality Trends for Long-Term Monitoring Data in Estuaries using Generalized Additive Models following Wood (2017) <doi:10.1201/9781315370279> and Error Propagation with Mixed-Effects Meta-Analysis following Sera et al. (2019) <doi:10.1002/sim.8362>. Methods are available for model fitting, assessment of fit, annual and seasonal trend tests, and visualization of results.
Maintained by Marcus Beck. Last updated 5 days ago.
reportingsan-francisco-baytime-series-analysiswater-quality
8.8 match 10 stars 5.38 score 24 scriptsdvrbts
labdsv:Ordination and Multivariate Analysis for Ecology
A variety of ordination and community analyses useful in analysis of data sets in community ecology. Includes many of the common ordination methods, with graphical routines to facilitate their interpretation, as well as several novel analyses.
Maintained by David W. Roberts. Last updated 2 years ago.
7.8 match 3 stars 6.08 score 452 scripts 13 dependentsspsanderson
healthyR.ai:The Machine Learning and AI Modeling Companion to 'healthyR'
Hospital machine learning and ai data analysis workflow tools, modeling, and automations. This library provides many useful tools to review common administrative hospital data. Some of these include predicting length of stay, and readmits. The aim is to provide a simple and consistent verb framework that takes the guesswork out of everything.
Maintained by Steven Sanderson. Last updated 2 months ago.
aiartificial-intelligencehealthcareanalyticshealthyrhealthyversemachine-learning
6.4 match 16 stars 7.37 score 36 scripts 1 dependentsalarm-redist
redistmetrics:Redistricting Metrics
Reliable and flexible tools for scoring redistricting plans using common measures and metrics. These functions provide key direct access to tools useful for non-simulation analyses of redistricting plans, such as for measuring compactness or partisan fairness. Tools are designed to work with the 'redist' package seamlessly.
Maintained by Christopher T. Kenny. Last updated 9 months ago.
6.1 match 10 stars 7.57 score 23 scripts 2 dependentsbioc
miQC:Flexible, probabilistic metrics for quality control of scRNA-seq data
Single-cell RNA-sequencing (scRNA-seq) has made it possible to profile gene expression in tissues at high resolution. An important preprocessing step prior to performing downstream analyses is to identify and remove cells with poor or degraded sample quality using quality control (QC) metrics. Two widely used QC metrics to identify a ‘low-quality’ cell are (i) if the cell includes a high proportion of reads that map to mitochondrial DNA encoded genes (mtDNA) and (ii) if a small number of genes are detected. miQC is data-driven QC metric that jointly models both the proportion of reads mapping to mtDNA and the number of detected genes with mixture models in a probabilistic framework to predict the low-quality cells in a given dataset.
Maintained by Ariel Hippen. Last updated 5 months ago.
singlecellqualitycontrolgeneexpressionpreprocessingsequencing
7.1 match 19 stars 6.39 score 65 scriptssonsoleslp
tna:Transition Network Analysis (TNA)
Provides tools for performing Transition Network Analysis (TNA) to study relational dynamics, including functions for building and plotting TNA models, calculating centrality measures, and identifying dominant events and patterns. TNA statistical techniques (e.g., bootstrapping and permutation tests) ensure the reliability of observed insights and confirm that identified dynamics are meaningful. See (Saqr et al., 2025) <doi:10.1145/3706468.3706513> for more details on TNA.
Maintained by Sonsoles López-Pernas. Last updated 2 days ago.
educational-data-mininglearning-analyticsmarkov-modeltemporal-analysis
7.0 match 4 stars 6.48 score 5 scriptsbrian-j-smith
MRMCaov:Multi-Reader Multi-Case Analysis of Variance
Estimation and comparison of the performances of diagnostic tests in multi-reader multi-case studies where true case statuses (or ground truths) are known and one or more readers provide test ratings for multiple cases. Reader performance metrics are provided for area under and expected utility of ROC curves, likelihood ratio of positive or negative tests, and sensitivity and specificity. ROC curves can be estimated empirically or with binormal or binormal likelihood-ratio models. Statistical comparisons of diagnostic tests are based on the ANOVA model of Obuchowski-Rockette and the unified framework of Hillis (2005) <doi:10.1002/sim.2024>. The ANOVA can be conducted with data from a full factorial, nested, or partially paired study design; with random or fixed readers or cases; and covariances estimated with the DeLong method, jackknifing, or an unbiased method. Smith and Hillis (2020) <doi:10.1117/12.2549075>.
Maintained by Brian J Smith. Last updated 2 years ago.
8.4 match 12 stars 5.26 score 8 scripts 1 dependentsoeysan
bfw:Bayesian Framework for Computational Modeling
Derived from the work of Kruschke (2015, <ISBN:9780124058880>), the present package aims to provide a framework for conducting Bayesian analysis using Markov chain Monte Carlo (MCMC) sampling utilizing the Just Another Gibbs Sampler ('JAGS', Plummer, 2003, <https://mcmc-jags.sourceforge.io>). The initial version includes several modules for conducting Bayesian equivalents of chi-squared tests, analysis of variance (ANOVA), multiple (hierarchical) regression, softmax regression, and for fitting data (e.g., structural equation modeling).
Maintained by Øystein Olav Skaar. Last updated 3 years ago.
bayesian-data-analysisbayesian-statisticsjagsmcmcpsychological-sciencecpp
7.5 match 10 stars 5.89 score 31 scriptspaws-r
paws:Amazon Web Services Software Development Kit
Interface to Amazon Web Services <https://aws.amazon.com>, including storage, database, and compute services, such as 'Simple Storage Service' ('S3'), 'DynamoDB' 'NoSQL' database, and 'Lambda' functions-as-a-service.
Maintained by Dyfan Jones. Last updated 3 days ago.
3.9 match 332 stars 11.25 score 177 scripts 12 dependentsthibautjombart
treespace:Statistical Exploration of Landscapes of Phylogenetic Trees
Tools for the exploration of distributions of phylogenetic trees. This package includes a 'shiny' interface which can be started from R using treespaceServer(). For further details see Jombart et al. (2017) <DOI:10.1111/1755-0998.12676>.
Maintained by Michelle Kendall. Last updated 2 years ago.
5.8 match 28 stars 7.39 score 63 scriptsjefferislab
RANN:Fast Nearest Neighbour Search (Wraps ANN Library) Using L2 Metric
Finds the k nearest neighbours for every point in a given dataset in O(N log N) time using Arya and Mount's ANN library (v1.1.3). There is support for approximate as well as exact searches, fixed radius searches and 'bd' as well as 'kd' trees. The distance is computed using the L2 (Euclidean) metric. Please see package 'RANN.L1' for the same functionality using the L1 (Manhattan, taxicab) metric.
Maintained by Gregory Jefferis. Last updated 7 months ago.
ann-librarynearest-neighborsnearest-neighbourscpp
3.5 match 58 stars 12.21 score 1.3k scripts 190 dependentsropensci
EDIutils:An API Client for the Environmental Data Initiative Repository
A client for the Environmental Data Initiative repository REST API. The 'EDI' data repository <https://portal.edirepository.org/nis/home.jsp> is for publication and reuse of ecological data with emphasis on metadata accuracy and completeness. It is built upon the 'PASTA+' software stack <https://pastaplus-core.readthedocs.io/en/latest/index.html#> and was developed in collaboration with the US 'LTER' Network <https://lternet.edu/>. 'EDIutils' includes functions to search and access existing data, evaluate and upload new data, and assist other data management tasks common to repository users.
Maintained by Colin Smith. Last updated 1 years ago.
ecologyeml-metadataopen-accessopen-dataresearch-data-managementresearch-data-repository
6.7 match 10 stars 6.47 score 117 scriptsrstudio
tfestimators:Interface to 'TensorFlow' Estimators
Interface to 'TensorFlow' Estimators <https://www.tensorflow.org/guide/estimator>, a high-level API that provides implementations of many different model types including linear models and deep neural networks.
Maintained by Tomasz Kalinowski. Last updated 3 years ago.
5.1 match 57 stars 8.42 score 170 scriptstidymodels
probably:Tools for Post-Processing Predicted Values
Models can be improved by post-processing class probabilities, by: recalibration, conversion to hard probabilities, assessment of equivocal zones, and other activities. 'probably' contains tools for conducting these operations as well as calibration tools and conformal inference techniques for regression models.
Maintained by Max Kuhn. Last updated 5 months ago.
3.5 match 115 stars 12.09 score 21k scripts 1 dependentseagerai
fastai:Interface to 'fastai'
The 'fastai' <https://docs.fast.ai/index.html> library simplifies training fast and accurate neural networks using modern best practices. It is based on research in to deep learning best practices undertaken at 'fast.ai', including 'out of the box' support for vision, text, tabular, audio, time series, and collaborative filtering models.
Maintained by Turgut Abdullayev. Last updated 11 months ago.
audiocollaborative-filteringdarknetdarknet-image-classificationfastaimedicalobject-detectiontabulartextvision
4.5 match 118 stars 9.40 score 76 scriptsmomx
Momocs:Morphometrics using R
The goal of 'Momocs' is to provide a complete, convenient, reproducible and open-source toolkit for 2D morphometrics. It includes most common 2D morphometrics approaches on outlines, open outlines, configurations of landmarks, traditional morphometrics, and facilities for data preparation, manipulation and visualization with a consistent grammar throughout. It allows reproducible, complex morphometrics analyses and other morphometrics approaches should be easy to plug in, or develop from, on top of this canvas.
Maintained by Vincent Bonhomme. Last updated 1 years ago.
5.7 match 51 stars 7.42 score 346 scriptsandrewmarx
samc:Spatial Absorbing Markov Chains
Implements functions for working with absorbing Markov chains. The implementation is based on the framework described in "Toward a unified framework for connectivity that disentangles movement and mortality in space and time" by Fletcher et al. (2019) <doi:10.1111/ele.13333>, which applies them to spatial ecology. This framework incorporates both resistance and absorption with spatial absorbing Markov chains (SAMC) to provide several short-term and long-term predictions for metrics related to connectivity in landscapes. Despite the ecological context of the framework, this package can be used in any application of absorbing Markov chains.
Maintained by Andrew Marx. Last updated 5 months ago.
absorbing-markov-chainsconnectivitylandscape-ecologylandscape-metricsmarkov-chaincpp
8.0 match 12 stars 5.26 score 15 scriptsnowosad
raceland:Pattern-Based Zoneless Method for Analysis and Visualization of Racial Topography
Implements a computational framework for a pattern-based, zoneless analysis, and visualization of (ethno)racial topography (Dmowska, Stepinski, and Nowosad (2020) <doi:10.1016/j.apgeog.2020.102239>). It is a reimagined approach for analyzing residential segregation and racial diversity based on the concept of 'landscape’ used in the domain of landscape ecology.
Maintained by Jakub Nowosad. Last updated 2 years ago.
information-theorylandscaperacial-diversityrasterresidential-segregationspatialcpp
8.0 match 9 stars 5.21 score 12 scriptsnanxstats
RECA:Relevant Component Analysis for Supervised Distance Metric Learning
Relevant Component Analysis (RCA) tries to find a linear transformation of the feature space such that the effect of irrelevant variability is reduced in the transformed space.
Maintained by Nan Xiao. Last updated 11 months ago.
machine-learningmetric-learning
10.4 match 7 stars 4.02 score 4 scriptstidymodels
tidyclust:A Common API to Clustering
A common interface to specifying clustering models, in the same style as 'parsnip'. Creates unified interface across different functions and computational engines.
Maintained by Emil Hvitfeldt. Last updated 2 months ago.
5.5 match 111 stars 7.45 score 139 scriptsbioc
sosta:A package for the analysis of anatomical tissue structures in spatial omics data
sosta (Spatial Omics STructure Analysis) is a package for analyzing spatial omics data to explore tissue organization at the anatomical structure level. It reconstructs morphologically relevant structures based on molecular features or cell types. It further calculates a range of structural and shape metrics to quantitatively describe tissue architecture. The package is designed to integrate with other packages for the analysis of spatial (omics) data.
Maintained by Samuel Gunz. Last updated 1 months ago.
softwarespatialtranscriptomicsvisualization
8.5 match 1 stars 4.85 score 2 scriptsumr-amap
BIOMASS:Estimating Aboveground Biomass and Its Uncertainty in Tropical Forests
Contains functions to estimate aboveground biomass/carbon and its uncertainty in tropical forests. These functions allow to (1) retrieve and to correct taxonomy, (2) estimate wood density and its uncertainty, (3) construct height-diameter models, (4) manage tree and plot coordinates, (5) estimate the aboveground biomass/carbon at the stand level with associated uncertainty. To cite 'BIOMASS', please use citation("BIOMASS"). See more in the article of Réjou-Méchain et al. (2017) <doi:10.1111/2041-210X.12753>.
Maintained by Dominique Lamonica. Last updated 1 days ago.
4.1 match 26 stars 9.90 score 68 scripts 1 dependentsjunruidi
ActFrag:Activity Fragmentation Metrics Extracted from Minute Level Activity Data
Recent studies haven shown that, on top of total daily active/sedentary volumes, the time accumulation strategies provide more sensitive information. This package provides functions to extract commonly used fragmentation metrics to quantify such time accumulation strategies based on minute level actigraphy-measured activity counts data.
Maintained by Junrui Di. Last updated 5 years ago.
8.8 match 4.62 score 14 scripts 2 dependentswelch-lab
rliger:Linked Inference of Genomic Experimental Relationships
Uses an extension of nonnegative matrix factorization to identify shared and dataset-specific factors. See Welch J, Kozareva V, et al (2019) <doi:10.1016/j.cell.2019.05.006>, and Liu J, Gao C, Sodicoff J, et al (2020) <doi:10.1038/s41596-020-0391-8> for more details.
Maintained by Yichen Wang. Last updated 2 months ago.
nonnegative-matrix-factorizationsingle-cellopenblascpp
3.8 match 402 stars 10.80 score 334 scripts 1 dependentsprabhleenkaur19
aniSNA:Statistical Network Analysis of Animal Social Networks
Obtain network structures from animal GPS telemetry observations and statistically analyse them to assess their adequacy for social network analysis. Methods include pre-network data permutations, bootstrapping techniques to obtain confidence intervals for global and node-level network metrics, and correlation and regression analysis of the local network metrics.
Maintained by Prabhleen Kaur. Last updated 2 months ago.
12.7 match 3.18 scoreatsa-es
MARSS:Multivariate Autoregressive State-Space Modeling
The MARSS package provides maximum-likelihood parameter estimation for constrained and unconstrained linear multivariate autoregressive state-space (MARSS) models, including partially deterministic models. MARSS models are a class of dynamic linear model (DLM) and vector autoregressive model (VAR) model. Fitting available via Expectation-Maximization (EM), BFGS (using optim), and 'TMB' (using the 'marssTMB' companion package). Functions are provided for parametric and innovations bootstrapping, Kalman filtering and smoothing, model selection criteria including bootstrap AICb, confidences intervals via the Hessian approximation or bootstrapping, and all conditional residual types. See the user guide for examples of dynamic factor analysis, dynamic linear models, outlier and shock detection, and multivariate AR-p models. Online workshops (lectures, eBook, and computer labs) at <https://atsa-es.github.io/>.
Maintained by Elizabeth Eli Holmes. Last updated 1 years ago.
multivariate-timeseriesstate-space-modelsstatisticstime-series
3.9 match 52 stars 10.34 score 596 scripts 3 dependentsbioc
ChIPQC:Quality metrics for ChIPseq data
Quality metrics for ChIPseq data.
Maintained by Tom Carroll. Last updated 5 months ago.
sequencingchipseqqualitycontrolreportwriting
7.3 match 5.45 score 140 scriptsstrohne
volker:High-Level Functions for Tabulating, Charting and Reporting Survey Data
Craft polished tables and plots in Markdown reports. Simply choose whether to treat your data as counts or metrics, and the package will automatically generate well-designed default tables and plots for you. Boiled down to the basics, with labeling features and simple interactive reports. All functions are 'tidyverse' compatible.
Maintained by Jakob Jünger. Last updated 2 days ago.
5.5 match 5 stars 7.16 score 125 scriptsinceptdk
adaptr:Adaptive Trial Simulator
Package that simulates adaptive (multi-arm, multi-stage) clinical trials using adaptive stopping, adaptive arm dropping, and/or adaptive randomisation. Developed as part of the INCEPT (Intensive Care Platform Trial) project (<https://incept.dk/>), primarily supported by a grant from Sygeforsikringen "danmark" (<https://www.sygeforsikring.dk/>).
Maintained by Anders Granholm. Last updated 11 months ago.
7.3 match 13 stars 5.44 score 14 scriptschjackson
flexsurv:Flexible Parametric Survival and Multi-State Models
Flexible parametric models for time-to-event data, including the Royston-Parmar spline model, generalized gamma and generalized F distributions. Any user-defined parametric distribution can be fitted, given at least an R function defining the probability density or hazard. There are also tools for fitting and predicting from fully parametric multi-state models, based on either cause-specific hazards or mixture models.
Maintained by Christopher Jackson. Last updated 2 months ago.
3.0 match 57 stars 13.31 score 632 scripts 43 dependentsnunompmoniz
IRon:Solving Imbalanced Regression Tasks
Imbalanced domain learning has almost exclusively focused on solving classification tasks, where the objective is to predict cases labelled with a rare class accurately. Such a well-defined approach for regression tasks lacked due to two main factors. First, standard regression tasks assume that each value is equally important to the user. Second, standard evaluation metrics focus on assessing the performance of the model on the most common cases. This package contains methods to tackle imbalanced domain learning problems in regression tasks, where the objective is to predict extreme (rare) values. The methods contained in this package are: 1) an automatic and non-parametric method to obtain such relevance functions; 2) visualisation tools; 3) suite of evaluation measures for optimisation/validation processes; 4) the squared-error relevance area measure, an evaluation metric tailored for imbalanced regression tasks. More information can be found in Ribeiro and Moniz (2020) <doi:10.1007/s10994-020-05900-9>.
Maintained by Nuno Moniz. Last updated 2 years ago.
evaluation-metricsimbalance-dataimbalanced-learningmachine-learningregression
10.1 match 19 stars 3.86 score 38 scriptsbioc
CytoMDS:Low Dimensions projection of cytometry samples
This package implements a low dimensional visualization of a set of cytometry samples, in order to visually assess the 'distances' between them. This, in turn, can greatly help the user to identify quality issues like batch effects or outlier samples, and/or check the presence of potential sample clusters that might align with the exeprimental design. The CytoMDS algorithm combines, on the one hand, the concept of Earth Mover's Distance (EMD), a.k.a. Wasserstein metric and, on the other hand, the Multi Dimensional Scaling (MDS) algorithm for the low dimensional projection. Also, the package provides some diagnostic tools for both checking the quality of the MDS projection, as well as tools to help with the interpretation of the axes of the projection.
Maintained by Philippe Hauchamps. Last updated 2 months ago.
flowcytometryqualitycontroldimensionreductionmultidimensionalscalingsoftwarevisualization
7.3 match 1 stars 5.32 score 2 scriptsmodeloriented
survex:Explainable Machine Learning in Survival Analysis
Survival analysis models are commonly used in medicine and other areas. Many of them are too complex to be interpreted by human. Exploration and explanation is needed, but standard methods do not give a broad enough picture. 'survex' provides easy-to-apply methods for explaining survival models, both complex black-boxes and simpler statistical models. They include methods specific to survival analysis such as SurvSHAP(t) introduced in Krzyzinski et al., (2023) <doi:10.1016/j.knosys.2022.110234>, SurvLIME described in Kovalev et al., (2020) <doi:10.1016/j.knosys.2020.106164> as well as extensions of existing ones described in Biecek et al., (2021) <doi:10.1201/9780429027192>.
Maintained by Mikołaj Spytek. Last updated 9 months ago.
biostatisticsbrier-scorescensored-datacox-modelcox-regressionexplainable-aiexplainable-machine-learningexplainable-mlexplanatory-model-analysisinterpretable-machine-learninginterpretable-mlmachine-learningprobabilistic-machine-learningshapsurvival-analysistime-to-eventvariable-importancexai
4.6 match 110 stars 8.40 score 114 scriptsmarlonecobos
mop:Mobility Oriented-Parity Metric
A set of tools to perform multiple versions of the Mobility Oriented-Parity metric. This multivariate analysis helps to characterize levels of dissimilarity between a set of conditions of reference and another set of conditions of interest. If predictive models are transferred to conditions different from those over which models were calibrated (trained), this metric helps to identify transfer conditions that differ substantially from those of calibration. These tools are implemented following principles proposed in Owens et al. (2013) <doi:10.1016/j.ecolmodel.2013.04.011>, and expanded to obtain more detailed results that aid in interpretation.
Maintained by Marlon E. Cobos. Last updated 9 months ago.
7.4 match 7 stars 5.23 score 20 scripts 2 dependentssprouffske
growthcurver:Simple Metrics to Summarize Growth Curves
Fits the logistic equation to microbial growth curve data (e.g., repeated absorbance measurements taken from a plate reader over time). From this fit, a variety of metrics are provided, including the maximum growth rate, the doubling time, the carrying capacity, the area under the logistic curve, and the time to the inflection point. Method described in Sprouffske and Wagner (2016) <doi:10.1186/s12859-016-1016-7>.
Maintained by Kathleen Sprouffske. Last updated 4 years ago.
5.0 match 46 stars 7.71 score 111 scriptsfcharte
mldr:Exploratory Data Analysis and Manipulation of Multi-Label Data Sets
Exploratory data analysis and manipulation functions for multi- label data sets along with an interactive Shiny application to ease their use.
Maintained by David Charte. Last updated 5 years ago.
5.4 match 23 stars 7.07 score 168 scripts 2 dependentslaresbernardo
lares:Analytics & Machine Learning Sidekick
Auxiliary package for better/faster analytics, visualization, data mining, and machine learning tasks. With a wide variety of family functions, like Machine Learning, Data Wrangling, Marketing Mix Modeling (Robyn), Exploratory, API, and Scrapper, it helps the analyst or data scientist to get quick and robust results, without the need of repetitive coding or advanced R programming skills.
Maintained by Bernardo Lares. Last updated 23 days ago.
analyticsapiautomationautomldata-sciencedescriptive-statisticsh2omachine-learningmarketingmmmpredictive-modelingpuzzlerlanguagerobynvisualization
3.9 match 233 stars 9.84 score 185 scripts 1 dependentsstatnet
tsna:Tools for Temporal Social Network Analysis
Temporal SNA tools for continuous- and discrete-time longitudinal networks having vertex, edge, and attribute dynamics stored in the 'networkDynamic' format. This work was supported by grant R01HD68395 from the National Institute of Health.
Maintained by Skye Bender-deMoll. Last updated 1 years ago.
5.0 match 7 stars 7.65 score 93 scripts 2 dependentspboutros
OmicsQC:Nominating Quality Control Outliers in Genomic Profiling Studies
A method that analyzes quality control metrics from multi-sample genomic sequencing studies and nominates poor quality samples for exclusion. Per sample quality control data are transformed into z-scores and aggregated. The distribution of aggregated z-scores are modelled using parametric distributions. The parameters of the optimal model, selected either by goodness-of-fit statistics or user-designation, are used for outlier nomination. Two implementations of the Cosine Similarity Outlier Detection algorithm are provided with flexible parameters for dataset customization.
Maintained by Paul C. Boutros. Last updated 1 years ago.
18.9 match 2.00 score 2 scriptscarmonalab
scGate:Marker-Based Cell Type Purification for Single-Cell Sequencing Data
A common bioinformatics task in single-cell data analysis is to purify a cell type or cell population of interest from heterogeneous datasets. 'scGate' automatizes marker-based purification of specific cell populations, without requiring training data or reference gene expression profiles. Briefly, 'scGate' takes as input: i) a gene expression matrix stored in a 'Seurat' object and ii) a “gating model” (GM), consisting of a set of marker genes that define the cell population of interest. The GM can be as simple as a single marker gene, or a combination of positive and negative markers. More complex GMs can be constructed in a hierarchical fashion, akin to gating strategies employed in flow cytometry. 'scGate' evaluates the strength of signature marker expression in each cell using the rank-based method 'UCell', and then performs k-nearest neighbor (kNN) smoothing by calculating the mean 'UCell' score across neighboring cells. kNN-smoothing aims at compensating for the large degree of sparsity in scRNA-seq data. Finally, a universal threshold over kNN-smoothed signature scores is applied in binary decision trees generated from the user-provided gating model, to annotate cells as either “pure” or “impure”, with respect to the cell population of interest. See the related publication Andreatta et al. (2022) <doi:10.1093/bioinformatics/btac141>.
Maintained by Massimo Andreatta. Last updated 1 months ago.
filteringmarker-genesscgatesignaturessingle-cell
4.5 match 106 stars 8.38 score 163 scriptsspatialnous
alcyon:Spatial Network Analysis
Interface package for 'sala', the spatial network analysis library from the 'depthmapX' software application. The R parts of the code are based on the 'rdepthmap' package. Allows for the analysis of urban and building-scale networks and provides metrics and methods usually found within the Space Syntax domain. Methods in this package are described by K. Al-Sayed, A. Turner, B. Hillier, S. Iida and A. Penn (2014) "Space Syntax methodology", and also by A. Turner (2004) <https://discovery.ucl.ac.uk/id/eprint/2651> "Depthmap 4: a researcher's handbook".
Maintained by Petros Koutsolampros. Last updated 2 months ago.
5.9 match 2 stars 6.34 score 13 scriptsdwoll
DVHmetrics:Analyze Dose-Volume Histograms and Check Constraints
Functionality for analyzing dose-volume histograms (DVH) in radiation oncology: Read DVH text files, calculate DVH metrics as well as generalized equivalent uniform dose (gEUD), biologically effective dose (BED), equivalent dose in 2 Gy fractions (EQD2), normal tissue complication probability (NTCP), and tumor control probability (TCP). Show DVH diagrams, check and visualize quality assurance constraints for the DVH. Includes web-based graphical user interface.
Maintained by Daniel Wollschlaeger. Last updated 16 days ago.
6.2 match 12 stars 6.03 scorealarm-redist
redist:Simulation Methods for Legislative Redistricting
Enables researchers to sample redistricting plans from a pre-specified target distribution using Sequential Monte Carlo and Markov Chain Monte Carlo algorithms. The package allows for the implementation of various constraints in the redistricting process such as geographic compactness and population parity requirements. Tools for analysis such as computation of various summary statistics and plotting functionality are also included. The package implements the SMC algorithm of McCartan and Imai (2023) <doi:10.1214/23-AOAS1763>, the enumeration algorithm of Fifield, Imai, Kawahara, and Kenny (2020) <doi:10.1080/2330443X.2020.1791773>, the Flip MCMC algorithm of Fifield, Higgins, Imai and Tarr (2020) <doi:10.1080/10618600.2020.1739532>, the Merge-split/Recombination algorithms of Carter et al. (2019) <arXiv:1911.01503> and DeFord et al. (2021) <doi:10.1162/99608f92.eb30390f>, and the Short-burst optimization algorithm of Cannon et al. (2020) <arXiv:2011.02288>.
Maintained by Christopher T. Kenny. Last updated 2 months ago.
geospatialgerrymanderingredistrictingsamplingopenblascppopenmp
4.0 match 68 stars 9.17 score 259 scriptsdomodwyer
promr:Prometheus 'PromQL' Query Client for 'R'
A native 'R' client library for querying the 'Prometheus' time-series database, using the 'PromQL' query language.
Maintained by Dom Dwyer. Last updated 3 years ago.
10.0 match 9 stars 3.69 score 11 scriptsigraph
igraph:Network Analysis and Visualization
Routines for simple graphs and network analysis. It can handle large graphs very well and provides functions for generating random and regular graphs, graph visualization, centrality methods and much more.
Maintained by Kirill Müller. Last updated 2 days ago.
complex-networksgraph-algorithmsgraph-theorymathematicsnetwork-analysisnetwork-graphfortranlibxml2glpkopenblascpp
1.8 match 581 stars 21.10 score 31k scripts 1.9k dependentsphiala
ecodist:Dissimilarity-Based Functions for Ecological Analysis
Dissimilarity-based analysis functions including ordination and Mantel test functions, intended for use with spatial and community ecological data. The original package description is in Goslee and Urban (2007) <doi:10.18637/jss.v022.i07>, with further statistical detail in Goslee (2010) <doi:10.1007/s11258-009-9641-0>.
Maintained by Sarah Goslee. Last updated 1 years ago.
3.8 match 9 stars 9.84 score 566 scripts 9 dependentsailich
GLCMTextures:GLCM Textures of Raster Layers
Calculates grey level co-occurrence matrix (GLCM) based texture measures (Hall-Beyer (2017) <https://prism.ucalgary.ca/bitstream/handle/1880/51900/texture%20tutorial%20v%203_0%20180206.pdf>; Haralick et al. (1973) <doi:10.1109/TSMC.1973.4309314>) of raster layers using a sliding rectangular window. It also includes functions to quantize a raster into grey levels as well as tabulate a glcm and calculate glcm texture metrics for a matrix.
Maintained by Alexander Ilich. Last updated 2 months ago.
5.8 match 12 stars 6.33 score 20 scripts 2 dependents