R-universe search: metric

mfrasco

Metrics:Evaluation Metrics for Machine Learning

An implementation of evaluation metrics in R that are commonly used in supervised machine learning. It implements metrics for regression, time series, binary classification, classification, and information retrieval problems. It has zero dependencies and a consistent, simple interface for all functions.

Maintained by Michael Frasco. Last updated 6 years ago.

59.5 match 99 stars 13.02 score 6.1k scripts 51 dependents

rstudio

keras3:R Interface to 'Keras'

Interface to 'Keras' <https://keras.io>, a high-level neural networks API. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both CPU and GPU devices.

Maintained by Tomasz Kalinowski. Last updated 3 days ago.

41.3 match 845 stars 13.57 score 264 scripts 2 dependents

cbielow

PTXQC:Quality Report Generation for MaxQuant and mzTab Results

Generates Proteomics (PTX) quality control (QC) reports for shotgun LC-MS data analyzed with the MaxQuant software suite (from .txt files) or mzTab files (ideally from OpenMS 'QualityControl' tool). Reports are customizable (target thresholds, subsetting) and available in HTML or PDF format. Published in J. Proteome Res., Proteomics Quality Control: Quality Control Software for MaxQuant Results (2015) <doi:10.1021/acs.jproteome.5b00780>.

Maintained by Chris Bielow. Last updated 1 years ago.

drag-and-drop hacktoberfest heatmap match-between-runs maxquant metric mztab openms proteomics quality-control quality-metrics report

44.4 match 42 stars 9.35 score 105 scripts 1 dependents

tidymodels

yardstick:Tidy Characterizations of Model Performance

Tidy tools for quantifying how well model fits to a data set such as confusion matrices, class probability curve summaries, and regression metrics (e.g., RMSE).

Maintained by Emil Hvitfeldt. Last updated 3 days ago.

25.4 match 387 stars 15.47 score 2.2k scripts 60 dependents

epiforecasts

scoringutils:Utilities for Scoring and Assessing Predictions

Facilitate the evaluation of forecasts in a convenient framework based on data.table. It allows user to to check their forecasts and diagnose issues, to visualise forecasts and missing data, to transform data before scoring, to handle missing forecasts, to aggregate scores, and to visualise the results of the evaluation. The package mostly focuses on the evaluation of probabilistic forecasts and allows evaluating several different forecast types and input formats. Find more information about the package in the Vignettes as well as in the accompanying paper, <doi:10.48550/arXiv.2205.07090>.

Maintained by Nikos Bosse. Last updated 12 days ago.

forecast-evaluation forecasting

29.6 match 52 stars 11.37 score 326 scripts 7 dependents

t-kalinowski

keras:R Interface to 'Keras'

Interface to 'Keras' <https://keras.io>, a high-level neural networks 'API'. 'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both 'CPU' and 'GPU' devices.

Maintained by Tomasz Kalinowski. Last updated 11 months ago.

30.0 match 10.82 score 10k scripts 54 dependents

microsoft

wpa:Tools for Analysing and Visualising Viva Insights Data

Opinionated functions that enable easier and faster analysis of Viva Insights data. There are three main types of functions in 'wpa': (i) Standard functions create a 'ggplot' visual or a summary table based on a specific Viva Insights metric; (2) Report Generation functions generate HTML reports on a specific analysis area, e.g. Collaboration; (3) Other miscellaneous functions cover more specific applications (e.g. Subject Line text mining) of Viva Insights data. This package adheres to 'tidyverse' principles and works well with the pipe syntax. 'wpa' is built with the beginner-to-intermediate R users in mind, and is optimised for simplicity.

Maintained by Martin Chan. Last updated 4 months ago.

workplace-analytics

44.1 match 30 stars 6.69 score 39 scripts 1 dependents

thie1e

cutpointr:Determine and Evaluate Optimal Cutpoints in Binary Classification Tasks

Estimate cutpoints that optimize a specified metric in binary classification tasks and validate performance using bootstrapping. Some methods for more robust cutpoint estimation are supported, e.g. a parametric method assuming normal distributions, bootstrapped cutpoints, and smoothing of the metric values per cutpoint using Generalized Additive Models. Various plotting functions are included. For an overview of the package see Thiele and Hirschfeld (2021) <doi:10.18637/jss.v098.i11>.

Maintained by Christian Thiele. Last updated 3 months ago.

bootstrapping cutpoint-optimization roc-curve cpp

27.9 match 88 stars 10.44 score 322 scripts 1 dependents

benrwoodard

adobeanalyticsr:R Client for 'Adobe Analytics' API 2.0

Connect to the 'Adobe Analytics' API v2.0 <https://github.com/AdobeDocs/analytics-2.0-apis> which powers 'Analysis Workspace'. The package was developed with the analyst in mind, and it will continue to be developed with the guiding principles of iterative, repeatable, timely analysis.

Maintained by Ben Woodard. Last updated 2 months ago.

41.5 match 18 stars 7.02 score 39 scripts

davidbolin

MetricGraph:Random Fields on Metric Graphs

Facilitates creation and manipulation of metric graphs, such as street or river networks. Further facilitates operations and visualizations of data on metric graphs, and the creation of a large class of random fields and stochastic partial differential equations on such spaces. These random fields can be used for simulation, prediction and inference. In particular, linear mixed effects models including random field components can be fitted to data based on computationally efficient sparse matrix representations. Interfaces to the R packages 'INLA' and 'inlabru' are also provided, which facilitate working with Bayesian statistical models on metric graphs. The main references for the methods are Bolin, Simas and Wallin (2024) <doi:10.3150/23-BEJ1647>, Bolin, Kovacs, Kumar and Simas (2023) <doi:10.1090/mcom/3929> and Bolin, Simas and Wallin (2023) <doi:10.48550/arXiv.2304.03190> and <doi:10.48550/arXiv.2304.10372>.

Maintained by David Bolin. Last updated 5 days ago.

cpp

46.2 match 14 stars 6.06 score 275 scripts

modeloriented

fairmodels:Flexible Tool for Bias Detection, Visualization, and Mitigation

Measure fairness metrics in one place for many models. Check how big is model's bias towards different races, sex, nationalities etc. Use measures such as Statistical Parity, Equal odds to detect the discrimination against unprivileged groups. Visualize the bias using heatmap, radar plot, biplot, bar chart (and more!). There are various pre-processing and post-processing bias mitigation algorithms implemented. Package also supports calculating fairness metrics for regression models. Find more details in (Wiśniewski, Biecek (2021)) <arXiv:2104.00507>.

Maintained by Jakub Wiśniewski. Last updated 1 months ago.

explain-classifiers explainable-ml fairness fairness-comparison fairness-ml model-evaluation

35.4 match 86 stars 7.72 score 51 scripts 1 dependents

jackstat

ModelMetrics:Rapid Calculation of Model Metrics

Collection of metrics for evaluating models written in C++ using 'Rcpp'. Popular metrics include area under the curve, log loss, root mean square error, etc.

Maintained by Tyler Hunt. Last updated 4 years ago.

auc logloss machine-learning metrics model-evaluation model-metrics cpp

21.4 match 29 stars 11.83 score 1.3k scripts 306 dependents

fgazzelloni

hmsidwR:Health Metrics and the Spread of Infectious Diseases

A collection of datasets and supporting functions accompanying Health Metrics and the Spread of Infectious Diseases by Federica Gazzelloni (2024). This package provides data for health metrics calculations, including Disability-Adjusted Life Years (DALYs), Years of Life Lost (YLLs), and Years Lived with Disability (YLDs), as well as additional tools for analyzing and visualizing health data. Federica Gazzelloni (2024) <doi:10.5281/zenodo.10818338>.

Maintained by Federica Gazzelloni. Last updated 2 months ago.

deaths health-data infectious-diseases lifeexpectancy

45.0 match 4 stars 5.48 score 6 scripts

samuel-marsh

scCustomize:Custom Visualizations & Functions for Streamlined Analyses of Single Cell Sequencing

Collection of functions created and/or curated to aid in the visualization and analysis of single-cell data using 'R'. 'scCustomize' aims to provide 1) Customized visualizations for aid in ease of use and to create more aesthetic and functional visuals. 2) Improve speed/reproducibility of common tasks/pieces of code in scRNA-seq analysis with a single or group of functions. For citation please use: Marsh SE (2021) "Custom Visualizations & Functions for Streamlined Analyses of Single Cell Sequencing" <doi:10.5281/zenodo.5706430> RRID:SCR_024675.

Maintained by Samuel Marsh. Last updated 3 months ago.

customization ggplot2 scrna-seq seurat single-cell single-cell-genomics single-cell-rna-seq visualization

27.5 match 242 stars 8.75 score 1.1k scripts

azure

azuremlsdk:Interface to the 'Azure Machine Learning' 'SDK'

Interface to the 'Azure Machine Learning' Software Development Kit ('SDK'). Data scientists can use the 'SDK' to train, deploy, automate, and manage machine learning models on the 'Azure Machine Learning' service. To learn more about 'Azure Machine Learning' visit the website: <https://docs.microsoft.com/en-us/azure/machine-learning/service/overview-what-is-azure-ml>.

Maintained by Diondra Peck. Last updated 3 years ago.

amlcompute azure azure-machine-learning azureml dsi machine-learning rstudio sdk-r

26.6 match 106 stars 8.91 score 221 scripts

moviedo5

fda.usc:Functional Data Analysis and Utilities for Statistical Computing

Routines for exploratory and descriptive analysis of functional data such as depth measurements, atypical curves detection, regression models, supervised classification, unsupervised classification and functional analysis of variance.

Maintained by Manuel Oviedo de la Fuente. Last updated 4 months ago.

functional-data-analysis fortran

23.8 match 12 stars 9.72 score 560 scripts 22 dependents

microsoft

vivainsights:Analyze and Visualize Data from 'Microsoft Viva Insights'

Provides a versatile range of functions, including exploratory data analysis, time-series analysis, organizational network analysis, and data validation, whilst at the same time implements a set of best practices in analyzing and visualizing data specific to 'Microsoft Viva Insights'.

Maintained by Martin Chan. Last updated 23 days ago.

37.0 match 11 stars 6.12 score 68 scripts

spatstat

spatstat.geom:Geometrical Functionality of the 'spatstat' Family

Defines spatial data types and supports geometrical operations on them. Data types include point patterns, windows (domains), pixel images, line segment patterns, tessellations and hyperframes. Capabilities include creation and manipulation of data (using command line or graphical interaction), plotting, geometrical operations (rotation, shift, rescale, affine transformation), convex hull, discretisation and pixellation, Dirichlet tessellation, Delaunay triangulation, pairwise distances, nearest-neighbour distances, distance transform, morphological operations (erosion, dilation, closing, opening), quadrat counting, geometrical measurement, geometrical covariance, colour maps, calculus on spatial domains, Gaussian blur, level sets of images, transects of images, intersections between objects, minimum distance matching. (Excludes spatial data on a network, which are supported by the package 'spatstat.linnet'.)

Maintained by Adrian Baddeley. Last updated 22 hours ago.

classes-and-objects distance-calculation geometry geometry-processing images mensuration plotting point-patterns spatial-data spatial-data-analysis

18.6 match 7 stars 12.11 score 241 scripts 227 dependents

ludvigolsen

cvms:Cross-Validation for Model Selection

Cross-validate one or multiple regression and classification models and get relevant evaluation metrics in a tidy format. Validate the best model on a test set and compare it to a baseline evaluation. Alternatively, evaluate predictions from an external model. Currently supports regression and classification (binary and multiclass). Described in chp. 5 of Jeyaraman, B. P., Olsen, L. R., & Wambugu M. (2019, ISBN: 9781838550134).

Maintained by Ludvig Renbo Olsen. Last updated 9 days ago.

21.1 match 39 stars 10.31 score 492 scripts 5 dependents

rmi-pacta

pacta.multi.loanbook:Run 'PACTA' on Multiple Loan Books Easily

Run Paris Agreement Capital Transition Assessment ('PACTA') analyses on multiple loan books in a structured way. Provides access to standard 'PACTA' metrics and additional 'PACTA'-related metrics for multiple loan books. Results take the form of 'csv' files and plots and are exported to user-specified project paths.

Maintained by Jacob Kastl. Last updated 1 days ago.

climate-change pacta pactaverse sustainable-finance

33.2 match 6.48 score 4 scripts

bioc

evaluomeR:Evaluation of Bioinformatics Metrics

Evaluating the reliability of your own metrics and the measurements done on your own datasets by analysing the stability and goodness of the classifications of such metrics.

Maintained by José Antonio Bernabé-Díaz. Last updated 5 months ago.

clustering classification featureextraction assessment clustering-evaluation evaluome evaluomer metrics

44.6 match 4.82 score 33 scripts

jacobbien

simulator:An Engine for Running Simulations

A framework for performing simulations such as those common in methodological statistics papers. The design principles of this package are described in greater depth in Bien, J. (2016) "The simulator: An Engine to Streamline Simulations," which is available at <arXiv:1607.00021>.

Maintained by Jacob Bien. Last updated 2 years ago.

simulation

29.9 match 52 stars 7.13 score 103 scripts

r-spatialecology

landscapemetrics:Landscape Metrics for Categorical Map Patterns

Calculates landscape metrics for categorical landscape patterns in a tidy workflow. 'landscapemetrics' reimplements the most common metrics from 'FRAGSTATS' (<https://www.fragstats.org/>) and new ones from the current literature on landscape metrics. This package supports 'terra' SpatRaster objects as input arguments. It further provides utility functions to visualize patches, select metrics and building blocks to develop new metrics.

Maintained by Maximilian H.K. Hesselbarth. Last updated 1 months ago.

landscape-ecology landscape-metrics raster spatial cpp

16.8 match 240 stars 12.47 score 584 scripts 4 dependents

irinagain

iglu:Interpreting Glucose Data from Continuous Glucose Monitors

Implements a wide range of metrics for measuring glucose control and glucose variability based on continuous glucose monitoring data. The list of implemented metrics is summarized in Rodbard (2009) <doi:10.1089/dia.2009.0015>. Additional visualization tools include time-series plots, lasagna plots and ambulatory glucose profile report.

Maintained by Irina Gaynanova. Last updated 9 days ago.

22.8 match 26 stars 9.00 score 39 scripts

vandomed

stocks:Stock Market Analysis

Functions for analyzing and visualizing stock market data. Main features are loading and aligning historical data, calculating performance metrics for individual funds or portfolios (e.g. annualized growth, maximum drawdown, Sharpe/Sortino ratio), and creating graphs.

Maintained by Dane R. Van Domelen. Last updated 5 years ago.

investment-analysis portfolio-construction portfolio-optimization sharpe-ratio stock-market time-series cpp

42.0 match 22 stars 4.63 score 39 scripts

jedick

chem16S:Chemical Metrics for Microbial Communities

Combines taxonomic classifications of high-throughput 16S rRNA gene sequences with reference proteomes of archaeal and bacterial taxa to generate amino acid compositions of community reference proteomes. Calculates chemical metrics including carbon oxidation state ('Zc'), stoichiometric oxidation and hydration state ('nO2' and 'nH2O'), H/C, N/C, O/C, and S/C ratios, grand average of hydropathicity ('GRAVY'), isoelectric point ('pI'), protein length, and average molecular weight of amino acid residues. Uses precomputed reference proteomes for archaea and bacteria derived from the Genome Taxonomy Database ('GTDB'). Also includes reference proteomes derived from the NCBI Reference Sequence ('RefSeq') database and manual mapping from the 'RDP Classifier' training set to 'RefSeq' taxonomy as described by Dick and Tan (2023) <doi:10.1007/s00248-022-01988-9>. Processes taxonomic classifications in 'RDP Classifier' format or OTU tables in 'phyloseq-class' objects from the Bioconductor package 'phyloseq'.

Maintained by Jeffrey Dick. Last updated 6 days ago.

16s-rrna carbon-oxidation-state chemical-metrics genomic-adaptation microbial-communities

31.3 match 4 stars 5.92 score 8 scripts

philips-software

latrend:A Framework for Clustering Longitudinal Data

A framework for clustering longitudinal datasets in a standardized way. The package provides an interface to existing R packages for clustering longitudinal univariate trajectories, facilitating reproducible and transparent analyses. Additionally, standard tools are provided to support cluster analyses, including repeated estimation, model validation, and model assessment. The interface enables users to compare results between methods, and to implement and evaluate new methods with ease. The 'akmedoids' package is available from <https://github.com/MAnalytics/akmedoids>.

Maintained by Niek Den Teuling. Last updated 2 months ago.

cluster-analysis clustering-evaluation clustering-methods data-science longitudinal-clustering longitudinal-data mixture-models time-series-analysis

26.4 match 30 stars 6.77 score 26 scripts

bioc

wateRmelon:Illumina DNA methylation array normalization and metrics

15 flavours of betas and three performance metrics, with methods for objects produced by methylumi and minfi packages.

Maintained by Leo C Schalkwyk. Last updated 4 months ago.

dnamethylation microarray twochannel preprocessing qualitycontrol

19.7 match 7.75 score 247 scripts 2 dependents

tomasfryda

h2o:R Interface for the 'H2O' Scalable Machine Learning Platform

R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).

Maintained by Tomas Fryda. Last updated 1 years ago.

18.6 match 3 stars 8.20 score 7.8k scripts 11 dependents

adriancorrendo

metrica:Prediction Performance Metrics

A compilation of more than 80 functions designed to quantitatively and visually evaluate prediction performance of regression (continuous variables) and classification (categorical variables) of point-forecast models (e.g. APSIM, DSSAT, DNDC, supervised Machine Learning). For regression, it includes functions to generate plots (scatter, tiles, density, & Bland-Altman plot), and to estimate error metrics (e.g. MBE, MAE, RMSE), error decomposition (e.g. lack of accuracy-precision), model efficiency (e.g. NSE, E1, KGE), indices of agreement (e.g. d, RAC), goodness of fit (e.g. r, R2), adjusted correlation coefficients (e.g. CCC, dcorr), symmetric regression coefficients (intercept, slope), and mean absolute scaled error (MASE) for time series predictions. For classification (binomial and multinomial), it offers functions to generate and plot confusion matrices, and to estimate performance metrics such as accuracy, precision, recall, specificity, F-score, Cohen's Kappa, G-mean, and many more. For more details visit the vignettes <https://adriancorrendo.github.io/metrica/>.

Maintained by Adrian A. Correndo. Last updated 9 months ago.

18.5 match 77 stars 8.18 score 49 scripts

bioc

MsQuality:MsQuality - Quality metric calculation from Spectra and MsExperiment objects

The MsQuality provides functionality to calculate quality metrics for mass spectrometry-derived, spectral data at the per-sample level. MsQuality relies on the mzQC framework of quality metrics defined by the Human Proteom Organization-Proteomics Standards Initiative (HUPO-PSI). These metrics quantify the quality of spectral raw files using a controlled vocabulary. The package is especially addressed towards users that acquire mass spectrometry data on a large scale (e.g. data sets from clinical settings consisting of several thousands of samples). The MsQuality package allows to calculate low-level quality metrics that require minimum information on mass spectrometry data: retention time, m/z values, and associated intensities. MsQuality relies on the Spectra package, or alternatively the MsExperiment package, and its infrastructure to store spectral data.

Maintained by Thomas Naake. Last updated 2 months ago.

metabolomics proteomics massspectrometry qualitycontrol mass-spectrometry qc

27.7 match 7 stars 5.45 score 2 scripts

bioc

similaRpeak:Metrics to estimate a level of similarity between two ChIP-Seq profiles

This package calculates metrics which quantify the level of similarity between ChIP-Seq profiles. More specifically, the package implements six pseudometrics specialized in pattern similarity detection in ChIP-Seq profiles.

Maintained by Astrid Deschênes. Last updated 5 months ago.

biologicalquestion chipseq genetics multiplecomparison differentialexpression bioconductor bioconductor-package chip-profiles chip-seq metrics

26.1 match 7 stars 5.62 score 7 scripts

bioc

CNVMetrics:Copy Number Variant Metrics

The CNVMetrics package calculates similarity metrics to facilitate copy number variant comparison among samples and/or methods. Similarity metrics can be employed to compare CNV profiles of genetically unrelated samples as well as those with a common genetic background. Some metrics are based on the shared amplified/deleted regions while other metrics rely on the level of amplification/deletion. The data type used as input is a plain text file containing the genomic position of the copy number variations, as well as the status and/or the log2 ratio values. Finally, a visualization tool is provided to explore resulting metrics.

Maintained by Astrid Deschênes. Last updated 5 months ago.

biologicalquestion software copynumbervariation cnv copy-number-variation metrics r-language

28.9 match 4 stars 5.08 score 8 scripts

branchlab

metasnf:Meta Clustering with Similarity Network Fusion

Framework to facilitate patient subtyping with similarity network fusion and meta clustering. The similarity network fusion (SNF) algorithm was introduced by Wang et al. (2014) in <doi:10.1038/nmeth.2810>. SNF is a data integration approach that can transform high-dimensional and diverse data types into a single similarity network suitable for clustering with minimal loss of information from each initial data source. The meta clustering approach was introduced by Caruana et al. (2006) in <doi:10.1109/ICDM.2006.103>. Meta clustering involves generating a wide range of cluster solutions by adjusting clustering hyperparameters, then clustering the solutions themselves into a manageable number of qualitatively similar solutions, and finally characterizing representative solutions to find ones that are best for the user's specific context. This package provides a framework to easily transform multi-modal data into a wide range of similarity network fusion-derived cluster solutions as well as to visualize, characterize, and validate those solutions. Core package functionality includes easy customization of distance metrics, clustering algorithms, and SNF hyperparameters to generate diverse clustering solutions; calculation and plotting of associations between features, between patients, and between cluster solutions; and standard cluster validation approaches including resampled measures of cluster stability, standard metrics of cluster quality, and label propagation to evaluate generalizability in unseen data. Associated vignettes guide the user through using the package to identify patient subtypes while adhering to best practices for unsupervised learning.

Maintained by Prashanth S Velayudhan. Last updated 4 days ago.

bioinformatics clustering metaclustering snf

17.7 match 8 stars 8.21 score 30 scripts

networkgroupr

fastnet:Large-Scale Social Network Analysis

We present an implementation of the algorithms required to simulate large-scale social networks and retrieve their most relevant metrics.

Maintained by Nazrul Shaikh. Last updated 8 years ago.

42.4 match 5 stars 3.37 score 47 scripts

terrytangyuan

dml:Distance Metric Learning in R

State-of-the-art algorithms for distance metric learning, including global and local methods such as Relevant Component Analysis, Discriminative Component Analysis, Local Fisher Discriminant Analysis, etc. These distance metric learning methods are widely applied in feature extraction, dimensionality reduction, clustering, classification, information retrieval, and computer vision problems.

Maintained by Yuan Tang. Last updated 2 years ago.

dimensionality-reduction distance-metric-learning machine-learning metric-learning statistics

23.7 match 58 stars 5.94 score 8 scripts 1 dependents

mayer79

MetricsWeighted:Weighted Metrics and Performance Measures for Machine Learning

Provides weighted versions of several metrics and performance measures used in machine learning, including average unit deviances of the Bernoulli, Tweedie, Poisson, and Gamma distributions, see Jorgensen B. (1997, ISBN: 978-0412997112). The package also contains a weighted version of generalized R-squared, see e.g. Cohen, J. et al. (2002, ISBN: 978-0805822236). Furthermore, 'dplyr' chains are supported.

Maintained by Michael Mayer. Last updated 8 months ago.

machine-learning metrics performance statistics

20.2 match 11 stars 6.79 score 75 scripts 5 dependents

schlosslab

mikropml:User-Friendly R Package for Supervised Machine Learning Pipelines

An interface to build machine learning models for classification and regression problems. 'mikropml' implements the ML pipeline described by Topçuoğlu et al. (2020) <doi:10.1128/mBio.00434-20> with reasonable default options for data preprocessing, hyperparameter tuning, cross-validation, testing, model evaluation, and interpretation steps. See the website <https://www.schlosslab.org/mikropml/> for more information, documentation, and examples.

Maintained by Kelly Sovacool. Last updated 2 years ago.

machine-learning

17.2 match 56 stars 7.83 score 86 scripts

geanders

weathermetrics:Functions to Convert Between Weather Metrics

Functions to convert between weather metrics, including conversions for metrics of temperature, air moisture, wind speed, and precipitation. This package also includes functions to calculate the heat index from air temperature and air moisture.

Maintained by Brooke Anderson. Last updated 8 years ago.

16.1 match 23 stars 8.32 score 506 scripts 1 dependents

fhdsl

metricminer:Mine Metrics from Common Places on the Web

Mine metrics on common places on the web through the power of their APIs (application programming interfaces). It also helps make the data in a format that is easily used for a dashboard or other purposes. There is an associated dashboard template and tutorials that are underdevelopment that help you fully utilize 'metricminer'.

Maintained by Candace Savonen. Last updated 2 days ago.

edtech-software

21.6 match 2 stars 6.13 score 21 scripts

8-bit-sheep

googleAnalyticsR:Google Analytics API into R

Interact with the Google Analytics APIs <https://developers.google.com/analytics/>, including the Core Reporting API (v3 and v4), Management API, User Activity API GA4's Data API and Admin API and Multi-Channel Funnel API.

Maintained by Erik Grönroos. Last updated 6 months ago.

analytics api google googleanalyticsr googleauthr

12.6 match 262 stars 10.11 score 680 scripts 1 dependents

aidanmorales

rTwig:Realistic Quantitative Structure Models

Real Twig is a method to correct branch overestimation in quantitative structure models. Overestimated cylinders are correctly tapered using measured twig diameters of corresponding tree species. Supported quantitative structure modeling software includes 'TreeQSM', 'SimpleForest', 'Treegraph', and 'aRchi'. Also included is a novel database of twig diameters and tools for fractal analysis of point clouds.

Maintained by Aidan Morales. Last updated 12 days ago.

forestry lidar modeling qsm rcpp cpp

17.9 match 8 stars 7.10 score 13 scripts

carlos-alberto-silva

rGEDI:NASA's Global Ecosystem Dynamics Investigation (GEDI) Data Visualization and Processing

Set of tools for downloading, reading, visualizing and processing GEDI Level1B, Level2A and Level2B data.

Maintained by Caio Hamamura. Last updated 5 months ago.

20.4 match 169 stars 6.11 score 85 scripts 1 dependents

jedick

canprot:Chemical Analysis of Proteins

Chemical analysis of proteins based on their amino acid compositions. Amino acid compositions can be read from FASTA files and used to calculate chemical metrics including carbon oxidation state and stoichiometric water content, as described in Dick et al. (2020) <doi:10.5194/bg-17-6145-2020>. Other properties that can be calculated include protein length, grand average of hydropathy (GRAVY), isoelectric point (pI), molecular weight (MW), standard molal volume (V0), and metabolic costs (Akashi and Gojobori, 2002 <doi:10.1073/pnas.062526999>; Wagner, 2005 <doi:10.1093/molbev/msi126>; Zhang et al., 2018 <doi:10.1038/s41467-018-06461-1>). A database of amino acid compositions of human proteins derived from UniProt is provided.

Maintained by Jeffrey Dick. Last updated 12 days ago.

amino-acid-composition chemical-metrics hydration-state isoelectric-point oxidation-state proteins

18.1 match 3 stars 6.70 score 46 scripts 1 dependents

brian-j-smith

MachineShop:Machine Learning Models and Tools

Meta-package for statistical and machine learning with a unified interface for model fitting, prediction, performance assessment, and presentation of results. Approaches for model fitting and prediction of numerical, categorical, or censored time-to-event outcomes include traditional regression models, regularization methods, tree-based methods, support vector machines, neural networks, ensembles, data preprocessing, filtering, and model tuning and selection. Performance metrics are provided for model assessment and can be estimated with independent test sets, split sampling, cross-validation, or bootstrap resampling. Resample estimation can be executed in parallel for faster processing and nested in cases of model tuning and selection. Modeling results can be summarized with descriptive statistics; calibration curves; variable importance; partial dependence plots; confusion matrices; and ROC, lift, and other performance curves.

Maintained by Brian J Smith. Last updated 7 months ago.

classification-models machine-learning predictive-modeling regression-models survival-models

15.0 match 61 stars 7.95 score 121 scripts

dmpe

urlshorteneR:R Wrapper for the 'Bit.ly' and 'Is.gd'/'v.gd' URL Shortening Services

Allows using two URL shortening services, which also provide expanding and analytic functions. Specifically developed for 'Bit.ly' (which requires OAuth 2.0) and 'is.gd' (no API key).

Maintained by John Malc. Last updated 28 days ago.

bitly isgd shorten-urls shortener shorturl url

17.8 match 21 stars 6.70 score 53 scripts 1 dependents

r-lidar

lidR:Airborne LiDAR Data Manipulation and Visualization for Forestry Applications

Airborne LiDAR (Light Detection and Ranging) interface for data manipulation and visualization. Read/write 'las' and 'laz' files, computation of metrics in area based approach, point filtering, artificial point reduction, classification from geographic data, normalization, individual tree segmentation and other manipulations.

Maintained by Jean-Romain Roussel. Last updated 1 months ago.

als forestry las laz lidar point-cloud remote-sensing openblas cpp openmp

8.1 match 623 stars 14.47 score 844 scripts 8 dependents

marinapapa

swaRmverse:Swarm Space Creation

Provides a pipeline for the comparative analysis of collective movement data (e.g. fish schools, bird flocks, baboon troops) by processing 2-dimensional positional data (x,y,t) from GPS trackers or computer vision tracking systems, discretizing events of collective motion, calculating a set of established metrics that characterize each event, and placing the events in a multi-dimensional swarm space constructed from these metrics. The swarm space concept, the metrics and data sets included are described in: Papadopoulou Marina, Furtbauer Ines, O'Bryan Lisa R., Garnier Simon, Georgopoulou Dimitra G., Bracken Anna M., Christensen Charlotte and King Andrew J. (2023) <doi:10.1098/rstb.2022.0068>.

Maintained by Marina Papadopoulou. Last updated 5 months ago.

22.6 match 2 stars 5.13 score 15 scripts

pydemull

activAnalyzer:A 'Shiny' App to Analyze Accelerometer-Measured Daily Physical Behavior Data

A tool to analyse 'ActiGraph' accelerometer data and to implement the use of the PROactive Physical Activity in COPD (chronic obstructive pulmonary disease) instruments. Once analysis is completed, the app allows to export results to .csv files and to generate a report of the measurement. All the configured inputs relevant for interpreting the results are recorded in the report. In addition to the existing 'R' packages that are fully integrated with the app, the app uses some functions from the 'actigraph.sleepr' package developed by Petkova (2021) <https://github.com/dipetkov/actigraph.sleepr/>.

Maintained by Pierre-Yves de Müllenheim. Last updated 6 months ago.

accelerometer actigraph app monitor shiny

22.4 match 5 stars 5.18 score 8 scripts

molina-valero

FORTLS:Automatic Processing of Terrestrial-Based Technologies Point Cloud Data for Forestry Purposes

Process automation of point cloud data derived from terrestrial-based technologies such as Terrestrial Laser Scanner (TLS) or Mobile Laser Scanner. 'FORTLS' enables (i) detection of trees and estimation of tree-level attributes (e.g. diameters and heights), (ii) estimation of stand-level variables (e.g. density, basal area, mean and dominant height), (iii) computation of metrics related to important forest attributes estimated in Forest Inventories at stand-level, and (iv) optimization of plot design for combining TLS data and field measured data. Documentation about 'FORTLS' is described in Molina-Valero et al. (2022, <doi:10.1016/j.envsoft.2022.105337>).

Maintained by Juan Alberto Molina-Valero. Last updated 3 months ago.

forest-inventory forest-monitoring lidar-point-cloud cpp

18.6 match 22 stars 6.16 score 11 scripts

sparklyr

sparklyr:R Interface to Apache Spark

R interface to Apache Spark, a fast and general engine for big data processing, see <https://spark.apache.org/>. This package supports connecting to local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end, and provides an interface to Spark's built-in machine learning algorithms.

Maintained by Edgar Ruiz. Last updated 9 days ago.

apache-spark distributed dplyr ide livy machine-learning remote-clusters spark sparklyr

7.4 match 959 stars 15.16 score 4.0k scripts 21 dependents

biorgeo

bioregion:Comparison of Bioregionalisation Methods

The main purpose of this package is to propose a transparent methodological framework to compare bioregionalisation methods based on hierarchical and non-hierarchical clustering algorithms (Kreft & Jetz (2010) <doi:10.1111/j.1365-2699.2010.02375.x>) and network algorithms (Lenormand et al. (2019) <doi:10.1002/ece3.4718> and Leroy et al. (2019) <doi:10.1111/jbi.13674>).

Maintained by Maxime Lenormand. Last updated 10 days ago.

biogeography bioregion bioregionalization cpp

17.6 match 7 stars 6.27 score 11 scripts

eikeluedeling

chillR:Statistical Methods for Phenology Analysis in Temperate Fruit Trees

The phenology of plants (i.e. the timing of their annual life phases) depends on climatic cues. For temperate trees and many other plants, spring phases, such as leaf emergence and flowering, have been found to result from the effects of both cool (chilling) conditions and heat. Fruit tree scientists (pomologists) have developed some metrics to quantify chilling and heat (e.g. see Luedeling (2012) <doi:10.1016/j.scienta.2012.07.011>). 'chillR' contains functions for processing temperature records into chilling (Chilling Hours, Utah Chill Units and Chill Portions) and heat units (Growing Degree Hours). Regarding chilling metrics, Chill Portions are often considered the most promising, but they are difficult to calculate. This package makes it easy. 'chillR' also contains procedures for conducting a PLS analysis relating phenological dates (e.g. bloom dates) to either mean temperatures or mean chill and heat accumulation rates, based on long-term weather and phenology records (Luedeling and Gassner (2012) <doi:10.1016/j.agrformet.2011.10.020>). As of version 0.65, it also includes functions for generating weather scenarios with a weather generator, for conducting climate change analyses for temperature-based climatic metrics and for plotting results from such analyses. Since version 0.70, 'chillR' contains a function for interpolating hourly temperature records.

Maintained by Eike Luedeling. Last updated 4 months ago.

cpp

17.8 match 3 stars 6.13 score 346 scripts 1 dependents

zachmayer

caretEnsemble:Ensembles of Caret Models

Functions for creating ensembles of caret models: caretList() and caretStack(). caretList() is a convenience function for fitting multiple caret::train() models to the same dataset. caretStack() will make linear or non-linear combinations of these models, using a caret::train() model as a meta-model.

Maintained by Zachary A. Deane-Mayer. Last updated 3 months ago.

9.1 match 226 stars 11.92 score 780 scripts 1 dependents

nceas

codyn:Community Dynamics Metrics

Univariate and multivariate temporal and spatial diversity indices, rank abundance curves, and community stability measures. The functions implement measures that are either explicitly temporal and include the option to calculate them over multiple replicates, or spatial and include the option to calculate them over multiple time points. Functions fall into five categories: static diversity indices, temporal diversity indices, spatial diversity indices, rank abundance curves, and community stability measures. The diversity indices are temporal and spatial analogs to traditional diversity indices. Specifically, the package includes functions to calculate community richness, evenness and diversity at a given point in space and time. In addition, it contains functions to calculate species turnover, mean rank shifts, and lags in community similarity between two time points. Details of the methods are available in Hallett et al. (2016) <doi:10.1111/2041-210X.12569> and Avolio et al. (2019) <doi:10.1002/ecs2.2881>.

Maintained by Matthew B. Jones. Last updated 4 years ago.

11.8 match 34 stars 9.07 score 230 scripts

phuais

multilandr:Landscape Analysis at Multiple Spatial Scales

Provides a tidy workflow for landscape-scale analysis. 'multilandr' offers tools to generate landscapes at multiple spatial scales and compute landscape metrics, primarily using the 'landscapemetrics' package. It also features utility functions for plotting and analyzing multi-scale landscapes, exploring correlations between metrics, filtering landscapes based on specific conditions, generating landscape gradients for a given metric, and preparing datasets for further statistical analysis. Documentation about 'multilandr' is provided in an introductory vignette included in this package and in the paper by Huais (2024) <doi:10.1007/s10980-024-01930-z>; see citation("multilandr") for details.

Maintained by Pablo Yair Huais. Last updated 27 days ago.

18.8 match 9 stars 5.61 score 5 scripts

globalecologylab

poems:Pattern-Oriented Ensemble Modeling System

A framework of interoperable R6 classes (Chang, 2020, <https://CRAN.R-project.org/package=R6>) for building ensembles of viable models via the pattern-oriented modeling (POM) approach (Grimm et al.,2005, <doi:10.1126/science.1116681>). The package includes classes for encapsulating and generating model parameters, and managing the POM workflow. The workflow includes: model setup; generating model parameters via Latin hyper-cube sampling (Iman & Conover, 1980, <doi:10.1080/03610928008827996>); running multiple sampled model simulations; collating summary results; and validating and selecting an ensemble of models that best match known patterns. By default, model validation and selection utilizes an approximate Bayesian computation (ABC) approach (Beaumont et al., 2002, <doi:10.1093/genetics/162.4.2025>), although alternative user-defined functionality could be employed. The package includes a spatially explicit demographic population model simulation engine, which incorporates default functionality for density dependence, correlated environmental stochasticity, stage-based transitions, and distance-based dispersal. The user may customize the simulator by defining functionality for translocations, harvesting, mortality, and other processes, as well as defining the sequence order for the simulator processes. The framework could also be adapted for use with other model simulators by utilizing its extendable (inheritable) base classes.

Maintained by July Pilowsky. Last updated 19 days ago.

biogeography population-model process-based

13.1 match 10 stars 8.05 score 59 scripts 2 dependents

terrytangyuan

lfda:Local Fisher Discriminant Analysis

Functions for performing and visualizing Local Fisher Discriminant Analysis(LFDA), Kernel Fisher Discriminant Analysis(KLFDA), and Semi-supervised Local Fisher Discriminant Analysis(SELF).

Maintained by Yuan Tang. Last updated 2 years ago.

dimensionality-reduction distance-metric-learning machine-learning metric-learning statistics

16.0 match 76 stars 6.50 score 74 scripts 3 dependents

craig-parylo

cvdprevent:Wrapper for the 'CVD Prevent' Application Programming Interface

Provides an R wrapper to the 'CVD Prevent' application programming interface (API). Users can make API requests through built-in R functions. The Cardiovascular Disease Prevention Audit (CVDPREVENT) is an England-wide primary care audit that automatically extracts routinely held GP health data. <https://bmchealthdocs.atlassian.net/wiki/spaces/CP/pages/317882369/CVDPREVENT+API+Documentation>.

Maintained by Craig Parylo. Last updated 1 months ago.

20.1 match 3 stars 5.02 score 4 scripts

wadpac

GGIR:Raw Accelerometer Data Analysis

A tool to process and analyse data collected with wearable raw acceleration sensors as described in Migueles and colleagues (JMPB 2019), and van Hees and colleagues (JApplPhysiol 2014; PLoSONE 2015). The package has been developed and tested for binary data from 'GENEActiv' <https://activinsights.com/>, binary (.gt3x) and .csv-export data from 'Actigraph' <https://theactigraph.com> devices, and binary (.cwa) and .csv-export data from 'Axivity' <https://axivity.com>. These devices are currently widely used in research on human daily physical activity. Further, the package can handle accelerometer data file from any other sensor brand providing that the data is stored in csv format. Also the package allows for external function embedding.

Maintained by Vincent T van Hees. Last updated 1 days ago.

accelerometer activity-recognition circadian-rhythm movement-sensor sleep

7.6 match 109 stars 13.20 score 342 scripts 3 dependents

bupaverse

edeaR:Exploratory and Descriptive Event-Based Data Analysis

Exploratory and descriptive analysis of event based data. Provides methods for describing and selecting process data, and for preparing event log data for process mining. Builds on the S3-class for event logs implemented in the package 'bupaR'.

Maintained by Gert Janssenswillen. Last updated 3 months ago.

11.0 match 12 stars 9.17 score 149 scripts 8 dependents

tguillerme

dispRity:Measuring Disparity

A modular package for measuring disparity (multidimensional space occupancy). Disparity can be calculated from any matrix defining a multidimensional space. The package provides a set of implemented metrics to measure properties of the space and allows users to provide and test their own metrics. The package also provides functions for looking at disparity in a serial way (e.g. disparity through time) or per groups as well as visualising the results. Finally, this package provides several statistical tests for disparity analysis.

Maintained by Thomas Guillerme. Last updated 1 days ago.

disparity ecology multidimensionality palaeobiology

11.5 match 26 stars 8.69 score 220 scripts 1 dependents

billpetti

baseballr:Acquiring and Analyzing Baseball Data

Provides numerous utilities for acquiring and analyzing baseball data from online sources such as 'Baseball Reference' <https://www.baseball-reference.com/>, 'FanGraphs' <https://www.fangraphs.com/>, and the 'MLB Stats' API <https://www.mlb.com/>.

Maintained by Saiem Gilani. Last updated 4 months ago.

baseball pitchfx sabermetrics statcast

11.2 match 380 stars 8.98 score 582 scripts

mlverse

luz:Higher Level 'API' for 'torch'

A high level interface for 'torch' providing utilities to reduce the the amount of code needed for common tasks, abstract away torch details and make the same code work on both the 'CPU' and 'GPU'. It's flexible enough to support expressing a large range of models. It's heavily inspired by 'fastai' by Howard et al. (2020) <arXiv:2002.04688>, 'Keras' by Chollet et al. (2015) and 'PyTorch Lightning' by Falcon et al. (2019) <doi:10.5281/zenodo.3828935>.

Maintained by Daniel Falbel. Last updated 6 months ago.

10.1 match 89 stars 9.88 score 318 scripts 4 dependents

andrewljackson

SIBER:Stable Isotope Bayesian Ellipses in R

Fits bi-variate ellipses to stable isotope data using Bayesian inference with the aim being to describe and compare their isotopic niche.

Maintained by Andrew Jackson. Last updated 10 months ago.

community-ecology ecology niche-modelling stable-isotopes jags cpp

10.9 match 36 stars 9.13 score 187 scripts 1 dependents

mlampros

nmslibR:Non Metric Space (Approximate) Library

A Non-Metric Space Library ('NMSLIB' <https://github.com/nmslib/nmslib>) wrapper, which according to the authors "is an efficient cross-platform similarity search library and a toolkit for evaluation of similarity search methods. The goal of the 'NMSLIB' <https://github.com/nmslib/nmslib> Library is to create an effective and comprehensive toolkit for searching in generic non-metric spaces. Being comprehensive is important, because no single method is likely to be sufficient in all cases. Also note that exact solutions are hardly efficient in high dimensions and/or non-metric spaces. Hence, the main focus is on approximate methods". The wrapper also includes Approximate Kernel k-Nearest-Neighbor functions based on the 'NMSLIB' <https://github.com/nmslib/nmslib> 'Python' Library.

Maintained by Lampros Mouselimis. Last updated 2 years ago.

approximate-nearest-neighbor-search nmslib non-metric python reticulate cpp openmp

19.4 match 12 stars 5.14 score 23 scripts

robwschlegel

heatwaveR:Detect Heatwaves and Cold-Spells

The different methods for defining, detecting, and categorising the extreme events known as heatwaves or cold-spells, as first proposed in Hobday et al. (2016) <doi: 10.1016/j.pocean.2015.12.014> and Hobday et al. (2018) <https://www.jstor.org/stable/26542662>. The functions in this package work on both air and water temperature data. These detection algorithms may be used on non-temperature data as well.

Maintained by Robert W. Schlegel. Last updated 2 months ago.

cpp

10.6 match 46 stars 9.36 score 343 scripts

easystats

easystats:Framework for Easy Statistical Modeling, Visualization, and Reporting

A meta-package that installs and loads a set of packages from 'easystats' ecosystem in a single step. This collection of packages provide a unifying and consistent framework for statistical modeling, visualization, and reporting. Additionally, it provides articles targeted at instructors for teaching 'easystats', and a dashboard targeted at new R users for easily conducting statistical analysis by accessing summary results, model fit indices, and visualizations with minimal programming.

Maintained by Daniel Lüdecke. Last updated 11 days ago.

dataanalytics datascience easystats hacktoberfest models performance-metrics regression-models statistics

7.5 match 1.1k stars 13.01 score 1.8k scripts 1 dependents

jmadinlab

habtools:Tools and Metrics for 3D Surfaces and Objects

A collection of functions for sampling and simulating 3D surfaces and objects and estimating metrics like rugosity, fractal dimension, convexity, sphericity, circularity, second moments of area and volume, and more.

Maintained by Nina Schiettekatte. Last updated 10 days ago.

15.9 match 12 stars 6.10 score 9 scripts

satijalab

Seurat:Tools for Single Cell Genomics

A toolkit for quality control, analysis, and exploration of single cell RNA sequencing data. 'Seurat' aims to enable users to identify and interpret sources of heterogeneity from single cell transcriptomic measurements, and to integrate diverse types of single cell data. See Satija R, Farrell J, Gennert D, et al (2015) <doi:10.1038/nbt.3192>, Macosko E, Basu A, Satija R, et al (2015) <doi:10.1016/j.cell.2015.05.002>, Stuart T, Butler A, et al (2019) <doi:10.1016/j.cell.2019.05.031>, and Hao, Hao, et al (2020) <doi:10.1101/2020.10.12.335331> for more details.

Maintained by Paul Hoffman. Last updated 1 years ago.

human-cell-atlas single-cell-genomics single-cell-rna-seq cpp

5.7 match 2.4k stars 16.86 score 50k scripts 73 dependents

ms609

TreeDist:Calculate and Map Distances Between Phylogenetic Trees

Implements measures of tree similarity, including information-based generalized Robinson-Foulds distances (Phylogenetic Information Distance, Clustering Information Distance, Matching Split Information Distance; Smith 2020) <doi:10.1093/bioinformatics/btaa614>; Jaccard-Robinson-Foulds distances (Bocker et al. 2013) <doi:10.1007/978-3-642-40453-5_13>, including the Nye et al. (2006) metric <doi:10.1093/bioinformatics/bti720>; the Matching Split Distance (Bogdanowicz & Giaro 2012) <doi:10.1109/TCBB.2011.48>; Maximum Agreement Subtree distances; the Kendall-Colijn (2016) distance <doi:10.1093/molbev/msw124>, and the Nearest Neighbour Interchange (NNI) distance, approximated per Li et al. (1996) <doi:10.1007/3-540-61332-3_168>. Includes tools for visualizing mappings of tree space (Smith 2022) <doi:10.1093/sysbio/syab100>, for identifying islands of trees (Silva and Wilkinson 2021) <doi:10.1093/sysbio/syab015>, for calculating the median of sets of trees, and for computing the information content of trees and splits.

Maintained by Martin R. Smith. Last updated 1 months ago.

phylogenetics tree-distance phylogenetic-trees tree-distances trees cpp

9.2 match 32 stars 10.32 score 97 scripts 5 dependents

atheriel

openmetrics:A 'Prometheus' Client for R Using the 'OpenMetrics' Format

Provides a client for the open-source monitoring and alerting toolkit, 'Prometheus', that emits metrics in the 'OpenMetrics' format. Allows users to automatically instrument 'Plumber' and 'Shiny' applications, collect standard process metrics, as well as define custom counter, gauge, and histogram metrics of their own.

Maintained by Aaron Jacobs. Last updated 4 years ago.

metrics openmetrics plumber prometheus prometheus-client shiny

22.1 match 35 stars 4.24 score 5 scripts

braverock

PerformanceAnalytics:Econometric Tools for Performance and Risk Analysis

Collection of econometric functions for performance and risk analysis. In addition to standard risk and performance metrics, this package aims to aid practitioners and researchers in utilizing the latest research in analysis of non-normal return streams. In general, it is most tested on return (rather than price) data on a regular scale, but most functions will work with irregular return data as well, and increasing numbers of functions will work with P&L or price data where possible.

Maintained by Brian G. Peterson. Last updated 3 months ago.

5.7 match 222 stars 15.93 score 4.8k scripts 20 dependents

mikeblazanin

gcplyr:Wrangle and Analyze Growth Curve Data

Easy wrangling and model-free analysis of microbial growth curve data, as commonly output by plate readers. Tools for reshaping common plate reader outputs into 'tidy' formats and merging them with design information, making data easy to work with using 'gcplyr' and other packages. Also streamlines common growth curve processing steps, like smoothing and calculating derivatives, and facilitates model-free characterization and analysis of growth data. See methods at <https://mikeblazanin.github.io/gcplyr/>.

Maintained by Mike Blazanin. Last updated 2 months ago.

dplyr ggplot2 tidyverse

11.3 match 30 stars 7.90 score 75 scripts

bioxgeo

geodiv:Methods for Calculating Gradient Surface Metrics

Methods for calculating gradient surface metrics for continuous analysis of landscape features.

Maintained by Annie C. Smith. Last updated 1 years ago.

cpp

14.8 match 11 stars 5.88 score 23 scripts 1 dependents

fabrice-rossi

mixvlmc:Variable Length Markov Chains with Covariates

Estimates Variable Length Markov Chains (VLMC) models and VLMC with covariates models from discrete sequences. Supports model selection via information criteria and simulation of new sequences from an estimated model. See Bühlmann, P. and Wyner, A. J. (1999) <doi:10.1214/aos/1018031204> for VLMC and Zanin Zambom, A., Kim, S. and Lopes Garcia, N. (2022) <doi:10.1111/jtsa.12615> for VLMC with covariates.

Maintained by Fabrice Rossi. Last updated 10 months ago.

machine-learning markov-chain markov-model statistics time-series cpp

13.9 match 2 stars 6.23 score 20 scripts

pharmar

riskmetric:Risk Metrics to Evaluating R Packages

Facilities for assessing R packages against a number of metrics to help quantify their robustness.

Maintained by Eli Miller. Last updated 8 days ago.

9.4 match 167 stars 8.89 score 43 scripts

hneth

riskyr:Rendering Risk Literacy more Transparent

Risk-related information (like the prevalence of conditions, the sensitivity and specificity of diagnostic tests, or the effectiveness of interventions or treatments) can be expressed in terms of frequencies or probabilities. By providing a toolbox of corresponding metrics and representations, 'riskyr' computes, translates, and visualizes risk-related information in a variety of ways. Adopting multiple complementary perspectives provides insights into the interplay between key parameters and renders teaching and training programs on risk literacy more transparent.

Maintained by Hansjoerg Neth. Last updated 10 months ago.

2x2-matrix bayesian-inference contingency-table representation risk risk-literacy visualization

11.3 match 19 stars 7.36 score 80 scripts

ropensci

repometrics:Metrics for Your Code Repository

Metrics for your code repository. Call one function to generate an interactive dashboard displaying the state of your code.

Maintained by Mark Padgham. Last updated 3 days ago.

dashboard software-metrics

18.3 match 2 stars 4.53 score

bioc

scuttle:Single-Cell RNA-Seq Analysis Utilities

Provides basic utility functions for performing single-cell analyses, focusing on simple normalization, quality control and data transformations. Also provides some helper functions to assist development of other packages.

Maintained by Aaron Lun. Last updated 5 months ago.

immunooncology singlecell rnaseq qualitycontrol preprocessing normalization transcriptomics geneexpression sequencing software dataimport openblas cpp

7.9 match 10.21 score 1.7k scripts 80 dependents

nashjc

optimx:Expanded Replacement and Extension of the 'optim' Function

Provides a replacement and extension of the optim() function to call to several function minimization codes in R in a single statement. These methods handle smooth, possibly box constrained functions of several or many parameters. Note that function 'optimr()' was prepared to simplify the incorporation of minimization codes going forward. Also implements some utility codes and some extra solvers, including safeguarded Newton methods. Many methods previously separate are now included here. This is the version for CRAN.

Maintained by John C Nash. Last updated 2 months ago.

6.3 match 2 stars 12.87 score 1.8k scripts 89 dependents

rstudio

shinyloadtest:Load Test Shiny Applications

Assesses the number of concurrent users 'shiny' applications are capable of supporting, and for directing application changes in order to support a higher number of users. Provides facilities for recording 'shiny' application sessions, playing recorded sessions against a target server at load, and analyzing the resulting metrics.

Maintained by Barret Schloerke. Last updated 7 months ago.

11.0 match 112 stars 7.14 score 61 scripts

mlysy

nicheROVER:Niche Region and Niche Overlap Metrics for Multidimensional Ecological Niches

Implementation of a probabilistic method to calculate 'nicheROVER' (_niche_ _r_egion and niche _over_lap) metrics using multidimensional niche indicator data (e.g., stable isotopes, environmental variables, etc.). The niche region is defined as the joint probability density function of the multidimensional niche indicators at a user-defined probability alpha (e.g., 95%). Uncertainty is accounted for in a Bayesian framework, and the method can be extended to three or more indicator dimensions. It provides directional estimates of niche overlap, accounts for species-specific distributions in multivariate niche space, and produces unique and consistent bivariate projections of the multivariate niche region. The article by Swanson et al. (2015) <doi:10.1890/14-0235.1> provides a detailed description of the methodology. See the package vignette for a worked example using fish stable isotope data.

Maintained by Martin Lysy. Last updated 1 years ago.

11.4 match 9 stars 6.80 score 47 scripts 1 dependents

bioc

spatialFDA:A Tool for Spatial Multi-sample Comparisons

spatialFDA is a package to calculate spatial statistics metrics. The package takes a SpatialExperiment object and calculates spatial statistics metrics using the package spatstat. Then it compares the resulting functions across samples/conditions using functional additive models as implemented in the package refund. Furthermore, it provides exploratory visualisations using functional principal component analysis, as well implemented in refund.

Maintained by Martin Emons. Last updated 24 days ago.

software spatial transcriptomics

15.4 match 2 stars 5.00 score 6 scripts

mlampros

KernelKnn:Kernel k Nearest Neighbors

Extends the simple k-nearest neighbors algorithm by incorporating numerous kernel functions and a variety of distance metrics. The package takes advantage of 'RcppArmadillo' to speed up the calculation of distances between observations.

Maintained by Lampros Mouselimis. Last updated 2 years ago.

cpp11 distance-metric kernel-methods knn rcpparmadillo openblas cpp openmp

8.0 match 17 stars 9.16 score 54 scripts 13 dependents

rbarkerclarke

gtexture:Generalized Application of Co-Occurrence Matrices and Haralick Texture

Generalizes application of gray-level co-occurrence matrix (GLCM) metrics to objects outside of images. The current focus is to apply GLCM metrics to the study of biological networks and fitness landscapes that are used in studying evolutionary medicine and biology, particularly the evolution of cancer resistance. The package was used in our publication, Barker-Clarke et al. (2023) <doi:10.1088/1361-6560/ace305>. A general reference to learn more about mathematical oncology can be found at Rockne et al. (2019) <doi:10.1088/1478-3975/ab1a09>.

Maintained by Rowan Barker-Clarke. Last updated 12 months ago.

24.4 match 3.00 score 1 scripts

vjoshy

ATQ:Alert Time Quality - Evaluating Timely Epidemic Metrics

Provides tools for evaluating timely epidemic detection models within school absenteeism-based surveillance systems. Introduces the concept of alert time quality as an evaluation metric. Includes functions to simulate populations, epidemics, and alert metrics associated with epidemic spread using population census data. The methods are based on research published in Vanderkruk et al. (2023) <doi:10.1186/s12889-023-15747-z> and Ward et al. (2019) <doi:10.1186/s12889-019-7521-7>.

Maintained by Vinay Joshy. Last updated 7 months ago.

16.2 match 1 stars 4.48 score 6 scripts

ropensci

nlrx:Setup, Run and Analyze 'NetLogo' Model Simulations from 'R' via 'XML'

Setup, run and analyze 'NetLogo' (<https://ccl.northwestern.edu/netlogo/>) model simulations in 'R'. 'nlrx' experiments use a similar structure as 'NetLogos' Behavior Space experiments. However, 'nlrx' offers more flexibility and additional tools for running and analyzing complex simulation designs and sensitivity analyses. The user defines all information that is needed in an intuitive framework, using class objects. Experiments are submitted from 'R' to 'NetLogo' via 'XML' files that are dynamically written, based on specifications defined by the user. By nesting model calls in future environments, large simulation design with many runs can be executed in parallel. This also enables simulating 'NetLogo' experiments on remote high performance computing machines. In order to use this package, 'Java' and 'NetLogo' (>= 5.3.1) need to be available on the executing system.

Maintained by Sebastian Hanss. Last updated 6 months ago.

agent-based-modeling individual-based-modelling netlogo peer-reviewed

8.0 match 78 stars 8.86 score 195 scripts

evolecolgroup

tidysdm:Species Distribution Models with Tidymodels

Fit species distribution models (SDMs) using the 'tidymodels' framework, which provides a standardised interface to define models and process their outputs. 'tidysdm' expands 'tidymodels' by providing methods for spatial objects, models and metrics specific to SDMs, as well as a number of specialised functions to process occurrences for contemporary and palaeo datasets. The full functionalities of the package are described in Leonardi et al. (2023) <doi:10.1101/2023.07.24.550358>.

Maintained by Andrea Manica. Last updated 9 days ago.

species-distribution-modelling tidymodels

8.0 match 31 stars 8.82 score 51 scripts

climateanalytics

foreSIGHT:Systems Insights from Generation of Hydroclimatic Timeseries

A tool to create hydroclimate scenarios, stress test systems and visualize system performance in scenario-neutral climate change impact assessments. Scenario-neutral approaches 'stress-test' the performance of a modelled system by applying a wide range of plausible hydroclimate conditions (see Brown & Wilby (2012) <doi:10.1029/2012EO410001> and Prudhomme et al. (2010) <doi:10.1016/j.jhydrol.2010.06.043>). These approaches allow the identification of hydroclimatic variables that affect the vulnerability of a system to hydroclimate variation and change. This tool enables the generation of perturbed time series using a range of approaches including simple scaling of observed time series (e.g. Culley et al. (2016) <doi:10.1002/2015WR018253>) and stochastic simulation of perturbed time series via an inverse approach (see Guo et al. (2018) <doi:10.1016/j.jhydrol.2016.03.025>). It incorporates 'Richardson-type' weather generator model configurations documented in Richardson (1981) <doi:10.1029/WR017i001p00182>, Richardson and Wright (1984), as well as latent variable type model configurations documented in Bennett et al. (2018) <doi:10.1016/j.jhydrol.2016.12.043>, Rasmussen (2013) <doi:10.1002/wrcr.20164>, Bennett et al. (2019) <doi:10.5194/hess-23-4783-2019> to generate hydroclimate variables on a daily basis (e.g. precipitation, temperature, potential evapotranspiration) and allows a variety of different hydroclimate variable properties, herein called attributes, to be perturbed. Options are included for the easy integration of existing system models both internally in R and externally for seamless 'stress-testing'. A suite of visualization options for the results of a scenario-neutral analysis (e.g. plotting performance spaces and overlaying climate projection information) are also included. Version 1.0 of this package is described in Bennett et al. (2021) <doi:10.1016/j.envsoft.2021.104999>. As further developments in scenario-neutral approaches occur the tool will be updated to incorporate these advances.

Maintained by David McInerney. Last updated 1 years ago.

cpp

19.3 match 1 stars 3.60 score 20 scripts

riazakhan94

ROCit:Performance Assessment of Binary Classifier with Visualization

Sensitivity (or recall or true positive rate), false positive rate, specificity, precision (or positive predictive value), negative predictive value, misclassification rate, accuracy, F-score- these are popular metrics for assessing performance of binary classifier for certain threshold. These metrics are calculated at certain threshold values. Receiver operating characteristic (ROC) curve is a common tool for assessing overall diagnostic ability of the binary classifier. Unlike depending on a certain threshold, area under ROC curve (also known as AUC), is a summary statistic about how well a binary classifier performs overall for the classification task. ROCit package provides flexibility to easily evaluate threshold-bound metrics. Also, ROC curve, along with AUC, can be obtained using different methods, such as empirical, binormal and non-parametric. ROCit encompasses a wide variety of methods for constructing confidence interval of ROC curve and AUC. ROCit also features the option of constructing empirical gains table, which is a handy tool for direct marketing. The package offers options for commonly used visualization, such as, ROC curve, KS plot, lift plot. Along with in-built default graphics setting, there are rooms for manual tweak by providing the necessary values as function arguments. ROCit is a powerful tool offering a range of things, yet it is very easy to use.

Maintained by Md Riaz Ahmed Khan. Last updated 3 years ago.

9.0 match 7.66 score 332 scripts 6 dependents

myles-lewis

nestedcv:Nested Cross-Validation with 'glmnet' and 'caret'

Implements nested k*l-fold cross-validation for lasso and elastic-net regularised linear models via the 'glmnet' package and other machine learning models via the 'caret' package <doi:10.1093/bioadv/vbad048>. Cross-validation of 'glmnet' alpha mixing parameter and embedded fast filter functions for feature selection are provided. Described as double cross-validation by Stone (1977) <doi:10.1111/j.2517-6161.1977.tb01603.x>. Also implemented is a method using outer CV to measure unbiased model performance metrics when fitting Bayesian linear and logistic regression shrinkage models using the horseshoe prior over parameters to encourage a sparse model as described by Piironen & Vehtari (2017) <doi:10.1214/17-EJS1337SI>.

Maintained by Myles Lewis. Last updated 5 days ago.

8.7 match 12 stars 7.92 score 46 scripts

bioc

GRmetrics:Calculate growth-rate inhibition (GR) metrics

Functions for calculating and visualizing growth-rate inhibition (GR) metrics.

Maintained by Nicholas Clark. Last updated 5 months ago.

immunooncology cellbasedassays cellbiology software timecourse visualization

14.2 match 1 stars 4.83 score 17 scripts

diegommcc

SpatialDDLS:Deconvolution of Spatial Transcriptomics Data Based on Neural Networks

Deconvolution of spatial transcriptomics data based on neural networks and single-cell RNA-seq data. SpatialDDLS implements a workflow to create neural network models able to make accurate estimates of cell composition of spots from spatial transcriptomics data using deep learning and the meaningful information provided by single-cell RNA-seq data. See Torroja and Sanchez-Cabo (2019) <doi:10.3389/fgene.2019.00978> and Mañanes et al. (2024) <doi:10.1093/bioinformatics/btae072> to get an overview of the method and see some examples of its performance.

Maintained by Diego Mañanes. Last updated 5 months ago.

deconvolution deep-learning neural-network spatial-transcriptomics

13.6 match 5 stars 5.00 score 1 scripts

martinspindler

hdm:High-Dimensional Metrics

Implementation of selected high-dimensional statistical and econometric methods for estimation and inference. Efficient estimators and uniformly valid confidence intervals for various low-dimensional causal/ structural parameters are provided which appear in high-dimensional approximately sparse models. Including functions for fitting heteroscedastic robust Lasso regressions with non-Gaussian errors and for instrumental variable (IV) and treatment effect estimation in a high-dimensional setting. Moreover, the methods enable valid post-selection inference and rely on a theoretically grounded, data-driven choice of the penalty. Chernozhukov, Hansen, Spindler (2016) <arXiv:1603.01700>.

Maintained by Martin Spindler. Last updated 4 years ago.

8.3 match 14 stars 8.17 score 564 scripts 4 dependents

vegandevs

vegan:Community Ecology Package

Ordination methods, diversity analysis and other functions for community and vegetation ecologists.

Maintained by Jari Oksanen. Last updated 16 days ago.

ecological-modelling ecology ordination fortran openblas

3.5 match 472 stars 19.41 score 15k scripts 440 dependents

pgomba

MDPIexploreR:Web Scraping and Bibliometric Analysis of MDPI Journals

Provides comprehensive tools to scrape and analyze data from the MDPI journals. It allows users to extract metrics such as submission-to-acceptance times, article types, and whether articles are part of special issues. The package can also visualize this information through plots. Additionally, 'MDPIexploreR' offers tools to explore patterns of self-citations within articles and provides insights into guest-edited special issues.

Maintained by Pablo Gómez Barreiro. Last updated 4 months ago.

analysis data-analysis data-visualization mdpi metrics scientific-journals visualization web-scraping

10.5 match 20 stars 6.20 score 9 scripts

ltorgo

performanceEstimation:An Infra-Structure for Performance Estimation of Predictive Models

An infra-structure for estimating the predictive performance of predictive models. In this context, it can also be used to compare and/or select among different alternative ways of solving one or more predictive tasks. The main goal of the package is to provide a generic infra-structure to estimate the values of different metrics of predictive performance using different estimation procedures. These estimation tasks can be applied to any solutions (workflows) to the predictive tasks. The package provides easy to use standard workflows that allow the usage of any available R modeling algorithm together with some pre-defined data pre-processing steps and also prediction post- processing methods. It also provides means for addressing issues related with the statistical significance of the observed differences.

Maintained by Luis Torgo. Last updated 8 years ago.

10.9 match 16 stars 5.97 score 195 scripts 1 dependents

hiweller

colordistance:Distance Metrics for Image Color Similarity

Loads and displays images, selectively masks specified background colors, bins pixels by color using either data-dependent or automatically generated color bins, quantitatively measures color similarity among images using one of several distance metrics for comparing pixel color clusters, and clusters images by object color similarity. Uses CIELAB, RGB, or HSV color spaces. Originally written for use with organism coloration (reef fish color diversity, butterfly mimicry, etc), but easily applicable for any image set.

Maintained by Hannah Weller. Last updated 1 years ago.

8.2 match 37 stars 7.93 score 76 scripts 2 dependents

facebook

prophet:Automatic Forecasting Procedure

Implements a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well.

Maintained by Sean Taylor. Last updated 5 months ago.

forecasting python cpp

4.1 match 19k stars 15.53 score 976 scripts 13 dependents

cran

FuzzySTs:Fuzzy Statistical Tools

The main goal of this package is to present various fuzzy statistical tools. It intends to provide an implementation of the theoretical and empirical approaches presented in the book entitled "The signed distance measure in fuzzy statistical analysis. Some theoretical, empirical and programming advances" <doi: 10.1007/978-3-030-76916-1>. For the theoretical approaches, see Berkachy R. and Donze L. (2019) <doi:10.1007/978-3-030-03368-2_1>. For the empirical approaches, see Berkachy R. and Donze L. (2016) <ISBN: 978-989-758-201-1>). Important (non-exhaustive) implementation highlights of this package are as follows: (1) a numerical procedure to estimate the fuzzy difference and the fuzzy square. (2) two numerical methods of fuzzification. (3) a function performing different possibilities of distances, including the signed distance and the generalized signed distance for instance with all its properties. (4) numerical estimations of fuzzy statistical measures such as the variance, the moment, etc. (5) two methods of estimation of the bootstrap distribution of the likelihood ratio in the fuzzy context. (6) an estimation of a fuzzy confidence interval by the likelihood ratio method. (7) testing fuzzy hypotheses and/or fuzzy data by fuzzy confidence intervals in the Kwakernaak - Kruse and Meyer sense. (8) a general method to estimate the fuzzy p-value with fuzzy hypotheses and/or fuzzy data. (9) a method of estimation of global and individual evaluations of linguistic questionnaires. (10) numerical estimations of multi-ways analysis of variance models in the fuzzy context. The unbalance in the considered designs are also foreseen.

Maintained by Redina Berkachy. Last updated 8 months ago.

18.5 match 3.40 score

business-science

modeltime:The Tidymodels Extension for Time Series Modeling

The time series forecasting framework for use with the 'tidymodels' ecosystem. Models include ARIMA, Exponential Smoothing, and additional time series models from the 'forecast' and 'prophet' packages. Refer to "Forecasting Principles & Practice, Second edition" (<https://otexts.com/fpp2/>). Refer to "Prophet: forecasting at scale" (<https://research.facebook.com/blog/2017/02/prophet-forecasting-at-scale/>.).

Maintained by Matt Dancho. Last updated 5 months ago.

arima data-science deep-learning ets forecasting machine-learning machine-learning-algorithms modeltime prophet tbats tidymodeling tidymodels time time-series time-series-analysis timeseries timeseries-forecasting

5.9 match 549 stars 10.57 score 1.1k scripts 7 dependents

bioc

cfDNAPro:cfDNAPro extracts and Visualises biological features from whole genome sequencing data of cell-free DNA

cfDNA fragments carry important features for building cancer sample classification ML models, such as fragment size, and fragment end motif etc. Analyzing and visualizing fragment size metrics, as well as other biological features in a curated, standardized, scalable, well-documented, and reproducible way might be time intensive. This package intends to resolve these problems and simplify the process. It offers two sets of functions for cfDNA feature characterization and visualization.

Maintained by Haichao Wang. Last updated 5 months ago.

visualization sequencing wholegenome bioinformatics cancer-genomics cancer-research cell-free-dna early-detection genomics-visualization liquid-biopsy swgs whole-genome-sequencing

10.3 match 28 stars 6.04 score 13 scripts

business-science

modeltime.resample:Resampling Tools for Time Series Forecasting

A 'modeltime' extension that implements forecast resampling tools that assess time-based model performance and stability for a single time series, panel data, and cross-sectional time series analysis.

Maintained by Matt Dancho. Last updated 1 years ago.

accuracy-metrics backtesting bootstrap bootstrapping cross-validation forecasting modeltime modeltime-resample resampling statistics tidymodels time-series

9.3 match 19 stars 6.64 score 38 scripts 1 dependents

emf-creaf

indicspecies:Relationship Between Species and Groups of Sites

Functions to assess the strength and statistical significance of the relationship between species occurrence/abundance and groups of sites [De Caceres & Legendre (2009) <doi:10.1890/08-1823.1>]. Also includes functions to measure species niche breadth using resource categories [De Caceres et al. (2011) <doi:10.1111/J.1600-0706.2011.19679.x>].

Maintained by Miquel De Cáceres. Last updated 23 days ago.

6.5 match 10 stars 9.49 score 386 scripts 4 dependents

ewenharrison

finalfit:Quickly Create Elegant Regression Results Tables and Plots when Modelling

Generate regression results tables and plots in final format for publication. Explore models and export directly to PDF and 'Word' using 'RMarkdown'.

Maintained by Ewen Harrison. Last updated 7 months ago.

5.3 match 270 stars 11.43 score 1.0k scripts

yanyachen

MLmetrics:Machine Learning Evaluation Metrics

A collection of evaluation metrics, including loss, score and utility functions, that measure regression, classification and ranking performance.

Maintained by Yachen Yan. Last updated 11 months ago.

5.5 match 69 stars 11.09 score 2.2k scripts 20 dependents

bioc

tidytof:Analyze High-dimensional Cytometry Data Using Tidy Data Principles

This package implements an interactive, scientific analysis pipeline for high-dimensional cytometry data built using tidy data principles. It is specifically designed to play well with both the tidyverse and Bioconductor software ecosystems, with functionality for reading/writing data files, data cleaning, preprocessing, clustering, visualization, modeling, and other quality-of-life functions. tidytof implements a "grammar" of high-dimensional cytometry data analysis.

Maintained by Timothy Keyes. Last updated 5 months ago.

singlecell flowcytometry bioinformatics cytometry data-science single-cell tidyverse cpp

8.3 match 19 stars 7.26 score 35 scripts

fmestre1

lconnect:Simple Tools to Compute Landscape Connectivity Metrics

Provides functions to upload vectorial data and derive landscape connectivity metrics in habitat or matrix systems. Additionally, includes an approach to assess individual patch contribution to the overall landscape connectivity, enabling the prioritization of habitat patches. The computation of landscape connectivity and patch importance are very useful in Landscape Ecology research. The metrics available are: number of components, number of links, size of the largest component, mean size of components, class coincidence probability, landscape coincidence probability, characteristic path length, expected cluster size, area-weighted flux and integral index of connectivity. Pascual-Hortal, L., and Saura, S. (2006) <doi:10.1007/s10980-006-0013-z> Urban, D., and Keitt, T. (2001) <doi:10.2307/2679983> Laita, A., Kotiaho, J., Monkkonen, M. (2011) <doi:10.1007/s10980-011-9620-4>.

Maintained by Frederico Mestre. Last updated 1 years ago.

connectivity habitat-connectivity landscape metrics cpp

15.7 match 6 stars 3.78 score 3 scripts

bioc

singleCellTK:Comprehensive and Interactive Analysis of Single Cell RNA-Seq Data

The Single Cell Toolkit (SCTK) in the singleCellTK package provides an interface to popular tools for importing, quality control, analysis, and visualization of single cell RNA-seq data. SCTK allows users to seamlessly integrate tools from various packages at different stages of the analysis workflow. A general "a la carte" workflow gives users the ability access to multiple methods for data importing, calculation of general QC metrics, doublet detection, ambient RNA estimation and removal, filtering, normalization, batch correction or integration, dimensionality reduction, 2-D embedding, clustering, marker detection, differential expression, cell type labeling, pathway analysis, and data exporting. Curated workflows can be used to run Seurat and Celda. Streamlined quality control can be performed on the command line using the SCTK-QC pipeline. Users can analyze their data using commands in the R console or by using an interactive Shiny Graphical User Interface (GUI). Specific analyses or entire workflows can be summarized and shared with comprehensive HTML reports generated by Rmarkdown. Additional documentation and vignettes can be found at camplab.net/sctk.

Maintained by Joshua David Campbell. Last updated 23 days ago.

singlecell geneexpression differentialexpression alignment clustering immunooncology batcheffect normalization qualitycontrol dataimport gui

5.8 match 181 stars 10.16 score 252 scripts

bioc

EpiCompare:Comparison, Benchmarking & QC of Epigenomic Datasets

EpiCompare is used to compare and analyse epigenetic datasets for quality control and benchmarking purposes. The package outputs an HTML report consisting of three sections: (1. General metrics) Metrics on peaks (percentage of blacklisted and non-standard peaks, and peak widths) and fragments (duplication rate) of samples, (2. Peak overlap) Percentage and statistical significance of overlapping and non-overlapping peaks. Also includes upset plot and (3. Functional annotation) functional annotation (ChromHMM, ChIPseeker and enrichment analysis) of peaks. Also includes peak enrichment around TSS.

Maintained by Hiranyamaya Dash. Last updated 29 days ago.

epigenetics genetics qualitycontrol chipseq multiplecomparison functionalgenomics atacseq dnaseseq benchmark benchmarking bioconductor bioconductor-package comparison html interactive-reporting

7.8 match 14 stars 7.54 score 46 scripts

laperez

Clustering:Techniques for Evaluating Clustering

The design of this package allows us to run different clustering packages and compare the results between them, to determine which algorithm behaves best from the data provided. See Martos, L.A.P., García-Vico, Á.M., González, P. et al.(2023) <doi:10.1007/s13748-022-00294-2> "Clustering: an R library to facilitate the analysis and comparison of cluster algorithms.", Martos, L.A.P., García-Vico, Á.M., González, P. et al. "A Multiclustering Evolutionary Hyperrectangle-Based Algorithm" <doi:10.1007/s44196-023-00341-3> and L.A.P., García-Vico, Á.M., González, P. et al. "An Evolutionary Fuzzy System for Multiclustering in Data Streaming" <doi:10.1016/j.procs.2023.12.058>.

Maintained by Luis Alfonso Perez Martos. Last updated 11 months ago.

14.3 match 5 stars 4.04 score 7 scripts

gabferreira

phyloraster:Evolutionary Diversity Metrics for Raster Data

Phylogenetic Diversity (PD, Faith 1992), Evolutionary Distinctiveness (ED, Isaac et al. 2007), Phylogenetic Endemism (PE, Rosauer et al. 2009; Laffan et al. 2016), and Weighted Endemism (WE, Laffan et al. 2016) for presence-absence raster. Faith, D. P. (1992) <doi:10.1016/0006-3207(92)91201-3> Isaac, N. J. et al. (2007) <doi:10.1371/journal.pone.0000296> Laffan, S. W. et al. (2016) <doi:10.1111/2041-210X.12513> Rosauer, D. et al. (2009) <doi:10.1111/j.1365-294X.2009.04311.x>.

Maintained by Gabriela Alves-Ferreira. Last updated 15 days ago.

10.1 match 7 stars 5.66 score 33 scripts

jlmelville

rnndescent:Nearest Neighbor Descent Method for Approximate Nearest Neighbors

The Nearest Neighbor Descent method for finding approximate nearest neighbors by Dong and co-workers (2010) <doi:10.1145/1963405.1963487>. Based on the 'Python' package 'PyNNDescent' <https://github.com/lmcinnes/pynndescent>.

Maintained by James Melville. Last updated 8 months ago.

approximate-nearest-neighbor-search cpp

7.8 match 11 stars 7.31 score 75 scripts

sapfluxnet

sapfluxnetr:Working with 'Sapfluxnet' Project Data

Access, modify, aggregate and plot data from the 'Sapfluxnet' project (<http://sapfluxnet.creaf.cat>), the first global database of sap flow measurements.

Maintained by Victor Granda. Last updated 2 years ago.

8.7 match 25 stars 6.57 score 49 scripts

herveabdi

DistatisR:DiSTATIS Three Way Metric Multidimensional Scaling

Implement DiSTATIS and CovSTATIS (three-way multidimensional scaling). DiSTATIS and CovSTATIS are used to analyze multiple distance/covariance matrices collected on the same set of observations. These methods are based on Abdi, H., Williams, L.J., Valentin, D., & Bennani-Dosse, M. (2012) <doi:10.1002/wics.198>.

Maintained by Herve Abdi. Last updated 1 years ago.

3-way-mds distatis metric-multidimensional-scaling

12.9 match 4 stars 4.42 score 44 scripts

nimble-dev

compareMCMCs:Compare MCMC Efficiency from 'nimble' and/or Other MCMC Engines

Manages comparison of MCMC performance metrics from multiple MCMC algorithms. These may come from different MCMC configurations using the 'nimble' package or from other packages. Plug-ins for JAGS via 'rjags' and Stan via 'rstan' are provided. It is possible to write plug-ins for other packages. Performance metrics are held in an MCMCresult class along with samples and timing data. It is easy to apply new performance metrics. Reports are generated as html pages with figures comparing sets of runs. It is possible to configure the html pages, including providing new figure components.

Maintained by Perry de Valpine. Last updated 5 months ago.

12.1 match 1 stars 4.71 score 17 scripts

cran

datarobot:'DataRobot' Predictive Modeling API

For working with the 'DataRobot' predictive modeling platform's API <https://www.datarobot.com/>.

Maintained by AJ Alon. Last updated 1 years ago.

16.4 match 2 stars 3.48 score

flr

FLCore:Core Package of FLR, Fisheries Modelling in R

Core classes and methods for FLR, a framework for fisheries modelling and management strategy simulation in R. Developed by a team of fisheries scientists in various countries. More information can be found at <http://flr-project.org/>.

Maintained by Iago Mosqueira. Last updated 9 days ago.

fisheries flr fisheries-modelling

6.5 match 16 stars 8.78 score 956 scripts 23 dependents

rstudio

vetiver:Version, Share, Deploy, and Monitor Models

The goal of 'vetiver' is to provide fluent tooling to version, share, deploy, and monitor a trained model. Functions handle both recording and checking the model's input data prototype, and predicting from a remote API endpoint. The 'vetiver' package is extensible, with generics that can support many kinds of models.

Maintained by Julia Silge. Last updated 5 months ago.

5.4 match 185 stars 10.48 score 466 scripts 1 dependents

thijsjanzen

treestats:Phylogenetic Tree Statistics

Collection of phylogenetic tree statistics, collected throughout the literature. All functions have been written to maximize computation speed. The package includes umbrella functions to calculate all statistics, all balance associated statistics, or all branching time related statistics. Furthermore, the 'treestats' package supports summary statistic calculations on Ltables, provides speed-improved coding of branching times, Ltable conversion and includes algorithms to create intermediately balanced trees. Full description can be found in Janzen (2024) <doi:10.1016/j.ympev.2024.108168>.

Maintained by Thijs Janzen. Last updated 6 months ago.

cpp

10.4 match 16 stars 5.43 score 16 scripts 1 dependents

bioc

struct:Statistics in R Using Class-based Templates

Defines and includes a set of class-based templates for developing and implementing data processing and analysis workflows, with a strong emphasis on statistics and machine learning. The templates can be used and where needed extended to 'wrap' tools and methods from other packages into a common standardised structure to allow for effective and fast integration. Model objects can be combined into sequences, and sequences nested in iterators using overloaded operators to simplify and improve readability of the code. Ontology lookup has been integrated and implemented to provide standardised definitions for methods, inputs and outputs wrapped using the class-based templates.

Maintained by Gavin Rhys Lloyd. Last updated 5 months ago.

workflowstep

9.3 match 6.04 score 76 scripts 3 dependents

bioc

musicatk:Mutational Signature Comprehensive Analysis Toolkit

Mutational signatures are carcinogenic exposures or aberrant cellular processes that can cause alterations to the genome. We created musicatk (MUtational SIgnature Comprehensive Analysis ToolKit) to address shortcomings in versatility and ease of use in other pre-existing computational tools. Although many different types of mutational data have been generated, current software packages do not have a flexible framework to allow users to mix and match different types of mutations in the mutational signature inference process. Musicatk enables users to count and combine multiple mutation types, including SBS, DBS, and indels. Musicatk calculates replication strand, transcription strand and combinations of these features along with discovery from unique and proprietary genomic feature associated with any mutation type. Musicatk also implements several methods for discovery of new signatures as well as methods to infer exposure given an existing set of signatures. Musicatk provides functions for visualization and downstream exploratory analysis including the ability to compare signatures between cohorts and find matching signatures in COSMIC V2 or COSMIC V3.

Maintained by Joshua D. Campbell. Last updated 5 months ago.

software biologicalquestion somaticmutation variantannotation

7.8 match 13 stars 7.02 score 20 scripts

psavary3

graph4lg:Build Graphs for Landscape Genetics Analysis

Build graphs for landscape genetics analysis. This set of functions can be used to import and convert spatial and genetic data initially in different formats, import landscape graphs created with 'GRAPHAB' software (Foltete et al., 2012) <doi:10.1016/j.envsoft.2012.07.002>, make diagnosis plots of isolation by distance relationships in order to choose how to build genetic graphs, create graphs with a large range of pruning methods, weight their links with several genetic distances, plot and analyse graphs, compare them with other graphs. It uses functions from other packages such as 'adegenet' (Jombart, 2008) <doi:10.1093/bioinformatics/btn129> and 'igraph' (Csardi et Nepusz, 2006) <https://igraph.org/>. It also implements methods commonly used in landscape genetics to create graphs, described by Dyer et Nason (2004) <doi:10.1111/j.1365-294X.2004.02177.x> and Greenbaum et Fefferman (2017) <doi:10.1111/mec.14059>, and to analyse distance data (van Strien et al., 2015) <doi:10.1038/hdy.2014.62>.

Maintained by Paul Savary. Last updated 2 years ago.

12.0 match 3 stars 4.51 score 54 scripts

egenn

rtemis:Machine Learning and Visualization

Advanced Machine Learning and Visualization. Unsupervised Learning (Clustering, Decomposition), Supervised Learning (Classification, Regression), Cross-Decomposition, Bagging, Boosting, Meta-models. Static and interactive graphics.

Maintained by E.D. Gennatas. Last updated 1 months ago.

data-science data-visualization machine-learning machine-learning-library visualization

7.6 match 145 stars 7.09 score 50 scripts 2 dependents

alexwaterboybezzina

CalcThemAll.PRM:Calculate Pesticide Risk Metric (PRM) Values from Multiple Pesticides...Calc Them All

Contains functions which can be used to calculate Pesticide Risk Metric values in aquatic environments from concentrations of multiple pesticides with known species sensitive distributions (SSDs). Pesticides provided by this package have all be validated however if the user has their own pesticides with SSD values they can append them to the pesticide_info table to include them in estimates.

Maintained by Alexander Bezzina. Last updated 11 months ago.

11.2 match 2 stars 4.78 score

fsavje

distances:Tools for Distance Metrics

Provides tools for constructing, manipulating and using distance metrics.

Maintained by Fredrik Savje. Last updated 1 years ago.

cpp

7.7 match 17 stars 6.92 score 117 scripts 12 dependents

thomaschln

opticskxi:OPTICS K-Xi Density-Based Clustering

Density-based clustering methods are well adapted to the clustering of high-dimensional data and enable the discovery of core groups of various shapes despite large amounts of noise. This package provides a novel density-based cluster extraction method, OPTICS k-Xi, and a framework to compare k-Xi models using distance-based metrics to investigate datasets with unknown number of clusters. The vignette first introduces density-based algorithms with simulated datasets, then presents and evaluates the k-Xi cluster extraction method. Finally, the models comparison framework is described and experimented on 2 genetic datasets to identify groups and their discriminating features. The k-Xi algorithm is a novel OPTICS cluster extraction method that specifies directly the number of clusters and does not require fine-tuning of the steepness parameter as the OPTICS Xi method. Combined with a framework that compares models with varying parameters, the OPTICS k-Xi method can identify groups in noisy datasets with unknown number of clusters. Results on summarized genetic data of 1,200 patients are in Charlon T. (2019) <doi:10.13097/archive-ouverte/unige:161795>. A short video tutorial can be found at <https://www.youtube.com/watch?v=P2XAjqI5Lc4/>.

Maintained by Thomas Charlon. Last updated 6 days ago.

10.8 match 4.90 score 1 scripts

jlmelville

mize:Unconstrained Numerical Optimization Algorithms

Optimization algorithms implemented in R, including conjugate gradient (CG), Broyden-Fletcher-Goldfarb-Shanno (BFGS) and the limited memory BFGS (L-BFGS) methods. Most internal parameters can be set through the call interface. The solvers hold up quite well for higher-dimensional problems.

Maintained by James Melville. Last updated 2 months ago.

conjugate-gradient l-bfgs numerical-optimization

7.6 match 10 stars 6.95 score 25 scripts 6 dependents

davidbolin

rSPDE:Rational Approximations of Fractional Stochastic Partial Differential Equations

Functions that compute rational approximations of fractional elliptic stochastic partial differential equations. The package also contains functions for common statistical usage of these approximations. The main references for rSPDE are Bolin, Simas and Xiong (2023) <doi:10.1080/10618600.2023.2231051> for the covariance-based method and Bolin and Kirchner (2020) <doi:10.1080/10618600.2019.1665537> for the operator-based rational approximation. These can be generated by the citation function in R.

Maintained by David Bolin. Last updated 8 days ago.

6.9 match 11 stars 7.57 score 188 scripts 3 dependents

atkinsjeff

forestr:Ecosystem and Canopy Structural Complexity Metrics from LiDAR

Provides a toolkit for calculating forest and canopy structural complexity metrics from terrestrial LiDAR (light detection and ranging). References: Atkins et al. 2018 <doi:10.1111/2041-210X.13061>; Hardiman et al. 2013 <doi:10.3390/f4030537>; Parker et al. 2004 <doi:10.1111/j.0021-8901.2004.00925.x>.

Maintained by Jeff Atkins. Last updated 1 years ago.

10.4 match 29 stars 4.95 score 31 scripts

andrewdhawan

sigQC:Quality Control Metrics for Gene Signatures

Provides gene signature quality control metrics in publication ready plots. Namely, enables the visualization of properties such as expression, variability, correlation, and comparison of methods of standardisation and scoring metrics.

Maintained by Andrew Dhawan. Last updated 7 months ago.

10.6 match 4 stars 4.89 score 13 scripts

cdowd

twosamples:Fast Permutation Based Two Sample Tests

Fast randomization based two sample tests. Testing the hypothesis that two samples come from the same distribution using randomization to create p-values. Included tests are: Kolmogorov-Smirnov, Kuiper, Cramer-von Mises, Anderson-Darling, Wasserstein, and DTS. The default test (two_sample) is based on the DTS test statistic, as it is the most powerful, and thus most useful to most users. The DTS test statistic builds on the Wasserstein distance by using a weighting scheme like that of Anderson-Darling. See the companion paper at <arXiv:2007.01360> or <https://codowd.com/public/DTS.pdf> for details of that test statistic, and non-standard uses of the package (parallel for big N, weighted observations, one sample tests, etc). We also include the permutation scheme to make test building simple for others.

Maintained by Connor Dowd. Last updated 2 years ago.

distance-metric ecdf cpp

7.5 match 17 stars 6.88 score 62 scripts 8 dependents

tidymodels

tune:Tidy Tuning Tools

The ability to tune models is important. 'tune' contains functions and classes to be used in conjunction with other 'tidymodels' packages for finding reasonable values of hyper-parameters in models, pre-processing methods, and post-processing steps.

Maintained by Max Kuhn. Last updated 11 days ago.

3.6 match 293 stars 14.27 score 756 scripts 39 dependents

ropensci

spatsoc:Group Animal Relocation Data by Spatial and Temporal Relationship

Detects spatial and temporal groups in GPS relocations (Robitaille et al. (2019) <doi:10.1111/2041-210X.13215>). It can be used to convert GPS relocations to gambit-of-the-group format to build proximity-based social networks In addition, the randomizations function provides data-stream randomization methods suitable for GPS data.

Maintained by Alec L. Robitaille. Last updated 1 months ago.

animal gps network social spatial

5.1 match 24 stars 9.97 score 145 scripts 3 dependents

cran

ncappc:NCA Calculations and Population Model Diagnosis

A flexible tool that can perform (i) traditional non-compartmental analysis (NCA) and (ii) Simulation-based posterior predictive checks for population pharmacokinetic (PK) and/or pharmacodynamic (PKPD) models using NCA metrics.

Maintained by Andrew C. Hooker. Last updated 7 years ago.

18.8 match 2.70 score

bioc

MicrobiotaProcess:A comprehensive R package for managing and analyzing microbiome and other ecological data within the tidy framework

MicrobiotaProcess is an R package for analysis, visualization and biomarker discovery of microbial datasets. It introduces MPSE class, this make it more interoperable with the existing computing ecosystem. Moreover, it introduces a tidy microbiome data structure paradigm and analysis grammar. It provides a wide variety of microbiome data analysis procedures under the unified and common framework (tidy-like framework).

Maintained by Shuangbin Xu. Last updated 5 months ago.

visualization microbiome software multiplecomparison featureextraction microbiome-analysis microbiome-data

5.2 match 183 stars 9.70 score 126 scripts 1 dependents

r-lib

systemfonts:System Native Font Finding

Provides system native access to the font catalogue. As font handling varies between systems it is difficult to correctly locate installed fonts across different operating systems. The 'systemfonts' package provides bindings to the native libraries on Windows, macOS and Linux for finding font files that can then be used further by e.g. graphic devices. The main use is intended to be from compiled code but 'systemfonts' also provides access from R.

Maintained by Thomas Lin Pedersen. Last updated 2 months ago.

fonts fontconfig freetype cpp

3.3 match 95 stars 15.62 score 384 scripts 990 dependents

kozodoi

fairness:Algorithmic Fairness Metrics

Offers calculation, visualization and comparison of algorithmic fairness metrics. Fair machine learning is an emerging topic with the overarching aim to critically assess whether ML algorithms reinforce existing social biases. Unfair algorithms can propagate such biases and produce predictions with a disparate impact on various sensitive groups of individuals (defined by sex, gender, ethnicity, religion, income, socioeconomic status, physical or mental disabilities). Fair algorithms possess the underlying foundation that these groups should be treated similarly or have similar prediction outcomes. The fairness R package offers the calculation and comparisons of commonly and less commonly used fairness metrics in population subgroups. These methods are described by Calders and Verwer (2010) <doi:10.1007/s10618-010-0190-x>, Chouldechova (2017) <doi:10.1089/big.2016.0047>, Feldman et al. (2015) <doi:10.1145/2783258.2783311> , Friedler et al. (2018) <doi:10.1145/3287560.3287589> and Zafar et al. (2017) <doi:10.1145/3038912.3052660>. The package also offers convenient visualizations to help understand fairness metrics.

Maintained by Nikita Kozodoi. Last updated 2 years ago.

algorithmic-discrimination algorithmic-fairness discrimination disparate-impact fairness fairness-ai fairness-ml machine-learning

7.4 match 32 stars 6.82 score 69 scripts 1 dependents

bioc

maftools:Summarize, Analyze and Visualize MAF Files

Analyze and visualize Mutation Annotation Format (MAF) files from large scale sequencing studies. This package provides various functions to perform most commonly used analyses in cancer genomics and to create feature rich customizable visualzations with minimal effort.

Maintained by Anand Mayakonda. Last updated 5 months ago.

datarepresentation dnaseq visualization drivermutation variantannotation featureextraction classification somaticmutation sequencing functionalgenomics survival bioinformatics cancer-genome-atlas cancer-genomics genomics maf-files tcga curl bzip2 xz-utils zlib

3.5 match 459 stars 14.63 score 948 scripts 18 dependents

dicook

nullabor:Tools for Graphical Inference

Tools for visual inference. Generate null data sets and null plots using permutation and simulation. Calculate distance metrics for a lineup, and examine the distributions of metrics.

Maintained by Di Cook. Last updated 1 months ago.

4.9 match 57 stars 10.38 score 370 scripts 2 dependents

kapsner

mlexperiments:Machine Learning Experiments

Provides 'R6' objects to perform parallelized hyperparameter optimization and cross-validation. Hyperparameter optimization can be performed with Bayesian optimization (via 'ParBayesianOptimization' <https://cran.r-project.org/package=ParBayesianOptimization>) and grid search. The optimized hyperparameters can be validated using k-fold cross-validation. Alternatively, hyperparameter optimization and validation can be performed with nested cross-validation. While 'mlexperiments' focuses on core wrappers for machine learning experiments, additional learner algorithms can be supplemented by inheriting from the provided learner base class.

Maintained by Lorenz A. Kapsner. Last updated 11 days ago.

cross-validation experiment hyperparameter-optimization hyperparameter-tuning machine-learning nested

6.6 match 5 stars 7.64 score 49 scripts 2 dependents

tbep-tech

tbeptools:Data and Indicators for the Tampa Bay Estuary Program

Several functions are provided for working with Tampa Bay Estuary Program data and indicators, including the water quality report card, tidal creek assessments, Tampa Bay Nekton Index, Tampa Bay Benthic Index, seagrass transect data, habitat report card, and fecal indicator bacteria. Additional functions are provided for miscellaneous tasks, such as reference library curation.

Maintained by Marcus Beck. Last updated 8 days ago.

data-analysis tampa-bay tbep water-quality

6.4 match 10 stars 7.86 score 133 scripts

idblr

ndi:Neighborhood Deprivation Indices

Computes various geospatial indices of socioeconomic deprivation and disparity in the United States. Some indices are considered "spatial" because they consider the values of neighboring (i.e., adjacent) census geographies in their computation, while other indices are "aspatial" because they only consider the value within each census geography. Two types of aspatial neighborhood deprivation indices (NDI) are available: including: (1) based on Messer et al. (2006) <doi:10.1007/s11524-006-9094-x> and (2) based on Andrews et al. (2020) <doi:10.1080/17445647.2020.1750066> and Slotman et al. (2022) <doi:10.1016/j.dib.2022.108002> who use variables chosen by Roux and Mair (2010) <doi:10.1111/j.1749-6632.2009.05333.x>. Both are a decomposition of multiple demographic characteristics from the U.S. Census Bureau American Community Survey 5-year estimates (ACS-5; 2006-2010 onward). Using data from the ACS-5 (2005-2009 onward), the package can also compute indices of racial or ethnic residential segregation, including but limited to those discussed in Massey & Denton (1988) <doi:10.1093/sf/67.2.281>, and additional indices of socioeconomic disparity.

Maintained by Ian D. Buller. Last updated 7 months ago.

census census-api census-data deprivation deprivation-stats disparity geospatial geospatial-data metric-development principal-component-analysis segregation-measures socio-economic-indicators

7.5 match 21 stars 6.67 score 7 scripts 1 dependents

bioc

pipeComp:pipeComp pipeline benchmarking framework

A simple framework to facilitate the comparison of pipelines involving various steps and parameters. The `pipelineDefinition` class represents pipelines as, minimally, a set of functions consecutively executed on the output of the previous one, and optionally accompanied by step-wise evaluation and aggregation functions. Given such an object, a set of alternative parameters/methods, and benchmark datasets, the `runPipeline` function then proceeds through all combinations arguments, avoiding recomputing the same step twice and compiling evaluations on the fly to avoid storing potentially large intermediate data.

Maintained by Pierre-Luc Germain. Last updated 5 months ago.

geneexpression transcriptomics clustering datarepresentation benchmark bioconductor pipeline-benchmarking pipelines single-cell-rna-seq

7.1 match 41 stars 7.02 score 43 scripts

david-cortes

recometrics:Evaluation Metrics for Implicit-Feedback Recommender Systems

Calculates evaluation metrics for implicit-feedback recommender systems that are based on low-rank matrix factorization models, given the fitted model matrices and data, thus allowing to compare models from a variety of libraries. Metrics include P@K (precision-at-k, for top-K recommendations), R@K (recall at k), AP@K (average precision at k), NDCG@K (normalized discounted cumulative gain at k), Hit@K (from which the 'Hit Rate' is calculated), RR@K (reciprocal rank at k, from which the 'MRR' or 'mean reciprocal rank' is calculated), ROC-AUC (area under the receiver-operating characteristic curve), and PR-AUC (area under the precision-recall curve). These are calculated on a per-user basis according to the ranking of items induced by the model, using efficient multi-threaded routines. Also provides functions for creating train-test splits for model fitting and evaluation.

Maintained by David Cortes. Last updated 2 months ago.

implicit-feedback matrix-factorization recommender-systems openblas cpp openmp

9.1 match 28 stars 5.45 score

bioc

beadarray:Quality assessment and low-level analysis for Illumina BeadArray data

The package is able to read bead-level data (raw TIFFs and text files) output by BeadScan as well as bead-summary data from BeadStudio. Methods for quality assessment and low-level analysis are provided.

Maintained by Mark Dunning. Last updated 5 months ago.

microarray onechannel qualitycontrol preprocessing

6.3 match 7.88 score 70 scripts 4 dependents

neptune-ai

neptune:MLOps Metadata Store - Experiment Tracking and Model Registry for Production Teams

An interface to Neptune. A metadata store for MLOps, built for teams that run a lot of experiments. It gives you a single place to log, store, display, organize, compare, and query all your model-building metadata. Neptune is used for: • Experiment tracking: Log, display, organize, and compare ML experiments in a single place. • Model registry: Version, store, manage, and query trained models, and model building metadata. • Monitoring ML runs live: Record and monitor model training, evaluation, or production runs live For more information see <https://neptune.ai/>.

Maintained by Rafal Jankowski. Last updated 2 years ago.

compare language log management metadata metrics mlops models monitoring organize parameters store tracker visualization

10.0 match 14 stars 4.89 score 16 scripts

mandymejia

ciftiTools:Tools for Reading, Writing, Viewing and Manipulating CIFTI Files

CIFTI files contain brain imaging data in "grayordinates," which represent the gray matter as cortical surface vertices (left and right) and subcortical voxels (cerebellum, basal ganglia, and other deep gray matter). 'ciftiTools' provides a unified environment for reading, writing, visualizing and manipulating CIFTI-format data. It supports the "dscalar," "dlabel," and "dtseries" intents. Grayordinate data is read in as a "xifti" object, which is structured for convenient access to the data and metadata, and includes support for surface geometry files to enable spatially-dependent functionality such as static or interactive visualizations and smoothing.

Maintained by Amanda Mejia. Last updated 1 months ago.

5.4 match 47 stars 8.90 score 176 scripts 4 dependents

emeyers

NeuroDecodeR:Decode Information from Neural Activity

Neural decoding is method of analyzing neural data that uses a pattern classifiers to predict experimental conditions based on neural activity. 'NeuroDecodeR' is a system of objects that makes it easy to run neural decoding analyses. For more information on neural decoding see Meyers & Kreiman (2011) <doi:10.7551/mitpress/8404.003.0024>.

Maintained by Ethan Meyers. Last updated 1 years ago.

7.4 match 12 stars 6.49 score 17 scripts

bioc

scPipe:Pipeline for single cell multi-omic data pre-processing

A preprocessing pipeline for single cell RNA-seq/ATAC-seq data that starts from the fastq files and produces a feature count matrix with associated quality control information. It can process fastq data generated by CEL-seq, MARS-seq, Drop-seq, Chromium 10x and SMART-seq protocols.

Maintained by Shian Su. Last updated 3 months ago.

immunooncology software sequencing rnaseq geneexpression singlecell visualization sequencematching preprocessing qualitycontrol genomeannotation dataimport curl bzip2 xz-utils zlib cpp

5.3 match 68 stars 9.02 score 84 scripts

tbep-tech

wqtrends:Assess Water Quality Trends with Generalized Additive Models

Assess Water Quality Trends for Long-Term Monitoring Data in Estuaries using Generalized Additive Models following Wood (2017) <doi:10.1201/9781315370279> and Error Propagation with Mixed-Effects Meta-Analysis following Sera et al. (2019) <doi:10.1002/sim.8362>. Methods are available for model fitting, assessment of fit, annual and seasonal trend tests, and visualization of results.

Maintained by Marcus Beck. Last updated 5 days ago.

reporting san-francisco-bay time-series-analysis water-quality

8.8 match 10 stars 5.38 score 24 scripts

dvrbts

labdsv:Ordination and Multivariate Analysis for Ecology

A variety of ordination and community analyses useful in analysis of data sets in community ecology. Includes many of the common ordination methods, with graphical routines to facilitate their interpretation, as well as several novel analyses.

Maintained by David W. Roberts. Last updated 2 years ago.

fortran

7.8 match 3 stars 6.08 score 452 scripts 13 dependents

spsanderson

healthyR.ai:The Machine Learning and AI Modeling Companion to 'healthyR'

Hospital machine learning and ai data analysis workflow tools, modeling, and automations. This library provides many useful tools to review common administrative hospital data. Some of these include predicting length of stay, and readmits. The aim is to provide a simple and consistent verb framework that takes the guesswork out of everything.

Maintained by Steven Sanderson. Last updated 2 months ago.

ai artificial-intelligence healthcareanalytics healthyr healthyverse machine-learning

6.4 match 16 stars 7.37 score 36 scripts 1 dependents

green-striped-gecko

dartR.base:Analysing 'SNP' and 'Silicodart' Data - Basic Functions

Facilitates the import and analysis of 'SNP' (single nucleotide 'polymorphism') and 'silicodart' (presence/absence) data. The main focus is on data generated by 'DarT' (Diversity Arrays Technology), however, data from other sequencing platforms can be used once 'SNP' or related fragment presence/absence data from any source is imported. Genetic datasets are stored in a derived 'genlight' format (package 'adegenet'), that allows for a very compact storage of data and metadata. Functions are available for importing and exporting of 'SNP' and 'silicodart' data, for reporting on and filtering on various criteria (e.g. 'callrate', 'heterozygosity', 'reproducibility', maximum allele frequency). Additional functions are available for visualization (e.g. Principle Coordinate Analysis) and creating a spatial representation using maps. 'dartR.base' is the 'base' package of the 'dartRverse' suits of packages. To install the other packages, we recommend to install the 'dartRverse' package, that supports the installation of all packages in the 'dartRverse'. If you want to cite 'dartR', you find the information by typing citation('dartR.base') in the console.

Maintained by Bernd Gruber. Last updated 12 days ago.

12.2 match 3.84 score 17 scripts 5 dependents

alarm-redist

redistmetrics:Redistricting Metrics

Reliable and flexible tools for scoring redistricting plans using common measures and metrics. These functions provide key direct access to tools useful for non-simulation analyses of redistricting plans, such as for measuring compactness or partisan fairness. Tools are designed to work with the 'redist' package seamlessly.

Maintained by Christopher T. Kenny. Last updated 9 months ago.

openblas cpp

6.1 match 10 stars 7.57 score 23 scripts 2 dependents

bioc

miQC:Flexible, probabilistic metrics for quality control of scRNA-seq data

Single-cell RNA-sequencing (scRNA-seq) has made it possible to profile gene expression in tissues at high resolution. An important preprocessing step prior to performing downstream analyses is to identify and remove cells with poor or degraded sample quality using quality control (QC) metrics. Two widely used QC metrics to identify a ‘low-quality’ cell are (i) if the cell includes a high proportion of reads that map to mitochondrial DNA encoded genes (mtDNA) and (ii) if a small number of genes are detected. miQC is data-driven QC metric that jointly models both the proportion of reads mapping to mtDNA and the number of detected genes with mixture models in a probabilistic framework to predict the low-quality cells in a given dataset.

Maintained by Ariel Hippen. Last updated 5 months ago.

singlecell qualitycontrol geneexpression preprocessing sequencing

7.1 match 19 stars 6.39 score 65 scripts

sonsoleslp

tna:Transition Network Analysis (TNA)

Provides tools for performing Transition Network Analysis (TNA) to study relational dynamics, including functions for building and plotting TNA models, calculating centrality measures, and identifying dominant events and patterns. TNA statistical techniques (e.g., bootstrapping and permutation tests) ensure the reliability of observed insights and confirm that identified dynamics are meaningful. See (Saqr et al., 2025) <doi:10.1145/3706468.3706513> for more details on TNA.

Maintained by Sonsoles López-Pernas. Last updated 2 days ago.

educational-data-mining learning-analytics markov-model temporal-analysis

7.0 match 4 stars 6.48 score 5 scripts

cran

hydropeak:Detect and Characterize Sub-Daily Flow Fluctuations

An important environmental impact on running water ecosystems is caused by hydropeaking - the discontinuous release of turbine water because of peaks of energy demand. An event-based algorithm is implemented to detect flow fluctuations referring to increase events (IC) and decrease events (DC). For each event, a set of parameters related to the fluctuation intensity is calculated. The framework is introduced in Greimel et al. (2016) "A method to detect and characterize sub-daily flow fluctuations" <doi:10.1002/hyp.10773> and can be used to identify different fluctuation types according to the potential source: e.g., sub-daily flow fluctuations caused by hydropeaking, rainfall, or snow and glacier melt. This is a companion to the package 'hydroroute', which is used to detect and follow hydropower plant-specific hydropeaking waves at the sub-catchment scale and to describe how hydropeaking flow parameters change along the longitudinal flow path as proposed and validated in Greimel et al. (2022).

Maintained by Bettina Grün. Last updated 2 years ago.

18.1 match 2.48 score 1 dependents

brian-j-smith

MRMCaov:Multi-Reader Multi-Case Analysis of Variance

Estimation and comparison of the performances of diagnostic tests in multi-reader multi-case studies where true case statuses (or ground truths) are known and one or more readers provide test ratings for multiple cases. Reader performance metrics are provided for area under and expected utility of ROC curves, likelihood ratio of positive or negative tests, and sensitivity and specificity. ROC curves can be estimated empirically or with binormal or binormal likelihood-ratio models. Statistical comparisons of diagnostic tests are based on the ANOVA model of Obuchowski-Rockette and the unified framework of Hillis (2005) <doi:10.1002/sim.2024>. The ANOVA can be conducted with data from a full factorial, nested, or partially paired study design; with random or fixed readers or cases; and covariances estimated with the DeLong method, jackknifing, or an unbiased method. Smith and Hillis (2020) <doi:10.1117/12.2549075>.

Maintained by Brian J Smith. Last updated 2 years ago.

8.4 match 12 stars 5.26 score 8 scripts 1 dependents

oeysan

bfw:Bayesian Framework for Computational Modeling

Derived from the work of Kruschke (2015, <ISBN:9780124058880>), the present package aims to provide a framework for conducting Bayesian analysis using Markov chain Monte Carlo (MCMC) sampling utilizing the Just Another Gibbs Sampler ('JAGS', Plummer, 2003, <https://mcmc-jags.sourceforge.io>). The initial version includes several modules for conducting Bayesian equivalents of chi-squared tests, analysis of variance (ANOVA), multiple (hierarchical) regression, softmax regression, and for fitting data (e.g., structural equation modeling).

Maintained by Øystein Olav Skaar. Last updated 3 years ago.

bayesian-data-analysis bayesian-statistics jags mcmc psychological-science cpp

7.5 match 10 stars 5.89 score 31 scripts

paws-r

paws:Amazon Web Services Software Development Kit

Interface to Amazon Web Services <https://aws.amazon.com>, including storage, database, and compute services, such as 'Simple Storage Service' ('S3'), 'DynamoDB' 'NoSQL' database, and 'Lambda' functions-as-a-service.

Maintained by Dyfan Jones. Last updated 3 days ago.

aws aws-sdk

3.9 match 332 stars 11.25 score 177 scripts 12 dependents

thibautjombart

treespace:Statistical Exploration of Landscapes of Phylogenetic Trees

Tools for the exploration of distributions of phylogenetic trees. This package includes a 'shiny' interface which can be started from R using treespaceServer(). For further details see Jombart et al. (2017) <DOI:10.1111/1755-0998.12676>.

Maintained by Michelle Kendall. Last updated 2 years ago.

cpp

5.8 match 28 stars 7.39 score 63 scripts

jefferislab

RANN:Fast Nearest Neighbour Search (Wraps ANN Library) Using L2 Metric

Finds the k nearest neighbours for every point in a given dataset in O(N log N) time using Arya and Mount's ANN library (v1.1.3). There is support for approximate as well as exact searches, fixed radius searches and 'bd' as well as 'kd' trees. The distance is computed using the L2 (Euclidean) metric. Please see package 'RANN.L1' for the same functionality using the L1 (Manhattan, taxicab) metric.

Maintained by Gregory Jefferis. Last updated 7 months ago.

ann-library nearest-neighbors nearest-neighbours cpp

3.5 match 58 stars 12.21 score 1.3k scripts 190 dependents

ropensci

EDIutils:An API Client for the Environmental Data Initiative Repository

A client for the Environmental Data Initiative repository REST API. The 'EDI' data repository <https://portal.edirepository.org/nis/home.jsp> is for publication and reuse of ecological data with emphasis on metadata accuracy and completeness. It is built upon the 'PASTA+' software stack <https://pastaplus-core.readthedocs.io/en/latest/index.html#> and was developed in collaboration with the US 'LTER' Network <https://lternet.edu/>. 'EDIutils' includes functions to search and access existing data, evaluate and upload new data, and assist other data management tasks common to repository users.

Maintained by Colin Smith. Last updated 1 years ago.

ecology eml-metadata open-access open-data research-data-management research-data-repository

6.7 match 10 stars 6.47 score 117 scripts

rstudio

tfestimators:Interface to 'TensorFlow' Estimators

Interface to 'TensorFlow' Estimators <https://www.tensorflow.org/guide/estimator>, a high-level API that provides implementations of many different model types including linear models and deep neural networks.

Maintained by Tomasz Kalinowski. Last updated 3 years ago.

5.1 match 57 stars 8.42 score 170 scripts

tidymodels

probably:Tools for Post-Processing Predicted Values

Models can be improved by post-processing class probabilities, by: recalibration, conversion to hard probabilities, assessment of equivocal zones, and other activities. 'probably' contains tools for conducting these operations as well as calibration tools and conformal inference techniques for regression models.

Maintained by Max Kuhn. Last updated 5 months ago.

3.5 match 115 stars 12.09 score 21k scripts 1 dependents

eagerai

fastai:Interface to 'fastai'

The 'fastai' <https://docs.fast.ai/index.html> library simplifies training fast and accurate neural networks using modern best practices. It is based on research in to deep learning best practices undertaken at 'fast.ai', including 'out of the box' support for vision, text, tabular, audio, time series, and collaborative filtering models.

Maintained by Turgut Abdullayev. Last updated 11 months ago.

audio collaborative-filtering darknet darknet-image-classification fastai medical object-detection tabular text vision

4.5 match 118 stars 9.40 score 76 scripts

momx

Momocs:Morphometrics using R

The goal of 'Momocs' is to provide a complete, convenient, reproducible and open-source toolkit for 2D morphometrics. It includes most common 2D morphometrics approaches on outlines, open outlines, configurations of landmarks, traditional morphometrics, and facilities for data preparation, manipulation and visualization with a consistent grammar throughout. It allows reproducible, complex morphometrics analyses and other morphometrics approaches should be easy to plug in, or develop from, on top of this canvas.

Maintained by Vincent Bonhomme. Last updated 1 years ago.

morphometrics

5.7 match 51 stars 7.42 score 346 scripts

andrewmarx

samc:Spatial Absorbing Markov Chains

Implements functions for working with absorbing Markov chains. The implementation is based on the framework described in "Toward a unified framework for connectivity that disentangles movement and mortality in space and time" by Fletcher et al. (2019) <doi:10.1111/ele.13333>, which applies them to spatial ecology. This framework incorporates both resistance and absorption with spatial absorbing Markov chains (SAMC) to provide several short-term and long-term predictions for metrics related to connectivity in landscapes. Despite the ecological context of the framework, this package can be used in any application of absorbing Markov chains.

Maintained by Andrew Marx. Last updated 5 months ago.

absorbing-markov-chains connectivity landscape-ecology landscape-metrics markov-chain cpp

8.0 match 12 stars 5.26 score 15 scripts

nowosad

raceland:Pattern-Based Zoneless Method for Analysis and Visualization of Racial Topography

Implements a computational framework for a pattern-based, zoneless analysis, and visualization of (ethno)racial topography (Dmowska, Stepinski, and Nowosad (2020) <doi:10.1016/j.apgeog.2020.102239>). It is a reimagined approach for analyzing residential segregation and racial diversity based on the concept of 'landscape’ used in the domain of landscape ecology.

Maintained by Jakub Nowosad. Last updated 2 years ago.

information-theory landscape racial-diversity raster residential-segregation spatial cpp

8.0 match 9 stars 5.21 score 12 scripts

nanxstats

RECA:Relevant Component Analysis for Supervised Distance Metric Learning

Relevant Component Analysis (RCA) tries to find a linear transformation of the feature space such that the effect of irrelevant variability is reduced in the transformed space.

Maintained by Nan Xiao. Last updated 11 months ago.

machine-learning metric-learning

10.4 match 7 stars 4.02 score 4 scripts

tidymodels

tidyclust:A Common API to Clustering

A common interface to specifying clustering models, in the same style as 'parsnip'. Creates unified interface across different functions and computational engines.

Maintained by Emil Hvitfeldt. Last updated 2 months ago.

5.5 match 111 stars 7.45 score 139 scripts

bioc

sosta:A package for the analysis of anatomical tissue structures in spatial omics data

sosta (Spatial Omics STructure Analysis) is a package for analyzing spatial omics data to explore tissue organization at the anatomical structure level. It reconstructs morphologically relevant structures based on molecular features or cell types. It further calculates a range of structural and shape metrics to quantitatively describe tissue architecture. The package is designed to integrate with other packages for the analysis of spatial (omics) data.

Maintained by Samuel Gunz. Last updated 1 months ago.

software spatial transcriptomics visualization

8.5 match 1 stars 4.85 score 2 scripts

umr-amap

BIOMASS:Estimating Aboveground Biomass and Its Uncertainty in Tropical Forests

Contains functions to estimate aboveground biomass/carbon and its uncertainty in tropical forests. These functions allow to (1) retrieve and to correct taxonomy, (2) estimate wood density and its uncertainty, (3) construct height-diameter models, (4) manage tree and plot coordinates, (5) estimate the aboveground biomass/carbon at the stand level with associated uncertainty. To cite 'BIOMASS', please use citation("BIOMASS"). See more in the article of Réjou-Méchain et al. (2017) <doi:10.1111/2041-210X.12753>.

Maintained by Dominique Lamonica. Last updated 1 days ago.

4.1 match 26 stars 9.90 score 68 scripts 1 dependents

junruidi

ActFrag:Activity Fragmentation Metrics Extracted from Minute Level Activity Data

Recent studies haven shown that, on top of total daily active/sedentary volumes, the time accumulation strategies provide more sensitive information. This package provides functions to extract commonly used fragmentation metrics to quantify such time accumulation strategies based on minute level actigraphy-measured activity counts data.

Maintained by Junrui Di. Last updated 5 years ago.

8.8 match 4.62 score 14 scripts 2 dependents

welch-lab

rliger:Linked Inference of Genomic Experimental Relationships

Uses an extension of nonnegative matrix factorization to identify shared and dataset-specific factors. See Welch J, Kozareva V, et al (2019) <doi:10.1016/j.cell.2019.05.006>, and Liu J, Gao C, Sodicoff J, et al (2020) <doi:10.1038/s41596-020-0391-8> for more details.

Maintained by Yichen Wang. Last updated 2 months ago.

nonnegative-matrix-factorization single-cell openblas cpp

3.8 match 402 stars 10.80 score 334 scripts 1 dependents

prabhleenkaur19

aniSNA:Statistical Network Analysis of Animal Social Networks

Obtain network structures from animal GPS telemetry observations and statistically analyse them to assess their adequacy for social network analysis. Methods include pre-network data permutations, bootstrapping techniques to obtain confidence intervals for global and node-level network metrics, and correlation and regression analysis of the local network metrics.

Maintained by Prabhleen Kaur. Last updated 2 months ago.

cpp

12.7 match 3.18 score

atsa-es

MARSS:Multivariate Autoregressive State-Space Modeling

The MARSS package provides maximum-likelihood parameter estimation for constrained and unconstrained linear multivariate autoregressive state-space (MARSS) models, including partially deterministic models. MARSS models are a class of dynamic linear model (DLM) and vector autoregressive model (VAR) model. Fitting available via Expectation-Maximization (EM), BFGS (using optim), and 'TMB' (using the 'marssTMB' companion package). Functions are provided for parametric and innovations bootstrapping, Kalman filtering and smoothing, model selection criteria including bootstrap AICb, confidences intervals via the Hessian approximation or bootstrapping, and all conditional residual types. See the user guide for examples of dynamic factor analysis, dynamic linear models, outlier and shock detection, and multivariate AR-p models. Online workshops (lectures, eBook, and computer labs) at <https://atsa-es.github.io/>.

Maintained by Elizabeth Eli Holmes. Last updated 1 years ago.

multivariate-timeseries state-space-models statistics time-series

3.9 match 52 stars 10.34 score 596 scripts 3 dependents

bioc

ChIPQC:Quality metrics for ChIPseq data

Quality metrics for ChIPseq data.

Maintained by Tom Carroll. Last updated 5 months ago.

sequencing chipseq qualitycontrol reportwriting

7.3 match 5.45 score 140 scripts

strohne

volker:High-Level Functions for Tabulating, Charting and Reporting Survey Data

Craft polished tables and plots in Markdown reports. Simply choose whether to treat your data as counts or metrics, and the package will automatically generate well-designed default tables and plots for you. Boiled down to the basics, with labeling features and simple interactive reports. All functions are 'tidyverse' compatible.

Maintained by Jakob Jünger. Last updated 2 days ago.

5.5 match 5 stars 7.16 score 125 scripts

inceptdk

adaptr:Adaptive Trial Simulator

Package that simulates adaptive (multi-arm, multi-stage) clinical trials using adaptive stopping, adaptive arm dropping, and/or adaptive randomisation. Developed as part of the INCEPT (Intensive Care Platform Trial) project (<https://incept.dk/>), primarily supported by a grant from Sygeforsikringen "danmark" (<https://www.sygeforsikring.dk/>).

Maintained by Anders Granholm. Last updated 11 months ago.

7.3 match 13 stars 5.44 score 14 scripts

chjackson

flexsurv:Flexible Parametric Survival and Multi-State Models

Flexible parametric models for time-to-event data, including the Royston-Parmar spline model, generalized gamma and generalized F distributions. Any user-defined parametric distribution can be fitted, given at least an R function defining the probability density or hazard. There are also tools for fitting and predicting from fully parametric multi-state models, based on either cause-specific hazards or mixture models.

Maintained by Christopher Jackson. Last updated 2 months ago.

cpp

3.0 match 57 stars 13.31 score 632 scripts 43 dependents

nunompmoniz

IRon:Solving Imbalanced Regression Tasks

Imbalanced domain learning has almost exclusively focused on solving classification tasks, where the objective is to predict cases labelled with a rare class accurately. Such a well-defined approach for regression tasks lacked due to two main factors. First, standard regression tasks assume that each value is equally important to the user. Second, standard evaluation metrics focus on assessing the performance of the model on the most common cases. This package contains methods to tackle imbalanced domain learning problems in regression tasks, where the objective is to predict extreme (rare) values. The methods contained in this package are: 1) an automatic and non-parametric method to obtain such relevance functions; 2) visualisation tools; 3) suite of evaluation measures for optimisation/validation processes; 4) the squared-error relevance area measure, an evaluation metric tailored for imbalanced regression tasks. More information can be found in Ribeiro and Moniz (2020) <doi:10.1007/s10994-020-05900-9>.

Maintained by Nuno Moniz. Last updated 2 years ago.

evaluation-metrics imbalance-data imbalanced-learning machine-learning regression

10.1 match 19 stars 3.86 score 38 scripts

bioc

CytoMDS:Low Dimensions projection of cytometry samples

This package implements a low dimensional visualization of a set of cytometry samples, in order to visually assess the 'distances' between them. This, in turn, can greatly help the user to identify quality issues like batch effects or outlier samples, and/or check the presence of potential sample clusters that might align with the exeprimental design. The CytoMDS algorithm combines, on the one hand, the concept of Earth Mover's Distance (EMD), a.k.a. Wasserstein metric and, on the other hand, the Multi Dimensional Scaling (MDS) algorithm for the low dimensional projection. Also, the package provides some diagnostic tools for both checking the quality of the MDS projection, as well as tools to help with the interpretation of the axes of the projection.

Maintained by Philippe Hauchamps. Last updated 2 months ago.

flowcytometry qualitycontrol dimensionreduction multidimensionalscaling software visualization

7.3 match 1 stars 5.32 score 2 scripts

modeloriented

survex:Explainable Machine Learning in Survival Analysis

Survival analysis models are commonly used in medicine and other areas. Many of them are too complex to be interpreted by human. Exploration and explanation is needed, but standard methods do not give a broad enough picture. 'survex' provides easy-to-apply methods for explaining survival models, both complex black-boxes and simpler statistical models. They include methods specific to survival analysis such as SurvSHAP(t) introduced in Krzyzinski et al., (2023) <doi:10.1016/j.knosys.2022.110234>, SurvLIME described in Kovalev et al., (2020) <doi:10.1016/j.knosys.2020.106164> as well as extensions of existing ones described in Biecek et al., (2021) <doi:10.1201/9780429027192>.

Maintained by Mikołaj Spytek. Last updated 9 months ago.

biostatistics brier-scores censored-data cox-model cox-regression explainable-ai explainable-machine-learning explainable-ml explanatory-model-analysis interpretable-machine-learning interpretable-ml machine-learning probabilistic-machine-learning shap survival-analysis time-to-event variable-importance xai

4.6 match 110 stars 8.40 score 114 scripts

marlonecobos

mop:Mobility Oriented-Parity Metric

A set of tools to perform multiple versions of the Mobility Oriented-Parity metric. This multivariate analysis helps to characterize levels of dissimilarity between a set of conditions of reference and another set of conditions of interest. If predictive models are transferred to conditions different from those over which models were calibrated (trained), this metric helps to identify transfer conditions that differ substantially from those of calibration. These tools are implemented following principles proposed in Owens et al. (2013) <doi:10.1016/j.ecolmodel.2013.04.011>, and expanded to obtain more detailed results that aid in interpretation.

Maintained by Marlon E. Cobos. Last updated 9 months ago.

cpp

7.4 match 7 stars 5.23 score 20 scripts 2 dependents

sprouffske

growthcurver:Simple Metrics to Summarize Growth Curves

Fits the logistic equation to microbial growth curve data (e.g., repeated absorbance measurements taken from a plate reader over time). From this fit, a variety of metrics are provided, including the maximum growth rate, the doubling time, the carrying capacity, the area under the logistic curve, and the time to the inflection point. Method described in Sprouffske and Wagner (2016) <doi:10.1186/s12859-016-1016-7>.

Maintained by Kathleen Sprouffske. Last updated 4 years ago.

5.0 match 46 stars 7.71 score 111 scripts

fcharte

mldr:Exploratory Data Analysis and Manipulation of Multi-Label Data Sets

Exploratory data analysis and manipulation functions for multi- label data sets along with an interactive Shiny application to ease their use.

Maintained by David Charte. Last updated 5 years ago.

5.4 match 23 stars 7.07 score 168 scripts 2 dependents

laresbernardo

lares:Analytics & Machine Learning Sidekick

Auxiliary package for better/faster analytics, visualization, data mining, and machine learning tasks. With a wide variety of family functions, like Machine Learning, Data Wrangling, Marketing Mix Modeling (Robyn), Exploratory, API, and Scrapper, it helps the analyst or data scientist to get quick and robust results, without the need of repetitive coding or advanced R programming skills.

Maintained by Bernardo Lares. Last updated 23 days ago.

analytics api automation automl data-science descriptive-statistics h2o machine-learning marketing mmm predictive-modeling puzzle rlanguage robyn visualization

3.9 match 233 stars 9.84 score 185 scripts 1 dependents

statnet

tsna:Tools for Temporal Social Network Analysis

Temporal SNA tools for continuous- and discrete-time longitudinal networks having vertex, edge, and attribute dynamics stored in the 'networkDynamic' format. This work was supported by grant R01HD68395 from the National Institute of Health.

Maintained by Skye Bender-deMoll. Last updated 1 years ago.

5.0 match 7 stars 7.65 score 93 scripts 2 dependents

pboutros

OmicsQC:Nominating Quality Control Outliers in Genomic Profiling Studies

A method that analyzes quality control metrics from multi-sample genomic sequencing studies and nominates poor quality samples for exclusion. Per sample quality control data are transformed into z-scores and aggregated. The distribution of aggregated z-scores are modelled using parametric distributions. The parameters of the optimal model, selected either by goodness-of-fit statistics or user-designation, are used for outlier nomination. Two implementations of the Cosine Similarity Outlier Detection algorithm are provided with flexible parameters for dataset customization.

Maintained by Paul C. Boutros. Last updated 1 years ago.

18.9 match 2.00 score 2 scripts

carmonalab

scGate:Marker-Based Cell Type Purification for Single-Cell Sequencing Data

A common bioinformatics task in single-cell data analysis is to purify a cell type or cell population of interest from heterogeneous datasets. 'scGate' automatizes marker-based purification of specific cell populations, without requiring training data or reference gene expression profiles. Briefly, 'scGate' takes as input: i) a gene expression matrix stored in a 'Seurat' object and ii) a “gating model” (GM), consisting of a set of marker genes that define the cell population of interest. The GM can be as simple as a single marker gene, or a combination of positive and negative markers. More complex GMs can be constructed in a hierarchical fashion, akin to gating strategies employed in flow cytometry. 'scGate' evaluates the strength of signature marker expression in each cell using the rank-based method 'UCell', and then performs k-nearest neighbor (kNN) smoothing by calculating the mean 'UCell' score across neighboring cells. kNN-smoothing aims at compensating for the large degree of sparsity in scRNA-seq data. Finally, a universal threshold over kNN-smoothed signature scores is applied in binary decision trees generated from the user-provided gating model, to annotate cells as either “pure” or “impure”, with respect to the cell population of interest. See the related publication Andreatta et al. (2022) <doi:10.1093/bioinformatics/btac141>.

Maintained by Massimo Andreatta. Last updated 1 months ago.

filtering marker-genes scgate signatures single-cell

4.5 match 106 stars 8.38 score 163 scripts

spatialnous

alcyon:Spatial Network Analysis

Interface package for 'sala', the spatial network analysis library from the 'depthmapX' software application. The R parts of the code are based on the 'rdepthmap' package. Allows for the analysis of urban and building-scale networks and provides metrics and methods usually found within the Space Syntax domain. Methods in this package are described by K. Al-Sayed, A. Turner, B. Hillier, S. Iida and A. Penn (2014) "Space Syntax methodology", and also by A. Turner (2004) <https://discovery.ucl.ac.uk/id/eprint/2651> "Depthmap 4: a researcher's handbook".

Maintained by Petros Koutsolampros. Last updated 2 months ago.

cpp openmp

5.9 match 2 stars 6.34 score 13 scripts

dwoll

DVHmetrics:Analyze Dose-Volume Histograms and Check Constraints

Functionality for analyzing dose-volume histograms (DVH) in radiation oncology: Read DVH text files, calculate DVH metrics as well as generalized equivalent uniform dose (gEUD), biologically effective dose (BED), equivalent dose in 2 Gy fractions (EQD2), normal tissue complication probability (NTCP), and tumor control probability (TCP). Show DVH diagrams, check and visualize quality assurance constraints for the DVH. Includes web-based graphical user interface.

Maintained by Daniel Wollschlaeger. Last updated 16 days ago.

6.2 match 12 stars 6.03 score

alarm-redist

redist:Simulation Methods for Legislative Redistricting

Enables researchers to sample redistricting plans from a pre-specified target distribution using Sequential Monte Carlo and Markov Chain Monte Carlo algorithms. The package allows for the implementation of various constraints in the redistricting process such as geographic compactness and population parity requirements. Tools for analysis such as computation of various summary statistics and plotting functionality are also included. The package implements the SMC algorithm of McCartan and Imai (2023) <doi:10.1214/23-AOAS1763>, the enumeration algorithm of Fifield, Imai, Kawahara, and Kenny (2020) <doi:10.1080/2330443X.2020.1791773>, the Flip MCMC algorithm of Fifield, Higgins, Imai and Tarr (2020) <doi:10.1080/10618600.2020.1739532>, the Merge-split/Recombination algorithms of Carter et al. (2019) <arXiv:1911.01503> and DeFord et al. (2021) <doi:10.1162/99608f92.eb30390f>, and the Short-burst optimization algorithm of Cannon et al. (2020) <arXiv:2011.02288>.

Maintained by Christopher T. Kenny. Last updated 2 months ago.

geospatial gerrymandering redistricting sampling openblas cpp openmp

4.0 match 68 stars 9.17 score 259 scripts

domodwyer

promr:Prometheus 'PromQL' Query Client for 'R'

A native 'R' client library for querying the 'Prometheus' time-series database, using the 'PromQL' query language.

Maintained by Dom Dwyer. Last updated 3 years ago.

metrics prometheus timeseries

10.0 match 9 stars 3.69 score 11 scripts

igraph

igraph:Network Analysis and Visualization

Routines for simple graphs and network analysis. It can handle large graphs very well and provides functions for generating random and regular graphs, graph visualization, centrality methods and much more.

Maintained by Kirill Müller. Last updated 2 days ago.

complex-networks graph-algorithms graph-theory mathematics network-analysis network-graph fortran libxml2 glpk openblas cpp

1.8 match 581 stars 21.10 score 31k scripts 1.9k dependents

phiala

ecodist:Dissimilarity-Based Functions for Ecological Analysis

Dissimilarity-based analysis functions including ordination and Mantel test functions, intended for use with spatial and community ecological data. The original package description is in Goslee and Urban (2007) <doi:10.18637/jss.v022.i07>, with further statistical detail in Goslee (2010) <doi:10.1007/s11258-009-9641-0>.

Maintained by Sarah Goslee. Last updated 1 years ago.

openblas

3.8 match 9 stars 9.84 score 566 scripts 9 dependents

ailich

GLCMTextures:GLCM Textures of Raster Layers

Calculates grey level co-occurrence matrix (GLCM) based texture measures (Hall-Beyer (2017) <https://prism.ucalgary.ca/bitstream/handle/1880/51900/texture%20tutorial%20v%203_0%20180206.pdf>; Haralick et al. (1973) <doi:10.1109/TSMC.1973.4309314>) of raster layers using a sliding rectangular window. It also includes functions to quantize a raster into grey levels as well as tabulate a glcm and calculate glcm texture metrics for a matrix.

Maintained by Alexander Ilich. Last updated 2 months ago.

openblas cpp openmp

5.8 match 12 stars 6.33 score 20 scripts 2 dependents