R-universe search: exports:plot

stan-dev

rstan:R Interface to Stan

User-facing R functions are provided to parse, compile, test, estimate, and analyze Stan models by accessing the header-only Stan library provided by the 'StanHeaders' package. The Stan project develops a probabilistic programming language that implements full Bayesian statistical inference via Markov Chain Monte Carlo, rough Bayesian inference via 'variational' approximation, and (optionally penalized) maximum likelihood estimation via optimization. In all three cases, automatic differentiation is used to quickly and accurately evaluate gradients without burdening the user with the need to derive the partial derivatives.

Maintained by Ben Goodrich. Last updated 4 days ago.

bayesian-data-analysis bayesian-inference bayesian-statistics mcmc stan cpp

1.1k stars 18.84 score 14k scripts 281 dependents

edzer

sp:Classes and Methods for Spatial Data

Classes and methods for spatial data; the classes document where the spatial location information resides, for 2D or 3D data. Utility functions are provided, e.g. for plotting data as maps, spatial selection, as well as methods for retrieving coordinates, for subsetting, print, summary, etc. From this version, 'rgdal', 'maptools', and 'rgeos' are no longer used at all, see <https://r-spatial.org/r/2023/05/15/evolution4.html> for details.

Maintained by Edzer Pebesma. Last updated 2 months ago.

127 stars 18.63 score 35k scripts 1.3k dependents

rspatial

terra:Spatial Data Analysis

Methods for spatial data analysis with vector (points, lines, polygons) and raster (grid) data. Methods for vector data include geometric operations such as intersect and buffer. Raster methods include local, focal, global, zonal and geometric operations. The predict and interpolate methods facilitate the use of regression type (interpolation, machine learning) models for spatial prediction, including with satellite remote sensing data. Processing of very large files is supported. See the manual and tutorials on <https://rspatial.org/> to get started. 'terra' replaces the 'raster' package ('terra' can do more, and it is faster and easier to use).

Maintained by Robert J. Hijmans. Last updated 2 days ago.

geospatial raster spatial vector onetbb proj gdal geos cpp

559 stars 17.64 score 17k scripts 855 dependents

rspatial

raster:Geographic Data Analysis and Modeling

Reading, writing, manipulating, analyzing and modeling of spatial data. This package has been superseded by the "terra" package <https://CRAN.R-project.org/package=terra>.

Maintained by Robert J. Hijmans. Last updated 10 hours ago.

cpp

163 stars 17.23 score 58k scripts 562 dependents

r-forge

colorspace:A Toolbox for Manipulating and Assessing Colors and Palettes

Carries out mapping between assorted color spaces including RGB, HSV, HLS, CIEXYZ, CIELUV, HCL (polar CIELUV), CIELAB, and polar CIELAB. Qualitative, sequential, and diverging color palettes based on HCL colors are provided along with corresponding ggplot2 color scales. Color palette choice is aided by an interactive app (with either a Tcl/Tk or a shiny graphical user interface) and shiny apps with an HCL color picker and a color vision deficiency emulator. Plotting functions for displaying and assessing palettes include color swatches, visualizations of the HCL space, and trajectories in HCL and/or RGB spectrum. Color manipulation functions include: desaturation, lightening/darkening, mixing, and simulation of color vision deficiencies (deutanomaly, protanomaly, tritanomaly). Details can be found on the project web page at <https://colorspace.R-Forge.R-project.org/> and in the accompanying scientific paper: Zeileis et al. (2020, Journal of Statistical Software, <doi:10.18637/jss.v096.i01>).

Maintained by Achim Zeileis. Last updated 4 months ago.

16.23 score 8.2k scripts 8.1k dependents

dankelley

oce:Analysis of Oceanographic Data

Supports the analysis of Oceanographic data, including 'ADCP' measurements, measurements made with 'argo' floats, 'CTD' measurements, sectional data, sea-level time series, coastline and topographic data, etc. Provides specialized functions for calculating seawater properties such as potential temperature in either the 'UNESCO' or 'TEOS-10' equation of state. Produces graphical displays that conform to the conventions of the Oceanographic literature. This package is discussed extensively by Kelley (2018) "Oceanographic Analysis with R" <doi:10.1007/978-1-4939-8844-0>.

Maintained by Dan Kelley. Last updated 20 hours ago.

oceanography fortran cpp

146 stars 15.34 score 4.2k scripts 18 dependents

philchalmers

mirt:Multidimensional Item Response Theory

Analysis of discrete response data using unidimensional and multidimensional item analysis models under the Item Response Theory paradigm (Chalmers (2012) <doi:10.18637/jss.v048.i06>). Exploratory and confirmatory item factor analysis models are estimated with quadrature (EM) or stochastic (MHRM) methods. Confirmatory bi-factor and two-tier models are available for modeling item testlets using dimension reduction EM algorithms, while multiple group analyses and mixed effects designs are included for detecting differential item, bundle, and test functioning, and for modeling item and person covariates. Finally, latent class models such as the DINA, DINO, multidimensional latent class, mixture IRT models, and zero-inflated response models are supported, as well as a wide family of probabilistic unfolding models.

Maintained by Phil Chalmers. Last updated 1 days ago.

irt mirt openblas cpp openmp

212 stars 14.93 score 2.5k scripts 40 dependents

r-lidar

lidR:Airborne LiDAR Data Manipulation and Visualization for Forestry Applications

Airborne LiDAR (Light Detection and Ranging) interface for data manipulation and visualization. Read/write 'las' and 'laz' files, computation of metrics in area based approach, point filtering, artificial point reduction, classification from geographic data, normalization, individual tree segmentation and other manipulations.

Maintained by Jean-Romain Roussel. Last updated 2 months ago.

als forestry las laz lidar point-cloud remote-sensing openblas cpp openmp

623 stars 14.47 score 844 scripts 8 dependents

bioc

xcms:LC-MS and GC-MS Data Analysis

Framework for processing and visualization of chromatographically separated and single-spectra mass spectral data. Imports from AIA/ANDI NetCDF, mzXML, mzData and mzML files. Preprocesses data for high-throughput, untargeted analyte profiling.

Maintained by Steffen Neumann. Last updated 14 days ago.

immunooncology massspectrometry metabolomics bioconductor feature-detection mass-spectrometry peak-detection cpp

196 stars 14.31 score 984 scripts 11 dependents

ipa-tys

ROCR:Visualizing the Performance of Scoring Classifiers

ROC graphs, sensitivity/specificity curves, lift charts, and precision/recall plots are popular examples of trade-off visualizations for specific pairs of performance measures. ROCR is a flexible tool for creating cutoff-parameterized 2D performance curves by freely combining two from over 25 performance measures (new performance measures can be added using a standard interface). Curves from different cross-validation or bootstrapping runs can be averaged by different methods, and standard deviations, standard errors or box plots can be used to visualize the variability across the runs. The parameterization can be visualized by printing cutoff values at the corresponding curve positions, or by coloring the curve according to cutoff. All components of a performance plot can be quickly adjusted using a flexible parameter dispatching mechanism. Despite its flexibility, ROCR is easy to use, with only three commands and reasonable default values for all optional parameters.

Maintained by Felix G.M. Ernst. Last updated 1 years ago.

38 stars 14.29 score 9.2k scripts 217 dependents

edzer

hexbin:Hexagonal Binning Routines

Binning and plotting functions for hexagonal bins.

Maintained by Edzer Pebesma. Last updated 5 months ago.

fortran

37 stars 14.00 score 2.4k scripts 114 dependents

biomodhub

biomod2:Ensemble Platform for Species Distribution Modeling

Functions for species distribution modeling, calibration and evaluation, ensemble of models, ensemble forecasting and visualization. The package permits to run consistently up to 10 single models on a presence/absences (resp presences/pseudo-absences) dataset and to combine them in ensemble models and ensemble projections. Some bench of other evaluation and visualisation tools are also available within the package.

Maintained by Maya Guéguen. Last updated 2 days ago.

95 stars 13.85 score 536 scripts 7 dependents

knausb

vcfR:Manipulate and Visualize VCF Data

Facilitates easy manipulation of variant call format (VCF) data. Functions are provided to rapidly read from and write to VCF files. Once VCF data is read into R a parser function extracts matrices of data. This information can then be used for quality control or other purposes. Additional functions provide visualization of genomic data. Once processing is complete data may be written to a VCF file (*.vcf.gz). It also may be converted into other popular R objects (e.g., genlight, DNAbin). VcfR provides a link between VCF data and familiar R software.

Maintained by Brian J. Knaus. Last updated 1 months ago.

genomics population-genetics population-genomics rcpp vcf-data visualization zlib cpp

256 stars 13.66 score 3.1k scripts 19 dependents

r-forge

robustbase:Basic Robust Statistics

"Essential" Robust Statistics. Tools allowing to analyze data with robust methods. This includes regression methodology including model selections and multivariate statistics where we strive to cover the book "Robust Statistics, Theory and Methods" by 'Maronna, Martin and Yohai'; Wiley 2006.

Maintained by Martin Maechler. Last updated 4 months ago.

fortran openblas

13.38 score 1.7k scripts 480 dependents

bbolker

bbmle:Tools for General Maximum Likelihood Estimation

Methods and functions for fitting maximum likelihood models in R. This package modifies and extends the 'mle' classes in the 'stats4' package.

Maintained by Ben Bolker. Last updated 1 months ago.

25 stars 13.36 score 1.4k scripts 117 dependents

edzer

spacetime:Classes and Methods for Spatio-Temporal Data

Classes and methods for spatio-temporal data, including space-time regular lattices, sparse lattices, irregular data, and trajectories; utility functions for plotting data as map sequences (lattice or animation) or multiple time series; methods for spatial and temporal selection and subsetting, as well as for spatial/temporal/spatio-temporal matching or aggregation, retrieving coordinates, print, summary, etc.

Maintained by Edzer Pebesma. Last updated 2 months ago.

74 stars 13.29 score 628 scripts 72 dependents

biodiverse

unmarked:Models for Data from Unmarked Animals

Fits hierarchical models of animal abundance and occurrence to data collected using survey methods such as point counts, site occupancy sampling, distance sampling, removal sampling, and double observer sampling. Parameters governing the state and observation processes can be modeled as functions of covariates. References: Kellner et al. (2023) <doi:10.1111/2041-210X.14123>, Fiske and Chandler (2011) <doi:10.18637/jss.v043.i10>.

Maintained by Ken Kellner. Last updated 9 days ago.

openblas cpp openmp

4 stars 13.02 score 652 scripts 12 dependents

csgillespie

poweRlaw:Analysis of Heavy Tailed Distributions

An implementation of maximum likelihood estimators for a variety of heavy tailed distributions, including both the discrete and continuous power law distributions. Additionally, a goodness-of-fit based approach is used to estimate the lower cut-off for the scaling region.

Maintained by Colin Gillespie. Last updated 2 months ago.

clauset powerlaw

112 stars 12.79 score 332 scripts 32 dependents

spedygiorgio

markovchain:Easy Handling Discrete Time Markov Chains

Functions and S4 methods to create and manage discrete time Markov chains more easily. In addition functions to perform statistical (fitting and drawing random variates) and probabilistic (analysis of their structural proprieties) analysis are provided. See Spedicato (2017) <doi:10.32614/RJ-2017-036>. Some functions for continuous times Markov chains depend on the suggested ctmcd package.

Maintained by Giorgio Alfredo Spedicato. Last updated 5 months ago.

ctmc dtmc markov-chain markov-model r-programming rcpp openblas cpp

104 stars 12.78 score 712 scripts 4 dependents

bioc

MSnbase:Base Functions and Classes for Mass Spectrometry and Proteomics

MSnbase provides infrastructure for manipulation, processing and visualisation of mass spectrometry and proteomics data, ranging from raw to quantitative and annotated data.

Maintained by Laurent Gatto. Last updated 14 days ago.

immunooncology infrastructure proteomics massspectrometry qualitycontrol dataimport bioconductor bioinformatics mass-spectrometry proteomics-data visualisation cpp

131 stars 12.76 score 772 scripts 36 dependents

thibautjombart

adegenet:Exploratory Analysis of Genetic and Genomic Data

Toolset for the exploration of genetic and genomic data. Adegenet provides formal (S4) classes for storing and handling various genetic data, including genetic markers with varying ploidy and hierarchical population structure ('genind' class), alleles counts by populations ('genpop'), and genome-wide SNP data ('genlight'). It also implements original multivariate methods (DAPC, sPCA), graphics, statistical tests, simulation tools, distance and similarity measures, and several spatial methods. A range of both empirical and simulated datasets is also provided to illustrate various methods.

Maintained by Zhian N. Kamvar. Last updated 2 months ago.

182 stars 12.60 score 1.9k scripts 29 dependents

data-cleaning

validate:Data Validation Infrastructure

Declare data validation rules and data quality indicators; confront data with them and analyze or visualize the results. The package supports rules that are per-field, in-record, cross-record or cross-dataset. Rules can be automatically analyzed for rule type and connectivity. Supports checks implied by an SDMX DSD file as well. See also Van der Loo and De Jonge (2018) <doi:10.1002/9781118897126>, Chapter 6 and the JSS paper (2021) <doi:10.18637/jss.v097.i10>.

Maintained by Mark van der Loo. Last updated 24 days ago.

data-cleaning validation

419 stars 12.39 score 448 scripts 8 dependents

asardaes

dtwclust:Time Series Clustering Along with Optimizations for the Dynamic Time Warping Distance

Time series clustering along with optimized techniques related to the Dynamic Time Warping distance and its corresponding lower bounds. Implementations of partitional, hierarchical, fuzzy, k-Shape and TADPole clustering are available. Functionality can be easily extended with custom distance measures and centroid definitions. Implementations of DTW barycenter averaging, a distance based on global alignment kernels, and the soft-DTW distance and centroid routines are also provided. All included distance functions have custom loops optimized for the calculation of cross-distance matrices, including parallelization support. Several cluster validity indices are included.

Maintained by Alexis Sarda. Last updated 8 months ago.

clustering dtw time-series openblas cpp

262 stars 12.35 score 406 scripts 14 dependents

alexkz

kernlab:Kernel-Based Machine Learning Lab

Kernel-based machine learning methods for classification, regression, clustering, novelty detection, quantile regression and dimensionality reduction. Among other methods 'kernlab' includes Support Vector Machines, Spectral Clustering, Kernel PCA, Gaussian Processes and a QP solver.

Maintained by Alexandros Karatzoglou. Last updated 8 months ago.

openblas cpp

21 stars 12.26 score 7.8k scripts 487 dependents

alexiosg

rugarch:Univariate GARCH Models

ARFIMA, in-mean, external regressors and various GARCH flavors, with methods for fit, forecast, simulation, inference and plotting.

Maintained by Alexios Galanos. Last updated 3 months ago.

cpp

26 stars 12.13 score 1.3k scripts 15 dependents

ncss-tech

aqp:Algorithms for Quantitative Pedology

The Algorithms for Quantitative Pedology (AQP) project was started in 2009 to organize a loosely-related set of concepts and source code on the topic of soil profile visualization, aggregation, and classification into this package (aqp). Over the past 8 years, the project has grown into a suite of related R packages that enhance and simplify the quantitative analysis of soil profile data. Central to the AQP project is a new vocabulary of specialized functions and data structures that can accommodate the inherent complexity of soil profile information; freeing the scientist to focus on ideas rather than boilerplate data processing tasks <doi:10.1016/j.cageo.2012.10.020>. These functions and data structures have been extensively tested and documented, applied to projects involving hundreds of thousands of soil profiles, and deeply integrated into widely used tools such as SoilWeb <https://casoilresource.lawr.ucdavis.edu/soilweb-apps>. Components of the AQP project (aqp, soilDB, sharpshootR, soilReports packages) serve an important role in routine data analysis within the USDA-NRCS Soil Science Division. The AQP suite of R packages offer a convenient platform for bridging the gap between pedometric theory and practice.

Maintained by Dylan Beaudette. Last updated 1 months ago.

digital-soil-mapping ncss-tech nrcs pedology pedometrics soil soil-survey usda

55 stars 11.90 score 1.2k scripts 2 dependents

rspatial

dismo:Species Distribution Modeling

Methods for species distribution modeling, that is, predicting the environmental similarity of any site to that of the locations of known occurrences of a species.

Maintained by Robert J. Hijmans. Last updated 4 months ago.

cpp

25 stars 11.88 score 2.8k scripts 21 dependents

bioc

graph:graph: A package to handle graph data structures

A package that implements some simple graph handling capabilities.

Maintained by Bioconductor Package Maintainer. Last updated 9 days ago.

graphandnetwork

11.86 score 764 scripts 339 dependents

r-forge

copula:Multivariate Dependence with Copulas

Classes (S4) of commonly used elliptical, Archimedean, extreme-value and other copula families, as well as their rotations, mixtures and asymmetrizations. Nested Archimedean copulas, related tools and special functions. Methods for density, distribution, random number generation, bivariate dependence measures, Rosenblatt transform, Kendall distribution function, perspective and contour plots. Fitting of copula models with potentially partly fixed parameters, including standard errors. Serial independence tests, copula specification tests (independence, exchangeability, radial symmetry, extreme-value dependence, goodness-of-fit) and model selection based on cross-validation. Empirical copula, smoothed versions, and non-parametric estimators of the Pickands dependence function.

Maintained by Martin Maechler. Last updated 23 days ago.

11.83 score 1.2k scripts 86 dependents

kingaa

pomp:Statistical Inference for Partially Observed Markov Processes

Tools for data analysis with partially observed Markov process (POMP) models (also known as stochastic dynamical systems, hidden Markov models, and nonlinear, non-Gaussian, state-space models). The package provides facilities for implementing POMP models, simulating them, and fitting them to time series data by a variety of frequentist and Bayesian methods. It is also a versatile platform for implementation of inference methods for general POMP models.

Maintained by Aaron A. King. Last updated 8 days ago.

abc b-spline differential-equations dynamical-systems iterated-filtering likelihood likelihood-free markov-chain-monte-carlo markov-model mathematical-modelling measurement-error particle-filter sequential-monte-carlo simulation-based-inference sobol-sequence state-space statistical-inference stochastic-processes time-series openblas

114 stars 11.74 score 1.3k scripts 4 dependents

prioritizr

prioritizr:Systematic Conservation Prioritization in R

Systematic conservation prioritization using mixed integer linear programming (MILP). It provides a flexible interface for building and solving conservation planning problems. Once built, conservation planning problems can be solved using a variety of commercial and open-source exact algorithm solvers. By using exact algorithm solvers, solutions can be generated that are guaranteed to be optimal (or within a pre-specified optimality gap). Furthermore, conservation problems can be constructed to optimize the spatial allocation of different management actions or zones, meaning that conservation practitioners can identify solutions that benefit multiple stakeholders. To solve large-scale or complex conservation planning problems, users should install the Gurobi optimization software (available from <https://www.gurobi.com/>) and the 'gurobi' R package (see Gurobi Installation Guide vignette for details). Users can also install the IBM CPLEX software (<https://www.ibm.com/products/ilog-cplex-optimization-studio/cplex-optimizer>) and the 'cplexAPI' R package (available at <https://github.com/cran/cplexAPI>). Additionally, the 'rcbc' R package (available at <https://github.com/dirkschumacher/rcbc>) can be used to generate solutions using the CBC optimization software (<https://github.com/coin-or/Cbc>). For further details, see Hanson et al. (2025) <doi:10.1111/cobi.14376>.

Maintained by Richard Schuster. Last updated 16 hours ago.

biodiversity conservation conservation-planner optimization prioritization solver spatial cpp

124 stars 11.71 score 584 scripts 2 dependents

luca-scr

GA:Genetic Algorithms

Flexible general-purpose toolbox implementing genetic algorithms (GAs) for stochastic optimisation. Binary, real-valued, and permutation representations are available to optimize a fitness function, i.e. a function provided by users depending on their objective function. Several genetic operators are available and can be combined to explore the best settings for the current task. Furthermore, users can define new genetic operators and easily evaluate their performances. Local search using general-purpose optimisation algorithms can be applied stochastically to exploit interesting regions. GAs can be run sequentially or in parallel, using an explicit master-slave parallelisation or a coarse-grain islands approach. For more details see Scrucca (2013) <doi:10.18637/jss.v053.i04> and Scrucca (2017) <doi:10.32614/RJ-2017-008>.

Maintained by Luca Scrucca. Last updated 7 months ago.

genetic-algorithm optimisation cpp

93 stars 11.58 score 624 scripts 52 dependents

bioc

Rgraphviz:Provides plotting capabilities for R graph objects

Interfaces R with the AT and T graphviz library for plotting R graph objects from the graph package.

Maintained by Kasper Daniel Hansen. Last updated 2 days ago.

graphandnetwork visualization zlib

11.51 score 1.2k scripts 107 dependents

bioc

destiny:Creates diffusion maps

Create and plot diffusion maps.

Maintained by Philipp Angerer. Last updated 4 months ago.

cellbiology cellbasedassays clustering software visualization diffusion-maps dimensionality-reduction cpp

82 stars 11.44 score 792 scripts 1 dependents

bioc

genefilter:genefilter: methods for filtering genes from high-throughput experiments

Some basic functions for filtering genes.

Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.

microarray fortran cpp

11.11 score 2.4k scripts 143 dependents

fmichonneau

phylobase:Base Package for Phylogenetic Structures and Comparative Data

Provides a base S4 class for comparative methods, incorporating one or more trees and trait data.

Maintained by Francois Michonneau. Last updated 1 years ago.

phylogenetics cpp

18 stars 11.10 score 394 scripts 18 dependents

mclements

rstpm2:Smooth Survival Models, Including Generalized Survival Models

R implementation of generalized survival models (GSMs), smooth accelerated failure time (AFT) models and Markov multi-state models. For the GSMs, g(S(t|x))=eta(t,x) for a link function g, survival S at time t with covariates x and a linear predictor eta(t,x). The main assumption is that the time effect(s) are smooth <doi:10.1177/0962280216664760>. For fully parametric models with natural splines, this re-implements Stata's 'stpm2' function, which are flexible parametric survival models developed by Royston and colleagues. We have extended the parametric models to include any smooth parametric smoothers for time. We have also extended the model to include any smooth penalized smoothers from the 'mgcv' package, using penalized likelihood. These models include left truncation, right censoring, interval censoring, gamma frailties and normal random effects <doi:10.1002/sim.7451>, and copulas. For the smooth AFTs, S(t|x) = S_0(t*eta(t,x)), where the baseline survival function S_0(t)=exp(-exp(eta_0(t))) is modelled for natural splines for eta_0, and the time-dependent cumulative acceleration factor eta(t,x)=\int_0^t exp(eta_1(u,x)) du for log acceleration factor eta_1(u,x). The Markov multi-state models allow for a range of models with smooth transitions to predict transition probabilities, length of stay, utilities and costs, with differences, ratios and standardisation.

Maintained by Mark Clements. Last updated 5 months ago.

fortran openblas cpp

27 stars 11.09 score 137 scripts 52 dependents

sgibb

MALDIquant:Quantitative Analysis of Mass Spectrometry Data

A complete analysis pipeline for matrix-assisted laser desorption/ionization-time-of-flight (MALDI-TOF) and other two-dimensional mass spectrometry data. In addition to commonly used plotting and processing methods it includes distinctive features, namely baseline subtraction methods such as morphological filters (TopHat) or the statistics-sensitive non-linear iterative peak-clipping algorithm (SNIP), peak alignment using warping functions, handling of replicated measurements as well as allowing spectra with different resolutions.

Maintained by Sebastian Gibb. Last updated 7 months ago.

maldi maldi-ims maldi-tof-ms mass-spectrometry

62 stars 11.06 score 180 scripts 44 dependents

rkillick

changepoint:Methods for Changepoint Detection

Implements various mainstream and specialised changepoint methods for finding single and multiple changepoints within data. Many popular non-parametric and frequentist methods are included. The cpt.mean(), cpt.var(), cpt.meanvar() functions should be your first point of call.

Maintained by Rebecca Killick. Last updated 4 months ago.

changepoint segmentation

133 stars 11.05 score 736 scripts 40 dependents

metrumresearchgroup

mrgsolve:Simulate from ODE-Based Models

Fast simulation from ordinary differential equation (ODE) based models typically employed in quantitative pharmacology and systems biology.

Maintained by Kyle T Baron. Last updated 9 days ago.

mrgsolve ode openblas cpp

138 stars 10.90 score 1.2k scripts 3 dependents

tyee001

VGAM:Vector Generalized Linear and Additive Models

An implementation of about 6 major classes of statistical regression models. The central algorithm is Fisher scoring and iterative reweighted least squares. At the heart of this package are the vector generalized linear and additive model (VGLM/VGAM) classes. VGLMs can be loosely thought of as multivariate GLMs. VGAMs are data-driven VGLMs that use smoothing. The book "Vector Generalized Linear and Additive Models: With an Implementation in R" (Yee, 2015) <DOI:10.1007/978-1-4939-2818-7> gives details of the statistical framework and the package. Currently only fixed-effects models are implemented. Many (100+) models and distributions are estimated by maximum likelihood estimation (MLE) or penalized MLE. The other classes are RR-VGLMs (reduced-rank VGLMs), quadratic RR-VGLMs, doubly constrained RR-VGLMs, quadratic RR-VGLMs, reduced-rank VGAMs, RCIMs (row-column interaction models)---these classes perform constrained and unconstrained quadratic ordination (CQO/UQO) models in ecology, as well as constrained additive ordination (CAO). Hauck-Donner effect detection is implemented. Note that these functions are subject to change; see the NEWS and ChangeLog files for latest changes.

Maintained by Thomas Yee. Last updated 1 months ago.

fortran

10 stars 10.67 score 3.6k scripts 169 dependents

r-forge

surveillance:Temporal and Spatio-Temporal Modeling and Monitoring of Epidemic Phenomena

Statistical methods for the modeling and monitoring of time series of counts, proportions and categorical data, as well as for the modeling of continuous-time point processes of epidemic phenomena. The monitoring methods focus on aberration detection in count data time series from public health surveillance of communicable diseases, but applications could just as well originate from environmetrics, reliability engineering, econometrics, or social sciences. The package implements many typical outbreak detection procedures such as the (improved) Farrington algorithm, or the negative binomial GLR-CUSUM method of Hoehle and Paul (2008) <doi:10.1016/j.csda.2008.02.015>. A novel CUSUM approach combining logistic and multinomial logistic modeling is also included. The package contains several real-world data sets, the ability to simulate outbreak data, and to visualize the results of the monitoring in a temporal, spatial or spatio-temporal fashion. A recent overview of the available monitoring procedures is given by Salmon et al. (2016) <doi:10.18637/jss.v070.i10>. For the retrospective analysis of epidemic spread, the package provides three endemic-epidemic modeling frameworks with tools for visualization, likelihood inference, and simulation. hhh4() estimates models for (multivariate) count time series following Paul and Held (2011) <doi:10.1002/sim.4177> and Meyer and Held (2014) <doi:10.1214/14-AOAS743>. twinSIR() models the susceptible-infectious-recovered (SIR) event history of a fixed population, e.g, epidemics across farms or networks, as a multivariate point process as proposed by Hoehle (2009) <doi:10.1002/bimj.200900050>. twinstim() estimates self-exciting point process models for a spatio-temporal point pattern of infective events, e.g., time-stamped geo-referenced surveillance data, as proposed by Meyer et al. (2012) <doi:10.1111/j.1541-0420.2011.01684.x>. A recent overview of the implemented space-time modeling frameworks for epidemic phenomena is given by Meyer et al. (2017) <doi:10.18637/jss.v077.i11>.

Maintained by Sebastian Meyer. Last updated 8 days ago.

cpp

2 stars 10.67 score 446 scripts 3 dependents

valentint

rrcov:Scalable Robust Estimators with High Breakdown Point

Robust Location and Scatter Estimation and Robust Multivariate Analysis with High Breakdown Point: principal component analysis (Filzmoser and Todorov (2013), <doi:10.1016/j.ins.2012.10.017>), linear and quadratic discriminant analysis (Todorov and Pires (2007)), multivariate tests (Todorov and Filzmoser (2010) <doi:10.1016/j.csda.2009.08.015>), outlier detection (Todorov et al. (2010) <doi:10.1007/s11634-010-0075-2>). See also Todorov and Filzmoser (2009) <urn:isbn:978-3838108148>, Todorov and Filzmoser (2010) <doi:10.18637/jss.v032.i03> and Boudt et al. (2019) <doi:10.1007/s11222-019-09869-x>.

Maintained by Valentin Todorov. Last updated 7 months ago.

fortran openblas

2 stars 10.57 score 484 scripts 96 dependents

bioc

seqLogo:Sequence logos for DNA sequence alignments

seqLogo takes the position weight matrix of a DNA sequence motif and plots the corresponding sequence logo as introduced by Schneider and Stephens (1990).

Maintained by Robert Ivanek. Last updated 5 months ago.

sequencematching

4 stars 10.57 score 304 scripts 29 dependents

ctmm-initiative

ctmm:Continuous-Time Movement Modeling

Functions for identifying, fitting, and applying continuous-space, continuous-time stochastic-process movement models to animal tracking data. The package is described in Calabrese et al (2016) <doi:10.1111/2041-210X.12559>, with models and methods based on those introduced and detailed in Fleming & Calabrese et al (2014) <doi:10.1086/675504>, Fleming et al (2014) <doi:10.1111/2041-210X.12176>, Fleming et al (2015) <doi:10.1103/PhysRevE.91.032107>, Fleming et al (2015) <doi:10.1890/14-2010.1>, Fleming et al (2016) <doi:10.1890/15-1607>, Péron & Fleming et al (2016) <doi:10.1186/s40462-016-0084-7>, Fleming & Calabrese (2017) <doi:10.1111/2041-210X.12673>, Péron et al (2017) <doi:10.1002/ecm.1260>, Fleming et al (2017) <doi:10.1016/j.ecoinf.2017.04.008>, Fleming et al (2018) <doi:10.1002/eap.1704>, Winner & Noonan et al (2018) <doi:10.1111/2041-210X.13027>, Fleming et al (2019) <doi:10.1111/2041-210X.13270>, Noonan & Fleming et al (2019) <doi:10.1186/s40462-019-0177-1>, Fleming et al (2020) <doi:10.1101/2020.06.12.130195>, Noonan et al (2021) <doi:10.1111/2041-210X.13597>, Fleming et al (2022) <doi:10.1111/2041-210X.13815>, Silva et al (2022) <doi:10.1111/2041-210X.13786>, Alston & Fleming et al (2023) <doi:10.1111/2041-210X.14025>.

Maintained by Christen H. Fleming. Last updated 7 days ago.

51 stars 10.54 score 534 scripts 4 dependents

bioc

ChemmineR:Cheminformatics Toolkit for R

ChemmineR is a cheminformatics package for analyzing drug-like small molecule data in R. Its latest version contains functions for efficient processing of large numbers of molecules, physicochemical/structural property predictions, structural similarity searching, classification and clustering of compound libraries with a wide spectrum of algorithms. In addition, it offers visualization functions for compound clustering results and chemical structures.

Maintained by Thomas Girke. Last updated 5 months ago.

cheminformatics biomedicalinformatics pharmacogenetics pharmacogenomics microtitreplateassay cellbasedassays visualization infrastructure dataimport clustering proteomics metabolomics cpp

15 stars 10.45 score 253 scripts 12 dependents

mhahsler

recommenderlab:Lab for Developing and Testing Recommender Algorithms

Provides a research infrastructure to develop and evaluate collaborative filtering recommender algorithms. This includes a sparse representation for user-item matrices, many popular algorithms, top-N recommendations, and cross-validation. Hahsler (2022) <doi:10.48550/arXiv.2205.12371>.

Maintained by Michael Hahsler. Last updated 2 days ago.

collaborative-filtering recommender-system

214 stars 10.42 score 840 scripts 2 dependents

adeverse

adegraphics:An S4 Lattice-Based Package for the Representation of Multivariate Data

Graphical functionalities for the representation of multivariate data. It is a complete re-implementation of the functions available in the 'ade4' package.

Maintained by Aurélie Siberchicot. Last updated 8 months ago.

9 stars 10.37 score 386 scripts 6 dependents

bioc

flowCore:flowCore: Basic structures for flow cytometry data

Provides S4 data structures and basic functions to deal with flow cytometry data.

Maintained by Mike Jiang. Last updated 5 months ago.

immunooncology infrastructure flowcytometry cellbasedassays cpp

10.34 score 1.7k scripts 59 dependents

ssnn-airr

alakazam:Immunoglobulin Clonal Lineage and Diversity Analysis

Provides methods for high-throughput adaptive immune receptor repertoire sequencing (AIRR-Seq; Rep-Seq) analysis. In particular, immunoglobulin (Ig) sequence lineage reconstruction, lineage topology analysis, diversity profiling, amino acid property analysis and gene usage. Citations: Gupta and Vander Heiden, et al (2017) <doi:10.1093/bioinformatics/btv359>, Stern, Yaari and Vander Heiden, et al (2014) <doi:10.1126/scitranslmed.3008879>.

Maintained by Susanna Marquez. Last updated 3 months ago.

software annotationdata cpp

10.33 score 424 scripts 7 dependents

bcgov

ssdtools:Species Sensitivity Distributions

Species sensitivity distributions are cumulative probability distributions which are fitted to toxicity concentrations for different species as described by Posthuma et al.(2001) <isbn:9781566705783>. The ssdtools package uses Maximum Likelihood to fit distributions such as the gamma, log-logistic, log-normal and log-normal log-normal mixture. Multiple distributions can be averaged using Akaike Information Criteria. Confidence intervals on hazard concentrations and proportions are produced by bootstrapping.

Maintained by Joe Thorley. Last updated 1 months ago.

ecotoxicology env species-sensitivity-distribution cpp

33 stars 10.33 score 111 scripts 5 dependents

bioc

Cardinal:A mass spectrometry imaging toolbox for statistical analysis

Implements statistical & computational tools for analyzing mass spectrometry imaging datasets, including methods for efficient pre-processing, spatial segmentation, and classification.

Maintained by Kylie Ariel Bemis. Last updated 3 months ago.

software infrastructure proteomics lipidomics massspectrometry imagingmassspectrometry immunooncology normalization clustering classification regression

48 stars 10.32 score 200 scripts

bioc

BASiCS:Bayesian Analysis of Single-Cell Sequencing data

Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model to perform statistical analyses of single-cell RNA sequencing datasets in the context of supervised experiments (where the groups of cells of interest are known a priori, e.g. experimental conditions or cell types). BASiCS performs built-in data normalisation (global scaling) and technical noise quantification (based on spike-in genes). BASiCS provides an intuitive detection criterion for highly (or lowly) variable genes within a single group of cells. Additionally, BASiCS can compare gene expression patterns between two or more pre-specified groups of cells. Unlike traditional differential expression tools, BASiCS quantifies changes in expression that lie beyond comparisons of means, also allowing the study of changes in cell-to-cell heterogeneity. The latter can be quantified via a biological over-dispersion parameter that measures the excess of variability that is observed with respect to Poisson sampling noise, after normalisation and technical noise removal. Due to the strong mean/over-dispersion confounding that is typically observed for scRNA-seq datasets, BASiCS also tests for changes in residual over-dispersion, defined by residual values with respect to a global mean/over-dispersion trend.

Maintained by Catalina Vallejos. Last updated 5 months ago.

immunooncology normalization sequencing rnaseq software geneexpression transcriptomics singlecell differentialexpression bayesian cellbiology bioconductor-package gene-expression rcpp rcpparmadillo scrna-seq single-cell openblas cpp openmp

83 stars 10.26 score 368 scripts 1 dependents

bioc

graphite:GRAPH Interaction from pathway Topological Environment

Graph objects from pathway topology derived from KEGG, Panther, PathBank, PharmGKB, Reactome SMPDB and WikiPathways databases.

Maintained by Gabriele Sales. Last updated 5 months ago.

pathways thirdpartyclient graphandnetwork network reactome kegg metabolomics bioinformatics mirror pathway-analysis

7 stars 10.17 score 122 scripts 21 dependents

bioc

QDNAseq:Quantitative DNA Sequencing for Chromosomal Aberrations

Quantitative DNA sequencing for chromosomal aberrations. The genome is divided into non-overlapping fixed-sized bins, number of sequence reads in each counted, adjusted with a simultaneous two-dimensional loess correction for sequence mappability and GC content, and filtered to remove spurious regions in the genome. Downstream steps of segmentation and calling are also implemented via packages DNAcopy and CGHcall, respectively.

Maintained by Daoud Sie. Last updated 5 months ago.

copynumbervariation dnaseq genetics genomeannotation preprocessing qualitycontrol sequencing

49 stars 10.10 score 177 scripts 4 dependents

stewid

SimInf:A Framework for Data-Driven Stochastic Disease Spread Simulations

Provides an efficient and very flexible framework to conduct data-driven epidemiological modeling in realistic large scale disease spread simulations. The framework integrates infection dynamics in subpopulations as continuous-time Markov chains using the Gillespie stochastic simulation algorithm and incorporates available data such as births, deaths and movements as scheduled events at predefined time-points. Using C code for the numerical solvers and 'OpenMP' (if available) to divide work over multiple processors ensures high performance when simulating a sample outcome. One of our design goals was to make the package extendable and enable usage of the numerical solvers from other R extension packages in order to facilitate complex epidemiological research. The package contains template models and can be extended with user-defined models. For more details see the paper by Widgren, Bauer, Eriksson and Engblom (2019) <doi:10.18637/jss.v091.i12>. The package also provides functionality to fit models to time series data using the Approximate Bayesian Computation Sequential Monte Carlo ('ABC-SMC') algorithm of Toni and others (2009) <doi:10.1098/rsif.2008.0172>.

Maintained by Stefan Widgren. Last updated 16 days ago.

data-driven epidemiology high-performance-computing markov-chain mathematical-modelling gsl openmp

35 stars 10.09 score 227 scripts

hypertidy

fasterize:Fast Polygon to Raster Conversion

Provides a drop-in replacement for rasterize() from the 'raster' package that takes polygon vector or data frame objects, and is much faster. There is support for the main options provided by the rasterize() function, including setting the field used and background value, and options for aggregating multi-layer rasters. Uses the scan line algorithm attributed to Wylie et al. (1967) <doi:10.1145/1465611.1465619>. Note that repository originally was hosted at 'Github' 'ecohealthalliance/fasterize' but was migrated to 'hypertidy/fasterize' in March 2025, and can be found indexed on 'R universe' <https://cran.r-universe.dev/fasterize>.

Maintained by Michael Sumner. Last updated 20 days ago.

raster rcpp rcpparmadillo sf spatial cpp

182 stars 10.05 score 14 dependents

mages

ChainLadder:Statistical Methods and Models for Claims Reserving in General Insurance

Various statistical methods and models which are typically used for the estimation of outstanding claims reserves in general insurance, including those to estimate the claims development result as required under Solvency II.

Maintained by Markus Gesmann. Last updated 2 months ago.

82 stars 10.04 score 196 scripts 2 dependents

robinhankin

Brobdingnag:Very Large Numbers in R

Very large numbers in R. Real numbers are held using their natural logarithms, plus a logical flag indicating sign. Functionality for complex numbers is also provided. The package includes a vignette that gives a step-by-step introduction to using S4 methods.

Maintained by Robin K. S. Hankin. Last updated 7 months ago.

5 stars 9.92 score 77 scripts 70 dependents

environmentalinformatics-marburg

satellite:Handling and Manipulating Remote Sensing Data

Herein, we provide a broad variety of functions which are useful for handling, manipulating, and visualizing satellite-based remote sensing data. These operations range from mere data import and layer handling (eg subsetting), over Raster* typical data wrangling (eg crop, extend), to more sophisticated (pre-)processing tasks typically applied to satellite imagery (eg atmospheric and topographic correction). This functionality is complemented by a full access to the satellite layers' metadata at any stage and the documentation of performed actions in a separate log file. Currently available sensors include Landsat 4-5 (TM), 7 (ETM+), and 8 (OLI/TIRS Combined), and additional compatibility is ensured for the Landsat Global Land Survey data set.

Maintained by Florian Detsch. Last updated 1 years ago.

cpp

22 stars 9.88 score 61 scripts 27 dependents

ubod

apcluster:Affinity Propagation Clustering

Implements Affinity Propagation clustering introduced by Frey and Dueck (2007) <DOI:10.1126/science.1136800>. The algorithms are largely analogous to the 'Matlab' code published by Frey and Dueck. The package further provides leveraged affinity propagation and an algorithm for exemplar-based agglomerative clustering that can also be used to join clusters obtained from affinity propagation. Various plotting functions are available for analyzing clustering results.

Maintained by Ulrich Bodenhofer. Last updated 11 months ago.

cpp

10 stars 9.81 score 270 scripts 25 dependents

bioc

matter:Out-of-core statistical computing and signal processing

Toolbox for larger-than-memory scientific computing and visualization, providing efficient out-of-core data structures using files or shared memory, for dense and sparse vectors, matrices, and arrays, with applications to nonuniformly sampled signals and images.

Maintained by Kylie A. Bemis. Last updated 4 months ago.

infrastructure datarepresentation dataimport dimensionreduction preprocessing cpp

57 stars 9.52 score 64 scripts 2 dependents

edzer

intervals:Tools for Working with Points and Intervals

Tools for working with and comparing sets of points and intervals.

Maintained by Edzer Pebesma. Last updated 7 months ago.

cpp

11 stars 9.50 score 122 scripts 98 dependents

bioc

snpStats:SnpMatrix and XSnpMatrix classes and methods

Classes and statistical methods for large SNP association studies. This extends the earlier snpMatrix package, allowing for uncertainty in genotypes.

Maintained by David Clayton. Last updated 5 months ago.

microarray snp geneticvariability zlib

9.48 score 674 scripts 20 dependents

sizespectrum

mizer:Dynamic Multi-Species Size Spectrum Modelling

A set of classes and methods to set up and run multi-species, trait based and community size spectrum ecological models, focused on the marine environment.

Maintained by Gustav Delius. Last updated 2 months ago.

ecosystem-model fish-population-dynamics fisheries fisheries-management marine-ecosystem population-dynamics simulation size-structure species-interactions transport-equation cpp

39 stars 9.41 score 207 scripts

eagerai

fastai:Interface to 'fastai'

The 'fastai' <https://docs.fast.ai/index.html> library simplifies training fast and accurate neural networks using modern best practices. It is based on research in to deep learning best practices undertaken at 'fast.ai', including 'out of the box' support for vision, text, tabular, audio, time series, and collaborative filtering models.

Maintained by Turgut Abdullayev. Last updated 12 months ago.

audio collaborative-filtering darknet darknet-image-classification fastai medical object-detection tabular text vision

118 stars 9.40 score 76 scripts

bioc

ggmsa:Plot Multiple Sequence Alignment using 'ggplot2'

A visual exploration tool for multiple sequence alignment and associated data. Supports MSA of DNA, RNA, and protein sequences using 'ggplot2'. Multiple sequence alignment can easily be combined with other 'ggplot2' plots, such as phylogenetic tree Visualized by 'ggtree', boxplot, genome map and so on. More features: visualization of sequence logos, sequence bundles, RNA secondary structures and detection of sequence recombinations.

Maintained by Guangchuang Yu. Last updated 3 months ago.

software visualization alignment annotation multiplesequencealignment

210 stars 9.35 score 196 scripts 2 dependents

bioc

multtest:Resampling-based multiple hypothesis testing

Non-parametric bootstrap and permutation resampling-based multiple testing procedures (including empirical Bayes methods) for controlling the family-wise error rate (FWER), generalized family-wise error rate (gFWER), tail probability of the proportion of false positives (TPPFP), and false discovery rate (FDR). Several choices of bootstrap-based null distribution are implemented (centered, centered and scaled, quantile-transformed). Single-step and step-wise methods are available. Tests based on a variety of t- and F-statistics (including t-statistics based on regression parameters from linear and survival models as well as those based on correlation parameters) are included. When probing hypotheses with t-statistics, users may also select a potentially faster null distribution which is multivariate normal with mean zero and variance covariance matrix derived from the vector influence function. Results are reported in terms of adjusted p-values, confidence regions and test statistic cutoffs. The procedures are directly applicable to identifying differentially expressed genes in DNA microarray experiments.

Maintained by Katherine S. Pollard. Last updated 5 months ago.

microarray differentialexpression multiplecomparison

9.34 score 932 scripts 136 dependents

hojsgaard

gRbase:A Package for Graphical Modelling in R

The 'gRbase' package provides graphical modelling features used by e.g. the packages 'gRain', 'gRim' and 'gRc'. 'gRbase' implements graph algorithms including (i) maximum cardinality search (for marked and unmarked graphs). (ii) moralization, (iii) triangulation, (iv) creation of junction tree. 'gRbase' facilitates array operations, 'gRbase' implements functions for testing for conditional independence. 'gRbase' illustrates how hierarchical log-linear models may be implemented and describes concept of graphical meta data. The facilities of the package are documented in the book by Højsgaard, Edwards and Lauritzen (2012, <doi:10.1007/978-1-4614-2299-0>) and in the paper by Dethlefsen and Højsgaard, (2005, <doi:10.18637/jss.v014.i17>). Please see 'citation("gRbase")' for citation details.

Maintained by Søren Højsgaard. Last updated 5 months ago.

openblas cpp

3 stars 9.24 score 241 scripts 20 dependents

bpfaff

urca:Unit Root and Cointegration Tests for Time Series Data

Unit root and cointegration tests encountered in applied econometric analysis are implemented.

Maintained by Bernhard Pfaff. Last updated 10 months ago.

fortran

6 stars 8.95 score 1.4k scripts 270 dependents

kollerma

robustlmm:Robust Linear Mixed Effects Models

Implements the Robust Scoring Equations estimator to fit linear mixed effects models robustly. Robustness is achieved by modification of the scoring equations combined with the Design Adaptive Scale approach.

Maintained by Manuel Koller. Last updated 1 years ago.

openblas cpp

28 stars 8.79 score 138 scripts

flr

FLCore:Core Package of FLR, Fisheries Modelling in R

Core classes and methods for FLR, a framework for fisheries modelling and management strategy simulation in R. Developed by a team of fisheries scientists in various countries. More information can be found at <http://flr-project.org/>.

Maintained by Iago Mosqueira. Last updated 8 days ago.

fisheries flr fisheries-modelling

16 stars 8.78 score 956 scripts 23 dependents

r-forge

distr:Object Oriented Implementation of Distributions

S4-classes and methods for distributions.

Maintained by Peter Ruckdeschel. Last updated 2 months ago.

8.77 score 327 scripts 32 dependents

bart1

move:Visualizing and Analyzing Animal Track Data

Contains functions to access movement data stored in 'movebank.org' as well as tools to visualize and statistically analyze animal movement data, among others functions to calculate dynamic Brownian Bridge Movement Models. Move helps addressing movement ecology questions.

Maintained by Bart Kranstauber. Last updated 4 months ago.

cpp

8.76 score 690 scripts 3 dependents

mikejohnson51

climateR:climateR

Find, subset, and retrive geospatial data by AOI.

Maintained by Mike Johnson. Last updated 4 months ago.

aoi climate dataset geospatial gridded-climate-data weather

187 stars 8.74 score 156 scripts 1 dependents

andrewzm

FRK:Fixed Rank Kriging

A tool for spatial/spatio-temporal modelling and prediction with large datasets. The approach models the field, and hence the covariance function, using a set of basis functions. This fixed-rank basis-function representation facilitates the modelling of big data, and the method naturally allows for non-stationary, anisotropic covariance functions. Discretisation of the spatial domain into so-called basic areal units (BAUs) facilitates the use of observations with varying support (i.e., both point-referenced and areal supports, potentially simultaneously), and prediction over arbitrary user-specified regions. `FRK` also supports inference over various manifolds, including the 2D plane and 3D sphere, and it provides helper functions to model, fit, predict, and plot with relative ease. Version 2.0.0 and above also supports the modelling of non-Gaussian data (e.g., Poisson, binomial, negative-binomial, gamma, and inverse-Gaussian) by employing a generalised linear mixed model (GLMM) framework. Zammit-Mangion and Cressie <doi:10.18637/jss.v098.i04> describe `FRK` in a Gaussian setting, and detail its use of basis functions and BAUs, while Sainsbury-Dale, Zammit-Mangion, and Cressie <doi:10.18637/jss.v108.i10> describe `FRK` in a non-Gaussian setting; two vignettes are available that summarise these papers and provide additional examples.

Maintained by Andrew Zammit-Mangion. Last updated 6 months ago.

cpp

71 stars 8.70 score 188 scripts 1 dependents

pilaboratory

sads:Maximum Likelihood Models for Species Abundance Distributions

Maximum likelihood tools to fit and compare models of species abundance distributions and of species rank-abundance distributions.

Maintained by Paulo I. Prado. Last updated 1 years ago.

23 stars 8.66 score 244 scripts 3 dependents

bioc

pRoloc:A unifying bioinformatics framework for spatial proteomics

The pRoloc package implements machine learning and visualisation methods for the analysis and interogation of quantitiative mass spectrometry data to reliably infer protein sub-cellular localisation.

Maintained by Lisa Breckels. Last updated 1 months ago.

immunooncology proteomics massspectrometry classification clustering qualitycontrol bioconductor proteomics-data spatial-proteomics visualisation openblas cpp

15 stars 8.64 score 101 scripts 2 dependents

actuaryzhang

cplm:Compound Poisson Linear Models

Likelihood-based and Bayesian methods for various compound Poisson linear models based on Zhang, Yanwei (2013) <doi:10.1007/s11222-012-9343-7>.

Maintained by Yanwei (Wayne) Zhang. Last updated 1 years ago.

openblas

16 stars 8.55 score 75 scripts 10 dependents

r-forge

ClassDiscovery:Classes and Methods for "Class Discovery" with Microarrays or Proteomics

Defines the classes used for "class discovery" problems in the OOMPA project (<http://oompa.r-forge.r-project.org/>). Class discovery primarily consists of unsupervised clustering methods with attempts to assess their statistical significance.

Maintained by Kevin R. Coombes. Last updated 2 months ago.

microarray clustering

8.53 score 85 scripts 9 dependents

ropensci

weatherOz:An API Client for Australian Weather and Climate Data Resources

Provides automated downloading, parsing and formatting of weather data for Australia through API endpoints provided by the Department of Primary Industries and Regional Development ('DPIRD') of Western Australia and by the Science and Technology Division of the Queensland Government's Department of Environment and Science ('DES'). As well as the Bureau of Meteorology ('BOM') of the Australian government precis and coastal forecasts, and downloading and importing radar and satellite imagery files. 'DPIRD' weather data are accessed through public 'APIs' provided by 'DPIRD', <https://www.agric.wa.gov.au/weather-api-20>, providing access to weather station data from the 'DPIRD' weather station network. Australia-wide weather data are based on data from the Australian Bureau of Meteorology ('BOM') data and accessed through 'SILO' (Scientific Information for Land Owners) Jeffrey et al. (2001) <doi:10.1016/S1364-8152(01)00008-1>. 'DPIRD' data are made available under a Creative Commons Attribution 3.0 Licence (CC BY 3.0 AU) license <https://creativecommons.org/licenses/by/3.0/au/deed.en>. SILO data are released under a Creative Commons Attribution 4.0 International licence (CC BY 4.0) <https://creativecommons.org/licenses/by/4.0/>. 'BOM' data are (c) Australian Government Bureau of Meteorology and released under a Creative Commons (CC) Attribution 3.0 licence or Public Access Licence ('PAL') as appropriate, see <http://www.bom.gov.au/other/copyright.shtml> for further details.

Maintained by Rodrigo Pires. Last updated 1 months ago.

dpird bom meteorological-data weather-forecast australia weather weather-data meteorology western-australia australia-bureau-of-meteorology western-australia-agriculture australia-agriculture australia-climate australia-weather api-client climate data rainfall weather-api

31 stars 8.47 score 40 scripts

bgoodri

mi:Missing Data Imputation and Model Checking

The mi package provides functions for data manipulation, imputing missing values in an approximate Bayesian framework, diagnostics of the models used to generate the imputations, confidence-building mechanisms to validate some of the assumptions of the imputation algorithm, and functions to analyze multiply imputed data sets with the appropriate degree of sampling uncertainty.

Maintained by Ben Goodrich. Last updated 3 years ago.

2 stars 8.25 score 244 scripts 47 dependents

tomasfryda

h2o:R Interface for the 'H2O' Scalable Machine Learning Platform

R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).

Maintained by Tomas Fryda. Last updated 1 years ago.

3 stars 8.20 score 7.8k scripts 11 dependents

cran

flexmix:Flexible Mixture Modeling

A general framework for finite mixtures of regression models using the EM algorithm is implemented. The E-step and all data handling are provided, while the M-step can be supplied by the user to easily define new models. Existing drivers implement mixtures of standard linear models, generalized linear models and model-based clustering.

Maintained by Bettina Gruen. Last updated 28 days ago.

5 stars 8.19 score 113 dependents

r-hyperspec

hyperSpec:Work with Hyperspectral Data, i.e. Spectra + Meta Information (Spatial, Time, Concentration, ...)

Comfortable ways to work with hyperspectral data sets, i.e. spatially or time-resolved spectra, or spectra with any other kind of information associated with each of the spectra. The spectra can be data as obtained in XRF, UV/VIS, Fluorescence, AES, NIR, IR, Raman, NMR, MS, etc. More generally, any data that is recorded over a discretized variable, e.g. absorbance = f(wavelength), stored as a vector of absorbance values for discrete wavelengths is suitable.

Maintained by Claudia Beleites. Last updated 10 months ago.

data-wrangling hyperspectral imaging infrared nmr raman spectroscopy uv-vis xrf

16 stars 8.10 score 233 scripts 2 dependents

bioc

openCyto:Hierarchical Gating Pipeline for flow cytometry data

This package is designed to facilitate the automated gating methods in sequential way to mimic the manual gating strategy.

Maintained by Mike Jiang. Last updated 2 days ago.

immunooncology flowcytometry dataimport preprocessing datarepresentation cpp

8.02 score 404 scripts 1 dependents

bioc

motifStack:Plot stacked logos for single or multiple DNA, RNA and amino acid sequence

The motifStack package is designed for graphic representation of multiple motifs with different similarity scores. It works with both DNA/RNA sequence motif and amino acid sequence motif. In addition, it provides the flexibility for users to customize the graphic parameters such as the font type and symbol colors.

Maintained by Jianhong Ou. Last updated 3 months ago.

sequencematching visualization sequencing microarray alignment chipchip chipseq motifannotation dataimport

7.93 score 188 scripts 6 dependents

r-forge

tuneR:Analysis of Music and Speech

Analyze music and speech, extract features like MFCCs, handle wave files and their representation in various ways, read mp3, read midi, perform steps of a transcription, ... Also contains functions ported from the 'rastamat' 'Matlab' package.

Maintained by Uwe Ligges. Last updated 12 months ago.

7.93 score 1.1k scripts 44 dependents

cran

timeSeries:Financial Time Series Objects (Rmetrics)

'S4' classes and various tools for financial time series: Basic functions such as scaling and sorting, subsetting, mathematical operations and statistical functions.

Maintained by Georgi N. Boshnakov. Last updated 6 months ago.

2 stars 7.89 score 146 dependents

bioc

flowWorkspace:Infrastructure for representing and interacting with gated and ungated cytometry data sets.

This package is designed to facilitate comparison of automated gating methods against manual gating done in flowJo. This package allows you to import basic flowJo workspaces into BioConductor and replicate the gating from flowJo using the flowCore functionality. Gating hierarchies, groups of samples, compensation, and transformation are performed so that the output matches the flowJo analysis.

Maintained by Greg Finak. Last updated 22 days ago.

immunooncology flowcytometry dataimport preprocessing datarepresentation zlib openblas cpp

7.89 score 576 scripts 10 dependents

genentech

psborrow2:Bayesian Dynamic Borrowing Analysis and Simulation

Bayesian dynamic borrowing is an approach to incorporating external data to supplement a randomized, controlled trial analysis in which external data are incorporated in a dynamic way (e.g., based on similarity of outcomes); see Viele 2013 <doi:10.1002/pst.1589> for an overview. This package implements the hierarchical commensurate prior approach to dynamic borrowing as described in Hobbes 2011 <doi:10.1111/j.1541-0420.2011.01564.x>. There are three main functionalities. First, 'psborrow2' provides a user-friendly interface for applying dynamic borrowing on the study results handles the Markov Chain Monte Carlo sampling on behalf of the user. Second, 'psborrow2' provides a simulation framework to compare different borrowing parameters (e.g. full borrowing, no borrowing, dynamic borrowing) and other trial and borrowing characteristics (e.g. sample size, covariates) in a unified way. Third, 'psborrow2' provides a set of functions to generate data for simulation studies, and also allows the user to specify their own data generation process. This package is designed to use the sampling functions from 'cmdstanr' which can be installed from <https://stan-dev.r-universe.dev>.

Maintained by Matt Secrest. Last updated 1 months ago.

bayesian-dynamic-borrowing psborrow2 simulation-study

18 stars 7.87 score 16 scripts

bioc

siggenes:Multiple Testing using SAM and Efron's Empirical Bayes Approaches

Identification of differentially expressed genes and estimation of the False Discovery Rate (FDR) using both the Significance Analysis of Microarrays (SAM) and the Empirical Bayes Analyses of Microarrays (EBAM).

Maintained by Holger Schwender. Last updated 5 months ago.

multiplecomparison microarray geneexpression snp exonarray differentialexpression

7.87 score 74 scripts 34 dependents

ericmarcon

entropart:Entropy Partitioning to Measure Diversity

Measurement and partitioning of diversity, based on Tsallis entropy, following Marcon and Herault (2015) <doi:10.18637/jss.v067.i08>. 'entropart' provides functions to calculate alpha, beta and gamma diversity of communities, including phylogenetic and functional diversity. Estimation-bias corrections are available.

Maintained by Eric Marcon. Last updated 2 months ago.

biodiversity diversity entropy-partitioning estimator measure species

9 stars 7.81 score 115 scripts 1 dependents

openpharma

crmPack:Object-Oriented Implementation of CRM Designs

Implements a wide range of model-based dose escalation designs, ranging from classical and modern continual reassessment methods (CRMs) based on dose-limiting toxicity endpoints to dual-endpoint designs taking into account a biomarker/efficacy outcome. The focus is on Bayesian inference, making it very easy to setup a new design with its own JAGS code. However, it is also possible to implement 3+3 designs for comparison or models with non-Bayesian estimation. The whole package is written in a modular form in the S4 class system, making it very flexible for adaptation to new models, escalation or stopping rules. Further details are presented in Sabanes Bove et al. (2019) <doi:10.18637/jss.v089.i10>.

Maintained by Daniel Sabanes Bove. Last updated 2 months ago.

jags cpp

21 stars 7.76 score 208 scripts

bioc

KEGGgraph:KEGGgraph: A graph approach to KEGG PATHWAY in R and Bioconductor

KEGGGraph is an interface between KEGG pathway and graph object as well as a collection of tools to analyze, dissect and visualize these graphs. It parses the regularly updated KGML (KEGG XML) files into graph models maintaining all essential pathway attributes. The package offers functionalities including parsing, graph operation, visualization and etc.

Maintained by Jitao David Zhang. Last updated 5 months ago.

pathways graphandnetwork visualization kegg

7.76 score 114 scripts 23 dependents

trackage

trip:Tracking Data

Access and manipulate spatial tracking data, with straightforward coercion from and to other formats. Filter for speed and create time spent maps from tracking data. There are coercion methods to convert between 'trip' and 'ltraj' from 'adehabitatLT', and between 'trip' and 'psp' and 'ppp' from 'spatstat'. Trip objects can be created from raw or grouped data frames, and from types in the 'sp', sf', 'amt', 'trackeR', 'mousetrap', and other packages, Sumner, MD (2011) <https://figshare.utas.edu.au/articles/thesis/The_tag_location_problem/23209538>.

Maintained by Michael D. Sumner. Last updated 9 months ago.

13 stars 7.72 score 137 scripts 1 dependents

blue-matter

MSEtool:Management Strategy Evaluation Toolkit

Development, simulation testing, and implementation of management procedures for fisheries (see Carruthers & Hordyk (2018) <doi:10.1111/2041-210X.13081>).

Maintained by Adrian Hordyk. Last updated 3 days ago.

cpp

8 stars 7.71 score 163 scripts 3 dependents

adamlilith

fasterRaster:Faster Raster and Spatial Vector Processing Using 'GRASS GIS'

Processing of large-in-memory/large-on disk rasters and spatial vectors using 'GRASS GIS' <https://grass.osgeo.org/>. Most functions in the 'terra' package are recreated. Processing of medium-sized and smaller spatial objects will nearly always be faster using 'terra' or 'sf', but for large-in-memory/large-on-disk objects, 'fasterRaster' may be faster. To use most of the functions, you must have the stand-alone version (not the 'OSGeoW4' installer version) of 'GRASS GIS' 8.0 or higher.

Maintained by Adam B. Smith. Last updated 2 days ago.

aspect distance fragmentation fragmentation-indices gis grass grass-gis raster raster-projection rasterize slope topography vectorization

57 stars 7.68 score 8 scripts

ltorgo

DMwR2:Functions and Data for the Second Edition of "Data Mining with R"

Functions and data accompanying the second edition of the book "Data Mining with R, learning with case studies" by Luis Torgo, published by CRC Press.

Maintained by Luis Torgo. Last updated 8 years ago.

27 stars 7.64 score 380 scripts 2 dependents

flr

ggplotFL:Using ggplot2 in FLR

Using ggplot2 for FLR. Provides (1) overloaded ggplot methods for various FLR classes, (2) ggplot-based versions of standard plots in the FLCore package, and (3) new geoms for using FLR objects.

Maintained by Iago Mosqueira. Last updated 2 months ago.

visualization ggplot2 fisheries flr

4 stars 7.60 score 458 scripts 12 dependents

bioc

ropls:PCA, PLS(-DA) and OPLS(-DA) for multivariate analysis and feature selection of omics data

Latent variable modeling with Principal Component Analysis (PCA) and Partial Least Squares (PLS) are powerful methods for visualization, regression, classification, and feature selection of omics data where the number of variables exceeds the number of samples and with multicollinearity among variables. Orthogonal Partial Least Squares (OPLS) enables to separately model the variation correlated (predictive) to the factor of interest and the uncorrelated (orthogonal) variation. While performing similarly to PLS, OPLS facilitates interpretation. Successful applications of these chemometrics techniques include spectroscopic data such as Raman spectroscopy, nuclear magnetic resonance (NMR), mass spectrometry (MS) in metabolomics and proteomics, but also transcriptomics data. In addition to scores, loadings and weights plots, the package provides metrics and graphics to determine the optimal number of components (e.g. with the R2 and Q2 coefficients), check the validity of the model by permutation testing, detect outliers, and perform feature selection (e.g. with Variable Importance in Projection or regression coefficients). The package can be accessed via a user interface on the Workflow4Metabolomics.org online resource for computational metabolomics (built upon the Galaxy environment).

Maintained by Etienne A. Thevenot. Last updated 5 months ago.

regression classification principalcomponent transcriptomics proteomics metabolomics lipidomics massspectrometry immunooncology

7.55 score 210 scripts 8 dependents

rspatial

predicts:Spatial Prediction Tools

Methods for spatial predictive modeling, especially for spatial distribution models. This includes algorithms for model fitting and prediction, as well as methods for model evaluation.

Maintained by Robert J. Hijmans. Last updated 2 months ago.

10 stars 7.55 score 108 scripts 8 dependents

wenjie2wang

reda:Recurrent Event Data Analysis

Contains implementations of recurrent event data analysis routines including (1) survival and recurrent event data simulation from stochastic process point of view by the thinning method proposed by Lewis and Shedler (1979) <doi:10.1002/nav.3800260304> and the inversion method introduced in Cinlar (1975, ISBN:978-0486497976), (2) the mean cumulative function (MCF) estimation by the Nelson-Aalen estimator of the cumulative hazard rate function, (3) two-sample recurrent event responses comparison with the pseudo-score tests proposed by Lawless and Nadeau (1995) <doi:10.2307/1269617>, (4) gamma frailty model with spline rate function following Fu, et al. (2016) <doi:10.1080/10543406.2014.992524>.

Maintained by Wenjie Wang. Last updated 1 years ago.

mcf mean-cumulative-function recurrent-event survival-analysis cpp

15 stars 7.52 score 55 scripts 3 dependents

tpetzoldt

growthrates:Estimate Growth Rates from Experimental Data

A collection of methods to determine growth rates from experimental data, in particular from batch experiments and plate reader trials.

Maintained by Thomas Petzoldt. Last updated 2 years ago.

27 stars 7.52 score 102 scripts

ropensci

rsat:Dealing with Multiplatform Satellite Images

Downloading, customizing, and processing time series of satellite images for a region of interest. 'rsat' functions allow a unified access to multispectral images from Landsat, MODIS and Sentinel repositories. 'rsat' also offers capabilities for customizing satellite images, such as tile mosaicking, image cropping and new variables computation. Finally, 'rsat' covers the processing, including cloud masking, compositing and gap-filling/smoothing time series of images (Militino et al., 2018 <doi:10.3390/rs10030398> and Militino et al., 2019 <doi:10.1109/TGRS.2019.2904193>).

Maintained by Unai Pérez - Goya. Last updated 11 months ago.

satellite-images

54 stars 7.45 score 52 scripts

bioc

flowViz:Visualization for flow cytometry

Provides visualization tools for flow cytometry data.

Maintained by Mike Jiang. Last updated 5 months ago.

immunooncology infrastructure flowcytometry cellbasedassays visualization

7.44 score 231 scripts 12 dependents

cran

sn:The Skew-Normal and Related Distributions Such as the Skew-t and the SUN

Build and manipulate probability distributions of the skew-normal family and some related ones, notably the skew-t and the SUN families. For the skew-normal and the skew-t distributions, statistical methods are provided for data fitting and model diagnostics, in the univariate and the multivariate case.

Maintained by Adelchi Azzalini. Last updated 2 years ago.

3 stars 7.44 score 92 dependents

ssnn-airr

shazam:Immunoglobulin Somatic Hypermutation Analysis

Provides a computational framework for analyzing mutations in immunoglobulin (Ig) sequences. Includes methods for Bayesian estimation of antigen-driven selection pressure, mutational load quantification, building of somatic hypermutation (SHM) models, and model-dependent distance calculations. Also includes empirically derived models of SHM for both mice and humans. Citations: Gupta and Vander Heiden, et al (2015) <doi:10.1093/bioinformatics/btv359>, Yaari, et al (2012) <doi:10.1093/nar/gks457>, Yaari, et al (2013) <doi:10.3389/fimmu.2013.00358>, Cui, et al (2016) <doi:10.4049/jimmunol.1502263>.

Maintained by Susanna Marquez. Last updated 3 months ago.

7.43 score 222 scripts 2 dependents

spatpomp-org

spatPomp:Inference for Spatiotemporal Partially Observed Markov Processes

Inference on panel data using spatiotemporal partially-observed Markov process (SpatPOMP) models. The 'spatPomp' package extends 'pomp' to include algorithms taking advantage of the spatial structure in order to assist with handling high dimensional processes. See Asfaw et al. (2024) <doi:10.48550/arXiv.2101.01157> for further description of the package.

Maintained by Edward Ionides. Last updated 4 months ago.

2 stars 7.38 score 93 scripts

cran

timeDate:Rmetrics - Chronological and Calendar Objects

The 'timeDate' class fulfils the conventions of the ISO 8601 standard as well as of the ANSI C and POSIX standards. Beyond these standards it provides the "Financial Center" concept which allows to handle data records collected in different time zones and mix them up to have always the proper time stamps with respect to your personal financial center, or alternatively to the GMT reference time. It can thus also handle time stamps from historical data records from the same time zone, even if the financial centers changed day light saving times at different calendar dates.

Maintained by Georgi N. Boshnakov. Last updated 6 months ago.

1 stars 7.37 score 713 dependents

consbiol-unibern

SDMtune:Species Distribution Model Selection

User-friendly framework that enables the training and the evaluation of species distribution models (SDMs). The package implements functions for data driven variable selection and model tuning and includes numerous utilities to display the results. All the functions used to select variables or to tune model hyperparameters have an interactive real-time chart displayed in the 'RStudio' viewer pane during their execution.

Maintained by Sergio Vignali. Last updated 3 months ago.

hyperparameter-tuning species-distribution-modelling variable-selection cpp

25 stars 7.37 score 155 scripts

gagolews

FuzzyNumbers:Tools to Deal with Fuzzy Numbers

S4 classes and methods to deal with fuzzy numbers. They allow for computing any arithmetic operations (e.g., by using the Zadeh extension principle), performing approximation of arbitrary fuzzy numbers by trapezoidal and piecewise linear ones, preparing plots for publications, computing possibility and necessity values for comparisons, etc.

Maintained by Marek Gagolewski. Last updated 3 years ago.

10 stars 7.37 score 91 scripts 17 dependents

jsta

wql:Exploring Water Quality Monitoring Data

Functions to assist in the processing and exploration of data from environmental monitoring programs. The package name stands for "water quality" and reflects the original focus on time series data for physical and chemical properties of water, as well as the biota. Intended for programs that sample approximately monthly, quarterly or annually at discrete stations, a feature of many legacy data sets. Most of the functions should be useful for analysis of similar-frequency time series regardless of the subject matter.

Maintained by Jemma Stachelek. Last updated 2 months ago.

water-quality

12 stars 7.34 score 204 scripts 3 dependents

choi-phd

TestDesign:Optimal Test Design Approach to Fixed and Adaptive Test Construction

Uses the optimal test design approach by Birnbaum (1968, ISBN:9781593119348) and van der Linden (2018) <doi:10.1201/9781315117430> to construct fixed, adaptive, and parallel tests. Supports the following mixed-integer programming (MIP) solver packages: 'Rsymphony', 'highs', 'gurobi', 'lpSolve', and 'Rglpk'. The 'gurobi' package is not available from CRAN; see <https://www.gurobi.com/downloads/>.

Maintained by Seung W. Choi. Last updated 6 months ago.

openblas cpp

3 stars 7.34 score 37 scripts 2 dependents

argocanada

argoFloats:Analysis of Oceanographic Argo Floats

Supports the analysis of oceanographic data recorded by Argo autonomous drifting profiling floats. Functions are provided to (a) download and cache data files, (b) subset data in various ways, (c) handle quality-control flags and (d) plot the results according to oceanographic conventions. A shiny app is provided for easy exploration of datasets. The package is designed to work well with the 'oce' package, providing a wide range of processing capabilities that are particular to oceanographic analysis. See Kelley, Harbin, and Richards (2021) <doi:10.3389/fmars.2021.635922> for more on the scientific context and applications.

Maintained by Dan Kelley. Last updated 1 months ago.

17 stars 7.32 score 203 scripts

kornl

gMCP:Graph Based Multiple Comparison Procedures

Functions and a graphical user interface for graphical described multiple test procedures.

Maintained by Kornelius Rohmeyer. Last updated 1 years ago.

openjdk

10 stars 7.31 score 105 scripts 2 dependents

bioc

flowClust:Clustering for Flow Cytometry

Robust model-based clustering using a t-mixture model with Box-Cox transformation. Note: users should have GSL installed. Windows users: 'consult the README file available in the inst directory of the source distribution for necessary configuration instructions'.

Maintained by Greg Finak. Last updated 5 months ago.

immunooncology clustering visualization flowcytometry

7.30 score 83 scripts 6 dependents

r-forge

pcalg:Methods for Graphical Models and Causal Inference

Functions for causal structure learning and causal inference using graphical models. The main algorithms for causal structure learning are PC (for observational data without hidden variables), FCI and RFCI (for observational data with hidden variables), and GIES (for a mix of data from observational studies (i.e. observational data) and data from experiments involving interventions (i.e. interventional data) without hidden variables). For causal inference the IDA algorithm, the Generalized Backdoor Criterion (GBC), the Generalized Adjustment Criterion (GAC) and some related functions are implemented. Functions for incorporating background knowledge are provided.

Maintained by Markus Kalisch. Last updated 7 months ago.

openblas cpp

7.30 score 700 scripts 19 dependents

robinhankin

onion:Octonions and Quaternions

Quaternions and Octonions are four- and eight- dimensional extensions of the complex numbers. They are normed division algebras over the real numbers and find applications in spatial rotations (quaternions), and string theory and relativity (octonions). The quaternions are noncommutative and the octonions nonassociative. See the package vignette for more details.

Maintained by Robin K. S. Hankin. Last updated 1 months ago.

6 stars 7.27 score 43 scripts 3 dependents

bioc

IHW:Independent Hypothesis Weighting

Independent hypothesis weighting (IHW) is a multiple testing procedure that increases power compared to the method of Benjamini and Hochberg by assigning data-driven weights to each hypothesis. The input to IHW is a two-column table of p-values and covariates. The covariate can be any continuous-valued or categorical variable that is thought to be informative on the statistical properties of each hypothesis test, while it is independent of the p-value under the null hypothesis.

Maintained by Nikos Ignatiadis. Last updated 5 months ago.

immunooncology multiplecomparison rnaseq

7.25 score 264 scripts 2 dependents

edzer

trajectories:Classes and Methods for Trajectory Data

Classes and methods for trajectory data, with support for nesting individual Track objects in track sets (Tracks) and track sets for different entities in collections of Tracks. Methods include selection, generalization, aggregation, intersection, simulation, and plotting.

Maintained by Edzer Pebesma. Last updated 7 months ago.

31 stars 7.25 score 76 scripts 1 dependents

bioc

qpgraph:Estimation of Genetic and Molecular Regulatory Networks from High-Throughput Genomics Data

Estimate gene and eQTL networks from high-throughput expression and genotyping assays.

Maintained by Robert Castelo. Last updated 2 days ago.

microarray geneexpression transcription pathways networkinference graphandnetwork generegulation genetics geneticvariability snp software openblas

3 stars 7.24 score 20 scripts 3 dependents

ropensci

melt:Multiple Empirical Likelihood Tests

Performs multiple empirical likelihood tests. It offers an easy-to-use interface and flexibility in specifying hypotheses and calibration methods, extending the framework to simultaneous inferences. The core computational routines are implemented using the 'Eigen' 'C++' library and 'RcppEigen' interface, with 'OpenMP' for parallel computation. Details of the testing procedures are provided in Kim, MacEachern, and Peruggia (2023) <doi:10.1080/10485252.2023.2206919>. A companion paper by Kim, MacEachern, and Peruggia (2024) <doi:10.18637/jss.v108.i05> is available for further information. This work was supported by the U.S. National Science Foundation under Grants No. SES-1921523 and DMS-2015552.

Maintained by Eunseop Kim. Last updated 11 months ago.

cpp openmp

12 stars 7.24 score 84 scripts

wbnicholson

BigVAR:Dimension Reduction Methods for Multivariate Time Series

Estimates VAR and VARX models with Structured Penalties.

Maintained by Will Nicholson. Last updated 6 months ago.

openblas cpp

58 stars 7.24 score 100 scripts 1 dependents

vpihur

clValid:Validation of Clustering Results

Statistical and biological validation of clustering results. This package implements Dunn Index, Silhouette, Connectivity, Stability, BHI and BSI. Further information can be found in Brock, G et al. (2008) <doi: 10.18637/jss.v025.i04>.

Maintained by Vasyl Pihur. Last updated 4 years ago.

5 stars 7.24 score 422 scripts 14 dependents

dankelley

plan:Tools for Project Planning

Supports the creation of 'burndown' charts and 'gantt' diagrams.

Maintained by Dan Kelley. Last updated 2 years ago.

33 stars 7.23 score 103 scripts

jhorzek

lessSEM:Non-Smooth Regularization for Structural Equation Models

Provides regularized structural equation modeling (regularized SEM) with non-smooth penalty functions (e.g., lasso) building on 'lavaan'. The package is heavily inspired by the ['regsem'](<https://github.com/Rjacobucci/regsem>) and ['lslx'](<https://github.com/psyphh/lslx>) packages.

Maintained by Jannik H. Orzek. Last updated 1 years ago.

lasso psychometrics regularization regularized-structural-equation-model sem structural-equation-modeling openblas cpp openmp

7 stars 7.19 score 223 scripts

paulponcet

statip:Statistical Functions for Probability Distributions and Regression

A collection of miscellaneous statistical functions for probability distributions: 'dbern()', 'pbern()', 'qbern()', 'rbern()' for the Bernoulli distribution, and 'distr2name()', 'name2distr()' for distribution names; probability density estimation: 'densityfun()'; most frequent value estimation: 'mfv()', 'mfv1()'; other statistical measures of location: 'cv()' (coefficient of variation), 'midhinge()', 'midrange()', 'trimean()'; construction of histograms: 'histo()', 'find_breaks()'; calculation of the Hellinger distance: 'hellinger()'; use of classical kernels: 'kernelfun()', 'kernel_properties()'; univariate piecewise-constant regression: 'picor()'.

Maintained by Paul Poncet. Last updated 5 years ago.

2 stars 7.17 score 73 scripts 52 dependents

datastorm-open

rAmCharts:JavaScript Charts Tool

Provides an R interface for using 'AmCharts' Library. Based on 'htmlwidgets', it provides a global architecture to generate 'JavaScript' source code for charts. Most of classes in the library have their equivalent in R with S4 classes; for those classes, not all properties have been referenced but can easily be added in the constructors. Complex properties (e.g. 'JavaScript' object) can be passed as named list. See examples at <https://datastorm-open.github.io/introduction_ramcharts/> and <https://www.amcharts.com/> for more information about the library. The package includes the free version of 'AmCharts' Library. Its only limitation is a small link to the web site displayed on your charts. If you enjoy this library, do not hesitate to refer to this page <https://www.amcharts.com/online-store/> to purchase a licence, and thus support its creators and get a period of Priority Support. See also <https://www.amcharts.com/about/> for more information about 'AmCharts' company.

Maintained by Benoit Thieurmel. Last updated 2 months ago.

49 stars 7.17 score 153 scripts 4 dependents

jellegoeman

penalized:L1 (Lasso and Fused Lasso) and L2 (Ridge) Penalized Estimation in GLMs and in the Cox Model

Fitting possibly high dimensional penalized regression models. The penalty structure can be any combination of an L1 penalty (lasso and fused lasso), an L2 penalty (ridge) and a positivity constraint on the regression coefficients. The supported regression models are linear, logistic and Poisson regression and the Cox Proportional Hazards model. Cross-validation routines allow optimization of the tuning parameters.

Maintained by Jelle Goeman. Last updated 3 years ago.

openblas cpp

4 stars 7.09 score 429 scripts 17 dependents

optad

adoptr:Adaptive Optimal Two-Stage Designs

Optimize one or two-arm, two-stage designs for clinical trials with respect to several implemented objective criteria or custom objectives. Optimization under uncertainty and conditional (given stage-one outcome) constraints are supported. See Pilz et al. (2019) <doi:10.1002/sim.8291> and Kunzmann et al. (2021) <doi:10.18637/jss.v098.i09> for details.

Maintained by Maximilian Pilz. Last updated 6 months ago.

1 stars 7.09 score 39 scripts 1 dependents

khliland

baseline:Baseline Correction of Spectra

Collection of baseline correction algorithms, along with a framework and a Tcl/Tk enabled GUI for optimising baseline algorithm parameters. Typical use of the package is for removing background effects from spectra originating from various types of spectroscopy and spectrometry, possibly optimizing this with regard to regression or classification results. Correction methods include polynomial fitting, weighted local smoothers and many more.

Maintained by Kristian Hovde Liland. Last updated 10 months ago.

9 stars 7.07 score 74 scripts 12 dependents

spedygiorgio

lifecontingencies:Financial and Actuarial Mathematics for Life Contingencies

Classes and methods that allow the user to manage life table, actuarial tables (also multiple decrements tables). Moreover, functions to easily perform demographic, financial and actuarial mathematics on life contingencies insurances calculations are contained therein. See Spedicato (2013) <doi:10.18637/jss.v055.i10>.

Maintained by Giorgio Alfredo Spedicato. Last updated 6 months ago.

actuarial financial life-contingencies life-insurance cpp

61 stars 7.06 score 156 scripts

crp2a

gamma:Dose Rate Estimation from in-Situ Gamma-Ray Spectrometry Measurements

Process in-situ Gamma-Ray Spectrometry for Luminescence Dating. This package allows to import, inspect and correct the energy shifts of gamma-ray spectra. It provides methods for estimating the gamma dose rate by the use of a calibration curve as described in Mercier and Falguères (2007). The package only supports Canberra CNF and TKA and Kromek SPE files.

Maintained by Archéosciences Bordeaux. Last updated 6 months ago.

archaeometry gamma-spectrometry geochronology luminescence-dating

7 stars 7.05 score 11 scripts 1 dependents

yuimaproject

yuima:The YUIMA Project Package for SDEs

Simulation and Inference for SDEs and Other Stochastic Processes.

Maintained by Stefano M. Iacus. Last updated 2 days ago.

openblas cpp

9 stars 7.02 score 92 scripts 2 dependents

bips-hb

innsight:Get the Insights of Your Neural Network

Interpretation methods for analyzing the behavior and individual predictions of modern neural networks in a three-step procedure: Converting the model, running the interpretation method, and visualizing the results. Implemented methods are, e.g., 'Connection Weights' described by Olden et al. (2004) <doi:10.1016/j.ecolmodel.2004.03.013>, layer-wise relevance propagation ('LRP') described by Bach et al. (2015) <doi:10.1371/journal.pone.0130140>, deep learning important features ('DeepLIFT') described by Shrikumar et al. (2017) <doi:10.48550/arXiv.1704.02685> and gradient-based methods like 'SmoothGrad' described by Smilkov et al. (2017) <doi:10.48550/arXiv.1706.03825>, 'Gradient x Input' or 'Vanilla Gradient'. Details can be found in the accompanying scientific paper: Koenen & Wright (2024, Journal of Statistical Software, <doi:10.18637/jss.v111.i08>).

Maintained by Niklas Koenen. Last updated 4 months ago.

30 stars 7.01 score 57 scripts

doccstat

fastcpd:Fast Change Point Detection via Sequential Gradient Descent

Implements fast change point detection algorithm based on the paper "Sequential Gradient Descent and Quasi-Newton's Method for Change-Point Analysis" by Xianyang Zhang, Trisha Dawn <https://proceedings.mlr.press/v206/zhang23b.html>. The algorithm is based on dynamic programming with pruning and sequential gradient descent. It is able to detect change points a magnitude faster than the vanilla Pruned Exact Linear Time(PELT). The package includes examples of linear regression, logistic regression, Poisson regression, penalized linear regression data, and whole lot more examples with custom cost function in case the user wants to use their own cost function.

Maintained by Xingchi Li. Last updated 10 days ago.

change-point-detection cpp custom-function gradient-descent lasso linear-regression logistic-regression offline pelt penalized-regression poisson-regression quasi-newton statistics time-series warm-start fortran openblas cpp openmp

22 stars 7.00 score 7 scripts

roustant

DiceKriging:Kriging Methods for Computer Experiments

Estimation, validation and prediction of kriging models. Important functions : km, print.km, plot.km, predict.km.

Maintained by Olivier Roustant. Last updated 4 years ago.

4 stars 6.99 score 526 scripts 37 dependents

sylvainschmitt

SSDM:Stacked Species Distribution Modelling

Allows to map species richness and endemism based on stacked species distribution models (SSDM). Individuals SDMs can be created using a single or multiple algorithms (ensemble SDMs). For each species, an SDM can yield a habitat suitability map, a binary map, a between-algorithm variance map, and can assess variable importance, algorithm accuracy, and between- algorithm correlation. Methods to stack individual SDMs include summing individual probabilities and thresholding then summing. Thresholding can be based on a specific evaluation metric or by drawing repeatedly from a Bernoulli distribution. The SSDM package also provides a user-friendly interface.

Maintained by Sylvain Schmitt. Last updated 11 months ago.

44 stars 6.99 score 44 scripts

r-forge

oompaBase:Class Unions, Matrix Operations, and Color Schemes for OOMPA

Provides the class unions that must be preloaded in order for the basic tools in the OOMPA (Object-Oriented Microarray and Proteomics Analysis) project to be defined and loaded. It also includes vectorized operations for row-by-row means, variances, and t-tests. Finally, it provides new color schemes. Details on the packages in the OOMPA project can be found at <http://oompa.r-forge.r-project.org/>.

Maintained by Kevin R. Coombes. Last updated 2 months ago.

infrastructure

6.97 score 29 scripts 18 dependents

bioc

ROC:utilities for ROC, with microarray focus

Provide utilities for ROC, with microarray focus.

Maintained by Vince Carey. Last updated 5 months ago.

differentialexpression

6.95 score 70 scripts 8 dependents

bioc

CellNOptR:Training of boolean logic models of signalling networks using prior knowledge networks and perturbation data

This package does optimisation of boolean logic networks of signalling pathways based on a previous knowledge network and a set of data upon perturbation of the nodes in the network.

Maintained by Attila Gabor. Last updated 4 days ago.

cellbasedassays cellbiology proteomics pathways network timecourse immunooncology

6.95 score 98 scripts 6 dependents

archaeostat

ArchaeoPhases:Post-Processing of Markov Chain Monte Carlo Simulations for Chronological Modelling

Statistical analysis of archaeological dates and groups of dates. This package allows to post-process Markov Chain Monte Carlo (MCMC) simulations from 'ChronoModel' <https://chronomodel.com/>, 'Oxcal' <https://c14.arch.ox.ac.uk/oxcal.html> or 'BCal' <https://bcal.shef.ac.uk/>. It provides functions for the study of rhythms of the long term from the posterior distribution of a series of dates (tempo and activity plot). It also allows the estimation and visualization of time ranges from the posterior distribution of groups of dates (e.g. duration, transition and hiatus between successive phases) as described in Philippe and Vibet (2020) <doi:10.18637/jss.v093.c01>.

Maintained by Anne Philippe. Last updated 12 months ago.

archaeology bayesian-statistics geochronology markov-chain radiocarbon-dates

10 stars 6.90 score 66 scripts

kingaa

ouch:Ornstein-Uhlenbeck Models for Phylogenetic Comparative Hypotheses

Fit and compare Ornstein-Uhlenbeck models for evolution along a phylogenetic tree.

Maintained by Aaron A. King. Last updated 5 months ago.

adaptive-regime brownian-motion ornstein-uhlenbeck ornstein-uhlenbeck-models ouch phylogenetic-comparative-hypotheses phylogenetic-comparative-methods phylogenetic-data react

15 stars 6.87 score 68 scripts 4 dependents

flr

FLasher:Projection and Forecasting of Fish Populations, Stocks and Fleets

Projection of future population and fishery dynamics is carried out for a given set of management targets. A system of equations is solved, using Automatic Differentation (AD), for the levels of effort by fishery (fleet) that will result in the required abundances, catches or fishing mortalities.

Maintained by Iago Mosqueira. Last updated 21 days ago.

forecast fisheries flr cpp

2 stars 6.86 score 254 scripts 6 dependents

pcruniversum

chipPCR:Toolkit of Helper Functions to Pre-Process Amplification Data

A collection of functions to pre-process amplification curve data from polymerase chain reaction (PCR) or isothermal amplification reactions. Contains functions to normalize and baseline amplification curves, to detect both the start and end of an amplification reaction, several smoothers (e.g., LOWESS, moving average, cubic splines, Savitzky-Golay), a function to detect false positive amplification reactions and a function to determine the amplification efficiency. Quantification point (Cq) methods include the first (FDM) and second approximate derivative maximum (SDM) methods (calculated by a 5-point-stencil) and the cycle threshold method. Data sets of experimental nucleic acid amplification systems ('VideoScan HCU', capillary convective PCR (ccPCR)) and commercial systems are included. Amplification curves were generated by helicase dependent amplification (HDA), ccPCR or PCR. As detection system intercalating dyes (EvaGreen, SYBR Green) and hydrolysis probes (TaqMan) were used. For more information see: Roediger et al. (2015) <doi:10.1093/bioinformatics/btv205>.

Maintained by Stefan Roediger. Last updated 4 years ago.

8 stars 6.84 score 97 scripts 1 dependents

philips-software

latrend:A Framework for Clustering Longitudinal Data

A framework for clustering longitudinal datasets in a standardized way. The package provides an interface to existing R packages for clustering longitudinal univariate trajectories, facilitating reproducible and transparent analyses. Additionally, standard tools are provided to support cluster analyses, including repeated estimation, model validation, and model assessment. The interface enables users to compare results between methods, and to implement and evaluate new methods with ease. The 'akmedoids' package is available from <https://github.com/MAnalytics/akmedoids>.

Maintained by Niek Den Teuling. Last updated 3 months ago.

cluster-analysis clustering-evaluation clustering-methods data-science longitudinal-clustering longitudinal-data mixture-models time-series-analysis

30 stars 6.77 score 26 scripts

achubaty

grainscape:Landscape Connectivity, Habitat, and Protected Area Networks

Given a landscape resistance surface, creates minimum planar graph (Fall et al. (2007) <doi:10.1007/s10021-007-9038-7>) and grains of connectivity (Galpern et al. (2012) <doi:10.1111/j.1365-294X.2012.05677.x>) models that can be used to calculate effective distances for landscape connectivity at multiple scales. Documentation is provided by several vignettes, and a paper (Chubaty, Galpern & Doctolero (2020) <doi:10.1111/2041-210X.13350>).

Maintained by Alex M Chubaty. Last updated 2 months ago.

habitat-connectivity landscape-connectivity spatial-graphs cpp

19 stars 6.76 score 20 scripts

flr

FLa4a:A Simple and Robust Statistical Catch at Age Model

A simple and robust statistical Catch at Age model that is specifically designed for stocks with intermediate levels of data quantity and quality.

Maintained by Ernesto Jardim. Last updated 7 days ago.

12 stars 6.71 score 177 scripts 2 dependents

dgerlanc

portfolio:Analysing Equity Portfolios

Classes for analysing and implementing equity portfolios, including routines for generating tradelists and calculating exposures to user-specified risk factors.

Maintained by Daniel Gerlanc. Last updated 7 months ago.

finance portfolio-construction risk-modelling

16 stars 6.71 score 106 scripts

bioc

doppelgangR:Identify likely duplicate samples from genomic or meta-data

The main function is doppelgangR(), which takes as minimal input a list of ExpressionSet object, and searches all list pairs for duplicated samples. The search is based on the genomic data (exprs(eset)), phenotype/clinical data (pData(eset)), and "smoking guns" - supposedly unique identifiers found in pData(eset).

Maintained by Levi Waldron. Last updated 5 months ago.

immunooncology rnaseq microarray geneexpression qualitycontrol bioconductor-package

5 stars 6.67 score 31 scripts

r-forge

distrEx:Extensions of Package 'distr'

Extends package 'distr' by functionals, distances, and conditional distributions.

Maintained by Matthias Kohl. Last updated 2 months ago.

6.64 score 107 scripts 17 dependents

bioc

LEA:LEA: an R package for Landscape and Ecological Association Studies

LEA is an R package dedicated to population genomics, landscape genomics and genotype-environment association tests. LEA can run analyses of population structure and genome-wide tests for local adaptation, and also performs imputation of missing genotypes. The package includes statistical methods for estimating ancestry coefficients from large genotypic matrices and for evaluating the number of ancestral populations (snmf). It performs statistical tests using latent factor mixed models for identifying genetic polymorphisms that exhibit association with environmental gradients or phenotypic traits (lfmm2). In addition, LEA computes values of genetic offset statistics based on new or predicted environments (genetic.gap, genetic.offset). LEA is mainly based on optimized programs that can scale with the dimensions of large data sets.

Maintained by Olivier Francois. Last updated 17 days ago.

software statistical method clustering regression openblas

6.63 score 534 scripts

r-forge

distrMod:Object Oriented Implementation of Probability Models

Implements S4 classes for probability models based on packages 'distr' and 'distrEx'.

Maintained by Peter Ruckdeschel. Last updated 2 months ago.

6.60 score 139 scripts 6 dependents

bioc

kebabs:Kernel-Based Analysis of Biological Sequences

The package provides functionality for kernel-based analysis of DNA, RNA, and amino acid sequences via SVM-based methods. As core functionality, kebabs implements following sequence kernels: spectrum kernel, mismatch kernel, gappy pair kernel, and motif kernel. Apart from an efficient implementation of standard position-independent functionality, the kernels are extended in a novel way to take the position of patterns into account for the similarity measure. Because of the flexibility of the kernel formulation, other kernels like the weighted degree kernel or the shifted weighted degree kernel with constant weighting of positions are included as special cases. An annotation-specific variant of the kernels uses annotation information placed along the sequence together with the patterns in the sequence. The package allows for the generation of a kernel matrix or an explicit feature representation in dense or sparse format for all available kernels which can be used with methods implemented in other R packages. With focus on SVM-based methods, kebabs provides a framework which simplifies the usage of existing SVM implementations in kernlab, e1071, and LiblineaR. Binary and multi-class classification as well as regression tasks can be used in a unified way without having to deal with the different functions, parameters, and formats of the selected SVM. As support for choosing hyperparameters, the package provides cross validation - including grouped cross validation, grid search and model selection functions. For easier biological interpretation of the results, the package computes feature weights for all SVMs and prediction profiles which show the contribution of individual sequence positions to the prediction result and indicate the relevance of sequence sections for the learning result and the underlying biological functions.

Maintained by Ulrich Bodenhofer. Last updated 5 months ago.

supportvectormachine classification clustering regression cpp

6.58 score 47 scripts 3 dependents

flr

FLBRP:Reference Points for Fisheries Management

Calculates a range of biological reference points based upon yield per recruit and stock recruit based equilibrium calculations. These include F based reference points like F0.1, FMSY and biomass based reference points like BMSY.

Maintained by Iago Mosqueira. Last updated 4 months ago.

reference points fisheries flr cpp

2 stars 6.58 score 350 scripts 4 dependents

toxpi

toxpiR:Create ToxPi Prioritization Models

Enables users to build 'ToxPi' prioritization models and provides functionality within the grid framework for plotting ToxPi graphs. 'toxpiR' allows for more customization than the 'ToxPi GUI' (<https://toxpi.org>) and integration into existing workflows for greater ease-of-use, reproducibility, and transparency. toxpiR package behaves nearly identically to the GUI; the package documentation includes notes about all differences. The vignettes download example files from <https://github.com/ToxPi/ToxPi-example-files>.

Maintained by Jonathon F Fleming. Last updated 7 months ago.

data-science modeling toxicology

11 stars 6.58 score 19 scripts

spkaluzny

splus2R:Supplemental S-PLUS Functionality in R

Currently there are many functions in S-PLUS that are missing in R. To facilitate the conversion of S-PLUS packages to R packages, this package provides some missing S-PLUS functionality in R.

Maintained by Stephen Kaluzny. Last updated 1 years ago.

1 stars 6.56 score 82 scripts 30 dependents

fbertran

Cascade:Selection, Reverse-Engineering and Prediction in Cascade Networks

A modeling tool allowing gene selection, reverse engineering, and prediction in cascade networks. Jung, N., Bertrand, F., Bahram, S., Vallat, L., and Maumy-Bertrand, M. (2014) <doi:10.1093/bioinformatics/btt705>.

Maintained by Frederic Bertrand. Last updated 2 years ago.

1 stars 6.56 score 40 scripts 2 dependents

bachmannpatrick

CLVTools:Tools for Customer Lifetime Value Estimation

A set of state-of-the-art probabilistic modeling approaches to derive estimates of individual customer lifetime values (CLV). Commonly, probabilistic approaches focus on modelling 3 processes, i.e. individuals' attrition, transaction, and spending process. Latent customer attrition models, which are also known as "buy-'til-you-die models", model the attrition as well as the transaction process. They are used to make inferences and predictions about transactional patterns of individual customers such as their future purchase behavior. Moreover, these models have also been used to predict individuals’ long-term engagement in activities such as playing an online game or posting to a social media platform. The spending process is usually modelled by a separate probabilistic model. Combining these results yields in lifetime values estimates for individual customers. This package includes fast and accurate implementations of various probabilistic models for non-contractual settings (e.g., grocery purchases or hotel visits). All implementations support time-invariant covariates, which can be used to control for e.g., socio-demographics. If such an extension has been proposed in literature, we further provide the possibility to control for time-varying covariates to control for e.g., seasonal patterns. Currently, the package includes the following latent attrition models to model individuals' attrition and transaction process: [1] Pareto/NBD model (Pareto/Negative-Binomial-Distribution), [2] the Extended Pareto/NBD model (Pareto/Negative-Binomial-Distribution with time-varying covariates), [3] the BG/NBD model (Beta-Gamma/Negative-Binomial-Distribution) and the [4] GGom/NBD (Gamma-Gompertz/Negative-Binomial-Distribution). Further, we provide an implementation of the Gamma/Gamma model to model the spending process of individuals.

Maintained by Patrick Bachmann. Last updated 4 months ago.

clv customer-lifetime-value customer-relationship-management openblas gsl cpp openmp

55 stars 6.47 score 12 scripts

r-forge

ClassComparison:Classes and Methods for "Class Comparison" Problems on Microarrays

Defines the classes used for "class comparison" problems in the OOMPA project (<http://oompa.r-forge.r-project.org/>). Class comparison includes tests for differential expression; see Simon's book for details on typical problem types.

Maintained by Kevin R. Coombes. Last updated 2 months ago.

microarray differentialexpression multiplecomparisons

6.46 score 44 scripts 3 dependents

bioc

MultiDataSet:Implementation of MultiDataSet and ResultSet

Implementation of the BRGE's (Bioinformatic Research Group in Epidemiology from Center for Research in Environmental Epidemiology) MultiDataSet and ResultSet. MultiDataSet is designed for integrating multi omics data sets and ResultSet is a container for omics results. This package contains base classes for MEAL and rexposome packages.

Maintained by Xavier Escribà Montagut. Last updated 5 months ago.

software datarepresentation

6.45 score 28 scripts 10 dependents

bstatcomp

bayes4psy:User Friendly Bayesian Data Analysis for Psychology

Contains several Bayesian models for data analysis of psychological tests. A user friendly interface for these models should enable students and researchers to perform professional level Bayesian data analysis without advanced knowledge in programming and Bayesian statistics. This package is based on the Stan platform (Carpenter et el. 2017 <doi:10.18637/jss.v076.i01>).

Maintained by Jure Demšar. Last updated 1 years ago.

cpp

14 stars 6.44 score 33 scripts

bioc

PICS:Probabilistic inference of ChIP-seq

Probabilistic inference of ChIP-Seq using an empirical Bayes mixture model approach.

Maintained by Renan Sauteraud. Last updated 2 days ago.

clustering visualization sequencing chipseq gsl

6.43 score 7 scripts 1 dependents

thibautjombart

apex:Phylogenetic Methods for Multiple Gene Data

Toolkit for the analysis of multiple gene data (Jombart et al. 2017) <doi:10.1111/1755-0998.12567>. 'apex' implements the new S4 classes 'multidna', 'multiphyDat' and associated methods to handle aligned DNA sequences from multiple genes.

Maintained by Klaus Schliep. Last updated 1 years ago.

5 stars 6.40 score 54 scripts

victor-navarro

calmr:Canonical Associative Learning Models and their Representations

Implementations of canonical associative learning models, with tools to run experiment simulations, estimate model parameters, and compare model representations. Experiments and results are represented using S4 classes and methods.

Maintained by Victor Navarro. Last updated 10 months ago.

3 stars 6.40 score 17 scripts

blue-matter

SAMtool:Stock Assessment Methods Toolkit

Simulation tools for closed-loop simulation are provided for the 'MSEtool' operating model to inform data-rich fisheries. 'SAMtool' provides a conditioning model, assessment models of varying complexity with standardized reporting, model-based management procedures, and diagnostic tools for evaluating assessments inside closed-loop simulation.

Maintained by Quang Huynh. Last updated 1 months ago.

cpp

3 stars 6.39 score 36 scripts 1 dependents

ropensci

QuadratiK:Collection of Methods Constructed using Kernel-Based Quadratic Distances

It includes test for multivariate normality, test for uniformity on the d-dimensional Sphere, non-parametric two- and k-sample tests, random generation of points from the Poisson kernel-based density and clustering algorithm for spherical data. For more information see Saraceno G., Markatou M., Mukhopadhyay R. and Golzy M. (2024) <doi:10.48550/arXiv.2402.02290> Markatou, M. and Saraceno, G. (2024) <doi:10.48550/arXiv.2407.16374>, Ding, Y., Markatou, M. and Saraceno, G. (2023) <doi:10.5705/ss.202022.0347>, and Golzy, M. and Markatou, M. (2020) <doi:10.1080/10618600.2020.1740713>.

Maintained by Giovanni Saraceno. Last updated 2 months ago.

cpp

1 stars 6.36 score 27 scripts

stc04003

reReg:Recurrent Event Regression

A comprehensive collection of practical and easy-to-use tools for regression analysis of recurrent events, with or without the presence of a (possibly) informative terminal event described in Chiou et al. (2023) <doi:10.18637/jss.v105.i05>. The modeling framework is based on a joint frailty scale-change model, that includes models described in Wang et al. (2001) <doi:10.1198/016214501753209031>, Huang and Wang (2004) <doi:10.1198/016214504000001033>, Xu et al. (2017) <doi:10.1080/01621459.2016.1173557>, and Xu et al. (2019) <doi:10.5705/SS.202018.0224> as special cases. The implemented estimating procedure does not require any parametric assumption on the frailty distribution. The package also allows the users to specify different model forms for both the recurrent event process and the terminal event.

Maintained by Sy Han (Steven) Chiou. Last updated 2 months ago.

openblas cpp

23 stars 6.35 score 36 scripts 1 dependents

dfriend21

quadtree:Region Quadtrees for Spatial Data

Provides functionality for working with raster-like quadtrees (also called “region quadtrees”), which allow for variable-sized cells. The package allows for flexibility in the quadtree creation process. Several functions defining how to split and aggregate cells are provided, and custom functions can be written for both of these processes. In addition, quadtrees can be created using other quadtrees as “templates”, so that the new quadtree's structure is identical to the template quadtree. The package also includes functionality for modifying quadtrees, querying values, saving quadtrees to a file, and calculating least-cost paths using the quadtree as a resistance surface.

Maintained by Derek Friend. Last updated 2 years ago.

cpp

19 stars 6.34 score 58 scripts

cran

fGarch:Rmetrics - Autoregressive Conditional Heteroskedastic Modelling

Analyze and model heteroskedastic behavior in financial time series.

Maintained by Georgi N. Boshnakov. Last updated 1 years ago.

fortran

7 stars 6.33 score 51 dependents

smoeding

usl:Analyze System Scalability with the Universal Scalability Law

The Universal Scalability Law (Gunther 2007) <doi:10.1007/978-3-540-31010-5> is a model to predict hardware and software scalability. It uses system capacity as a function of load to forecast the scalability for the system.

Maintained by Stefan Moeding. Last updated 3 years ago.

scalability universal-scalability-law usl

36 stars 6.32 score 117 scripts

larmarange

prevR:Estimating Regional Trends of a Prevalence from a DHS and Similar Surveys

Spatial estimation of a prevalence surface or a relative risks surface, using data from a Demographic and Health Survey (DHS) or an analog survey, see Larmarange et al. (2011) <doi:10.4000/cybergeo.24606>.

Maintained by Joseph Larmarange. Last updated 6 months ago.

5 stars 6.26 score 46 scripts

bioc

lumi:BeadArray Specific Methods for Illumina Methylation and Expression Microarrays

The lumi package provides an integrated solution for the Illumina microarray data analysis. It includes functions of Illumina BeadStudio (GenomeStudio) data input, quality control, BeadArray-specific variance stabilization, normalization and gene annotation at the probe level. It also includes the functions of processing Illumina methylation microarrays, especially Illumina Infinium methylation microarrays.

Maintained by Lei Huang. Last updated 5 months ago.

microarray onechannel preprocessing dnamethylation qualitycontrol twochannel

6.26 score 294 scripts 5 dependents

rickhelmus

patRoon:Workflows for Mass-Spectrometry Based Non-Target Analysis

Provides an easy-to-use interface to a mass spectrometry based non-target analysis workflow. Various (open-source) tools are combined which provide algorithms for extraction and grouping of features, extraction of MS and MS/MS data, automatic formula and compound annotation and grouping related features to components. In addition, various tools are provided for e.g. data preparation and cleanup, plotting results and automatic reporting.

Maintained by Rick Helmus. Last updated 7 days ago.

mass-spectrometry non-target cpp openjdk

65 stars 6.24 score 43 scripts

fbertran

Patterns:Deciphering Biological Networks with Patterned Heterogeneous Measurements

A modeling tool dedicated to biological network modeling (Bertrand and others 2020, <doi:10.1093/bioinformatics/btaa855>). It allows for single or joint modeling of, for instance, genes and proteins. It starts with the selection of the actors that will be the used in the reverse engineering upcoming step. An actor can be included in that selection based on its differential measurement (for instance gene expression or protein abundance) or on its time course profile. Wrappers for actors clustering functions and cluster analysis are provided. It also allows reverse engineering of biological networks taking into account the observed time course patterns of the actors. Many inference functions are provided and dedicated to get specific features for the inferred network such as sparsity, robust links, high confidence links or stable through resampling links. Some simulation and prediction tools are also available for cascade networks (Jung and others 2014, <doi:10.1093/bioinformatics/btt705>). Example of use with microarray or RNA-Seq data are provided.

Maintained by Frederic Bertrand. Last updated 11 months ago.

4 stars 6.16 score 18 scripts

clarahapp

funData:An S4 Class for Functional Data

S4 classes for univariate and multivariate functional data with utility functions. See <doi:10.18637/jss.v093.i05> for a detailed description of the package functionalities and its interplay with the MFPCA package for multivariate functional principal component analysis <https://CRAN.R-project.org/package=MFPCA>.

Maintained by Clara Happ-Kurz. Last updated 1 years ago.

14 stars 6.15 score 111 scripts 6 dependents

umr-amap

AMAPVox:LiDAR Data Voxelisation

Read, manipulate and write voxel spaces. Voxel spaces are read from text-based output files of the 'AMAPVox' software. 'AMAPVox' is a LiDAR point cloud voxelisation software that aims at estimating leaf area through several theoretical/numerical approaches. See more in the article Vincent et al. (2017) <doi:10.23708/1AJNMP> and the technical note Vincent et al. (2021) <doi:10.23708/1AJNMP>.

Maintained by Philippe Verley. Last updated 2 months ago.

15 stars 6.13 score 12 scripts

distancedevelopment

dssd:Distance Sampling Survey Design

Creates survey designs for distance sampling surveys. These designs can be assessed for various effort and coverage statistics. Once the user is satisfied with the design characteristics they can generate a set of transects to use in their distance sampling survey. Many of the designs implemented in this R package were first made available in our 'Distance' for Windows software and are detailed in Chapter 7 of Advanced Distance Sampling, Buckland et. al. (2008, ISBN-13: 978-0199225873). Find out more about estimating animal/plant abundance with distance sampling at <http://distancesampling.org/>.

Maintained by Laura Marshall. Last updated 3 months ago.

2 stars 6.11 score 36 scripts 1 dependents

carlos-alberto-silva

rGEDI:NASA's Global Ecosystem Dynamics Investigation (GEDI) Data Visualization and Processing

Set of tools for downloading, reading, visualizing and processing GEDI Level1B, Level2A and Level2B data.

Maintained by Caio Hamamura. Last updated 5 months ago.

169 stars 6.11 score 85 scripts 1 dependents

geobosh

sarima:Simulation and Prediction with Seasonal ARIMA Models

Functions, classes and methods for time series modelling with ARIMA and related models. The aim of the package is to provide consistent interface for the user. For example, a single function autocorrelations() computes various kinds of theoretical and sample autocorrelations. This is work in progress, see the documentation and vignettes for the current functionality. Function sarima() fits extended multiplicative seasonal ARIMA models with trends, exogenous variables and arbitrary roots on the unit circle, which can be fixed or estimated (for the algebraic basis for this see <arXiv:2208.05055>, a paper on the methodology is being prepared).

Maintained by Georgi N. Boshnakov. Last updated 1 years ago.

arima kalman-filter reg-sarima sarima sarimax seasonal time-series xarima openblas cpp

3 stars 6.09 score 112 scripts 1 dependents

bioc

Pedixplorer:Pedigree Functions

Routines to handle family data with a Pedigree object. The initial purpose was to create correlation structures that describe family relationships such as kinship and identity-by-descent, which can be used to model family data in mixed effects models, such as in the coxme function. Also includes a tool for Pedigree drawing which is focused on producing compact layouts without intervention. Recent additions include utilities to trim the Pedigree object with various criteria, and kinship for the X chromosome.

Maintained by Louis Le Nezet. Last updated 13 days ago.

software datarepresentation genetics graphandnetwork visualization kinship pedigree

2 stars 6.08 score 10 scripts

bioc

mgsa:Model-based gene set analysis

Model-based Gene Set Analysis (MGSA) is a Bayesian modeling approach for gene set enrichment. The package mgsa implements MGSA and tools to use MGSA together with the Gene Ontology.

Maintained by Sebastian Bauer. Last updated 5 months ago.

pathways go genesetenrichment openmp

5 stars 6.08 score 12 scripts

bioc

qcmetrics:A Framework for Quality Control

The package provides a framework for generic quality control of data. It permits to create, manage and visualise individual or sets of quality control metrics and generate quality control reports in various formats.

Maintained by Laurent Gatto. Last updated 5 months ago.

immunooncology software qualitycontrol proteomics microarray massspectrometry visualization reportwriting

2 stars 6.03 score 2 dependents

bioc

FELLA:Interpretation and enrichment for metabolomics data

Enrichment of metabolomics data using KEGG entries. Given a set of affected compounds, FELLA suggests affected reactions, enzymes, modules and pathways using label propagation in a knowledge model network. The resulting subnetwork can be visualised and exported.

Maintained by Sergio Picart-Armada. Last updated 5 months ago.

software metabolomics graphandnetwork kegg go pathways network networkenrichment

6.01 score 32 scripts

ltorgo

performanceEstimation:An Infra-Structure for Performance Estimation of Predictive Models

An infra-structure for estimating the predictive performance of predictive models. In this context, it can also be used to compare and/or select among different alternative ways of solving one or more predictive tasks. The main goal of the package is to provide a generic infra-structure to estimate the values of different metrics of predictive performance using different estimation procedures. These estimation tasks can be applied to any solutions (workflows) to the predictive tasks. The package provides easy to use standard workflows that allow the usage of any available R modeling algorithm together with some pre-defined data pre-processing steps and also prediction post- processing methods. It also provides means for addressing issues related with the statistical significance of the observed differences.

Maintained by Luis Torgo. Last updated 8 years ago.

16 stars 5.97 score 195 scripts 1 dependents

jtimonen

lgpr:Longitudinal Gaussian Process Regression

Interpretable nonparametric modeling of longitudinal data using additive Gaussian process regression. Contains functionality for inferring covariate effects and assessing covariate relevances. Models are specified using a convenient formula syntax, and can include shared, group-specific, non-stationary, heterogeneous and temporally uncertain effects. Bayesian inference for model parameters is performed using 'Stan'. The modeling approach and methods are described in detail in Timonen et al. (2021) <doi:10.1093/bioinformatics/btab021>.

Maintained by Juho Timonen. Last updated 7 months ago.

bayesian-inference gaussian-processes longitudinal-data stan cpp

25 stars 5.94 score 69 scripts

comeetie

greed:Clustering and Model Selection with the Integrated Classification Likelihood

An ensemble of algorithms that enable the clustering of networks and data matrices (such as counts, categorical or continuous) with different type of generative models. Model selection and clustering is performed in combination by optimizing the Integrated Classification Likelihood (which is equivalent to minimizing the description length). Several models are available such as: Stochastic Block Model, degree corrected Stochastic Block Model, Mixtures of Multinomial, Latent Block Model. The optimization is performed thanks to a combination of greedy local search and a genetic algorithm (see <arXiv:2002:11577> for more details).

Maintained by Etienne Côme. Last updated 2 years ago.

openblas cpp openmp

14 stars 5.94 score 41 scripts

bioc

normr:Normalization and difference calling in ChIP-seq data

Robust normalization and difference calling procedures for ChIP-seq and alike data. Read counts are modeled jointly as a binomial mixture model with a user-specified number of components. A fitted background estimate accounts for the effect of enrichment in certain regions and, therefore, represents an appropriate null hypothesis. This robust background is used to identify significantly enriched or depleted regions.

Maintained by Johannes Helmuth. Last updated 5 months ago.

bayesian differentialpeakcalling classification dataimport chipseq ripseq functionalgenomics genetics multiplecomparison normalization peakdetection preprocessing alignment cpp openmp

11 stars 5.93 score 13 scripts

jeswheel

panelPomp:Inference for Panel Partially Observed Markov Processes

Data analysis based on panel partially-observed Markov process (PanelPOMP) models. To implement such models, simulate them and fit them to panel data, 'panelPomp' extends some of the facilities provided for time series data by the 'pomp' package. Implemented methods include filtering (panel particle filtering) and maximum likelihood estimation (Panel Iterated Filtering) as proposed in Breto, Ionides and King (2020) "Panel Data Analysis via Mechanistic Models" <doi:10.1080/01621459.2019.1604367>.

Maintained by Jesse Wheeler. Last updated 4 months ago.

5.91 score 45 scripts

bozenne

BuyseTest:Generalized Pairwise Comparisons

Implementation of the Generalized Pairwise Comparisons (GPC) as defined in Buyse (2010) <doi:10.1002/sim.3923> for complete observations, and extended in Peron (2018) <doi:10.1177/0962280216658320> to deal with right-censoring. GPC compare two groups of observations (intervention vs. control group) regarding several prioritized endpoints to estimate the probability that a random observation drawn from one group performs better/worse/equivalently than a random observation drawn from the other group. Summary statistics such as the net treatment benefit, win ratio, or win odds are then deduced from these probabilities. Confidence intervals and p-values are obtained based on asymptotic results (Ozenne 2021 <doi:10.1177/09622802211037067>), non-parametric bootstrap, or permutations. The software enables the use of thresholds of minimal importance difference, stratification, non-prioritized endpoints (O Brien test), and can handle right-censoring and competing-risks.

Maintained by Brice Ozenne. Last updated 16 days ago.

generalized-pairwise-comparisons non-parametric statistics cpp

5 stars 5.91 score 90 scripts

bioc

fabia:FABIA: Factor Analysis for Bicluster Acquisition

Biclustering by "Factor Analysis for Bicluster Acquisition" (FABIA). FABIA is a model-based technique for biclustering, that is clustering rows and columns simultaneously. Biclusters are found by factor analysis where both the factors and the loading matrix are sparse. FABIA is a multiplicative model that extracts linear dependencies between samples and feature patterns. It captures realistic non-Gaussian data distributions with heavy tails as observed in gene expression measurements. FABIA utilizes well understood model selection techniques like the EM algorithm and variational approaches and is embedded into a Bayesian framework. FABIA ranks biclusters according to their information content and separates spurious biclusters from true biclusters. The code is written in C.

Maintained by Andreas Mitterecker. Last updated 5 months ago.

statisticalmethod microarray differentialexpression multiplecomparison clustering visualization

5.84 score 32 scripts 6 dependents

tobiaskley

quantspec:Quantile-Based Spectral Analysis of Time Series

Methods to determine, smooth and plot quantile periodograms for univariate and multivariate time series.

Maintained by Tobias Kley. Last updated 9 years ago.

cpp

10 stars 5.84 score 46 scripts 1 dependents

cran

flexclust:Flexible Cluster Algorithms

The main function kcca implements a general framework for k-centroids cluster analysis supporting arbitrary distance measures and centroid computation. Further cluster methods include hard competitive learning, neural gas, and QT clustering. There are numerous visualization methods for cluster results (neighborhood graphs, convex cluster hulls, barcharts of centroids, ...), and bootstrap methods for the analysis of cluster stability.

Maintained by Bettina Grün. Last updated 28 days ago.

3 stars 5.81 score 52 dependents

reginalexavier

OpenLand:Quantitative Analysis and Visualization of LUCC

Tools for the analysis of land use and cover (LUC) time series. It includes support for loading spatiotemporal raster data and synthesized spatial plotting. Several LUC change (LUCC) metrics in regular or irregular time intervals can be extracted and visualized through one- and multistep sankey and chord diagrams. A complete intensity analysis according to Aldwaik and Pontius (2012) <doi:10.1016/j.landurbplan.2012.02.010> is implemented, including tools for the generation of standardized multilevel output graphics.

Maintained by Reginal Exavier. Last updated 11 months ago.

geography geospatial intensity-analysis land-use-and-land-cover-change luc-maps lulc plot rasters

22 stars 5.80 score 19 scripts

merck

gMCPLite:Lightweight Graph Based Multiple Comparison Procedures

A lightweight fork of 'gMCP' with functions for graphical described multiple test procedures introduced in Bretz et al. (2009) <doi:10.1002/sim.3495> and Bretz et al. (2011) <doi:10.1002/bimj.201000239>. Implements a flexible function using 'ggplot2' to create multiplicity graph visualizations. Contains instructions of multiplicity graph and graphical testing for group sequential design, described in Maurer and Bretz (2013) <doi:10.1080/19466315.2013.807748>, with necessary unit testing using 'testthat'.

Maintained by Nan Xiao. Last updated 1 years ago.

11 stars 5.79 score 14 scripts

hojsgaard

gRim:Graphical Interaction Models

Provides the following types of models: Models for contingency tables (i.e. log-linear models) Graphical Gaussian models for multivariate normal data (i.e. covariance selection models) Mixed interaction models. Documentation about 'gRim' is provided by vignettes included in this package and the book by Højsgaard, Edwards and Lauritzen (2012, <doi:10.1007/978-1-4614-2299-0>); see 'citation("gRim")' for details.

Maintained by Søren Højsgaard. Last updated 5 months ago.

openblas cpp

2 stars 5.77 score 74 scripts

stewid

EpiContactTrace:Epidemiological Tool for Contact Tracing

Routines for epidemiological contact tracing and visualisation of network of contacts.

Maintained by Stefan Widgren. Last updated 6 months ago.

cpp

11 stars 5.74 score 42 scripts

bioc

debCAM:Deconvolution by Convex Analysis of Mixtures

An R package for fully unsupervised deconvolution of complex tissues. It provides basic functions to perform unsupervised deconvolution on mixture expression profiles by Convex Analysis of Mixtures (CAM) and some auxiliary functions to help understand the subpopulation-specific results. It also implements functions to perform supervised deconvolution based on prior knowledge of molecular markers, S matrix or A matrix. Combining molecular markers from CAM and from prior knowledge can achieve semi-supervised deconvolution of mixtures.

Maintained by Lulu Chen. Last updated 5 months ago.

software cellbiology geneexpression openjdk

7 stars 5.69 score 14 scripts

bioc

metabCombiner:Method for Combining LC-MS Metabolomics Feature Measurements

This package aligns LC-HRMS metabolomics datasets acquired from biologically similar specimens analyzed under similar, but not necessarily identical, conditions. Peak-picked and simply aligned metabolomics feature tables (consisting of m/z, rt, and per-sample abundance measurements, plus optional identifiers & adduct annotations) are accepted as input. The package outputs a combined table of feature pair alignments, organized into groups of similar m/z, and ranked by a similarity score. Input tables are assumed to be acquired using similar (but not necessarily identical) analytical methods.

Maintained by Hani Habra. Last updated 5 months ago.

software massspectrometry metabolomics mass-spectrometry

10 stars 5.65 score 5 scripts