Showing 200 of total 487 results (show query)
stan-dev
rstan:R Interface to Stan
User-facing R functions are provided to parse, compile, test, estimate, and analyze Stan models by accessing the header-only Stan library provided by the 'StanHeaders' package. The Stan project develops a probabilistic programming language that implements full Bayesian statistical inference via Markov Chain Monte Carlo, rough Bayesian inference via 'variational' approximation, and (optionally penalized) maximum likelihood estimation via optimization. In all three cases, automatic differentiation is used to quickly and accurately evaluate gradients without burdening the user with the need to derive the partial derivatives.
Maintained by Ben Goodrich. Last updated 4 days ago.
bayesian-data-analysisbayesian-inferencebayesian-statisticsmcmcstancpp
1.1k stars 18.84 score 14k scripts 281 dependentsedzer
sp:Classes and Methods for Spatial Data
Classes and methods for spatial data; the classes document where the spatial location information resides, for 2D or 3D data. Utility functions are provided, e.g. for plotting data as maps, spatial selection, as well as methods for retrieving coordinates, for subsetting, print, summary, etc. From this version, 'rgdal', 'maptools', and 'rgeos' are no longer used at all, see <https://r-spatial.org/r/2023/05/15/evolution4.html> for details.
Maintained by Edzer Pebesma. Last updated 2 months ago.
127 stars 18.63 score 35k scripts 1.3k dependentsrspatial
terra:Spatial Data Analysis
Methods for spatial data analysis with vector (points, lines, polygons) and raster (grid) data. Methods for vector data include geometric operations such as intersect and buffer. Raster methods include local, focal, global, zonal and geometric operations. The predict and interpolate methods facilitate the use of regression type (interpolation, machine learning) models for spatial prediction, including with satellite remote sensing data. Processing of very large files is supported. See the manual and tutorials on <https://rspatial.org/> to get started. 'terra' replaces the 'raster' package ('terra' can do more, and it is faster and easier to use).
Maintained by Robert J. Hijmans. Last updated 2 days ago.
geospatialrasterspatialvectoronetbbprojgdalgeoscpp
559 stars 17.64 score 17k scripts 855 dependentsrspatial
raster:Geographic Data Analysis and Modeling
Reading, writing, manipulating, analyzing and modeling of spatial data. This package has been superseded by the "terra" package <https://CRAN.R-project.org/package=terra>.
Maintained by Robert J. Hijmans. Last updated 10 hours ago.
163 stars 17.23 score 58k scripts 562 dependentsdankelley
oce:Analysis of Oceanographic Data
Supports the analysis of Oceanographic data, including 'ADCP' measurements, measurements made with 'argo' floats, 'CTD' measurements, sectional data, sea-level time series, coastline and topographic data, etc. Provides specialized functions for calculating seawater properties such as potential temperature in either the 'UNESCO' or 'TEOS-10' equation of state. Produces graphical displays that conform to the conventions of the Oceanographic literature. This package is discussed extensively by Kelley (2018) "Oceanographic Analysis with R" <doi:10.1007/978-1-4939-8844-0>.
Maintained by Dan Kelley. Last updated 20 hours ago.
146 stars 15.34 score 4.2k scripts 18 dependentsphilchalmers
mirt:Multidimensional Item Response Theory
Analysis of discrete response data using unidimensional and multidimensional item analysis models under the Item Response Theory paradigm (Chalmers (2012) <doi:10.18637/jss.v048.i06>). Exploratory and confirmatory item factor analysis models are estimated with quadrature (EM) or stochastic (MHRM) methods. Confirmatory bi-factor and two-tier models are available for modeling item testlets using dimension reduction EM algorithms, while multiple group analyses and mixed effects designs are included for detecting differential item, bundle, and test functioning, and for modeling item and person covariates. Finally, latent class models such as the DINA, DINO, multidimensional latent class, mixture IRT models, and zero-inflated response models are supported, as well as a wide family of probabilistic unfolding models.
Maintained by Phil Chalmers. Last updated 1 days ago.
212 stars 14.93 score 2.5k scripts 40 dependentsr-lidar
lidR:Airborne LiDAR Data Manipulation and Visualization for Forestry Applications
Airborne LiDAR (Light Detection and Ranging) interface for data manipulation and visualization. Read/write 'las' and 'laz' files, computation of metrics in area based approach, point filtering, artificial point reduction, classification from geographic data, normalization, individual tree segmentation and other manipulations.
Maintained by Jean-Romain Roussel. Last updated 2 months ago.
alsforestrylaslazlidarpoint-cloudremote-sensingopenblascppopenmp
623 stars 14.47 score 844 scripts 8 dependentsbioc
xcms:LC-MS and GC-MS Data Analysis
Framework for processing and visualization of chromatographically separated and single-spectra mass spectral data. Imports from AIA/ANDI NetCDF, mzXML, mzData and mzML files. Preprocesses data for high-throughput, untargeted analyte profiling.
Maintained by Steffen Neumann. Last updated 14 days ago.
immunooncologymassspectrometrymetabolomicsbioconductorfeature-detectionmass-spectrometrypeak-detectioncpp
196 stars 14.31 score 984 scripts 11 dependentsedzer
hexbin:Hexagonal Binning Routines
Binning and plotting functions for hexagonal bins.
Maintained by Edzer Pebesma. Last updated 5 months ago.
37 stars 14.00 score 2.4k scripts 114 dependentsbiomodhub
biomod2:Ensemble Platform for Species Distribution Modeling
Functions for species distribution modeling, calibration and evaluation, ensemble of models, ensemble forecasting and visualization. The package permits to run consistently up to 10 single models on a presence/absences (resp presences/pseudo-absences) dataset and to combine them in ensemble models and ensemble projections. Some bench of other evaluation and visualisation tools are also available within the package.
Maintained by Maya Guéguen. Last updated 2 days ago.
95 stars 13.85 score 536 scripts 7 dependentsknausb
vcfR:Manipulate and Visualize VCF Data
Facilitates easy manipulation of variant call format (VCF) data. Functions are provided to rapidly read from and write to VCF files. Once VCF data is read into R a parser function extracts matrices of data. This information can then be used for quality control or other purposes. Additional functions provide visualization of genomic data. Once processing is complete data may be written to a VCF file (*.vcf.gz). It also may be converted into other popular R objects (e.g., genlight, DNAbin). VcfR provides a link between VCF data and familiar R software.
Maintained by Brian J. Knaus. Last updated 1 months ago.
genomicspopulation-geneticspopulation-genomicsrcppvcf-datavisualizationzlibcpp
256 stars 13.66 score 3.1k scripts 19 dependentsr-forge
robustbase:Basic Robust Statistics
"Essential" Robust Statistics. Tools allowing to analyze data with robust methods. This includes regression methodology including model selections and multivariate statistics where we strive to cover the book "Robust Statistics, Theory and Methods" by 'Maronna, Martin and Yohai'; Wiley 2006.
Maintained by Martin Maechler. Last updated 4 months ago.
13.38 score 1.7k scripts 480 dependentsbbolker
bbmle:Tools for General Maximum Likelihood Estimation
Methods and functions for fitting maximum likelihood models in R. This package modifies and extends the 'mle' classes in the 'stats4' package.
Maintained by Ben Bolker. Last updated 1 months ago.
25 stars 13.36 score 1.4k scripts 117 dependentsedzer
spacetime:Classes and Methods for Spatio-Temporal Data
Classes and methods for spatio-temporal data, including space-time regular lattices, sparse lattices, irregular data, and trajectories; utility functions for plotting data as map sequences (lattice or animation) or multiple time series; methods for spatial and temporal selection and subsetting, as well as for spatial/temporal/spatio-temporal matching or aggregation, retrieving coordinates, print, summary, etc.
Maintained by Edzer Pebesma. Last updated 2 months ago.
74 stars 13.29 score 628 scripts 72 dependentsbiodiverse
unmarked:Models for Data from Unmarked Animals
Fits hierarchical models of animal abundance and occurrence to data collected using survey methods such as point counts, site occupancy sampling, distance sampling, removal sampling, and double observer sampling. Parameters governing the state and observation processes can be modeled as functions of covariates. References: Kellner et al. (2023) <doi:10.1111/2041-210X.14123>, Fiske and Chandler (2011) <doi:10.18637/jss.v043.i10>.
Maintained by Ken Kellner. Last updated 9 days ago.
4 stars 13.02 score 652 scripts 12 dependentscsgillespie
poweRlaw:Analysis of Heavy Tailed Distributions
An implementation of maximum likelihood estimators for a variety of heavy tailed distributions, including both the discrete and continuous power law distributions. Additionally, a goodness-of-fit based approach is used to estimate the lower cut-off for the scaling region.
Maintained by Colin Gillespie. Last updated 2 months ago.
112 stars 12.79 score 332 scripts 32 dependentsspedygiorgio
markovchain:Easy Handling Discrete Time Markov Chains
Functions and S4 methods to create and manage discrete time Markov chains more easily. In addition functions to perform statistical (fitting and drawing random variates) and probabilistic (analysis of their structural proprieties) analysis are provided. See Spedicato (2017) <doi:10.32614/RJ-2017-036>. Some functions for continuous times Markov chains depend on the suggested ctmcd package.
Maintained by Giorgio Alfredo Spedicato. Last updated 5 months ago.
ctmcdtmcmarkov-chainmarkov-modelr-programmingrcppopenblascpp
104 stars 12.78 score 712 scripts 4 dependentsbioc
MSnbase:Base Functions and Classes for Mass Spectrometry and Proteomics
MSnbase provides infrastructure for manipulation, processing and visualisation of mass spectrometry and proteomics data, ranging from raw to quantitative and annotated data.
Maintained by Laurent Gatto. Last updated 14 days ago.
immunooncologyinfrastructureproteomicsmassspectrometryqualitycontroldataimportbioconductorbioinformaticsmass-spectrometryproteomics-datavisualisationcpp
131 stars 12.76 score 772 scripts 36 dependentsthibautjombart
adegenet:Exploratory Analysis of Genetic and Genomic Data
Toolset for the exploration of genetic and genomic data. Adegenet provides formal (S4) classes for storing and handling various genetic data, including genetic markers with varying ploidy and hierarchical population structure ('genind' class), alleles counts by populations ('genpop'), and genome-wide SNP data ('genlight'). It also implements original multivariate methods (DAPC, sPCA), graphics, statistical tests, simulation tools, distance and similarity measures, and several spatial methods. A range of both empirical and simulated datasets is also provided to illustrate various methods.
Maintained by Zhian N. Kamvar. Last updated 2 months ago.
182 stars 12.60 score 1.9k scripts 29 dependentsdata-cleaning
validate:Data Validation Infrastructure
Declare data validation rules and data quality indicators; confront data with them and analyze or visualize the results. The package supports rules that are per-field, in-record, cross-record or cross-dataset. Rules can be automatically analyzed for rule type and connectivity. Supports checks implied by an SDMX DSD file as well. See also Van der Loo and De Jonge (2018) <doi:10.1002/9781118897126>, Chapter 6 and the JSS paper (2021) <doi:10.18637/jss.v097.i10>.
Maintained by Mark van der Loo. Last updated 24 days ago.
419 stars 12.39 score 448 scripts 8 dependentsasardaes
dtwclust:Time Series Clustering Along with Optimizations for the Dynamic Time Warping Distance
Time series clustering along with optimized techniques related to the Dynamic Time Warping distance and its corresponding lower bounds. Implementations of partitional, hierarchical, fuzzy, k-Shape and TADPole clustering are available. Functionality can be easily extended with custom distance measures and centroid definitions. Implementations of DTW barycenter averaging, a distance based on global alignment kernels, and the soft-DTW distance and centroid routines are also provided. All included distance functions have custom loops optimized for the calculation of cross-distance matrices, including parallelization support. Several cluster validity indices are included.
Maintained by Alexis Sarda. Last updated 8 months ago.
clusteringdtwtime-seriesopenblascpp
262 stars 12.35 score 406 scripts 14 dependentsalexkz
kernlab:Kernel-Based Machine Learning Lab
Kernel-based machine learning methods for classification, regression, clustering, novelty detection, quantile regression and dimensionality reduction. Among other methods 'kernlab' includes Support Vector Machines, Spectral Clustering, Kernel PCA, Gaussian Processes and a QP solver.
Maintained by Alexandros Karatzoglou. Last updated 8 months ago.
21 stars 12.26 score 7.8k scripts 487 dependentsalexiosg
rugarch:Univariate GARCH Models
ARFIMA, in-mean, external regressors and various GARCH flavors, with methods for fit, forecast, simulation, inference and plotting.
Maintained by Alexios Galanos. Last updated 3 months ago.
26 stars 12.13 score 1.3k scripts 15 dependentsncss-tech
aqp:Algorithms for Quantitative Pedology
The Algorithms for Quantitative Pedology (AQP) project was started in 2009 to organize a loosely-related set of concepts and source code on the topic of soil profile visualization, aggregation, and classification into this package (aqp). Over the past 8 years, the project has grown into a suite of related R packages that enhance and simplify the quantitative analysis of soil profile data. Central to the AQP project is a new vocabulary of specialized functions and data structures that can accommodate the inherent complexity of soil profile information; freeing the scientist to focus on ideas rather than boilerplate data processing tasks <doi:10.1016/j.cageo.2012.10.020>. These functions and data structures have been extensively tested and documented, applied to projects involving hundreds of thousands of soil profiles, and deeply integrated into widely used tools such as SoilWeb <https://casoilresource.lawr.ucdavis.edu/soilweb-apps>. Components of the AQP project (aqp, soilDB, sharpshootR, soilReports packages) serve an important role in routine data analysis within the USDA-NRCS Soil Science Division. The AQP suite of R packages offer a convenient platform for bridging the gap between pedometric theory and practice.
Maintained by Dylan Beaudette. Last updated 1 months ago.
digital-soil-mappingncss-technrcspedologypedometricssoilsoil-surveyusda
55 stars 11.90 score 1.2k scripts 2 dependentsrspatial
dismo:Species Distribution Modeling
Methods for species distribution modeling, that is, predicting the environmental similarity of any site to that of the locations of known occurrences of a species.
Maintained by Robert J. Hijmans. Last updated 4 months ago.
25 stars 11.88 score 2.8k scripts 21 dependentsbioc
graph:graph: A package to handle graph data structures
A package that implements some simple graph handling capabilities.
Maintained by Bioconductor Package Maintainer. Last updated 9 days ago.
11.86 score 764 scripts 339 dependentsr-forge
copula:Multivariate Dependence with Copulas
Classes (S4) of commonly used elliptical, Archimedean, extreme-value and other copula families, as well as their rotations, mixtures and asymmetrizations. Nested Archimedean copulas, related tools and special functions. Methods for density, distribution, random number generation, bivariate dependence measures, Rosenblatt transform, Kendall distribution function, perspective and contour plots. Fitting of copula models with potentially partly fixed parameters, including standard errors. Serial independence tests, copula specification tests (independence, exchangeability, radial symmetry, extreme-value dependence, goodness-of-fit) and model selection based on cross-validation. Empirical copula, smoothed versions, and non-parametric estimators of the Pickands dependence function.
Maintained by Martin Maechler. Last updated 23 days ago.
11.83 score 1.2k scripts 86 dependentskingaa
pomp:Statistical Inference for Partially Observed Markov Processes
Tools for data analysis with partially observed Markov process (POMP) models (also known as stochastic dynamical systems, hidden Markov models, and nonlinear, non-Gaussian, state-space models). The package provides facilities for implementing POMP models, simulating them, and fitting them to time series data by a variety of frequentist and Bayesian methods. It is also a versatile platform for implementation of inference methods for general POMP models.
Maintained by Aaron A. King. Last updated 8 days ago.
abcb-splinedifferential-equationsdynamical-systemsiterated-filteringlikelihoodlikelihood-freemarkov-chain-monte-carlomarkov-modelmathematical-modellingmeasurement-errorparticle-filtersequential-monte-carlosimulation-based-inferencesobol-sequencestate-spacestatistical-inferencestochastic-processestime-seriesopenblas
114 stars 11.74 score 1.3k scripts 4 dependentsprioritizr
prioritizr:Systematic Conservation Prioritization in R
Systematic conservation prioritization using mixed integer linear programming (MILP). It provides a flexible interface for building and solving conservation planning problems. Once built, conservation planning problems can be solved using a variety of commercial and open-source exact algorithm solvers. By using exact algorithm solvers, solutions can be generated that are guaranteed to be optimal (or within a pre-specified optimality gap). Furthermore, conservation problems can be constructed to optimize the spatial allocation of different management actions or zones, meaning that conservation practitioners can identify solutions that benefit multiple stakeholders. To solve large-scale or complex conservation planning problems, users should install the Gurobi optimization software (available from <https://www.gurobi.com/>) and the 'gurobi' R package (see Gurobi Installation Guide vignette for details). Users can also install the IBM CPLEX software (<https://www.ibm.com/products/ilog-cplex-optimization-studio/cplex-optimizer>) and the 'cplexAPI' R package (available at <https://github.com/cran/cplexAPI>). Additionally, the 'rcbc' R package (available at <https://github.com/dirkschumacher/rcbc>) can be used to generate solutions using the CBC optimization software (<https://github.com/coin-or/Cbc>). For further details, see Hanson et al. (2025) <doi:10.1111/cobi.14376>.
Maintained by Richard Schuster. Last updated 16 hours ago.
biodiversityconservationconservation-planneroptimizationprioritizationsolverspatialcpp
124 stars 11.71 score 584 scripts 2 dependentsluca-scr
GA:Genetic Algorithms
Flexible general-purpose toolbox implementing genetic algorithms (GAs) for stochastic optimisation. Binary, real-valued, and permutation representations are available to optimize a fitness function, i.e. a function provided by users depending on their objective function. Several genetic operators are available and can be combined to explore the best settings for the current task. Furthermore, users can define new genetic operators and easily evaluate their performances. Local search using general-purpose optimisation algorithms can be applied stochastically to exploit interesting regions. GAs can be run sequentially or in parallel, using an explicit master-slave parallelisation or a coarse-grain islands approach. For more details see Scrucca (2013) <doi:10.18637/jss.v053.i04> and Scrucca (2017) <doi:10.32614/RJ-2017-008>.
Maintained by Luca Scrucca. Last updated 7 months ago.
genetic-algorithmoptimisationcpp
93 stars 11.58 score 624 scripts 52 dependentsbioc
Rgraphviz:Provides plotting capabilities for R graph objects
Interfaces R with the AT and T graphviz library for plotting R graph objects from the graph package.
Maintained by Kasper Daniel Hansen. Last updated 2 days ago.
graphandnetworkvisualizationzlib
11.51 score 1.2k scripts 107 dependentsbioc
destiny:Creates diffusion maps
Create and plot diffusion maps.
Maintained by Philipp Angerer. Last updated 4 months ago.
cellbiologycellbasedassaysclusteringsoftwarevisualizationdiffusion-mapsdimensionality-reductioncpp
82 stars 11.44 score 792 scripts 1 dependentsbioc
genefilter:genefilter: methods for filtering genes from high-throughput experiments
Some basic functions for filtering genes.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
11.11 score 2.4k scripts 143 dependentsfmichonneau
phylobase:Base Package for Phylogenetic Structures and Comparative Data
Provides a base S4 class for comparative methods, incorporating one or more trees and trait data.
Maintained by Francois Michonneau. Last updated 1 years ago.
18 stars 11.10 score 394 scripts 18 dependentssgibb
MALDIquant:Quantitative Analysis of Mass Spectrometry Data
A complete analysis pipeline for matrix-assisted laser desorption/ionization-time-of-flight (MALDI-TOF) and other two-dimensional mass spectrometry data. In addition to commonly used plotting and processing methods it includes distinctive features, namely baseline subtraction methods such as morphological filters (TopHat) or the statistics-sensitive non-linear iterative peak-clipping algorithm (SNIP), peak alignment using warping functions, handling of replicated measurements as well as allowing spectra with different resolutions.
Maintained by Sebastian Gibb. Last updated 7 months ago.
maldimaldi-imsmaldi-tof-msmass-spectrometry
62 stars 11.06 score 180 scripts 44 dependentsrkillick
changepoint:Methods for Changepoint Detection
Implements various mainstream and specialised changepoint methods for finding single and multiple changepoints within data. Many popular non-parametric and frequentist methods are included. The cpt.mean(), cpt.var(), cpt.meanvar() functions should be your first point of call.
Maintained by Rebecca Killick. Last updated 4 months ago.
133 stars 11.05 score 736 scripts 40 dependentsmetrumresearchgroup
mrgsolve:Simulate from ODE-Based Models
Fast simulation from ordinary differential equation (ODE) based models typically employed in quantitative pharmacology and systems biology.
Maintained by Kyle T Baron. Last updated 9 days ago.
138 stars 10.90 score 1.2k scripts 3 dependentsvalentint
rrcov:Scalable Robust Estimators with High Breakdown Point
Robust Location and Scatter Estimation and Robust Multivariate Analysis with High Breakdown Point: principal component analysis (Filzmoser and Todorov (2013), <doi:10.1016/j.ins.2012.10.017>), linear and quadratic discriminant analysis (Todorov and Pires (2007)), multivariate tests (Todorov and Filzmoser (2010) <doi:10.1016/j.csda.2009.08.015>), outlier detection (Todorov et al. (2010) <doi:10.1007/s11634-010-0075-2>). See also Todorov and Filzmoser (2009) <urn:isbn:978-3838108148>, Todorov and Filzmoser (2010) <doi:10.18637/jss.v032.i03> and Boudt et al. (2019) <doi:10.1007/s11222-019-09869-x>.
Maintained by Valentin Todorov. Last updated 7 months ago.
2 stars 10.57 score 484 scripts 96 dependentsbioc
seqLogo:Sequence logos for DNA sequence alignments
seqLogo takes the position weight matrix of a DNA sequence motif and plots the corresponding sequence logo as introduced by Schneider and Stephens (1990).
Maintained by Robert Ivanek. Last updated 5 months ago.
4 stars 10.57 score 304 scripts 29 dependentsbioc
ChemmineR:Cheminformatics Toolkit for R
ChemmineR is a cheminformatics package for analyzing drug-like small molecule data in R. Its latest version contains functions for efficient processing of large numbers of molecules, physicochemical/structural property predictions, structural similarity searching, classification and clustering of compound libraries with a wide spectrum of algorithms. In addition, it offers visualization functions for compound clustering results and chemical structures.
Maintained by Thomas Girke. Last updated 5 months ago.
cheminformaticsbiomedicalinformaticspharmacogeneticspharmacogenomicsmicrotitreplateassaycellbasedassaysvisualizationinfrastructuredataimportclusteringproteomicsmetabolomicscpp
15 stars 10.45 score 253 scripts 12 dependentsmhahsler
recommenderlab:Lab for Developing and Testing Recommender Algorithms
Provides a research infrastructure to develop and evaluate collaborative filtering recommender algorithms. This includes a sparse representation for user-item matrices, many popular algorithms, top-N recommendations, and cross-validation. Hahsler (2022) <doi:10.48550/arXiv.2205.12371>.
Maintained by Michael Hahsler. Last updated 2 days ago.
collaborative-filteringrecommender-system
214 stars 10.42 score 840 scripts 2 dependentsadeverse
adegraphics:An S4 Lattice-Based Package for the Representation of Multivariate Data
Graphical functionalities for the representation of multivariate data. It is a complete re-implementation of the functions available in the 'ade4' package.
Maintained by Aurélie Siberchicot. Last updated 8 months ago.
9 stars 10.37 score 386 scripts 6 dependentsbioc
flowCore:flowCore: Basic structures for flow cytometry data
Provides S4 data structures and basic functions to deal with flow cytometry data.
Maintained by Mike Jiang. Last updated 5 months ago.
immunooncologyinfrastructureflowcytometrycellbasedassayscpp
10.34 score 1.7k scripts 59 dependentsssnn-airr
alakazam:Immunoglobulin Clonal Lineage and Diversity Analysis
Provides methods for high-throughput adaptive immune receptor repertoire sequencing (AIRR-Seq; Rep-Seq) analysis. In particular, immunoglobulin (Ig) sequence lineage reconstruction, lineage topology analysis, diversity profiling, amino acid property analysis and gene usage. Citations: Gupta and Vander Heiden, et al (2017) <doi:10.1093/bioinformatics/btv359>, Stern, Yaari and Vander Heiden, et al (2014) <doi:10.1126/scitranslmed.3008879>.
Maintained by Susanna Marquez. Last updated 3 months ago.
10.33 score 424 scripts 7 dependentsbcgov
ssdtools:Species Sensitivity Distributions
Species sensitivity distributions are cumulative probability distributions which are fitted to toxicity concentrations for different species as described by Posthuma et al.(2001) <isbn:9781566705783>. The ssdtools package uses Maximum Likelihood to fit distributions such as the gamma, log-logistic, log-normal and log-normal log-normal mixture. Multiple distributions can be averaged using Akaike Information Criteria. Confidence intervals on hazard concentrations and proportions are produced by bootstrapping.
Maintained by Joe Thorley. Last updated 1 months ago.
ecotoxicologyenvspecies-sensitivity-distributioncpp
33 stars 10.33 score 111 scripts 5 dependentsbioc
Cardinal:A mass spectrometry imaging toolbox for statistical analysis
Implements statistical & computational tools for analyzing mass spectrometry imaging datasets, including methods for efficient pre-processing, spatial segmentation, and classification.
Maintained by Kylie Ariel Bemis. Last updated 3 months ago.
softwareinfrastructureproteomicslipidomicsmassspectrometryimagingmassspectrometryimmunooncologynormalizationclusteringclassificationregression
48 stars 10.32 score 200 scriptsbioc
BASiCS:Bayesian Analysis of Single-Cell Sequencing data
Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model to perform statistical analyses of single-cell RNA sequencing datasets in the context of supervised experiments (where the groups of cells of interest are known a priori, e.g. experimental conditions or cell types). BASiCS performs built-in data normalisation (global scaling) and technical noise quantification (based on spike-in genes). BASiCS provides an intuitive detection criterion for highly (or lowly) variable genes within a single group of cells. Additionally, BASiCS can compare gene expression patterns between two or more pre-specified groups of cells. Unlike traditional differential expression tools, BASiCS quantifies changes in expression that lie beyond comparisons of means, also allowing the study of changes in cell-to-cell heterogeneity. The latter can be quantified via a biological over-dispersion parameter that measures the excess of variability that is observed with respect to Poisson sampling noise, after normalisation and technical noise removal. Due to the strong mean/over-dispersion confounding that is typically observed for scRNA-seq datasets, BASiCS also tests for changes in residual over-dispersion, defined by residual values with respect to a global mean/over-dispersion trend.
Maintained by Catalina Vallejos. Last updated 5 months ago.
immunooncologynormalizationsequencingrnaseqsoftwaregeneexpressiontranscriptomicssinglecelldifferentialexpressionbayesiancellbiologybioconductor-packagegene-expressionrcpprcpparmadilloscrna-seqsingle-cellopenblascppopenmp
83 stars 10.26 score 368 scripts 1 dependentsbioc
graphite:GRAPH Interaction from pathway Topological Environment
Graph objects from pathway topology derived from KEGG, Panther, PathBank, PharmGKB, Reactome SMPDB and WikiPathways databases.
Maintained by Gabriele Sales. Last updated 5 months ago.
pathwaysthirdpartyclientgraphandnetworknetworkreactomekeggmetabolomicsbioinformaticsmirrorpathway-analysis
7 stars 10.17 score 122 scripts 21 dependentsbioc
QDNAseq:Quantitative DNA Sequencing for Chromosomal Aberrations
Quantitative DNA sequencing for chromosomal aberrations. The genome is divided into non-overlapping fixed-sized bins, number of sequence reads in each counted, adjusted with a simultaneous two-dimensional loess correction for sequence mappability and GC content, and filtered to remove spurious regions in the genome. Downstream steps of segmentation and calling are also implemented via packages DNAcopy and CGHcall, respectively.
Maintained by Daoud Sie. Last updated 5 months ago.
copynumbervariationdnaseqgeneticsgenomeannotationpreprocessingqualitycontrolsequencing
49 stars 10.10 score 177 scripts 4 dependentsstewid
SimInf:A Framework for Data-Driven Stochastic Disease Spread Simulations
Provides an efficient and very flexible framework to conduct data-driven epidemiological modeling in realistic large scale disease spread simulations. The framework integrates infection dynamics in subpopulations as continuous-time Markov chains using the Gillespie stochastic simulation algorithm and incorporates available data such as births, deaths and movements as scheduled events at predefined time-points. Using C code for the numerical solvers and 'OpenMP' (if available) to divide work over multiple processors ensures high performance when simulating a sample outcome. One of our design goals was to make the package extendable and enable usage of the numerical solvers from other R extension packages in order to facilitate complex epidemiological research. The package contains template models and can be extended with user-defined models. For more details see the paper by Widgren, Bauer, Eriksson and Engblom (2019) <doi:10.18637/jss.v091.i12>. The package also provides functionality to fit models to time series data using the Approximate Bayesian Computation Sequential Monte Carlo ('ABC-SMC') algorithm of Toni and others (2009) <doi:10.1098/rsif.2008.0172>.
Maintained by Stefan Widgren. Last updated 16 days ago.
data-drivenepidemiologyhigh-performance-computingmarkov-chainmathematical-modellinggslopenmp
35 stars 10.09 score 227 scriptshypertidy
fasterize:Fast Polygon to Raster Conversion
Provides a drop-in replacement for rasterize() from the 'raster' package that takes polygon vector or data frame objects, and is much faster. There is support for the main options provided by the rasterize() function, including setting the field used and background value, and options for aggregating multi-layer rasters. Uses the scan line algorithm attributed to Wylie et al. (1967) <doi:10.1145/1465611.1465619>. Note that repository originally was hosted at 'Github' 'ecohealthalliance/fasterize' but was migrated to 'hypertidy/fasterize' in March 2025, and can be found indexed on 'R universe' <https://cran.r-universe.dev/fasterize>.
Maintained by Michael Sumner. Last updated 20 days ago.
rasterrcpprcpparmadillosfspatialcpp
182 stars 10.05 score 14 dependentsmages
ChainLadder:Statistical Methods and Models for Claims Reserving in General Insurance
Various statistical methods and models which are typically used for the estimation of outstanding claims reserves in general insurance, including those to estimate the claims development result as required under Solvency II.
Maintained by Markus Gesmann. Last updated 2 months ago.
82 stars 10.04 score 196 scripts 2 dependentsrobinhankin
Brobdingnag:Very Large Numbers in R
Very large numbers in R. Real numbers are held using their natural logarithms, plus a logical flag indicating sign. Functionality for complex numbers is also provided. The package includes a vignette that gives a step-by-step introduction to using S4 methods.
Maintained by Robin K. S. Hankin. Last updated 7 months ago.
5 stars 9.92 score 77 scripts 70 dependentsenvironmentalinformatics-marburg
satellite:Handling and Manipulating Remote Sensing Data
Herein, we provide a broad variety of functions which are useful for handling, manipulating, and visualizing satellite-based remote sensing data. These operations range from mere data import and layer handling (eg subsetting), over Raster* typical data wrangling (eg crop, extend), to more sophisticated (pre-)processing tasks typically applied to satellite imagery (eg atmospheric and topographic correction). This functionality is complemented by a full access to the satellite layers' metadata at any stage and the documentation of performed actions in a separate log file. Currently available sensors include Landsat 4-5 (TM), 7 (ETM+), and 8 (OLI/TIRS Combined), and additional compatibility is ensured for the Landsat Global Land Survey data set.
Maintained by Florian Detsch. Last updated 1 years ago.
22 stars 9.88 score 61 scripts 27 dependentsubod
apcluster:Affinity Propagation Clustering
Implements Affinity Propagation clustering introduced by Frey and Dueck (2007) <DOI:10.1126/science.1136800>. The algorithms are largely analogous to the 'Matlab' code published by Frey and Dueck. The package further provides leveraged affinity propagation and an algorithm for exemplar-based agglomerative clustering that can also be used to join clusters obtained from affinity propagation. Various plotting functions are available for analyzing clustering results.
Maintained by Ulrich Bodenhofer. Last updated 11 months ago.
10 stars 9.81 score 270 scripts 25 dependentsbioc
matter:Out-of-core statistical computing and signal processing
Toolbox for larger-than-memory scientific computing and visualization, providing efficient out-of-core data structures using files or shared memory, for dense and sparse vectors, matrices, and arrays, with applications to nonuniformly sampled signals and images.
Maintained by Kylie A. Bemis. Last updated 4 months ago.
infrastructuredatarepresentationdataimportdimensionreductionpreprocessingcpp
57 stars 9.52 score 64 scripts 2 dependentsedzer
intervals:Tools for Working with Points and Intervals
Tools for working with and comparing sets of points and intervals.
Maintained by Edzer Pebesma. Last updated 7 months ago.
11 stars 9.50 score 122 scripts 98 dependentsbioc
snpStats:SnpMatrix and XSnpMatrix classes and methods
Classes and statistical methods for large SNP association studies. This extends the earlier snpMatrix package, allowing for uncertainty in genotypes.
Maintained by David Clayton. Last updated 5 months ago.
microarraysnpgeneticvariabilityzlib
9.48 score 674 scripts 20 dependentssizespectrum
mizer:Dynamic Multi-Species Size Spectrum Modelling
A set of classes and methods to set up and run multi-species, trait based and community size spectrum ecological models, focused on the marine environment.
Maintained by Gustav Delius. Last updated 2 months ago.
ecosystem-modelfish-population-dynamicsfisheriesfisheries-managementmarine-ecosystempopulation-dynamicssimulationsize-structurespecies-interactionstransport-equationcpp
39 stars 9.41 score 207 scriptseagerai
fastai:Interface to 'fastai'
The 'fastai' <https://docs.fast.ai/index.html> library simplifies training fast and accurate neural networks using modern best practices. It is based on research in to deep learning best practices undertaken at 'fast.ai', including 'out of the box' support for vision, text, tabular, audio, time series, and collaborative filtering models.
Maintained by Turgut Abdullayev. Last updated 12 months ago.
audiocollaborative-filteringdarknetdarknet-image-classificationfastaimedicalobject-detectiontabulartextvision
118 stars 9.40 score 76 scriptsbioc
ggmsa:Plot Multiple Sequence Alignment using 'ggplot2'
A visual exploration tool for multiple sequence alignment and associated data. Supports MSA of DNA, RNA, and protein sequences using 'ggplot2'. Multiple sequence alignment can easily be combined with other 'ggplot2' plots, such as phylogenetic tree Visualized by 'ggtree', boxplot, genome map and so on. More features: visualization of sequence logos, sequence bundles, RNA secondary structures and detection of sequence recombinations.
Maintained by Guangchuang Yu. Last updated 3 months ago.
softwarevisualizationalignmentannotationmultiplesequencealignment
210 stars 9.35 score 196 scripts 2 dependentsbioc
multtest:Resampling-based multiple hypothesis testing
Non-parametric bootstrap and permutation resampling-based multiple testing procedures (including empirical Bayes methods) for controlling the family-wise error rate (FWER), generalized family-wise error rate (gFWER), tail probability of the proportion of false positives (TPPFP), and false discovery rate (FDR). Several choices of bootstrap-based null distribution are implemented (centered, centered and scaled, quantile-transformed). Single-step and step-wise methods are available. Tests based on a variety of t- and F-statistics (including t-statistics based on regression parameters from linear and survival models as well as those based on correlation parameters) are included. When probing hypotheses with t-statistics, users may also select a potentially faster null distribution which is multivariate normal with mean zero and variance covariance matrix derived from the vector influence function. Results are reported in terms of adjusted p-values, confidence regions and test statistic cutoffs. The procedures are directly applicable to identifying differentially expressed genes in DNA microarray experiments.
Maintained by Katherine S. Pollard. Last updated 5 months ago.
microarraydifferentialexpressionmultiplecomparison
9.34 score 932 scripts 136 dependentshojsgaard
gRbase:A Package for Graphical Modelling in R
The 'gRbase' package provides graphical modelling features used by e.g. the packages 'gRain', 'gRim' and 'gRc'. 'gRbase' implements graph algorithms including (i) maximum cardinality search (for marked and unmarked graphs). (ii) moralization, (iii) triangulation, (iv) creation of junction tree. 'gRbase' facilitates array operations, 'gRbase' implements functions for testing for conditional independence. 'gRbase' illustrates how hierarchical log-linear models may be implemented and describes concept of graphical meta data. The facilities of the package are documented in the book by Højsgaard, Edwards and Lauritzen (2012, <doi:10.1007/978-1-4614-2299-0>) and in the paper by Dethlefsen and Højsgaard, (2005, <doi:10.18637/jss.v014.i17>). Please see 'citation("gRbase")' for citation details.
Maintained by Søren Højsgaard. Last updated 5 months ago.
3 stars 9.24 score 241 scripts 20 dependentsbpfaff
urca:Unit Root and Cointegration Tests for Time Series Data
Unit root and cointegration tests encountered in applied econometric analysis are implemented.
Maintained by Bernhard Pfaff. Last updated 10 months ago.
6 stars 8.95 score 1.4k scripts 270 dependentskollerma
robustlmm:Robust Linear Mixed Effects Models
Implements the Robust Scoring Equations estimator to fit linear mixed effects models robustly. Robustness is achieved by modification of the scoring equations combined with the Design Adaptive Scale approach.
Maintained by Manuel Koller. Last updated 1 years ago.
28 stars 8.79 score 138 scriptsflr
FLCore:Core Package of FLR, Fisheries Modelling in R
Core classes and methods for FLR, a framework for fisheries modelling and management strategy simulation in R. Developed by a team of fisheries scientists in various countries. More information can be found at <http://flr-project.org/>.
Maintained by Iago Mosqueira. Last updated 8 days ago.
fisheriesflrfisheries-modelling
16 stars 8.78 score 956 scripts 23 dependentsr-forge
distr:Object Oriented Implementation of Distributions
S4-classes and methods for distributions.
Maintained by Peter Ruckdeschel. Last updated 2 months ago.
8.77 score 327 scripts 32 dependentsbart1
move:Visualizing and Analyzing Animal Track Data
Contains functions to access movement data stored in 'movebank.org' as well as tools to visualize and statistically analyze animal movement data, among others functions to calculate dynamic Brownian Bridge Movement Models. Move helps addressing movement ecology questions.
Maintained by Bart Kranstauber. Last updated 4 months ago.
8.76 score 690 scripts 3 dependentsmikejohnson51
climateR:climateR
Find, subset, and retrive geospatial data by AOI.
Maintained by Mike Johnson. Last updated 4 months ago.
aoiclimatedatasetgeospatialgridded-climate-dataweather
187 stars 8.74 score 156 scripts 1 dependentspilaboratory
sads:Maximum Likelihood Models for Species Abundance Distributions
Maximum likelihood tools to fit and compare models of species abundance distributions and of species rank-abundance distributions.
Maintained by Paulo I. Prado. Last updated 1 years ago.
23 stars 8.66 score 244 scripts 3 dependentsbioc
pRoloc:A unifying bioinformatics framework for spatial proteomics
The pRoloc package implements machine learning and visualisation methods for the analysis and interogation of quantitiative mass spectrometry data to reliably infer protein sub-cellular localisation.
Maintained by Lisa Breckels. Last updated 1 months ago.
immunooncologyproteomicsmassspectrometryclassificationclusteringqualitycontrolbioconductorproteomics-dataspatial-proteomicsvisualisationopenblascpp
15 stars 8.64 score 101 scripts 2 dependentsactuaryzhang
cplm:Compound Poisson Linear Models
Likelihood-based and Bayesian methods for various compound Poisson linear models based on Zhang, Yanwei (2013) <doi:10.1007/s11222-012-9343-7>.
Maintained by Yanwei (Wayne) Zhang. Last updated 1 years ago.
16 stars 8.55 score 75 scripts 10 dependentsr-forge
ClassDiscovery:Classes and Methods for "Class Discovery" with Microarrays or Proteomics
Defines the classes used for "class discovery" problems in the OOMPA project (<http://oompa.r-forge.r-project.org/>). Class discovery primarily consists of unsupervised clustering methods with attempts to assess their statistical significance.
Maintained by Kevin R. Coombes. Last updated 2 months ago.
8.53 score 85 scripts 9 dependentsropensci
weatherOz:An API Client for Australian Weather and Climate Data Resources
Provides automated downloading, parsing and formatting of weather data for Australia through API endpoints provided by the Department of Primary Industries and Regional Development ('DPIRD') of Western Australia and by the Science and Technology Division of the Queensland Government's Department of Environment and Science ('DES'). As well as the Bureau of Meteorology ('BOM') of the Australian government precis and coastal forecasts, and downloading and importing radar and satellite imagery files. 'DPIRD' weather data are accessed through public 'APIs' provided by 'DPIRD', <https://www.agric.wa.gov.au/weather-api-20>, providing access to weather station data from the 'DPIRD' weather station network. Australia-wide weather data are based on data from the Australian Bureau of Meteorology ('BOM') data and accessed through 'SILO' (Scientific Information for Land Owners) Jeffrey et al. (2001) <doi:10.1016/S1364-8152(01)00008-1>. 'DPIRD' data are made available under a Creative Commons Attribution 3.0 Licence (CC BY 3.0 AU) license <https://creativecommons.org/licenses/by/3.0/au/deed.en>. SILO data are released under a Creative Commons Attribution 4.0 International licence (CC BY 4.0) <https://creativecommons.org/licenses/by/4.0/>. 'BOM' data are (c) Australian Government Bureau of Meteorology and released under a Creative Commons (CC) Attribution 3.0 licence or Public Access Licence ('PAL') as appropriate, see <http://www.bom.gov.au/other/copyright.shtml> for further details.
Maintained by Rodrigo Pires. Last updated 1 months ago.
dpirdbommeteorological-dataweather-forecastaustraliaweatherweather-datameteorologywestern-australiaaustralia-bureau-of-meteorologywestern-australia-agricultureaustralia-agricultureaustralia-climateaustralia-weatherapi-clientclimatedatarainfallweather-api
31 stars 8.47 score 40 scriptsbgoodri
mi:Missing Data Imputation and Model Checking
The mi package provides functions for data manipulation, imputing missing values in an approximate Bayesian framework, diagnostics of the models used to generate the imputations, confidence-building mechanisms to validate some of the assumptions of the imputation algorithm, and functions to analyze multiply imputed data sets with the appropriate degree of sampling uncertainty.
Maintained by Ben Goodrich. Last updated 3 years ago.
2 stars 8.25 score 244 scripts 47 dependentstomasfryda
h2o:R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Maintained by Tomas Fryda. Last updated 1 years ago.
3 stars 8.20 score 7.8k scripts 11 dependentscran
flexmix:Flexible Mixture Modeling
A general framework for finite mixtures of regression models using the EM algorithm is implemented. The E-step and all data handling are provided, while the M-step can be supplied by the user to easily define new models. Existing drivers implement mixtures of standard linear models, generalized linear models and model-based clustering.
Maintained by Bettina Gruen. Last updated 28 days ago.
5 stars 8.19 score 113 dependentsr-hyperspec
hyperSpec:Work with Hyperspectral Data, i.e. Spectra + Meta Information (Spatial, Time, Concentration, ...)
Comfortable ways to work with hyperspectral data sets, i.e. spatially or time-resolved spectra, or spectra with any other kind of information associated with each of the spectra. The spectra can be data as obtained in XRF, UV/VIS, Fluorescence, AES, NIR, IR, Raman, NMR, MS, etc. More generally, any data that is recorded over a discretized variable, e.g. absorbance = f(wavelength), stored as a vector of absorbance values for discrete wavelengths is suitable.
Maintained by Claudia Beleites. Last updated 10 months ago.
data-wranglinghyperspectralimaginginfrarednmrramanspectroscopyuv-visxrf
16 stars 8.10 score 233 scripts 2 dependentsbioc
openCyto:Hierarchical Gating Pipeline for flow cytometry data
This package is designed to facilitate the automated gating methods in sequential way to mimic the manual gating strategy.
Maintained by Mike Jiang. Last updated 2 days ago.
immunooncologyflowcytometrydataimportpreprocessingdatarepresentationcpp
8.02 score 404 scripts 1 dependentsbioc
motifStack:Plot stacked logos for single or multiple DNA, RNA and amino acid sequence
The motifStack package is designed for graphic representation of multiple motifs with different similarity scores. It works with both DNA/RNA sequence motif and amino acid sequence motif. In addition, it provides the flexibility for users to customize the graphic parameters such as the font type and symbol colors.
Maintained by Jianhong Ou. Last updated 3 months ago.
sequencematchingvisualizationsequencingmicroarrayalignmentchipchipchipseqmotifannotationdataimport
7.93 score 188 scripts 6 dependentsr-forge
tuneR:Analysis of Music and Speech
Analyze music and speech, extract features like MFCCs, handle wave files and their representation in various ways, read mp3, read midi, perform steps of a transcription, ... Also contains functions ported from the 'rastamat' 'Matlab' package.
Maintained by Uwe Ligges. Last updated 12 months ago.
7.93 score 1.1k scripts 44 dependentscran
timeSeries:Financial Time Series Objects (Rmetrics)
'S4' classes and various tools for financial time series: Basic functions such as scaling and sorting, subsetting, mathematical operations and statistical functions.
Maintained by Georgi N. Boshnakov. Last updated 6 months ago.
2 stars 7.89 score 146 dependentsbioc
flowWorkspace:Infrastructure for representing and interacting with gated and ungated cytometry data sets.
This package is designed to facilitate comparison of automated gating methods against manual gating done in flowJo. This package allows you to import basic flowJo workspaces into BioConductor and replicate the gating from flowJo using the flowCore functionality. Gating hierarchies, groups of samples, compensation, and transformation are performed so that the output matches the flowJo analysis.
Maintained by Greg Finak. Last updated 22 days ago.
immunooncologyflowcytometrydataimportpreprocessingdatarepresentationzlibopenblascpp
7.89 score 576 scripts 10 dependentsgenentech
psborrow2:Bayesian Dynamic Borrowing Analysis and Simulation
Bayesian dynamic borrowing is an approach to incorporating external data to supplement a randomized, controlled trial analysis in which external data are incorporated in a dynamic way (e.g., based on similarity of outcomes); see Viele 2013 <doi:10.1002/pst.1589> for an overview. This package implements the hierarchical commensurate prior approach to dynamic borrowing as described in Hobbes 2011 <doi:10.1111/j.1541-0420.2011.01564.x>. There are three main functionalities. First, 'psborrow2' provides a user-friendly interface for applying dynamic borrowing on the study results handles the Markov Chain Monte Carlo sampling on behalf of the user. Second, 'psborrow2' provides a simulation framework to compare different borrowing parameters (e.g. full borrowing, no borrowing, dynamic borrowing) and other trial and borrowing characteristics (e.g. sample size, covariates) in a unified way. Third, 'psborrow2' provides a set of functions to generate data for simulation studies, and also allows the user to specify their own data generation process. This package is designed to use the sampling functions from 'cmdstanr' which can be installed from <https://stan-dev.r-universe.dev>.
Maintained by Matt Secrest. Last updated 1 months ago.
bayesian-dynamic-borrowingpsborrow2simulation-study
18 stars 7.87 score 16 scriptsbioc
siggenes:Multiple Testing using SAM and Efron's Empirical Bayes Approaches
Identification of differentially expressed genes and estimation of the False Discovery Rate (FDR) using both the Significance Analysis of Microarrays (SAM) and the Empirical Bayes Analyses of Microarrays (EBAM).
Maintained by Holger Schwender. Last updated 5 months ago.
multiplecomparisonmicroarraygeneexpressionsnpexonarraydifferentialexpression
7.87 score 74 scripts 34 dependentsericmarcon
entropart:Entropy Partitioning to Measure Diversity
Measurement and partitioning of diversity, based on Tsallis entropy, following Marcon and Herault (2015) <doi:10.18637/jss.v067.i08>. 'entropart' provides functions to calculate alpha, beta and gamma diversity of communities, including phylogenetic and functional diversity. Estimation-bias corrections are available.
Maintained by Eric Marcon. Last updated 2 months ago.
biodiversitydiversityentropy-partitioningestimatormeasurespecies
9 stars 7.81 score 115 scripts 1 dependentsopenpharma
crmPack:Object-Oriented Implementation of CRM Designs
Implements a wide range of model-based dose escalation designs, ranging from classical and modern continual reassessment methods (CRMs) based on dose-limiting toxicity endpoints to dual-endpoint designs taking into account a biomarker/efficacy outcome. The focus is on Bayesian inference, making it very easy to setup a new design with its own JAGS code. However, it is also possible to implement 3+3 designs for comparison or models with non-Bayesian estimation. The whole package is written in a modular form in the S4 class system, making it very flexible for adaptation to new models, escalation or stopping rules. Further details are presented in Sabanes Bove et al. (2019) <doi:10.18637/jss.v089.i10>.
Maintained by Daniel Sabanes Bove. Last updated 2 months ago.
21 stars 7.76 score 208 scriptsbioc
KEGGgraph:KEGGgraph: A graph approach to KEGG PATHWAY in R and Bioconductor
KEGGGraph is an interface between KEGG pathway and graph object as well as a collection of tools to analyze, dissect and visualize these graphs. It parses the regularly updated KGML (KEGG XML) files into graph models maintaining all essential pathway attributes. The package offers functionalities including parsing, graph operation, visualization and etc.
Maintained by Jitao David Zhang. Last updated 5 months ago.
pathwaysgraphandnetworkvisualizationkegg
7.76 score 114 scripts 23 dependentstrackage
trip:Tracking Data
Access and manipulate spatial tracking data, with straightforward coercion from and to other formats. Filter for speed and create time spent maps from tracking data. There are coercion methods to convert between 'trip' and 'ltraj' from 'adehabitatLT', and between 'trip' and 'psp' and 'ppp' from 'spatstat'. Trip objects can be created from raw or grouped data frames, and from types in the 'sp', sf', 'amt', 'trackeR', 'mousetrap', and other packages, Sumner, MD (2011) <https://figshare.utas.edu.au/articles/thesis/The_tag_location_problem/23209538>.
Maintained by Michael D. Sumner. Last updated 9 months ago.
13 stars 7.72 score 137 scripts 1 dependentsblue-matter
MSEtool:Management Strategy Evaluation Toolkit
Development, simulation testing, and implementation of management procedures for fisheries (see Carruthers & Hordyk (2018) <doi:10.1111/2041-210X.13081>).
Maintained by Adrian Hordyk. Last updated 3 days ago.
8 stars 7.71 score 163 scripts 3 dependentsadamlilith
fasterRaster:Faster Raster and Spatial Vector Processing Using 'GRASS GIS'
Processing of large-in-memory/large-on disk rasters and spatial vectors using 'GRASS GIS' <https://grass.osgeo.org/>. Most functions in the 'terra' package are recreated. Processing of medium-sized and smaller spatial objects will nearly always be faster using 'terra' or 'sf', but for large-in-memory/large-on-disk objects, 'fasterRaster' may be faster. To use most of the functions, you must have the stand-alone version (not the 'OSGeoW4' installer version) of 'GRASS GIS' 8.0 or higher.
Maintained by Adam B. Smith. Last updated 2 days ago.
aspectdistancefragmentationfragmentation-indicesgisgrassgrass-gisrasterraster-projectionrasterizeslopetopographyvectorization
57 stars 7.68 score 8 scriptsltorgo
DMwR2:Functions and Data for the Second Edition of "Data Mining with R"
Functions and data accompanying the second edition of the book "Data Mining with R, learning with case studies" by Luis Torgo, published by CRC Press.
Maintained by Luis Torgo. Last updated 8 years ago.
27 stars 7.64 score 380 scripts 2 dependentsflr
ggplotFL:Using ggplot2 in FLR
Using ggplot2 for FLR. Provides (1) overloaded ggplot methods for various FLR classes, (2) ggplot-based versions of standard plots in the FLCore package, and (3) new geoms for using FLR objects.
Maintained by Iago Mosqueira. Last updated 2 months ago.
visualizationggplot2fisheriesflr
4 stars 7.60 score 458 scripts 12 dependentsbioc
ropls:PCA, PLS(-DA) and OPLS(-DA) for multivariate analysis and feature selection of omics data
Latent variable modeling with Principal Component Analysis (PCA) and Partial Least Squares (PLS) are powerful methods for visualization, regression, classification, and feature selection of omics data where the number of variables exceeds the number of samples and with multicollinearity among variables. Orthogonal Partial Least Squares (OPLS) enables to separately model the variation correlated (predictive) to the factor of interest and the uncorrelated (orthogonal) variation. While performing similarly to PLS, OPLS facilitates interpretation. Successful applications of these chemometrics techniques include spectroscopic data such as Raman spectroscopy, nuclear magnetic resonance (NMR), mass spectrometry (MS) in metabolomics and proteomics, but also transcriptomics data. In addition to scores, loadings and weights plots, the package provides metrics and graphics to determine the optimal number of components (e.g. with the R2 and Q2 coefficients), check the validity of the model by permutation testing, detect outliers, and perform feature selection (e.g. with Variable Importance in Projection or regression coefficients). The package can be accessed via a user interface on the Workflow4Metabolomics.org online resource for computational metabolomics (built upon the Galaxy environment).
Maintained by Etienne A. Thevenot. Last updated 5 months ago.
regressionclassificationprincipalcomponenttranscriptomicsproteomicsmetabolomicslipidomicsmassspectrometryimmunooncology
7.55 score 210 scripts 8 dependentsrspatial
predicts:Spatial Prediction Tools
Methods for spatial predictive modeling, especially for spatial distribution models. This includes algorithms for model fitting and prediction, as well as methods for model evaluation.
Maintained by Robert J. Hijmans. Last updated 2 months ago.
10 stars 7.55 score 108 scripts 8 dependentswenjie2wang
reda:Recurrent Event Data Analysis
Contains implementations of recurrent event data analysis routines including (1) survival and recurrent event data simulation from stochastic process point of view by the thinning method proposed by Lewis and Shedler (1979) <doi:10.1002/nav.3800260304> and the inversion method introduced in Cinlar (1975, ISBN:978-0486497976), (2) the mean cumulative function (MCF) estimation by the Nelson-Aalen estimator of the cumulative hazard rate function, (3) two-sample recurrent event responses comparison with the pseudo-score tests proposed by Lawless and Nadeau (1995) <doi:10.2307/1269617>, (4) gamma frailty model with spline rate function following Fu, et al. (2016) <doi:10.1080/10543406.2014.992524>.
Maintained by Wenjie Wang. Last updated 1 years ago.
mcfmean-cumulative-functionrecurrent-eventsurvival-analysiscpp
15 stars 7.52 score 55 scripts 3 dependentstpetzoldt
growthrates:Estimate Growth Rates from Experimental Data
A collection of methods to determine growth rates from experimental data, in particular from batch experiments and plate reader trials.
Maintained by Thomas Petzoldt. Last updated 2 years ago.
27 stars 7.52 score 102 scriptsropensci
rsat:Dealing with Multiplatform Satellite Images
Downloading, customizing, and processing time series of satellite images for a region of interest. 'rsat' functions allow a unified access to multispectral images from Landsat, MODIS and Sentinel repositories. 'rsat' also offers capabilities for customizing satellite images, such as tile mosaicking, image cropping and new variables computation. Finally, 'rsat' covers the processing, including cloud masking, compositing and gap-filling/smoothing time series of images (Militino et al., 2018 <doi:10.3390/rs10030398> and Militino et al., 2019 <doi:10.1109/TGRS.2019.2904193>).
Maintained by Unai Pérez - Goya. Last updated 11 months ago.
54 stars 7.45 score 52 scriptsbioc
flowViz:Visualization for flow cytometry
Provides visualization tools for flow cytometry data.
Maintained by Mike Jiang. Last updated 5 months ago.
immunooncologyinfrastructureflowcytometrycellbasedassaysvisualization
7.44 score 231 scripts 12 dependentscran
sn:The Skew-Normal and Related Distributions Such as the Skew-t and the SUN
Build and manipulate probability distributions of the skew-normal family and some related ones, notably the skew-t and the SUN families. For the skew-normal and the skew-t distributions, statistical methods are provided for data fitting and model diagnostics, in the univariate and the multivariate case.
Maintained by Adelchi Azzalini. Last updated 2 years ago.
3 stars 7.44 score 92 dependentsssnn-airr
shazam:Immunoglobulin Somatic Hypermutation Analysis
Provides a computational framework for analyzing mutations in immunoglobulin (Ig) sequences. Includes methods for Bayesian estimation of antigen-driven selection pressure, mutational load quantification, building of somatic hypermutation (SHM) models, and model-dependent distance calculations. Also includes empirically derived models of SHM for both mice and humans. Citations: Gupta and Vander Heiden, et al (2015) <doi:10.1093/bioinformatics/btv359>, Yaari, et al (2012) <doi:10.1093/nar/gks457>, Yaari, et al (2013) <doi:10.3389/fimmu.2013.00358>, Cui, et al (2016) <doi:10.4049/jimmunol.1502263>.
Maintained by Susanna Marquez. Last updated 3 months ago.
7.43 score 222 scripts 2 dependentsspatpomp-org
spatPomp:Inference for Spatiotemporal Partially Observed Markov Processes
Inference on panel data using spatiotemporal partially-observed Markov process (SpatPOMP) models. The 'spatPomp' package extends 'pomp' to include algorithms taking advantage of the spatial structure in order to assist with handling high dimensional processes. See Asfaw et al. (2024) <doi:10.48550/arXiv.2101.01157> for further description of the package.
Maintained by Edward Ionides. Last updated 4 months ago.
2 stars 7.38 score 93 scriptscran
timeDate:Rmetrics - Chronological and Calendar Objects
The 'timeDate' class fulfils the conventions of the ISO 8601 standard as well as of the ANSI C and POSIX standards. Beyond these standards it provides the "Financial Center" concept which allows to handle data records collected in different time zones and mix them up to have always the proper time stamps with respect to your personal financial center, or alternatively to the GMT reference time. It can thus also handle time stamps from historical data records from the same time zone, even if the financial centers changed day light saving times at different calendar dates.
Maintained by Georgi N. Boshnakov. Last updated 6 months ago.
1 stars 7.37 score 713 dependentsconsbiol-unibern
SDMtune:Species Distribution Model Selection
User-friendly framework that enables the training and the evaluation of species distribution models (SDMs). The package implements functions for data driven variable selection and model tuning and includes numerous utilities to display the results. All the functions used to select variables or to tune model hyperparameters have an interactive real-time chart displayed in the 'RStudio' viewer pane during their execution.
Maintained by Sergio Vignali. Last updated 3 months ago.
hyperparameter-tuningspecies-distribution-modellingvariable-selectioncpp
25 stars 7.37 score 155 scriptsgagolews
FuzzyNumbers:Tools to Deal with Fuzzy Numbers
S4 classes and methods to deal with fuzzy numbers. They allow for computing any arithmetic operations (e.g., by using the Zadeh extension principle), performing approximation of arbitrary fuzzy numbers by trapezoidal and piecewise linear ones, preparing plots for publications, computing possibility and necessity values for comparisons, etc.
Maintained by Marek Gagolewski. Last updated 3 years ago.
10 stars 7.37 score 91 scripts 17 dependentsjsta
wql:Exploring Water Quality Monitoring Data
Functions to assist in the processing and exploration of data from environmental monitoring programs. The package name stands for "water quality" and reflects the original focus on time series data for physical and chemical properties of water, as well as the biota. Intended for programs that sample approximately monthly, quarterly or annually at discrete stations, a feature of many legacy data sets. Most of the functions should be useful for analysis of similar-frequency time series regardless of the subject matter.
Maintained by Jemma Stachelek. Last updated 2 months ago.
12 stars 7.34 score 204 scripts 3 dependentschoi-phd
TestDesign:Optimal Test Design Approach to Fixed and Adaptive Test Construction
Uses the optimal test design approach by Birnbaum (1968, ISBN:9781593119348) and van der Linden (2018) <doi:10.1201/9781315117430> to construct fixed, adaptive, and parallel tests. Supports the following mixed-integer programming (MIP) solver packages: 'Rsymphony', 'highs', 'gurobi', 'lpSolve', and 'Rglpk'. The 'gurobi' package is not available from CRAN; see <https://www.gurobi.com/downloads/>.
Maintained by Seung W. Choi. Last updated 6 months ago.
3 stars 7.34 score 37 scripts 2 dependentsargocanada
argoFloats:Analysis of Oceanographic Argo Floats
Supports the analysis of oceanographic data recorded by Argo autonomous drifting profiling floats. Functions are provided to (a) download and cache data files, (b) subset data in various ways, (c) handle quality-control flags and (d) plot the results according to oceanographic conventions. A shiny app is provided for easy exploration of datasets. The package is designed to work well with the 'oce' package, providing a wide range of processing capabilities that are particular to oceanographic analysis. See Kelley, Harbin, and Richards (2021) <doi:10.3389/fmars.2021.635922> for more on the scientific context and applications.
Maintained by Dan Kelley. Last updated 1 months ago.
17 stars 7.32 score 203 scriptskornl
gMCP:Graph Based Multiple Comparison Procedures
Functions and a graphical user interface for graphical described multiple test procedures.
Maintained by Kornelius Rohmeyer. Last updated 1 years ago.
10 stars 7.31 score 105 scripts 2 dependentsbioc
flowClust:Clustering for Flow Cytometry
Robust model-based clustering using a t-mixture model with Box-Cox transformation. Note: users should have GSL installed. Windows users: 'consult the README file available in the inst directory of the source distribution for necessary configuration instructions'.
Maintained by Greg Finak. Last updated 5 months ago.
immunooncologyclusteringvisualizationflowcytometry
7.30 score 83 scripts 6 dependentsr-forge
pcalg:Methods for Graphical Models and Causal Inference
Functions for causal structure learning and causal inference using graphical models. The main algorithms for causal structure learning are PC (for observational data without hidden variables), FCI and RFCI (for observational data with hidden variables), and GIES (for a mix of data from observational studies (i.e. observational data) and data from experiments involving interventions (i.e. interventional data) without hidden variables). For causal inference the IDA algorithm, the Generalized Backdoor Criterion (GBC), the Generalized Adjustment Criterion (GAC) and some related functions are implemented. Functions for incorporating background knowledge are provided.
Maintained by Markus Kalisch. Last updated 7 months ago.
7.30 score 700 scripts 19 dependentsrobinhankin
onion:Octonions and Quaternions
Quaternions and Octonions are four- and eight- dimensional extensions of the complex numbers. They are normed division algebras over the real numbers and find applications in spatial rotations (quaternions), and string theory and relativity (octonions). The quaternions are noncommutative and the octonions nonassociative. See the package vignette for more details.
Maintained by Robin K. S. Hankin. Last updated 1 months ago.
6 stars 7.27 score 43 scripts 3 dependentsbioc
IHW:Independent Hypothesis Weighting
Independent hypothesis weighting (IHW) is a multiple testing procedure that increases power compared to the method of Benjamini and Hochberg by assigning data-driven weights to each hypothesis. The input to IHW is a two-column table of p-values and covariates. The covariate can be any continuous-valued or categorical variable that is thought to be informative on the statistical properties of each hypothesis test, while it is independent of the p-value under the null hypothesis.
Maintained by Nikos Ignatiadis. Last updated 5 months ago.
immunooncologymultiplecomparisonrnaseq
7.25 score 264 scripts 2 dependentsedzer
trajectories:Classes and Methods for Trajectory Data
Classes and methods for trajectory data, with support for nesting individual Track objects in track sets (Tracks) and track sets for different entities in collections of Tracks. Methods include selection, generalization, aggregation, intersection, simulation, and plotting.
Maintained by Edzer Pebesma. Last updated 7 months ago.
31 stars 7.25 score 76 scripts 1 dependentsbioc
qpgraph:Estimation of Genetic and Molecular Regulatory Networks from High-Throughput Genomics Data
Estimate gene and eQTL networks from high-throughput expression and genotyping assays.
Maintained by Robert Castelo. Last updated 2 days ago.
microarraygeneexpressiontranscriptionpathwaysnetworkinferencegraphandnetworkgeneregulationgeneticsgeneticvariabilitysnpsoftwareopenblas
3 stars 7.24 score 20 scripts 3 dependentsropensci
melt:Multiple Empirical Likelihood Tests
Performs multiple empirical likelihood tests. It offers an easy-to-use interface and flexibility in specifying hypotheses and calibration methods, extending the framework to simultaneous inferences. The core computational routines are implemented using the 'Eigen' 'C++' library and 'RcppEigen' interface, with 'OpenMP' for parallel computation. Details of the testing procedures are provided in Kim, MacEachern, and Peruggia (2023) <doi:10.1080/10485252.2023.2206919>. A companion paper by Kim, MacEachern, and Peruggia (2024) <doi:10.18637/jss.v108.i05> is available for further information. This work was supported by the U.S. National Science Foundation under Grants No. SES-1921523 and DMS-2015552.
Maintained by Eunseop Kim. Last updated 11 months ago.
12 stars 7.24 score 84 scriptswbnicholson
BigVAR:Dimension Reduction Methods for Multivariate Time Series
Estimates VAR and VARX models with Structured Penalties.
Maintained by Will Nicholson. Last updated 6 months ago.
58 stars 7.24 score 100 scripts 1 dependentsvpihur
clValid:Validation of Clustering Results
Statistical and biological validation of clustering results. This package implements Dunn Index, Silhouette, Connectivity, Stability, BHI and BSI. Further information can be found in Brock, G et al. (2008) <doi: 10.18637/jss.v025.i04>.
Maintained by Vasyl Pihur. Last updated 4 years ago.
5 stars 7.24 score 422 scripts 14 dependentsdankelley
plan:Tools for Project Planning
Supports the creation of 'burndown' charts and 'gantt' diagrams.
Maintained by Dan Kelley. Last updated 2 years ago.
33 stars 7.23 score 103 scriptsjhorzek
lessSEM:Non-Smooth Regularization for Structural Equation Models
Provides regularized structural equation modeling (regularized SEM) with non-smooth penalty functions (e.g., lasso) building on 'lavaan'. The package is heavily inspired by the ['regsem'](<https://github.com/Rjacobucci/regsem>) and ['lslx'](<https://github.com/psyphh/lslx>) packages.
Maintained by Jannik H. Orzek. Last updated 1 years ago.
lassopsychometricsregularizationregularized-structural-equation-modelsemstructural-equation-modelingopenblascppopenmp
7 stars 7.19 score 223 scriptspaulponcet
statip:Statistical Functions for Probability Distributions and Regression
A collection of miscellaneous statistical functions for probability distributions: 'dbern()', 'pbern()', 'qbern()', 'rbern()' for the Bernoulli distribution, and 'distr2name()', 'name2distr()' for distribution names; probability density estimation: 'densityfun()'; most frequent value estimation: 'mfv()', 'mfv1()'; other statistical measures of location: 'cv()' (coefficient of variation), 'midhinge()', 'midrange()', 'trimean()'; construction of histograms: 'histo()', 'find_breaks()'; calculation of the Hellinger distance: 'hellinger()'; use of classical kernels: 'kernelfun()', 'kernel_properties()'; univariate piecewise-constant regression: 'picor()'.
Maintained by Paul Poncet. Last updated 5 years ago.
2 stars 7.17 score 73 scripts 52 dependentsdatastorm-open
rAmCharts:JavaScript Charts Tool
Provides an R interface for using 'AmCharts' Library. Based on 'htmlwidgets', it provides a global architecture to generate 'JavaScript' source code for charts. Most of classes in the library have their equivalent in R with S4 classes; for those classes, not all properties have been referenced but can easily be added in the constructors. Complex properties (e.g. 'JavaScript' object) can be passed as named list. See examples at <https://datastorm-open.github.io/introduction_ramcharts/> and <https://www.amcharts.com/> for more information about the library. The package includes the free version of 'AmCharts' Library. Its only limitation is a small link to the web site displayed on your charts. If you enjoy this library, do not hesitate to refer to this page <https://www.amcharts.com/online-store/> to purchase a licence, and thus support its creators and get a period of Priority Support. See also <https://www.amcharts.com/about/> for more information about 'AmCharts' company.
Maintained by Benoit Thieurmel. Last updated 2 months ago.
49 stars 7.17 score 153 scripts 4 dependentsjellegoeman
penalized:L1 (Lasso and Fused Lasso) and L2 (Ridge) Penalized Estimation in GLMs and in the Cox Model
Fitting possibly high dimensional penalized regression models. The penalty structure can be any combination of an L1 penalty (lasso and fused lasso), an L2 penalty (ridge) and a positivity constraint on the regression coefficients. The supported regression models are linear, logistic and Poisson regression and the Cox Proportional Hazards model. Cross-validation routines allow optimization of the tuning parameters.
Maintained by Jelle Goeman. Last updated 3 years ago.
4 stars 7.09 score 429 scripts 17 dependentsoptad
adoptr:Adaptive Optimal Two-Stage Designs
Optimize one or two-arm, two-stage designs for clinical trials with respect to several implemented objective criteria or custom objectives. Optimization under uncertainty and conditional (given stage-one outcome) constraints are supported. See Pilz et al. (2019) <doi:10.1002/sim.8291> and Kunzmann et al. (2021) <doi:10.18637/jss.v098.i09> for details.
Maintained by Maximilian Pilz. Last updated 6 months ago.
1 stars 7.09 score 39 scripts 1 dependentskhliland
baseline:Baseline Correction of Spectra
Collection of baseline correction algorithms, along with a framework and a Tcl/Tk enabled GUI for optimising baseline algorithm parameters. Typical use of the package is for removing background effects from spectra originating from various types of spectroscopy and spectrometry, possibly optimizing this with regard to regression or classification results. Correction methods include polynomial fitting, weighted local smoothers and many more.
Maintained by Kristian Hovde Liland. Last updated 10 months ago.
9 stars 7.07 score 74 scripts 12 dependentsspedygiorgio
lifecontingencies:Financial and Actuarial Mathematics for Life Contingencies
Classes and methods that allow the user to manage life table, actuarial tables (also multiple decrements tables). Moreover, functions to easily perform demographic, financial and actuarial mathematics on life contingencies insurances calculations are contained therein. See Spedicato (2013) <doi:10.18637/jss.v055.i10>.
Maintained by Giorgio Alfredo Spedicato. Last updated 6 months ago.
actuarialfinanciallife-contingencieslife-insurancecpp
61 stars 7.06 score 156 scriptscrp2a
gamma:Dose Rate Estimation from in-Situ Gamma-Ray Spectrometry Measurements
Process in-situ Gamma-Ray Spectrometry for Luminescence Dating. This package allows to import, inspect and correct the energy shifts of gamma-ray spectra. It provides methods for estimating the gamma dose rate by the use of a calibration curve as described in Mercier and Falguères (2007). The package only supports Canberra CNF and TKA and Kromek SPE files.
Maintained by Archéosciences Bordeaux. Last updated 6 months ago.
archaeometrygamma-spectrometrygeochronologyluminescence-dating
7 stars 7.05 score 11 scripts 1 dependentsyuimaproject
yuima:The YUIMA Project Package for SDEs
Simulation and Inference for SDEs and Other Stochastic Processes.
Maintained by Stefano M. Iacus. Last updated 2 days ago.
9 stars 7.02 score 92 scripts 2 dependentsdoccstat
fastcpd:Fast Change Point Detection via Sequential Gradient Descent
Implements fast change point detection algorithm based on the paper "Sequential Gradient Descent and Quasi-Newton's Method for Change-Point Analysis" by Xianyang Zhang, Trisha Dawn <https://proceedings.mlr.press/v206/zhang23b.html>. The algorithm is based on dynamic programming with pruning and sequential gradient descent. It is able to detect change points a magnitude faster than the vanilla Pruned Exact Linear Time(PELT). The package includes examples of linear regression, logistic regression, Poisson regression, penalized linear regression data, and whole lot more examples with custom cost function in case the user wants to use their own cost function.
Maintained by Xingchi Li. Last updated 10 days ago.
change-point-detectioncppcustom-functiongradient-descentlassolinear-regressionlogistic-regressionofflinepeltpenalized-regressionpoisson-regressionquasi-newtonstatisticstime-serieswarm-startfortranopenblascppopenmp
22 stars 7.00 score 7 scriptsroustant
DiceKriging:Kriging Methods for Computer Experiments
Estimation, validation and prediction of kriging models. Important functions : km, print.km, plot.km, predict.km.
Maintained by Olivier Roustant. Last updated 4 years ago.
4 stars 6.99 score 526 scripts 37 dependentssylvainschmitt
SSDM:Stacked Species Distribution Modelling
Allows to map species richness and endemism based on stacked species distribution models (SSDM). Individuals SDMs can be created using a single or multiple algorithms (ensemble SDMs). For each species, an SDM can yield a habitat suitability map, a binary map, a between-algorithm variance map, and can assess variable importance, algorithm accuracy, and between- algorithm correlation. Methods to stack individual SDMs include summing individual probabilities and thresholding then summing. Thresholding can be based on a specific evaluation metric or by drawing repeatedly from a Bernoulli distribution. The SSDM package also provides a user-friendly interface.
Maintained by Sylvain Schmitt. Last updated 11 months ago.
44 stars 6.99 score 44 scriptsr-forge
oompaBase:Class Unions, Matrix Operations, and Color Schemes for OOMPA
Provides the class unions that must be preloaded in order for the basic tools in the OOMPA (Object-Oriented Microarray and Proteomics Analysis) project to be defined and loaded. It also includes vectorized operations for row-by-row means, variances, and t-tests. Finally, it provides new color schemes. Details on the packages in the OOMPA project can be found at <http://oompa.r-forge.r-project.org/>.
Maintained by Kevin R. Coombes. Last updated 2 months ago.
6.97 score 29 scripts 18 dependentsbioc
ROC:utilities for ROC, with microarray focus
Provide utilities for ROC, with microarray focus.
Maintained by Vince Carey. Last updated 5 months ago.
6.95 score 70 scripts 8 dependentsbioc
CellNOptR:Training of boolean logic models of signalling networks using prior knowledge networks and perturbation data
This package does optimisation of boolean logic networks of signalling pathways based on a previous knowledge network and a set of data upon perturbation of the nodes in the network.
Maintained by Attila Gabor. Last updated 4 days ago.
cellbasedassayscellbiologyproteomicspathwaysnetworktimecourseimmunooncology
6.95 score 98 scripts 6 dependentsarchaeostat
ArchaeoPhases:Post-Processing of Markov Chain Monte Carlo Simulations for Chronological Modelling
Statistical analysis of archaeological dates and groups of dates. This package allows to post-process Markov Chain Monte Carlo (MCMC) simulations from 'ChronoModel' <https://chronomodel.com/>, 'Oxcal' <https://c14.arch.ox.ac.uk/oxcal.html> or 'BCal' <https://bcal.shef.ac.uk/>. It provides functions for the study of rhythms of the long term from the posterior distribution of a series of dates (tempo and activity plot). It also allows the estimation and visualization of time ranges from the posterior distribution of groups of dates (e.g. duration, transition and hiatus between successive phases) as described in Philippe and Vibet (2020) <doi:10.18637/jss.v093.c01>.
Maintained by Anne Philippe. Last updated 12 months ago.
archaeologybayesian-statisticsgeochronologymarkov-chainradiocarbon-dates
10 stars 6.90 score 66 scriptskingaa
ouch:Ornstein-Uhlenbeck Models for Phylogenetic Comparative Hypotheses
Fit and compare Ornstein-Uhlenbeck models for evolution along a phylogenetic tree.
Maintained by Aaron A. King. Last updated 5 months ago.
adaptive-regimebrownian-motionornstein-uhlenbeckornstein-uhlenbeck-modelsouchphylogenetic-comparative-hypothesesphylogenetic-comparative-methodsphylogenetic-datareact
15 stars 6.87 score 68 scripts 4 dependentsflr
FLasher:Projection and Forecasting of Fish Populations, Stocks and Fleets
Projection of future population and fishery dynamics is carried out for a given set of management targets. A system of equations is solved, using Automatic Differentation (AD), for the levels of effort by fishery (fleet) that will result in the required abundances, catches or fishing mortalities.
Maintained by Iago Mosqueira. Last updated 21 days ago.
2 stars 6.86 score 254 scripts 6 dependentsphilips-software
latrend:A Framework for Clustering Longitudinal Data
A framework for clustering longitudinal datasets in a standardized way. The package provides an interface to existing R packages for clustering longitudinal univariate trajectories, facilitating reproducible and transparent analyses. Additionally, standard tools are provided to support cluster analyses, including repeated estimation, model validation, and model assessment. The interface enables users to compare results between methods, and to implement and evaluate new methods with ease. The 'akmedoids' package is available from <https://github.com/MAnalytics/akmedoids>.
Maintained by Niek Den Teuling. Last updated 3 months ago.
cluster-analysisclustering-evaluationclustering-methodsdata-sciencelongitudinal-clusteringlongitudinal-datamixture-modelstime-series-analysis
30 stars 6.77 score 26 scriptsachubaty
grainscape:Landscape Connectivity, Habitat, and Protected Area Networks
Given a landscape resistance surface, creates minimum planar graph (Fall et al. (2007) <doi:10.1007/s10021-007-9038-7>) and grains of connectivity (Galpern et al. (2012) <doi:10.1111/j.1365-294X.2012.05677.x>) models that can be used to calculate effective distances for landscape connectivity at multiple scales. Documentation is provided by several vignettes, and a paper (Chubaty, Galpern & Doctolero (2020) <doi:10.1111/2041-210X.13350>).
Maintained by Alex M Chubaty. Last updated 2 months ago.
habitat-connectivitylandscape-connectivityspatial-graphscpp
19 stars 6.76 score 20 scriptsflr
FLa4a:A Simple and Robust Statistical Catch at Age Model
A simple and robust statistical Catch at Age model that is specifically designed for stocks with intermediate levels of data quantity and quality.
Maintained by Ernesto Jardim. Last updated 7 days ago.
12 stars 6.71 score 177 scripts 2 dependentsdgerlanc
portfolio:Analysing Equity Portfolios
Classes for analysing and implementing equity portfolios, including routines for generating tradelists and calculating exposures to user-specified risk factors.
Maintained by Daniel Gerlanc. Last updated 7 months ago.
financeportfolio-constructionrisk-modelling
16 stars 6.71 score 106 scriptsbioc
doppelgangR:Identify likely duplicate samples from genomic or meta-data
The main function is doppelgangR(), which takes as minimal input a list of ExpressionSet object, and searches all list pairs for duplicated samples. The search is based on the genomic data (exprs(eset)), phenotype/clinical data (pData(eset)), and "smoking guns" - supposedly unique identifiers found in pData(eset).
Maintained by Levi Waldron. Last updated 5 months ago.
immunooncologyrnaseqmicroarraygeneexpressionqualitycontrolbioconductor-package
5 stars 6.67 score 31 scriptsr-forge
distrEx:Extensions of Package 'distr'
Extends package 'distr' by functionals, distances, and conditional distributions.
Maintained by Matthias Kohl. Last updated 2 months ago.
6.64 score 107 scripts 17 dependentsbioc
LEA:LEA: an R package for Landscape and Ecological Association Studies
LEA is an R package dedicated to population genomics, landscape genomics and genotype-environment association tests. LEA can run analyses of population structure and genome-wide tests for local adaptation, and also performs imputation of missing genotypes. The package includes statistical methods for estimating ancestry coefficients from large genotypic matrices and for evaluating the number of ancestral populations (snmf). It performs statistical tests using latent factor mixed models for identifying genetic polymorphisms that exhibit association with environmental gradients or phenotypic traits (lfmm2). In addition, LEA computes values of genetic offset statistics based on new or predicted environments (genetic.gap, genetic.offset). LEA is mainly based on optimized programs that can scale with the dimensions of large data sets.
Maintained by Olivier Francois. Last updated 17 days ago.
softwarestatistical methodclusteringregressionopenblas
6.63 score 534 scriptsr-forge
distrMod:Object Oriented Implementation of Probability Models
Implements S4 classes for probability models based on packages 'distr' and 'distrEx'.
Maintained by Peter Ruckdeschel. Last updated 2 months ago.
6.60 score 139 scripts 6 dependentsbioc
kebabs:Kernel-Based Analysis of Biological Sequences
The package provides functionality for kernel-based analysis of DNA, RNA, and amino acid sequences via SVM-based methods. As core functionality, kebabs implements following sequence kernels: spectrum kernel, mismatch kernel, gappy pair kernel, and motif kernel. Apart from an efficient implementation of standard position-independent functionality, the kernels are extended in a novel way to take the position of patterns into account for the similarity measure. Because of the flexibility of the kernel formulation, other kernels like the weighted degree kernel or the shifted weighted degree kernel with constant weighting of positions are included as special cases. An annotation-specific variant of the kernels uses annotation information placed along the sequence together with the patterns in the sequence. The package allows for the generation of a kernel matrix or an explicit feature representation in dense or sparse format for all available kernels which can be used with methods implemented in other R packages. With focus on SVM-based methods, kebabs provides a framework which simplifies the usage of existing SVM implementations in kernlab, e1071, and LiblineaR. Binary and multi-class classification as well as regression tasks can be used in a unified way without having to deal with the different functions, parameters, and formats of the selected SVM. As support for choosing hyperparameters, the package provides cross validation - including grouped cross validation, grid search and model selection functions. For easier biological interpretation of the results, the package computes feature weights for all SVMs and prediction profiles which show the contribution of individual sequence positions to the prediction result and indicate the relevance of sequence sections for the learning result and the underlying biological functions.
Maintained by Ulrich Bodenhofer. Last updated 5 months ago.
supportvectormachineclassificationclusteringregressioncpp
6.58 score 47 scripts 3 dependentsflr
FLBRP:Reference Points for Fisheries Management
Calculates a range of biological reference points based upon yield per recruit and stock recruit based equilibrium calculations. These include F based reference points like F0.1, FMSY and biomass based reference points like BMSY.
Maintained by Iago Mosqueira. Last updated 4 months ago.
reference pointsfisheriesflrcpp
2 stars 6.58 score 350 scripts 4 dependentstoxpi
toxpiR:Create ToxPi Prioritization Models
Enables users to build 'ToxPi' prioritization models and provides functionality within the grid framework for plotting ToxPi graphs. 'toxpiR' allows for more customization than the 'ToxPi GUI' (<https://toxpi.org>) and integration into existing workflows for greater ease-of-use, reproducibility, and transparency. toxpiR package behaves nearly identically to the GUI; the package documentation includes notes about all differences. The vignettes download example files from <https://github.com/ToxPi/ToxPi-example-files>.
Maintained by Jonathon F Fleming. Last updated 7 months ago.
data-sciencemodelingtoxicology
11 stars 6.58 score 19 scriptsspkaluzny
splus2R:Supplemental S-PLUS Functionality in R
Currently there are many functions in S-PLUS that are missing in R. To facilitate the conversion of S-PLUS packages to R packages, this package provides some missing S-PLUS functionality in R.
Maintained by Stephen Kaluzny. Last updated 1 years ago.
1 stars 6.56 score 82 scripts 30 dependentsfbertran
Cascade:Selection, Reverse-Engineering and Prediction in Cascade Networks
A modeling tool allowing gene selection, reverse engineering, and prediction in cascade networks. Jung, N., Bertrand, F., Bahram, S., Vallat, L., and Maumy-Bertrand, M. (2014) <doi:10.1093/bioinformatics/btt705>.
Maintained by Frederic Bertrand. Last updated 2 years ago.
1 stars 6.56 score 40 scripts 2 dependentsbachmannpatrick
CLVTools:Tools for Customer Lifetime Value Estimation
A set of state-of-the-art probabilistic modeling approaches to derive estimates of individual customer lifetime values (CLV). Commonly, probabilistic approaches focus on modelling 3 processes, i.e. individuals' attrition, transaction, and spending process. Latent customer attrition models, which are also known as "buy-'til-you-die models", model the attrition as well as the transaction process. They are used to make inferences and predictions about transactional patterns of individual customers such as their future purchase behavior. Moreover, these models have also been used to predict individuals’ long-term engagement in activities such as playing an online game or posting to a social media platform. The spending process is usually modelled by a separate probabilistic model. Combining these results yields in lifetime values estimates for individual customers. This package includes fast and accurate implementations of various probabilistic models for non-contractual settings (e.g., grocery purchases or hotel visits). All implementations support time-invariant covariates, which can be used to control for e.g., socio-demographics. If such an extension has been proposed in literature, we further provide the possibility to control for time-varying covariates to control for e.g., seasonal patterns. Currently, the package includes the following latent attrition models to model individuals' attrition and transaction process: [1] Pareto/NBD model (Pareto/Negative-Binomial-Distribution), [2] the Extended Pareto/NBD model (Pareto/Negative-Binomial-Distribution with time-varying covariates), [3] the BG/NBD model (Beta-Gamma/Negative-Binomial-Distribution) and the [4] GGom/NBD (Gamma-Gompertz/Negative-Binomial-Distribution). Further, we provide an implementation of the Gamma/Gamma model to model the spending process of individuals.
Maintained by Patrick Bachmann. Last updated 4 months ago.
clvcustomer-lifetime-valuecustomer-relationship-managementopenblasgslcppopenmp
55 stars 6.47 score 12 scriptsr-forge
ClassComparison:Classes and Methods for "Class Comparison" Problems on Microarrays
Defines the classes used for "class comparison" problems in the OOMPA project (<http://oompa.r-forge.r-project.org/>). Class comparison includes tests for differential expression; see Simon's book for details on typical problem types.
Maintained by Kevin R. Coombes. Last updated 2 months ago.
microarraydifferentialexpressionmultiplecomparisons
6.46 score 44 scripts 3 dependentsbioc
MultiDataSet:Implementation of MultiDataSet and ResultSet
Implementation of the BRGE's (Bioinformatic Research Group in Epidemiology from Center for Research in Environmental Epidemiology) MultiDataSet and ResultSet. MultiDataSet is designed for integrating multi omics data sets and ResultSet is a container for omics results. This package contains base classes for MEAL and rexposome packages.
Maintained by Xavier Escribà Montagut. Last updated 5 months ago.
6.45 score 28 scripts 10 dependentsbstatcomp
bayes4psy:User Friendly Bayesian Data Analysis for Psychology
Contains several Bayesian models for data analysis of psychological tests. A user friendly interface for these models should enable students and researchers to perform professional level Bayesian data analysis without advanced knowledge in programming and Bayesian statistics. This package is based on the Stan platform (Carpenter et el. 2017 <doi:10.18637/jss.v076.i01>).
Maintained by Jure Demšar. Last updated 1 years ago.
14 stars 6.44 score 33 scriptsbioc
PICS:Probabilistic inference of ChIP-seq
Probabilistic inference of ChIP-Seq using an empirical Bayes mixture model approach.
Maintained by Renan Sauteraud. Last updated 2 days ago.
clusteringvisualizationsequencingchipseqgsl
6.43 score 7 scripts 1 dependentsthibautjombart
apex:Phylogenetic Methods for Multiple Gene Data
Toolkit for the analysis of multiple gene data (Jombart et al. 2017) <doi:10.1111/1755-0998.12567>. 'apex' implements the new S4 classes 'multidna', 'multiphyDat' and associated methods to handle aligned DNA sequences from multiple genes.
Maintained by Klaus Schliep. Last updated 1 years ago.
5 stars 6.40 score 54 scriptsvictor-navarro
calmr:Canonical Associative Learning Models and their Representations
Implementations of canonical associative learning models, with tools to run experiment simulations, estimate model parameters, and compare model representations. Experiments and results are represented using S4 classes and methods.
Maintained by Victor Navarro. Last updated 10 months ago.
3 stars 6.40 score 17 scriptsblue-matter
SAMtool:Stock Assessment Methods Toolkit
Simulation tools for closed-loop simulation are provided for the 'MSEtool' operating model to inform data-rich fisheries. 'SAMtool' provides a conditioning model, assessment models of varying complexity with standardized reporting, model-based management procedures, and diagnostic tools for evaluating assessments inside closed-loop simulation.
Maintained by Quang Huynh. Last updated 1 months ago.
3 stars 6.39 score 36 scripts 1 dependentsropensci
QuadratiK:Collection of Methods Constructed using Kernel-Based Quadratic Distances
It includes test for multivariate normality, test for uniformity on the d-dimensional Sphere, non-parametric two- and k-sample tests, random generation of points from the Poisson kernel-based density and clustering algorithm for spherical data. For more information see Saraceno G., Markatou M., Mukhopadhyay R. and Golzy M. (2024) <doi:10.48550/arXiv.2402.02290> Markatou, M. and Saraceno, G. (2024) <doi:10.48550/arXiv.2407.16374>, Ding, Y., Markatou, M. and Saraceno, G. (2023) <doi:10.5705/ss.202022.0347>, and Golzy, M. and Markatou, M. (2020) <doi:10.1080/10618600.2020.1740713>.
Maintained by Giovanni Saraceno. Last updated 2 months ago.
1 stars 6.36 score 27 scriptsstc04003
reReg:Recurrent Event Regression
A comprehensive collection of practical and easy-to-use tools for regression analysis of recurrent events, with or without the presence of a (possibly) informative terminal event described in Chiou et al. (2023) <doi:10.18637/jss.v105.i05>. The modeling framework is based on a joint frailty scale-change model, that includes models described in Wang et al. (2001) <doi:10.1198/016214501753209031>, Huang and Wang (2004) <doi:10.1198/016214504000001033>, Xu et al. (2017) <doi:10.1080/01621459.2016.1173557>, and Xu et al. (2019) <doi:10.5705/SS.202018.0224> as special cases. The implemented estimating procedure does not require any parametric assumption on the frailty distribution. The package also allows the users to specify different model forms for both the recurrent event process and the terminal event.
Maintained by Sy Han (Steven) Chiou. Last updated 2 months ago.
23 stars 6.35 score 36 scripts 1 dependentsdfriend21
quadtree:Region Quadtrees for Spatial Data
Provides functionality for working with raster-like quadtrees (also called “region quadtrees”), which allow for variable-sized cells. The package allows for flexibility in the quadtree creation process. Several functions defining how to split and aggregate cells are provided, and custom functions can be written for both of these processes. In addition, quadtrees can be created using other quadtrees as “templates”, so that the new quadtree's structure is identical to the template quadtree. The package also includes functionality for modifying quadtrees, querying values, saving quadtrees to a file, and calculating least-cost paths using the quadtree as a resistance surface.
Maintained by Derek Friend. Last updated 2 years ago.
19 stars 6.34 score 58 scriptscran
fGarch:Rmetrics - Autoregressive Conditional Heteroskedastic Modelling
Analyze and model heteroskedastic behavior in financial time series.
Maintained by Georgi N. Boshnakov. Last updated 1 years ago.
7 stars 6.33 score 51 dependentssmoeding
usl:Analyze System Scalability with the Universal Scalability Law
The Universal Scalability Law (Gunther 2007) <doi:10.1007/978-3-540-31010-5> is a model to predict hardware and software scalability. It uses system capacity as a function of load to forecast the scalability for the system.
Maintained by Stefan Moeding. Last updated 3 years ago.
scalabilityuniversal-scalability-lawusl
36 stars 6.32 score 117 scriptslarmarange
prevR:Estimating Regional Trends of a Prevalence from a DHS and Similar Surveys
Spatial estimation of a prevalence surface or a relative risks surface, using data from a Demographic and Health Survey (DHS) or an analog survey, see Larmarange et al. (2011) <doi:10.4000/cybergeo.24606>.
Maintained by Joseph Larmarange. Last updated 6 months ago.
5 stars 6.26 score 46 scriptsbioc
lumi:BeadArray Specific Methods for Illumina Methylation and Expression Microarrays
The lumi package provides an integrated solution for the Illumina microarray data analysis. It includes functions of Illumina BeadStudio (GenomeStudio) data input, quality control, BeadArray-specific variance stabilization, normalization and gene annotation at the probe level. It also includes the functions of processing Illumina methylation microarrays, especially Illumina Infinium methylation microarrays.
Maintained by Lei Huang. Last updated 5 months ago.
microarrayonechannelpreprocessingdnamethylationqualitycontroltwochannel
6.26 score 294 scripts 5 dependentsrickhelmus
patRoon:Workflows for Mass-Spectrometry Based Non-Target Analysis
Provides an easy-to-use interface to a mass spectrometry based non-target analysis workflow. Various (open-source) tools are combined which provide algorithms for extraction and grouping of features, extraction of MS and MS/MS data, automatic formula and compound annotation and grouping related features to components. In addition, various tools are provided for e.g. data preparation and cleanup, plotting results and automatic reporting.
Maintained by Rick Helmus. Last updated 7 days ago.
mass-spectrometrynon-targetcppopenjdk
65 stars 6.24 score 43 scriptsclarahapp
funData:An S4 Class for Functional Data
S4 classes for univariate and multivariate functional data with utility functions. See <doi:10.18637/jss.v093.i05> for a detailed description of the package functionalities and its interplay with the MFPCA package for multivariate functional principal component analysis <https://CRAN.R-project.org/package=MFPCA>.
Maintained by Clara Happ-Kurz. Last updated 1 years ago.
14 stars 6.15 score 111 scripts 6 dependentsumr-amap
AMAPVox:LiDAR Data Voxelisation
Read, manipulate and write voxel spaces. Voxel spaces are read from text-based output files of the 'AMAPVox' software. 'AMAPVox' is a LiDAR point cloud voxelisation software that aims at estimating leaf area through several theoretical/numerical approaches. See more in the article Vincent et al. (2017) <doi:10.23708/1AJNMP> and the technical note Vincent et al. (2021) <doi:10.23708/1AJNMP>.
Maintained by Philippe Verley. Last updated 2 months ago.
15 stars 6.13 score 12 scriptsdistancedevelopment
dssd:Distance Sampling Survey Design
Creates survey designs for distance sampling surveys. These designs can be assessed for various effort and coverage statistics. Once the user is satisfied with the design characteristics they can generate a set of transects to use in their distance sampling survey. Many of the designs implemented in this R package were first made available in our 'Distance' for Windows software and are detailed in Chapter 7 of Advanced Distance Sampling, Buckland et. al. (2008, ISBN-13: 978-0199225873). Find out more about estimating animal/plant abundance with distance sampling at <http://distancesampling.org/>.
Maintained by Laura Marshall. Last updated 3 months ago.
2 stars 6.11 score 36 scripts 1 dependentscarlos-alberto-silva
rGEDI:NASA's Global Ecosystem Dynamics Investigation (GEDI) Data Visualization and Processing
Set of tools for downloading, reading, visualizing and processing GEDI Level1B, Level2A and Level2B data.
Maintained by Caio Hamamura. Last updated 5 months ago.
169 stars 6.11 score 85 scripts 1 dependentsgeobosh
sarima:Simulation and Prediction with Seasonal ARIMA Models
Functions, classes and methods for time series modelling with ARIMA and related models. The aim of the package is to provide consistent interface for the user. For example, a single function autocorrelations() computes various kinds of theoretical and sample autocorrelations. This is work in progress, see the documentation and vignettes for the current functionality. Function sarima() fits extended multiplicative seasonal ARIMA models with trends, exogenous variables and arbitrary roots on the unit circle, which can be fixed or estimated (for the algebraic basis for this see <arXiv:2208.05055>, a paper on the methodology is being prepared).
Maintained by Georgi N. Boshnakov. Last updated 1 years ago.
arimakalman-filterreg-sarimasarimasarimaxseasonaltime-seriesxarimaopenblascpp
3 stars 6.09 score 112 scripts 1 dependentsbioc
Pedixplorer:Pedigree Functions
Routines to handle family data with a Pedigree object. The initial purpose was to create correlation structures that describe family relationships such as kinship and identity-by-descent, which can be used to model family data in mixed effects models, such as in the coxme function. Also includes a tool for Pedigree drawing which is focused on producing compact layouts without intervention. Recent additions include utilities to trim the Pedigree object with various criteria, and kinship for the X chromosome.
Maintained by Louis Le Nezet. Last updated 13 days ago.
softwaredatarepresentationgeneticsgraphandnetworkvisualizationkinshippedigree
2 stars 6.08 score 10 scriptsbioc
mgsa:Model-based gene set analysis
Model-based Gene Set Analysis (MGSA) is a Bayesian modeling approach for gene set enrichment. The package mgsa implements MGSA and tools to use MGSA together with the Gene Ontology.
Maintained by Sebastian Bauer. Last updated 5 months ago.
pathwaysgogenesetenrichmentopenmp
5 stars 6.08 score 12 scriptsbioc
qcmetrics:A Framework for Quality Control
The package provides a framework for generic quality control of data. It permits to create, manage and visualise individual or sets of quality control metrics and generate quality control reports in various formats.
Maintained by Laurent Gatto. Last updated 5 months ago.
immunooncologysoftwarequalitycontrolproteomicsmicroarraymassspectrometryvisualizationreportwriting
2 stars 6.03 score 2 dependentsbioc
FELLA:Interpretation and enrichment for metabolomics data
Enrichment of metabolomics data using KEGG entries. Given a set of affected compounds, FELLA suggests affected reactions, enzymes, modules and pathways using label propagation in a knowledge model network. The resulting subnetwork can be visualised and exported.
Maintained by Sergio Picart-Armada. Last updated 5 months ago.
softwaremetabolomicsgraphandnetworkkegggopathwaysnetworknetworkenrichment
6.01 score 32 scriptsltorgo
performanceEstimation:An Infra-Structure for Performance Estimation of Predictive Models
An infra-structure for estimating the predictive performance of predictive models. In this context, it can also be used to compare and/or select among different alternative ways of solving one or more predictive tasks. The main goal of the package is to provide a generic infra-structure to estimate the values of different metrics of predictive performance using different estimation procedures. These estimation tasks can be applied to any solutions (workflows) to the predictive tasks. The package provides easy to use standard workflows that allow the usage of any available R modeling algorithm together with some pre-defined data pre-processing steps and also prediction post- processing methods. It also provides means for addressing issues related with the statistical significance of the observed differences.
Maintained by Luis Torgo. Last updated 8 years ago.
16 stars 5.97 score 195 scripts 1 dependentsjtimonen
lgpr:Longitudinal Gaussian Process Regression
Interpretable nonparametric modeling of longitudinal data using additive Gaussian process regression. Contains functionality for inferring covariate effects and assessing covariate relevances. Models are specified using a convenient formula syntax, and can include shared, group-specific, non-stationary, heterogeneous and temporally uncertain effects. Bayesian inference for model parameters is performed using 'Stan'. The modeling approach and methods are described in detail in Timonen et al. (2021) <doi:10.1093/bioinformatics/btab021>.
Maintained by Juho Timonen. Last updated 7 months ago.
bayesian-inferencegaussian-processeslongitudinal-datastancpp
25 stars 5.94 score 69 scriptscomeetie
greed:Clustering and Model Selection with the Integrated Classification Likelihood
An ensemble of algorithms that enable the clustering of networks and data matrices (such as counts, categorical or continuous) with different type of generative models. Model selection and clustering is performed in combination by optimizing the Integrated Classification Likelihood (which is equivalent to minimizing the description length). Several models are available such as: Stochastic Block Model, degree corrected Stochastic Block Model, Mixtures of Multinomial, Latent Block Model. The optimization is performed thanks to a combination of greedy local search and a genetic algorithm (see <arXiv:2002:11577> for more details).
Maintained by Etienne Côme. Last updated 2 years ago.
14 stars 5.94 score 41 scriptsbioc
normr:Normalization and difference calling in ChIP-seq data
Robust normalization and difference calling procedures for ChIP-seq and alike data. Read counts are modeled jointly as a binomial mixture model with a user-specified number of components. A fitted background estimate accounts for the effect of enrichment in certain regions and, therefore, represents an appropriate null hypothesis. This robust background is used to identify significantly enriched or depleted regions.
Maintained by Johannes Helmuth. Last updated 5 months ago.
bayesiandifferentialpeakcallingclassificationdataimportchipseqripseqfunctionalgenomicsgeneticsmultiplecomparisonnormalizationpeakdetectionpreprocessingalignmentcppopenmp
11 stars 5.93 score 13 scriptsjeswheel
panelPomp:Inference for Panel Partially Observed Markov Processes
Data analysis based on panel partially-observed Markov process (PanelPOMP) models. To implement such models, simulate them and fit them to panel data, 'panelPomp' extends some of the facilities provided for time series data by the 'pomp' package. Implemented methods include filtering (panel particle filtering) and maximum likelihood estimation (Panel Iterated Filtering) as proposed in Breto, Ionides and King (2020) "Panel Data Analysis via Mechanistic Models" <doi:10.1080/01621459.2019.1604367>.
Maintained by Jesse Wheeler. Last updated 4 months ago.
5.91 score 45 scriptsbozenne
BuyseTest:Generalized Pairwise Comparisons
Implementation of the Generalized Pairwise Comparisons (GPC) as defined in Buyse (2010) <doi:10.1002/sim.3923> for complete observations, and extended in Peron (2018) <doi:10.1177/0962280216658320> to deal with right-censoring. GPC compare two groups of observations (intervention vs. control group) regarding several prioritized endpoints to estimate the probability that a random observation drawn from one group performs better/worse/equivalently than a random observation drawn from the other group. Summary statistics such as the net treatment benefit, win ratio, or win odds are then deduced from these probabilities. Confidence intervals and p-values are obtained based on asymptotic results (Ozenne 2021 <doi:10.1177/09622802211037067>), non-parametric bootstrap, or permutations. The software enables the use of thresholds of minimal importance difference, stratification, non-prioritized endpoints (O Brien test), and can handle right-censoring and competing-risks.
Maintained by Brice Ozenne. Last updated 16 days ago.
generalized-pairwise-comparisonsnon-parametricstatisticscpp
5 stars 5.91 score 90 scriptsbioc
fabia:FABIA: Factor Analysis for Bicluster Acquisition
Biclustering by "Factor Analysis for Bicluster Acquisition" (FABIA). FABIA is a model-based technique for biclustering, that is clustering rows and columns simultaneously. Biclusters are found by factor analysis where both the factors and the loading matrix are sparse. FABIA is a multiplicative model that extracts linear dependencies between samples and feature patterns. It captures realistic non-Gaussian data distributions with heavy tails as observed in gene expression measurements. FABIA utilizes well understood model selection techniques like the EM algorithm and variational approaches and is embedded into a Bayesian framework. FABIA ranks biclusters according to their information content and separates spurious biclusters from true biclusters. The code is written in C.
Maintained by Andreas Mitterecker. Last updated 5 months ago.
statisticalmethodmicroarraydifferentialexpressionmultiplecomparisonclusteringvisualization
5.84 score 32 scripts 6 dependentstobiaskley
quantspec:Quantile-Based Spectral Analysis of Time Series
Methods to determine, smooth and plot quantile periodograms for univariate and multivariate time series.
Maintained by Tobias Kley. Last updated 9 years ago.
10 stars 5.84 score 46 scripts 1 dependentscran
flexclust:Flexible Cluster Algorithms
The main function kcca implements a general framework for k-centroids cluster analysis supporting arbitrary distance measures and centroid computation. Further cluster methods include hard competitive learning, neural gas, and QT clustering. There are numerous visualization methods for cluster results (neighborhood graphs, convex cluster hulls, barcharts of centroids, ...), and bootstrap methods for the analysis of cluster stability.
Maintained by Bettina Grün. Last updated 28 days ago.
3 stars 5.81 score 52 dependentsreginalexavier
OpenLand:Quantitative Analysis and Visualization of LUCC
Tools for the analysis of land use and cover (LUC) time series. It includes support for loading spatiotemporal raster data and synthesized spatial plotting. Several LUC change (LUCC) metrics in regular or irregular time intervals can be extracted and visualized through one- and multistep sankey and chord diagrams. A complete intensity analysis according to Aldwaik and Pontius (2012) <doi:10.1016/j.landurbplan.2012.02.010> is implemented, including tools for the generation of standardized multilevel output graphics.
Maintained by Reginal Exavier. Last updated 11 months ago.
geographygeospatialintensity-analysisland-use-and-land-cover-changeluc-mapslulcplotrasters
22 stars 5.80 score 19 scriptsmerck
gMCPLite:Lightweight Graph Based Multiple Comparison Procedures
A lightweight fork of 'gMCP' with functions for graphical described multiple test procedures introduced in Bretz et al. (2009) <doi:10.1002/sim.3495> and Bretz et al. (2011) <doi:10.1002/bimj.201000239>. Implements a flexible function using 'ggplot2' to create multiplicity graph visualizations. Contains instructions of multiplicity graph and graphical testing for group sequential design, described in Maurer and Bretz (2013) <doi:10.1080/19466315.2013.807748>, with necessary unit testing using 'testthat'.
Maintained by Nan Xiao. Last updated 1 years ago.
11 stars 5.79 score 14 scriptshojsgaard
gRim:Graphical Interaction Models
Provides the following types of models: Models for contingency tables (i.e. log-linear models) Graphical Gaussian models for multivariate normal data (i.e. covariance selection models) Mixed interaction models. Documentation about 'gRim' is provided by vignettes included in this package and the book by Højsgaard, Edwards and Lauritzen (2012, <doi:10.1007/978-1-4614-2299-0>); see 'citation("gRim")' for details.
Maintained by Søren Højsgaard. Last updated 5 months ago.
2 stars 5.77 score 74 scriptsstewid
EpiContactTrace:Epidemiological Tool for Contact Tracing
Routines for epidemiological contact tracing and visualisation of network of contacts.
Maintained by Stefan Widgren. Last updated 6 months ago.
11 stars 5.74 score 42 scriptsbioc
debCAM:Deconvolution by Convex Analysis of Mixtures
An R package for fully unsupervised deconvolution of complex tissues. It provides basic functions to perform unsupervised deconvolution on mixture expression profiles by Convex Analysis of Mixtures (CAM) and some auxiliary functions to help understand the subpopulation-specific results. It also implements functions to perform supervised deconvolution based on prior knowledge of molecular markers, S matrix or A matrix. Combining molecular markers from CAM and from prior knowledge can achieve semi-supervised deconvolution of mixtures.
Maintained by Lulu Chen. Last updated 5 months ago.
softwarecellbiologygeneexpressionopenjdk
7 stars 5.69 score 14 scriptsbioc
metabCombiner:Method for Combining LC-MS Metabolomics Feature Measurements
This package aligns LC-HRMS metabolomics datasets acquired from biologically similar specimens analyzed under similar, but not necessarily identical, conditions. Peak-picked and simply aligned metabolomics feature tables (consisting of m/z, rt, and per-sample abundance measurements, plus optional identifiers & adduct annotations) are accepted as input. The package outputs a combined table of feature pair alignments, organized into groups of similar m/z, and ranked by a similarity score. Input tables are assumed to be acquired using similar (but not necessarily identical) analytical methods.
Maintained by Hani Habra. Last updated 5 months ago.
softwaremassspectrometrymetabolomicsmass-spectrometry
10 stars 5.65 score 5 scripts