Showing 200 of total 395 results (show query)
stan-dev
rstan:R Interface to Stan
User-facing R functions are provided to parse, compile, test, estimate, and analyze Stan models by accessing the header-only Stan library provided by the 'StanHeaders' package. The Stan project develops a probabilistic programming language that implements full Bayesian statistical inference via Markov Chain Monte Carlo, rough Bayesian inference via 'variational' approximation, and (optionally penalized) maximum likelihood estimation via optimization. In all three cases, automatic differentiation is used to quickly and accurately evaluate gradients without burdening the user with the need to derive the partial derivatives.
Maintained by Ben Goodrich. Last updated 4 days ago.
bayesian-data-analysisbayesian-inferencebayesian-statisticsmcmcstancpp
1.1k stars 18.84 score 14k scripts 281 dependentsedzer
sp:Classes and Methods for Spatial Data
Classes and methods for spatial data; the classes document where the spatial location information resides, for 2D or 3D data. Utility functions are provided, e.g. for plotting data as maps, spatial selection, as well as methods for retrieving coordinates, for subsetting, print, summary, etc. From this version, 'rgdal', 'maptools', and 'rgeos' are no longer used at all, see <https://r-spatial.org/r/2023/05/15/evolution4.html> for details.
Maintained by Edzer Pebesma. Last updated 2 months ago.
127 stars 18.63 score 35k scripts 1.3k dependentsbioc
Biostrings:Efficient manipulation of biological strings
Memory efficient string containers, string matching algorithms, and other utilities, for fast manipulation of large biological sequences or sets of sequences.
Maintained by Hervé Pagès. Last updated 1 months ago.
sequencematchingalignmentsequencinggeneticsdataimportdatarepresentationinfrastructurebioconductor-packagecore-package
62 stars 17.77 score 8.6k scripts 1.2k dependentsbioc
GenomicRanges:Representation and manipulation of genomic intervals
The ability to efficiently represent and manipulate genomic annotations and alignments is playing a central role when it comes to analyzing high-throughput sequencing data (a.k.a. NGS data). The GenomicRanges package defines general purpose containers for storing and manipulating genomic intervals and variables defined along a genome. More specialized containers for representing and manipulating short alignments against a reference genome, or a matrix-like summarization of an experiment, are defined in the GenomicAlignments and SummarizedExperiment packages, respectively. Both packages build on top of the GenomicRanges infrastructure.
Maintained by Hervé Pagès. Last updated 4 months ago.
geneticsinfrastructuredatarepresentationsequencingannotationgenomeannotationcoveragebioconductor-packagecore-package
44 stars 17.68 score 13k scripts 1.3k dependentsrspatial
terra:Spatial Data Analysis
Methods for spatial data analysis with vector (points, lines, polygons) and raster (grid) data. Methods for vector data include geometric operations such as intersect and buffer. Raster methods include local, focal, global, zonal and geometric operations. The predict and interpolate methods facilitate the use of regression type (interpolation, machine learning) models for spatial prediction, including with satellite remote sensing data. Processing of very large files is supported. See the manual and tutorials on <https://rspatial.org/> to get started. 'terra' replaces the 'raster' package ('terra' can do more, and it is faster and easier to use).
Maintained by Robert J. Hijmans. Last updated 2 days ago.
geospatialrasterspatialvectoronetbbprojgdalgeoscpp
559 stars 17.64 score 17k scripts 855 dependentsrspatial
raster:Geographic Data Analysis and Modeling
Reading, writing, manipulating, analyzing and modeling of spatial data. This package has been superseded by the "terra" package <https://CRAN.R-project.org/package=terra>.
Maintained by Robert J. Hijmans. Last updated 18 hours ago.
163 stars 17.23 score 58k scripts 562 dependentsr-forge
Matrix:Sparse and Dense Matrix Classes and Methods
A rich hierarchy of sparse and dense matrix classes, including general, symmetric, triangular, and diagonal matrices with numeric, logical, or pattern entries. Efficient methods for operating on such matrices, often wrapping the 'BLAS', 'LAPACK', and 'SuiteSparse' libraries.
Maintained by Martin Maechler. Last updated 19 days ago.
1 stars 17.23 score 33k scripts 12k dependentsyrosseel
lavaan:Latent Variable Analysis
Fit a variety of latent variable models, including confirmatory factor analysis, structural equation modeling and latent growth curve models.
Maintained by Yves Rosseel. Last updated 2 days ago.
factor-analysisgrowth-curve-modelslatent-variablesmissing-datamultilevel-modelsmultivariate-analysispath-analysispsychometricsstatistical-modelingstructural-equation-modeling
454 stars 16.82 score 8.4k scripts 218 dependentsbioc
GenomeInfoDb:Utilities for manipulating chromosome names, including modifying them to follow a particular naming style
Contains data and functions that define and allow translation between different chromosome sequence naming conventions (e.g., "chr1" versus "1"), including a function that attempts to place sequence names in their natural, rather than lexicographic, order.
Maintained by Hervé Pagès. Last updated 2 months ago.
geneticsdatarepresentationannotationgenomeannotationbioconductor-packagecore-package
32 stars 16.32 score 1.3k scripts 1.7k dependentsjoshuaulrich
quantmod:Quantitative Financial Modelling Framework
Specify, build, trade, and analyse quantitative financial trading strategies.
Maintained by Joshua M. Ulrich. Last updated 26 days ago.
algorithmic-tradingchartingdata-importfinancetime-series
839 stars 16.17 score 8.1k scripts 343 dependentsbioc
DESeq2:Differential gene expression analysis based on the negative binomial distribution
Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.
Maintained by Michael Love. Last updated 23 days ago.
sequencingrnaseqchipseqgeneexpressiontranscriptionnormalizationdifferentialexpressionbayesianregressionprincipalcomponentclusteringimmunooncologyopenblascpp
375 stars 16.11 score 17k scripts 115 dependentsbioc
IRanges:Foundation of integer range manipulation in Bioconductor
Provides efficient low-level and highly reusable S4 classes for storing, manipulating and aggregating over annotated ranges of integers. Implements an algebra of range operations, including efficient algorithms for finding overlaps and nearest neighbors. Defines efficient list-like classes for storing, transforming and aggregating large grouped data, i.e., collections of atomic vectors and DataFrames.
Maintained by Hervé Pagès. Last updated 2 months ago.
infrastructuredatarepresentationbioconductor-packagecore-package
22 stars 16.09 score 2.1k scripts 1.8k dependentsbioc
S4Vectors:Foundation of vector-like and list-like containers in Bioconductor
The S4Vectors package defines the Vector and List virtual classes and a set of generic functions that extend the semantic of ordinary vectors and lists in R. Package developers can easily implement vector-like or list-like objects as concrete subclasses of Vector or List. In addition, a few low-level concrete subclasses of general interest (e.g. DataFrame, Rle, Factor, and Hits) are implemented in the S4Vectors package itself (many more are implemented in the IRanges package and in other Bioconductor infrastructure packages).
Maintained by Hervé Pagès. Last updated 2 months ago.
infrastructuredatarepresentationbioconductor-packagecore-package
18 stars 16.05 score 1.0k scripts 1.9k dependentsbioc
DelayedArray:A unified framework for working transparently with on-disk and in-memory array-like datasets
Wrapping an array-like object (typically an on-disk object) in a DelayedArray object allows one to perform common array operations on it without loading the object in memory. In order to reduce memory usage and optimize performance, operations on the object are either delayed or executed using a block processing mechanism. Note that this also works on in-memory array-like objects like DataFrame objects (typically with Rle columns), Matrix objects, ordinary arrays and, data frames.
Maintained by Hervé Pagès. Last updated 1 months ago.
infrastructuredatarepresentationannotationgenomeannotationbioconductor-packagecore-packageu24ca289073
27 stars 15.59 score 538 scripts 1.2k dependentsdankelley
oce:Analysis of Oceanographic Data
Supports the analysis of Oceanographic data, including 'ADCP' measurements, measurements made with 'argo' floats, 'CTD' measurements, sectional data, sea-level time series, coastline and topographic data, etc. Provides specialized functions for calculating seawater properties such as potential temperature in either the 'UNESCO' or 'TEOS-10' equation of state. Produces graphical displays that conform to the conventions of the Oceanographic literature. This package is discussed extensively by Kelley (2018) "Oceanographic Analysis with R" <doi:10.1007/978-1-4939-8844-0>.
Maintained by Dan Kelley. Last updated 1 days ago.
146 stars 15.34 score 4.2k scripts 18 dependentsbioc
AnnotationDbi:Manipulation of SQLite-based annotations in Bioconductor
Implements a user-friendly interface for querying SQLite-based annotation data packages.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
annotationmicroarraysequencinggenomeannotationbioconductor-packagecore-package
9 stars 15.05 score 3.6k scripts 769 dependentsbioc
DOSE:Disease Ontology Semantic and Enrichment analysis
This package implements five methods proposed by Resnik, Schlicker, Jiang, Lin and Wang respectively for measuring semantic similarities among DO terms and gene products. Enrichment analyses including hypergeometric model and gene set enrichment analysis are also implemented for discovering disease associations of high-throughput biological data.
Maintained by Guangchuang Yu. Last updated 5 months ago.
annotationvisualizationmultiplecomparisongenesetenrichmentpathwayssoftwaredisease-ontologyenrichment-analysissemantic-similarity
119 stars 14.97 score 2.0k scripts 61 dependentsphilchalmers
mirt:Multidimensional Item Response Theory
Analysis of discrete response data using unidimensional and multidimensional item analysis models under the Item Response Theory paradigm (Chalmers (2012) <doi:10.18637/jss.v048.i06>). Exploratory and confirmatory item factor analysis models are estimated with quadrature (EM) or stochastic (MHRM) methods. Confirmatory bi-factor and two-tier models are available for modeling item testlets using dimension reduction EM algorithms, while multiple group analyses and mixed effects designs are included for detecting differential item, bundle, and test functioning, and for modeling item and person covariates. Finally, latent class models such as the DINA, DINO, multidimensional latent class, mixture IRT models, and zero-inflated response models are supported, as well as a wide family of probabilistic unfolding models.
Maintained by Phil Chalmers. Last updated 1 days ago.
212 stars 14.93 score 2.5k scripts 40 dependentsedzer
hexbin:Hexagonal Binning Routines
Binning and plotting functions for hexagonal bins.
Maintained by Edzer Pebesma. Last updated 5 months ago.
37 stars 14.00 score 2.4k scripts 114 dependentsmhahsler
arules:Mining Association Rules and Frequent Itemsets
Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules). Also provides C implementations of the association mining algorithms Apriori and Eclat. Hahsler, Gruen and Hornik (2005) <doi:10.18637/jss.v014.i15>.
Maintained by Michael Hahsler. Last updated 2 months ago.
arulesassociation-rulesfrequent-itemsets
194 stars 13.99 score 3.3k scripts 28 dependentsbiomodhub
biomod2:Ensemble Platform for Species Distribution Modeling
Functions for species distribution modeling, calibration and evaluation, ensemble of models, ensemble forecasting and visualization. The package permits to run consistently up to 10 single models on a presence/absences (resp presences/pseudo-absences) dataset and to combine them in ensemble models and ensemble projections. Some bench of other evaluation and visualisation tools are also available within the package.
Maintained by Maya Guéguen. Last updated 2 days ago.
95 stars 13.85 score 536 scripts 7 dependentsr-dbi
RMySQL:Database Interface and 'MySQL' Driver for R
Legacy 'DBI' interface to 'MySQL' / 'MariaDB' based on old code ported from S-PLUS. A modern 'MySQL' client written in 'C++' is available from the 'RMariaDB' package.
Maintained by Jeroen Ooms. Last updated 2 months ago.
209 stars 13.68 score 3.7k scripts 15 dependentsbbolker
bbmle:Tools for General Maximum Likelihood Estimation
Methods and functions for fitting maximum likelihood models in R. This package modifies and extends the 'mle' classes in the 'stats4' package.
Maintained by Ben Bolker. Last updated 1 months ago.
25 stars 13.36 score 1.4k scripts 117 dependentsbrodieg
diffobj:Diffs for R Objects
Generate a colorized diff of two R objects for an intuitive visualization of their differences.
Maintained by Brodie Gaslam. Last updated 3 years ago.
231 stars 13.17 score 107 scripts 494 dependentsbiodiverse
unmarked:Models for Data from Unmarked Animals
Fits hierarchical models of animal abundance and occurrence to data collected using survey methods such as point counts, site occupancy sampling, distance sampling, removal sampling, and double observer sampling. Parameters governing the state and observation processes can be modeled as functions of covariates. References: Kellner et al. (2023) <doi:10.1111/2041-210X.14123>, Fiske and Chandler (2011) <doi:10.18637/jss.v043.i10>.
Maintained by Ken Kellner. Last updated 9 days ago.
4 stars 13.02 score 652 scripts 12 dependentsspedygiorgio
markovchain:Easy Handling Discrete Time Markov Chains
Functions and S4 methods to create and manage discrete time Markov chains more easily. In addition functions to perform statistical (fitting and drawing random variates) and probabilistic (analysis of their structural proprieties) analysis are provided. See Spedicato (2017) <doi:10.32614/RJ-2017-036>. Some functions for continuous times Markov chains depend on the suggested ctmcd package.
Maintained by Giorgio Alfredo Spedicato. Last updated 5 months ago.
ctmcdtmcmarkov-chainmarkov-modelr-programmingrcppopenblascpp
104 stars 12.78 score 712 scripts 4 dependentsbioc
rtracklayer:R interface to genome annotation files and the UCSC genome browser
Extensible framework for interacting with multiple genome browsers (currently UCSC built-in) and manipulating annotation tracks in various formats (currently GFF, BED, bedGraph, BED15, WIG, BigWig and 2bit built-in). The user may export/import tracks to/from the supported browsers, as well as query and modify the browser state, such as the current viewport.
Maintained by Michael Lawrence. Last updated 3 days ago.
annotationvisualizationdataimportzlibopensslcurl
12.66 score 6.7k scripts 480 dependentsthibautjombart
adegenet:Exploratory Analysis of Genetic and Genomic Data
Toolset for the exploration of genetic and genomic data. Adegenet provides formal (S4) classes for storing and handling various genetic data, including genetic markers with varying ploidy and hierarchical population structure ('genind' class), alleles counts by populations ('genpop'), and genome-wide SNP data ('genlight'). It also implements original multivariate methods (DAPC, sPCA), graphics, statistical tests, simulation tools, distance and similarity measures, and several spatial methods. A range of both empirical and simulated datasets is also provided to illustrate various methods.
Maintained by Zhian N. Kamvar. Last updated 2 months ago.
182 stars 12.60 score 1.9k scripts 29 dependentsdata-cleaning
validate:Data Validation Infrastructure
Declare data validation rules and data quality indicators; confront data with them and analyze or visualize the results. The package supports rules that are per-field, in-record, cross-record or cross-dataset. Rules can be automatically analyzed for rule type and connectivity. Supports checks implied by an SDMX DSD file as well. See also Van der Loo and De Jonge (2018) <doi:10.1002/9781118897126>, Chapter 6 and the JSS paper (2021) <doi:10.18637/jss.v097.i10>.
Maintained by Mark van der Loo. Last updated 24 days ago.
419 stars 12.39 score 448 scripts 8 dependentsmelff
memisc:Management of Survey Data and Presentation of Analysis Results
An infrastructure for the management of survey data including value labels, definable missing values, recoding of variables, production of code books, and import of (subsets of) 'SPSS' and 'Stata' files is provided. Further, the package allows to produce tables and data frames of arbitrary descriptive statistics and (almost) publication-ready tables of regression model estimates, which can be exported to 'LaTeX' and HTML.
Maintained by Martin Elff. Last updated 23 days ago.
46 stars 12.34 score 1.2k scripts 13 dependentsmiraisolutions
XLConnect:Excel Connector for R
Provides comprehensive functionality to read, write and format Excel data.
Maintained by Martin Studer. Last updated 30 days ago.
cross-platformexcelr-languagexlconnectopenjdk
130 stars 12.28 score 1.2k scripts 1 dependentstomoakin
RPostgreSQL:R Interface to the 'PostgreSQL' Database System
Database interface and 'PostgreSQL' driver for 'R'. This package provides a Database Interface 'DBI' compliant driver for 'R' to access 'PostgreSQL' database systems. In order to build and install this package from source, 'PostgreSQL' itself must be present your system to provide 'PostgreSQL' functionality via its libraries and header files. These files are provided as 'postgresql-devel' package under some Linux distributions. On 'macOS' and 'Microsoft Windows' system the attached 'libpq' library source will be used.
Maintained by Tomoaki Nishiyama. Last updated 15 hours ago.
66 stars 12.11 score 4.5k scripts 19 dependentsr-forge
copula:Multivariate Dependence with Copulas
Classes (S4) of commonly used elliptical, Archimedean, extreme-value and other copula families, as well as their rotations, mixtures and asymmetrizations. Nested Archimedean copulas, related tools and special functions. Methods for density, distribution, random number generation, bivariate dependence measures, Rosenblatt transform, Kendall distribution function, perspective and contour plots. Fitting of copula models with potentially partly fixed parameters, including standard errors. Serial independence tests, copula specification tests (independence, exchangeability, radial symmetry, extreme-value dependence, goodness-of-fit) and model selection based on cross-validation. Empirical copula, smoothed versions, and non-parametric estimators of the Pickands dependence function.
Maintained by Martin Maechler. Last updated 23 days ago.
11.83 score 1.2k scripts 86 dependentskingaa
pomp:Statistical Inference for Partially Observed Markov Processes
Tools for data analysis with partially observed Markov process (POMP) models (also known as stochastic dynamical systems, hidden Markov models, and nonlinear, non-Gaussian, state-space models). The package provides facilities for implementing POMP models, simulating them, and fitting them to time series data by a variety of frequentist and Bayesian methods. It is also a versatile platform for implementation of inference methods for general POMP models.
Maintained by Aaron A. King. Last updated 8 days ago.
abcb-splinedifferential-equationsdynamical-systemsiterated-filteringlikelihoodlikelihood-freemarkov-chain-monte-carlomarkov-modelmathematical-modellingmeasurement-errorparticle-filtersequential-monte-carlosimulation-based-inferencesobol-sequencestate-spacestatistical-inferencestochastic-processestime-seriesopenblas
114 stars 11.74 score 1.3k scripts 4 dependentsluca-scr
GA:Genetic Algorithms
Flexible general-purpose toolbox implementing genetic algorithms (GAs) for stochastic optimisation. Binary, real-valued, and permutation representations are available to optimize a fitness function, i.e. a function provided by users depending on their objective function. Several genetic operators are available and can be combined to explore the best settings for the current task. Furthermore, users can define new genetic operators and easily evaluate their performances. Local search using general-purpose optimisation algorithms can be applied stochastically to exploit interesting regions. GAs can be run sequentially or in parallel, using an explicit master-slave parallelisation or a coarse-grain islands approach. For more details see Scrucca (2013) <doi:10.18637/jss.v053.i04> and Scrucca (2017) <doi:10.32614/RJ-2017-008>.
Maintained by Luca Scrucca. Last updated 7 months ago.
genetic-algorithmoptimisationcpp
93 stars 11.58 score 624 scripts 52 dependentsbioc
mia:Microbiome analysis
mia implements tools for microbiome analysis based on the SummarizedExperiment, SingleCellExperiment and TreeSummarizedExperiment infrastructure. Data wrangling and analysis in the context of taxonomic data is the main scope. Additional functions for common task are implemented such as community indices calculation and summarization.
Maintained by Tuomas Borman. Last updated 14 days ago.
microbiomesoftwaredataimportanalysisbioconductor
52 stars 11.50 score 316 scripts 5 dependentsr-forge
Rmpfr:Interface R to MPFR - Multiple Precision Floating-Point Reliable
Arithmetic (via S4 classes and methods) for arbitrary precision floating point numbers, including transcendental ("special") functions. To this end, the package interfaces to the 'LGPL' licensed 'MPFR' (Multiple Precision Floating-Point Reliable) Library which itself is based on the 'GMP' (GNU Multiple Precision) Library.
Maintained by Martin Maechler. Last updated 4 months ago.
11.30 score 316 scripts 141 dependentsbioc
MAST:Model-based Analysis of Single Cell Transcriptomics
Methods and models for handling zero-inflated single cell assay data.
Maintained by Andrew McDavid. Last updated 5 months ago.
geneexpressiondifferentialexpressiongenesetenrichmentrnaseqtranscriptomicssinglecell
232 stars 11.28 score 1.8k scripts 5 dependentsfmichonneau
phylobase:Base Package for Phylogenetic Structures and Comparative Data
Provides a base S4 class for comparative methods, incorporating one or more trees and trait data.
Maintained by Francois Michonneau. Last updated 1 years ago.
18 stars 11.10 score 394 scripts 18 dependentsrkillick
changepoint:Methods for Changepoint Detection
Implements various mainstream and specialised changepoint methods for finding single and multiple changepoints within data. Many popular non-parametric and frequentist methods are included. The cpt.mean(), cpt.var(), cpt.meanvar() functions should be your first point of call.
Maintained by Rebecca Killick. Last updated 4 months ago.
133 stars 11.05 score 736 scripts 40 dependentsbioc
DirichletMultinomial:Dirichlet-Multinomial Mixture Model Machine Learning for Microbiome Data
Dirichlet-multinomial mixture models can be used to describe variability in microbial metagenomic data. This package is an interface to code originally made available by Holmes, Harris, and Quince, 2012, PLoS ONE 7(2): 1-15, as discussed further in the man page for this package, ?DirichletMultinomial.
Maintained by Martin Morgan. Last updated 5 months ago.
immunooncologymicrobiomesequencingclusteringclassificationmetagenomicsgsl
10 stars 10.91 score 125 scripts 26 dependentsecmerkle
blavaan:Bayesian Latent Variable Analysis
Fit a variety of Bayesian latent variable models, including confirmatory factor analysis, structural equation models, and latent growth curve models. References: Merkle & Rosseel (2018) <doi:10.18637/jss.v085.i04>; Merkle et al. (2021) <doi:10.18637/jss.v100.i06>.
Maintained by Edgar Merkle. Last updated 9 days ago.
bayesian-statisticsfactor-analysisgrowth-curve-modelslatent-variablesmissing-datamultilevel-modelsmultivariate-analysispath-analysispsychometricsstatistical-modelingstructural-equation-modelingcpp
92 stars 10.84 score 183 scripts 3 dependentszdebruine
RcppML:Rcpp Machine Learning Library
Fast machine learning algorithms including matrix factorization and divisive clustering for large sparse and dense matrices.
Maintained by Zach DeBruine. Last updated 2 years ago.
clusteringmatrix-factorizationnmfrcpprcppeigensparse-matrixcppopenmp
107 stars 10.66 score 125 scripts 50 dependentsohdsi
FeatureExtraction:Generating Features for a Cohort
An R interface for generating features for a cohort using data in the Common Data Model. Features can be constructed using default or custom made feature definitions. Furthermore it's possible to aggregate features and get the summary statistics.
Maintained by Ger Inberg. Last updated 8 days ago.
62 stars 10.64 score 209 scripts 2 dependentsvalentint
rrcov:Scalable Robust Estimators with High Breakdown Point
Robust Location and Scatter Estimation and Robust Multivariate Analysis with High Breakdown Point: principal component analysis (Filzmoser and Todorov (2013), <doi:10.1016/j.ins.2012.10.017>), linear and quadratic discriminant analysis (Todorov and Pires (2007)), multivariate tests (Todorov and Filzmoser (2010) <doi:10.1016/j.csda.2009.08.015>), outlier detection (Todorov et al. (2010) <doi:10.1007/s11634-010-0075-2>). See also Todorov and Filzmoser (2009) <urn:isbn:978-3838108148>, Todorov and Filzmoser (2010) <doi:10.18637/jss.v032.i03> and Boudt et al. (2019) <doi:10.1007/s11222-019-09869-x>.
Maintained by Valentin Todorov. Last updated 7 months ago.
2 stars 10.57 score 484 scripts 96 dependentsbioc
seqLogo:Sequence logos for DNA sequence alignments
seqLogo takes the position weight matrix of a DNA sequence motif and plots the corresponding sequence logo as introduced by Schneider and Stephens (1990).
Maintained by Robert Ivanek. Last updated 5 months ago.
4 stars 10.57 score 304 scripts 29 dependentsmhahsler
recommenderlab:Lab for Developing and Testing Recommender Algorithms
Provides a research infrastructure to develop and evaluate collaborative filtering recommender algorithms. This includes a sparse representation for user-item matrices, many popular algorithms, top-N recommendations, and cross-validation. Hahsler (2022) <doi:10.48550/arXiv.2205.12371>.
Maintained by Michael Hahsler. Last updated 3 days ago.
collaborative-filteringrecommender-system
214 stars 10.42 score 840 scripts 2 dependentsbioc
flowCore:flowCore: Basic structures for flow cytometry data
Provides S4 data structures and basic functions to deal with flow cytometry data.
Maintained by Mike Jiang. Last updated 5 months ago.
immunooncologyinfrastructureflowcytometrycellbasedassayscpp
10.34 score 1.7k scripts 59 dependentsagrdatasci
gdistance:Distances and Routes on Geographical Grids
Provides classes and functions to calculate various distance measures and routes in heterogeneous geographic spaces represented as grids. The package implements measures to model dispersal histories first presented by van Etten and Hijmans (2010) <doi:10.1371/journal.pone.0012060>. Least-cost distances as well as more complex distances based on (constrained) random walks can be calculated. The distances implemented in the package are used in geographical genetics, accessibility indicators, and may also have applications in other fields of geospatial analysis.
Maintained by Andrew Marx. Last updated 1 years ago.
17 stars 10.34 score 478 scripts 23 dependentsstewid
SimInf:A Framework for Data-Driven Stochastic Disease Spread Simulations
Provides an efficient and very flexible framework to conduct data-driven epidemiological modeling in realistic large scale disease spread simulations. The framework integrates infection dynamics in subpopulations as continuous-time Markov chains using the Gillespie stochastic simulation algorithm and incorporates available data such as births, deaths and movements as scheduled events at predefined time-points. Using C code for the numerical solvers and 'OpenMP' (if available) to divide work over multiple processors ensures high performance when simulating a sample outcome. One of our design goals was to make the package extendable and enable usage of the numerical solvers from other R extension packages in order to facilitate complex epidemiological research. The package contains template models and can be extended with user-defined models. For more details see the paper by Widgren, Bauer, Eriksson and Engblom (2019) <doi:10.18637/jss.v091.i12>. The package also provides functionality to fit models to time series data using the Approximate Bayesian Computation Sequential Monte Carlo ('ABC-SMC') algorithm of Toni and others (2009) <doi:10.1098/rsif.2008.0172>.
Maintained by Stefan Widgren. Last updated 17 days ago.
data-drivenepidemiologyhigh-performance-computingmarkov-chainmathematical-modellinggslopenmp
35 stars 10.09 score 227 scriptsmages
ChainLadder:Statistical Methods and Models for Claims Reserving in General Insurance
Various statistical methods and models which are typically used for the estimation of outstanding claims reserves in general insurance, including those to estimate the claims development result as required under Solvency II.
Maintained by Markus Gesmann. Last updated 2 months ago.
82 stars 10.04 score 196 scripts 2 dependentsropensci
RNeXML:Semantically Rich I/O for the 'NeXML' Format
Provides access to phyloinformatic data in 'NeXML' format. The package should add new functionality to R such as the possibility to manipulate 'NeXML' objects in more various and refined way and compatibility with 'ape' objects.
Maintained by Carl Boettiger. Last updated 11 months ago.
metadatanexmlphylogeneticslinked-data
13 stars 9.97 score 100 scripts 19 dependentsbioc
methylumi:Handle Illumina methylation data
This package provides classes for holding and manipulating Illumina methylation data. Based on eSet, it can contain MIAME information, sample information, feature information, and multiple matrices of data. An "intelligent" import function, methylumiR can read the Illumina text files and create a MethyLumiSet. methylumIDAT can directly read raw IDAT files from HumanMethylation27 and HumanMethylation450 microarrays. Normalization, background correction, and quality control features for GoldenGate, Infinium, and Infinium HD arrays are also included.
Maintained by Sean Davis. Last updated 5 months ago.
dnamethylationtwochannelpreprocessingqualitycontrolcpgisland
9 stars 9.90 score 89 scripts 9 dependentsbioc
snpStats:SnpMatrix and XSnpMatrix classes and methods
Classes and statistical methods for large SNP association studies. This extends the earlier snpMatrix package, allowing for uncertainty in genotypes.
Maintained by David Clayton. Last updated 5 months ago.
microarraysnpgeneticvariabilityzlib
9.48 score 674 scripts 20 dependentssizespectrum
mizer:Dynamic Multi-Species Size Spectrum Modelling
A set of classes and methods to set up and run multi-species, trait based and community size spectrum ecological models, focused on the marine environment.
Maintained by Gustav Delius. Last updated 2 months ago.
ecosystem-modelfish-population-dynamicsfisheriesfisheries-managementmarine-ecosystempopulation-dynamicssimulationsize-structurespecies-interactionstransport-equationcpp
39 stars 9.41 score 207 scriptsreinhardfurrer
spam:SPArse Matrix
Set of functions for sparse matrix algebra. Differences with other sparse matrix packages are: (1) we only support (essentially) one sparse matrix format, (2) based on transparent and simple structure(s), (3) tailored for MCMC calculations within G(M)RF. (4) and it is fast and scalable (with the extension package spam64). Documentation about 'spam' is provided by vignettes included in this package, see also Furrer and Sain (2010) <doi:10.18637/jss.v036.i10>; see 'citation("spam")' for details.
Maintained by Reinhard Furrer. Last updated 2 months ago.
1 stars 9.36 score 420 scripts 439 dependentsbioc
multtest:Resampling-based multiple hypothesis testing
Non-parametric bootstrap and permutation resampling-based multiple testing procedures (including empirical Bayes methods) for controlling the family-wise error rate (FWER), generalized family-wise error rate (gFWER), tail probability of the proportion of false positives (TPPFP), and false discovery rate (FDR). Several choices of bootstrap-based null distribution are implemented (centered, centered and scaled, quantile-transformed). Single-step and step-wise methods are available. Tests based on a variety of t- and F-statistics (including t-statistics based on regression parameters from linear and survival models as well as those based on correlation parameters) are included. When probing hypotheses with t-statistics, users may also select a potentially faster null distribution which is multivariate normal with mean zero and variance covariance matrix derived from the vector influence function. Results are reported in terms of adjusted p-values, confidence regions and test statistic cutoffs. The procedures are directly applicable to identifying differentially expressed genes in DNA microarray experiments.
Maintained by Katherine S. Pollard. Last updated 5 months ago.
microarraydifferentialexpressionmultiplecomparison
9.34 score 932 scripts 136 dependentsbioc
CNEr:CNE Detection and Visualization
Large-scale identification and advanced visualization of sets of conserved noncoding elements.
Maintained by Ge Tan. Last updated 5 months ago.
generegulationvisualizationdataimport
3 stars 9.28 score 35 scripts 19 dependentsbpfaff
urca:Unit Root and Cointegration Tests for Time Series Data
Unit root and cointegration tests encountered in applied econometric analysis are implemented.
Maintained by Bernhard Pfaff. Last updated 10 months ago.
6 stars 8.95 score 1.4k scripts 270 dependentsbioc
marray:Exploratory analysis for two-color spotted microarray data
Class definitions for two-color spotted microarray data. Fuctions for data input, diagnostic plots, normalization and quality checking.
Maintained by Yee Hwa (Jean) Yang. Last updated 5 months ago.
microarraytwochannelpreprocessing
8.92 score 222 scripts 38 dependentsflr
FLCore:Core Package of FLR, Fisheries Modelling in R
Core classes and methods for FLR, a framework for fisheries modelling and management strategy simulation in R. Developed by a team of fisheries scientists in various countries. More information can be found at <http://flr-project.org/>.
Maintained by Iago Mosqueira. Last updated 8 days ago.
fisheriesflrfisheries-modelling
16 stars 8.78 score 956 scripts 23 dependentsbart1
move:Visualizing and Analyzing Animal Track Data
Contains functions to access movement data stored in 'movebank.org' as well as tools to visualize and statistically analyze animal movement data, among others functions to calculate dynamic Brownian Bridge Movement Models. Move helps addressing movement ecology questions.
Maintained by Bart Kranstauber. Last updated 4 months ago.
8.76 score 690 scripts 3 dependentspilaboratory
sads:Maximum Likelihood Models for Species Abundance Distributions
Maximum likelihood tools to fit and compare models of species abundance distributions and of species rank-abundance distributions.
Maintained by Paulo I. Prado. Last updated 1 years ago.
23 stars 8.66 score 244 scripts 3 dependentsjniedballa
camtrapR:Camera Trap Data Management and Preparation of Occupancy and Spatial Capture-Recapture Analyses
Management of and data extraction from camera trap data in wildlife studies. The package provides a workflow for storing and sorting camera trap photos (and videos), tabulates records of species and individuals, and creates detection/non-detection matrices for occupancy and spatial capture-recapture analyses with great flexibility. In addition, it can visualise species activity data and provides simple mapping functions with GIS export.
Maintained by Juergen Niedballa. Last updated 4 months ago.
occupancy-modelingspatial-capture-recapturewildlife
35 stars 8.65 score 178 scriptsactuaryzhang
cplm:Compound Poisson Linear Models
Likelihood-based and Bayesian methods for various compound Poisson linear models based on Zhang, Yanwei (2013) <doi:10.1007/s11222-012-9343-7>.
Maintained by Yanwei (Wayne) Zhang. Last updated 1 years ago.
16 stars 8.55 score 75 scripts 10 dependentsr-forge
ClassDiscovery:Classes and Methods for "Class Discovery" with Microarrays or Proteomics
Defines the classes used for "class discovery" problems in the OOMPA project (<http://oompa.r-forge.r-project.org/>). Class discovery primarily consists of unsupervised clustering methods with attempts to assess their statistical significance.
Maintained by Kevin R. Coombes. Last updated 2 months ago.
8.53 score 85 scripts 9 dependentsbioc
pwalign:Perform pairwise sequence alignments
The two main functions in the package are pairwiseAlignment() and stringDist(). The former solves (Needleman-Wunsch) global alignment, (Smith-Waterman) local alignment, and (ends-free) overlap alignment problems. The latter computes the Levenshtein edit distance or pairwise alignment score matrix for a set of strings.
Maintained by Hervé Pagès. Last updated 10 days ago.
alignmentsequencematchingsequencinggeneticsbioconductor-package
1 stars 8.48 score 27 scripts 104 dependentstomasfryda
h2o:R Interface for the 'H2O' Scalable Machine Learning Platform
R interface for 'H2O', the scalable open source machine learning platform that offers parallelized implementations of many supervised and unsupervised machine learning algorithms such as Generalized Linear Models (GLM), Gradient Boosting Machines (including XGBoost), Random Forests, Deep Neural Networks (Deep Learning), Stacked Ensembles, Naive Bayes, Generalized Additive Models (GAM), ANOVA GLM, Cox Proportional Hazards, K-Means, PCA, ModelSelection, Word2Vec, as well as a fully automatic machine learning algorithm (H2O AutoML).
Maintained by Tomas Fryda. Last updated 1 years ago.
3 stars 8.20 score 7.8k scripts 11 dependentscran
flexmix:Flexible Mixture Modeling
A general framework for finite mixtures of regression models using the EM algorithm is implemented. The E-step and all data handling are provided, while the M-step can be supplied by the user to easily define new models. Existing drivers implement mixtures of standard linear models, generalized linear models and model-based clustering.
Maintained by Bettina Gruen. Last updated 29 days ago.
5 stars 8.19 score 113 dependentsr-hyperspec
hyperSpec:Work with Hyperspectral Data, i.e. Spectra + Meta Information (Spatial, Time, Concentration, ...)
Comfortable ways to work with hyperspectral data sets, i.e. spatially or time-resolved spectra, or spectra with any other kind of information associated with each of the spectra. The spectra can be data as obtained in XRF, UV/VIS, Fluorescence, AES, NIR, IR, Raman, NMR, MS, etc. More generally, any data that is recorded over a discretized variable, e.g. absorbance = f(wavelength), stored as a vector of absorbance values for discrete wavelengths is suitable.
Maintained by Claudia Beleites. Last updated 10 months ago.
data-wranglinghyperspectralimaginginfrarednmrramanspectroscopyuv-visxrf
16 stars 8.10 score 233 scripts 2 dependentspolmine
polmineR:Verbs and Nouns for Corpus Analysis
Package for corpus analysis using the Corpus Workbench ('CWB', <https://cwb.sourceforge.io>) as an efficient back end for indexing and querying large corpora. The package offers functionality to flexibly create subcorpora and to carry out basic statistical operations (count, co-occurrences etc.). The original full text of documents can be reconstructed and inspected at any time. Beyond that, the package is intended to serve as an interface to packages implementing advanced statistical procedures. Respective data structures (document-term matrices, term-co-occurrence matrices etc.) can be created based on the indexed corpora.
Maintained by Andreas Blaette. Last updated 1 years ago.
49 stars 7.96 score 311 scriptsbioc
Category:Category Analysis
A collection of tools for performing category (gene set enrichment) analysis.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
annotationgopathwaysgenesetenrichment
7.93 score 183 scripts 16 dependentsr-forge
tuneR:Analysis of Music and Speech
Analyze music and speech, extract features like MFCCs, handle wave files and their representation in various ways, read mp3, read midi, perform steps of a transcription, ... Also contains functions ported from the 'rastamat' 'Matlab' package.
Maintained by Uwe Ligges. Last updated 12 months ago.
7.93 score 1.1k scripts 44 dependentsbiodiverse
ubms:Bayesian Models for Data from Unmarked Animals using 'Stan'
Fit Bayesian hierarchical models of animal abundance and occurrence via the 'rstan' package, the R interface to the 'Stan' C++ library. Supported models include single-season occupancy, dynamic occupancy, and N-mixture abundance models. Covariates on model parameters are specified using a formula-based interface similar to package 'unmarked', while also allowing for estimation of random slope and intercept terms. References: Carpenter et al. (2017) <doi:10.18637/jss.v076.i01>; Fiske and Chandler (2011) <doi:10.18637/jss.v043.i10>.
Maintained by Ken Kellner. Last updated 30 days ago.
distance-samplinghierarchical-modelsn-mixture-modeloccupancystanopenblascpp
36 stars 7.90 score 73 scriptsbioc
siggenes:Multiple Testing using SAM and Efron's Empirical Bayes Approaches
Identification of differentially expressed genes and estimation of the False Discovery Rate (FDR) using both the Significance Analysis of Microarrays (SAM) and the Empirical Bayes Analyses of Microarrays (EBAM).
Maintained by Holger Schwender. Last updated 5 months ago.
multiplecomparisonmicroarraygeneexpressionsnpexonarraydifferentialexpression
7.87 score 74 scripts 34 dependentsbioc
hermes:Preprocessing, analyzing, and reporting of RNA-seq data
Provides classes and functions for quality control, filtering, normalization and differential expression analysis of pre-processed `RNA-seq` data. Data can be imported from `SummarizedExperiment` as well as `matrix` objects and can be annotated from `BioMart`. Filtering for genes without too low expression or containing required annotations, as well as filtering for samples with sufficient correlation to other samples or total number of reads is supported. The standard normalization methods including cpm, rpkm and tpm can be used, and 'DESeq2` as well as voom differential expression analyses are available.
Maintained by Daniel Sabanés Bové. Last updated 5 months ago.
rnaseqdifferentialexpressionnormalizationpreprocessingqualitycontrolrna-seqstatistical-engineering
11 stars 7.77 score 48 scripts 1 dependentsbioc
edge:Extraction of Differential Gene Expression
The edge package implements methods for carrying out differential expression analyses of genome-wide gene expression studies. Significance testing using the optimal discovery procedure and generalized likelihood ratio tests (equivalent to F-tests and t-tests) are implemented for general study designs. Special functions are available to facilitate the analysis of common study designs, including time course experiments. Other packages such as sva and qvalue are integrated in edge to provide a wide range of tools for gene expression analysis.
Maintained by John D. Storey. Last updated 5 months ago.
multiplecomparisondifferentialexpressiontimecourseregressiongeneexpressiondataimport
21 stars 7.77 score 62 scriptsopenpharma
crmPack:Object-Oriented Implementation of CRM Designs
Implements a wide range of model-based dose escalation designs, ranging from classical and modern continual reassessment methods (CRMs) based on dose-limiting toxicity endpoints to dual-endpoint designs taking into account a biomarker/efficacy outcome. The focus is on Bayesian inference, making it very easy to setup a new design with its own JAGS code. However, it is also possible to implement 3+3 designs for comparison or models with non-Bayesian estimation. The whole package is written in a modular form in the S4 class system, making it very flexible for adaptation to new models, escalation or stopping rules. Further details are presented in Sabanes Bove et al. (2019) <doi:10.18637/jss.v089.i10>.
Maintained by Daniel Sabanes Bove. Last updated 2 months ago.
21 stars 7.76 score 208 scriptstrackage
trip:Tracking Data
Access and manipulate spatial tracking data, with straightforward coercion from and to other formats. Filter for speed and create time spent maps from tracking data. There are coercion methods to convert between 'trip' and 'ltraj' from 'adehabitatLT', and between 'trip' and 'psp' and 'ppp' from 'spatstat'. Trip objects can be created from raw or grouped data frames, and from types in the 'sp', sf', 'amt', 'trackeR', 'mousetrap', and other packages, Sumner, MD (2011) <https://figshare.utas.edu.au/articles/thesis/The_tag_location_problem/23209538>.
Maintained by Michael D. Sumner. Last updated 9 months ago.
13 stars 7.72 score 137 scripts 1 dependentsblue-matter
MSEtool:Management Strategy Evaluation Toolkit
Development, simulation testing, and implementation of management procedures for fisheries (see Carruthers & Hordyk (2018) <doi:10.1111/2041-210X.13081>).
Maintained by Adrian Hordyk. Last updated 3 days ago.
8 stars 7.71 score 163 scripts 3 dependentsbsaul
geex:An API for M-Estimation
Provides a general, flexible framework for estimating parameters and empirical sandwich variance estimator from a set of unbiased estimating equations (i.e., M-estimation in the vein of Stefanski & Boos (2002) <doi:10.1198/000313002753631330>). All examples from Stefanski & Boos (2002) are published in the corresponding Journal of Statistical Software paper "The Calculus of M-Estimation in R with geex" by Saul & Hudgens (2020) <doi:10.18637/jss.v092.i02>. Also provides an API to compute finite-sample variance corrections.
Maintained by Bradley Saul. Last updated 11 months ago.
asymptoticscovariance-estimatescovariance-estimationestimate-parametersestimating-equationsestimationinferencem-estimationrobustsandwich
8 stars 7.70 score 131 scripts 2 dependentsadamlilith
fasterRaster:Faster Raster and Spatial Vector Processing Using 'GRASS GIS'
Processing of large-in-memory/large-on disk rasters and spatial vectors using 'GRASS GIS' <https://grass.osgeo.org/>. Most functions in the 'terra' package are recreated. Processing of medium-sized and smaller spatial objects will nearly always be faster using 'terra' or 'sf', but for large-in-memory/large-on-disk objects, 'fasterRaster' may be faster. To use most of the functions, you must have the stand-alone version (not the 'OSGeoW4' installer version) of 'GRASS GIS' 8.0 or higher.
Maintained by Adam B. Smith. Last updated 2 days ago.
aspectdistancefragmentationfragmentation-indicesgisgrassgrass-gisrasterraster-projectionrasterizeslopetopographyvectorization
57 stars 7.68 score 8 scriptsltorgo
DMwR2:Functions and Data for the Second Edition of "Data Mining with R"
Functions and data accompanying the second edition of the book "Data Mining with R, learning with case studies" by Luis Torgo, published by CRC Press.
Maintained by Luis Torgo. Last updated 8 years ago.
27 stars 7.64 score 380 scripts 2 dependentswenjie2wang
reda:Recurrent Event Data Analysis
Contains implementations of recurrent event data analysis routines including (1) survival and recurrent event data simulation from stochastic process point of view by the thinning method proposed by Lewis and Shedler (1979) <doi:10.1002/nav.3800260304> and the inversion method introduced in Cinlar (1975, ISBN:978-0486497976), (2) the mean cumulative function (MCF) estimation by the Nelson-Aalen estimator of the cumulative hazard rate function, (3) two-sample recurrent event responses comparison with the pseudo-score tests proposed by Lawless and Nadeau (1995) <doi:10.2307/1269617>, (4) gamma frailty model with spline rate function following Fu, et al. (2016) <doi:10.1080/10543406.2014.992524>.
Maintained by Wenjie Wang. Last updated 1 years ago.
mcfmean-cumulative-functionrecurrent-eventsurvival-analysiscpp
15 stars 7.52 score 55 scripts 3 dependentstpetzoldt
growthrates:Estimate Growth Rates from Experimental Data
A collection of methods to determine growth rates from experimental data, in particular from batch experiments and plate reader trials.
Maintained by Thomas Petzoldt. Last updated 2 years ago.
27 stars 7.52 score 102 scriptscran
sn:The Skew-Normal and Related Distributions Such as the Skew-t and the SUN
Build and manipulate probability distributions of the skew-normal family and some related ones, notably the skew-t and the SUN families. For the skew-normal and the skew-t distributions, statistical methods are provided for data fitting and model diagnostics, in the univariate and the multivariate case.
Maintained by Adelchi Azzalini. Last updated 2 years ago.
3 stars 7.44 score 92 dependentsssnn-airr
shazam:Immunoglobulin Somatic Hypermutation Analysis
Provides a computational framework for analyzing mutations in immunoglobulin (Ig) sequences. Includes methods for Bayesian estimation of antigen-driven selection pressure, mutational load quantification, building of somatic hypermutation (SHM) models, and model-dependent distance calculations. Also includes empirically derived models of SHM for both mice and humans. Citations: Gupta and Vander Heiden, et al (2015) <doi:10.1093/bioinformatics/btv359>, Yaari, et al (2012) <doi:10.1093/nar/gks457>, Yaari, et al (2013) <doi:10.3389/fimmu.2013.00358>, Cui, et al (2016) <doi:10.4049/jimmunol.1502263>.
Maintained by Susanna Marquez. Last updated 3 months ago.
7.43 score 222 scripts 2 dependentsbioc
cogena:co-expressed gene-set enrichment analysis
cogena is a workflow for co-expressed gene-set enrichment analysis. It aims to discovery smaller scale, but highly correlated cellular events that may be of great biological relevance. A novel pipeline for drug discovery and drug repositioning based on the cogena workflow is proposed. Particularly, candidate drugs can be predicted based on the gene expression of disease-related data, or other similar drugs can be identified based on the gene expression of drug-related data. Moreover, the drug mode of action can be disclosed by the associated pathway analysis. In summary, cogena is a flexible workflow for various gene set enrichment analysis for co-expressed genes, with a focus on pathway/GO analysis and drug repositioning.
Maintained by Zhilong Jia. Last updated 5 months ago.
clusteringgenesetenrichmentgeneexpressionvisualizationpathwayskegggomicroarraysequencingsystemsbiologydatarepresentationdataimportbioconductorbioinformatics
12 stars 7.36 score 32 scriptschoi-phd
TestDesign:Optimal Test Design Approach to Fixed and Adaptive Test Construction
Uses the optimal test design approach by Birnbaum (1968, ISBN:9781593119348) and van der Linden (2018) <doi:10.1201/9781315117430> to construct fixed, adaptive, and parallel tests. Supports the following mixed-integer programming (MIP) solver packages: 'Rsymphony', 'highs', 'gurobi', 'lpSolve', and 'Rglpk'. The 'gurobi' package is not available from CRAN; see <https://www.gurobi.com/downloads/>.
Maintained by Seung W. Choi. Last updated 6 months ago.
3 stars 7.34 score 37 scripts 2 dependentsargocanada
argoFloats:Analysis of Oceanographic Argo Floats
Supports the analysis of oceanographic data recorded by Argo autonomous drifting profiling floats. Functions are provided to (a) download and cache data files, (b) subset data in various ways, (c) handle quality-control flags and (d) plot the results according to oceanographic conventions. A shiny app is provided for easy exploration of datasets. The package is designed to work well with the 'oce' package, providing a wide range of processing capabilities that are particular to oceanographic analysis. See Kelley, Harbin, and Richards (2021) <doi:10.3389/fmars.2021.635922> for more on the scientific context and applications.
Maintained by Dan Kelley. Last updated 1 months ago.
17 stars 7.32 score 203 scriptsbioc
flowClust:Clustering for Flow Cytometry
Robust model-based clustering using a t-mixture model with Box-Cox transformation. Note: users should have GSL installed. Windows users: 'consult the README file available in the inst directory of the source distribution for necessary configuration instructions'.
Maintained by Greg Finak. Last updated 5 months ago.
immunooncologyclusteringvisualizationflowcytometry
7.30 score 83 scripts 6 dependentsr-forge
pcalg:Methods for Graphical Models and Causal Inference
Functions for causal structure learning and causal inference using graphical models. The main algorithms for causal structure learning are PC (for observational data without hidden variables), FCI and RFCI (for observational data with hidden variables), and GIES (for a mix of data from observational studies (i.e. observational data) and data from experiments involving interventions (i.e. interventional data) without hidden variables). For causal inference the IDA algorithm, the Generalized Backdoor Criterion (GBC), the Generalized Adjustment Criterion (GAC) and some related functions are implemented. Functions for incorporating background knowledge are provided.
Maintained by Markus Kalisch. Last updated 7 months ago.
7.30 score 700 scripts 19 dependentsbioc
qpgraph:Estimation of Genetic and Molecular Regulatory Networks from High-Throughput Genomics Data
Estimate gene and eQTL networks from high-throughput expression and genotyping assays.
Maintained by Robert Castelo. Last updated 3 days ago.
microarraygeneexpressiontranscriptionpathwaysnetworkinferencegraphandnetworkgeneregulationgeneticsgeneticvariabilitysnpsoftwareopenblas
3 stars 7.24 score 20 scripts 3 dependentsropensci
melt:Multiple Empirical Likelihood Tests
Performs multiple empirical likelihood tests. It offers an easy-to-use interface and flexibility in specifying hypotheses and calibration methods, extending the framework to simultaneous inferences. The core computational routines are implemented using the 'Eigen' 'C++' library and 'RcppEigen' interface, with 'OpenMP' for parallel computation. Details of the testing procedures are provided in Kim, MacEachern, and Peruggia (2023) <doi:10.1080/10485252.2023.2206919>. A companion paper by Kim, MacEachern, and Peruggia (2024) <doi:10.18637/jss.v108.i05> is available for further information. This work was supported by the U.S. National Science Foundation under Grants No. SES-1921523 and DMS-2015552.
Maintained by Eunseop Kim. Last updated 11 months ago.
12 stars 7.24 score 84 scriptsvpihur
clValid:Validation of Clustering Results
Statistical and biological validation of clustering results. This package implements Dunn Index, Silhouette, Connectivity, Stability, BHI and BSI. Further information can be found in Brock, G et al. (2008) <doi: 10.18637/jss.v025.i04>.
Maintained by Vasyl Pihur. Last updated 4 years ago.
5 stars 7.24 score 422 scripts 14 dependentsdankelley
plan:Tools for Project Planning
Supports the creation of 'burndown' charts and 'gantt' diagrams.
Maintained by Dan Kelley. Last updated 2 years ago.
33 stars 7.23 score 103 scriptsjhorzek
lessSEM:Non-Smooth Regularization for Structural Equation Models
Provides regularized structural equation modeling (regularized SEM) with non-smooth penalty functions (e.g., lasso) building on 'lavaan'. The package is heavily inspired by the ['regsem'](<https://github.com/Rjacobucci/regsem>) and ['lslx'](<https://github.com/psyphh/lslx>) packages.
Maintained by Jannik H. Orzek. Last updated 1 years ago.
lassopsychometricsregularizationregularized-structural-equation-modelsemstructural-equation-modelingopenblascppopenmp
7 stars 7.19 score 223 scriptssahirbhatnagar
casebase:Fitting Flexible Smooth-in-Time Hazards and Risk Functions via Logistic and Multinomial Regression
Fit flexible and fully parametric hazard regression models to survival data with single event type or multiple competing causes via logistic and multinomial regression. Our formulation allows for arbitrary functional forms of time and its interactions with other predictors for time-dependent hazards and hazard ratios. From the fitted hazard model, we provide functions to readily calculate and plot cumulative incidence and survival curves for a given covariate profile. This approach accommodates any log-linear hazard function of prognostic time, treatment, and covariates, and readily allows for non-proportionality. We also provide a plot method for visualizing incidence density via population time plots. Based on the case-base sampling approach of Hanley and Miettinen (2009) <DOI:10.2202/1557-4679.1125>, Saarela and Arjas (2015) <DOI:10.1111/sjos.12125>, and Saarela (2015) <DOI:10.1007/s10985-015-9352-x>.
Maintained by Sahir Bhatnagar. Last updated 7 months ago.
competing-riskscox-regressionregression-modelssurvival-analysis
9 stars 7.16 score 94 scriptsoptad
adoptr:Adaptive Optimal Two-Stage Designs
Optimize one or two-arm, two-stage designs for clinical trials with respect to several implemented objective criteria or custom objectives. Optimization under uncertainty and conditional (given stage-one outcome) constraints are supported. See Pilz et al. (2019) <doi:10.1002/sim.8291> and Kunzmann et al. (2021) <doi:10.18637/jss.v098.i09> for details.
Maintained by Maximilian Pilz. Last updated 6 months ago.
1 stars 7.09 score 39 scripts 1 dependentsropensci
taxlist:Handling Taxonomic Lists
Handling taxonomic lists through objects of class 'taxlist'. This package provides functions to import species lists from 'Turboveg' (<https://www.synbiosys.alterra.nl/turboveg/>) and the possibility to create backups from resulting R-objects. Also quick displays are implemented as summary-methods.
Maintained by Miguel Alvarez. Last updated 6 months ago.
12 stars 7.07 score 81 scripts 2 dependentsspedygiorgio
lifecontingencies:Financial and Actuarial Mathematics for Life Contingencies
Classes and methods that allow the user to manage life table, actuarial tables (also multiple decrements tables). Moreover, functions to easily perform demographic, financial and actuarial mathematics on life contingencies insurances calculations are contained therein. See Spedicato (2013) <doi:10.18637/jss.v055.i10>.
Maintained by Giorgio Alfredo Spedicato. Last updated 6 months ago.
actuarialfinanciallife-contingencieslife-insurancecpp
61 stars 7.06 score 156 scriptsleifeld
btergm:Temporal Exponential Random Graph Models by Bootstrapped Pseudolikelihood
Temporal Exponential Random Graph Models (TERGM) estimated by maximum pseudolikelihood with bootstrapped confidence intervals or Markov Chain Monte Carlo maximum likelihood. Goodness of fit assessment for ERGMs, TERGMs, and SAOMs. Micro-level interpretation of ERGMs and TERGMs. The methods are described in Leifeld, Cranmer and Desmarais (2018), JStatSoft <doi:10.18637/jss.v083.i06>.
Maintained by Philip Leifeld. Last updated 10 days ago.
complex-networksdynamic-analysisergmestimationgoodness-of-fitinferencelongitudinal-datanetwork-analysispredictiontergm
18 stars 7.03 score 83 scripts 2 dependentsdoccstat
fastcpd:Fast Change Point Detection via Sequential Gradient Descent
Implements fast change point detection algorithm based on the paper "Sequential Gradient Descent and Quasi-Newton's Method for Change-Point Analysis" by Xianyang Zhang, Trisha Dawn <https://proceedings.mlr.press/v206/zhang23b.html>. The algorithm is based on dynamic programming with pruning and sequential gradient descent. It is able to detect change points a magnitude faster than the vanilla Pruned Exact Linear Time(PELT). The package includes examples of linear regression, logistic regression, Poisson regression, penalized linear regression data, and whole lot more examples with custom cost function in case the user wants to use their own cost function.
Maintained by Xingchi Li. Last updated 11 days ago.
change-point-detectioncppcustom-functiongradient-descentlassolinear-regressionlogistic-regressionofflinepeltpenalized-regressionpoisson-regressionquasi-newtonstatisticstime-serieswarm-startfortranopenblascppopenmp
22 stars 7.00 score 7 scriptsroustant
DiceKriging:Kriging Methods for Computer Experiments
Estimation, validation and prediction of kriging models. Important functions : km, print.km, plot.km, predict.km.
Maintained by Olivier Roustant. Last updated 4 years ago.
4 stars 6.99 score 526 scripts 37 dependentsbioc
affyPLM:Methods for fitting probe-level models
A package that extends and improves the functionality of the base affy package. Routines that make heavy use of compiled code for speed. Central focus is on implementation of methods for fitting probe-level models and tools using these models. PLM based quality assessment tools.
Maintained by Ben Bolstad. Last updated 2 months ago.
microarrayonechannelpreprocessingqualitycontrolopenblaszlib
6.99 score 206 scripts 4 dependentsr-forge
oompaBase:Class Unions, Matrix Operations, and Color Schemes for OOMPA
Provides the class unions that must be preloaded in order for the basic tools in the OOMPA (Object-Oriented Microarray and Proteomics Analysis) project to be defined and loaded. It also includes vectorized operations for row-by-row means, variances, and t-tests. Finally, it provides new color schemes. Details on the packages in the OOMPA project can be found at <http://oompa.r-forge.r-project.org/>.
Maintained by Kevin R. Coombes. Last updated 2 months ago.
6.97 score 29 scripts 18 dependentsskoval
RISmed:Download Content from NCBI Databases
A set of tools to extract bibliographic content from the National Center for Biotechnology Information (NCBI) databases, including PubMed. The name RISmed is a portmanteau of RIS (for Research Information Systems, a common tag format for bibliographic data) and PubMed.
Maintained by Stephanie Kovalchik. Last updated 3 years ago.
38 stars 6.94 score 252 scripts 3 dependentsbioc
GOstats:Tools for manipulating GO and microarrays
A set of tools for interacting with GO and microarray data. A variety of basic manipulation tools for graphs, hypothesis testing and other simple calculations.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
annotationgomultiplecomparisongeneexpressionmicroarraypathwaysgenesetenrichmentgraphandnetwork
6.93 score 528 scripts 12 dependentsarchaeostat
ArchaeoPhases:Post-Processing of Markov Chain Monte Carlo Simulations for Chronological Modelling
Statistical analysis of archaeological dates and groups of dates. This package allows to post-process Markov Chain Monte Carlo (MCMC) simulations from 'ChronoModel' <https://chronomodel.com/>, 'Oxcal' <https://c14.arch.ox.ac.uk/oxcal.html> or 'BCal' <https://bcal.shef.ac.uk/>. It provides functions for the study of rhythms of the long term from the posterior distribution of a series of dates (tempo and activity plot). It also allows the estimation and visualization of time ranges from the posterior distribution of groups of dates (e.g. duration, transition and hiatus between successive phases) as described in Philippe and Vibet (2020) <doi:10.18637/jss.v093.c01>.
Maintained by Anne Philippe. Last updated 12 months ago.
archaeologybayesian-statisticsgeochronologymarkov-chainradiocarbon-dates
10 stars 6.90 score 66 scriptskingaa
ouch:Ornstein-Uhlenbeck Models for Phylogenetic Comparative Hypotheses
Fit and compare Ornstein-Uhlenbeck models for evolution along a phylogenetic tree.
Maintained by Aaron A. King. Last updated 5 months ago.
adaptive-regimebrownian-motionornstein-uhlenbeckornstein-uhlenbeck-modelsouchphylogenetic-comparative-hypothesesphylogenetic-comparative-methodsphylogenetic-datareact
15 stars 6.87 score 68 scripts 4 dependentsflr
FLasher:Projection and Forecasting of Fish Populations, Stocks and Fleets
Projection of future population and fishery dynamics is carried out for a given set of management targets. A system of equations is solved, using Automatic Differentation (AD), for the levels of effort by fishery (fleet) that will result in the required abundances, catches or fishing mortalities.
Maintained by Iago Mosqueira. Last updated 21 days ago.
2 stars 6.86 score 254 scripts 6 dependentsbioc
GenomicFiles:Distributed computing by file or by range
This package provides infrastructure for parallel computations distributed 'by file' or 'by range'. User defined MAPPER and REDUCER functions provide added flexibility for data combination and manipulation.
Maintained by Bioconductor Package Maintainer. Last updated 5 months ago.
geneticsinfrastructuredataimportsequencingcoverage
6.86 score 89 scripts 16 dependentsingmarvisser
depmixS4:Dependent Mixture Models - Hidden Markov Models of GLMs and Other Distributions in S4
Fits latent (hidden) Markov models on mixed categorical and continuous (time series) data, otherwise known as dependent mixture models, see Visser & Speekenbrink (2010, <DOI:10.18637/jss.v036.i07>).
Maintained by Ingmar Visser. Last updated 4 years ago.
12 stars 6.85 score 308 scripts 4 dependentsludovikcoba
rrecsys:Environment for Evaluating Recommender Systems
Processes standard recommendation datasets (e.g., a user-item rating matrix) as input and generates rating predictions and lists of recommended items. Standard algorithm implementations which are included in this package are the following: Global/Item/User-Average baselines, Weighted Slope One, Item-Based KNN, User-Based KNN, FunkSVD, BPR and weighted ALS. They can be assessed according to the standard offline evaluation methodology (Shani, et al. (2011) <doi:10.1007/978-0-387-85820-3_8>) for recommender systems using measures such as MAE, RMSE, Precision, Recall, F1, AUC, NDCG, RankScore and coverage measures. The package (Coba, et al.(2017) <doi: 10.1007/978-3-319-60042-0_36>) is intended for rapid prototyping of recommendation algorithms and education purposes.
Maintained by Ludovik Çoba. Last updated 3 years ago.
23 stars 6.84 score 25 scriptsbioc
maser:Mapping Alternative Splicing Events to pRoteins
This package provides functionalities for downstream analysis, annotation and visualizaton of alternative splicing events generated by rMATS.
Maintained by Diogo F.T. Veiga. Last updated 5 months ago.
alternativesplicingtranscriptomicsvisualization
17 stars 6.74 score 18 scriptsdgerlanc
portfolio:Analysing Equity Portfolios
Classes for analysing and implementing equity portfolios, including routines for generating tradelists and calculating exposures to user-specified risk factors.
Maintained by Daniel Gerlanc. Last updated 7 months ago.
financeportfolio-constructionrisk-modelling
16 stars 6.71 score 106 scriptsbioc
doppelgangR:Identify likely duplicate samples from genomic or meta-data
The main function is doppelgangR(), which takes as minimal input a list of ExpressionSet object, and searches all list pairs for duplicated samples. The search is based on the genomic data (exprs(eset)), phenotype/clinical data (pData(eset)), and "smoking guns" - supposedly unique identifiers found in pData(eset).
Maintained by Levi Waldron. Last updated 5 months ago.
immunooncologyrnaseqmicroarraygeneexpressionqualitycontrolbioconductor-package
5 stars 6.67 score 31 scriptsbioc
LEA:LEA: an R package for Landscape and Ecological Association Studies
LEA is an R package dedicated to population genomics, landscape genomics and genotype-environment association tests. LEA can run analyses of population structure and genome-wide tests for local adaptation, and also performs imputation of missing genotypes. The package includes statistical methods for estimating ancestry coefficients from large genotypic matrices and for evaluating the number of ancestral populations (snmf). It performs statistical tests using latent factor mixed models for identifying genetic polymorphisms that exhibit association with environmental gradients or phenotypic traits (lfmm2). In addition, LEA computes values of genetic offset statistics based on new or predicted environments (genetic.gap, genetic.offset). LEA is mainly based on optimized programs that can scale with the dimensions of large data sets.
Maintained by Olivier Francois. Last updated 18 days ago.
softwarestatistical methodclusteringregressionopenblas
6.63 score 534 scriptsrobinhankin
spray:Sparse Arrays and Multivariate Polynomials
Sparse arrays interpreted as multivariate polynomials. Uses 'disordR' discipline (Hankin, 2022, <doi:10.48550/ARXIV.2210.03856>). To cite the package in publications please use Hankin (2022) <doi:10.48550/ARXIV.2210.10848>.
Maintained by Robin K. S. Hankin. Last updated 2 months ago.
2 stars 6.62 score 35 scripts 4 dependentsrobinhankin
disordR:Non-Ordered Vectors
Functionality for manipulating values of associative maps. The package is a dependency for mvp-type packages that use the STL map class: it traps plausible idiom that is ill-defined (implementation-specific) and returns an informative error, rather than returning a possibly incorrect result. To cite the package in publications please use Hankin (2022) <doi:10.48550/ARXIV.2210.03856>.
Maintained by Robin K. S. Hankin. Last updated 5 months ago.
1 stars 6.59 score 20 dependentsflr
FLBRP:Reference Points for Fisheries Management
Calculates a range of biological reference points based upon yield per recruit and stock recruit based equilibrium calculations. These include F based reference points like F0.1, FMSY and biomass based reference points like BMSY.
Maintained by Iago Mosqueira. Last updated 4 months ago.
reference pointsfisheriesflrcpp
2 stars 6.58 score 350 scripts 4 dependentsspkaluzny
splus2R:Supplemental S-PLUS Functionality in R
Currently there are many functions in S-PLUS that are missing in R. To facilitate the conversion of S-PLUS packages to R packages, this package provides some missing S-PLUS functionality in R.
Maintained by Stephen Kaluzny. Last updated 1 years ago.
1 stars 6.56 score 82 scripts 30 dependentsfbertran
Cascade:Selection, Reverse-Engineering and Prediction in Cascade Networks
A modeling tool allowing gene selection, reverse engineering, and prediction in cascade networks. Jung, N., Bertrand, F., Bahram, S., Vallat, L., and Maumy-Bertrand, M. (2014) <doi:10.1093/bioinformatics/btt705>.
Maintained by Frederic Bertrand. Last updated 2 years ago.
1 stars 6.56 score 40 scripts 2 dependentsbioc
deepSNV:Detection of subclonal SNVs in deep sequencing data.
This package provides provides quantitative variant callers for detecting subclonal mutations in ultra-deep (>=100x coverage) sequencing experiments. The deepSNV algorithm is used for a comparative setup with a control experiment of the same loci and uses a beta-binomial model and a likelihood ratio test to discriminate sequencing errors and subclonal SNVs. The shearwater algorithm computes a Bayes classifier based on a beta-binomial model for variant calling with multiple samples for precisely estimating model parameters - such as local error rates and dispersion - and prior knowledge, e.g. from variation data bases such as COSMIC.
Maintained by Moritz Gerstung. Last updated 5 months ago.
geneticvariabilitysnpsequencinggeneticsdataimportcurlbzip2xz-utilszlibcpp
6.53 score 38 scripts 1 dependentsyangjasp
optimall:Allocate Samples Among Strata
Functions for the design process of survey sampling, with specific tools for multi-wave and multi-phase designs. Perform optimum allocation using Neyman (1934) <doi:10.2307/2342192> or Wright (2012) <doi:10.1080/00031305.2012.733679> allocation, split strata based on quantiles or values of known variables, randomly select samples from strata, allocate sampling waves iteratively, and organize a complex survey design. Also includes a Shiny application for observing the effects of different strata splits.
Maintained by Jasper Yang. Last updated 1 months ago.
5 stars 6.49 score 39 scriptsr-forge
ClassComparison:Classes and Methods for "Class Comparison" Problems on Microarrays
Defines the classes used for "class comparison" problems in the OOMPA project (<http://oompa.r-forge.r-project.org/>). Class comparison includes tests for differential expression; see Simon's book for details on typical problem types.
Maintained by Kevin R. Coombes. Last updated 2 months ago.
microarraydifferentialexpressionmultiplecomparisons
6.46 score 44 scripts 3 dependentsbstatcomp
bayes4psy:User Friendly Bayesian Data Analysis for Psychology
Contains several Bayesian models for data analysis of psychological tests. A user friendly interface for these models should enable students and researchers to perform professional level Bayesian data analysis without advanced knowledge in programming and Bayesian statistics. This package is based on the Stan platform (Carpenter et el. 2017 <doi:10.18637/jss.v076.i01>).
Maintained by Jure Demšar. Last updated 1 years ago.
14 stars 6.44 score 33 scriptsbioc
PICS:Probabilistic inference of ChIP-seq
Probabilistic inference of ChIP-Seq using an empirical Bayes mixture model approach.
Maintained by Renan Sauteraud. Last updated 2 days ago.
clusteringvisualizationsequencingchipseqgsl
6.43 score 7 scripts 1 dependentsbioc
quantro:A test for when to use quantile normalization
A data-driven test for the assumptions of quantile normalization using raw data such as objects that inherit eSets (e.g. ExpressionSet, MethylSet). Group level information about each sample (such as Tumor / Normal status) must also be provided because the test assesses if there are global differences in the distributions between the user-defined groups.
Maintained by Stephanie Hicks. Last updated 5 months ago.
normalizationpreprocessingmultiplecomparisonmicroarraysequencing
6.40 score 69 scripts 2 dependentsblue-matter
SAMtool:Stock Assessment Methods Toolkit
Simulation tools for closed-loop simulation are provided for the 'MSEtool' operating model to inform data-rich fisheries. 'SAMtool' provides a conditioning model, assessment models of varying complexity with standardized reporting, model-based management procedures, and diagnostic tools for evaluating assessments inside closed-loop simulation.
Maintained by Quang Huynh. Last updated 1 months ago.
3 stars 6.39 score 36 scripts 1 dependentsr-forge
TailRank:The Tail-Rank Statistic
Implements the tail-rank statistic for selecting biomarkers from a microarray data set, an efficient nonparametric test focused on the distributional tails. See <https://gitlab.com/krcoombes/coombeslab/-/blob/master/doc/papers/tolstoy-new.pdf>.
Maintained by Kevin R. Coombes. Last updated 2 months ago.
6.38 score 37 scripts 3 dependentsropensci
QuadratiK:Collection of Methods Constructed using Kernel-Based Quadratic Distances
It includes test for multivariate normality, test for uniformity on the d-dimensional Sphere, non-parametric two- and k-sample tests, random generation of points from the Poisson kernel-based density and clustering algorithm for spherical data. For more information see Saraceno G., Markatou M., Mukhopadhyay R. and Golzy M. (2024) <doi:10.48550/arXiv.2402.02290> Markatou, M. and Saraceno, G. (2024) <doi:10.48550/arXiv.2407.16374>, Ding, Y., Markatou, M. and Saraceno, G. (2023) <doi:10.5705/ss.202022.0347>, and Golzy, M. and Markatou, M. (2020) <doi:10.1080/10618600.2020.1740713>.
Maintained by Giovanni Saraceno. Last updated 2 months ago.
1 stars 6.36 score 27 scriptsbioc
NanoStringNCTools:NanoString nCounter Tools
Tools for NanoString Technologies nCounter Technology. Provides support for reading RCC files into an ExpressionSet derived object. Also includes methods for QC and normalizaztion of NanoString data.
Maintained by Maddy Griswold. Last updated 5 months ago.
geneexpressiontranscriptioncellbasedassaysdataimporttranscriptomicsproteomicsmrnamicroarrayproprietaryplatformsrnaseq
6.35 score 94 scripts 4 dependentsstc04003
reReg:Recurrent Event Regression
A comprehensive collection of practical and easy-to-use tools for regression analysis of recurrent events, with or without the presence of a (possibly) informative terminal event described in Chiou et al. (2023) <doi:10.18637/jss.v105.i05>. The modeling framework is based on a joint frailty scale-change model, that includes models described in Wang et al. (2001) <doi:10.1198/016214501753209031>, Huang and Wang (2004) <doi:10.1198/016214504000001033>, Xu et al. (2017) <doi:10.1080/01621459.2016.1173557>, and Xu et al. (2019) <doi:10.5705/SS.202018.0224> as special cases. The implemented estimating procedure does not require any parametric assumption on the frailty distribution. The package also allows the users to specify different model forms for both the recurrent event process and the terminal event.
Maintained by Sy Han (Steven) Chiou. Last updated 2 months ago.
23 stars 6.35 score 36 scripts 1 dependentsdfriend21
quadtree:Region Quadtrees for Spatial Data
Provides functionality for working with raster-like quadtrees (also called “region quadtrees”), which allow for variable-sized cells. The package allows for flexibility in the quadtree creation process. Several functions defining how to split and aggregate cells are provided, and custom functions can be written for both of these processes. In addition, quadtrees can be created using other quadtrees as “templates”, so that the new quadtree's structure is identical to the template quadtree. The package also includes functionality for modifying quadtrees, querying values, saving quadtrees to a file, and calculating least-cost paths using the quadtree as a resistance surface.
Maintained by Derek Friend. Last updated 2 years ago.
19 stars 6.34 score 58 scriptscran
fGarch:Rmetrics - Autoregressive Conditional Heteroskedastic Modelling
Analyze and model heteroskedastic behavior in financial time series.
Maintained by Georgi N. Boshnakov. Last updated 1 years ago.
7 stars 6.33 score 51 dependentssmoeding
usl:Analyze System Scalability with the Universal Scalability Law
The Universal Scalability Law (Gunther 2007) <doi:10.1007/978-3-540-31010-5> is a model to predict hardware and software scalability. It uses system capacity as a function of load to forecast the scalability for the system.
Maintained by Stefan Moeding. Last updated 3 years ago.
scalabilityuniversal-scalability-lawusl
36 stars 6.32 score 117 scriptsjensharbers
agricolaeplotr:Visualization of Design of Experiments from the 'agricolae' Package
Visualization of Design of Experiments from the 'agricolae' package with 'ggplot2' framework The user provides an experiment design from the 'agricolae' package, calls the corresponding function and will receive a visualization with 'ggplot2' based functions that are specific for each design. As there are many different designs, each design is tested on its type. The output can be modified with standard 'ggplot2' commands or with other packages with 'ggplot2' function extensions.
Maintained by Jens Harbers. Last updated 2 months ago.
8 stars 6.27 score 78 scriptslarmarange
prevR:Estimating Regional Trends of a Prevalence from a DHS and Similar Surveys
Spatial estimation of a prevalence surface or a relative risks surface, using data from a Demographic and Health Survey (DHS) or an analog survey, see Larmarange et al. (2011) <doi:10.4000/cybergeo.24606>.
Maintained by Joseph Larmarange. Last updated 6 months ago.
5 stars 6.26 score 46 scriptsbioc
lumi:BeadArray Specific Methods for Illumina Methylation and Expression Microarrays
The lumi package provides an integrated solution for the Illumina microarray data analysis. It includes functions of Illumina BeadStudio (GenomeStudio) data input, quality control, BeadArray-specific variance stabilization, normalization and gene annotation at the probe level. It also includes the functions of processing Illumina methylation microarrays, especially Illumina Infinium methylation microarrays.
Maintained by Lei Huang. Last updated 5 months ago.
microarrayonechannelpreprocessingdnamethylationqualitycontroltwochannel
6.26 score 294 scripts 5 dependentsbioc
VariantFiltering:Filtering of coding and non-coding genetic variants
Filter genetic variants using different criteria such as inheritance model, amino acid change consequence, minor allele frequencies across human populations, splice site strength, conservation, etc.
Maintained by Robert Castelo. Last updated 2 months ago.
geneticshomo_sapiensannotationsnpsequencinghighthroughputsequencing
4 stars 6.23 score 21 scriptsmu-sigma
HVT:Constructing Hierarchical Voronoi Tessellations and Overlay Heatmaps for Data Analysis
Facilitates building topology preserving maps for data analysis.
Maintained by "Mu Sigma, Inc.". Last updated 1 days ago.
4 stars 6.20 score 1 scriptsclarahapp
funData:An S4 Class for Functional Data
S4 classes for univariate and multivariate functional data with utility functions. See <doi:10.18637/jss.v093.i05> for a detailed description of the package functionalities and its interplay with the MFPCA package for multivariate functional principal component analysis <https://CRAN.R-project.org/package=MFPCA>.
Maintained by Clara Happ-Kurz. Last updated 1 years ago.
14 stars 6.15 score 111 scripts 6 dependentsbioc
Pedixplorer:Pedigree Functions
Routines to handle family data with a Pedigree object. The initial purpose was to create correlation structures that describe family relationships such as kinship and identity-by-descent, which can be used to model family data in mixed effects models, such as in the coxme function. Also includes a tool for Pedigree drawing which is focused on producing compact layouts without intervention. Recent additions include utilities to trim the Pedigree object with various criteria, and kinship for the X chromosome.
Maintained by Louis Le Nezet. Last updated 13 days ago.
softwaredatarepresentationgeneticsgraphandnetworkvisualizationkinshippedigree
2 stars 6.08 score 10 scriptsltorgo
performanceEstimation:An Infra-Structure for Performance Estimation of Predictive Models
An infra-structure for estimating the predictive performance of predictive models. In this context, it can also be used to compare and/or select among different alternative ways of solving one or more predictive tasks. The main goal of the package is to provide a generic infra-structure to estimate the values of different metrics of predictive performance using different estimation procedures. These estimation tasks can be applied to any solutions (workflows) to the predictive tasks. The package provides easy to use standard workflows that allow the usage of any available R modeling algorithm together with some pre-defined data pre-processing steps and also prediction post- processing methods. It also provides means for addressing issues related with the statistical significance of the observed differences.
Maintained by Luis Torgo. Last updated 8 years ago.
16 stars 5.97 score 195 scripts 1 dependentsbioc
normr:Normalization and difference calling in ChIP-seq data
Robust normalization and difference calling procedures for ChIP-seq and alike data. Read counts are modeled jointly as a binomial mixture model with a user-specified number of components. A fitted background estimate accounts for the effect of enrichment in certain regions and, therefore, represents an appropriate null hypothesis. This robust background is used to identify significantly enriched or depleted regions.
Maintained by Johannes Helmuth. Last updated 5 months ago.
bayesiandifferentialpeakcallingclassificationdataimportchipseqripseqfunctionalgenomicsgeneticsmultiplecomparisonnormalizationpeakdetectionpreprocessingalignmentcppopenmp
11 stars 5.93 score 13 scriptsbozenne
BuyseTest:Generalized Pairwise Comparisons
Implementation of the Generalized Pairwise Comparisons (GPC) as defined in Buyse (2010) <doi:10.1002/sim.3923> for complete observations, and extended in Peron (2018) <doi:10.1177/0962280216658320> to deal with right-censoring. GPC compare two groups of observations (intervention vs. control group) regarding several prioritized endpoints to estimate the probability that a random observation drawn from one group performs better/worse/equivalently than a random observation drawn from the other group. Summary statistics such as the net treatment benefit, win ratio, or win odds are then deduced from these probabilities. Confidence intervals and p-values are obtained based on asymptotic results (Ozenne 2021 <doi:10.1177/09622802211037067>), non-parametric bootstrap, or permutations. The software enables the use of thresholds of minimal importance difference, stratification, non-prioritized endpoints (O Brien test), and can handle right-censoring and competing-risks.
Maintained by Brice Ozenne. Last updated 16 days ago.
generalized-pairwise-comparisonsnon-parametricstatisticscpp
5 stars 5.91 score 90 scriptsbioc
scanMiR:scanMiR
A set of tools for working with miRNA affinity models (KdModels), efficiently scanning for miRNA binding sites, and predicting target repression. It supports scanning using miRNA seeds, full miRNA sequences (enabling 3' alignment) and KdModels, and includes the prediction of slicing and TDMD sites. Finally, it includes utility and plotting functions (e.g. for the visual representation of miRNA-target alignment).
Maintained by Pierre-Luc Germain. Last updated 5 months ago.
mirnasequencematchingalignment
5.89 score 52 scripts 1 dependentsbioc
globaltest:Testing Groups of Covariates/Features for Association with a Response Variable, with Applications to Gene Set Testing
The global test tests groups of covariates (or features) for association with a response variable. This package implements the test with diagnostic plots and multiple testing utilities, along with several functions to facilitate the use of this test for gene set testing of GO and KEGG terms.
Maintained by Jelle Goeman. Last updated 5 months ago.
microarrayonechannelbioinformaticsdifferentialexpressiongopathways
5.89 score 79 scripts 6 dependentsropensci
phylotaR:Automated Phylogenetic Sequence Cluster Identification from 'GenBank'
A pipeline for the identification, within taxonomic groups, of orthologous sequence clusters from 'GenBank' <https://www.ncbi.nlm.nih.gov/genbank/> as the first step in a phylogenetic analysis. The pipeline depends on a local alignment search tool and is, therefore, not dependent on differences in gene naming conventions and naming errors.
Maintained by Shixiang Wang. Last updated 8 months ago.
blastngenbankpeer-reviewedphylogeneticssequence-alignment
23 stars 5.86 score 156 scriptsbioc
fabia:FABIA: Factor Analysis for Bicluster Acquisition
Biclustering by "Factor Analysis for Bicluster Acquisition" (FABIA). FABIA is a model-based technique for biclustering, that is clustering rows and columns simultaneously. Biclusters are found by factor analysis where both the factors and the loading matrix are sparse. FABIA is a multiplicative model that extracts linear dependencies between samples and feature patterns. It captures realistic non-Gaussian data distributions with heavy tails as observed in gene expression measurements. FABIA utilizes well understood model selection techniques like the EM algorithm and variational approaches and is embedded into a Bayesian framework. FABIA ranks biclusters according to their information content and separates spurious biclusters from true biclusters. The code is written in C.
Maintained by Andreas Mitterecker. Last updated 5 months ago.
statisticalmethodmicroarraydifferentialexpressionmultiplecomparisonclusteringvisualization
5.84 score 32 scripts 6 dependentscran
flexclust:Flexible Cluster Algorithms
The main function kcca implements a general framework for k-centroids cluster analysis supporting arbitrary distance measures and centroid computation. Further cluster methods include hard competitive learning, neural gas, and QT clustering. There are numerous visualization methods for cluster results (neighborhood graphs, convex cluster hulls, barcharts of centroids, ...), and bootstrap methods for the analysis of cluster stability.
Maintained by Bettina Grün. Last updated 29 days ago.
3 stars 5.81 score 52 dependentsbioc
BindingSiteFinder:Binding site defintion based on iCLIP data
Precise knowledge on the binding sites of an RNA-binding protein (RBP) is key to understand (post-) transcriptional regulatory processes. Here we present a workflow that describes how exact binding sites can be defined from iCLIP data. The package provides functions for binding site definition and result visualization. For details please see the vignette.
Maintained by Mirko Brüggemann. Last updated 9 days ago.
sequencinggeneexpressiongeneregulationfunctionalgenomicscoveragedataimportbinding-site-classificationbinding-sitesbioconductor-packageicliprna-binding-proteins
6 stars 5.80 score 3 scriptssylvainschmitt
rcontroll:Individual-Based Forest Growth Simulator 'TROLL'
'TROLL' is coded in C++ and it typically simulates hundreds of thousands of individuals over hundreds of years. The 'rcontroll' R package is a wrapper of 'TROLL'. 'rcontroll' includes functions that generate inputs for simulations and run simulations. Finally, it is possible to analyse the 'TROLL' outputs through tables, figures, and maps taking advantage of other R visualisation packages. 'rcontroll' also offers the possibility to generate a virtual LiDAR point cloud that corresponds to a snapshot of the simulated forest.
Maintained by Sylvain Schmitt. Last updated 6 months ago.
5 stars 5.76 score 19 scriptsbioc
demuxmix:Demultiplexing oligo-barcoded scRNA-seq data using regression mixture models
A package for demultiplexing single-cell sequencing experiments of pooled cells labeled with barcode oligonucleotides. The package implements methods to fit regression mixture models for a probabilistic classification of cells, including multiplet detection. Demultiplexing error rates can be estimated, and methods for quality control are provided.
Maintained by Hans-Ulrich Klein. Last updated 5 months ago.
singlecellsequencingpreprocessingclassificationregression
5 stars 5.76 score 19 scripts 1 dependentsbioc
qusage:qusage: Quantitative Set Analysis for Gene Expression
This package is an implementation the Quantitative Set Analysis for Gene Expression (QuSAGE) method described in (Yaari G. et al, Nucl Acids Res, 2013). This is a novel Gene Set Enrichment-type test, which is designed to provide a faster, more accurate, and easier to understand test for gene expression studies. qusage accounts for inter-gene correlations using the Variance Inflation Factor technique proposed by Wu et al. (Nucleic Acids Res, 2012). In addition, rather than simply evaluating the deviation from a null hypothesis with a single number (a P value), qusage quantifies gene set activity with a complete probability density function (PDF). From this PDF, P values and confidence intervals can be easily extracted. Preserving the PDF also allows for post-hoc analysis (e.g., pair-wise comparisons of gene set activity) while maintaining statistical traceability. Finally, while qusage is compatible with individual gene statistics from existing methods (e.g., LIMMA), a Welch-based method is implemented that is shown to improve specificity. The QuSAGE package also includes a mixed effects model implementation, as described in (Turner JA et al, BMC Bioinformatics, 2015), and a meta-analysis framework as described in (Meng H, et al. PLoS Comput Biol. 2019). For questions, contact Chris Bolen (cbolen1@gmail.com) or Steven Kleinstein (steven.kleinstein@yale.edu)
Maintained by Christopher Bolen. Last updated 5 months ago.
genesetenrichmentmicroarrayrnaseqsoftwareimmunooncology
5.65 score 185 scripts 1 dependentsbioc
flowMeans:Non-parametric Flow Cytometry Data Gating
Identifies cell populations in Flow Cytometry data using non-parametric clustering and segmented-regression-based change point detection. Note: R 2.11.0 or newer is required.
Maintained by Nima Aghaeepour. Last updated 5 months ago.
immunooncologyflowcytometrycellbiologyclustering
5.64 score 36 scripts 2 dependentsgraemeleehickey
bayesDP:Implementation of the Bayesian Discount Prior Approach for Clinical Trials
Functions for data augmentation using the Bayesian discount prior method for single arm and two-arm clinical trials, as described in Haddad et al. (2017) <doi:10.1080/10543406.2017.1300907>. The discount power prior methodology was developed in collaboration with the The Medical Device Innovation Consortium (MDIC) Computer Modeling & Simulation Working Group.
Maintained by Graeme L. Hickey. Last updated 3 months ago.
bayesianbayesian-inferencebayesian-statisticsclinical-trialsmdicposterior-predictiveposterior-probabilityprior-distributionopenblascpp
5.56 score 20 scripts 1 dependentsquantsulting
ghyp:Generalized Hyperbolic Distribution and Its Special Cases
Detailed functionality for working with the univariate and multivariate Generalized Hyperbolic distribution and its special cases (Hyperbolic (hyp), Normal Inverse Gaussian (NIG), Variance Gamma (VG), skewed Student-t and Gaussian distribution). Especially, it contains fitting procedures, an AIC-based model selection routine, and functions for the computation of density, quantile, probability, random variates, expected shortfall and some portfolio optimization and plotting routines as well as the likelihood ratio test. In addition, it contains the Generalized Inverse Gaussian distribution. See Chapter 3 of A. J. McNeil, R. Frey, and P. Embrechts. Quantitative risk management: Concepts, techniques and tools. Princeton University Press, Princeton (2005).
Maintained by Marc Weibel. Last updated 7 months ago.
5.55 score 90 scripts 8 dependentsbayesplay
bayesplay:The Bayes Factor Playground
A lightweight modelling syntax for defining likelihoods and priors and for computing Bayes factors for simple one parameter models. It includes functionality for computing and plotting priors, likelihoods, and model predictions. Additional functionality is included for computing and plotting posteriors.
Maintained by Lincoln John Colling. Last updated 1 years ago.
bayesbayesianbayesian-statistics
6 stars 5.54 score 23 scriptsjeffreyhanson
raptr:Representative and Adequate Prioritization Toolkit in R
Biodiversity is in crisis. The overarching aim of conservation is to preserve biodiversity patterns and processes. To this end, protected areas are established to buffer species and preserve biodiversity processes. But resources are limited and so protected areas must be cost-effective. This package contains tools to generate plans for protected areas (prioritizations), using spatially explicit targets for biodiversity patterns and processes. To obtain solutions in a feasible amount of time, this package uses the commercial 'Gurobi' software (obtained from <https://www.gurobi.com/>). For more information on using this package, see Hanson et al. (2018) <doi:10.1111/2041-210X.12862>.
Maintained by Jeffrey O Hanson. Last updated 1 years ago.
8 stars 5.52 score 83 scriptssleire
etrm:Energy Trading and Risk Management
Provides a collection of functions to perform core tasks within Energy Trading and Risk Management (ETRM). Calculation of maximum smoothness forward price curves for electricity and natural gas contracts with flow delivery, as presented in F. E. Benth, S. Koekebakker, and F. Ollmar (2007) <doi:10.3905/jod.2007.694791> and F. E. Benth, J. S. Benth, and S. Koekebakker (2008) <doi:10.1142/6811>. Portfolio insurance trading strategies for price risk management in the forward market, see F. Black (1976) <doi:10.1016/0304-405X(76)90024-6>, T. Bjork (2009) <https://EconPapers.repec.org/RePEc:oxp:obooks:9780199574742>, F. Black and R. W. Jones (1987) <doi:10.3905/jpm.1987.409131> and H. E. Leland (1980) <http://www.jstor.org/stable/2327419>.
Maintained by Anders D. Sleire. Last updated 2 years ago.
commoditiesenergy-tradingrisk-managementtrading-strategies
33 stars 5.52 score 10 scriptsdgerlanc
backtest:Exploring Portfolio-Based Conjectures About Financial Instruments
The backtest package provides facilities for exploring portfolio-based conjectures about financial instruments (stocks, bonds, swaps, options, et cetera).
Maintained by Daniel Gerlanc. Last updated 10 years ago.
20 stars 5.52 score 33 scriptsluciu5
antitrust:Tools for Antitrust Practitioners
A collection of tools for antitrust practitioners, including the ability to calibrate different consumer demand systems and simulate the effects of mergers under different competitive regimes.
Maintained by Charles Taragin. Last updated 6 months ago.
5 stars 5.51 score 36 scripts 2 dependentsfmmgroupva
FMM:Rhythmic Patterns Modeling by FMM Models
Provides a collection of functions to fit and explore single, multi-component and restricted Frequency Modulated Moebius (FMM) models. 'FMM' is a nonlinear parametric regression model capable of fitting non-sinusoidal shapes in rhythmic patterns. Details about the mathematical formulation of 'FMM' models can be found in Rueda et al. (2019) <doi:10.1038/s41598-019-54569-1>.
Maintained by Itziar Fernandez. Last updated 3 days ago.
2 stars 5.48 scoreblasif
cocons:Covariate-Based Covariance Functions for Nonstationary Spatial Modeling
Estimation, prediction, and simulation of nonstationary Gaussian process with modular covariate-based covariance functions. Sources of nonstationarity, such as spatial mean, variance, geometric anisotropy, smoothness, and nugget, can be considered based on spatial characteristics. An induced compact-supported nonstationary covariance function is provided, enabling fast and memory-efficient computations when handling densely sampled domains.
Maintained by Federico Blasi. Last updated 2 months ago.
covariance-matrixcppestimationgaussian-processeslarge-datasetnonstationarityoptimizationpredictioncpp
3 stars 5.48 score 1 scriptsbioc
specL:specL - Prepare Peptide Spectrum Matches for Use in Targeted Proteomics
provides a functions for generating spectra libraries that can be used for MRM SRM MS workflows in proteomics. The package provides a BiblioSpec reader, a function which can add the protein information using a FASTA formatted amino acid file, and an export method for using the created library in the Spectronaut software. The package is developed, tested and used at the Functional Genomics Center Zurich <https://fgcz.ch>.
Maintained by Christian Panse. Last updated 5 months ago.
massspectrometryproteomicsddadiamass-spectrometry
1 stars 5.46 score 12 scriptsr-forge
fRegression:Rmetrics - Regression Based Decision and Prediction
A collection of functions for linear and non-linear regression modelling. It implements a wrapper for several regression models available in the base and contributed packages of R.
Maintained by Paul J. Northrop. Last updated 9 days ago.
1 stars 5.44 score 23 scriptsssnn-airr
scoper:Spectral Clustering-Based Method for Identifying B Cell Clones
Provides a computational framework for identification of B cell clones from Adaptive Immune Receptor Repertoire sequencing (AIRR-Seq) data. Three main functions are included (identicalClones, hierarchicalClones, and spectralClones) that perform clustering among sequences of BCRs/IGs (B cell receptors/immunoglobulins) which share the same V gene, J gene and junction length. Nouri N and Kleinstein SH (2018) <doi: 10.1093/bioinformatics/bty235>. Nouri N and Kleinstein SH (2019) <doi: 10.1101/788620>. Gupta NT, et al. (2017) <doi: 10.4049/jimmunol.1601850>.
Maintained by Susanna Marquez. Last updated 2 months ago.
5.43 score 89 scriptsstaffanbetner
rethinking:Statistical Rethinking book package
Utilities for fitting and comparing models
Maintained by Richard McElreath. Last updated 4 months ago.
5.42 score 4.4k scriptssimonmoulds
lulcc:Land Use Change Modelling in R
Classes and methods for spatially explicit land use change modelling in R.
Maintained by Simon Moulds. Last updated 5 years ago.
41 stars 5.37 score 38 scriptsr-forge
R2MLwiN:Running 'MLwiN' from Within R
An R command interface to the 'MLwiN' multilevel modelling software package.
Maintained by Zhengzheng Zhang. Last updated 9 days ago.
5.35 score 125 scriptsneotomadb
neotoma2:Working with the Neotoma Paleoecology Database
Access and manipulation of data using the Neotoma Paleoecology Database. <https://api.neotomadb.org/api-docs/>.
Maintained by Dominguez Vidana Socorro. Last updated 8 months ago.
earthcubeneotomansfpaleoecology
8 stars 5.35 score 56 scriptscenterforstatistics-ugent
pim:Fit Probabilistic Index Models
Fit a probabilistic index model as described in Thas et al, 2012: <doi:10.1111/j.1467-9868.2011.01020.x>. The interface to the modeling function has changed in this new version. The old version is still available at R-Forge.
Maintained by Joris Meys. Last updated 3 months ago.
10 stars 5.33 score 43 scriptseglenn
acs:Download, Manipulate, and Present American Community Survey and Decennial Data from the US Census
Provides a general toolkit for downloading, managing, analyzing, and presenting data from the U.S. Census (<https://www.census.gov/data/developers/data-sets.html>), including SF1 (Decennial short-form), SF3 (Decennial long-form), and the American Community Survey (ACS). Confidence intervals provided with ACS data are converted to standard errors to be bundled with estimates in complex acs objects. Package provides new methods to conduct standard operations on acs objects and present/plot data in statistically appropriate ways.
Maintained by Ezra Haber Glenn. Last updated 6 years ago.
11 stars 5.33 score 430 scripts 3 dependentsfukayak
occumb:Site Occupancy Modeling for Environmental DNA Metabarcoding
Fits multispecies site occupancy models to environmental DNA metabarcoding data collected using spatially-replicated survey design. Model fitting results can be used to evaluate and compare the effectiveness of species detection to find an efficient survey design. Reference: Fukaya et al. (2022) <doi:10.1111/2041-210X.13732>.
Maintained by Keiichi Fukaya. Last updated 2 months ago.
2 stars 5.30 score 10 scriptspedersen-fisheries-lab
sspm:Spatial Surplus Production Model Framework for Northern Shrimp Populations
Implement a GAM-based (Generalized Additive Models) spatial surplus production model (spatial SPM), aimed at modeling northern shrimp population in Atlantic Canada but potentially to any stock in any location. The package is opinionated in its implementation of SPMs as it internally makes the choice to use penalized spatial gams with time lags. However, it also aims to provide options for the user to customize their model. The methods are described in Pedersen et al. (2022, <https://www.dfo-mpo.gc.ca/csas-sccs/Publications/ResDocs-DocRech/2022/2022_062-eng.html>).
Maintained by Valentin Lucet. Last updated 2 months ago.
3 stars 5.28 score 21 scriptskkawato
rdlearn:Safe Policy Learning under Regression Discontinuity Design with Multiple Cutoffs
Implements safe policy learning under regression discontinuity designs with multiple cutoffs, based on Zhang et al. (2022) <doi:10.48550/arXiv.2208.13323>. The learned cutoffs are guaranteed to perform no worse than the existing cutoffs in terms of overall outcomes. The 'rdlearn' package also includes features for visualizing the learned cutoffs relative to the baseline and conducting sensitivity analyses.
Maintained by Kentaro Kawato. Last updated 1 months ago.
1 stars 5.23 score 4 scriptsbioc
HiTC:High Throughput Chromosome Conformation Capture analysis
The HiTC package was developed to explore high-throughput 'C' data such as 5C or Hi-C. Dedicated R classes as well as standard methods for quality controls, normalization, visualization, and further analysis are also provided.
Maintained by Nicolas Servant. Last updated 5 months ago.
sequencinghighthroughputsequencinghic
5.23 score 42 scriptscran
ICS:Tools for Exploring Multivariate Data via ICS/ICA
Implementation of Tyler, Critchley, Duembgen and Oja's (JRSS B, 2009, <doi:10.1111/j.1467-9868.2009.00706.x>) and Oja, Sirkia and Eriksson's (AJS, 2006, <https://www.ajs.or.at/index.php/ajs/article/view/vol35,%20no2%263%20-%207>) method of two different scatter matrices to obtain an invariant coordinate system or independent components, depending on the underlying assumptions.
Maintained by Klaus Nordhausen. Last updated 10 days ago.
5.20 score 17 dependentsbioc
ASICS:Automatic Statistical Identification in Complex Spectra
With a set of pure metabolite reference spectra, ASICS quantifies concentration of metabolites in a complex spectrum. The identification of metabolites is performed by fitting a mixture model to the spectra of the library with a sparse penalty. The method and its statistical properties are described in Tardivel et al. (2017) <doi:10.1007/s11306-017-1244-5>.
Maintained by Gaëlle Lefort. Last updated 5 months ago.
softwaredataimportcheminformaticsmetabolomics
5.18 score 30 scriptscran
aod:Analysis of Overdispersed Data
Provides a set of functions to analyse overdispersed counts or proportions. Most of the methods are already available elsewhere but are scattered in different packages. The proposed functions should be considered as complements to more sophisticated methods such as generalized estimating equations (GEE) or generalized linear mixed effect models (GLMM).
Maintained by Renaud Lancelot. Last updated 1 years ago.
3 stars 5.15 score 15 dependentsbioc
CMA:Synthesis of microarray-based classification
This package provides a comprehensive collection of various microarray-based classification algorithms both from Machine Learning and Statistics. Variable Selection, Hyperparameter tuning, Evaluation and Comparison can be performed combined or stepwise in a user-friendly environment.
Maintained by Roman Hornung. Last updated 5 months ago.
5.09 score 61 scriptsbioc
topdownr:Investigation of Fragmentation Conditions in Top-Down Proteomics
The topdownr package allows automatic and systemic investigation of fragment conditions. It creates Thermo Orbitrap Fusion Lumos method files to test hundreds of fragmentation conditions. Additionally it provides functions to analyse and process the generated MS data and determine the best conditions to maximise overall fragment coverage.
Maintained by Sebastian Gibb. Last updated 5 months ago.
immunooncologyinfrastructureproteomicsmassspectrometrycoveragemass-spectrometrytopdown
1 stars 5.08 scorestatistikat
x12:Interface to 'X12-ARIMA'/'X13-ARIMA-SEATS' and Structure for Batch Processing of Seasonal Adjustment
The 'X13-ARIMA-SEATS' <https://www.census.gov/data/software/x13as.html> methodology and software is a widely used software and developed by the US Census Bureau. It can be accessed from 'R' with this package and 'X13-ARIMA-SEATS' binaries are provided by the 'R' package 'x13binary'.
Maintained by Alexander Kowarik. Last updated 3 years ago.
18 stars 5.06 score 57 scriptsyanrong-stacy-song
creditr:Credit Default Swaps
Price credit default swaps using 'C' code from the International Swaps and Derivatives Association CDS Standard Model. See <https://www.cdsmodel.com/cdsmodel/documentation.html> for more information about the model and <https://www.cdsmodel.com/cdsmodel/cds-disclaimer.html> for license details for the 'C' code.
Maintained by Yanrong Song. Last updated 5 days ago.
5.05 score 32 scriptsmodal-inria
Rankcluster:Model-Based Clustering for Multivariate Partial Ranking Data
Implementation of a model-based clustering algorithm for ranking data (C. Biernacki, J. Jacques (2013) <doi:10.1016/j.csda.2012.08.008>). Multivariate rankings as well as partial rankings are taken into account. This algorithm is based on an extension of the Insertion Sorting Rank (ISR) model for ranking data, which is a meaningful and effective model parametrized by a position parameter (the modal ranking, quoted by mu) and a dispersion parameter (quoted by pi). The heterogeneity of the rank population is modelled by a mixture of ISR, whereas conditional independence assumption is considered for multivariate rankings.
Maintained by Quentin Grimonprez. Last updated 2 years ago.
clusteringhacktoberfestrankcpp
1 stars 5.05 score 37 scripts 1 dependentsemanuelsommer
portvine:Vine Based (Un)Conditional Portfolio Risk Measure Estimation
Following Sommer (2022) <https://mediatum.ub.tum.de/1658240> portfolio level risk estimates (e.g. Value at Risk, Expected Shortfall) are estimated by modeling each asset univariately by an ARMA-GARCH model and then their cross dependence via a Vine Copula model in a rolling window fashion. One can even condition on variables/time series at certain quantile levels to stress test the risk measure estimates.
Maintained by Emanuel Sommer. Last updated 1 years ago.
expected-shortfallgarch-modelsvalue-at-riskvine-copulascpp
22 stars 5.04 score 6 scriptsalexzwanenburg
familiar:End-to-End Automated Machine Learning and Model Evaluation
Single unified interface for end-to-end modelling of regression, categorical and time-to-event (survival) outcomes. Models created using familiar are self-containing, and their use does not require additional information such as baseline survival, feature clustering, or feature transformation and normalisation parameters. Model performance, calibration, risk group stratification, (permutation) variable importance, individual conditional expectation, partial dependence, and more, are assessed automatically as part of the evaluation process and exported in tabular format and plotted, and may also be computed manually using export and plot functions. Where possible, metrics and values obtained during the evaluation process come with confidence intervals.
Maintained by Alex Zwanenburg. Last updated 6 months ago.
aiexplainable-aimachine-learningsurvival-analysistabular-data
30 stars 5.03 score 18 scriptsbioc
podkat:Position-Dependent Kernel Association Test
This package provides an association test that is capable of dealing with very rare and even private variants. This is accomplished by a kernel-based approach that takes the positions of the variants into account. The test can be used for pre-processed matrix data, but also directly for variant data stored in VCF files. Association testing can be performed whole-genome, whole-exome, or restricted to pre-defined regions of interest. The test is complemented by tools for analyzing and visualizing the results.
Maintained by Ulrich Bodenhofer. Last updated 5 months ago.
geneticswholegenomeannotationvariantannotationsequencingdataimportcurlbzip2xz-utilszlibcpp
5.02 score 6 scriptsevolutionary-optimization-laboratory
rmoo:Multi-Objective Optimization in R
The 'rmoo' package is a framework for multi- and many-objective optimization, which allows researchers and users versatility in parameter configuration, as well as tools for analysis, replication and visualization of results. The 'rmoo' package was built as a fork of the 'GA' package by Luca Scrucca(2017) <DOI:10.32614/RJ-2017-008> and implementing the Non-Dominated Sorting Genetic Algorithms proposed by K. Deb's.
Maintained by Francisco Benitez. Last updated 5 months ago.
metaheuristicsmultiobjectivemultiobjective-optimizationnsgansga2nsga3optimizationpareto-front
30 stars 5.01 score 23 scriptsparksw3
fitode:Tools for Ordinary Differential Equations Model Fitting
Methods and functions for fitting ordinary differential equations (ODE) model in 'R'. Sensitivity equations are used to compute the gradients of ODE trajectories with respect to underlying parameters, which in turn allows for more stable fitting. Other fitting methods, such as MCMC (Markov chain Monte Carlo), are also available.
Maintained by Sang Woo Park. Last updated 1 months ago.
6 stars 5.01 score 34 scriptsbioc
fmrs:Variable Selection in Finite Mixture of AFT Regression and FMR Models
The package obtains parameter estimation, i.e., maximum likelihood estimators (MLE), via the Expectation-Maximization (EM) algorithm for the Finite Mixture of Regression (FMR) models with Normal distribution, and MLE for the Finite Mixture of Accelerated Failure Time Regression (FMAFTR) subject to right censoring with Log-Normal and Weibull distributions via the EM algorithm and the Newton-Raphson algorithm (for Weibull distribution). More importantly, the package obtains the maximum penalized likelihood (MPLE) for both FMR and FMAFTR models (collectively called FMRs). A component-wise tuning parameter selection based on a component-wise BIC is implemented in the package. Furthermore, this package provides Ridge Regression and Elastic Net.
Maintained by Farhad Shokoohi. Last updated 5 months ago.
survivalregressiondimensionreduction
3 stars 5.00 score 55 scripts 1 dependentsr-forge
plasma:Partial LeAst Squares for Multiomic Analysis
Contains tools for supervised analyses of incomplete, overlapping multiomics datasets. Applies partial least squares in multiple steps to find models that predict survival outcomes. See Yamaguchi et al. (2023) <doi:10.1101/2023.03.10.532096>.
Maintained by Kevin R. Coombes. Last updated 2 months ago.
4.97 score 13 scriptsfeiyoung
ILSE:Linear Regression Based on 'ILSE' for Missing Data
Linear regression when covariates include missing values by embedding the correlation information between covariates. Especially for block missing data, it works well. 'ILSE' conducts imputation and regression simultaneously and iteratively. More details can be referred to Huazhen Lin, Wei Liu and Wei Lan. (2021) <doi:10.1080/07350015.2019.1635486>.
Maintained by Wei Liu. Last updated 1 years ago.
fimlilselinear-regressionmissing-dataopenblascpp
2 stars 4.95 score 3 scripts