Showing 22 of total 22 results (show query)
tidymodels
recipes:Preprocessing and Feature Engineering Steps for Modeling
A recipe prepares your data for modeling. We provide an extensible framework for pipeable sequences of feature engineering steps provides preprocessing tools to be applied to data. Statistical parameters for the steps can be estimated from an initial data set and then applied to other data sets. The resulting processed output can then be used as inputs for statistical or machine learning models.
Maintained by Max Kuhn. Last updated 3 hours ago.
586 stars 18.79 score 7.2k scripts 383 dependentsmhahsler
arules:Mining Association Rules and Frequent Itemsets
Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules). Also provides C implementations of the association mining algorithms Apriori and Eclat. Hahsler, Gruen and Hornik (2005) <doi:10.18637/jss.v014.i15>.
Maintained by Michael Hahsler. Last updated 2 months ago.
arulesassociation-rulesfrequent-itemsets
194 stars 13.99 score 3.3k scripts 28 dependentsepiforecasts
EpiNow2:Estimate Real-Time Case Counts and Time-Varying Epidemiological Parameters
Estimates the time-varying reproduction number, rate of spread, and doubling time using a range of open-source tools (Abbott et al. (2020) <doi:10.12688/wellcomeopenres.16006.1>), and current best practices (Gostic et al. (2020) <doi:10.1101/2020.06.18.20134858>). It aims to help users avoid some of the limitations of naive implementations in a framework that is informed by community feedback and is actively supported.
Maintained by Sebastian Funk. Last updated 1 months ago.
backcalculationcovid-19gaussian-processesopen-sourcereproduction-numberstancpp
123 stars 11.86 score 210 scriptsbioc
EnrichedHeatmap:Making Enriched Heatmaps
Enriched heatmap is a special type of heatmap which visualizes the enrichment of genomic signals on specific target regions. Here we implement enriched heatmap by ComplexHeatmap package. Since this type of heatmap is just a normal heatmap but with some special settings, with the functionality of ComplexHeatmap, it would be much easier to customize the heatmap as well as concatenating to a list of heatmaps to show correspondance between different data sources.
Maintained by Zuguang Gu. Last updated 5 months ago.
softwarevisualizationsequencinggenomeannotationcoveragecpp
190 stars 10.87 score 330 scripts 1 dependentsmurrayefford
secr:Spatially Explicit Capture-Recapture
Functions to estimate the density and size of a spatially distributed animal population sampled with an array of passive detectors, such as traps, or by searching polygons or transects. Models incorporating distance-dependent detection are fitted by maximizing the likelihood. Tools are included for data manipulation and model selection.
Maintained by Murray Efford. Last updated 2 days ago.
3 stars 10.06 score 410 scripts 5 dependentsepiverse-trace
epiparameter:Classes and Helper Functions for Working with Epidemiological Parameters
Classes and helper functions for loading, extracting, converting, manipulating, plotting and aggregating epidemiological parameters for infectious diseases. Epidemiological parameters extracted from the literature are loaded from the 'epiparameterDB' R package.
Maintained by Joshua W. Lambert. Last updated 2 months ago.
data-accessdata-packageepidemiologyepiverseprobability-distribution
34 stars 9.82 score 102 scripts 1 dependentsvigou3
actuar:Actuarial Functions and Heavy Tailed Distributions
Functions and data sets for actuarial science: modeling of loss distributions; risk theory and ruin theory; simulation of compound models, discrete mixtures and compound hierarchical models; credibility theory. Support for many additional probability distributions to model insurance loss size and frequency: 23 continuous heavy tailed distributions; the Poisson-inverse Gaussian discrete distribution; zero-truncated and zero-modified extensions of the standard discrete distributions. Support for phase-type distributions commonly used to compute ruin probabilities. Main reference: <doi:10.18637/jss.v025.i07>. Implementation of the Feller-Pareto family of distributions: <doi:10.18637/jss.v103.i06>.
Maintained by Vincent Goulet. Last updated 3 months ago.
12 stars 9.44 score 732 scripts 35 dependentsmatloff
regtools:Regression and Classification Tools
Tools for linear, nonlinear and nonparametric regression and classification. Novel graphical methods for assessment of parametric models using nonparametric methods. One vs. All and All vs. All multiclass classification, optional class probabilities adjustment. Nonparametric regression (k-NN) for general dimension, local-linear option. Nonlinear regression with Eickert-White method for dealing with heteroscedasticity. Utilities for converting time series to rectangular form. Utilities for conversion between factors and indicator variables. Some code related to "Statistical Regression and Classification: from Linear Models to Machine Learning", N. Matloff, 2017, CRC, ISBN 9781498710916.
Maintained by Norm Matloff. Last updated 2 months ago.
127 stars 9.39 score 48 scripts 3 dependentsmi2-warsaw
FSelectorRcpp:'Rcpp' Implementation of 'FSelector' Entropy-Based Feature Selection Algorithms with a Sparse Matrix Support
'Rcpp' (free of 'Java'/'Weka') implementation of 'FSelector' entropy-based feature selection algorithms based on an MDL discretization (Fayyad U. M., Irani K. B.: Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning. In 13'th International Joint Conference on Uncertainly in Artificial Intelligence (IJCAI93), pages 1022-1029, Chambery, France, 1993.) <https://www.ijcai.org/Proceedings/93-2/Papers/022.pdf> with a sparse matrix support.
Maintained by Zygmunt Zawadzki. Last updated 6 months ago.
entropyfeature-selectionrcppsparse-matrixcpp
35 stars 8.22 score 78 scripts 1 dependentsbioc
MOSim:Multi-Omics Simulation (MOSim)
MOSim package simulates multi-omic experiments that mimic regulatory mechanisms within the cell, allowing flexible experimental design including time course and multiple groups.
Maintained by Sonia Tarazona. Last updated 5 months ago.
softwaretimecourseexperimentaldesignrnaseqcpp
9 stars 7.46 score 11 scriptsmeyerp-software
infotheo:Information-Theoretic Measures
Implements various measures of information theory based on several entropy estimators.
Maintained by Patrick E. Meyer. Last updated 3 years ago.
6.22 score 480 scripts 44 dependentscran
entropy:Estimation of Entropy, Mutual Information and Related Quantities
Implements various estimators of entropy for discrete random variables, including the shrinkage estimator by Hausser and Strimmer (2009), the maximum likelihood and the Millow-Madow estimator, various Bayesian estimators, and the Chao-Shen estimator. It also offers an R interface to the NSB estimator. Furthermore, the package provides functions for estimating the Kullback-Leibler divergence, the chi-squared divergence, mutual information, and the chi-squared divergence of independence. It also computes the G statistic and the chi-squared statistic and corresponding p-values. Furthermore, there are functions for discretizing continuous random variables.
Maintained by Korbinian Strimmer. Last updated 3 years ago.
4 stars 5.68 score 62 dependentsgamlss-dev
gamlss2:GAMLSS Infrastructure for Flexible Distributional Regression
Next generation infrastructure for generalized additive models for location, scale, and shape (GAMLSS) and distributional regression more generally. The package provides a fresh reimplementaton of the classic 'gamlss' package while being more modular and facilitating the creation of advanced terms and models.
Maintained by Nikolaus Umlauf. Last updated 1 months ago.
8 stars 5.24 score 4 scripts 1 dependentsbioc
HiTC:High Throughput Chromosome Conformation Capture analysis
The HiTC package was developed to explore high-throughput 'C' data such as 5C or Hi-C. Dedicated R classes as well as standard methods for quality controls, normalization, visualization, and further analysis are also provided.
Maintained by Nicolas Servant. Last updated 5 months ago.
sequencinghighthroughputsequencinghic
5.23 score 42 scriptscran
biclust:BiCluster Algorithms
The main function biclust() provides several algorithms to find biclusters in two-dimensional data: Cheng and Church (2000, ISBN:1-57735-115-0), spectral (2003) <doi:10.1101/gr.648603>, plaid model (2005) <doi:10.1016/j.csda.2004.02.003>, xmotifs (2003) <doi:10.1142/9789812776303_0008> and bimax (2006) <doi:10.1093/bioinformatics/btl060>. In addition, the package provides methods for data preprocessing (normalization and discretisation), visualisation, and validation of bicluster solutions.
Maintained by Sebastian Kaiser. Last updated 2 years ago.
3 stars 4.74 score 16 dependentslbelzile
BMAmevt:Multivariate Extremes: Bayesian Estimation of the Spectral Measure
Toolkit for Bayesian estimation of the dependence structure in multivariate extreme value parametric models, following Sabourin and Naveau (2014) <doi:10.1016/j.csda.2013.04.021> and Sabourin, Naveau and Fougeres (2013) <doi:10.1007/s10687-012-0163-0>.
Maintained by Leo Belzile. Last updated 2 years ago.
3.90 score 16 scriptsrbarkerclarke
gtexture:Generalized Application of Co-Occurrence Matrices and Haralick Texture
Generalizes application of gray-level co-occurrence matrix (GLCM) metrics to objects outside of images. The current focus is to apply GLCM metrics to the study of biological networks and fitness landscapes that are used in studying evolutionary medicine and biology, particularly the evolution of cancer resistance. The package was used in our publication, Barker-Clarke et al. (2023) <doi:10.1088/1361-6560/ace305>. A general reference to learn more about mathematical oncology can be found at Rockne et al. (2019) <doi:10.1088/1478-3975/ab1a09>.
Maintained by Rowan Barker-Clarke. Last updated 12 months ago.
3.00 score 1 scriptscran
cdparcoord:Top Frequency-Based Parallel Coordinates
Parallel coordinate plotting with resolutions for large data sets and missing values.
Maintained by Norm Matloff. Last updated 6 years ago.
2.70 scorecran
rbooster:AdaBoost Framework for Any Classifier
This is a simple package which provides a function that boosts pre-ready or custom-made classifiers. Package uses Discrete AdaBoost (<doi:10.1006/jcss.1997.1504>) and Real AdaBoost (<doi:10.1214/aos/1016218223>) for two class, SAMME (<doi:10.4310/SII.2009.v2.n3.a8>) and SAMME.R (<doi:10.4310/SII.2009.v2.n3.a8>) for multiclass classification.
Maintained by Fatih Saglam. Last updated 3 years ago.
2.70 scoremrmarjan
multiUS:Functions for the Courses Multivariate Analysis and Computer Intensive Methods
Provides utility functions for multivariate analysis (factor analysis, discriminant analysis, and others). The package is primary written for the course Multivariate analysis and for the course Computer intensive methods at the masters program of Applied Statistics at University of Ljubljana.
Maintained by Cugmas Marjan. Last updated 2 years ago.
1.23 score 17 scripts